idnits 2.17.1 

draft-wang-appsawg-end2end-overload-control-00.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

     No issues found here.

  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  == The document seems to lack the recommended RFC 2119 boilerplate, even if
     it appears to use RFC 2119 keywords. 

     (The document does seem to have the reference to RFC 2119 which the
     ID-Checklist requires).
  -- The document date (Oct 18, 2013) is 3841 days in the past.  Is this
     intentional?


  Checking references for intended status: Informational
  ----------------------------------------------------------------------------

  == Unused Reference: 'RFC2119' is defined on line 241, but no explicit
     reference was found in the text

  == Unused Reference: 'Liao' is defined on line 249, but no explicit
     reference was found in the text


     Summary: 0 errors (**), 0 flaws (~~), 4 warnings (==), 1 comment (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.
--------------------------------------------------------------------------------


2	appsawg                                                          J. Wang
3	Internet-Draft                                                     Q. Yu
4	Intended status: Informational                                   L. Deng
5	                                                            China Mobile
6	                                                            Oct 18, 2013

8	     End-to-end Session Initiation Protocol (SIP) overload control
9	           draft-wang-appsawg-end2end-overload-control-00

11	Abstract

13	   This draft proposes end-to-end Session Initiation Protocol (SIP)
14	   overload control, in which the edge servers of the SIP network
15	   throttle the arriving calls in order to control overload for the SIP
16	   network.  Compared to the local and hop-by-hop SIP overload control,
17	   the end-to-end SIP overload control can achieve best performance.

19	Status of This Memo

21	   This Internet-Draft is submitted in full conformance with the
22	   provisions of BCP 78 and BCP 79.

24	   Internet-Drafts are working documents of the Internet Engineering
25	   Task Force (IETF).  Note that other groups may also distribute
26	   working documents as Internet-Drafts.  The list of current Internet-
27	   Drafts is at http://datatracker.ietf.org/drafts/current/.

29	   Internet-Drafts are draft documents valid for a maximum of six months
30	   and may be updated, replaced, or obsoleted by other documents at any
31	   time.  It is inappropriate to use Internet-Drafts as reference
32	   material or to cite them other than as "work in progress."

34	Copyright Notice

36	   Copyright (c) 2013 IETF Trust and the persons identified as the
37	   document authors.  All rights reserved.

39	   This document is subject to BCP 78 and the IETF Trust's Legal
40	   Provisions Relating to IETF Documents
41	   (http://trustee.ietf.org/license-info) in effect on the date of
42	   publication of this document.  Please review these documents
43	   carefully, as they describe your rights and restrictions with respect
44	   to this document.  Code Components extracted from this document must
45	   include Simplified BSD License text as described in Section 4.e of
46	   the Trust Legal Provisions and are provided without warranty as
47	   described in the Simplified BSD License.

49	Table of Contents

51	   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   2
52	   2.  End-to-end overload control scheme  . . . . . . . . . . . . .   3
53	     2.1.  Overview  . . . . . . . . . . . . . . . . . . . . . . . .   3
54	     2.2.  End-to-end overload control design  . . . . . . . . . . .   3
55	     2.3.  End-to-end overload control algorithm . . . . . . . . . .   4
56	       2.3.1.  End-to-end overload control algorithm metrics . . . .   4
57	       2.3.2.  Default End-to-end overload control algorithm . . . .   5
58	   3.  Security Considerations . . . . . . . . . . . . . . . . . . .   5
59	   4.  IANA Considerations . . . . . . . . . . . . . . . . . . . . .   5
60	   5.  References  . . . . . . . . . . . . . . . . . . . . . . . . .   5
61	     5.1.  Normative References  . . . . . . . . . . . . . . . . . .   5
62	     5.2.  Informative References  . . . . . . . . . . . . . . . . .   6
63	   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .   6

65	1.  Introduction

67	   Session Initiation Protocol (SIP) serves as a foundation for many of
68	   today's session-oriented applications, such as Voice over IP (VoIP),
69	   multimedia distributions, video conferencing, instant messaging and
70	   presence service.  The widespread popularity and rapidly growing
71	   deployments of SIP require that SIP servers provide adequate control
72	   mechanisms to handle overload.  Overload of a SIP server occurs if
73	   the message arrival rate to the server exceeds its message processing
74	   capacity.  Under overload, the throughput of a SIP server can drop
75	   significantly and can even reach zero.  Besides, the call setup delay
76	   becomes unacceptable for a real-time media call.  In this case, the
77	   server enters into a congestion collapse.

79	   [RFC6357] has classified the SIP overload control approaches into
80	   local, hop-by-hop and end-to-end overload control.  In local overload
81	   control, the SIP server monitors its load and starts to reject
82	   requests locally by using 503 (Service Unavailable) responses when it
83	   detects overload.  In hop-by-hop overload control, the overloaded SIP
84	   server can provide feedback to its direct upstream neighbors, which
85	   then adjust the amount of traffic forwarded to this SIP server to
86	   eliminate overload.  In end-to-end overload control, the edge
87	   servers, which are considered as the closest servers to the sources
88	   of traffic in a SIP network, are responsible for adjusting the amount
89	   of traffic forwarded to the overloaded server to eliminate overload.

91	   In the deployment scenarios (such as IMS) where the SIP call
92	   traverses through multiple SIP servers in a SIP network, local and
93	   hop-by-hop overload control are inefficient since overload is
94	   resolved near the overloaded sever.  In this case, the SIP servers
95	   located between the edge server and the overloaded server waste their
96	   processing resources on processing a request that will finally be
97	   rejected.  On the other hand, in end-to-end overload control minimum
98	   resources of SIP networks are wasted on processing a request that
99	   will finally be rejected since the edge servers are responsible for
100	   rejecting requests.

102	   The research in [Hilt] indicates that the end-to-end overload control
103	   achieves the best performance (highest throughput) although it is the
104	   most complex among all types of overload control approaches.  Based
105	   on them, this document proposes an end-to-end overload control
106	   mechanism for networks of SIP servers.

108	2.  End-to-end overload control scheme

110	2.1.  Overview

112	   The SIP network consists of edge servers and core servers.  Each UA
113	   is connected to the network via an edge server located closest to it.
114	   When a SIP call between two UAs goes through the network, the first
115	   server the call arrives at is denoted as the ingress server, and the
116	   last server the call arrives at is denoted as the target server.  It
117	   is clear that both ingress server and target server are edge servers.

119	   The design of end-to-end overload control should follow the
120	   principles as below:

122	   o  Overload MUST be controlled at ingress servers. That is, arriving
123	      calls from UAs are throttled at ingress servers. Overload control
124	      works best if applied at the servers closest to the source of
125	      traffic because in this way minimum resources of SIP networks are
126	      wasted on processing a request that will finally be rejected.
127	   o  Overload SHOULD be controlled on a per-target basis.  That is,
128	      each ingress server throttles the arriving call from UAs based on
129	      its target server.  Without the per-target basis, an ingress
130	      server should identify which server is overloaded and throttle
131	      arriving calls that will be routed through the overloaded server.
132	      On the other hand, with the per-target basis, an ingress server
133	      only needs to identify which target server the call passing
134	      through the overloaded server is related to, and throttle arriving
135	      calls that will be forwarded to this target server.  Thus per-
136	      target basis makes it much easier for ingress servers to control
137	      overload for the SIP network

139	2.2.  End-to-end overload control design

141	   In end-to-end overload control, the core servers SHOULD only
142	   implement local overload control that rejects requests by using 503
143	   responses.  When receiving a 503 response from a downstream neighbor,
144	   the server SHOULD forward this response to the upstream neighbor,
145	   from which the INVITE request related to this response has been
146	   received.  In this way, the 503 response will finally be forwarded to
147	   the edge server.  The edge servers SHOULD calculate and follow the
148	   restrictions on the traffic admitted to the SIP network based on the
149	   received 503 responses.

151	   In end-to-end overload control, a set of control-units is deployed at
152	   each ingress server to control overload for the SIP network.  At an
153	   ingress server, each control-unit is related to a specific target
154	   server and controls the arriving calls from UAs that take this
155	   ingress server as first-hop and the target server as last-hop in the
156	   network.  Thus, the load of the network is controlled by control-
157	   units at all ingress servers.  Note that this approach is completely
158	   distributed: there is no centralized entity to control control-units
159	   and each control-unit is functionally identical and operates
160	   independently.  Besides, there is no communication between control-
161	   units.  Finally, this approach can be deployed incrementally, by
162	   installing control-units on ingress servers, with no need to alter
163	   other servers in the SIP network.  Therefore, this approach is easy
164	   to implement.

166	2.3.  End-to-end overload control algorithm

168	   The main function of the control-unit is to decide the call admission
169	   rate, which is calculated by using the end-to-end overload control
170	   algorithm.

172	2.3.1.  End-to-end overload control algorithm metrics

174	   Aggressiveness: The network is underutilized when the call admission
175	   rate is below the capacity of the network.  In this case, control-
176	   unit needs to increase the call admission rate as fast as possible in
177	   order to make full use of network resources and avoid unnecessary
178	   call rejections.  Aggressiveness measures how fast a control-unit
179	   makes use of network resources as they are available.  The
180	   aggressiveness is defined as the inverse of the time needed for the
181	   control-unit to achieve the increment of a certain amount of call
182	   admission rate, in response to: (1) a step increase of available
183	   network resources or (2) a step increase of call arrival rate when
184	   there are available resources in the network.  Obviously, high
185	   aggressiveness, implying potentially high utilization, is desirable.

187	   Responsiveness: The network is overloaded when the call admission
188	   rate exceeds the capacity of the network.  In this case, control-unit
189	   needs to decrease the call admission rate as fast as possible in
190	   order to eliminate overload.  Responsiveness measures how fast a
191	   control-unit decreases the call admission rate in response to
192	   overload.  We define responsiveness as the inverse of the time needed
193	   for the control-unit to achieve the decrement of a certain amount of
194	   call admission rate, in response to a step increase of network
195	   overload.  Obviously, high responsiveness, which allows control-unit
196	   to decrease the call admission rate quickly when overload occurs, is
197	   desirable.

199	   Throughput: The network is fully utilized when the call admission
200	   rate is close to the capacity of the network.  In this case, the
201	   throughput is determined by the overload control algorithm.
202	   Obviously, high throughput is desirable

204	2.3.2.  Default End-to-end overload control algorithm

206	   The default end-to-end overload control algorithm presented here is
207	   only an example.  Other algorithms that can achieve high
208	   aggressiveness, high responsiveness and high throughput may be used.

210	   The default end-to-end overload control algorithm consists of an
211	   increasing rule and a decreasing rule.  When there is no overload
212	   feedback, the algorithm increases call admission rate according to
213	   the increasing rule.  When receiving the overload feedback, the
214	   algorithm decreases call admission rate according to the decreasing
215	   rule.  The control-unit periodically executes the overload control
216	   algorithm (with interval T) and takes the number of received 503
217	   responses during each T as the overload feedback to the algorithm.

219	   The increasing rule and decreasing rule are shown as follows:

221	      increasing: r(t+1)=r(t)+a, a>0
222	      decreasing: r(t+1)=r(t)-b*r(t), 1>b>0

224	   where r(t) is the call admission rate at time t. a and b are constant
225	   factors.  That is, if no call rejection is received, the call
226	   admission rate is increased additively.  Otherwise, it is decreased
227	   multiplicatively.

229	3.  Security Considerations

231	   TBA

233	4.  IANA Considerations

235	   None.

237	5.  References

239	5.1.  Normative References

241	   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
242	              Requirement Levels", BCP 14, RFC 2119, March 1997.

244	5.2.  Informative References

246	   [Hilt]     Hilt, V. and I. Widjaja, "Controlling Overload in Networks
247	              of SIP Servers", October 2008.

249	   [Liao]     Liao, J., Wang, J., Li, T., Wang, J., Wang, J., and X.
250	              Zhu, "A Distributed End-to-end Overload Control Mechanism
251	              for Networks of SIP Servers", COMPUTER NETWORKS , 2012.

253	   [RFC6357]  Hilt, V., Noel, E., Shen, C., and A. Abdelal, "RFC 6357:
254	              Design Considerations for Session Initiation Protocol
255	              (SIP) Overload Control", August 2011.

257	Authors' Addresses

259	   Jinzhu Wang
260	   China Mobile

262	   Email: wangjinzhu@chinamobile.com

264	   Qing Yu
265	   China Mobile

267	   Email: yuqing@chinamobile.com

269	   Lingli Deng
270	   China Mobile

272	   Email: denglingli@chinamobile.com