idnits 2.17.1 draft-ietf-soc-overload-control-05.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == Line 649 has weird spacing: '...control param...' == The document seems to use 'NOT RECOMMENDED' as an RFC 2119 keyword, but does not include the phrase in its RFC 2119 key words list. -- The document date (October 28, 2011) is 4564 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Looks like a reference, but probably isn't: '0' on line 694 -- Looks like a reference, but probably isn't: '100' on line 694 == Outdated reference: A later version (-13) exists of draft-ietf-soc-load-control-event-package-01 == Outdated reference: A later version (-02) exists of draft-noel-soc-overload-rate-control-01 Summary: 0 errors (**), 0 flaws (~~), 5 warnings (==), 3 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 SOC Working Group V. Gurbani, Ed. 3 Internet-Draft Bell Laboratories, Alcatel-Lucent 4 Intended status: Standards Track V. Hilt 5 Expires: April 30, 2012 Bell Labs/Alcatel-Lucent 6 H. Schulzrinne 7 Columbia University 8 October 28, 2011 10 Session Initiation Protocol (SIP) Overload Control 11 draft-ietf-soc-overload-control-05 13 Abstract 15 Overload occurs in Session Initiation Protocol (SIP) networks when 16 SIP servers have insufficient resources to handle all SIP messages 17 they receive. Even though the SIP protocol provides a limited 18 overload control mechanism through its 503 (Service Unavailable) 19 response code, SIP servers are still vulnerable to overload. This 20 document defines the behaviour of SIP servers involved in overload 21 control, and in addition, it specifies a loss-based overload scheme 22 for SIP. 24 Status of this Memo 26 This Internet-Draft is submitted in full conformance with the 27 provisions of BCP 78 and BCP 79. 29 Internet-Drafts are working documents of the Internet Engineering 30 Task Force (IETF). Note that other groups may also distribute 31 working documents as Internet-Drafts. The list of current Internet- 32 Drafts is at http://datatracker.ietf.org/drafts/current/. 34 Internet-Drafts are draft documents valid for a maximum of six months 35 and may be updated, replaced, or obsoleted by other documents at any 36 time. It is inappropriate to use Internet-Drafts as reference 37 material or to cite them other than as "work in progress." 39 This Internet-Draft will expire on April 30, 2012. 41 Copyright Notice 43 Copyright (c) 2011 IETF Trust and the persons identified as the 44 document authors. All rights reserved. 46 This document is subject to BCP 78 and the IETF Trust's Legal 47 Provisions Relating to IETF Documents 48 (http://trustee.ietf.org/license-info) in effect on the date of 49 publication of this document. Please review these documents 50 carefully, as they describe your rights and restrictions with respect 51 to this document. Code Components extracted from this document must 52 include Simplified BSD License text as described in Section 4.e of 53 the Trust Legal Provisions and are provided without warranty as 54 described in the Simplified BSD License. 56 Table of Contents 58 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 59 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 5 60 3. Overview of operations . . . . . . . . . . . . . . . . . . . . 5 61 4. Via header parameters for overload control . . . . . . . . . . 5 62 4.1. The oc parameter . . . . . . . . . . . . . . . . . . . . . 6 63 4.2. The oc-algo parameter . . . . . . . . . . . . . . . . . . 6 64 4.3. The oc-validity parameter . . . . . . . . . . . . . . . . 7 65 4.4. The oc-seq parameter . . . . . . . . . . . . . . . . . . . 7 66 5. General behaviour . . . . . . . . . . . . . . . . . . . . . . 8 67 5.1. Handshake to determine support for overload control . . . 8 68 5.2. Creating and updating the overload control parameters . . 9 69 5.3. Determining the 'oc' Parameter Value . . . . . . . . . . . 11 70 5.4. Processing the Overload Control Parameters . . . . . . . . 11 71 5.5. Using the Overload Control Parameter Values . . . . . . . 11 72 5.6. Forwarding the overload control parameters . . . . . . . . 12 73 5.7. Terminating overload control . . . . . . . . . . . . . . . 12 74 5.8. Stabilizing overload control . . . . . . . . . . . . . . . 13 75 5.9. Self-Limiting . . . . . . . . . . . . . . . . . . . . . . 13 76 5.10. Responding to an Overload Indication . . . . . . . . . . . 14 77 5.10.1. Message prioritization at the hop before the 78 overloaded server . . . . . . . . . . . . . . . . . . 14 79 5.10.2. Rejecting requests at an overloaded server . . . . . 15 80 5.11. 100-Trying provisional response and overload control 81 parameters . . . . . . . . . . . . . . . . . . . . . . . . 15 82 6. The loss-based overload control scheme . . . . . . . . . . . . 15 83 6.1. Special parameter values for loss-based overload 84 control . . . . . . . . . . . . . . . . . . . . . . . . . 16 85 6.2. Example . . . . . . . . . . . . . . . . . . . . . . . . . 16 86 6.3. Default algorithm for loss-based overload control . . . . 18 87 7. Relationship with other IETF SIP load control efforts . . . . 20 88 8. Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 89 9. Design Considerations . . . . . . . . . . . . . . . . . . . . 20 90 9.1. SIP Mechanism . . . . . . . . . . . . . . . . . . . . . . 20 91 9.1.1. SIP Response Header . . . . . . . . . . . . . . . . . 20 92 9.1.2. SIP Event Package . . . . . . . . . . . . . . . . . . 21 93 9.2. Backwards Compatibility . . . . . . . . . . . . . . . . . 22 94 10. Security Considerations . . . . . . . . . . . . . . . . . . . 23 95 11. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 24 96 12. References . . . . . . . . . . . . . . . . . . . . . . . . . . 24 97 12.1. Normative References . . . . . . . . . . . . . . . . . . . 24 98 12.2. Informative References . . . . . . . . . . . . . . . . . . 25 99 Appendix A. Acknowledgements . . . . . . . . . . . . . . . . . . 25 100 Appendix B. RFC5390 requirements . . . . . . . . . . . . . . . . 25 101 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 31 103 1. Introduction 105 As with any network element, a Session Initiation Protocol (SIP) 106 [RFC3261] server can suffer from overload when the number of SIP 107 messages it receives exceeds the number of messages it can process. 108 Overload can pose a serious problem for a network of SIP servers. 109 During periods of overload, the throughput of a network of SIP 110 servers can be significantly degraded. In fact, overload may lead to 111 a situation in which the throughput drops down to a small fraction of 112 the original processing capacity. This is often called congestion 113 collapse. 115 Overload is said to occur if a SIP server does not have sufficient 116 resources to process all incoming SIP messages. These resources may 117 include CPU processing capacity, memory, network bandwidth, input/ 118 output, or disk resources. 120 For overload control, we only consider failure cases where SIP 121 servers are unable to process all SIP requests due to resource 122 constraints. There are other cases where a SIP server can 123 successfully process incoming requests but has to reject them due to 124 failure conditions unrelated to the SIP server being overloaded. For 125 example, a PSTN gateway that runs out of trunks but still has plenty 126 of capacity to process SIP messages should reject incoming INVITEs 127 using a 488 (Not Acceptable Here) response [RFC4412]. Similarly, a 128 SIP registrar that has lost connectivity to its registration database 129 but is still capable of processing SIP requests should reject 130 REGISTER requests with a 500 (Server Error) response [RFC3261]. 131 Overload control does not apply to these cases and SIP provides 132 appropriate response codes for them. 134 The SIP protocol provides a limited mechanism for overload control 135 through its 503 (Service Unavailable) response code. However, this 136 mechanism cannot prevent overload of a SIP server and it cannot 137 prevent congestion collapse. In fact, the use of the 503 (Service 138 Unavailable) response code may cause traffic to oscillate and to 139 shift between SIP servers and thereby worsen an overload condition. 140 A detailed discussion of the SIP overload problem, the problems with 141 the 503 (Service Unavailable) response code and the requirements for 142 a SIP overload control mechanism can be found in [RFC5390]. 144 This document defines the general behaviour of SIP servers and 145 clients involved in overload control in Section 5. In addition, 146 Section 6 specifies a loss-based overload control scheme. SIP 147 clients and servers conformant to this specification MUST implement 148 the loss-based overload control scheme. They MAY implement other 149 overload control schemes as well. 151 2. Terminology 153 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 154 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 155 document are to be interpreted as described in RFC 2119 [RFC2119]. 157 The normative statements in this specification as they apply to SIP 158 clients and SIP servers assume that both the SIP clients and SIP 159 servers support this specification. If, for instance, only a SIP 160 client supports this specification and not the SIP server, then 161 follows that the normative statements in this specification pertinent 162 to the behavior of a SIP server do not apply to the server that does 163 not support this specification. 165 3. Overview of operations 167 We now explain the overview of how the overload control mechanism 168 operates by introducing the overload control parameters. Section 4 169 provides more details and normative behavior on the parameters listed 170 below. 172 Because overload control is best performed hop-by-hop, the Via 173 parameter is attractive since it allows two adjacent SIP entities to 174 indicate support for, and exchange information associated with 175 overload control. Additional advantages of this choice are discussed 176 in Section 9.1.1. An alternative mechanism using SIP event packages 177 was also considered, and the characteristics of that choice are 178 further outlined in Section 9.1.2. 180 This document defines four new parameters for the SIP Via header for 181 overload control. These parameters provide a mechanism for conveying 182 overload control information between adjacent SIP entities. The "oc" 183 parameter is used by a SIP server to indicate a reduction in the 184 amount of requests arriving at the server. The "oc-algo" parameter 185 contains a token or a list of tokens corresponding to the class of 186 overload control algorithms supported by the client. The server 187 chooses one algorithm from this list. The "oc-validity" parameter 188 establishes a time limit for which overload control is in effect, and 189 the "oc-seq" parameter aids in sequencing the responses at the 190 client. These parameters are discussed in detail in the next 191 section. 193 4. Via header parameters for overload control 195 The four Via header parameters are introduced below. Further context 196 in how to interpret these under various conditions is provided in 197 Section 5. 199 4.1. The oc parameter 201 This parameter is inserted by the SIP client and updated by the SIP 202 server. 204 A SIP client MUST add an "oc" parameter to the topmost Via header it 205 inserts into the SIP request. This provides an indication to 206 downstream neighbors that the client supports overload control. When 207 inserted into a request by a SIP client to indicate support for 208 overload control, there MUST NOT be a value associated with the 209 parameter. 211 The downstream server MUST add a value to the "oc" parameter in the 212 response going upstream. Inclusion of a value to the parameter 213 represents two things: one, upon an initial handshake (see Section 214 X.X), addition of a value by the server to this parameter indicates 215 (to the client) that the downstream server supports overload control 216 as defined in this document. Second, if the value added by the 217 server is non-zero, it indicates (to the client) that the server 218 wants to perform overload control. 220 When a SIP client receives a response with the value in the "oc" 221 parameter filled in, it SHOULD reduce, by the amount indicated, the 222 number of requests going downstream to the SIP server from which it 223 received the response (see Section 5.10 for pertinent discussion on 224 traffic reduction). 226 4.2. The oc-algo parameter 228 This parameter is inserted by the SIP client and updated by the SIP 229 server. 231 A SIP client conformant to this specification MUST add an "oc-algo" 232 parameter to the topmost Via header it inserts into the SIP request. 233 This parameter contains one or more overload control algorithms. A 234 SIP client conformant to this specification MUST support the loss- 235 based overload control scheme and MUST insert the token "loss" as the 236 "oc-algo" parameter value. In addition, the SIP client MAY insert 237 other tokens, separated by a comma, in the "oc-algo" parameter if it 238 supports other overload control schemes such as a rate-based scheme 239 ([I-D.noel-soc-overload-rate-control]). Each element in the comma- 240 separated list corresponds to the class of overload control 241 algorithms supported by the SIP client. When more than one class of 242 overload control algorithms is present in the "oc-algo" parameter, 243 the client may indicate algorithm preference by ordering the list in 244 a decreasing order of preference. However, the client must not 245 assume that the server will pick the most preferred algorithm. 247 When a downstream SIP server receives a request with a choice of 248 overload control algorithms specified in the "oc-algo" parameter 249 value, it MUST choose one algorithm from the list and MUST pare the 250 list down to include the one chosen algorithm. The pared down list 251 consisting of the chosen algorithm MUST be returned to the upstream 252 SIP client in the response. 254 Once a SIP client and a SIP server have converged to a mutually 255 agreeable class of overload control algorithm, the agreed upon class 256 stays in effect for a non-trivial duration of time to allow the 257 overload control algorithm to stabilize its behaviour (see 258 Section 5.8). Furthermore, the client MUST continue to include all 259 supported algorithms in subsequent requests; the server MUST respond 260 with the agreed to algorithm until such time that the algorithm is 261 changed by the server (see Section 5.8). 263 4.3. The oc-validity parameter 265 This parameter is inserted by the SIP server. 267 This parameter contains a value that indicates an interval of time 268 (measured in milliseconds) that the load reduction specified value of 269 the "oc" parameter should be in effect. The default value of the 270 "oc-validity" parameter is 500 (millisecond). 272 A value of 0 in the "oc-validity" parameter is reserved to denote the 273 event that the server wishes to stop overload control (see 274 Section 5.7 for more information). 276 A non-zero value for the "oc-validity" parameter MUST only be present 277 in conjunction with an "oc" parameter. 279 4.4. The oc-seq parameter 281 This parameter is inserted by the SIP server. 283 This parameter contains a value that indicates the sequence number 284 associated with the "oc" parameter. Some implementations may be 285 capable of updating the overload control information before the 286 validity period specified by the "oc-validity" parameter expires. 287 Such implementations MUST have an increasing value in the "oc-seq" 288 parameter for each response sent to the upstream SIP client. This is 289 to allow the upstream SIP client to properly collate out-of-order 290 responses. 292 5. General behaviour 294 When forwarding a SIP request, a SIP client uses the SIP procedures 295 of [RFC3263] to determine the next hop SIP server. The procedures of 296 [RFC3263] take as input a SIP URI, extract the domain portion of that 297 URI for use as a lookup key, and query the Domain Name Service (DNS) 298 to obtain an ordered set of one or more IP addresses with a port 299 number and transport corresponding to each IP address in this set 300 (the "Expected Output"). 302 After selecting a specific SIP server from the Expected Output, a SIP 303 client compliant to this specification MUST determine if it is 304 operating under overload control mode with the server (see Section 305 Section 5.5 or if this is the initial contact with the server. 307 If the client determines that this is the initial contact with the 308 server, it proceeds in the following manner to determine if the 309 downstream server supports overload control and to choose an overload 310 control algorithm: A client compliant to this specification MUST 311 insert the "oc" parameter without any value, and MUST insert the "oc- 312 algo" parameter with a list of algorithms it supports. This list 313 MUST include "loss" and MAY include other algorithm names approved by 314 IANA and described in corresponding documents. The client transmits 315 the request to the chosen server. 317 If the server supports overload control as described in this 318 document, it MUST set the value of the "oc" parameter in the request 319 to "0". In addition, it MUST choose one algorithm from the list of 320 algorithms in "oc-algo" parameter and echo the chosen parameter as 321 the only value of the "oc-algo" parameter in the response sent back 322 to the client. A server compliant to this specification MAY insert 323 an "oc-validity=0" parameter in the response to further qualify the 324 value inserted in the "oc" parameter. 326 A client that supports the rate-based overload control scheme 327 [I-D.noel-soc-overload-rate-control] will consider a value of "0" 328 in the "oc" parameter as an indication not to send any requests 329 downstream at all. Thus, when the server inserts "oc-validity=0" 330 as well, it is indicating that it does support overload control, 331 but it is not under overload mode right now. 333 5.1. Handshake to determine support for overload control 335 When a client contacts a server whose overload control support is not 336 known, the client MUST insert the "oc" parameter without any value. 337 Additionally, the client MUST insert the "oc-algo" parameter with a 338 list of algorithms it supports for overload control. This list MUST 339 include "loss" and MAY also include other algorithm names approved by 340 IANA and described in their corresponding documents in the future. 342 A server that supports overload control MUST set the value of the 343 "oc" parameter to be 0. In addition, it MUST choose one algorithm 344 from the list of algorithms in the "oc-algo" parameter and echo the 345 chosen algorithm as the sole parameter value in the "oc-algo" 346 parameter. A server that supports overload control MAY insert an 347 "oc-validity=0" parameter in the response to further qualify the 348 value in the "oc" parameter. 350 Note that the rate-based overload control scheme considers "oc=0" 351 as an indication not to send any requests downstream at all. 352 Thus, having the "oc-validity=0" parameter further imparts the 353 semantics that overload control is supported, but turned off (see 354 Section 5.7.) 356 5.2. Creating and updating the overload control parameters 358 A SIP server can provide overload control feedback to its upstream 359 neighbors by providing a value for the "oc" parameter to the topmost 360 Via header field of a SIP response. The topmost Via header is 361 determined after the SIP server has removed its own Via header; i.e., 362 it is the Via header that was generated by the upstream neighbor. 364 Since the topmost Via header of a response will be removed by an 365 upstream neighbor after processing it, overload control feedback 366 contained in the "oc" parameter will not travel beyond the upstream 367 SIP client. A Via header parameter therefore provides hop-by-hop 368 semantics for overload control feedback (see [RFC6357]) even if the 369 next hop neighbor does not support this specification. 371 The "oc" parameter can be used in all response types, including 372 provisional, success and failure responses (please see Section 5.11 373 for special consideration on transporting overload control parameters 374 in a 100-Trying response). A SIP server MAY update the "oc" 375 parameter in all responses it is sending. A SIP server MUST update 376 the "oc" parameter to responses when the transmission of overload 377 control feedback is required by the overload control algorithm to 378 limit the traffic received by the server. I.e., a SIP server MUST 379 update the "oc" parameter when the overload control algorithm sets 380 the value of an "oc" parameter to a value different than the default 381 value. 383 A SIP server that has updated the "oc" parameter to Via header SHOULD 384 also add a "oc-validity" parameter to the same Via header. The "oc- 385 validity" parameter defines the time in milliseconds during which the 386 content (i.e., the overload control feedback) of the "oc" parameter 387 is valid. The default value of the "oc-validity" parameter is 500 388 (millisecond). A SIP server SHOULD use a shorter "oc-validity" time 389 if its overload status varies quickly and MAY use a longer "oc- 390 validity" time if this status is more stable. If the "oc-validity" 391 parameter is not present, its default value is used. The "oc- 392 validity" parameter MUST NOT be used in a Via header that did not 393 originally contain an "oc" parameter when received. Furthermore, 394 when a SIP server receives a request with the topmost Via header 395 containing only an "oc-validity" parameter without the accompanying 396 "oc" parameter, it MUST ignore the "oc-validity" 398 When a SIP server retransmits a response, it SHOULD use the "oc" 399 parameter value and "oc-validity" parameter value consistent with the 400 overload state at the time the retransmitted response is sent. This 401 implies that the values in the "oc" and "oc-validity" parameters may 402 be different then the ones used in previous retransmissions of the 403 response. Due to the fact that responses sent over UDP may be 404 subject to delays in the network and arrive out of order, the "oc- 405 seq" parameter aids in detecting a stale "oc" parameter value. 407 Implementations that are capable of updating the "oc" and "oc- 408 validity" parameter values for retransmissions MUST insert the "oc- 409 seq" parameter. The value of this parameter MUST be a set of numbers 410 drawn from an increasing sequence. 412 Implementations that are not capable of updating the "oc" and "oc- 413 validity" parameter values for retransmissions --- or implementations 414 that do not want to do so because they will have to regenerate the 415 message to be retransmitted --- MUST still insert a "oc-seq" 416 parameter in the first response associated with a transaction; 417 however, they do not have to update the value in subsequent 418 retransmissions. 420 The "oc-validity" and "oc-seq" Via header parameters are only defined 421 in SIP responses and MUST NOT be used in SIP requests. These 422 parameters are only useful to the upstream neighbor of a SIP server 423 (i.e., the entity that is sending requests to the SIP server) since 424 this is the entity that can offload traffic by redirecting/rejecting 425 new requests. If requests are forwarded in both directions between 426 two SIP servers (i.e., the roles of upstream/downstream neighbors 427 change), there are also responses flowing in both directions. Thus, 428 both SIP servers can exchange overload information. 430 Since overload control protects a SIP server from overload, it is 431 RECOMMENDED that a SIP server use the mechanisms described in this 432 specification. However, if a SIP server wanted to limit its overload 433 control capability for privacy reasons, it MAY decide to perform 434 overload control only for requests that are received on a secure 435 transport channel, such as TLS. This enables a SIP server to protect 436 overload control information and ensure that it is only visible to 437 trusted parties. 439 5.3. Determining the 'oc' Parameter Value 441 The value of the "oc" parameter is determined by the overloaded 442 server using any pertinent information at its disposal. The process 443 by which an overloaded server determines the value of the "oc" 444 parameter is considered out of scope for this document. 446 5.4. Processing the Overload Control Parameters 448 A SIP client compliant to this specification SHOULD remove "oc", "oc- 449 validity" and "oc-seq" parameters from all Via headers of a response 450 received, except for the topmost Via header. This prevents overload 451 control parameters that were accidentally or maliciously inserted 452 into Via headers by a downstream SIP server from traveling upstream. 454 A SIP client maintains the "oc" parameter values received along with 455 the address and port number of the SIP servers from which they were 456 received for the duration specified in the "oc-validity" parameter or 457 the default duration. Each time a SIP client receives a response 458 with an "oc" parameter from a downstream SIP server, it overwrites 459 the "oc" value it has currently stored for this server with the new 460 value received. The SIP client restarts the validity period of an 461 "oc" parameter each time a response with an "oc" parameter is 462 received from this server. A stored "oc" parameter value MUST be 463 discarded once it has reached the end of its validity. 465 5.5. Using the Overload Control Parameter Values 467 A SIP client compliant to this specification MUST honor overload 468 control values it receives from downstream neighbors. The SIP client 469 MUST NOT forward more requests to a SIP server than allowed by the 470 current "oc" parameter value from a particular downstream server. 472 When forwarding a SIP request, a SIP client uses the SIP procedures 473 of [RFC3263] to determine the next hop SIP server. The procedures of 474 [RFC3263] take as input a SIP URI, extract the domain portion of that 475 URI for use as a lookup key, and query the Domain Name Service (DNS) 476 to obtain an ordered set of one or more IP addresses with a port 477 number and transport corresponding to each IP address in this set 478 (the "Expected Output"). 480 After selecting a specific SIP server from the Expected Output, the 481 SIP client MUST determine if it already has overload control 482 parameter values for the server chosen from the Expected Output. If 483 the SIP client has a non-expired "oc" parameter value for the server 484 chosen from the Expected Output, and this chosen server is operating 485 in overload control mode. Thus, the SIP client MUST determine if it 486 can or cannot forward the current request to the SIP server depending 487 on the nature of the request and the prevailing overload conditions. 489 The particular algorithm used to determine whether or not to forward 490 a particular SIP request is a matter of local policy, and may take 491 into account a variety of prioritization factors. However, this 492 local policy SHOULD generate the same number of SIP requests as the 493 default algorithm defined by the overload control scheme being used. 495 5.6. Forwarding the overload control parameters 497 A SIP client MAY forward the content of an "oc" parameter it has 498 received from a downstream neighbor on to its upstream neighbor. 499 However, forwarding the content of the "oc" parameter is generally 500 NOT RECOMMENDED and should only be performed if permitted by the 501 configuration of SIP servers. For example, a SIP server that only 502 relays messages between exactly two SIP servers may forward an "oc" 503 parameter. The "oc" parameter is forwarded by copying it from the 504 Via in which it was received into the next Via header (i.e., the Via 505 header that will be on top after processing the response). If an 506 "oc-validity" parameter is present, MUST be copied along with the 507 "oc" parameter. 509 5.7. Terminating overload control 511 A SIP client stops applying overload control to the number of 512 messages forwarded (i.e., it stops reducing the number of messages 513 forwarded) if one of the following events occur: 515 1. The "oc" parameter is set to a value that allows the client to 516 forward all traffic; 517 2. The "oc-validity" period negotiated to put the server and client 518 in overload state expires; 519 3. The client is explicitly told by the server to stop performing 520 overload control using the "oc-validity=0" parameter. 522 A SIP server can decide to terminate overload control by explicitly 523 signaling the client. To do so, the SIP server MUST set the value of 524 the "oc-validity" parameter to 0. The SIP server MUST increment the 525 value of "oc-seq", and SHOULD set the value of the "oc" parameter to 526 0. 528 Note that the loss-based overload control scheme (Section 6) can 529 effectively stop overload control by setting the value of the "oc" 530 parameter to 0. However, the rate-based scheme 531 ([I-D.noel-soc-overload-rate-control]) needs an additional piece 532 of information in the form of "oc-validity=0". 534 When the client receives a response with a higher "oc-seq" number 535 than the one it currently is processing, it checks the "oc-validity" 536 parameter. If the value of the "oc-validity" parameter is 0, the 537 client MUST stop performing overload control of messages destined to 538 the server and the traffic should flow without any reduction. 539 Furthermore, when the value of the "oc-validity" parameter is 0, the 540 client SHOULD disregard the value in the "oc" parameter. 542 5.8. Stabilizing overload control 544 Realities of deployments of SIP necessitate that the overload control 545 algorithm be renegotiated upon a system reboot or a software upgrade. 546 However, frequent renegotiation of the overload control algorithm 547 MUST be avoided. A rapid renegotiation of the overload control 548 algorithm will not benefit the client or the server as such flapping 549 does not allow the chosen algorithm to measure and fine tune its 550 behavior over a period of time. Renegotiation, when desired, is 551 simply accomplished by the SIP server choosing a new algorithm from 552 the list in the "oc-algo" parameter and sending it back to the client 553 in a response. 555 The client associates a specific algorithm with each server it sends 556 traffic to such that when the server changes the algorithm, the 557 client must behave accordingly as well. 559 Once the client and server agree on an overload control algorithm, it 560 MUST remain in effect for at least 3600 seconds (1 hour) before 561 renegotiation occurs. 563 One way to accomplish this involves the server saving the time of 564 the last negotiation in a lookup table, indexed by the client's 565 network identifiers. Renegotiation is only done when the time of 566 the last negotiation has surpassed 3600 seconds. 568 5.9. Self-Limiting 570 In some cases, a SIP client may not receive a response from a server 571 after sending a request. RFC3261 [RFC3261] defines that when a 572 timeout error is received from the transaction layer, it MUST be 573 treated as if a 408 (Request Timeout) status code has been received. 574 If a fatal transport error is reported by the transport layer, it 575 MUST be treated as a 503 (Service Unavailable) status code. 577 In the event of repeated timeouts or fatal transport errors, the SIP 578 client MUST stop sending requests to this server. The SIP client 579 SHOULD occasionally forward a single request to probe if the 580 downstream server is alive. Once a SIP client has successfully 581 transmitted a request to the downstream server, the SIP client can 582 resume normal traffic rates. It should, of course, honor any "oc" 583 parameters it may receive subsequent to resuming normal traffic 584 rates. 586 5.10. Responding to an Overload Indication 588 A SIP client can receive overload control feedback indicating that it 589 needs to reduce the traffic it sends to its downstream server. The 590 client can accomplish this task by sending some of the requests that 591 would have gone to the overloaded element to a different destination. 592 It needs to ensure, however, that this destination is not in overload 593 and capable of processing the extra load. A client can also buffer 594 requests in the hope that the overload condition will resolve quickly 595 and the requests still can be forwarded in time. In many cases, 596 however, it will need to reject these requests. 598 5.10.1. Message prioritization at the hop before the overloaded server 600 During an overload condition, a SIP client needs to prioritize 601 requests and select those requests that need to be rejected or 602 redirected. While this selection is largely a matter of local 603 policy, certain heuristics can be suggested. One, during overload 604 control, the SIP client should preserve existing dialogs as much as 605 possible. This suggests that mid-dialog requests MAY be given 606 preferential treatment. Similarly, requests that result in releasing 607 resources (such as a BYE) MAY also be given preferential treatment. 609 A SIP client SHOULD honor the local policy for prioritizing SIP 610 requests such as policies based on the content of the Resource- 611 Priority header (RPH, RFC4412 [RFC4412]). Specific (namespace.value) 612 RPH contents may indicate high priority requests that should be 613 preserved as much as possible during overload. The RPH contents can 614 also indicate a low-priority request that is eligible to be dropped 615 during times of overload. Other indicators, such as the SOS URN 616 [RFC5031] indicating an emergency request, may also be used for 617 prioritization. 619 Local policy could also include giving precedence to mid-dialog SIP 620 requests (re-INVITEs, UPDATEs, BYEs etc.) in times of overload. A 621 local policy can be expected to combine both the SIP request type and 622 the prioritization markings, and SHOULD be honored when overload 623 conditions prevail. 625 A SIP client SHOULD honor user-level load control filters installed 626 by signaling neighbors [I-D.ietf-soc-load-control-event-package] by 627 sending the SIP messages that matched the filter downstream. 629 5.10.2. Rejecting requests at an overloaded server 631 If the upstream SIP client to the overloaded server does not support 632 overload control, it will continue to direct requests to the 633 overloaded server. Thus, the overloaded server must bear the cost of 634 rejecting some session requests as well as the cost of processing 635 other requests to completion. It would be fair to devote the same 636 amount of processing at the overloaded server to the combination of 637 rejection and processing as the overloaded server would devote to 638 processing requests from an upstream SIP client that supported 639 overload control. This is to ensure that SIP servers that do not 640 support this specification don't receive an unfair advantage over 641 those that do. 643 A SIP server that is under overload and has started to throttle 644 incoming traffic MUST reject this request with a "503 (Service 645 Unavailable)" response without Retry-After header to reject a 646 fraction of requests from upstream neighbors that do not support 647 overload control. 649 5.11. 100-Trying provisional response and overload control parameters 651 The overload control information sent from a SIP server to a client 652 is transported in the responses. While implementations can insert 653 overload control information in any response, special attention 654 should be accorded to overload control information transported in a 655 100-Trying response. 657 Traditionally, the 100-Trying response has been used in SIP to quench 658 retransmissions. In some implementations, the 100-Trying message may 659 not be generated by the transaction user (TU) nor consumed by the TU. 660 In these implementations, the 100-Trying response is generated at the 661 transaction layer and sent to the upstream SIP client. At the 662 receiving SIP client, the 100-Trying is consumed at the transaction 663 layer by inhibiting the retransmission of the corresponding request. 664 Consequently, implementations that insert overload control 665 information in the 100-Trying cannot assume that the upstream SIP 666 client passed the overload control information in the 100-Trying to 667 their corresponding TU. For this reason, implementations that insert 668 overload control information in the 100-Trying MUST re-insert the 669 same (or updated) overload control information in the first non-100 670 response being sent to the upstream SIP client. 672 6. The loss-based overload control scheme 674 A loss percentage enables a SIP server to ask an upstream neighbor to 675 reduce the number of requests it would normally forward to this 676 server by X%. For example, a SIP server can ask an upstream neighbor 677 to reduce the number of requests this neighbor would normally send by 678 10%. The upstream neighbor then redirects or rejects 10% of the 679 traffic that is destined for this server. 681 This section specifies the semantics of the overload control 682 parameters associated with the loss-based overload control scheme. 683 The general behaviour of SIP clients and servers is specified in 684 Section 5 and is applicable to SIP clients and servers that implement 685 loss-based overload control. 687 6.1. Special parameter values for loss-based overload control 689 The loss-based overload control scheme is identified using the token 690 "loss". This token MUST appear in the "oc-algo" parameter. 692 A SIP server, upon entering the overload state, will assign a value 693 to the "oc" parameter. This value MUST be restricted in the range of 694 [0, 100], inclusive. This value MUST be interpreted as a percentage, 695 and the SIP client MUST reduce the number of requests being forwarded 696 to the overloaded server by that amount. The SIP client may use any 697 algorithm that reduces the traffic arriving at the overloaded server 698 by the amount indicated. Such an algorithm SHOULD honor the message 699 prioritization discussion of Section 5.10.1. While a particular 700 algorithm is not subject to standardization, for completeness a 701 default algorithm for loss-based overload control is provided in 702 Section 6.3. 704 When a SIP server receives a request from a client with an "oc" 705 parameter but without a value, and the SIP server is not experiencing 706 overload, it MUST assign a value of 0 to the "oc" parameter in the 707 response. Assigning such a value lets the client know that the 708 server supports overload control and is not currently experiencing 709 overload. 711 When the "oc-validity" parameter is used to signify overload control 712 termination (Section 5.7), the server MUST insert a value of 0 in the 713 "oc-validity" parameter. The server MUST insert a value of 0 in the 714 "oc" parameter as well. When a client receives a response whose "oc- 715 validity" parameter contains a 0, it MUST treat any non-zero value in 716 the "oc" parameter as if it had received a value of 0 in that 717 parameter. 719 6.2. Example 721 Consider a SIP client, P1, which is sending requests to another 722 downstream SIP server, P2. The following snippets of SIP messages 723 demonstrate how the overload control parameters work. 725 INVITE sips:user@example.com SIP/2.0 726 Via: SIP/2.0/TLS p1.example.net; 727 branch=z9hG4bK2d4790.1;received=192.0.2.111;oc; 728 oc-algo="loss,A" 729 ... 731 SIP/2.0 100 Trying 732 Via: SIP/2.0/TLS p1.example.net; 733 branch=z9hG4bK2d4790.1;received=192.0.2.111; 734 oc=0;oc-algo="loss"; 735 ... 737 In the messages above, the first line is sent by P1 to P2. This line 738 is a SIP request; because P1 supports overload control, it inserts 739 the "oc" parameter in the topmost Via header that it created. P1 740 supports two overload control algorithms: loss and some algorithm 741 called "A". 743 The second line --- a SIP response --- shows the topmost Via header 744 amended by P2 according to this specification and sent to P1. 745 Because P2 also supports overload control, it chooses the "loss" 746 based scheme and sends that back to P1 in the "oc-algo" parameter. 747 It also sets the value of "oc" parameter to 0. 749 Had P2 not supported overload control, it would have left the "oc" 750 and "oc-algo" parameters unchanged, thus allowing the client to know 751 that it did not support overload control. 753 At some later time, P2 starts to experience overload. It sends the 754 following SIP message indicating that P1 should decrease the messages 755 arriving to P2 by 20% for 1s. 757 SIP/2.0 180 Ringing 758 Via: SIP/2.0/TLS p1.example.net; 759 branch=z9hG4bK2d4790.3;received=192.0.2.111; 760 oc=20;oc-algo="loss";oc-validity=1000; 761 oc-seq=1282321615.782 762 ... 764 After 500ms, the overload condition at P2 subsides. It then sends 765 out the message below to allow P1 to send all messages destined to 766 P2. 768 SIP/2.0 183 Queued 769 Via: SIP/2.0/TLS p1.example.net; 770 branch=z9hG4bK2d4790.4;received=192.0.2.111; 771 oc=0;oc-algo="loss";oc-validity=0;oc-seq=1282321887.783 772 ... 774 6.3. Default algorithm for loss-based overload control 776 This section describes a default algorithm that a SIP client can to 777 throttle SIP traffic going downstream by the percentage loss value 778 specified in the "oc" parameter. 780 The client maintains two categories of requests; the first category 781 will include requests that are candidates for reduction, and the 782 second category will include requests that are not subject to 783 reduction (except under extenuating circumstances when there aren't 784 any messages in the first category that can be reduced). Section 785 Section 5.10.1 contains normative directives on how to prioritize 786 messages for inclusion in the second category. The remaining 787 messages can be allocated to the first category. 789 The client determines the mix of requests falling into the first 790 category and those falling into the second category. For example, 791 40% of the requests may be eligible for reduction and 60% not 792 eligible (and therefore, must be sent downstream). 794 Under overload condition, the client converts the value of the "oc" 795 parameter to a value that it applies to requests in the first 796 category. As a simple example, if "oc=10" and 40% of the requests 797 should be included in the first category, then: 799 10 / 40 * 100 = 25 801 Or, 25% of the requests in the first category can be reduced to get 802 an overall reduction of 10%. The client uses random discard to 803 achieve the 25% reduction of messages in the first category. 804 Messages in the second category proceed downstream unscathed. To 805 affect the 25% reduction rate from the first category, the client 806 draws a random number between 1 and 100 for the request picked from 807 the first category. If the random number is less than or equal to 808 converted value of the "oc" parameter, the request is not forwarded; 809 otherwise the request is forwarded. 811 A reference algorithm is shown below. 813 cat1 := 40.0 // Category 1 --- subject to reduction 814 cat2 := 100.0 - cat1 // Category 2 --- Not subject to 815 // reduction. 40/60 mix. 816 in_oc := false // Not operating under overload 818 while (true) { 819 sip_msg := get_sip_msg() 820 if (is_response(sip_msg)) { 821 process_msg(sip_msg) 823 } 824 else if (is_request(sip_msg)) { 826 // Determine if server wants to enter overload or is 827 // in overload 828 in_oc := extract_in_oc(sip_msg) 830 // Get validity value 831 oc_validity := extract_oc_validity(sip_msg) 833 // Get oc parameter value 834 oc_value := extract_oc_value(sip_msg) 836 pct_to_reduce := oc_value / cat1 * 100 837 // Example: if oc=10, 838 // server uses 10 / 40 * 100 = 25 or 25% of messages in 839 // Category 1 can be reduced. 841 if (in_oc == false) { 842 process_msg(sip_msg) 843 } 844 else { 846 // Either Category 1 or Category 2 847 assign_msg_to_category(sip_msg) 849 if (oc_validity is in effect) { 850 process_msg(get_msg_from_cat2()) 851 sip_msg := get_msg_from_cat1() 853 // Draw a random number between 1 and 100 854 r := random() 856 if (r <= pct_to_reduce) { 857 // Do not send to server, handle locally by 858 // generating a final response 859 } 860 else { 861 process_msg(sip_msg) 862 } 863 } 864 } 865 } 866 } 868 Note that in the event that there are not enough messages in the 869 first category to reduce, the client may use local policies to target 870 messages in the second category. 872 7. Relationship with other IETF SIP load control efforts 874 The overload control mechanism described in this document is reactive 875 in nature and apart from message prioritization directives listed in 876 Section 5.10.1 the mechanisms described in this draft will not 877 discriminate requests based on user identity, filtering action and 878 arrival time. SIP networks that require pro-active overload control 879 mechanisms can upload user-level load control filters as described in 880 [I-D.ietf-soc-load-control-event-package]. 882 8. Syntax 884 This specification extends the existing definition of the Via header 885 field parameters of [RFC3261] as follows: 887 via-params = via-ttl / via-maddr 888 / via-received / via-branch 889 / oc / oc-validity 890 / oc-seq / oc-algo / via-extension 892 oc = "oc" [EQUAL oc-num] 893 oc-num = 1*DIGIT 894 oc-validity = "oc-validity" [EQUAL delta-ms] 895 oc-seq = "oc-seq" EQUAL 1*12DIGIT "." 1*5DIGIT 896 oc-algo = "oc-algo" EQUAL DQUOTE algo-list *(COMMA algo-list) 897 DQUOTE 898 algo-list = "loss" / *(other-algo) 899 other-algo = %x41-5A / %x61-7A / %x30-39 901 9. Design Considerations 903 This section discusses specific design considerations for the 904 mechanism described in this document. General design considerations 905 for SIP overload control can be found in [RFC6357]. 907 9.1. SIP Mechanism 909 A SIP mechanism is needed to convey overload feedback from the 910 receiving to the sending SIP entity. A number of different 911 alternatives exist to implement such a mechanism. 913 9.1.1. SIP Response Header 915 Overload control information can be transmitted using a new Via 916 header field parameter for overload control. A SIP server can add 917 this header parameter to the responses it is sending upstream to 918 provide overload control feedback to its upstream neighbors. This 919 approach has the following characteristics: 921 o A Via header parameter is light-weight and creates very little 922 overhead. It does not require the transmission of additional 923 messages for overload control and does not increase traffic or 924 processing burdens in an overload situation. 925 o Overload control status can frequently be reported to upstream 926 neighbors since it is a part of a SIP response. This enables the 927 use of this mechanism in scenarios where the overload status needs 928 to be adjusted frequently. It also enables the use of overload 929 control mechanisms that use regular feedback such as window-based 930 overload control. 931 o With a Via header parameter, overload control status is inherent 932 in SIP signaling and is automatically conveyed to all relevant 933 upstream neighbors, i.e., neighbors that are currently 934 contributing traffic. There is no need for a SIP server to 935 specifically track and manage the set of current upstream or 936 downstream neighbors with which it should exchange overload 937 feedback. 938 o Overload status is not conveyed to inactive senders. This avoids 939 the transmission of overload feedback to inactive senders, which 940 do not contribute traffic. If an inactive sender starts to 941 transmit while the receiver is in overload it will receive 942 overload feedback in the first response and can adjust the amount 943 of traffic forwarded accordingly. 944 o A SIP server can limit the distribution of overload control 945 information by only inserting it into responses to known upstream 946 neighbors. A SIP server can use transport level authentication 947 (e.g., via TLS) with its upstream neighbors. 949 9.1.2. SIP Event Package 951 Overload control information can also be conveyed from a receiver to 952 a sender using a new event package. Such an event package enables a 953 sending entity to subscribe to the overload status of its downstream 954 neighbors and receive notifications of overload control status 955 changes in NOTIFY requests. This approach has the following 956 characteristics: 958 o Overload control information is conveyed decoupled from SIP 959 signaling. It enables an overload control manager, which is a 960 separate entity, to monitor the load on other servers and provide 961 overload control feedback to all SIP servers that have set up 962 subscriptions with the controller. 964 o With an event package, a receiver can send updates to senders that 965 are currently inactive. Inactive senders will receive a 966 notification about the overload and can refrain from sending 967 traffic to this neighbor until the overload condition is resolved. 968 The receiver can also notify all potential senders once they are 969 permitted to send traffic again. However, these notifications do 970 generate additional traffic, which adds to the overall load. 971 o A SIP entity needs to set up and maintain overload control 972 subscriptions with all upstream and downstream neighbors. A new 973 subscription needs to be set up before/while a request is 974 transmitted to a new downstream neighbor. Servers can be 975 configured to subscribe at boot time. However, this would require 976 additional protection to avoid the avalanche restart problem for 977 overload control. Subscriptions need to be terminated when they 978 are not needed any more, which can be done, for example, using a 979 timeout mechanism. 980 o A receiver needs to send NOTIFY messages to all subscribed 981 upstream neighbors in a timely manner when the control algorithm 982 requires a change in the control variable (e.g., when a SIP server 983 is in an overload condition). This includes active as well as 984 inactive neighbors. These NOTIFYs add to the amount of traffic 985 that needs to be processed. To ensure that these requests will 986 not be dropped due to overload, a priority mechanism needs to be 987 implemented in all servers these request will pass through. 988 o As overload feedback is sent to all senders in separate messages, 989 this mechanism is not suitable when frequent overload control 990 feedback is needed. 991 o A SIP server can limit the set of senders that can receive 992 overload control information by authenticating subscriptions to 993 this event package. 994 o This approach requires each proxy to implement user agent 995 functionality (UAS and UAC) to manage the subscriptions. 997 9.2. Backwards Compatibility 999 An new overload control mechanism needs to be backwards compatible so 1000 that it can be gradually introduced into a network and functions 1001 properly if only a fraction of the servers support it. 1003 Hop-by-hop overload control (see [RFC6357]) has the advantage that it 1004 does not require that all SIP entities in a network support it. It 1005 can be used effectively between two adjacent SIP servers if both 1006 servers support overload control and does not depend on the support 1007 from any other server or user agent. The more SIP servers in a 1008 network support hop-by-hop overload control, the better protected the 1009 network is against occurrences of overload. 1011 A SIP server may have multiple upstream neighbors from which only 1012 some may support overload control. If a server would simply use this 1013 overload control mechanism, only those that support it would reduce 1014 traffic. Others would keep sending at the full rate and benefit from 1015 the throttling by the servers that support overload control. In 1016 other words, upstream neighbors that do not support overload control 1017 would be better off than those that do. 1019 A SIP server should therefore use 5xx responses towards upstream 1020 neighbors that do not support overload control. The server should 1021 reject the same amount of requests with 5xx responses that would be 1022 otherwise be rejected/redirected by the upstream neighbor if it would 1023 support overload control. If the load condition on the server does 1024 not permit the creation of 5xx responses, the server should drop all 1025 requests from servers that do not support overload control. 1027 10. Security Considerations 1029 Overload control mechanisms can be used by an attacker to conduct a 1030 denial-of-service attack on a SIP entity if the attacker can pretend 1031 that the SIP entity is overloaded. When such a forged overload 1032 indication is received by an upstream SIP client, it will stop 1033 sending traffic to the victim. Thus, the victim is subject to a 1034 denial-of-service attack. 1036 An attacker can create forged overload feedback by inserting itself 1037 into the communication between the victim and its upstream neighbors. 1038 The attacker would need to add overload feedback indicating a high 1039 load to the responses passed from the victim to its upstream 1040 neighbor. Proxies can prevent this attack by communicating via TLS. 1041 Since overload feedback has no meaning beyond the next hop, there is 1042 no need to secure the communication over multiple hops. 1044 Another way to conduct an attack is to send a message containing a 1045 high overload feedback value through a proxy that does not support 1046 this extension. If this feedback is added to the second Via headers 1047 (or all Via headers), it will reach the next upstream proxy. If the 1048 attacker can make the recipient believe that the overload status was 1049 created by its direct downstream neighbor (and not by the attacker 1050 further downstream) the recipient stops sending traffic to the 1051 victim. A precondition for this attack is that the victim proxy does 1052 not support this extension since it would not pass through overload 1053 control feedback otherwise. 1055 A malicious SIP entity could gain an advantage by pretending to 1056 support this specification but never reducing the amount of traffic 1057 it forwards to the downstream neighbor. If its downstream neighbor 1058 receives traffic from multiple sources which correctly implement 1059 overload control, the malicious SIP entity would benefit since all 1060 other sources to its downstream neighbor would reduce load. 1062 The solution to this problem depends on the overload control 1063 method. For rate-based and window-based overload control, it is 1064 very easy for a downstream entity to monitor if the upstream 1065 neighbor throttles traffic forwarded as directed. For percentage 1066 throttling this is not always obvious since the load forwarded 1067 depends on the load received by the upstream neighbor. 1069 11. IANA Considerations 1071 This specification defines four new Via header parameters as detailed 1072 below in the "Header Field Parameter and Parameter Values" sub- 1073 registry as per the registry created by [RFC3968]. The required 1074 information is: 1076 Header Field Parameter Name Predefined Values Reference 1077 __________________________________________________________ 1078 Via oc Yes RFCXXXX 1079 Via oc-validity Yes RFCXXXX 1080 Via oc-seq Yes RFCXXXX 1081 Via oc-algo Yes RFCXXXX 1083 RFC XXXX [NOTE TO RFC-EDITOR: Please replace with final RFC 1084 number of this specification.] 1086 NOTE: Do we need to do anything special to register "loss" 1087 as a value for "oc-algo" parameter? 1089 12. References 1091 12.1. Normative References 1093 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1094 Requirement Levels", BCP 14, RFC 2119, March 1997. 1096 [RFC3261] Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston, 1097 A., Peterson, J., Sparks, R., Handley, M., and E. 1098 Schooler, "SIP: Session Initiation Protocol", RFC 3261, 1099 June 2002. 1101 [RFC3263] Rosenberg, J. and H. Schulzrinne, "Session Initiation 1102 Protocol (SIP): Locating SIP Servers", RFC 3263, 1103 June 2002. 1105 [RFC3968] Camarillo, G., "The Internet Assigned Number Authority 1106 (IANA) Header Field Parameter Registry for the Session 1107 Initiation Protocol (SIP)", BCP 98, RFC 3968, 1108 December 2004. 1110 [RFC4412] Schulzrinne, H. and J. Polk, "Communications Resource 1111 Priority for the Session Initiation Protocol (SIP)", 1112 RFC 4412, February 2006. 1114 12.2. Informative References 1116 [I-D.ietf-soc-load-control-event-package] 1117 Shen, C., Schulzrinne, H., and A. Koike, "A Session 1118 Initiation Protocol (SIP) Load Control Event Package", 1119 draft-ietf-soc-load-control-event-package-01 (work in 1120 progress), July 2011. 1122 [I-D.noel-soc-overload-rate-control] 1123 Noel, E., Williams, P., and J. Gunn, "Session Initiation 1124 Protocol (SIP) Rate Control", 1125 draft-noel-soc-overload-rate-control-01 (work in 1126 progress), September 2011. 1128 [RFC5031] Schulzrinne, H., "A Uniform Resource Name (URN) for 1129 Emergency and Other Well-Known Services", RFC 5031, 1130 January 2008. 1132 [RFC5390] Rosenberg, J., "Requirements for Management of Overload in 1133 the Session Initiation Protocol", RFC 5390, December 2008. 1135 [RFC6357] Hilt, V., Noel, E., Shen, C., and A. Abdelal, "Design 1136 Considerations for Session Initiation Protocol (SIP) 1137 Overload Control", RFC 6357, August 2011. 1139 Appendix A. Acknowledgements 1141 Many thanks to Bruno Chatras, Keith Drage, Janet Gunn, Rich Terpstra, 1142 Daryl Malas, R. Parthasarathi, Antoine Roly, Jonathan Rosenberg, 1143 Charles Shen, Rahul Srivastava, Padma Valluri, Shaun Bharrat, and 1144 Paul Kyzivat for their contributions to this specification. 1146 Appendix B. RFC5390 requirements 1148 Table 1 provides a summary how this specification fulfills the 1149 requirements of [RFC5390]. A more detailed view on how each 1150 requirements is fulfilled is provided after the table. 1152 +-------------+-------------------+ 1153 | Requirement | Meets requirement | 1154 +-------------+-------------------+ 1155 | REQ 1 | Yes | 1156 | REQ 2 | Yes | 1157 | REQ 3 | Partially | 1158 | REQ 4 | Partially | 1159 | REQ 5 | Partially | 1160 | REQ 6 | Not applicable | 1161 | REQ 7 | Yes | 1162 | REQ 8 | Partially | 1163 | REQ 9 | Yes | 1164 | REQ 10 | Yes | 1165 | REQ 11 | Yes | 1166 | REQ 12 | Yes | 1167 | REQ 13 | Yes | 1168 | REQ 14 | Yes | 1169 | REQ 15 | Yes | 1170 | REQ 16 | Yes | 1171 | REQ 17 | Partially | 1172 | REQ 18 | Yes | 1173 | REQ 19 | Yes | 1174 | REQ 20 | Yes | 1175 | REQ 21 | Yes | 1176 | REQ 22 | Yes | 1177 | REQ 23 | Yes | 1178 +-------------+-------------------+ 1180 Summary of meeting requirements in RFC5390 1182 Table 1 1184 REQ 1: The overload mechanism shall strive to maintain the overall 1185 useful throughput (taking into consideration the quality-of-service 1186 needs of the using applications) of a SIP server at reasonable 1187 levels, even when the incoming load on the network is far in excess 1188 of its capacity. The overall throughput under load is the ultimate 1189 measure of the value of an overload control mechanism. 1191 Meeting REQ 1: Yes, the overload control mechanism allows an 1192 overloaded SIP server to maintain a reasonable level of throughput as 1193 it enters into congestion mode by requesting the upstream clients to 1194 reduce traffic destined downstream. 1196 REQ 2: When a single network element fails, goes into overload, or 1197 suffers from reduced processing capacity, the mechanism should strive 1198 to limit the impact of this on other elements in the network. This 1199 helps to prevent a small-scale failure from becoming a widespread 1200 outage. 1202 Meeting REQ 2: Yes. When a SIP server enters overload mode, it will 1203 request the upstream clients to throttle the traffic destined to it. 1204 As a consequence of this, the overloaded SIP server will itself 1205 generate proportionally less downstream traffic, thereby limiting the 1206 impact on other elements in the network. 1208 REQ 3: The mechanism should seek to minimize the amount of 1209 configuration required in order to work. For example, it is better 1210 to avoid needing to configure a server with its SIP message 1211 throughput, as these kinds of quantities are hard to determine. 1213 Meeting REQ 3: Partially. On the server side, the overload condition 1214 is determined monitoring S (c.f., Section 4 of [RFC6357]) and 1215 reporting a load feedback F as a value to the "oc" parameter. On the 1216 client side, a throttle T is applied to requests going downstream 1217 based on F. This specification does not prescribe any value for S, 1218 nor a particular value for F. The "oc-algo" parameter allows for 1219 automatic convergence to a particular class of overload control 1220 algorithm. There are suggested default values for the "oc-validity" 1221 parameter. 1223 REQ 4: The mechanism must be capable of dealing with elements that do 1224 not support it, so that a network can consist of a mix of elements 1225 that do and don't support it. In other words, the mechanism should 1226 not work only in environments where all elements support it. It is 1227 reasonable to assume that it works better in such environments, of 1228 course. Ideally, there should be incremental improvements in overall 1229 network throughput as increasing numbers of elements in the network 1230 support the mechanism. 1232 Meeting REQ 4: Partially. The mechanism is designed to reduce 1233 congestion when a pair of communicating entities support it. If a 1234 downstream overloaded SIP server does not respond to a request in 1235 time, a SIP client conformant to this specification will attempt to 1236 reduce traffic destined towards the non-responsive server as outlined 1237 in Section 5.9. 1239 REQ 5: The mechanism should not assume that it will only be deployed 1240 in environments with completely trusted elements. It should seek to 1241 operate as effectively as possible in environments where other 1242 elements are malicious; this includes preventing malicious elements 1243 from obtaining more than a fair share of service. 1245 Meeting REQ 5: Partially. Since overload control information is 1246 shared between a pair of communicating entities, a confidential and 1247 authenticated channel can be used for this communication. However, 1248 if such a channel is not available, then the security ramifications 1249 outlined in Section 10 apply. 1251 REQ 6: When overload is signaled by means of a specific message, the 1252 message must clearly indicate that it is being sent because of 1253 overload, as opposed to other, non overload-based failure conditions. 1254 This requirement is meant to avoid some of the problems that have 1255 arisen from the reuse of the 503 response code for multiple purposes. 1256 Of course, overload is also signaled by lack of response to requests. 1257 This requirement applies only to explicit overload signals. 1259 Meeting REQ 6: Not applicable. Overload control information is 1260 signaled as part of the Via header and not in a new header. 1262 REQ 7: The mechanism shall provide a way for an element to throttle 1263 the amount of traffic it receives from an upstream element. This 1264 throttling shall be graded so that it is not all- or-nothing as with 1265 the current 503 mechanism. This recognizes the fact that "overload" 1266 is not a binary state and that there are degrees of overload. 1268 Meeting REQ 7: Yes, please see Section 5.5 and Section 5.10. 1270 REQ 8: The mechanism shall ensure that, when a request was not 1271 processed successfully due to overload (or failure) of a downstream 1272 element, the request will not be retried on another element that is 1273 also overloaded or whose status is unknown. This requirement derives 1274 from REQ 1. 1276 Meeting REQ 8: Partially. A SIP client that has overload information 1277 from multiple downstream servers will not retry the request on 1278 another element. However, if a SIP client does not know the overload 1279 status of a downstream server, it may send the request to that 1280 server. 1282 REQ 9: That a request has been rejected from an overloaded element 1283 shall not unduly restrict the ability of that request to be submitted 1284 to and processed by an element that is not overloaded. This 1285 requirement derives from REQ 1. 1287 Meeting REQ 9: Yes, a SIP client conformant to this specification 1288 will send the request to a different element. 1290 REQ 10: The mechanism should support servers that receive requests 1291 from a large number of different upstream elements, where the set of 1292 upstream elements is not enumerable. 1294 Meeting REQ 10: Yes, there are no constraints on the number of 1295 upstream clients. 1297 REQ 11: The mechanism should support servers that receive requests 1298 from a finite set of upstream elements, where the set of upstream 1299 elements is enumerable. 1301 Meeting REQ 11: Yes, there are no constraints on the number of 1302 upstream clients. 1304 REQ 12: The mechanism should work between servers in different 1305 domains. 1307 Meeting REQ 12: Yes, there are no inherent limitations on using 1308 overload control between domains. 1310 REQ 13: The mechanism must not dictate a specific algorithm for 1311 prioritizing the processing of work within a proxy during times of 1312 overload. It must permit a proxy to prioritize requests based on any 1313 local policy, so that certain ones (such as a call for emergency 1314 services or a call with a specific value of the Resource-Priority 1315 header field [RFC4412]) are given preferential treatment, such as not 1316 being dropped, being given additional retransmission, or being 1317 processed ahead of others. 1319 Meeting REQ 13: Yes, please see Section 5.10. 1321 REQ 14: REQ 14: The mechanism should provide unambiguous directions 1322 to clients on when they should retry a request and when they should 1323 not. This especially applies to TCP connection establishment and SIP 1324 registrations, in order to mitigate against avalanche restart. 1326 Meeting REQ 14: Yes, Section 5.9 provides normative behavior on when 1327 to retry a request after repeated timeouts and fatal transport errors 1328 resulting from communications with a non-responsive downstream SIP 1329 server. 1331 REQ 15: In cases where a network element fails, is so overloaded that 1332 it cannot process messages, or cannot communicate due to a network 1333 failure or network partition, it will not be able to provide explicit 1334 indications of the nature of the failure or its levels of congestion. 1335 The mechanism must properly function in these cases. 1337 Meeting REQ 15: Yes, Section 5.9 provides normative behavior on when 1338 to retry a request after repeated timeouts and fatal transport errors 1339 resulting from communications with a non-responsive downstream SIP 1340 server. 1342 REQ 16: The mechanism should attempt to minimize the overhead of the 1343 overload control messaging. 1345 Meeting REQ 16: Yes, overload control messages are sent in the 1346 topmost Via header, which is always processed by the SIP elements. 1348 REQ 17: The overload mechanism must not provide an avenue for 1349 malicious attack, including DoS and DDoS attacks. 1351 Meeting REQ 17: Partially. Since overload control information is 1352 shared between a pair of communicating entities, a confidential and 1353 authenticated channel can be used for this communication. However, 1354 if such a channel is not available, then the security ramifications 1355 outlined in Section 10 apply. 1357 REQ 18: The overload mechanism should be unambiguous about whether a 1358 load indication applies to a specific IP address, host, or URI, so 1359 that an upstream element can determine the load of the entity to 1360 which a request is to be sent. 1362 Meeting REQ 18: Yes, please see discussion in Section 5.5. 1364 REQ 19: The specification for the overload mechanism should give 1365 guidance on which message types might be desirable to process over 1366 others during times of overload, based on SIP-specific 1367 considerations. For example, it may be more beneficial to process a 1368 SUBSCRIBE refresh with Expires of zero than a SUBSCRIBE refresh with 1369 a non-zero expiration (since the former reduces the overall amount of 1370 load on the element), or to process re-INVITEs over new INVITEs. 1372 Meeting REQ 19: Yes, please see Section 5.10. 1374 REQ 20: In a mixed environment of elements that do and do not 1375 implement the overload mechanism, no disproportionate benefit shall 1376 accrue to the users or operators of the elements that do not 1377 implement the mechanism. 1379 Meeting REQ 20: Yes, an element that does not implement overload 1380 control does not receive any measure of extra benefit. 1382 REQ 21: The overload mechanism should ensure that the system remains 1383 stable. When the offered load drops from above the overall capacity 1384 of the network to below the overall capacity, the throughput should 1385 stabilize and become equal to the offered load. 1387 Meeting REQ 21: Yes, the overload control mechanism described in this 1388 draft ensures the stability of the system. 1390 REQ 22: It must be possible to disable the reporting of load 1391 information towards upstream targets based on the identity of those 1392 targets. This allows a domain administrator who considers the load 1393 of their elements to be sensitive information, to restrict access to 1394 that information. Of course, in such cases, there is no expectation 1395 that the overload mechanism itself will help prevent overload from 1396 that upstream target. 1398 Meeting REQ 22: Yes, an operator of a SIP server can configure the 1399 SIP server to only report overload control information for requests 1400 received over a confidential channel, for example. However, note 1401 that this requirement is in conflict with REQ 3, as it introduces a 1402 modicum of extra configuration. 1404 REQ 23: It must be possible for the overload mechanism to work in 1405 cases where there is a load balancer in front of a farm of proxies. 1407 Meeting REQ 23: Yes. Depending on the type of load balancer, this 1408 requirement is met. A load balancer fronting a farm of SIP proxies 1409 could be a SIP-aware load balancer or one that is not SIP-aware. If 1410 the load balancer is SIP-aware, it can make conscious decisions on 1411 throttling outgoing traffic towards the individual server in the farm 1412 based on the overload control parameters returned by the server. On 1413 the other hand, if the load balancer is not SIP-aware, then there are 1414 other strategies to perform overload control. Section 6 of [RFC6357] 1415 documents some of these strategies in more detail (see discussion 1416 related to Figure 3(a) in Section 6). 1418 Authors' Addresses 1420 Vijay K. Gurbani (editor) 1421 Bell Laboratories, Alcatel-Lucent 1422 1960 Lucent Lane, Rm 9C-533 1423 Naperville, IL 60563 1424 USA 1426 Email: vkg@bell-labs.com 1428 Volker Hilt 1429 Bell Labs/Alcatel-Lucent 1430 791 Holmdel-Keyport Rd 1431 Holmdel, NJ 07733 1432 USA 1434 Email: volkerh@bell-labs.com 1435 Henning Schulzrinne 1436 Columbia University/Department of Computer Science 1437 450 Computer Science Building 1438 New York, NY 10027 1439 USA 1441 Phone: +1 212 939 7004 1442 Email: hgs@cs.columbia.edu 1443 URI: http://www.cs.columbia.edu