idnits 2.17.1 draft-ietf-soc-overload-control-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** You're using the IETF Trust Provisions' Section 6.b License Notice from 12 Sep 2009 rather than the newer Notice from 28 Dec 2009. (See https://trustee.ietf.org/license-info/) Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == The document seems to use 'NOT RECOMMENDED' as an RFC 2119 keyword, but does not include the phrase in its RFC 2119 key words list. -- The document date (November 19, 2010) is 4907 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'EQUAL 0-100' is mentioned on line 515, but not defined == Outdated reference: A later version (-08) exists of draft-ietf-soc-overload-design-01 Summary: 1 error (**), 0 flaws (~~), 4 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 SOC Working Group V. Gurbani, Ed. 3 Internet-Draft Bell Laboratories, Alcatel-Lucent 4 Intended status: Standards Track V. Hilt 5 Expires: May 23, 2011 Bell Labs/Alcatel-Lucent 6 H. Schulzrinne 7 Columbia University 8 November 19, 2010 10 Session Initiation Protocol (SIP) Overload Control 11 draft-ietf-soc-overload-control-00 13 Abstract 15 Overload occurs in Session Initiation Protocol (SIP) networks when 16 SIP servers have insufficient resources to handle all SIP messages 17 they receive. Even though the SIP protocol provides a limited 18 overload control mechanism through its 503 (Service Unavailable) 19 response code, SIP servers are still vulnerable to overload. This 20 document defines an overload control mechanism for SIP. 22 Status of this Memo 24 This Internet-Draft is submitted to IETF in full conformance with the 25 provisions of BCP 78 and BCP 79. 27 Internet-Drafts are working documents of the Internet Engineering 28 Task Force (IETF), its areas, and its working groups. Note that 29 other groups may also distribute working documents as Internet- 30 Drafts. 32 Internet-Drafts are draft documents valid for a maximum of six months 33 and may be updated, replaced, or obsoleted by other documents at any 34 time. It is inappropriate to use Internet-Drafts as reference 35 material or to cite them other than as "work in progress." 37 The list of current Internet-Drafts can be accessed at 38 http://www.ietf.org/ietf/1id-abstracts.txt. 40 The list of Internet-Draft Shadow Directories can be accessed at 41 http://www.ietf.org/shadow.html. 43 This Internet-Draft will expire on May 23, 2011. 45 Copyright Notice 47 Copyright (c) 2010 IETF Trust and the persons identified as the 48 document authors. All rights reserved. 50 This document is subject to BCP 78 and the IETF Trust's Legal 51 Provisions Relating to IETF Documents 52 (http://trustee.ietf.org/license-info) in effect on the date of 53 publication of this document. Please review these documents 54 carefully, as they describe your rights and restrictions with respect 55 to this document. Code Components extracted from this document must 56 include Simplified BSD License text as described in Section 4.e of 57 the Trust Legal Provisions and are provided without warranty as 58 described in the BSD License. 60 Table of Contents 62 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 63 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 3 64 3. Overview of operations . . . . . . . . . . . . . . . . . . . . 4 65 4. Via Header Parameters for Overload Control . . . . . . . . . . 5 66 4.1. The 'oc' Parameter . . . . . . . . . . . . . . . . . . . . 5 67 4.2. Creating the Overload Control Parameters . . . . . . . . . 5 68 4.3. Determining the 'oc' Parameter Value . . . . . . . . . . . 7 69 4.4. Processing the Overload Control Parameters . . . . . . . . 8 70 4.5. Using the Overload Control Parameter Values . . . . . . . 8 71 4.6. Forwarding the overload control parameters . . . . . . . . 9 72 4.7. Self-Limiting . . . . . . . . . . . . . . . . . . . . . . 9 73 5. Responding to an Overload Indication . . . . . . . . . . . . . 10 74 5.1. Message prioritization at the hop before the 75 overloaded server . . . . . . . . . . . . . . . . . . . . 10 76 5.2. Rejecting requests at an overloaded server . . . . . . . . 11 77 6. Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 78 7. Design Considerations . . . . . . . . . . . . . . . . . . . . 12 79 7.1. SIP Mechanism . . . . . . . . . . . . . . . . . . . . . . 12 80 7.1.1. SIP Response Header . . . . . . . . . . . . . . . . . 12 81 7.1.2. SIP Event Package . . . . . . . . . . . . . . . . . . 13 82 7.2. Backwards Compatibility . . . . . . . . . . . . . . . . . 14 83 8. Security Considerations . . . . . . . . . . . . . . . . . . . 15 84 9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 15 85 10. References . . . . . . . . . . . . . . . . . . . . . . . . . . 16 86 10.1. Normative References . . . . . . . . . . . . . . . . . . . 16 87 10.2. Informative References . . . . . . . . . . . . . . . . . . 16 88 Appendix A. Acknowledgements . . . . . . . . . . . . . . . . . . 16 89 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 16 91 1. Introduction 93 As with any network element, a Session Initiation Protocol (SIP) 94 [RFC3261] server can suffer from overload when the number of SIP 95 messages it receives exceeds the number of messages it can process. 96 Overload can pose a serious problem for a network of SIP servers. 97 During periods of overload, the throughput of a network of SIP 98 servers can be significantly degraded. In fact, overload may lead to 99 a situation in which the throughput drops down to a small fraction of 100 the original processing capacity. This is often called congestion 101 collapse. 103 Overload is said to occur if a SIP server does not have sufficient 104 resources to process all incoming SIP messages. These resources may 105 include CPU processing capacity, memory, network bandwidth, input/ 106 output, or disk resources. 108 For overload control, we only consider failure cases where SIP 109 servers are unable to process all SIP requests due to resource 110 constraints. There are other cases where a SIP server can 111 successfully process incoming requests but has to reject them due to 112 failure conditions unrelated to the SIP server being overloaded. For 113 example, a PSTN gateway that runs out of trunk lines but still has 114 plenty of capacity to process SIP messages should reject incoming 115 INVITEs using a 488 (Not Acceptable Here) response [RFC4412]. 116 Similarly, a SIP registrar that has lost connectivity to its 117 registration database but is still capable of processing SIP requests 118 should reject REGISTER requests with a 500 (Server Error) response 119 [RFC3261]. Overload control does not apply to these cases and SIP 120 provides appropriate response codes for them. 122 The SIP protocol provides a limited mechanism for overload control 123 through its 503 (Service Unavailable) response code. However, this 124 mechanism cannot prevent overload of a SIP server and it cannot 125 prevent congestion collapse. In fact, the use of the 503 (Service 126 Unavailable) response code may cause traffic to oscillate and to 127 shift between SIP servers and thereby worsen an overload condition. 128 A detailed discussion of the SIP overload problem, the problems with 129 the 503 (Service Unavailable) response code and the requirements for 130 a SIP overload control mechanism can be found in [RFC5390]. 132 2. Terminology 134 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 135 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 136 document are to be interpreted as described in RFC 2119 [RFC2119]. 138 3. Overview of operations 140 We now explain the overview of how the overload control mechanism 141 operates by introducing the overload control parameters. Section 4 142 provides more details and normative behavior on the parameters listed 143 below. 145 Because overload control is best performed hop-by-hop, the Via 146 parameter is attractive since it allows two adjacent SIP entities to 147 indicate support for, and exchange information associated with 148 overload control. Additional advantages of this choice are discussed 149 in Section 7.1.1. An alternative mechanism using SIP event packages 150 was also considered, and the characteristics of that choice are 151 further outlined in Section 7.1.2. 153 This document defines three new parameters for the SIP Via header for 154 overload control. These parameters provide a SIP mechanism for 155 conveying overload control information between adjacent SIP 156 entities.) These parameters are: 158 1. oc: This parameter serves a dual purpose; when inserted by a SIP 159 entity in a request going downstream, the parameter indicates 160 that the SIP entity supports overload control. When the 161 downstream SIP server sends a response, the downstream SIP server 162 will add a value to the parameter that indicates a loss rate (in 163 percentage) by which the requests arriving at the downstream SIP 164 server should be reduced. (c.f. Section 4.2, Section 4.3, 165 Section 4.4 and Section 4.5.) 166 2. oc-validity: Inserted by the SIP entity sending a response 167 upstream. This parameter contains a value that indicates the 168 time (in ms) that the load reduction specified by the "oc" 169 parameter should be in effect (c.f. Section 4.2.) 170 3. oc-seq: Inserted by the SIP entity sending a response upstream. 171 This parameter contains a value that indicates the sequence 172 number associated with the "oc" parameter defined above (c.f. 173 Section Section 4.2). 175 Consider a SIP entity, P1, which is sending requests to another 176 downstream SIP server, P2. The following snippets of SIP messages 177 demonstrate how the overload control parameters work. 179 INVITE sips:user@example.com SIP/2.0 180 Via: SIP/2.0/TLS p1.example.net; 181 branch=z9hG4bK2d4790.1;received=192.0.2.111;oc 182 ... 184 SIP/2.0 100 Trying 185 Via: SIP/2.0/TLS p1.example.net; 186 branch=z9hG4bK2d4790.1;received=192.0.2.111; 187 oc=20;oc-validity=500;oc-seq=1282321615.781 188 ... 190 In the messages above, the first line is sent by P1 to P2. This line 191 is a SIP request; because P1 supports overload control, it inserts 192 the "oc" parameter in the topmost Via header that it created. 194 The second line --- a SIP response --- shows the topmost Via header 195 amended by P2 according to this specification and sent to P1. 196 Because P2 also supports overload control, it sends back further 197 overload control parameters towards P1 requesting that P1 reduce the 198 incoming traffic by 20% for 500ms. P2 updates the "oc" parameter to 199 add a value and inserts the remaining two parameters, "oc-validity" 200 and "oc-seq". 202 4. Via Header Parameters for Overload Control 204 4.1. The 'oc' Parameter 206 A SIP entity that supports this specification MUST add an "oc" 207 parameter to the Via headers it inserts into SIP requests. This 208 provides an indication to downstream neighbors that this server 209 supports overload control. When inserted into a request by a SIP 210 entity to indicate support for overload control, there MUST NOT be a 211 value associated with the parameter. 213 4.2. Creating the Overload Control Parameters 215 A SIP server can provide overload control feedback to its upstream 216 neighbors by providing a value for the "oc" parameter to the topmost 217 Via header field of a SIP response. The topmost Via header is 218 determined after the SIP server has removed its own Via header; i.e., 219 it is the Via header that was generated by the upstream neighbor. 221 Since the topmost Via header of a response will be removed by an 222 upstream neighbor after processing it, overload control feedback 223 contained in the "oc" parameter will not travel beyond the upstream 224 SIP entity. A Via header parameter therefore provides hop-by-hop 225 semantics for overload control feedback (see 227 [I-D.ietf-soc-overload-design]) even if the next hop neighbor does 228 not support this specification. 230 The "oc: parameter can be used in all response types, including 231 provisional, success and failure responses. A SIP server MAY update 232 the "oc" parameter to all responses it is sending. A SIP server MUST 233 update the "oc" parameter to responses when the transmission of 234 overload control feedback is required by the overload control 235 algorithm to limit the traffic received by the server. I.e., a SIP 236 server MUST update the "oc" parameter when the overload control 237 algorithm sets the value of an "oc" parameter to a value different 238 than the default value. 240 A SIP server that has updated the "oc" parameter to Via header SHOULD 241 also add a "oc_validity" parameter to the same Via header. The 242 "oc_validity" parameter defines the time in milliseconds during which 243 the content (i.e., the overload control feedback) of the "oc" 244 parameter is valid. The default value of the "oc_validity" parameter 245 is 500 (millisecond). A SIP server SHOULD use a shorter 246 "oc_validity" time if its overload status varies quickly and MAY use 247 a longer "oc_validity" time if this status is more stable. If the 248 "oc_validity" parameter is not present, its default value is used. 249 The "oc_validity" parameter MUST NOT be used in a Via header that did 250 not originally contain an "oc" parameter when received. Furthermore, 251 when a SIP server receives a request with the topmost Via header 252 containing only an "oc-validity" parameter without the accompanying 253 "oc" parameter. it MUST ignore the "oc-validity" parameter. 255 When a SIP server retransmits a response, it SHOULD use the "oc" 256 parameter value and "oc-validity" parameter value consistent with the 257 overload state at the time the retransmitted response is sent. This 258 implies that the values in the "oc" and "oc-validity" parameters may 259 be different then the ones used in previous retransmissions of the 260 response. Due to the fact that responses sent over UDP may be 261 subject to delays in the network and arrive out of order, the "oc- 262 seq" parameter aids in detecting a stale "oc" parameter value. 264 Implementations that are capable of updating the "oc" and "oc- 265 validity" parameter values for retransmissions MUST insert the "oc- 266 seq" parameter. The value of this parameter MUST be a set of numbers 267 drawn from an increasing sequence. 269 Implementations that are not capable of updating the "oc" and "oc- 270 validity" parameter values for retransmissions --- or implementations 271 that do not want to do so because they will have to regenerate the 272 message to be retransmitted --- MUST still insert a "oc-seq" 273 parameter in the first response associated with a transaction; 274 however, they do not have to update the value in subsequent 275 retransmissions. 277 The "oc_validity" and "oc-seq" Via header parameters are only defined 278 in SIP responses and MUST NOT be used in SIP requests. These 279 parameters are only useful to the upstream neighbor of a SIP server 280 (i.e., the entity that is sending requests to the SIP server) since 281 this is the entity that can offload traffic by redirecting/rejecting 282 new requests. If requests are forwarded in both directions between 283 two SIP servers (i.e., the roles of upstream/downstream neighbors 284 change), there are also responses flowing in both directions. Thus, 285 both SIP servers can exchange overload information. 287 Since overload control protects a SIP server from overload, it is 288 RECOMMENDED that a SIP server use the mechanisms described in this 289 specification. However, if a SIP server wanted to limit its overload 290 control capability for privacy reasons, it MAY decide to perform 291 overload control only for requests that are received on a secure 292 transport channel, such as TLS. This enables a SIP server to protect 293 overload control information and ensure that it is only visible to 294 trusted parties. 296 4.3. Determining the 'oc' Parameter Value 298 The value of the "oc" parameter is determined by an overload control 299 algorithm (see [I-D.ietf-soc-overload-design]). This specification 300 does not mandate the use of a specific overload control algorithm. 301 However, the output of an overload control algorithm MUST be 302 compliant to the semantics of this Via header parameter. 304 The "oc" parameter value specifies the percentage by which the load 305 forwarded to this SIP server should be reduced. Possible values 306 range from 0 (the traffic forwarded is reduced by 0%, i.e., all 307 traffic is forwarded) to 100 (the traffic forwarded is reduced by 308 100%, i.e., no traffic forwarded). The default value of this 309 parameter is 0. 311 OPEN ISSUE 1: The "oc" parameter value specified in this document 312 is defined to contain a loss rate. However, other types of 313 overload control feedback exist, for example, a target rate for 314 rate-based overload control or message confirmations and window- 315 size for window-based overload control. 317 While it would in theory be possible to allow multiple types of 318 overload control feedback to co-exist (e.g., by using different 319 parameters for the different feedback types) it is very 320 problematic for interoperability purposes and would require SIP 321 servers to implement multiple overload control mechanisms. 323 4.4. Processing the Overload Control Parameters 325 A SIP entity compliant to this specification SHOULD remove "oc", 326 "oc_validity" and "oc-seq" parameters from all Via headers of a 327 response received, except for the topmost Via header. This prevents 328 overload control parameters that were accidentally or maliciously 329 inserted into Via headers by a downstream SIP server from traveling 330 upstream. 332 A SIP entity maintains the "oc" parameter values received along with 333 the address and port number of the SIP servers from which they were 334 received for the duration specified in the "oc_validity" parameter or 335 the default duration. Each time a SIP entity receives a response 336 with an "oc" parameter from a downstream SIP server, it overwrites 337 the "oc" value it has currently stored for this server with the new 338 value received. The SIP entity restarts the validity period of an 339 "oc" parameter each time a response with an "oc" parameter is 340 received from this server. A stored "oc" parameter value MUST be 341 discarded once it has reached the end of its validity. 343 4.5. Using the Overload Control Parameter Values 345 A SIP entity compliant to this specification MUST honor overload 346 control values it receives from downstream neighbors. The SIP entity 347 MUST NOT forward more requests to a SIP server than allowed by the 348 current "oc" parameter value from a particular downstream server. 350 When forwarding a SIP request, a SIP entity uses the SIP procedures 351 of [RFC3263] to determine the next hop SIP server. The procedures of 352 [RFC3263] take as input a SIP URI, extract the domain portion of that 353 URI for use as a lookup key, and query the Domain Name Service (DNS) 354 to obtain an ordered set of one or more IP addresses with a port 355 number and transport corresponding to each IP address in this set 356 (the "Expected Output"). 358 After selecting a specific SIP server from the Expected Output, the 359 SIP entity MUST determine if it already has overload control 360 parameter values for the server chosen from the Expected Output. If 361 the SIP entity has a non-expired "oc" parameter value for the server 362 chosen from the Expected Output, and this chosen server is operating 363 in overload control mode. Thus, the SIP entity MUST determine if it 364 can or cannot forward the current request to the SIP server depending 365 on the nature of the request and the prevailing overload conditions. 367 The particular algorithm used to determine whether or not to forward 368 a particular SIP request is a matter of local policy, and may take 369 into account a variety of prioritization factors. However, this 370 local policy SHOULD generate the same number and rate of SIP requests 371 as the default algorithm (to be determined), which treats all 372 requests as equal. 374 In the absence of a different local policy, the SIP entity SHOULD use 375 the following default algorithm to determine if it can forward the 376 request downstream (TODO: Need to devise an algorithm. The original 377 simple algorithm based on random number generation does not suffice 378 for all cases.) 380 4.6. Forwarding the overload control parameters 382 A SIP entity MAY forward the content of an "oc" parameter it has 383 received from a downstream neighbor on to its upstream neighbor. 384 However, forwarding the content of the "oc" parameter is generally 385 NOT RECOMMENDED and should only be performed if permitted by the 386 configuration of SIP servers. For example, a SIP server that only 387 relays messages between exactly two SIP servers may forward an "oc" 388 parameter. The "oc" parameter is forwarded by copying it from the 389 Via in which it was received into the next Via header (i.e., the Via 390 header that will be on top after processing the response). If an 391 "oc_validity" parameter is present, MUST be copied along with the 392 "oc" parameter. 394 4.7. Self-Limiting 396 In some cases, a SIP entity may not receive a response from a 397 downstream server after sending a request. RFC3261 [RFC3261] defines 398 that when a timeout error is received from the transaction layer, it 399 MUST be treated as if a 408 (Request Timeout) status code has been 400 received. If a fatal transport error is reported by the transport 401 layer, it MUST be treated as a 503 (Service Unavailable) status code. 403 In the event of repeated timeouts or fatal transport errors, the SIP 404 entity MUST stop sending requests to this server. The SIP entity 405 SHOULD occasionally forward a single request to probe if the 406 downstream server is alive. Once a SIP entity has successfully 407 transmitted a request to the downstream server, the SIP entity can 408 resume normal traffic rates. It should, of course, honor any "oc" 409 parameters it may receive subsequent to resuming normal traffic 410 rates. 412 OPEN ISSUE 2: If a downstream neighbor does not respond to a 413 request at all, the upstream SIP entity will stop sending requests 414 to the downstream neighbor. The upstream SIP entity will 415 periodically forward a single request to probe the health of its 416 downstream neighbor. It has been suggested --- see http:// 417 www.ietf.org/mail-archive/web/sip-overload/current/msg00229.html 418 --- that we have a notification mechanism in place for the 419 downstream neighbor to signal to the upstream SIP entity that it 420 is ready to receive requests. This notification scheme has 421 advantages, but comes with obvious disadvantages as well. Need 422 some more discussion around this. 424 5. Responding to an Overload Indication 426 A SIP entity can receive overload control feedback indicating that it 427 needs to reduce the traffic it sends to its downstream server. The 428 entity can accomplish this task by sending some of the requests that 429 would have gone to the overloaded element to a different destination. 430 It needs to ensure, however, that this destination is not in overload 431 and capable of processing the extra load. An entity can also buffer 432 requests in the hope that the overload condition will resolve quickly 433 and the requests still can be forwarded in time. In many cases, 434 however, it will need to reject these requests. 436 5.1. Message prioritization at the hop before the overloaded server 438 During an overload condition, a SIP entity needs to prioritize 439 requests and select those requests that need to be rejected or 440 redirected. While this selection is largely a matter of local 441 policy, certain heuristics can be suggested. One, during overload 442 control, the SIP entity should preserve existing dialogs as much as 443 possible. This suggests that mid-dialog requests MAY be given 444 preferential treatment. Similarly, requests that result in releasing 445 resources (such as a BYE) MAY also be given preferential treatment. 447 A SIP entity SHOULD honor the local policy for prioritizing SIP 448 requests such as policies based on the content of the Resource- 449 Priority header (RPH, RFC4412 [RFC4412]). Specific (namespace.value) 450 RPH contents may indicate high priority requests that should be 451 preserved as much as possible during overload. The RPH contents can 452 also indicate a low-priority request that is eligible to be dropped 453 during times of overload. Other indicators, such as the SOS URN 454 [RFC5031] indicating an emergency request, may also be used for 455 prioritization. 457 Local policy could also include giving precedence to mid- dialog SIP 458 requests (re-INVITEs, UPDATEs, BYEs etc.) in times of overload. A 459 local policy can be expected to combine both the SIP request type and 460 the prioritization markings, and SHOULD be honored when overload 461 conditions prevail. 463 5.2. Rejecting requests at an overloaded server 465 If the upstream SIP entity to the overloaded server does not support 466 overload control, it will continue to direct requests to the 467 overloaded server. Thus, the overloaded server must bear the cost of 468 rejecting some session requests as well as the cost of processing 469 other requests to completion. It would be fair to devote the same 470 amount of processing at the overloaded server to the combination of 471 rejection and processing as the overloaded server would devote to 472 processing requests from an upstream SIP entity that supported 473 overload control. This is to ensure that SIP servers that do not 474 support this specification don't receive an unfair advantage over 475 those that do. 477 A SIP server that is under overload and has started to throttle 478 incoming traffic MUST reject this request with a "503 (Service 479 Unavailable)" response without Retry-After header to reject a 480 fraction of requests from upstream neighbors that do not support 481 overload control. 483 6. Syntax 485 This section defines the syntax of new Via header parameters: "oc", 486 "oc_validity", and "oc-seq". 488 The "oc" Via header parameter, when it has a value, MUST restrain 489 that value to a number between 0 and 100. This value describes the 490 percentage by which the traffic (SIP requests) to the SIP server from 491 which the response has been received should be reduced. The default 492 value for this parameter is 0. 494 The "oc_validity" Via header parameter contains the time during which 495 the corresponding "oc" Via header parameter is valid. The 496 "oc_validity" parameter can only be present in a Via header in 497 conjunction with an "oc" parameter. 499 The "oc-seq" Via header parameter contains a sequence number. Those 500 implementations that are capable of providing finer-grained overload 501 control information may do so, however, each response that contains 502 the updated overload control information MUST have an increasing 503 value in this parameter. This is to allow the upstream server to 504 properly order out-of-order responses that contain overload control 505 information. 507 This specification extends the existing definition of the Via header 508 field parameters of [RFC3261] as follows: 510 via-params = via-ttl / via-maddr 511 / via-received / via-branch 512 / oc / oc-validity 513 / oc-seq / via-extension 515 oc = "oc" [EQUAL 0-100] 517 oc-validity = "oc_validity" [EQUAL delta-ms] 519 oc-seq = (1*12DIGIT "." 1*5DIGIT) 521 Example: 523 Via: SIP/2.0/TCP ss1.atlanta.example.com:5060 524 ;branch=z9hG4bK2d4790.1 525 ;received=192.0.2.111 526 ;oc=20;oc_validity=500;oc-seq=1282321615.641 528 7. Design Considerations 530 This section discusses specific design considerations for the 531 mechanism described in this document. General design considerations 532 for SIP overload control can be found in 533 [I-D.ietf-soc-overload-design]. 535 7.1. SIP Mechanism 537 A SIP mechanism is needed to convey overload feedback from the 538 receiving to the sending SIP entity. A number of different 539 alternatives exist to implement such a mechanism. 541 7.1.1. SIP Response Header 543 Overload control information can be transmitted using a new Via 544 header field parameter for overload control. A SIP server can add 545 this header parameter to the responses it is sending upstream to 546 provide overload control feedback to its upstream neighbors. This 547 approach has the following characteristics: 549 o A Via header parameter is light-weight and creates very little 550 overhead. It does not require the transmission of additional 551 messages for overload control and does not increase traffic or 552 processing burdens in an overload situation. 553 o Overload control status can frequently be reported to upstream 554 neighbors since it is a part of a SIP response. This enables the 555 use of this mechanism in scenarios where the overload status needs 556 to be adjusted frequently. It also enables the use of overload 557 control mechanisms that use regular feedback such as window-based 558 overload control. 559 o With a Via header parameter, overload control status is inherent 560 in SIP signaling and is automatically conveyed to all relevant 561 upstream neighbors, i.e., neighbors that are currently 562 contributing traffic. There is no need for a SIP server to 563 specifically track and manage the set of current upstream or 564 downstream neighbors with which it should exchange overload 565 feedback. 566 o Overload status is not conveyed to inactive senders. This avoids 567 the transmission of overload feedback to inactive senders, which 568 do not contribute traffic. If an inactive sender starts to 569 transmit while the receiver is in overload it will receive 570 overload feedback in the first response and can adjust the amount 571 of traffic forwarded accordingly. 572 o A SIP server can limit the distribution of overload control 573 information by only inserting it into responses to known upstream 574 neighbors. A SIP server can use transport level authentication 575 (e.g., via TLS) with its upstream neighbors. 577 7.1.2. SIP Event Package 579 Overload control information can also be conveyed from a receiver to 580 a sender using a new event package. Such an event package enables a 581 sending entity to subscribe to the overload status of its downstream 582 neighbors and receive notifications of overload control status 583 changes in NOTIFY requests. This approach has the following 584 characteristics: 586 o Overload control information is conveyed decoupled from SIP 587 signaling. It enables an overload control manager, which is a 588 separate entity, to monitor the load on other servers and provide 589 overload control feedback to all SIP servers that have set up 590 subscriptions with the controller. 591 o With an event package, a receiver can send updates to senders that 592 are currently inactive. Inactive senders will receive a 593 notification about the overload and can refrain from sending 594 traffic to this neighbor until the overload condition is resolved. 595 The receiver can also notify all potential senders once they are 596 permitted to send traffic again. However, these notifications do 597 generate additional traffic, which adds to the overall load. 598 o A SIP entity needs to set up and maintain overload control 599 subscriptions with all upstream and downstream neighbors. A new 600 subscription needs to be set up before/while a request is 601 transmitted to a new downstream neighbor. Servers can be 602 configured to subscribe at boot time. However, this would require 603 additional protection to avoid the avalanche restart problem for 604 overload control. Subscriptions need to be terminated when they 605 are not needed any more, which can be done, for example, using a 606 timeout mechanism. 607 o A receiver needs to send NOTIFY messages to all subscribed 608 upstream neighbors in a timely manner when the control algorithm 609 requires a change in the control variable (e.g., when a SIP server 610 is in an overload condition). This includes active as well as 611 inactive neighbors. These NOTIFYs add to the amount of traffic 612 that needs to be processed. To ensure that these requests will 613 not be dropped due to overload, a priority mechanism needs to be 614 implemented in all servers these request will pass through. 615 o As overload feedback is sent to all senders in separate messages, 616 this mechanism is not suitable when frequent overload control 617 feedback is needed. 618 o A SIP server can limit the set of senders that can receive 619 overload control information by authenticating subscriptions to 620 this event package. 621 o This approach requires each proxy to implement user agent 622 functionality (UAS and UAC) to manage the subscriptions. 624 7.2. Backwards Compatibility 626 An new overload control mechanism needs to be backwards compatible so 627 that it can be gradually introduced into a network and functions 628 properly if only a fraction of the servers support it. 630 Hop-by-hop overload control (see [I-D.ietf-soc-overload-design]) has 631 the advantage that it does not require that all SIP entities in a 632 network support it. It can be used effectively between two adjacent 633 SIP servers if both servers support overload control and does not 634 depend on the support from any other server or user agent. The more 635 SIP servers in a network support hop-by-hop overload control, the 636 better protected the network is against occurrences of overload. 638 A SIP server may have multiple upstream neighbors from which only 639 some may support overload control. If a server would simply use this 640 overload control mechanism, only those that support it would reduce 641 traffic. Others would keep sending at the full rate and benefit from 642 the throttling by the servers that support overload control. In 643 other words, upstream neighbors that do not support overload control 644 would be better off than those that do. 646 A SIP server should therefore use 5xx responses towards upstream 647 neighbors that do not support overload control. The server should 648 reject the same amount of requests with 5xx responses that would be 649 otherwise be rejected/redirected by the upstream neighbor if it would 650 support overload control. If the load condition on the server does 651 not permit the creation of 5xx responses, the server should drop all 652 requests from servers that do not support overload control. 654 8. Security Considerations 656 Overload control mechanisms can be used by an attacker to conduct a 657 denial-of-service attack on a SIP entity if the attacker can pretend 658 that the SIP entity is overloaded. When such a forged overload 659 indication is received by an upstream SIP entity, it will stop 660 sending traffic to the victim. Thus, the victim is subject to a 661 denial-of-service attack. 663 An attacker can create forged overload feedback by inserting itself 664 into the communication between the victim and its upstream neighbors. 665 The attacker would need to add overload feedback indicating a high 666 load to the responses passed from the victim to its upstream 667 neighbor. Proxies can prevent this attack by communicating via TLS. 668 Since overload feedback has no meaning beyond the next hop, there is 669 no need to secure the communication over multiple hops. 671 Another way to conduct an attack is to send a message containing a 672 high overload feedback value through a proxy that does not support 673 this extension. If this feedback is added to the second Via headers 674 (or all Via headers), it will reach the next upstream proxy. If the 675 attacker can make the recipient believe that the overload status was 676 created by its direct downstream neighbor (and not by the attacker 677 further downstream) the recipient stops sending traffic to the 678 victim. A precondition for this attack is that the victim proxy does 679 not support this extension since it would not pass through overload 680 control feedback otherwise. 682 A malicious SIP entity could gain an advantage by pretending to 683 support this specification but never reducing the amount of traffic 684 it forwards to the downstream neighbor. If its downstream neighbor 685 receives traffic from multiple sources which correctly implement 686 overload control, the malicious SIP entity would benefit since all 687 other sources to its downstream neighbor would reduce load. 689 The solution to this problem depends on the overload control 690 method. For rate-based and window-based overload control, it is 691 very easy for a downstream entity to monitor if the upstream 692 neighbor throttles traffic forwarded as directed. For percentage 693 throttling this is not always obvious since the load forwarded 694 depends on the load received by the upstream neighbor. 696 9. IANA Considerations 698 [TBD.] 700 10. References 702 10.1. Normative References 704 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 705 Requirement Levels", BCP 14, RFC 2119, March 1997. 707 [RFC3261] Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston, 708 A., Peterson, J., Sparks, R., Handley, M., and E. 709 Schooler, "SIP: Session Initiation Protocol", RFC 3261, 710 June 2002. 712 [RFC3263] Rosenberg, J. and H. Schulzrinne, "Session Initiation 713 Protocol (SIP): Locating SIP Servers", RFC 3263, 714 June 2002. 716 [RFC4412] Schulzrinne, H. and J. Polk, "Communications Resource 717 Priority for the Session Initiation Protocol (SIP)", 718 RFC 4412, February 2006. 720 10.2. Informative References 722 [I-D.ietf-soc-overload-design] 723 Hilt, V., Noel, E., Shen, C., and A. Abdelal, "Design 724 Considerations for Session Initiation Protocol (SIP) 725 Overload Control", draft-ietf-soc-overload-design-01 (work 726 in progress), August 2010. 728 [RFC5031] Schulzrinne, H., "A Uniform Resource Name (URN) for 729 Emergency and Other Well-Known Services", RFC 5031, 730 January 2008. 732 [RFC5390] Rosenberg, J., "Requirements for Management of Overload in 733 the Session Initiation Protocol", RFC 5390, December 2008. 735 Appendix A. Acknowledgements 737 Many thanks to Rich Terpstra, Daryl Malas, Jonathan Rosenberg, 738 Charles Shen, Padma Valluri, Janet Gunn, Shaun Bharrat, and Paul 739 Kyzivat for their contributions to this specification. 741 Authors' Addresses 743 Vijay K. Gurbani (editor) 744 Bell Laboratories, Alcatel-Lucent 745 1960 Lucent Lane, Rm 9C-533 746 Naperville, IL 60563 747 USA 749 Email: vkg@bell-labs.com 751 Volker Hilt 752 Bell Labs/Alcatel-Lucent 753 791 Holmdel-Keyport Rd 754 Holmdel, NJ 07733 755 USA 757 Email: volkerh@bell-labs.com 759 Henning Schulzrinne 760 Columbia University/Department of Computer Science 761 450 Computer Science Building 762 New York, NY 10027 763 USA 765 Phone: +1 212 939 7004 766 Email: hgs@cs.columbia.edu 767 URI: http://www.cs.columbia.edu