idnits 2.17.1 draft-ietf-sipping-overload-design-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** The document seems to lack a License Notice according IETF Trust Provisions of 28 Dec 2009, Section 6.b.i or Provisions of 12 Sep 2009 Section 6.b -- however, there's a paragraph with a matching beginning. Boilerplate error? (You're using the IETF Trust Provisions' Section 6.b License Notice from 12 Feb 2009 rather than one of the newer Notices. See https://trustee.ietf.org/license-info/.) Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (March 7, 2009) is 5529 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- No issues found here. Summary: 1 error (**), 0 flaws (~~), 1 warning (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 SIPPING Working Group V. Hilt (Ed.) 3 Internet-Draft Bell Labs/Alcatel-Lucent 4 Intended status: Informational March 7, 2009 5 Expires: September 8, 2009 7 Design Considerations for Session Initiation Protocol (SIP) Overload 8 Control 9 draft-ietf-sipping-overload-design-01 11 Status of this Memo 13 This Internet-Draft is submitted to IETF in full conformance with the 14 provisions of BCP 78 and BCP 79. 16 Internet-Drafts are working documents of the Internet Engineering 17 Task Force (IETF), its areas, and its working groups. Note that 18 other groups may also distribute working documents as Internet- 19 Drafts. 21 Internet-Drafts are draft documents valid for a maximum of six months 22 and may be updated, replaced, or obsoleted by other documents at any 23 time. It is inappropriate to use Internet-Drafts as reference 24 material or to cite them other than as "work in progress." 26 The list of current Internet-Drafts can be accessed at 27 http://www.ietf.org/ietf/1id-abstracts.txt. 29 The list of Internet-Draft Shadow Directories can be accessed at 30 http://www.ietf.org/shadow.html. 32 This Internet-Draft will expire on September 8, 2009. 34 Copyright Notice 36 Copyright (c) 2009 IETF Trust and the persons identified as the 37 document authors. All rights reserved. 39 This document is subject to BCP 78 and the IETF Trust's Legal 40 Provisions Relating to IETF Documents in effect on the date of 41 publication of this document (http://trustee.ietf.org/license-info). 42 Please review these documents carefully, as they describe your rights 43 and restrictions with respect to this document. 45 Abstract 47 Overload occurs in Session Initiation Protocol (SIP) networks when 48 SIP servers have insufficient resources to handle all SIP messages 49 they receive. Even though the SIP protocol provides a limited 50 overload control mechanism through its 503 (Service Unavailable) 51 response code, SIP servers are still vulnerable to overload. This 52 document discusses models and design considerations for a SIP 53 overload control mechanism. 55 Table of Contents 57 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 58 2. SIP Overload Problem . . . . . . . . . . . . . . . . . . . . . 4 59 3. Explicit vs. Implicit Overload Control . . . . . . . . . . . . 5 60 4. System Model . . . . . . . . . . . . . . . . . . . . . . . . . 5 61 5. Degree of Cooperation . . . . . . . . . . . . . . . . . . . . 7 62 5.1. Hop-by-Hop . . . . . . . . . . . . . . . . . . . . . . . . 8 63 5.2. End-to-End . . . . . . . . . . . . . . . . . . . . . . . . 9 64 5.3. Local Overload Control . . . . . . . . . . . . . . . . . . 10 65 6. Topologies . . . . . . . . . . . . . . . . . . . . . . . . . . 10 66 7. Fairness . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 67 8. Performance Metrics . . . . . . . . . . . . . . . . . . . . . 13 68 9. Explicit Overload Control Feedback . . . . . . . . . . . . . . 14 69 9.1. Rate-based Overload Control . . . . . . . . . . . . . . . 14 70 9.2. Loss-based Overload Control . . . . . . . . . . . . . . . 15 71 9.3. Window-based Overload Control . . . . . . . . . . . . . . 16 72 9.4. Overload Signal-based Overload Control . . . . . . . . . . 17 73 9.5. On-/Off Overload Control . . . . . . . . . . . . . . . . . 18 74 10. Implicit Overload Control . . . . . . . . . . . . . . . . . . 18 75 11. Overload Control Algorithms . . . . . . . . . . . . . . . . . 19 76 12. Message Prioritization . . . . . . . . . . . . . . . . . . . . 19 77 13. Security Considerations . . . . . . . . . . . . . . . . . . . 19 78 14. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 19 79 15. Informative References . . . . . . . . . . . . . . . . . . . . 19 80 Appendix A. Contributors . . . . . . . . . . . . . . . . . . . . 20 81 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 20 83 1. Introduction 85 As with any network element, a Session Initiation Protocol (SIP) 86 [RFC3261] server can suffer from overload when the number of SIP 87 messages it receives exceeds the number of messages it can process. 88 Overload occurs if a SIP server does not have sufficient resources to 89 process all incoming SIP messages. These resources may include CPU, 90 memory, network bandwidth, input/output, or disk resources. 92 Overload can pose a serious problem for a network of SIP servers. 93 During periods of overload, the throughput of a network of SIP 94 servers can be significantly degraded. In fact, overload may lead to 95 a situation in which the throughput drops down to a small fraction of 96 the original processing capacity. This is often called congestion 97 collapse. 99 An overload control mechanism enables a SIP server to perform close 100 to its capacity limit during times of overload. Overload control is 101 used by a SIP server if it is unable to process all SIP requests due 102 to resource constraints. There are other failure cases in which a 103 SIP server can successfully process incoming requests but has to 104 reject them for other reasons. For example, a PSTN gateway that runs 105 out of trunk lines but still has plenty of capacity to process SIP 106 messages should reject incoming INVITEs using a 488 (Not Acceptable 107 Here) response [RFC4412]. Similarly, a SIP registrar that has lost 108 connectivity to its registration database but is still capable of 109 processing SIP messages should reject REGISTER requests with a 500 110 (Server Error) response [RFC3261]. Overload control mechanisms do 111 not apply in these cases and SIP provides appropriate response codes 112 for them. 114 The SIP protocol provides a limited mechanism for overload control 115 through its 503 (Service Unavailable) response code and the Retry- 116 After header. However, this mechanism cannot prevent overload of a 117 SIP server and it cannot prevent congestion collapse. In fact, it 118 may cause traffic to oscillate and to shift between SIP servers and 119 thereby worsen an overload condition. A detailed discussion of the 120 SIP overload problem, the problems with the 503 (Service Unavailable) 121 response code and the Retry-After header and the requirements for a 122 SIP overload control mechanism can be found in [RFC5390]. 124 This document discusses the models, assumptions and design 125 considerations for a SIP overload control mechanism. The document is 126 a product of the SIP overload control design team. 128 2. SIP Overload Problem 130 A key contributor to the SIP congestion collapse [RFC5390] is the 131 regenerative behavior of overload in the SIP protocol. When SIP is 132 running over the UDP protocol, it will retransmit messages that were 133 dropped by a SIP server due to overload and thereby increase the 134 offered load for the already overloaded server. This increase in 135 load worsens the severity of the overload condition and, in turn, 136 causes more messages to be dropped. A congestion collapse can occur 137 [Noel et al.], [Shen et al.] and [Hilt et al.]. 139 Regenerative behavior under overload should ideally be avoided by any 140 protocol as this would lead to stable operation under overload. 141 However, this is often difficult to achieve in practice. For 142 example, changing the SIP retransmission timer mechanisms can reduce 143 the degree of regeneration during overload but will impact the 144 ability of SIP to recover from message losses. Without any 145 retransmission each message that is dropped due to SIP server 146 overload will eventually lead to a failed call. 148 For a SIP INVITE transaction to be successful a minimum of three 149 messages need to be forwarded by a SIP server. Often an INVITE 150 transaction consists of five or more SIP messages. If a SIP server 151 under overload randomly discards messages without evaluating them, 152 the chances that all messages belonging to a transaction are 153 successfully forwarded will decrease as the load increases. Thus, 154 the number of transactions that complete successfully will decrease 155 even if the message throughput of a server remains up and assuming 156 the overload behavior is fully non-regenerative. A SIP server might 157 (partially) parse incoming messages to determine if it is a new 158 request or a message belonging to an existing transaction. However, 159 after having spend resources on parsing a SIP message, discarding 160 this message is expensive as the resources already spend are lost. 161 The number of successful transactions will therefore decline with an 162 increase in load as less and less resources can be spent on 163 forwarding messages and more and more resources are consumed by 164 inspecting messages that will eventually be dropped. The slope of 165 the decline depends on the amount of resources spent to inspect each 166 message. 168 Another key challenge for SIP overload control is that the rate of 169 the true traffic source usually cannot be controlled. Overload is 170 often caused by a large number of UAs each of which creates only a 171 single message. These UAs cannot be rate controlled as they only 172 send one message. However, the sum of their traffic can overload a 173 SIP server. 175 3. Explicit vs. Implicit Overload Control 177 The main differences between explicit and implicit overload control 178 is the way overload is signaled from a SIP server that is reaching 179 overload condition to its upstream neighbors. 181 In an explicit overload control mechanism, a SIP server uses an 182 explicit overload signal to indicate that it is reaching its capacity 183 limit. Upstream neighbors receiving this signal can adjust their 184 transmission rate as indicated by the overload signal to a level that 185 is acceptable to the downstream server. The overload signal enables 186 a SIP server to steer the load it is receiving to a rate at which it 187 can perform at maximum capacity. 189 Implicit overload control uses the absence of responses and packet 190 loss as an indication of overload. A SIP server that is sensing such 191 a condition reduces the load it is forwarding a downstream neighbor. 192 Since there is no explicit overload signal, this mechanism is robust 193 as it does not depend on actions taken by the SIP server running into 194 overload. 196 The ideas of explicit and implicit overload control are in fact 197 complementary. By considering implicit overload indications a server 198 can avoid overloading an unresponsive downstream neighbor. An 199 explicit overload signal enables a SIP server to actively steer the 200 incoming load to a desired level. 202 4. System Model 204 The model shown in Figure 1 identifies fundamental components of an 205 explicit SIP overload control mechanism: 207 SIP Processor: The SIP Processor processes SIP messages and is the 208 component that is protected by overload control. 209 Monitor: The Monitor measures the current load of the SIP processor 210 on the receiving entity. It implements the mechanisms needed to 211 determine the current usage of resources relevant for the SIP 212 processor and reports load samples (S) to the Control Function. 213 Control Function: The Control Function implements the overload 214 control algorithm. The control function uses the load samples (S) 215 and determines if overload has occurred and a throttle (T) needs 216 to be set to adjust the load sent to the SIP processor on the 217 receiving entity. The control function on the receiving entity 218 sends load feedback (F) to the sending entity. 220 Actuator: The Actuator implements the algorithms needed to act on 221 the throttles (T) and ensures that the amount of traffic forwarded 222 to the receiving entity meets the criteria of the throttle. For 223 example, a throttle may instruct the Actuator to not forward more 224 than 100 INVITE messages per second. The Actuator implements the 225 algorithms to achieve this objective, e.g., using message gapping. 226 It also implements algorithms to select the messages that will be 227 affected and determine whether they are rejected or redirected. 229 The type of feedback (F) conveyed from the receiving to the sending 230 entity depends on the overload control method used (i.e., loss-based, 231 rate-based, window-based or signal-based overload control; see 232 Section 9), the overload control algorithm (see Section 11) as well 233 as other design parameters. The feedback (F) enables the sending 234 entity to adjust the amount of traffic forwarded to the receiving 235 entity to a level that is acceptable to the receiving entity without 236 causing overload. 238 Figure 1 depicts a general system model for overload control. In 239 this diagram, one instance of the control function is on the sending 240 entity (i.e., associated with the actuator) and one is on the 241 receiving entity (i.e., associated with the monitor). However, a 242 specific mechanism may not require both elements. In this case, one 243 of two control function elements can be empty and simply passes along 244 feedback. E.g., if (F) is defined as a loss-rate (e.g., reduce 245 traffic by 10%) there is no need for a control function on the 246 sending entity as the content of (F) can be copied directly into (T). 248 The model in Figure 1 shows a scenario with one sending and one 249 receiving entity. In a more realistic scenario a receiving entity 250 will receive traffic from multiple sending entities and vice versa 251 (see Section 6). The feedback generated by a Monitor will therefore 252 often be distributed across multiple Actuators. A Monitor needs to 253 be able to split the load it can process across multiple sending 254 entities and generate feedback that correctly adjusts the load each 255 sending entity is allowed to send. Similarly, an Actuator needs to 256 be prepared to receive different levels of feedback from different 257 receiving entities and throttle traffic to these entities 258 accordingly. 260 Sending Receiving 261 Entity Entity 262 +----------------+ +----------------+ 263 | Server A | | Server B | 264 | +----------+ | | +----------+ | -+ 265 | | Control | | F | | Control | | | 266 | | Function |<-+------+--| Function | | | 267 | +----------+ | | +----------+ | | 268 | T | | | ^ | | Overload 269 | v | | | S | | Control 270 | +----------+ | | +----------+ | | 271 | | Actuator | | | | Monitor | | | 272 | +----------+ | | +----------+ | | 273 | | | | ^ | -+ 274 | v | | | | -+ 275 | +----------+ | | +----------+ | | 276 <-+--| SIP | | | | SIP | | | SIP 277 --+->|Processor |--+------+->|Processor |--+-> | System 278 | +----------+ | | +----------+ | | 279 +----------------+ +----------------+ -+ 281 Figure 1: System Model for Explicit Overload Control 283 5. Degree of Cooperation 285 A SIP request is usually processed by more than one SIP server on its 286 path to the destination. Thus, a design choice for an explicit 287 overload control mechanism is where to place the components of 288 overload control along the path of a request and, in particular, 289 where to place the Monitor and Actuator. This design choice 290 determines the degree of cooperation between the SIP servers on the 291 path. Overload control can be implemented hop-by-hop with the 292 Monitor on one server and the Actuator on its direct upstream 293 neighbor. Overload control can be implemented end-to-end with 294 Monitors on all SIP servers along the path of a request and an 295 Actuator on the sender. In this case, the Control Functions 296 associated with each Monitor have to cooperate to jointly determine 297 the overall feedback for this path. Finally, overload control can be 298 implemented locally on a SIP server if Monitor and Actuator reside on 299 the same server. In this case, the sending entity and receiving 300 entity are the same SIP server and Actuator and Monitor operate on 301 the same SIP processor (although, the Actuator typically operates on 302 a pre-processing stage in local overload control). Local overload 303 control is an internal overload control mechanism as the control loop 304 is implemented internally on one server. Hop-by-hop and end-to-end 305 are external overload control mechanisms. All three configurations 306 are shown in Figure 2. 308 +---------+ +------(+)---------+ 309 +------+ | | | ^ | 310 | | | +---+ | | +---+ 311 v | v //=>| C | v | //=>| C | 312 +---+ +---+ // +---+ +---+ +---+ // +---+ 313 | A |===>| B | | A |===>| B | 314 +---+ +---+ \\ +---+ +---+ +---+ \\ +---+ 315 ^ \\=>| D | ^ | \\=>| D | 316 | +---+ | | +---+ 317 | | | v | 318 +---------+ +------(+)---------+ 320 (a) hop-by-hop (b) end-to-end 322 +-+ 323 v | 324 +-+ +-+ +---+ 325 v | v | //=>| C | 326 +---+ +---+ // +---+ 327 | A |===>| B | 328 +---+ +---+ \\ +---+ 329 \\=>| D | 330 +---+ 331 ^ | 332 +-+ 334 (c) local 336 ==> SIP request flow 337 <-- Overload feedback loop 339 Figure 2: Degree of Cooperation between Servers 341 5.1. Hop-by-Hop 343 The idea of hop-by-hop overload control is to instantiate a separate 344 control loop between all neighboring SIP servers that directly 345 exchange traffic. I.e., the Actuator is located on the SIP server 346 that is the direct upstream neighbor of the SIP server that has the 347 corresponding Monitor. Each control loop between two servers is 348 completely independent of the control loop between other servers 349 further up- or downstream. In the example in Figure 2(a), three 350 independent overload control loops are instantiated: A - B, B - C and 351 B - D. Each loop only controls a single hop. Overload feedback 352 received from a downstream neighbor is not forwarded further 353 upstream. Instead, a SIP server acts on this feedback, for example, 354 by rejecting SIP messages if needed. If the upstream neighbor of a 355 server also becomes overloaded, it will report this problem to its 356 upstream neighbors, which again take action based on the reported 357 feedback. Thus, in hop-by-hop overload control, overload is always 358 resolved by the direct upstream neighbors of the overloaded server 359 without the need to involve entities that are located multiple SIP 360 hops away. 362 Hop-by-hop overload control reduces the impact of overload on a SIP 363 network and can avoid congestion collapse. It is simple and scales 364 well to networks with many SIP entities. An advantage is that it 365 does not require feedback to be transmitted across multiple-hops, 366 possibly crossing multiple trust domains. Feedback is sent to the 367 next hop only. Furthermore, it does not require a SIP entity to 368 aggregate a large number of overload status values or keep track of 369 the overload status of SIP servers it is not communicating with. 371 5.2. End-to-End 373 End-to-end overload control implements an overload control loop along 374 the entire path of a SIP request, from UAC to UAS. An end-to-end 375 overload control mechanism consolidates overload information from all 376 SIP servers on the way (including all proxies and the UAS) and uses 377 this information to throttle traffic as far upstream as possible. An 378 end-to-end overload control mechanism has to be able to frequently 379 collect the overload status of all servers on the potential path(s) 380 to a destination and combine this data into meaningful overload 381 feedback. 383 A UA or SIP server only throttles requests if it knows that these 384 requests will eventually be forwarded to an overloaded server. For 385 example, if D is overloaded in Figure 2(b), A should only throttle 386 requests it forwards to B when it knows that they will be forwarded 387 to D. It should not throttle requests that will eventually be 388 forwarded to C, since server C is not overloaded. In many cases, it 389 is difficult for A to determine which requests will be routed to C 390 and D since this depends on the local routing decision made by B. 391 These routing decisions can be highly variable and, for example, 392 depend on call routing policies configured by the user, services 393 invoked on a call, load balancing policies, etc. The fact that a 394 previous message to a target has been routed through an overload 395 server does not necessarily mean the next message to this target will 396 also be routed through the same server. 398 The main problem of end-to-end overload control is its inherent 399 complexity since UAC or SIP servers need to monitor all potential 400 paths to a destination in order to determine which requests should be 401 throttled and which requests may be sent. Even if this information 402 is available, it is not clear which path a specific request will 403 take. 405 A variant of end-to-end overload control is to implement a control 406 loop control between a set of well-known SIP servers along the path 407 of a SIP request. For example, an overload control loop can be 408 instantiated between a server that only has one downstream neighbor 409 or a set of closely coupled SIP servers. A control loop spanning 410 multiple hops can be used if the sending entity has full knowledge 411 about the SIP servers on the path of a SIP message. 413 A key difference to transport protocols using end-to-end congestion 414 control such as TCP is that the traffic exchanged between SIP servers 415 consists of many individual SIP messages. Each of these SIP messages 416 has its own source and destination. Even SIP messages containing 417 identical SIP URIs (e.g., a SUBSCRIBE and a INVITE message to the 418 same SIP URI) can be routed to different destinations. This is 419 different from TCP which controls a stream of packets between a 420 single source and a single destination. 422 5.3. Local Overload Control 424 The idea of local overload control (see Figure 2(c)) is to run the 425 Monitor and Actuator on the same server. This enables the server to 426 monitor the current resource usage and to reject messages that can't 427 be processed without overusing the local resources. The fundamental 428 assumption behind local overload control is that it is less resource 429 consuming for a server to reject messages than to process them. A 430 server can therefore reject the excess messages it cannot process to 431 stop all retransmissions of these messages. 433 Local overload control can be used in conjunction with an other 434 overload control mechanisms and provides an additional layer of 435 protection against overload. It is fully implemented on the local 436 server and does not require any cooperation from upstream neighbors. 437 In general, SIP servers should apply implicit or explicit overload 438 control techniques to control load before a local overload control 439 mechanism is activated as a mechanism of last resort. 441 6. Topologies 443 The following topologies describe four generic SIP server 444 configurations. These topologies illustrate specific challenges for 445 an overload control mechanism. An actual SIP server topology is 446 likely to consist of combinations of these generic scenarios. 448 In the "load balancer" configuration shown in Figure 3(a) a set of 449 SIP servers (D, E and F) receives traffic from a single source A. A 450 load balancer is a typical example for such a configuration. In this 451 configuration, overload control needs to prevent server A (i.e., the 452 load balancer) from sending too much traffic to any of its downstream 453 neighbors D, E and F. If one of the downstream neighbors becomes 454 overloaded, A can direct traffic to the servers that still have 455 capacity. If one of the servers serves as a backup, it can be 456 activated once one of the primary servers reaches overload. 458 If A can reliably determine that D, E and F are its only downstream 459 neighbors and all of them are in overload, it may choose to report 460 overload upstream on behalf of D, E and F. However, if the set of 461 downstream neighbors is not fixed or only some of them are in 462 overload then A should not activate an overload control since A can 463 still forward the requests destined to non-overloaded downstream 464 neighbors. These requests would be throttled as well if A would use 465 overload control towards its upstream neighbors. 467 In the "multiple sources" configuration shown in Figure 3(b), a SIP 468 server D receives traffic from multiple upstream sources A, B and C. 469 Each of these sources can contribute a different amount of traffic, 470 which can vary over time. The set of active upstream neighbors of D 471 can change as servers may become inactive and previously inactive 472 servers may start contributing traffic to D. 474 If D becomes overloaded, it needs to generate feedback to reduce the 475 amount of traffic it receives from its upstream neighbors. D needs 476 to decide by how much each upstream neighbor should reduce traffic. 477 This decision can require the consideration of the amount of traffic 478 sent by each upstream neighbor and it may need to be re-adjusted as 479 the traffic contributed by each upstream neighbor varies over time. 480 Server D can use a local fairness policy to determine much traffic it 481 accepts from each upstream neighbor. 483 In many configurations, SIP servers form a "mesh" as shown in 484 Figure 3(c). Here, multiple upstream servers A, B and C forward 485 traffic to multiple alternative servers D and E. This configuration 486 is a combination of the "load balancer" and "multiple sources" 487 scenario. 489 +---+ +---+ 490 /->| D | | A |-\ 491 / +---+ +---+ \ 492 / \ +---+ 493 +---+-/ +---+ +---+ \->| | 494 | A |------>| E | | B |------>| D | 495 +---+-\ +---+ +---+ /->| | 496 \ / +---+ 497 \ +---+ +---+ / 498 \->| F | | C |-/ 499 +---+ +---+ 501 (a) load balancer (b) multiple sources 503 +---+ 504 | A |---\ a--\ 505 +---+-\ \---->+---+ \ 506 \/----->| D | b--\ \--->+---+ 507 +---+--/\ /-->+---+ \---->| | 508 | B | \/ c-------->| D | 509 +---+---\/\--->+---+ | | 510 /\---->| E | ... /--->+---+ 511 +---+--/ /-->+---+ / 512 | C |-----/ z--/ 513 +---+ 515 (c) mesh (d) edge proxy 517 Figure 3: Topologies 519 Overload control that is based on reducing the number of messages a 520 sender is allowed to send is not suited for servers that receive 521 requests from a very large population of senders, each of which only 522 infrequently sends a request. This scenario is shown in Figure 3(d). 523 An edge proxy that is connected to many UAs is a typical example for 524 such a configuration. 526 Since each UA typically only contributes a few requests, which are 527 often related to the same call, it can't decrease its message rate to 528 resolve the overload. In such a configuration, a SIP server can 529 resort to local overload control by rejecting a percentage of the 530 requests it receives with 503 (Service Unavailable) responses. Since 531 there are many upstream neighbors that contribute to the overall 532 load, sending 503 (Service Unavailable) to a fraction of them can 533 gradually reduce load without entirely stopping all incoming traffic. 534 The Retry-After header can be used in 503 (Service Unavailable) 535 responses to ask UAs to wait a given number of seconds before trying 536 the call again. Using 503 (Service Unavailable) towards individual 537 sources can, however, not prevent overload if a large number of users 538 places calls at the same time. 540 Note: The requirements of the "edge proxy" topology are different 541 than the ones of the other topologies, which may require a 542 different method for overload control. 544 7. Fairness 546 There are many different ways to define fairness if a SIP server has 547 multiple upstream neighbors. In the context of SIP server overload, 548 it is helpful to describe two categories of fairness criteria: basic 549 fairness and customized fairness. With basic fairness a SIP server 550 treats all end-users equally and ensures that each end-user has the 551 same chance in accessing the server resources. With customized 552 fairness the server allocate resources according to different 553 priorities. An example application of the basic fairness criteria is 554 the "Third caller receives free tickets" scenario, where each end- 555 user should have an equal success probability in making calls through 556 an overloaded SIP server, regardless of which service provider he/she 557 is subscribing to. An example of customized fairness would be a 558 server which gives different resource allocations to its upstream 559 neighbors (e.g., service providers) as defined in service level 560 agreements. 562 8. Performance Metrics 564 The performance of an overload control mechanism can be measured 565 using different metrics. 567 A key performance indicator is the goodput of a SIP server during 568 overload. Ideally, a SIP server is enabled to perform at its 569 capacity limit during periods of overload. E.g., if a SIP server has 570 a processing capacity of 140 INVITE transactions per second then an 571 overload control mechanism should enable it to handle 140 INVITEs per 572 second even if the offered load is much higher. The delay introduced 573 by a SIP server is another important indicator. An overload control 574 mechanism should ensure that the delay encountered by a SIP message 575 is not increased significantly during periods of overload. 577 Reactiveness and stability are other important performance 578 indicators. An overload control mechanism should quickly react to an 579 overload occurrence and ensure that a SIP server does not become 580 overloaded even during sudden peaks of load. Similarly, an overload 581 control mechanism should quickly remove all throttles if the overload 582 disappears. Stability is another important criteria as using an 583 overload control mechanism should not lead to the oscillation of load 584 on a SIP server. The performance of SIP overload control mechanisms 585 is discussed in [Noel et al.], [Shen et al.] and [Hilt et al.]. 587 In addition to the above metrics, there are other indicators that are 588 relevant for the evaluation of an overload control mechanism: 590 Fairness: Which types of fairness does the overload control 591 mechanism implement? 592 Self-limiting: Is the overload control self-limiting if a SIP server 593 becomes unresponsive? 594 Changes in neighbor set: How does the mechanism adapt to a changing 595 set of sending entities? 596 Data points to monitor: Which data points does an overload control 597 mechanism need to monitor? 598 Tuning requirements: Does the algorithm work out of the box or is 599 parameter tweaking required? 601 TBD: a discussion of these metrics for the following overload 602 control mechanisms is needed. 604 9. Explicit Overload Control Feedback 606 Explicit overload control feedback enables a receiver to indicate how 607 much traffic it wants to receive. Explicit overload control 608 mechanisms can be differentiated based on the type of information 609 conveyed in the overload control feedback. Another way to classify 610 explicit overload control mechanisms is whether the control function 611 is in the receiving or sending entity (receiver- vs. sender-based 612 overload control). 614 9.1. Rate-based Overload Control 616 The key idea of rate-based overload control is to limit the request 617 rate at which an upstream element is allowed to forward traffic to 618 the downstream neighbor. If overload occurs, a SIP server instructs 619 each upstream neighbor to send at most X requests per second. Each 620 upstream neighbor can be assigned a different rate cap. 622 An example algorithm for the Actuator in a sending entity to 623 implement a rate cap is request gapping. After transmitting a 624 request to a downstream neighbor, a server waits for 1/X seconds 625 before it transmits the next request to the same neighbor. Requests 626 that arrive during the waiting period are not forwarded and are 627 either redirected, rejected or buffered. 629 The rate cap ensures that the number of requests received by a SIP 630 server never increases beyond the sum of all rate caps granted to 631 upstream neighbors. Rate-based overload control protects a SIP 632 server against overload even during load spikes assuming there are no 633 new upstream neighbors that start sending traffic. New upstream 634 neighbors need to be considered in all rate caps currently assigned 635 to upstream neighbors. The current overall rate cap of a SIP server 636 is determined by an overload control algorithm, e.g., based on system 637 load. 639 Rate-based overload control requires a SIP server to assign a rate 640 cap to each of its upstream neighbors while it is activated. 641 Effectively, a server needs to assign a share of its overall capacity 642 to each upstream neighbor. A server needs to ensure that the sum of 643 all rate caps assigned to upstream neighbors is not (significantly) 644 higher than its actual processing capacity. This requires a SIP 645 server to keep track of the set of upstream neighbors and to adjust 646 the rate cap if a new upstream neighbor appears or an existing 647 neighbor stops transmitting. For example, if the capacity of the 648 server is X and this server is receiving traffic from two upstream 649 neighbors, it can assign a rate of X/2 to each of them. If a third 650 sender appears, the rate for each sender is lowered to X/3. If the 651 rate cap assigned to upstream neighbors is too high, a server may 652 still experience overload. If the cap is too low, the upstream 653 neighbors will reject requests even though they could be processed by 654 the server. 656 An approach for estimating a rate cap for each upstream neighbor is 657 using a fixed proportion of a control variable, X, where X is 658 initially equal to the capacity of the SIP server. The server then 659 increases or decreases X until the workload arrival rate matches the 660 actual server capacity. Usually, this will mean that the sum of the 661 rate caps sent out by the server (=X) exceeds its actual capacity, 662 but enables upstream neighbors who are not generating more than their 663 fair share of the work to be effectively unrestricted. In this 664 approach, the server only has to measure the aggregate arrival rate. 665 However, since the overall rate cap is usually higher than the actual 666 capacity, brief periods of overload may occur. 668 9.2. Loss-based Overload Control 670 A loss percentage enables a SIP server to ask an upstream neighbor to 671 reduce the number of requests it would normally forward to this 672 server by a percentage X. For example, a SIP server can ask an 673 upstream neighbor to reduce the number of requests this neighbor 674 would normally send by 10%. The upstream neighbor then redirects or 675 rejects X percent of the traffic that is destined for this server. 677 An algorithm for the sending entity to implement a loss percentage is 678 to draw a random number between 1 and 100 for each request to be 679 forwarded. The request is not forwarded to the server if the random 680 number is less than or equal to X. 682 An advantage of loss-based overload control is that, the receiving 683 entity does not need to track the set of upstream neighbors or the 684 request rate it receives from each upstream neighbor. It is 685 sufficient to monitor the overall system utilization. To reduce 686 load, a server can ask its upstream neighbors to lower the traffic 687 forwarded by a certain percentage. The server calculates this 688 percentage by combining the loss percentage that is currently in use 689 (i.e., the loss percentage the upstream neighbors are currently using 690 when forwarding traffic), the current system utilization and the 691 desired system utilization. For example, if the server load 692 approaches 90% and the current loss percentage is set to a 50% 693 traffic reduction, then the server can decide to increase the loss 694 percentage to 55% in order to get to a system utilization of 80%. 695 Similarly, the server can lower the loss percentage if permitted by 696 the system utilization. 698 Loss-based overload control requires that the throttle percentage is 699 adjusted to the current overall number of requests received by the 700 server. This is particularly important if the number of requests 701 received fluctuates quickly. For example, if a SIP server sets a 702 throttle value of 10% at time t1 and the number of requests increases 703 by 20% between time t1 and t2 (t1