idnits 2.17.1 draft-ietf-soc-overload-design-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (August 13, 2010) is 5005 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- No issues found here. Summary: 0 errors (**), 0 flaws (~~), 1 warning (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 SOC Working Group V. Hilt 3 Internet-Draft Bell Labs/Alcatel-Lucent 4 Intended status: Informational E. Noel 5 Expires: February 14, 2011 AT&T Labs 6 C. Shen 7 Columbia University 8 A. Abdelal 9 Sonus Networks 10 August 13, 2010 12 Design Considerations for Session Initiation Protocol (SIP) Overload 13 Control 14 draft-ietf-soc-overload-design-01 16 Abstract 18 Overload occurs in Session Initiation Protocol (SIP) networks when 19 SIP servers have insufficient resources to handle all SIP messages 20 they receive. Even though the SIP protocol provides a limited 21 overload control mechanism through its 503 (Service Unavailable) 22 response code, SIP servers are still vulnerable to overload. This 23 document discusses models and design considerations for a SIP 24 overload control mechanism. 26 Status of this Memo 28 This Internet-Draft is submitted in full conformance with the 29 provisions of BCP 78 and BCP 79. 31 Internet-Drafts are working documents of the Internet Engineering 32 Task Force (IETF). Note that other groups may also distribute 33 working documents as Internet-Drafts. The list of current Internet- 34 Drafts is at http://datatracker.ietf.org/drafts/current/. 36 Internet-Drafts are draft documents valid for a maximum of six months 37 and may be updated, replaced, or obsoleted by other documents at any 38 time. It is inappropriate to use Internet-Drafts as reference 39 material or to cite them other than as "work in progress." 41 This Internet-Draft will expire on February 14, 2011. 43 Copyright Notice 45 Copyright (c) 2010 IETF Trust and the persons identified as the 46 document authors. All rights reserved. 48 This document is subject to BCP 78 and the IETF Trust's Legal 49 Provisions Relating to IETF Documents 50 (http://trustee.ietf.org/license-info) in effect on the date of 51 publication of this document. Please review these documents 52 carefully, as they describe your rights and restrictions with respect 53 to this document. Code Components extracted from this document must 54 include Simplified BSD License text as described in Section 4.e of 55 the Trust Legal Provisions and are provided without warranty as 56 described in the Simplified BSD License. 58 Table of Contents 60 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 61 2. SIP Overload Problem . . . . . . . . . . . . . . . . . . . . . 4 62 3. Explicit vs. Implicit Overload Control . . . . . . . . . . . . 5 63 4. System Model . . . . . . . . . . . . . . . . . . . . . . . . . 5 64 5. Degree of Cooperation . . . . . . . . . . . . . . . . . . . . 7 65 5.1. Hop-by-Hop . . . . . . . . . . . . . . . . . . . . . . . . 8 66 5.2. End-to-End . . . . . . . . . . . . . . . . . . . . . . . . 9 67 5.3. Local Overload Control . . . . . . . . . . . . . . . . . . 10 68 6. Topologies . . . . . . . . . . . . . . . . . . . . . . . . . . 10 69 7. Fairness . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 70 8. Performance Metrics . . . . . . . . . . . . . . . . . . . . . 13 71 9. Explicit Overload Control Feedback . . . . . . . . . . . . . . 14 72 9.1. Rate-based Overload Control . . . . . . . . . . . . . . . 14 73 9.2. Loss-based Overload Control . . . . . . . . . . . . . . . 15 74 9.3. Window-based Overload Control . . . . . . . . . . . . . . 16 75 9.4. Overload Signal-based Overload Control . . . . . . . . . . 17 76 9.5. On-/Off Overload Control . . . . . . . . . . . . . . . . . 18 77 10. Implicit Overload Control . . . . . . . . . . . . . . . . . . 18 78 11. Overload Control Algorithms . . . . . . . . . . . . . . . . . 18 79 12. Message Prioritization . . . . . . . . . . . . . . . . . . . . 19 80 13. Security Considerations . . . . . . . . . . . . . . . . . . . 20 81 14. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 20 82 15. Informative References . . . . . . . . . . . . . . . . . . . . 20 83 Appendix A. Contributors . . . . . . . . . . . . . . . . . . . . 21 84 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 21 86 1. Introduction 88 As with any network element, a Session Initiation Protocol (SIP) 89 [RFC3261] server can suffer from overload when the number of SIP 90 messages it receives exceeds the number of messages it can process. 91 Overload occurs if a SIP server does not have sufficient resources to 92 process all incoming SIP messages. These resources may include CPU, 93 memory, input/output, or disk resources. 95 Overload can pose a serious problem for a network of SIP servers. 96 During periods of overload, the throughput of a network of SIP 97 servers can be significantly degraded. In fact, overload may lead to 98 a situation in which the throughput drops down to a small fraction of 99 the original processing capacity. This is often called congestion 100 collapse. 102 An overload control mechanism enables a SIP server to perform close 103 to its capacity limit during times of overload. Overload control is 104 used by a SIP server if it is unable to process all SIP requests due 105 to resource constraints. There are other failure cases in which a 106 SIP server can successfully process incoming requests but has to 107 reject them for other reasons. For example, a PSTN gateway that runs 108 out of trunk lines but still has plenty of capacity to process SIP 109 messages should reject incoming INVITEs using a 488 (Not Acceptable 110 Here) response [RFC4412]. Similarly, a SIP registrar that has lost 111 connectivity to its registration database but is still capable of 112 processing SIP messages should reject REGISTER requests with a 500 113 (Server Error) response [RFC3261]. Overload control mechanisms do 114 not apply in these cases and SIP provides appropriate response codes 115 for them. 117 The SIP protocol provides a limited mechanism for overload control 118 through its 503 (Service Unavailable) response code and the Retry- 119 After header. However, this mechanism cannot prevent overload of a 120 SIP server and it cannot prevent congestion collapse. In fact, it 121 may cause traffic to oscillate and to shift between SIP servers and 122 thereby worsen an overload condition. A detailed discussion of the 123 SIP overload problem, the problems with the 503 (Service Unavailable) 124 response code and the Retry-After header and the requirements for a 125 SIP overload control mechanism can be found in [RFC5390]. 127 This document discusses the models, assumptions and design 128 considerations for a SIP overload control mechanism. The document is 129 a product of the SIP overload control design team. 131 2. SIP Overload Problem 133 A key contributor to the SIP congestion collapse [RFC5390] is the 134 regenerative behavior of overload in the SIP protocol. When SIP is 135 running over the UDP protocol, it will retransmit messages that were 136 dropped by a SIP server due to overload and thereby increase the 137 offered load for the already overloaded server. This increase in 138 load worsens the severity of the overload condition and, in turn, 139 causes more messages to be dropped. A congestion collapse can occur 140 [Hilt et al.], [Noel et al.], [Shen et al.] and [Abdelal et al.]. 142 Regenerative behavior under overload should ideally be avoided by any 143 protocol as this would lead to stable operation under overload. 144 However, this is often difficult to achieve in practice. For 145 example, changing the SIP retransmission timer mechanisms can reduce 146 the degree of regeneration during overload but will impact the 147 ability of SIP to recover from message losses. Without any 148 retransmission each message that is dropped due to SIP server 149 overload will eventually lead to a failed call. 151 For a SIP INVITE transaction to be successful a minimum of three 152 messages need to be forwarded by a SIP server. Often an INVITE 153 transaction consists of five or more SIP messages. If a SIP server 154 under overload randomly discards messages without evaluating them, 155 the chances that all messages belonging to a transaction are 156 successfully forwarded will decrease as the load increases. Thus, 157 the number of transactions that complete successfully will decrease 158 even if the message throughput of a server remains up and assuming 159 the overload behavior is fully non-regenerative. A SIP server might 160 (partially) parse incoming messages to determine if it is a new 161 request or a message belonging to an existing transaction. However, 162 after having spend resources on parsing a SIP message, discarding 163 this message is expensive as the resources already spend are lost. 164 The number of successful transactions will therefore decline with an 165 increase in load as less and less resources can be spent on 166 forwarding messages and more and more resources are consumed by 167 inspecting messages that will eventually be dropped. The slope of 168 the decline depends on the amount of resources spent to inspect each 169 message. 171 Another challenge for SIP overload control is that the rate of the 172 true traffic source usually cannot be controlled. Overload is often 173 caused by a large number of UAs each of which creates only a single 174 message. These UAs cannot be rate controlled as they only send one 175 message. However, the sum of their traffic can overload a SIP 176 server. 178 3. Explicit vs. Implicit Overload Control 180 The main differences between explicit and implicit overload control 181 is the way overload is signaled from a SIP server that is reaching 182 overload condition to its upstream neighbors. 184 In an explicit overload control mechanism, a SIP server uses an 185 explicit overload signal to indicate that it is reaching its capacity 186 limit. Upstream neighbors receiving this signal can adjust their 187 transmission rate according to the overload signal to a level that is 188 acceptable to the downstream server. The overload signal enables a 189 SIP server to steer the load it is receiving to a rate at which it 190 can perform at maximum capacity. 192 Implicit overload control uses the absence of responses and packet 193 loss as an indication of overload. A SIP server that is sensing such 194 a condition reduces the load it is forwarding a downstream neighbor. 195 Since there is no explicit overload signal, this mechanism is robust 196 as it does not depend on actions taken by the SIP server running into 197 overload. 199 The ideas of explicit and implicit overload control are in fact 200 complementary. By considering implicit overload indications a server 201 can avoid overloading an unresponsive downstream neighbor. An 202 explicit overload signal enables a SIP server to actively steer the 203 incoming load to a desired level. 205 4. System Model 207 The model shown in Figure 1 identifies fundamental components of an 208 explicit SIP overload control mechanism: 210 SIP Processor: The SIP Processor processes SIP messages and is the 211 component that is protected by overload control. 212 Monitor: The Monitor measures the current load of the SIP processor 213 on the receiving entity. It implements the mechanisms needed to 214 determine the current usage of resources relevant for the SIP 215 processor and reports load samples (S) to the Control Function. 216 Control Function: The Control Function implements the overload 217 control algorithm. The control function uses the load samples (S) 218 and determines if overload has occurred and a throttle (T) needs 219 to be set to adjust the load sent to the SIP processor on the 220 receiving entity. The control function on the receiving entity 221 sends load feedback (F) to the sending entity. 223 Actuator: The Actuator implements the algorithms needed to act on 224 the throttles (T) and ensures that the amount of traffic forwarded 225 to the receiving entity meets the criteria of the throttle. For 226 example, a throttle may instruct the Actuator to not forward more 227 than 100 INVITE messages per second. The Actuator implements the 228 algorithms to achieve this objective, e.g., using message gapping. 229 It also implements algorithms to select the messages that will be 230 affected and determine whether they are rejected or redirected. 232 The type of feedback (F) conveyed from the receiving to the sending 233 entity depends on the overload control method used (i.e., loss-based, 234 rate-based, window-based or signal-based overload control; see 235 Section 9), the overload control algorithm (see Section 11) as well 236 as other design parameters. The feedback (F) enables the sending 237 entity to adjust the amount of traffic forwarded to the receiving 238 entity to a level that is acceptable to the receiving entity without 239 causing overload. 241 Figure 1 depicts a general system model for overload control. In 242 this diagram, one instance of the control function is on the sending 243 entity (i.e., associated with the actuator) and one is on the 244 receiving entity (i.e., associated with the monitor). However, a 245 specific mechanism may not require both elements. In this case, one 246 of two control function elements can be empty and simply passes along 247 feedback. E.g., if (F) is defined as a loss-rate (e.g., reduce 248 traffic by 10%) there is no need for a control function on the 249 sending entity as the content of (F) can be copied directly into (T). 251 The model in Figure 1 shows a scenario with one sending and one 252 receiving entity. In a more realistic scenario a receiving entity 253 will receive traffic from multiple sending entities and vice versa 254 (see Section 6). The feedback generated by a Monitor will therefore 255 often be distributed across multiple Actuators. A Monitor needs to 256 be able to split the load it can process across multiple sending 257 entities and generate feedback that correctly adjusts the load each 258 sending entity is allowed to send. Similarly, an Actuator needs to 259 be prepared to receive different levels of feedback from different 260 receiving entities and throttle traffic to these entities 261 accordingly. 263 Sending Receiving 264 Entity Entity 265 +----------------+ +----------------+ 266 | Server A | | Server B | 267 | +----------+ | | +----------+ | -+ 268 | | Control | | F | | Control | | | 269 | | Function |<-+------+--| Function | | | 270 | +----------+ | | +----------+ | | 271 | T | | | ^ | | Overload 272 | v | | | S | | Control 273 | +----------+ | | +----------+ | | 274 | | Actuator | | | | Monitor | | | 275 | +----------+ | | +----------+ | | 276 | | | | ^ | -+ 277 | v | | | | -+ 278 | +----------+ | | +----------+ | | 279 <-+--| SIP | | | | SIP | | | SIP 280 --+->|Processor |--+------+->|Processor |--+-> | System 281 | +----------+ | | +----------+ | | 282 +----------------+ +----------------+ -+ 284 Figure 1: System Model for Explicit Overload Control 286 5. Degree of Cooperation 288 A SIP request is usually processed by more than one SIP server on its 289 path to the destination. Thus, a design choice for an explicit 290 overload control mechanism is where to place the components of 291 overload control along the path of a request and, in particular, 292 where to place the Monitor and Actuator. This design choice 293 determines the degree of cooperation between the SIP servers on the 294 path. Overload control can be implemented hop-by-hop with the 295 Monitor on one server and the Actuator on its direct upstream 296 neighbor. Overload control can be implemented end-to-end with 297 Monitors on all SIP servers along the path of a request and an 298 Actuator on the sender. In this case, the Control Functions 299 associated with each Monitor have to cooperate to jointly determine 300 the overall feedback for this path. Finally, overload control can be 301 implemented locally on a SIP server if Monitor and Actuator reside on 302 the same server. In this case, the sending entity and receiving 303 entity are the same SIP server and Actuator and Monitor operate on 304 the same SIP processor (although, the Actuator typically operates on 305 a pre-processing stage in local overload control). Local overload 306 control is an internal overload control mechanism as the control loop 307 is implemented internally on one server. Hop-by-hop and end-to-end 308 are external overload control mechanisms. All three configurations 309 are shown in Figure 2. 311 +---------+ +------(+)---------+ 312 +------+ | | | ^ | 313 | | | +---+ | | +---+ 314 v | v //=>| C | v | //=>| C | 315 +---+ +---+ // +---+ +---+ +---+ // +---+ 316 | A |===>| B | | A |===>| B | 317 +---+ +---+ \\ +---+ +---+ +---+ \\ +---+ 318 ^ \\=>| D | ^ | \\=>| D | 319 | +---+ | | +---+ 320 | | | v | 321 +---------+ +------(+)---------+ 323 (a) hop-by-hop (b) end-to-end 325 +-+ 326 v | 327 +-+ +-+ +---+ 328 v | v | //=>| C | 329 +---+ +---+ // +---+ 330 | A |===>| B | 331 +---+ +---+ \\ +---+ 332 \\=>| D | 333 +---+ 334 ^ | 335 +-+ 337 (c) local 339 ==> SIP request flow 340 <-- Overload feedback loop 342 Figure 2: Degree of Cooperation between Servers 344 5.1. Hop-by-Hop 346 The idea of hop-by-hop overload control is to instantiate a separate 347 control loop between all neighboring SIP servers that directly 348 exchange traffic. I.e., the Actuator is located on the SIP server 349 that is the direct upstream neighbor of the SIP server that has the 350 corresponding Monitor. Each control loop between two servers is 351 completely independent of the control loop between other servers 352 further up- or downstream. In the example in Figure 2(a), three 353 independent overload control loops are instantiated: A - B, B - C and 354 B - D. Each loop only controls a single hop. Overload feedback 355 received from a downstream neighbor is not forwarded further 356 upstream. Instead, a SIP server acts on this feedback, for example, 357 by rejecting SIP messages if needed. If the upstream neighbor of a 358 server also becomes overloaded, it will report this problem to its 359 upstream neighbors, which again take action based on the reported 360 feedback. Thus, in hop-by-hop overload control, overload is always 361 resolved by the direct upstream neighbors of the overloaded server 362 without the need to involve entities that are located multiple SIP 363 hops away. 365 Hop-by-hop overload control reduces the impact of overload on a SIP 366 network and can avoid congestion collapse. It is simple and scales 367 well to networks with many SIP entities. An advantage is that it 368 does not require feedback to be transmitted across multiple-hops, 369 possibly crossing multiple trust domains. Feedback is sent to the 370 next hop only. Furthermore, it does not require a SIP entity to 371 aggregate a large number of overload status values or keep track of 372 the overload status of SIP servers it is not communicating with. 374 5.2. End-to-End 376 End-to-end overload control implements an overload control loop along 377 the entire path of a SIP request, from UAC to UAS. An end-to-end 378 overload control mechanism consolidates overload information from all 379 SIP servers on the way (including all proxies and the UAS) and uses 380 this information to throttle traffic as far upstream as possible. An 381 end-to-end overload control mechanism has to be able to frequently 382 collect the overload status of all servers on the potential path(s) 383 to a destination and combine this data into meaningful overload 384 feedback. 386 A UA or SIP server only throttles requests if it knows that these 387 requests will eventually be forwarded to an overloaded server. For 388 example, if D is overloaded in Figure 2(b), A should only throttle 389 requests it forwards to B when it knows that they will be forwarded 390 to D. It should not throttle requests that will eventually be 391 forwarded to C, since server C is not overloaded. In many cases, it 392 is difficult for A to determine which requests will be routed to C 393 and D since this depends on the local routing decision made by B. 394 These routing decisions can be highly variable and, for example, 395 depend on call routing policies configured by the user, services 396 invoked on a call, load balancing policies, etc. The fact that a 397 previous message to a target has been routed through an overloaded 398 server does not necessarily mean the next message to this target will 399 also be routed through the same server. 401 The main problem of end-to-end overload control is its inherent 402 complexity since UAC or SIP servers need to monitor all potential 403 paths to a destination in order to determine which requests should be 404 throttled and which requests may be sent. Even if this information 405 is available, it is not clear which path a specific request will 406 take. 408 A variant of end-to-end overload control is to implement a control 409 loop between a set of well-known SIP servers along the path of a SIP 410 request. For example, an overload control loop can be instantiated 411 between a server that only has one downstream neighbor or a set of 412 closely coupled SIP servers. A control loop spanning multiple hops 413 can be used if the sending entity has full knowledge about the SIP 414 servers on the path of a SIP message. 416 A key difference to transport protocols using end-to-end congestion 417 control such as TCP is that the traffic exchanged between SIP servers 418 consists of many individual SIP messages. Each of these SIP messages 419 has its own source and destination. Even SIP messages containing 420 identical SIP URIs (e.g., a SUBSCRIBE and a INVITE message to the 421 same SIP URI) can be routed to different destinations. This is 422 different from TCP which controls a stream of packets between a 423 single source and a single destination. 425 5.3. Local Overload Control 427 The idea of local overload control (see Figure 2(c)) is to run the 428 Monitor and Actuator on the same server. This enables the server to 429 monitor the current resource usage and to reject messages that can't 430 be processed without overusing the local resources. The fundamental 431 assumption behind local overload control is that it is less resource 432 consuming for a server to reject messages than to process them. A 433 server can therefore reject the excess messages it cannot process to 434 stop all retransmissions of these messages. Since rejecting messages 435 does consume resources on a SIP server, local overload control alone 436 cannot prevent a congestion collapse. 438 Local overload control can be used in conjunction with an other 439 overload control mechanisms and provides an additional layer of 440 protection against overload. It is fully implemented within a SIP 441 server and does not require cooperation between servers. In general, 442 SIP servers should apply other overload control techniques to control 443 load before a local overload control mechanism is activated as a 444 mechanism of last resort. 446 6. Topologies 448 The following topologies describe four generic SIP server 449 configurations. These topologies illustrate specific challenges for 450 an overload control mechanism. An actual SIP server topology is 451 likely to consist of combinations of these generic scenarios. 453 In the "load balancer" configuration shown in Figure 3(a) a set of 454 SIP servers (D, E and F) receives traffic from a single source A. A 455 load balancer is a typical example for such a configuration. In this 456 configuration, overload control needs to prevent server A (i.e., the 457 load balancer) from sending too much traffic to any of its downstream 458 neighbors D, E and F. If one of the downstream neighbors becomes 459 overloaded, A can direct traffic to the servers that still have 460 capacity. If one of the servers serves as a backup, it can be 461 activated once one of the primary servers reaches overload. 463 If A can reliably determine that D, E and F are its only downstream 464 neighbors and all of them are in overload, it may choose to report 465 overload upstream on behalf of D, E and F. However, if the set of 466 downstream neighbors is not fixed or only some of them are in 467 overload then A should not activate an overload control since A can 468 still forward the requests destined to non-overloaded downstream 469 neighbors. These requests would be throttled as well if A would use 470 overload control towards its upstream neighbors. 472 In the "multiple sources" configuration shown in Figure 3(b), a SIP 473 server D receives traffic from multiple upstream sources A, B and C. 474 Each of these sources can contribute a different amount of traffic, 475 which can vary over time. The set of active upstream neighbors of D 476 can change as servers may become inactive and previously inactive 477 servers may start contributing traffic to D. 479 If D becomes overloaded, it needs to generate feedback to reduce the 480 amount of traffic it receives from its upstream neighbors. D needs 481 to decide by how much each upstream neighbor should reduce traffic. 482 This decision can require the consideration of the amount of traffic 483 sent by each upstream neighbor and it may need to be re-adjusted as 484 the traffic contributed by each upstream neighbor varies over time. 485 Server D can use a local fairness policy to determine much traffic it 486 accepts from each upstream neighbor. 488 In many configurations, SIP servers form a "mesh" as shown in 489 Figure 3(c). Here, multiple upstream servers A, B and C forward 490 traffic to multiple alternative servers D and E. This configuration 491 is a combination of the "load balancer" and "multiple sources" 492 scenario. 494 +---+ +---+ 495 /->| D | | A |-\ 496 / +---+ +---+ \ 497 / \ +---+ 498 +---+-/ +---+ +---+ \->| | 499 | A |------>| E | | B |------>| D | 500 +---+-\ +---+ +---+ /->| | 501 \ / +---+ 502 \ +---+ +---+ / 503 \->| F | | C |-/ 504 +---+ +---+ 506 (a) load balancer (b) multiple sources 508 +---+ 509 | A |---\ a--\ 510 +---+-\ \---->+---+ \ 511 \/----->| D | b--\ \--->+---+ 512 +---+--/\ /-->+---+ \---->| | 513 | B | \/ c-------->| D | 514 +---+---\/\--->+---+ | | 515 /\---->| E | ... /--->+---+ 516 +---+--/ /-->+---+ / 517 | C |-----/ z--/ 518 +---+ 520 (c) mesh (d) edge proxy 522 Figure 3: Topologies 524 Overload control that is based on reducing the number of messages a 525 sender is allowed to send is not suited for servers that receive 526 requests from a very large population of senders, each of which only 527 infrequently sends a request. This scenario is shown in Figure 3(d). 528 An edge proxy that is connected to many UAs is a typical example for 529 such a configuration. 531 Since each UA typically only contributes a few requests, which are 532 often related to the same call, it can't decrease its message rate to 533 resolve the overload. In such a configuration, a SIP server can 534 resort to local overload control by rejecting a percentage of the 535 requests it receives with 503 (Service Unavailable) responses. Since 536 there are many upstream neighbors that contribute to the overall 537 load, sending 503 (Service Unavailable) to a fraction of them can 538 gradually reduce load without entirely stopping all incoming traffic. 539 The Retry-After header can be used in 503 (Service Unavailable) 540 responses to ask UAs to wait a given number of seconds before trying 541 the call again. Using 503 (Service Unavailable) towards individual 542 sources can, however, not prevent overload if a large number of users 543 places calls at the same time. 545 Note: The requirements of the "edge proxy" topology are different 546 than the ones of the other topologies, which may require a 547 different method for overload control. 549 7. Fairness 551 There are many different ways to define fairness between multiple 552 upstream neighbors of a SIP server. In the context of SIP server 553 overload, it is helpful to describe two categories of fairness: basic 554 fairness and customized fairness. With basic fairness a SIP server 555 treats all call attempts equally and ensures that each call attempt 556 has the same chance of succeeding. With customized fairness, the 557 server allocates resources according to different priorities. An 558 example application of the basic fairness criteria is the "Third 559 caller receives free tickets" scenario, where each call attempt 560 should have an equal success probability in making calls through an 561 overloaded SIP server, irrespective of the service provider where it 562 was initiated. An example of customized fairness would be a server 563 which assigns different resource allocations to its upstream 564 neighbors (e.g., service providers) as defined in a service level 565 agreement (SLA). 567 8. Performance Metrics 569 The performance of an overload control mechanism can be measured 570 using different metrics. 572 A key performance indicator is the goodput of a SIP server under 573 overload. Ideally, a SIP server will be enabled to perform at its 574 capacity limit during periods of overload. E.g., if a SIP server has 575 a processing capacity of 140 INVITE transactions per second then an 576 overload control mechanism should enable it to process 140 INVITEs 577 per second even if the offered load is much higher. The delay 578 introduced by a SIP server is another important indicator. An 579 overload control mechanism should ensure that the delay encountered 580 by a SIP message is not increased significantly during periods of 581 overload. 583 Reactiveness and stability are other important performance 584 indicators. An overload control mechanism should quickly react to an 585 overload occurrence and ensure that a SIP server does not become 586 overloaded even during sudden peaks of load. Similarly, an overload 587 control mechanism should quickly stop rejecting calls if the overload 588 disappears. Stability is another important criteria. An overload 589 control mechanism should not cause significant oscillations of load 590 on a SIP server. The performance of SIP overload control mechanisms 591 is discussed in [Noel et al.], [Shen et al.], [Hilt et al.] and 592 [Abdelal et al.]. 594 In addition to the above metrics, there are other indicators that are 595 relevant for the evaluation of an overload control mechanism: 597 Fairness: Which types of fairness does the overload control 598 mechanism implement? 599 Self-limiting: Is the overload control self-limiting if a SIP server 600 becomes unresponsive? 601 Changes in neighbor set: How does the mechanism adapt to a changing 602 set of sending entities? 603 Data points to monitor: Which and how many data points does an 604 overload control mechanism need to monitor? 606 9. Explicit Overload Control Feedback 608 Explicit overload control feedback enables a receiver to indicate how 609 much traffic it wants to receive. Explicit overload control 610 mechanisms can be differentiated based on the type of information 611 conveyed in the overload control feedback and whether the control 612 function is in the receiving or sending entity (receiver- vs. sender- 613 based overload control). 615 9.1. Rate-based Overload Control 617 The key idea of rate-based overload control is to limit the request 618 rate at which an upstream element is allowed to forward traffic to 619 the downstream neighbor. If overload occurs, a SIP server instructs 620 each upstream neighbor to send at most X requests per second. Each 621 upstream neighbor can be assigned a different rate cap. 623 An example algorithm for an Actuator in the sending entity is request 624 gapping. After transmitting a request to a downstream neighbor, a 625 server waits for 1/X seconds before it transmits the next request to 626 the same neighbor. Requests that arrive during the waiting period 627 are not forwarded and are either redirected, rejected or buffered. 629 The rate cap ensures that the number of requests received by a SIP 630 server never increases beyond the sum of all rate caps granted to 631 upstream neighbors. Rate-based overload control protects a SIP 632 server against overload even during load spikes assuming there are no 633 new upstream neighbors that start sending traffic. New upstream 634 neighbors need to be considered in all rate caps assigned to upstream 635 neighbors. The overall rate cap of a SIP server is determined by an 636 overload control algorithm, e.g., based on system load. 638 Rate-based overload control requires a SIP server to assign a rate 639 cap to each of its upstream neighbors while it is activated. 640 Effectively, a server needs to assign a share of its overall capacity 641 to each upstream neighbor. A server needs to ensure that the sum of 642 all rate caps assigned to upstream neighbors does not substantially 643 oversubscribe its actual processing capacity. This requires a SIP 644 server to keep track of the set of upstream neighbors and to adjust 645 the rate cap if a new upstream neighbor appears or an existing 646 neighbor stops transmitting. For example, if the capacity of the 647 server is X and this server is receiving traffic from two upstream 648 neighbors, it can assign a rate of X/2 to each of them. If a third 649 sender appears, the rate for each sender is lowered to X/3. If the 650 overall rate cap is too high, a server may experience overload. If 651 the cap is too low, the upstream neighbors will reject requests even 652 though they could be processed by the server. 654 An approach for estimating a rate cap for each upstream neighbor is 655 using a fixed proportion of a control variable, X, where X is 656 initially equal to the capacity of the SIP server. The server then 657 increases or decreases X until the workload arrival rate matches the 658 actual server capacity. Usually, this will mean that the sum of the 659 rate caps sent out by the server (=X) exceeds its actual capacity, 660 but enables upstream neighbors who are not generating more than their 661 fair share of the work to be effectively unrestricted. In this 662 approach, the server only has to measure the aggregate arrival rate. 663 However, since the overall rate cap is usually higher than the actual 664 capacity, brief periods of overload may occur. 666 9.2. Loss-based Overload Control 668 A loss percentage enables a SIP server to ask an upstream neighbor to 669 reduce the number of requests it would normally forward to this 670 server by a percentage X. For example, a SIP server can ask an 671 upstream neighbor to reduce the number of requests this neighbor 672 would normally send by 10%. The upstream neighbor then redirects or 673 rejects X percent of the traffic that is destined for this server. 675 An algorithm for the sending entity to implement a loss percentage is 676 to draw a random number between 1 and 100 for each request to be 677 forwarded. The request is not forwarded to the server if the random 678 number is less than or equal to X. 680 An advantage of loss-based overload control is that, the receiving 681 entity does not need to track the set of upstream neighbors or the 682 request rate it receives from each upstream neighbor. It is 683 sufficient to monitor the overall system utilization. To reduce 684 load, a server can ask its upstream neighbors to lower the traffic 685 forwarded by a certain percentage. The server calculates this 686 percentage by combining the loss percentage that is currently in use 687 (i.e., the loss percentage the upstream neighbors are currently using 688 when forwarding traffic), the current system utilization and the 689 desired system utilization. For example, if the server load 690 approaches 90% and the current loss percentage is set to a 50% 691 traffic reduction, then the server can decide to increase the loss 692 percentage to 55% in order to get to a system utilization of 80%. 693 Similarly, the server can lower the loss percentage if permitted by 694 the system utilization. 696 Loss-based overload control requires that the throttle percentage is 697 adjusted to the current overall number of requests received by the 698 server. This is particularly important if the number of requests 699 received fluctuates quickly. For example, if a SIP server sets a 700 throttle value of 10% at time t1 and the number of requests increases 701 by 20% between time t1 and t2 (t1