idnits 2.17.1 draft-ietf-tsvwg-circuit-breaker-02.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (July 20, 2015) is 3202 days in the past. Is this intentional? Checking references for intended status: Best Current Practice ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) ** Obsolete normative reference: RFC 5405 (Obsoleted by RFC 8085) Summary: 1 error (**), 0 flaws (~~), 1 warning (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 TSVWG Working Group G. Fairhurst 3 Internet-Draft University of Aberdeen 4 Intended status: Best Current Practice July 20, 2015 5 Expires: January 21, 2016 7 Network Transport Circuit Breakers 8 draft-ietf-tsvwg-circuit-breaker-02 10 Abstract 12 This document explains what is meant by the term "network transport 13 circuit breaker" (CB). It describes the need for circuit breakers 14 when using network tunnels, and other non-congestion controlled 15 applications. It also defines requirements for building a circuit 16 breaker and the expected outcomes of using a circuit breaker within 17 the Internet. 19 Status of This Memo 21 This Internet-Draft is submitted in full conformance with the 22 provisions of BCP 78 and BCP 79. 24 Internet-Drafts are working documents of the Internet Engineering 25 Task Force (IETF). Note that other groups may also distribute 26 working documents as Internet-Drafts. The list of current Internet- 27 Drafts is at http://datatracker.ietf.org/drafts/current/. 29 Internet-Drafts are draft documents valid for a maximum of six months 30 and may be updated, replaced, or obsoleted by other documents at any 31 time. It is inappropriate to use Internet-Drafts as reference 32 material or to cite them other than as "work in progress." 34 This Internet-Draft will expire on January 21, 2016. 36 Copyright Notice 38 Copyright (c) 2015 IETF Trust and the persons identified as the 39 document authors. All rights reserved. 41 This document is subject to BCP 78 and the IETF Trust's Legal 42 Provisions Relating to IETF Documents 43 (http://trustee.ietf.org/license-info) in effect on the date of 44 publication of this document. Please review these documents 45 carefully, as they describe your rights and restrictions with respect 46 to this document. Code Components extracted from this document must 47 include Simplified BSD License text as described in Section 4.e of 48 the Trust Legal Provisions and are provided without warranty as 49 described in the Simplified BSD License. 51 Table of Contents 53 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 54 1.1. Types of Circuit-Breaker . . . . . . . . . . . . . . . . 4 55 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 4 56 3. Design of a Circuit-Breaker (What makes a good circuit 57 breaker?) . . . . . . . . . . . . . . . . . . . . . . . . . . 4 58 3.1. Functional Components . . . . . . . . . . . . . . . . . . 5 59 4. Requirements for a Network Transport Circuit Breaker . . . . 6 60 4.1. Unidirectional Circuit Breakers over Controlled Paths . . 8 61 5. Examples of Circuit Breakers . . . . . . . . . . . . . . . . 8 62 5.1. A Fast-Trip Circuit Breaker . . . . . . . . . . . . . . . 8 63 5.1.1. A Fast-Trip Circuit Breaker for RTP . . . . . . . . . 9 64 5.2. A Slow-trip Circuit Breaker . . . . . . . . . . . . . . . 9 65 5.3. A Managed Circuit Breaker . . . . . . . . . . . . . . . . 9 66 5.3.1. A Managed Circuit Breaker for SAToP Pseudo-Wires . . 10 67 6. Examples where circuit breakers may not be needed. . . . . . 11 68 6.1. CBs over pre-provisioned Capacity . . . . . . . . . . . . 11 69 6.2. CBs with CC Traffic . . . . . . . . . . . . . . . . . . . 11 70 6.3. CBs with Uni-directional Traffic and no Control Path . . 12 71 7. Security Considerations . . . . . . . . . . . . . . . . . . . 12 72 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 13 73 9. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 13 74 10. Revision Notes . . . . . . . . . . . . . . . . . . . . . . . 13 75 11. References . . . . . . . . . . . . . . . . . . . . . . . . . 14 76 11.1. Normative References . . . . . . . . . . . . . . . . . . 14 77 11.2. Informative References . . . . . . . . . . . . . . . . . 14 78 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 15 80 1. Introduction 82 A network transport Circuit Breaker (CB) is an automatic mechanism 83 that is used to estimate congestion caused by a flow, and to 84 terminate (or significantly reduce the rate of) the flow when 85 persistent congestion is detected. This is a safety measure to 86 prevent congestion collapse (starvation of resources available to 87 other flows), essential for an Internet that is heterogeneous and for 88 traffic that is hard to predict in advance. 90 The term "Circuit Breaker" originates in electricity supply, and has 91 nothing to do with network circuits or virtual circuits. In 92 electricity supply, a CB is intended as a protection mechanism of 93 last resort. Under normal circumstances, a CB ought not to be 94 triggered; It is designed to protect the supply network and attached 95 equipment when there is overload. Just as people do not expect the 96 electrical circuit-breaker (or fuse) in their home to be triggered, 97 except when there is a wiring fault or a problem with an electrical 98 appliance. 100 In networking, the CB principle can be used as a protection mechanism 101 of last resort to avoid persistent congestion. Persistent congestion 102 (also known as "congestion collapse") was a feature of the early 103 Internet of the 1980s. This resulted in excess traffic starving 104 other connection from access to the Internet. It was countered by 105 the requirement to use congestion control (CC) by the Transmission 106 Control Protocol (TCP) [Jacobsen88] [RFC1112]. These mechanisms 107 operate in Internet hosts to cause TCP connections to "back off" 108 during congestion. The introduction of CC in TCP (currently 109 documented in [RFC5681] ensured the stability of the Internet, 110 because it was able to detect congestion and promptly react. This 111 worked well while TCP was by far the dominant traffic in the 112 Internet, and most TCP flows were long-lived (ensuring that they 113 could detect and respond to congestion before the flows terminated). 114 This is no longer the case, and non-congestion controlled traffic, 115 including many applications of the User Datagram Protocol (UDP) can 116 form a significant proportion of the total traffic traversing a link. 117 The current Internet therefore requires that non-congestion 118 controlled traffic needs to be considered to avoid congestion 119 collapse. 121 There are important differences between a transport circuit-breaker 122 and a congestion-control method. Specifically, congestion control 123 (as implemented in TCP, SCTP, and DCCP) needs to operate on the 124 timescale on the order of a packet round-trip-time (RTT), the time 125 from sender to destination and return. Congestion control methods 126 may react to a single packet loss/marking and reduce the transmission 127 rate for each loss or congestion event. The goal is usually to limit 128 the maximum transmission rate that reflects the available capacity of 129 a network path. These methods typically operate on individual 130 traffic flows (e.g., a 5-tuple). 132 In contrast, CBs are recommended for non-congestion-controlled 133 Internet flows and for traffic aggregates, e.g., traffic sent using a 134 network tunnel. Later sections provide examples of cases where 135 circuit-breakers may or may not be desirable. 137 A CB needs to measure (meter) the traffic to determine if the network 138 is experiencing congestion and must be designed to trigger robustly 139 when there is persistent congestion. This means the trigger needs to 140 operate on a timescale much longer than the path round trip time 141 (e.g., seconds to possibly many tens of seconds). This longer period 142 is needed to provide sufficient time for transports (or applications) 143 to adjust their rate following congestion, and for the network load 144 to stabilise after any adjustment. A CB trigger will often be based 145 on a series of successive sample measurements taken over a reasonably 146 long period of time. This is to ensure that a CB does not 147 accidentally trigger following a single (or even successive) 148 congestion events (congestion events are what triggers congestion 149 control, and are to be regarded as normal on a network link operating 150 near its capacity). Once triggered, a control function needs to 151 remove traffic from the network, either disabling the flow or 152 significantly reducing the level of traffic. This reaction provides 153 the required protection to prevent persistent congestion being 154 experienced by other flows that share the congested part of the 155 network path. 157 Section 4 defines requirements for building a circuit breaker. 159 1.1. Types of Circuit-Breaker 161 There are various forms of network transport circuit breaker. These 162 are differentiated mainly on the timescale over which they are 163 triggered, but also in the intended protection they offer: 165 o Fast-Trip Circuit Breakers: The relatively short timescale used by 166 this form of circuit breaker is intended to protect a flow or 167 related group of flows. 169 o Slow-Trip Circuit Breakers: This circuit breaker utilises a longer 170 timescale and is designed to protect traffic aggregates. 172 o Managed Circuit Breakers: Utilise the operations and management 173 functions that may be present in a managed service to implement a 174 circuit breaker. 176 Examples of each type of circuit breaker are provided in section 4. 178 2. Terminology 180 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 181 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 182 document are to be interpreted as described in [RFC2119]. 184 3. Design of a Circuit-Breaker (What makes a good circuit breaker?) 186 Although circuit breakers have been talked about in the IETF for many 187 years, there has not yet been guidance on the cases where circuit 188 breakers are needed or upon the design of circuit breaker mechanisms. 189 This document seeks to offer advise on these two topics. 191 Section 3.1 describes the functional components of a circuit breaker 192 and section 3.2 defines requirements for implementing a circuit 193 breaker. 195 3.1. Functional Components 197 The basic design of a transport circuit breaker involves 198 communication between an ingress point (a sender) and an egress point 199 (a receiver) of a network flow. A simple picture of CB operation is 200 provided in figure 1. This shows a set of routers (each labelled R) 201 connecting a set of endpoints. A CB is used to control traffic 202 passing through a subset of these routers, acting between an ingress 203 and a egress point. In some cases, the ingress and egress may be 204 within one or both network endpoints, in other cases they will be 205 within a network device. For example, one expected use would be at 206 the ingress and egress of a tunnel service. 208 +--------+ +--------+ 209 |Endpoint| |Endpoint| 210 +--+-----+ +--+-----+ 211 | | 212 | +-+ +-+ +---------+ +-+ +-+ +-+ +--------+ +-+ +-+ | 213 +-+R+--+R+--+ Ingress +--+R+--+R+--+R+--+ Egress |--+R+--+R+-+ 214 +++ +-+ +-------+-+ +-+ +-+ +-+ +-----+--+ +++ +-+ 215 | ^ | | | 216 +-+ | | +----+----+ | | +-+ 217 +R+--+ | | Measure +<-------------------+ +--+R+ 218 +++ | +----+----+ +++ 219 | | | | 220 | | +----+----+ | 221 +--+-----+ | | Trigger + +--+-----+ 222 |Endpoint| | +----+----+ |Endpoint| 223 +--------+ | | +--------+ 224 +------+ 225 Reaction 227 Figure 1: A CB controlling the part of the end-to-end path between an 228 ingress point and an egress point. 230 The set of components needed to implement a circuit breaker are: 232 1. An ingress meter (at the sender or tunnel ingress) records the 233 number of packets/bytes sent in each measurement interval. This 234 measures the offered network load. The measurement interval 235 could be every few seconds. 237 2. An egress meter (at the receiver or tunnel egress) records the 238 number/bytes received in each measurement interval. This 239 measures the supported load and may utilise other signals to 240 detect the effect of congestion (e.g., loss/marking experienced 241 over the path). 243 3. The measured values at the ingress and egress are communicated to 244 the CB Measurement function. This may use several methods 245 including: Sending return measurement packets from a receiver to 246 a trigger function at the sender; An implementation using 247 Operations, Administration and Management (OAM); or another in- 248 band signalling datagram to send to the trigger function. This 249 could also be implemented purely as a control plane function 250 using a software-defined network controller. 252 4. The measurement function combines the ingress and egress 253 measurements to assess the present level of network congestion. 254 (For example, the loss rate for each measurement interval could 255 be deduced from calculating the difference between ingress and 256 egress counter values. Note that accurate measurement intervals 257 are not typically important, since isolated loss events need to 258 be disregarded.) 260 5. A trigger function determines if the measurements indicate 261 persistent congestion. This function defines an appropriate 262 threshold for determining there is persistent congestion between 263 the ingress and egress (e.g., more than 10% loss, but other 264 methods could also be based on the rate of transmission as well 265 as the loss rate). The transport CB is triggered when the 266 threshold is exceeded in multiple measurement intervals (e.g., 3 267 successive measurements). This design needs to be robust to 268 single or spurious events triggering a reaction. 270 6. A reaction that is applied at the Ingress when the CB is 271 triggered. This seeks to automatically remove the traffic 272 causing persistent congestion. 274 7. The CB also triggers when it does not receive both sender and 275 receiver measurements, since this also could indicate a loss of 276 control packets (also a symptom of heavy congestion or inability 277 to control the load). 279 4. Requirements for a Network Transport Circuit Breaker 281 The requirements for implementing a CB are: 283 o There MUST be a control path from the ingress meter and the egress 284 meter to the point of measurement. The CB MUST trigger if this 285 control path fails. That is, the feedback indicating a congested 286 period is designed so that the CB is triggered when it fails to 287 receive measurement reports that indicate an absence of 288 congestion, rather than relying on the successful transmission of 289 a "congested" signal back to the sender. (The feedback signal 290 could itself be lost under congestion). 292 o A CB MUST define a measurement period over which the receiver 293 measures the level of congestion. This method does not have to 294 detect individual packet loss, but MUST have a way to know that 295 packets have been lost/marked from the traffic flow. If Explicit 296 Congestion Notification (ECN) is enabled [RFC3168], an egress 297 meter MAY also count the number of ECN congestion marks/event per 298 measurement interval, but even if ECN is used, loss MUST still be 299 measured, since this better reflects the impact of persistent 300 congestion. The type of CB will determine how long this 301 measurement period needs to be. The minimum time must be 302 significantly longer than the time that current CC algorithms need 303 to reduce their rate following detection of congestion (i.e., many 304 path RTTs). 306 o A CB is REQUIRED to define a threshold to determine whether the 307 measured congestion is considered excessive. 309 o A CB is REQUIRED to define a period over which the trigger uses 310 the collected measurements. 312 o A CB MUST be robust to multiple congestion events. This usually 313 will define a number of measured persistent congestion events per 314 triggering period. For example, a CB may combine the results of 315 several measurement periods to determine if the CB is triggered. 316 (e.g., triggered when persistent congestion is detected in 3 of 317 the measurements within the triggering interval). 319 o A triggered CB MUST react decisively by disabling (or 320 significantly reducing) traffic at the source (e.g., tunnel 321 ingress). The CB SHOULD be constructed so that it does not 322 trigger under light or intermittent congestion, with a default 323 response to a trigger that disables all traffic that contributed 324 to congestion. 326 o Some circuit breaker designs use a reaction that reduces, rather 327 that disables, the flows it controls. This response MUST be much 328 more severe than that of a CC algorithm, because the CB reacts to 329 more persistent congestion and operates over longer timescales 330 (i.e., the overload condition will have persisted for a longer 331 time before the CB is triggered). A CB that reduces the rate of a 332 flow, MUST continue to monitor the level congestion and MUST 333 further reduce the rate if the CB is again triggered. 335 o The reaction to a triggered CB MUST continue for a period of time 336 of at least the triggering interval. Manual operator intervention 337 will usually be required to restore the flow. If an automated 338 response is needed to reset the trigger, then this MUST NOT be 339 immediate. The design of this release mechanism needs to be 340 sufficiently conservative that it does not adversely interact with 341 other mechanisms (including other CB algorithms that control 342 traffic over a common path. 344 o When a CB is triggered, it SHOULD be regarded as an abnormal 345 network event. As such, this event SHOULD be logged. The 346 measurements that lead to triggering of the CB SHOULD also be 347 logged. 349 4.1. Unidirectional Circuit Breakers over Controlled Paths 351 A CB can be used to control uni-directional UDP traffic, providing 352 that there is a control path to connect the functional components at 353 the Ingress and Egress. This control path can exist in networks for 354 which the traffic flow is purely unidirectional (e.g., a multicast 355 stream that sends packets across an Internet path and can use 356 multicast routing to prune flows to shed network load). 358 Some paths are provisioned using a control protocol, e.g., flows 359 provisioned using the Multi-Protocol Label Switching (MPLS) services, 360 path provisioned using the Resource reservation protocol (RSVP), or 361 admission-controlled Differentiated Services. For these paths the 362 control protocol may be invoked to shed the network load when the 363 circuit breaker is triggered. 365 5. Examples of Circuit Breakers 367 There are multiple types of CB that may be defined for use in 368 different deployment cases. This section provides examples of 369 different types of circuit breaker: 371 5.1. A Fast-Trip Circuit Breaker 373 A fast-trip circuit breaker is the most responsive form of CB. It 374 has a response time that is only slightly larger than that of the 375 traffic it controls. It is suited to traffic with well-understood 376 characteristics. It is not be suited to arbitrary network traffic, 377 since it may prematurely trigger (e.g., when multiple congestion- 378 controlled flows lead to short-term overload). 380 5.1.1. A Fast-Trip Circuit Breaker for RTP 382 A set of fast-trip CB methods have been specified for use together by 383 a Real-time Transport Protocol (RTP) flow using the RTP/AVP Profile 384 [RTP-CB]. It is expected that, in the absence of severe congestion, 385 all RTP applications running on best-effort IP networks will be able 386 to run without triggering these circuit breakers. A fast-trip RTP CB 387 is therefore implemented as a fail-safe. 389 The sender monitors reception of RTCP Reception Report (RR or XRR) 390 packets that convey reception quality feedback information. This is 391 used to measure (congestion) loss, possibly in combination with ECN 392 [RFC6679]. 394 The CB action (shutdown of the flow) is triggered when any of the 395 following trigger conditions are true: 397 1. An RTP CB triggers on reported lack of progress. 399 2. An RTP CB triggers when no receiver reports messages are 400 received. 402 3. An RTP CB uses a TFRC-style check and sets a hard upper limit to 403 the long-term RTP throughput (over many RTTs). 405 4. An RTP CB includes the notion of Media Usability. This circuit 406 breaker is triggered when the quality of the transported media 407 falls below some required minimum acceptable quality. 409 5.2. A Slow-trip Circuit Breaker 411 A slow-trip CB may be implemented in an endpoint or network device. 412 This type of CB is much slower at responding to congestion than a 413 fast-trip CB and is expected to be more common. 415 One example where a slow-trip CB is needed is where flows or traffic- 416 aggregates use a tunnel or encapsulation and the flows within the 417 tunnel do not all support TCP-style congestion control (e.g., TCP, 418 SCTP, TFRC), see [RFC5405] section 3.1.3. A use case is where 419 tunnels are deployed in the general Internet (rather than "controlled 420 environments" within an ISP or Enterprise), especially when the 421 tunnel may need to cross a customer access router. 423 5.3. A Managed Circuit Breaker 425 A managed CB is implemented in the signalling protocol or management 426 plane that relates to the traffic aggregate being controlled. This 427 type of circuit breaker is typically applicable when the deployment 428 is within a "controlled environment". 430 A Circuit Breaker requires more than the ability to determine that a 431 network path is forwarding data, or to measure the rate of a path - 432 which are often normal network operational functions. There is an 433 additional need to determine a metric for congestion on the path and 434 to trigger a reaction when a threshold is crossed that indicates 435 persistent congestion. 437 5.3.1. A Managed Circuit Breaker for SAToP Pseudo-Wires 439 [RFC4553], SAToP Pseudo-Wires (PWE3), section 8 describes an example 440 of a managed circuit breaker for isochronous flows. 442 If such flows were to run over a pre-provisioned (e.g., MPLS) 443 infrastructure, then it may be expected that the Pseudo-Wire (PW) 444 would not experience congestion, because a flow is not expected to 445 either increase (or decrease) their rate. If instead Pseudo-Wire 446 traffic is multiplexed with other traffic over the general Internet, 447 it could experience congestion. [RFC4553] states: "If SAToP PWs run 448 over a PSN providing best-effort service, they SHOULD monitor packet 449 loss in order to detect "severe congestion". The currently 450 recommended measurement period is 1 second, and the trigger operates 451 when there are more than three measured Severely Errored Seconds 452 (SES) within a period. 454 If such a condition is detected, a SAToP PW should shut down 455 bidirectionally for some period of time...". The concept was that 456 when the packet loss ratio (congestion) level increased above a 457 threshold, the PW was by default disabled. This use case considered 458 fixed-rate transmission, where the PW had no reasonable way to shed 459 load. 461 The trigger needs to be set at the rate that the PW was likely to 462 experience a serious problem, possibly making the service non- 463 compliant. At this point, triggering the CB would remove the traffic 464 preventing undue impact on congestion-responsive traffic (e.g., TCP). 465 Part of the rationale, was that high loss ratios typically indicated 466 that something was "broken" and ought to have already resulted in 467 operator intervention, and therefore need to trigger this 468 intervention. 470 An operator-based response provides opportunity for other action to 471 restore the service quality, e.g., by shedding other loads or 472 assigning additional capacity, or to consciously avoid reacting to 473 the trigger while engineering a solution to the problem. This may 474 require the trigger to be sent to a third location (e.g., a network 475 operations centre, NOC) responsible for operation of the tunnel 476 ingress, rather than the tunnel ingress itself. 478 6. Examples where circuit breakers may not be needed. 480 A CB is not required for a single CC-controlled flow using TCP, SCTP, 481 TFRC, etc. In these cases, the CC methods are already designed to 482 prevent congestion collapse. 484 6.1. CBs over pre-provisioned Capacity 486 One common question is whether a CB is needed when a tunnel is 487 deployed in a private network with pre-provisioned capacity? 489 In this case, compliant traffic that does not exceed the provisioned 490 capacity ought not to result in congestion collapse. A CB will hence 491 only be triggered when there is non-compliant traffic. It could be 492 argued that this event ought never to happen - but it may also be 493 argued that the CB equally ought never to be triggered. If a CB were 494 to be implemented, it will provide an appropriate response if 495 persistent congestion occurs in an operational network. Implementing 496 a CB will not reduce the performance of the flows, but offers 497 protection in the event that persistent congestion occurs. 499 6.2. CBs with CC Traffic 501 IP-based traffic is generally assumed to be congestion-controlled, 502 i.e., it is assumed that the transport protocols generating IP-based 503 traffic at the sender already employ mechanisms that are sufficient 504 to address congestion on the path [RFC5405]. A question therefore 505 arises when people deploy a tunnel that is thought to only carry an 506 aggregate of TCP (or some other CC-controlled) traffic: Is there 507 advantage in this case in using a CB? 509 For sure, traffic in a such a tunnel will respond to congestion. 510 However, the answer to the question may not be obvious, because the 511 overall traffic formed by an aggregate of flows that implement a CC 512 mechanism does not necessarily prevent congestion collapse. For 513 instance, most CC mechanisms require long-lived flows to react to 514 reduce the rate of a flow, an aggregate of many short flows may 515 result in many terminating before they experience congestion. It is 516 also often impossible for a tunnel service provider to know that the 517 tunnel only contains CC-controlled traffic (e.g., Inspecting packet 518 headers may not be possible). The important thing to note is that if 519 the aggregate of the traffic does not result in persistent congestion 520 (impacting other flows), then the CB will not trigger. This is the 521 expected case in this context - so implementing a CB will not reduce 522 performance of the tunnel, but offers protection in the event that 523 persistent congestion occur. 525 6.3. CBs with Uni-directional Traffic and no Control Path 527 A one-way forwarding path could have no associated control path, and 528 therefore cannot be controlled using an automated process. This 529 service could be provided using a path that has dedicated capacity 530 and does not share this capacity with other elastic Internet flows 531 (i.e., flows that vary their rate). 533 When capacity is shared, one way to mitigate the impact on other 534 flows is to manage the traffic envelope by using ingress policing. 536 Supporting this type of traffic in the general Internet requires 537 operator monitoring to detect and respond to persistent congestion. 539 7. Security Considerations 541 All circuit breaker mechanisms rely upon coordination between the 542 ingress and egress meters and communication with the trigger 543 function. This is usually achieved by passing network control 544 information across the network. Timely operation of a circuit 545 breaker depends on the choice of measurement period. If the receiver 546 has an interval that is overly long, then the responsiveness of the 547 circuit breaker decreases. This impacts the ability of the circuit 548 breaker to detect and react to congestion. 550 Mechanisms need to be implemented to prevent attacks on the network 551 control information that would result in Denial of Service (DoS). 552 The source and integrity of control information (measurements and 553 triggers) MUST be protected from off-path attacks. Without 554 protection, it may be trivial for an attacker to inject packets with 555 values that could prematurely trigger a circuit breaker resulting in 556 DoS. Simple protection can be provided by using a randomised source 557 port, or equivalent field in the packet header (such as the RTP SSRC 558 value and the RTP sequence number) expected not to be known to an 559 off-path attacker. Stronger protection can be achieved using a 560 secure authentication protocol. 562 Transmission of network control information consumes network 563 capacity. This control traffic needs to be considered in the design 564 of a circuit breaker and could potentially add to network congestion. 565 If this traffic is sent over a shared path, it is RECOMMENDED that 566 this control traffic is prioritized to reduce the probability of loss 567 under congestion. Control traffic also needs to be considered when 568 provisioning a network that uses a circuit breaker. 570 The circuit breaker MUST be designed to be robust to packet loss that 571 can also be experienced during congestion/overload. Loss of control 572 traffic may be a side-effect of a congested network, but also may 573 arise from other causes. 575 Each design of a circuit breaker must evaluate whether the particular 576 circuit breaker mechanism has new security implications. 578 8. IANA Considerations 580 This document makes no request from IANA. 582 9. Acknowledgments 584 There are many people who have discussed and described the issues 585 that have motivated this draft. Contributions and comments included: 586 Lars Eggert, Colin Perkins, David Black, Matt Mathis and Andrew 587 McGregor. This work was part-funded by the European Community under 588 its Seventh Framework Programme through the Reducing Internet 589 Transport Latency (RITE) project (ICT-317700). 591 10. Revision Notes 593 XXX RFC-Editor: Please remove this section prior to publication XXX 595 Draft 00 597 This was the first revision. Help and comments are greatly 598 appreciated. 600 Draft 01 602 Contained clarifications and changes in response to received 603 comments, plus addition of diagram and definitions. Comments are 604 welcome. 606 WG Draft 00 608 Approved as a WG work item on 28th Aug 2014. 610 WG Draft 01 612 Incorporates feedback after Dallas IETF TSVWG meeting. This version 613 is thought ready for WGLC comments. 615 WG Draft 02 617 Minor fixes for typos. Rewritten security considerations section. 619 11. References 621 11.1. Normative References 623 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 624 Requirement Levels", BCP 14, RFC 2119, March 1997. 626 [RFC3168] Ramakrishnan, K., Floyd, S., and D. Black, "The Addition 627 of Explicit Congestion Notification (ECN) to IP", 628 RFC 3168, DOI 10.17487/RFC3168, September 2001, 629 . 631 [RFC5405] Eggert, L. and G. Fairhurst, "Unicast UDP Usage Guidelines 632 for Application Designers", BCP 145, RFC 5405, 633 DOI 10.17487/RFC5405, November 2008, 634 . 636 11.2. Informative References 638 [Jacobsen88] 639 European Telecommunication Standards, Institute (ETSI), 640 "Congestion Avoidance and Control", SIGCOMM Symposium 641 proceedings on Communications architectures and 642 protocols", August 1998. 644 [RFC1112] Deering, S., "Host extensions for IP multicasting", STD 5, 645 RFC 1112, DOI 10.17487/RFC1112, August 1989, 646 . 648 [RFC4553] Vainshtein, A., Ed. and YJ. Stein, Ed., "Structure- 649 Agnostic Time Division Multiplexing (TDM) over Packet 650 (SAToP)", RFC 4553, DOI 10.17487/RFC4553, June 2006, 651 . 653 [RFC5681] Allman, M., Paxson, V., and E. Blanton, "TCP Congestion 654 Control", RFC 5681, DOI 10.17487/RFC5681, September 2009, 655 . 657 [RFC6679] Westerlund, M., Johansson, I., Perkins, C., O'Hanlon, P., 658 and K. Carlberg, "Explicit Congestion Notification (ECN) 659 for RTP over UDP", RFC 6679, DOI 10.17487/RFC6679, August 660 2012, . 662 [RTP-CB] Perkins, and Singh, "Multimedia Congestion Control: 663 Circuit Breakers for Unicast RTP Sessions", February 2014. 665 Author's Address 667 Godred Fairhurst 668 University of Aberdeen 669 School of Engineering 670 Fraser Noble Building 671 Aberdeen, Scotland AB24 3UE 672 UK 674 Email: gorry@erg.abdn.ac.uk 675 URI: http://www.erg.abdn.ac.uk