idnits 2.17.1 draft-ietf-mboned-cbacc-04.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (7 March 2022) is 774 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Outdated reference: A later version (-03) exists of draft-ietf-mboned-ambi-01 == Outdated reference: A later version (-04) exists of draft-ietf-mboned-dorms-01 Summary: 0 errors (**), 0 flaws (~~), 3 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Mboned J. Holland 3 Internet-Draft Akamai Technologies, Inc. 4 Intended status: Standards Track 7 March 2022 5 Expires: 8 September 2022 7 Circuit Breaker Assisted Congestion Control 8 draft-ietf-mboned-cbacc-04 10 Abstract 12 This document specifies Circuit Breaker Assisted Congestion Control 13 (CBACC). CBACC enables fast-trip Circuit Breakers by publishing rate 14 metadata about multicast channels from senders to intermediate 15 network nodes or receivers. The circuit breaker behavior is defined 16 as a supplement to receiver driven congestion control systems, to 17 preserve network health if misbehaving or malicious receiver 18 applications subscribe to a volume of traffic that exceeds capacity 19 policies or capability for a network or receiving device. 21 Status of This Memo 23 This Internet-Draft is submitted in full conformance with the 24 provisions of BCP 78 and BCP 79. 26 Internet-Drafts are working documents of the Internet Engineering 27 Task Force (IETF). Note that other groups may also distribute 28 working documents as Internet-Drafts. The list of current Internet- 29 Drafts is at https://datatracker.ietf.org/drafts/current/. 31 Internet-Drafts are draft documents valid for a maximum of six months 32 and may be updated, replaced, or obsoleted by other documents at any 33 time. It is inappropriate to use Internet-Drafts as reference 34 material or to cite them other than as "work in progress." 36 This Internet-Draft will expire on 8 September 2022. 38 Copyright Notice 40 Copyright (c) 2022 IETF Trust and the persons identified as the 41 document authors. All rights reserved. 43 This document is subject to BCP 78 and the IETF Trust's Legal 44 Provisions Relating to IETF Documents (https://trustee.ietf.org/ 45 license-info) in effect on the date of publication of this document. 46 Please review these documents carefully, as they describe your rights 47 and restrictions with respect to this document. Code Components 48 extracted from this document must include Revised BSD License text as 49 described in Section 4.e of the Trust Legal Provisions and are 50 provided without warranty as described in the Revised BSD License. 52 Table of Contents 54 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 55 1.1. Background and Terminology . . . . . . . . . . . . . . . 4 56 1.2. Venues for Contribution and Discussion . . . . . . . . . 4 57 1.3. Non-obvious doc choices . . . . . . . . . . . . . . . . . 4 58 2. Circuit Breaker Behavior . . . . . . . . . . . . . . . . . . 5 59 2.1. Functional Components . . . . . . . . . . . . . . . . . . 5 60 2.1.1. Bitrate Advertisement . . . . . . . . . . . . . . . . 5 61 2.1.2. Circuit Breaker Node . . . . . . . . . . . . . . . . 6 62 2.1.3. Communication Method . . . . . . . . . . . . . . . . 7 63 2.1.4. Measurement Function . . . . . . . . . . . . . . . . 7 64 2.1.5. Trigger Function . . . . . . . . . . . . . . . . . . 8 65 2.1.6. Reaction . . . . . . . . . . . . . . . . . . . . . . 9 66 2.1.7. Feedback Control Mechanism . . . . . . . . . . . . . 10 67 2.2. States . . . . . . . . . . . . . . . . . . . . . . . . . 10 68 2.2.1. Interface State . . . . . . . . . . . . . . . . . . . 10 69 2.2.2. Flow State . . . . . . . . . . . . . . . . . . . . . 11 70 2.3. Implementation Design Considerations . . . . . . . . . . 11 71 2.3.1. Oversubscription Thresholds . . . . . . . . . . . . . 12 72 2.3.2. Fairness Functions . . . . . . . . . . . . . . . . . 12 73 3. YANG Module . . . . . . . . . . . . . . . . . . . . . . . . . 12 74 3.1. Tree Diagram . . . . . . . . . . . . . . . . . . . . . . 12 75 3.2. Module . . . . . . . . . . . . . . . . . . . . . . . . . 12 76 4. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 14 77 4.1. YANG Module Names Registry . . . . . . . . . . . . . . . 14 78 4.2. The XML Registry . . . . . . . . . . . . . . . . . . . . 15 79 5. Security Considerations . . . . . . . . . . . . . . . . . . . 15 80 5.1. Metadata Security . . . . . . . . . . . . . . . . . . . . 15 81 5.2. Denial of Service . . . . . . . . . . . . . . . . . . . . 15 82 5.2.1. State Overload . . . . . . . . . . . . . . . . . . . 15 83 6. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 16 84 7. References . . . . . . . . . . . . . . . . . . . . . . . . . 16 85 7.1. Normative References . . . . . . . . . . . . . . . . . . 16 86 7.2. Informative References . . . . . . . . . . . . . . . . . 17 87 Appendix A. Overjoining . . . . . . . . . . . . . . . . . . . . 19 88 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 20 90 1. Introduction 92 This document defines Circuit Breaker Assisted Congestion Control 93 (CBACC). CBACC defines a Network Transport Circuit Breaker (CB), as 94 described by [RFC8084]. 96 The CB behavior defined in this document uses bit-rate metadata about 97 multicast data streams coupled with policy, capacity, and load 98 information at a network location to prune multicast channels so that 99 the network's aggregate capacity at that location is not exceeded by 100 the subscribed channels. 102 To communicate the required metadata, this document defines a YANG 103 [RFC7950] module that augments the DORMS 104 [I-D.draft-ietf-mboned-dorms] YANG module. DORMS provides a 105 mechanism for senders to publish metadata about the multicast streams 106 they're sending through a RESTCONF service, so that receivers or 107 forwarding nodes can discover and consume the metadata with a set of 108 standard methods. The CBACC metadata MAY be communicated to 109 receivers or forwarding nodes by some other method, but the 110 definition of any alternative methods is out of scope for this 111 document. 113 The CB behavior defined in this document matches the description 114 provided in Section 3.2.3 of [RFC8084] of a unidirectional CB over a 115 controlled path. The control messages from that description are 116 composed of the messages containing the metadata required for 117 operation of the CB. 119 CBACC is designed to supplement protocols that use multicast IP and 120 rely on well-behaved receivers to achieve congestion control. 121 Examples of congestion control systems fitting this description 122 include [PLM], [RLM], [RLC], [FLID-DL], [SMCC], and WEBRC [RFC3738]. 124 CBACC addresses a problem with "overjoining" by untrusted receivers. 126 In an overjoining condition, receivers (either malicious, 127 misconfigured, or with implementation errors) subscribe to multicast 128 channels but do not respond appropriately to congestion. When 129 sufficient multicast traffic is available for subscription by such 130 receivers, this can overload any network. 132 The overjoining problem is relevant to misbehaving receivers for both 133 receiver-driven and feedback-driven congestion control strategies, as 134 described in Section 4.1 of [RFC8085]. 136 Overjoining attacks and the challenges they present are discussed in 137 more detail in Appendix A. 139 CBACC offers a solution for the recommendation in Section 4 of 140 [RFC8085] that circuit breaker solutions be used even where 141 congestion control is optional. 143 1.1. Background and Terminology 145 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 146 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 147 "OPTIONAL" in this document are to be interpreted as described in BCP 148 14 [RFC2119] [RFC8174] when, and only when, they appear in all 149 capitals, as shown here. 151 1.2. Venues for Contribution and Discussion 153 This document is in the Github repository at: 155 https://github.com/GrumpyOldTroll/ietf-dorms-cluster 157 Readers are welcome to open issues and send pull requests for this 158 document. 160 Please note that contributions may be merged and substantially 161 edited, and as a reminder, please carefully consider the Note Well 162 before contributing: https://datatracker.ietf.org/submit/note-well/ 164 Substantial discussion of this document should take place on the 165 MBONED working group mailing list (mboned@ietf.org). 167 * Join: https://www.ietf.org/mailman/listinfo/mboned 169 * Search: https://mailarchive.ietf.org/arch/browse/mboned/ 171 1.3. Non-obvious doc choices 173 * Since nothing is necessarily being actively measured by a network 174 component at the ingress, referring to the bitrate advertisement 175 as an "ingress meter" for this context was considered confusing by 176 reviewers, so the section was renamed with just a note pointing to 177 the link. Likewise the egress meter and "CB node". 179 * TBD: might need more and better examples explaining the point in 180 Section 2.1.5.1? Some reason to believe it's not sufficiently 181 clear... 183 * Another TBD: consider Dino's suggestion from 2020-04-09 to include 184 an operational considerations section that addresses some possible 185 optimizations for CB placement and configuration. 187 * TBD: add a section walking through the requirements in 188 https://datatracker.ietf.org/doc/html/rfc8084#section-4 189 (https://datatracker.ietf.org/doc/html/rfc8084#section-4) and 190 explaining how this matches. 192 * I'm unclear on whether https://datatracker.ietf.org/doc/html/ 193 rfc8407#section-3.8.2 (https://datatracker.ietf.org/doc/html/ 194 rfc8407#section-3.8.2) applies here, such that providing an 195 augmentation inside the DORMS namespace causes an update to the 196 DORMS document. 198 2. Circuit Breaker Behavior 200 2.1. Functional Components 202 This section maps the functional components described in Section 3.1 203 of [RFC8084] to the operational components of the CBACC CB defined by 204 this document. 206 2.1.1. Bitrate Advertisement 208 The metadata provides an advertised maximum data bit-rate, namely the 209 "max-speed" field in the YANG model in Section 3. This is a self- 210 report by the sender about the maximum amount of traffic a sender 211 will send within any time interval given by the "data-rate-window" 212 field, which is the measurement interval for the CB. This value 213 refers to the total IP Payload data for all packets in the same 214 (S,G), and its units are in kilobits per second. 216 The sender MUST NOT send more data for a data stream than the amount 217 of data declared according to its advertised data rate within any 218 measurement window, and it's RECOMMENDED for the sender to provide 219 some margin to account for the possibility of burst forwarding after 220 traffic encounters a non-empty queue, e.g. as sometimes observed with 221 ACK compression (see [ZSC91] for a description of the phenomenon). 222 If a CB node observes a higher data rate transmitted within any 223 measurement window, it MAY circuit-break that flow immediately. 225 In the terminology of [RFC8084], the bitrate advertisement qualifies 226 as an ingress meter. 228 2.1.2. Circuit Breaker Node 230 A circuit breaker node (CB node) is a location in a network where the 231 costraints of the network and the observations about active traffic 232 are compared to the bitrate advertisement in order to make the 233 decision loop about when and whether to perform the circuit breaking 234 behavior. In the terminology of [RFC8084], the CB node qualifies as 235 an egress meter. 237 The CB node has access to several pieces of information that can be 238 used as relevant egress metrics that may include: 240 1. Physical capacity limits on each interface. 242 2. Configured capacity limits for multicast traffic for each 243 interface. 245 3. The observed received data rates of subscribed multicast channels 246 with CBACC metadata. 248 4. The observed received data rates of subscribed multicast channels 249 without CBACC metadata. 251 5. The observed received data rates of competing non-multicast 252 traffic. 254 6. The loss rate for subscribed multicast channels, when available. 255 The loss rate is only sometimes observable at a CB node; for 256 example, when using AMBI [I-D.draft-ietf-mboned-ambi], or when 257 the data stream carries a protocol that is known to the CB node 258 by some out of band means, and whose traffic can be monitored for 259 loss. When available, the loss rates may be used. 261 Note that any on-path router can behave as a CB node, even though 262 there may be other CB nodes downstream or upstream covering the same 263 data streams. When viewing CB nodes as egress meters in the context 264 of [RFC8084], it's important to recall there's not a single egress 265 meter in the network, but rather an egress meter per CB node, 266 representing potentially multiple overlaid circuit breakers that may 267 redundantly cover parts of the same path, with potentially different 268 constraints based on the network location where the egress meter 269 operates. All of the CB nodes anywhere on a path constitute separate 270 circuit breakers that may trip independently of other circuit 271 breakers. 273 Also note that other kinds of components besides on-path routers 274 forwarding the traffic can act as CB nodes, for example the operating 275 system or browser on a device receiving the traffic, or the receiving 276 application itself. 278 2.1.3. Communication Method 280 CBACC generally operates at a CB node, where metrics such as those 281 described in Section 2.1.2 are available through system calls, or by 282 communication with various locally deployable system monitoring 283 applications. However, the CBACC processing can equivalently occur 284 on a separate device that can monitor statistics gathered at a CB 285 node, as long as the necessary control functions to trigger the CB 286 can be invoked. 288 The communication path defined in this document for the CB node to 289 obtain the bitrate advertisement in Section 2.1.1 is the use of DORMS 290 [I-D.draft-ietf-mboned-dorms]. Other methods MAY be used as well or 291 instead, but are out of scope for this document. 293 2.1.4. Measurement Function 295 The measurement function maintains a few values for each interface, 296 computed from the metrics described in Section 2.1.2 and 297 Section 2.1.1: 299 1. The aggregate advertised maximum bit-rate capacity consumed by 300 CBACC data streams. This is the sum of the max-speed values in 301 the CBACC metadata for all data streams subscribed through an 302 interface 304 2. An oversubscription threshold for each interface. The 305 oversubscription threshold will be determined differently for CB 306 nodes in different contexts. In some network devices, it might 307 be as simple as an administratively configured absolute value or 308 proportion of an interface's capacity. For other situations, 309 like a CB node operating in a context with loss visibility, it 310 could be a dynamically changing value that grows when data 311 streams are successfully subscribed and receiving data without 312 loss, and shrinks as loss is observed across subscribed data 313 streams. The oversubscription threshold calculation could also 314 incorporate other information like out-of-band path capacity 315 measurements with bandwidth detection techniques such as 316 [PathChirp] or [CapProbe]. 318 This document covers some non-normative examples of valid 319 oversubscription threshold functions in Section 2.3.1. In 320 general, the oversubscription threshold is the primary parameter 321 that different CBs in different contexts can tune to provide the 322 safety guarantees necessary for their context. 324 2.1.5. Trigger Function 326 The trigger function fires when the aggregate advertised maximum bit- 327 rate exceeds the oversubscription threshold for any interface. 329 When oversubscribed, the trigger function changes the states of 330 subscribed channels to "blocked" until the aggregate subscribed bit- 331 rate is below the oversubscription threshold again. 333 2.1.5.1. Fairness and Inter-flow Ordering 335 The trigger function orders the monitored flows according to a 336 fairness function and a within-sender priority ordering (chosen by 337 the sender as part of the CBACC metadata). When flows are blocked, 338 they're blocked in order until the aggregate bitrate of the permitted 339 flows do not exceed the oversubscription thresholds monitored by the 340 CB node. 342 Flows from a single sender MUST be ordered according to their 343 priority field from the CBACC metadata when compared with each other. 344 This takes precedence over the fairness function ordering, since 345 certain flows from the same sender may need strict priority over 346 others. 348 For example, consider a sender using File Delivery over 349 Unidirectional Transport (FLUTE, defined in [RFC6726]) that sends 350 File Delivery Table (FDT) Instances (see section 3.2 of [RFC6726]) in 351 one (S,G) and data for the various referenced files in other (S,G)s. 352 In this case the data for the files will not be consumable without 353 the (S,G) containing the FDT. Other transport protocols may 354 similarly send control information (often with a lower bitrate) on 355 one channel, and data information on another. In these cases, the 356 sender may need to ensure that data channels are only available when 357 the control channels are also available. 359 When comparing flows between senders, (S,G)s from the same sender 360 with different priorities should be treated as aggregated (S,G)s with 361 regard to their declared bitrate consumption, to ensure that if any 362 flows from the same sender need to be pruned by the circuit-breaker, 363 the least preferred priority flows from that sender are pruned first. 365 Between-sender flows and flows from the same sender with the same 366 priority are ordered according to the fairness function. TBD: need 367 to work thru detsils, this does not work as written. Sample fairness 368 function would reward senders for splitting a flow in 2 (more total 369 subscribers). Maybe should count offload instead? This has trouble 370 from favoring padding in your flow, but is (i think?) dominated by 371 subscriber count where that's known. The fairness function can be 372 different for CBs in different contexts. 374 A CBACC CB implementation SHOULD provide mechanisms for 375 administrative controls to configure explicit biases, as this may be 376 necessary to support Service Level Agreements for specific events or 377 providers, or to block or de-prioritize channels with historically 378 known misbehavior. 380 Subject to the above constraints, where possible the default fairness 381 behavior SHOULD favor streams with many receivers over streams with 382 few receivers, and streams with a low bit-rate over streams with a 383 high bit-rate. See Section 2.3.2 for further considerations and 384 examples. 386 2.1.6. Reaction 388 When the trigger function fires and a subscribed channel becomes 389 blocked, the reaction depends on whether it's an upstream interface 390 or a downstream interface. 392 If a channel is blocked on one or more downstream interfaces, it may 393 still be unblocked on other downstream interfaces. When this is the 394 case, traffic is simply not forwarded along blocked interfaces, even 395 though clients might still be joined downstream of those interfaces. 397 When a channel is blocked on all downstream interfaces or when the 398 upstream interface is oversubscribed, the channel is pruned so that 399 data no longer arrives from the network on the upstream interface. 400 The prune would be performed with a PIM prune (Section 3.5 of 401 [RFC7761]), or a "leave" operation to be communicated via IGMP, MLD, 402 or another multicast group signaling mechanism, according to the 403 expected signaling within the network. 405 Once initially pruned, a flow SHOULD remain pruned for a minimum 406 amount of time. The minimum hold-down duration SHOULD be no less 407 than 2.5 minutes by default, even if available bitrate space clears 408 up, to ensure downstream subscriptions will notice and respond. The 409 hold-down duration SHOULD be extended from the minimum by a randomly 410 chosen number of seconds uniformly distributed over a configurable 411 desynchronization period, to avoid synchronized recovery of different 412 circuit breakers along the path. The default length of the 413 desynchronization period should be at least 30 seconds. 415 2.5 minutes is chosen to exceed the default maximum lifetime of 2 416 minutes that can occur if an IGMP responder suddenly stops operation, 417 and ceases responding to IGMP queries with membership reports, and 30 418 seconds is chosen to allow for some flexibility in lost packets. The 419 values MAY be administratively tuned as needed by network operators 420 to meet performance goals specific to their networks or to the 421 traffic they're forwarding. 423 When enough capacity is available for a circuit-broken stream to be 424 unblocked and the circuit-breaker hold-down time is expired, flows 425 SHOULD be unblocked according to the priority order until no more 426 flows can be unblocked without exceeding the circuit breaker limits. 428 2.1.7. Feedback Control Mechanism 430 The bitrate advertisement metadata from Section 2.1.1 should be 431 refreshed as needed to maintain up to date values. When using DORMS 432 and RESTCONF, the Subscription to YANG Notifications for Datastore 433 Updates [RFC8641] is the preferred method to receive changes if 434 available. 436 If datastore subscriptions are not supported by the client or server, 437 the HTTP Cache Control headers provide valid refresh time properties 438 from the server, and SHOULD be used if present. If No-Cache is used, 439 the default refresh timing SHOULD be 30 seconds. A uniformly 440 distributed random value between 0 and 10 seconds SHOULD be added to 441 the Cache Control or the default refresh timing to avoid 442 synchronization across multiple clients. 444 2.2. States 446 2.2.1. Interface State 448 A CB holds the following state for each interface, for both the 449 inbound and outbound directions on that interface: 451 * aggregate bandwidth: The sum of the bandwidths of all non-circuit- 452 broken CBACC flows that transit this interface in this direction. 454 * bandwidth limit: The maximum aggregate CBACC advertised bandwidth 455 allowed, not including circuit-broken flows. 457 When reducing the bandwidth limit due to congestion, the circuit 458 breaker SHOULD NOT reduce the limit by more than half its value in 459 10 seconds, and SHOULD use a smoothing function to reduce the 460 limit gradually over time. 462 It is RECOMMENDED that no more than half the capacity for a link 463 be allocated to CBACC flows if the link might be shared with 464 unicast traffic that is responsive to congestion. 466 2.2.2. Flow State 468 Data streams with CBACC metadata have a state for the upstream 469 interface through which the stream is joined: 471 * 'subscribed' 473 Indicates that the circuit breaker is subscribed upstream to the 474 flow and forwarding packets through zero or more egress 475 interfaces. 477 * 'pruned' 479 Indicates that the flow has been circuit-broken. A request to 480 unsubscribe from the flow has been sent upstream, e.g. a PIM prune 481 (Section 3.5 of [RFC7761]) or a "leave" operation communicated via 482 IGMP, MLD, or another group membership management mechanism. 484 Data streams also have a per-interface state for downstream 485 interfaces with subscribers, where the data is being forwarded. It's 486 one of: 488 * 'forwarding' 490 Indicates that the flow is a non-circuit-broken flow in steady 491 state, forwarding packets downstream. 493 * 'blocked' 495 Indicates that data packets for this flow are NOT forwarded 496 downstream via this interface. 498 2.3. Implementation Design Considerations 499 2.3.1. Oversubscription Thresholds 501 TBD. 503 2.3.2. Fairness Functions 505 As an example fairness function that makes good sense for a general 506 case of unknown traffic: 508 Consider a network where the receiver count for multicast channels is 509 known, for example via the experimental PIM extension for population 510 count defined in [RFC6807]. 512 A good fairness metric for a flow is max-bandwidth divided by 513 receiver-count, with lower values of the fairness metric favored over 514 higher values. 516 An overview of some other approaches to appropriate fairness metrics 517 is given in Section 2.3 of [RFC5166]. 519 3. YANG Module 521 3.1. Tree Diagram 523 The tree diagram below follows the notation defined in [RFC8340]. 525 module: ietf-cbacc 527 augment /dorms:dorms/dorms:metadata/dorms:sender/dorms:group: 528 +--rw cbacc! 529 +--rw max-speed uint32 530 +--rw max-packet-size? uint16 531 +--rw data-rate-window? uint32 532 +--rw priority? uint16 534 3.2. Module 536 537 file ietf-cbacc@2022-03-07.yang 538 module ietf-cbacc { 539 yang-version 1.1; 541 namespace "urn:ietf:params:xml:ns:yang:ietf-cbacc"; 542 prefix "cbacc"; 544 import ietf-dorms { 545 prefix "dorms"; 546 reference "I-D.jholland-mboned-dorms"; 548 } 550 organization "IETF"; 552 contact 553 "Author: Jake Holland 554 555 "; 557 description 558 "Copyright (c) 2019 IETF Trust and the persons identified as 559 authors of the code. All rights reserved. 561 Redistribution and use in source and binary forms, with or 562 without modification, is permitted pursuant to, and subject to 563 the license terms contained in, the Simplified BSD License set 564 forth in Section 4.c of the IETF Trust's Legal Provisions 565 Relating to IETF Documents 566 (https://trustee.ietf.org/license-info). 568 This version of this YANG module is part of 569 draft-jholland-mboned-cbacc. See the internet draft for full 570 legal notices. 572 The key words 'MUST', 'MUST NOT', 'REQUIRED', 'SHALL', 'SHALL 573 NOT', 'SHOULD', 'SHOULD NOT', 'RECOMMENDED', 'NOT RECOMMENDED', 574 'MAY', and 'OPTIONAL' in this document are to be interpreted as 575 described in BCP 14 (RFC 2119) (RFC 8174) when, and only when, 576 they appear in all capitals, as shown here. 578 This module contains the definition for bandwidth consumption 579 metadata for SSM channels, as an extension to DORMS 580 (draft-ietf-mboned-dorms)."; 582 revision 2021-07-08 { 583 description "Draft version, post-early-review."; 584 reference 585 "draft-ietf-mboned-cbacc"; 586 } 588 augment 589 "/dorms:dorms/dorms:metadata/dorms:sender/dorms:group" { 590 description "Definition of the manifest stream providing 591 integrity info for the data stream"; 593 container cbacc { 594 presence "CBACC-enabled flow"; 595 description 596 "Information to enable fast-trip circuit breakers"; 597 leaf max-speed { 598 type uint32; 599 units "kilobits/second"; 600 mandatory true; 601 description "Maximum bitrate for this stream, in Kilobits 602 of IP packet data (including headers) of native 603 multicast traffic per second"; 604 } 605 leaf max-packet-size { 606 type uint16; 607 default 1400; 608 description "Maximum IP payload size, in octets."; 609 } 610 leaf data-rate-window { 611 type uint32; 612 units "milliseconds"; 613 default 2000; 614 description 615 "Time window over which data rate is guaranteed, 616 in milliseconds."; 617 /* TBD: range limits? */ 618 } 619 leaf priority { 620 type uint16; 621 default 256; 622 description 623 "The relative preference level for keeping this flow 624 compared to other flows from this sender (higher 625 value is more preferred to keep)"; 626 } 627 } 628 } 629 } 630 632 4. IANA Considerations 634 4.1. YANG Module Names Registry 636 This document adds one YANG module to the "YANG Module Names" 637 registry maintained at . The following registrations are made, per the format in 639 Section 14 of [RFC6020]: 641 name: ietf-cbacc 642 namespace: urn:ietf:params:xml:ns:yang:ietf-cbacc 643 prefix: cbacc 644 reference: I-D.draft-ietf-mboned-cbacc 646 4.2. The XML Registry 648 This document adds the following registration to the "ns" subregistry 649 of the "IETF XML Registry" defined in [RFC3688], referencing this 650 document. 652 URI: urn:ietf:params:xml:ns:yang:ietf-cbacc 653 Registrant Contact: The IESG. 654 XML: N/A, the requested URI is an XML namespace. 656 5. Security Considerations 658 TBD: Yang Doctor review from Reshad said this should "mention the 659 YANG data nodes". I think this means "do what 660 https://tools.ietf.org/html/rfc8407#section-3.7 says"? 662 5.1. Metadata Security 664 Be sure to authenticate the metadata. See DORMS security 665 considerations, and don't accept unauthenticated metadata if using an 666 alternative means. 668 5.2. Denial of Service 670 5.2.1. State Overload 672 Since CBACC flows require state, it may be possible for a set of 673 receivers and/or senders, possibly acting in concert, to generate 674 many flows in an attempt to overflow the circuit breakers' state 675 tables. 677 It is permissible for a network node to behave as a CBACC circuit 678 breaker for some CBACC flows while treating other CBACC flows as non- 679 CBACC, as part of a load balancing strategy for the network as a 680 whole, or simply as defense against this concern when the number of 681 monitored flows exceeds some threshold. 683 The same techniques described in Section 3.1 of [RFC4609] can be used 684 to help mitigate this attack, for much the same reasons. It is 685 RECOMMENDED that network operators implement measures to mitigate 686 such attacks. 688 6. Acknowledgements 690 Many thanks to Devin Anderson, Ben Kaduk, Cheng Jin, Scott Brown, 691 Miroslav Ponec, Bob Briscoe, Lenny Giuliani, Christian Worm 692 Mortensen, Dino Farinacci, and Reshad Rahman for their thoughtful 693 comments and contributions. 695 7. References 697 7.1. Normative References 699 [I-D.draft-ietf-mboned-ambi] 700 Holland, J. and K. Rose, "Asymmetric Manifest Based 701 Integrity", Work in Progress, Internet-Draft, draft-ietf- 702 mboned-ambi-01, 31 October 2020, 703 . 706 [I-D.draft-ietf-mboned-dorms] 707 Holland, J., "Discovery Of Restconf Metadata for Source- 708 specific multicast", Work in Progress, Internet-Draft, 709 draft-ietf-mboned-dorms-01, 31 October 2020, 710 . 713 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 714 Requirement Levels", BCP 14, RFC 2119, 715 DOI 10.17487/RFC2119, March 1997, 716 . 718 [RFC7950] Bjorklund, M., Ed., "The YANG 1.1 Data Modeling Language", 719 RFC 7950, DOI 10.17487/RFC7950, August 2016, 720 . 722 [RFC8084] Fairhurst, G., "Network Transport Circuit Breakers", 723 BCP 208, RFC 8084, DOI 10.17487/RFC8084, March 2017, 724 . 726 [RFC8085] Eggert, L., Fairhurst, G., and G. Shepherd, "UDP Usage 727 Guidelines", BCP 145, RFC 8085, DOI 10.17487/RFC8085, 728 March 2017, . 730 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 731 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 732 May 2017, . 734 [RFC8340] Bjorklund, M. and L. Berger, Ed., "YANG Tree Diagrams", 735 BCP 215, RFC 8340, DOI 10.17487/RFC8340, March 2018, 736 . 738 7.2. Informative References 740 [CapProbe] Kapoor, R., Chen, L., Lao, L., Gerla, M., and M.Y. 741 Sanadidi, "CapProbe: A Simple and Accurate Capacity 742 Estimation Technique", September 2004, 743 . 745 [FLID-DL] Byers, J.W., Horn, G., Luby, M., Mitzenmacher, M., Shaver, 746 W., and IEEE, "FLID-DL: congestion control for layered 747 multicast", DOI 10.1109/JSAC.2002.803998, n.d., 748 . 750 [PathChirp] 751 Ribeiro, V.J., Riedi, R.H., Baraniuk, R.G., Navratil, J., 752 Cottrell, L., Department of Electrical and Computer 753 Engineering Rice University, and SLAC/SCS-Network 754 Monitoring, Stanford University, "pathChirp: Efficient 755 Available Bandwidth Estimation for Network Paths", 2003. 757 [PLM] Biersack, Institut EURECOM, A.Legout, E.W., "PLM: Fast 758 Convergence for Cumulative Layered Multicast Transmission 759 Schemes", 1999, 760 . 763 [RFC3688] Mealling, M., "The IETF XML Registry", BCP 81, RFC 3688, 764 DOI 10.17487/RFC3688, January 2004, 765 . 767 [RFC3738] Luby, M. and V. Goyal, "Wave and Equation Based Rate 768 Control (WEBRC) Building Block", RFC 3738, 769 DOI 10.17487/RFC3738, April 2004, 770 . 772 [RFC4609] Savola, P., Lehtonen, R., and D. Meyer, "Protocol 773 Independent Multicast - Sparse Mode (PIM-SM) Multicast 774 Routing Security Issues and Enhancements", RFC 4609, 775 DOI 10.17487/RFC4609, October 2006, 776 . 778 [RFC5166] Floyd, S., Ed., "Metrics for the Evaluation of Congestion 779 Control Mechanisms", RFC 5166, DOI 10.17487/RFC5166, March 780 2008, . 782 [RFC6020] Bjorklund, M., Ed., "YANG - A Data Modeling Language for 783 the Network Configuration Protocol (NETCONF)", RFC 6020, 784 DOI 10.17487/RFC6020, October 2010, 785 . 787 [RFC6726] Paila, T., Walsh, R., Luby, M., Roca, V., and R. Lehtonen, 788 "FLUTE - File Delivery over Unidirectional Transport", 789 RFC 6726, DOI 10.17487/RFC6726, November 2012, 790 . 792 [RFC6807] Farinacci, D., Shepherd, G., Venaas, S., and Y. Cai, 793 "Population Count Extensions to Protocol Independent 794 Multicast (PIM)", RFC 6807, DOI 10.17487/RFC6807, December 795 2012, . 797 [RFC7761] Fenner, B., Handley, M., Holbrook, H., Kouvelas, I., 798 Parekh, R., Zhang, Z., and L. Zheng, "Protocol Independent 799 Multicast - Sparse Mode (PIM-SM): Protocol Specification 800 (Revised)", STD 83, RFC 7761, DOI 10.17487/RFC7761, March 801 2016, . 803 [RFC8641] Clemm, A. and E. Voit, "Subscription to YANG Notifications 804 for Datastore Updates", RFC 8641, DOI 10.17487/RFC8641, 805 September 2019, . 807 [RLC] Rizzo, L., Vicisano, L., and J. Crowcroft, "The RLC 808 multicast congestion control algorithm", 1999, 809 . 811 [RLM] McCanne, S., Jacobson, V., Vetterli, M., University of 812 California, Berkeley, and Lawrence Berkeley National 813 Laboratory, "Receiver-driven Layered Multicast", 1995, 814 . 817 [SMCC] Kwon, G., Byers, J.W., and Computer Science Department, 818 Boston University, "Smooth Multirate Multicast Congestion 819 Control", 2002, 820 . 822 [ZSC91] Zhang, L., Shenker, S., and D.D. Clark, "Observations and 823 Dynamics of a Congestion Control Algorithm: The Effects of 824 Two-Way Traffic", Proc. ACM SIGCOMM, ACM Computer 825 Communications Review (CCR), Vol 21, No 4, pp.133-147. , 826 1991. 828 Appendix A. Overjoining 830 [RFC8085] describes several remedies for unicast congestion control 831 under UDP, even though UDP does not itself provide congestion 832 control. In general, any network node under congestion could in 833 theory collect evidence that a unicast flow's sending rate is not 834 responding to congestion, and would then be justified in circuit- 835 breaking it. 837 With multicast IP, the situation is different, especially in the 838 presence of malicious receivers. A well-behaved sender using a 839 receiver-controlled congestion scheme such as WEBRC does not reduce 840 its send rate in response to congestion, instead relying on receivers 841 to leave the appropriate multicast groups. 843 This leads to a situation where, when a network accepts inter-domain 844 multicast traffic, as long as there are senders somewhere in the 845 world with aggregate bandwidth that exceeds a network's capacity, 846 receivers in that network can join the flows and overflow the network 847 capacity. A receiver controlled by an attacker could do this at the 848 IGMP/MLD level without running the application layer protocol that 849 participates in the receiver-controlled congestion control. 851 A network might be able to detect and defend against the most naive 852 version of such an attack by blocking end users that try to join too 853 many flows at once. However, an attacker can achieve the same effect 854 by joining a few high-bandwidth flows, if those exist anywhere, and 855 an attacker that controls a few machines in a network can coordinate 856 the receivers so they join disjoint sets of non-responsive sending 857 flows. 859 This scenario will produce congestion in a middle node in the network 860 that can't be easily detected at the edge where the IGMP/MLD join is 861 accepted. Thus, an attacker with a small set of machines in a target 862 network can always trip a circuit breaker if present, or can induce 863 excessive congestion among the bandwidth allocated to multicast. 864 This problem gets worse as more multicast flows become available. 866 Although the same can apply to non-responsive unicast traffic, 867 network operators can assume that non-responsive sending flows are in 868 violation of congestion control best practices, and can therefore cut 869 off flows associated with the misbehaving senders. By contrast, non- 870 responsive multicast senders are likely to be well-behaved 871 participants in receiver-controlled congestion control schemes. 873 However, receiver controlled congestion control schemes also show the 874 most promise for efficient massive scale content distribution via 875 multicast, provided network health can be ensured. Therefore, 876 mechanisms to mitigate overjoining attacks while still permitting 877 receiver-controlled congestion control are necessary. 879 Author's Address 881 Jake Holland 882 Akamai Technologies, Inc. 883 150 Broadway 884 Cambridge, MA 02144, 885 United States of America 886 Email: jakeholland.net@gmail.com