idnits 2.17.1 draft-briscoe-conex-initial-deploy-03.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (July 16, 2012) is 4302 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- No issues found here. Summary: 0 errors (**), 0 flaws (~~), 1 warning (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 ConEx B. Briscoe 3 Internet-Draft BT 4 Intended status: Informational July 16, 2012 5 Expires: January 17, 2013 7 Initial Congestion Exposure (ConEx) Deployment Examples 8 draft-briscoe-conex-initial-deploy-03 10 Abstract 12 This document gives examples of how ConEx deployment might get 13 started, focusing on unilateral deployment by a single network. 15 Status of This Memo 17 This Internet-Draft is submitted in full conformance with the 18 provisions of BCP 78 and BCP 79. 20 Internet-Drafts are working documents of the Internet Engineering 21 Task Force (IETF). Note that other groups may also distribute 22 working documents as Internet-Drafts. The list of current Internet- 23 Drafts is at http://datatracker.ietf.org/drafts/current/. 25 Internet-Drafts are draft documents valid for a maximum of six months 26 and may be updated, replaced, or obsoleted by other documents at any 27 time. It is inappropriate to use Internet-Drafts as reference 28 material or to cite them other than as "work in progress." 30 This Internet-Draft will expire on January 17, 2013. 32 Copyright Notice 34 Copyright (c) 2012 IETF Trust and the persons identified as the 35 document authors. All rights reserved. 37 This document is subject to BCP 78 and the IETF Trust's Legal 38 Provisions Relating to IETF Documents 39 (http://trustee.ietf.org/license-info) in effect on the date of 40 publication of this document. Please review these documents 41 carefully, as they describe your rights and restrictions with respect 42 to this document. Code Components extracted from this document must 43 include Simplified BSD License text as described in Section 4.e of 44 the Trust Legal Provisions and are provided without warranty as 45 described in the Simplified BSD License. 47 Table of Contents 49 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 50 2. Recap: Incremental Deployment Features of the ConEx 51 Protocol . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 52 3. ConEx Components . . . . . . . . . . . . . . . . . . . . . . . 4 53 3.1. Recap of Basic ConEx Components . . . . . . . . . . . . . 4 54 3.2. Per-Network Deployment Concepts . . . . . . . . . . . . . 4 55 4. Example Initial Deployment Arrangements . . . . . . . . . . . 5 56 4.1. Single Receiving Network Scenario . . . . . . . . . . . . 5 57 4.1.1. ConEx Functions in the Single Receiving Network 58 Scenario . . . . . . . . . . . . . . . . . . . . . . . 7 59 4.1.2. Incentives to Unilaterally Deploy ConEx in a 60 Receiving Network . . . . . . . . . . . . . . . . . . 8 61 5. Security Considerations . . . . . . . . . . . . . . . . . . . 11 62 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 11 63 7. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . 11 64 8. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 11 65 9. Comments Solicited . . . . . . . . . . . . . . . . . . . . . . 11 66 10. Informative References . . . . . . . . . . . . . . . . . . . . 11 67 Appendix A. Summary of Changes between Drafts . . . . . . . . . . 12 69 1. Introduction 71 This document gives examples of how ConEx deployment might get 72 started, focusing on unilateral deployment by a single network. 74 2. Recap: Incremental Deployment Features of the ConEx Protocol 76 The ConEx mechanism document [conex-abstract-mech] goes to great 77 lengths to design for incremental deployment in all the respects 78 below. It should be referred to for precise details on each of these 79 points: 81 o The ConEx mechanism is essentially a change to the source, in 82 order to re-insert congestion feedback into the network. 84 o Source-host-only deployment is possible without any negotiation 85 required, and individual transport protocol implementations within 86 a source host can be updated separately. 88 o Receiver modification may optionally improve ConEx for some 89 transport protocols with feedback limitations (TCP being the main 90 example), but it is not a necessity 92 o Proxies for the source and/or receiver are feasible (though not 93 necessarily straightforward) 95 o Queues and network forwarding do not require any modification for 96 ConEx. 98 o ECN is not required in the network for ConEx. If some network 99 nodes support ECN, it can be used by ConEx. 101 o ECN is not required at the receiver for ConEx. The sender should 102 nonetheless attempt to negotiate ECN-usage with the receiver, 103 given some aspects of ConEx work better the more ECN is deployed, 104 particularly auditing and border measurement. 106 o Given ConEx exposes information for IP-layer policy devices to 107 use, the design does not preclude possible innovative uses of 108 ConEx information by other IP-layer devices, e.g. forwarding 109 itself 111 o Packets indicate whether or not they support ConEx. 113 3. ConEx Components 115 3.1. Recap of Basic ConEx Components 117 [conex-abstract-mech] introduces the following components: 119 o The ConEx Wire Protocol (currently only specified for IPv6 120 [conex-destopt], although a possible way to fit ConEx into the 121 IPv4 header has been described [intarea-ipv4-id-reuse]) 123 o Forwarding devices (unmodified) 125 o Sender (modified for ConEx) 127 o Receiver (optionally modified) 129 o Audit 131 o Policy Devices: 133 * Rest-of-Path Congestion Monitoring Devices (using information 134 from the ConEx wire protocol) 136 * Congestion Policers (using rest-of-path congestion monitoring) 138 [conex-abstract-mech] should be referred to for definitions of each 139 of these components and further explanation. 141 The goal of all these ConEx elements for this scenario is to expose 142 information about congestion on the whole-path to a congestion- 143 policer. A congestion-policer is nearly identical to a traditional 144 token-bucket-based bit-rate policer except the tokens it fills with 145 arrive at a rate that represents the volume of congestion that the 146 customer is allowed to contribute to over time and tokens drain from 147 the bucket at a rate dependent on the ConEx signals representing 148 rest-of-path congestion. [CongPol] introduces congestion-policing 149 and [conex-concepts-uses] explains the benefits of policing based on 150 congestion-volume compared to methods like weighted round-robin 151 traditionally used in a BRAS. 153 3.2. Per-Network Deployment Concepts 155 Network deployment-related definitions: 157 Internet Ingress: The first IP node a packet traverses that is 158 outside the source's own network. In a residential access network 159 scenario, for traffic from a home this is the first IP-aware node 160 after the home access equipment. For Internet access from an 161 enterprise network this is the provider edge router. 163 Internet Egress: The last IP node a packet traverses before reaching 164 the receiver's network. 166 ConEx-Enabled Network: A network whose edge nodes implement ConEx 167 policy functions. 169 Each network can unilaterally choose to use any ConEx information 170 given by those sources using ConEx, independently of whether other 171 networks use it. 173 Typically, a network will use ConEx information by deploying a policy 174 function at the ingress edge of its network to monitor arriving 175 traffic and to act in some way on the congestion information in those 176 packets that are ConEx-enabled. Actions might include policing, 177 altering the class of service, or re-routing. Alternatively, less 178 direct actions via a management system might include triggering 179 capacity upgrades, triggering penalty clauses in contracts or levying 180 charges between networks based on ConEx measurements. 182 Typically, a network using ConEx info will deploy a ConEx policy 183 function near the ingress edge and a ConEx audit function near the 184 egress edge. The segment of the path between a ConEx policy function 185 and a ConEx audit function can be considered to be a ConEx-protected 186 segment of the path. Assuming a network covers all its ingresses and 187 egresses with policy functions and audit functions respectively, the 188 network within this ring will be a ConEx-protected network. 190 Of course, because each edge device usually serves as both an ingress 191 and an egress, the two functions are both likely to be present in 192 each edge device. 194 4. Example Initial Deployment Arrangements 196 In all the deployment scenarios below, we assume that deployment 197 starts with some data sources being modified with ConEx code. The 198 rationale for this is that the developer of a scavenger transport 199 protocol like LEDBAT has a strong incentive to tell the network how 200 little congestion it is causing despite sending large volumes of 201 data. In this case the developer makes the first move expecting it 202 will prompt at least some networks to move in response--so that they 203 use the ConEx information to reward users of the scavenger protocol. 205 4.1. Single Receiving Network Scenario 207 The name 'Receiving Network' for this scenario merely emphasises that 208 most data is arriving from connected networks and data centres and 209 being consumed by residential customers on this access network. Some 210 data is of course also travelling in the other direction. 211 DSLAMs __ 212 /|/ ,-.Home-a 213 __/__| |-----( ) 214 ,-----. / \ | |--- `-' 215 ,---. / \ ,------P/ \|\__ 216 / \ ' Core '/| BRAS | __ 217 ( Peer )-->-|P | '------' /|/ 218 \ / | | _____| |--- 219 '---` ' '\,------./ | |--- 220 \ M / |BRAS | \|\__ 221 `-----' '------A\ __ 222 | P| \ /|/ 223 /|\ /|\ \__\_| |--- ,-. 224 ,---. ,---. / | |-----( ) 225 /Data \ / \ \|\__ `-'Home-b 226 ( Centre) ( CDN ) 227 \ / \ / Access Network 228 '---` '---` <-------------> 230 P=Congestion-Policer; M=Congestion-Monitor; A=Audit function 232 Figure 1: Single Receiving Network Scenario 234 Figure Figure 1 is an attempt to show the salient features of a ConEx 235 deployment in a typical broadband access provider's network (within 236 the constraints of ASCII art). Broadband remote access servers 237 (BRASs) control access to the core network from the access network 238 and vice versa. Home networks (and small businesses) connect to the 239 access network, but only two are shown. 241 In this diagram, all data is travelling towards the access network of 242 Home-b, from the Peer network, the Data centre, the CDN and Home-a. 243 Data actually travels in both directions on all links, but only one 244 direction is shown. 246 The data centre, core and access network are all run by the same 247 network operator, but each is the responsibility of a different 248 department with internal accounting between them. The content 249 distribution network (CDN) is operated by a third party CDN provider, 250 and of course the peer network is also operated by a third party. 252 This operator of the data centre, core and access network is the only 253 one in the diagram to have deployed ConEx monitoring and policy 254 devices at the edges of its network. However, it has not enabled ECN 255 on any of its network elements and neither has any other network in 256 the diagram. The operator has deployed a congestion policing 257 function (P) on the provider-edge router where the peer attaches to 258 its core, on the BRAS where the CDN attaches and on the other BRAS 259 where each of the residential customers like Home-a attach. On the 260 provider-edge router where the data centre attaches it has deployed a 261 congestion monitoring function (M). Each of these policing and 262 monitoring functions handles the aggregate of all traffic traversing 263 it, for all destinations. 265 The operator has deployed an audit function on each logical output 266 port of the BRAS for each end-customer site like Home-b. The Audit 267 function handles the aggregate of all traffic for that end-customer 268 from all sources. For traffic in the opposite direction (e.g. from 269 Home-b to Home-a, there would be equivalent policing (P) and audit 270 (A) functions in the converse locations to those shown. 272 Some content sources in the CDN and in the data centre are using the 273 ConEx protocol, but others are not. There is a similar situation for 274 hosts attached to the Peer network and hosts in home networks like 275 Home-a: some are sending ConEx packets at least for bulk data 276 transports, while others are not. 278 4.1.1. ConEx Functions in the Single Receiving Network Scenario 280 Within the BRAS there are logical ports that model the rate of each 281 access line from the DSLAM to each home network [TR-059], [TR-101]. 282 They are fed by a shared queue that models the rate of the downstream 283 link from the BRAS to the DSLAM (sometimes called the backhaul 284 network). If there is congestion anywhere in the set of networks in 285 Figure Figure 1 it is nearly always: 287 o either self-congestion in the queues into the logical ports 288 representing the access lines 290 o or shared congestion in the shared queue on the BRAS that feeds 291 them. 293 Any ConEx sources sending data through this BRAS will receive 294 feedback about these losses from the destination and re-insert it as 295 ConEx markings into the data. Figure 2 shows an example plot of the 296 loss levels that might be seen at different monitoring points along a 297 path between the data centre and home-b, for instance. The top half 298 of the figure shows the loss probability within the BRAS consists of 299 0.1% at the shared queue and 0.2% self-congestion in the logical 300 output port that models the access line, making 0.3% in total. This 301 upper diagram also shows whole path congestion as signalled by the 302 ConEx sender, which remains unchanged along the whole path at 0.3%. 304 The lower half of the figure shows (downstream congestion) = (whole 305 path) - (upstream congestion). Upstream congestion can only be 306 monitored locally where the loss actually happens (within the BRAS 307 output queues). Nonetheless, given there is rarely loss anywhere 308 else but within the BRAS, this limitation is not significant in this 309 scenario. The lower half of the figure also shows the location of 310 the policing and audit functions. Policing anywhere within or 311 upstream ofthe BRAS will be based on the downstream congestion level 312 of 0.3%. While Auditing within the BRAS but after all the queues can 313 check that the whole path congestion signalled by ConEx is no less 314 than the loss levels experienced within the BRAS itself. 316 Data centre-->|<--core-->|<------BRAS--------->|<--Home-- 317 | | 318 ^loss |<-Shared->|<-Access->| 319 |probability backhaul 320 | 321 0.3%|- - - - - - - - - - - - - - - - - - - - +----------------- 322 | whole path congestion | 323 | | 324 | |upstream 325 0.1%| +---------+congestion 326 | | 327 -O==============================+-----------------------------> 328 monitoring point 329 ^loss 330 |probability Policing Audit 331 | | | 332 | V | 333 0.3%|----------------O-------------+ | 334 | |downstream | 335 0.2%| +---------+ | 336 | congestion| | 337 | | | 338 | | V 339 -O----------------------------------------+====O============--> 340 monitoring point 342 Figure 2: Example plot of loss levels along a path 344 4.1.2. Incentives to Unilaterally Deploy ConEx in a Receiving Network 346 Even a sending application that is modified to use ConEx can choose 347 whether to send ConEx or Not-ConEx packets. Nonetheless, ConEx 348 packets bring information to a policer about congestion expected on 349 the rest of the path beyond the policer. Not-ConEx packets bring no 350 such information. Therefore a network that has deployed ConEx 351 policers will tend to rate-limit not-ConEx packets conservatively in 352 order to manage the unknown risk of congestion. In contrast, a 353 network doesn't normally need to rate-limit ConEx-enabled packets 354 unless they reveal a persistently high contribution to congestion. 355 This natural tendency for networks to favour senders that provide 356 ConEx information encourages senders to choose to use the ConEx 357 protocol whenever they can. 359 In particular, high volume sources have the most incentive to deploy 360 ConEx. This is because high volume sources (e.g. video download 361 sites or peer-to-peer file-sharing) can gain by implementing a low 362 'weight' end-to-end transport (i.e. a less aggressive response to 363 congestion than other transports). Then, although they send a large 364 amount of volume, they need not contribute significantly to 365 congestion. If the ISP currently limits data volume, or offers 366 chargeable tiers based on data volume, such customers stand to gain 367 considerably if they can encourage the ISP to limit usage based on 368 congestion-volume instead of volume. 370 Figure 3 explains why this is the case. The plots show bit-rate on 371 the vertical axis and time horizontally. A file transfer (e.g. the 372 one labelled from customer 'b') is given a simplified representation 373 as a rectangle, implying it runs at a set rate for a time, then 374 completes. The maximum height of each plot represents the maximum 375 capacity of the shared link across the backhaul network, which is 376 typically the bottleneck in a broadband network. The hatched regions 377 represent unused capacity. 'c' represents the high volume source that 378 we intend to show has an incentive to deploy ConEx. 380 In the upper half of the figure, customers 'b' & 'c' both use 381 transports with equal weights, which is why they are shown with equal 382 rates when they both compete for the capacity of the line. 'c' sends 383 larger files than 'b', so when 'b' completes each of its file, 'c' 384 can use the full capacity of the line until 'b' starts the next file. 385 In the lower half of the figure, 'c' uses a less aggressive (lower 386 weight) transport, so whenever 'b' sends a file, 'c' yields more of 387 its rate. This allows 'b' to complete its transfer earlier, so that 388 'c' can take up the full rate earlier. 'b' sends the same volume 389 files (same area in the graph), just faster and therefore they 390 complete sooner (tall & thin instead of shorter and wider). As a 391 result, 'c' hardly finishes any later than in the upper diagram. 392 However, 'c' will have contributed much less to congestion, and 'b' 393 completes the majority of its file transfers much faster. 'b' has 394 also contributed less to congestion. 396 As we have said, customer 'c' in particular stands to gain if the ISP 397 bases usage-limits (or usage charges) on congestion-volume rather 398 than volume. The ISP also has a strong incentive to reward customers 399 like 'c', because they make the network performance appear far better 400 than before for customer's like 'b' (e.g. short Web transfers). 401 However, the network cannot make this move until customers like 'c' 402 expose congestion information (ConEx) that the ISP can use in its 403 traffic management or contracts. 405 ^ bit-rate 406 | 407 |---------------------------------------------------.--,--.--,------- 408 | |/\| |\/| 409 | c |\/| b|/\| c 410 |------. ,-----. ,-----. ,-----. ,-----. ,-----./\| |\/| ,---- 411 | b | | b | | b | | b | | b | | b |\/| |/\| | b 412 | | | | | | | | | | | |/\| |\/| | 413 +------'--'-----'--'-----'--'-----'--'-----'--'-----'--'--'--'--'----> 414 time 416 ^ bit-rate 417 | 418 |---------------------------------------------------.--,--.--,------- 419 |---. ,---. ,---. ,---. ,---. ,---. |/\| |\/| ,---. 420 | | | | | | c | | | | | | |\/| b|/\| c| | 421 | | | | | | | | | | | | |/\| |\/| | | 422 | b | | b | | b | | b | | b | | b | |\/| |/\| | b | 423 | | | | | | | | | | | | |/\| |\/| | | 424 +---'-----'---'----'---'----'---'----'---'----'---'-'--'--'--'--'---'> 425 time 427 Figure 3: Weighted congestion controls with equal weights (upper) and 428 unequal (lower) 430 Of course, in reality there would be more than two customers. But 431 this would mean that short transfers like 'b' stand to gain even 432 more, as multiple larger files would be yielding at once. 434 We should point out that not all high-volume customers will be 435 prepared to temporarily shift their usage out of the way as shown -- 436 real-time video for instance would still use a higher weight (more 437 aggressive) so as to ensure timely delivery. However, high volume 438 applications with elastic (non-real-time) requirements are also 439 common (e.g. video streaming, software downloads, etc) 441 We should also point out that a transport that is less agressive 442 against other customers is similar but not quite the same as LEDBAT 443 [ledbat-congestion]. LEDBAT does indeed yield more to other flows 444 during congestion, but it is designed to only do this if the 445 contention for resources is at a slow link, such as the customer's 446 own home router. If the contention is at a fast link, such as a 447 BRAS, LEDBAT is designed not to yield. This is because ISPs 448 currently give no reward to a transport that minimises congestion to 449 others -- because they do not have the congestion information to be 450 able to. 452 5. Security Considerations 454 6. IANA Considerations 456 This document does not require actions by IANA. 458 7. Conclusions 460 This document has introduced how congestion policing could be 461 deployed at the broadband remote access servers in a typical 462 broadband access network. Congestion policing uses ConEx markings 463 introduced by data sources and packets discarded by the BRAS to 464 determine rest-of-path congestion, and police traffic accordingly. 466 It has been shown that high-volume elastic data sources have a strong 467 incentive to deploy ConEx speculatively in the expectation that they 468 will be able to encourage their ISPs to account for their usage by 469 congestion-volume, not volume. They can use a less aggressive 470 transport and prove that they are contributing little to congestion 471 despite sending a lot of volume. ISPs also have a strong incentive 472 to use this ConEx information to encourage their elastic high-volume 473 customers to use less agressive transports, given they improve the 474 performance of all the other customers. 476 Without ConEx information, ISPs can only use volume as a metric of 477 usage, which prevents the above virtuous circle from forming, 478 perversely discouraging high-volume elastic customers from such 479 friendly behaviour. 481 8. Acknowledgments 483 9. Comments Solicited 485 Comments and questions are encouraged and very welcome. They can be 486 addressed to the IETF Congestion Exposure (ConEx) working group's 487 mailing list , and/or to the authors. 489 10. Informative References 491 [CongPol] Jacquet, A., Briscoe, B., and T. Moncaster, 492 "Policing Freedom to Use the Internet 493 Resource Pool", Proc ACM Workshop on Re- 494 Architecting the Internet (ReArch'08) , 495 December 2008, . 498 [TR-059] Anschutz, T., Ed., "DSL Forum Technical 499 Report TR-059: Requirements for the Support 500 of QoS-Enabled IP Services", September 2003. 502 [TR-101] Cohen, A., Ed. and E. Shrum, Ed., "Migration 503 to Ethernet-Based DSL Aggregation", 504 April 2006. 506 [conex-abstract-mech] Mathis, M. and B. Briscoe, "Congestion 507 Exposure (ConEx) Concepts and Abstract 508 Mechanism", 509 draft-ietf-conex-abstract-mech-05 (work in 510 progress), July 2012. 512 [conex-concepts-uses] Briscoe, B., Woundy, R., and A. Cooper, 513 "ConEx Concepts and Use Cases", 514 draft-ietf-conex-concepts-uses-04 (work in 515 progress), March 2012. 517 [conex-destopt] Krishnan, S., Kuehlewind, M., and C. Ucendo, 518 "IPv6 Destination Option for Conex", 519 draft-ietf-conex-destopt-02 (work in 520 progress), March 2012. 522 [intarea-ipv4-id-reuse] Briscoe, B., "Reusing the IPv4 523 Identification Field in Atomic Packets", 524 draft-briscoe-intarea-ipv4-id-reuse-01 (work 525 in progress), March 2012. 527 [ledbat-congestion] Hazel, G., Iyengar, J., Kuehlewind, M., and 528 S. Shalunov, "Low Extra Delay Background 529 Transport (LEDBAT)", 530 draft-ietf-ledbat-congestion-09 (work in 531 progress), October 2011. 533 Appendix A. Summary of Changes between Drafts 535 Detailed changes are available from 536 http://tools.ietf.org/html/draft-briscoe-conex-initial-deploy 538 From draft-briscoe-02 to draft-briscoe-03: 540 * Removed Mobile and Data Centre scenarios, making this draft 541 solely cover the receiving access network scenario. It then 542 becomes a 'sibling' of the drafts on these two subjects, rather 543 than a 'parent' 545 * Consequently Dirk Kutscher is no longer a co-author 547 * Included more comprehensive background information on ConEx 549 * Completed Incentives section 551 * Updated refs 553 From draft-briscoe-01 to draft-briscoe-02: 555 * Added Mobile Scenario section, and Dirk Kutscher as co-author; 557 From draft-briscoe-00 to draft-briscoe-01: Re-issued without textual 558 change. Merely re-submitted to correct a processing error causing 559 the whole text of draft-00 to be duplicated within the file. 561 Author's Address 563 Bob Briscoe 564 BT 565 B54/77, Adastral Park 566 Martlesham Heath 567 Ipswich IP5 3RE 568 UK 570 Phone: +44 1473 645196 571 EMail: bob.briscoe@bt.com 572 URI: http://bobbriscoe.net/