idnits 2.17.1 draft-briscoe-tsvwg-cl-phb-03.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** It looks like you're using RFC 3978 boilerplate. You should update this to the boilerplate described in the IETF Trust License Policy document (see https://trustee.ietf.org/license-info), which is required now. -- Found old boilerplate from RFC 3978, Section 5.1 on line 34. -- Found old boilerplate from RFC 3978, Section 5.5 on line 1968. -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 1945. -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 1952. ** Found boilerplate matching RFC 3978, Section 5.4, paragraph 1 (on line 1972), which is fine, but *also* found old RFC 2026, Section 10.4C, paragraph 1 text on line 56. ** This document has an original RFC 3978 Section 5.4 Copyright Line, instead of the newer IETF Trust Copyright according to RFC 4748. ** This document has an original RFC 3978 Section 5.5 Disclaimer, instead of the newer disclaimer which includes the IETF Trust according to RFC 4748. ** The document seems to lack an RFC 3979 Section 5, para. 3 IPR Disclosure Invitation -- however, there's a paragraph with a matching beginning. Boilerplate error? Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- == There is 1 instance of lines with non-ascii characters in the document. == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack a Security Considerations section. ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** The document seems to lack separate sections for Informative/Normative References. All references will be assumed normative when checking for downward references. ** There are 2 instances of too long lines in the document, the longest one being 5 characters in excess of 72. ** The abstract seems to contain references ([CL-DEPLOY]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the RFC 3978 Section 5.4 Copyright Line does not match the current year == Line 482 has weird spacing: '...reshold thr...' -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- Couldn't find a document date in the document -- date freshness check skipped. Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'CL-ARCH' is mentioned on line 1539, but not defined == Unused Reference: 'GSPa' is defined on line 1768, but no explicit reference was found in the text == Unused Reference: 'GSP- TR' is defined on line 1772, but no explicit reference was found in the text == Unused Reference: 'GSP-TR' is defined on line 1774, but no explicit reference was found in the text == Unused Reference: 'RFC2474' is defined on line 1797, but no explicit reference was found in the text == Unused Reference: 'RFC2597' is defined on line 1806, but no explicit reference was found in the text == Outdated reference: A later version (-04) exists of draft-briscoe-tsvwg-cl-architecture-03 -- Possible downref: Normative reference to a draft: ref. 'CL-DEPLOY' -- Possible downref: Non-RFC (?) normative reference: ref. 'DCAC' == Outdated reference: A later version (-02) exists of draft-floyd-ecn-alternates-00 -- Possible downref: Normative reference to a draft: ref. 'Floyd' -- Possible downref: Non-RFC (?) normative reference: ref. 'GSPa' -- Possible downref: Non-RFC (?) normative reference: ref. 'GSP- TR' -- Possible downref: Non-RFC (?) normative reference: ref. 'GSP-TR' -- Possible downref: Non-RFC (?) normative reference: ref. 'Hovell' == Outdated reference: A later version (-01) exists of draft-briscoe-tsvwg-re-ecn-border-cheat-00 -- Possible downref: Normative reference to a draft: ref. 'Re-PCN' ** Downref: Normative reference to an Informational RFC: RFC 2475 ** Downref: Normative reference to an Historic RFC: RFC 3540 == Outdated reference: A later version (-20) exists of draft-ietf-nsis-rmd-06 ** Downref: Normative reference to an Experimental draft: draft-ietf-nsis-rmd (ref. 'RMD') -- Possible downref: Normative reference to a draft: ref. 'RTECN' -- Possible downref: Normative reference to a draft: ref. 'Westberg' ** Downref: Normative reference to an Informational draft: draft-zhang-pcn-performance-evaluation (ref. 'Zhang') Summary: 14 errors (**), 0 flaws (~~), 14 warnings (==), 16 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 TSVWG B. Briscoe 2 Internet Draft P. Eardley 3 draft-briscoe-tsvwg-cl-phb-03.txt D. Songhurst 4 Expires: April 2006 BT 6 F. Le Faucheur 7 A. Charny 8 V. Liatsos 9 Cisco Systems, Inc 11 J. Babiarz 12 K. Chan 13 S. Dudley 14 Nortel 16 G. Karagiannis 17 University of Twente / Ericsson 19 A. Bader 20 L. Westberg 21 Ericsson 23 20 October, 2006 25 Pre-Congestion Notification marking 26 draft-briscoe-tsvwg-cl-phb-03.txt 28 Status of this Memo 30 By submitting this Internet-Draft, each author represents that 31 any applicable patent or other IPR claims of which he or she is 32 aware have been or will be disclosed, and any of which he or she 33 becomes aware will be disclosed, in accordance with Section 6 of 34 BCP 79. 36 Internet-Drafts are working documents of the Internet Engineering 37 Task Force (IETF), its areas, and its working groups. Note that 38 other groups may also distribute working documents as Internet- 39 Drafts. 41 Internet-Drafts are draft documents valid for a maximum of six months 42 and may be updated, replaced, or obsoleted by other documents at any 43 time. It is inappropriate to use Internet-Drafts as reference 44 material or to cite them other than as "work in progress." 46 The list of current Internet-Drafts can be accessed at 47 http://www.ietf.org/ietf/1id-abstracts.txt 49 The list of Internet-Draft Shadow Directories can be accessed at 50 http://www.ietf.org/shadow.html 52 This Internet-Draft will expire on October 2006. 54 Copyright Notice 56 Copyright (C) The Internet Society (2006). All Rights Reserved. 58 Abstract 60 Pre-Congestion Notification (PCN) builds on the concepts of RFC 3168, 61 "The addition of Explicit Congestion Notification to IP". However, 62 Pre-Congestion Notification aims at providing notification before any 63 congestion actually occurs. Pre-Congestion Notification is applied to 64 real-time flows (such as voice, video and multimedia streaming) in 65 DiffServ networks. As described in [CL-DEPLOY], it enables "pre" 66 congestion control through two procedures, flow admission control and 67 flow pre-emption. The draft proposes algorithms that determine when a 68 PCN-enabled router writes Admission Marking and Pre-emption Marking 69 in a packet header, depending on the traffic level. The draft also 70 proposes how to encode these markings. We present simulation results 71 with PCN working in an edge-to-edge scenario using the marking 72 algorithms described. Other marking algorithms will be investigated 73 in the future. 75 Authors' Note (TO BE DELETED BY THE RFC EDITOR UPON PUBLICATION) 77 This document is posted as an Internet-Draft with the intention of 78 eventually becoming a STANDARDS track RFC. 80 Conventions used in this document 82 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 83 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 84 document are to be interpreted as described in [RFC2119]. 86 Table of Contents 88 1. Overview....................................................4 89 1.1. Introduction...........................................4 90 1.2. Terminology............................................9 91 2. Admission Marking algorithm.................................10 92 2.1. Outline...............................................10 93 2.2. Virtual queue based algorithm for Admission Marking.....10 94 2.3. Admission control within a CL-region using Pre-Congestion 95 Notification...............................................12 96 3. Pre-emption Marking........................................13 97 3.1. Outline...............................................13 98 3.2. Token bucket based algorithm for Pre-emption Marking....13 99 3.3. Flow pre-emption within a CL-region using Pre-Congestion 100 Notification...............................................15 101 4. Simulation results.........................................16 102 5. Encoding the Admission Marked and Pre-emption Marked states..17 103 6. Acknowledgements...........................................19 104 7. Comments solicited.........................................19 105 8. Changes from earlier version of the draft...................19 106 9. Appendix A: Explicit Congestion Notification................20 107 10. Appendix B - Details of simulations........................22 108 10.1. Network and signalling model..........................22 109 10.2. Simulated Traffic types...............................23 110 10.2.1. Voice CBR........................................24 111 10.2.2. On-off traffic approximating voice with silence 112 compression.............................................24 113 10.2.3. High-rate on-off traffic.........................24 114 10.3. Admission Control Simulations.........................24 115 10.3.1. Summary of the key parameters for CAC............24 116 10.3.1.1. Virtual Queue settings......................24 117 10.3.1.2. Egress measurement parameters...............25 118 10.3.2. Overview of the Admission Control Results.........25 119 10.3.3. Sensitivity to Poisson Arrivals assumption........27 120 10.3.4. Sensitivity to marking parameters................29 121 10.3.5. Sensitivity to RTT...............................31 122 10.3.6. Future Work for Admission Control Experiments.....32 123 10.4. Flow Pre-emption Simulations..........................32 124 10.4.1. Flow Pre-emption Model and key parameters.........32 125 10.4.2. Summary of Flow Pre-emption Experiments...........34 126 10.4.3. Future Work on Flow Pre-emption Experiments.......35 127 11. Appendix C - Alternative ways of encoding the Admission Marked 128 and Pre-emption Marked States..................................36 129 11.1. Alternative 1........................................36 130 11.2. Alternative 2........................................36 131 11.3. Alternative 3........................................37 132 11.4. Alternative 4........................................37 133 11.5. Alternative 5........................................38 134 11.6. Comparison of Alternatives............................38 135 11.6.1. How compatible is the encoding scheme with RFC 3168 136 ECN?....................................................39 137 11.6.2. Does the encoding scheme allow an "ECN-nonce"?....41 138 11.6.3. Does the encoding scheme require new DSCP(s)?.....42 139 11.6.4. Impact on measurements...........................43 140 11.6.5. Other issues.....................................43 141 12. References................................................44 142 Authors' Addresses............................................46 143 Intellectual Property Statement................................48 144 Disclaimer of Validity........................................48 145 Copyright Statement...........................................48 147 1. Overview 149 1.1. Introduction 151 Pre-Congestion Notification builds on the concepts of RFC 3168, "The 152 addition of Explicit Congestion Notification to IP". Pre-Congestion 153 Notification (PCN) is applied to real-time flows (such as voice, 154 video and multimedia streaming) in DiffServ-enabled networks. The 155 reader is referred to [CL-DEPLOY] for description of how PCN enables 156 "pre" congestion control through two procedures, flow admission 157 control and flow pre-emption. Flow admission control determines 158 whether a new microflow is added into the network. Flow pre-emption 159 reduces the current traffic load by terminating selected microflows. 161 Note this draft concerns the admission control and pre-emption of 162 *flows*, not of packets. 164 Appendix A provides a brief summary of Explicit Congestion 165 Notification (ECN) [RFC3168]. It specifies that a router sets the ECN 166 field to the Congestion Experienced (CE) value as a warning of 167 incipient congestion. RFC3168 doesn't specify a particular algorithm 168 for setting the CE codepoint, although RED (Random Early Detection) 169 is expected to be used. RFC3168 states that "specifications for 170 Diffserv PHBs [RFC2475] MAY provide more specifics" on the CE marking 171 algorithm. This document can be seen as effectively providing such 172 "specifics" for PHBs (Per Hop Behaviours) targeting real-time 173 services. We imagine future specifications for Diffserv PHBs MAY 174 define their ECN marking algorithm by reference to this document. In 175 particular we imagine a Controlled Load PHB definition would refer to 176 Expedited Forwarding [RFC3246] for its scheduling behaviour and to 177 this draft for its ECN marking behaviour. 179 This draft does not propose to change the name of the ECN field. The 180 term PCN is solely used for the marking process. So we say pre- 181 congestion marking is applied to the ECN field (not to the PCN 182 field). We also keep the names of the ECN codepoints, except wherever 183 new codepoint semantics are required. When we talk of PCN-routers, we 184 mean routers arranged so that they will use PCN to mark packets 185 carrying specific, configured DSCPs (differentiated services 186 codepoints). PCN routers may still use default ECN semantics to mark 187 packets carrying other DSCPs. 189 A router enabled with Pre-Congestion Notification marks packets at a 190 lower traffic level than an ECN-router, when there still isn't any 191 significant build-up of real-time packets in the queue. So PCN-marked 192 packets act as an "early warning" that the rate of packets flowing is 193 getting close to the engineered capacity and hence indicate to the 194 admission control system that requests to admit new real-time flows 195 should be rejected. 197 In addition to admission control, another essential Quality of 198 Service feature in deployed networks is the ability to cope with 199 failures of routers and links. In this situation the network's 200 capacity is reduced and selected flows may need to be terminated 201 (pre-empted) in order to preserve the quality of service of the 202 remaining real-time flows. Therefore PCN-routers also include the 203 ability to PCN-mark packets to alert that the rate of packets flowing 204 is too close, or exceeding, the engineered capacity and flow pre- 205 emption may be needed. 207 So a PCN-router needs to be configured with two reference rates: 209 o configured-admission-rate 211 o configured-pre-emption-rate 213 Flow pre-emption should happen at a higher traffic rate than 214 admission control for a number of reasons including: 216 o End-users are typically more annoyed by their established call 217 dying than by getting a busy tone at call establishment. There may 218 also be regulatory obligations on network operators not to drop 219 established calls. 221 o A congestion notification based Admission scheme has some inherent 222 inaccuracy because of its reactive, measurement-based nature. For 223 example, sometimes new load may arrive so fast that the admission 224 scheme overshoots before it can measure the effect of new sessions 225 admitted elsewhere. Such anomalous events can usually be absorbed 226 without any disruption, by setting a buffer zone between the 227 configured-admission-rate and configured-pre-emption-rate. No more 228 traffic is admitted until natural flow departures have cleared the 229 buffer zone. 231 o A buffer zone also allows an operator to decide to admit an 232 'emergency' or 'Assured Services' call immediately, i.e. without 233 admission control. Similarly to the previous bullet, usually the 234 buffer zone allows the 'emergency' call to be admitted without any 235 disruption to on-going calls. Section 5.4 of [CL-DEPLOY] discusses 236 this option. 238 If the buffer zone is insufficient then the flow pre-emption 239 mechanism will kick in; however this should very rarely happen. 241 Both the configured-admission-rate and the configured-pre-emption- 242 rate will be lower than the physical line rate. ([CL-DEPLOY] Section 243 3.2.2 discusses the case (called implicit pre-emption alerting) where 244 the configured-pre-emption-rate is equal to the line rate.) 246 Note that admission control is the primary mechanism used to prevent 247 congestion from occurring and flow pre-emption would rarely be 248 invoked under normal conditions; it is a safety mechanism to prevent 249 congestion from persisting after link failures, re-routes, rare over- 250 admission and other similar events. 252 Together, admission control and flow pre-emption protect the 253 forwarding service offered to admitted and non-pre-empted flows, as 254 well as protecting service to the traffic classes using the remainder 255 of the link capacity. 257 Note well that a PCN-router does not achieve admission control or 258 flow pre-emption on its own. Just like ECN, a PCN router requires a 259 feedback system in order to control the load causing the congestion 260 it is suffering. [CL-DEPLOY] describes how to achieve an end-to-end 261 controlled load service by using, within a large region of the 262 Internet, DiffServ and edge-to-edge distributed measurement-based 263 admission control and flow pre-emption. Controlled load (CL) service 264 is a quality of service (QoS) closely approximating the QoS that the 265 same flow would receive from a lightly loaded network element 266 [RFC2211]. The edge-to-edge region (which we call the CL-region) is a 267 controlled environment, in that all routers in the CL-region are 268 enabled with Pre-Congestion Notification and packets can only enter / 269 leave the CL-region through (enhanced) gateways. PCN-marked packets 270 are detected by an egress gateway and associated information is sent 271 to the relevant ingress gateway to decide whether to admit a new 272 flow, or even pre-empt an existing flow. [CL-DEPLOY] also describes a 273 number of assumptions about the CL-region, such as that there are a 274 large number of real-time flows between each pair of gateways; hence 275 the CL-region is typically the backbone of an operator. 277 We also would like to use PCN-routers in deployment models, such as: 279 o Where the CL-region spans networks run by different operators. 281 o End-host to end-host, i.e. a similar architecture to that 282 described in [RTECN] 284 o A similar architecture to that described in [RMD] 286 These deployment models are for further study as some of the 287 assumptions made about the CL-region in [CL-DEPLOY] no longer hold. 288 We plan later drafts to describe if and how PCN can work in these 289 frameworks. 291 This document describes Pre-Congestion Notification: 293 o (Section 2) The algorithm that determines when a packet is marked 294 so as to warn the admission control mechanism that admission 295 control may be needed. 297 o (Section 3) The algorithm that determines when a packet is marked 298 so as to warn the pre-emption mechanism that pre-emption may be 299 needed. 301 o (Section 4 & Appendix B) Simulation results that demonstrate the 302 effectiveness of stateless admission control and flow pre-emption. 303 The results were obtained using the algorithms of Sections 2 and 304 3. The pdf version of this document includes graphs of simulation 305 results that aren't in the text version. 307 o (Section 5 & Appendix C) How to encode the markings, i.e. what 308 change to make to which bits of a packet so as to convey the 309 admission marking and pre-emption marking to the admission control 310 and pre-emption mechanisms on the egress gateway. 312 Sections 2 and 3 describe the algorithms a PCN-enabled router uses to 313 decide whether it needs to admission mark or pre-emption mark a 314 packet. The algorithms are driven by the amount of traffic in the 315 specified real-time service class. Note that the measurement is made 316 on an aggregate basis, i.e. it doesn't distinguish between real-time 317 microflows. Note also that the algorithms run separately for each 318 outgoing link of the PCN router. We present example implementations 319 but the same effect may be implemented in different ways. Indeed, 320 both the admission control and pre-emption algorithms could have been 321 implemented as variants of token buckets, but the former is 322 implemented as a virtual queue, to present an alternative (yet still 323 fairly similar) implementation. 325 +------------+ 326 | Result | 327 | V 328 +-------+ +--------+ 329 | Bulk | | PCN | 330 Packets ===>| Meter |===>| Marker |===> Marked Packets 331 | | | | 332 +-------+ +--------+ 334 Figure 1: Block Diagram of Meter and Marker Function 336 Currently this draft documents pre-congestion notification algorithms 337 that we believe are reasonably good, but not necessarily the best. 338 On-going work will consider various alternatives and reach rough 339 consensus on the best. 341 In Sections 2 and 3 we also hint at how Pre-Congestion Notification 342 can be used within the CL-region, in order to achieve admission 343 control and flow pre-emption "edge-to-edge" across the CL-region. 344 Details are in [CL-DEPLOY]. 346 Section 4 reports some simulation results obtained using these 347 algorithms in the CL-region framework. Note that the aim of our 348 simulations is to demonstrate to the IETF community that these PCN- 349 based admission control and flow pre-emption mechanisms work 350 successfully. It isn't to show that the particular marking algorithms 351 simulated are the optimum ones; although we believe they are a 352 reasonably good choice, on-going work will compare them with various 353 alternatives. 355 Section 5 presents one possibility for how to encode the markings. 356 Although we believe it is a reasonable choice, there are other 357 possibilities, some of which are listed and discussed in Appendix C. 358 We seek advice and debate as to what scheme should be standardised. 359 Note that the choice of how to encode the markings is non-trivial 360 because we have five things we potentially want to encode, and only 361 have four states in the two bits of the ECN field: 363 o Admission Marking - the traffic level is such that the router 364 Admission Marks the packet 366 o Pre-emption Marking - the traffic level is such that the router 367 Pre-emption Marks the packet 369 o ECT(0) - the first ECT codepoint, for backwards compatibility with 370 the ECN nonce 372 o ECT(1) - the other ECT codepoint, for backwards compatibility with 373 the ECN nonce 375 o Not ECT - to indicate to a router that the traffic is not PCN- 376 capable. 378 1.2. Terminology 380 o Pre-Congestion Notification (PCN): two new algorithms that 381 determine when a PCN-enabled router Admission Marks and Pre- 382 emption Marks a packet, depending on the traffic level. 384 o Admission Marking condition- the traffic level is such that the 385 router Admission Marks packets. The router provides an "early 386 warning" that the load is nearing the engineered admission control 387 capacity, before there is any significant build-up in the queue of 388 packets belonging to the specified real-time service class. 390 o Pre-emption Marking condition- the traffic level is such that the 391 router Pre-emption Marks packets. The router warns explicitly that 392 pre-emption may be needed. 394 o Configured-admission-rate - the reference rate used by the 395 admission marking algorithm in a PCN-enabled router. 397 o Configured-pre-emption-rate - the reference rate used by the pre- 398 emption marking algorithm in a PCN-enabled router. 400 2. Admission Marking algorithm 402 2.1. Outline 404 A PCN-enabled router monitors the aggregate traffic in the specified 405 real-time service class. Based on this measurement, the probability 406 that the router admission marks a packet is determined by the 407 algorithm detailed below, configured to use the configured-admission- 408 rate. The algorithm ensures that packets are admission marked before 409 the actual queue builds up, but when it is in danger of doing so 410 soon; the probability increases with the danger. Hence such packets 411 act as an "early warning" that the engineered capacity is nearly 412 reached, and that no more real-time flows should be admitted. 414 2.2. Virtual queue based algorithm for Admission Marking 416 In order to make the description more specific we assume a virtual 417 queue is used; other implementations are possible. By a virtual queue 418 we mean a *conceptual* queue - it doesn't store packets, it is just 419 an integer. The integer represents the length of a queue that would 420 exist if the real-time packets were drained at the configured- 421 admission-rate instead of the real scheduling rate for the relevant 422 PHB. Note that there is a virtual queue for each outgoing link and it 423 operates in bulk and not per microflow, i.e. the same virtual queue 424 is used for all the real-time packets on that link. The virtual queue 425 could be implemented, for example, with a variation of a leaky 426 bucket. 428 The virtual queue is: 430 o Emptied at the configured-admission-rate, which is slower (perhaps 431 considerably slower) than the link speed and the relevant PHB 432 scheduling rate. This provides a safety margin to minimise the 433 chances of unnecessarily triggering the pre-emption mechanism, for 434 instance. 436 o Filled when a packet arrives carrying a DSCP that has been 437 configured for PCN (even if the packet is already admission or 438 pre-emption marked). The amount added is the same as the number of 439 octets in the packet. 441 The procedure is visualised in Figure 2: 443 _________________ _________________ ____________ 444 PCN |increment length | | calculate | |decide | 445 packet --> |of virtual queue | -> |probability of | -> |whether to | 446 arrives | by size of | |admission marking| |admission | 447 | packet | | packet | |mark packet | 448 ----------------- ----------------- ------------ 449 Figure 2: Router action to support admission marking 451 The router computes the probability that the packet should be 452 admission marked according to the size of the virtual queue, using 453 the following RED-like algorithm: 455 Size of virtual queue < min-marking-threshold, probability = 0; 457 min-marking-threshold < Size of virtual queue < max-marking- 458 threshold, 460 probability = 462 (Size of virtual queue - min-marking-threshold) / (max-marking- 463 threshold - min-marking-threshold); 465 Size of virtual queue > max-marking-threshold, probability = 1 467 Probability ^ 468 of Admission | 469 Marking | 470 a packet 1_| _______________ 471 | / 472 | / 473 | / 474 | / 475 | / 476 | / 477 0_|____________/ 478 | 479 ------------|------|--------------> 480 min- max- Size of virtual queue 481 marking- marking- 482 threshold threshold 484 Figure 3: Probability of router admission marking a packet 485 If the CL traffic is sustained at a level greater than the 486 configured-admission-rate then all packets are eventually admission 487 marked. However, a short burst of traffic at greater than the 488 configured-admission-rate (measured over the burst) may not trigger 489 any admission marking if the burst is sufficiently short that the 490 virtual queue doesn't grow beyond the min-marking-threshold. 492 A packet that is already pre-emption marked is never re-marked to the 493 admission marked state. The decision whether to admission mark a 494 particular packet is made independently of the decision for the 495 previous packet. 497 2.3. Admission control within a CL-region using Pre-Congestion 498 Notification 500 As an example of how the Admission Marking algorithm enables 501 admission control, we briefly consider the edge-to-edge framework 502 described in [CL-DEPLOY]. As real-time packets enter a CL-region, 503 they are re-marked to enable PCN marking using the CL DSCP and the 504 appropriate ECT field. As these CL-packets travel across the edge-to- 505 edge CL-region, routers may admission mark packets, as determined by 506 the algorithm described above. The egress gateway of the region 507 measures the fraction of the real-time traffic that is in the 508 Admission Marked state, with a separate measurement made for traffic 509 from each ingress gateway. It calculates the fraction as an 510 exponentially weighted moving average (which we term Congestion- 511 Level-Estimate, or CLE). When RSVP signalling for a new flow arrives 512 at the egress gateway, it reports the CLE to the CL-region's ingress 513 gateway piggy-backed on the RSVP signalling. The ingress gateway only 514 admits the new real-time microflow if the CLE is less than the CLE- 515 threshold. Hence previously accepted microflows are protected and so 516 suffer minimal queuing delay, jitter and loss. 518 3. Pre-emption Marking 520 3.1. Outline 522 A PCN-enabled router monitors the aggregate traffic in the specified 523 real-time service class. Based on this measurement, when the rate of 524 real-time traffic exceeds the configured-pre-emption-rate for some 525 time, the router will pre-emption mark packets, as determined by the 526 algorithm detailed below. The configured-pre-emption-rate is less 527 than the link speed and less than the relevant PHB scheduling rate, 528 so that Pre-emption Marked packets act as an explicit alert that the 529 engineered capacity is nearly reached, and that some real-time flows 530 may need to be pre-empted. This minimises the chances of a router 531 randomly dropping packets, and hence the Quality of Service of the 532 remaining flows is fully preserved. Also, service is preserved to 533 traffic in other service classes using the remaining capacity. 535 Pre-emption Marking of packets is similar in motivation to ECN- 536 marking of packets in [RFC3168]. With [RFC3168], feedback of an ECN- 537 marked packet causes the TCP source to halve its effective rate, 538 whereas in our mechanism feedback of pre-emption marking enables an 539 upstream node to terminate real-time flow(s). Pre-emption is 540 therefore more aggressive against selected flows, but the gain is 541 that it enables the full QoS of the remaining flows to be preserved. 542 Note that in [RFC3168] ECN-marking a given packet is intended to 543 result in rate adjustment of the flow to which the packet belongs; 544 while in this draft pre-emption marking a packet simply provides an 545 indication that pre-emption may be needed. As described in [CL- 546 DEPLOY] the pre-emption mechanism will then select particular flows 547 to be pre-empted. 549 3.2. Token bucket based algorithm for Pre-emption Marking 551 In order to make the description more specific we assume a token 552 bucket is used; other implementations are possible. 554 All PCN routers maintain a token bucket per outgoing link: 556 o Tokens are added at the configured-pre-emption-rate, which is 557 slower than the link speed (and the relevant PHB scheduling rate). 559 o Usually tokens are removed when a real-time packet arrives; the 560 amount removed is the same as the number of octets in the packet. 561 However, if the real-time packet has already been pre-emption 562 marked, then tokens are not removed. Also, if there are 563 insufficient tokens (because removing them would cause a negative 564 number of tokens in the token bucket), then tokens are not removed 565 and the packet is pre-emption marked. This procedure is visualised 566 in Figure 4. 568 _ _ 569 / Is \ 570 /packet \ ---------------- 571 RT packet / already \ Y |Don't remove | 572 arrives --->/Pre-emption\ -----> |any tokens from | 573 \ Marked? / |token bucket | 574 \ / ---------------- 575 \ / ^ 576 \_ _/ | 577 | | 578 N | --------------- 579 | | Pre-emption | 580 | | mark packet | 581 | | | 582 | -------------- 583 v ^ 584 _ _ | 585 / \ | 586 / are \ | 587 / there \ N| 588 /sufficient \----------------+ 589 \ tokens in / Y| ------------------- 590 \ token / | | Remove tokens | 591 \bucket?/ +-----> | (= octets in pkt) | 592 \_ _/ | from token bucket | 593 ------------------ 595 Figure 4: Router action to support pre-emption alerting 596 So if traffic in the specified real-time service class is sustained 597 at a level greater than the configured-pre-emption-rate then 'non- 598 pre-emption-marked' packet arrivals in excess of this rate are pre- 599 emption marked, but those below it are not marked. ('Non-pre-emption- 600 marked' means 'either unmarked or admission marked'.) The reason is 601 that if a packet finds insufficient tokens, then no tokens are 602 removed from the token bucket, and also the packet is pre-emption 603 marked. Note however that a short burst of traffic at greater than 604 the configured-pre-emption-rate (measured over the burst) may not 605 trigger any pre-emption marking, if the burst is sufficiently short 606 that the token bucket doesn't run out of tokens. 608 3.3. Flow pre-emption within a CL-region using Pre-Congestion 609 Notification 611 As an example of how the Pre-emption Marking algorithm enables flow 612 pre-emption, we briefly consider the edge-to-edge deployment model 613 described in [CL-DEPLOY]. As real-time packets travel across the 614 edge-to-edge CL-region, PCN-enabled routers may pre-emption mark 615 packets, as determined by the algorithm described above. 617 When the egress gateway of the region detects a Pre-emption Marked 618 packet, it measures the rate of real-time traffic *excluding* any 619 packets that are pre-emption marked. Hence it measures the amount of 620 traffic that the network can actually support safely (which we term 621 Sustainable-Aggregate-Rate). The measurement is made for traffic from 622 a particular ingress gateway, and then reported to that ingress 623 gateway. When it receives this message, the ingress gateway measures 624 the ingress-aggregate-rate of real-time traffic that is being sent 625 towards the particular egress gateway. If this measured ingress- 626 aggregate-rate exceeds the Sustainable-Aggregate-Rate, then the 627 ingress gateway pre-empts sufficient number of real-time flow(s) to 628 bring down the ingress-aggregate-rate to (approximately) the 629 Sustainable-Aggregate-Rate. 631 Different implementations of the rate measurement (and the timescale 632 of this measurement) at the egress and ingress gateways are possible. 634 4. Simulation results 636 We have performed an initial set of simulations of admission control 637 and flow pre-emption mechanisms described in this document and 638 consistent with [CL-DEPLOY]. 640 We investigated the performance of the admission control and flow 641 pre-emption mechanisms with traffic modelling CBR voice, on-off 642 traffic approximating voice with silence compression, and more 643 aggressive on-off traffic with larger packet sizes and peak and mean 644 rates approximating that of video traffic. 646 In summary, both the admission control and flow pre-emption 647 mechanisms worked well for all of these traffic types under the 648 assumptions of [CL-DEPLOY] (in particular under the assumption that 649 there are many micro-flows between any pair of ingress / egress 650 gateways, which, in turn, translates in the assumption that 651 relatively high speed links are used). Details of the simulation 652 study are given in Appendix B. In the pdf version of this document 653 Appendix B also include graphs of simulation results. 655 So far the simulations have been run with a sensible estimate of 656 suitable parameters. While a limited amount of work has been done to 657 evaluate sensitivity of the results to the simulation parameters (see 658 Appendix B), investigating further the sensitivity to these 659 parameters is the next step. 661 Due to time constraints, we were able to simulate a single 662 "congestion point" only, i.e. there was a single router where pre- 663 congestion notification for admission control and/or pre-emption was 664 triggered. Furthermore, admission control and flow pre-emption 665 simulations were performed independently. A study of the interaction 666 of admission control and flow pre-emption is also a subject of future 667 work. 669 A further performance evaluation study is presented in [Zhang]. 671 5. Encoding the Admission Marked and Pre-emption Marked states 673 In this Section we describe one proposal for how to encode the 674 Admission Marking and Pre-emption Marking states in a packet, i.e. 675 what change to make to which bits of a packet. 677 The encoding scheme uses the two ECN (Explicit Congestion 678 Notification) bits in the IP header. The four ECN codepoints are used 679 as follows: 681 +-----+-----+ 682 | ECN FIELD | 683 +-----+-----+ 684 bit 6 bit 7 685 0 0 Admission Marking 686 0 1 ECT(1) 687 1 0 ECT(0) 688 1 1 Pre-emption Marking 689 Other DSCPs Non-PCN-Capable 691 Figure 5: Pre-Congestion Notification's use of the ECN Field in IP 693 A PCN-capable environment is one in which all the devices behave in 694 accordance with the PCN mechanisms, for packets in the specific 695 traffic class(es). Therefore a PCN-capable environment, such as a CL- 696 region, meets the requirements of [Floyd] for a controlled 697 environment. 699 A router knows a packet should be treated with the PCN behaviour if 701 o Its differentiated services codepoint (DSCP) is one configured for 702 PCN marking. Packets with this DSCP are PCN-capable whatever the 703 ECN codepoint is. 705 If necessary the router re-sets the ECN field to '00' to indicate 706 Admission Marking and to '11' to indicate Pre-emption Marking. 707 Packets with Admission Marking may be re-marked to Pre-emption 708 Marking, but not vice-versa. 710 For the deployment model of [CL-DEPLOY] an ingress gateway knows, as 711 part of the RSVP signalling set-up, whether a microflow is to be 712 treated with the CPN behaviour by the CL-region. If necessary it sets 713 the DSCP to a PCN-capable DSCP. It also sets the ECN field to either 714 ECT(0) or ECT(1) as it chooses. 716 Other deployment models would be very similar. For example, in a 717 framework where Pre-Congestion Notification operates from one end- 718 host to another, then the sending end-host would set the ECN field to 719 either ECT(0) or ECT(1).One advantage of this encoding scheme is that 720 it allows the (partial) use of the ECN nonce, thus providing similar 721 protection against a cheater as [RFC3540]. However, a drawback is 722 that if PCN marking is used with a pre-existing scheduling behaviour 723 (such as EF), and some traffic still uses the legacy (EF) behaviour, 724 then a new DSCP would be required to distinguish PCN-capable packets 725 from ones that aren't PCN-capable. 727 Note that although we believe the encoding scheme is reasonable, it 728 is not our final proposal. Alternatives are listed and discussed in 729 Appendix C. We welcome advice and comments as to the most appropriate 730 scheme. 732 6. Acknowledgements 734 This work has evolved from several previous independent efforts: 736 o Guaranteed QoS Synthesis [Hovell], which evolved from the 737 Guaranteed Stream Provider developed in the M3I project [GSPa, 738 GSP-TR], which in turn was based on the theoretical work of 739 Gibbens and Kelly [DCAC] 741 o RTECN (Real-Time Explicit Congestion Notification) [RTECN] 743 o RMD (Resource Management in DiffServ) [RMD] and [Westberg] 745 7. Comments solicited 747 Comments and questions are encouraged and very welcome. They can be 748 sent to the Transport Area Working Group's mailing list, 749 tsvwg@ietf.org, and/or to the authors. 751 8. Changes from earlier version of the draft 753 The main changes are: 755 From -01 to -02: 757 Minor clarifications and corrections throughout. 759 From -00 to -01 761 The description of how to use pre-congestion notification marking in 762 a CL-region is now described in [CL-DEPLOY]. 764 Only one admission marking algorithm is now described. 766 A pre-emption marking scheme has been added. 768 Various options for encoding the marking are described and discussed 769 in Appendix C. 771 Simulation results are described in Appendix B and summarised in 772 Section 4. 774 9. Appendix A: Explicit Congestion Notification 776 This Appendix provides a brief summary of Explicit Congestion 777 Notification (ECN). 779 [RFC3168] specifies the incorporation of ECN to TCP and IP, including 780 ECN's use of two bits in the IP header. It specifies a method for 781 indicating incipient congestion to end-nodes (e.g. as in RED, Random 782 Early Detection), where the notification is through ECN marking 783 packets rather than dropping them. 785 ECN uses two bits in the IP header of both IPv4 and IPv6 packets: 787 0 1 2 3 4 5 6 7 788 +-----+-----+-----+-----+-----+-----+-----+-----+ 789 | DS FIELD, DSCP | ECN FIELD | 790 +-----+-----+-----+-----+-----+-----+-----+-----+ 792 DSCP: differentiated services codepoint 793 ECN: Explicit Congestion Notification 795 Figure A.1: The Differentiated Services and ECN Fields in IP. 797 The two bits of the ECN field have four ECN codepoints, '00' to '11': 798 +-----+-----+ 799 | ECN FIELD | 800 +-----+-----+ 801 ECT CE 802 0 0 Not-ECT 803 0 1 ECT(1) 804 1 0 ECT(0) 805 1 1 CE 807 Figure A.2: The ECN Field in IP. 809 The not-ECT codepoint '00' indicates a packet that is not using ECN. 811 The CE codepoint '11' is set by a router to indicate congestion to 812 the end nodes. The term 'CE packet' denotes a packet that has the CE 813 codepoint set. 815 The ECN-Capable Transport (ECT) codepoints '10' and '01' (ECT(0) and 816 ECT(1) respectively) are set by the data sender to indicate that the 817 end-points of the transport protocol are ECN-capable. Routers treat 818 the ECT(0) and ECT(1) codepoints as equivalent. Senders are free to 819 use either the ECT(0) or the ECT(1) codepoint to indicate ECT, on a 820 packet-by-packet basis. The use of both the two codepoints for ECT is 821 motivated primarily by the desire to allow mechanisms for the data 822 sender to verify that network elements are not erasing the CE 823 codepoint, and that data receivers are properly reporting to the 824 sender the receipt of packets with the CE codepoint set. 826 ECN requires support from the transport protocol, in addition to the 827 functionality given by the ECN field in the IP packet header. 828 [RFC3168] addresses the addition of ECN Capability to TCP, specifying 829 three new pieces of functionality: negotiation between the endpoints 830 during connection setup to determine if they are both ECN-capable; an 831 ECN-Echo (ECE) flag in the TCP header so that the data receiver can 832 inform the data sender when a CE packet has been received; and a 833 Congestion Window Reduced (CWR) flag in the TCP header so that the 834 data sender can inform the data receiver that the congestion window 835 has been reduced. 837 The transport layer (e.g. TCP) must respond, in terms of congestion 838 control, to a *single* CE packet as it would to a packet drop. 840 The advantage of setting the CE codepoint as an indication of 841 congestion, instead of relying on packet drops, is that it allows the 842 receiver(s) to receive the packet, thus avoiding the potential for 843 excessive delays due to retransmissions after packet losses. 845 10. Appendix B - Details of simulations 847 The results of the simulation study referred to in Section 4 are presented 848 below. Further evaluation can be found in [Zhang]. 850 10.1. Network and signalling model 852 In most simulations, the network is modelled as a single link between 853 an ingress and an egress node, all flows sharing the same link. 854 Figure B.1 shows the modelled network. A is the ingress node and B is 855 the egress node. 857 A --- B 859 Figure B.1: Simulated Single Link Network. 861 A 863 \ 865 B - D - F 867 / 869 C 871 Figure B.2: Simulated Multi Link Network. 873 A subset of simulations uses a network structured similarly to the 874 network shown on figure B.2. A set of ingresses (A,B,C) connected to 875 an interior node in the network (D) with links of different 876 propagation delay. This node in turn is connected to the egress (F). 877 In this topology, different sets of flows between each ingress and 878 the egress converge on the single link, where pre-congestion 879 notification algorithm is enabled. In our simulations, the network 880 has 100 ingress nodes, each connected to the interior node with a 881 different propagation delay (1ms to 100ms). The point of congestion 882 is taken to be the link (D-F) connecting the interior node to the 883 egress node. This link is modelled with a 10ms propagation delay. 884 Therefore the range of RTTs is from 22ms to 220ms. 886 The simple network topology was due to a lack of time for the 887 simulations. 889 Our simulations concentrated primarily on the range of capacities of 890 'bottleneck' links with sufficient aggregation - above 10 Mbps for 891 voice and 622 Mbps for "video", up to 1 Gbps. But we also 892 investigated slower 'bottleneck' links down to 512 kbps. 894 In the simulation model, a call request arrives at the ingress and 895 immediately sends a message to the egress. The message arrives at the 896 egress after the propagation time plus link processing time (but no 897 queuing delay). When the egress receives this message, it immediately 898 responds to the ingress with the current Congestion-Level-Estimate. 899 If the Congestion-Level-Estimate is below the specified CLE- 900 threshold, the call is admitted, otherwise it is rejected. 902 The life of a call outside the domain described above is not 903 modelled. Propagation delay from source to the ingress and from 904 destination to the egress is assumed negligible and is not modelled. 906 10.2. Simulated Traffic types 908 Three types of traffic were simulated (CBR voice, on-off traffic 909 approximating voice with silence compression, and on-off traffic with 910 higher peak and mean rates (we termed the latter "video" as the 911 chosen peak and mean rate was similar to that of an mpeg video 912 stream, although no attempt was made to match any other parameters of 913 this traffic to those of a video stream). The distribution of flow 914 duration was chosen to be exponentially distributed with mean 2min, 915 regardless of the traffic type. In most of the experiments flows 916 arrived according to a Poisson distribution with mean arrival rate 917 chosen to achieve a desired amount of overload over the configured- 918 pre-emption-rate or configured-admission-limit in each experiment. 919 Overloads in the range 2x to 5x have been investigated. 921 In addition, some experiments investigated a batch Poisson model. 922 Here the batch represented a set of calls arriving at almost the same 923 time. The batch arrival process was Poisson, and the batch size was 924 geometrically distributed with a mean of up to 5 calls per batch. 926 For on-off traffic, on and off periods were exponentially distributed 927 with the specified mean. 929 Traffic parameters for each flow are summarized below: 931 10.2.1. Voice CBR 933 * Average rate 64 Kbps, 935 * Packet length 160 bytes 937 * packet inter-arrival time 20ms 939 10.2.2. On-off traffic approximating voice with silence compression 941 * Packet length 160 bytes 943 * Long-term average rate 21.76 Kbps 945 * On Period mean duration 340ms; during the on period traffic is sent 946 with the CBR voice parameters described above 948 * Off Period mean duration 660ms; no traffic is sent during the off 949 period. 951 10.2.3. High-rate on-off traffic 953 * Long term average rate 4 Mbps 955 * On Period mean duration 340ms; during the on-period the packets are 956 sent at 12 Mbps (1500 byte packets, packet inter-arrival: 1ms) 958 * Off Period mean duration 660ms 960 10.3. Admission Control Simulations 962 10.3.1. Summary of the key parameters for CAC 964 10.3.1.1. Virtual Queue settings 966 Most of the simulations were run with the following Virtual Queue 967 thresholds: 969 * min-marking-threshold: 5ms at link speed, 971 * max-marking-threshold: 15ms at link speed, 973 * virtual-queue-upper-limit: 20ms at link speed. 975 The virtual-queue-upper-limit puts an upper bound on how much the 976 virtual queue can grow. 978 Note that the virtual queue is drained at a configured rate smaller 979 than the link speed. Most of the simulations were set with the 980 configured-admission-rate of the virtual queue at half the link 981 speed. 983 Note that as long as there is no packet loss, the admission control 984 scheme successfully keeps the load of admitted flows at the desired 985 level regardless of the actual setting of the configured-admission- 986 limit. However, it is not clear if this remains true when the 987 configured-admission-rate is close to the link speed/actual queue 988 service rate. Further work is necessary to quantify the performance 989 of the scheme with smaller service rate/virtual queue rate ratio, 990 where packet loss may be an issue. 992 10.3.1.2. Egress measurement parameters. 994 In our simulations, the CLE-threshold was chosen as 0.5. The CLE is 995 computed as an exponential weighted moving average (EWMA) with a 996 weight of 0.01. The CLE is computed on a per-packet basis. 998 10.3.2. Overview of the Admission Control Results 1000 We found that on links of capacity from 10Mbps to OC3, congestion 1001 control for CBR voice and ON_OFF voice traffic work reliably with the 1002 range of parameters we simulated, both with Poisson and Batch call 1003 arrivals. As the performance of the algorithm was quite good at 1004 these speeds, and generally becomes the better the higher the degree 1005 of aggregation of traffic, we chose to not investigate higher link 1006 speeds for CBR and on-off voice, within the time constraints of this 1007 effort. 1009 For higher-rate on-off "video" traffic, due to time limitations we 1010 simulated 1Gbps and OC12 (622 Mbps) links and Poisson arrivals only. 1011 Note that due to the high mean and peak rates of this traffic model, 1012 slower links are unlikely to yield sufficient level of aggregation of 1013 this type of traffic to satisfy the flow aggregation assumptions of 1014 [CL-ARCH]. Our simulations indicated that this model also behaved 1015 quite well, although the deviation from the configured-admission-rate 1016 is slightly higher in this case than for the less bursty traffic 1017 models. 1019 For these link speeds and traffic models, we investigated the demand 1020 overload of 2x-5x. 1022 Table B.1 below summarizes the worst case difference between the 1023 admitted load vs. configured-admission-rate. The worst case 1024 difference was taken over all experiments with the corresponding 1025 range of link speeds and demand overloads. In general, the higher the 1026 demand, the more challenging it is for the admission control 1027 algorithm due to a larger number of near-simultaneous arrivals at 1028 higher overloads, and as a result the worst case results in Table B.1 1029 correspond to the 5x demand overload experiments. 1031 ------------------------------------------------------------------ 1032 | | | | diff between | | 1033 | Link type | traffic | call | mean admitted | standard | 1034 | | type | arrival | load & | deviation| 1035 | | | process | conf-adm-rate | | 1036 ------------------------------------------------------------------ 1037 |T3,100Mbps,OC3 | CBR | POISSON | 0.5% | 0.5% | 1038 ------------------------------------------------------------------ 1039 | 1040 |T3,100Mbps,OC3 |ON-OFF V | POISSON | 2.5% | 2.5% | 1041 ------------------------------------------------------------------ 1042 |T3,100Mbps,OC3 | CBR | BATCH | 1.0% | 1.0% | 1043 ------------------------------------------------------------------ 1044 |T3,100Mbps,OC3 |ON-OFF V | BATCH | 3.0% | 3.0% | 1045 ------------------------------------------------------------------ 1046 | 1Gbps | "Video" | POISSON | 2.0% | 8.0% | 1047 ------------------------------------------------------------------ 1048 | OC12 |"Video | POISSON | 0.0% | 10.0% | 1049 ------------------------------------------------------------------ 1050 Table B.1. Summary of the admission control results for links above T3 1051 speeds 1052 Note: T1 = 1.5Mbps, T3 = 45Mbps, OC3 = 155Mbps, OC12 = 622Mbps 1054 Sample simulation graphs for the experiments summarized in Table 6.1 1055 can be viewed in the PDF version of this draft. 1057 Below are sample results for admission control experiments. Graphs a) 1058 and b) show results for a 155 Mbps link with the CBR voice, Poisson 1059 and Batch call arrival models respectively. Graphs c) and d) show 1060 results for an 155 Mbps link with on-off voice, Poisson and Batch 1061 arrival model respectively. Graph e) shows the results for a 1Gbps 1062 link with on-off-video traffic, Poisson call arrival model. All these 1063 results were obtained with min-marking-threshold = 5 ms, max-marking- 1064 threshold = 15 ms, virtual-queue-upper-limit=20ms. 1066 Graphs a) and b) show results for a 155 Mbps link with the CBR voice, 1067 Poisson and Batch call arrival models respectively. 1069 Graphs c) and d) show results for an 155 Mbps link with on-off voice, 1070 Poisson and Batch arrival model respectively. 1072 Graph e) shows the results for a 1Gbps link with on-off-video 1073 traffic, Poisson call arrival model. 1075 On slower links, accuracy of admission control algorithm was lower 1076 with Poisson arrivals, and was especially challenging with burstier 1077 Batch arrivals. This is described in section 6.3.3 below. 1079 In general, we find that the admission control algorithm perform the 1080 better the larger degree of aggregation of traffic on the link. The 1081 algorithm performs well in the range of link speeds we expect to see 1082 in a CL region. 1084 10.3.3. Sensitivity to Poisson Arrivals assumption 1086 We investigated whether making the call arrival process burstier than 1087 Poisson has an effect on the performance of the admission control 1088 algorithm. To that end we investigated the comparative performance of 1089 the algorithm with Poisson and Batch call arrival processes, 1090 described in section 10.2. The mean call arrival rate was the same 1091 for both processes, with the demand overloads ranging from 2x to 5x. 1093 We found that the admission control algorithm works reliably for both 1094 CBR and VBR at links of 1Mbps and above for up to 5x overloads for 1095 both Poisson and Batch call arrivals. We also found that the 1096 admission control algorithm only works reasonably well at links of 1 1097 Mb/s if we assume CBR traffic and Poisson arrival. At T1 speeds and 1098 below, Batch arrivals resulted in over-admission, the degree of which 1099 increased on slower links. 1101 Table B.2 below summarizes the difference between the admitted load 1102 and the configured-admission-rate for CBR Voice in the case of 1103 Poisson and Batch arrivals. Table B.3 provides a similar summary for 1104 on-off traffic simulating voice with silence compression. The results 1105 in the tables correspond to the worst case across all overload 1106 factors (and when multiple links speeds are listed, across all those 1107 link speeds). 1109 ------------------------------------------------------------- 1110 | | | diff between | | 1111 | Link type | arrival | mean admitted | standard | 1112 | | model | load & | deviation | 1113 | | | conf-adm-rate | | 1114 ------------------------------------------------------------ 1115 | 1Mbps, T1 | BATCH | 30.0% | 30.0% | 1116 ------------------------------------------------------------- 1117 | 10 Mbps | BATCH | 5.0% | 8.0% | 1118 ------------------------------------------------------------- 1119 |T3,100Mbps,OC3| BATCH | 1.0% | 1.0% | 1120 ------------------------------------------------------------- 1121 | 1Mbps, T1 | POISSON | 5.0% | 10.0% | 1122 ------------------------------------------------------------- 1123 | 10 Mbps | POISSON | 1.0% | 2.0% | 1124 ------------------------------------------------------------- 1125 |T3,100Mbps,OC3| POISSON | 0.5% | 0.5% | 1126 ------------------------------------------------------------- 1127 Table B.2. Comparison of Poisson and Batch call arrival models for CBR 1128 voice. Note: T1 = 1.5Mbps, T3 = 45Mbps, OC3 = 155Mbps, OC12 = 622Mbps 1129 ------------------------------------------------------------ 1130 | | | diff between | | 1131 | Link type | arrival | mean admitted | standard | 1132 | | model | load & | deviation | 1133 | | | conf-adm-rate | | 1134 ------------------------------------------------------------ 1135 | 1Mbps, T1 | BATCH | 40.0% | 30.0% | 1136 ------------------------------------------------------------- 1137 | 10 Mbps | BATCH | 8.0% | 6.0% | 1138 ------------------------------------------------------------- 1139 |T3,100Mbps,OC3| BATCH | 3.0% | 3.0% | 1140 ------------------------------------------------------------- 1141 | 1Mbps, T1 | POISSON | 15.0% | 20.0% | 1142 ------------------------------------------------------------- 1143 | 10 Mbps | POISSON | 7.0% | 6.0% | 1144 ------------------------------------------------------------- 1145 |T3,100Mbps,OC3| POISSON | 2.5% | 2.5% | 1146 ------------------------------------------------------------- 1147 Table B.3. Comparison of Poisson and Batch call arrival models for on- 1148 off voice with silence compression. 1149 Note: T1 = 1.5Mbps, T3 = 45Mbps, OC3 = 155Mbps, OC12 = 622Mbps 1151 10.3.4. Sensitivity to marking parameters 1153 The behaviour of the congestion control algorithm in all simulation 1154 experiments did not substantially differ depending on whether the 1155 marking was "ramp", i.e. whether a separate min-marking-threshold and 1156 max-marking-threshold were used, with linear marking probability 1157 between these thresholds, or whether the marking was "step" with the 1158 min-marking-threshold and max-marking-threshold collapsed at the max- 1159 marking-threshold value, and marking all packets with probability 1 1160 above this collapsed threshold. 1162 However, the difference between "ramp" and "step" may be more visible 1163 in the multiple congestion point case (recall that only a single 1164 congestion point experiments were performed so far). 1166 Another possible reason for this apparent lack of difference between 1167 "ramp" and "step" may relate to the choice of the egress measurement 1168 parameters and a relatively high CLE threshold of 50%. Choosing a 1169 lower CLE-acceptance threshold and a faster measurement timescale may 1170 result in a better sensitivity to lower levels of marked traffic. 1172 Investigating the interaction between settings of the marking 1173 thresholds, the CLE-threshold, and the measurement parameters at the 1174 egress is an area of future investigation. 1176 In contrast, the limited number of simulation experiments we 1177 performed indicate that the choice of the absolute value of the min- 1178 marking-threshold, the max-marking-threshold and the virtual-queue- 1179 upper-limit can have an effect on the algorithm performance. 1180 Specifically, choosing the min-marking-threshold and the max-marking- 1181 threshold too small may cause substantial underutilization, 1182 especially on the slow links. However, at larger values of the min- 1183 marking-threshold and the max-marking-threshold, preliminary 1184 experiments suggest the algorithm's performance is insensitive to 1185 their values. The choice of the virtual-queue-upper-limit affects the 1186 amount of over-admission (above the configured-admission-rate 1187 threshold) in some cases, although this effect is not consistent 1188 throughout the experiments. 1190 The Table B.4 below gives a summary of the difference between the 1191 admitted load and the configured-admission-rate as a function of the 1192 virtual queue parameters, for the 4 Mbps on-off traffic model. The 1193 results in the table represent the worst case result among the 1194 experiments with different degree of demand overloads in the range of 1195 2x-5x. Typically, higher deviation of admitted load from the 1196 configured-admission-rate occurs for the higher degree of demand 1197 overload. 1199 ------------------------------------------------------------- 1200 | | | diff between | | 1201 | Link type |min-threshold, | mean admitted | standard | 1202 | |max-threshold, | load & | deviation | 1203 | |upper-limit(ms)| conf-adm-rate | | 1204 ------------------------------------------------------------ 1205 | 1Gbps |5, 15, 20 | 6.0% | 8.0% | 1206 ------------------------------------------------------------- 1207 | 1Gbps |1, 5, 10 | 2.0% | 7.0% | 1208 ------------------------------------------------------------- 1209 | 1Gbps |5, 15, 45 | 2.0% | 8.0% | 1210 ------------------------------------------------------------- 1211 | OC12 |5, 15, 20 | 5.0% | 11.0% | 1212 ------------------------------------------------------------- 1213 | OC12 |1, 5, 10 | 2.0% | 13.0% | 1214 ------------------------------------------------------------- 1215 | OC12 |5, 15, 45 | 0.0% | 10.0% | 1216 ------------------------------------------------------------- 1217 Table B.4. Sensitivity of 4 Mbps on-off "video" traffic to the virtual 1218 queue settings. 1219 Note: T1 = 1.5Mbps, T3 = 45Mbps, OC3 = 155Mbps, OC12 = 622Mbps 1221 Impact of the virtual queue parameter setting is a subject of further 1222 study. 1224 10.3.5. Sensitivity to RTT 1226 We performed a limited amount of sensitivity of the admission control 1227 algorithm used to the range of round trip propagation time (which is 1228 the dominant component of the control delay in the typical 1229 environment using pre-congestion notification). 1231 Specifically, we studied the case when different groups of flows 1232 sharing a single bottleneck link in the network have a range of 1233 roundtrip delays between 22 and 220 ms, as shown in Figure B.2. 1235 The results were good for all types of traffic tested, implying that 1236 the admission control algorithm is not sensitive to the either the 1237 absolute value of the round-trip propagation time or relative value 1238 of the round-trip propagation time, at least in the range of values 1239 tested. We expect this to remain true for a wider range of round-trip 1240 propagation times. 1242 10.3.6. Future Work for Admission Control Experiments 1244 Areas of future investigation include extending the study of 1245 sensitivity to multiple congestion points and topologies, further 1246 investigation of sensitivity to factors such as marking parameters, 1247 implementation details and time scale of egress measurements, the 1248 CLE-threshold. Also variations on the marking algorithm will be 1249 studied. 1251 Another area of investigation is to understand the sensitivity to the 1252 ratio of configured-admission-rate to the actual queue service 1253 rate/link speed, and specifically study how close the configured- 1254 admission-rate can be to the actual queue draining rate. A related 1255 investigation is to understand the effect of packet loss on the 1256 admission control mechanisms. Packet loss can occur if the 1257 configured-admission-rate is sufficiently close to the actual queue 1258 rate. 1260 More realistic Video modelling and the mix of video and voice traffic 1261 in the same queue is also an area of further study. 1263 10.4. Flow Pre-emption Simulations 1265 10.4.1. Flow Pre-emption Model and key parameters 1267 The same single-congestion-point network model as described in 1268 section 10.1 for admission control is used for flow pre-emption. Flow 1269 arrival and traffic models are also the same as for CAC admission 1270 control simulations. 1272 In all flow pre-emption simulations, flows arrive at the ingress 1273 according to a Poisson distribution, with the mean load of 1274 "unrestricted" arrivals exceeding the pre-emption threshold by a 1275 factor of 2 to 5. However, as explained below, the pre-emption 1276 simulation involve a very sudden surge of traffic to simulate a 1277 network failure scenario. 1279 In the simulation, the router implementing PCN Pre-emption Marking 1280 operates as described in section 3, marking packets which find no 1281 token in the token bucket. When an egress gateway receives a marked 1282 packet from the ingress, it will start measuring its Sustainable- 1283 Aggregate-Rate for this ingress, if it is not already in the pre- 1284 emption mode. 1286 If a marked packet arrives while the egress is already in the pre- 1287 emption mode, the packet is ignored. 1289 The measurement is interval based, with 100ms measurement interval 1290 chosen in all simulations. 1292 At the end of the measurement interval, the egress sends the measured 1293 Sustainable-Aggregate-Rate to the ingress, and leaves the pre-emption 1294 mode. 1296 When the ingress receives the sustainable rate from the egress, it 1297 starts its own interval immediately (unless it is already in a 1298 measurement interval), and measures its sending rate to that egress. 1299 Then at the end of that measurement interval, it pre-empts the 1300 necessary amount of traffic. The ingress then leaves the pre-emption 1301 mode until the next time it receives the sustainable rate estimate 1302 from the egress. 1304 Due to time limitations, in all our simulations the ingress used the 1305 same length of the measurement interval as the egress. Investigation 1306 of the impact of different measurement intervals is an important area 1307 of future work. 1309 To avoid excessive pre-emption due to the rate measurement errors, we 1310 used two error factors, Error1 and Error2 to trigger decisions on 1311 when to pre-empt and how much to pre-empt at the ingress. To that 1312 end, the ingress did not trigger pre-emption unless the sending rate 1313 it measured was greater than SAR + Error1 (SAR=Sustainable Aggregate 1314 Rate). Similarly, the ingress pre-empted enough flows to reduce its 1315 sending rate to SAR - Error2. Both Error1 and Error2 in all 1316 simulations were in the range of 2-5%. 1318 The configured-pre-emption-rate was set to 50% of link speed. Token 1319 bucket depth was set to 64 packets for CBR and 128 packets for on-off 1320 traffic. 1322 We only tested on the network shown in Figure B.1 and we experimented 1323 with different propagation delay values: 10ms, 50ms and 100ms. 1325 Due to time limitation, only links above T3 rate were simulated in 1326 Pre-emption experiments. 1328 In all pre-emption experiments, we simulated the base load of traffic 1329 below pre-emption threshold. At some point during the experiment, the 1330 load was suddenly increased to simulate sudden overload such that 1331 might occur after a link failure causes rerouting of some traffic to 1332 a previously un-congested link. In order to model the fact that a 1333 link failure may cause flows rerouting to a particular link over a 1334 period of time, we simulated a "one-wave" traffic surge, where the 1335 extra flows arrived near simultaneously, and a "three-wave" traffic 1336 surge, where there are two surges of traffic arriving close together 1337 (within one measurement interval), followed by a third surge at a 1338 later time. 1340 10.4.2. Summary of Flow Pre-emption Experiments. 1342 Our initial simulations demonstrated that in general performance of 1343 the flow pre-emption mechanism was good, and the appropriate amount 1344 of traffic was pre-empted in all simulated cases, as long as the 1345 depth of the pre-emption token bucket was set appropriately (64 1346 packets for CBR, 128 or higher for on-off traffic). The pre-emption 1347 always occurred very fast (in particular, in the simulation graphs 1348 shown in the pdf version of this document with time granularity of 1 1349 second, pre-emption looks instantaneous). 1351 Perhaps the most useful result of the simulation experiments we were 1352 able to run so far was the importance of choosing the token bucket 1353 depth deep enough to accommodate the expected burstiness on CL 1354 traffic. If the token bucket depth is too small, instantaneous bursts 1355 may cause false pre-emption events. Note that if traffic load is 1356 stable or decreasing, then marking some packets erroneously during a 1357 an unexpected short burst does not cause any false pre-emption, 1358 because the rate measurement of the sustained rate is not affected by 1359 a small amount of pre-emption-marked packets. However, if the 1360 traffic load is increasing (while still remaining below pre-emption 1361 level on the average), a packet marked for pre-emption because it 1362 found no tokens in the too-shallow token bucket, may cause a false 1363 pre-emption event. 1365 Below are sample results for pre-emption experiments with CBR voice, 1366 on-off voice and on-off "video" traffic, and a Poisson call arrival 1367 model. In all these graphs a single overload event occurs in the 1368 middle of a simulation run, triggering pre-emption. Graphs a) and b) 1369 show pre-emption simulations on voice traffic (CBR and on-off) on a 1370 155Mbps link, with the pre-emption token bucket depth of 64 packets. 1371 Graph c) shows pre-emption of on-off "video" traffic on a 1Gbps link, 1372 with the pre-emption token bucket depth of 128 packets. All three 1373 experiments use Error1=Error2=5%, and the configured-pre-emption-rate 1374 set to 50% of the link rate. 1376 Graphs a) and b) show pre-emption simulations on voice traffic (CBR 1377 and on-off) on a 155Mbps link, with the pre-emption token bucket 1378 depth of 64 packets. 1380 Graph c) shows pre-emption of on-off "video" traffic on a 1Gbps link, 1381 with the pre-emption token bucket depth of 128 packets. 1383 10.4.3. Future Work on Flow Pre-emption Experiments 1385 Further work is required to study potential ways of reducing 1386 sensitivity of the algorithm to the token bucket depth. Potential 1387 approaches may be to smooth out pre-emption signal by requiring a 1388 certain amount of pre-emption-marked packets to arrive to the egress 1389 before measurement of the sustainable rate is triggered. An obvious 1390 trade-off to be quantified is the corresponding increase in the 1391 reaction time to receiving a pre-emption-marked packet. 1393 Further quantification of the sensitivity to traffic burstiness and 1394 rate measurement implementation and time scales is an important area 1395 for future work. 1397 More realistic Video modelling and the mix of video and voice traffic 1398 in the same queue is also an area of further study. 1400 Another area of further investigation is the interaction of flow pre- 1401 emption and admission control, and specifically understanding of how 1402 close the admission and pre-emption rates can be on one link. A 1403 related topic is the interaction of flow pre-emption and admission 1404 control triggered by different links for the same ingress-egress 1405 pair. 1407 The exact algorithm for selecting which flows to pre-empt in the case 1408 of variable rate flows and mixture of traffic profile is subject of 1409 further study. 1411 Representative graphs for pre-emption experiments are presented in 1412 the PDF version of this draft. 1414 11. Appendix C - Alternative ways of encoding the Admission Marked and 1415 Pre-emption Marked States 1417 In this Appendix we list and discuss alternative ways of encoding the 1418 Admission Marked and Pre-emption Marked states. We ignore minor 1419 variants such as swapping the encoding for the Admission Marked and 1420 Pre-emption Marked states. 1422 11.1. Alternative 1 1424 The first alternative is the one given in Section 5 above. 1426 +-----+-----+ 1427 | ECN FIELD | 1428 +-----+-----+ 1429 bit 6 bit 7 1430 0 0 Admission Marking 1431 0 1 ECT(1) 1432 1 0 ECT(0) 1433 1 1 Pre-emption Marking 1435 Other DSCPs Not ECN capable 1437 Figure C.1: Encoding scheme Alternative 1 1439 11.2. Alternative 2 1441 In the second alternative, both Admission Marking and Pre-emption 1442 Marking are encoded as '11', depending on the original ECT marking: 1444 o Setting the ECN field of an ECT(1) packet to '11' indicates 1445 Admission Marking 1447 o Setting the ECN field of an ECT(0) packet to '11' indicates Pre- 1448 emption Marking 1449 +-----+-----+ 1450 | ECN FIELD | 1451 +-----+-----+ 1452 bit 6 bit 7 1453 0 0 Not-ECT 1454 0 1 ECT(1/A) re-mark ECT(1) to '11' to encode 1455 Admission Marking 1456 1 0 ECT(0/P) re-mark ECT(0) to '11' to encode 1457 Pre-emption Marking 1458 1 1 Admission Marking or Pre-emption Marking 1460 Figure C.2: Encoding scheme Alternative 2 1462 11.3. Alternative 3 1464 The third alternative is a combination of the previous two schemes. 1466 +-----+-----+ 1467 | ECN FIELD | 1468 +-----+-----+ 1469 bit 6 bit 7 1470 0 0 Admission Marking 1471 0 1 ECT(1/A) re-mark ECT(1) to '00' to encode 1472 Admission Marking 1473 1 0 ECT(0/P) re-mark ECT(0) to '11' to encode 1474 Pre-emption Marking 1475 1 1 Pre-emption Marking 1477 Other DSCPs Not ECN capable 1479 Figure C.3: Encoding scheme Alternative 3 1481 11.4. Alternative 4 1483 In the fourth alternative a packet is re-marked with a new DSCP to 1484 indicate Pre-emption Marking. 1486 +-----+-----+ 1487 | ECN FIELD | 1488 +-----+-----+ 1489 bit 6 bit 7 1490 0 0 Not ECN capable 1491 0 1 ECT(1) 1492 1 0 ECT(0) 1493 1 1 Admission Marking 1495 New DSCP Pre-emption Marking 1497 Figure C.4: Encoding scheme Alternative 4 1499 11.5. Alternative 5 1501 The fifth alternative doesn't include the ECN nonce. 1503 +-----+-----+ 1504 | ECN FIELD | 1505 +-----+-----+ 1506 bit 6 bit 7 1507 0 0 Not ECN capable 1508 0 1 PCN capable 1509 1 0 Admission Marking 1510 1 1 Pre-emption Marking 1512 Figure C.5: Encoding scheme Alternative 5 1514 11.6. Comparison of Alternatives 1516 In this section we compare the encoding alternatives against various 1517 criteria. No scheme is perfect. We would like feedback and advice 1518 from the IETF community as to which is most suitable. The choice of 1519 how to encode the markings is non-trivial because we have five things 1520 we want to encode, and only have four states available in the two 1521 bits of the ECN field: 1523 o Admission Marking - the traffic level is such that the router 1524 Admission Marks the packet 1526 o Pre-emption Marking - the traffic level is such that the router 1527 Pre-emption Marks the packet 1529 o ECT(0) - the first ECT codepoint, for backwards compatibility with 1530 the ECN nonce 1532 o ECT(1) - the other ECT codepoint, for backwards compatibility with 1533 the ECN nonce 1535 o Not ECN - to indicate to a router that the traffic is not ECN- 1536 capable, and indeed not PCN-capable. 1538 Some of the issues won't be relevant in particular scenarios. For 1539 example, with the CL-region framework[CL-ARCH], the edge-to-edge 1540 region is a controlled environment so an ECN (RFC3168) packet should 1541 never encounter a PCN-enabled router. 1543 Occasionally we use the terminology of the CL-region framework. This 1544 is merely to make the language more specific. 1546 11.6.1. How compatible is the encoding scheme with RFC 3168 ECN? 1548 All the encoding schemes for Pre-Congestion Notification use the ECN 1549 field, so there will be interactions between PCN and ECN. Three 1550 aspects are: 1552 o What happens if an ECN (RFC3168) packet encounters a PCN-enabled 1553 router? 1555 o What happens if a PCN-capable packet encounters an ECN-enabled 1556 router? 1558 o What happens if a flow that has been admitted, using the PCN-based 1559 admission control mechanism, wants to use ECN (i.e. from end-point 1560 to end-point as in RFC3168)? 1562 The first two bullets are about an "unusual" situation, perhaps where 1563 re-routing means that a PCN-enabled packet gets routed onto an ECN 1564 router - or perhaps where one of the CL-regions ingress gateways is 1565 misconfigured so that it allows in ECN packets into the CL traffic 1566 class. The third bullet is when the end-point wants its flow, which 1567 has been reserved using PCN-based admission control, to also use ECN- 1568 congestion control. There has been some discussion (and disagreement) 1569 about whether this is a realistic requirement [Floyd] [tsvwg-ml]. 1571 o What happens if an ECN (RFC3168) packet encounters a PCN-enabled 1572 router? 1574 The main issue here is if traffic at the PCN-router is above the 1575 admission or pre-emption threshold, and what then happens when the 1576 ECN packet reaches the RFC3168 ECN end-point. 1578 Alternative 2 and 4 are very safe. If the PCN-router Admission Marks 1579 a packet ('11'), the ECN end-point interprets this as the CE 1580 codepoint. The admission threshold is lower (perhaps much lower) than 1581 an ECN threshold would be. 1583 Alternative 3 is also safe. If the PCN-router Pre-emption Marks a 1584 packet ('11'), the ECN end-point interprets this as the CE codepoint. 1585 The pre-emption threshold is likely to be lower than an ECN threshold 1586 would be, and is definitely lower than the traffic level at which 1587 packets would start to be dropped. 1589 Alternative 5 is probably OK. However if the level of RFC3168 traffic 1590 is above the PCN router's configured-admission-rate but below its 1591 configured-pre-emption-rate, then packets are admission marked (to 1592 '10') but not pre-emption marked (to '11'). Therefore the ECN traffic 1593 would tend to block new PCN flows, but not reduce its own rate. This 1594 would be safer with the encodings for admission marking and pre- 1595 emption marking swapped. 1597 With Alternatives 1 and 3, if traffic is above the admission 1598 threshold then packets will be re-marked to '00'. A subsequent ECN 1599 router will therefore think the packet isn't ECN-capable. 1601 With Alternative 5 packets are admission marked to '10', which could 1602 confuse an ECN RFC3168 end-point using the ECN nonce. 1604 o What happens if a PCN-capable packet encounters an ECN-enabled 1605 router? 1607 The main issue is if the ECN-router is becoming congested, so it 1608 changes the ECN field to '11', to indicate Congestion Experienced 1609 (CE). 1611 With Alternatives 1, 3 and 5 '11' will be interpreted as Pre-emption 1612 Marking, so the pre-emption mechanism will be triggered. 1614 With Alternative 2 either the pre-emption or admission mechanism 1615 would be triggered (depending whether it was originally a '10' or 1616 '01' packet). 1618 With Alternative 4 the admission control mechanism will be triggered. 1620 Interpretation of '11' as pre-emption marking is probably safer than 1621 interpreting it as admission marking, because it then pre-empts flows 1622 going through a congested ECN router. However, it isn't clear-cut 1623 what 'safe' means in this context. 1625 o What happens if a flow that has been admitted, using the PCN-based 1626 admission control mechanism, wants to use ECN (i.e. from end-point 1627 to end-point as in RFC3168)? 1629 For instance with the CL-region framework, it isn't clear what the 1630 ingress gateway should do if it gets a packet with the CE codepoint, 1631 '11'. All the PCN encoding schemes have the same issue. Some options: 1633 - the ingress gateway could re-set a '11' packet to one of the ECT 1634 codepoints. However, as far as the ECN-end-point is concerned, the 1635 CE information is lost. 1637 - The ingress gateway could pre-empt the flow. This is safer, but 1638 perhaps harsh as the flow would now be handled by the non-PCN- 1639 capable class within the CL-region, and by the non-ECN-capable 1640 class after that. 1642 - Tunnelling between the ingress and egress gateways, e.g. all PCN- 1643 capable traffic could be tunnelled. This preserves both the ECN 1644 and PCN functionality, but at the cost of the tunnelling. 1646 11.6.2. Does the encoding scheme allow an "ECN-nonce"? 1648 The Explicit Congestion Notification (ECN)-nonce is an optional 1649 addition to ECN that protects against accidental or malicious 1650 concealment of marked packets from the TCP sender. It uses the two 1651 ECN-Capable Transport (ECT) codepoints in the ECN field of the IP 1652 header. It improves the robustness of congestion control by enabling 1653 co-operative senders to prevent receivers from exploiting ECN to gain 1654 an unfair share of network bandwidth. 1656 Pre-Congestion Notification is targeted at real-time traffic, which 1657 we'd expect to use UDP or DCCP rather than TCP. However, we imagine 1658 an "ECN-nonce" could be defined for DCCP and perhaps UDP with similar 1659 functionality to the ECN-nonce. 1661 Analysing the encoding schemes in the context of an ECN-nonce: 1663 o Alternatives 2 and 4 would allow an ECN-nonce 1665 o Alternatives 1 and 3 would party allow an ECN-nonce - in terms of 1666 the edge-to-edge framework, an egress gateway would be able to 1667 detect a cheating ingress gateway, but it wouldn't detect an 1668 interior router re-marking the ECN field from '11' to '00'. 1670 o Alternative 5 wouldn't allow an ECN-nonce 1672 An alternative scheme intended to prevent cheating when using ECN for 1673 admission control is proposed in [Re-PCN]. This scheme claims to 1674 provide protection against a much wider range of cheating strategies 1675 than the ECN-Nonce, including against cheating ingress nodes or 1676 senders. Whereas the ECN-nonce requires the sender to be trusted. 1677 This scheme uses a bit outside the ECN field, so Alternative 5 1678 combined with that scheme could solve the problem of fitting five 1679 states into four codepoints. 1681 11.6.3. Does the encoding scheme require new DSCP(s)? 1683 o Alternatives 2 and 5 do not. 1685 o Alternative 1 does not allow indication of a non-PCN-capable 1686 transport within the same DSCP as used by PCN-capable transports. 1687 Therefore, if the PCN-routers are used with a pre-existing 1688 scheduling behaviour (such as EF) an extra DSCP would have to be 1689 used to indicate the combination of PCN marking with EF 1690 scheduling. 1692 o Alternative 4 needs a new DSCP so a PCN-router can Pre-emption 1693 Mark a packet. 1695 In Section 5 we suggested that the Expedited Forwarding DSCP might be 1696 used to indicate to a PCN-router that a packet is part of a PCN- 1697 capable flow. However PCN could be used similarly to add admission 1698 control and flow pre-emption to other DSCP classes. With Alternative 1699 4 a new DSCP would be needed for each PCN-enabled class. 1701 It's not clear to what extent the requirement for extra DSCP(s) 1702 matters. DSCPs are plentiful in an IP network, but scarce in an MPLS 1703 network where the DSCP/ECN byte is mapped to the three MPLS header 1704 EXP bits [MPLS/EXP]. However, note that there is at least no need to 1705 encode the ECN-nonce in the MPLS EXP field, as it is sufficient to 1706 encode the ECN-nonce in the underlying IP header. 1708 11.6.4. Impact on measurements 1710 With some of the Alternatives, the measurements by the egress gateway 1711 for instance, have to be modified: 1713 With Alternative 2 and 3, it has to measure the rate of ECT(1/A) in 1714 order to deduce the total number of bits in admission marked packets. 1716 With Alternative 2, the egress moves into the pre-emption alert state 1717 if the rate of ECT(0/P) is significantly less than 50%. This is 1718 slower than the other Alternatives which are triggered by a single 1719 pre-emption marked packet. It also makes it more likely that the 1720 egress moves into the pre-emption alert state when the traffic level 1721 actually doesn't justify this. 1723 With Alternative 4 the egress has to monitor the new DSCP in order to 1724 measure pre-emption marked packets. 1726 11.6.5. Other issues 1728 With Alternatives 2 and 3, Admission Marking means re-marking the ECN 1729 field of a '01' packet and Pre-emption Marking means re-marking a 1730 '10' packet. Therefore extra work is required compared with the other 1731 Alternatives; exactly what the work is depends on the details of the 1732 framework using PCN. 1734 With Alternatives 1 and 5 Pre-emption Marking overwrites Admission 1735 Marking. 1737 With Alternative 4 Pre-emption Marking is indicated by a new DSCP. 1738 Some ECMP (Equal Cost Multipath Routing) algorithms use the DSCP 1739 field as one of the input fields used to calculate which link to 1740 forward a packet on. Therefore, with a network running ECMP there is 1741 a danger that a Pre-emption Marked packet might be forwarded on a 1742 different path to other PCN-capable packets. The extent that this 1743 matters is for further study. It is not an issue for the other 1744 encoding Alternatives. 1746 12. References 1748 A later version will distinguish normative and informative 1749 references. 1751 [CL-DEPLOY] B. Briscoe, P. Eardley, D. Songhurst, F. Le Faucheur, 1752 A. Charny, S. Dudley, J. Babiarz, K. Chan, G. 1753 Karagiannis, A. Bader. A Deployment Model for 1754 Admission Control over DiffServ using Pre-Congestion 1755 Notification, draft-briscoe-tsvwg-cl-architecture- 1756 03.txt", (work in progress), October 2006 1758 [DCAC] Richard J. Gibbens and Frank P. Kelly "Distributed 1759 connection acceptance control for a connectionless 1760 network", In: Proc. International Teletraffic Congress 1761 (ITC16), Edinburgh, pp. 941?952 (1999). 1763 [Floyd] S. Floyd, 'Specifying Alternate Semantics for the 1764 Explicit Congestion Notification (ECN) Field', draft- 1765 floyd-ecn-alternates-00.txt (work in progress), April 1766 2005 1768 [GSPa] Karsten (Ed.), Martin "GSP/ECN Technology \& 1769 Experiments", Deliverable: 15.3 PtIII, M3I Eu Vth 1770 Framework Project IST-1999-11429, URL: 1771 http://www.m3i.org/ (February, 2002) (superseded by 1772 [GSP- TR]) 1774 [GSP-TR] Martin Karsten and Jens Schmitt, "Admission Control 1775 Based on Packet Marking and Feedback Signalling ?-- 1776 Mechanisms, Implementation and Experiments", TU- 1777 Darmstadt Technical Report TR-KOM-2002-03, URL: 1778 http://www.kom.e-technik.tu- 1779 darmstadt.de/publications/abstracts/KS02-5.html (May, 1780 2002) 1782 [Hovell] P. Hovell, R. Briscoe, G. Corliano, "Guaranteed QoS 1783 Synthesis - an example of a scalable core IP quality 1784 of service solution", BT Technology Journal, Vol 23 No 1785 2, April 2005 1787 [Re-PCN] B. Briscoe, "Emulating Border Flow Policing using Re- 1788 ECN on Bulk Data", draft-briscoe-tsvwg-re-ecn-border- 1789 cheat-00 (work in progress), February 2006 1791 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1792 Requirement Levels", BCP 14, RFC 2119, March 1997. 1794 [RFC2211] J. Wroclawski, Specification of the Controlled-Load 1795 Network Element Service, September 1997 1797 [RFC2474] Nichols, K., Blake, S., Baker, F. and D. Black, 1798 "Definition of the Differentiated Services Field (DS 1799 Field) in the IPv4 and IPv6 Headers", RFC 2474, 1800 December 1998 1802 [RFC2475] Blake, S., Black, D., Carlson, M., Davies, E., Wang, 1803 Z. and W. Weiss, "An Architecture for Differentiated 1804 Services", RFC 2475, December 1998. 1806 [RFC2597] Heinanen, J., Baker, F., Weiss, W. and J. Wrocklawski, 1807 "Assured Forwarding PHB Group", RFC 2597, June 1999. 1809 [RFC3168] Ramakrishnan, K., Floyd, S. and D. Black "The Addition 1810 of Explicit Congestion Notification (ECN) to IP", RFC 1811 3168, September 2001. 1813 [RFC3246] B. Davie, A. Charny, J.C.R. Bennet, K. Benson, J.Y. Le 1814 Boudec, W. Courtney, S. Davari, V. Firoiu, D. 1815 Stiliadis, 'An Expedited Forwarding PHB (Per-Hop 1816 Behavior)', RFC 3246, March 2002. 1818 [RFC3540] N. Spring, D. Wetherall, D. Ely, 'Robust Explicit 1819 Congestion Notification (ECN) Signaling with Nonces', 1820 RFC 3540, June 2003. 1822 [RMD] A Bader, L Westberg, G Karagiannis, C Kappler, T 1823 Phelan, 'RMD-QOSM - The Resource Management in 1824 DiffServ QoS model', draft-ietf-nsis-rmd-06 Work in 1825 Progress, February 2006 1827 [RTECN] Babiarz, J., Chan, K. and V. Firoiu, 'Congestion 1828 Notification Process for Real-Time Traffic', draft- 1829 babiarz-tsvwg-rtecn-05 Work in Progress, October 2005. 1831 [tsvwg-ml] Discussion on the TSVWG mailing list, Nov/Dec 2005. 1833 [Westberg] L. Westberg, Z. R. Turanyi, D. Partain, A. Bader, G. 1834 Karagiannis, "Load Control of Real-Time Traffic", 1835 draft-westberg-loadcntr-04.txt (Work in progress), Dec 1836 2005 1838 [Zhang] J. Zhang, A. Charny, V. Liatsos, F. Le Faucheur, 1839 "Performance Evaluation of CL-PHB Admission and pre-emption 1840 Algorithms", draft-zhang-pcn-performance-evaluation.txt 1841 (Work in progress), October 2005 1843 Authors' Addresses 1845 Bob Briscoe 1846 BT Research 1847 B54/77, Sirius House 1848 Adastral Park 1849 Martlesham Heath 1850 Ipswich, Suffolk 1851 IP5 3RE 1852 United Kingdom 1853 Email: bob.briscoe@bt.com 1855 Dave Songhurst 1856 BT Research 1857 B54/69, Sirius House 1858 Adastral Park 1859 Martlesham Heath 1860 Ipswich, Suffolk 1861 IP5 3RE 1862 United Kingdom 1863 Email: dsonghurst@jungle.bt.co.uk 1865 Philip Eardley 1866 BT Research 1867 B54/77, Sirius House 1868 Adastral Park 1869 Martlesham Heath 1870 Ipswich, Suffolk 1871 IP5 3RE 1872 United Kingdom 1873 Email: philip.eardley@bt.com 1875 Vassilis Liatsos 1876 Cisco Systems, Inc. 1877 1414 Massachusetts Avenue 1878 Boxborough, 1879 MA 01719, 1880 USA 1881 Email: vliatsos@ciscoyahoo.com 1883 Francois Le Faucheur 1884 Cisco Systems, Inc. 1885 Village d'Entreprise Green Side - Batiment T3 1886 400, Avenue de Roumanille 1887 06410 Biot Sophia-Antipolis 1888 France 1889 Email: flefauch@cisco.com 1890 Anna Charny 1891 Cisco Systems, Inc. 1892 14164 Massachusetts Ave 1893 Boxborough, 1894 MA 01719 1895 USA 1896 Email: acharny@cisco.com 1898 Jozef Babiarz 1899 Nortel Networks 1900 3500 Carling Avenue 1901 Ottawa, Ont. K2H 8E9 1902 Canada 1903 Email: babiarz@nortel.com 1905 Kwok Ho Chan 1906 Nortel Networks 1907 600 Technology Park Drive 1908 Billerica, MA 01821 1909 USA 1910 Email: khchan@nortel.com 1912 Stephen Dudley 1913 Nortel Networks 1914 4001 E. Chapel Hill Nelson Highway 1915 P.O. Box 13010, ms 570-01-0V8 1916 Research Triangle Park, NC 27709 1917 USA 1918 Email: smdudley@nortel.com 1920 Georgios Karagiannis 1921 University of Twente 1922 P.O. BOX 217 1923 7500 AE Enschede, 1924 The Netherlands 1925 EMail: g.karagiannis@ewi.utwente.nl 1927 Attila Bᤥr 1928 attila.bader@ericsson.com 1930 Lars Westberg 1931 Ericsson AB 1932 SE-164 80 Stockholm 1933 Sweden 1934 EMail: Lars.Westberg@ericsson.com 1936 Intellectual Property Statement 1938 The IETF takes no position regarding the validity or scope of any 1939 Intellectual Property Rights or other rights that might be claimed to 1940 pertain to the implementation or use of the technology described in 1941 this document or the extent to which any license under such rights 1942 might or might not be available; nor does it represent that it has 1943 made any independent effort to identify any such rights. Information 1944 on the procedures with respect to rights in RFC documents can be 1945 found in BCP 78 and BCP 79. 1947 Copies of IPR disclosures made to the IETF Secretariat and any 1948 assurances of licenses to be made available, or the result of an 1949 attempt made to obtain a general license or permission for the use of 1950 such proprietary rights by implementers or users of this 1951 specification can be obtained from the IETF on-line IPR repository at 1952 http://www.ietf.org/ipr. 1954 The IETF invites any interested party to bring to its attention any 1955 copyrights, patents or patent applications, or other proprietary 1956 rights that may cover technology that may be required to implement 1957 this standard. Please address the information to the IETF at 1958 ietf-ipr@ietf.org 1960 Disclaimer of Validity 1962 This document and the information contained herein are provided on an 1963 "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS 1964 OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET 1965 ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, 1966 INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE 1967 INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED 1968 WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. 1970 Copyright Statement 1972 Copyright (C) The Internet Society (2006). 1974 This document is subject to the rights, licenses and restrictions 1975 contained in BCP 78, and except as set forth therein, the authors 1976 retain all their rights.