idnits 2.17.1 draft-babiarz-pcn-3sm-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** It looks like you're using RFC 3978 boilerplate. You should update this to the boilerplate described in the IETF Trust License Policy document (see https://trustee.ietf.org/license-info), which is required now. -- Found old boilerplate from RFC 3978, Section 5.1 on line 18. -- Found old boilerplate from RFC 3978, Section 5.5, updated by RFC 4748 on line 1306. -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 1317. -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 1324. -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 1330. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust Copyright Line does not match the current year == The document doesn't use any RFC 2119 keywords, yet seems to have RFC 2119 boilerplate text. -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (November 19, 2007) is 6002 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Unused Reference: 'Maglaris-88' is defined on line 972, but no explicit reference was found in the text == Outdated reference: A later version (-02) exists of draft-babiarz-pcn-explicit-marking-01 == Outdated reference: A later version (-11) exists of draft-ietf-pcn-architecture-01 == Outdated reference: A later version (-03) exists of draft-menth-pcn-performance-00 Summary: 2 errors (**), 0 flaws (~~), 6 warnings (==), 7 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group J. Babiarz 3 Internet-Draft X-G. Liu 4 Intended status: Informational K. Chan 5 Expires: May 22, 2008 Nortel 6 M. Menth 7 University of Wuerzburg 8 November 19, 2007 10 Three State PCN Marking 11 draft-babiarz-pcn-3sm-01 13 Status of this Memo 15 By submitting this Internet-Draft, each author represents that any 16 applicable patent or other IPR claims of which he or she is aware 17 have been or will be disclosed, and any of which he or she becomes 18 aware will be disclosed, in accordance with Section 6 of BCP 79. 20 Internet-Drafts are working documents of the Internet Engineering 21 Task Force (IETF), its areas, and its working groups. Note that 22 other groups may also distribute working documents as Internet- 23 Drafts. 25 Internet-Drafts are draft documents valid for a maximum of six months 26 and may be updated, replaced, or obsoleted by other documents at any 27 time. It is inappropriate to use Internet-Drafts as reference 28 material or to cite them other than as "work in progress." 30 The list of current Internet-Drafts can be accessed at 31 http://www.ietf.org/ietf/1id-abstracts.txt. 33 The list of Internet-Draft Shadow Directories can be accessed at 34 http://www.ietf.org/shadow.html. 36 This Internet-Draft will expire on May 22, 2008. 38 Copyright Notice 40 Copyright (C) The IETF Trust (2007). 42 Abstract 44 This document proposes a mechanism for admission control and flow 45 termination. It is based on the concept of pre-congestion 46 notification (PCN) using three different codepoints: "no pre- 47 congestion", "admission-stop", and "excess-traffic" for packet 48 marking. Therefore, the proposal is called three state marking 49 (3sm). The behaviour of edge nodes is presented which distinguishes 50 from other proposals through little complexity and its ability to 51 cope with multipath routing. Algorithms required for packet metering 52 and marking are explained in detail. 54 Table of Contents 56 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 5 57 1.1. Requirements Notation . . . . . . . . . . . . . . . . . . 6 58 1.2. Terminology Used in this Document . . . . . . . . . . . . 6 59 1.3. Adapted Terminology . . . . . . . . . . . . . . . . . . . 6 60 1.4. New Terminology . . . . . . . . . . . . . . . . . . . . . 7 61 2. The 3sm Proposal . . . . . . . . . . . . . . . . . . . . . . . 8 62 2.1. Packet Marking in 3sm . . . . . . . . . . . . . . . . . . 8 63 2.2. Admission Control in 3sm . . . . . . . . . . . . . . . . . 9 64 2.2.1. Explicit Edge-to-Edge Tunnels (E3Tunnel) . . . . . . . 9 65 2.3. End-to-End on-Path Signalling (End2PS) . . . . . . . . . . 11 66 2.3.1. Operation of Standard RSVP . . . . . . . . . . . . . . 11 67 2.3.2. Modification of Standard RSVP to Perform PCN-Based 68 Admission Control . . . . . . . . . . . . . . . . . . 12 69 2.4. Edge-to-Edge on-Path Signalling (Edge2PS) . . . . . . . . 12 70 2.5. Flow Termination in 3sm . . . . . . . . . . . . . . . . . 13 71 2.5.1. Marked Flow Termination (MFT) . . . . . . . . . . . . 13 72 2.5.2. Measured Rate Termination (MRT) . . . . . . . . . . . 14 73 3. Three State PCN Marker with Marking Frequency Reduction 74 (MFR) for Marked Flow Termination (MFT) . . . . . . . . . . . 14 75 3.1. ET-Marker . . . . . . . . . . . . . . . . . . . . . . . . 15 76 3.1.1. Behaviour of SR-Metering and ET-Marking . . . . . . . 15 77 3.1.2. Pseudo Code for the ET-Marker . . . . . . . . . . . . 16 78 3.1.3. Configuration of the ET-Marker . . . . . . . . . . . . 16 79 3.1.4. Characteristics of the Proposed ET-Marker . . . . . . 17 80 3.2. AS-Marker . . . . . . . . . . . . . . . . . . . . . . . . 17 81 3.2.1. Behaviour of AR-Metering and AS-Marking . . . . . . . 18 82 3.2.2. Pseudo Code for AS-Marker . . . . . . . . . . . . . . 18 83 3.2.3. Configuration of the AS-Marker . . . . . . . . . . . . 18 84 3.2.4. Characteristics of the Proposed AS-Marker . . . . . . 19 85 3.3. Marking Codepoints . . . . . . . . . . . . . . . . . . . . 19 86 4. Benefits and Shortcomings of the 3sm Proposal . . . . . . . . 19 87 4.1. Benefits . . . . . . . . . . . . . . . . . . . . . . . . . 20 88 4.2. Shortcomings of 3sm . . . . . . . . . . . . . . . . . . . 21 89 5. Security Considerations . . . . . . . . . . . . . . . . . . . 21 90 6. Changes from Previous Revision . . . . . . . . . . . . . . . . 21 91 7. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 22 92 8. Informative References . . . . . . . . . . . . . . . . . . . . 23 93 Appendix A. Overview of Token Bucket (TB) and Virtual Queue 94 (VQ) . . . . . . . . . . . . . . . . . . . . . . . . 24 95 A.1. New features for marking algorithm . . . . . . . . . . . . 24 96 A.1.1. Virtual Queue (VQ) . . . . . . . . . . . . . . . . . . 24 97 A.1.2. Token Bucket (TB) . . . . . . . . . . . . . . . . . . 25 98 A.1.3. Tail Marking . . . . . . . . . . . . . . . . . . . . . 25 99 A.1.4. Tail Marking with Marking Frequency Reduction (MFR) . 26 100 A.1.5. Tail Marking with Packet Size Independent Marking 101 (PSIM) and Proportional Marking Frequency 102 Reduction (PMFR) . . . . . . . . . . . . . . . . . . . 27 103 A.1.6. Threshold Marking . . . . . . . . . . . . . . . . . . 28 104 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 29 105 Intellectual Property and Copyright Statements . . . . . . . . . . 31 107 1. Introduction 109 Pre-Congestion Notification (PCN) builds on the concepts of 110 [RFC3168], "The addition of Explicit Congestion Notification (ECN) to 111 IP". It is used to implement admission control and flow termination 112 for real-time flows (such as voice, video and multimedia streaming) 113 in DiffServ [RFC2474], [RFC2475] enabled networks. Flow admission 114 control determines whether a new flow can be added into the network 115 without overloading any of its links, whereas flow termination 116 reduces the current PCN traffic load by terminating marked flows when 117 at least one link in the network is overloaded for some reason. For 118 a general overview, the reader is referred to 119 [I-D.ietf-pcn-architecture]. 121 This document describes the 3sm proposal which is a special 122 implementation of the general PCN framework 123 [I-D.ietf-pcn-architecture]. It relies on three different packet 124 markings: "no pre-congestion" (NP), "admission-stop" (AS), or 125 "excess-traffic" (ET). Packets enter a network with NP (unmarked). 126 The basic idea is as follows. Each link in a PCN domain has an 127 admissible rate (AR). If the current PCN traffic rate of a link 128 exceeds its AR, all PCN packets on that link are re-marked with AS. 129 The PCN egress nodes monitor the packet markings and trigger the PCN 130 ingress nodes to admit or reject new flow requests depending on 131 whether they observe AS-marked packets or not. Similarly, each link 132 has a supportable rate (SR). If the current PCN traffic rate of a 133 link exceeds its SR, some of the PCN packets on that link are re- 134 marked with ET. The PCN egress nodes detect ET-marked packets and 135 pass their flow IDs to the appropriate flow termination entity. This 136 concept is called marked flow termination (MFT). 138 The 3sm architecture expects the above mentioned marking behaviour 139 from PCN interior nodes. We propose simple metering and marking 140 algorithms for that purpose. Those are threshold marking for AS- 141 marking and tail marking with marking frequency reduction (MFR) for 142 ET-marking. We describe them in Section 3 based on a token bucket 143 approach. To improve the termination behaviour of MFT, we suggest 144 packet size independent marking (PSIM) and proportional MFR (PMFR) in 145 the Appendix A and show that the same behaviour can be specified 146 based on a virtual queue. 148 The document is structured as follows. After a short section on 149 terminology, Section 2 focuses on the edge behaviour of the 3sm 150 proposal. Section 3 provides detailed algorithms for the metering 151 and marking behaviour expected from interior nodes in 3sm. Section 4 152 list benefits and shortcomings of the 3sm proposal. Appendix A 153 summarizes background information about token bucket (TB) and virtual 154 queue (VQ) metering and marking as they are the base for the proposed 155 metering and marking mechanisms. 157 1.1. Requirements Notation 159 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 160 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 161 document are to be interpreted as described in [RFC2119]. 163 1.2. Terminology Used in this Document 165 The terminology used in this document conforms to the terminology of 166 [I-D.ietf-pcn-architecture]. However, we adapted some terminology 167 for better readability and added some new terminology that we found 168 useful in this document. 170 1.3. Adapted Terminology 172 [I-D.ietf-pcn-architecture] has chosen very general terms for the 173 rate thresholds as they have different semantic meanings in different 174 proposals for the implementation of the general PCN architecture. 175 For better readability, we use names that are more intuitive within 176 the 3sm proposal. 178 o Admissible rate (AR) - PCN-lower-rate 180 o Supportable rate (SR) - PCN-upper-rate 182 o AR-overload - Condition that PCN traffic rate exceeds AR of a link 184 o SR-overload - Condition that PCN traffic rate exceeds SR of a link 186 o No pre-congestion (NP) - Default marking for PCN packets that have 187 not been carried over links with any type of pre-congestion (AR- 188 overload or SR-overload). 190 o Admission-stop (AS) - PCN-lower-rate-marking 192 o Excess-traffic (ET) - PCN-upper-rate-marking 194 o AS-marking - Action of re-marking of NP-marked packets to AS 196 o ET-marking - Action of re-marking of NP- or AS-marked packets to 197 ET 199 1.4. New Terminology 201 We provide a brief definition of the terminology used in this 202 document. 204 o PCN - Pre-Congestion Notification meters traffic rates per service 205 class on a link and notifies the PCN egress nodes using packet 206 marking whether a certain rate threshold is exceeded on any link 207 of the path through the PCN-enabled network taken by the packet. 208 The rate thresholds may be significantly lower than the line rates 209 such that PCN egress nodes are notified long before queues build 210 up in the buffers and real congestion occurs. PCN is intended for 211 the implementation of measurement-based admission control and flow 212 termination for real-time inelastic traffic, e.g., voice. The PCN 213 marking in the packet headers need to be standardized. 215 o ECN Field - Refers to the use of the standardized two bit field in 216 the IP header that is used for signalling Explicit Congestion 217 Notification [RFC3168]. In the PCN framework the ECN field maybe 218 reused to signal two levels of PCN marking. 220 o Service class - By service class we mean a grouping of packets 221 belonging to one or more applications or services that generated 222 traffic with similar characteristics and requiring similar QoS 223 treatment. See [RFC4594] for details. 225 o Admission Control - It is the function of admitting or blocking 226 requests of new flows or sessions for access to the network to 227 prevent AR-overload. 229 o Flow Termination - It is the function of terminating already 230 admitted flows in the sense that they cannot continue to send PCN 231 traffic. Only flows contributing to SR-overload are terminated to 232 reduce SR-overload. 234 o Deployment model - A method to find the PCN information for the 235 path of a new flow that requests admission to the PCN domain and 236 to perform the required admission decision. Deployment models 237 take advantage of information or protocols available in the 238 networking scenario where PCN is to be deployed. 240 o Marked flow termination (MFT) - Flow termination function 241 terminating marked flows. 243 o Measured rate termination (MRT) - Flow termination function 244 terminating a certain rate of traffic that was directly or 245 indirectly measured by the system. 247 o Marking frequency reduction (MFR) - Marked flow termination (MFT) 248 requires that only some instead of all PCN packets exceeding the 249 supportable rate (SR) of a link are marked. This is achieved by 250 MFR. This can be achieved by adding a slowdown factor of "s" 251 tokens to the fill state of the token bucket whenever a packet is 252 ET-marked. 254 o Token bucket (TB) - One base mechanism for packet metering. 256 o Virtual queue (VQ) - Another, equivalent base mechanism for packet 257 marking. 259 2. The 3sm Proposal 261 First, a high-level overview of the packet marking in 3sm is 262 presented. Then, the edge behaviour for the admission control and 263 flow termination function is explained. 265 2.1. Packet Marking in 3sm 267 PCN traffic can be classified by DSCP or a group of DSCPs and is 268 forwarded by an appropriated PHB. PCN configures for each link an 269 admissible and a supportable rate (AR, SR). PCN traffic enters the 270 network with a "no-precongestion" (NP) mark. PCN nodes meter the PCN 271 traffic on every link. When the PCN traffic rate on a link exceeds 272 the corresponding AR, the PCN node re-marks all NP-marked PCN packets 273 to "admission-stop" (AS). 275 Similarly, they re-mark some non-ET-marked PCN packets to "excess- 276 traffic" (ET) when the PCN traffic rate on a link exceeds the 277 corresponding SR. Figure 1 summarizes the relation between the AR 278 and SR thresholds and the marking behaviour. The SR normally is at 279 least a delta above the AR and a delta below the maximum service rate 280 for PCN traffic for the sake of stability of the measurement-based 281 reactive system. If the PCN traffic rate is below AR, no packets are 282 re-marked; if it is between AR and SR, all NP-marked PCN packets are 283 re-marked to AS; and if it is above SR, some non-ET-marked PCN 284 packets are re-marked to ET and all other NP-marked PCN packets are 285 re-marked to AS. Hence, the meters and markers operate in a marking- 286 aware mode: NP-marked packets can be re-marked to AS or ET, AS-marked 287 packets can be re-marked to ET but not to NP, and ET-marked packets 288 cannot be re-marked at all. 290 PCN traffic rate 291 100%^ 292 | AR- and SR-overload: 293 | re-mark SOME non-ET-marked 294 | packets to ET and the remaining to AS, 295 | indicating that AR and SR are exceeded 296 Supportable rate|---------------------------------------------- 297 SR | AR-overload: 298 | re-mark ALL NP-marked packets to AS, 299 | indicating admissible rate is exceeded 300 Admissible rate |---------------------------------------------- 301 AR | 302 | No overload: do not re-mark any packets 303 | 304 0%+-------------------------------------------------> 306 Figure 1: Packet re-marking by PCN nodes. 308 2.2. Admission Control in 3sm 310 The admission of a flow requests depends on the marking that is 311 currently observed on the path the data packet of the future flow 312 will take. However, it is not trivial to provide this information 313 and may be achieved differently depending on the networking scenario. 314 Therefore, we introduce three deployment models that use their own 315 methods to get the PCN information for the path of a future data 316 flow. These deployment models are adapted to special networking 317 scenarios that are in use today and are named after the concept that 318 helps to find the path of future data packets: 320 o explicit edge-to-edge tunnels (E2Tunnel) 322 o end-to-end on-path signalling (End2PS) 324 o edge-to-edge on-path signalling (Edge2PS) 326 2.2.1. Explicit Edge-to-Edge Tunnels (E3Tunnel) 328 Explicit edge-to-edge tunnels provide an IP adjacency from a PCN 329 ingress node to a PCN egress node. As a consequence, the information 330 about the PCN egress node of a packet can be derived from the routing 331 table. In addition, we assume that each tunnel is set up on an 332 explicit path. This can be realized, e.g. by MPLS label switched 333 paths (LSPs) or IP-in-IP tunnelling. We use the name E3Tunnel to 334 abbreviate the term "explicit edge-to-edge tunnel" and as the name 335 for the corresponding deployment model. An E3Tunnel aggregate is the 336 ensemble of PCN traffic carries through the PCN domain through a 337 common explicit E3Tunnel. 339 In the presence of E3Tunnels, the tunnel a packet is forwarded over 340 is derived from the forwarding table of the PCN ingress node. This 341 information is required that a new flow can request admission to the 342 appropriate E3Tunnel. The PCN ingress and egress nodes have a 343 context for each E3Tunnel they support. The PCN ingress node works 344 per context in two different modes: 346 o the reject mode and 348 o the accept mode. 350 In the reject mode, the PCN ingress node rejects new admission 351 requests while it admits them in its accept mode. 353 Similarly, the PCN egress node works per context in two corresponding 354 modes: 356 o the admission-stop mode and 358 o the admission-continue mode. 360 The PCN egress node monitors the markings of the packets received 361 from a specific E3Tunnel and depending on their marking, it switches 362 to the admission-stop or admission-continue mode. Depending on its 363 mode, the PCN egress node periodically sends "admission-stop" or 364 "admission-continue" messages to the PCN ingress node belonging to 365 the same E3Tunnel. If the PCN egress node receives an "admission- 366 stop" message, it switches within the corresponding E3Tunnel context 367 to its reject mode and if it receives an "admission-continue" 368 message, it switches to its accept mode. 370 Different algorithms can be applied by the PCN egress node to decide 371 when to send which control messages. We give two examples. 373 o Option 1 (single-packet-based): If the PCN egress node observes a 374 single marked packet from a specific E3Tunnel, it turns to the 375 admission-stop mode. It returns to the admission-continue mode 376 after some time unless it still observes marked packets 377 occasionally. This option can be implemented without rate 378 measurement and even without any counters requiring per-packet 379 modification. 381 o Option 2 (CLE-based): The PCN egress node tracks the number of 382 marked and umarked packets from a specific E3Tunnel within a 383 measurement interval. It calculates the congestion level estimate 384 (CLE) which is the fraction of the marked and all packets observed 385 during this interval. If no packet was received within the last 386 measurement interval, the PCN egress node switches to the 387 admission-continue mode. If the PCN egress node is in the 388 admission-continue mode and the CLE of the last measurement 389 interval is larger than a predefined admission-stop threshold, it 390 switches to the admission-stop mode. If the PCN egress node is in 391 the admission-stop mode and the CLE is smaller than a predefined 392 admission-continue threshold, it switches to the admission- 393 continue mode. 395 2.3. End-to-End on-Path Signalling (End2PS) 397 End-to-end on-path signalling sends PATH messages downstream to 398 discover the path of future data packets and RESV messages upstream 399 to trigger the admission request for the correct forward path using 400 the gathered PATH information. The deployment model End2PS reuses 401 the end-to-end on-path signalling protocol for probing on which the 402 admission decision is based. We first explain the basic operation of 403 standard RSVP and then adapt it to perform PCN-based admission 404 control. 406 2.3.1. Operation of Standard RSVP 408 A popular protocol example is RSVP [RFC2205]. With RSVP, the data 409 source issues a PATH message which is carried hop-by-hop over the 410 same path future data packets will go. To that end, the PATH message 411 uses the same source and destination address as future data packets 412 and also all other header fields that are possible input for routing 413 and load balancing decisions need to be the same. When a PATH 414 message arrives at an RSVP-capable node, a PATH state is established 415 pointing to the previous hop before the PATH message is forwarded 416 further downstream. When the PATH message arrives at the 417 destination, the destination triggers the end-to-end reservation for 418 the flow by sending a RESV message upstream along the nodes that set 419 up a PATH state. In these nodes, the RESV message is processed. In 420 particular, resource admission control is performed for the new flow 421 request and if it succeeds, the node forwards the RESV message to the 422 previous hop recorded by the PATH state. This two pass signalling 423 approach guarantees that the reservation is done on the downstream 424 path of the future data flow. In contrast to PATH messages, RESV 425 messages have the source address of the sending node and the 426 destination address of the hop pointed to by the PATH state. That 427 way, the information about the downstream next hop of the future data 428 stream is conveyed to the previous hop and the flow-related 429 information is stored in a RESV state. RSVP is a soft-state 430 protocol, i.e., the PATH and RESV control messages are periodically 431 sent to keep the PATH and RESV states alive and, thereby, the flow 432 reservations. Admission control needs to be performed for a flow 433 only once when no RESV state is set up, yet. 435 2.3.2. Modification of Standard RSVP to Perform PCN-Based Admission 436 Control 438 We assume that interior nodes of a PCN domain are RSVP-disabled. 439 That means that they just forward RSVP messages without processing 440 them and PCN ingress and egress nodes are neighboring RSVP-capable 441 nodes. As a consequence, PCN ingress nodes decide whether new flows 442 can be admitted and carried through domain or not. 444 When the initial PATH message travels downstream, it is either marked 445 or not, and eventually received by the PCN egress node. If no PATH 446 state can be found for this flow at the PCN egress node, this PATH 447 message is the first one and not a REFRESH message. If the PATH 448 message is the first of the flow and if it is marked with "admission- 449 stop", the RSVP engine sends back a PATHERR message to reject the 450 flow. If the PATH message is not marked (NP), the RSVP PATH state is 451 established at the PCN egress node and the PATH message is forwarded 452 further downstream. REFRESH messages are just forwarded according to 453 standard RSVP. When the PATH message arrives at the destination and 454 a RESV message is sent. Eventually, the corresponding RESV message 455 arrives at the PCN ingress node. When no RESV state is set up yet, 456 this is the first RESV message and admission control must be 457 performed. By the mere fact that the RESV message arrives, the PCN 458 ingress node knows that the corresponding initial PATH message was 459 not marked. Thus, it can admit any flow for which a new RESV message 460 arrives. 462 A single probe is sufficient in 3sm because 3sm marks all packets 463 when AR-overload occurs. Note that RSVP is only an example for 464 End2PS, but End2PS works equally well with other two pass on-path 465 signalling protocols. 467 2.4. Edge-to-Edge on-Path Signalling (Edge2PS) 469 In the absence of explicit edge-to-edge tunnels or other on-path 470 signalling protocols, PCN information about the path future packets 471 of a new flow will take is required for a qualified admission 472 decision at the PCN ingress node. To that end, we propose to use an 473 edge-to-edge on-path signalling protocol (Edge2PS). 475 Again, we assume that all interior nodes of the PCN domain are RSVP- 476 disabled. When a request for the admission of a new flow arrives, 477 e.g. signalled via SIP INVITE or other means, the PCN ingress and 478 egress node act as RSVP proxies for the source and destination node 479 of the data flow and set up a reservation request between PCN ingress 480 and egress node. The source and destination address of the PATH 481 messages are those of the actual data flow. This guarantees that 482 they are carried on the same path as future data packet will be. The 483 PCN edge nodes do not forward the RSVP control messages outside the 484 PCN domain. Instead, the PCN egress node returns a RESV message to 485 the PCN ingress node. The RSVP messages created by PCN nodes on 486 behalf of the flow need to be discerned somehow from regular end-to- 487 end RSVP messages because they must not be forwarded outside the PCN 488 domain. 490 With this addition, the above presented modification of standard RSVP 491 can be used to admit flows based on a single probe message. Note 492 that there is no stringent need to use RSVP for Edge2PS. A less 493 complex two-pass protocol suffices. 495 2.5. Flow Termination in 3sm 497 Although flows are admitted only if the PCN traffic rate does not 498 exceed the admissible rate (AR) on any link of their paths, it is 499 possible that the PCN traffic rate on a link exceeds the SR, e.g., 500 due to changed sending behaviour of admitted flows or due to route 501 changes after a failure. 503 With 3sm two different options for flow termination can be supported: 504 marked flow termination (MFT) and measured rate termination (MRT). 505 We explain them in the following. 507 2.5.1. Marked Flow Termination (MFT) 509 Each PCN egress node monitors its received PCN traffic. If it 510 detects an ET-marked packet, the corresponding PCN ingress node is 511 identified and the PCN egress node sends the flow ID belonging to the 512 ET-marked packet in a "traffic-reduction" message to the flow 513 termination entity (e.g. the appropriate PCN ingress node) for 514 termination. To save signalling overhead, several IDs may be 515 signalled in a single "traffic reduction" message. 517 Marked flow termination can be applied with any deployment model. 518 How the PCN ingress node of an ET-marked packet is derived depends on 519 the deployment model: 521 o E3Tunnel: the adjacency from which the packet is received defines 522 the E3Tunnel such that the corresponding PCN ingress node is 523 known. 525 o With End2PS or Edge2PS, the PCN egress nodes can map the packets 526 to RSVP reservations using RSVP classifiers, and the corresponding 527 RSVP PATH state contains the address of the PCN ingress node. 529 Marked flow termination (MFT) requires that only some of the packets 530 that exceed the supportable rate are ET-marked. Otherwise, MFT 531 terminates too many flows. With MFT, only a small fraction of the 532 traffic is removed within a round-trip time, but the process 533 continues if the PCN traffic rate still exceeds the supportable rate 534 (SR). Therefore, the PCN traffic rate is gradually reduced until it 535 drops below SR. Then, the flow termination process stops since no 536 more packets are ET-marked. 538 2.5.2. Measured Rate Termination (MRT) 540 Measured rate termination can be applied only in the E3Tunnel 541 deployment model. It requires that all packets exceeding SR are ET- 542 marked. The PCN egress nodes measure the rate of ET-marked (ETR) 543 packets per E3Tunnel and if it is larger than zero, it signals that 544 ETR to the corresponding flow termination entity in a "traffic- 545 reduction" message. The flow termination entity may be the 546 corresponding PCN ingress node which then terminates a subset of the 547 flows carried over the respective E3Tunnel such that the rate of 548 these flows is about ETR. However, this is not an easy task since 549 the actual rates of the flows are in general not known. If SR- 550 overload continues in spite of the flow termination, the PCN egress 551 node must wait some time before it sends a new "traffic-reduction" 552 message to guarantee that the impact of the previous one is already 553 reflected by the new measured rate of ET-marked traffic (ETR). 555 Measured rate termination can reduce SR-overload very quickly, but it 556 has several drawbacks: 558 o Rate measurement of ET-marked packets is complex. 560 o The choice of the right subset of flows is difficult and requires 561 that the rates of individual flows are known. 563 As marked flow termination (MFT) is simpler than measured rate 564 termination (MRT), we propose MFT as preferred method for flow 565 termination in 3sm. 567 3. Three State PCN Marker with Marking Frequency Reduction (MFR) for 568 Marked Flow Termination (MFT) 570 The three state PCN marker (3sm) meters PCN packet streams per link 571 and performs packet re-marking according to Figure 1. As a 572 consequence the following three marking states are required: 574 o no pre-congestion (NP), 576 o admission-stop (AS), and 577 o excess-traffic (ET). 579 In theory, the meter meters each packet and passes the packet and the 580 metering result to the marker, and the marker marks packets according 581 to the results of the meter. This is illustrated in Figure 2. The 582 marking may be coded in the ECN field [RFC3168] of the packet for a 583 specified PHB in a specific manner. 585 +------------+ 586 | Result | 587 | V 588 +-------+ +--------+ 589 | | | | 590 Packet stream ===>| Meter |===>| Marker |===> Marked packet stream 591 | | | | 592 +-------+ +--------+ 594 Figure 2: Block diagram of meter and marker function. 596 The behaviour of the two functions is often described by a single 597 metering and marking algorithm. Therefore, we call the algorithm 598 metering packets relative to admissible rate (AR) and re-marking them 599 to AS simply AS-marker, and the algorithm metering packets relative 600 to supportable rate (SR) and re-marking them to ET simply ET-marker. 602 In the following, we explain the behaviour of the ET- and AS-marker 603 using token bucket (TB) based algorithms. The packet sizes counted 604 by the meters and markers pertain to the size of the IP packet 605 including its header bytes. Equivalent virtual queue (VQ) based 606 algorithms are presented in Appendix A. Other implementations 607 approximating the described behaviours can be used. 609 3.1. ET-Marker 611 We describe the behaviour for SR-metering and ET-marking, present 612 pseudo code, explain its configuration, and discuss its behaviour. 613 We use object-oriented notation for most variables. 615 3.1.1. Behaviour of SR-Metering and ET-Marking 617 We propose an ET-marker based on a token bucket with tail marking and 618 marking frequency reduction (see Appendix A for explanation of 619 different options). The TB has a bucket of size TB.size which is 620 continuously filled with tokens at rate TB.rate. When a PCN packet 621 arrives, it is re-marked with "ET" if the fill state of the bucket 622 (TB.fill) in tokens is smaller than its size (packet.size) in bytes 623 and "s" additional tokens are added to the bucket; otherwise, the 624 fill state is reduced by packet.size tokens. The slowdown parameter 625 "s" reduces the marking frequency of the algorithm. 627 3.1.2. Pseudo Code for the ET-Marker 629 The behaviour of the token bucket with tail marking and marking 630 frequency reduction for SR-metering and ET-marking is expressed by 631 the following pseudo code. It requires the time variable 632 TB.lastUpdate indicating when the fill state of TB was last updated 633 and a global variable "now" providing the current time. A PCN packet 634 has the variables packet.mark showing its marking (NP, AS, ET) and 635 packet.size showing its size. 637 Input: pcn packet 639 TB.fill = min(TB.size, TB.fill + TB.rate * (now - TB.lastUpdate)); 640 TB.lastUpdate = now; 641 if (TB.fill < packet.size) 642 packet.mark = ET; 643 TB.fill = min(TB.size, TB.fill + s); 644 else 645 TB.fill = TB.fill - packet.size; 646 endif 648 Output: void 650 Several enhancements of this algorithm are presented in 651 Appendix A.1.5. It marks packet independent of its size and applies 652 MFR proportionally to the packet size. Early results show that this 653 equalizes the termination probability for flows with different packet 654 sizes and makes the time to remove the overload independent of packet 655 sizes. In addition, MFR is also applied when packets are already ET- 656 marked by previous nodes. This reduces the potential overtermination 657 in case of multiple bottleneck links. These enhancements are simple, 658 but still make the algorithm slightly more complex. Therefore, more 659 performance results and discussions are needed to decide whether 660 these enhancements should be standard marking behaviour or not 661 [I-D.babiarz-pcn-explicit-marking], [I-D.menth-pcn-performance]. 663 3.1.3. Configuration of the ET-Marker 665 The following parameters must be configured: 667 o TB.rate: supportable rate (SR) 669 o TB.size: supportable burst size (SBS), needs to be set 670 appropriately 672 o Slowdown parameter "s": needs to be set appropriately 674 3.1.4. Characteristics of the Proposed ET-Marker 676 The proposed algorithm can be applied with and without marking 677 frequency reduction (MFR), i.e., s>0 and s=0, respectively. 679 a) No marking frequency reduction (noMFR, s=0) 681 If the slowdown parameter is set to s=0, MFR is switched off. As an 682 alternative, a simplified version of the given algorithm can be used. 683 If the PCN traffic rate on a link constantly exceeds its SR, the fill 684 state of the TB decreases. Arriving packets for which the number of 685 tokens in the bucket does not suffice are ET-marked. The size of the 686 token bucket (supportable burst size (SBS)) controls how fast the 687 marker reacts to a traffic rate above SR: if it is set to a low 688 value, packets are already marked at a rate lower than SR in the 689 presence of bursts, if it set to a high value, marking starts delayed 690 if the PCN traffic rate exceeds SR. A nice property of this option 691 is that the rate of the ET-marked packets is exactly the rate of the 692 traffic that exceeds SR (excess traffic rate) under the assumption 693 that there was no packet loss. This observation can be used for 694 "measured rate termination" (MRT): the rate of ET-marked traffic is 695 measured per ingress-egress aggregate by the PCN egress node and 696 signalled to the corresponding flow termination entity. 698 Measured rate termination (MRT) is also an option to realize flow 699 termination in 3sm, but the preferred option for 3sm is marked flow 700 termination (MFT). The drawback of "no MFR" arises in conjunction 701 with MFT. Then too many packets are ET-marked and too many flows are 702 terminated. That is the motivation to reduce the marking frequency 703 by setting s>0. 705 b) Marking frequency reduction (MFR, s>0) 707 If the slowdown parameter is set to a value s>0, MFR is achieved 708 because for each marked packet up to additional "s" bytes that exceed 709 SR can pass the ET-marker without being re-marked to ET. Thus, 710 increasing the slowdown parameter "s" decreases the number of ET- 711 marked packets in a short time period. In combination with MFT, a 712 suitable "s" is required to achieve a fast termination of 713 sufficiently many flows without terminating more flows than 714 necessary. 716 3.2. AS-Marker 718 We explain the behaviour for AR-metering and AS-marking, present 719 pseudo code, explain its configuration, and discuss its behaviour. 721 3.2.1. Behaviour of AR-Metering and AS-Marking 723 We propose an AS-marker based on a token bucket with threshold 724 marking (see Appendix A for explanation of different options). The 725 TB has a bucket of size TB.size which is continuously filled with 726 tokens at rate TB.rate. The AR-meter and AS-marker consider only 727 packets that are not ET-marked. When a non-ET-marked PCN packet 728 arrives, it is re-marked to "AS" if the fill state of the bucket 729 (TB.fill) in tokens is smaller than its size (packet.size) in bytes; 730 otherwise, the fillstate is reduced by packet.size tokens and if the 731 fill state is then smaller than the marking threshold (TB.threshold), 732 the packet is also re-marked to "AS" while if the fill state is then 733 larger than or equal to the marking threshold, the packet is not re- 734 marked. 736 The AS-marker is sensitive to ET-markings in the sense that only non- 737 ET-marked packets are considered for remarking. 739 3.2.2. Pseudo Code for AS-Marker 741 The behaviour of the token bucket with threshold marking for AR- 742 metering and AS-marking is expressed by the following pseudo code 743 using the same nomenclature like above. 745 Input: pcn packet 747 TB.fill = min(TB.size, TB.fill+TB.rate*(now-TB.lastUpdate)); 748 TB.lastUpdate = now 749 if (packet.mark <> ET) //mark only non-ET-marked packets 750 if (TB.fill < TB.threshold) 751 packet.mark = AS; 752 endif 753 TB.fill = max(0, TB.fill - packet.size); 754 endif 756 Output: void 758 3.2.3. Configuration of the AS-Marker 760 The following parameters must be configured: 762 o TB.rate: admissible rate (AR) 764 o TB.size (TBS): admissible burst size (ABS) needs to be set 765 appropriately 767 o TB.threshold: needs to be set appropriately 769 3.2.4. Characteristics of the Proposed AS-Marker 771 If the AR is exceeded, the TB fill state continuously decreases, it 772 eventually falls below its marking threshold TB.threshold, and it 773 only increases again if the PCN traffic rate on the link falls below 774 AR. As a consequence, all packets are AS-marked during that time and 775 admission of further flows is stopped until the PCN traffic rate 776 drops below AR. In particular, all probe packets are AS-marked. 778 [TR437] investigates the impact of the TB parameters on the marking 779 characteristics, i.e. the percentage of marked packets depending on 780 the PCN traffic rate relative to AR. If the PCN traffic rate is 781 above AR, all packets must be marked with AS. To that end, 782 TB.threshold must be set large enough. If the PCN traffic rate is 783 below AR, no packets should be marked. To that end TB.size- 784 TB.threshold must be set large enough. Then, we get marking with 785 clear decisions, i.e. packets are marked if the PCN rate is above AR 786 and they are not marked if the PCN rate is below AR. This type of 787 marking is required for 3sm. [I-D.menth-pcn-performance] provides a 788 summary of [TR437]. 790 3.3. Marking Codepoints 792 PCN metering and marking requires classification of traffic which is 793 subject to PCN metering and PCN marking (PCN-capable). Furthermore, 794 PCN-aware flows that are subject to PCN-marking require at least the 795 following codepoints: 797 o "no-precongestion" (NP), 799 o "admission-stop" (AS), and 801 o "excess-traffic" (ET). 803 These signals may be encoded by re-using the two-bit ECN field or by 804 different DS codepoints. The actual encoding is out of the scope of 805 this document. 807 4. Benefits and Shortcomings of the 3sm Proposal 809 This section highlights some benefits of the 3sm proposal that we 810 think are important. Some of them are based on performance results 811 reported in [SIM-07], [I-D.babiarz-pcn-explicit-marking], 812 [I-D.menth-pcn-performance], and [TR437]. 814 4.1. Benefits 816 o 3sm does not require the standardization of the exact algorithm in 817 the PCN interior node but only the standardization of the 818 behavior. 820 o 3sm does not require periodic transmission of measurement results 821 from PCN egress to PCN ingress. 823 o The supportable rate of a link can be chosen independently of the 824 admissible rate as long as it is not smaller. This is maximum 825 freedom and does not imply any degradation in resource efficiency 826 for resilient admission control [PCN-Config]. 828 o 3sm works well with multipath routing due to marked flow 829 termination and the application of probing when needed. 831 * A flow is terminated only if it is carried over an SR- 832 overloaded path and if one of its packets was ET-marked. 833 Therefore, its termination reduces the SR-overload on a 834 bottleneck link. 836 * Conversely, flows that are not carried over a bottleneck link 837 cannot be ET-marked and, therefore, not terminated. 839 o Marked flow termination is very robust 841 * The slowdown factor "s" can be reasonably chosen such that the 842 time to remove the overload is short and that not more traffic 843 than necessary is terminated. 845 * It works well with low and high packet loss. Packet loss just 846 increases the time to remove the overload. 848 * It works well with a small and large number of flows. In 849 particular, the right number of flows is terminated if there is 850 only one flow per aggregate and an SR-overload of, e.g., 30%. 852 * It works well with constant bit rate (CBR) voice, on/off voice, 853 and variable bit rate (VBR) traffic. It is not very sensitive 854 to different traffic characteristics [SIM-07]. 856 * It works well with multiple bottleneck links in the path of a 857 flow. 859 * It works well with flows that have significantly different 860 round-trip times (RTT) or flow termination delays. In 861 particular, flows with different RTTs suffer the same flow 862 termination probability. 864 * In case of terminating bidirectional flows, the termination of 865 a flow entails the termination of the downstream and upstream 866 flow. In case of overload on both the downstream and upstream 867 path, flows are terminated on both path. This leads to 868 increased termination aggressiveness. However, marked flow 869 termination leads only to little overtermination as it 870 terminates traffic only gradually. 872 * Different slowdown factors "s" may be used for the 873 configuration of marking frequency reduction on different links 874 within a single PCN domain. 876 * It leads to higher termination probabilities for flows with 877 larger packets. This can be repaired by packet size 878 independent marking (PSIM, in Appendix A.1.5) 880 4.2. Shortcomings of 3sm 882 o Marked flow termination leads to higher termination probabilities 883 for flows with larger packet frequency (higher rate flows). 884 Currently, there is no mechanism to improve this. 886 5. Security Considerations 888 The Three State PCN Marker has no known security concerns. 890 6. Changes from Previous Revision 892 Changes in version -01 compared to version -00: 894 * Abstract: adapted: more general, better summary 896 * shortened introduction and adapted it to new document structure 898 * put "1.3 Terminology" into separate section, clarification and some 899 minor changes 901 * put "1.2 Overview of PCN" into a separate section on 3sm edge 902 behaviour 904 * added more structure and content to that section 906 o deployment scenarios for admission control 907 o added measured rate termination (MRT) as a non-preferred option to 908 marked flow termination (MFT) in 3sm 910 * introduced AS-meter as abbreviation for AR-metering and AS-marking 911 function 913 * introduced ET-meter as abbreviation for SR-metering and ET-marking 914 function 916 * removed comment about similarity of 3sm marking and srTCM (RFC2697) 918 * current 3.1 did some reformulation and added some new simulation 919 insights, moved optimizations into Appendix A.1.5 921 * current 3.2 did some reformulation and added some new simulation 922 insights, changed the AS-marker according to 924 o Joe Babiarz's comment saying all packets need to be considered for 925 AR-metering 927 o Bob Brisco's comment saying TB.fill should be set to zero if it is 928 smaller than packet.size 930 o provided insights for the setting of TB parameters to obtain 931 desired marking behaviour 933 * new section on benefits of the 3sm proposal added 935 * A.1 - A.1.4 minor changes (rewording) 937 * A.1.5 new section on optimization for 3sm marker (PSIM, PMFR, MFR 938 for already marked packets) 940 * A.1.6 simplified threshold marking according to Bob's comments 942 * old A.6 Section on Related Work (srTCM) removed as srTCM without 943 change cannot be reused for implementation of threshold marking 945 * removed Appendix B and integrated the comments into the previous 946 sections of the document if necessary 948 7. Acknowledgements 950 The authors would like to thank the following people for reviewing 951 this draft or earlier versions thereof and for their suggestions to 952 make this document more complete: Dave McDysan, Nicolas Chevrollier, 953 Frank Lehrieder, Bob Brisco, and Ben Strulo. 955 8. Informative References 957 [I-D.babiarz-pcn-explicit-marking] 958 Liu, X. and J. Babiarz, "Simulations Results for 3sM", 959 draft-babiarz-pcn-explicit-marking-01 (work in progress), 960 July 2007. 962 [I-D.ietf-pcn-architecture] 963 Eardley, P., "Pre-Congestion Notification Architecture", 964 draft-ietf-pcn-architecture-01 (work in progress), 965 October 2007. 967 [I-D.menth-pcn-performance] 968 Menth, M. and F. Lehrieder, "Performance Evaluation of 969 PCN-Based Algorithms", draft-menth-pcn-performance-00 970 (work in progress), November 2007. 972 [Maglaris-88] 973 Maglaris et al, "Performance Models of Statistical 974 Multiplexing in Packet Video Communications, IEEE 975 Transactions on Communications 36, pp. 834-844", July 976 1988. 978 [PCN-Config] 979 Menth, M. and M. Hartmann, "PCN-Based Resilient Network 980 Admission Control: The Impact of a Single Bit", (http:// 981 www3.informatik.uni-wuerzburg.de/staff/menth/Publications/ 982 Menth07-PCN-Config.pdf)", 2007. 984 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 985 Requirement Levels", BCP 14, RFC 2119, March 1997. 987 [RFC2205] Braden, B., Zhang, L., Berson, S., Herzog, S., and S. 988 Jamin, "Resource ReSerVation Protocol (RSVP) -- Version 1 989 Functional Specification", RFC 2205, September 1997. 991 [RFC2474] Nichols, K., Blake, S., Baker, F., and D. Black, 992 "Definition of the Differentiated Services Field (DS 993 Field) in the IPv4 and IPv6 Headers", RFC 2474, 994 December 1998. 996 [RFC2475] Blake, S., Black, D., Carlson, M., Davies, E., Wang, Z., 997 and W. Weiss, "An Architecture for Differentiated 998 Services", RFC 2475, December 1998. 1000 [RFC3168] Ramakrishnan, K., Floyd, S., and D. Black, "The Addition 1001 of Explicit Congestion Notification (ECN) to IP", 1002 RFC 3168, September 2001. 1004 [RFC4594] Babiarz, J., Chan, K., and F. Baker, "Configuration 1005 Guidelines for DiffServ Service Classes", RFC 4594, 1006 August 2006. 1008 [SIM-07] Liu, X-G. and J. Babiarz, "Simulation Results for Explicit 1009 PCN Marking and Flow Termination 1010 (http://standards.nortel.com/pcn/Simulation_EPCN.pdf)", 1011 February 2007. 1013 [TR437] Menth, M. and F. Lehrieder, "Comparison of Marking 1014 Algorithms for PCN-Based Admission Control, Technical 1015 Report No. 437, (http:// www- 1016 info3.informatik.uni-wuerzburg.de/TR/tr437.pdf)", 1017 October 2007. 1019 Appendix A. Overview of Token Bucket (TB) and Virtual Queue (VQ) 1021 Token buckets (TB) and virtual queues (VQ) are equivalent base 1022 mechanisms for algorithms to control whether a packet is conform to 1023 its flow's indicated rate R and burst size S. Therefore, TB 1024 parameters are frequently used as traffic descriptors. TBs and VQs 1025 are dual approaches: while packets are TB-conform as long as 1026 sufficient tokens are in the bucket at their arrival times, they are 1027 VQ-conform as long as sufficient free space is available in the queue 1028 at their arrival times. Therefore, TBs and VQs can be used 1029 interchangeably and, in particular, algorithms given based on a TB 1030 description can be implemented by a VQ and vice-versa. 1032 In the following, we explain the basic VQ and TB mechanisms 1033 (Appendix A.1.1 and Appendix A.1.2). Packets are marked depending on 1034 the state of the VQ or TB at their arrival time. There are different 1035 marking options. Only those packets that are not conform to its flow 1036 description may be marked (tail marking, Appendix A.1.3), or only 1037 some non-conforming packets may be marked (tail marking with marking 1038 frequency reduction, Appendix A.1.4), or all packets may be marked 1039 until the flow again reaches conformity (threshold marking, 1040 Appendix A.1.6). 1042 A.1. New features for marking algorithm 1044 A.1.1. Virtual Queue (VQ) 1046 We use an object-oriented notation for a more intuitive readability 1047 of the algorithms. The VQ has a VQ rate (VQ.rate) and a queue which 1048 is capable to store up to VQ.size bytes. The current length of the 1049 queue is denoted by VQ.length. This length is reduced over time at 1050 rate VQ.rate. When a packet arrives, it is "accepted" by the VQ and 1051 increments VQ.length by its size (packet.size) if there is still 1052 enough free space in the queue to accommodate it; otherwise it is 1053 "rejected". As the queue size is decreased continuously over time, 1054 the behaviour of a VQ is best described by a fluid model. However, 1055 the state of the VQ shortly after packet arrivals can be calculated 1056 based on the current time "now" and the length of the VQ at the last 1057 update time of the VQ (VQ.lastUpdate) using the following algorithm: 1059 VQ.length = max(0, VQ.length - (now - VQ.lastUpdate) * VQ.rate); 1060 VQ.lastUpdate = now; 1061 if (VQ.length + packet.size <= VQ.size) 1062 VQ.length = VQ.length + packet.size; 1063 endif 1065 A.1.2. Token Bucket (TB) 1067 The TB is basically the same mechanism, but it looks at the problem 1068 from a different angle. The TB has a rate (TB.rate) and a bucket 1069 which is capable to store up to TB.size tokens. A token is the 1070 permission to send one byte. The current fill state of the bucket is 1071 denoted by TB.fill. This fill state is increased over time at rate 1072 TB.rate. When a packet arrives, it is "accepted" and decrements 1073 TB.fill by its size (packet.size) if there are enough tokens in the 1074 bucket to send the entire packet; otherwise it is "rejected". As the 1075 fill state is increased continuously over time, the behaviour of a TB 1076 is best described by a fluid model. However, the state of the TB 1077 shortly after packet arrival can be calculated based on the current 1078 time "now" and the fill state of the TB at the last update time of 1079 the TB (TB.lastUpdate) using the following algorithm: 1081 TB.fill = min(TB.size, TB.fill + (now - TB.lastUpdate) * TB.rate); 1082 TB.lastUpdate = now; 1083 if (TB.fill >= packet.size) 1084 TB.fill = TB.fill - packet.size; 1085 endif 1087 A.1.3. Tail Marking 1089 To control whether packets of a stream with rate R and maximum burst 1090 size MBS are conform to the description R and MBS, the stream is 1091 metered either by a VQ with VQ.rate=R and VQ.size=MBS, or by a TB 1092 with TB.rate=R and TB.size=MBS. If a packet is accepted by the VQ or 1093 by the TB, it is marked in-profile. If it is rejected, it is marked 1094 out-of-profile. 1096 The corresponding pseudo codes are for the VQ: 1098 VQ.length = max(0, VQ.length - (now - VQ.lastUpdate) * VQ.rate); 1099 VQ.lastUpdate = now; 1100 if (VQ.length + packet.size <= VQ.size) 1101 VQ.length = VQ.length + packet.size; 1102 packet.mark = in-profile; 1103 else 1104 packet.mark = out-of-profile; 1105 endif 1107 and for the TB: 1109 TB.fill = min(TB.size, TB.fill + (now - TB.lastUpdate) * TB.rate); 1110 TB.lastUpdate = now; 1111 if (TB.fill >= packet.size) 1112 TB.fill = TB.fill - packet.size; 1113 packet.mark = in-profile; 1114 else 1115 packet.mark = out-of-profile; 1116 endif 1118 A.1.4. Tail Marking with Marking Frequency Reduction (MFR) 1120 The objective of tail marking with MFR is to mark only some of the 1121 packets that are out-of-profile. The strength of the reduction can 1122 be controlled by the slowdown parameter "s". When a packet is 1123 classified out-of-profile, the VQ length is decremented by "s" bytes 1124 and the TB fill state is incremented by "s" tokens, respectively. As 1125 a consequence, the VQ and the TB are not likely to mark consecutive 1126 packets as out-of-profile which reduces their marking frequency. 1128 The corresponding pseudo codes are for the VQ: 1130 VQ.length = max(0, VQ.length - (now - VQ.lastUpdate) * VQ.rate); 1131 VQ.lastUpdate = now; 1132 if (VQ.length + packet.size <= VQ.size) 1133 VQ.length = VQ.length + packet.size; 1134 packet.mark = in-profile; 1135 else 1136 VQ.length = max(0, VQ.length-s); //marking frequency reduction 1137 packet.mark = out-of-profile; 1138 endif 1140 and for the TB: 1142 TB.fill = min(TB.size, TB.fill + (now - TB.lastUpdate) * TB.rate); 1143 TB.lastUpdate = now; 1144 if (TB.fill >= packet.size) 1145 TB.fill = TB.fill - packet.size; 1146 packet.mark = in-profile; 1147 else 1148 TB.fill = min(TB.size, TB.fill+s) //marking frequency reduction 1149 packet.mark = out-of-profile; 1150 endif 1152 If the slowdown parameter is set to s=0, the marking algorithm 1153 behaves like pure tail marking. 1155 A.1.5. Tail Marking with Packet Size Independent Marking (PSIM) and 1156 Proportional Marking Frequency Reduction (PMFR) 1158 For the sake of fairness, large packets should not have a larger 1159 packet marking probability; otherwise this creates incentives to send 1160 many small packets instead of a few large packets. Therefore, we 1161 propose to test a packet arrival whether an MTU can still be 1162 supported. Then, the marking probability depends only on VQ.length 1163 or TB.fill, respectively. 1165 For the sake of better control of the termination aggressiveness, 1166 more flows should be marked and terminated in the presence of low bit 1167 rate flows than in the presence of high bit rate flows. To that end, 1168 we propose to remove (VQ) or add (TB) the slowdown factor 1169 proportionally to the packet size. 1171 The reason for adding "s" tokens to the bucket when a packet is 1172 marked is the anticipation of the termination of that flow. This is 1173 done to approximate the fill state with this flow already being 1174 terminated in order to avoid that too many additional flows are 1175 marked. The same condition is met when an already ET-marked packet 1176 arrives at the ET-marker. Therefore, the same slowdown factor "s" is 1177 added to the bucket of the ET-marker as if it had ET-marked the 1178 packet itself. This is done to avoid that more traffic than 1179 necessary is terminated in case that a traffic aggregate crosses 1180 several bottleneck links. However, this situation is rather unlikely 1181 and the effect is well visible only under pathologic conditions. 1182 Therefore, optimal behaviour is not crucial and the algorithm may be 1183 simplified by ignoring packets that are already ET-marked. 1185 These are obvious enhancements of the base algorithm, but they are 1186 only recommended when solid performance results indicate that the 1187 improvements have a significant impact in practice. These studies 1188 are still ongoing. 1190 The following pseudo codes incorporate all proposed enhancements. We 1191 give them for a VQ approach: 1193 VQ.length = max(0, VQ.length - (now - VQ.lastUpdate) * VQ.rate); 1194 VQ.lastUpdate = now; 1195 if (packet.mark<>ET) 1196 if (VQ.length + MTU <= VQ.size) // PSIM 1197 VQ.length = VQ.length + packet.size; 1198 else 1199 VQ.length = max(0, VQ.length-s*paket.size/MTU); // PMFR 1200 packet.mark = ET; 1201 endif 1202 else 1203 VQ.length = max(0, VQ.length-s*paket.size/MTU); // PMFR 1204 endif 1206 and for the TB: 1208 TB.fill = min(TB.size, TB.fill + (now - TB.lastUpdate) * TB.rate); 1209 TB.lastUpdate = now; 1210 If (packet.mark<>ET) 1211 if (TB.fill >= MTU) // PSIM 1212 TB.fill = TB.fill - packet.size; 1213 else 1214 TB.fill = min(TB.size, TB.fill+s*packet.size/MTU) // PMFR 1215 packet.mark = ET; 1216 endif 1217 else 1218 TB.fill = min(TB.size, TB.fill + s*packet.size/MTU); // PMFR 1219 endif 1221 A.1.6. Threshold Marking 1223 The objective of threshold marking is to mark all packets with, e.g., 1224 "rate-exceeded" as long as some packets are out-of-profile with 1225 respect to flow parameters R and MBS. We achieve that by setting the 1226 VQ or TB size larger than MBS. Packets are marked if the VQ length 1227 exceeds MBS or if the fill state of the TB falls below TB.size-MBS. 1228 Furthermore, the packet size is always added to the queue or removed 1229 from the bucket. Note that marking thresholds need to be configured 1230 differently for VQ and TB to obtain the same behaviour: 1232 o VQ.threshold = MBS; 1234 o TB.threshold = TB.size - MBS. 1236 The corresponding pseudo codes are for the VQ: 1238 VQ.length = max(0, VQ.length - (now - VQ.lastUpdate) * VQ.rate); 1239 VQ.lastUpdate = now; 1240 if (VQ.length > VQ.threshold) 1241 packet.mark = "rate-exceeded"; // packet out-of-profile 1242 endif 1243 VQ.length = min(VQ.size, VQ.length + packet.size); 1245 and for the TB: 1247 TB.fill = min(TB.size, TB.fill + (now - TB.lastUpdate) * TB.rate); 1248 TB.lastUpdate = now; 1249 TB.fill = TB.fill - packet.size; 1250 if (TB.fill < TB.threshold) 1251 packet.mark = "rate-exceeded"; // packet out-of-profile 1252 endif 1253 TB.fill = max(0, TB.fill - packet.size); 1255 Authors' Addresses 1257 Jozef Z. Babiarz 1258 Nortel 1259 3500 Carling Avenue 1260 Ottawa, Ont. K2H 8E9 1261 Canada 1263 Phone: +1-613-763-6098 1264 Email: babiarz@nortel.com 1266 Xiao-Gao Liu 1267 Nortel 1268 3500 Carling Avenue 1269 Ottawa, Ont. K2H 8E9 1270 Canada 1272 Phone: +1-613-763-7516 1273 Email: xgliu@nortel.com 1274 Kwok Ho Chan 1275 Nortel 1276 600 Technology Park Drive 1277 Billerica, MA 01821 1278 USA 1280 Phone: +1-978-288-8175 1281 Email: khchan@nortel.com 1283 Dr. Michael Menth 1284 University of Wuerzburg 1285 Institute of Computer Science 1286 Am Hubland, D-97074 Wuerzburg, Room B206 1287 Germany 1289 Phone: (+49)-931/888-6644 1290 Email: menth@informatik.uni-wuerzburg.de 1292 Full Copyright Statement 1294 Copyright (C) The IETF Trust (2007). 1296 This document is subject to the rights, licenses and restrictions 1297 contained in BCP 78, and except as set forth therein, the authors 1298 retain all their rights. 1300 This document and the information contained herein are provided on an 1301 "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS 1302 OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND 1303 THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS 1304 OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF 1305 THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED 1306 WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. 1308 Intellectual Property 1310 The IETF takes no position regarding the validity or scope of any 1311 Intellectual Property Rights or other rights that might be claimed to 1312 pertain to the implementation or use of the technology described in 1313 this document or the extent to which any license under such rights 1314 might or might not be available; nor does it represent that it has 1315 made any independent effort to identify any such rights. Information 1316 on the procedures with respect to rights in RFC documents can be 1317 found in BCP 78 and BCP 79. 1319 Copies of IPR disclosures made to the IETF Secretariat and any 1320 assurances of licenses to be made available, or the result of an 1321 attempt made to obtain a general license or permission for the use of 1322 such proprietary rights by implementers or users of this 1323 specification can be obtained from the IETF on-line IPR repository at 1324 http://www.ietf.org/ipr. 1326 The IETF invites any interested party to bring to its attention any 1327 copyrights, patents or patent applications, or other proprietary 1328 rights that may cover technology that may be required to implement 1329 this standard. Please address the information to the IETF at 1330 ietf-ipr@ietf.org. 1332 Acknowledgment 1334 Funding for the RFC Editor function is provided by the IETF 1335 Administrative Support Activity (IASA).