idnits 2.17.1 draft-stein-pwe3-ethpwcong-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** The document seems to lack a License Notice according IETF Trust Provisions of 28 Dec 2009, Section 6.b.ii or Provisions of 12 Sep 2009 Section 6.b -- however, there's a paragraph with a matching beginning. Boilerplate error? (You're using the IETF Trust Provisions' Section 6.b License Notice from 12 Feb 2009 rather than one of the newer Notices. See https://trustee.ietf.org/license-info/.) Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (July 1, 2009) is 5411 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Obsolete informational reference (is this intentional?): RFC 4447 (Obsoleted by RFC 8077) Summary: 1 error (**), 0 flaws (~~), 1 warning (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 PWE3 Y(J). Stein 3 Internet-Draft RAD Data Communications 4 Intended status: Standards Track July 1, 2009 5 Expires: January 2, 2010 7 Ethernet PW Congestion Handling Mechanisms 8 draft-stein-pwe3-ethpwcong-00.txt 10 Status of this Memo 12 This Internet-Draft is submitted to IETF in full conformance with the 13 provisions of BCP 78 and BCP 79. 15 Internet-Drafts are working documents of the Internet Engineering 16 Task Force (IETF), its areas, and its working groups. Note that 17 other groups may also distribute working documents as Internet- 18 Drafts. 20 Internet-Drafts are draft documents valid for a maximum of six months 21 and may be updated, replaced, or obsoleted by other documents at any 22 time. It is inappropriate to use Internet-Drafts as reference 23 material or to cite them other than as "work in progress." 25 The list of current Internet-Drafts can be accessed at 26 http://www.ietf.org/ietf/1id-abstracts.txt. 28 The list of Internet-Draft Shadow Directories can be accessed at 29 http://www.ietf.org/shadow.html. 31 This Internet-Draft will expire on January 2, 2010. 33 Copyright Notice 35 Copyright (c) 2009 IETF Trust and the persons identified as the 36 document authors. All rights reserved. 38 This document is subject to BCP 78 and the IETF Trust's Legal 39 Provisions Relating to IETF Documents in effect on the date of 40 publication of this document (http://trustee.ietf.org/license-info). 41 Please review these documents carefully, as they describe your rights 42 and restrictions with respect to this document. 44 Abstract 46 Mechanisms for handling congestion in Ethernet pseudowires are 47 presented. These mechanisms extend capabilities of the native 48 service across the PSN, and require use of the PWE3 control word. 50 Requirements Language 52 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 53 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 54 document are to be interpreted as described in RFC 2119 [RFC2119]. 56 Table of Contents 58 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 59 2. Control Word Format . . . . . . . . . . . . . . . . . . . . . . 3 60 3. Drop Eligibility Indication . . . . . . . . . . . . . . . . . . 4 61 4. Explicit Congestion Notification . . . . . . . . . . . . . . . 5 62 5. Security Considerations . . . . . . . . . . . . . . . . . . . . 6 63 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . . 6 64 7. References . . . . . . . . . . . . . . . . . . . . . . . . . . 7 65 7.1. Normative References . . . . . . . . . . . . . . . . . . . 7 66 7.2. Informative References . . . . . . . . . . . . . . . . . . 7 67 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 7 69 1. Introduction 71 Ethernet PWs do not presently have mechanisms for handling AC 72 congestion. When the egress AC becomes congested, the egress PE will 73 receive PAUSE (802.3x) frames or experiences back-pressure, denying 74 it the capability of forwarding frames to the AC. This will result 75 in the egress PE's output buffers filling up and eventually Ethernet 76 frames will need to be discarded. Not only are such frames lost 77 after precious PSN bandwidth has already been consumed, they are also 78 discarded without regard to importance, priority, or fairness. 80 If the Ethernet frames being transported are carrying TCP/IP traffic, 81 then TCP rate cut-back will limit the traffic volume to some extent. 82 However, the early discard that triggers the rate cut-back also 83 results in packet retransmission, adding additional Ethernet PW 84 traffic to be transported. When the Ethernet frames are not carrying 85 TCP/IP, but rather UDP/IP, or any other non-TCP/IP traffic that does 86 not react to packet discard by cutting back the transmission rate, 87 the situation is potentially worse. 89 The native Ethernet service handles congestion by causing the sender 90 to stop sending frames. On full duplex links this is accomplished by 91 the congested receiver sending PAUSE frames. On half-duplex networks 92 this is accomplished by the congested receiver introducing back- 93 pressure. In either case the effect is that the sender stops 94 forwarding frames until the receiver is once again ready to process 95 them, thus eliminating congestion. 97 Ethernet PWs do not transport received congestion indications across 98 the PSN, nor do they generate congestion indications when the egress 99 PE detects congestion. 101 It is possible to rectify this lack of functionality by adding 102 indications in the PWE control word. The arbitrariness of the 103 packets discarded can be alleviated by including a drop eligibility 104 indication. The loss itself can be possibly avoided by mechanisms 105 that explicit indicate forward and backward congestion. Such 106 indications enable a PE to reflect the egress AC congestion status 107 back towards the ingress AC, where steps can be taken to limit the 108 ingress rate, thus avoiding buffer overflow. 110 2. Control Word Format 112 The mechanisms described herein are only available when the Ethernet 113 PW employs the PWE3 control word. Thus, when congestion handling is 114 support the control word MUST be included in the PW packet. The use 115 of the control word is usually signaled using the PWE3 control 116 protocol [RFC4447]. There is no need to additionally signal the use 117 of the mechanisms described herein, as the default actions suffice. 119 The format of the control word is given in Figure 1 and has been 120 chosen to be compatible with that of RFC 4619 [RFC4619]. 122 0 1 2 3 123 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 124 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 125 |0 0 0 0|F|B|D|R|FRG| Length | Sequence Number | 126 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 128 Figure 1. Control Word structure 130 Bits 0 to 3 In the above diagram, the first 4 bits MUST be set to 0. 132 F (bit 4) Forward Explicit Congestion Notification (FECN) bit. 134 B (bit 5) Backward Explicit Congestion Notification (BECN) bit. 136 D (bit 6) Discard Eligibility Indication (DEI) bit. 138 R (bit 7) RESERVED bit. 140 FRG (bits 8 and 9) described in RFC 4623 [RFC4623]. 142 Length (bits 10 - 15) described in RFC 4385 [RFC4385]. 144 Sequence Number (bits 16 - 31) described in RFC 4385 [RFC4385] and 145 service specific encapsulation documents. 147 3. Drop Eligibility Indication 149 If drop eligibility is supported, then the ingress PE MUST set the 150 Drop Eligibility Indicator (DEI) bit in the PWE3 control word, and 151 during congestion the egress PE MUST preferentially discard Ethernet 152 frames that arrived in PW packets with the DEI bit set. 154 When the ingress PE receives a Q-in-Q Ethernet frame from the AC, it 155 MUST copy the DEI bit from the Ethernet frame into the DEI bit in the 156 PWE3 control word. 158 The ingress PE SHOULD perform the MEF bandwidth profile (token 159 bucket) algorithm [MEF10.1]. Frames marked red MUST be discarded, 160 and green and yellow frames MUST be encapsulated and forwarded. For 161 yellow frames the ingress PE MUST set the DEI bit in the PWE control 162 word. 164 Intermediate network elements MUST NOT clear the DEI bit. 165 Intermediate PW-aware network elements (e.g., S-PEs) MAY set the DEI 166 bit upon experiencing congestion, if they run the MEF BW profile 167 (token bucket) algorithm. 169 When the egress PE needs to discard an Ethernet frame, it MUST 170 discard packets with the DEI bit set before discarding packets with 171 the DEI bit cleared. 173 When the egress PE forwards Q-in-Q Ethernet frame to the AC, it MUST 174 copy the DEI bit from the PWE control word into the DEI bit in the 175 Ethernet frame. 177 4. Explicit Congestion Notification 179 If explicit congestion notification is supported, then the egress PE 180 MUST make the ingress PE aware of the congestion experienced, and the 181 ingress PE MAY make the egress PE aware of such congestion. An 182 ingress PE being informed of congestion by the egress PE SHOULD take 183 steps to alleviate this congestion. 185 If the egress PE receives PAUSE frames or detects Ethernet back- 186 pressure or detects that its AC-bound queues pass a preconfigured 187 threshold, then it MUST set the BECN bit in the PWE control word of 188 all PW packets set in the opposite direction towards the ingress PE. 189 If no packets are available for sending in the backward direction, 190 the egress PE MUST send dummy BECN PW packets towards the ingress PE 191 at a preconfigured rate (default is one per second). These dummy 192 BECN packets have their BECN bit set, their length field set to zero, 193 but contain no data. 195 When the egress PE PAUSE timer expires, or it detects that back- 196 pressure that had been applied has been removed, or its AC-bound 197 queues drop below a preconfigured threshold, it MUST clear the BECN 198 bit of all PW packets set towards the ingress PE. If no packets are 199 available for sending in the backward direction, the egress PE MUST 200 send three dummy BECN PW packets towards the ingress PE at a 201 preconfigured rate (default is one per second). These dummy BECN 202 packets have their BECN bit cleared, their length field set to zero, 203 but contain no data. 205 Intermediate network elements MUST NOT clear the BECN bit. 206 Intermediate PW-aware network elements (e.g., S-PEs) upon 207 experiencing congestion MAY set the BECN bit on packets forwarded in 208 the opposite direction. 210 When the ingress PE receives packets with the BECN bit set (including 211 dummy BECN packets). it SHOULD perform one of the following 212 operations to ameliorate the situation. 214 It SHOULD send PAUSE packets or apply backpressure towards the 215 ingress AC. 217 If its Ethernet interface does not support PAUSE or back-pressure, it 218 SHOULD apply the MEF bandwidth profile algorithm to frames received 219 from the AC before sending them towards the PSN. 221 If the ingress PE has admission control functionality, it SHOULD 222 refuse further connections with traffic that would be forwarded to 223 the egress PE, and MAY withdraw low priority connections. 225 If the ingress PE detects that its output queues pass a preconfigured 226 threshold, then it SHOULD send PAUSE frames or apply back-pressure to 227 the AC. It SHOULD also set the FECN bit in the PWE control word of 228 all PW packets set towards the egress PE, in order to inform the 229 egress PE to expect delays. 231 Intermediate network elements MUST NOT clear the FECN bit. 232 Intermediate PW-aware network elements (e.g., S-PEs) MAY set the FECN 233 bit upon experiencing congestion in the forward direction. 235 If packets with FECN set have been send, then when the ingress PE 236 sees that its PSN-bound queues drop below a preconfigured threshold, 237 it MUST clear the FECN bit of all PW packets sent towards the egress 238 PE. If no packets are available for sending in the forward 239 direction, the ingress PE MUST send three dummy FECN PW packets 240 towards the egress PE at a preconfigured rate (default is one per 241 second). These dummy BECN packets have their FECN bit cleared, their 242 length field set to zero, but contain no data. 244 5. Security Considerations 246 The congestion handling mechanisms introduced here do not introduce 247 significant security considerations above those present for PWs that 248 do not use these mechanisms. For example, a denial of service attack 249 based on forcing the ingress PE to slow down would require the 250 ability to inject otherwise valid PW packets. A malicious entity 251 that has attained that level has already breached the fundamental 252 security of the PW infrastructure. 254 6. IANA Considerations 256 This document requires no IANA actions. 258 7. References 260 7.1. Normative References 262 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 263 Requirement Levels", BCP 14, RFC 2119, March 1997. 265 [RFC4385] Bryant, S., Swallow, G., Martini, L., and D. McPherson, 266 "Pseudowire Emulation Edge-to-Edge (PWE3) Control Word for 267 Use over an MPLS PSN", RFC 4385, February 2006. 269 [RFC4623] Malis, A. and M. Townsley, "Pseudowire Emulation Edge-to- 270 Edge (PWE3) Fragmentation and Reassembly", RFC 4623, 271 August 2006. 273 [MEF10.1] "MEF Technical Specification MEF 10.1 - Ethernet Service 274 Attributes Phase 2", Metro Ethernet Forum MEF 10.1, 275 November 2006. 277 7.2. Informative References 279 [RFC4447] Martini, L., Rosen, E., El-Aawar, N., Smith, T., and G. 280 Heron, "Pseudowire Setup and Maintenance Using the Label 281 Distribution Protocol (LDP)", RFC 4447, April 2006. 283 [RFC4619] Martini, L., Kawa, C., and A. Malis, "Encapsulation 284 Methods for Transport of Frame Relay over Multiprotocol 285 Label Switching (MPLS) Networks", RFC 4619, 286 September 2006. 288 Author's Address 290 Yaakov (Jonathan) Stein 291 RAD Data Communications 292 24 Raoul Wallenberg St., Bldg C 293 Tel Aviv 69719 294 ISRAEL 296 Phone: +972 3 645-5389 297 Email: yaakov_s@rad.com