idnits 2.17.1 draft-ietf-pals-ethernet-cw-07.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year (Using the creation date from RFC4448, updated by this document, for RFC5378 checks: 2002-09-04) -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (July 02, 2018) is 2123 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) No issues found here. Summary: 0 errors (**), 0 flaws (~~), 1 warning (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 PALS Working Group S. Bryant 3 Internet-Draft A. Malis 4 Updates: 4448 (if approved) Huawei 5 Intended status: Standards Track I. Bagdonas 6 Expires: January 3, 2019 Equinix 7 July 02, 2018 9 Use of Ethernet Control Word RECOMMENDED 10 draft-ietf-pals-ethernet-cw-07 12 Abstract 14 The pseudowire (PW) encapsulation of Ethernet, as defined in RFC 15 4448, specifies that the use of the control word (CW) is optional. 16 In the absence of the CW an Ethernet pseudowire packet can be 17 misidentified as an IP packet by a label switching router (LSR). 18 This in turn may lead to the selection of the wrong equal-cost-multi- 19 path (ECMP) path for the packet, leading in turn to the misordering 20 of packets. This problem has become more serious due to the 21 deployment of equipment with Ethernet MAC addresses that start with 22 0x4 or 0x6. The use of the Ethernet PW CW addresses this problem. 23 This document recommends the use of the Ethernet pseudowire control 24 word in all but exceptional circumstances. 26 This document updates RFC 4448. 28 Status of This Memo 30 This Internet-Draft is submitted in full conformance with the 31 provisions of BCP 78 and BCP 79. 33 Internet-Drafts are working documents of the Internet Engineering 34 Task Force (IETF). Note that other groups may also distribute 35 working documents as Internet-Drafts. The list of current Internet- 36 Drafts is at http://datatracker.ietf.org/drafts/current/. 38 Internet-Drafts are draft documents valid for a maximum of six months 39 and may be updated, replaced, or obsoleted by other documents at any 40 time. It is inappropriate to use Internet-Drafts as reference 41 material or to cite them other than as "work in progress." 43 This Internet-Draft will expire on January 3, 2019. 45 Copyright Notice 47 Copyright (c) 2018 IETF Trust and the persons identified as the 48 document authors. All rights reserved. 50 This document is subject to BCP 78 and the IETF Trust's Legal 51 Provisions Relating to IETF Documents 52 (http://trustee.ietf.org/license-info) in effect on the date of 53 publication of this document. Please review these documents 54 carefully, as they describe your rights and restrictions with respect 55 to this document. Code Components extracted from this document must 56 include Simplified BSD License text as described in Section 4.e of 57 the Trust Legal Provisions and are provided without warranty as 58 described in the Simplified BSD License. 60 Table of Contents 62 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 63 2. Specification of Requirements . . . . . . . . . . . . . . . . 3 64 3. Background . . . . . . . . . . . . . . . . . . . . . . . . . 3 65 4. Recommendation . . . . . . . . . . . . . . . . . . . . . . . 5 66 5. Equal Cost Multi-path (ECMP) . . . . . . . . . . . . . . . . 5 67 6. Mitigations . . . . . . . . . . . . . . . . . . . . . . . . . 6 68 7. Operational Considerations . . . . . . . . . . . . . . . . . 6 69 8. Security Considerations . . . . . . . . . . . . . . . . . . . 7 70 9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 7 71 10. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 7 72 11. References . . . . . . . . . . . . . . . . . . . . . . . . . 7 73 11.1. Normative References . . . . . . . . . . . . . . . . . . 7 74 11.2. Informative References . . . . . . . . . . . . . . . . . 8 75 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 8 77 1. Introduction 79 The pseudowire(PW) encapsulation of Ethernet, as defined in 80 [RFC4448], specifies that the use of the control word (CW) is 81 optional. It is common for label switching routers (LSRs) to search 82 past the end of the label stack to determine whether the payload is 83 an IP packet, and if the payload is an IP packet, to select the next 84 hop based on the so called "five-tuple" (IP source address, IP 85 destination address, protocol/next-header, transport layer source 86 port and transport layer destination port). In the absence of a PW 87 CW an Ethernet pseudowire packet can be misidentified as an IP packet 88 by a label switching router (LSR) selecting the equal-cost-multi-path 89 (ECMP) path based on the five-tuple. This in turn may lead to the 90 selection of the wrong ECMP path for the packet, leading in turn to 91 the misordering of packets. Further discussion of this topic is 92 published in [RFC4928]. 94 Flow misordering can also happen in a single path scenario when 95 traffic classification and differential forwarding treatment 96 mechanisms are in use. These errors occur when a forwarder 97 incorrectly assumes that the packet is IP and applies forwarding 98 policy based on fields in the PW payload. 100 IPv4 and IPv6 packets respectively start with the values 0x4 and 0x6. 101 Misidentification can arise if an Ethernet PW packet without a CW is 102 carrying an Ethernet packet with a destination address that starts 103 either of these values. 105 This problem has recently become more serious for a number of 106 reasons. Firstly, due to the deployment of equipment with Ethernet 107 MAC addresses that start with 0x4 or 0x6 assigned by the IEEE 108 Registration Authority Committee (RAC). Secondly, concerns over 109 privacy have led to the use of MAC address randomization which 110 assigns local MAC addresses randomly for privacy. Random assignment 111 results in addresses starting with one of these two values one time 112 in eight. 114 The use of the Ethernet PW CW addresses this problem. 116 This document recommends the use of the Ethernet pseudowire control 117 word in all but exceptional circumstances. 119 2. Specification of Requirements 121 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 122 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 123 "OPTIONAL" in this document are to be interpreted as described in BCP 124 14 [RFC2119] [RFC8174] when, and only when, they appear in all 125 capitals, as shown here. 127 3. Background 129 Ethernet pseudowire encapsulation is specified in [RFC4448]. In 130 particular the reader is drawn to section 4.6, part of which is 131 quoted below for the convenience of the reader: 133 "The control word defined in this section is based on the Generic 134 PW MPLS Control Word as defined in [RFC4385]. It provides the 135 ability to sequence individual frames on the PW, avoidance of 136 equal-cost multiple-path load-balancing (ECMP) [RFC2992], and 137 Operations and Management (OAM) mechanisms including VCCV 138 [RFC5085]. 140 "[RFC4385] states, "If a PW is sensitive to packet misordering 141 and is being carried over an MPLS PSN that uses the contents 142 of the MPLS payload to select the ECMP path, it MUST employ a 143 mechanism which prevents packet misordering." This is necessary 144 because ECMP implementations may examine the first nibble after 145 the MPLS label stack to determine whether the labeled packet 146 is IP or not. Thus, if the source MAC address of an Ethernet 147 frame carried over the PW without a control word present begins 148 with 0x4 or 0x6, it could be mistaken for an IPv4 or IPv6 149 packet. This could, depending on the configuration and 150 topology of the MPLS network, lead to a situation where all 151 packets for a given PW do not follow the same path. This may 152 increase out-of-order frames on a given PW, or cause OAM packets 153 to follow a different path than actual traffic (see 154 Section 4.4.3, "Frame Ordering"). 156 "The features that the control word provides may not be needed 157 for a given Ethernet PW. For example, ECMP may not be present 158 or active on a given MPLS network, strict frame sequencing may 159 not be required, etc. If this is the case, the control word 160 provides little value and is therefore optional. Early Ethernet 161 PW implementations have been deployed that do not include a 162 control word or the ability to process one if present. To 163 aid in backwards compatibility, future implementations MUST 164 be able to send and receive frames without the control word 165 present." 167 At the time when pseudowires were first deployed, some equipment of 168 commercial significance was unable to process the Ethernet Control 169 Word. In addition, at that time it was considered that no Ethernet 170 MAC address had been issued by the IEEE Registration Authority 171 Committee (RAC) that starts with 0x4 or 0x6, and thus it was thought 172 to be safe to deploy Ethernet PWs without the CW. 174 Since that time the RAC has issued Ethernet MAC addresses start with 175 0x4 or 0x6 and thus the assumption that in practical networks there 176 would be no confusion between an Ethernet PW packet without the CW 177 and an IP packet is no longer correct. 179 Possibly through the use of unauthorized Ethernet MAC addresses, this 180 assumption has been unsafe for a while, leading some equipment 181 vendors to implement more complex, proprietary, methods to 182 discriminate between Ethernet PW packets and IP packets. Such 183 mechanisms rely on the heuristics of examining the transit packets in 184 trying to find out the exact payload type of the packet and cannot be 185 reliable due to the random nature of the payload carried within such 186 packets. 188 A posting on the NANOG email list highlighted this problem: 190 https://mailman.nanog.org/pipermail/nanog/2016-December/089395.html 192 RFC EDITOR Please delete this paragraph. 193 Kramdown does not include references when they are only found in 194 literal text so I include them here: [RFC4385] [RFC2992] [RFC5085] as 195 a fixup. 197 4. Recommendation 199 The ambiguity between an MPLS payload that is an Ethernet PW and one 200 that is an IP packet is resolved when the Ethernet PW control word is 201 used. This document updates [RFC4448] to state that where both the 202 ingress PE and the egress PE support the Ethernet pseudowire control 203 word, then the CW MUST be used. 205 Where the application of ECMP to an Ethernet PW traffic is required, 206 and where both the ingress and the egress PEs support [RFC6790] 207 (Entropy Label Indicator/Entropy Label (ELI/EL)) or both the ingress 208 and the egress PEs support [RFC6391] (FAT PW), then either method may 209 be used. The use of both methods on the same PW is not normally 210 necessary and should be avoided unless circumstances require it. In 211 the case of multi-segment PWs, if ELI/EL is used then it SHOULD be 212 used on every segment of the PW. The method by which usage of ELI/EL 213 on every segment is guaranteed is out of scope of this document. 215 5. Equal Cost Multi-path (ECMP) 217 Where the volume of traffic on an Ethernet PW is such that ECMP is 218 required then one of two methods may be used: 220 o Flow-Aware Transport (FAT) of Pseudowires over an MPLS Packet 221 Switched Network specified in [RFC6391], or 223 o LSP entropy labels specified in [RFC6790] 225 RFC6391 works by increasing the entropy of the bottom of stack label. 226 It requires that both the ingress and egress provider edge (PE)s 227 support this feature. It also requires that sufficient LSRs on the 228 LSP between the ingress and egress PE be able to select an 229 ECMP path on an MPLS packet with the resultant stack depth. 231 RFC6790 works by including an entropy value in the LSP part of the 232 label stack. This requires that the Ingress and Egress PEs support 233 the insertion and removal of the EL and the entropy label indicator, 234 and that sufficient LSRs on the LSP are able to preform ECMP based on 235 the EL. 237 In both cases there are considerations in getting Operations, 238 Administration, and Maintenance (OAM) packets to follow the same path 239 as a data packet. This is described in detail section 7 of 240 [RFC6391], and section 6 of RFC6790. However in both cases the 241 situation is improved compared to the ECMP behavior in the case where 242 the Ethernet PW CW was not used, since there is currently no known 243 method of getting a PW OAM packet to follow the same path as a PW 244 data packet subjected to ECMP based on the five tuple of the IP 245 payload. 247 The PW label is pushed before the LSP label. As the EL/ELI labels 248 are part of the LSP layer rather than part of the PW layer, they are 249 pushed after the PW label has been pushed. 251 6. Mitigations 253 Where it is not possible to use the Ethernet PW CW, the effects of 254 ECMP can be disabled by carrying the PW over a traffic engineered 255 path that does not subject the payload to load balancing (for example 256 [RFC3209]). However such paths may be subjected to link bundle load 257 balancing and of course the single LSP has to carry the full PW load. 259 7. Operational Considerations 261 In some cases, the inclusion of a CW in the PW is determined by 262 equipment configuration. Furthermore, it is possible that the 263 default configuration in such cases is to disable use of the CW. 264 Care needs to be taken to ensure that software that implements this 265 recommendation does not depend on existing configuration settings 266 that prevents the use of control word. It is recommended that 267 platform software emits a rate limited message indicating that CW can 268 be used but is disabled due to existing configuration. 270 Instead of including a payload type in the packet, MPLS relies on the 271 control plane to signal the payload type that follows the bottom of 272 the label stack. Some LSRs attempt to deduce the packet type by MPLS 273 payload inspection, in some cases looking past the PW CW. If the 274 payload appears to be IP or IP carried in an Ethernet header they 275 perform an ECMP calculation based on what they assume to be the five 276 tuple fields. However deduction of the payload type in this way is 277 not an exact science, and where a packet that is not IP is mistaken 278 for an IP packet the result can be packets delivered out of order. 279 Misordering of this type can be difficult for an operator to 280 diagnose. Operators should be aware when enabling capability that 281 allows information gleaned from packet inspection past the PW CW to 282 be used in any ECMP calculation, that this may cause Ethernet frames 283 to be delivered out of order despite the presence of the CW. 285 8. Security Considerations 287 This document expresses a preference for one existing and widely 288 deployed Ethernet PW encapsulation over another. These methods have 289 identical security considerations, which are discussed in [RFC4448]. 290 This document introduces no additional security issues. 292 9. IANA Considerations 294 This document makes no IANA requests. 296 10. Acknowledgments 298 The authors thank Job Snijders for drawing attention to this problem. 299 The authors also thank Pat Thaler for clarifying the matter of local 300 MAC address assignment. We thank Sasha Vainshtein for his valuable 301 review comments. 303 11. References 305 11.1. Normative References 307 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 308 Requirement Levels", BCP 14, RFC 2119, 309 DOI 10.17487/RFC2119, March 1997, . 312 [RFC4385] Bryant, S., Swallow, G., Martini, L., and D. McPherson, 313 "Pseudowire Emulation Edge-to-Edge (PWE3) Control Word for 314 Use over an MPLS PSN", RFC 4385, DOI 10.17487/RFC4385, 315 February 2006, . 317 [RFC4448] Martini, L., Ed., Rosen, E., El-Aawar, N., and G. Heron, 318 "Encapsulation Methods for Transport of Ethernet over MPLS 319 Networks", RFC 4448, DOI 10.17487/RFC4448, April 2006, 320 . 322 [RFC4928] Swallow, G., Bryant, S., and L. Andersson, "Avoiding Equal 323 Cost Multipath Treatment in MPLS Networks", BCP 128, 324 RFC 4928, DOI 10.17487/RFC4928, June 2007, 325 . 327 [RFC6391] Bryant, S., Ed., Filsfils, C., Drafz, U., Kompella, V., 328 Regan, J., and S. Amante, "Flow-Aware Transport of 329 Pseudowires over an MPLS Packet Switched Network", 330 RFC 6391, DOI 10.17487/RFC6391, November 2011, 331 . 333 [RFC6790] Kompella, K., Drake, J., Amante, S., Henderickx, W., and 334 L. Yong, "The Use of Entropy Labels in MPLS Forwarding", 335 RFC 6790, DOI 10.17487/RFC6790, November 2012, 336 . 338 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 339 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 340 May 2017, . 342 11.2. Informative References 344 [RFC2992] Hopps, C., "Analysis of an Equal-Cost Multi-Path 345 Algorithm", RFC 2992, DOI 10.17487/RFC2992, November 2000, 346 . 348 [RFC3209] Awduche, D., Berger, L., Gan, D., Li, T., Srinivasan, V., 349 and G. Swallow, "RSVP-TE: Extensions to RSVP for LSP 350 Tunnels", RFC 3209, DOI 10.17487/RFC3209, December 2001, 351 . 353 [RFC5085] Nadeau, T., Ed. and C. Pignataro, Ed., "Pseudowire Virtual 354 Circuit Connectivity Verification (VCCV): A Control 355 Channel for Pseudowires", RFC 5085, DOI 10.17487/RFC5085, 356 December 2007, . 358 Authors' Addresses 360 Stewart Bryant 361 Huawei 363 Email: stewart.bryant@gmail.com 365 Andrew G Malis 366 Huawei 368 Email: agmalis@gmail.com 369 Ignas Bagdonas 370 Equinix 372 Email: ibagdona.ietf@gmail.com>