idnits 2.17.1 draft-ietf-mpls-ecmp-bcp-03.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** It looks like you're using RFC 3978 boilerplate. You should update this to the boilerplate described in the IETF Trust License Policy document (see https://trustee.ietf.org/license-info), which is required now. -- Found old boilerplate from RFC 3978, Section 5.1 on line 23. -- Found old boilerplate from RFC 3978, Section 5.5, updated by RFC 4748 on line 340. -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 313. -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 326. ** The document seems to lack an RFC 3979 Section 5, para. 2 IPR Disclosure Acknowledgement -- however, there's a paragraph with a matching beginning. Boilerplate error? Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- == The page length should not exceed 58 lines per page, but there was 8 longer pages, the longest (page 3) being 60 lines Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust Copyright Line does not match the current year -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (February 2007) is 6273 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Unused Reference: 'RFC3036' is defined on line 245, but no explicit reference was found in the text == Unused Reference: 'RFC3107' is defined on line 248, but no explicit reference was found in the text == Unused Reference: 'RFC3209' is defined on line 251, but no explicit reference was found in the text == Unused Reference: 'RFC3478' is defined on line 254, but no explicit reference was found in the text == Unused Reference: 'RFC3479' is defined on line 257, but no explicit reference was found in the text == Unused Reference: 'RFC4206' is defined on line 260, but no explicit reference was found in the text == Unused Reference: 'RFC4220' is defined on line 264, but no explicit reference was found in the text == Unused Reference: 'RFC4221' is defined on line 267, but no explicit reference was found in the text == Unused Reference: 'RFC4378' is defined on line 270, but no explicit reference was found in the text == Unused Reference: 'RFC4379' is defined on line 274, but no explicit reference was found in the text -- Obsolete informational reference (is this intentional?): RFC 3036 (Obsoleted by RFC 5036) -- Obsolete informational reference (is this intentional?): RFC 3107 (Obsoleted by RFC 8277) -- Obsolete informational reference (is this intentional?): RFC 4379 (Obsoleted by RFC 8029) Summary: 3 errors (**), 0 flaws (~~), 12 warnings (==), 9 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group George Swallow 3 Internet Draft Cisco Systems, Inc. 4 Category: Standards Track 5 Expiration Date: August 2007 6 Stewart Bryant 7 Cisco Systems, Inc. 9 Loa Andersson 10 Acreo 12 February 2007 14 Avoiding Equal Cost Multipath Treatment in MPLS Networks 16 draft-ietf-mpls-ecmp-bcp-03.txt 18 Status of this Memo 20 By submitting this Internet-Draft, each author represents that any 21 applicable patent or other IPR claims of which he or she is aware 22 have been or will be disclosed, and any of which he or she becomes 23 aware will be disclosed, in accordance with Section 6 of BCP 79. 25 Internet-Drafts are working documents of the Internet Engineering 26 Task Force (IETF), its areas, and its working groups. Note that 27 other groups may also distribute working documents as Internet- 28 Drafts. 30 Internet-Drafts are draft documents valid for a maximum of six months 31 and may be updated, replaced, or obsoleted by other documents at any 32 time. It is inappropriate to use Internet-Drafts as reference 33 material or to cite them other than as "work in progress." 35 The list of current Internet-Drafts can be accessed at 36 http://www.ietf.org/1id-abstracts.html 38 The list of Internet-Draft Shadow Directories can be accessed at 39 http://www.ietf.org/shadow.html 41 Abstract 43 This document describes the Equal Cost Multipath (ECMP) behavior of 44 currently deployed MPLS networks. This document makes best practice 45 recommendations for anyone defining an application to run over an 46 MPLS network that wishes to avoid the reordering that can result from 47 transmission of different packets from the same flow over multiple 48 different equal cost paths. 50 Contents 52 1 Introduction .............................................. 3 53 1.1 Terminology ............................................... 3 54 2 Current ECMP Practices .................................... 3 55 3 Recommendations for Avoiding ECMP Treatment ............... 5 56 4 Security Considerations ................................... 6 57 5 References ................................................ 6 58 5.1 Normative References ...................................... 6 59 5.2 Informative References .................................... 7 60 6 Authors' Addresses ........................................ 8 62 1. Introduction 64 This document describes the Equal Cost Multipath (ECMP) behavior of 65 currently deployed MPLS networks. We discuss cases where multiple 66 packets from the same top-level LSP might be transmitted over differ- 67 ent equal cost paths, resulting in possible mis-ordering of packets 68 which are part of the same top-level LSP. This document also makes 69 best practice recommendations for anyone defining an application to 70 run over an MPLS network that wishes to avoid the resulting potential 71 for mis-ordered packets. While disabling ECMP behavior is an option 72 open to most operators, few (if any) have chosen to do so, and the 73 application designer does not have control over the behavior of the 74 networks that the application may run over. Thus ECMP behavior is a 75 reality that must be reckoned with. 77 1.1. Terminology 79 ECMP Equal Cost Multipath 81 FEC Forwarding Equivalence Class 83 IP ECMP A forwarding behavior in which the selection of the 84 next-hop between equal cost routes is based on the 85 header(s) of an IP packet 87 Label ECMP A forwarding behavior in which the selection of the 88 next-hop between equal cost routes is based on the 89 label stack of an MPLS packet 91 LSP Label Switched Path 93 LSR Label Switching Router 95 2. Current ECMP Practices 97 The MPLS label stack and Forwarding Equivalence Classes are defined 98 in [RFC3031]. The MPLS label stack does not carry a Protocol Identi- 99 fier. Instead the payload of an MPLS packet is identified by the 100 Forwarding Equivalence Class (FEC) of the bottom most label. Thus it 101 is not possible to know the payload type if one does not know the 102 label binding for the bottom most label. Since an LSR which is pro- 103 cessing a label stack need only know the binding for the label(s) it 104 must process, it is very often the case that LSRs along an LSP are 105 unable to determine the payload type of the carried contents. 107 As a means of potentially reducing delay and congestion, IP networks 108 have taken advantage of multiple paths through a network by splitting 109 traffic flows across those paths. The general name for this practice 110 is Equal Cost Multipath or ECMP. In general this is done by hashing 111 on various fields on the IP or contained headers. In practice, 112 within a network core, the hashing is based mainly or exclusively on 113 the IP source and destination addresses. The reason for splitting 114 aggregated flows in this manner is to minimize the re-ordering of 115 packets belonging to individual flows contained within the aggregated 116 flow. Within this document we use the term IP ECMP for this type of 117 forwarding algorithm. 119 For packets that contain both a label stack and an encapsulated IPv4 120 (or IPv6) packet, current implementations in some cases may hash on 121 any combination of labels and IPv4 (or IPv6) source and destination 122 labels. 124 In the early days of MPLS, the payload was almost exclusively IP. 125 Even today the overwhelming majority of carried traffic remains IP. 126 Providers of MPLS equipment sought to continue this IP ECMP behavior. 127 As shown above, it is not possible to know whether the payload of an 128 MPLS packet is IP at every place where IP ECMP needs to be performed. 129 Thus vendors have taken the liberty of guessing what the payload is. 130 By inspecting the first nibble beyond the label stack, existing 131 equipment infers that a packet is not IPv4 or IPv6 if the value of 132 the nibble (where the IP version number would be found) is not 0x4 or 133 0x6 respectively. Most deployed LSRs will treat a packet whose first 134 nibble is equal to 0x4 as if the payload were IPv4 for purposes of IP 135 ECMP. 137 A consequence of this is that any application which defines a FEC 138 which does not take measures to prevent the values 0x4 and 0x6 from 139 occurring in the first nibble of the payload may be subject to IP 140 ECMP and thus having their flows take multiple paths and arriving 141 with considerable jitter and possibly out of order. While none of 142 this is in violation of the basic service offering of IP, it is 143 detrimental to the performance of various classes of applications. 144 It also complicates the measurement, monitoring and tracing of those 145 flows. 147 New MPLS payload types are emerging such as those specified by the 148 IETF PWE3 and AVT working groups. These payloads are not IP and, if 149 specified without constraint might be mistaken for IP. 151 It must also be noted that LSRs which correctly identify a payload as 152 not being IP, most often will load-share traffic across multiple 153 equal-cost paths based on the label stack. Any reserved label, no 154 matter where it is located in the stack, may be included in the com- 155 putation for load balancing. Modification of the label stack between 156 packets of a single flow could result in re-ordering that flow. That 157 is, were an explicit null or a router-alert label to be added to a 158 packet, that packet could take a different path through the network. 160 Note that for some applications, being mistaken for IPv4 may not be 161 detrimental. The trivial case where the payload behind the top label 162 is a packet belonging to an MPLS IPv4 VPN. Here the real payload is 163 IP and most (if not all) deployed equipment will locate the end of 164 the label stack and correctly perform IP ECMP. 166 A less obvious case is when the packets of a given flow happen to 167 have constant values in the fields upon which IP ECMP would be per- 168 formed. For example if an ethernet frame immediately follows the 169 label and the LSR does not do ECMP on IPv6, then either the first 170 nibble will be 0x4 or it will be something else. If the nibble is 171 not 0x4 then no IP ECMP is performed, but Label ECMP may be per- 172 formed. If it is 0x4, then the constant values of the MAC addresses 173 overlay the fields that would have been occupied by the source and 174 destination addresses of an IP header. As a result the ECMP algo- 175 rithm would be feed a constant value and thus would always return the 176 same result. 178 3. Recommendations for Avoiding ECMP Treatment 180 We will use the term "Application Label" to refer to a label that has 181 been allocated with a FEC Type that is defined (or simply used) by an 182 application. Such labels necessarily appear at the bottom of the 183 label stack, that is, below labels associated with transporting the 184 packet across an MPLS network. The FEC Type of the Application label 185 defines the payload that follows. Anyone defining an application to 186 be transported over MPLS is free to define new FEC Types and the for- 187 mat of the payload which will be carried. 189 0 1 2 3 190 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 191 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 192 | Label | Exp |0| TTL | 193 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 194 . . . . . 195 . . . . . 196 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 197 | Label | Exp |0| TTL | 198 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 199 | Application Label | Exp |1| TTL | 200 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 201 |1st Nbl| | 202 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 203 In order to avoid IP ECMP treatment it is necessary that an applica- 204 tion take precautions to not be mistaken as IP by deployed equipment 205 that snoops on the presumed location of the IP Version field. Thus, 206 at a minimum, the chosen format must disallow the values 0x4 and 0x6 207 in the first nibble of their payload. 209 It is strongly recommended, however, that applications restrict the 210 first nibble values to 0x0 and 0x1. This will ensure that that their 211 traffic flows will not be affected if some future routing equipment 212 does similar snooping on some future version of IP. 214 For an example of how ECMP is avoided in Pseudowires, see [RFC4385]. 216 4. Security Considerations 218 This memo discusses the conditions under which MPLS traffic associ- 219 ated with a single top-level LSP either does or does not have the 220 possibility of being split between multiple paths, implying the pos- 221 sibility of mis-ordering between packets belonging to the same top- 222 level LSP. From a security point of view, the worse that could result 223 from a security breach of the mechanisms described here would be mis- 224 ordering of packets, and possible corresponding loss of throughput 225 (for example, TCP connections may in some cases reduce the window 226 size in response to mis-ordered packets). However, in order to create 227 even this limited result, a hacker would need to either change the 228 configuration or implementation of a router, or change the bits on 229 the wire as transmitted in a packet. 231 Other security issues in the deployment of MPLS are outside of the 232 scope of this document, but are discussed in other MPLS specifica- 233 tions such as RFCs 3031, 3036, 3107, 3209, 3478, 3479, 4206, 4220, 234 4221, 4378, AND 4379. 236 5. References 238 5.1. Normative References 240 [RFC3031] Rosen, E. et al., "Multiprotocol Label Switching 241 Architecture", RFC 3031, January 2001. 243 5.2. Informative References 245 [RFC3036] Andersson, L., et. al., "LDP Specification", RFC 3036, 246 January 2001. 248 [RFC3107] Rekhter, Y. and E. Rosen, "Carrying Label Information in 249 BGP-4", RFC 3107, May 2001. 251 [RFC3209] Awduche, D., et. al., "RSVP-TE: Extensions to RSVP for 252 LSP Tunnels", RFC 3209, December 2001. 254 [RFC3478] Leelanivas, M., et. al., "Graceful Restart Mechanism for 255 Label Distribution Protocol", RFC 3478, February 2003. 257 [RFC3479] Farrel, A., "Fault Tolerance for the Label Distribution 258 Protocol (LDP)", RFC 3479, February 2003. 260 [RFC4206] Kompella, K. and Y. Rekhter, "Label Switched Paths (LSP) 261 Hierarchy with Generalized Multi-Protocol Label Switching 262 (GMPLS) Traffic Engineering (TE)", RFC 4206, October 2005. 264 [RFC4220] Dubuc, M., et. al., "Traffic Engineering Link Management 265 Information Base", RFC 4220, November 2005. 267 [RFC4221] Nadeau, T., et. al., "Multiprotocol Label Switching (MPLS) 268 Management Overview", RFC 4221, November 2005. 270 [RFC4378] Allan, D. and T. Nadeau, "A Framework for Multi-Protocol 271 Label Switching (MPLS) Operations and Management (OAM)", 272 RFC 4378, February 2006. 274 [RFC4379] Kompella, K. and G. Swallow, "Detecting Multi-Protocol 275 Label Switched (MPLS) Data Plane Failures", RFC 4379, 276 February 2006. 278 [RFC4385] Bryant, S., et. al., "Pseudowire Emulation Edge-to-Edge 279 (PWE3) Control Word for Use over an MPLS PSN", RFC 4385, 280 February 2006. 282 6. Authors' Addresses 284 Loa Andersson 285 Acreo 287 Email: loa@pi.se 289 Stewart Bryant 290 Cisco Systems 291 250, Longwater, 292 Green Park, 293 Reading, RG2 6GB, UK 295 Email: stbryant@cisco.com 297 George Swallow 298 Cisco Systems, Inc. 299 1414 Massachusetts Ave 300 Boxborough, MA 01719 302 Email: swallow@cisco.com 304 Intellectual Property 306 The IETF takes no position regarding the validity or scope of any 307 Intellectual Property Rights or other rights that might be claimed to 308 pertain to the implementation or use of the technology described in 309 this document or the extent to which any license under such rights 310 might or might not be available; nor does it represent that it has 311 made any independent effort to identify any such rights. Information 312 on the procedures with respect to rights in RFC documents can be 313 found in BCP 78 and BCP 79. 315 Copies of IPR disclosures made to the IETF Secretariat and any assur- 316 ances of licenses to be made available, or the result of an attempt 317 made to obtain a general license or permission for the use of such 318 proprietary rights by implementers or users of this specification can 319 be obtained from the IETF on-line IPR repository at 320 http://www.ietf.org/ipr. 322 The IETF invites any interested party to bring to its attention any 323 copyrights, patents or patent applications, or other proprietary 324 rights that may cover technology that may be required to implement 325 this standard. Please address the information to the IETF at ietf- 326 ipr@ietf.org. 328 Full Copyright Notice 330 Copyright (C) The IETF Trust (2007). This document is subject to the 331 rights, licenses and restrictions contained in BCP 78, and except as 332 set forth therein, the authors retain all their rights. 334 This document and the information contained herein are provided on an 335 "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS 336 OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND 337 THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS 338 OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF 339 THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED 340 WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.