idnits 2.17.1 draft-bryant-perlman-trill-pwe-encap-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** It looks like you're using RFC 3978 boilerplate. You should update this to the boilerplate described in the IETF Trust License Policy document (see https://trustee.ietf.org/license-info), which is required now. -- Found old boilerplate from RFC 3978, Section 5.1 on line 20. -- Found old boilerplate from RFC 3978, Section 5.5 on line 413. -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 390. -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 397. -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 403. ** This document has an original RFC 3978 Section 5.4 Copyright Line, instead of the newer IETF Trust Copyright according to RFC 4748. ** This document has an original RFC 3978 Section 5.5 Disclaimer, instead of the newer disclaimer which includes the IETF Trust according to RFC 4748. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an Introduction section. ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** The document seems to lack separate sections for Informative/Normative References. All references will be assumed normative when checking for downward references. ** The document seems to lack a both a reference to RFC 2119 and the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. RFC 2119 keyword, line 116: '... stack [RFC3032] plus an OPTIONAL four byte control word. At least...' RFC 2119 keyword, line 127: '...n is required the control word MUST be...' RFC 2119 keyword, line 128: '...L implementation MAY omit the PWE3 con...' Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the RFC 3978 Section 5.4 Copyright Line does not match the current year -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (October 16, 2005) is 6760 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Looks like a reference, but probably isn't: '0' on line 240 == Outdated reference: A later version (-03) exists of draft-bryant-shand-lf-conv-frmwk-00 -- Possible downref: Normative reference to a draft: ref. 'CCONV' == Outdated reference: A later version (-03) exists of draft-ietf-mpls-ecmp-bcp-01 == Outdated reference: A later version (-11) exists of draft-ietf-pwe3-ethernet-encap-10 -- Possible downref: Normative reference to a draft: ref. 'RBRIDGE' -- No information found for draft-ietf-l2vpn-ldp - is the name correct? -- Possible downref: Normative reference to a draft: ref. 'VPLS' Summary: 7 errors (**), 0 flaws (~~), 5 warnings (==), 12 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group S. Bryant 3 Internet-Draft Cisco Systems 4 Expires: April 19, 2006 R. Perlman 5 Sun Microsystems 6 A. Atlas 7 Google 8 D. Fedyk 9 Nortel Networks 10 October 16, 2005 12 TRILL using Pseudo-Wire Emulation (PWE) Encapsulation 13 draft-bryant-perlman-trill-pwe-encap-00 15 Status of this Memo 17 By submitting this Internet-Draft, each author represents that any 18 applicable patent or other IPR claims of which he or she is aware 19 have been or will be disclosed, and any of which he or she becomes 20 aware will be disclosed, in accordance with Section 6 of BCP 79. 22 Internet-Drafts are working documents of the Internet Engineering 23 Task Force (IETF), its areas, and its working groups. Note that 24 other groups may also distribute working documents as Internet- 25 Drafts. 27 Internet-Drafts are draft documents valid for a maximum of six months 28 and may be updated, replaced, or obsoleted by other documents at any 29 time. It is inappropriate to use Internet-Drafts as reference 30 material or to cite them other than as "work in progress." 32 The list of current Internet-Drafts can be accessed at 33 http://www.ietf.org/ietf/1id-abstracts.txt. 35 The list of Internet-Draft Shadow Directories can be accessed at 36 http://www.ietf.org/shadow.html. 38 This Internet-Draft will expire on April 19, 2006. 40 Copyright Notice 42 Copyright (C) The Internet Society (2005). 44 Abstract 46 A new layer of encapsulation is required with RBridges. This layer 47 must contain at least a time-to-live and an RBridge identifier field. 48 This document proposes that the reuse of the encapsulation defined by 49 PWE3 for encapsulation of Ethernet frames over an MPLS packet 50 switched network. 52 Table of Contents 54 1. Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . 3 55 2. Forwarding Considerations . . . . . . . . . . . . . . . . . . 4 56 2.1. Forwarding Table Population . . . . . . . . . . . . . . . 5 57 2.2. QoS Treatment . . . . . . . . . . . . . . . . . . . . . . 5 58 2.3. Load Balancing . . . . . . . . . . . . . . . . . . . . . . 6 59 2.4. Multicast and Broadcast Frames . . . . . . . . . . . . . . 6 60 3. Dynamic Assignment of 19-bit Nicknames . . . . . . . . . . . . 7 61 4. Security Considerations . . . . . . . . . . . . . . . . . . . 8 62 5. References . . . . . . . . . . . . . . . . . . . . . . . . . . 8 63 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 9 64 Intellectual Property and Copyright Statements . . . . . . . . . . 10 66 1. Motivation 68 The TRILL encapsulation requires a TTL and an RBridge ID, which could 69 be the ingress or the egress depending upon the particular packet. 70 There are four encapsulation mechanism that TRILL could use: 72 a. It could design its own encapsulation from scratch. 74 b. It could use an Ethernet based encapsulation. 76 c. It could use an IP based encapsulation. 78 d. It could use an MPLS based encapsulation. 80 Adding, or removing an encapsulation, or forwarding a packet based on 81 an encapsulation is one of the most time critical operation in any 82 networking equipment, and usually requires hardware support. The use 83 of a new network encapsulation type is always problematic because new 84 hardware is usually required. This is expensive to design and 85 deploy, and frequently has a significant time and risk impact on the 86 market acceptance of a new network architecture. The use of a new, 87 TRILL specific, encapsulation should therefore, if possible, to be 88 avoided. 90 TRILL could opt to use an Ethernet based encapsulation. The nesting 91 of 802.x tags is a well understood technology and suitable hardware 92 is widely deployed. However the absence of a TTL field in the header 93 means that a controlled convergence technology needs to be used 94 [CCONV] to avoid the collateral damage caused by microlooping packets 95 during network convergence. Although convergence control 96 technologies are now available, they are not well understood by the 97 networking industry, and their use by TRILL may not be accepted by 98 the industry. 100 TRILL could use an IP encapsulation, but using an IP header for this 101 purpose has issues (see Section 5.5 in [RBRIDGE]). Such issues 102 include the encapsulation overhead, the complexity of providing L2 103 services within the L3 subnet, and the additional potential work for 104 fragmentation and reassembly. 106 The simplest existing encapsulation that meets the TRILL requirement 107 is that defined by PWE3 for the encapsulation of Ethernet frames over 108 an MPLS packet switched network [PWE3-ETHER]. The forwarding 109 functionality required by TRILL is very similar to that needed to 110 implement virtual private lan service (VPLS [VPLS]). Equipment 111 capable of encapsulating Ethernet packets for carriage over an MPLS 112 core is widely available, and the modifications necessary to support 113 TRILL would reside primarily in the control plane. 115 The encapsulation described in [PWE3-ETHER] consists of an MPLS label 116 stack [RFC3032] plus an OPTIONAL four byte control word. At least 117 one MPLS label stack entry (LSE) will be present in the TRILL packet. 118 In addition to containing the label (delivery address), the LSE also 119 contains the TTL field required by TRILL, and a QoS field (exp bits) 120 that may also be of use. 122 The control word carries some information that prevents the packet 123 being mistaken for an IP packet in an MPLS network and incorrectly 124 being subjected to ECMP. This functionality is not required in a 125 TRILL network. The control word also contains a sequence number 126 which is used to prevent the out of order delivery of PWE3 Ethernet 127 payloads. If order preservation is required the control word MUST be 128 used, otherwise a TRILL implementation MAY omit the PWE3 control 129 word. 131 The use of the PWE3 Ethernet over MPLS encapsulation by TRILL would 132 facilitate the integration of TRILL and MPLS networking. 134 2. Forwarding Considerations 136 As described in Section 3, each RBridge can obtain two 19-bit 137 nicknames. The first nickname can be used for the RBridge when 138 unicast traffic is directed to it; it is the egress RBridge nickname. 139 The second nickname can be used for multicast and broadcast traffic 140 from the RBridge; it will be the ingress RBridge nickname. 142 An MPLS shim header contains a 20-bit label field. The same format 143 can be used for the TRILL shim header; the labels will be distributed 144 via the link-state protocol used between RBridges; those labels will 145 be unique within this RBridge network instance. The Ethertype will 146 indicate that it is a TRILL frame; this will be used to provide the 147 correct forwarding context for the label space. The bottom-most bit 148 of the label field can indicate whether the top 19 bits indicate a 149 unicast nickname or a multicast and broadcast nickname. The 150 forwarding behavior will differ based upon this. 152 In the unicast case, when an Ethernet frame is received without the 153 new TRILL ethertype, the ingress RBridge will lookup the egress 154 RBridge, as specified in [RBRIDGE], and obtain its egress RBridge 155 nickname. The ingress RBridge will also determine if the Ethernet 156 frame has a priority specified as in 802.1p and will extract that 157 3-bit priority field. Then the original Ethernet frame will be 158 encapsulated as follows: 160 0 1 2 3 161 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 162 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 163 | Egress Nickname |0| Exp |S| TTL | 164 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 165 | | 166 | Received Ethernet Frame /// 167 /// | 168 | | 169 +---------------------------------------------------------------+ 171 Exp: Indicates Priority 172 S: Bottom of Stack, 1 bit 173 TTL: Time to Live, 8 bits 175 Figure 1: Unicast Encapsulation 177 Traditional bridges avoid misordering; it is an Ethernet invarient. 178 During a traditional network convergence using a link-state protocol, 179 it is possible for packets to be misordered. The PWE3 control word 180 can be used for this purpose with pseudo-wires (Section 3.7 in [PWE3- 181 ETHER]); such use might require too much hardware state due to the 182 desired load-balancing of flows. 184 This gives the encapsulated frame the same format as an Ethernet 185 pseudo-wire [PWE3-ETHER]. The forwarding path can be exactly the 186 same as that used for an Ethernet pseudo-wire. 188 2.1. Forwarding Table Population 190 When an RBridge X learns a new egress nickname A, on each interface, 191 the top 19 bits of the label are filled out with the new nickname and 192 the bottom bit (the unicast/other) is set to 0; an insegment for that 193 label is created (usually by adding an entry into the input label 194 mapping (ILM) table.) A corresponding outsegment is installed for 195 each interface that is on the shortest path tree from the RBridge X 196 to the RBridge indicated by A. That out-segment does a label swap 197 operation, where the label swapped to is the same constructed label. 198 The created in-segment is connected to the created out-segments with 199 load balancing specified; only one out-segment will be used for a 200 particular frame. 202 2.2. QoS Treatment 204 The encapsulation preserves the priority, if specified, of the frame 205 without requiring intermediate RBridges to examine the encapsulated 206 frame. The ingress RBridge extracts the priority from the 802.1p 207 field and stores that in the EXP field of the shim header. 209 When an RBridge adds the outer Ethernet frame to an TRILL 210 encapsulated frame, the RBridge can specify an 802.1p field with a 211 priority equal to that stored in the EXP field of the shim header. 212 If the EXP field is 0, then no 802.1p field is necessary. 214 2.3. Load Balancing 216 Load balancing between multiple equal cost paths is a concern for 217 RBridges. To properly load balance TRILL encapsulated frames, an 218 RBridge should identify TRILL encapsulated frames and implement a 219 specific hashing algorithm for this ethertype. A specific Ethertype 220 would be used for TRILL frames, making them trivial to identify. 222 The load balancing that would be provided by current mechanisms is 223 not sufficient. Without the PWE3 control word, either the TRILL 224 encapsulated frame would appear as non-IP and would be load balanced 225 based on a hash of the label stack (known as LABEL ECMP [MPLS-ECMP]) 226 or it would be mis-identified as IP and load balanced based on the 227 bits located where IP addresses would be if the encapsulated Ethernet 228 frame were an IP packet. The former case would provide no flow 229 diversity, since all TRILL encapsulated frames would have the same 230 label, corresponding to the same egress RBridge nickname. The latter 231 case could risk packet re-ordering. Current mechanisms seeing the 232 PWE3 control-word would use LABEL EMP and thus provide no flow 233 diversity. 235 2.4. Multicast and Broadcast Frames 237 For multicast/broadcast frames, the ingress RBridge nickname 238 indicates the spanning tree which should be used. As with the 239 unicast case, a label is formed of the nickname field and the 240 unicast/other field (label[19:1] = nickname[18:0] and label[0] = 1). 241 The treatment of the TTL field and the EXP fields are the same. 243 When an RBridge learns of a new ingress RBridge nickname, an ILM 244 entry corresponding to the label is created. An out-segment is 245 created for each interface that is in the SPT rooted at the ingress 246 RBridge. The in-segment is connected to the created out-segments 247 with multicasting specified; subject to filtering, each frame will be 248 sent out each out-segment. Except for the egress filtering, the 249 above forwarding behavior is already part of MPLS; it is used to 250 support point-to-multipoint MPLS LSPs. 252 Filtering may be applied based upon the frame and the outgoing 253 interface's membership. For instance, if a frame is being broadcast 254 along a VLAN and an interface is marked as not being connected to any 255 bridges or RBridges with VLAN membership, then the frame need not be 256 sent out that interface. Similarly, if a frame is being multicasted, 257 the RBridge could decide to filter the frame if the interface is 258 explicitly known to not be part of the multicast tree. 260 3. Dynamic Assignment of 19-bit Nicknames 262 We assume each RBridge has a unique 6-byte system ID, which it uses 263 as its IS-IS ID. In order to use the compressed MPLS-like encoding 264 of the shim header, we need to create an identifier which is 19-bits. 265 This gives a space of half a million nicknames, large enough that 266 there will be enough nicknames. We do, however, need a method for 267 assigning nicknames to RBridges so that the nicknames are unique 268 within the RBridge domain. 270 We will assign a new type value to be carried in LSPs. The TLV will 271 carry the nickname the LSP source wishes to use. The TLV will be: 273 +------+--------+-----------------------+ 274 | type | length | value=19 bit nickname | 275 +------+--------+-----------------------+ 277 Figure 2: Nickname TLV 279 Each RBridge chooses its own nickname. However, each RBridge is also 280 responsible for ensuring that its nickname is unique. If R1 chooses 281 nickname x, and R1 discovers, through receipt of R2's LSP, that R2 282 has also chosen x, then the RBridge with the lower system ID keeps 283 the nickname, and the other one must choose a new nickname. 285 If two RBridge domains merge, then there might be a lot of nickname 286 collisions for a short time, but as soon as each side receives the 287 link state packets of the other, the RBridges that need to change 288 nicknames will quickly become aware of this, and choose new nicknames 289 that do not, to the best of their ability, collide with any existing 290 nicknames. 292 To minimize the probability of nickname collisions, each RBridge 293 chooses its nickname randomly from the set of assigned nicknames. 294 Alternatively, we could use some sort of hash algorithm (such as the 295 bottom 19 bits of the MD5 of the RBridge's system ID), to choose the 296 first nickname, and then if there is a collision, go to the next 19 297 bits of the MD5, and so on, until all 128 bits of the MD5 hash are 298 exhausted, in which case the RBridge hashes its own system ID again, 299 this time together with the constant "1". 301 There is no reason for all RBridges to use the same algorithm for 302 choosing nicknames. Picking them at random, or using a hash, are an 303 attempt to avoid collisions when the network starts up, but that is 304 only an optimization. Even if all RBridges used the same algorithm, 305 say as a worst case, they all start with "1" and count up 306 sequentially until they find an uncontested nickname, the network 307 will eventually stabilize. And once it is stable, nicknames should 308 remain stable even as routers go up or down. 310 To minimize the probability of a new RBridge usurping a nickname 311 already in use, an RBridge should wait to acquire the link state 312 database from a neighbor before it announces its own nickname. 314 4. Security Considerations 316 The security implications of selecting this format have not yet been 317 considered. 319 5. References 321 [CCONV] Bryant, S. and M. Shand, "Applicability of Loop-free 322 Convergence", draft-bryant-shand-lf-conv-frmwk-00.txt 323 (work in progress), June 2005. 325 [MPLS-ECMP] 326 Swallow, G., Bryant, S., and L. Andersson, "Avoiding Equal 327 Cost Multipath Treatment in MPLS Networks", 328 draft-ietf-mpls-ecmp-bcp-01.txt (work in progress), 329 July 2005. 331 [PWE3-ETHER] 332 Martini, L., Rosen, E., and G. Heron, "Encapsulation 333 Methods for Transport of Ethernet Over MPLS Networks", 334 draft-ietf-pwe3-ethernet-encap-10.txt (work in progress), 335 June 2005. 337 [RBRIDGE] Perlman, R., Touch, J., and A. Yegin, "RBridges: 338 Transparent Routing", draft-perlman-rbridge-03.txt (work 339 in progress), May 2005. 341 [RFC3032] Rosen, E., Tappan, D., Fedorkow, G., Rekhter, Y., 342 Farinacci, D., Li, T., and A. Conta, "MPLS Label Stack 343 Encoding", RFC 3032, January 2001. 345 [VPLS] Lasserre, M. and V. Kompella, "Virtual Private LAN 346 Services over MPLS", draft-ietf-l2vpn-ldp-07.txt (work in 347 progress), July 2005. 349 Authors' Addresses 351 Stewart Bryant 352 Cisco Systems 353 250, Longwater, Green Park 354 Reading RG2 6GB 355 United Kingdom 357 Email: stbryant@cisco.com 359 Radia Perlman 360 Sun Microsystems 362 Email: Radia.Perlman@sun.com 364 Alia K. Atlas 365 Google 366 1600 Amphitheatre Parkway 367 Mountain View, CA 94043 368 USA 370 Email: akatlas@alum.mit.edu 372 Don Fedyk 373 Nortel Networks 374 600 Technology Park 375 Billerica, MA 01821 376 USA 378 Phone: +1 978 288 3041 379 Email: dwfedyk@nortelnetworks.com 381 Intellectual Property Statement 383 The IETF takes no position regarding the validity or scope of any 384 Intellectual Property Rights or other rights that might be claimed to 385 pertain to the implementation or use of the technology described in 386 this document or the extent to which any license under such rights 387 might or might not be available; nor does it represent that it has 388 made any independent effort to identify any such rights. Information 389 on the procedures with respect to rights in RFC documents can be 390 found in BCP 78 and BCP 79. 392 Copies of IPR disclosures made to the IETF Secretariat and any 393 assurances of licenses to be made available, or the result of an 394 attempt made to obtain a general license or permission for the use of 395 such proprietary rights by implementers or users of this 396 specification can be obtained from the IETF on-line IPR repository at 397 http://www.ietf.org/ipr. 399 The IETF invites any interested party to bring to its attention any 400 copyrights, patents or patent applications, or other proprietary 401 rights that may cover technology that may be required to implement 402 this standard. Please address the information to the IETF at 403 ietf-ipr@ietf.org. 405 Disclaimer of Validity 407 This document and the information contained herein are provided on an 408 "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS 409 OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET 410 ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, 411 INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE 412 INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED 413 WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. 415 Copyright Statement 417 Copyright (C) The Internet Society (2005). This document is subject 418 to the rights, licenses and restrictions contained in BCP 78, and 419 except as set forth therein, the authors retain all their rights. 421 Acknowledgment 423 Funding for the RFC Editor function is currently provided by the 424 Internet Society.