idnits 2.17.1 draft-eastlake-trill-rbridge-clear-correct-03.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- == There are 1 instance of lines with non-RFC6890-compliant IPv4 addresses in the document. If these are example addresses, they should be changed. -- The draft header indicates that this document updates RFC6439, but the abstract doesn't seem to directly say this. It does mention RFC6439 though, so this could be OK. -- The draft header indicates that this document updates RFC6325, but the abstract doesn't seem to directly say this. It does mention RFC6325 though, so this could be OK. -- The draft header indicates that this document updates RFC6327, but the abstract doesn't seem to mention this, which it should. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year (Using the creation date from RFC6325, updated by this document, for RFC5378 checks: 2006-05-11) -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (January 8, 2012) is 4490 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Possible downref: Non-RFC (?) normative reference: ref. 'IS-IS' ** Obsolete normative reference: RFC 5306 (Obsoleted by RFC 8706) ** Obsolete normative reference: RFC 6327 (Obsoleted by RFC 7177) ** Obsolete normative reference: RFC 6439 (Obsoleted by RFC 8139) Summary: 3 errors (**), 0 flaws (~~), 2 warnings (==), 6 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 TRILL Working Group Donald Eastlake 2 INTERNET-DRAFT Mingui Zhang 3 Intended status: Proposed Standard Huawei 4 Updates: 6325, 6327, 6439 Anoop Ghanwani 5 Dell 6 Ayan Banerjee 7 Cisco 8 Vishwas Manral 9 Hewlett-Packard 10 Expires: July 7, 2012 January 8, 2012 12 TRILL: Clarifications, Corrections, and Updates 13 15 Abstract 17 The IETF TRILL (TRansparent Interconnection of Lots of Links) 18 protocol provides least cost pair-wise data forwarding without 19 configuration in multi-hop networks with arbitrary topology, safe 20 forwarding even during periods of temporary loops, and support for 21 multipathing of both unicast and multicast traffic. TRILL 22 accomplishes this by using IS-IS (Intermediate System to Intermediate 23 System) link state routing and by encapsulating traffic using a 24 header that includes a hop count. Since the TRILL base protocol was 25 approved in March 2010, active development of TRILL has revealed a 26 few errata in the original RFC 6325 and some cases that could use 27 clarifications or updates. 29 RFCs 6327, RFC 6439, and RFC XXXX, provide clarifications with 30 respect to Adjacency, Appointed Forwarders, and the TRILL ESADI 31 protocol. This document provide other known clarifications, 32 corrections, and updates to RFCs 6325, 6327, and 6439. 34 Status of This Memo 36 This Internet-Draft is submitted to IETF in full conformance with the 37 provisions of BCP 78 and BCP 79. Distribution of this document is 38 unlimited. Comments should be sent to the TRILL working group 39 mailing list . 41 Internet-Drafts are working documents of the Internet Engineering 42 Task Force (IETF), its areas, and its working groups. Note that 43 other groups may also distribute working documents as Internet- 44 Drafts. 46 Internet-Drafts are draft documents valid for a maximum of six months 47 and may be updated, replaced, or obsoleted by other documents at any 48 time. It is inappropriate to use Internet-Drafts as reference 49 material or to cite them other than as "work in progress." 50 The list of current Internet-Drafts can be accessed at 51 http://www.ietf.org/1id-abstracts.html 53 The list of Internet-Draft Shadow Directories can be accessed at 54 http://www.ietf.org/shadow.html 56 Table of Contents 58 1. Introduction............................................4 59 1.1 Precedence.............................................4 60 1.2 Terminology and Acronyms...............................4 62 2. Overloaded and/or Unreachable RBridges..................6 63 2.1 Reachability...........................................6 64 2.2 Distribution Trees.....................................7 65 2.3 Overloaded Receipt of TRILL Data Frames................7 66 2.3.1 Known Unicast Receipt................................7 67 2.3.2 Multi-Destination Receipt............................8 68 2.4 Overloaded Origination of TRILL Data Frames............8 69 2.4.1 Known Unicast Origination............................8 70 2.4.2 Multi-Destination Origination........................8 71 2.4.2.1 An Example Network.................................8 72 2.4.2.2 Indicating OOMF Support............................9 73 2.4.2.3 Using OOMF Service................................10 75 3. Distribution Trees.....................................11 76 3.1 Number of Distribution Trees..........................11 77 3.2 Distribution Tree Updates.............................11 79 4. Nickname Selection.....................................12 81 5. MTU (Maximum Transmission Unit)........................14 82 5.1 MTU Related Errata in RFC 6325........................14 83 5.1.1 MTU PDU Addressing..................................14 84 5.1.2 MTU PDU Processing..................................14 85 5.1.3 MTU Testing.........................................15 86 5.2 Ethernet MTU Values...................................15 88 6. Port Modes.............................................17 89 7. The CFI / DEI Bit......................................18 90 8. Graceful Restart.......................................19 91 9. Some Updates to RFC 6327...............................20 92 10. Updates on Appointed Forwarders and Inhibition........21 93 10.1 Optional TRILL Hello Reduction.......................21 94 10.2 Overflow and Appointed Forwarders....................23 96 11. IANA Considerations...................................24 97 12. Security Considerations...............................25 98 Acknowledgements..........................................25 99 Normative References......................................26 100 Informative References....................................26 101 Authors' Addresses........................................27 102 Appendix: Change Record...................................28 104 1. Introduction 106 The IETF TRILL (Transparent Interconnection of Lots of Links) 107 protocol [RFC6325] provides optimal pair-wise data frame forwarding 108 without configuration in multi-hop networks with arbitrary topology, 109 safe forwarding even during periods of temporary loops, and support 110 for multipathing of both unicast and multicast traffic. TRILL 111 accomplishes this by using IS-IS (Intermediate System to Intermediate 112 System) [IS-IS] [RFC1195] [RFC6326bis] link state routing and 113 encapsulating traffic using a header that includes a hop count. The 114 design supports VLANs (Virtual Local Area Networks) and optimization 115 of the distribution of multi-destination frames based on VLANs and IP 116 derived multicast groups. 118 Since the TRILL base protocol [RFC6325] was approved, the active 119 development of TRILL has revealed cases that could use clarifications 120 or updates and a few errors in the original specification document 121 [RFC6325]. 123 [RFC6327], [RFC6439], and [RFCXXXX], provide clarifications with 124 respect to Adjacency, Appointed Forwarders, and the TRILL ESADI 125 protocol. This document provides other known clarifications, 126 corrections, and updates to [RFC6325], [RFC6327], and [RFC6439]. 128 1.1 Precedence 130 In case of conflict between this document and any of [RFC6325], 131 [RFC6327], or [RFC6439], this document takes precedence. In addition, 132 Section 1.2 (Normative Content and Precedence) of [RFC6325] is 133 updated to provide a more complete precedence ordering of the 134 sections of [RFC6325] as following, where sections to the left take 135 precedence over sections to their right: 137 4 > 3 > 7 > 5 > 2 > 6 > 1 139 1.2 Terminology and Acronyms 141 This document uses the acronyms defined in [RFC6325] and the 142 following additional acronyms: 144 CFI - Canonical Format Indicator [802] 146 DEI - Drop Eligibility Indicator [802.1Q-2011] 148 OOMF - Overload Originated Multi-destination Frame 150 TRILL Switch - An alternative name for an RBridge 152 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 153 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 154 "OPTIONAL" in this document are to be interpreted as described in 155 [RFC2119]. 157 2. Overloaded and/or Unreachable RBridges 159 RBridges may be in overload as indicated by the [IS-IS] overload flag 160 in their LSPs. This means that either (1) they are incapable of 161 holding the entire link state database and thus do not have a view of 162 the entire topology or (2) they have been configured to have the 163 overload bit on. Although networks should be engineered to avoid 164 actual link state overload, it might occur under various 165 circumstances. For example, if a large campus included one or more 166 low-end TRILL Switches. 168 It is a common operational practice to set the overload bit in an 169 [IS-IS] router (such as an RBridge) when performing maintenance on 170 that router that might affect its ability to correctly forward 171 frames; this will usually leave the router reachable for maintenance 172 traffic but transit traffic will not normally be routed through it. 173 (Also, in some cases, TRILL provides for setting the overload bit in 174 the pseudo node of a link to stop TRILL Data traffic on an access 175 link (see Section 4.9.1 of [RFC6325]).) 177 [IS-IS] and TRILL make a reasonable effort to do what they can even 178 if some RBridges/routers are in overload. They can do reasonable well 179 if a few scattered nodes are in overload. However, actual least cost 180 paths are no longer assured if any RBridges are in overload. 182 For the effect of overload on the appointment of forwarders, see 183 Section 10.2. 185 2.1 Reachability 187 Frames are not least cost routed through an overloaded TRILL Switch 188 if any other path is available, although they may originate or 189 terminate at an overloaded TRILL Switch. In addition, frames will not 190 be least cost routed over links with cost 2**24 - 1; such links are 191 reserved for traffic engineered frames the handling of which is 192 beyond the scope of this document. 194 As a result, a portion of the campus may be unreachable for least 195 cost routed TRILL Data because all paths to it would be through a 196 link with cost 2**24 - 1. For example, an RBridge RB1 is not 197 reachable by TRILL Data if all of its neighbors are connected to RB1 198 by links with cost 2**24 - 1. Such RBridges are called "data 199 unreachable". 201 The link state database at an RBridge RB1 can also contain 202 information on TRILL Switches that are unreachable by IS-IS link 203 state flooding due to link or RBridge failures. When such failures 204 partition the campus, the TRILL Switches adjacent to the failure and 205 on the same side of the failure as RB1 will update their LSPs to show 206 the lack of connectivity and RB1 will receive those updates. However, 207 LSPs held by RB1 for TRILL Switches on the far side of the failure 208 will not be updated and may stay around until they time out, which 209 could be tens of seconds or longer. As a result, RB1 will be aware of 210 the partition. Nodes on the far side of the partition are both "IS- 211 IS unreachable" and data unreachable. 213 2.2 Distribution Trees 215 A RBridge in overload cannot be trusted to correctly calculate 216 distribution trees or correctly perform the Reverse Path Forwarding 217 Check. Therefore, it cannot be trusted to forward multi-destination 218 TRILL Data frames. It can only appear as a leaf node in a TRILL 219 multi-destination distribution tree. Furthermore, if all the 220 immediate neighbors of an RBridge are overloaded, then it is omitted 221 from all trees in the campus and is unreachable by multi-destination 222 frames. 224 When an RBridge determines what nicknames to use as the roots of the 225 distribution trees it calculates, it MUST ignore all nicknames held 226 by TRILL Switches that are in overload or are data unreachable. When 227 calculating Reverse Path Forwarding Checks for multi-destination 228 frames, an RBridge RB1 can similarly ignore any trees that cannot 229 reach to RB1 even if other RBridges list those trees as trees those 230 other TRILL Switches might use. (But see Section 3.) 232 2.3 Overloaded Receipt of TRILL Data Frames 234 The receipt of TRILL Data frames by overloaded RBridge RB2 is 235 discussed in the subsections below. In all cases, the normal Hop 236 Count decrement is performed and the TRILL Data frame is discarded if 237 the result is less than one or if the egress nickname is illegal. 239 2.3.1 Known Unicast Receipt 241 RB2 will not usually receive unicast TRILL Data frames unless it is 242 the egress, in which case it decapsulates and delivers the frames 243 normally. If RB2 receives a unicast TRILL Data frame for which it is 244 not the egress, perhaps because a neighbor does not yet know it is in 245 overload, RB2 MUST NOT discard the frame because the egress is an 246 unknown nickname as it might not know about all nicknames due to its 247 overloaded condition. If any neighbor, other than the neighbor from 248 which it received the frame, is not overloaded it MUST attempt to 249 forward the frame to one of those neighbors. If there is no such 250 neighbor, the frame is discarded. 252 2.3.2 Multi-Destination Receipt 254 If RB2 in overload receives a multi-destination TRILL Data frame, RB2 255 MUST NOT apply a Reverse Path Forwarding Check since, due to 256 overload, it might not do so correctly. RB2 decapsulates and delivers 257 the frame locally where it is Appointed Forwarder for the frame's 258 VLAN, subject to any multicast pruning. But since, as stated above, 259 RB2 can only be the leaf of a distribution tree, it MUST NOT forward 260 a multi-destination TRILL Data frame (except as an egressed native 261 frame where RB2 is Appointed Forwarder). 263 2.4 Overloaded Origination of TRILL Data Frames 265 Overloaded origination of unicast frames with known egress and of 266 multi-destination frames are discussed in the subsections below. 268 2.4.1 Known Unicast Origination 270 When an overloaded RBridge RB2 ingresses or creates a known 271 destination unicast TRILL Data frame, it delivers it locally if the 272 destination MAC is local. Otherwise RB2 unicasts it to any neighbor 273 TRILL Switch that is not overloaded. It MAY use what routing 274 information it has to help select the neighbor. 276 2.4.2 Multi-Destination Origination 278 Overloaded RBridge RB2 ingressing to or creating a multi-destination 279 TRILL Data frame is more complex than for a known unicast frame. 281 2.4.2.1 An Example Network 283 For example, consider the network below in which, for simplicity, end 284 stations and any bridges are not shown. There is one distribution 285 tree of which RB4 is the root and which is represented by double 286 lines. Only RBridge RB2 is overloaded. 288 +-----+ +-----+ +-----+ 289 | RB7 +===+ RB5 +=====+ RB3 | 290 +-----+ +--+--+ +-++--+ 291 | || 292 +---+---+ || 293 +----+RB2(ov)|======++ 294 | +-------+ || 295 | || 296 +---+-+ +-----+ ++==++=++ 297 | RB8 +====+ RB6 +==++ RB4 || 298 +-----+ +-----+ ++=====++ 300 Since RB2 is overloaded it does not know what the distribution tree 301 or trees are for the network. Thus there is no way it can provide 302 normal TRILL Data encapsulation for multi-destination native frames. 303 So RB2 tunnels the frame to a neighbor that is not overloaded and 304 that handles the frame specially if it has such a neighbor that 305 signal it is willing to offer this service. RBridges indicate this in 306 their Hellos as described below. This service is called OOMF 307 (Overloaded Origination of Multi-destination Frame) service. 308 - The multi-destination frame MUST NOT be locally distributed in 309 native form at RB2 before tunneling to a neighbor because this 310 would cause the frame to be delivered twice. For example, if 311 RB2 locally distributed a multicast native frame and then 312 tunneled it to RB5, RB2 would get a copy of the frame when RB3 313 transmitted it as a TRILL Data frame on the multi-access 314 RB2-RB3-RB4 link. Since RB2 would, in general, not be able to 315 tell that this was a frame it had tunneled for distribution, 316 RB2 would decapsulate it and locally distribute it a second 317 time. 318 - On the other hand, if there were no neighbor of RB2 willing to 319 offer RB2 the OOMF service, RB2 cannot tunnel the frame to a 320 neighbor. In this case RB2 MUST locally distribute the frame 321 where it is Appointed Forwarder for the frame's VLAN and 322 optionally subject to multicast pruning. 324 2.4.2.2 Indicating OOMF Support 326 A RBridge RB3 indicates its willingness to offer the OOMF service to 327 RB2 in the TRILL Neighbor TLV in RB3's TRILL Hellos by setting a bit 328 associated with the SNPA (MAC address) of RB2 on the link. (See 329 Section 11.) Overloaded RBridge RB2 can only distribute multi- 330 destination TRILL Data frames to the campus if a neighbor of RB2 not 331 in overload offers RB2 the OOMF service. If RB2 does not have OOMF 332 service available to it, RB2 can still receive multi-destination 333 frames from non-overloaded neighbors and, if RB2 should originate or 334 ingress such a frame, it distributes it locally in native form. 336 2.4.2.3 Using OOMF Service 338 If RB2 sees this OOMF (Overloaded Origination of Multi-destination 339 Frame) service advertised by any of its neighbors on any link to 340 which RB2 connects, it selects one such neighbor by a means beyond 341 the scope of this document. Assuming RB2 selects RB3 to handle multi- 342 destination frames it originates. RB2 MUST advertise in its LSP that 343 it might use any of the distribution trees that RB3 advertises it 344 might use so that the Reverse Path Forwarding Check will work in the 345 rest of the campus. Thus, notwithstanding its overloaded state, RB2 346 MUST retain this information from RB3 LSPs, which it will receive as 347 it is directly connected to RB3. 349 RB2 then encapsulates such frames as TRILL Data frames to RB3 as 350 follows: M bit = 0, Hop Count = 2, ingress nickname = a nickname held 351 by RB2, and, since RB2 cannot tell what distribution tree RB3 will 352 use, egress nickname = a special nickname indicating an OOMF frame 353 (see Section 6). RB2 then unicasts this TRILL Data frame to RB3. 354 (Implementation of Item 4 in Section 4 below provides reasonable 355 assurance that, notwithstanding its overloaded state, the ingress 356 nickname used by RB2 will be unique within at least the portion of 357 the campus that is IS-IS reachable from RB2.) 359 On receipt of such a frame, RB3 does the following: 361 - change the egress nickname field to designate a distribution tree 362 that RB3 normally uses, 363 - set the M bit to one, 364 - change the Hop Count to the value it would normally use if it were 365 the ingress, and 366 - forward the frame on that tree. 368 RB3 MAY rate limit the number of frames for which it is providing 369 this service by discarding some such frames from RB2. The provision 370 of even limited bandwidth for OOMFs by RB3, perhaps via the slow 371 path, may be important to the bootstrapping of services at RB2 or at 372 end stations connected to RB, such as supporting DHCP and ARP/ND. 373 (Everyone sometimes needs a little OOMF (pronounced oompf) to get off 374 the ground.) 376 3. Distribution Trees 378 A correction and a clarifications related to distribution trees 379 appear in the subsections below. See also Section 2.2. 381 3.1 Number of Distribution Trees 383 In [RFC6325], Section 4.5.2, page 56, Point 2, 4th paragraph, the 384 parenthetical "(up to the maximum of {j,k})" is incorrect. It should 385 read "(up to k if j is zero or the minimum of ( j, k) if j is non- 386 zero)". 388 3.2 Distribution Tree Updates 390 When a link state database change causes a change in the distribution 391 tree(s), there are several possibilities. If a tree root remains a 392 tree root but the tree changes, then local forwarding and RPFC 393 entries for that tree should be updated as soon as practical. 394 Similarly, if a new nickname becomes a tree root, forwarding and RPFC 395 entries for the new tree should be installed as soon as practical. 396 However, if a nickname ceases to be a tree root and there is 397 sufficient room in local tables, the forwarding and RPFC entries for 398 the former tree MAY be retained so that any multi-destination TRILL 399 Data frames already in flight on that tree have a higher probability 400 of being delivered. 402 4. Nickname Selection 404 Nickname selection is covered by Section 3.7.3 of [RFC6325]. However, 405 the following should be noted: 407 1. The second sentence in the second bullet item in Section 3.7.3 of 408 [RFC6325] on page 25 is erroneous and is corrected as follows: 410 1.a The occurrence of "IS-IS ID (LAN ID)" is replaced with 411 "priority". 413 1.b The occurrence of "IS-IS System ID" is replaced with "seven 414 byte IS-IS ID (LAN ID)". 416 The resulting corrected [RFC6325] sentence reads as follows: "If 417 RB1 chooses nickname x, and RB1 discovers, through receipt of an 418 LSP for RB2 at any later time, that RB2 has also chosen x, then 419 the RBridge or pseudonode with the numerically higher priority 420 keeps the nickname, or if there is a tie in priority, the RBridge 421 with the numerically higher seven byte IS-IS ID (LAN ID) keeps the 422 nickname, and the other RBridge MUST select a new nickname." 424 2. In examining the link state database for nickname conflicts, 425 nicknames held by IS-IS unreachable TRILL Switches MUST be ignored 426 but nicknames held by IS-IS reachable TRILL Switches MUST NOT be 427 ignored even if they are data unreachable. 429 3. An RBridge may need to select a new nickname, either initially 430 because it has none or because of a conflict. When doing so, the 431 RBridge MUST consider as available all nicknames that do not 432 appear in its link state database or that appear to be held by IS- 433 IS unreachable TRILL Switches; however, it SHOULD give preference 434 to selecting new nicknames that do not appear to be held by any 435 TRILL Switch in the campus, reachable or unreachable, so as to 436 minimize conflicts if IS-IS unreachable TRILL Switches later 437 become reachable. 439 4. An RBridge, even after it has acquired a nickname for which there 440 appears to be no conflicting claimant, MUST continue to monitor 441 for conflicts with the nickname or nicknames it holds. It does so 442 by checking in LSPs it receives that should update its link state 443 database for any of its nicknames held with higher priority by 444 another TRILL Switch that is IS-IS reachable. If it finds such a 445 conflict, it MUST select a new nickname. (It is possible to 446 receive an LSP that should update the link state database but does 447 not due to overflow.) 449 5. In the very unlikely case that an RBridge is unable to obtain a 450 nickname because all valid nicknames (0x0001 through 0xFFBF 451 inclusive) are in use with higher priority by IS-IS reachable 452 TRILL Switches, it will be unable to act as an ingress, egress, or 453 tree root but will still be able to function as a transit TRILL 454 Switch. Although it cannot be a tree root, such an RBridge is 455 included in every distribution tree computed for the campus. It 456 would not be possible to send an RBridge Channel message to such a 457 TRILL Switch [Channel]. 459 5. MTU (Maximum Transmission Unit) 461 MTU values in TRILL key off the originatingL1LSPBufferSize value 462 communicated in the IS-IS originatingLSPBufferSize TLV [IS-IS]. The 463 campus-wide value Sz, as described in [RFC6325] Section 4.3.1, is the 464 minimum value of originatingL1LSPBufferSize for the RBridges in a 465 campus, but not less than 1470. The MTU testing mechanism and 466 limiting LSPs to Sz assures that the LSPs can be flooded properly by 467 IS-IS and thus that IS-IS can operate properly. 469 If nothing is known about the campus, the originatingL1LSPBufferSize 470 for an RBridge should default to the minimum of the LSP size that its 471 TRILL IS-IS software can handle and the minimum MTU of the ports that 472 it might use to receive or transmit LSPs. However, to avoid having to 473 refragment LSPs, originatingL1LSPBufferSize SHOULD be configured to a 474 smaller value if it is known that other RBridges will be announcing 475 such smaller value or that the campus will partition due to a 476 significant number of links with an MTU of such smaller value. In a 477 well configured campus, to minimize any LSP re-sizing, it is 478 desirable for all RBridges to be configured with the same 479 originatingL1LSPBufferSize. 481 Section 5.1 below corrects errata in [RFC6325] and Section 5.2 482 clarifies the meaning of various MTU (Maximum Transmission Unit) 483 limits for TRILL Ethernet links. 485 5.1 MTU Related Errata in RFC 6325 487 Three MTU related errata in [RFC6325] are corrected in the 488 subsections below. 490 5.1.1 MTU PDU Addressing 492 Section 4.3.2 of [RFC6325] incorrectly states that multi-destination 493 MTU-probe and MTU-ack TRILL IS-IS PDUs are sent on Ethernet links 494 with the All-RBridges multicast address as the Outer.MacDA. As TRILL 495 IS-IS PDUs, when multicast on an Ethernet link, they MUST be sent to 496 the All-IS-IS-RBridges multicast address. 498 5.1.2 MTU PDU Processing 500 As discussed in [RFC6325] and, in more detail, in [RFC6327], MTU- 501 probe and MTU-ack PDUs MAY be unicast; however, Section 4.6 of 502 [RFC6325] erroneously does not allow for this possibility. It is 503 corrected by replacing Item numbered "1" in Section 4.6.2 of 504 [RFC6325] with the following quoted text to which TRILL Switches MUST 505 conform: 507 "1. If the Ethertype is L2-IS-IS and the Outer.MacDA is either All- 508 IS-IS-RBridges or the unicast MAC address of the receiving 509 RBridge port, the frame is handled as described in Section 510 4.6.2.1" 512 The reference to "Section 4.6.2.1" in the above quoted text is to 513 that Section in [RFC6325]. 515 5.1.3 MTU Testing 517 The last two sentences of Section 4.3.2 of [RFC6325] have errors. 518 They currently read: 520 If X is not greater than Sz, then RB1 sets the "failed minimum MTU 521 test" flag for RB2 in RB1's Hello. If size X succeeds, and X > Sz, 522 then RB1 advertises the largest tested X for each adjacency in the 523 TRILL Hellos RB1 sends on that link, and RB1 MAY advertise X as an 524 attribute of the link to RB2 in RB1's LSP. 526 They should read: 528 If X is not greater than or equal to Sz, then RB1 sets the "failed 529 minimum MTU test" flag for RB2 in RB1's Hello. If size X succeeds, 530 and X >= Sz, then RB1 advertises the largest tested X for each 531 adjacency in the TRILL Hellos RB1 sends on that link, and RB1 MAY 532 advertise X as an attribute of the link to RB2 in RB1's LSP. 534 5.2 Ethernet MTU Values 536 originatingL1LSPBufferSize is the maximum permitted size of LSPs 537 after the eight byte fixed IS-IS PDU header. This IS-IS PDU header 538 starts with the 0x83 Intradomain Routeing Protocol Discriminator byte 539 and ends with the Maximum Area Addresses byte, inclusive. In layer 3 540 IS-IS, originatingL1LSPBufferSize defaults to 1492 bytes and thus the 541 default Layer 3 LSP size, including this header, is 1500 bytes. In 542 TRILL, originatingL1LSPBufferSize defaults to 1470 bytes, allowing 22 543 bytes of additional headroom or safety margin to accommodate legacy 544 devices with, for example, the classic Ethernet maximum MTU, and 545 headers such as an Outer.VLAN. We will call this safety margin 546 "Margin" below. 548 Assuming the campus wide minimum link MTU is Sz, RBridges on Ethernet 549 links MUST limit most TRILL IS-IS PDUs so that PDUz (the length of 550 the PDU starting just before and including the L2-IS-IS Ethertype and 551 ending just before the Ethernet frame FCS) does not to exceed 553 PDUz = ( Sz + 32 - Margin ) bytes 555 The PDU exceptions are TRILL Hello PDUs, which MUST NOT exceed this 556 limit assuming an Sz of 1470 bytes, and MTU-probe and MTU-ack PDUs 557 which are padded, depending on the size Tz being tested, to ( Tz + 32 558 - Margin ) bytes. 560 Sz does not limit TRILL Data frames. They are only limited by the MTU 561 of the RBridges and links that they actually pass through; however, 562 links that can accommodate IS-IS PDUs up to Sz should accommodate, 563 with a reasonable safety margin, TRILL Data frame payloads, starting 564 after the Inner.VLAN and ending just before the FCS, of ( Sz + 10 - 565 Margin ) bytes. Most modern Ethernet equipment has ample headroom for 566 frames with extensive headers and is sometimes engineered to 567 accommodate 9K byte jumbo frames. 569 6. Port Modes 571 Section 4.9.1 of [RFC6325] specified four mode bits for RBridge ports 572 but may not be completely clear on the effects of various 573 combinations of bits. 575 The table below explicitly indicates the effect of all possible 576 combinations of the port mode bits. "*" in one of the first four 577 columns indicates that the bit can be either zero or one. The 578 following columns indicate allowed frame types. The Disable bit 579 normally disables all frames but, as an implementation choice, some 580 or all low level Layer 2 control frames (a specified in [RFC6325] 581 Section 1.4) can still be sent or received. 583 +-+-+-+-+--------+-------+-----+-----+-----+ 584 |D| | | | | | | | | 585 |i| |A| | | |TRILL| | | 586 |s| |c|T| | |Data | | | 587 |a| |c|r| | | | | | 588 |b|P|e|u| |native | LSP | | | 589 |l|2|s|n|Layer 2 |ingress| SNP |TRILL| P2P | 590 |e|P|s|k|Control |egress | MTU |Hello|Hello| 591 +-+-+-+-+--------+-------+-----+-----+-----+ 592 |0|0|0|0| Yes | Yes | Yes | Yes | No | 593 +-+-+-+-+--------+-------+-----+-----+-----+ 594 |0|0|0|1| Yes | No | Yes | Yes | No | 595 +-+-+-+-+--------+-------+-----+-----+-----+ 596 |0|0|1|0| Yes | Yes | No | Yes | No | 597 +-+-+-+-+--------+-------+-----+-----+-----+ 598 |0|0|1|1| Yes | No | No | Yes | No | 599 +-+-+-+-+--------+-------+-----+-----+-----+ 600 |0|1|0|*| Yes | No | Yes | No | Yes | 601 +-+-+-+-+--------+-------+-----+-----+-----+ 602 |0|1|1|*| Yes | No | No | No | Yes | 603 +-+-+-+-+--------+-------+-----+-----+-----+ 604 |1|*|*|*|Optional| No | No | No | No | 605 +-+-+-+-+--------+-------+-----+-----+-----+ 607 7. The CFI / DEI Bit 609 In May 2011, the IEEE promulgated [802.1Q-2011] which changes the 610 meaning of the bit between the priority and VLAN ID bits in the 611 payload of C-VLAN tags. Previously this bit was called the CFI 612 (Canonical Format Indicator) bit and had a special meaning in 613 connection with IEEE 802.5 (Token Ring) frames. Now, under 614 [802.1Q-2011], it is a DEI (Drop Eligibility Indicator) bit, similar 615 to that bit in S-VLAN (also known as B-VLAN) tags where this bit has 616 always been a DEI bit. 618 The TRILL base protocol specification [RFC6325] assumed, in effect, 619 that the link by which end stations are connected to TRILL Switches 620 and the virtual link provided by the TRILL Data frame are IEEE 802.3 621 Ethernet links on which the CFI bit is always zero. Should an end 622 station be attached by some other type of link, such as a Token Ring 623 link, [RFC6325] implicitly assumed that such frames would be 624 canonicalized to 802.3 frames before being ingressed and similarly, 625 on egress, such frames would be converted from 802.3 to the 626 appropriate frame type for the link. Thus, [RFC6325] required that 627 the CFI bit in the Inner.VLAN always be zero. 629 However, for TRILL Switches with ports conforming to the change 630 incorporated in the IEEE 802.1Q-2011 standard, the bit in the 631 Inner.VLAN, now a DEI bit, MUST be set to the DEI value provided by 632 the EISS interface on ingressing a native frame. Similarly, this bit 633 MUST be provided to the EISS when transiting or egressing a TRILL 634 Data frame. As with the 3-bit priority field, the DEI bit to use in 635 forwarding a transit frame MUST be taken from the Inner.VLAN. The 636 exact effect on the Outer.VLAN DEI and priority bits and whether or 637 not an Outer.VLAN appears at all on the wire for output frames 638 depends on output port configuration. 640 TRILL Switch campuses with a mixture of ports, some compliant with 641 [802.1Q-2011] and some compliant with pre-802.1Q-2011 standards, 642 especially if they have actual Token Ring links, may operate 643 incorrectly and may corrupt data, just as a bridged LAN with such 644 mixed bridges and ports would. 646 8. Graceful Restart 648 TRILL Switches SHOULD support the features specified in [RFC5306] 649 which describes a mechanism for a restarting IS-IS router to signal 650 to its neighbors that it is restarting, allowing them to reestablish 651 their adjacencies without cycling through the down state, while still 652 correctly initiating link state database synchronization. 654 9. Some Updates to RFC 6327 656 [RFC6327] provides for multiple states of the potential adjacency 657 between two TRILL Switches. It makes clear that only an adjacency in 658 the "Report" state is reported in LSPs. LSP synchronization (LSP and 659 SNP transmission and receipt), however, is performed if and only if 660 there is at least one adjacency on the link in the "Two-Way" or 661 "Report" state. 663 To support the PORT-TRILL-VER sub-TLV specified in [RFC6326bis], the 664 following updates are made to [RFC6327]: 666 1. The paragraph immediately before the 3.2 section header is 667 modified by adding "TRILL-PORT-VER sub-TLV [RFC6326bis] if 668 included" to those items which MUST be the same in all TRILL 669 Hellos sent out the same RBridge port regardless of the VLAN on 670 which they are sent but can occasionally change. 672 2. In Section 3.2, the state entry for each adjacency is expanded 673 to include the 5 bytes of data from the TRILL-PORT-VER received 674 in the most recent TRILL Hello from the remote RBridge. 676 3. In Section 3.3, a bullet item as follows is added to the bullet 677 items after the event descriptions: "The five bytes of TRILL- 678 PORT-VER data are set from that sub-TLV in the Hello or set to 679 zero if that sub-TLV does not occur in the Hello." 681 4. In the first part of Section 4, a bullet item is added to the 682 list as follows: "The five bytes of TRILL-PORT-VER sub-TLV data 683 used in TRILL Hellos sent on the port." 685 10. Updates on Appointed Forwarders and Inhibition 687 An optional method of Hello reduction is specified in Section 10.1 688 below and a recommendation on forwarder appointments in the face of 689 overload is given in Section 10.2. 691 10.1 Optional TRILL Hello Reduction 693 If a network manager has sufficient confidence that they know the 694 configuration of bridges, ports, and the like, within a link, they 695 may be able to reduce the number of TRILL Hellos sent on that link; 696 for example, if all RBridges on the link will see all Hellos 697 regardless of VLAN constraints, Hellos could be sent on fewer VLANs. 698 However, because adjacencies are established in the Designated VLAN, 699 an RBridge MUST always attempt to send Hellos in the Designated VLAN. 700 Hello reduction makes TRILL less robust in the face of partitioned 701 VLANs or disagreement over the Designated VLAN or the like in a link; 702 however, as long as all RBridge ports on the link are configured for 703 the same desired Designated VLAN, can see each others frames in that 704 VLAN, and utilize the mechanisms specified below to update VLAN 705 inhibition timers, operations will be safe. (These considerations do 706 not arise on links between RBridges that are configured as point-to- 707 point since, in that case, each RBridge sends point-to-point Hellos, 708 other TRILL IS-IS PDUs, and TRILL Data frames only in what it 709 believes to be the Designated VLAN of the link and no native frame 710 end station service is provided.) 712 The provision for a configurable set of "Announcing VLANs", as 713 described in Section 4.6.3 of [RFC6325] provides a mechanism in the 714 TRILL base protocol for a reduction in TRILL Hellos. 716 To maintain loop safety in the face of occasional lost frames, 717 RBridge failures, link failures, new RBridges coming up on a link, 718 and the like, the inhibition mechanism specified in [RFC6439] is 719 still required. Under Section 3 of [RFC6439], a VLAN inhibition timer 720 can only be set by the receipt of a Hello sent or received in that 721 VLAN. Thus, to safely send a reduced number of TRILL Hellos on a 722 reduced number of VLANs requires additional mechanisms to set the 723 VLAN inhibition timers at an RBridge, thus extending Section 3, Item 724 4, of [RFC6439]. Two such mechanisms are specified below. Support for 725 both of these mechanisms is indicated by a capability bit in the 726 TRILL-PORT-VER sub-TLV (see Section 9 above and [RFC6326bis]). Unless 727 all adjacencies that are not in the Down state out a port indicate 728 support of these mechanisms and the mechanisms are used, it may be 729 unsafe to reduce the VLANs on which TRILL Hellos are sent to fewer 730 VLANs than recommended in [RFC6325]. 732 1. An RBridge RB2 MAY include in any TRILL Hello an Appointed 733 Forwarders sub-TLV [RFC6326bis] appointing itself for one or more 734 ranges of VLANs. The Appointee Nickname field(s) in the Appointed 735 Forwarder sub-TLV MUST be the same as the Sender Nickname in the 736 Special VLANs and Flags sub-TLV in the TRILL Hello. This indicates 737 the sending RBridge believes it is Appointed Forwarder for those 738 VLANs. An RBridge receiving such a sub-TLV sets each of its VLAN 739 inhibition timers for every VLAN in the block or blocks listed in 740 the Appointed Forwarders sub-TLV to the maximum of its current 741 value and the Holding Time of the Hello containing the sub-TLV. 742 This is backwards compatible because such sub-TLVs will have no 743 effect on any receiving RBridge not implementing this mechanism 744 unless RB2 is the DRB sending Hello on the Designated VLAN in 745 which case, as specified in [RFC6439] RB2 MUST include in the 746 Hello all forwarder appointments, if any, for RBridges other than 747 itself on the link. 749 2. An RBridge MAY use the new VLANs Appointed sub-TLV [RFC6326bis]. 750 When RB1 receives a VLANs Appointed sub-TLV in a TRILL Hello from 751 RB2 on any VLAN, RB1 updates the VLAN inhibition timers for all 752 the VLANs that RB2 lists in that sub-TLV as VLANs for which RB2 is 753 Appointed Forwarder. Each such timer is updated to the maximum of 754 its current value and the holding time of the TRILL Hello 755 containing the VLANs Appointed sub-TLV. This sub-TLV will be an 756 unknown sub-TLV to RBridge not implementing it and such RBridges 757 will ignore it. Even in a TRILL Hello send by the DRB on the 758 Designated VLAN, one or more VLANs Appointed sub-TLVs may be 759 included and, as long as no Appointed Forwarders sub-TLVs appear, 760 the Hello is not required to indicate all forwarder appointments. 762 Two different encoding are providing above to optimize the listing of 763 VLANs. Large blocks of contiguous VLANs are more efficiently encoded 764 with the Appointed Forwarders sub-TLV and scattered VLANs are more 765 efficiently encoded with the VLANs Appointed sub-TLV. These encoding 766 may be mixed in the same Hello and the use of these sub-TLVs does not 767 affect the requirement that the "AF" bit in the Special VLANs and 768 Flags sub-TLV MUST be set if the originating RBridge believes it is 769 Appointed Forwarder for the VLAN in which the Hello is sent. If the 770 above mechanisms are used on a link, then each RBridge on the link 771 MUST send Hellos in one or more VLANs with such VLANs Appointed sub- 772 TLV(s) and/or self-appointment Appointed Forwarders sub-TLV(s) and 773 the "AF" bit appropriately set such that no VLAN inhibition timer 774 will improperly expire unless three or more Hellos are lost. For 775 example, an RBridge could announce all VLANs for which it believes it 776 is Appointed Forwarder in a Hello sent on the Designated VLAN three 777 times per Holding Time. 779 10.2 Overflow and Appointed Forwarders 781 An RBridge in overload (see Section 2) will, in general, do a poorer 782 job of ingressing and forwarding frames than an RBridge not in 783 overflow that has full knowldge of the campus topology. For example, 784 an overloaded RBridge may not be able to distribute multi-destination 785 TRILL Data frames at all. 787 Therefore, the DRB SHOULD NOT appointed an RBridge in overflow as 788 Appointed Forwarder for an VLAN unless there is no alternative. 789 Furthermore, if an Appointed Forwarder becomes overloaded, the DRB 790 SHOULD re-assign VLANs from the overloaded RBridged to another 791 RBridge on the link that is not overloaded, if one is available. 793 A counter-example would be if all end stations in VLAN-x were on 794 links attached to RB1 via ports where VLAN-x was enabled. In such a 795 case, RB1 SHOULD be made the VLAN-x Appointed Forwarder on all such 796 link even if RB1 is overloaded. 798 11. IANA Considerations 800 The following IANA actions are required: 802 1. The previously reserved nickname 0xTBD [0xFFC1 suggested] is 803 allocated for use in the TRILL Header egress nickname field to 804 indicate an Overload Originated Multi-destination Frame (OOMF). 806 2. Bit 1 from the seven previously reserved (RESV) bits in the per 807 neighbor "Neighbor RECORD" in the TRILL Neighbor TLV [RFC6326bis] 808 is allocated to indicate that the RBridge sending the TRILL Hello 809 volunteers to provide the OOMF forwarding service described in 810 Section 2.4.2 to such frames originated by the TRILL Switch whose 811 SNPA (MAC address) appears in that Neighbor RECORD. 813 3. Bit 0 is allocated from the Capability bits in the TRILL-PORT-VER 814 sub-TLV [RFC6326bis] to indicate support of the VLANs Appointed 815 sub-TLV [RFC6326bis] and the VLAN inhibition setting mechanisms 816 specified in Section 10. 818 12. Security Considerations 820 This memo improves the documentation of the TRILL protocol, corrects 821 some errors in [RFC6325], and updates [RFC6325], [RFC6327], and 822 [RFC6439]. It does not change the security considerations of these 823 RFCs. 825 Acknowledgements 827 The contributions of the following persons are gratefully 828 acknowledged: 830 Somnath Chatterjee, Weiguo Hao, Rakesh Kumar, Yizhou Li, Radia 831 Perlman 833 This document was produced with raw nroff. All macros used were 834 defined in the source file. 836 Normative References 838 [802.1Q-2011] - IEEE 802.1, "IEEE Standard for Local and metropolitan 839 area networks - Virtual Bridged Local Area Networks", IEEE Std 840 802.1Q-2011, May 2011. 842 [IS-IS] - ISO/IEC 10589:2002, Second Edition, "Intermediate System to 843 Intermediate System Intra-Domain Routeing Exchange Protocol for 844 use in Conjunction with the Protocol for Providing the 845 Connectionless-mode Network Service (ISO 8473)", 2002. 847 [RFC1195] - Callon, R., "Use of OSI IS-IS for routing in TCP/IP and 848 dual environments", RFC 1195, December 1990. 850 [RFC2119] - Bradner, S., "Key words for use in RFCs to Indicate 851 Requirement Levels", BCP 14, RFC 2119, March 1997. 853 [RFC5306] - Shand, M. and L. Ginsberg, "Restart Signaling for IS-IS", 854 RFC 5306, October 2008. 856 [RFC6325] - Perlman, R., Eastlake 3rd, D., Dutt, D., Gai, S., and A. 857 Ghanwani, "Routing Bridges (RBridges): Base Protocol 858 Specification", RFC 6325, July 2011. 860 [RFC6327] - Eastlake 3rd, D., Perlman, R., Ghanwani, A., Dutt, D., 861 and V. Manral, "Routing Bridges (RBridges): Adjacency", RFC 862 6327, July 2011. 864 [RFC6439] - Perlman, R., Eastlake, D., Li, Y., Banerjee, A., and F. 865 Hu, "Routing Bridges (RBridges): Appointed Forwarders", RFC 866 6439, November 2011. 868 [RFC6326bis] - Eastlake, D., Banerjee, A., Dutt, D., Perlman, R., and 869 A. Ghanwani, draft-eastlake-isis-rfc6326bis, work in progress. 871 Informative References 873 [802] - IEEE 802, "IEEE Standard for Local and metropolitan area 874 networks: Overview and Architecture", IEEE Std 802.1-2001, 8 875 March 2002. 877 [Channel] - draft-ietf-trill-rbridge-channel, work in progress. 879 [RFCXXXX] - H. Zhai, F. Hu, R. Perlman, D. Eastlake, "RBridges: The 880 ESADI Protocol", draft-hu-trill-rbridge-esadi, work in 881 progress. 883 Authors' Addresses 885 Donald Eastlake 886 Huawei Technologies 887 155 Beaver Street 888 Milford, MA 01757 USA 890 Phone: +1-508-333-2270 891 Email: d3e3e3@gmail.com 893 Mingui Zhang 894 Huawei Technologies Co., Ltd 895 Huawei Building, No.156 Beiqing Rd. 896 Z-park, Shi-Chuang-Ke-Ji-Shi-Fan-Yuan, Hai-Dian District, 897 Beijing 100095 P.R. China 899 Email: zhangmingui@huawei.com 901 Anoop Ghanwani 902 Dell 903 350 Holger Way 904 San Jose, CA 95134 USA 906 Phone: +1-408-571-3500 907 Email: anoop@alumni.duke.edu 909 Ayan Banerjee 910 Cisco Systems 911 170 West Tasman Drive 912 San Jose, CA 95134 USA 914 Tel.: +1-408-527-0539 915 Email: ayabaner@cisco.com 917 Vishwas Manral 918 HP Networking 919 19111 Pruneridge Avenue 920 Cupertino, CA 95014 USA 922 Tel: +1-408-477-0000 923 Email: vishwas.manral@hp.com 925 Appendix: Change Record 927 This appendix summarizes changes between versions of this draft. 929 RFC Editor: Please delete this Appendix before publication. 931 From -00 to -01 933 1. Add Section updating [RFC6327]. 935 2. Add some material to Section 5.2 on MTUs. 937 3. Minor editorial changes. 939 From -01 to -02 941 1. Add section 1.1 on Precedence. 943 2. Add section 3.1 to fix "maximum" typo in 4.5.2, point 2, on number 944 of distribution trees. 946 3. Fix point 2 in section 4 on nickname selection. 948 4. Section 5 on MTU re-organized and substantial material added. 950 5. Section 6 on Port Modes added. 952 6. Add section 10 updating matters related to Appointed Forwarders. 954 7. Add Acknowledgement section. 956 8. Update References. 958 9. Update Author Info. 960 10. Assorted editorial changes. 962 From -02 to -03 964 Minor editorial change. 966 Copyright and IPR Provisions 968 Copyright (c) 2012 IETF Trust and the persons identified as the 969 document authors. All rights reserved. 971 This document is subject to BCP 78 and the IETF Trust's Legal 972 Provisions Relating to IETF Documents 973 (http://trustee.ietf.org/license-info) in effect on the date of 974 publication of this document. Please review these documents 975 carefully, as they describe your rights and restrictions with respect 976 to this document. Code Components extracted from this document must 977 include Simplified BSD License text as described in Section 4.e of 978 the Trust Legal Provisions and are provided without warranty as 979 described in the Simplified BSD License. The definitive version of 980 an IETF Document is that published by, or under the auspices of, the 981 IETF. Versions of IETF Documents that are published by third parties, 982 including those that are translated into other languages, should not 983 be considered to be definitive versions of IETF Documents. The 984 definitive version of these Legal Provisions is that published by, or 985 under the auspices of, the IETF. Versions of these Legal Provisions 986 that are published by third parties, including those that are 987 translated into other languages, should not be considered to be 988 definitive versions of these Legal Provisions. For the avoidance of 989 doubt, each Contributor to the IETF Standards Process licenses each 990 Contribution that he or she makes as part of the IETF Standards 991 Process to the IETF Trust pursuant to the provisions of RFC 5378. No 992 language to the contrary, or terms, conditions or rights that differ 993 from or are inconsistent with the rights and licenses granted under 994 RFC 5378, shall have any effect and shall be null and void, whether 995 published or posted by such Contributor, or included with or in such 996 Contribution.