idnits 2.17.1 draft-tissa-trill-cmt-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** There are 3 instances of too long lines in the document, the longest one being 36 characters in excess of 72. -- The draft header indicates that this document updates RFC6325, but the abstract doesn't seem to mention this, which it should. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year (Using the creation date from RFC6325, updated by this document, for RFC5378 checks: 2006-05-11) -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (April 13, 2012) is 4388 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'RFC6439' is mentioned on line 135, but not defined ** Obsolete undefined reference: RFC 6439 (Obsoleted by RFC 8139) == Unused Reference: 'RFC6349' is defined on line 504, but no explicit reference was found in the text == Unused Reference: 'RFC6165' is defined on line 513, but no explicit reference was found in the text == Unused Reference: 'RFC4971' is defined on line 516, but no explicit reference was found in the text == Unused Reference: 'TRILLPN' is defined on line 520, but no explicit reference was found in the text ** Obsolete normative reference: RFC 6327 (Obsoleted by RFC 7177) ** Downref: Normative reference to an Informational RFC: RFC 6349 == Outdated reference: A later version (-09) exists of draft-eastlake-isis-rfc6326bis-02 -- Obsolete informational reference (is this intentional?): RFC 4971 (Obsoleted by RFC 7981) -- No information found for draft-hu-trill-psuedonode-nickname - is the name correct? Summary: 4 errors (**), 0 flaws (~~), 7 warnings (==), 5 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 TRILL Working Group Tissa Senevirathne 2 Internet Draft CISCO 3 Intended status: Standard Track Janardhanan Pathangi 4 Updates: 6325 DELL 5 Jon Hudson 6 Brocade 8 April 13, 2012 9 Expires: October 2012 11 Coordinated Multicast Trees (CMT)for TRILL 12 draft-tissa-trill-cmt-01.txt 14 Status of this Memo 16 This Internet-Draft is submitted in full conformance with the 17 provisions of BCP 78 and BCP 79. 19 Internet-Drafts are working documents of the Internet Engineering 20 Task Force (IETF), its areas, and its working groups. Note that 21 other groups may also distribute working documents as Internet- 22 Drafts. 24 Internet-Drafts are draft documents valid for a maximum of six 25 months and may be updated, replaced, or obsoleted by other documents 26 at any time. It is inappropriate to use Internet-Drafts as 27 reference material or to cite them other than as "work in progress." 29 The list of current Internet-Drafts can be accessed at 30 http://www.ietf.org/ietf/1id-abstracts.txt 32 The list of Internet-Draft Shadow Directories can be accessed at 33 http://www.ietf.org/shadow.html 35 This Internet-Draft will expire on October 13, 2012. 37 Copyright Notice 39 Copyright (c) 2012 IETF Trust and the persons identified as the 40 document authors. All rights reserved. 42 This document is subject to BCP 78 and the IETF Trust's Legal 43 Provisions Relating to IETF Documents 44 (http://trustee.ietf.org/license-info) in effect on the date of 45 publication of this document. Please review these documents 46 carefully, as they describe your rights and restrictions with 47 respect to this document. 49 Abstract 51 TRILL facilitates loop free connectivity to non TRILL legacy 52 networks via choice of an Appointed Forwarder for set of VLANs. 53 Appointed Forwarder provides VLAN level load sharing with active- 54 standby model. Mission critical operations such as High Performance 55 Data Centers require active-active load sharing model. Active-Active 56 load sharing model can be accomplished by representing any given non 57 TRILL legacy network with a single virtual RBridge. Virtual 58 representation of the non-TRILL legacy network with a single RBridge 59 poses serious challenges in multi-destination RPF calculations. This 60 document presents the required enhancements to build Coordinated 61 Multicast Trees (CMT) within the TRILL campus to solve related RPF 62 issues. CMT provides flexibility to RBridges to select desired path 63 of association to a given distribution tree. 65 Table of Contents 67 1. Introduction...................................................3 68 1.1. Scope and Applicability...................................4 69 1.2. Contributors..............................................5 70 2. Conventions used in this document..............................5 71 3. AFFINITY TLV...................................................5 72 4. Multicast Tree Construction and use of Affinity Sub-TLV........6 73 4.1. Update to RFC 6325........................................7 74 4.2. Announcing virtual RBridge nickname.......................8 75 4.3. Affinity Sub-TLV capability...............................8 76 5. Theory of operation............................................8 77 5.1. Distribution Tree provisioning............................8 78 5.2. Affinity Sub-TLV advertisement............................8 79 5.3. Affinity sub-TLV conflict resolution......................9 80 5.4. Ingress Multi-Destination Forwarding......................9 81 5.4.1. Forwarding when n < k................................9 82 5.5. Egress Multi-Destination Forwarding......................10 83 5.5.1. Traffic Arriving on an assigned Tree to RBk-RBv.....10 84 5.5.2. Traffic Arriving on other Trees.....................10 85 5.6. Failure scenarios........................................10 86 5.6.1. Edge RBridge RBk failure............................10 87 5.7. Backward compatibility...................................11 88 6. Security Considerations.......................................11 89 7. IANA Considerations...........................................12 90 8. References....................................................12 91 8.1. Normative References.....................................12 92 8.2. Informative References...................................12 93 9. Acknowledgments...............................................12 94 10. Authors' Addresses...........................................14 96 1. Introduction 98 TRILL presented in [RFC6325] and other related documents, provide 99 methods of utilizing all available paths for active forwarding, with 100 minimum configuration. TRILL utilizes IS-IS as control plane and 101 encapsulates native frames with a TRILL header. 103 Legacy networks utilize IEEE 802.1D Spanning Tree Protocol as the 104 control protocol and utilizes at any given time, a single path among 105 all available paths for active forwarding. Legacy networks forward 106 frames in ''native'' format. 108 [RFC6325],[RFC6327] and [RFC6439] provide methods for 109 interoperability between TRILL and Legacy networks. [RFC6439], 110 provide active-standby solution, where only one of the RBridges is 111 in active forwarding state for any given VLAN. The RBridge in active 112 forwarding state for any given VLAN is referred to as the Appointed 113 Forwarder (AF). All frames ingressing into a TRILL network via the 114 Appointed Forwarder are encapsulated with the TRILL header with a 115 nickname held by the ingress AF RBridge. Due to failures, re- 116 configurations and other network dynamics, Appointed Forwarder for 117 any set of VLANs may change. RBridges maintain forwarding table that 118 contain destination MAC address to egress RBridge binding. In the 119 event of AF change, forwarding tables of remote RBridges may 120 continue to forward traffic to the previous AF and may get discarded 121 at the egress, causing traffic disruption. 123 Mission critical applications such as High Performance Data Centers 124 require resiliency during failover. The active-active forwarding 125 model minimizes impact during failures and maximizes the available 126 network bandwidth. A typical deployment scenario, depicted in Figure 127 1, which may have either End Stations and/or Legacy bridges attached 128 to the RBridges. These Legacy devices typically are multi-homed to 129 several RBridges and treat all of the uplinks as a single Link 130 Aggregation (LAG) bundle. The Appointed Forwarder designation 131 presented in [RFC6439] requires each of the edge RBridges to 132 exchange TRILL hello packets. By design, a LAG does not forward 133 packets received on one of the member ports of the LAG to other 134 member ports of the same LAG. As a result AF designation methods 135 presented in [RFC6439] cannot be applied to deployment scenario 136 depicted in Figure 1 137 An active-active load sharing model can be implemented by 138 representing the edge of the network connected to a specific group 139 of RBridges by a single virtual RBridge. In addition to an active- 140 active forwarding model, there may be other applications that may 141 requires similar representations. 143 Sections 4.5.1 and 4.5.2 of [RFC6325] specify distribution tree 144 calculation and Reverse Path Forwarding Check calculation algorithms 145 for multi-destination forwarding. The algorithms specified in 146 [RFC6325], strictly depends on link cost and parent RBridge 147 priority. As a result, based on the network topology, it may be 148 possible that a given edge RBridge, if it is forwarding on behalf of 149 the virtual RBridge, may not have a candidate multicast tree that 150 the edge RBridge can forward traffic on because there is no tree for 151 which the virtual RBridge is a leaf node from the edge RBridge. 153 In this document we present a method that allows RBridges to specify 154 the path of association to distribution trees. Remote RBridges 155 calculate the SPF and derive the RPF for distribution trees based on 156 the distribution tree association advertisements. In the absence of 157 distribution tree association advertisements, remote RBridges derive 158 the SPF based on the algorithm specified in section 4.5.1 of [RFC 159 6325]. 161 Other applications, beside the above mentioned active-active 162 forwarding model, may utilize the distribution tree association 163 framework presented in this document to associate to distribution 164 trees through a preferred path. 166 This proposal requires presence of multiple multi-destination trees 167 within the TRILL campus and updating all the RBridges in the network 168 to support the new Affinity sub-TLV. It is expected that both of 169 these requirements will be met as they are control plane changes, 170 and will be common deployment scenario. In case any of the above two 171 conditions are not met RBridges MUST support a fallback option for 172 interoperability. Since the fallback is expected to be a temporary 173 phenomenon till all RBridges are upgraded, this proposal gives 174 guidelines for such fallbacks, and does not mandate or specify any 175 specific set of fallback options. 177 1.1. Scope and Applicability 179 This document provides a concept of Affinity sub-TLV to solve 180 associated RPF issues at the active-active edge. Specific methods in 181 this document for making use of the Affinity sub-TLV are applicable 182 where multiple RBridges are connected to edge device through link 183 aggregation or to a multiport server or some similar arrangement 184 where the RBridges cannot see each other's Hellos. 186 This document DOES NOT provide other required operational elements 187 to implement active-active edge solution, such as methods of link 188 aggregation. Solution specific operational elements are outside the 189 scope of this document and will be covered in solution specific 190 documents. 192 Examples provided in this document are for illustration purposes 193 only. 195 1.2. Contributors 197 The work in this document is a result of much passionate discussions 198 and contributions from following individuals. Their names are listed 199 in alphabetical order: 201 Ayan Banerjee, Dinesh Dutt, Donald Eastlake, Mingui Zhang, Radia 202 Perlman, Sam Aldrin, Shivakumar Sundaram, Zhai Hongjun. 204 2. Conventions used in this document 206 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 207 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 208 document are to be interpreted as described in [RFC2119]. 210 In this document, these words will appear with that interpretation 211 only when in ALL CAPS. Lower case uses of these words are not to be 212 interpreted as carrying [RFC2119] significance. 214 3. AFFINITY TLV 216 Association of a RBridge to a multicast tree through a specific path 217 is accomplished by using a new IS-IS sub-TLV, Affinity TLV. 219 AFFINITY TLV is a sub-TLV under the Router capability TLV (242) [RFC 220 4971]. Section 2.3.10 of [RFC6326bis] formally specifies the code 221 point and data structure for the Affinity sub-TLV. 223 4. Multicast Tree Construction and use of Affinity Sub-TLV 224 Figure 1 and Figure 2 below show the reference topology and a logical topology 225 using CMT to provide active-active service. 226 ------------------ 227 / \ 228 | | 229 | TRILL Campus | 230 | | 231 \ / 232 --------------------- 233 | | | 234 | | | ---------- 235 DRB| | | 236 +------+ +------+ +------+ 237 | | | | | | 238 |(RB1) | |(RB2) | | (RBk)| 239 +------+ +------+ +------+ 240 | | | 241 | | ----------- ---- 242 | | | ------------- | 243 ----- | | ----------- | | 244 LAG---->(| | | ) (| | |) LAG 245 +-------+ . . . +-------+ 246 | CE1 | | CEn | 247 | | | | 248 +-------+ +-------+ 250 Figure 1 Reference Topology 252 ------------------ Sample Multicast Tree (T1) 253 / \ 254 | | | 255 | TRILL Campus | o RBn 256 | | / | \ 257 \ / / | ---\ 258 --------------------- RB1o o o 259 | | | | RB2 RBk 260 | | | ---------- | 261 DRB| | | oRBv 262 +------+ +------+ +------+ 263 | | | | | | 264 |(RB1) | |(RB2) | | (RBk)| 265 +------+ +------+ +------+ 266 ooo|ooooooo|oooooooooooooo|oo 267 o o 268 o virtual RBridge RBv o 269 ooo|ooooooo|ooo|ooooooooooooo ---- 270 | | | ---------- | 271 ----- | | ---------- | | 272 LAG---->(| | | ) (| | | ) LAG 273 +-------+ . . . +-------+ 274 | CE1 | | CEn | 275 | | | | 276 +-------+ +-------+ 278 Figure 2 Example Logical Topology 280 4.1. Update to RFC 6325 282 Section 4.5.1 of [RFC6325], is updated as below: 284 Each RBridge that desires to be a parent RBridge for a specific 285 multi-destination distribution tree x for child RBridge RBy 286 announces the desired association through Affinity sub-TLV. The 287 child RBridge RBy is specified by its nickname (or one of its 288 nicknames if it hold more than one). 290 When such an Affinity sub-TLV is present, the association specified 291 by the affinity sub-TLV MUST be used when constructing the SPF tree. 292 In the absence of such Affinity sub-TLV, or if there are RBRidges in 293 the network that are do not support Affinity sub-TLV, SPF tree is 294 calculated as specified in the section 4.5.1 of [RFC6325]. Section 295 4.3. below explains methods of identifying RBridges that support 296 Affinity sub-TLV capability. 298 4.2. Announcing virtual RBridge nickname 300 Each edge RBridge RB1 to RBk advertises virtual RBridge nickname RBv 301 using the nickname sub-TLV (6), [RFC6326bis], along with their 302 regular nickname or nicknames. 304 4.3. Affinity Sub-TLV capability. 306 RBridges that announce the TRILL version sub-TLV [RFC6326bis] and 307 set the Affinity capability bit (section 7. ) support the Affinity 308 sub-TLV and calculation of multi-destination distribution trees as 309 specified herein. 311 5. Theory of operation 313 5.1. Distribution Tree provisioning 315 Let's assume there are n distribution trees and k edge RBridges in 316 the edge group of interest. 318 If n >= k 320 Let's assume edge RBridges are sorted in numerically ascending 321 order by SystemID such that RB1 < RB2 < RBk. Each Rbridge in the 322 numerically sorted list is assigned a monotonically increasing 323 number j such that; RB1=0, RB2=1, RBi=j and RBi+1=j+1. 325 Assign each tree to RBi such that tree number { (tree_number) % 326 k}+1 is assigned to RBridge i for tree_number from 1 to n. where n 327 is the number of trees and k is the number of RBridges considered 328 for tree allocation. 330 If n < k 332 Distribution trees are assigned to RBridges RB1 to RBn, using the 333 same algorithm as n >= k case. RBridges RBn+1 to RBk do not 334 participate in active-active forwarding process on behalf of RBv. 336 5.2. Affinity Sub-TLV advertisement 338 Each RBridge in the RB1..RBk domain advertises an Affinity TLV on 339 behalf of RBv. 341 As an example, let's assume that RB1 has chosen Trees t1 and tk+1 on 342 behalf of RBv. 344 RB1 advertises affinity TLV; {RBv, Num of Trees=2, t1, tk+1. 346 Other RBridges in the RB1..RBk edge group follow the same procedure. 348 5.3. Affinity sub-TLV conflict resolution 350 If different RBridges advertise Affinity sub-TLVs that try to 351 associate the same virtual RBridge as their child in the same tree 352 or trees, those Affinity sub-TLVs are in conflict for those trees. 353 The nicknames of the conflicting RBridges are compared to identify 354 which RBridge holds the nickname that is the highest priority to be 355 a tree root, with the System ID as the tie breaker 357 The RBridge with the highest priority to be a tree root will retain 358 the Affinity association. Other RBridges with lower priority to be a 359 tree root MUST stop advertising their conflicting Affinity sub-TLV, 360 re-calculate the multicast tree affinity allocation, and, if 361 appropriate, advertise a new non-conflict Affinity sub-TLV. 363 Similarly, remote RBridges MUST honor the Affinity sub-TLV from the 364 RBridge with the highest priority to be a tree root and ignore the 365 conflicting Affinity sub-TLV entries advertised by the RBridges with 366 lower priorities to be tree roots. 368 5.4. Ingress Multi-Destination Forwarding 370 If there is at least one tree on which RBv has affinity via RBk, 371 then RBk performs the following operations, for multi-destination 372 frames received from a CE node: 374 1. Flood to locally attached CE nodes subjected to VLAN and multicast 375 pruning. 376 2. Encapsulate in TRILL header and assign ingress RBridge nickname as 377 RBv. (nickname of the virtual RBridge). 378 3. Forward to one of the distribution trees, tree x in which RBv is 379 associated with RBk 381 5.4.1. Forwarding when n < k 383 If there is no tree on which RBv can claim affinity via RBk 384 (Probably because the number of trees n built is less than number 385 of RBridges k announcing the affinity sub-TLV), then RBk MUST fall 386 back to one of the following 388 1. This RBridge should stop forwarding frames from the CE nodes, 389 and should mark its link as passive. This will prevent CE nodes 390 from forwarding data on to this RBridge, and only use those 391 RBridges which have been assigned a tree - -OR- 392 2. This RBridge tunnels multi-destination frames received from 393 attached native devices to an RBridge RBy that has an assigned 394 tree. The tunnel destination should forward it to the TRILL 395 network, and also to its local access links . (The mechanism 396 of tunneling and handshake between the tunnel source and 397 destination are out of scope of this specification and may be 398 addressed in future documents). 400 Above fallback options may be very specific to active-active 401 forwarding scenario. However, as stated above, Affinity sub-TLV may 402 be used in other applications. In such event the application SHOULD 403 specify applicable fallback options. 405 5.5. Egress Multi-Destination Forwarding 407 5.5.1. Traffic Arriving on an assigned Tree to RBk-RBv 409 Multi-destination frames arriving at RBk on a Tree x, where RBk has 410 announced the affinity of RBv via x, MUST be forwarded to CE members 411 of RBv. Forwarding to other end-nodes and RBridges that are not part 412 of the network represented by the RBv virtual RBridge MUST follow 413 the forwarding rules specified in [RFC6325]. 415 5.5.2. Traffic Arriving on other Trees 417 Multi-destination frames arriving at RBk on a Tree y, where RBk has 418 not announced the affinity of RBv via y, MUST NOT be forwarded to CE 419 members of RBv. Forwarding to other end-nodes and RBridges that are 420 not part of the network represented by the RBv virtual RBridge MUST 421 follow the forwarding rules specified in RFC6325. 423 5.6. Failure scenarios 425 5.6.1. Edge RBridge RBk failure 427 The below failure recovery algorithm is presented only as a 428 guideline. Implementations MAY include other failure recover 429 algorithms. Details of such algorithms are outside the scope of this 430 document. 432 Each of the member RBridges of given virtual RBridge edge group is 433 aware of its member RBridges through configuration or some other 434 method. 436 Member RBridges detect nodal failure of a member RBridge through IS- 437 IS LSP advertisements or lack thereof. 439 Upon detecting a member failure, each of the member RBridges of the 440 RBv edge group start recovery timer T_rec for failed RBrdige RBi. If 441 the previously failed RBridge RBi has not recovered after the expiry 442 of timer T_rec, members RBridges perform distribution tree 443 assignment algorithm specified in section 5.1. Each of the member 444 RBridges re-advertises the Affinity sub-TLV with new tree 445 assignment. This action causes the campus to update the tree 446 calculation with the new assignment. 448 RBi upon start-up, starts advertising its presence through IS-IS 449 LSPs and starts a timer T_i. Member RBridges detecting the presence 450 of RB start a timer T_j. Timer T_j SHOULD be at least < T_i/2. 451 (Please see note below) 453 Upon expiry of timer T_j, member RBridges recalculate the multi- 454 destination tree assignment and advertised the related trees using 455 Affinity sub-TLV. 457 Upon expiry of timer T_i, RBi recalculate the multi-destination tree 458 assignment and advertises the related trees using Affinity TLV. 460 Note: Timers T_i and T_j are designed so as to minimize traffic down 461 time and avoid multi-destination packet duplication. 463 5.7. Backward compatibility 465 Implementations MUST support backward compatibility mode to 466 interoperate with pre Affinity sub-TLV RBRidges in the network. Such 467 backward compatibility operation MAY include, however is not limited 468 to, tunneling and/or active-standby modes of operations. 470 Example: 472 Step 1. Stop using virtual RBridge nickname for traffic ingressing 473 from CE nodes 474 Step 2. Stop performing active-active forwarding. And fall back to 475 active standby forwarding, based on locally defined policies. 476 Definition such policies are outside the scope of this document 477 and may be addressed in future documents. 479 6. Security Considerations 481 Security considerations are similar to RFC 6325,RFC 6326 and RFC 482 6327. Additional security considerations are being discussed. 484 7. IANA Considerations 486 IANA is requested to allocate a capability bit for ''Affinity 487 Supported'' in the TRILL-VER sub-TLV. ''Affinity Supported'' capability 488 bit and Affinity sub-TLV are specified and allocated in 489 [RFC6326bis]. 491 8. References 493 8.1. Normative References 495 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 496 Requirement Levels", BCP 14, RFC 2119, March 1997. 498 [RFC6325] Perlman, R., et.al. ''RBridge: Base Protocol 499 Specification'', RFC 6325, July 2011. 501 [RFC6327] Eastlake, D. et.al., ''RBridge: Adjacency'', RFC 6327, July 502 2011. 504 [RFC6349] Eastlake, D. et.al., ''RBridge: Appointed Forwarder'', RFC 505 6349, November 2011. 507 [RFC6326bis] Eastlake, D. et.al., ''Transparent Interconnection of 508 Lots of Links (TRILL) Use of IS-IS'', draft-eastlake-isis- 509 rfc6326bis-02.txt, Work in Progress, December 2011. 511 8.2. Informative References 513 [RFC6165] Banerjee, A. and Ward, D. ''Extensions to IS-IS for Layer-2 514 Systems'', RFC 6165, April 2011. 516 [RFC4971] Vasseur, JP. et.al ''Intermediate System to Intermediate 517 System (IS-IS) Extensions for Advertising Router 518 Information'', RFC 4971, July 2007. 520 [TRILLPN] Zhai,H., et.al ''RBridge: Psuedonode Nickname'', draft-hu- 521 trill-psuedonode-nickname-01, Work in progress, November 522 2011. 524 9. Acknowledgments 526 Authors wish to extend their appreciations towards individuals who 527 volunteered to review and comment on the work presented in this 528 document and provided constructive and critical feedback. Specific 529 acknowledgements are due for Anoop Ghanwani, Ronak Desai, and Varun 530 Shah. 532 This document was prepared using 2-Word-v2.0.template.dot. 534 10. Authors' Addresses 536 Tissa Senevirathne 537 Cisco Systems 538 375 East Tasman Drive, 539 San Jose, CA 95134 541 Phone: +1-408-853-2291 542 Email: tsenevir@cisco.com 544 Janardhanan Pathangi 545 Dell/Force10 Networks 546 Olympia Technology Park, 547 Guindy Chennai 600 032 549 Phone: +91 44 4220 8400 550 Email: Pathangi_Janardhanan@Dell.com 552 Jon Hudson 553 Brocade 554 130 Holger Way 555 San Jose, CA 95134 USA 557 Email: jon.hudson@gmail.com