idnits 2.17.1 draft-ietf-trill-cmt-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- -- The draft header indicates that this document updates RFC6325, but the abstract doesn't seem to mention this, which it should. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year (Using the creation date from RFC6325, updated by this document, for RFC5378 checks: 2006-05-11) -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (April 17, 2012) is 4390 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Unused Reference: 'RFC6165' is defined on line 519, but no explicit reference was found in the text == Unused Reference: 'RFC4971' is defined on line 522, but no explicit reference was found in the text == Unused Reference: 'TRILLPN' is defined on line 526, but no explicit reference was found in the text ** Obsolete normative reference: RFC 6327 (Obsoleted by RFC 7177) ** Obsolete normative reference: RFC 6439 (Obsoleted by RFC 8139) == Outdated reference: A later version (-09) exists of draft-eastlake-isis-rfc6326bis-07 -- Obsolete informational reference (is this intentional?): RFC 4971 (Obsoleted by RFC 7981) == Outdated reference: A later version (-08) exists of draft-hu-trill-pseudonode-nickname-01 Summary: 2 errors (**), 0 flaws (~~), 6 warnings (==), 4 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 TRILL Working Group Tissa Senevirathne 2 Internet Draft CISCO 3 Intended status: Standard Track Janardhanan Pathangi 4 Updates: 6325 DELL 5 Jon Hudson 6 Brocade 8 April 17, 2012 9 Expires: October 2012 11 Coordinated Multicast Trees (CMT)for TRILL 12 draft-ietf-trill-cmt-00.txt 14 Status of this Memo 16 This Internet-Draft is submitted in full conformance with the 17 provisions of BCP 78 and BCP 79. 19 Internet-Drafts are working documents of the Internet Engineering 20 Task Force (IETF), its areas, and its working groups. Note that 21 other groups may also distribute working documents as Internet- 22 Drafts. 24 Internet-Drafts are draft documents valid for a maximum of six 25 months and may be updated, replaced, or obsoleted by other documents 26 at any time. It is inappropriate to use Internet-Drafts as 27 reference material or to cite them other than as "work in progress." 29 The list of current Internet-Drafts can be accessed at 30 http://www.ietf.org/ietf/1id-abstracts.txt 32 The list of Internet-Draft Shadow Directories can be accessed at 33 http://www.ietf.org/shadow.html 35 This Internet-Draft will expire on October 17, 2012. 37 Copyright Notice 39 Copyright (c) 2012 IETF Trust and the persons identified as the 40 document authors. All rights reserved. 42 This document is subject to BCP 78 and the IETF Trust's Legal 43 Provisions Relating to IETF Documents 44 (http://trustee.ietf.org/license-info) in effect on the date of 45 publication of this document. Please review these documents 46 carefully, as they describe your rights and restrictions with 47 respect to this document. Code Components extracted from this 48 document must include Simplified BSD License text as described in 49 Section 4.e of the Trust Legal Provisions and are provided without 50 warranty as described in the Simplified BSD License. 52 Abstract 54 TRILL facilitates loop free connectivity to non TRILL legacy 55 networks via choice of an Appointed Forwarder for set of VLANs. 56 Appointed Forwarder provides VLAN level load sharing with active- 57 standby model. Mission critical operations such as High Performance 58 Data Centers require active-active load sharing model. Active-Active 59 load sharing model can be accomplished by representing any given non 60 TRILL legacy network with a single virtual RBridge. Virtual 61 representation of the non-TRILL legacy network with a single RBridge 62 poses serious challenges in multi-destination RPF calculations. This 63 document presents the required enhancements to build Coordinated 64 Multicast Trees (CMT) within the TRILL campus to solve related RPF 65 issues. CMT provides flexibility to RBridges to select desired path 66 of association to a given distribution tree. 68 Table of Contents 70 1. Introduction...................................................3 71 1.1. Scope and Applicability...................................4 72 1.2. Contributors..............................................5 73 2. Conventions used in this document..............................5 74 3. AFFINITY TLV...................................................5 75 4. Multicast Tree Construction and use of Affinity Sub-TLV........6 76 4.1. Update to RFC 6325........................................7 77 4.2. Announcing virtual RBridge nickname.......................8 78 4.3. Affinity Sub-TLV capability...............................8 79 5. Theory of operation............................................8 80 5.1. Distribution Tree provisioning............................8 81 5.2. Affinity Sub-TLV advertisement............................8 82 5.3. Affinity sub-TLV conflict resolution......................9 83 5.4. Ingress Multi-Destination Forwarding......................9 84 5.4.1. Forwarding when n < k................................9 85 5.5. Egress Multi-Destination Forwarding......................10 86 5.5.1. Traffic Arriving on an assigned Tree to RBk-RBv.....10 87 5.5.2. Traffic Arriving on other Trees.....................10 88 5.6. Failure scenarios........................................10 89 5.6.1. Edge RBridge RBk failure............................10 90 5.7. Backward compatibility...................................11 91 6. Security Considerations.......................................11 92 7. IANA Considerations...........................................12 93 8. References....................................................12 94 8.1. Normative References.....................................12 95 8.2. Informative References...................................12 96 9. Acknowledgments...............................................12 97 10. Authors' Addresses...........................................14 99 1. Introduction 101 TRILL presented in [RFC6325] and other related documents, provide 102 methods of utilizing all available paths for active forwarding, with 103 minimum configuration. TRILL utilizes IS-IS as control plane and 104 encapsulates native frames with a TRILL header. 106 Legacy networks utilize IEEE 802.1D Spanning Tree Protocol as the 107 control protocol and utilizes at any given time, a single path among 108 all available paths for active forwarding. Legacy networks forward 109 frames in ''native'' format. 111 [RFC6325],[RFC6327] and [RFC6439] provide methods for 112 interoperability between TRILL and Legacy networks. [RFC6439], 113 provide active-standby solution, where only one of the RBridges is 114 in active forwarding state for any given VLAN. The RBridge in active 115 forwarding state for any given VLAN is referred to as the Appointed 116 Forwarder (AF). All frames ingressing into a TRILL network via the 117 Appointed Forwarder are encapsulated with the TRILL header with a 118 nickname held by the ingress AF RBridge. Due to failures, re- 119 configurations and other network dynamics, Appointed Forwarder for 120 any set of VLANs may change. RBridges maintain forwarding table that 121 contain destination MAC address to egress RBridge binding. In the 122 event of AF change, forwarding tables of remote RBridges may 123 continue to forward traffic to the previous AF and may get discarded 124 at the egress, causing traffic disruption. 126 Mission critical applications such as High Performance Data Centers 127 require resiliency during failover. The active-active forwarding 128 model minimizes impact during failures and maximizes the available 129 network bandwidth. A typical deployment scenario, depicted in Figure 130 1, which may have either End Stations and/or Legacy bridges attached 131 to the RBridges. These Legacy devices typically are multi-homed to 132 several RBridges and treat all of the uplinks as a single Link 133 Aggregation (LAG) bundle. The Appointed Forwarder designation 134 presented in [RFC6439] requires each of the edge RBridges to 135 exchange TRILL hello packets. By design, a LAG does not forward 136 packets received on one of the member ports of the LAG to other 137 member ports of the same LAG. As a result AF designation methods 138 presented in [RFC6439] cannot be applied to deployment scenario 139 depicted in Figure 1 141 An active-active load sharing model can be implemented by 142 representing the edge of the network connected to a specific group 143 of RBridges by a single virtual RBridge. In addition to an active- 144 active forwarding model, there may be other applications that may 145 requires similar representations. 147 Sections 4.5.1 and 4.5.2 of [RFC6325] specify distribution tree 148 calculation and Reverse Path Forwarding Check calculation algorithms 149 for multi-destination forwarding. The algorithms specified in 150 [RFC6325], strictly depends on link cost and parent RBridge 151 priority. As a result, based on the network topology, it may be 152 possible that a given edge RBridge, if it is forwarding on behalf of 153 the virtual RBridge, may not have a candidate multicast tree that 154 the edge RBridge can forward traffic on because there is no tree for 155 which the virtual RBridge is a leaf node from the edge RBridge. 157 In this document we present a method that allows RBridges to specify 158 the path of association to distribution trees. Remote RBridges 159 calculate the SPF and derive the RPF for distribution trees based on 160 the distribution tree association advertisements. In the absence of 161 distribution tree association advertisements, remote RBridges derive 162 the SPF based on the algorithm specified in section 4.5.1 of [RFC 163 6325]. 165 Other applications, beside the above mentioned active-active 166 forwarding model, may utilize the distribution tree association 167 framework presented in this document to associate to distribution 168 trees through a preferred path. 170 This proposal requires presence of multiple multi-destination trees 171 within the TRILL campus and updating all the RBridges in the network 172 to support the new Affinity sub-TLV. It is expected that both of 173 these requirements will be met as they are control plane changes, 174 and will be common deployment scenario. In case any of the above two 175 conditions are not met RBridges MUST support a fallback option for 176 interoperability. Since the fallback is expected to be a temporary 177 phenomenon till all RBridges are upgraded, this proposal gives 178 guidelines for such fallbacks, and does not mandate or specify any 179 specific set of fallback options. 181 1.1. Scope and Applicability 183 This document provides a concept of Affinity sub-TLV to solve 184 associated RPF issues at the active-active edge. Specific methods in 185 this document for making use of the Affinity sub-TLV are applicable 186 where multiple RBridges are connected to edge device through link 187 aggregation or to a multiport server or some similar arrangement 188 where the RBridges cannot see each other's Hellos. 190 This document DOES NOT provide other required operational elements 191 to implement active-active edge solution, such as methods of link 192 aggregation. Solution specific operational elements are outside the 193 scope of this document and will be covered in solution specific 194 documents. 196 Examples provided in this document are for illustration purposes 197 only. 199 1.2. Contributors 201 The work in this document is a result of much passionate discussions 202 and contributions from following individuals. Their names are listed 203 in alphabetical order: 205 Ayan Banerjee, Dinesh Dutt, Donald Eastlake, Mingui Zhang, Radia 206 Perlman, Sam Aldrin, Shivakumar Sundaram, Zhai Hongjun. 208 2. Conventions used in this document 210 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 211 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 212 document are to be interpreted as described in [RFC2119]. 214 In this document, these words will appear with that interpretation 215 only when in ALL CAPS. Lower case uses of these words are not to be 216 interpreted as carrying [RFC2119] significance. 218 3. AFFINITY TLV 220 Association of a RBridge to a multicast tree through a specific path 221 is accomplished by using a new IS-IS sub-TLV, Affinity TLV. 223 AFFINITY TLV is a sub-TLV under the Router capability TLV (242) [RFC 224 4971]. Section 2.3.10 of [6326bis] formally specifies the code point 225 and data structure for the Affinity sub-TLV. 227 4. Multicast Tree Construction and use of Affinity Sub-TLV 229 Figure 1 and Figure 2 below show the reference topology and a 230 logical topology using CMT to provide active-active service. 232 ------------------ 233 / \ 234 | | 235 | TRILL Campus | 236 | | 237 \ / 238 --------------------- 239 | | | 240 | | ---------- 241 DRB| | | 242 +------+ +------+ +------+ 243 | | | | | | 244 |(RB1) | |(RB2) | | (RBk)| 245 +------+ +------+ +------+ 246 | | | 247 | | ----------- ---- 248 | | | ------------- | 249 ----- | | ----------- | | 250 LAG---->(| | | ) (| | |) LAG 251 +-------+ . . . +-------+ 252 | CE1 | | CEn | 253 | | | | 254 +-------+ +-------+ 256 Figure 1 Reference Topology 258 ------------------ Sample Multicast Tree (T1) 259 / \ 260 | | | 261 | TRILL Campus | o RBn 262 | | / | \ 263 \ / / | ---\ 264 --------------------- RB1o o o 265 | | | | RB2 RBk 266 | | ---------- | 267 DRB| | | oRBv 268 +------+ +------+ +------+ 269 | | | | | | 270 |(RB1) | |(RB2) | | (RBk)| 271 +------+ +------+ +------+ 272 ooo|ooooooo|oooooooooooooo|oo 273 o o 274 o virtual RBridge RBv o 275 ooo|ooooooo|ooo|ooooooooooooo ---- 276 | | | ---------- | 277 ----- | | ---------- | | 278 LAG---->(| | | ) (| | | ) LAG 279 +-------+ . . . +-------+ 280 | CE1 | | CEn | 281 | | | | 282 +-------+ +-------+ 284 Figure 2 Example Logical Topology 286 4.1. Update to RFC 6325 288 Section 4.5.1 of [RFC6325], is updated as below: 290 Each RBridge that desires to be a parent RBridge for a specific 291 multi-destination distribution tree x for child RBridge RBy 292 announces the desired association through Affinity sub-TLV. The 293 child RBridge RBy is specified by its nickname (or one of its 294 nicknames if it hold more than one). 296 When such an Affinity sub-TLV is present, the association specified 297 by the affinity sub-TLV MUST be used when constructing the SPF tree. 298 In the absence of such Affinity sub-TLV, or if there are RBRidges in 299 the network that are do not support Affinity sub-TLV, SPF tree is 300 calculated as specified in the section 4.5.1 of [RFC6325]. Section 301 4.3. below explains methods of identifying RBridges that support 302 Affinity sub-TLV capability. 304 4.2. Announcing virtual RBridge nickname 306 Each edge RBridge RB1 to RBk advertises virtual RBridge nickname RBv 307 using the nickname sub-TLV (6), [6326bis], along with their regular 308 nickname or nicknames. 310 4.3. Affinity Sub-TLV capability. 312 RBridges that announce the TRILL version sub-TLV [6326bis] and set 313 the Affinity capability bit (section 7. ) support the Affinity sub- 314 TLV and calculation of multi-destination distribution trees as 315 specified herein. 317 5. Theory of operation 319 5.1. Distribution Tree provisioning 321 Let's assume there are n distribution trees and k edge RBridges in 322 the edge group of interest. 324 If n >= k 326 Let's assume edge RBridges are sorted in numerically ascending 327 order by SystemID such that RB1 < RB2 < RBk. Each Rbridge in the 328 numerically sorted list is assigned a monotonically increasing 329 number j such that; RB1=0, RB2=1, RBi=j and RBi+1=j+1. 331 Assign each tree to RBi such that tree number { (tree_number) % 332 k}+1 is assigned to RBridge i for tree_number from 1 to n. where n 333 is the number of trees and k is the number of RBridges considered 334 for tree allocation. 336 If n < k 338 Distribution trees are assigned to RBridges RB1 to RBn, using the 339 same algorithm as n >= k case. RBridges RBn+1 to RBk do not 340 participate in active-active forwarding process on behalf of RBv. 342 5.2. Affinity Sub-TLV advertisement 344 Each RBridge in the RB1..RBk domain advertises an Affinity TLV on 345 behalf of RBv. 347 As an example, let's assume that RB1 has chosen Trees t1 and tk+1 on 348 behalf of RBv. 350 RB1 advertises affinity TLV; {RBv, Num of Trees=2, t1, tk+1. 352 Other RBridges in the RB1..RBk edge group follow the same procedure. 354 5.3. Affinity sub-TLV conflict resolution 356 If different RBridges advertise Affinity sub-TLVs that try to 357 associate the same virtual RBridge as their child in the same tree 358 or trees, those Affinity sub-TLVs are in conflict for those trees. 359 The nicknames of the conflicting RBridges are compared to identify 360 which RBridge holds the nickname that is the highest priority to be 361 a tree root, with the System ID as the tie breaker 363 The RBridge with the highest priority to be a tree root will retain 364 the Affinity association. Other RBridges with lower priority to be a 365 tree root MUST stop advertising their conflicting Affinity sub-TLV, 366 re-calculate the multicast tree affinity allocation, and, if 367 appropriate, advertise a new non-conflict Affinity sub-TLV. 369 Similarly, remote RBridges MUST honor the Affinity sub-TLV from the 370 RBridge with the highest priority to be a tree root and ignore the 371 conflicting Affinity sub-TLV entries advertised by the RBridges with 372 lower priorities to be tree roots. 374 5.4. Ingress Multi-Destination Forwarding 376 If there is at least one tree on which RBv has affinity via RBk, 377 then RBk performs the following operations, for multi-destination 378 frames received from a CE node: 380 1. Flood to locally attached CE nodes subjected to VLAN and multicast 381 pruning. 382 2. Encapsulate in TRILL header and assign ingress RBridge nickname as 383 RBv. (nickname of the virtual RBridge). 384 3. Forward to one of the distribution trees, tree x in which RBv is 385 associated with RBk 387 5.4.1. Forwarding when n < k 389 If there is no tree on which RBv can claim affinity via RBk 390 (Probably because the number of trees n built is less than number 391 of RBridges k announcing the affinity sub-TLV), then RBk MUST fall 392 back to one of the following 394 1. This RBridge should stop forwarding frames from the CE nodes, 395 and should mark its link as passive. This will prevent CE nodes 396 from forwarding data on to this RBridge, and only use those 397 RBridges which have been assigned a tree -OR- 398 2. This RBridge tunnels multi-destination frames received from 399 attached native devices to an RBridge RBy that has an assigned 400 tree. The tunnel destination should forward it to the TRILL 401 network, and also to its local access links . (The mechanism 402 of tunneling and handshake between the tunnel source and 403 destination are out of scope of this specification and may be 404 addressed in future documents). 406 Above fallback options may be very specific to active-active 407 forwarding scenario. However, as stated above, Affinity sub-TLV may 408 be used in other applications. In such event the application SHOULD 409 specify applicable fallback options. 411 5.5. Egress Multi-Destination Forwarding 413 5.5.1. Traffic Arriving on an assigned Tree to RBk-RBv 415 Multi-destination frames arriving at RBk on a Tree x, where RBk has 416 announced the affinity of RBv via x, MUST be forwarded to CE members 417 of RBv. Forwarding to other end-nodes and RBridges that are not part 418 of the network represented by the RBv virtual RBridge MUST follow 419 the forwarding rules specified in [RFC6325]. 421 5.5.2. Traffic Arriving on other Trees 423 Multi-destination frames arriving at RBk on a Tree y, where RBk has 424 not announced the affinity of RBv via y, MUST NOT be forwarded to CE 425 members of RBv. Forwarding to other end-nodes and RBridges that are 426 not part of the network represented by the RBv virtual RBridge MUST 427 follow the forwarding rules specified in RFC6325. 429 5.6. Failure scenarios 431 5.6.1. Edge RBridge RBk failure 433 The below failure recovery algorithm is presented only as a 434 guideline. Implementations MAY include other failure recover 435 algorithms. Details of such algorithms are outside the scope of this 436 document. 438 Each of the member RBridges of given virtual RBridge edge group is 439 aware of its member RBridges through configuration or some other 440 method. 442 Member RBridges detect nodal failure of a member RBridge through IS- 443 IS LSP advertisements or lack thereof. 445 Upon detecting a member failure, each of the member RBridges of the 446 RBv edge group start recovery timer T_rec for failed RBrdige RBi. If 447 the previously failed RBridge RBi has not recovered after the expiry 448 of timer T_rec, members RBridges perform distribution tree 449 assignment algorithm specified in section 5.1. Each of the member 450 RBridges re-advertises the Affinity sub-TLV with new tree 451 assignment. This action causes the campus to update the tree 452 calculation with the new assignment. 454 RBi upon start-up, starts advertising its presence through IS-IS 455 LSPs and starts a timer T_i. Member RBridges detecting the presence 456 of RB start a timer T_j. Timer T_j SHOULD be at least < T_i/2. 457 (Please see note below) 459 Upon expiry of timer T_j, member RBridges recalculate the multi- 460 destination tree assignment and advertised the related trees using 461 Affinity sub-TLV. 463 Upon expiry of timer T_i, RBi recalculate the multi-destination tree 464 assignment and advertises the related trees using Affinity TLV. 466 Note: Timers T_i and T_j are designed so as to minimize traffic down 467 time and avoid multi-destination packet duplication. 469 5.7. Backward compatibility 471 Implementations MUST support backward compatibility mode to 472 interoperate with pre Affinity sub-TLV RBRidges in the network. Such 473 backward compatibility operation MAY include, however is not limited 474 to, tunneling and/or active-standby modes of operations. 476 Example: 478 Step 1. Stop using virtual RBridge nickname for traffic ingressing 479 from CE nodes 480 Step 2. Stop performing active-active forwarding. And fall back to 481 active standby forwarding, based on locally defined policies. 482 Definition such policies are outside the scope of this document 483 and may be addressed in future documents. 485 6. Security Considerations 487 Security considerations are similar to RFC 6325,RFC 6326 and RFC 488 6327. Additional security considerations are being discussed. 490 7. IANA Considerations 492 IANA is requested to allocate a capability bit for ''Affinity 493 Supported'' in the TRILL-VER sub-TLV. ''Affinity Supported'' 494 capability bit and Affinity sub-TLV are specified and allocated in 495 [6326bis]. 497 8. References 499 8.1. Normative References 501 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 502 Requirement Levels", BCP 14, RFC 2119, March 1997. 504 [RFC6325] Perlman, R., et.al. ''RBridge: Base Protocol 505 Specification'', RFC 6325, July 2011. 507 [RFC6327] Eastlake, D. et.al., ''RBridge: Adjacency'', RFC 6327, July 508 2011. 510 [RFC6439] Eastlake, D. et.al., ''RBridge: Appointed Forwarder'', RFC 511 6439, November 2011. 513 [6326bis] Eastlake, D. et.al., ''Transparent Interconnection of Lots 514 of Links (TRILL) Use of IS-IS'', draft-eastlake-isis- 515 rfc6326bis-07.txt, Work in Progress, December 2011. 517 8.2. Informative References 519 [RFC6165] Banerjee, A. and Ward, D. ''Extensions to IS-IS for Layer-2 520 Systems'', RFC 6165, April 2011. 522 [RFC4971] Vasseur, JP. et.al ''Intermediate System to Intermediate 523 System (IS-IS) Extensions for Advertising Router 524 Information'', RFC 4971, July 2007. 526 [TRILLPN] Zhai,H., et.al ''RBridge: Pseudonode Nickname'', draft-hu- 527 trill-pseudonode-nickname-01, Work in progress, November 528 2011. 530 9. Acknowledgments 532 Authors wish to extend their appreciations towards individuals who 533 volunteered to review and comment on the work presented in this 534 document and provided constructive and critical feedback. Specific 535 acknowledgements are due for Anoop Ghanwani, Ronak Desai, and Varun 536 Shah. 538 This document was prepared using 2-Word-v2.0.template.dot. 540 10. Authors' Addresses 542 Tissa Senevirathne 543 Cisco Systems 544 375 East Tasman Drive, 545 San Jose, CA 95134 547 Phone: +1-408-853-2291 548 Email: tsenevir@cisco.com 550 Janardhanan Pathangi 551 Dell/Force10 Networks 552 Olympia Technology Park, 553 Guindy Chennai 600 032 555 Phone: +91 44 4220 8400 556 Email: Pathangi_Janardhanan@Dell.com 558 Jon Hudson 559 Brocade 560 130 Holger Way 561 San Jose, CA 95134 USA 563 Email: jon.hudson@gmail.com