idnits 2.17.1 draft-yizhou-trill-tree-selection-04.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (March 3, 2015) is 3313 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'RFC2119' is mentioned on line 290, but not defined == Unused Reference: 'RFC6439' is defined on line 697, but no explicit reference was found in the text ** Obsolete normative reference: RFC 6439 (Obsoleted by RFC 8139) Summary: 1 error (**), 0 flaws (~~), 3 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 TRILL Working Group Y. Li 3 INTERNET-DRAFT D. Eastlake 4 Intended Status: Standard Track W. Hao 5 H. Chen 6 Huawei Technologies 7 Radia Perlman 8 EMC 9 Naveen Nimmu 10 Broadcom 11 S. Chatterjee 12 Cisco 13 Sunny Rajagopalan 14 IBM 15 Expires: September 4, 2015 March 3, 2015 17 TRILL: Data Label based Tree Selection for Multi-destination Data 18 draft-yizhou-trill-tree-selection-04 20 Abstract 22 TRILL uses distribution trees to deliver multi-destination frames. 23 Multiple trees can be used by an ingress RBridge for flows regardless 24 of the VLAN, Fine Grained Label (FGL), and/or multicast group of the 25 flow. Different ingress RBridges may choose different distribution 26 trees for TRILL Data packets in the same VLAN, FGL, and/or multicast 27 group. To avoid unnecessary link utilization, distribution trees 28 should be pruned based on VLAN and/or FGL and/or multicast 29 destination address. If any VLAN, FGL, or multicast group can be sent 30 on any tree, for typical fast path hardware, the amount of pruning 31 information is multiplied by the number of tree; however, there is a 32 limited capacity for such pruning information. 34 This document specifies an optional facility to restrict the TRILL 35 Data packets sent on particular distribution trees by VLAN, FGL, 36 and/or multicast group thus reducing the total amount of pruning 37 information so that it can more easily be accommodated by fast path 38 hardware. 40 Status of this Memo 42 This Internet-Draft is submitted to IETF in full conformance with the 43 provisions of BCP 78 and BCP 79. 45 Internet-Drafts are working documents of the Internet Engineering 46 Task Force (IETF), its areas, and its working groups. Note that 47 other groups may also distribute working documents as 48 Internet-Drafts. 50 Internet-Drafts are draft documents valid for a maximum of six months 51 and may be updated, replaced, or obsoleted by other documents at any 52 time. It is inappropriate to use Internet-Drafts as reference 53 material or to cite them other than as "work in progress." 55 The list of current Internet-Drafts can be accessed at 56 http://www.ietf.org/1id-abstracts.html 58 The list of Internet-Draft Shadow Directories can be accessed at 59 http://www.ietf.org/shadow.html 61 Copyright and License Notice 63 Copyright (c) 2014 IETF Trust and the persons identified as the 64 document authors. All rights reserved. 66 This document is subject to BCP 78 and the IETF Trust's Legal 67 Provisions Relating to IETF Documents 68 (http://trustee.ietf.org/license-info) in effect on the date of 69 publication of this document. Please review these documents 70 carefully, as they describe your rights and restrictions with respect 71 to this document. Code Components extracted from this document must 72 include Simplified BSD License text as described in Section 4.e of 73 the Trust Legal Provisions and are provided without warranty as 74 described in the Simplified BSD License. 76 Table of Contents 78 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 79 1.1. Background Description . . . . . . . . . . . . . . . . . . 4 80 1.2. Motivations . . . . . . . . . . . . . . . . . . . . . . . . 5 81 2. Terminology Used in This Document . . . . . . . . . . . . . . . 7 82 3. Data Label based Tree Selection . . . . . . . . . . . . . . . . 8 83 3.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . 8 84 3.2. Sub-TLVs for the Router Capability TLV . . . . . . . . . . 9 85 3.2.1. The Tree and VLANs APPsub-TLV . . . . . . . . . . . . . 9 86 3.2.2. The Tree and VLANs Used APPsub-TLV . . . . . . . . . . 10 87 3.2.3. The Tree and FGLs APPsub-TLV . . . . . . . . . . . . . 11 88 3.2.4. The Tree and FGLs Used APPsub-TLV . . . . . . . . . . . 12 89 3.3. Detailed Processing . . . . . . . . . . . . . . . . . . . . 12 90 3.4. Failure Handling . . . . . . . . . . . . . . . . . . . . . 13 91 3.5. Multicast Extensions . . . . . . . . . . . . . . . . . . . 14 93 4. Backward Compatibility . . . . . . . . . . . . . . . . . . . . 14 94 5. Security Considerations . . . . . . . . . . . . . . . . . . . . 16 95 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . . 16 96 7. References . . . . . . . . . . . . . . . . . . . . . . . . . . 16 97 7.1 Normative References . . . . . . . . . . . . . . . . . . . 16 98 7.2 Informative References . . . . . . . . . . . . . . . . . . 17 99 8. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . 17 100 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 17 102 1. Introduction 104 1.1. Background Description 106 One or more distribution trees, identified by their root nickname, 107 are used to distribute multi-destination data in a TRILL campus 108 [RFC6325]. The RBridge having the highest tree root priority 109 announces the total number of trees that should be computed for the 110 campus. It may also specify the ordered list of trees that RBridges 111 need to compute using the Tree Identifiers (TREE-RT-IDs) sub-TLV 112 [RFC7176]. Every RBridge can specify the trees it will use in the 113 Trees Used Identifiers (TREE-USE-IDs) sub-TLV and the VLANs or fine 114 grained labels (FGLs [RFC7172]) it is interested in are specified in 115 Interested VLANs and/or Interested Labels sub-TLVs [RFC7176]. It is 116 suggested that, by default, the ingress RBridge use the distribution 117 tree whose root is the closest [RFC6325]. Trees Used Identifiers sub- 118 TLVs are used to build the RPF Check table that is used for reverse 119 path forwarding check; Interested VLANs and Interested Labels sub- 120 TLVs are used for distribution tree pruning and the multi-destination 121 forwarding table with pruning info is built based on that. Each 122 distribution tree SHOULD be pruned per VLAN/FGL, eliminating branches 123 that have no potential receivers downstream [RFC6325]. Further 124 pruning based on Layer 2 or Layer 3 multicast address is also 125 possible. 127 Defaults are provided but it is implementation dependent how many 128 trees to calculate, where the tree roots are located, and which 129 tree(s) are to be used by an ingress RBridge. With the increasing 130 demand to use TRILL in data center networks, there are some features 131 we can explore for multi-destination frames in the data center use 132 case. In order to achieve non-blocking data forwarding, a fat tree 133 structure is often used. Figure 1 shows a typical fat tree structure 134 based data center network. RB1 and RB2 are aggregation switches and 135 RB11 to RB14 are access switches. It is a common practice to 136 configure the tree roots to be at the aggregation switches for more 137 efficient traffic transportation. All the ingress RBridges that are 138 access switches have the same distance to all the tree roots. 140 +-----+ +-----+ 141 | RB1 | | RB2 | 142 +-----+ +-----+ 143 / | \\ / /|\ 144 / | \ \ / / | \ 145 / | \ \ / | \-----+ 146 / | \/ \ | | 147 / | /\/ \| | 148 / /---+---/ /\ |\ | 149 / / | / \ | \ | 150 / / | / \ | \ | 151 / / | / \ | \ | 152 +-----+ +-----+ +-----+ +-----+ 153 | RB11| | RB12| | RB13| | RB14| 154 +-----+ +-----+ +-----+ +-----+ 156 Figure 1. Fat Tree Structure based TRILL network 158 1.2. Motivations 160 In the structure of figure 1, if we choose to put the tree roots at 161 RB1 and RB2, the ingress RBridge (e.g. RB11) would find more than one 162 closest tree root (i.e. RB1 & RB2). An ingress RBridge has two 163 options to select the tree root for multi-destination frames: choose 164 one and only one as distribution tree root or use ECMP-like algorithm 165 to balance the traffic among the multiple trees whose roots are at 166 the same distance. 168 - For the former, a single tree used by each ingress RBridge, can 169 have the obvious problem of inefficient link usage. For example, if 170 RB11 chooses the tree1 that is rooted at RB1 as the distribution 171 tree, the link between RB11 and RB2 will never be used for multi- 172 destination frames ingressed by RB11. 174 - For the latter, ECMP based tree selection results in a linear 175 increase in multicast forwarding table size with the number of trees 176 as explained in the next paragraph. 178 A multicast forwarding table at an RBridge is normally used to map 179 the key of (tree nickname + VLAN) to an index to a list of ports for 180 multicast packet replication. The key used for mapping is simply the 181 tree nickname when the RBridge does not prune the tree and the key 182 could be (tree nickname + VLAN + Layer 2 or 3 multicast address) when 183 the RBridge was programmed by control plane with Layer 2 or 3 184 multicast pruning information. 186 For any RBridge RBn, for each VLAN x, if RBn is in a distribution 187 tree t for VLAN x, there will be an entry of (t, x, port list) in the 188 multicast forwarding table on RBn. Typically each entry contains a 189 distinct combination of (tree nickname, VLAN) as the lookup key. If 190 there are n such trees and m such VLANs, the multicast forwarding 191 table size on RBn is n*m entries. If fine-grained label is used 192 [RFC7172] and/or finer pruning is used (for example, VLAN + multicast 193 group address is used for pruning), the value of m increases. In the 194 larger scale data center, more trees would be necessary for better 195 load balancing purpose and it results in the increasing of value n. 196 In either case, the number of table entries n*m will increase 197 dramatically. 199 The left table in Figure 2 shows an example of the multicast 200 forwarding table on RB11 in the Figure 1 topology with 2 distribution 201 trees in a campus using typical fast path hardware. The number of 202 entries is approximately 2 * 4K in this case. If 4 distribution trees 203 are used in a TRILL campus and RBn has 4K VLANs with downstream 204 receivers, it consumes 16K table entries. TRILL multicast forwarding 205 tables have a limited size in hardware implementation. The table 206 entries are a precious resource. In some implementations, the table 207 is shared with Layer 3 IP multicast for a total of 16K or 8K table 208 entries. Therefore we want to reduce the table size consumed as much 209 as possible and at the same time maintain the load balancing among 210 trees. 212 In cases where blocks of consecutive VLANs or FGLs can be assigned to 213 a tree, it would be very helpful in compressing the multicast 214 forwarding table if entries could have a Data Label value and mask 215 and the fast path hardware could do longest prefix matching. But few 216 if any fast path implementations provide such logic. 218 A straightforward way to alleviate the limited table entries problem 219 is not to prune the distribution tree. However this can only be used 220 in the restricted scenarios for the following reasons: 222 - Not pruning unnecessarily wastes bandwidth for multi-destination 223 packets. There is broadcast traffic in each VLAN, like ARP and 224 unknown unicast. In addition, if there is a lot of Layer 3 multicast 225 traffic in some VLAN, no pruning may result in the worse consequence 226 of Layer 3 user data unnecessarily flooded over the campus. The 227 volume could be huge if certain applications like IPTV are supported. 228 Finer pruning like pruning based on multicast group may be desirable 229 in this case. 231 - Not pruning is only useful at pure transit nodes. Edge nodes always 232 need to maintain the multicast forwarding table with the key of (tree 233 nickname + VLAN) since the edge node needs to decide whether and how 234 to replicate the frame to local access ports based on VLAN. It is 235 very likely that edge nodes are relatively low scale switches with 236 the smaller shared table size, say 4K, available. 238 - Security concerns. VLAN based traffic isolation is a basic 239 requirement in some scenarios. No pruning may result in the 240 unnecessary leakage of the traffic. Misbehaved RBridges may take 241 advantage of this. 243 In addition to the multicast table size concern, some silicon does 244 not currently support hashing-based tree nickname selection at the 245 ingress RBridge. VLAN based tree selection is used instead. The 246 control plane of the ingress RBridge maps the incoming VLAN x to a 247 tree nickname t. Then the data plane will always use tree t for VLAN 248 x multi-destination frames. Though an ingress RBridge may choose 249 multiple trees to be used for load sharing, it can use one and only 250 one tree for each VLAN. If we make sure all ingress RBridges campus- 251 wide send VLAN x multi-destination packets only using tree t, then 252 there would be no need to store the multicast table entry with the 253 key of (tree-other- than-t, x) on any RBridge. 255 This document describes the TRILL control plane support for a VLAN 256 based tree selection mechanism to reduce the multicast forwarding 257 table size. It is compatible with the silicon implementation 258 mentioned in the previous paragraph. Here VLAN based tree selection 259 is a general term which also includes finer granularity case such as 260 VLAN + Layer 2 or 3 multicast or FGL group based selection. 262 2. Terminology Used in This Document 264 This document uses the terminology from [RFC6325] and [RFC7172], some 265 of which is repeated below for convenience, along with some 266 additional terms listed below: 268 campus: Name for a TRILL network, like "bridged LAN" is a name for a 269 bridged network. It does not have any academic implication. 271 Data Label: VLAN or FGL. 273 ECMP: Equal Cost Multi-Path [RFC6325]. 275 FGL: Finge Grainge Lable [RFC7172]. 277 IPTV: "Television" (video) over IP. 279 RBridge: An alternative name for a TRILL switch. 281 TRILL: Transparent Interconnection of Lots of Links (or Tunneled 282 Routing in the Link Layer). 284 TRILL switch: A device implementing the TRILL protocol. Sometimes 285 called an RBridge. 287 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 288 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 289 document are to be interpreted as described in RFC-2119 [RFC2119]. 291 3. Data Label based Tree Selection 293 Data Label based tree selection can be used as a complementary 294 distribution tree selection mechanism, especially when the multicast 295 forwarding table size is a concern. 297 3.1 Overview 299 The tree root with the highest priority announces the tree nicknames 300 and the Data Labels allowed on each tree. Such tree to Data Label 301 correspondence announcements can be based on static configuration or 302 some predefined algorithm beyond the scope of this document. An 303 ingress RBridge selects the tree-VLAN correspondence it wishes to use 304 from the list announced by the highest priority tree root. It SHOULD 305 NOT transmit VLAN x frame on tree y if the highest priority tree root 306 does not say VLAN x is allowed on tree y. 308 If we make sure one VLAN is allowed on one and only one tree, we can 309 keep the number of multicast forwarding table entries on any RBridge 310 fixed at 4K maximum (or up to 16M in case of fine grained label). 311 Take Figure 1 as example, two trees rooted at RB1 and RB2 312 respectively. The highest priority tree root appoints the tree1 to 313 carry VLAN 1-2000 and tree2 to carry VLAN 2001-4095. With such 314 announcement by the highest priority tree root, every RBridge which 315 understands the announcement will not send VLAN 2001-4095 traffic on 316 tree1 and not send VLAN 1-2000 traffic on tree2. Then no RBridge 317 would need to store the entries for tree1/VLAN2001-4095 or 318 tree2/VLAN1-2000. Figure 2 shows the multicast forwarding table on an 319 RBridge before and after we perform the VLAN based tree selection. 320 The number of entries is reduced by a factor f, f being the number of 321 trees used in the campus. In this example, it is reduced from 2*4095 322 to 4095. This affects both transit nodes and edge nodes. Data plane 323 encoding does not change. 325 +--------------+-----+---------+ +--------------+-----+---------+ 326 |tree nickname |VLAN |port list| |tree nickname |VLAN |port list| 327 +--------------+-----+---------+ +--------------+-----+---------+ 328 | tree 1 | 1 | | | tree 1 | 1 | | 329 +--------------+-----+---------+ +--------------+-----+---------+ 330 | tree 1 | 2 | | | tree 1 | 2 | | 331 +--------------+-----+---------+ +--------------+-----+---------+ 332 | tree 1 | ... | | | tree 1 | ... | | 333 +--------------+-----+---------+ +--------------+-----+---------+ 334 | tree 1 | ... | | | tree 1 | 1999| | 335 +--------------+-----+---------+ +--------------+-----+---------+ 336 | tree 1 | ... | | | tree 1 | 2000| | 337 +--------------+-----+---------+ +--------------+-----+---------+ 338 | tree 1 | 4094| | | tree 2 | 2001| | 339 +--------------+-----+---------+ +--------------+-----+---------+ 340 | tree 1 | 4095| | | tree 2 | 2002| | 341 +--------------+-----+---------+ +--------------+-----+---------+ 342 | tree 2 | 1 | | | tree 2 | ... | | 343 +--------------+-----+---------+ +--------------+-----+---------+ 344 | tree 2 | 2 | | | tree 2 | 4094| | 345 +--------------+-----+---------+ +--------------+-----+---------+ 346 | tree 2 | ... | | | tree 2 | 4095| | 347 +--------------+-----+---------+ +--------------+-----+---------+ 348 | tree 2 | ... | | 349 +--------------+-----+---------+ 350 | tree 2 | ... | | 351 +--------------+-----+---------+ 352 | tree 2 | ... | | 353 +--------------+-----+---------+ 354 | tree 2 | 4094| | 355 +--------------+-----+---------+ 356 | tree 2 | 4095| | 357 +--------------+-----+---------+ 359 Figure 2. Multicast forwarding table before (left) & after (right) 361 3.2. Sub-TLVs for the Router Capability TLV 363 Four new APPsub-TLVs that can be carried in E-L1FS FS-LSPs 364 [rfc7180bis] are defined below. They can be considered analogous to 365 finer granularity versions of the Tree Identifiers Sub-TLV and the 366 Trees Used Identifiers Sub-TLV in [RFC7176]. 368 3.2.1. The Tree and VLANs APPsub-TLV 370 The Tree and VLANs (TREE-VLANs) APPsub-TLV is used to announce the 371 VLANs allowed on each tree by the RBridge that has the highest 372 priority to be a tree root. Multiple instances of this sub-TLV may be 373 carried. The same tree nicknames may occur in the multiple Tree-VLAN 374 RECORDs within the same or across multiple sub-TLVs. The sub-TLV 375 format is as follows: 377 1 1 1 1 1 1 378 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 379 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 380 | Type = tbd1 | (2 bytes) 381 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 382 | Length | (2 bytes) 383 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-...-+-+ 384 | Tree-VLAN RECORD (1) | (6 bytes) 385 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-...-+-+ 386 | ................. | 387 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-...-+-+ 388 | Tree-VLAN RECORD (N) | (6 bytes) 389 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-...-+-+ 391 where each Tree-VLAN RECORD is of the form: 392 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 393 | Nickname | (2 bytes) 394 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 395 | RESV | Start.VLAN | (2 bytes) 396 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 397 | RESV | End.VLAN | (2 bytes) 398 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 400 o Type: TRILL GENINFO APPsub-TLV type, set to tbd1 (TREE-VLANs). 402 o Length: 6*n bytes, where there are n Tree-VLAN RECORDs. Thus the 403 value of Length can be used to determine n. If Length is not a 404 multiple of 6, the sub-TLV is corrupt and MUST be ignored. 406 o Nickname: The nickname identifying the distribution tree by its 407 root. 409 o RESV: 4 bits that MUST be sent as zero and ignored on receipt. 411 o Start.VLAN, End.VLAN: These fields are the VLAN IDs of the allowed 412 VLAN range on the tree, inclusive. To specify a single VLAN, the 413 VLAN's ID appears as both the start and end VLAN. If End.VLAN is less 414 than Start.VLAN the Tree-VLAN RECORD MUST be ignored. 416 3.2.2. The Tree and VLANs Used APPsub-TLV 418 This APPsub-TLV has the same structure as the Tree and VLANs APPsub- 419 TLV (TREE-VLANs) specified in Section 3.2.1. The only difference is 420 that its APPsub-TLV type is set to tbd2 (TREE-VLAN-USE), and the 421 Tree-VLAN RECORDs listed are those the originating RBridge allows. 423 3.2.3. The Tree and FGLs APPsub-TLV 425 The Tree and FGLs (TREE-FGLs) APPsub-TLV is used to announce the FGLs 426 allowed on each tree by the RBridge that has the highest priority to 427 be a tree root. Multiple instances of this APPsub-TLV may be carried. 428 The same tree nicknames may occur in the multiple Tree-FGL RECORDs 429 within the same or across multiple APPsub-TLVs. Its format is as 430 follows: 432 1 1 1 1 1 1 433 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 434 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 435 | Type = tbd3 | (2 bytes) 436 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 437 | Length | (2 bytes) 438 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-...-+-+ 439 | Tree-FGL RECORD (1) | (8 bytes) 440 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-...-+-+ 441 | ................. | 442 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-...-+-+ 443 | Tree-FGL RECORD (N) | (8 bytes) 444 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-...-+-+ 446 where each Tree-VLAN RECORD is of the form: 447 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 448 | Nickname | (2 bytes) 449 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-...-+ 450 | Start.FGL | (3 bytes) 451 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-...-+ 452 | End.FGL | (3 bytes) 453 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-...-+ 455 o Type: TRILL GENINFO APPsub-TLV type, set to tbd3 (TREE-FGLs). 457 o Length: 8*n bytes, where there are n Tree-FGL RECORDs. Thus the 458 value of Length can be used to determine n. If Length is not a 459 multiple of 8, the sub-TLV is corrupt and MUST be ignored. 461 o Nickname: The nickname identifying the distribution tree by its 462 root. 464 o RESV: 4 bits that MUST be sent as zero and ignored on receipt. 466 o Start.FGL, End.FGL: These fields are the FGL IDs of the allowed 467 FGL range on the tree, inclusive. To specify a single FGL, the FGL's 468 ID appears as both the start and end FGL. If End.FGL is less than 469 Start.FGL the Tree-FGL RECORD MUST be ignored. 471 3.2.4. The Tree and FGLs Used APPsub-TLV 473 This APPsub-TLV has the same structure as the Tree and FGLs APPsub- 474 TLV (TREE-FGLs) specified in Section 3.2.3. The only difference is 475 that its APPsub-TLV type is set to tbd4 (TREE-FGL-USE), and the Tree- 476 FGL RECORDs listed are those the originating RBridge allows. 478 3.3. Detailed Processing 480 The highest priority tree root RBridge MUST include all the necessary 481 tree related APPsub-TLVs defined in [RFC7176] as usual in its E-L1FS 482 FS-LSP and MAY include the Tree and VLANs Sub-TLV (TREE-VLANs) and or 483 Tree and FGLs Sub-TLV (TREE-FGLs) in its E-L1FS FS-LSP [rfc7180bis]. 484 In this way it MAY indicate that each VLAN and/or FGL is only allowed 485 on one or some other number of trees less than the number of trees 486 being calculated in the campus in order to save table space in the 487 fast path forwarding hardware. 489 An ingress RBridge that understands the TREE-VLANs APPsub-TLV SHOULD 490 select the tree-VLAN correspondences it wishes to use and put them in 491 TREE-VLAN-USE APPsub-TLVs. If there were multiple tree nicknames 492 announced in TREE-VLANs Sub-TLV for a VLAN x, ingress RBridge must 493 choose one of them if it supports this feature. For example, the 494 ingress RBridge may choose the closest (minimum cost) root from them. 495 How to make such choice is out of the scope of this document. It may 496 be desirable to have some fixed algorithm to make sure all ingress 497 RBs choose the same tree for VLAN x in this case. Any single Data 498 Label that the ingress RBridge is interested in should be related to 499 one and only one tree ID in TREE-VLAN-USE to minimize the multicast 500 forwarding table size on other RBridges but as long as the Data Label 501 is related to less than all the trees being calculated, it will 502 reduce the burden on the forwarding table size. 504 When an ingress RBridge tries to encapsulate a multi-destination 505 frame for Data Label x, it SHOULD use the tree nickname that it 506 selected previously in TREE-VLAN-USE or TREE-FGL-USE for Data Label 507 x. 509 If RBridge RBn does not perform pruning, it builds the multicast 510 forwarding table exactly same as that in [RFC6325]. 512 If RBn prunes the distribution tree based on VLANs, RBn uses the 513 information received in TREE-VLAN-USE APPsub-TLVs to mark the set of 514 VLANs reachable downstream for each adjacency and for each related 515 tree. If RBn prunes the distribution tree based on FGLs, RBn uses the 516 information received in TRILL-FGL-USE APPsub-TLVs to mark the set of 517 FLGs reachable downstream for each adjacency and for each related 518 tree. 520 Logically, an ingress RBridge that does not support VLAN based tree 521 selection is equivalent to the one that supports it and announces all 522 the combination pair of tree-id-used and interested-vlan as TREE- 523 VLAN-USE and correspondingly for FGL. 525 3.4. Failure Handling 527 Failure of a tree root that is not the highest priority: It is the 528 responsibility of the highest priority tree root to inform other 529 RBridges of any change in the allowed tree-VLAN correspondence. When 530 the highest priority tree root learns the root of tree t fails, it 531 should re-assign the VLANs allowed on tree t to other trees or to a 532 tree replacing the failed one. 534 Failure of the highest priority tree root: It is RECOMMENDED that the 535 second highest priority tree root be pre-configured with the proper 536 knowledge of the tree-VLAN correspondence allowed when the highest 537 priority tree root fails. The information announced by the second 538 priority tree root would be stored by all RBridges but would not take 539 effect unless the RBridge noticed the failure of the highest priority 540 tree root. When the highest priority tree root fails, the former 541 second priority tree root will become the highest priority tree root 542 of the campus. When an RBridge notices the failure of the original 543 highest priority tree root, it can immediately use the stored 544 information announced by the original second priority tree root. It 545 is recommended that the tree-VLAN correspondence information be pre- 546 configured on the second highest priority tree root to be the same as 547 that on the highest priority tree root for the trees other than the 548 highest priority tree itself. This can minimize the change of 549 multicast forwarding table in case of the highest priority tree root 550 failure. For a large campus, it may make sense to pre-configure this 551 information in a similar way on the third, fourth, or even lower 552 priority tree root RBridges. 554 In some transient conditions or in case of misbehavior by the highest 555 priority tree root, an ingress RBridge may encounter the following 556 scenarios: 558 - No tree has been announced to allow VLAN x frames 560 - An ingress RBridge is supposed to transmit VLAN x frames on tree t, 561 but root of tree t is no longer reachable. 563 For the second case, an ingress RBridge may choose another reachable 564 tree root which allows VLAN x according to the highest priority tree 565 root announcement. If there is no such tree available, then it is 566 same as the first case above. Then the ingress RBridge should be 567 'downgraded' to a conventional BRridge with behavior as specified in 568 [RFC6325]. A timer should be set to allow the temporary transient 569 stage to complete before the change of responsive tree or 'downgrade' 570 takes effect. The value of timer should at least be set to the LSP 571 flooding time of the campus. 573 3.5. Multicast Extensions 575 Data Label based tree selection is easily extended to (Data Label + 576 Layer 2 or 3 multicast group) based tree selection. We can appoint 577 multicast group 1 in VLAN 10 to tree1 and appoint group 2 in VLAN 10 578 to tree2 for better load sharing. One additional APPsub-TLV is 579 specified as follows: 581 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 582 | Type = tbd5 | (2 byte) 583 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 584 | Length | (2 byte) 585 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 586 | Tree Nickname | (2 bytes) 587 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 588 | Group Sub-Sub-TLVs (variable) 589 +-+-+-+-+-+-+-+-+-+.... 591 o Type: TRILL GENINFO APPsub-TLV type, set to tbd5 (TREE-GROUPs). 593 o Length: 2 + the length of the Group Sub-Sub TLVs included 595 o Nickname: The nickname identifying the distribution tree by its 596 root. 598 o RESV: 4 bits that MUST be sent as zero and ignored on receipt. 600 o Group Sub-Sub-TLVs: Zero or more of the TLV structure that are 601 allowed as sub-TLVs of the GADDR TLV [RFC7176]. Each such TLV 602 structure specifies a multicast group and either a VLAN or FGL. 603 Although these TLV structure are considered sub-TLVs when they appear 604 inside a GADDR TLV, they are technically sub-sub-TLVs when they 605 appear inside the TREE-GROUPs APPsub-TLV. 607 4. Backward Compatibility 609 RBridges MUST include the TREE-USE-IDs and INT-VLAN sub-TLVs in their 610 LSPs when required by [RFC6325] whether or not they supports the new 611 TREE-VLAN-USE or TREE-FGL-USE sub-TLVs specified by this draft. 613 RBridges that understand the new TREE-VLAN-USE sub-TLV sent from 614 another RBridge RBn should use it to build the multicast forwarding 615 table and ignore the TREE-USE-IDs and INT-VLAN sub-TLVs sent from the 616 same RBridge. TREE-USE-IDs and INT-VLAN sub-TLVs are still useful for 617 some purposes other than building multicast forwarding table, for 618 example RPF table building, spanning tree root notification, etc. If 619 the RBridge does not receive TREE-VLAN-USE sub-TLV from RBn, it uses 620 the conventional way described in [RFC6325] to build the multicast 621 forwarding table. 623 For example, there are two distribution trees, tree1 and tree2 in the 624 campus. RB1 and RB2 are RBridges that use the new APPsub-TLVs 625 described in this document. RB3 is an old RBridge that is compatible 626 with [RFC6325]. Assume RB2 is interested in VLANs 10 and 11 and RB3 627 is interested in VLANs 100 and 101. Hence RB1 receives ((tree1, 628 VLAN10), (tree2, VLAN11)) as TREE-VLAN-USE sub-TLV and (tree1, tree2) 629 as TREE-USE-IDs sub-TLV from RB2 on port x. And RB1 receives (tree1) 630 as TREE-USE-IDs sub-TLV and no TREE-VLAN-USE sub-TLV from RB3 on port 631 y. RB2 and RB3 announce their interested VLANs in INT-VLAN sub-TLV as 632 usual. Then RB1 will build the entry of (tree1, VLAN10, port x) and 633 (tree2, VLAN11, port x) based on RB2's LSP and mechanism specified in 634 this document. RB1 also builds entry of (tree1, VLAN100, port y), 635 (tree1, VLAN101, port y), (tree2, VLAN100, port y), (tree2, VLAN101, 636 port y) based on RB3's LSP in conventional way. The multicast 637 forwarding table on RB1 with merged entry would be like the 638 following. 640 +--------------+-----+---------+ 641 |tree nickname |VLAN |port list| 642 +--------------+-----+---------+ 643 | tree 1 | 10 | x | 644 +--------------+-----+---------+ 645 | tree 1 | 100 | y | 646 +--------------+-----+---------+ 647 | tree 1 | 101 | y | 648 +--------------+-----+---------+ 649 | tree 2 | 11 | x | 650 +--------------+-----+---------+ 651 | tree 2 | 100 | y | 652 +--------------+-----+---------+ 653 | tree 2 | 101 | y | 654 +--------------+-----+---------+ 656 It is expected that the table is not as small as the one where every 657 RBridge supports the new TREE-VLAN-USE sub-TLVs. The worst case in a 658 hybrid campus is the number of entries equal to the number in current 659 practice which does not support VLAN based tree selection. Such an 660 extreme case happens when the interested VLAN set from the new 661 RBridges is a subset of the interested VLAN set from the old 662 RBridges. 664 VLAN based tree selection is compatible with the current practice. 665 Its effectiveness increases with more RBridge supporting this feature 666 in the TRILL campus. 668 5. Security Considerations 670 This document does not change the general RBridge security 671 considerations of the TRILL base protocol. The APPsub-TLVs specified 672 can be secured using the IS-IS authentication feature [RFC5310]. See 673 Section 6 of [RFC6325] for general TRILL security considerations. 675 6. IANA Considerations 677 IANA is requested to assigne five new TRILL APPsub-TLV type codes as 678 specified in Section 3 and update the TRILL Parameters registry as 679 shown below. 681 Type Name Reference 682 ---- ---- --------- 684 tbd1 TREE-VLANs [this document] 685 tbd2 TREE-VLAN-USE [this document] 686 tbd3 TREE-FGLs [this document] 687 tbd4 TREE-FGL-USE [this document] 688 tbd5 TREE-GROUPs [this document] 690 7. References 692 7.1 Normative References 694 [RFC6325] Perlman, R., et.al. "RBridge: Base Protocol Specification", 695 RFC 6325, July 2011. 697 [RFC6439] Eastlake, D. et.al., "RBridge: Appointed Forwarder", RFC 698 6439, November 2011. 700 [RFC7172] Eastlake 3rd, D., Zhang, M., Agarwal, P., Perlman, R., and 701 D. Dutt, "Transparent Interconnection of Lots of Links 702 (TRILL): Fine-Grained Labeling", RFC 7172, May 2014, 703 . 705 [RFC7176] Eastlake 3rd, D., Senevirathne, T., Ghanwani, A., Dutt, D., 706 and A. Banerjee, "Transparent Interconnection of Lots of 707 Links (TRILL) Use of IS-IS", RFC 7176, May 2014, 708 . 710 [rfc7180bis] Eastlake 3rd, D. et. Al. draft-eastlake-trill- 711 rfc7180bis, work in progress. 713 7.2 Informative References 715 [RFC5310] - Bhatia, M., Manral, V., Li, T., Atkinson, R., White, R., 716 and M. Fanto, "IS-IS Generic Cryptographic 717 Authentication", RFC 5310, February 2009, . 720 8. Acknowledgments 722 Authors wish to thank David M. Bond, Liangliang Ma, Rakesh Kumar R 723 for the valuable comments (names in alphabet order). 725 Authors' Addresses 727 Yizhou Li 728 Huawei Technologies 729 101 Software Avenue, 730 Nanjing 210012 731 China 733 Phone: +86-25-56624629 734 Email: liyizhou@huawei.com 736 Donald Eastlake 737 Huawei R&D USA 738 155 Beaver Street 739 Milford, MA 01757 USA 741 Phone: +1-508-333-2270 742 Email: d3e3e3@gmail.com 744 Weiguo Hao 745 Huawei Technologies 746 101 Software Avenue, 747 Nanjing 210012 748 China 750 Phone: +86-25-56623144 751 Email: haoweiguo@huawei.com 753 Hao Chen 754 Huawei Technologies 755 101 Software Avenue, 756 Nanjing 210012 757 China 759 Email: philips.chenhao@huawei.com 761 Radia Perlman 762 EMC 763 2010 256th Avenue NE, #200 764 Bellevue, WA 98007 765 USA 767 Email: Radia@alum.mit.edu 769 Naveen Nimmu 770 Broadcom 771 9th Floor, Building no 9, Raheja Mind space 772 Hi-Tec City, Madhapur, 773 Hyderabad - 500 081, INDIA 775 Phone: +1-408-218-8893 776 Email: naveen@broadcom.com 778 Somnath Chatterjee 779 Cisco Systems, 780 SEZ Unit, Cessna Business Park, 781 Outer ring road, 782 Bangalore - 560087 783 India 785 Email: somnath.chatterjee01@gmail.com 787 Sunny Rajagopalan 788 IBM 790 Email: sunny.rajagopalan@us.ibm.com