idnits 2.17.1 draft-hu-trill-pseudonode-nickname-08.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (June 25, 2014) is 3593 days in the past. Is this intentional? -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '' and '' lines. Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'IS-IS' is mentioned on line 107, but not defined == Missing Reference: 'RFC6439' is mentioned on line 118, but not defined ** Obsolete undefined reference: RFC 6439 (Obsoleted by RFC 8139) == Missing Reference: 'RFC6165' is mentioned on line 1168, but not defined == Missing Reference: 'RFC7174' is mentioned on line 1091, but not defined == Unused Reference: 'RFC1195' is defined on line 1160, but no explicit reference was found in the text == Outdated reference: A later version (-11) exists of draft-ietf-trill-cmt-01 ** Obsolete normative reference: RFC 7180 (Obsoleted by RFC 7780) == Outdated reference: A later version (-07) exists of draft-ietf-trill-active-active-connection-prob-04 Summary: 2 errors (**), 0 flaws (~~), 8 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 TRILL Working Group H. Zhai 3 Internet-Draft ZTE 4 Intended Status: Standards Track T. Senevirathne 5 Expires: December 27, 2014 Cisco Systems 6 R. Perlman 7 Intel Labs 8 D. Eastlake 3rd 9 M. Zhang 10 Y. Li 11 Huawei 12 June 25, 2014 14 RBridge: Pseudo-Nickname for Active-active Access 15 draft-hu-trill-pseudonode-nickname-08 17 Abstract 19 The IETF TRILL (TRansparent Interconnection of Lots of Links) 20 protocol provides support for flow level multi-pathing for both 21 unicast and multi-destination traffic in networks with arbitrary 22 topology. Active-active access at the TRILL edge is the extension of 23 these characteristics to end stations that are multiply connected to 24 a TRILL campus. In this document, the edge RBridge (TRILL switch) 25 group providing active-active access to such an end station can be 26 represented as a Virtual RBridge. Based on the concept of Virtual 27 RBridge along with its pseudo-nickname, this document facilitates the 28 TRILL active-active access of such end stations. 30 Status of This Memo 32 This Internet-Draft is submitted to IETF in full conformance with the 33 provisions of BCP 78 and BCP 79. 35 Internet-Drafts are working documents of the Internet Engineering 36 Task Force (IETF), its areas, and its working groups. Note that 37 other groups may also distribute working documents as 38 Internet-Drafts. 40 Internet-Drafts are draft documents valid for a maximum of six months 41 and may be updated, replaced, or obsoleted by other documents at any 42 time. It is inappropriate to use Internet-Drafts as reference 43 material or to cite them other than as "work in progress." 45 The list of current Internet-Drafts can be accessed at 46 http://www.ietf.org/1id-abstracts.html 47 The list of Internet-Draft Shadow Directories can be accessed at 48 http://www.ietf.org/shadow.html 50 Copyright and License Notice 52 Copyright (c) 2014 IETF Trust and the persons identified as the 53 document authors. All rights reserved. 55 This document is subject to BCP 78 and the IETF Trust's Legal 56 Provisions Relating to IETF Documents 57 (http://trustee.ietf.org/license-info) in effect on the date of 58 publication of this document. Please review these documents 59 carefully, as they describe your rights and restrictions with respect 60 to this document. Code Components extracted from this document must 61 include Simplified BSD License text as described in Section 4.e of 62 the Trust Legal Provisions and are provided without warranty as 63 described in the Simplified BSD License. 65 Table of Contents 67 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 68 1.1. Terminology and Acronyms . . . . . . . . . . . . . . . . . 5 69 2. Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 70 3. Virtual RBridge and its Pseudo-nickname . . . . . . . . . . . . 7 71 4. Member RBridges Auto-Discovery . . . . . . . . . . . . . . . . 8 72 4.1. Discovering Member RBridge for an RBv . . . . . . . . . . . 9 73 4.2. Selection of Pseudo-nickname for RBv . . . . . . . . . . . 11 74 5. Distribution Trees and Designated Forwarder . . . . . . . . . . 12 75 5.1. Different Trees for Different Member RBridges . . . . . . . 12 76 5.2. Designated Forwarder for Member RBridges . . . . . . . . . 13 77 5.3. Ingress Nickname Filtering . . . . . . . . . . . . . . . . 15 78 6. TRILL traffic Processing . . . . . . . . . . . . . . . . . . . 16 79 6.1. Native Frames Ingressing . . . . . . . . . . . . . . . . . 16 80 6.2. Egressing TRILL Data Packets . . . . . . . . . . . . . . . 17 81 6.2.1. Unicast TRILL Data Packets . . . . . . . . . . . . . . 17 82 6.2.2. Multi-Destination TRILL Data Packets . . . . . . . . . 18 83 7. MAC Information Synchronization in Edge Group . . . . . . . . . 18 84 8. Member Link Failure in RBv . . . . . . . . . . . . . . . . . . 19 85 8.1. Link Protection for Unicast Frame Egressing . . . . . . . . 20 86 9. TLV Extensions for Edge RBridge Group . . . . . . . . . . . . . 20 87 9.1. MC-LAG Membership (LM) Sub-TLV . . . . . . . . . . . . . . 21 88 9.2. PN-RBV sub-TLV . . . . . . . . . . . . . . . . . . . . . . 22 89 9.3. MAC-RI-MC-LAG Boundary sub-TLVs . . . . . . . . . . . . . . 23 90 10. OAM Frames . . . . . . . . . . . . . . . . . . . . . . . . . . 24 91 11. Configuration Consistency . . . . . . . . . . . . . . . . . . 24 92 12. Security Considerations . . . . . . . . . . . . . . . . . . . 25 93 13. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 25 94 14. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 25 95 15. Contributing Authors . . . . . . . . . . . . . . . . . . . . . 25 96 16. References . . . . . . . . . . . . . . . . . . . . . . . . . . 26 97 16.1. Normative References . . . . . . . . . . . . . . . . . . . 26 98 16.2. Informative References . . . . . . . . . . . . . . . . . . 27 99 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 27 101 1. Introduction 103 The IETF TRILL protocol [RFC6325] provides optimal pair-wise data 104 frame forwarding without configuration, safe forwarding even during 105 periods of temporary loops, and support for multi-pathing of both 106 unicast and multicast traffic. TRILL accomplishes this by using IS-IS 107 [IS-IS] [RFC7176] link state routing and encapsulating traffic using 108 a header that includes a hop count. Devices that implement TRILL are 109 called RBridges or TRILL switch. 111 In the base TRILL protocol, an end node can be attached to the TRILL 112 campus via a point-to-point link or a shared link (such as a Local 113 Area Network (LAN) segment). Although there might be more than one 114 edge RBridge on a shared link, to avoid potential forwarding loops, 115 one and only one of the edge RBridges is permitted to provide 116 forwarding service for end station traffic in each VLAN (Virtual 117 LAN). That RBridge is referred to as Appointed Forwarder (AF) for the 118 VLAN on the link [RFC6325] [RFC6439]. However, in some practical 119 deployments, to increase the access bandwidth and reliability, an end 120 station might multiply connect to several edge RBridges and treat all 121 of the uplinks as a Multi-Chassis Link Aggregation (MC-LAG) bundle. 122 In this case, it's required that traffic can be ingressed/egressed 123 into/from the TRILL campus by any of the RBridges for each given 124 VLAN. These RBridges constitutes an Active-Active Edge (AAE) RBridge 125 group for the end station. 127 Traffic with the same VLAN and source MAC address but belonging to 128 different flows might be sent by such an end station to different 129 member RBridges of the AAE group, and then is ingressed into TRILL 130 campus. When an RBridge receives such TRILL data packets ingressed by 131 different RBridges, it learns different VLAN and MAC address to 132 nickname correspondences continuously when decapsulating the packets. 133 This issue is known as the "MAC flip-flopping" issue, which makes 134 most TRILL switches behave badly and causes the returning traffic to 135 reach the destination via different paths resulting in persistent re- 136 ordering of the frames. In addition to this issue, other issues such 137 as duplication egressing and loop of multi-destination frames may 138 also disturb the end stations multiply connected to the member 139 RBridges of an AAE group [AAProb]. 141 Edge RBridge groups, which can be represented as a Virtual RBridge 142 (RBv) and assigned a pseudo-nickname, address the AAE issues of TRILL 143 in this document. A member RBridge of such a group uses the pseudo- 144 nickname, instead of its own nickname, as the ingress RBridge 145 nickname when ingressing frames received on attached MC-LAG links. 147 The main body of this document is organized as follows: Section 2 148 gives an overview of the TRILL active-active access issues and the 149 reason that a virtual RBridge (RBv) is used to resolve the issues. 150 Section 3 gives the concept of virtual RBridge and its pseudo- 151 nickname. Section 4 describes how edge RBridges constitute an RBv 152 automatically and get a pseudo-nickname for the RBv. Section 5 153 discusses how to protect multi-destination traffic against disruption 154 due to Reverse Forwarding Path (RPF) check failure, duplication and 155 forwarding loop, etc. Section 6 covers the special processing of 156 native frames and TRILL data packets at member RBridges of an RBv 157 (also referred to as an Active-Active Edge (AAE) RBridge group); 158 followed by Section 7, which describes the MAC information 159 synchronization among the member RBridges of an RBv. Section 8 160 discusses the protection against downlink failure at a member 161 RBridge; and Section 9 gives the necessary TLV extensions for AAE 162 RBridge group. 164 1.1. Terminology and Acronyms 166 This document uses the acronyms and terms defined in [RFC6325] 167 [AAProb] and the following additional acronyms: 169 CE - As in [CMT], Classic Ethernet device (end station or bridge). 170 The device can be either physical or virtual equipment. 172 FGL - Fine-Grained Labeling or Fine-Grained Labeled or Fine-Grained 173 Label [RFC7172]. 175 AAE - Active-active Edge RBridge group, a group of edge RBridges to 176 which at least one CE is multiply attached using MC-LAG. AAE is also 177 referred to as edge group or Virtual RBridge in this document. 179 RBv - Virtual RBridge, an alias of active-active edge RBridge group 180 in this document. 182 vDRB - The Designated RBridge in an RBv. It is responsible for 183 deciding on a pseudo-nickname for the RBv. 185 OE flag - A flag used by the member RBridge of an MC-LAG to tell 186 other edge RBridges whether it is willing to share an RBv with other 187 MC-LAGs if they multiply attach to the same set of edge RBridges as 188 it. If this flag for an MC-LAG is 1, it means that the MC-LAG needs 189 to be served by an RBv by itself and is not willing to do the share, 190 i.e., it should Occupy an RBv Exclusively (OE). 192 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 193 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 194 document are to be interpreted as described in RFC 2119 [RFC2119]. 196 2. Overview 198 To minimize impact during failures and maximize available access 199 bandwidth, end stations (referred to as CEs in this document) may be 200 multiply connected to TRILL campus via multiple edge RBridges. Figure 201 1 shows such a typical deployment scenario, where CE1 attaches to 202 RB1, RB2, ... RBk and treats all of the uplinks as a Multi-Chassis 203 Link Aggregation (MC-LAG) bundle. Then RB1, RB2, ... RBk constitute 204 an Active-active Edge (AAE) RBridge group for CE1 in this MC-LAG. 205 Even if a member RBridge or an uplink fails, CE1 can still get frame 206 forwarding service from TRILL campus if there are still member 207 RBridges and uplinks available in the AAE group. Furthermore, CE1 can 208 make flow-based load balancing across the available member links of 209 the MC-LAG bundle in the AAE group when it communicates with other 210 end stations across the TRILL campus [AAProb]. 212 ---------------------- 213 | | 214 | TRILL Campus | 215 | | 216 ---------------------- 217 | | | 218 +-----+ | +--------+ 219 | | | 220 +------+ +------+ +------+ 221 |(RB1) | |(RB2) | | (RBk)| 222 +------+ +------+ +------+ 223 |..| |..| |..| 224 | +----+ | | | | 225 | +---|-----|--|----------+ | 226 | +-|---|-----+ +-----------+ | 227 MC- | | | +------------------+ | | 228 LAG1-->(| | |) (| | |) <--MC-LAGn 229 +-------+ . . . +-------+ 230 | CE1 | | CEn | 231 +-------+ +-------+ 233 Figure 1 Active-Active Connection to TRILL Edge RBridges 235 By design, an MC-LAG (say MC-LAG1) does not forward packets received 236 on one member port to other member ports. As a result, the TRILL 237 Hello messages sent by one member RBridge (say RB1) via a port to CE1 238 will not be forwarded to other member RBridges by CE1. That is to 239 say, member RBridges will not see each other's hellos via the MC-LAG. 240 So every member RBridge of MC-LAG1 thinks of itself as appointed 241 forwarder for all VLANs enabled on an MC-LAG1 link and can 242 ingress/egress frames simultaneously in these VLANs. The simultaneous 243 flow-based ingressing/egressing may cause some problems. For example, 244 simultaneous egressing of multi-destination traffic by multiple 245 member RBridges will result in frame duplication at CE1 (see Section 246 3.1 of [AAProb]); simultaneous ingressing of frames originated by CE1 247 for different flows in the same VLAN will result in MAC address flip- 248 flopping at remote egress RBridges (see Section 3.3 of [AAProb]). The 249 flip-flopping in turn causes packet re-ordering in reverse traffic. 251 Since the fact is true that edge RBridges learn Data Label and MAC 252 address to nickname correspondences by default via decapsulating 253 TRILL data packets (see Section 4.8.1 of [RFC6325] as updated by 254 [RFC7172]), the MAC flip-flopping issue should be solved based on the 255 assumption that the default learning is enabled at edge RBridges. So 256 this document specifies Virtual RBridge, together with its pseudo- 257 nickname, to fix these issues. 259 3. Virtual RBridge and its Pseudo-nickname 261 A Virtual RBridge (RBv) represents a group of edge RBridges to which 262 at least one CE is multiply attached using MC-LAG. More exactly, it 263 represents a group of end station service ports on the edge RBridges 264 and the end station service provided to the CE(s) on these ports, 265 through which the CE(s) is multiply attached to TRILL campus using 266 MC-LAG(s). Such end station service ports are called RBv ports; in 267 contrast, other access ports at edge RBridges are called regular 268 access ports in this document. RBv ports are always MC-LAG connecting 269 ports, but not vice versa (see Section 4.1). For an edge RBridge, if 270 one or more of its end station service ports are ports of an RBv, 271 that RBridge is a member RBridge of that RBv. 273 For the convenience of description, a Virtual RBridge is also 274 referred to as an Active-Active Edge (AAE) group in this document. In 275 the TRILL campus, an RBv is identified by its pseudo-nickname, which 276 is different from any RBridge's regular nickname(s). An RBv has one 277 and only one pseudo-nickname. Each member RBridge (say RB1, RB2 ..., 278 RBk) of an RBv (say RBvn) advertises RBvn's pseudo-nickname using a 279 Nickname sub-TLV in its TRILL IS-IS LSP (Link State PDU) [RFC7176] 280 and SHOULD do so with maximum priority of use (0xFF), along with 281 their regular nickname(s). (Maximum priority is recommended to avoid 282 the disruption to AAE group that would occur if the nickname were 283 taken away by a higher priority RBridge.) Then from these LSPs, other 284 RBridges outside the AAE group know that RBvn is reachable through 285 RB1 to RBk. 287 A member RBridge (say RBi) loses its membership from RBvn when its 288 last port of RBvn becomes unavailable due to failure, re- 289 configuration, etc. Then RBi removes RBvn's pseudo-nickname from its 290 LSP and distributes the updated LSP as usual. From those updated 291 LSPs, other RBridges know that their path(s) to RBvn is not available 292 through RBi now. 294 When member RBridges receive native frames from their RBv ports and 295 decide to ingress the frames into the TRILL campus, they use that 296 RBv's pseudo-nickname instead of their own regular nicknames as the 297 ingress nickname to encapsulate them into TRILL Data packets. So when 298 these packets arrive at an egress RBridge, even they are originated 299 by the same end station in the same VLAN but ingressed by different 300 member RBridges, no address flip-flopping is observed on the egress 301 RBridge when decapsulating these packets. (When a member RBridge of 302 an AAE group ingresses a frame from a non-RBv port, it still use its 303 own nickname as the ingress nickname.) 305 Since RBv is not a physical node and no TRILL frames are forwarded 306 between its ports via a local MC-LAG, pseudo-node LSP(s) MUST NOT be 307 created for an RBv. RBv cannot act as root when constructing 308 distribution trees for multi-cast traffic and its pseudo-nickname is 309 ignored when determining the distribution tree root for TRILL campus 310 [CMT]. So the tree root priority of RBv's nickname SHOULD be set to 311 0, and this nickname SHOULD NOT be listed in the "s" nicknames (see 312 Section 2.5 of [RFC6325]) by the RBridge holding the highest priority 313 tree root nickname. 315 NOTE: In order to reduce the consumption of nicknames, especially in 316 large TRILL campus with lots of RBridges and/or active-active 317 accesses, when multiple CEs attach to the exact same set of edge 318 RBridges via MC-LAGs, those edge RBridges should be considered as a 319 single RBv with a pseudo-nickname. 321 4. Member RBridges Auto-Discovery 323 Edge RBridges connected by CE(s) via MC-LAG(s) can automatically 324 discover each other with minimal configuration through exchange of 325 the MC-LAG(s) information. 327 From the perspective of edge RBridges, a CE that connects to edge 328 RBridges via an MC-LAG can be identified by the globally unique ID of 329 the MC-LAG (i.e., the MC-LAG System ID [802.1AX], also referred to as 330 MC-LAG ID in this document). On each of such edge RBridges, the 331 access port to such a CE is associated with an MC-LAG ID for the CE. 332 An MC-LAG is considered valid on an edge RBridge only if the RBridge 333 still has operational down-link to that MC-LAG. For such an edge 334 RBridge, it advertises a list of MC-LAG IDs for all the valid local 335 MC-LAGs to other edge RBridges via its TRILL IS-IS LSP(s). Based on 336 the MC-LAG IDs advertised by other edge RBridges, each RBridge can 337 know which edge RBridges could constitute an AAE group (See Section 338 4.1 for more details). Then one RBridge is elected from the group to 339 allocate an available nickname (i.e., the pseudo-nickname) for the 340 group (See Section 4.2 for more details). 342 4.1. Discovering Member RBridge for an RBv 344 Take Figure 2 as an example, where CE1 and CE2 multiply attach to 345 RB1, RB2 and RB3 via MC-LAG1 and MC-LAG2 respectively; CE3 and CE4 346 attach to RB3 and RB4 via MC-LAG3 and MC-LAG4 respectively. Assume 347 MC-LAG3 is configured to occupy a Virtual RBridge by itself. 349 --------------------- 350 / \ 351 | TRILL Campus | 352 \ / 353 --------------------- 354 | | | | 355 +-------+ | | +--------+ 356 | | | | 357 +-------+ +-------+ +-------+ +-------+ 358 | RB1 | | RB2 | | RB3 | | RB4 | 359 +-------+ +-------+ +-------+ +-------+ 360 | | | | | | | | | | 361 | +-------|-+ | +------|-+ | +-------|--+ | 362 | +---------+ | | | | | | | | 363 | | +---------|-|-|------+ | +-------+ | | 364 MC- | | | MC- | | | MC- | | MC- | | 365 LAG1->(| | |) LAG2->(| | |) LAG3->(| |) LAG4->(| |) 366 +-------+ +-------+ +-------+ +-------+ 367 | CE1 | | CE2 | | CE3 | | CE4 | 368 +-------+ +-------+ +-------+ +-------+ 370 Figure 2 Different MC-LAGs to TRILL Campus 372 RB1 and RB2 advertise {MC-LAG1, MC-LAG2} in the MC-LAG Membership 373 sub-TLV (see Section 9.1 for more details) via their TRILL IS-IS LSPs 374 respectively; RB3 announces {MC-LAG1, MC-LAG2, MC-LAG3, MC-LAG4}; and 375 RB4 announces {MC-LAG3, MC-LAG4}, respectively. 377 An edge RBridge is called an MC-LAG related RBridge if it has at 378 least one MC-LAG configured on an access port. On receipt of the MC- 379 LAG Membership sub-TLVs, RBn ignores them if it is not an MC-LAG 380 related RBridge; otherwise, RBn SHOULD use the MC-LAG information 381 contained in the sub-TLVs, along with its own MC-LAG Membership sub- 382 TLVs to decide which RBv(s) it should join and which edge RBridges 383 constitute each of such RBvs. Based on the information received, each 384 of the 4 RBridges knows the following information: 386 MC-LAG ID OE-flag Set of edge RBridges 387 --------- -------- --------------------- 388 MC-LAG1 0 {RB1, RB2, RB3} 389 MC-LAG2 0 {RB1, RB2, RB3} 390 MC-LAG3 1 {RB3, RB4} 391 MC-LAG4 0 {RB3, RB4} 393 Where the OE-flag indicates whether an MC-LAG is willing to share an 394 RBv with other MC-LAGs if they multiply attach to exact the same set 395 of edge RBridges as it. For an MC-LAG (for example MC-LAG3), if its 396 OE-flag is one, it means that MC-LAG3 does not want to share, so it 397 MUST Occupy an RBv Exclusively (OE). 399 Otherwise, the MC-LAG (for example MC-LAG1) will share an RBv with 400 other MC-LAGs if possible. By default, this flag is set zero. For an 401 MC-LAG, this flag is considered 1 only if any edge RBridge advertises 402 it as one (see Section 9.1). 404 In the above table, there might be some MC-LAGs that attach to a 405 single RBridge due to mis-configuration or link failure, etc. Those 406 MC-LAGs are considered as invalid entries. Then each of the MC-LAG 407 related edge RBridges performs the following approach to decide which 408 valid MC-LAGs can be served by an RBv. 410 Step 1: Take all the valid MC-LAGs that have their OE-flags set 1 out 411 of the table and create an RBv per such MC-LAG. 413 Step 2: Sort the left valid MC-LAGs in the table in descending order 414 based on the number of RBridges in their associated set of multi- 415 homed RBridges. 417 Step 3: Take the valid MC-LAG (say MC-LAG_i) with the maximum set of 418 RBridges, say S_i, out of the table and create a new RBv (Say RBv_i) 419 for it. 421 Step 4: Walk through the remaining valid MC-LAGs in the table one by 422 one, pick up all the valid MC-LAGs that their sets of multi-homed 423 RBridges contain the same RBridges as that of MC-LAG_i and take them 424 out of the table. Then appoint RBv_i as the servicing RBv for those 425 MC-LAGs. 427 Step 5: Repeat Step 3-4 for the left MC-LAGs until all the valid 428 entries in the table has be associated with an RBv. 430 After performing the above steps, all the 4 RBridges know that MC- 431 LAG3 is served by an RBv, say RBv1, which has RB3 and RB4 as member 432 RBrdges; MC-LAG1 and MC-LAG2 are served by another RBv, say RBv2, 433 which has RB1, RB2 and RB3 as member RBridges; and MC-LAG4 is served 434 by RBv3, which has RB3 and RB4 as member RBridges, shown as follows: 436 RBv Serving MC-LAGs Member RBridges 437 ----- ------------------- --------------- 438 RBv1 {MC-LAG3} {RB3, RB4} 439 RBv2 {MC-LAG1, MC-LAG2} {RB1, RB2, RB3} 440 RBv3 {MC-LAG4} {RB3, RB4} 442 In each RBv, one of the member RBridges is elected as the DRB 443 (Designated RBridge) of the RBv. Then this RBridge picks up an 444 available nickname as the pseudo-nickname for the RBv and announce it 445 to all other member RBridges of the RBv via its TRILL IS-IS LSPs 446 (refer to Section 9.2 for the relative extended sub-TLVs). 448 4.2. Selection of Pseudo-nickname for RBv 450 As described in Section 3, in the TRILL campus, an RBv is identified 451 by its pseudo-nickname. In an AAE group (i.e., RBv), one member 452 RBridge is elected for the duty to select a pseudo-nickname for this 453 RBv; this RBridge is called Designated RBridge of the RBv (vDRB) in 454 this document. The winner is the RBridge with the largest IS-IS 455 System ID considered as an unsigned integer, in the group. Then based 456 on its TRILL IS-IS link state database and the potential pseudo- 457 nickname(s) reported in the MC-LAG Membership sub-TLVs by other 458 member RBridges of this RBv (see Section 9.1 for more details), the 459 vDRB select an available nickname as the pseudo-nickname for this RBv 460 and advertizes it to the other RBridges via its TRILL IS-IS LSP(s) 461 (see Section 9.2). Except as provided below, the selection of a 462 nickname to use as the pseudo-nickname follows the usual TRILL rules 463 given in [RFC6325] as updated by [RFC7180]. On receipt of the pseudo- 464 nickname advertised by the vDRB, all the other RBridges of that group 465 associate it with the MC-LAGs served by the RBv, and then download 466 the association to their data plane fast path logic. 468 To reduce the traffic disruption caused by nickname changing, if 469 possible, vDRB SHOULD attempt to reuse the pseudo-nickname recently 470 used by the group when selection nickname for the RBv. To help the 471 vDRB to do so, each MC-LAG related RBridge advertises a re-using 472 pseudo-nickname for each of its MC-LAGs in its MC-LAG Membership sub- 473 TLV if it has used such one for that MC-LAG recently. Although it is 474 up to the implementation of the vDRB as to how to treat the re-using 475 pseudo-nicknames, one suggestion is given as follows: 477 o If there are more than one available re-using pseudo-nickname that 478 are reported by all the member RBridges of some MC-LAGs in this 479 RBv, the available one that is reported by most of such MC-LAGs is 480 chosen as the pseudo-nickname for this RBv. In the case that tie 481 exists, the re-using pseudo-nickname with the smallest value 482 considered as an unsigned integer is chosen. 484 o If only one re-using pseudo-nickname is reported, it SHOULD be 485 chosen if available. 487 If there is no available re-using pseudo-nickname reported, the vDRB 488 selects a nickname by its usual method. 490 Then the selected pseudo-nickname is announced by the vDRB to other 491 member RBridges of this RBv in the PN-RBv sub-TLV (see Section 9.2) 492 via its TRILL IS-IS LSP(s). After receiving the pseudo-nickname, 493 other RBridges of that RBv associate the nickname with their ports of 494 that RBv and download the association to their data plane fast path 495 logic. 497 5. Distribution Trees and Designated Forwarder 499 In an AAE group (i.e., an RBv), as each of the member RBridges thinks 500 it is the appointed forwarder for VLAN x, without changes made for 501 active-active connection support, they would all ingress/egress 502 frames into/from TRILL campus for all VLANs. For multi-destination 503 frames, more than one member RBridges ingress them may cause some of 504 the resulting TRILL Data packets to be discarded due to failure of 505 Reverse Path Forwarding (RPF) Check on other RBridges; for a multi- 506 destination traffic, more than one RBridges egress it may cause local 507 CE(s) receiving duplication frames [AAProb]. Furthermore, in an AAE 508 group, a multi-destination frame sent by a CE (say CEi) may be 509 ingressed into TRILL campus by one member RBridge, then another 510 member RBridge will receive it from TRILL campus and egress it to 511 CEi, which will result in loop of frame for CEi. 513 In the following sub-sections, the first two issues are discussed in 514 Section 5.1 and Section 5.2, respectively; the third one is discussed 515 in Section 5.3. 517 5.1. Different Trees for Different Member RBridges 519 In TRILL, RBridges use distribution trees to forward multi- 520 destination frames (although under some circumstances they can be 521 unicast as specified in [RFC7172]). RPF Check along with other 522 checking is used to avoid temporary multicast loops during topology 523 changes (Section 4.5.2 of [RFC6325]). RPF check mechanism only allows 524 a multi-destination frame ingressed by an RBridge RBi and forwarded 525 on a distribution tree Tx to arrive at another RBridge RBn on an 526 expected port. If arriving on other ports, the frame MUST be dropped. 527 To avoid address flip-flopping on remote RBridges, member RBridges 528 use RBv's pseudo-nickname instead of their regular nicknames as 529 ingress nickname to ingress native frames, including multicast 530 frames. From the view of other RBridges, these frames appear as if 531 they were ingressed by the RBv. When multicast frames of different 532 flows are ingressed by different member RBridges of an RBv and 533 forwarded along same a distribution tree, they may arrive at RBn from 534 different ports. Some of them will violate the RFC check principle at 535 RBn and be dropped, which may result in traffic disruption. 537 In an RBv, if different member RBridge uses different distribution 538 trees to ingress multi-destination frames, the RFC check violation 539 issue can be fixed. Coordinated Multicast Trees (CMT) proposes such 540 an approach, and makes use of the Affinity sub-TLV defined in 541 [RFC7176] to tell other RBridges which trees a member RBridge (say 542 RBi) may choose when ingressing multi-destination frames, then all 543 RBridges in the TRILL campus calculate RFC check information for RBi 544 on those trees [CMT]. 546 In this document, the approach proposed in [CMT] is used to fix the 547 RFC check violation issue, please refer to [CMT] for more details of 548 the approach. 550 5.2. Designated Forwarder for Member RBridges 552 Take Figure 3 as an example, where CE1 and CE2 are served by an RBv, 553 which has RB1 and RB2 as member RBridges. In VLAN x, the three CEs 554 can communicate with each other. 556 --------------------- 557 / \ 558 | TRILL Campus | 559 \ / 560 ----------------------- 561 | | 562 +----+ +------+ 563 | | 564 +---------+ +--------+ 565 | RB1 | | RB2 | 566 | oooooooo|oooooooooooooooo|ooooo | 567 +o--------+ RBv +-----o--+ 568 o|oooo|oooooooooooooooooooo|o|o | 569 | +--|--------------------+ | | 570 | | +---------+ +----------+ | 571 (| |)<-MC-LAG1 (| |)<-MC-LAG2 | 572 +-------+ +-------+ +-------+ 573 | CE1 | | CE2 | | CE3 | 574 +-------+ +-------+ +-------+ 576 Figure 3 A Topology with Multi-homed and Single-homed CEs 578 When a remote RBridge (say RBn) sends a multi-destination TRILL Data 579 packet in VLAN x (or the FGL that VLAN x maps to if the packet is an 580 FGL one), both RB1 and RB2 will receive it. As each of them thinks it 581 is the appointed forwarder for VLAN x, without changes made for 582 active-active connection support, they would both forward the frame 583 to CE1/CE2. As a result, CE1/CE2 would receive duplication copies of 584 the frame through this RBv. 586 In another case, assume CE3 is single-homed to RB2. When it transmits 587 a native multi-destination frame onto link CE3-RB2 in VLAN x, the 588 frame can be locally replicated to the ports to CE1/CE2, and also 589 encapsulated into TRILL Data packet and ingressed into TRILL campus. 590 When the packet arrives at RB1 across the TRILL campus, it will be 591 egressed to CE1/CE2 by RB1. Then CE1/CE2 receives duplicate copies 592 from RB1 and RB2. 594 In this document, Designated Forwarder (DF) for a VLAN is introduced 595 to avoid the duplicate copies. The basic idea of DF is to elect one 596 RBridge per VLAN from an RBv to egress multi-destination TRILL Data 597 traffic and replicate locally-received multi-destination native 598 frames to the CEs served by the RBv. 600 Note that DF has an effect only on the egressing/replicating of 601 multi-destination traffic, no effect on the ingressing of frames or 602 forwarding/egressing of unicast frames. Furthermore, DF check is 603 performed only for RBv ports, not on regular access ports. 605 Each RBridge in an RBv elects a DF using same algorithm which 606 guarantees the same RBridge elected as DF per VLAN. 608 Assuming there are m MC-LAGs and k member RBridges in an RBv; each 609 MC-LAG is referred to as MC-LAGi where 0 <= i < m, and each RBridge 610 is referred to as RBj where 0 <= j < k-1, DF election algorithm per 611 VLAN is as follows: 613 Step 1: For MC-LAGi, sort all the RBridges in numerically ascending 614 order based on (System IDj | MC-LAGi) mod k, where "System IDj" is 615 the IS-IS System ID of RBj, "|" means concatenation, and MC-LAGi is 616 the MC-LAG ID for MC-LAGi. In the case that some RBridges get the 617 same result of the mod, these RBridges are sorted in numerically 618 ascending order in the proper places of the result in the list by 619 their System IDs. 621 Step 2: Each RBridge in the numerically sorted list is assigned a 622 monotonically increasing number j, such that increasing number j 623 corresponding to its position in the sorted list, i.e., the first 624 RBridge (the first one with the smallest (System ID | MC-LAG ID) mod 625 k) is assigned zero and the last is assigned k-1. 627 Step 3: For VLAN ID n, choose the RBridge whose number equals (n mod 628 k) as DF. 630 Step 4: Repeat Step 1-3 for the remaining MC-LAGs until there is a DF 631 per VLAN per MC-LAG in the RBv. 633 For a multi-destination native frame of VLAN x received, if RBi is an 634 MC-LAG attached RBridge, in addition to local replication of the 635 frame to regular access port as per [RFC6325] (and [RFC7172] for 636 FGL), it should also locally replicate the frame to the following RBv 637 ports: 639 1) RBv ports associated with the same pseudo-nickname as that of the 640 incoming port, no matter whether RBi is the DF for the frame's 641 VLAN on the outgoing ports; 643 2) RBv ports on which RBi is the DF for the frame's VLAN while they 644 are associated with different pseudo-nickname(s) to that of the 645 incoming port. 647 Furthermore, the frame MUST NOT be replicated back to the incoming 648 port. For non-MC-LAG related RBridges or for non-RBv ports on an MC- 649 LAG related RBridge, local replication is performed as per [RFC6325]. 651 For a multi-destination TRILL Data packet received, RBi MUST NOT 652 egress it out of the RBv ports where it is not DF for the frame's 653 Inner.VLAN (or for the VLAN corresponding to the Inner.Label if the 654 packet is an FGL one). Otherwise, whether or not egressing it out of 655 such ports is further subject to the filtering check result of the 656 frame's ingress nickname on these ports (see Section 5.3). 658 5.3. Ingress Nickname Filtering 660 As shown in Figure 3, CE1 may send a multicast traffic in VLAN x to 661 TRILL campus via a member RBridge (say RB1). The traffic is then 662 TRILL-encapsulated by RB1 and delivered through TRILL campus to 663 multi-destination receivers. RB2 may receive the traffic, and egress 664 it back to CE1 if it is the DF for VLAN x on the port to MC-LAG1. 665 Then the traffic loops back to CE1 (see Section 3.2 of [AAProb]). 667 To fix the above issue, an ingress nickname filtering check is 668 required by this document. The idea of this check is to check the 669 ingress nickname of a multi-destination TRILL Data packet before 670 egress a copy of it out of an RBv port. If the ingress nickname 671 matches the pseudo-nickname of the RBv (associated with the port), 672 the filtering check should fail, and then the copy MUST NOT be 673 egressed out of that RBv port. Otherwise, the copy is egressed out of 674 that port if it has also passed other checks, such as the appointed 675 forwarder check in Section 4.6.2.5 of [RFC6325] and the DF check in 676 Section 5.2. 678 Note that this ingress nickname filtering check has no effect on the 679 multi-destination native frames received on access ports and 680 replicated to other local ports (including RBv ports), since there is 681 no ingress nickname associated with such frames. Furthermore, for the 682 RBridge regular access ports, there is no pseudo-nickname associated 683 with them; so no ingress nickname filtering check is required on 684 those ports. 686 More details of data packet processing on RBv ports are given in the 687 next section. 689 6. TRILL traffic Processing 691 This section provides more details of native frame and TRILL Data 692 packet processing as it relates to the RBv's pseudo-nickname. 694 6.1. Native Frames Ingressing 696 When RB1 receives a unicast native frame from one of its ports that 697 has end-station service enabled, it processes the frame as described 698 in Section 4.6.1.1 of [RFC6325] with the following exception. 700 o If the port is an RBv port, RB1 uses the RBv's pseudo-nickname, 701 instead of one of its regular nickname(s) as the ingress nickname 702 when doing TRILL encapsulation on the frame. 704 When RB1 receives a native BUM (Broadcast, Unknown unicast or 705 Multicast) frame from one of its access ports (including regular 706 access ports and RBv ports), it processes the frame as described in 707 Section 4.6.1.2 of [RFC6325] with the following exceptions. 709 o If the incoming port is an RBv port, RB1 uses the RBv's pseudo- 710 nickname, instead of one of its regular nickname(s) as the ingress 711 nickname when doing TRILL encapsulation on the frame. 713 o For the copies of the frame replicated locally to RBv ports, there 714 are two cases as follows: 716 - If the outgoing port(s) is associated with the same pseudo- 717 nickname as that of the incoming port, the copies are forwarded 718 out of that outgoing port(s) after passing the appointed 719 forwarder check for the frame's VLAN. That is to say, the 720 copies are processed on such port(s) as Section 4.6.1.2 of 721 [RFC6325]. 723 - Else, the Designated Forwarder (DF) check is further made on 724 the outgoing ports for the frame's VLAN after the appointed 725 forwarder check. The copies are not output through the ports 726 that failed the DF check (i.e., RB1 is not DF for the frame's 727 VLAN on the ports); otherwise, the copies are forwarded out of 728 the ports that pass the DF check (see Section 5.2). 730 For such a frame received, the MAC address information learned by 731 observing it, together with the MC-LAG ID of the incoming port SHOULD 732 be shared with other member RBridges in the group (see Section 7). 734 6.2. Egressing TRILL Data Packets 736 This section describes egress processing of the TRILL Data packets 737 received on a member RBridge (say RBn). Section 6.2.1 describes the 738 egress processing of unicast TRILL Data packets and Section 6.2.2 739 specifies the multi-destination TRILL Data packets egressing. 741 6.2.1. Unicast TRILL Data Packets 743 When receiving a unicast TRILL data packet, RBn checks the egress 744 nickname in the TRILL header of the packet. If the egress nickname 745 is one of RBn's regular nicknames, the packet is processed as defined 746 in Section 4.6.2.4 of [RFC6325]. 748 If the egress nickname is the pseudo-nickname of one local RBv, RBn 749 is responsible for learning the source MAC address. The learned 750 {Inner.MacSA, Data Label, ingress nickname} triplet SHOULD be shared 751 within the AAE group (See Section 7). 753 Then the packet is de-capsulated to its native form. The Inner.MacDA 754 and Data Label are looked up in RBn's local forwarding tables, and 755 one of the three following cases may occur. RBn uses the first case 756 that applies and ignores the remaining cases: 758 o If the destination end station identified by the Inner.MacDA and 759 Data Label is on a local link, the native frame is sent onto that 760 link with the VLAN from the Inner.VLAN or VLAN corresponding to 761 the Inner.Label if the packet is FGL. 763 o Else if RBn can reach the destination through another member 764 RBridge RBk, it tunnels the native frame to RBk by re- 765 encapsulating it into a unicast TRILL Data packet and sends it to 766 RBk. RBn uses RBk's regular nickname, instead of the pseudo- 767 nickname as the egress nickname for the re-encapsulation, and the 768 ingress nickname remains unchanged (Section 2.4.2.1 of [RFC7180]). 769 If the hop count value of the packet is too small for it to reach 770 RBk safely, RBn SHOULD increase that value properly in doing the 771 re-encapsulation. (NOTE: When receiving that re-encapsulated TRILL 772 Data packet, as the egress nickname of the packet is RBk's regular 773 nickname rather than the pseudo-nickname of a local RBv, RBk will 774 process it as Section 4.6.2.4 of [RFC6325], and will not re- 775 forward it to another RBridge.) 777 o Else, RBn does not know how to reach the destination; it sends the 778 native frame out of all the local ports on which it is appointed 779 forwarder for the Inner.VLAN (or appointed forwarder for the VLAN 780 into which the Inner.Label maps for FGL TRILL Data packet 781 [RFC7172]). 783 6.2.2. Multi-Destination TRILL Data Packets 785 When RB1 receives a multi-destination TRILL Data Packet, it checks 786 and processes the packet as described in Section 4.6.2.5 of [RFC6325] 787 with the following exception. 789 o On each RBv port where RBn is the appointed forwarder for the 790 packet's Inner.VLAN (or for the VLAN to which the packet's 791 Inner.Label maps if it is an FGL TRILL Data packet), the 792 Designated Forwarder check (see Section 5.2) and the Ingress 793 Nickname Filtering check (see Section 5.3) are further performed. 794 For such an RBv port, if either the DF check or the filtering 795 check fails, the frame MUST NOT be egressed out of that port. That 796 is to say, 1) if the port is associated with the same pseudo- 797 nickname as the ingress nickname of the packet, the packet SHOULD 798 be discarded; or 2) if RBn is not the DF for the packet's 799 Inner.VLAN (or VLAN the packet's Inner.Label maps to) on the port, 800 the packet SHOULD also be discarded; otherwise, it can be egressed 801 out of the port. 803 7. MAC Information Synchronization in Edge Group 805 An edge RBridge, say RB1 in MC-LAG1, may have learned a MAC address 806 and Data Label to nickname correspondence for a remote host h1 when 807 h1 sends a packet to CE1. The returning traffic from CE1 may go to 808 any other member RBridge of MC-LAG1, for example RB2. RB2 may not 809 have that correspondence stored. Therefore it has to do the flooding 810 for unknown unicast. Such flooding is unnecessary since the returning 811 traffic is almost always expected and RB1 had learned the address 812 correspondence. To avoid the unnecessary flooding, RB1 SHOULD share 813 the correspondence with other RBridges of MC-LAG1. RB1 synchronizes 814 the correspondence by using MAC-RI sub-TLV [RFC6165] in its ESADI 815 LSPs [ESADI]. 817 On the other hand, RB2 has learned the MAC&VLAN of CE1 when CE1 sends 818 a frame to h1 through RB2. The returning traffic from h1 may go to 819 RB1. RB1 may have not CE1's MAC&VLAN stored even though it is in the 820 same MC-LAG for CE1 as RB2. Therefore it has to flood the traffic out 821 of its all access ports where it is appointed forwarder for the VLAN 822 (see Section 6.2.1). Such flooding is unnecessary since the returning 823 traffic is almost always expected and RB2 had learned the CE1's 824 MAC&VLAN information. To avoid that unnecessary flooding, RB2 SHOULD 825 share the MAC and VLAN (or MAC and FGL if the egress port is an FGL 826 port [RFC7172]) with other RBridges of MC-LAG1. RB2 synchronizes the 827 MAC and Data Label by enclosing the relative MAC-RI TLV with a pair 828 of boundary TRILL Appsub-TLVs for MC-LAG1 (see Section 9.3) in its 829 ESADI LSP [ESADI]. After receiving the enclosed MAC-RI TLVs, the 830 member RBridges of MAC-LAG1 (i.e., MAC-LAG1 related RBridges) treat 831 the MAC and Data Label as if it learned them locally on its member 832 port of MC-LAG1; the MC-LAG1 unrelated RBridges just ignore MC-LAG1's 833 information contained in the boundary sub-TLVs and treat the MAC and 834 Data Label per [ESADI]. Furthermore, in order to make the the MC-LAG1 835 unrelated RBridges know that the MAC/Data Label is reachable through 836 the RBv that provides service to MC-LAG1, the Topology-id/Nickname 837 field of the MAC-RI TLV SHOULD carry the pseudo-nickname of the RBv 838 rather than zero or one of the originating RBridge's (i.e., RB2's) 839 regular nicknames. 841 8. Member Link Failure in RBv 843 As shown in Figure 4, suppose the link RB1-CE1 fails. Although a new 844 RBv will be formed by RB2 and RB3 to provide active-active service 845 for MC-LAG1 (see Section 5), the unicast traffic to CE1 might be 846 still forwarded to RB1 before the remote RBridge learns CE1 is 847 attached to the new RBv. That traffic might be disrupted by the link 848 failure. Section 8.1 discusses the failure protection in this 849 scenario. 851 However, for multi-destination TRILL Data packets, since they can 852 reach all member RBridges of the new RBv and be egressed to CE1 by 853 either RB2 or RB3 (i.e., the new DF for the traffic's Inner.VLAN or 854 the VLAN the packet's Inner.Label maps to in the new RBv), special 855 actions to protect against down-link failure for such multi- 856 desination packets is not needed. 858 ------------------ 859 / \ 860 | TRILL Campus | 861 \ / 862 -------------------- 863 | | | 864 +---+ | +----+ 865 | | | 866 +------+ +------+ +------+ 867 | RB1 | | RB2 | | RB3 | 868 ooooooo|ooooo|oooooo|ooo|ooooo | 869 o+------+ RBv +------+ +-----o+ 870 o|oooo|ooooo |oooo|ooooo|oo|o 871 | | | +-|-----+ | 872 \|/+--|-------+ | +------+ | 873 - B | +----------|------+ | | 874 /|\| +-----------+ | | | 875 (| | |)<--MC-LAG1 (| | |)<--MC-ALG2 876 +-------+ +-------+ 877 | CE1 | | CE2 | 878 +-------+ +-------+ 879 B - Failed Link or Link bundle 881 Figure 4 A Topology with Multi-homed and Single-homed CEs 883 8.1. Link Protection for Unicast Frame Egressing 885 When the link CE1-RB1 fails, RB1 loses its direct connection to CE1. 886 The MAC entry through the failed link to CE1 is removed from RB1's 887 local forwarding table immediately. Another MAC entry learned from 888 another member RBridge of MC-LAG1 (for example RB2, since it is still 889 a member RBridge of MC-LAG1) is installed into RB1's forwarding table 890 (see Section 9.3). In that new entry, RB2 (identified by one of its 891 regular nicknames) is the egress RBridge for CE1's MAC address. Then 892 when a TRILL Data packet to CE1 is delivered to RB1, it can be 893 tunneled to RB2 after being re-encapsulated (ingress nickname remains 894 unchanged and egress nickname is replaced by RB2's regular nickname) 895 based on the above installed MAC entry (see bullet 2 in Section 896 6.2.1). Then RB2 receives the frame and egresses it to CE1. 898 After the failure recovery, RB1 learns that it can reach CE1 via link 899 CE1-RB1 again by observing CE1's native frames or from the MAC 900 information synchronization by member RBridge(s) of MC-LAG1 described 901 in Section 7, then it restores the MAC entry to its previous one and 902 downloads it to its data plane fast path logic. 904 9. TLV Extensions for Edge RBridge Group 905 9.1. MC-LAG Membership (LM) Sub-TLV 907 This TLV is used by edge RBridge to announce its associated MC-LAG 908 information. It is defined as a sub-TLV of the Router Capability TLV 909 (#242) and the Multi-Topology-Aware Capability (MT-CAP) TLV (#144). 910 It has the following format: 912 +-+-+-+-+-+-+-+-+ 913 | Type= LM | (1 byte) 914 +-+-+-+-+-+-+-+-+ 915 | Length | (1 byte) 916 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+...+-+ 917 | MC-LAG RECORD(1) | (11 bytes) 918 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+...+-+ 919 . . 920 . . 921 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+...+-+ 922 | MC-LAG RECORD(n) | (11 bytes) 923 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+...+-+ 925 Figure 5 MC-LAG Membership Advertisement Sub-TLV 927 where each MC-LAG record has the following form: 929 +--+-+-+-+-+-+-+-+ 930 |OE| RESV | (1 byte) 931 +--+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 932 | Re-using Pseudo-nickname | (2 bytes) 933 +--+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+...+-+ 934 | MC-LAG System ID | (8 bytes) 935 +--+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+...+-+ 937 o LM (1 byte): Defines the type of this sub-TLV, #TBD. 939 o Length (1 byte): 11*n bytes, where there are n MC-LAG Records. 941 o OE (1 bit): an flag indicating whether or not the MC-LAG wants to 942 occupy an RBv by itself; 1 for occupying by itself (or Occupying 943 Exclusively (OE)). By default, it is set to 0 on transmit. This 944 bit is used for edge RBridge group auto-discovery (see Section 945 4.1). For any one MC-LAG, the values of this flag might conflict 946 in the LSPs advertised by different member RBridges of that MC- 947 LAG. In that case, the flag for that MC-LAG is considered as 1. 949 o RESV (7 bits): Transmitted as zero and ignored on receipt. 951 o Re-using Pseudo-nickname (2 bytes): In an MC-LAG record, it 952 suggests the pseudo-nickname of the AAE group serving the MC-LAG. 954 If the MC-LAG is not served by any AAE group, this field MUST be 955 set to zero. It is used by the originating RBridge to help the 956 vDRB to reuse pseudo-nickname of an AAE group (see Section 4.2). 958 o MC-LAG System ID (8 bytes): The System ID of the MC-LAG as 959 specified in Section 5.3.2 in [802.1AX]. 961 On receipt of such a sub-TLV, if RBn is not an MC-LAG related edge 962 RBridge, it ignores the sub-TLV; otherwise, it parses the sub-TLV. 963 When new MC-LAGs are found or old ones are withdrawn compared to its 964 old copy, and they are also configured on RBn, it triggers RBn to 965 perform the "Member RBridges Auto-Discovery" approach described in 966 Section 4.1. 968 9.2. PN-RBV sub-TLV 970 PN-RBv sub-TLV is used by a Designated RBridge of a Virtual RBridge 971 (vDRB) to appoint Pseudo-nickname for the MC-LAGs served by the RBv. 972 It is defined as a sub-TLV the Router Capability TLV (#242) and the 973 Multi-Topology-Aware Capability (MT-CAP) TLV (#144). It has the 974 following format: 976 +-+-+-+-+-+-+-+-+ 977 | Type= PN_RBv | (1 byte) 978 +-+-+-+-+-+-+-+-+ 979 | Length | (1 byte) 980 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 981 | RBv's Pseudo-Nickname | (2 bytes) 982 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+...+-+ 983 | MC-LAG System ID (1) | (8 bytes) 984 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+...+-+ 985 . . 986 . . 987 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+...+-+ 988 | MC-LAG System ID (n) | (8 bytes) 989 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+...+-+ 991 o PN_RBv (1 byte): Defines the type of this sub-TLV, #TBD. 993 o Length (1 byte): 2+8*n bytes, where there are n MC-LAG System IDs. 995 o RBv's Pseudo-Nickname (2 bytes): The appointed pseudo-nickname for 996 the RBv that serves for the MC-LAGs listed in the following 997 fields. 999 o MC-LAG System ID (8 bytes): The System ID of the MC-LAG as 1000 specified in Section 5.3.2 in [802.1AX]. 1002 On receipt of such a sub-TLV, if RBn is not an MC-LAG related edge 1003 RBridge, it ignores the sub-TLV. Otherwise, if RBn is also a member 1004 RBridge of the RBv identified by the list of MC-LAGs, it associates 1005 the pseudo-nickname with the ports of these MC-LAGs and downloads the 1006 association onto data plane fast path logic. 1008 9.3. MAC-RI-MC-LAG Boundary sub-TLVs 1010 In this document, two sub-TLVs are used as boundary sub-TLVs for edge 1011 RBridge to enclose the MAC-RI TLV(s) containing the MAC address 1012 information leant form local port of an MC-LAG when this RBridge 1013 wants to share the information with other edge RBridges. They are 1014 defined as TRILL APPsub-TLVs [ESADI]. The MAC-RI-MC-LAG-INFO-START 1015 sub-TLV has the following format: 1017 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1018 |Type =MAC-RI-MC-LAG-INFO-START | (2 byte) 1019 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1020 | Length | (2 byte) 1021 +-+-+-+-+-+-+-+-+-+-+-+-+-+-...+-+-+-+-+-+-+ 1022 | MC-LAG System ID | (8 bytes) 1023 +-+-+-+-+-+-+-+-+-+-+-+-+-+-...+-+-+-+-+-+-+ 1025 o MAC-RI-MC-LAG-INFO-START (1 byte): Defines the type of this sub- 1026 TLV, #TBD. 1028 o Length (1 byte): 8. 1030 o MC-LAG System ID (8 bytes): The System ID of the MC-LAG as 1031 specified in Section 5.3.2 in [802.1AX]. This ID identifies the 1032 MC-LAG for all MAC addresses contained in following MAC-RI TLVs 1033 until an MAC-RI-MC-LAG-INFO-END sub-TLV is encountered. 1035 MAC-RI-MC-LAG-INFO-END sub-TLV is defined as follows: 1037 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1038 |Type = MAC-RI-MC-LAG-INFO-END | (2 byte) 1039 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1040 | Length | (2 byte) 1041 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1043 o MAC-RI-MC-LAG-INFO-END (1 byte): Defines the type of this sub-TLV, 1044 #TBD. 1046 o Length (1 byte): 0. 1048 This pair of sub-TLVs can be carried multiple times in a message and 1049 in multiple messages. When an MC-LAG related edge RBridge (say RBn) 1050 wants to share with other edge RBridges the MAC addresses learned on 1051 its local ports of different MC-LAGs, it uses one or more pairs of 1052 such sub-TLVs for each of such MC-LAGs in its ESADI LSPs. Each 1053 encloses the MAC-RI TLVs containing the MAC addresses learned from 1054 the MC-LAG. Furthermore, if the MC-LAG is served by a local RBv, the 1055 value of Topology ID/Nickname field in the relative MAC-RI TLVs 1056 SHOULD be the pseudo-nickname of the RBv rather than one of the RBn's 1057 regular nickname or zero. Then on receipt of such a MAC-RI TLV, 1058 remote RBridges know that the contained MAC addresses are reachable 1059 through the RBv. 1061 On receipt of such boundary sub-TLVs, when the edge RBridge is not an 1062 MC-LAG related one or cannot recognize such sub-TLVs, it ignores them 1063 and continues to parse the enclosed MAC-RI TLVs per [ESADI]. 1064 Otherwise, the recipient parses the boundary sub-TLVs, and 1066 1) If the edge RBridge is configured with the contained MC-LAG and 1067 the MC-LAG is also enabled locally, it treats all the MAC 1068 addresses, contained in the following MC-RI TLVs enclosed by the 1069 corresponding pair of boundary sub-TLVs, as if they were learned 1070 from its local port of that MC-LAG; 1072 2) Else, it ignores these boundary sub-TLVs and continues to parse 1073 the following MAC-RI TLVs per [ESADI] until another pair of 1074 boundary sub-TLVs is encountered. 1076 10. OAM Frames 1078 Attention must be paid when generating the OAM frames. To ensure the 1079 response messages can return to the originating member RBridge of an 1080 RBv, pseudo-nickname cannot be used as ingress nickname in TRILL OAM 1081 messages, except that in the response to an OAM message that has that 1082 RBv's pseudo-nickname as egress nickname. For example, assume RB1 is 1083 a member RBridge of RBvi, RB1 cannot use RBvi's pseudo-nickname as 1084 the ingress nickname when originating OAM messages; otherwise the 1085 responses to the messages may be delivered to another member RBridge 1086 of RBvi rather than RB1. But when RB1 responds to the OAM message 1087 with RBvi's pseudo-nickname as egress nickname, it can use that 1088 pseudo-nickname as ingress nickname in the response message. 1090 Since OAM messages cannot be used by RBridges for the learning of MAC 1091 addresses (Section 3.2.1 of [RFC7174]), it will not lead to MAC 1092 address flip-flopping at a remote RBridge even though RB1 uses its 1093 regular nicknames as ingress nicknames in its TRILL OAM messages 1094 while uses RBvi's pseudo-nickname in its TRILL Data packets. 1096 11. Configuration Consistency 1097 It is important that the VLAN membership of all the RBridge ports in 1098 an MC-LAG MUST be the same. Any inconsistencies in VLAN membership 1099 may result in packet loss or non-shortest paths. 1101 Take Figure 1 for example, suppose RB1 configures VLAN1 and VLAN2 for 1102 the link CE1-RB1, while RB2 only configures VLAN1 for the CE1-RB2 1103 link. Both RB1 and RB2 use the same ingress nickname RBv for all 1104 frames originating from CE1. Hence, a remote RBridge RBx will learn 1105 that CE1's MAC address in VLAN2 is originating from RBv. As a 1106 result, on the returning path, remote RBridge RBx may deliver VLAN2 1107 traffic to RB2. However, RB2 does not have VLAN2 configured on CE1- 1108 RB2 link and hence the frame may be dropped or has to be redirected 1109 to RB1 if RB2 knows RB1 can reach CE1 in VLAN2. 1111 Furthermore, it is important that if any VLAN in an MC-LAG is being 1112 mapped by edge RBridges to an FGL [RFC7172], that the mapping MUST be 1113 same for all edge RBridge ports in the MC-LAG. Otherwise, for 1114 example, unicast FGL TRILL Data packets from remote RBridges may get 1115 mapped into different VLANs depending on which edge RBridge receives 1116 and egresses them. 1118 12. Security Considerations 1120 This draft does not introduce any extra security risks. For general 1121 TRILL Security Considerations, see [RFC6325]. For ESADI Security 1122 Considerations, see [ESADI]. 1124 13. IANA Considerations 1126 IANA is requested to allocate code points for the 4 sub-TLVs defined 1127 in Section 9. 1129 14. Acknowledgments 1131 We would like to thank Mingjiang Chen for his contributions to this 1132 document. Additionally, we would like to thank Erik Nordmark, Les 1133 Ginsberg, Ayan Banerjee, Dinesh Dutt, Anoop Ghanwani, Janardhanan 1134 Pathang, Jon Hudson and Fangwei Hu for their good questions and 1135 comments. 1137 15. Contributing Authors 1138 Weiguo Hao 1139 Huawei Technologies 1140 101 Software Avenue, 1141 Nanjing 210012 1142 China 1144 Phone: +86-25-56623144 1145 Email: haoweiguo@huawei.com 1147 16. References 1149 16.1. Normative References 1151 [CMT] T. Senevirathne, J. Pathangi, and J. Hudson, "Coordinated 1152 Multicast Trees (CMT) for TRILL", draft-ietf-trill-cmt- 1153 01.txt Work in Progress, April 2014. 1155 [ESADI] H. Zhai, F. Hu, R. Perlman, D. Eastlake, "TRILL 1156 (Transparent Interconnection of Lots of Links): The ESADI 1157 (End Station Address Distribution Information) Protocol", 1158 draft-ietf-trill-esadi-09, June 2014. 1160 [RFC1195] R. Callon, "Use of OSI IS-IS for routing in TCP/IP and 1161 dual environments", RFC 1195, December 1990. 1163 [RFC2119] S. Bradner, "Key words for use in RFCs to Indicate 1164 Requirement Levels", BCP 14, RFC 2119, March 1997. 1166 [RFC6325] R. Perlman, D. Eastlake, D. Dutt, S. Gai, and A. 1167 Ghanwani, "Routing Bridges (RBridges): Base Protocol 1168 Specification", RFC 6325, July 2011. [RFC6165] Banerjee, 1169 A. and D. Ward, "Extensions to IS-IS for Layer-2 Systems", 1170 RFC 6165, April 2011. 1172 [RFC7172] Eastlake 3rd, D., Zhang, M., Agarwal, P., Perlman, R., and 1173 D. Dutt, "Transparent Interconnection of Lots of Links 1174 (TRILL): Fine-Grained Labeling", RFC 7172, May 2014. 1176 [RFC7176] D. Eastlake, A. Banerjee, A. Ghanwani, and R. Perlman, 1177 "Transparent Interconnection of Lots of Links (TRILL) Use 1178 of IS-IS", RFC7176, May 2014. 1180 [RFC7180] D. Eastlake, M. Zhang, A. Ghanwani, V. Manral and A. 1181 Banerjee, "Transparent Interconnection of Lots of Links 1182 (TRILL): Clarifications, Corrections, and Updates", 1183 RFC7180, May 2014. 1185 [802.1AX] IEEE, "IEEE Standard for Local and Metropolitan Area/ 1186 networks Link Aggregation", 802.1AX-2008, 1 January 2008. 1188 16.2. Informative References 1190 [AAProb] Y. Li, W. Hao, R. Perlman, J. Hudson and H. Zhai, "Problem 1191 Statement and Goals for Active-Active TRILL Edge", draft- 1192 ietf-trill-active-active-connection-prob-04, June 2014. 1194 Authors' Addresses 1196 Hongjun Zhai 1197 ZTE Corporation 1198 68 Zijinghua Road, Yuhuatai District 1199 Nanjing, Jiangsu 210012 1200 China 1202 Phone: +86 25 52877345 1203 Email: zhai.hongjun@zte.com.cn 1205 Tissa Senevirathne 1206 Cisco Systems 1207 375 East Tasman Drive 1208 San Jose, CA 95134 1209 USA 1211 Phone: +1-408-853-2291 1212 Email: tsenevir@cisco.com 1214 Radia Perlman 1215 Intel Labs 1216 2200 Mission College Blvd 1217 Santa Clara, CA 95054-1549 1218 USA 1220 Phone: +1-408-765-8080 1221 Email: Radia@alum.mit.edu 1223 Donald Eastlake 3rd 1224 Huawei Technologies 1225 155 Beaver Street 1226 Milford, MA 01757 1227 USA 1228 Phone: +1-508-333-2270 1229 Email: d3e3e3@gmail.com 1231 Mingui Zhang 1232 Huawei Technologies 1233 Huawei Building, No.156 Beiqing Rd. 1234 Beijing, Beijing 100095 1235 China 1237 Email: zhangmingui@huawei.com 1239 Yizhou Li 1240 Huawei Technologies 1241 101 Software Avenue, 1242 Nanjing 210012 1243 China 1245 Phone: +86-25-56625409 1246 Email: liyizhou@huawei.com