idnits 2.17.1 draft-ietf-trill-aa-multi-attach-06.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- == There are 1 instance of lines with non-RFC6890-compliant IPv4 addresses in the document. If these are example addresses, they should be changed. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (September 25, 2015) is 3135 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'RFC1058' is mentioned on line 560, but not defined == Missing Reference: 'RFC5304' is mentioned on line 650, but not defined ** Obsolete normative reference: RFC 6439 (Obsoleted by RFC 8139) Summary: 1 error (**), 0 flaws (~~), 4 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 INTERNET-DRAFT Mingui Zhang 3 Intended Status: Proposed Standard Huawei 4 Radia Perlman 5 EMC 6 Hongjun Zhai 7 JIT 8 Muhammad Durrani 9 Cisco Systems 10 Sujay Gupta 11 IP Infusion 12 Expires: March 28, 2016 September 25, 2015 14 TRILL Active-Active Edge Using Multiple MAC Attachments 15 draft-ietf-trill-aa-multi-attach-06.txt 17 Abstract 19 TRILL (Transparent Interconnection of Lots of Links) active-active 20 service provides end stations with flow level load balance and 21 resilience against link failures at the edge of TRILL campuses as 22 described in RFC 7379. 24 This draft specifies a method by which member RBridges in an active- 25 active edge RBridge group use their own nicknames as ingress RBridge 26 nicknames to encapsulate frames from attached end systems. Thus, 27 remote edge RBridges (who are not in the group) will see one host MAC 28 address being associated with the multiple RBridges in the group. 29 Such remote edge RBridges are required to maintain all those 30 associations (i.e., MAC attachments) and to not flip-flop among them 31 which would be the behavior prior to this specification. Design goals 32 of this specification are discussed in the document. 34 Status of this Memo 36 This Internet-Draft is submitted to IETF in full conformance with the 37 provisions of BCP 78 and BCP 79. 39 Internet-Drafts are working documents of the Internet Engineering 40 Task Force (IETF), its areas, and its working groups. Note that 41 other groups may also distribute working documents as 42 Internet-Drafts. 44 Internet-Drafts are draft documents valid for a maximum of six months 45 and may be updated, replaced, or obsoleted by other documents at any 46 time. It is inappropriate to use Internet-Drafts as reference 47 material or to cite them other than as "work in progress." 49 The list of current Internet-Drafts can be accessed at 50 http://www.ietf.org/1id-abstracts.html 52 The list of Internet-Draft Shadow Directories can be accessed at 53 http://www.ietf.org/shadow.html 55 Copyright and License Notice 57 Copyright (c) 2015 IETF Trust and the persons identified as the 58 document authors. All rights reserved. 60 This document is subject to BCP 78 and the IETF Trust's Legal 61 Provisions Relating to IETF Documents 62 (http://trustee.ietf.org/license-info) in effect on the date of 63 publication of this document. Please review these documents 64 carefully, as they describe your rights and restrictions with respect 65 to this document. Code Components extracted from this document must 66 include Simplified BSD License text as described in Section 4.e of 67 the Trust Legal Provisions and are provided without warranty as 68 described in the Simplified BSD License. 70 Table of Contents 72 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 73 2. Acronyms and Terminology . . . . . . . . . . . . . . . . . . . 4 74 2.1. Acronyms and Terms . . . . . . . . . . . . . . . . . . . . 4 75 2.2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . 5 76 3. Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 77 4. Incremental Deployable Options . . . . . . . . . . . . . . . . 6 78 4.1. Details of Option B . . . . . . . . . . . . . . . . . . . . 7 79 4.1.1. Advertising Data Labels for Active-Active Edge . . . . 7 80 4.1.2. Discovery of Active-Active Edge Members . . . . . . . . 7 81 4.1.3. Advertising Learned MAC Addresses . . . . . . . . . . . 8 82 4.2. Extended RBridge Capability Flags APPsub-TLV . . . . . . . 10 83 5. Meeting the Design Goals . . . . . . . . . . . . . . . . . . . 11 84 5.1. No MAC Flip-Flopping (Normal Unicast Egress) . . . . . . . 11 85 5.2. Regular Unicast/Multicast Ingress . . . . . . . . . . . . . 12 86 5.3. Correct Multicast Egress . . . . . . . . . . . . . . . . . 12 87 5.3.1. No Duplication (Single Exit Point) . . . . . . . . . . 12 88 5.3.2. No Echo (Split Horizon) . . . . . . . . . . . . . . . . 12 89 5.4. No Black-hole or Triangular Forwarding . . . . . . . . . . 13 90 5.5. Load Balance Towards the AAE . . . . . . . . . . . . . . . 13 91 5.6. Scalability . . . . . . . . . . . . . . . . . . . . . . . . 14 92 6. E-L1FS Backwards Compatibility . . . . . . . . . . . . . . . . 14 93 7. Security Considerations . . . . . . . . . . . . . . . . . . . . 14 94 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . . 15 95 8.1. TRILL APPsub-TLVs . . . . . . . . . . . . . . . . . . . . . 15 96 8.2. Extended RBridge Capabilities Registry . . . . . . . . . . 15 97 8.3. Active-Active Flags . . . . . . . . . . . . . . . . . . . . 15 98 9. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 16 99 10. References . . . . . . . . . . . . . . . . . . . . . . . . . . 16 100 10.1. Normative References . . . . . . . . . . . . . . . . . . . 16 101 10.2. Informative References . . . . . . . . . . . . . . . . . . 17 102 Appendix A. Scenarios for Split Horizon . . . . . . . . . . . . . 17 103 Author's Addresses . . . . . . . . . . . . . . . . . . . . . . . . 20 105 1. Introduction 107 As discussed in [RFC7379], in a TRILL (Transparent Interconnection of 108 Lots of Links) Active-Active Edge (AAE) topology, a Local Active- 109 Active Link Protocol (LAALP), for example, a Multi-Chassis Link 110 Aggregation Group (MC-LAG), is used to connect multiple RBridges to 111 multi-port Customer Equipment (CE), such as a switch, vSwitch or a 112 multi-port end station. A set of endnodes are attached in the case of 113 switch or vSwitch. It is required that data traffic within a specific 114 VLAN from this endnode set (including the multi-port end station 115 case) can be ingressed and egressed by any of these RBridges 116 simultaneously. End systems in the set can spread their traffic among 117 these edge RBridges at the flow level. When a link fails, end systems 118 keep using the remaining links in the LAALP without waiting for the 119 convergence of TRILL, which provides resilience to link failures. 121 Since a frame from each endnode can be ingressed by any RBridge in 122 the local AAE group, a remote edge RBridge may observe multiple 123 attachment points (i.e., egress RBridges) for this endnode. This 124 issue is known as the "MAC flip-flopping". See [RFC7379] for a 125 discussion of the MAC flip-flopping issue. 127 In this document, AAE member RBridges use their own nicknames to 128 ingress frames into the TRILL campus. Remote edge RBridges are 129 required to keep multiple points of attachment per MAC address and 130 Data Label (VLAN or Fine Grained Label [RFC7172]) attached to the 131 AAE. This addresses the MAC flip-flopping issue. The use of the 132 solution, as specified in this document, in an AAE group does not 133 prohibit the use of other solutions in other AAE groups in the same 134 TRILL campus. For example, the specification in this draft and the 135 specification in [PN] could be simultaneously deployed for different 136 AAE groups in the same campus. 138 The main body of this document is organized as follows. Section 2 139 lists acronyms and terminologies. Section 3 gives the overview model. 140 Section 4 provides options for incremental deployment. Section 5 141 describes how this approach meets the design goals. The Sections 142 after Section 5 cover security, IANA, and some backwards 143 compatibility considerations. 145 2. Acronyms and Terminology 147 2.1. Acronyms and Terms 149 AAE: Active-Active Edge 151 Campus: a TRILL network consisting of TRILL switches, links, and 152 possibly bridges bounded by end stations and IP routers. For TRILL, 153 there is no "academic" implication in the name "campus". 155 CE: Customer Equipment (end station or bridge). The device can be 156 either physical or virtual equipment. 158 Data Label: VLAN or FGL 160 DRNI: Distributed Resilient Network Interconnect. A link aggregation 161 specified in [802.1AX] that can provide an LAALP between from 1 to 3 162 CEs and 2 or 3 RBridges. 164 Edge RBridge: An RBridge providing end station service on one or more 165 of its ports. 167 E-L1FS: Extended Level 1 Flooding Scope 169 ESADI: End Station Address Distribution Information [RFC7357] 171 FGL: Fine Grained Label [RFC7172] 173 FS-LSP: Flooding Scoped Link State PDU 175 IS: Intermediate System [ISIS] 177 IS-IS: Intermediate System to Intermediate System [ISIS] 179 LAALP: As in [RFC7379], Local Active-Active Link Protocol. Any 180 protocol similar to MC-LAG (or DRNI) that runs in a distributed 181 fashions on a CE, the links from that CE to a set of edge group 182 RBridges, and on those RBridges. 184 LSP: Link State PDU 186 MC-LAG: Multi-Chassis LAG. Proprietary extensions of Link Aggregation 187 [802.1AX] that can provide an LAALP between one CE and 2 or more 188 RBridges. 190 PDU: Protocol Data Unit 192 RBridge: A device implementing the TRILL protocol. 194 TRILL: TRansparent Interconnection of Lots of Links or Tunneled 195 Routing in the Link Layer [RFC6325] [RFC7177]. 197 TRILL switch: An alternative name for an RBridge. 199 vSwitch: A virtual switch such as a hypervisor that also simulates a 200 bridge. 202 2.2. Terminology 204 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 205 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 206 document are to be interpreted as described in RFC 2119 [RFC2119]. 208 Familiarity with [RFC6325], [RFC6439] and [RFC7177] is assumed in 209 this document. 211 3. Overview 213 +-----+ 214 | RB4 | 215 +----------+-----+----------+ 216 | | 217 | | 218 | Rest of campus | 219 | | 220 | | 221 +-+-----+--+-----+--+-----+-+ 222 | RB1 | | RB2 | | RB3 | 223 +-----\ +-----+ /-----+ 224 \ | / 225 \ | / 226 |||LAALP1 227 ||| 228 +---+ 229 | B | 230 +---+ 231 H1 H2 H3 H4: VLAN 10 233 Figure 3.1: An example topology for TRILL Active-Active Edge 235 Figure 3.1 shows an example network for TRILL Active-Active Edge (See 236 also Figure 1 in [RFC7379]). In this figure, endnodes (H1, H2, H3 and 237 H4) are attached to a bridge B that communicates with multiple 238 RBridges (RB1, RB2 and RB3) via the LAALP. Suppose RB4 is a 'remote' 239 RBridge not in the AAE group in the TRILL campus. This connection 240 model is also applicable to the virtualized environment where the 241 physical bridge can be replaced with a vSwitch while those bare metal 242 hosts are replaced with virtual machines (VM). 244 For a frame received from its attached endnode sets, a member RBridge 245 of the AAE group conforming to this document always encapsulates that 246 frame using its own nickname as the ingress nickname no matter 247 whether it's unicast or multicast. 249 With the options specified as follows, even though the remote RBridge 250 RB4 will see multiple attachments for each MAC from one of the end- 251 nodes, the "MAC flip-flopping" will not cause any problem. 253 4. Incremental Deployable Options 255 Two options are specified. Option A requires new hardware support. 256 Option B can be incrementally implemented throughout a TRILL campus 257 with common existing TRILL fast path hardware. Further details on 258 Option B are given in Section 4.1. 260 -- Option A 262 A new capability announcement would appear in LSPs: "I can cope 263 with data plane learning of multiple attachments for an endnode". 264 This mode of operation is generally not supported by existing 265 TRILL fast path hardware. Only if all edge RBridges, to which the 266 group has data connectivity, and that are interested in any of the 267 Data Labels in which the AAE is interested, announce this 268 capability, can the AAE group safely use this approach. If all 269 such RBridges do not announce this "Option A" capability, then a 270 fallback would be needed such as reverting from active-active to 271 active-standby operation or isolating the RBridges that would need 272 to support this capability and do not support it. Further details 273 for Options A are beyond the scope of this document except that in 274 Section 4.2 a bit is reserved to indicate support for Option A 275 because a remote RBridge supporting Option A is compatible with an 276 AAE group using Option B. 278 -- Option B 280 As pointed out in Section 4.2.6 of [RFC6325] and Section 5.3 of 281 [RFC7357], one MAC address may be persistently claimed to be 282 attached to multiple RBridges within the same Data Label in the 283 TRILL ESADI-LSPs. For Option B, AAE member RBridges make use of 284 the TRILL ESADI (End Station Address Distribution Information) 285 protocol to distribute multiple attachments of a MAC address. 286 Remote RBridges SHOULD disable the data plane MAC learning for 287 such multi-attached MAC addresses from TRILL Data packet 288 decapsulation unless they also support Option A. The ability to 289 configure an RBridge to disable data plane learning is provided by 290 the base TRILL protocol [RFC6325]. 292 4.1. Details of Option B 294 With Option B, the receiving edge RBridges MUST avoid flip-flop 295 errors for MAC addresses learned from the TRILL Data packet 296 decapsulation for the originating RBridge within these Data Labels. 297 It is RECOMMENDED that the receiving edge RBridge disable the data 298 plane MAC learning from TRILL Data packet decapsulation within those 299 advertised Data Labels for the originating RBridge unless the 300 receiving RBridge also supports Option A. Alternative implementations 301 that produce the same expected behavior, i.e., the receiving edge 302 RBridge does not flip-flop among multiple MAC attachments, are 303 acceptable. For example, the confidence level mechanism as specified 304 in [RFC6325] can be used. Let the receiving edge RBridge give a 305 prevailing confidence value (e.g., 0x21) to the first MAC attachment 306 learned from the data plane over others from the TRILL Data packet 307 decapsulation. The receiving edge RBridge will stick to this MAC 308 attachment until it is overridden by one learned from the ESADI 309 protocol [RFC7357]. The MAC attachment learned from ESADI is set to 310 have higher confidence value (e.g., 0x80) to override any alternative 311 learning from the decapsulation of received TRILL Data packets 312 [RFC6325]. 314 4.1.1. Advertising Data Labels for Active-Active Edge 316 RBridge in an AAE group MUST participate in ESADI in Data Labels 317 enabled for its attached LAALPs. This document further registers two 318 data flags, which are used to advertise that the originating RBridge 319 supports and participates in an Active-Active Edge. These two flags 320 are allocated from the Interested VLANs Flag Bits that appear in the 321 Interested VLANs and Spanning Tree Roots Sub-TLV and the Interested 322 Labels Flag Bits that appear in the Interested Labels and Spanning 323 Tree Roots Sub-TLV [RFC7176] (see Section 8.3). When these flags are 324 set to 1, the originating RBridge is advertising Data Labels for 325 LAALPs rather than plain LAN links. 327 4.1.2. Discovery of Active-Active Edge Members 329 Remote edge RBridges need to discover RBridges in an AAE. This is 330 achieved by listening to the following "AA LAALP Group RBridges" 331 TRILL APPsub-TLV included in the TRILL GENINFO TLV in FS-LSPs 332 [RFC7180bis]. 334 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 335 | Type = AA-LAALP-GROUP-RBRIDGES| (2 bytes) 336 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 337 | Length | (2 bytes) 338 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 339 | Sender Nickname | (2 bytes) 340 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 341 | LAALP ID Size | (1 byte) 342 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+...+-+-+ 343 | LAALP ID (k bytes) | 344 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+...+-+-+ 346 o Type: AA LAALP Group RBridges (TRILL APPsub-TLV type tbd1) 348 o Length: 3+k 350 o Sender Nickname: The nickname the originating RBridge will use as 351 the ingress nickname. This field is useful because the originating 352 RBridge might own multiple nicknames. 354 o LAALP ID Size: The length k of the LAALP ID in bytes. 356 o LAALP ID: The ID of the LAALP which is k bytes long. If the LAALP 357 is an MC-LAG or DRNI, it is the 8-byte ID specified in Clause 358 6.3.2 in [802.1AX]. 360 This APPsub-TLV is expected to rarely change as it only does so in 361 cases of the creation or elimination of an AAE group or of link 362 failure or restoration to the CE in such a group. 364 4.1.3. Advertising Learned MAC Addresses 366 Whenever MAC addresses from the LAALP of this AAE are learned through 367 ingress or configuration, the originating RBridge MUST advertise 368 these MAC addresses using the MAC-Reachability TLV [RFC6165] via the 369 ESADI protocol [RFC7357]. The MAC-Reachability TLVs are composed in a 370 way that each TLV only contains MAC addresses of end-nodes attached 371 to a single LAALP. Each such TLV is enclosed in a TRILL APPsub-TLV 372 defined as follows. 374 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 375 | Type = AA-LAALP-GROUP-MAC | (2 bytes) 376 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 377 | Length | (2 bytes) 378 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 379 | LAALP ID Size | (1 byte) 380 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+...+-+-+ 381 | LAALP ID (k bytes) | 382 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+...+-+-+ 383 | MAC-Reachability TLV (7 + 6*n bytes) | 384 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+...+-+-+ 386 o Type: AA LAALP Group MAC (TRILL APPsub-TLV type tbd2) 388 o Length: The MAC-Reachability TLV [RFC6165] is contained in the 389 value field as a sub-TLV. The total number of bytes contained in 390 the value field is given by k+8+6*n. 392 o LAALP ID Size: The length k of the LAALP ID in bytes. 394 o LAALP ID: The ID of the LAALP that is k bytes long. Here, it also 395 serves as the identifier of the AAE. If the LAALP is an MC-LAG (or 396 DRNI), it is the 8 byte ID as specified in Clause 6.3.2 in 397 [802.1AX]. 399 o MAC-Reachability sub-TLV: The AA-LAALP-GROUP-MAC APPsub-TLV value 400 contains the MAC-Reachability TLV as a sub-TLV (see [RFC6165], n 401 is the number of MAC addresses present). As specified in Section 402 2.2 in [RFC7356], the type and length fields of the MAC- 403 Reachability TLV are encoded as unsigned 16 bit integers. The one 404 octet unsigned Confidence along with these TLVs SHOULD be set to 405 prevail over those MAC addresses learned from TRILL Data 406 decapsulation by remote edge RBridges. 408 This AA-LAALP-GROUP-MAC APPsub-TLV MUST be included in a TRILL 409 GENINFO TLV [RFC7357] in the ESADI-LSP. There may be more than one 410 occurrence of such TRILL APPsub-TLV in one ESADI-LSP fragment. 412 For those MAC addresses contained in an AA-LAALP-GROUP-MAC APPsub- 413 TLV, this document applies. Otherwise, [RFC7357] applies. For 414 example, an AAE member RBridge continues to enclose MAC addresses 415 learned from TRILL Data packet decapsulation in MAC-Reachability TLV 416 as per [RFC6165] and advertise them using the ESADI protocol. 418 When the remote RBridge learns MAC addresses contained in the AA- 419 LAALP-GROUP-MAC APPsub-TLV via the ESADI protocol [RFC7357], it sends 420 the packets destined to these MAC addresses to the closest one (the 421 one to which the remote RBridge has the least cost forwarding path) 422 of those RBridges in the AAE identified by the LAALP ID in the AA- 423 LAALP-GROUP-MAC APPsub-TLV. If there are multiple equal least cost 424 member RBridges, the ingress RBridge is required to select a unique 425 one in a pseudo-random way as specified in Section 5.3 of [RFC7357]. 427 When another RBridge in the same AAE group receives an ESADI-LSP with 428 the AA-LAALP-GROUP-MAC APPsub-TLV, it also learns MAC addresses of 429 those end-nodes served by the corresponding LAALP. These MAC 430 addresses SHOULD be learned as if those end-nodes are locally 431 attached to this RBridge itself. 433 An AAE member RBridge MUST use the AA-LAALP-GROUP-MAC APPsub-TLV to 434 advertise in ESADI the MAC addresses learned from a plain local link 435 (a non LAALP link) with Data Labels that happen to be covered by the 436 Data Labels of any attached LAALP. The reason is that MAC learning 437 from TRILL Data packet decapsulation within these Data Labels at the 438 remote edge RBridge has normally been disabled for this RBridge. 440 This APPsub-TLV changes whenever the MAC reachability situation for 441 the LAALP changes. 443 4.2. Extended RBridge Capability Flags APPsub-TLV 445 The following Extended RBridge Capability Flags APPsub-TLV will be 446 included in an E-L1FS FS-LSP fragment zero [RFC7180bis] as an APPsub- 447 TLV of the TRILL GENINFO-TLV. 449 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 450 | Type = EXTENDED-RBRIDGE-CAP | (2 bytes) 451 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 452 | Length | (2 bytes) 453 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 454 | Topology | (2 bytes) 455 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 456 |E|H| Reserved | 457 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 458 | Reserved (continued) | 459 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 461 o Type: Extended RBridge Capability (TRILL APPsub-TLV type tbd3) 463 o Length: Set to 8. 465 o Topology: Indicates the topology to which the capabilities apply. 466 When this field is set to zero, this implies that the capabilities 467 apply to all topologies or topologies are not in use [TRILL-MT]. 469 o E: Bit 0 of the capability bits. When this bit is set, it 470 indicates the originating RBridge acts as specified in Option B 471 above. 473 o H: Bit 1 of the capability bits. When this bit is set, it 474 indicates that the originating RBridge keeps multiple MAC 475 attachments learned from TRILL Data packet decapsulation with fast 476 path hardware, that is, it acts as specified in Option A above. 478 o Reserved: Flags extending from bit 2 through bit 63 of the 479 capability fits reserved for future use. These MUST be sent as 480 zero and ignored on receipt. 482 The Extended RBridge Capability Flags TRILL APPsub-TLV is used to 483 notify other RBridges whether the originating RBridge supports the 484 capability indicated by the E and H bits. For example, if E bit is 485 set, it indicates the originating RBridge will act as defined in 486 Option B. That is, it will disable the MAC learning from TRILL Data 487 packet decapsulation within Data Labels advertised by AAE RBridges 488 while waiting for the TRILL ESADI-LSPs to distribute the {MAC, 489 Nickname, Data Label} association. Meanwhile, this RBridge is able to 490 act as an AAE RBridge. It's required to advertise MAC addresses 491 learned from local LAALPs in TRILL ESADI-LSPs using the AA-LAALP- 492 GROUP-MAC APPsub-TLV defined in Section 4.1. If an RBridge in an AAE 493 group, as specified herein, observe a remote RBridge interested in 494 one or more of that AAE group's Data Labels, and the remote RBridge 495 does not support, as indicated by its extended capabilities, either 496 Option A or Option B, then the AAE group MUST fall back to active- 497 standby mode. 499 This APPsub-TLV is expected to rarely change as it only needs to be 500 updated when RBridge capabilities change, such as due to an upgrade 501 or reconfiguration. 503 5. Meeting the Design Goals 505 This section explores how this specification meets the major design 506 goals of AAE. 508 5.1. No MAC Flip-Flopping (Normal Unicast Egress) 510 Since all RBridges talking with the AAE RBridges in the campus are 511 able to see multiple attachments for one MAC address in ESADI 512 [RFC7357], a MAC address learned from one AAE member will not be 513 overwritten by the same MAC address learned from another AAE member. 514 Although multiple entries for this MAC address will be created, for 515 return traffic the remote RBridge is required to adhere to a unique 516 one of the attachments for each MAC address rather than keep flip- 517 flopping among them (see Section 4.2.6 of [RFC6325] and Section 5.3 518 of [RFC7357]). 520 5.2. Regular Unicast/Multicast Ingress 522 LAALP guarantees that each frame will be sent upward to the AAE via 523 exactly one uplink. RBridges in the AAE simply follow the process per 524 [RFC6325] to ingress the frame. For example, each RBridge uses its 525 own nickname as the ingress nickname to encapsulate the frame. In 526 such a scenario, each RBridge takes for granted that it is the 527 Appointed Forwarder for the VLANs enabled on the uplink of the LAALP. 529 5.3. Correct Multicast Egress 531 A fundamental design goal of AAE is that there must be no duplication 532 or forwarding loop. 534 5.3.1. No Duplication (Single Exit Point) 536 When multi-destination TRILL Data packets for a specific Data Label 537 are received from the campus, it's important that exactly one RBridge 538 out of the AAE group let through each multi-destination packet so no 539 duplication will happen. The LAALP will have defined its selection 540 function (using hashing or election algorithm) to designated a 541 forwarder for a multi-destination frame. Since AAE member RBridges 542 support the LAALP, they are able to utilize that selection function 543 to determine the single exit point. If the output of the selection 544 function points to the port attached to the receiving RBridge itself 545 (i.e., the packet should be egressed out of this node), the receiving 546 RBridge MUST egress this packet for that AAE group. Otherwise, the 547 packet MUST NOT be egressed for that AAE group. (For ports that lead 548 to non-AAE links, the receiving RBridge determines whether to egress 549 the packet or not according to [RFC6325] which is updated by 550 [RFC7172].) 552 5.3.2. No Echo (Split Horizon) 554 When a multi-destination frame originated from an LAALP is ingressed 555 by an RBridge of an AAE group, distributed to the TRILL network and 556 then received by another RBridge in the same AAE group, it is 557 important that this receiving RBridge does not egress this frame back 558 to this LAALP. Otherwise, it will cause a forwarding loop (echo). The 559 well known 'split horizon' technique (as discussed in Section 2.2.1 560 of [RFC1058]) is used to eliminate the echo issue. 562 RBridges in the AAE group need to split horizon based on the ingress 563 RBridge nickname plus the VLAN of the TRILL Data packet. They need to 564 set up per port filtering lists consisting of the tuple of . Packets with information matching with any entry of 566 the filtering list MUST NOT be egressed out of that port. The 567 information of such filters is obtained by listening to the AA-LAALP- 568 GROUP-RBRIDGES TRILL APPsub-TLVs as defined in Section 4.1.2. Note 569 that all enabled VLANs MUST be consistent on all ports connected to 570 an LAALP. So the enabled VLANs need not be included in these TRILL 571 APPsub-TLVs. They can be locally obtained from the port attached to 572 that LAALP. Through parsing these APPsub-TLVs, the receiving RBridge 573 discovers all other RBridges connected to the same LAALP. The Sender 574 Nickname of the originating RBridge will be added into the filtering 575 list of the port attached to the LAALP. For example, RB3 in Figure 576 3.1 will set up a filtering list that looks like {, 577 } on its port attached to LAALP1. According to split 578 horizon, TRILL Data packets within VLAN10 ingressed by RB1 or RB2 579 will not be egressed out of this port. 581 When there are multiple LAALPs connected to the same RBridge, these 582 LAALPs may have VLANs that overlap. Here a VLAN overlaps means this 583 VLAN ID is enabled by multiple LAALPs. A customer may require that 584 hosts within these overlapped VLANs communicate with each other. In 585 Appendix A, several scenarios are given to explain how hosts 586 communicate within the overlapped VLANs and how split horizon 587 happens. 589 5.4. No Black-hole or Triangular Forwarding 591 If a sub-link of the LAALP fails while remote RBridges continue to 592 send packets towards the failed port, a black-hole happens. If the 593 AAE member RBridge with that failed port starts to redirect the 594 packets to other member RBridges for delivery, triangular forwarding 595 occurs. 597 The member RBridge attached to the failed sub-link makes use of the 598 ESADI protocol to flush those failure affected MAC addresses as 599 defined in Section 5.2 of [RFC7357]. After doing that, no packets 600 will be sent towards the failed port, hence no black-hole will 601 happen. Nor will the member RBridge need to redirect packets to other 602 member RBridges, which may otherwise lead to triangular forwarding. 604 5.5. Load Balance Towards the AAE 606 Since a remote RBridge can see multiple attachments of one MAC 607 address in ESADI, this remote RBridge can choose to spread the 608 traffic towards the AAE members on a per flow basis. Each of them is 609 able to act as the egress point. In doing this, the forwarding paths 610 need not be limited to the least cost path selection from the ingress 611 RBridge to the AAE RBridges. The traffic load from the remote RBridge 612 towards the AAE RBridges can be balanced based on a pseudo-random 613 selection method (see Section 4.1). 615 Note that the load balance method adopted at a remote ingress RBridge 616 is not to replace the load balance mechanism of LAALP. These two load 617 spreading mechanisms should take effect separately. 619 5.6. Scalability 621 With Option A, multiple attachments need to be recorded for a MAC 622 address learned from AAE RBridges. More entries may be consumed in 623 the MAC learning table. However, MAC addresses attached to an LAALP 624 are usually only a small part of all MAC addresses in the whole TRILL 625 campus. As a result, the extra space required by the multi-attached 626 MAC addresses can usually be accommodated by RBridges unused MAC 627 table space. 629 With Option B, remote RBridges will keep the multiple attachments of 630 a MAC address in the ESADI link state databases that are usually 631 maintained by software. While in the MAC table that is normally 632 implemented in hardware, an RBridge still establishes only one entry 633 for each MAC address. 635 6. E-L1FS Backwards Compatibility 637 The Extended TLVs defined in Section 4 and 5 are to be used in an 638 Extended Level 1 Flooding Scope ( E-L1FS [RFC7356] [RFC7180bis]) PDU. 639 For those RBridges that do not support E-L1FS, the EXTENDED-RBRIDGE- 640 CAP TRILL APPsub-TLV will not be sent out either, and MAC multi- 641 attach active-active is not supported. 643 7. Security Considerations 645 For security considerations pertaining to extensions transported by 646 TRILL ESADI, see the Security Considerations section in [RFC7357]. 648 For extensions not transported by TRILL ESADI, RBridges may be 649 configured to include the IS-IS Authentication TLV (10) in the IS-IS 650 PDUs to use the IS-IS security [RFC5304][RFC5310]. 652 Since currently deployed LAALPs [RFC7379] are proprietary, security 653 over membership in and internal management of active-active edge 654 groups is proprietary. In the environment that above authentication 655 are not adopted, a rogue RBridge that insinuates itself into an 656 active-active edge group can disrupt end station traffic flowing into 657 or out of that group. For example, if there are N RBridges in the 658 group, it could typically control 1/Nth of the traffic flowing out of 659 that group and a similar amount of unicast traffic flowing into that 660 group. 662 For general TRILL security considerations, see [RFC6325]. 664 8. IANA Considerations 666 8.1. TRILL APPsub-TLVs 668 IANA is requested to allocate three new types under the TRILL GENINFO 669 TLV [RFC7357] for the TRILL APPsub-TLVs defined in Section 4.1 of 670 this document. The following entries are added to the "TRILL APPsub- 671 TLV Types under IS-IS TLV 251 Application Identifier 1" Registry on 672 the TRILL Parameters IANA web page. 674 Type Name Reference 675 --------- ---- --------- 676 tbd1(252) AA-LAALP-GROUP-RBRIDGES [This document] 677 tbd2(253) AA-LAALP-GROUP-MAC [This document] 678 tbd3(254) EXTENDED-RBRIDGE-CAP [This document] 680 8.2. Extended RBridge Capabilities Registry 682 IANA is requested to create a registry under the TRILL Parameters 683 registry as follows: 685 Name: Extended RBridge Capabilities 687 Registration Procedure: Expert Review 689 Reference: [this document] 691 Bit Mnemonic Description Reference 692 ---- -------- ----------- --------- 693 0 E Option B Support [this document] 694 1 H Option A Support [this document] 695 2-63 - Unassigned 697 8.3. Active-Active Flags 699 IANA is requested to allocate two flag bits, with mnemonic "AA", as 700 follows: 702 One flag bit is allocated from the Interested VLANs Flag Bits. 704 Bit Mnemonic Description Reference 705 --- -------- ----------- --------- 706 tbd4(16) AA VLANs for Active-Active [This document] 708 One flag bit is allocated from the Interested Labels Flag Bits. 710 Bit Mnemonic Description Reference 711 --- -------- ----------- --------- 712 tbd5(4) AA FGLs for Active-Active [This document] 714 9. Acknowledgements 716 Authors would like to thank the comments and suggestions from Andrew 717 Qu, Donald Eastlake, Erik Nordmark, Fangwei Hu, Liang Xia, Weiguo 718 Hao, Yizhou Li and Mukhtiar Shaikh. 720 10. References 722 10.1. Normative References 724 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 725 Requirement Levels", BCP 14, RFC 2119, March 1997. 727 [RFC6165] Banerjee, A. and D. Ward, "Extensions to IS-IS for Layer-2 728 Systems", RFC 6165, April 2011. 730 [RFC6325] Perlman, R., Eastlake 3rd, D., Dutt, D., Gai, S., and A. 731 Ghanwani, "Routing Bridges (RBridges): Base Protocol 732 Specification", RFC 6325, July 2011. 734 [RFC6439] Perlman, R., Eastlake, D., Li, Y., Banerjee, A., and F. Hu, 735 "Routing Bridges (RBridges): Appointed Forwarders", RFC 736 6439, November 2011. 738 [RFC7172] D. Eastlake 3rd and M. Zhang and P. Agarwal and R. Perlman 739 and D. Dutt, "Transparent Interconnection of Lots of Links 740 (TRILL): Fine-Grained Labeling", RFC 7172, May 2014. 742 [RFC7176] D. Eastlake 3rd and T. Senevirathne and A. Ghanwani and D. 743 Dutt and A. Banerjee, "Transparent Interconnection of Lots 744 of Links (TRILL) Use of IS-IS", RFC7176, May 2014. 746 [RFC7177] D. Eastlake 3rd and R. Perlman and A. Ghanwani and H. Yang 747 and V. Manral, "Transparent Interconnection of Lots of 748 Links (TRILL): Adjacency", RFC 7177, May 2014. 750 [RFC7356] Ginsberg, L., Previdi, S., and Y. Yang, "IS-IS Flooding 751 Scope Link State PDUs (LSPs)", RFC 7356, September 2014. 753 [RFC7357] Zhai, H., Hu, F., Perlman, R., Eastlake 3rd, D., and O. 754 Stokes, "Transparent Interconnection of Lots of Links 755 (TRILL): End Station Address Distribution Information 756 (ESADI) Protocol", RFC 7357, September 2014. 758 [RFC7180bis] D. Eastlake, M. Zhang, et al, "TRILL: Clarifications, 759 Corrections, and Updates", draft-ietf-trill-rfc7180bis, 760 work in progress. 762 [802.1AX] IEEE, "IEEE Standard for Local and Metropolitan Area 763 Networks - Link Aggregation", 802.1AX-2014, 24 December 764 2014. 766 10.2. Informative References 768 [RFC7379] Li, Y., Hao, W., Perlman, R., Hudson, J., and H. Zhai, 769 "Problem Statement and Goals for Active-Active Connection 770 at the Transparent Interconnection of Lots of Links (TRILL) 771 Edge", RFC 7379, October 2014. 773 [PN] H. Zhai, T. Senevirathne, et al, "TRILL: Pseudo-Nickname 774 for Active-active Access", draft-ietf-trill-pseudonode- 775 nickname, work in progress. 777 [TRILL-MT] D. Eastlake, M. Zhang, A. Banerjee, V. Manral, "TRILL: 778 Multi-Topology", draft-eastlake-trill-multi-topology, work 779 in progress. 781 [ISIS] ISO, "Intermediate system to Intermediate system routeing 782 information exchange protocol for use in conjunction with 783 the Protocol for providing the Connectionless-mode Network 784 Service (ISO 8473)", ISO/IEC 10589:2002. 786 [RFC5310] Bhatia, M., Manral, V., Li, T., Atkinson, R., White, R., 787 and M. Fanto, "IS-IS Generic Cryptographic Authentication", 788 RFC 5310, February 2009. 790 Appendix A. Scenarios for Split Horizon 792 +------------------+ +------------------+ +------------------+ 793 | RB1 | | RB2 | | RB3 | 794 +------------------+ +------------------+ +------------------+ 795 L1 L2 L3 L1 L2 L3 L1 L2 L3 796 VL10~20 VL15~25 VL15 VL10~20 VL15~25 VL15 VL10~20 VL15~25 VL15 797 LAALP1 LAALP2 LAN LAALP1 LAALP2 LAN LAALP1 LAALP2 LAN 798 B1 B2 B10 B1 B2 B20 B1 B2 B30 800 Figure A.1: An example topology to explain split horizon 802 Suppose RB1, RB2 and RB3 are the Active-Active group connecting 803 LAALP1 and LAALP2. LAALP1 and LAALP2 are connected to B1 and B2 at 804 their other ends. Suppose all these RBridges use port L1 to connect 805 LAALP1 while they use port L2 to connect LAALP2. Assume all three L1 806 enable VLAN 10~20 while all three L2 enable VLAN 15~25. So that there 807 is an overlap of VLAN 15~20. A customer may require that hosts within 808 these overlapped VLANs communicate with each other. That is, hosts 809 attached to B1 in VLAN 15~20 need to communicate with hosts attached 810 to B2 in VLAN 15~20. Assume the remote plain RBridge RB4 also has 811 hosts attached in VLAN 15~20 which need to communicate with those 812 hosts in these VLANs attached to B1 and B2. 814 Two major requirements: 816 1. Frames ingressed from RB1-L1-VLAN 15~20 MUST NOT be egressed out 817 of ports RB2-L1 and RB3-L1. At the same time, 819 2. frames coming from B1-VLAN 15~20 should reach B2-VLAN 15~20. 821 RB3 stores the information for split horizon on its ports L1 and L2. 822 On L1: {, } and on L2: {, 824 }. 826 Five clarification scenarios: 828 a. Suppose RB2/RB3 receives a TRILL multi-destination data packet 829 with VLAN 15 and ingress nickname RB1. RB3 is the single exit 830 point (selected out according to the hashing function of LAALP) 831 for this packet. On ports L1 and L2, RB3 has covered 832 , so that RB3 will not egress this 833 packet out of either L1 or L2. Here, _split horizon_ happens. 835 Beforehand, RB1 obtains a native frame on port L1 from B1 in VLAN 836 15. RB1 judges it should be forwarded as a multi-destination 837 packet across the TRILL campus. Also, RB1 replicates this frame 838 without TRILL encapsulation and sends it out of port L2, so that 839 B2 will get this frame. 841 b. Suppose RB2/RB3 receives a TRILL multi-destination data packet 842 with VLAN 15 and ingress nickname RB4. RB3 is the single exit 843 point. On ports L1 and L2, since RB3 has not stored any tuple with 844 ingress_ nickname_RB4, RB3 will decapsulate the packet and egress 845 it out of both ports L1 and L2. So both B1 and B2 will receive the 846 frame. 848 c. Suppose there is a plain LAN link port L3 on RB1, RB2 and RB3, 849 connecting to B10, B20 and B30 respectively. These L3 ports happen 850 to be configured with VLAN 15. On port L3, RB2 and RB3 stores no 851 information of split horizon for AAE (since this port has not been 852 configured to be in any LAALP). They will egress the packet 853 ingressed from RB1-L1 in VLAN 15. 855 d. If a packet is ingressed from RB1-L1 or RB1-L2 with VLAN 15, port 856 RB1-L3 will not egress packets with ingress-nickname-RB1. RB1 857 needs to replicate this frame without encapsulation and sends it 858 out of port L3. This kind of 'bounce' behavior for multi- 859 destination frames is just as specified in paragraph 2 of Section 860 4.6.1.2 of [RFC6325]. 862 e. If a packet is ingressed from RB1-L3, since RB1-L1 and RB1-L2 863 cannot egress packets with VLAN 15 and ingress-nickname-RB1, RB1 864 needs to replicate this frame without encapsulation and sends it 865 out of port L1 and L2. (Also see paragraph 2 of Section 4.6.1.2 of 866 [RFC6325].) 868 Author's Addresses 870 Mingui Zhang 871 Huawei Technologies 872 No.156 Beiqing Rd. Haidian District, 873 Beijing 100095 P.R. China 875 EMail: zhangmingui@huawei.com 877 Radia Perlman 878 EMC 879 2010 256th Avenue NE, #200 880 Bellevue, WA 98007 USA 882 EMail: radia@alum.mit.edu 884 Hongjun Zhai 885 Jinling Institute of Technology 886 99 Hongjing Avenue, Jiangning District 887 Nanjing, Jiangsu 211169 China 889 EMail: honjun.zhai@tom.com 891 Muhammad Durrani 892 Cisco Systems 893 170 West Tasman Dr. 894 San Jose, CA 95134 896 EMail: mdurrani@cisco.com 898 Sujay Gupta 899 IP Infusion, 900 RMZ Centennial 901 Mahadevapura Post 902 Bangalore - 560048 903 India 905 EMail: sujay.gupta@ipinfusion.com