idnits 2.17.1 draft-ietf-trill-active-active-connection-prob-07.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == Line 240 has weird spacing: '...o which uplin...' -- The document date (August 25, 2014) is 3532 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- ** Obsolete normative reference: RFC 6439 (Obsoleted by RFC 8139) Summary: 1 error (**), 0 flaws (~~), 2 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 TRILL Working Group Yizhou Li 3 INTERNET-DRAFT Weiguo Hao 4 Intended Status: Informational Huawei Technologies 5 Radia Perlman 6 EMC 7 Jon Hudson 8 Brocade 9 Hongjun Zhai 10 ZTE 11 Expires: Feb 26, 2015 August 25, 2014 13 Problem Statement and Goals for Active-Active TRILL Edge 14 draft-ietf-trill-active-active-connection-prob-07 16 Abstract 18 The IETF TRILL (Transparent Interconnection of Lots of Links) 19 protocol provides support for flow level multi-pathing with rapid 20 failover for both unicast and multi-destination traffic in networks 21 with arbitrary topology. Active-active at the TRILL edge is the 22 extension of these characteristics to end stations that are multiply 23 connected to a TRILL campus. This informational document discusses 24 the high level problems and goals when providing active-active 25 connection at the TRILL edge. 27 Status of this Memo 29 This Internet-Draft is submitted to IETF in full conformance with the 30 provisions of BCP 78 and BCP 79. 32 Internet-Drafts are working documents of the Internet Engineering 33 Task Force (IETF), its areas, and its working groups. Note that 34 other groups may also distribute working documents as 35 Internet-Drafts. 37 Internet-Drafts are draft documents valid for a maximum of six months 38 and may be updated, replaced, or obsoleted by other documents at any 39 time. It is inappropriate to use Internet-Drafts as reference 40 material or to cite them other than as "work in progress." 42 The list of current Internet-Drafts can be accessed at 43 http://www.ietf.org/1id-abstracts.html 45 The list of Internet-Draft Shadow Directories can be accessed at 46 http://www.ietf.org/shadow.html 48 Copyright and License Notice 50 Copyright (c) 2014 IETF Trust and the persons identified as the 51 document authors. All rights reserved. 53 This document is subject to BCP 78 and the IETF Trust's Legal 54 Provisions Relating to IETF Documents 55 (http://trustee.ietf.org/license-info) in effect on the date of 56 publication of this document. Please review these documents 57 carefully, as they describe your rights and restrictions with respect 58 to this document. Code Components extracted from this document must 59 include Simplified BSD License text as described in Section 4.e of 60 the Trust Legal Provisions and are provided without warranty as 61 described in the Simplified BSD License. 63 Table of Contents 65 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 66 1.1 Terminology . . . . . . . . . . . . . . . . . . . . . . . . 3 67 2. Target Scenario . . . . . . . . . . . . . . . . . . . . . . . . 4 68 2.1 LAALP and Edge Group Characteristics . . . . . . . . . . . . 6 69 3. Problems in Active-Active at the TRILL Edge . . . . . . . . . . 7 70 3.1 Frame Duplications . . . . . . . . . . . . . . . . . . . . . 7 71 3.2 Loop Back . . . . . . . . . . . . . . . . . . . . . . . . . 7 72 3.3 Address Flip-Flop . . . . . . . . . . . . . . . . . . . . . 8 73 3.4 Unsynchronized Information Among Member RBridges . . . . . . 8 74 4. High Level Requirements and Goals for Solutions . . . . . . . . 8 75 5. Security Considerations . . . . . . . . . . . . . . . . . . . . 9 76 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . . 10 77 7. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . 10 78 8. References . . . . . . . . . . . . . . . . . . . . . . . . . . 10 79 8.1 Normative References . . . . . . . . . . . . . . . . . . . 10 80 8.2 Informative References . . . . . . . . . . . . . . . . . . 11 81 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 11 83 1. Introduction 85 The IETF TRILL (Transparent Interconnection of Lots of Links) 86 [RFC6325] protocol provides loop free and per hop based multipath 87 data forwarding with minimum configuration. TRILL uses [IS-IS] 88 [RFC6165] [RFC7176] as its control plane routing protocol and defines 89 a TRILL specific header for user data. In a TRILL campus, 90 communications between TRILL switches can 92 (1) use multiple parallel links and/or paths, 94 (2) spread load over different links and/or paths at a fine grained 95 flow level through equal cost multipathing of unicast traffic and 96 multiple distribution trees for multi-destination traffic, and 98 (3) rapidly re-configure to accommodate link or node failures or 99 additions. 101 "Active-active" is the extension, to the extent practical, of similar 102 load spreading and robustness to the connections between end stations 103 and the TRILL campus. Such end stations may have multiple ports and 104 will be connected, directly or via bridges, to multiple edge TRILL 105 switches. It must be possible, except in some failure conditions, to 106 spread end station traffic load at the granularity of flows across 107 links to such multiple edge TRILL switches and rapidly re-configure 108 to accommodate topology changes. 110 1.1 Terminology 112 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 113 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 114 document are to be interpreted as described in RFC 2119 [RFC2119]. 116 The acronyms and terminology in [RFC6325] are used herein with the 117 following additions: 119 CE - Customer Equipment (end station or bridge). 121 Data Label - VLAN or FGL (Fine Grained Label [RFC7172]). 123 LAALP - Local Active-Active Link Protocol. Any protocol similar to 124 MC-LAG that runs in a distributed fashions on a CE, the links from 125 that CE to a set of edge group RBridges, and on those RBridges. 127 MC-LAG - Multi-Chassis Link Aggregation. Proprietary extensions to 128 IEEE Std 802.1AX-2011 [802.1AX] standard so that the aggregated links 129 can, at one end of the aggregation, attach to different switches. 131 Edge group - a group of edge RBridges to which at least one CE is 132 multiply attached using an LAALP. When multiple CEs attach to the 133 exact same set of edge RBridges, those edge RBridges can be 134 considered as a single edge group. An RBridge can be in more than one 135 edge group. 137 RBridge - Routing Bridge - an alternative name for a TRILL switch. 139 TRILL - Transparent Interconnection of Lots of Links. 141 TRILL switch -- a device the implements the TRILL protocol; an 142 alternative term for an RBridge. 144 2. Target Scenario 146 This section presents a typical scenario of active-active connections 147 to a TRILL campus via multiple edge RBridges where the current TRILL 148 appointed forwarder mechanism does not work as expected. 150 The TRILL appointed forwarder mechanism [RFC6439] can handle fail 151 over (active-standby), provides loop avoidance and, with 152 administrative configuration, provides load spreading based on VLAN. 153 One and only one appointed RBridge can ingress/egress native frames 154 into/from the TRILL campus for a given VLAN among all edge RBridges 155 connecting a legacy network to the TRILL campus. This is true whether 156 the legacy network is a simple point-to-point link or a complex 157 bridged LAN or anything in between. By carefully selecting different 158 RBridges as appointed forwarder for different sets of VLANs, load 159 spreading over different edge RBidges across different Data Labels 160 can be achieved. 162 The appointed forwarder mechanism [RFC6439] requires all of the edge 163 group RBridges to exchange TRILL IS-IS Hello packets through their 164 access ports. As Figure 1 shows, when multiple access links of 165 multiple edge RBridges are connected to a CE by an LAALP, Hello 166 messages sent by RB1 via access port to CE1 will not be forwarded to 167 RB2 by CE1. RB2 (and other members of LAALP1) will not see that Hello 168 from RB1 via the LAALP1. Every member RBridge of LAALP1 thinks of 169 itself as appointed forwarder on an LAALP1 link for all VLANs and 170 will ingress/egress frames. Hence the appointed forwarder mechanism 171 cannot provide active-active or even active-standby service across 172 the edge group in such a scenario. 174 ---------------------- 175 | | 176 | TRILL Campus | 177 | | 178 ---------------------- 179 | | | 180 ----- | -------- 181 | | | 182 +------+ +------+ +------+ 183 | | | | | | 184 |(RB1) | |(RB2) | | (RBk)| 185 +------+ +------+ +------+ 186 |..| |..| |..| 187 | +----+ | | | | 188 | +---|-----|--|----------+ | 189 | +-|---|-----+ +-----------+ | 190 | | | +------------------+ | | 191 LAALP1--->(| | |) (| | |) <---LAALPn 192 +-------+ . . . +-------+ 193 | CE1 | | CEn | 194 | | | | 195 +-------+ +-------+ 197 Figure 1 Active-Active connection to TRILL edge RBridges 199 Active-Active connection is useful when we want to achieve the 200 following two goals: 202 - Flow rather than VLAN based load balancing is desired. 204 - More rapid failure recovery is desired. The current appointed 205 forwarder mechanism relies on the TRILL Hello timer expiration to 206 detect the unreachability of another edge RBridge connecting to the 207 same local link. Then re-appointing the forwarder for specific VLANs 208 may be required. Such procedures take time on the scale of seconds 209 although this can be improved with TRILL use of BFD [RFC7175]. 210 Active-Active connection usually has a faster built-in mechanism for 211 member node and/or link failure detection. Faster detection of 212 failures minimizes the frame loss and recovery time. 214 Today LAALP is usually a proprietary facility whose implementation 215 varies by vendor. So, to be sure the LAALP operations successfully 216 across a group of edge RBridges, those edge RBridges will almost 217 always have to be from the same vendor. In the case where the LAALP 218 is an MC-LAG, the CE normally implements standard [802.1AX] logic so 219 proprietary elements would only be at the edge group end. There is 220 also a revision of [802.1AX] underway (802.1X-REV) to remove the 221 restriction in [802.1AX] that there be one box at each end of the 222 aggregation. So it is possible that in the future an LAALP could be 223 implemented through such a revised [802.1AX] with standards 224 conformant logic at both the CE and edge group ends. In order to have 225 a common understanding of active-active connection scenarios, the 226 assumptions in Section 2.1 are made about the characteristics of the 227 LAALP and edge group of RBridges. 229 2.1 LAALP and Edge Group Characteristics 231 For a CE connecting to multiple edge RBridges via an LAALP (active- 232 active connection), the following characteristics apply: 234 a) The LAALP will deliver a frame from an endnode to TRILL at exactly 235 one edge group RBridge. 236 b) The LAALP will never forward frames it receives from one up-link 237 to another. 238 c) The LAALP will attempt to send all frames for a given flow on the 239 same uplink. To do this, it has some unknown rule for which frames 240 get sent to which uplinks (typically based on a simple hash function 241 of Layer 2 through 4 header fields). 242 d) Frames are accepted from any of the uplinks and passed down to 243 endnodes (if any exist). 244 e) The LAALP cannot be assumed to send useful control information to 245 the up-link such as "this is the set of other RBridges to which this 246 CE is attached", or "these are all the MAC addresses attached". 248 For an edge group of RBridges to which a CE is multiply attached with 249 an LAALP: 251 a) Any two RBridges in the edge group are reachable from each other 252 via the TRILL campus. 253 b) Each RBridge in the edge group knows an ID for each LAALP instance 254 multiply attached to that group. The ID will be consistent across 255 the edge group and globally unique across the TRILL campus. For 256 example, if CE1 attaches to RB1, RB2, ... RBn using an LAALP, then 257 each of RBs will know, for the port to CE1, that it is has some label 258 such as "LAALP1" 259 c) Each RB in the edge group can be configured with the set of 260 acceptable VLANs for the ports to any CE. The acceptable VLANs 261 configured for those ports should include all the VLANs the CE has 262 joined and be consistent for all the member RBridges of the edge 263 group. 264 d) When a RBridge fails, all the other RBridges having formed any 265 LAALP instance with it know the information in a timely fashion. 266 e) When a down-link of an edge group RBridge to an LAALP instance 267 fails, that RBridge and all the other RBridges participating in the 268 LAALP instance including that down-link know of the failure in a 269 timely fashion. 270 f) The RBridges in the edge group have some mechanism to exchange 271 information with each other, including the set of CEs they are 272 connecting to or the IDs of the LAALP instances their down-links are 273 part of. 275 Other than the applicable characteristics above, the internals of an 276 LAALP are out of scope for TRILL. 278 3. Problems in Active-Active at the TRILL Edge 280 This section presents the problems that need to be addressed in 281 active-active connection scenarios. The topology in Figure 1 is used 282 in the following sub-sections as the example scenario for 283 illustration purposes. 285 3.1 Frame Duplications 287 When a remote RBridge ingresses a multi-destination TRILL Data packet 288 in VLAN x, all edge group RBridges of LAALP1 will receive the frame 289 if any local CE1 joins VLAN x. As each of them thinks it is the 290 appointed forwarder for VLAN x, without changes made for active- 291 active connection support, they would all forward the frame to CE1. 292 The bad consequence is that CE1 receives multiple copies of that 293 multi-destination frame from the remote end host source. 295 Frame duplication may also occur when an ingress RBridge is non- 296 remote, say ingress and egress are two RBridges belonging to the same 297 edge group. Assume LAALP m connects to an edge group g and the edge 298 group g consists of RB1, RB2 and RB3. The multi-destination frames 299 ingressed from a port not connected to LAALP m by RB1 can be locally 300 replicated to other ports on RB1 and also TRILL encapsulated and 301 forwarded to RB2 and RB3. CE1 will receive duplicate copies from RB1, 302 RB2 and RB3. 304 Note that frame duplication is only a problem in multi-destination 305 frame forwarding. Unicast forwarding does not have this issue as 306 there is only ever one copy of the packet. 308 3.2 Loop Back 310 As shown in Figure 1, CE1 may send a native multi-destination frame 311 to the TRILL campus via a member of the LAALP1 edge group (say RB1). 312 This frame will be TRILL encapsulated and then forwarded through the 313 campus to the multi-destination receivers. Other members (say RB2) of 314 the same LAALP edge group will receive this multicast packet as well. 315 In this case, without changes made for active-active connection 316 support, RB2 will decapsulate the frame and egress it. The frame 317 loops back to CE1. 319 3.3 Address Flip-Flop 321 Consider RB1 and RB2 using their own nickname as ingress nickname for 322 data into a TRILL campus. As shown by Figure 1, CE1 may send a data 323 frame with the same VLAN and source MAC address to any member of the 324 edge group LAALP1. If some egress RBridge receives TRILL data packets 325 from different ingress RBridges but with same source Data Label and 326 MAC address, it learns different Data Label and MAC to nickname 327 address correspondences when decapsulating the data frames. Address 328 correspondence may keep flip-flopping among nicknames of the member 329 RBridges of the LAALP for the same Data Label and MAC address. 330 Existing hardware does not support data plane learning of multiple 331 nicknames for the same MAC address and data label -- when data plane 332 learning indicates attachment of the MAC to a new nickname, it 333 overwrites the old attachment nickname. 335 Implementers have stated that most current TRILL switch hardware, 336 when doing data plane learning, behaves badly under these 337 circumstances and, for example, interpret address flip-flopping as a 338 severe network problem. It may also cause the returning traffic to go 339 through different paths to reach the destination resulting in 340 persistent re-ordering of the frames. 342 3.4 Unsynchronized Information Among Member RBridges 344 A local RBridge, say RB1 connected to LAALP1, may have learned a Data 345 Label and MAC to nickname correspondence for a remote host h1 when h1 346 sends a packet to CE1. The returning traffic from CE1 may go to any 347 other member RBridge of LAALP1, for example RB2. RB2 may not have 348 h1's Data Label and MAC to nickname correspondence stored. Therefore 349 it has to do the flooding for unknown unicast [RFC6325]. Such 350 flooding is unnecessary since the returning traffic is almost always 351 expected and RB1 had learned the address correspondence. It is 352 desirable to avoid flooding; it imposes a greater burden on the 353 network than known destination unicast traffic because the flooded 354 traffic is sent over more links. 356 Synchronization of the Data Label and MAC to nickname correspondence 357 information among member RBridges will reduce such unnecessary 358 flooding. 360 4. High Level Requirements and Goals for Solutions 361 The problems identified in section 3 should be solved in any solution 362 for active-active connection to edge RBridges. The following high- 363 level requirements and goals should be met. 365 Data plane: 367 1) All up-links of CE MUST be active: the LAALP is free to choose any 368 up-link on which to send packets and the CE is able to receive 369 packets from any up-link of an edge group. 370 2) Looping back and frame duplication MUST be prevented. 371 3) Learning of Data Label and MAC to nickname correspondence by a 372 remote RBridge MUST NOT flip-flop between the local multiply attached 373 edge RBridges. 374 4) Packets for a flow SHOULD stay in order. 375 5) The Reverse Path Forwarding Check MUST work properly as per 376 [RFC6325]. 377 6) Single up-link failure on CE to an edge group MUST NOT cause 378 persistent packet delivery failure between TRILL campus and CE. 380 Control plane: 382 1) No requirement for new information to be passed between edge 383 RBridges and CE or between edge RBridges and endnodes. 384 2) If there is any TRILL specific information required to be 385 exchanged between RBridges in an edge group, for example data labels 386 and MAC addresses binding to nicknames, a solution MUST specify the 387 mechanism to perform such exchange unless this is handled internal to 388 the LAALP. 389 3) RBridges SHOULD be able to discover other members in the same edge 390 group by exchanging their LAALP attachment information. 392 Configuration, incremental deployment, and others: 394 1) Solution SHOULD require minimal configuration. 395 2) Solution SHOULD automatically detect misconfiguration of edge 396 RBridge group. 397 3) Solution SHOULD support incremental deployment, that is, not 398 require campus wide upgrading for all RBridges, only changes to the 399 edge group RBridges. 400 4) Solution SHOULD be able to support from 2 up to at least 4 active- 401 active up-links on a multiply attached CE. 402 5) Solution SHOULD NOT assume there is a dedicated physical link 403 between any two of the edge RBridges in an edge group. 405 5. Security Considerations 407 As an informational overview, this draft does not introduce any extra 408 security risks. Security risks introduced by any particular LAALP or 409 other elements of solutions to the problems presented here will be 410 discussed in the separate document(s) describing such LAALP or 411 solutions. 413 End station links in TRILL are Ethernet links and consideration 414 should be given to securing them with [802.1AE] link security for the 415 protection of end station data and link level control messages 416 including any LAALP control messages. 418 For general TRILL Security Considerations, see [RFC6325]. 420 6. IANA Considerations 422 No IANA action is required. RFC Editor: please delete this section 423 before publication. 425 7. Acknowledgments 427 Special acknowledgments to Donald Eastlake, Adrian Farrel and Mingui 428 Zhang for their valuable comments. 430 8. References 432 8.1 Normative References 434 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 435 Requirement Levels", BCP 14, RFC 2119, March 1997. 437 [IS-IS] ISO/IEC 10589:2002, Second Edition, "Intermediate System to 438 Intermediate System Intra-Domain Routing Exchange Protocol 439 for use in Conjunction with the Protocol for Providing the 440 Connectionless-mode Network Service (ISO 8473)", 2002. 442 [RFC6165] Banerjee, A. and D. Ward, "Extensions to IS-IS for Layer-2 443 Systems", RFC 6165, April 2011. 445 [RFC6325] Perlman, R., Eastlake 3rd, D., Dutt, D., Gai, S., and A. 446 Ghanwani, "Routing Bridges (RBridges): Base Protocol 447 Specification", RFC 6325, July 2011 449 [RFC6439] Perlman, R., Eastlake, D., Li, Y., Banerjee, A., and F. Hu, 450 "Routing Bridges (RBridges): Appointed Forwarders", RFC 451 6439, November 2011 453 [RFC7172] Eastlake, D., M. Zhang, P. Agarwal, R. Perlman, D. Dutt, 454 "Transparent Interconnection of Lots of Links (TRILL): 455 Fine-Grained Labeling", RFC7172, May 2014. 457 [RFC7176] Eastlake 3rd, D., Senevirathne, T., Ghanwani, A., Dutt, D., 458 and A. Banerjee, "Transparent Interconnection of Lots of 459 Links (TRILL) Use of IS-IS", RFC 7176, May 2014. 461 8.2 Informative References 463 [RFC7175] Manral, V., D. Eastlake, D. Ward, A. Banerjee, "Transparent 464 Interconnetion of Lots of Links (TRILL): Bidirectional 465 Forwarding Detection (BFD) Support", RFC7175, May 2014. 467 [802.1AX] IEEE, "Link Aggregration", 802.1AX-2008, 3 November 2008. 469 [802.1Q] IEEE, "Media Access Control (MAC) Bridges and Virtual 470 Bridged Local Area Networks", IEEE Std 802.1Q-2011, 31 471 August 2011. 473 [802.1AE] IEEE, "Media Access Control (MAC) Security", IEEE Std 474 802.1AE-2006, 18 August 2006. 476 Authors' Addresses 478 Yizhou Li 479 Huawei Technologies 480 101 Software Avenue, 481 Nanjing 210012 482 China 484 Phone: +86-25-56625409 485 EMail: liyizhou@huawei.com 487 Weiguo Hao 488 Huawei Technologies 489 101 Software Avenue, 490 Nanjing 210012 491 China 493 Phone: +86-25-56623144 494 EMail: haoweiguo@huawei.com 496 Radia Perlman 497 EMC 498 2010 256th Avenue NE, #200 499 Bellevue, WA 98007 USA 500 Email: Radia@alum.mit.edu 502 Jon Hudson 503 Brocade 504 130 Holger Way 505 San Jose, CA 95134 USA 507 Phone: +1-408-333-4062 508 jon.hudson@gmail.com 510 Hongjun Zhai 511 ZTE 512 68 Zijinghua Road, Yuhuatai District 513 Nanjing, Jiangsu 210012 514 China 516 Phone: +86 25 52877345 517 Email: zhai.hongjun@zte.com.cn