idnits 2.17.1 draft-ietf-trill-directory-assisted-encap-10.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (March 4, 2018) is 2217 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) No issues found here. Summary: 0 errors (**), 0 flaws (~~), 1 warning (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 INTERNET-DRAFT Linda Dunbar 2 Intended status: Proposed Standard Donald Eastlake 3 Huawei 4 Radia Perlman 5 Dell/EMC 6 Expires: September 3, 2018 March 4, 2018 8 Directory Assisted TRILL Encapsulation 9 11 Abstract 13 This draft describes how data center networks can benefit from non- 14 RBridge nodes performing TRILL encapsulation with assistance from a 15 directory service. 17 Status of This Memo 19 This Internet-Draft is submitted to IETF in full conformance with the 20 provisions of BCP 78 and BCP 79. 22 Distribution of this document is unlimited. Comments should be sent 23 to the authors or the TRILL working group mailing list: 24 trill@ietf.org 26 Internet-Drafts are working documents of the Internet Engineering 27 Task Force (IETF), its areas, and its working groups. Note that 28 other groups may also distribute working documents as Internet- 29 Drafts. 31 Internet-Drafts are draft documents valid for a maximum of six months 32 and may be updated, replaced, or obsoleted by other documents at any 33 time. It is inappropriate to use Internet-Drafts as reference 34 material or to cite them other than as "work in progress." 36 The list of current Internet-Drafts can be accessed at 37 http://www.ietf.org/1id-abstracts.html. The list of Internet-Draft 38 Shadow Directories can be accessed at 39 http://www.ietf.org/shadow.html. 41 Table of Contents 43 1. Introduction............................................3 44 2. Conventions Used in This Document.......................4 46 3. Directory Assistance to Non-RBridge.....................5 47 4. Source Nickname in Encapsulation by Non-RBridge Nodes...8 49 5. Benefits of Non-RBridge Performing TRILL Encapsulation..9 50 5.1. Avoid Nickname Exhaustion Issue.......................9 51 5.2. Reduce MAC Tables for Switches on Bridged LANs........9 53 6. Manageability Considerations...........................11 54 7. Security Considerations................................11 55 8. IANA Considerations....................................12 57 Normative References......................................13 58 Informative References....................................13 60 Acknowledgments...........................................13 61 Authors' Addresses........................................14 63 1. Introduction 65 This document describes how data center networks can benefit from 66 non-RBridge nodes performing TRILL encapsulation with assistance from 67 a directory service and specifies a method for them to do so. 69 [RFC7067] and [RFC8171] describe the framework and methods for edge 70 RBridges to get MAC&VLAN <-> Edge RBridge mapping from a directory 71 service instead of flooding unknown destination MAC addresses across 72 a TRILL domain. If it has the needed directory information, any node, 73 even a non-RBridge node, can perform the TRILL data packet 74 encapsulation. This draft describes the benefits of and a scheme for 75 non-RBridge nodes performing TRILL encapsulation. 77 2. Conventions Used in This Document 79 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 80 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 81 document are to be interpreted as described in [RFC2119]. 83 AF: Appointed Forwarder RBridge port [RFC8139]. 85 Bridge: An IEEE 802.1Q compliant device. In this draft, Bridge is 86 used interchangeably with Layer 2 switch. 88 DA: Destination Address. 90 ES-IS: End System to Intermediate Systems [RFC8171]. 92 Host: A physical server or a virtual machine running 93 applications. A host usually has at least one IP address 94 and at least one MAC address. 96 IS-IS: Intermediate System to Intermediate System [RFC7176]. 98 SA: Source Address. 100 TRILL-EN: TRILL Encapsulating node. A node that performs the TRILL 101 encapsulation but doesn't participate in RBridge's IS-IS 102 routing. 104 VM: Virtual Machine. 106 3. Directory Assistance to Non-RBridge 108 With directory assistance [RFC7067] [RFC8171], a non-RBridge node can 109 learn if a data packet needs to be forwarded across the RBridge 110 domain and if so the corresponding egress RBridge. 112 Suppose the RBridge domain boundary starts at network switches (not 113 virtual switches embedded on servers). (See Figure 1 for a high level 114 diagram of a typical data center network.) A directory can assist 115 Virtual Switches embedded on servers to encapsulate with a proper 116 TRILL header by providing the nickname of the egress RBridge edge to 117 which the destination is attached. The other information needed to 118 encapsulate can be either learned by listening to TRILL ES-IS and/or 119 IS-IS Hellos [RFC7176] [RFC8171], which will indicate the MAC address 120 and nickname of appropriate local edge RBridges, or by configuration. 122 If it is not known whether a destination is attached to one or more 123 RBridge edge nodes, based on the directory, the non-RBridge node can 124 forward the data frames natively, i.e. not encapsulating with any 125 TRILL header. Or, if the directory is known to be complete, the non- 126 RBridge node can discard such data frames. 128 \ +-------+ +------+ TRILL Domain/ 129 \ +/------+ | +/-----+ | / 130 \ | Aggr11| + ----- |AggrN1| + / 131 \ +---+---+/ +------+/ / 132 \ / \ / \ / 133 \ / \ / \ / 134 \ +---+ +---+ +---+ +---+ / 135 \- |T11|... |T1x| |T21| .. |T2y|--- 136 +---+ +---+ +---+ +---+ 137 | | | | 138 +-|-+ +-|-+ +-|-+ +-|-+ 139 | |... | V | | V | .. | V |<- vSwitch 140 +---+ +---+ +---+ +---+ 141 | |... | V | | V | .. | V | 142 +---+ +---+ +---+ +---+ 143 | |... | V | | V | .. | V | 144 +---+ +---+ +---+ +---+ 146 Figure 1. TRILL domain in a typical Data Center Network 148 When a TRILL encapsulated data packet reaches the ingress RBridge, 149 that RBridge simply performs the usual TRILL processing and forwards 150 the pre-encapsulated packet to the RBridge that is specified by the 151 egress nickname field of the TRILL header. When an ingress RBridge 152 receives a native Ethernet frame in an environment with complete 153 directory information, the ingress RBridge doesn't flood or forward 154 the received data frames when the destination MAC address in the 155 Ethernet data frames is unknown. 157 When all end nodes attached to an ingress RBridge pre-encapsulate 158 with a TRILL header for traffic across the TRILL domain, the ingress 159 RBridge doesn't need to encapsulate any native Ethernet frames to the 160 TRILL domain. The attached nodes can be connected to multiple edge 161 RBridges by having multiple ports or through a bridged LAN. All 162 RBridge edge ports connected to one bridged LAN can receive and 163 forward pre-encapsulated traffic, which can greatly improve the 164 overall network utilization. However, it is still necessary to 165 designate AF ports to, for example, be sure that multi-destination 166 packets from the TRILL campus are only egressed through one RBridge. 168 The TRILL base protocol specification [RFC6325] Section 4.6.2 Bullet 169 8 specifies that an RBridge port can be configured to accept TRILL 170 encapsulated frames from a neighbor that is not an RBridge. 172 When a TRILL frame arrives at an RBridge whose nickname matches the 173 destination nickname in the TRILL header of the frame, the processing 174 is exactly as normal: as specified in [RFC6325] the RBridge 175 decapsulates the received TRILL frame and forwards the decapsulated 176 frame to the target attached to its edge ports. When the destination 177 MAC address of the decapsulated Ethernet frame is not in the egress 178 RBridge's local MAC attachment tables, the egress RBridge floods the 179 decapsulated frame to all attached links in the frame's VLAN, or 180 drops the frame (if the egress RBridge is configured with that 181 policy). 183 We call a node that, as specified herein, only performs TRILL 184 encapsulation, but doesn't participate in RBridge's IS-IS routing, a 185 TRILL Encapsulating node (TRILL-EN). The TRILL Encapsulating Node can 186 pull MAC&VLAN <-> Edge RBridge mapping from directory servers 187 [RFC8171]. In order to do this, a TRILL-EN MUST support TRILL ES-IS 188 [RFC8171]. 190 Upon receiving or locally generating a native Ethernet frame, the 191 TRILL-EN checks the MAC&VLAN <-> Edge RBridge mapping, and performs 192 the corresponding TRILL encapsulation if the mapping entry is found 193 as shown in Figure 2. If the destination MAC address and VLAN of the 194 received Ethernet frame doesn't exist in the mapping table and there 195 is no positive reply from pull requests to a directory, the Ethernet 196 frame is dropped or is forwarded in native form to an edge RBridge, 197 depending on the TRILL-EN configuration. 199 +------------+--------+---------+---------+--+-------+---+ 200 |OuterEtherHd|TRILL HD| InnerDA | InnerSA |..|Payload|FCS| 201 +------------+--------+---------+---------+--+-------+---+ 202 | 203 | | | 204 | 205 | 206 | +-------+ TRILL +------+ 207 | | R1 |-----------| R2 | Decapsulate 208 | +---+---+ domain +------+ TRILL header 209 v | | 210 +---------->| | 211 | | 212 +-----+ +-----+ 213 Non-RBridge node: |T12 | | T22 | 214 Encapsulate TRILL +-----+ +-----+ 215 Header for data 216 Frames to traverse TRILL domain. 218 Figure 2. Data frames from a TRILL-EN 220 4. Source Nickname in Encapsulation by Non-RBridge Nodes 222 The TRILL header includes a Source RBridge's Nickname (ingress) and 223 Destination RBridge's Nickname (egress). When a TRILL header is added 224 to a data packet by a TRILL-EN, the Ingress RBridge nickname field in 225 the TRILL header is set to a nickname of the AF for the data packet's 226 VLAN. The TRILL-EN determines the AF by snooping on IS-IS Hellos from 227 the edge RBridges on the link with the TRILL-EN in the same way that 228 the RBridges on the link determine the AF [RFC8139]. A TRILL-EN is 229 free to send the encapsulated data frame to any of the edge RBridges 230 on its link. 232 5. Benefits of Non-RBridge Performing TRILL Encapsulation 234 This section summarizing benefits of having a non-RBridge node 235 perform TRILL encapsulation. 237 5.1. Avoid Nickname Exhaustion Issue 239 For a large Data Center with hundreds of thousands of virtualized 240 servers, setting the TRILL boundary at the servers' virtual switches 241 will create a TRILL domain with hundreds of thousands of RBridge 242 nodes, which has issues of TRILL Nickname exhaustion and challenges 243 to IS-IS. On the other hand, setting the TRILL boundary at 244 aggregation switches that have many virtualized servers attached can 245 limit the number of RBridge nodes in a TRILL domain, but introduces 246 the issue of very large MAC&VLAN <-> Edge RBridge mapping tables to 247 be maintained by RBridge edge nodes. 249 Allowing Non-RBridge nodes to pre-encapsulate data frames with TRILL 250 headers makes it possible to have a TRILL domain with a reasonable 251 number of RBridge nodes in a large data center. All the TRILL-ENs 252 attached to one RBridge can be represented by one TRILL nickname, 253 which can avoid the Nickname exhaustion problem. 255 5.2. Reduce MAC Tables for Switches on Bridged LANs 257 When hosts in a VLAN (or subnet) span across multiple RBridge edge 258 nodes and each RBridge edge has multiple VLANs enabled, the switches 259 on the bridged LANs attached to the RBridge edge are exposed to all 260 MAC addresses among all the VLANs enabled. 262 For example, for an Access Switch with 40 physical servers attached, 263 where each server has 100 VMs, there are 4000 hosts under the Access 264 Switch. If indeed hosts/VMs can be moved anywhere, the worst case for 265 the Access Switch is when all those 4000 VMs belong to different 266 VLANs, i.e. the access switch has 4000 VLANs enabled. If each VLAN 267 has 200 hosts, this access switch's MAC table potentially has 268 200*4000 = 800,000 entries. 270 If the virtual switches on servers pre-encapsulate the data frames 271 destined for hosts attached to remote RBridge Edge nodes, the outer 272 MAC destination address of those TRILL encapsulated data frames will 273 be the MAC address of a local RBridge edge, i.e. the ingress 274 RBridge. The switches on the local bridged LAN don't need to keep the 275 MAC entries for remote hosts attached to other edge RBridges. 277 But the TRILL traffic from nodes attached to other RBridges is 278 decapsulated and has the true source and destination MACs. One simple 279 way to prevent local bridges from learning remote hosts' MACs and 280 adding to their MAC tables, if that would be a problem, is to disable 281 this data plane learning on local bridges. The local bridges can be 282 pre-configured with MAC addresses of local hosts with the assistance 283 of a directory. The local bridges can always send frames with 284 unknown destination MAC addresses to the ingress RBridge. In an 285 environment where a large number of VMs are instantiated in one 286 server, the number of remote MAC addresses could be very large. If it 287 is not feasible to disable learning and pre- configure MAC tables for 288 local bridges and all important traffic is IP, one effective method 289 to minimize local bridges' MAC table size is to use the server's MAC 290 address to hide MAC addresses of the attached VMs. I.e., the server 291 acting as an edge node uses its own MAC address in the source MAC 292 address field of the packets originated from a host (or VM) embedded. 293 When the Ethernet frame arrives at the target edge node (the egress), 294 the target edge node can send the packet to the corresponding 295 destination host based on the packet's IP address. Very often, the 296 target edge node communicates with the embedded VMs via a layer 2 297 virtual switch. In this case, the target edge node can construct the 298 proper Ethernet header with the assistance of the directory. The 299 information from the directory includes the proper host IP to MAC 300 mapping information. 302 6. Manageability Considerations 304 Directory assistance [RFC8171] is required to make it possible for a 305 non-TRILL node to pre-encapsulate packets destined towards remote 306 RBridges. TRILL-ENs have the same configuration options as any pull 307 directory client. See Section 4 of [RFC8171]. 309 7. Security Considerations 311 The mechanism described in this document requires TRILL-ENs to be 312 aware of the MAC address(es) of the TRILL edge RBridge(s) to which 313 the TRILL-EN is attached and the egress RBridge nickname from which 314 the destination of the packets is reachable. With that information, 315 TRILL-ENs can learn a substantial amount about the topology of the 316 TRILL domain. Therefore, there could be a potential security risk 317 when the TRILL-ENs are not trusted or are compromised. In addition, 318 if the path between the directory and the TRILL-ENs are attacked, 319 false mappings can be sent to the TRILL-EN causing packets from the 320 TRILL-EN to be sent to wrong destinations, possibly violating 321 security policy. Therefore, a combination of authentication and 322 encryption is RECOMMENDED between the Directory and TRILL-EN. The 323 entities involved will need to properly authenticate with each other, 324 provide session encryption, maintain security patch levels, and 325 configure their systems to allow minimal access and running processes 326 to protect sensitive information. 328 Use of directory assisted encapsulation by TRILL-ENs essentially 329 involves those TRILL-ENs spoofing edge RBridges to which they are 330 connected, which is another reason that TRILL-ENs should be trusted 331 nodes. Such spoofing cannot cause looping traffic because TRILL has a 332 hop count in the TRILL header [RFC6325] so that, should there be a 333 loop, a TRILL packet caught in that loop (i.e., an encapsulated 334 frame) will be discarded. (In the potentially more dangerous case of 335 multi-destination packets, as compared with known unicast, where 336 copies could multiply due to forks in the distribution tree, a 337 Reverse Path Forwarding Check is also used [RFC6325] to discard 338 packets that appear to be on the wrong link or when there is 339 disagreement about the distribution tree.) 341 For Pull Directory and TRILL ES-IS security considerations, see 342 [RFC8171]. 344 For general TRILL security considerations, see [RFC6325]. 346 8. IANA Considerations 348 This document requires no IANA actions. RFC Editor: please remove 349 this section before publication. 351 Normative References 353 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 354 Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, 355 March 1997, . 357 [RFC6325] Perlman, R., Eastlake 3rd, D., Dutt, D., Gai, S., and A. 358 Ghanwani, "Routing Bridges (RBridges): Base Protocol 359 Specification", RFC 6325, DOI 10.17487/RFC6325, July 2011, 360 . 362 [RFC7176] Eastlake 3rd, D., Senevirathne, T., Ghanwani, A., Dutt, D., 363 and A. Banerjee, "Transparent Interconnection of Lots of Links 364 (TRILL) Use of IS-IS", RFC 7176, DOI 10.17487/RFC7176, May 365 2014, . 367 [RFC8139] Eastlake 3rd, D., Li, Y., Umair, M., Banerjee, A., and F. 368 Hu, "Transparent Interconnection of Lots of Links (TRILL): 369 Appointed Forwarders", RFC 8139, DOI 10.17487/RFC8139, June 370 2017, . 372 [RFC8171] Eastlake 3rd, D., Dunbar, L., Perlman, R., and Y. Li, 373 "Transparent Interconnection of Lots of Links (TRILL): Edge 374 Directory Assistance Mechanisms", RFC 8171, DOI 375 10.17487/RFC8171, June 2017, . 378 Informative References 380 [RFC7067] Dunbar, L., Eastlake 3rd, D., Perlman, R., and I. 381 Gashinsky, "Directory Assistance Problem and High-Level Design 382 Proposal", RFC 7067, DOI 10.17487/RFC7067, November 2013, 383 . 385 Acknowledgments 387 The following are thanked for their contributions: 389 Igor Gashinsky 390 Ben Nevin-Jenkins 392 The document was prepared in raw nroff. All macros used were defined 393 within the source file. 395 Authors' Addresses 397 Linda Dunbar 398 Huawei Technologies 399 5340 Legacy Drive, Suite 175 400 Plano, TX 75024, USA 402 Phone: +1-469-277-5840 403 Email: linda.dunbar@huawei.com 405 Donald Eastlake 406 Huawei Technologies 407 155 Beaver Street 408 Milford, MA 01757 USA 410 Phone: +1-508-333-2270 411 Email: d3e3e3@gmail.com 413 Radia Perlman 414 Dell/EMC 415 2010 256th Avenue NE, #200 416 Bellevue, WA 98007 USA 418 Email: Radia@alum.mit.edu 420 Copyright, Disclaimer, and Additional IPR Provisions 422 Copyright (c) 2018 IETF Trust and the persons identified as the 423 document authors. All rights reserved. 425 This document is subject to BCP 78 and the IETF Trust's Legal 426 Provisions Relating to IETF Documents 427 (http://trustee.ietf.org/license-info) in effect on the date of 428 publication of this document. Please review these documents 429 carefully, as they describe your rights and restrictions with respect 430 to this document. Code Components extracted from this document must 431 include Simplified BSD License text as described in Section 4.e of 432 the Trust Legal Provisions and are provided without warranty as 433 described in the Simplified BSD License. The definitive version of 434 an IETF Document is that published by, or under the auspices of, the 435 IETF. Versions of IETF Documents that are published by third parties, 436 including those that are translated into other languages, should not 437 be considered to be definitive versions of IETF Documents. The 438 definitive version of these Legal Provisions is that published by, or 439 under the auspices of, the IETF. Versions of these Legal Provisions 440 that are published by third parties, including those that are 441 translated into other languages, should not be considered to be 442 definitive versions of these Legal Provisions. For the avoidance of 443 doubt, each Contributor to the IETF Standards Process licenses each 444 Contribution that he or she makes as part of the IETF Standards 445 Process to the IETF Trust pursuant to the provisions of RFC 5378. No 446 language to the contrary, or terms, conditions or rights that differ 447 from or are inconsistent with the rights and licenses granted under 448 RFC 5378, shall have any effect and shall be null and void, whether 449 published or posted by such Contributor, or included with or in such 450 Contribution.