idnits 2.17.1 draft-ietf-trill-directory-assisted-encap-03.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == The document doesn't use any RFC 2119 keywords, yet seems to have RFC 2119 boilerplate text. -- The document date (July 8, 2016) is 2847 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) ** Obsolete normative reference: RFC 6439 (Obsoleted by RFC 8139) Summary: 1 error (**), 0 flaws (~~), 2 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 INTERNET-DRAFT Linda Dunbar 2 Intended status: Proposed Standard Donald Eastlake 3 Huawei 4 Radia Perlman 5 EMC 6 Igor Gashinsky 7 Intel 8 Expires: January 7, 2017 July 8, 2016 10 Directory Assisted TRILL Encapsulation 11 13 Abstract 15 This draft describes how data center networks can benefit from non- 16 RBridge nodes performing TRILL encapsulation with assistance from a 17 directory service. 19 Status of This Memo 21 This Internet-Draft is submitted to IETF in full conformance with the 22 provisions of BCP 78 and BCP 79. 24 Distribution of this document is unlimited. Comments should be sent 25 to the authors or the TRILL working group mailing list: 26 trill@ietf.org 28 Internet-Drafts are working documents of the Internet Engineering 29 Task Force (IETF), its areas, and its working groups. Note that 30 other groups may also distribute working documents as Internet- 31 Drafts. 33 Internet-Drafts are draft documents valid for a maximum of six months 34 and may be updated, replaced, or obsoleted by other documents at any 35 time. It is inappropriate to use Internet-Drafts as reference 36 material or to cite them other than as "work in progress." 38 The list of current Internet-Drafts can be accessed at 39 http://www.ietf.org/1id-abstracts.html. The list of Internet-Draft 40 Shadow Directories can be accessed at 41 http://www.ietf.org/shadow.html. 43 Table of Contents 45 1. Introduction............................................3 47 2. Conventions Used in This Document.......................4 49 3. Directory Assistance to Non-RBridge.....................5 51 4. Source Nickname in Encapsulation by Non-RBridge Nodes...8 53 5. Benefits of Non-RBridge Performing TRILL Encapsulation..9 54 5.1. Avoid Nickname Exhaustion Issue.......................9 55 5.2. Reduce MAC Tables for Switches on Bridged LANs........9 57 6. Manageability Considerations...........................11 58 7. Security Considerations................................11 59 8. IANA Considerations....................................12 61 Normative References......................................13 62 Informative References....................................13 64 Acknowledgments...........................................13 65 Authors' Addresses........................................14 67 1. Introduction 69 This document describes how data center networks can benefit from 70 non-RBridge nodes performing TRILL encapsulation with assistance from 71 directory service and specifies a method for them to do so. 73 [RFC7067] and [Directory] describes the framework and methods for 74 RBridge edge to get MAC&VLAN<->RBridgeEdge mapping from a directory 75 service in data center environments instead of flooding unknown DAs 76 across TRILL domain. If it has the needed directory information, any 77 node, even a non-RBridge node, can perform the TRILL encapsulation. 78 This draft is to describe the benefits and a scheme for non-RBridge 79 nodes performing TRILL encapsulation. 81 2. Conventions Used in This Document 83 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 84 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 85 document are to be interpreted as described in [RFC2119]. 87 AF: Appointed Forwarder RBridge port [RFC6439] 89 Bridge: IEEE 802.1Q compliant device. In this draft, Bridge 90 is used interchangeably with Layer 2 switch. 92 DA: Destination Address 94 Host: Application running on a physical server or a 95 virtual machine. A host usually has at least one IP 96 address and at least one MAC address. 98 SA: Source Address 100 TRILL-EN: TRILL Encapsulating node. It is a node that only 101 performs the TRILL encapsulation but doesn't 102 participate in RBridge's IS-IS routing. 104 VM: Virtual Machines 106 3. Directory Assistance to Non-RBridge 108 With directory assistance [RFC7067] [Directory], a non-RBridge can be 109 informed if a packet needs to be forwarded across the RBridge domain 110 and the corresponding egress RBridge. Suppose the RBridge domain 111 boundary starts at network switches (not virtual switches embedded on 112 servers), a directory can assist Virtual Switches embedded on servers 113 to encapsulate with a proper TRILL header by providing the nickname 114 of the egress RBridge edge to which the destination is attached. The 115 other information needed to encapsulate can be either learned by 116 listening to TRILL Hellos, which will indicate the MAC address and 117 nickname of appropriate edge RBridges, or by configuration. 119 If a destination is not shown as attached to one or more other 120 RBridge edge nodes, based on the directory, the non-RBridge node can 121 forward the data frames natively, i.e. not encapsulating with any 122 TRILL header. 124 \ +-------+ +------+ TRILL Domain/ 125 \ +/------+ | +/-----+ | / 126 \ | Aggr11| + ----- |AggrN1| + / 127 \ +---+---+/ +------+/ / 128 \ / \ / \ / 129 \ / \ / \ / 130 \ +---+ +---+ +---+ +---+ / 131 \- |T11|... |T1x| |T21| .. |T2y|--- 132 +---+ +---+ +---+ +---+ 133 | | | | 134 +-|-+ +-|-+ +-|-+ +-|-+ 135 | |... | V | | V | .. | V |<- vSwitch 136 +---+ +---+ +---+ +---+ 137 | |... | V | | V | .. | V | 138 +---+ +---+ +---+ +---+ 139 | |... | V | | V | .. | V | 140 +---+ +---+ +---+ +---+ 142 Figure 1. TRILL domain in typical Data Center Network 144 When a TRILL encapsulated data packet reaches the ingress RBridge, 145 the ingress RBridge simply forwards the pre-encapsulated packet to 146 the RBridge that is specified by the egress nickname field of the 147 TRILL header of the data frame. When the ingress RBridge receives a 148 native Ethernet frame, it handles it as usual and may drop it if it 149 has complete directory information indicating that the target is not 150 attached to the TRILL campus. 152 In such an environment with complete directory information, the 153 ingress RBridge doesn't flood or forward the received data frames 154 when the DA in the Ethernet data frames is unknown. 156 When all nodes attached to an ingress RBridge can pre-encapsulate 157 with a TRILL header for traffic across the TRILL domain, the ingress 158 RBridge don't need to encapsulate any native Ethernet frames to the 159 TRILL domain. The attached nodes can be connected to multiple edge 160 RBridges by having multiple ports or by an bridged LAN. Under this 161 environment, there is no need to designate AF ports and all RBridge 162 edge ports connected to one bridged LAN can receive and forward pre- 163 encapsulated traffic, which can greatly improve the overall network 164 utilization. 166 The TRILL base protocol specification [RFC6325] Section 4.6.2 Bullet 167 8 specifies that an RBridge port can be configured to accept TRILL 168 encapsulated frames from a neighbor that is not an RBridge. 170 When a TRILL frame arrives at an RBridge whose nickname matches with 171 the destination nickname in the TRILL header of the frame, the 172 processing is exactly same as normal, i.e. as specified in [RFC6325] 173 the RBridge decapsulates the received TRILL frame and forwards the 174 decapsulated frame to the target attached to its edge ports. When 175 the DA of the decapsulated Ethernet frame is not in the egress 176 RBridge's local MAC attachment tables, the egress RBridge floods the 177 decapsulated frame to all attached links in the frame's VLAN, or 178 drops the frame (if the egress RBridge is configured with that 179 policy). 181 We call a node that only performs the TRILL encapsulation but doesn't 182 participate in RBridge's IS-IS routing a TRILL Encapsulating node 183 (TRILL-EN). The TRILL Encapsulating Node can get the 184 MAC&VLAN<->RBridgeEdge mapping table pulled from directory servers or 185 pushed from directory serviers to it [DirectoryExtensions]. 187 Upon receiving a native Ethernet frame, the TRILL-EN checks the 188 MAC&VLAN<->RBridgeEdge mapping table, and perform the corresponding 189 TRILL encapsulation if the entry is found in the mapping table. If 190 the destination address and VLAN of the received Ethernet frame 191 doesn't exist in the mapping table and there is no positive reply 192 from pulling requests to a directory, the Ethernet frame is dropped 193 or forwarded in native form to an edge RBridge. 195 +------------+--------+---------+---------+--+-------+---+ 196 |OuterEtherHd|TRILL HD| InnerDA | InnerSA |..|Payload|FCS| 197 +------------+--------+---------+---------+--+-------+---+ 198 ^ 199 | | | 200 | 201 | 202 | +-------+ TRILL +------+ 203 | | R1 |-----------| R2 | Decapsulate 204 | +---+---+ domain +------+ TRILL header 205 | | | 206 +----------| | 207 | | 208 +-----+ +-----+ 209 Non-RBridge node:|T12 | | T22 | 210 Encapsulate TRILL+-----+ +-----+ 211 Header for data 212 Frames to traverse 213 TRILL domain. 215 Figure 2. Data frames from TRILL-EN 217 4. Source Nickname in Encapsulation by Non-RBridge Nodes 219 The TRILL header includes a Source RBridge's Nickname (ingress) and 220 Destination RBridge's Nickname (egress). When a TRILL header is added 221 by TRILL-EN, the Ingress RBridge edge node's nickname is used in the 222 source address field. The TRILL-EN learns this nickname by listening 223 to the TRILL IS-IS Hellos from the Ingress RBridge. Those Hellos have 224 that nickname in a field in the Special VLANs and Flags Sub-TLV 225 [RFC7176] contained in the Hello. 227 5. Benefits of Non-RBridge Performing TRILL Encapsulation 229 5.1. Avoid Nickname Exhaustion Issue 231 For a large Data Center with hundreds of thousands of virtualized 232 servers, setting the TRILL boundary at the servers' virtual switches 233 will create a TRILL domain with hundreds of thousands of RBridge 234 nodes, which has issues of TRILL Nicknames exhaustion and challenges 235 to IS-IS. On the other hand, setting TRILL boundary at aggregation 236 switches that have many virtualized servers attached can limit the 237 number of RBridge nodes in a TRILL domain, but introduce the issues 238 of very large MAC&VLAN<->RBridgeEdge mapping table to be maintained 239 by RBridge edge nodes and the necessity of enforcing AF ports. 241 Allowing Non-RBridge nodes to pre-encapsulate data frames with TRILL 242 header makes it possible to have a TRILL domain with a reasonable 243 number of RBridge nodes in a large data center. All the TRILL-ENs 244 attached to one RBridge are represented by one TRILL nickname, which 245 can avoid the Nickname exhaustion problem. 247 5.2. Reduce MAC Tables for Switches on Bridged LANs 249 When hosts in a VLAN (or subnet) span across multiple RBridge edge 250 nodes and each RBridge edge has multiple VLANs enabled, the switches 251 on the bridged LANs attached to the RBridge edge are exposed to all 252 MAC addresses among all the VLANs enabled. 254 For example, for an Access switch with 40 physical servers attached, 255 where each server has 100 VMs, there are 4000 hosts under the Access 256 Switch. If indeed hosts/VMs can be moved anywhere, the worst case for 257 the Access Switch is when all those 4000 VMs belong to different 258 VLANs, i.e. the access switch has 4000 VLANs enabled. If each VLAN 259 has 200 hosts, this access switch's MAC table potentially has 260 200*4000 = 800,000 entries. 262 If the virtual switches on servers pre-encapsulate the data frames 263 destined for hosts attached to other RBridge Edge nodes, the outer 264 MAC DA of those TRILL encapsulated data frames will be the MAC 265 address of the local RBridge edge, i.e. the ingress RBridge. 266 Therefore, the switches on the local bridged LAN don't need to keep 267 the MAC entries for remote hosts attached to other edge RBridges. 269 But the traffic from nodes attached to other RBridges is decapsulated 270 and has the true source and destination MACs. One simple way to 271 prevent local bridges from learning remote hosts' MACs and adding to 272 their MAC tables, if that is a problem, is to disable this data plane 273 learning on local bridges. The local bridges can be pre-configured 274 with MAC addresses of local hosts with the assistance of a directory. 275 The local bridges can always send frames with unknown Destination to 276 the ingress RBridge. In an environment where a large number of VMs 277 are instantiated in one server, the number of remote MAC addresses 278 could be very large. If it is not feasible to disable learning and 279 pre- configure MAC tables for local bridges, one effective method to 280 minimize local bridges' MAC table size is to use the server's MAC 281 address to hide MAC addresses of the attached VMs. I.e. the server 282 acting as an edge node uses its own MAC address in the Source Address 283 field of the packets originated from a host (or VM) embedded. When 284 the Ethernet frame arrives at the target edge node (the server), the 285 target edge node can send the packet to the corresponding destination 286 host based on the packet's IP address. Very often, the target edge 287 node communicates with the embedded VMs via a layer 2 virtual switch. 288 In this case, the target edge node can construct the proper Ethernet 289 header with the assistance of the directory. The information from 290 the directory includes the proper host IP to MAC mapping information. 292 6. Manageability Considerations 294 It requires directory assistance [DirectoryExtensions] to make it 295 possible for a non-TRILL node to pre-encapsulate packets destined 296 towards remote RBridges. 298 7. Security Considerations 300 For security consideratios of the extension of directory services to 301 non-RBridges, see [DirectoryExtensions]. 303 For general TRILL security considerations, see [RFC6325]. 305 8. IANA Considerations 307 This document requires no IANA actions. RFC Edtior: please remove 308 this section before publication. 310 Normative References 312 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 313 Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, 314 March 1997, . 316 [RFC6325] Perlman, R., Eastlake 3rd, D., Dutt, D., Gai, S., and A. 317 Ghanwani, "Routing Bridges (RBridges): Base Protocol 318 Specification", RFC 6325, DOI 10.17487/RFC6325, July 2011, 319 . 321 [RFC6439] Perlman, R., Eastlake, D., Li, Y., Banerjee, A., and F. Hu, 322 "Routing Bridges (RBridges): Appointed Forwarders", RFC 6439, 323 DOI 10.17487/RFC6439, November 2011, . 326 [RFC7176] Eastlake 3rd, D., Senevirathne, T., Ghanwani, A., Dutt, D., 327 and A. Banerjee, "Transparent Interconnection of Lots of Links 328 (TRILL) Use of IS-IS", RFC 7176, DOI 10.17487/RFC7176, May 329 2014, . 331 [Directory] D. Eastlake, L. Dunbar, R. Perlman, Y. Li, "TRILL: Edge 332 Directory Assist Mechanisms", draft-ietf-trill-directory- 333 assist-mechanisms, work in progress. 335 [DirectoryExtensions] D. Eastlake, L. Dunbar, R. Perlman, F. Hu, 336 "TRILL: Directory Extensions", draft-ietf-trill-directory- 337 extensions, work in progress. 339 Informative References 341 [RFC7067] Dunbar, et, al "Directory Assistance Problem and High-Level 342 Design Proposal", RFC7067, Nov, 2013. 344 Acknowledgments 346 The document was prepared in raw nroff. All macros used were defined 347 within the source file. 349 Authors' Addresses 351 Linda Dunbar 352 Huawei Technologies 353 5340 Legacy Drive, Suite 175 354 Plano, TX 75024, USA 356 Phone: +1-469-277-5840 357 Email: linda.dunbar@huawei.com 359 Donald Eastlake 360 Huawei Technologies 361 155 Beaver Street 362 Milford, MA 01757 USA 364 Phone: +1-508-333-2270 365 Email: d3e3e3@gmail.com 367 Radia Perlman 368 EMC 369 2010 256th Avenue NE, #200 370 Bellevue, WA 98007 USA 372 Email: Radia@alum.mit.edu 374 Igor Gashinsky 375 Yahoo 376 45 West 18th Street 6th floor 377 New York, NY 10011 USA 379 Email: igor@yahoo-inc.com 381 Copyright, Disclaimer, and Additional IPR Provisions 383 Copyright (c) 2016 IETF Trust and the persons identified as the 384 document authors. All rights reserved. 386 This document is subject to BCP 78 and the IETF Trust's Legal 387 Provisions Relating to IETF Documents 388 (http://trustee.ietf.org/license-info) in effect on the date of 389 publication of this document. Please review these documents 390 carefully, as they describe your rights and restrictions with respect 391 to this document. Code Components extracted from this document must 392 include Simplified BSD License text as described in Section 4.e of 393 the Trust Legal Provisions and are provided without warranty as 394 described in the Simplified BSD License. The definitive version of 395 an IETF Document is that published by, or under the auspices of, the 396 IETF. Versions of IETF Documents that are published by third parties, 397 including those that are translated into other languages, should not 398 be considered to be definitive versions of IETF Documents. The 399 definitive version of these Legal Provisions is that published by, or 400 under the auspices of, the IETF. Versions of these Legal Provisions 401 that are published by third parties, including those that are 402 translated into other languages, should not be considered to be 403 definitive versions of these Legal Provisions. For the avoidance of 404 doubt, each Contributor to the IETF Standards Process licenses each 405 Contribution that he or she makes as part of the IETF Standards 406 Process to the IETF Trust pursuant to the provisions of RFC 5378. No 407 language to the contrary, or terms, conditions or rights that differ 408 from or are inconsistent with the rights and licenses granted under 409 RFC 5378, shall have any effect and shall be null and void, whether 410 published or posted by such Contributor, or included with or in such 411 Contribution.