idnits 2.17.1 draft-dunbar-trill-directory-assisted-encap-07.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == The document doesn't use any RFC 2119 keywords, yet seems to have RFC 2119 boilerplate text. -- The document date (April 8, 2014) is 3665 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) ** Obsolete normative reference: RFC 6439 (Obsoleted by RFC 8139) Summary: 1 error (**), 0 flaws (~~), 2 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 TRILL working group L. Dunbar 2 Internet Draft D. Eastlake 3 Intended status: Standard Track Huawei 4 Expires: October 2014 Radia Perlman 5 Intel 6 I. Gashinsky 7 Yahoo 8 April 8, 2014 10 Directory Assisted TRILL Encapsulation 11 draft-dunbar-trill-directory-assisted-encap-07.txt 13 Status of this Memo 15 This Internet-Draft is submitted in full conformance with 16 the provisions of BCP 78 and BCP 79. 18 This Internet-Draft is submitted in full conformance with 19 the provisions of BCP 78 and BCP 79. This document may 20 not be modified, and derivative works of it may not be 21 created, except to publish it as an RFC and to translate 22 it into languages other than English. 24 Internet-Drafts are working documents of the Internet 25 Engineering Task Force (IETF), its areas, and its working 26 groups. Note that other groups may also distribute 27 working documents as Internet-Drafts. 29 Internet-Drafts are draft documents valid for a maximum 30 of six months and may be updated, replaced, or obsoleted 31 by other documents at any time. It is inappropriate to 32 use Internet-Drafts as reference material or to cite them 33 other than as "work in progress." 35 The list of current Internet-Drafts can be accessed at 36 http://www.ietf.org/ietf/1id-abstracts.txt 38 The list of Internet-Draft Shadow Directories can be 39 accessed at http://www.ietf.org/shadow.html 41 This Internet-Draft will expire on September 8, 2014. 43 Copyright Notice 45 Copyright (c) 2014 IETF Trust and the persons identified 46 as the document authors. All rights reserved. 48 This document is subject to BCP 78 and the IETF Trust's 49 Legal Provisions Relating to IETF Documents 50 (http://trustee.ietf.org/license-info) in effect on the 51 date of publication of this document. Please review these 52 documents carefully, as they describe your rights and 53 restrictions with respect to this document. Code 54 Components extracted from this document must include 55 Simplified BSD License text as described in Section 4.e 56 of the Trust Legal Provisions and are provided without 57 warranty as described in the Simplified BSD License. 59 Abstract 61 This draft describes how data center network can benefit from 62 non-RBridge nodes performing TRILL encapsulation with 63 assistance from directory service. 65 Table of Contents 67 1. Introduction...................................................3 68 2. Conventions used in this document..............................3 69 3. Directory Assistance to Non-RBridge............................4 70 4. Source Nickname in Frames Encapsulated by Non-RBridge 71 Nodes.............................................................6 72 5. Benefits of Non-RBridge encapsulating TRILL header.............7 73 5.1. Avoid Nickname Exhaustion Issue...........................7 74 5.2. Reduce MAC Tables for switches on Bridged LANs............7 75 6. Conclusion and Recommendation..................................8 76 7. Manageability Considerations...................................8 77 8. Security Considerations........................................9 78 9. IANA Considerations............................................9 79 10. References....................................................9 80 10.1. Normative References.....................................9 81 10.2. Informative References...................................9 82 11. Acknowledgments..............................................10 84 1. Introduction 86 This draft describes how data center network can benefit from 87 non-RBridge nodes performing TRILL encapsulation with 88 assistance from directory service. 90 [RFC7067] describes the framework for RBridge edge to get 91 MAC&VLAN<->RBridgeEdge mapping from a directory service in 92 data center environment instead of flooding unknown DAs across 93 TRILL domain. When directory is used, any node, even a non- 94 RBridge node, can perform the TRILL encapsulation. This draft 95 is to describe the benefits and the scheme of non-RBridge 96 nodes performing TRILL encapsulation. 98 2. Conventions used in this document 100 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", 101 "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", 102 "MAY", and "OPTIONAL" in this document are to be 103 interpreted as described in RFC-2119 [RFC2119]. 105 In this document, these words will appear with that 106 interpretation only when in ALL CAPS. Lower case uses of 107 these words are not to be interpreted as carrying RFC- 108 2119 significance. 110 AF Appointed Forwarder RBridge port [RFC6439] 112 Bridge: IEEE 802.1Q compliant device. In this draft, Bridge 113 is used interchangeably with Layer 2 switch. 115 DA: Destination Address 117 DC: Data Center 119 EoR: End of Row switches in data center. Also known as 120 Aggregation switches in some data centers 122 Host: Application running on a physical server or a 123 virtual machine. A host usually has at least one IP 124 address and at least one MAC address. 126 SA: Source Address 128 ToR: Top of Rack Switch in data center. It is also known 129 as access switches in some data centers. 131 TRILL-EN: TRILL Encapsulating node. It is a node that only 132 performs the TRILL encapsulation but doesn't 133 participate in RBridge's IS-IS routing. 135 VM: Virtual Machines 137 3. Directory Assistance to Non-RBridge 139 With directory assistance [RFC7067], a non-RBridge can be 140 informed if a packet needs to be forwarded across the RBridge 141 domain and the corresponding egress RBridge. Suppose the 142 RBridge domain boundary starts at network switches (not 143 virtual switches embedded on servers), a directory can assist 144 Virtual Switches embedded on servers to encapsulate with a 145 proper TRILL header by providing the nickname of the egress 146 RBridge edge to which the target is attached. The other 147 information needed to encapsulate can be either learned by 148 listening to TRILL Hellos, which will indicate the MAC address 149 and nickname of appropriate edge RBridges, or by 150 configuration. 152 If a target is not attached to other RBridge edge nodes based 153 on the directory [RFC7067], the non-RBridge node can forward 154 the data frames natively, i.e. not encapsulating any TRILL 155 header. 157 \ +-------+ +------+ TRILL Domain/ 158 \ +/------+ | +/-----+ | / 159 \ | Aggr11| + ----- |AggrN1| + / 160 \ +---+---+/ +------+/ / 161 \ / \ / \ / 162 \ / \ / \ / 163 \ +---+ +---+ +---+ +---+ / 164 \- |T11|... |T1x| |T21| ... |T2y|--- 165 +---+ +---+ +---+ +---+ 166 | | | | 167 +-|-+ +-|-+ +-|-+ +-|-+ 168 | |... | V | | V | ... | V |<- vSwitch 169 +---+ +---+ +---+ +---+ 170 | |... | V | | V | ... | V | 171 +---+ +---+ +---+ +---+ 172 | |... | V | | V | ... | V | 173 +---+ +---+ +---+ +---+ 174 Figure 1 TRILL domain in typical Data Center Network 176 When a TRILL encapsulated data packet reaches the ingress 177 RBridge, the ingress RBridge can simply forward the pre- 178 encapsulated packet to the RBridge that is specified by the 179 egress nickname field of the TRILL header of the data frame. 180 When the ingress RBridge receives a native Ethernet frame, it 181 handles it as usual and may drop it if it has complete directory 182 information indicating that the target is not attached to the TRILL 183 campus. 185 In this environment with complete directory information, the 186 ingress RBridge doesn't flood or send the received Ethernet 187 data frames to TRILL domain when the DA in the Ethernet data 188 frames is unknown. 190 When all attached nodes to ingress RBridge can pre-encapsulate 191 TRILL header for traffic across the TRILL domain, the ingress 192 RBridge don't need to encapsulate any native Ethernet frames 193 to the TRILL domain. All native Ethernet frames are switched 194 by the attached bridged LAN per IEEE802.1Q. Under this 195 environment, there is no need to designate AF ports and all 196 RBridge edge ports connected to one bridged LAN can receive 197 and forward pre-encapsulated traffic, which can greatly 198 improve the overall network utilization. 200 Note: [RFC6325] Section 4.6.2 Bullet 8 specifies that an 201 RBridge port can be configured to accept TRILL encapsulated 202 frames from a neighbor that is not an RBridge. 204 When a TRILL frame arrives at an RBridge whose nickname 205 matches with the destination nickname in the TRILL header of 206 the frame, the processing is exactly same as normal, i.e. the 207 RBridge decapsulates the received TRILL frame and forwards the 208 decapsulated Ethernet frame to the target attached to its edge 209 ports. When the DA of the decapsulated Ethernet frame is not 210 in the egress RBridge's local MAC attachment tables, the 211 egress RBridge can flood the decapsulated Ethernet frame to 212 all hosts attached or drop the frame (if the egress RBridge is 213 configured with the policy). 215 We call a node that only performs the TRILL encapsulation but 216 doesn't participate in RBridge's IS-IS routing a TRILL 217 Encapsulating node (TRILL-EN). The TRILL Encapsulating Node 218 can get the MAC&VLAN<->RBridgeEdge mapping table pulled from 219 directory servers [RFC7067]. 221 Editor's note: RFC7067 has defined Push and Pull model for 222 edge nodes to get directory mapping information. While Pull 223 Model is relative simple for TRILL-EN to implement, Pushing 224 requires some reliable flooding mechanism, like the one used 225 by IS-IS, between the edge RBridge and the TRILL encapsulating 226 node. Something like an extension to ES-IS might be needed. 228 Upon receiving a native Ethernet frame, the TRILL-EN checks 229 the MAC&VLAN<->RBridgeEdge mapping table, and perform the 230 corresponding TRILL encapsulation if the entry is found in the 231 mapping table. If the destination address and VLAN of the 232 received Ethernet frame doesn't exist in the mapping table and 233 no positive reply from pulling request to a directory, the 234 Ethernet frame is dropped or forwarded per IEEE802.1Q. 236 +------------+--------+---------+---------+--+-------+---+ 237 |OuterEtherHd|TRILL HD| InnerDA | InnerSA |..|Payload|FCS| 238 +------------+--------+---------+---------+--+-------+---+ 239 ^ 240 | | | 241 | 242 | 243 | +-------+ TRILL +------+ 244 | | R1 |-----------| R2 | Decapsulate 245 | +---+---+ domain +------+ TRILL header 246 | | | 247 +----------| | 248 | | 249 +-----+ +-----+ 250 Non-RBridge node:|T12 | | T22 | 251 Encapsulate TRILL+-----+ +-----+ 252 Header for data 253 Frames to traverse 254 TRILL domain. 255 Figure 2 Data frames from TRILL-EN 257 4. Source Nickname in Frames Encapsulated by Non-RBridge 258 Nodes 260 The TRILL header includes a Source RBridge's Nickname 261 (ingress) and Destination RBridge's Nickname (egress). When a 262 TRILL header is added by TRILL-EN, the Ingress RBridge edge 263 node's nickname is used in the source address field. 265 5. Benefits of Non-RBridge encapsulating TRILL header 267 5.1. Avoid Nickname Exhaustion Issue 269 For a large Data Center with hundreds of thousands of 270 virtualized servers, setting TRILL boundary at the servers' 271 virtual switches will create a TRILL domain with hundreds of 272 thousands of RBridge nodes, which has issues of TRILL 273 Nicknames exhaustion and challenges to IS-IS. Setting TRILL 274 boundary at aggregation switches that have many virtualized 275 servers attached can limit the number of RBridge nodes in a 276 TRILL domain, but introduce the issues of very large 277 MAC&VLAN<->RBridgeEdge mapping table to be maintained by 278 RBridge edge nodes and the necessity of enforcing AF ports. 280 Allowing Non-RBridge nodes to pre-encapsulate data frames with 281 TRILL header makes it possible to have a TRILL domain with 282 reasonable number of RBridge nodes in a large data center. All 283 the TRILL-ENs attached to one RBridge are represented by one 284 TRILL nickname, which can avoid the Nickname exhaustion 285 problem. 287 5.2. Reduce MAC Tables for switches on Bridged LANs 289 When hosts in a VLAN (or subnet) span across multiple RBridge 290 edge nodes and each RBridge edge has multiple VLANs enabled, 291 the switches on the bridged LANs attached to the RBridge edge 292 are exposed to all MAC addresses among all the VLANs enabled. 294 For example, for an Access switch with 40 physical servers 295 attached, where each server has 100 VMs, there are 4000 hosts 296 under the Access Switch. If indeed hosts/VMs can be moved 297 anywhere, the worst case for the Access Switch is when all 298 those 4000 VMs belong to different VLANs, i.e. the access 299 switch has 4000 VLANs enabled. If each VLAN has 200 hosts, 300 this access switch's MAC table potentially has 200*4000 = 301 800,000 entries. 303 If the virtual switches on server pre-encapsulate the data 304 frames towards hosts attached to other RBridge Edge nodes with 305 TRILL header, the outer MAC DA of those TRILL encapsulated 306 data frames will be the MAC address of the local RBridge edge, 307 i.e. the ingress RBridge. Therefore, the switches on the local 308 bridged LAN don't need to keep the MAC entries for remote 309 hosts attached to other RBridge edges. 311 But the traffic from nodes attached to other RBridges is 312 decapsulated and has the true source and destination MACs. To 313 prevent local bridges from learning remote hosts' MACs and 314 adding to their MAC tables, one simple way is to disable 315 learning on local bridges. The local bridges can be pre- 316 installed with MAC addresses of local hosts with the 317 assistance of directory. The local bridges can always send 318 frames with unknown Destination to the ingress RBridge. In an 319 environment where end stations are VMs embedded in a server, 320 the amount of remote MAC addresses could be very large. If it 321 is not feasible to disable learning and pre-install MAC tables 322 for local bridges, one effective method to minimize local 323 bridges' MAC table size is to use the server's MAC address to 324 hide MAC addresses of the attached VMs. I.e. the server acting 325 as an edge node using its own MAC address in the Source 326 Address field of the packets originated from a host (or VM) 327 embedded. When the Ethernet frame arrives at the target edge 328 node (the server), the target edge node can send the packet to 329 the corresponding destination host based on the packet's IP 330 address. Very often, the target edge node communicates with 331 the embedded VMs via a layer 2 virtual switch. Under this 332 case, the target edge node can construct the proper Ethernet 333 header with the assistance from directory. The information 334 from directory includes the proper host IP to MAC mapping 335 information. 337 6. Conclusion and Recommendation 339 When directory information is available, nodes outside TRILL 340 domain become capable of encapsulating TRILL header for data 341 frames destined for remote RBridges that are not on the same 342 bridged LAN. The non-RBridge encapsulation approach is 343 especially useful when there are a large number of servers in 344 a data center equipped with hypervisor-based virtual switches. 345 It is relatively easy for virtual switches, which are usually 346 software based, to get directory assistance and perform 347 network address encapsulation. 349 7. Manageability Considerations 351 It requires directory assistance to make it possible for a 352 non-TRILL node to pre-encapsulate packets destined towards 353 remote RBridges. 355 8. Security Considerations 357 Pull Directory queries and responses are transmitted as 358 RBridge-to-RBridge or native RBridge Channel messages. Such 359 messages can besecured as specified in [ChannelTunnel]. 361 For general TRILL security considerations, see [RFC6325]. 363 9. IANA Considerations 365 This document requires no IANA actions. RFC Editor: 366 Please remove this section before publication. 368 10. References 370 10.1. Normative References 372 [RFC2119] Bradner, S., "Key words for use in RFCs to 373 Indicate Requirement Levels", BCP 14, RFC 2119, 374 March 1997. 376 [RFC6325] Perlman, et, al, "Routing Bridges (RBridges): 377 Base Protocol Specification", RFC6325, July 378 2011 380 [RFC6439] Perlman, R., Eastlake, D., Li, Y., Banerjee, 381 A., and F. Hu, "Routing Bridges (RBridges): 382 Appointed Forwarders", RFC 6439, November 2011. 384 10.2. Informative References 386 [RFC7067] Dunbar, et, al "Directory Assistance Problem 387 and High-Level Design Proposal", RFC7067, Nov, 388 2013. 390 [ChannelTunnel] - D. Eastlake, Y. Li, "TRILL: RBridge 391 Channel Tunnel Protocol", draft-eastlake-trill- 392 channel-tunnel, work in progress. 394 11. Acknowledgments 396 This document was prepared using 2-Word- 397 v2.0.template.dot. 399 Authors' Addresses 401 Linda Dunbar 402 Huawei Technologies 403 5340 Legacy Drive, Suite 175 404 Plano, TX 75024, USA 405 Phone: (469) 277 5840 406 Email: linda.dunbar@huawei.com 408 Donald Eastlake 409 Huawei Technologies 410 155 Beaver Street 411 Milford, MA 01757 USA 412 Phone: 1-508-333-2270 413 Email: d3e3e3@gmail.com 415 Radia Perlman 416 Intel Labs 417 2200 Mission College Blvd. 418 Santa Clara, CA 95054-1549 USA 419 Phone: 1-408-765-8080 420 Email: Radia@alum.mit.edu 422 Igor Gashinsky 423 Yahoo 424 45 West 18th Street 6th floor 425 New York, NY 10011 426 Email: igor@yahoo-inc.com