idnits 2.17.1 draft-ietf-trill-directory-assisted-encap-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == The document doesn't use any RFC 2119 keywords, yet seems to have RFC 2119 boilerplate text. -- The document date (December 16, 2014) is 3412 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) ** Obsolete normative reference: RFC 6439 (Obsoleted by RFC 8139) Summary: 1 error (**), 0 flaws (~~), 2 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 TRILL working group L. Dunbar 2 Internet Draft D. Eastlake 3 Intended status: Standard Track Huawei 4 Expires: June 2015 Radia Perlman 5 Intel 6 I. Gashinsky 7 Yahoo 8 December 16, 2014 10 Directory Assisted TRILL Encapsulation 11 draft-ietf-trill-directory-assisted-encap-00.txt 13 Status of this Memo 15 This Internet-Draft is submitted in full conformance with 16 the provisions of BCP 78 and BCP 79. 18 This Internet-Draft is submitted in full conformance with 19 the provisions of BCP 78 and BCP 79. This document may 20 not be modified, and derivative works of it may not be 21 created, except to publish it as an RFC and to translate 22 it into languages other than English. 24 Internet-Drafts are working documents of the Internet 25 Engineering Task Force (IETF), its areas, and its working 26 groups. Note that other groups may also distribute 27 working documents as Internet-Drafts. 29 Internet-Drafts are draft documents valid for a maximum 30 of six months and may be updated, replaced, or obsoleted 31 by other documents at any time. It is inappropriate to 32 use Internet-Drafts as reference material or to cite them 33 other than as "work in progress." 35 The list of current Internet-Drafts can be accessed at 36 http://www.ietf.org/ietf/1id-abstracts.txt 38 The list of Internet-Draft Shadow Directories can be 39 accessed at http://www.ietf.org/shadow.html 41 This Internet-Draft will expire on June 16, 2015. 43 Copyright Notice 45 Copyright (c) 2014 IETF Trust and the persons identified 46 as the document authors. All rights reserved. 48 This document is subject to BCP 78 and the IETF Trust's 49 Legal Provisions Relating to IETF Documents 50 (http://trustee.ietf.org/license-info) in effect on the 51 date of publication of this document. Please review these 52 documents carefully, as they describe your rights and 53 restrictions with respect to this document. Code 54 Components extracted from this document must include 55 Simplified BSD License text as described in Section 4.e 56 of the Trust Legal Provisions and are provided without 57 warranty as described in the Simplified BSD License. 59 Abstract 61 This draft describes how data center network can benefit from 62 non-RBridge nodes performing TRILL encapsulation with 63 assistance from directory service. 65 Table of Contents 67 1. Introduction....................................... 3 68 2. Conventions used in this document.................. 3 69 3. Directory Assistance to Non-RBridge................ 4 70 4. Source Nickname in Frames Encapsulated by Non- 71 RBridge Nodes......................................... 7 72 5. Benefits of Non-RBridge encapsulating TRILL header. 7 73 5.1. Avoid Nickname Exhaustion Issue................. 7 74 5.2. Reduce MAC Tables for switches on Bridged LANs.. 8 75 6. Conclusion and Recommendation...................... 9 76 7. Manageability Considerations....................... 9 77 8. Security Considerations............................ 9 78 9. IANA Considerations................................ 9 79 10. References....................................... 10 80 10.1. Normative References.......................... 10 81 10.2. Informative References........................ 10 82 11. Acknowledgments.................................. 10 84 1. Introduction 86 This draft describes how data center networks can benefit from 87 non-RBridge nodes performing TRILL encapsulation with 88 assistance from directory service. 90 [RFC7067] describes the framework for RBridge edge to get 91 MAC&VLAN<->RBridgeEdge mapping from a directory service in 92 data center environments instead of flooding unknown DAs 93 across TRILL domain. If it has the needed directory 94 information, any node, even a non-RBridge node, can perform 95 the TRILL encapsulation. This draft is to describe the 96 benefits and a scheme for non-RBridge nodes performing TRILL 97 encapsulation. 99 2. Conventions used in this document 101 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", 102 "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", 103 "MAY", and "OPTIONAL" in this document are to be 104 interpreted as described in RFC-2119 [RFC2119]. 106 In this document, these words will appear with that 107 interpretation only when in ALL CAPS. Lower case uses of 108 these words are not to be interpreted as carrying RFC- 109 2119 significance. 111 AF Appointed Forwarder RBridge port [RFC6439] 113 Bridge: IEEE 802.1Q compliant device. In this draft, Bridge 114 is used interchangeably with Layer 2 switch. 116 DA: Destination Address 118 DC: Data Center 120 EoR: End of Row switches in data center. Also known as 121 Aggregation switches in some data centers 123 Host: Application running on a physical server or a 124 virtual machine. A host usually has at least one IP 125 address and at least one MAC address. 127 SA: Source Address 128 ToR: Top of Rack Switch in data center. It is also known 129 as access switches in some data centers. 131 TRILL-EN: TRILL Encapsulating node. It is a node that only 132 performs the TRILL encapsulation but doesn't 133 participate in RBridge's IS-IS routing. 135 VM: Virtual Machines 137 3. Directory Assistance to Non-RBridge 139 With directory assistance [RFC7067], a non-RBridge can be 140 informed if a packet needs to be forwarded across the RBridge 141 domain and the corresponding egress RBridge. Suppose the 142 RBridge domain boundary starts at network switches (not 143 virtual switches embedded on servers), a directory can assist 144 Virtual Switches embedded on servers to encapsulate with a 145 proper TRILL header by providing the nickname of the egress 146 RBridge edge to which the destination is attached. The other 147 information needed to encapsulate can be either learned by 148 listening to TRILL Hellos, which will indicate the MAC address 149 and nickname of appropriate edge RBridges, or by 150 configuration. 152 If a destination is not attached to other RBridge edge nodes 153 based on the directory [RFC7067], the non-RBridge node can 154 forward the data frames natively, i.e. not encapsulating any 155 TRILL header. 157 \ +-------+ +------+ TRILL Domain/ 158 \ +/------+ | +/-----+ | / 159 \ | Aggr11| + ----- |AggrN1| + / 160 \ +---+---+/ +------+/ / 161 \ / \ / \ / 162 \ / \ / \ / 163 \ +---+ +---+ +---+ +---+ / 164 \- |T11|... |T1x| |T21| .. |T2y|--- 165 +---+ +---+ +---+ +---+ 166 | | | | 167 +-|-+ +-|-+ +-|-+ +-|-+ 168 | |... | V | | V | .. | V |<- vSwitch 169 +---+ +---+ +---+ +---+ 170 | |... | V | | V | .. | V | 171 +---+ +---+ +---+ +---+ 172 | |... | V | | V | .. | V | 173 +---+ +---+ +---+ +---+ 174 Figure 1 TRILL domain in typical Data Center Network 176 When a TRILL encapsulated data packet reaches the ingress 177 RBridge, the ingress RBridge simply forwards the pre- 178 encapsulated packet to the RBridge that is specified by the 179 egress nickname field of the TRILL header of the data frame. 180 When the ingress RBridge receives a native Ethernet frame, it 181 handles it as usual and may drop it if it has complete directory 182 information indicating that the target is not attached to the TRILL 183 campus. 185 In this environment with complete directory information, the 186 ingress RBridge doesn't flood or forward the received data 187 frames when the DA in the Ethernet data frames is unknown. 189 When all attached nodes to ingress RBridge can pre-encapsulate 190 TRILL header for traffic across the TRILL domain, the ingress 191 RBridge don't need to encapsulate any native Ethernet frames 192 to the TRILL domain. The attached nodes can be connected to 193 multiple edge RBridges by having multiple ports or by an bridged LAN. 194 Under this environment, there is no need to designate AF ports 195 and all RBridge edge ports connected to one bridged LAN can 196 receive and forward pre-encapsulated traffic, which can 197 greatly improve the overall network utilization. 199 Note: [RFC6325] Section 4.6.2 Bullet 8 specifies that an 200 RBridge port can be configured to accept TRILL encapsulated 201 frames from a neighbor that is not an RBridge. 203 When a TRILL frame arrives at an RBridge whose nickname 204 matches with the destination nickname in the TRILL header of 205 the frame, the processing is exactly same as normal, i.e. the 206 RBridge decapsulates the received TRILL frame and forwards the 207 decapsulated frame to the target attached to its edge ports. 208 When the DA of the decapsulated Ethernet frame is not in the 209 egress RBridge's local MAC attachment tables, the egress 210 RBridge floods the decapsulated frame to all attached links in 211 the frame's VLAN, or drops the frame (if the egress RBridge is 212 configured with the policy). 214 We call a node that only performs the TRILL encapsulation but 215 doesn't participate in RBridge's IS-IS routing a TRILL 216 Encapsulating node (TRILL-EN). The TRILL Encapsulating Node 217 can get the MAC&VLAN<->RBridgeEdge mapping table pulled from 218 directory servers [RFC7067]. 220 Editor's note: RFC7067 has defined Push and Pull model for 221 edge nodes to get directory mapping information. While Pull 222 Model is relative simple for TRILL-EN to implement, Pushing 223 requires some reliable flooding mechanism, like the one used 224 by IS-IS, between the edge RBridge and the TRILL encapsulating 225 node. Something like an extension to ES-IS might be needed. 227 Upon receiving a native Ethernet frame, the TRILL-EN checks 228 the MAC&VLAN<->RBridgeEdge mapping table, and perform the 229 corresponding TRILL encapsulation if the entry is found in the 230 mapping table. If the destination address and VLAN of the 231 received Ethernet frame doesn't exist in the mapping table and 232 no positive reply from pulling request to a directory, the 233 Ethernet frame is dropped or forwarded in native form to an edge 234 RBridge. 236 +------------+--------+---------+---------+--+-------+---+ 237 |OuterEtherHd|TRILL HD| InnerDA | InnerSA |..|Payload|FCS| 238 +------------+--------+---------+---------+--+-------+---+ 239 ^ 240 | | | 241 | 242 | 243 | +-------+ TRILL +------+ 244 | | R1 |-----------| R2 | Decapsulate 245 | +---+---+ domain +------+ TRILL header 246 | | | 247 +----------| | 248 | | 249 +-----+ +-----+ 250 Non-RBridge node:|T12 | | T22 | 251 Encapsulate TRILL+-----+ +-----+ 252 Header for data 253 Frames to traverse 254 TRILL domain. 255 Figure 2 Data frames from TRILL-EN 257 4. Source Nickname in Frames Encapsulated by Non-RBridge 258 Nodes 260 The TRILL header includes a Source RBridge's Nickname 261 (ingress) and Destination RBridge's Nickname (egress). When a 262 TRILL header is added by TRILL-EN, the Ingress RBridge edge 263 node's nickname is used in the source address field. 265 5. Benefits of Non-RBridge encapsulating TRILL header 267 5.1. Avoid Nickname Exhaustion Issue 269 For a large Data Center with hundreds of thousands of 270 virtualized servers, setting the TRILL boundary at the 271 servers' virtual switches will create a TRILL domain with 272 hundreds of thousands of RBridge nodes, which has issues of 273 TRILL Nicknames exhaustion and challenges to IS-IS. On the 274 other hand, setting TRILL boundary at aggregation switches that 275 have many virtualized servers attached can limit the number of 276 RBridge nodes in a TRILL domain, but introduce the issues of 277 very large MAC&VLAN<->RBridgeEdge mapping table to be 278 maintained by RBridge edge nodes and the necessity of 279 enforcing AF ports. 281 Allowing Non-RBridge nodes to pre-encapsulate data frames with 282 TRILL header makes it possible to have a TRILL domain with a 283 reasonable number of RBridge nodes in a large data center. All 284 the TRILL-ENs attached to one RBridge are represented by one 285 TRILL nickname, which can avoid the Nickname exhaustion 286 problem. 288 5.2. Reduce MAC Tables for switches on Bridged LANs 290 When hosts in a VLAN (or subnet) span across multiple RBridge 291 edge nodes and each RBridge edge has multiple VLANs enabled, 292 the switches on the bridged LANs attached to the RBridge edge 293 are exposed to all MAC addresses among all the VLANs enabled. 295 For example, for an Access switch with 40 physical servers 296 attached, where each server has 100 VMs, there are 4000 hosts 297 under the Access Switch. If indeed hosts/VMs can be moved 298 anywhere, the worst case for the Access Switch is when all 299 those 4000 VMs belong to different VLANs, i.e. the access 300 switch has 4000 VLANs enabled. If each VLAN has 200 hosts, 301 this access switch's MAC table potentially has 200*4000 = 302 800,000 entries. 304 If the virtual switches on servers pre-encapsulate the data 305 frames destined for hosts attached to other RBridge Edge 306 nodes, the outer MAC DA of those TRILL encapsulated data 307 frames will be the MAC address of the local RBridge edge, i.e. 308 the ingress RBridge. Therefore, the switches on the local 309 bridged LAN don't need to keep the MAC entries for remote 310 hosts attached to other edge RBridges. 312 But the traffic from nodes attached to other RBridges is 313 decapsulated and has the true source and destination MACs. To 314 prevent local bridges from learning remote hosts' MACs and 315 adding to their MAC tables, one simple way is to disable this 316 data plane learning on local bridges. The local bridges can be 317 pre-configured with MAC addresses of local hosts with the 318 assistance of a directory. The local bridges can always send 319 frames with unknown Destination to the ingress RBridge. In an 320 environment where a large number of VMs are instantiated in 321 one server, the number of remote MAC addresses could be very 322 large. If it is not feasible to disable learning and pre- 323 configure MAC tables for local bridges, one effective method 324 to minimize local bridges' MAC table size is to use the 325 server's MAC address to hide MAC addresses of the attached 326 VMs. I.e. the server acting as an edge node using its own MAC 327 address in the Source Address field of the packets originated 328 from a host (or VM) embedded. When the Ethernet frame arrives 329 at the target edge node (the server), the target edge node can 330 send the packet to the corresponding destination host based on 331 the packet's IP address. Very often, the target edge node 332 communicates with the embedded VMs via a layer 2 virtual 333 switch. Under this case, the target edge node can construct 334 the proper Ethernet header with the assistance from directory. 335 The information from directory includes the proper host IP to 336 MAC mapping information. 338 6. Conclusion and Recommendation 340 When directory information is available, nodes outside the 341 TRILL domain can encapsulate data frames destined for nodes 342 attached to remote RBridges. The non-RBridge encapsulation 343 approach is especially useful when there are a large number of 344 servers in a data center equipped with hypervisor-based 345 virtual switches. It is relatively easy for virtual switches, 346 which are usually software based, to get directory assistance 347 and perform network address encapsulation. 349 7. Manageability Considerations 351 It requires directory assistance to make it possible for a 352 non-TRILL node to pre-encapsulate packets destined towards 353 remote RBridges. 355 8. Security Considerations 357 Pull Directory queries and responses are transmitted as 358 RBridge-to-RBridge or native RBridge Channel messages. Such 359 messages can besecured as specified in [ChannelTunnel]. 361 For general TRILL security considerations, see [RFC6325]. 363 9. IANA Considerations 365 This document requires no IANA actions. RFC Editor: 366 Please remove this section before publication. 368 10. References 370 10.1. Normative References 372 [RFC2119] Bradner, S., "Key words for use in RFCs to 373 Indicate Requirement Levels", BCP 14, RFC 2119, 374 March 1997. 376 [RFC6325] Perlman, et, al, "Routing Bridges (RBridges): 377 Base Protocol Specification", RFC6325, July 378 2011 380 [RFC6439] Perlman, R., Eastlake, D., Li, Y., Banerjee, 381 A., and F. Hu, "Routing Bridges (RBridges): 382 Appointed Forwarders", RFC 6439, November 2011. 384 10.2. Informative References 386 [RFC7067] Dunbar, et, al "Directory Assistance Problem 387 and High-Level Design Proposal", RFC7067, Nov, 388 2013. 390 [ChannelTunnel] - D. Eastlake, Y. Li, "TRILL: RBridge 391 Channel Tunnel Protocol", draft-eastlake-trill- 392 channel-tunnel, work in progress. 394 11. Acknowledgments 396 This document was prepared using 2-Word- 397 v2.0.template.dot. 399 Authors' Addresses 401 Linda Dunbar 402 Huawei Technologies 403 5340 Legacy Drive, Suite 175 404 Plano, TX 75024, USA 405 Phone: (469) 277 5840 406 Email: linda.dunbar@huawei.com 408 Donald Eastlake 409 Huawei Technologies 410 155 Beaver Street 411 Milford, MA 01757 USA 412 Phone: 1-508-333-2270 413 Email: d3e3e3@gmail.com 415 Radia Perlman 416 Intel Labs 417 2200 Mission College Blvd. 418 Santa Clara, CA 95054-1549 USA 419 Phone: 1-408-765-8080 420 Email: Radia@alum.mit.edu 422 Igor Gashinsky 423 Yahoo 424 45 West 18th Street 6th floor 425 New York, NY 10011 426 Email: igor@yahoo-inc.com