idnits 2.17.1 draft-ietf-bfd-vxlan-02.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (August 17, 2018) is 2050 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Outdated reference: A later version (-19) exists of draft-ietf-bfd-multipoint-18 ** Downref: Normative reference to an Informational RFC: RFC 7348 Summary: 1 error (**), 0 flaws (~~), 2 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Internet Engineering Task Force S. Pallagatti, Ed. 3 Internet-Draft Rtbrick 4 Intended status: Standards Track S. Paragiri 5 Expires: February 18, 2019 Juniper Networks 6 V. Govindan 7 M. Mudigonda 8 Cisco 9 G. Mirsky 10 ZTE Corp. 11 August 17, 2018 13 BFD for VXLAN 14 draft-ietf-bfd-vxlan-02 16 Abstract 18 This document describes the use of the Bidirectional Forwarding 19 Detection (BFD) protocol in Virtual eXtensible Local Area Network 20 (VXLAN) overlay networks. 22 Status of This Memo 24 This Internet-Draft is submitted in full conformance with the 25 provisions of BCP 78 and BCP 79. 27 Internet-Drafts are working documents of the Internet Engineering 28 Task Force (IETF). Note that other groups may also distribute 29 working documents as Internet-Drafts. The list of current Internet- 30 Drafts is at https://datatracker.ietf.org/drafts/current/. 32 Internet-Drafts are draft documents valid for a maximum of six months 33 and may be updated, replaced, or obsoleted by other documents at any 34 time. It is inappropriate to use Internet-Drafts as reference 35 material or to cite them other than as "work in progress." 37 This Internet-Draft will expire on February 18, 2019. 39 Copyright Notice 41 Copyright (c) 2018 IETF Trust and the persons identified as the 42 document authors. All rights reserved. 44 This document is subject to BCP 78 and the IETF Trust's Legal 45 Provisions Relating to IETF Documents 46 (https://trustee.ietf.org/license-info) in effect on the date of 47 publication of this document. Please review these documents 48 carefully, as they describe your rights and restrictions with respect 49 to this document. Code Components extracted from this document must 50 include Simplified BSD License text as described in Section 4.e of 51 the Trust Legal Provisions and are provided without warranty as 52 described in the Simplified BSD License. 54 Table of Contents 56 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 57 2. Conventions used in this document . . . . . . . . . . . . . . 3 58 2.1. Terminology . . . . . . . . . . . . . . . . . . . . . . . 3 59 2.2. Requirements Language . . . . . . . . . . . . . . . . . . 3 60 3. Use cases . . . . . . . . . . . . . . . . . . . . . . . . . . 3 61 4. Deployment . . . . . . . . . . . . . . . . . . . . . . . . . 4 62 5. BFD Packet Transmission over VXLAN Tunnel . . . . . . . . . . 5 63 5.1. BFD Packet Encapsulation in VXLAN . . . . . . . . . . . . 6 64 6. Reception of BFD packet from VXLAN Tunnel . . . . . . . . . . 7 65 6.1. Demultiplexing of the BFD packet . . . . . . . . . . . . 7 66 7. Use of reserved VNI . . . . . . . . . . . . . . . . . . . . . 8 67 8. Echo BFD . . . . . . . . . . . . . . . . . . . . . . . . . . 8 68 9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 8 69 10. Security Considerations . . . . . . . . . . . . . . . . . . . 8 70 11. Contributors . . . . . . . . . . . . . . . . . . . . . . . . 8 71 12. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 9 72 13. Normative References . . . . . . . . . . . . . . . . . . . . 9 73 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 9 75 1. Introduction 77 "Virtual eXtensible Local Area Network" (VXLAN) [RFC7348]. provides 78 an encapsulation scheme that allows virtual machines (VMs) to 79 communicate in a data center network. 81 VXLAN is typically deployed in data centers interconnecting 82 virtualized hosts, which may be spread across multiple racks. The 83 individual racks may be part of a different Layer 3 network, or they 84 could be in a single Layer 2 network. The VXLAN segments/overlays 85 are overlaid on top of Layer 3 network. 87 A VM can communicate with another VM only if they are on the same 88 VXLAN segment. VMs are unaware of VXLAN tunnels as a VXLAN tunnel is 89 terminated on a VXLAN Tunnel End Point (VTEP) (hypervisor/TOR). 90 VTEPs (hypervisor/TOR) are responsible for encapsulating and 91 decapsulating frames exchanged among VMs. 93 Ability to monitor path continuity, i.e., perform proactive 94 continuity check (CC) for these tunnels, is important. The 95 asynchronous mode of BFD, as defined in [RFC5880], can be used to 96 monitor a VXLAN tunnel. Use of [I-D.ietf-bfd-multipoint] is for 97 future study. 99 Also, BFD in VXLAN can be used to monitor the particular service 100 nodes that are designated to properly handle Layer 2 broadcast, 101 unknown unicast, and multicast traffic. Such nodes, often referred 102 "replicators", are usually virtual VTEPs and can be monitored by 103 physical VTEPs to minimize BUM traffic directed to the unavailable 104 replicator. 106 This document describes the use of Bidirectional Forwarding Detection 107 (BFD) protocol VXLAN to enable monitoring continuity of the path 108 between Network Virtualization Edges (NVEs) and/or availability of a 109 replicator service node using BFD. 111 In this document, the terms NVE and VTEP are used interchangeably. 113 2. Conventions used in this document 115 2.1. Terminology 117 BFD - Bidirectional Forwarding Detection 119 CC - Continuity Check 121 NVE - Network Virtualization Edge 123 TOR - Top of Rack 125 VM - Virtual Machine 127 VTEP - VXLAN Tunnel End Point 129 VXLAN - Virtual eXtensible Local Area Network 131 2.2. Requirements Language 133 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 134 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 135 "OPTIONAL" in this document are to be interpreted as described in BCP 136 14 [RFC2119] [RFC8174] when, and only when, they appear in all 137 capitals, as shown here. 139 3. Use cases 141 The primary use case of BFD for VXLAN is for continuity check of a 142 tunnel. By exchanging BFD control packets between VTEPs, an operator 143 exercises the VXLAN path in both the underlay and overlay thus 144 ensuring the VXLAN path availability and VTEPs reachability. BFD 145 failure detection can be used for maintenance. There are other use 146 cases such as the following: 148 Layer 2 VMs: 150 Most deployments will have VMs with only L2 capabilities that 151 may not support L3. BFD being an L3 protocol can be used as a 152 tunnel CC mechanism, where BFD will start and terminate at the 153 NVEs, e.g., VTEPs. 155 It is possible to aggregate the CC sessions for multiple 156 tenants by running a BFD session between the VTEPs over VxLAN 157 tunnel. 159 Fault localization: 161 It is also possible that VMs are L3 aware and can host a BFD 162 session. In these cases, BFD sessions can be established among 163 VMs for CC. Also, BFD sessions can be created among VTEPs for 164 tunnel CC. Having a hierarchical OAM model helps localize 165 faults though it requires additional consideration. 167 Service node reachability: 169 The service node is responsible for sending BUM traffic. In 170 case a service node tunnel terminates at a VTEP, and that VTEP 171 might not even host VM. BFD session between TOR/hypervisor and 172 service node can be used to monitor service node reachability. 174 4. Deployment 176 Figure 1 illustrates the scenario with two servers, each of them 177 hosting two VMs. The servers host VTEPs that terminate two VXLAN 178 tunnels with VNI number 100 and 200 respectively. Separate BFD 179 sessions can be established between the VTEPs (IP1 and IP2) for 180 monitoring each of the VXLAN tunnels (VNI 100 and 200). No BFD 181 packets intended for a Hypervisor VTEP should be forwarded to a VM as 182 a VM may drop BFD packets leading to a false negative. This method 183 is applicable whether the VTEP is a virtual or physical device. 185 +------------+-------------+ 186 | Server 1 | 187 | | 188 | +----+----+ +----+----+ | 189 | |VM1-1 | |VM1-2 | | 190 | |VNI 100 | |VNI 200 | | 191 | | | | | | 192 | +---------+ +---------+ | 193 | Hypervisor VTEP (IP1) | 194 +--------------------------+ 195 | 196 | 197 | 198 | +-------------+ 199 | | Layer 3 | 200 |---| Network | 201 | | 202 +-------------+ 203 | 204 | 205 +-----------+ 206 | 207 | 208 +------------+-------------+ 209 | Hypervisor VTEP (IP2) | 210 | +----+----+ +----+----+ | 211 | |VM2-1 | |VM2-2 | | 212 | |VNI 100 | |VNI 200 | | 213 | | | | | | 214 | +---------+ +---------+ | 215 | Server 2 | 216 +--------------------------+ 218 Figure 1: Reference VXLAN domain 220 5. BFD Packet Transmission over VXLAN Tunnel 222 BFD packet MUST be encapsulated and sent to a remote VTEP as 223 explained in Section 5.1. Implementations SHOULD ensure that the BFD 224 packets follow the same lookup path as VXLAN data packets within the 225 sender system. 227 5.1. BFD Packet Encapsulation in VXLAN 229 BFD packets are encapsulated in VXLAN as described below. The VXLAN 230 packet format is defined in Section 5 of [RFC7348]. The Outer IP/UDP 231 and VXLAN headers MUST be encoded by the sender as defined in 232 [RFC7348]. 234 0 1 2 3 235 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 236 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 237 | | 238 ~ Outer Ethernet Header ~ 239 | | 240 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 241 | | 242 ~ Outer IPvX Header ~ 243 | | 244 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 245 | | 246 ~ Outer UDP Header ~ 247 | | 248 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 249 | | 250 ~ VXLAN Header ~ 251 | | 252 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 253 | | 254 ~ Inner Ethernet Header ~ 255 | | 256 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 257 | | 258 ~ Inner IPvX Header ~ 259 | | 260 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 261 | | 262 ~ Inner UDP Header ~ 263 | | 264 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 265 | | 266 ~ BFD Control Message ~ 267 | | 268 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 269 | FCS | 270 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 272 Figure 2: VXLAN Encapsulation of BFD Control Message 274 The BFD packet MUST be carried inside the inner MAC frame of the 275 VXLAN packet. The inner MAC frame carrying the BFD payload has the 276 following format: 278 Ethernet Header: 280 Destination MAC: This MUST be the dedicated MAC TBA (Section 9) 281 or the MAC address of the destination VTEP. The details of how 282 the MAC address of the destination VTEP is obtained are outside 283 the scope of this document. 285 Source MAC: MAC address of the originating VTEP 287 IP header: 289 Source IP: IP address of the originating VTEP. 291 Destination IP: IP address of the terminating VTEP. 293 TTL: MUST be set to 1 to ensure that the BFD packet is not 294 routed within the L3 underlay network. 296 The fields of the UDP header and the BFD control packet are 297 encoded as specified in [RFC5881] for p2p VXLAN tunnels. 299 6. Reception of BFD packet from VXLAN Tunnel 301 Once a packet is received, VTEP MUST validate the packet as described 302 in Section 4.1 of [RFC7348]. If the Destination MAC of the inner MAC 303 frame matches the dedicated MAC or the MAC address of the VTEP the 304 packet MUST be processed further. 306 The UDP destination port and the TTL of the inner Ethernet frame MUST 307 be validated to determine if the received packet can be processed by 308 BFD. BFD packet with inner MAC set to VTEP or dedicated MAC address 309 MUST NOT be forwarded to VMs. 311 To ensure BFD detects the proper configuration of VXLAN Network 312 Identifier (VNI) in a remote VTEP, a lookup SHOULD be performed with 313 the MAC-DA and VNI as key in the Virtual Forwarding Instance (VFI) 314 table of the originating/terminating VTEP to exercise the VFI 315 associated with the VNI. 317 6.1. Demultiplexing of the BFD packet 319 Demultiplexing of IP BFD packet has been defined in Section 3 of 320 [RFC5881]. Since multiple BFD sessions may be running between two 321 VTEPs, there needs to be a mechanism for demultiplexing received BFD 322 packets to the proper session. The procedure for demultiplexing 323 packets with Your Discriminator equal to 0 is different from 324 [RFC5880]. For such packets, the BFD session MUST be identified 325 using the inner headers, i.e., the source IP and the destination IP 326 present in the IP header carried by the payload of the VXLAN 327 encapsulated packet. The VNI of the packet SHOULD be used to derive 328 interface-related information for demultiplexing the packet. If BFD 329 packet is received with non-zero Your Discriminator, then BFD session 330 MUST be demultiplexed only with Your Discriminator as the key. 332 7. Use of reserved VNI 334 BFD session MAY be established for the reserved VNI 0. One way to 335 aggregate BFD sessions between VTEP's is to establish a BFD session 336 with VNI 0. A VTEP MAY also use VNI 0 to establish a BFD session 337 with a service node. 339 8. Echo BFD 341 Support for echo BFD is outside the scope of this document. 343 9. IANA Considerations 345 IANA has assigned TBA as a dedicated MAC address to be used as the 346 Destination MAC address of the inner Ethernet of VXLAN when carrying 347 BFD control packets. 349 10. Security Considerations 351 The document recommends setting the inner IP TTL to 1 which could 352 lead to a DDoS attack. Thus the implementation MUST have throttling 353 in place. Throttling MAY be relaxed for BFD packets based on port 354 number. 356 Other than inner IP TTL set to 1 this specification does not raise 357 any additional security issues beyond those of the specifications 358 referred to in the list of normative references. 360 11. Contributors 362 Reshad Rahman 363 rrahman@cisco.com 364 Cisco 366 12. Acknowledgments 368 Authors would like to thank Jeff Hass of Juniper Networks for his 369 reviews and feedback on this material. 371 Authors would also like to thank Nobo Akiya, Marc Binderberger, 372 Shahram Davari and Donald E. Eastlake 3rd for the extensive reviews 373 and the most detailed and helpful comments. 375 13. Normative References 377 [I-D.ietf-bfd-multipoint] 378 Katz, D., Ward, D., Networks, J., and G. Mirsky, "BFD for 379 Multipoint Networks", draft-ietf-bfd-multipoint-18 (work 380 in progress), June 2018. 382 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 383 Requirement Levels", BCP 14, RFC 2119, 384 DOI 10.17487/RFC2119, March 1997, 385 . 387 [RFC5880] Katz, D. and D. Ward, "Bidirectional Forwarding Detection 388 (BFD)", RFC 5880, DOI 10.17487/RFC5880, June 2010, 389 . 391 [RFC5881] Katz, D. and D. Ward, "Bidirectional Forwarding Detection 392 (BFD) for IPv4 and IPv6 (Single Hop)", RFC 5881, 393 DOI 10.17487/RFC5881, June 2010, 394 . 396 [RFC7348] Mahalingam, M., Dutt, D., Duda, K., Agarwal, P., Kreeger, 397 L., Sridhar, T., Bursell, M., and C. Wright, "Virtual 398 eXtensible Local Area Network (VXLAN): A Framework for 399 Overlaying Virtualized Layer 2 Networks over Layer 3 400 Networks", RFC 7348, DOI 10.17487/RFC7348, August 2014, 401 . 403 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 404 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 405 May 2017, . 407 Authors' Addresses 409 Santosh Pallagatti (editor) 410 Rtbrick 412 Email: santosh.pallagatti@gmail.com 413 Sudarsan Paragiri 414 Juniper Networks 415 1194 N. Mathilda Ave. 416 Sunnyvale, California 94089-1206 417 USA 419 Email: sparagiri@juniper.net 421 Vengada Prasad Govindan 422 Cisco 424 Email: venggovi@cisco.com 426 Mallik Mudigonda 427 Cisco 429 Email: mmudigon@cisco.com 431 Greg Mirsky 432 ZTE Corp. 434 Email: gregimirsky@gmail.com