idnits 2.17.1 draft-ietf-bfd-vxlan-06.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (December 26, 2018) is 1941 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) ** Downref: Normative reference to an Informational RFC: RFC 7348 Summary: 1 error (**), 0 flaws (~~), 1 warning (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 BFD S. Pallagatti, Ed. 3 Internet-Draft Rtbrick 4 Intended status: Standards Track S. Paragiri 5 Expires: June 29, 2019 Juniper Networks 6 V. Govindan 7 M. Mudigonda 8 Cisco 9 G. Mirsky 10 ZTE Corp. 11 December 26, 2018 13 BFD for VXLAN 14 draft-ietf-bfd-vxlan-06 16 Abstract 18 This document describes the use of the Bidirectional Forwarding 19 Detection (BFD) protocol in Virtual eXtensible Local Area Network 20 (VXLAN) overlay networks. 22 Status of This Memo 24 This Internet-Draft is submitted in full conformance with the 25 provisions of BCP 78 and BCP 79. 27 Internet-Drafts are working documents of the Internet Engineering 28 Task Force (IETF). Note that other groups may also distribute 29 working documents as Internet-Drafts. The list of current Internet- 30 Drafts is at https://datatracker.ietf.org/drafts/current/. 32 Internet-Drafts are draft documents valid for a maximum of six months 33 and may be updated, replaced, or obsoleted by other documents at any 34 time. It is inappropriate to use Internet-Drafts as reference 35 material or to cite them other than as "work in progress." 37 This Internet-Draft will expire on June 29, 2019. 39 Copyright Notice 41 Copyright (c) 2018 IETF Trust and the persons identified as the 42 document authors. All rights reserved. 44 This document is subject to BCP 78 and the IETF Trust's Legal 45 Provisions Relating to IETF Documents 46 (https://trustee.ietf.org/license-info) in effect on the date of 47 publication of this document. Please review these documents 48 carefully, as they describe your rights and restrictions with respect 49 to this document. Code Components extracted from this document must 50 include Simplified BSD License text as described in Section 4.e of 51 the Trust Legal Provisions and are provided without warranty as 52 described in the Simplified BSD License. 54 Table of Contents 56 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 57 2. Conventions used in this document . . . . . . . . . . . . . . 3 58 2.1. Terminology . . . . . . . . . . . . . . . . . . . . . . . 3 59 2.2. Requirements Language . . . . . . . . . . . . . . . . . . 4 60 3. Use cases . . . . . . . . . . . . . . . . . . . . . . . . . . 4 61 4. Deployment . . . . . . . . . . . . . . . . . . . . . . . . . 5 62 5. BFD Packet Transmission over VXLAN Tunnel . . . . . . . . . . 6 63 5.1. BFD Packet Encapsulation in VXLAN . . . . . . . . . . . . 7 64 6. Reception of BFD packet from VXLAN Tunnel . . . . . . . . . . 8 65 6.1. Demultiplexing of the BFD packet . . . . . . . . . . . . 8 66 7. Use of reserved VNI . . . . . . . . . . . . . . . . . . . . . 9 67 8. Echo BFD . . . . . . . . . . . . . . . . . . . . . . . . . . 9 68 9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 9 69 10. Security Considerations . . . . . . . . . . . . . . . . . . . 9 70 11. Contributors . . . . . . . . . . . . . . . . . . . . . . . . 10 71 12. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 10 72 13. References . . . . . . . . . . . . . . . . . . . . . . . . . 10 73 13.1. Normative References . . . . . . . . . . . . . . . . . . 10 74 13.2. Informational References . . . . . . . . . . . . . . . . 11 75 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 11 77 1. Introduction 79 "Virtual eXtensible Local Area Network" (VXLAN) [RFC7348]. provides 80 an encapsulation scheme that allows building an overlay network by 81 decoupling the address space of the attached virtual hosts from that 82 of the network. 84 One use of VXLAN is in data centers interconnecting VMs of a tenant. 85 VXLAN addresses requirements of the Layer 2 and Layer 3 data center 86 network infrastructure in the presence of VMs in a multi-tenant 87 environment, discussed in section 3 [RFC7348], by providing Layer 2 88 overlay scheme on a Layer 3 network. Another use is as an 89 encapsulation for Ethernet VPN [RFC8365]. 91 This document is written assuming the use of VXLAN for virtualized 92 hosts and refers to VMs and VTEPs in hypervisors. However, the 93 concepts are equally applicable to non-virtualized hosts attached to 94 VTEPs in switches. 96 In the absence of a router in the overlay, a VM can communicate with 97 another VM only if they are on the same VXLAN segment. VMs are 98 unaware of VXLAN tunnels as a VXLAN tunnel is terminated on a VXLAN 99 Tunnel End Point (VTEP) (hypervisor/TOR). VTEPs (hypervisor/TOR) are 100 responsible for encapsulating and decapsulating frames exchanged 101 among VMs. 103 Ability to monitor path continuity, i.e., perform proactive 104 continuity check (CC) for these tunnels, is important. The 105 asynchronous mode of BFD, as defined in [RFC5880], can be used to 106 monitor a VXLAN tunnel. Use of [I-D.ietf-bfd-multipoint] is for 107 future study. 109 Also, BFD in VXLAN can be used to monitor the particular service 110 nodes that are designated to handle Layer 2 broadcast properly, 111 unknown unicast, and multicast traffic. Such nodes, discussed in 112 details in [RFC8293], are often referred to as "replicators", are 113 usually virtual VTEPs and can be monitored by physical VTEPs to 114 minimize BUM traffic directed to the unavailable replicator. 116 This document describes the use of Bidirectional Forwarding Detection 117 (BFD) protocol VXLAN to enable monitoring continuity of the path 118 between Network Virtualization Edges (NVEs) and/or availability of a 119 replicator service node using BFD. 121 In this document, the terms NVE and VTEP are used interchangeably. 123 2. Conventions used in this document 125 2.1. Terminology 127 BFD - Bidirectional Forwarding Detection 129 CC - Continuity Check 131 NVE - Network Virtualization Edge 133 TOR - Top of Rack 135 VM - Virtual Machine 137 VTEP - VXLAN Tunnel End Point 139 VXLAN - Virtual eXtensible Local Area Network 141 2.2. Requirements Language 143 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 144 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 145 "OPTIONAL" in this document are to be interpreted as described in BCP 146 14 [RFC2119] [RFC8174] when, and only when, they appear in all 147 capitals, as shown here. 149 3. Use cases 151 The primary use case of BFD for VXLAN is for continuity check of a 152 tunnel. By exchanging BFD control packets between VTEPs, an operator 153 exercises the VXLAN path in both the underlay and overlay thus 154 ensuring the VXLAN path availability and VTEPs reachability. BFD 155 failure detection can be used for maintenance. There are other use 156 cases such as the following: 158 Layer 2 VMs: 160 Deployments might have VMs with only L2 capabilities and not 161 have an IP address assigned or, in other cases, VMs are 162 assigned IP address but are restricted to communicate only 163 within their subnet. BFD being an L3 protocol can be used as a 164 tunnel CC mechanism, where BFD will start and terminate at the 165 NVEs, e.g., VTEPs. 167 It is possible to aggregate the CC sessions for multiple 168 tenants by running a BFD session between the VTEPs over VxLAN 169 tunnel. 171 Fault localization: 173 It is also possible that VMs are L3 aware and can host a BFD 174 session. In these cases, BFD sessions can be established among 175 VMs for CC. Also, BFD sessions can be created among VTEPs for 176 tunnel CC. Having a hierarchical OAM model helps localize 177 faults though it requires additional consideration of, for 178 example, coordination of BFD intervals across the OAM layers 180 Service node reachability: 182 The service node is responsible for sending BUM traffic. In 183 case a service node tunnel terminates at a VTEP, and that VTEP 184 might not even host VM. BFD session between TOR/hypervisor and 185 service node can be used to monitor service node reachability. 187 4. Deployment 189 Figure 1 illustrates the scenario with two servers, each of them 190 hosting two VMs. The servers host VTEPs that terminate two VXLAN 191 tunnels with VNI number 100 and 200 respectively. Separate BFD 192 sessions can be established between the VTEPs (IP1 and IP2) for 193 monitoring each of the VXLAN tunnels (VNI 100 and 200). The 194 implementation SHOULD have a reasonable upper bound on the number of 195 BFD sessions that can be created between the same pair of VTEPs. No 196 BFD packets intended for a Hypervisor VTEP should be forwarded to a 197 VM as a VM may drop BFD packets leading to a false negative. This 198 method is applicable whether the VTEP is a virtual or physical 199 device. 201 +------------+-------------+ 202 | Server 1 | 203 | | 204 | +----+----+ +----+----+ | 205 | |VM1-1 | |VM1-2 | | 206 | |VNI 100 | |VNI 200 | | 207 | | | | | | 208 | +---------+ +---------+ | 209 | Hypervisor VTEP (IP1) | 210 +--------------------------+ 211 | 212 | 213 | 214 | +-------------+ 215 | | Layer 3 | 216 |---| Network | 217 | | 218 +-------------+ 219 | 220 | 221 +-----------+ 222 | 223 | 224 +------------+-------------+ 225 | Hypervisor VTEP (IP2) | 226 | +----+----+ +----+----+ | 227 | |VM2-1 | |VM2-2 | | 228 | |VNI 100 | |VNI 200 | | 229 | | | | | | 230 | +---------+ +---------+ | 231 | Server 2 | 232 +--------------------------+ 234 Figure 1: Reference VXLAN domain 236 5. BFD Packet Transmission over VXLAN Tunnel 238 BFD packet MUST be encapsulated and sent to a remote VTEP as 239 explained in Section 5.1. Implementations SHOULD ensure that the BFD 240 packets follow the same lookup path as VXLAN data packets within the 241 sender system. 243 5.1. BFD Packet Encapsulation in VXLAN 245 BFD packets are encapsulated in VXLAN as described below. The VXLAN 246 packet format is defined in Section 5 of [RFC7348]. The Outer IP/UDP 247 and VXLAN headers MUST be encoded by the sender as defined in 248 [RFC7348]. 250 0 1 2 3 251 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 252 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 253 | | 254 ~ Outer Ethernet Header ~ 255 | | 256 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 257 | | 258 ~ Outer IPvX Header ~ 259 | | 260 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 261 | | 262 ~ Outer UDP Header ~ 263 | | 264 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 265 | | 266 ~ VXLAN Header ~ 267 | | 268 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 269 | | 270 ~ Inner Ethernet Header ~ 271 | | 272 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 273 | | 274 ~ Inner IPvX Header ~ 275 | | 276 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 277 | | 278 ~ Inner UDP Header ~ 279 | | 280 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 281 | | 282 ~ BFD Control Message ~ 283 | | 284 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 285 | FCS | 286 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 288 Figure 2: VXLAN Encapsulation of BFD Control Message 290 The BFD packet MUST be carried inside the inner MAC frame of the 291 VXLAN packet. The inner MAC frame carrying the BFD payload has the 292 following format: 294 Ethernet Header: 296 Destination MAC: This MUST be the dedicated MAC TBA (Section 9) 297 or the MAC address of the destination VTEP. The details of how 298 the MAC address of the destination VTEP is obtained are outside 299 the scope of this document. 301 Source MAC: MAC address of the originating VTEP 303 IP header: 305 Source IP: IP address of the originating VTEP. 307 Destination IP: IP address of the terminating VTEP. 309 TTL: MUST be set to 1 to ensure that the BFD packet is not 310 routed within the L3 underlay network. 312 The fields of the UDP header and the BFD control packet are 313 encoded as specified in [RFC5881] for p2p VXLAN tunnels. 315 6. Reception of BFD packet from VXLAN Tunnel 317 Once a packet is received, VTEP MUST validate the packet as described 318 in Section 4.1 of [RFC7348]. If the Destination MAC of the inner MAC 319 frame matches the dedicated MAC or the MAC address of the VTEP the 320 packet MUST be processed further. 322 The UDP destination port and the TTL of the inner IP packet MUST be 323 validated to determine if the received packet can be processed by 324 BFD. BFD packet with inner MAC set to VTEP or dedicated MAC address 325 MUST NOT be forwarded to VMs. 327 To ensure BFD detects the proper configuration of VXLAN Network 328 Identifier (VNI) in a remote VTEP, a lookup SHOULD be performed with 329 the MAC-DA and VNI as key in the Virtual Forwarding Instance (VFI) 330 table of the originating/terminating VTEP to exercise the VFI 331 associated with the VNI. 333 6.1. Demultiplexing of the BFD packet 335 Demultiplexing of IP BFD packet has been defined in Section 3 of 336 [RFC5881]. Since multiple BFD sessions may be running between two 337 VTEPs, there needs to be a mechanism for demultiplexing received BFD 338 packets to the proper session. The procedure for demultiplexing 339 packets with Your Discriminator equal to 0 is different from 340 [RFC5880]. For such packets, the BFD session MUST be identified 341 using the inner headers, i.e., the source IP, the destination IP, and 342 the source UDP port number present in the IP header carried by the 343 payload of the VXLAN encapsulated packet. The VNI of the packet 344 SHOULD be used to derive interface-related information for 345 demultiplexing the packet. If BFD packet is received with non-zero 346 Your Discriminator, then BFD session MUST be demultiplexed only with 347 Your Discriminator as the key. 349 7. Use of reserved VNI 351 In most cases, a single BFD session is sufficient for the given VTEP 352 to monitor the reachability of a remote VTEP, regardless of the 353 number of VNIs in common. When the single BFD session is used to 354 monitor reachability of the remote VTEP, an implementation SHOULD use 355 a VNI of 0. 357 8. Echo BFD 359 Support for echo BFD is outside the scope of this document. 361 9. IANA Considerations 363 IANA has assigned TBA as a dedicated MAC address from the IANA 48-bit 364 unicast MAC address registry to be used as the Destination MAC 365 address of the inner Ethernet of VXLAN when carrying BFD control 366 packets. 368 10. Security Considerations 370 The document requires setting the inner IP TTL to 1 which could be 371 used as a DDoS attack vector. Thus the implementation MUST have 372 throttling in place to control the rate of BFD control packets sent 373 to the control plane. Throttling MAY be relaxed for BFD packets 374 based on port number. 376 The implementation SHOULD have a reasonable upper bound on the number 377 of BFD sessions that can be created between the same pair of VTEPs. 379 Other than inner IP TTL set to 1 and limit the number of BFD sessions 380 between the same pair of VTEPs, this specification does not raise any 381 additional security issues beyond those of the specifications 382 referred to in the list of normative references. 384 11. Contributors 386 Reshad Rahman 387 rrahman@cisco.com 388 Cisco 390 12. Acknowledgments 392 Authors would like to thank Jeff Haas of Juniper Networks for his 393 reviews and feedback on this material. 395 Authors would also like to thank Nobo Akiya, Marc Binderberger, 396 Shahram Davari, Donald E. Eastlake 3rd, and Anoop Ghanwani for the 397 extensive reviews and the most detailed and helpful comments. 399 13. References 401 13.1. Normative References 403 [I-D.ietf-bfd-multipoint] 404 Katz, D., Ward, D., Networks, J., and G. Mirsky, "BFD for 405 Multipoint Networks", draft-ietf-bfd-multipoint-19 (work 406 in progress), December 2018. 408 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 409 Requirement Levels", BCP 14, RFC 2119, 410 DOI 10.17487/RFC2119, March 1997, 411 . 413 [RFC5880] Katz, D. and D. Ward, "Bidirectional Forwarding Detection 414 (BFD)", RFC 5880, DOI 10.17487/RFC5880, June 2010, 415 . 417 [RFC5881] Katz, D. and D. Ward, "Bidirectional Forwarding Detection 418 (BFD) for IPv4 and IPv6 (Single Hop)", RFC 5881, 419 DOI 10.17487/RFC5881, June 2010, 420 . 422 [RFC7348] Mahalingam, M., Dutt, D., Duda, K., Agarwal, P., Kreeger, 423 L., Sridhar, T., Bursell, M., and C. Wright, "Virtual 424 eXtensible Local Area Network (VXLAN): A Framework for 425 Overlaying Virtualized Layer 2 Networks over Layer 3 426 Networks", RFC 7348, DOI 10.17487/RFC7348, August 2014, 427 . 429 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 430 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 431 May 2017, . 433 13.2. Informational References 435 [RFC8293] Ghanwani, A., Dunbar, L., McBride, M., Bannai, V., and R. 436 Krishnan, "A Framework for Multicast in Network 437 Virtualization over Layer 3", RFC 8293, 438 DOI 10.17487/RFC8293, January 2018, 439 . 441 [RFC8365] Sajassi, A., Ed., Drake, J., Ed., Bitar, N., Shekhar, R., 442 Uttaro, J., and W. Henderickx, "A Network Virtualization 443 Overlay Solution Using Ethernet VPN (EVPN)", RFC 8365, 444 DOI 10.17487/RFC8365, March 2018, 445 . 447 Authors' Addresses 449 Santosh Pallagatti (editor) 450 Rtbrick 452 Email: santosh.pallagatti@gmail.com 454 Sudarsan Paragiri 455 Juniper Networks 456 1194 N. Mathilda Ave. 457 Sunnyvale, California 94089-1206 458 USA 460 Email: sparagiri@juniper.net 462 Vengada Prasad Govindan 463 Cisco 465 Email: venggovi@cisco.com 467 Mallik Mudigonda 468 Cisco 470 Email: mmudigon@cisco.com 471 Greg Mirsky 472 ZTE Corp. 474 Email: gregimirsky@gmail.com