idnits 2.17.1 draft-ietf-bfd-vxlan-04.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (November 23, 2018) is 1980 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Outdated reference: A later version (-19) exists of draft-ietf-bfd-multipoint-18 ** Downref: Normative reference to an Informational RFC: RFC 7348 Summary: 1 error (**), 0 flaws (~~), 2 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 BFD S. Pallagatti, Ed. 3 Internet-Draft Rtbrick 4 Intended status: Standards Track S. Paragiri 5 Expires: May 27, 2019 Juniper Networks 6 V. Govindan 7 M. Mudigonda 8 Cisco 9 G. Mirsky 10 ZTE Corp. 11 November 23, 2018 13 BFD for VXLAN 14 draft-ietf-bfd-vxlan-04 16 Abstract 18 This document describes the use of the Bidirectional Forwarding 19 Detection (BFD) protocol in Virtual eXtensible Local Area Network 20 (VXLAN) overlay networks. 22 Status of This Memo 24 This Internet-Draft is submitted in full conformance with the 25 provisions of BCP 78 and BCP 79. 27 Internet-Drafts are working documents of the Internet Engineering 28 Task Force (IETF). Note that other groups may also distribute 29 working documents as Internet-Drafts. The list of current Internet- 30 Drafts is at https://datatracker.ietf.org/drafts/current/. 32 Internet-Drafts are draft documents valid for a maximum of six months 33 and may be updated, replaced, or obsoleted by other documents at any 34 time. It is inappropriate to use Internet-Drafts as reference 35 material or to cite them other than as "work in progress." 37 This Internet-Draft will expire on May 27, 2019. 39 Copyright Notice 41 Copyright (c) 2018 IETF Trust and the persons identified as the 42 document authors. All rights reserved. 44 This document is subject to BCP 78 and the IETF Trust's Legal 45 Provisions Relating to IETF Documents 46 (https://trustee.ietf.org/license-info) in effect on the date of 47 publication of this document. Please review these documents 48 carefully, as they describe your rights and restrictions with respect 49 to this document. Code Components extracted from this document must 50 include Simplified BSD License text as described in Section 4.e of 51 the Trust Legal Provisions and are provided without warranty as 52 described in the Simplified BSD License. 54 Table of Contents 56 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 57 2. Conventions used in this document . . . . . . . . . . . . . . 3 58 2.1. Terminology . . . . . . . . . . . . . . . . . . . . . . . 3 59 2.2. Requirements Language . . . . . . . . . . . . . . . . . . 3 60 3. Use cases . . . . . . . . . . . . . . . . . . . . . . . . . . 4 61 4. Deployment . . . . . . . . . . . . . . . . . . . . . . . . . 4 62 5. BFD Packet Transmission over VXLAN Tunnel . . . . . . . . . . 5 63 5.1. BFD Packet Encapsulation in VXLAN . . . . . . . . . . . . 6 64 6. Reception of BFD packet from VXLAN Tunnel . . . . . . . . . . 7 65 6.1. Demultiplexing of the BFD packet . . . . . . . . . . . . 7 66 7. Use of reserved VNI . . . . . . . . . . . . . . . . . . . . . 8 67 8. Echo BFD . . . . . . . . . . . . . . . . . . . . . . . . . . 8 68 9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 8 69 10. Security Considerations . . . . . . . . . . . . . . . . . . . 8 70 11. Contributors . . . . . . . . . . . . . . . . . . . . . . . . 9 71 12. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 9 72 13. References . . . . . . . . . . . . . . . . . . . . . . . . . 9 73 13.1. Normative References . . . . . . . . . . . . . . . . . . 9 74 13.2. Informational References . . . . . . . . . . . . . . . . 10 75 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 10 77 1. Introduction 79 "Virtual eXtensible Local Area Network" (VXLAN) [RFC7348]. provides 80 an encapsulation scheme that allows building an overlay network by 81 decoupling the address space of the attached virtual hosts from that 82 of the network. 84 VXLAN is typically deployed in data centers interconnecting 85 virtualized hosts of a tenant. VXLAN addresses requirements of the 86 Layer 2 and Layer 3 data center network infrastructure in the 87 presence of VMs in a multi-tenant environment, discussed in section 3 88 [RFC7348], by providing Layer 2 overlay scheme on a Layer 3 network. 90 In the absence of a router in the overlay, a VM can communicate with 91 another VM only if they are on the same VXLAN segment. VMs are 92 unaware of VXLAN tunnels as a VXLAN tunnel is terminated on a VXLAN 93 Tunnel End Point (VTEP) (hypervisor/TOR). VTEPs (hypervisor/TOR) are 94 responsible for encapsulating and decapsulating frames exchanged 95 among VMs. 97 Ability to monitor path continuity, i.e., perform proactive 98 continuity check (CC) for these tunnels, is important. The 99 asynchronous mode of BFD, as defined in [RFC5880], can be used to 100 monitor a VXLAN tunnel. Use of [I-D.ietf-bfd-multipoint] is for 101 future study. 103 Also, BFD in VXLAN can be used to monitor the particular service 104 nodes that are designated to handle Layer 2 broadcast properly, 105 unknown unicast, and multicast traffic. Such nodes, discussed in 106 details in [RFC8293], are often referred to as "replicators", are 107 usually virtual VTEPs and can be monitored by physical VTEPs to 108 minimize BUM traffic directed to the unavailable replicator. 110 This document describes the use of Bidirectional Forwarding Detection 111 (BFD) protocol VXLAN to enable monitoring continuity of the path 112 between Network Virtualization Edges (NVEs) and/or availability of a 113 replicator service node using BFD. 115 In this document, the terms NVE and VTEP are used interchangeably. 117 2. Conventions used in this document 119 2.1. Terminology 121 BFD - Bidirectional Forwarding Detection 123 CC - Continuity Check 125 NVE - Network Virtualization Edge 127 TOR - Top of Rack 129 VM - Virtual Machine 131 VTEP - VXLAN Tunnel End Point 133 VXLAN - Virtual eXtensible Local Area Network 135 2.2. Requirements Language 137 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 138 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 139 "OPTIONAL" in this document are to be interpreted as described in BCP 140 14 [RFC2119] [RFC8174] when, and only when, they appear in all 141 capitals, as shown here. 143 3. Use cases 145 The primary use case of BFD for VXLAN is for continuity check of a 146 tunnel. By exchanging BFD control packets between VTEPs, an operator 147 exercises the VXLAN path in both the underlay and overlay thus 148 ensuring the VXLAN path availability and VTEPs reachability. BFD 149 failure detection can be used for maintenance. There are other use 150 cases such as the following: 152 Layer 2 VMs: 154 Deployments might have VMs with only L2 capabilities and not 155 have an IP address assigned or, in other cases, VMs are 156 assigned IP address but are restricted to communicate only 157 within their subnet. BFD being an L3 protocol can be used as a 158 tunnel CC mechanism, where BFD will start and terminate at the 159 NVEs, e.g., VTEPs. 161 It is possible to aggregate the CC sessions for multiple 162 tenants by running a BFD session between the VTEPs over VxLAN 163 tunnel. 165 Fault localization: 167 It is also possible that VMs are L3 aware and can host a BFD 168 session. In these cases, BFD sessions can be established among 169 VMs for CC. Also, BFD sessions can be created among VTEPs for 170 tunnel CC. Having a hierarchical OAM model helps localize 171 faults though it requires additional consideration of, for 172 example, coordination of BFD intervals across the OAM layers 174 Service node reachability: 176 The service node is responsible for sending BUM traffic. In 177 case a service node tunnel terminates at a VTEP, and that VTEP 178 might not even host VM. BFD session between TOR/hypervisor and 179 service node can be used to monitor service node reachability. 181 4. Deployment 183 Figure 1 illustrates the scenario with two servers, each of them 184 hosting two VMs. The servers host VTEPs that terminate two VXLAN 185 tunnels with VNI number 100 and 200 respectively. Separate BFD 186 sessions can be established between the VTEPs (IP1 and IP2) for 187 monitoring each of the VXLAN tunnels (VNI 100 and 200). The 188 implementation SHOULD have a reasonable upper bound on the number of 189 BFD sessions that can be created between the same pair of VTEPs. No 190 BFD packets intended for a Hypervisor VTEP should be forwarded to a 191 VM as a VM may drop BFD packets leading to a false negative. This 192 method is applicable whether the VTEP is a virtual or physical 193 device. 195 +------------+-------------+ 196 | Server 1 | 197 | | 198 | +----+----+ +----+----+ | 199 | |VM1-1 | |VM1-2 | | 200 | |VNI 100 | |VNI 200 | | 201 | | | | | | 202 | +---------+ +---------+ | 203 | Hypervisor VTEP (IP1) | 204 +--------------------------+ 205 | 206 | 207 | 208 | +-------------+ 209 | | Layer 3 | 210 |---| Network | 211 | | 212 +-------------+ 213 | 214 | 215 +-----------+ 216 | 217 | 218 +------------+-------------+ 219 | Hypervisor VTEP (IP2) | 220 | +----+----+ +----+----+ | 221 | |VM2-1 | |VM2-2 | | 222 | |VNI 100 | |VNI 200 | | 223 | | | | | | 224 | +---------+ +---------+ | 225 | Server 2 | 226 +--------------------------+ 228 Figure 1: Reference VXLAN domain 230 5. BFD Packet Transmission over VXLAN Tunnel 232 BFD packet MUST be encapsulated and sent to a remote VTEP as 233 explained in Section 5.1. Implementations SHOULD ensure that the BFD 234 packets follow the same lookup path as VXLAN data packets within the 235 sender system. 237 5.1. BFD Packet Encapsulation in VXLAN 239 BFD packets are encapsulated in VXLAN as described below. The VXLAN 240 packet format is defined in Section 5 of [RFC7348]. The Outer IP/UDP 241 and VXLAN headers MUST be encoded by the sender as defined in 242 [RFC7348]. 244 0 1 2 3 245 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 246 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 247 | | 248 ~ Outer Ethernet Header ~ 249 | | 250 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 251 | | 252 ~ Outer IPvX Header ~ 253 | | 254 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 255 | | 256 ~ Outer UDP Header ~ 257 | | 258 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 259 | | 260 ~ VXLAN Header ~ 261 | | 262 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 263 | | 264 ~ Inner Ethernet Header ~ 265 | | 266 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 267 | | 268 ~ Inner IPvX Header ~ 269 | | 270 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 271 | | 272 ~ Inner UDP Header ~ 273 | | 274 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 275 | | 276 ~ BFD Control Message ~ 277 | | 278 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 279 | FCS | 280 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 282 Figure 2: VXLAN Encapsulation of BFD Control Message 284 The BFD packet MUST be carried inside the inner MAC frame of the 285 VXLAN packet. The inner MAC frame carrying the BFD payload has the 286 following format: 288 Ethernet Header: 290 Destination MAC: This MUST be the dedicated MAC TBA (Section 9) 291 or the MAC address of the destination VTEP. The details of how 292 the MAC address of the destination VTEP is obtained are outside 293 the scope of this document. 295 Source MAC: MAC address of the originating VTEP 297 IP header: 299 Source IP: IP address of the originating VTEP. 301 Destination IP: IP address of the terminating VTEP. 303 TTL: MUST be set to 1 to ensure that the BFD packet is not 304 routed within the L3 underlay network. 306 The fields of the UDP header and the BFD control packet are 307 encoded as specified in [RFC5881] for p2p VXLAN tunnels. 309 6. Reception of BFD packet from VXLAN Tunnel 311 Once a packet is received, VTEP MUST validate the packet as described 312 in Section 4.1 of [RFC7348]. If the Destination MAC of the inner MAC 313 frame matches the dedicated MAC or the MAC address of the VTEP the 314 packet MUST be processed further. 316 The UDP destination port and the TTL of the inner IP packet MUST be 317 validated to determine if the received packet can be processed by 318 BFD. BFD packet with inner MAC set to VTEP or dedicated MAC address 319 MUST NOT be forwarded to VMs. 321 To ensure BFD detects the proper configuration of VXLAN Network 322 Identifier (VNI) in a remote VTEP, a lookup SHOULD be performed with 323 the MAC-DA and VNI as key in the Virtual Forwarding Instance (VFI) 324 table of the originating/terminating VTEP to exercise the VFI 325 associated with the VNI. 327 6.1. Demultiplexing of the BFD packet 329 Demultiplexing of IP BFD packet has been defined in Section 3 of 330 [RFC5881]. Since multiple BFD sessions may be running between two 331 VTEPs, there needs to be a mechanism for demultiplexing received BFD 332 packets to the proper session. The procedure for demultiplexing 333 packets with Your Discriminator equal to 0 is different from 334 [RFC5880]. For such packets, the BFD session MUST be identified 335 using the inner headers, i.e., the source IP, the destination IP, and 336 the source UDP port number present in the IP header carried by the 337 payload of the VXLAN encapsulated packet. The VNI of the packet 338 SHOULD be used to derive interface-related information for 339 demultiplexing the packet. If BFD packet is received with non-zero 340 Your Discriminator, then BFD session MUST be demultiplexed only with 341 Your Discriminator as the key. 343 7. Use of reserved VNI 345 In most cases, a single BFD session is sufficient for the given VTEP 346 to monitor the reachability of a remote VTEP, regardless of the 347 number of VNIs in common. When the single BFD session is used to 348 monitor reachability of the remote VTEP, an implementation SHOULD use 349 a VNI of 0. 351 8. Echo BFD 353 Support for echo BFD is outside the scope of this document. 355 9. IANA Considerations 357 IANA has assigned TBA as a dedicated MAC address from the IANA 8-bit 358 unicast MAC address registry to be used as the Destination MAC 359 address of the inner Ethernet of VXLAN when carrying BFD control 360 packets. 362 10. Security Considerations 364 The document requires setting the inner IP TTL to 1 which could be 365 used as a DDoS attack vector. Thus the implementation MUST have 366 throttling in place to control the rate of BFD control packets sent 367 to the control plane. Throttling MAY be relaxed for BFD packets 368 based on port number. 370 The implementation SHOULD have a reasonable upper bound on the number 371 of BFD sessions that can be created between the same pair of VTEPs. 373 Other than inner IP TTL set to 1 and limit the number of BFD sessions 374 between the same pair of VTEPs, this specification does not raise any 375 additional security issues beyond those of the specifications 376 referred to in the list of normative references. 378 11. Contributors 380 Reshad Rahman 381 rrahman@cisco.com 382 Cisco 384 12. Acknowledgments 386 Authors would like to thank Jeff Hass of Juniper Networks for his 387 reviews and feedback on this material. 389 Authors would also like to thank Nobo Akiya, Marc Binderberger, 390 Shahram Davari, Donald E. Eastlake 3rd, and Anoop Ghanwani for the 391 extensive reviews and the most detailed and helpful comments. 393 13. References 395 13.1. Normative References 397 [I-D.ietf-bfd-multipoint] 398 Katz, D., Ward, D., Networks, J., and G. Mirsky, "BFD for 399 Multipoint Networks", draft-ietf-bfd-multipoint-18 (work 400 in progress), June 2018. 402 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 403 Requirement Levels", BCP 14, RFC 2119, 404 DOI 10.17487/RFC2119, March 1997, 405 . 407 [RFC5880] Katz, D. and D. Ward, "Bidirectional Forwarding Detection 408 (BFD)", RFC 5880, DOI 10.17487/RFC5880, June 2010, 409 . 411 [RFC5881] Katz, D. and D. Ward, "Bidirectional Forwarding Detection 412 (BFD) for IPv4 and IPv6 (Single Hop)", RFC 5881, 413 DOI 10.17487/RFC5881, June 2010, 414 . 416 [RFC7348] Mahalingam, M., Dutt, D., Duda, K., Agarwal, P., Kreeger, 417 L., Sridhar, T., Bursell, M., and C. Wright, "Virtual 418 eXtensible Local Area Network (VXLAN): A Framework for 419 Overlaying Virtualized Layer 2 Networks over Layer 3 420 Networks", RFC 7348, DOI 10.17487/RFC7348, August 2014, 421 . 423 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 424 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 425 May 2017, . 427 13.2. Informational References 429 [RFC8293] Ghanwani, A., Dunbar, L., McBride, M., Bannai, V., and R. 430 Krishnan, "A Framework for Multicast in Network 431 Virtualization over Layer 3", RFC 8293, 432 DOI 10.17487/RFC8293, January 2018, 433 . 435 Authors' Addresses 437 Santosh Pallagatti (editor) 438 Rtbrick 440 Email: santosh.pallagatti@gmail.com 442 Sudarsan Paragiri 443 Juniper Networks 444 1194 N. Mathilda Ave. 445 Sunnyvale, California 94089-1206 446 USA 448 Email: sparagiri@juniper.net 450 Vengada Prasad Govindan 451 Cisco 453 Email: venggovi@cisco.com 455 Mallik Mudigonda 456 Cisco 458 Email: mmudigon@cisco.com 460 Greg Mirsky 461 ZTE Corp. 463 Email: gregimirsky@gmail.com