idnits 2.17.1 draft-ietf-bfd-vxlan-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (August 6, 2018) is 2089 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Outdated reference: A later version (-19) exists of draft-ietf-bfd-multipoint-18 ** Downref: Normative reference to an Informational RFC: RFC 7348 Summary: 1 error (**), 0 flaws (~~), 2 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Internet Engineering Task Force S. Pallagatti, Ed. 3 Internet-Draft Rtbrick 4 Intended status: Standards Track S. Paragiri 5 Expires: February 7, 2019 Juniper Networks 6 V. Govindan 7 M. Mudigonda 8 Cisco 9 G. Mirsky 10 ZTE Corp. 11 August 6, 2018 13 BFD for VXLAN 14 draft-ietf-bfd-vxlan-01 16 Abstract 18 This document describes the use of Bidirectional Forwarding Detection 19 (BFD) protocol in Virtual eXtensible Local Area Network (VXLAN) 20 overlay network. 22 Status of This Memo 24 This Internet-Draft is submitted in full conformance with the 25 provisions of BCP 78 and BCP 79. 27 Internet-Drafts are working documents of the Internet Engineering 28 Task Force (IETF). Note that other groups may also distribute 29 working documents as Internet-Drafts. The list of current Internet- 30 Drafts is at https://datatracker.ietf.org/drafts/current/. 32 Internet-Drafts are draft documents valid for a maximum of six months 33 and may be updated, replaced, or obsoleted by other documents at any 34 time. It is inappropriate to use Internet-Drafts as reference 35 material or to cite them other than as "work in progress." 37 This Internet-Draft will expire on February 7, 2019. 39 Copyright Notice 41 Copyright (c) 2018 IETF Trust and the persons identified as the 42 document authors. All rights reserved. 44 This document is subject to BCP 78 and the IETF Trust's Legal 45 Provisions Relating to IETF Documents 46 (https://trustee.ietf.org/license-info) in effect on the date of 47 publication of this document. Please review these documents 48 carefully, as they describe your rights and restrictions with respect 49 to this document. Code Components extracted from this document must 50 include Simplified BSD License text as described in Section 4.e of 51 the Trust Legal Provisions and are provided without warranty as 52 described in the Simplified BSD License. 54 Table of Contents 56 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 57 2. Conventions used in this document . . . . . . . . . . . . . . 3 58 2.1. Terminology . . . . . . . . . . . . . . . . . . . . . . . 3 59 2.2. Requirements Language . . . . . . . . . . . . . . . . . . 3 60 3. Use cases . . . . . . . . . . . . . . . . . . . . . . . . . . 3 61 4. Deployment . . . . . . . . . . . . . . . . . . . . . . . . . 4 62 5. BFD Packet Transmission over VXLAN Tunnel . . . . . . . . . . 5 63 5.1. BFD Packet Encapsulation in VXLAN . . . . . . . . . . . . 6 64 6. Reception of BFD packet from VXLAN Tunnel . . . . . . . . . . 7 65 6.1. Demultiplexing of the BFD packet . . . . . . . . . . . . 7 66 7. Use of reserved VNI . . . . . . . . . . . . . . . . . . . . . 8 67 8. Echo BFD . . . . . . . . . . . . . . . . . . . . . . . . . . 8 68 9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 8 69 10. Security Considerations . . . . . . . . . . . . . . . . . . . 8 70 11. Contributors . . . . . . . . . . . . . . . . . . . . . . . . 8 71 12. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 9 72 13. Normative References . . . . . . . . . . . . . . . . . . . . 9 73 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 9 75 1. Introduction 77 "Virtual eXtensible Local Area Network (VXLAN)" has been described in 78 [RFC7348]. VXLAN provides an encapsulation scheme that allows 79 virtual machines (VMs) to communicate in a data center network. 81 VXLAN is typically deployed in data centers interconnecting 82 virtualized hosts, which may be spread across multiple racks. The 83 individual racks may be part of a different Layer 3 network, or they 84 could be in a single Layer 2 network. The VXLAN segments/overlay 85 networks are overlaid on top of these Layer 2 or Layer 3 networks. 87 A VM can communicate with another VM only if they are on the same 88 VXLAN. VMs are unaware of VXLAN tunnels as VXLAN tunnel is 89 terminated on VXLAN Tunnel End Point (VTEP) (hypervisor/TOR). VTEPs 90 (hypervisor/TOR) are responsible for encapsulating, and decapsulating 91 frames exchanged among VMs. 93 Since underlay is an L3 network, ability to monitor path continuity, 94 i.e., perform proactive continuity check (CC) for these tunnels is 95 important. Asynchronous mode of BFD, as defined in [RFC5880], can be 96 used to monitor a VXLAN tunnel. Use of [I-D.ietf-bfd-multipoint] is 97 for future study. 99 Also, BFD in VXLAN can be used to monitor the particular service 100 nodes that are designated to properly handle Layer 2 broadcast, 101 unknown unicast, and multicast traffic. Such nodes, often referred 102 "replicators", are usually virtual VTEPs can be monitored by physical 103 VTEPs to minimize BUM traffic directed to the unavailable replicator. 105 This document describes the use of Bidirectional Forwarding Detection 106 (BFD) protocol VXLAN to enable continuity monitoring between Network 107 Virtualization Edges (NVEs) and/or availability of a replicator 108 service node using BFD. 110 2. Conventions used in this document 112 2.1. Terminology 114 BFD - Bidirectional Forwarding Detection 116 CC - Continuity Check 118 NVE - Network Virtualization Edge 120 TOR - Top of Rack 122 VM - Virtual Machine 124 VTEP - VXLAN Tunnel End Point 126 VXLAN - Virtual eXtensible Local Area Network 128 2.2. Requirements Language 130 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 131 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 132 "OPTIONAL" in this document are to be interpreted as described in BCP 133 14 [RFC2119] [RFC8174] when, and only when, they appear in all 134 capitals, as shown here. 136 3. Use cases 138 The primary use case of BFD for VXLAN is for continuity check of a 139 tunnel. By exchanging BFD control packets between VTEPs, an operator 140 exercises the VXLAN path in both in underlay and overlay thus 141 ensuring the VXLAN path availability and VTEPs reachability. BFD 142 failure detection can be used for maintenance. There are other use 143 cases such as 144 Layer 2 VMs: 146 Most deployments will have VMs with only L2 capabilities that 147 may not support L3. BFD being an L3 protocol can be used as 148 tunnel CC mechanism, where BFD will start and terminate at the 149 NVEs, e.g., VTEPs. 151 It is possible to aggregate the CC sessions for multiple 152 tenants by running a BFD session between the VTEPs over VxLAN 153 tunnel. In the rest of this document, terms NVE and VTEP are 154 used interchangeably. 156 Fault localization: 158 It is also possible that VMs are L3 aware and can host a BFD 159 session. In these cases, BFD sessions can be established among 160 VMs for CC. In addition, BFD sessions can be established among 161 VTEPs for tunnel CC. Having a hierarchical OAM model helps 162 localize faults though requires additional consideration. 164 Service node reachability: 166 The service node is responsible for sending BUM traffic. In 167 case a service node tunnel terminates at VTEP, and it might not 168 even host VM. BFD session between TOR/hypervisor and service 169 node can be used to monitor service node reachability. 171 4. Deployment 173 Figure 1 illustrates the scenario with two servers, each of them 174 hosting two VMs. The servers host VTEPs that terminate two VXLAN 175 tunnels with VNI number 100 and 200 respectively. Separate BFD 176 sessions can be established between the VTEPs (IP1 and IP2) for 177 monitoring each of the VXLAN tunnels (VNI 100 and 200). No BFD 178 packets intended to Hypervisor VTEP should be forwarded to a VM as VM 179 may drop BFD packets leading to a false negative. This method is 180 applicable whether VTEP is a virtual or physical device. 182 +------------+-------------+ 183 | Server 1 | 184 | | 185 | +----+----+ +----+----+ | 186 | |VM1-1 | |VM1-2 | | 187 | |VNI 100 | |VNI 200 | | 188 | | | | | | 189 | +---------+ +---------+ | 190 | Hypervisor VTEP (IP1) | 191 +--------------------------+ 192 | 193 | 194 | 195 | +-------------+ 196 | | Layer 3 | 197 |---| Network | 198 | | 199 +-------------+ 200 | 201 | 202 +-----------+ 203 | 204 | 205 +------------+-------------+ 206 | Hypervisor VTEP (IP2) | 207 | +----+----+ +----+----+ | 208 | |VM2-1 | |VM2-2 | | 209 | |VNI 100 | |VNI 200 | | 210 | | | | | | 211 | +---------+ +---------+ | 212 | Server 2 | 213 +--------------------------+ 215 Figure 1: Reference VXLAN domain 217 5. BFD Packet Transmission over VXLAN Tunnel 219 BFD packet MUST be encapsulated and sent to a remote VTEP as 220 explained in Section 5.1. Implementations SHOULD ensure that the BFD 221 packets follow the same lookup path of VXLAN packets within the 222 sender system. 224 5.1. BFD Packet Encapsulation in VXLAN 226 VXLAN packet format has been described in Section 5 of [RFC7348]. 227 The Outer IP/UDP and VXLAN headers MUST be encoded by the sender as 228 defined in [RFC7348]. 230 0 1 2 3 231 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 232 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 233 | | 234 ~ Outer Ethernet Header ~ 235 | | 236 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 237 | | 238 ~ Outer IPvX Header ~ 239 | | 240 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 241 | | 242 ~ Outer UDP Header ~ 243 | | 244 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 245 | | 246 ~ VXLAN Header ~ 247 | | 248 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 249 | | 250 ~ Inner Ethernet Header ~ 251 | | 252 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 253 | | 254 ~ Inner IPvX Header ~ 255 | | 256 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 257 | | 258 ~ Inner UDP Header ~ 259 | | 260 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 261 | | 262 ~ BFD Control Message ~ 263 | | 264 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 265 | FCS | 266 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 268 Figure 2: VXLAN Encapsulaion of BFD Control Message 270 The BFD packet MUST be carried inside the inner MAC frame of the 271 VXLAN packet. The inner MAC frame carrying the BFD payload has the 272 following format: 274 Ethernet Header: 276 Destination MAC: This MUST be a dedicated MAC (TBA) Section 9 277 or the MAC address of the destination VTEP. The details of how 278 the MAC address of the destination VTEP is obtained are outside 279 the scope of this document. 281 Source MAC: MAC address of the originating VTEP 283 IP header: 285 Source IP: IP address of the originating VTEP. 287 Destination IP: IP address of the terminating VTEP. 289 TTL: MUST be set to 1 to ensure that the BFD packet is not 290 routed within the L3 underlay network. 292 The fields of the UDP header and the BFD control packet are 293 encoded as specified in [RFC5881] for p2p VXLAN tunnels. 295 6. Reception of BFD packet from VXLAN Tunnel 297 Once a packet is received, VTEP MUST validate the packet as described 298 in Section 4.1 of [RFC7348]. If the Destination MAC of the inner MAC 299 frame matches the dedicated MAC or the MAC address of the VTEP the 300 packet MUST be processed further. 302 The UDP destination port and the TTL of the inner Ethernet frame MUST 303 be validated to determine if the received packet can be processed by 304 BFD. BFD packet with inner MAC set to VTEP or dedicated MAC address 305 MUST NOT be forwarded to VMs. 307 To ensure BFD detects the proper configuration of VXLAN Network 308 Identifier (VNI) in a remote VTEP, a lookup SHOULD be performed with 309 the MAC-DA and VNI as key in the Virtual Forwarding Instance (VFI) 310 table of the originating/ terminating VTEP to exercise the VFI 311 associated with the VNI. 313 6.1. Demultiplexing of the BFD packet 315 Demultiplexing of IP BFD packet has been defined in Section 3 of 316 [RFC5881]. Since multiple BFD sessions may be running between two 317 VTEPs, there needs to be a mechanism for demultiplexing received BFD 318 packets to the proper session. The procedure for demultiplexing 319 packets with Your Discriminator equal to 0 is different from 320 [RFC5880]. For such packets, the BFD session MUST be identified 321 using the inner headers, i.e., the source IP and the destination IP 322 present in the IP header carried by the payload of the VXLAN 323 encapsulated packet. The VNI of the packet SHOULD be used to derive 324 interface-related information for demultiplexing the packet. If BFD 325 packet is received with non-zero Your Discriminator, then BFD session 326 MUST be demultiplexed only with Your Discriminator as the key. 328 7. Use of reserved VNI 330 BFD session MAY be established for the reserved VNI 0. One way to 331 aggregate BFD sessions between VTEP's is to establish a BFD session 332 with VNI 0. A VTEP MAY also use VNI 0 to establish a BFD session 333 with a service node. 335 8. Echo BFD 337 Support for echo BFD is outside the scope of this document. 339 9. IANA Considerations 341 IANA is requested to assign a dedicated MAC address to be used as the 342 Destination MAC address of the inner Ethernet which carries BFD 343 control packet in IP/UDP encapsulation. 345 10. Security Considerations 347 The document recommends setting the inner IP TTL to 1 which could 348 lead to a DDoS attack. Thus the implementation MUST have throttling 349 in place. Throttling MAY be relaxed for BFD packets based on port 350 number. 352 Other than inner IP TTL set to 1 this specification does not raise 353 any additional security issues beyond those of the specifications 354 referred to in the list of normative references. 356 11. Contributors 358 Reshad Rahman 359 rrahman@cisco.com 360 Cisco 362 12. Acknowledgments 364 Authors would like to thank Jeff Hass of Juniper Networks for his 365 reviews and feedback on this material. 367 Authors would also like to thank Nobo Akiya, Marc Binderberger and 368 Shahram Davari for the extensive review. 370 13. Normative References 372 [I-D.ietf-bfd-multipoint] 373 Katz, D., Ward, D., Networks, J., and G. Mirsky, "BFD for 374 Multipoint Networks", draft-ietf-bfd-multipoint-18 (work 375 in progress), June 2018. 377 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 378 Requirement Levels", BCP 14, RFC 2119, 379 DOI 10.17487/RFC2119, March 1997, 380 . 382 [RFC5880] Katz, D. and D. Ward, "Bidirectional Forwarding Detection 383 (BFD)", RFC 5880, DOI 10.17487/RFC5880, June 2010, 384 . 386 [RFC5881] Katz, D. and D. Ward, "Bidirectional Forwarding Detection 387 (BFD) for IPv4 and IPv6 (Single Hop)", RFC 5881, 388 DOI 10.17487/RFC5881, June 2010, 389 . 391 [RFC7348] Mahalingam, M., Dutt, D., Duda, K., Agarwal, P., Kreeger, 392 L., Sridhar, T., Bursell, M., and C. Wright, "Virtual 393 eXtensible Local Area Network (VXLAN): A Framework for 394 Overlaying Virtualized Layer 2 Networks over Layer 3 395 Networks", RFC 7348, DOI 10.17487/RFC7348, August 2014, 396 . 398 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 399 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 400 May 2017, . 402 Authors' Addresses 404 Santosh Pallagatti (editor) 405 Rtbrick 407 Email: santosh.pallagatti@gmail.com 408 Sudarsan Paragiri 409 Juniper Networks 410 1194 N. Mathilda Ave. 411 Sunnyvale, California 94089-1206 412 USA 414 Email: sparagiri@juniper.net 416 Vengada Prasad Govindan 417 Cisco 419 Email: venggovi@cisco.com 421 Mallik Mudigonda 422 Cisco 424 Email: mmudigon@cisco.com 426 Greg Mirsky 427 ZTE Corp. 429 Email: gregimirsky@gmail.com