idnits 2.17.1 draft-rosen-idr-tunnel-encaps-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == The document seems to lack the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords -- however, there's a paragraph with a matching beginning. Boilerplate error? (The document does seem to have the reference to RFC 2119 which the ID-Checklist requires). (Using the creation date from RFC5512, updated by this document, for RFC5378 checks: 2008-01-24) -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (June 24, 2015) is 3228 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) ** Obsolete normative reference: RFC 5512 (Obsoleted by RFC 9012) Summary: 1 error (**), 0 flaws (~~), 2 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 IDR Working Group E. Rosen, Ed. 3 Internet-Draft Juniper Networks, Inc. 4 Updates: 5512 (if approved) K. Patel 5 Intended status: Standards Track Cisco Systems 6 Expires: December 26, 2015 G. Van de Velde 7 Alcatel-Lucent 8 June 24, 2015 10 Using the BGP Tunnel Encapsulation Attribute without the BGP 11 Encapsulation SAFI 12 draft-rosen-idr-tunnel-encaps-00 14 Abstract 16 RFC 5512 defines a BGP Path Attribute known as the "Tunnel 17 Encapsulation Attribute". This attribute allows one to specify a set 18 of tunnels. For each such tunnel, the attribute can provide 19 additional information used to create a tunnel and the corresponding 20 encapsulation header, and can also provide information that aids in 21 choosing whether a particular packet is to be sent through a 22 particular tunnel. RFC 5512 states that the attribute is only 23 carried in BGP UPDATEs that have the "Encapsulation Subsequent 24 Address Family (Encapsulation SAFI)". This document updates RFC 5512 25 by removing that restriction, and by specifying semantics for the 26 attribute when it is carried in UPDATEs of certain other SAFIs. This 27 document also extends the attribute by enabling it to carry 28 additional information needed to create the encapsulation headers 29 additional tunnel types not mentioned in RFC 5512. Finally, this 30 document also extends the attribute by allowing it to specify a 31 remote tunnel endpoint address for each tunnel. 33 Status of This Memo 35 This Internet-Draft is submitted in full conformance with the 36 provisions of BCP 78 and BCP 79. 38 Internet-Drafts are working documents of the Internet Engineering 39 Task Force (IETF). Note that other groups may also distribute 40 working documents as Internet-Drafts. The list of current Internet- 41 Drafts is at http://datatracker.ietf.org/drafts/current/. 43 Internet-Drafts are draft documents valid for a maximum of six months 44 and may be updated, replaced, or obsoleted by other documents at any 45 time. It is inappropriate to use Internet-Drafts as reference 46 material or to cite them other than as "work in progress." 48 This Internet-Draft will expire on December 26, 2015. 50 Copyright Notice 52 Copyright (c) 2015 IETF Trust and the persons identified as the 53 document authors. All rights reserved. 55 This document is subject to BCP 78 and the IETF Trust's Legal 56 Provisions Relating to IETF Documents 57 (http://trustee.ietf.org/license-info) in effect on the date of 58 publication of this document. Please review these documents 59 carefully, as they describe your rights and restrictions with respect 60 to this document. Code Components extracted from this document must 61 include Simplified BSD License text as described in Section 4.e of 62 the Trust Legal Provisions and are provided without warranty as 63 described in the Simplified BSD License. 65 Table of Contents 67 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 68 2. Tunnel Encapsulation Attribute Sub-TLVs . . . . . . . . . . . 5 69 2.1. The Remote Endpoint Sub-TLV . . . . . . . . . . . . . . . 5 70 2.2. Encapsulation Sub-TLVs for Particular Tunnel Types . . . 7 71 2.2.1. VXLAN . . . . . . . . . . . . . . . . . . . . . . . . 7 72 2.2.2. VXLAN-GPE . . . . . . . . . . . . . . . . . . . . . . 8 73 2.2.3. NVGRE . . . . . . . . . . . . . . . . . . . . . . . . 9 74 2.2.4. GTP . . . . . . . . . . . . . . . . . . . . . . . . . 10 75 2.2.5. MPLS-in-GRE . . . . . . . . . . . . . . . . . . . . . 11 76 2.3. Outer Encapsulation Sub-TLVs . . . . . . . . . . . . . . 12 77 2.3.1. IPv4 DS Field . . . . . . . . . . . . . . . . . . . . 12 78 2.3.2. UDP Destination Port . . . . . . . . . . . . . . . . 12 79 2.4. Embedded Label Handling Sub-TLV . . . . . . . . . . . . . 12 80 3. Semantics and Usage of the Tunnel Encapsulation 81 attribute . . . . . . . . . . . . . . . . . . . . . . . . . . 13 82 4. Routing Considerations . . . . . . . . . . . . . . . . . . . 16 83 4.1. No Impact on BGP Decision Process . . . . . . . . . . . . 16 84 4.2. Looping, Infinite Stacking, Etc. . . . . . . . . . . . . 17 85 5. Recursive Next Hop Resolution . . . . . . . . . . . . . . . . 18 86 6. Tunnel Encapsulation Extended Community . . . . . . . . . . . 18 87 7. Use of Virtual Network Identifiers and Embedded Labels 88 when Imposing a Tunnel Encapsulation . . . . . . . . . . . . 19 89 7.1. Unlabeled Address Families . . . . . . . . . . . . . . . 19 90 7.2. Labeled Address Families . . . . . . . . . . . . . . . . 20 91 7.2.1. When a Valid VNID has been Signaled . . . . . . . . . 20 92 7.2.2. When a Valid VNID has not been Signaled . . . . . . . 20 93 7.2.3. Applicability Restrictions . . . . . . . . . . . . . 21 94 8. Scoping . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 95 9. Error Handling . . . . . . . . . . . . . . . . . . . . . . . 22 96 10. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 23 97 11. Security Considerations . . . . . . . . . . . . . . . . . . . 24 98 12. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 25 99 13. Contributor Addresses . . . . . . . . . . . . . . . . . . . . 25 100 14. References . . . . . . . . . . . . . . . . . . . . . . . . . 25 101 14.1. Normative References . . . . . . . . . . . . . . . . . . 25 102 14.2. Informative References . . . . . . . . . . . . . . . . . 26 103 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 27 105 1. Introduction 107 [RFC5512] defines a BGP Path Attribute known as the Tunnel 108 Encapsulation attribute. This attribute consists of one or more 109 TLVs. Each TLV identifies a particular type of tunnel. Each TLV 110 also contains one or more sub-TLVs. Some of the sub-TLVs, e.g., the 111 "Encapsulation sub-TLV", contain information that may be used to form 112 the encapsulation header for the specified tunnel type. Other sub- 113 TLVs, e.g., the "color sub-TLV" and the "protocol sub-TLV", contain 114 information that aids in determining whether particular packets 115 should be sent through the tunnel that the TLV identifies. 117 [RFC5512] only allows the Tunnel Encapsulation attribute to be 118 attached to BGP UPDATE messages that have the "Encapsulation SAFI" 119 (i.e., UPDATE messages with AFI/SAFI 1/7 or 2/7). In an UPDATE of 120 the Encapsulation SAFI, the NLRI is an address of the BGP speaker 121 originating the UPDATE. Consider the following scenario: 123 o BGP speaker R1 has received and installed UPDATE U; 125 o UPDATE U's SAFI is the Encapsulation SAFI; 127 o UPDATE U has the address R2 as its NLRI; 129 o UPDATE U has a Tunnel Encapsulation attribute. 131 o R1 has a packet, P, to transmit to destination D; 133 o R1's best path to D is a BGP route that has R2 as its next hop; 135 In this scenario, when R1 transmits packet P, it should transmit it 136 to R2 through one of the tunnels specified in U's Tunnel 137 Encapsulation attribute. The IP address of the remote endpoint of 138 each such tunnel is R2. Packet P is known as the tunnel's "payload". 140 While the ability to specify tunnel information in a BGP UPDATE is 141 useful, the procedures of [RFC5512] have certain limitations: 143 o The requirement to use the "Encapsulation SAFI" presents an 144 unfortunate operational cost, as each BGP session that may need to 145 carry tunnel encapsulation information needs to be reconfigured to 146 support the Encapsulation SAFI. 148 o There is no way to use the Tunnel Encapsulation attribute to 149 specify the remote endpoint address of a given tunnel; [RFC5512] 150 assumes that the remote endpoint of each tunnel is specified as 151 the NLRI of an UPDATE of the Encapsulation-SAFI. 153 o If the respective best paths to two different address prefixes 154 have the same next hop, [RFC5512] does not provide a 155 straightforward method to associate each prefix with a different 156 tunnel. 158 In this document we address these deficiencies by: 160 o Defining a new "Remote Endpoint Address sub-TLV" that can be 161 included in any of the TLVs contained in the Tunnel Encapsulation 162 attribute. This sub-TLV can be used to specify the remote 163 endpoint address of a particular tunnel. 165 o Allowing the Tunnel Encapsulation attribute to be carried by BGP 166 UPDATEs of additional AFI/SAFIs. Appropriate semantics are 167 provided for this way of using the attribute. 169 One of the sub-TLVs defined in [RFC5512] is the "Encapsulation sub- 170 TLV". For a given tunnel, the encapsulation sub-TLV specifies some 171 of the information needed to construct the encapsulation header used 172 when sending packets through that tunnel. This document defines 173 encapsulation sub-TLVs for a number of tunnel types not discussed in 174 [RFC5512]: VXLAN, VXLAN-GRE, NVGRE, GTP, and MPLS-in-GRE. MPLS-in- 175 UDP [RFC7510] is also supported, but an Encapsulation sub-TLV for it 176 is not needed. 178 Some of the encapsulations mentioned in the previous paragraph need 179 to be further encapsulated inside UDP and/or IP. [RFC5512] provides 180 no way to specify that certain information is to appear in these 181 outer IP and/or UDP encapsulations. This document provides a 182 framework for including such information in the TLVs of the Tunnel 183 Encapsulation attribute. 185 When the Tunnel Encapsulation attribute is attached to a BGP UPDATE 186 whose AFI/SAFI identifies one of the labeled address families, it is 187 not always obvious whether the label embedded in the NLRI is to 188 appear somewhere in the tunnel encapsulation header (and if so, 189 where), or whether it is to appear in the payload, or whether it can 190 be omitted altogether. This is especially true if the tunnel 191 encapsulation header itself contains a "virtual network identifier". 192 This document provides a mechanism that allows one to signal (by 193 using sub-TLVs of the Tunnel Encapsulation attribute) how one wants 194 to use the embedded label when the tunnel encapsulation has its own 195 virtual network identifier field. 197 [RFC5512] defines a Tunnel Encapsulation Extended Community, that can 198 be used instead of the Tunnel Encapsulation attribute under certain 199 circumstances. This document addresses the issue of how to handle a 200 BGP UPDATE that carries both a Tunnel Encapsulation attribute and one 201 or more Tunnel Encapsulation Extended Communities. 203 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 204 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 205 "OPTIONAL", when and only when appearing in all capital letters, are 206 to be interpreted as described in [RFC2119]. 208 2. Tunnel Encapsulation Attribute Sub-TLVs 210 [RFC5512] specifies three sub-TLVs for the Tunnel Encapsulation 211 attribute: the Encapsulation sub-TLV, the Color sub-TLV, and the 212 Protocol Type sub-TLV. In this section we specify a number of 213 additional sub-TLVs. We also specify Encapsulation sub-TLVs for a 214 number of tunnel types that are not mentioned in [RFC5512]. 216 2.1. The Remote Endpoint Sub-TLV 218 The Remote Endpoint sub-TLV is a sub-TLV whose value field contains 219 an IP address sub-field and an Autonomous System (AS) number sub- 220 field. The IP address may be either an IPv4 address (a /32 IPv4 221 prefix), an IPv6 address (a /128 IPv6 prefix). However, IPv6 link 222 local addresses are not valid values of the IP address field. Also, 223 IPv4 broadcast addresses are not valid values of this field. 225 If the length of the value field is eight octets, the value field 226 contains a four-octet IPv4 address field followed by a four-octet AS 227 number field. If the length of the value field is 20 octets, the 228 value field contains a sixteen-octet IPv6 address field followed by a 229 four-octet AS number field. 231 In a given BGP UPDATE, the address family (IPv4 or IPv6) of a Remote 232 Endpoint sub-TLV is independent of the address family of of the 233 UPDATE itself. For example, an UPDATE whose NLRI is an IPv4 address 234 may have a Tunnel Encapsulation attribute containing Remote Endpoint 235 sub-TLVs that contain IPv6 addresses. Also, different tunnels 236 represented in the Tunnel Encapsulation attribute may have Remote 237 Endpoints of different address families. 239 A two-octet AS number can be carried in the AS number field by 240 setting the two high order octets to zero, and carrying the number in 241 the two low order octets of the field. 243 The AS number in the sub-TLV MUST be the number of the AS to which 244 the IP address in the sub-TLV belongs. 246 There is one special case: the Remote Endpoint sub-TLV MAY have a 247 value field consisting entirely of zeroes. This means that the 248 tunnel's remote endpoint is the UPDATE's BGP next hop. 250 If the Remote Endpoint sub-TLV has a non-zero value, then if any of 251 the following conditions hold, the Remote Endpoint sub-TLV is 252 considered to be "invalid": 254 o If the sub-TLV's value field is any length other than eight or 255 twenty octets, the sub-TLV is considered to be malformed. If the 256 Remote Endpoint sub-TLV is malformed, the TLV containing it is 257 also considered to be malformed, and the entire TLV MUST be 258 ignored. However, the Tunnel Encapsulation attribute SHOULD NOT 259 be considered to be malformed in this case; other TLVs in the 260 attribute SHOULD be processed. 262 o The IPv4 or IPv6 address field of the sub-TLV contains a value 263 that is not a valid (see above) IPv4 or IPv6 address, 264 respectively. 266 o It can be determined that the IP address in the sub-TLV does not 267 belong to the non-zero AS whose number is in the sub-TLV. (See 268 section Section 11 for discussion of one way to determine this.) 270 If the Remote Endpoint sub-TLV is invalid, the entire TLV containing 271 it SHOULD be ignored. However, other TLVs in the Tunnel 272 Encapsulation attribute SHOULD NOT be ignored. 274 When redistributing a route that is carrying a Tunnel Encapsulation 275 attribute that contains a TLV that itself contains an invalid Remote 276 Endpoint sub-TLV, the TLV SHOULD be removed from the attribute before 277 redistribution. 279 See Section 9 for further discussion of how to handle errors that are 280 encountered when parsing the Tunnel Encapsulation attribute. 282 If the Remote Endpoint sub-TLV contains an IPv4 or IPv6 address that 283 is not reachable, the sub-TLV is NOT considered to be invalid, and 284 the containing TLV SHOULD NOT be removed from the attribute before 285 redistribution. However, the tunnel identified by the TLV containing 286 that sub-TLV cannot be used until such time as the address becomes 287 reachable. See Section 3. 289 2.2. Encapsulation Sub-TLVs for Particular Tunnel Types 291 Tunnel Encapsulation sub-TLVs for the following tunnel types are 292 defined in [RFC5512]: L2TPv3, and GRE. 294 This section defines Tunnel Encapsulation sub-TLVs for the following 295 tunnel types: VXLAN ([RFC7348]), VXLAN-GPE ([VXLAN-GPE]), NVGRE 296 ([NVGRE]), GTP [GTP-U], and MPLS-in-GRE ([RFC2784], [RFC2890], 297 [RFC4023]). 299 Rules for forming the encapsulation based on the information in a 300 given TLV are given in Section 7. 302 2.2.1. VXLAN 304 This document defines an encapsulation sub-TLV for VXLAN tunnels. 305 When the tunnel type is VXLAN, the following is the structure of the 306 value field in the encapsulation sub-TLV: 308 0 1 2 3 309 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 310 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 311 |V|M|R|R|R|R|R|R| VN-ID (3 Octets) | 312 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 313 | MAC Address (4 Octets) | 314 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 315 | MAC Address (2 Octets) | Reserved | 316 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 318 Figure 1: VXLAN Encapsulation Sub-TLV 320 V: This bit is set to 1 to indicate that a valid VN-ID is present 321 in the encapsulation sub-TLV. 323 M: This bit is set to 1 to indicate that a valid MAC Address is 324 present in the encapsulation sub-TLV. 326 R: The remaining bits in the 8-bit flags field are reserved for 327 further use. They SHOULD always be set to 0. 329 VN-ID: If the V bit is set, the VN-id field contains a 3 octet VN- 330 ID value. If the V bit is not set, the VN-id field SHOULD be set 331 to zero. 333 MAC Address: If the M bit is set, this field contains a 6 octet 334 Ethernet MAC address. If the M bit is not set, this field SHOULD 335 be set to all zeroes. 337 When forming the VXLAN encapsulation header: 339 o The values of the V, M, and R bits are NOT copied into the flags 340 field of the VXLAN header. The flags field of the VXLAN header is 341 set as per [RFC7348]. 343 o If the M bit is set, the MAC Address is copied into the Inner 344 Destination MAC Address field of the Inner Ethernet Header (see 345 section 5 of [RFC7348]. If the M bit is not set, the Inner 346 Destination MAC address field is set to a configured value. If 347 the M bit is not set, and there is no configured value, the VXLAN 348 tunnel cannot be used. 350 o See Section 7 to see how the VNI field of the VXLAN encapsulation 351 header is set. 353 Note that what we are calling a "VXLAN tunnel" is actually an 354 "ethernet-in-VXLAN" tunnel. Although, strictly speaking, VXLAN 355 tunnels only carry ethernet frames, a IP packet or an MPLS packet can 356 be carried through a "VXLAN tunnel" by forming an IP-in-ethernet-in- 357 VXLAN or MPLS-in-ethernet-in-VXLAN tunnel. 359 2.2.2. VXLAN-GPE 361 This document defines an encapsulation sub-TLV for VXLAN tunnels. 362 When the tunnel type is VXLAN-GPE, the following is the structure of 363 the value field in the encapsulation sub-TLV: 365 0 1 2 3 366 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 367 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 368 |Ver|V|R|R|R|R|R| Reserved | 369 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 370 | VN-ID | Reserved | 371 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 373 Figure 2: VXLAN GPE Encapsulation Sub-TLV 375 V: This bit is set to 1 to indicate that a valid VN-ID is present 376 in the encapsulation sub-TLV. 378 R: The bits designated "R" above are reserved for future use. 379 They SHOULD always be set to zero. 381 Version (Ver): Indicates VXLAN GPE protocol version. If the 382 indicated version is not supported, the TLV that contains this 383 Encapsulation sub-TLV MUST be treated as specifying an unsupported 384 tunnel type. The value of this field will be copied into the 385 corresponding field of the VXLAN encapsulation header. 387 VN-ID: If the V bit is set, this field contains a 3 octet VN-ID 388 value. If the V bit is not set, this field SHOULD be set to zero. 390 When forming the VXLAN-GPE encapsulation header: 392 o The values of the V and R bits are NOT copied into the flags field 393 of the VXLAN-GPE header. However, the values of the Ver bits are 394 copied into the VXLAN-GPE header. Other bits in the flags field 395 of the VXLAN-GPE header are set as per [VXLAN-GPE]. 397 o See Section 7 to see how the VNI field of the VXLAN-GPE 398 encapsulation header is set. 400 2.2.3. NVGRE 402 This document defines an encapsulation sub-TLV for NVGRE tunnels. 403 When the tunnel type is NVGRE, the following is the structure of the 404 value field in the encapsulation sub-TLV: 406 0 1 2 3 407 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 408 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 409 |V|M|R|R|R|R|R|R| VN-ID (3 Octets) | 410 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 411 | MAC Address (4 Octets) | 412 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 413 | MAC Address (2 Octets) | Reserved | 414 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 416 Figure 3: NVGRE Encapsulation Sub-TLV 418 V: This bit is set to 1 to indicate that a valid VN-ID is present 419 in the encapsulation sub-TLV. 421 M: This bit is set to 1 to indicate that a valid MAC Address is 422 present in the encapsulation sub-TLV. 424 R: The remaining bits in the 8-bit flags field are reserved for 425 further use. They SHOULD always be set to 0. 427 VN-ID: If the V bit is set, the VN-id field contains a 3 octet VN- 428 ID value. If the V bit is not set, the VN-id field SHOULD be set 429 to zero. 431 MAC Address: If the M bit is set, this field contains a 6 octet 432 Ethernet MAC address. If the M bit is not set, this field SHOULD 433 be set to all zeroes. 435 When forming the NVGRE encapsulation header: 437 o The values of the V, M, and R bits are NOT copied into the flags 438 field of the NVGRE header. The flags field of the VXLAN header is 439 set as per [NVGRE]. 441 o If the M bit is set, the MAC Address is copied into the Inner 442 Destination MAC Address field of the Inner Ethernet Header (see 443 section 5 of [NVGRE]. If the M bit is not set, the Inner 444 Destination MAC address field is set to a configured value. If 445 the M bit is not set, and there is no configured value, the NVGRE 446 tunnel cannot be used. 448 o See Section 7 to see how the VNI field of the VXLAN encapsulation 449 header is set. 451 2.2.4. GTP 453 When the tunnel type is GTP [GTP-U], the Encapsulation sub-TLV 454 contains information needed to send data packets through a GTP 455 tunnel, and also contains information needed by the tunnel's remote 456 endpoint to create a "reverse" tunnel back to the transmitter. This 457 allows a bidirectional control connection to be created. The format 458 of the Encapsulation Sub-TLV is: 460 0 1 2 3 461 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 462 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 463 | Remote TEID (4 Octets) | 464 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 465 | Local TEID (4 Octets) | 466 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 467 | Local Endpoint Address (4/16 Octets (IPv4/IPv6)) | 468 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 470 Figure 4: GTP Encapsulation Sub-TLV 472 Remote TEID: Contains the 32-bit Tunnel Endpoint Identifier of the 473 GTP tunnel through which data packets are to be sent. When data 474 packets are sent through the tunnel, the Remote TEID is carried in 475 the GTP encapsulation header. The GTP header is itself 476 encapsulation within an IP header, whose IP destination address 477 field is set to the value of the Remote Endpoint sub-TLV. 479 Local TEID: Contains a 32-bit Tunnel Endpoint Identifier of a GTP 480 tunnel assigned by EPC ([vEPC]). 482 Local Endpoint Address: Contains an IPv4 or IPv6 anycast address. 483 This is used, along with the Local TEID, to set up a tunnel in the 484 reverse direction. See [vEPC] for details. 486 2.2.5. MPLS-in-GRE 488 When the tunnel type is MPLS-in-GRE, the following is the structure 489 of the value field in an optional encapsulation sub-TLV: 491 0 1 2 3 492 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 493 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 494 | GRE-Key (4 Octets) | 495 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 497 Figure 5: MPLS-in-GRE Encapsulation Sub-TLV 499 GRE-Key: 4-octet field [RFC2890] that is generated by the 500 advertising router. The actual method by which the key is 501 obtained is beyond the scope of this document. The key is 502 inserted into the GRE encapsulation header of the payload packets 503 sent by ingress routers to the advertising router. It is intended 504 to be used for identifying extra context information about the 505 received payload. Note that the key is optional. Unless a key 506 value is being advertised, the MPLS-in-GRE encapsulation sub-TLV 507 MUST NOT be present. 509 Note that the GRE tunnel type defined in [RFC5512] can be used 510 instead of the MPLS-in-GRE tunnel type when it is necessary to 511 encapsulate MPLS in GRE. Including a TLV of the MPLS-in-GRE tunnel 512 type is equivalent to including a TLV of the GRE tunnel type that 513 also includes a Protocol Type sub-TLV ([RFC5512]) specifying MPLS as 514 the protocol to be encapsulated. That is, if a TLV specifies MPLS- 515 in-GRE or if it includes a Protocol Type sub-TLV specifying MPLS, the 516 GRE tunnel advertised in that TLV MUST NOT be used for carrying IP 517 packets. 519 2.3. Outer Encapsulation Sub-TLVs 521 The Encapsulation sub-TLV for a particular tunnel type allows one to 522 specify the values that are to be placed in certain fields of the 523 encapsulation header for that tunnel type. However, some tunnel 524 types require an outer IP encapsulation, and some also require an 525 outer UDP encapsulation. The Encapsulation sub-TLV for a given 526 tunnel type does not usually provide a way to specify values for 527 fields of the outer IP and/or UDP encapsulations. If it is necessary 528 to specify values for fields of the outer encapsulation, additional 529 sub-TLVs must be used. This document defines two such sub-TLVs. 531 If an outer encapsulation sub-TLV occurs in a TLV for a tunnel type 532 that does not use the corresponding outer encapsulation, the sub-TLV 533 as if it were an unknown type of sub-TLV. 535 2.3.1. IPv4 DS Field 537 Most of the tunnel types that can be specified in the Tunnel 538 Encapsulation attribute require an outer IP encapsulation. The IPv4 539 DS Field sub-TLV can be carried in the TLV of any such tunnel type. 540 It specifies the setting of one-octet Differentiated Services field 541 in the outer IP encapsulation (see [RFC2474]). The value field is 542 always a single octet. 544 2.3.2. UDP Destination Port 546 Some of the tunnel types that can be specified in the Tunnel 547 Encapsulation attribute require an outer UDP encapsulation. 548 Generally there is a standard UDP Destination Port value for a 549 particular tunnel type. However, sometimes it is useful to be able 550 to use a non-standard UDP destination port. If a particular tunnel 551 type requires an outer UDP encapsulation, and it is desired to use a 552 UDP destination port other than the standard one, the port to be used 553 can be specified by including a UDP Destination Port sub-TLV. The 554 value field of this sub-TLV is always a two-octet field, containing 555 the port value. 557 2.4. Embedded Label Handling Sub-TLV 559 Certain BGP address families (corresponding to particular AFI/SAFI 560 pairs, e.g., 1/4, 2/4, 1/128, 2/128) have MPLS labels embedded in 561 their NLRIs. We will use the term "embedded label" to refer to the 562 MPLS label that is embedded in an NLRI, and the term "labeled address 563 family" to refer to any AFI/SAFI that has embedded labels. 565 Some of the tunnel types (e.g., VXLAN, VXLAN-GPE, and NVGRE) that can 566 be specified in the Tunnel Encapsulation attribute have an 567 encapsulation header containing "Virtual Network" identifier of some 568 sort. The Encapsulation sub-TLVs for these tunnel types may 569 optionally specify a value for the virtual network identifier. 571 Suppose a Tunnel Encapsulation attribute is attached to an UPDATE of 572 an embedded address family, and it is decided to use a particular 573 tunnel (specified in one of the attribute's TLVs) for transmitting a 574 packet that is being forwarded according to that UPDATE. When 575 forming the encapsulation header for that packet, different 576 deployment scenarios require different handling of the embedded label 577 and/or the virtual network identifier. The Embedded Label Handling 578 sub-TLV can be used to control the placement of the embedded label 579 and/or the virtual network identifier in the encapsulation. 581 The Embedded Label Handling sub-TLV may be included in any TLV of the 582 Tunnel Encapsulation attribute. If the Tunnel Encapsulation 583 attribute is attached to an UPDATE of a non-labeled address family, 584 the sub-TLV is treated as a no-op. If the sub-TLV is contained in a 585 TLV whose tunnel type does not have a virtual network identifier in 586 its encapsulation header, the sub-TLV is treated as a no-op. 588 The sub-TLV's Length field always contains the value 1, and its value 589 field consists of a single octet. The following values are defined: 591 1: The payload will be an MPLS packet with the embedded label at the 592 top of its label stack. 594 2: The embedded label is not carried in the payload, but is carried 595 either in the virtual network identifier field of the 596 encapsulation header, or else is ignored entirely. 598 Please see Section 7 for the details of how this sub-TLV is used when 599 it is carried by an UPDATE of a labeled address family. 601 If the Embedded Label sub-TLV is carried by an UPDATE of a non- 602 labeled address family, it is treated as a no-op. However, it SHOULD 603 NOT be stripped from the TLV before the UPDATE is forwarded. 605 3. Semantics and Usage of the Tunnel Encapsulation attribute 607 [RFC5512] specifies the use of the Tunnel Encapsulation attribute in 608 BGP UPDATE messages of AFI/SAFI 1/7 and 2/7. That document restricts 609 the use of this attribute to UPDATE messsages of those SAFIs. This 610 document removes that restriction. 612 The BGP Tunnel Encapsulation attribute MAY be carried in any BGP 613 UPDATE message whose AFI/SAFI is 1/1 (IPv4 Unicast), 2/1 (IPv6 614 Unicast), 1/4 (IPv4 Labeled Unicast), 2/4 (IPv6 Labeled Unicast), 615 1/128 (VPN-IPv4 Labeled Unicast), 2/128 (VPN-IPv6 Labeled Unicast), 616 or 25/70 (EVPN). Use of the Tunnel Encapsulation attribute in BGP 617 UPDATE messages of other AFI/SAFIs is outside the scope of this 618 document. 620 The decision to attach a Tunnel Encapsulation attribute to a given 621 BGP UPDATE is determined by policy. The set of TLVs and sub-TLVs 622 contained in the attribute is also determined by policy. 624 When the Tunnel Encapsulation attribute is carried in an UPDATE of 625 one of the AFI/SAFIs specifies in the previous paragraph, each TLV 626 MUST have a Remote Endpoint sub-TLV. If a TLV that does not have a 627 Remote Endpoint sub-TLV, that TLV should be treated as if it had a 628 malformed Remote Endpoint sub-TLV (see Section 2.1). 630 Suppose that: 632 o a given packet P must be forwarded by router R; 634 o the path along which P is to be forwarded is determined by BGP 635 UPDATE U; 637 o UPDATE U has a Tunnel Encapsulation attribute, containing at least 638 one TLV that identifies a "feasible tunnel" for packet P. A 639 tunnel is considered feasible if it has the following two 640 properties: 642 * The tunnel type is supported (i.e., router R knows how to set 643 up tunnels of that type, how to create the encapsulation header 644 for tunnels of that type, etc.) 646 * The tunnel is of a type that can be used to carry packet P 647 (e.g., an MPLS-in-UDP tunnel would not be a feasible tunnel for 648 carrying an IP packet, UNLESS the IP packet can first be 649 converted to an MPLS packet). 651 * The tunnel is specified in a TLV whose Remote Endpoint sub-TLV 652 identifies an IP address that is reachable. 654 Then router R SHOULD send packet P through one of the feasible 655 tunnels identified in the Tunnel Encapsulation attribute of UPDATE U. 657 If the Tunnel Encapsulation attribute contains several TLVs (i.e., if 658 it specifies several tunnels), router R may choose any one of those 659 tunnels, based upon local policy. If any of tunnels' TLVs contain 660 the Color sub-TLV and/or the Protocol Type sub-TLV defined in 661 [RFC5512], the choice of tunnel may be influenced by these sub-TLVs. 663 If a particular tunnel is not feasible at some moment because its 664 Remote Endpoint cannot be reached at that moment, the tunnel may 665 become feasible at a later time. When this happens, router R SHOULD 666 reconsider its choice of tunnel to use, and MAY choose to now use the 667 tunnel. 669 A TLV specifying a non-feasible tunnel is not considered to be 670 malformed or erroneous in any way, and the TLV SHOULD NOT be stripped 671 from the Tunnel Encapsulation attribute before redistribution. 673 In addition to the sub-TLVs already defined, additional sub-TLVs may 674 be defined that affect the choice of tunnel to be used, or that 675 affect the contents of the tunnel encapsulation header. The 676 documents that define any such additional sub-TLVs must specify the 677 effect that including the sub-TLV is to have. 679 If it is determined to send a packet through the tunnel specified in 680 a particular TLV of a particular Tunnel Encapsulation attribute, and 681 if that TLV contains a Remote Endpoint sub-TLV, then the tunnel's 682 remote endpoint address is the IP address contained in the sub-TLV. 683 If the TLV does not contain a Remote Endpoint sub-TLV, or if it 684 contains a Remote Endpoint sub-TLV whose value field is all zeroes, 685 then the tunnel's remote endpoint is the IP address specified as the 686 Next Hop of the BGP Update containing the Tunnel Encapsulation 687 attribute. 689 The procedure for sending a packet through a particular tunnel type 690 to a particular remote endpoint depends upon the tunnel type, and is 691 outside the scope of this document. The contents of the tunnel 692 encapsulation header MAY be influenced by the Encapsulation sub-TLV. 694 Note that some tunnel types may require the execution of an explicit 695 tunnel setup protocol before they can be used for carrying data. 696 Other tunnel types may not require any tunnel setup protocol. 697 Whenever a new Tunnel Type TLV is defined, the specification of that 698 TLV must describe (or reference) the procedures for creating the 699 encapsulation header used to forward packets through that tunnel 700 type. 702 If a Tunnel Encapsulation attribute specifies several tunnels, the 703 way in which a router chooses which one to use is a matter of policy, 704 subject to the following constraint: if a router can determine that a 705 given tunnel is not functional, it MUST NOT use that tunnel. In 706 particular, if the tunnel is identified in a TLV that has a Remote 707 Endpoint sub-TLV, and if the IP address specified in the sub-TLV is 708 not reachable from router R, then the tunnel SHOULD be considered 709 non-functional. Other means of determining whether a given tunnel is 710 functional MAY be used; specification of such means is outside the 711 scope of this specification. Of course, if a non-functional tunnel 712 later becomes functional, router R SHOULD reevaluate its choice of 713 tunnels. 715 If router R determines that it cannot use any of the tunnels 716 specified in the Tunnel Encapsulation attribute, it MAY either drop 717 packet P, or it MAY transmit packet P as it would had the Tunnel 718 Encapsulation attribute not been present. This is a matter of local 719 policy. By default, the packet SHOULD be transmitted as if the 720 Tunnel Encapsulation attribute had not been present. 722 A Tunnel Encapsulation attribute may contain several TLVs that all 723 specify the same tunnel type. Each TLV should be considered as 724 specifying a different tunnel. Two tunnels of the same type may have 725 different Remote Endpoint sub-TLVs, different Encapsulation sub-TLVs, 726 etc. Choosing between two such tunnels is a matter of local policy. 728 Once router R has decided to send packet P through a particular 729 tunnel, it encapsulates packet P appropriately and then forwards it 730 according to the route that leads to the tunnel's remote endpoint. 731 This route may itself be a BGP route with a Tunnel Encapsulation 732 attribute. If so, the encapsulated packet is treated as the payload 733 and is encapsulated according to the Tunnel Encapsulation attribute 734 of that route. That is, tunnels may be "stacked". 736 4. Routing Considerations 738 4.1. No Impact on BGP Decision Process 740 The presence of the Tunnel Encapsulation attribute does not affect 741 the BGP bestpath selection algorithm. 743 Under certain circumstances, this may need to counter-intuitive 744 consequences. For example, suppose: 746 o router R1 receives a BGP UPDATE message from router R2, such that 748 * the NLRI of that UPDATE is prefix X, 750 * the UPDATE contains a Tunnel Encapsulation attribute specifying 751 two tunnels, T1 and T2, 753 * R1 cannot use tunnel T1 or tunnel T2, either because the tunnel 754 remote endpoint is not reachable or because R1 does not support 755 that kind of tunnel 757 o router R1 receives a BGP UPDATE message from router R3, such that 758 * the NLRI of that UPDATE is prefix X, 760 * the UPDATE contains a Tunnel Encapsulation attribute specifying 761 two tunnels, T3 and T4, 763 * R1 can use at least one of the two tunnels 765 Since the Tunnel Encapsulation attribute does not affect bestpath 766 selection, R1 may well install the route from R2 rather than the 767 route from R3, even though R2's route contains no usable tunnels. 769 This possibility must be kept in mind whenever a Remote Endpoint sub- 770 TLV carried by a given UPDATE specifies an IP address that is 771 different than the next hop of that UPDATE. 773 4.2. Looping, Infinite Stacking, Etc. 775 Consider a packet destined for address X. Suppose a BGP UPDATE for 776 address prefix X carries a Tunnel Encapsulation attribute that 777 specifies a remote tunnel endpoint of Y. And suppose that a BGP 778 UPDATE for address prefix Y carries a Tunnel Encapsulation attribute 779 that specifies a Remote Endpoint of X. It is easy to see that this 780 will cause an infinite number of encapsulation headers to be put on 781 the given packet. 783 This could happen as a result of misconfiguration, either accidental 784 or intentional. It could also happen if the Tunnel Encapsulation 785 attribute were altered by a malicious agent. Implementations should 786 be aware of this. 788 Improper setting (or malicious altering) of the Tunnel Encapsulation 789 attribute could also cause data packets to loop. Suppose a BGP 790 UPDATE for address prefix X carries a Tunnel Encapsulation attribute 791 that specifies a remote tunnel endpoint of Y. Suppose router R 792 receives and processes the update. When router R receives a packet 793 destined for X, it will apply the encapsulation and send the 794 encapsulated packet to Y. Y will decapsulate the packet and forward 795 it further. If Y is further away from X than is router R, it is 796 possible that the path from Y to X will traverse R. This would cause 797 a long-lasting routing loop. 799 These possibilities must also be kept in mind whenever the Remote 800 Endpoint for a given prefix differs from the BGP next hop for that 801 prefix. 803 5. Recursive Next Hop Resolution 805 Suppose that: 807 o a given packet P must be forwarded by router R1; 809 o the path along which P is to be forwarded is determined by BGP 810 UPDATE U1; 812 o UPDATE U1 does not have a Tunnel Encapsulation attribute; 814 o the next hop of UPDATE U1 is router R2; 816 o the best path to router R2 is a BGP route that was advertised in 817 UPDATE U2; 819 o UPDATE U2 has a Tunnel Encapsulation attribute. 821 Then packet P SHOULD be sent through one of the tunnels identified in 822 the Tunnel Encapsulation attribute of UPDATE U2. See Section 3 for 823 further details. 825 Note that if UPDATE U1 and UPDATE U2 both have Tunnel Encapsulation 826 attributes, packet P will be carried through a pair of nested 827 tunnels. P will first be encapsulated based on the Tunnel 828 Encapsulation attribute of U1. This encapsulated packet then becomes 829 the payload, and is encapsulated based on the Tunnel Encapsulation 830 attribute of U2. This is another way of "stacking" tunnels (see also 831 Section 3. 833 6. Tunnel Encapsulation Extended Community 835 [RFC5512] defines an Encapsulation Extended Community. This Extended 836 Community may be attached to a route any AFI/SAFI to which the Tunnel 837 Encapsulation attribute may be attached. Each such Extended 838 Community identifies a particular tunnel type. If the Encapsulation 839 Extended Community identifies a particular tunnel type, its semantics 840 are exactly equivalent to the semantics of a Tunnel Encapsulation 841 attribute TLV that: 843 o identifies the same tunnel type, and 845 o has a Remote Endpoint sub-TLV whose IP address field contains the 846 address of the BGP next hop of the route to which it is attached, 847 and 849 o has no other sub-TLVs. 851 7. Use of Virtual Network Identifiers and Embedded Labels when Imposing 852 a Tunnel Encapsulation 854 Three of the tunnel types that can be specified in a Tunnel 855 Encapsulation TLV have virtual network identifier fields in their 856 encapsulation headers. In the VXLAN and VXLAN-GPE encapsulations, 857 this field is called the VNI field; in the NVGRE encapsulation, this 858 field is called the VSID field. 860 When one of these tunnel encapsulations is imposed on a packet, the 861 setting of the virtual network identifier field in the encapsulation 862 header depends upon the contents of the Encapsulation sub-TLV (if one 863 is present). When the Tunnel Encapsulation attribute is being 864 carried on a BGP UPDATE of a labeled address family, the setting of 865 the virtual network identifier field also depends upon the contents 866 of the Embedded Label Handling sub-TLV (if present). 868 This section specifies the procedures for choosing the value to set 869 in the virtual network identifier field of the encapsulation header. 870 These procedures apply only when the tunnel type is VXLAN, VXLAN-GPE, 871 or NVGRE. 873 7.1. Unlabeled Address Families 875 This sub-section applies when: 877 o the Tunnel Encapsulation attribute is carried on a BGP UPDATE of 878 an unlabeled address family, and 880 o at least one of the attribute's TLVs identifies a tunnel type that 881 uses a virtual network identifier, and 883 o it has been determined to send a packet through one of those 884 tunnels. 886 If the TLV identifying the tunnel contains an Encapsulation sub-TLV 887 whose V bit is set, the virtual network identifier field of the 888 encapsulation header is set to the value of the virtual network 889 identifier field of the Encapsulation sub-TLV. 891 Otherwise, the virtual network identifier field of the encapsulation 892 header is set to a configured value; if there is no configured value, 893 the tunnel cannot be used. 895 7.2. Labeled Address Families 897 This sub-section applies when: 899 o the Tunnel Encapsulation attribute is carried on a BGP UPDATE of a 900 labeled address family, and 902 o at least one of the attribute's TLVs identifies a tunnel type that 903 uses a virtual network identifier, and 905 o it has been determined to send a packet through one of those 906 tunnels. 908 7.2.1. When a Valid VNID has been Signaled 910 If the TLV identifying the tunnel contains an Encapsulation sub-TLV 911 whose V bit is set, the virtual network identifier field of the 912 encapsulation header is set as follows: 914 o If the TLV does not contain an Embedded Label Handling sub-TLV, or 915 if it contains an Embedded Label Handling sub-TLV whose value is 916 1, then the virtual network identifier field of the encapsulation 917 header is set to the value of the virtual network identifier field 918 of the Encapsulation sub-TLV. 920 The embedded label (from the NLRI of the route that is carrying 921 the Tunnel Encapsulation attribute) appears at the top of the MPLS 922 label stack in the encapsulation payload. 924 o If the TLV contains an Embedded Label Handling sub-TLV whose value 925 is 2, the embedded label is ignored entirely, and the virtual 926 network identifier field of the encapsulation header is set to the 927 value of the virtual network identifier field of the Encapsulation 928 sub-TLV. 930 7.2.2. When a Valid VNID has not been Signaled 932 If the TLV identifying the tunnel does not contain an Encapsulation 933 sub-TLV whose V bit is set, the virtual network identifier field of 934 the encapsulation header is set as follows: 936 o If the TLV does not contain an Embedded Label Handling sub-TLV, or 937 if it contains an Embedded Label Handling sub-TLV whose value is 938 1, then the virtual network identifier field of the encapsulation 939 header is set to a configured value. 941 If there is no configured value, the tunnel cannot be used. 943 The embedded label (from the NLRI of the route that is carrying 944 the Tunnel Encapsulation attribute) appears at the top of the MPLS 945 label stack in the encapsulation payload. 947 o If the TLV contains an Embedded Label Handling sub-TLV whose value 948 is 2, the embedded label is copied into the virtual network 949 identifier field of the encapsulation header. 951 The embedded label does not appear in the MPLS label stack of the 952 payload. 954 7.2.3. Applicability Restrictions 956 In a given UPDATE of a labeled address family, the label embedded in 957 the NLRI is generally a label that is meaningful only to the router 958 whose address appears as the next hop. Certain of the procedures of 959 Section 7.2.1 or Section 7.2.2 cause the embedded label to be carried 960 by a data packet to the router whose address appears in the Remote 961 Endpoint sub-TLV. If the Remote Endpoint sub-TLV does not identify 962 the same router that is the next hop, sending the packet through the 963 tunnel may cause the label to be misinterpreted at the tunnel's 964 remote endpoint. This may cause misdelivery of the packet. 966 Therefore the embedded label MUST NOT be carried by a data packet 967 traveling through a tunnel unless it is known that the label will be 968 properly interpreted at the tunnel's remote endpoint. How this is 969 known is outside the scope of this document. 971 Note that if the Tunnel Encapsulation attribute is attached to a VPN- 972 IP route [RFC4364], and if Inter-AS "option b" (see section 10 of 973 [RFC4364] is being used, and if the Remote Endpoint sub-TLV contains 974 an IP address that is not in same AS as the router receiving the 975 route, it is very likely that the embedded label has been changed. 976 Therefore use of the Tunnel Encapsulation attribute in an "Inter-AS 977 option b" scenario is not supported. 979 8. Scoping 981 The Tunnel Encapsulation attribute is defined as a transitive 982 attribute, so that it may be passed along by BGP speakers that do not 983 recognize it. However, it is intended that the Tunnel Encapsulation 984 attribute be used only within a well-defined scope, e.g., within a 985 set of Autonomous Systems that belong to a single administrative 986 entity. If the attribute is distributed beyond its intended scope, 987 packets may be sent through tunnels in a manner that is not intended. 989 To prevent the Tunnel Encapsulation attribute from being distributed 990 beyond its intended scope, any BGP speaker that understands the 991 attribute MUST be able to filter the attribute from incoming BGP 992 UPDATE messages. When the attribute is filtered from an incoming 993 UPDATE, the attribute is neither processed nor redistributed. This 994 filtering SHOULD be possible on a per-BGP-session basis. For each 995 session, filtering of the attribute on incoming UPDATEs MUST be 996 enabled by default. 998 In addition, any BGP speaker that understands the attribute MUST be 999 able to filter the attribute from outgoing BGP UPDATE messages. This 1000 filtering SHOULD be possible on a per-BGP-session basis. For each 1001 session, filtering of the attribute on outgoing UPDATEs MUST be 1002 enabled by default. 1004 9. Error Handling 1006 The Tunnel Encapsulation attribute is a sequence of TLVs, each of 1007 which is a sequence of sub-TLVs. The final octet of a TLV is 1008 determined by its length field. Similarly, the final octet of a sub- 1009 TLV is determined by its length field. The final octet of a TLV must 1010 also be the final octet of its final sub-TLV. If this is not the 1011 case, the TLV MUST be considered invalid. A TLV that is found to be 1012 invalid for this reason MUST NOT be processed, and MUST be stripped 1013 from the Tunnel Encapsulation attribute before redistribution. 1014 Subsequent TLVs in the Tunnel Encapsulation attribute may still be 1015 valid, in which case they MUST be processed and redistributed 1016 normally. 1018 If a Tunnel Encapsulation attribute does not have any valid TLVs, or 1019 it does not have the transitive bit set, the "Attribute Discard" 1020 procedure of [ERRORS] is applied. 1022 If a Tunnel Encapsulation attribute can be parsed correctly, but 1023 contains a TLV that is not recognized (i.e., the tunnel type is not 1024 recognized) by a particular BGP speaker, the attribute is NOT 1025 considered to be malformed. The unrecognized TLV MUST be ignored, 1026 and the BGP speaker MUST interpret the attribute as if the 1027 unrecognized TLV had not been present. If the route carrying the 1028 Tunnel Encapsulation attribute is redistributed with the attribute, 1029 the unrecognized TLV SHOULD remain in the attribute. 1031 If a TLV of a Tunnel Encapsulation attribute contains a sub-TLV that 1032 is not recognized by a particular BGP speaker, the BGP speaker SHOULD 1033 process that TLV as if the unrecognized sub-TLV had not been present. 1034 If the route carrying the Tunnel Encapsulation attribute is 1035 redistributed with the attribute, the unrecognized TLV SHOULD remain 1036 in the attribute. 1038 In general, if a TLV contains a sub-TLV that is invalid (e.g., 1039 contains a length field whose value is not legal for that sub-TLV), 1040 the sub-TLV should be treated as if it were an unrecognized sub-TLV. 1041 This document specifies one exception to this rule -- if a TLV 1042 contains an invalid Remote Endpoint sub-TLV (as defined in 1043 Section 2.1, the entire TLV MUST be ignored, and SHOULD be removed 1044 from the Tunnel Encapsulation attribute before the route carrying 1045 that attribute is redistributed. 1047 A TLV that does not contain the Remote Encapsulation sub-TLV MUST be 1048 treated as if it contained an invalid Remote Endpoint sub-TLV. 1050 A TLV identifying a particular tunnel type may contain a sub-TLV that 1051 is meaningless for that tunnel type. For example, perhaps the TLV 1052 contains a "UDP Destination Port" sub-TLV, but the identified tunnel 1053 type does not use UDP encapsulation at all. Sub-TLVs of this sort 1054 SHOULD be treated as no-ops. That is, they SHOULD NOT affect the 1055 creation of the encapsulation header. However, the sub-TLV MUST NOT 1056 be considered to be invalid, and MUST NOT be removed from the TLV 1057 before the route carrying the Tunnel Encapsulation attribute is 1058 redistributed. 1060 There is no significance to the order in which the TLVs occur within 1061 the Tunnel Encapsulation attribute. Multiple TLVs may occur for a 1062 given tunnel type; each such TLV is regarded as describing a 1063 different tunnel. 1065 10. IANA Considerations 1067 IANA is requested to assign a codepoint from the "BGP Tunnel 1068 Encapsulation Attribute Sub-TLVs" registry for "Remote Endpoint", 1069 with this document being the reference. 1071 IANA is requested to assign a codepoint from the "BGP Tunnel 1072 Encapsulation Attribute Sub-TLVs" registry for "IPv4 DS Field", with 1073 this document being the reference. 1075 IANA is requested to assign a codepoint from the "BGP Tunnel 1076 Encapsulation Attribute Sub-TLVs" registry for "UDP Destination 1077 Port", with this document being the reference. 1079 IANA is requested to assign a codepoint from the "BGP Tunnel 1080 Encapsulation Attribute Sub-TLVs" registry for "Embedded Label 1081 Handling", with this document being the reference. 1083 IANA is requested to add this document as a reference for tunnel 1084 types 8-13 in the "BGP Tunnel Encapsulation Tunnel Types" registry. 1086 11. Security Considerations 1088 The Tunnel Encapsulation attribute can cause traffic to be diverted 1089 from its normal path, especially when the Remote Endpoint sub-TLV is 1090 used. This can have serious consequences if the attribute is added 1091 or modified illegitimately, as it enables traffic to be "hijacked". 1093 The Remote Endpoint sub-TLV contains both an IP address and an AS 1094 number. BGP Origin Validation [RFC6811] can be used to obtain 1095 assurance that the given IP address belongs to the given AS. While 1096 this provides some protection against misconfiguration, it does not 1097 prevent a malicious agent from inserting a sub-TLV that will appear 1098 valid. 1100 Before sending a packet through the tunnel identified in a particular 1101 TLV of a Tunnel Encapsulation attribute, it may be advisable to use 1102 BGP Origin Validation to obtain the following additional assurances: 1104 o the origin AS of the route carrying the Tunnel Encapsulation 1105 attribute is correct; 1107 o the origin AS of the route to the IP address specified in the 1108 Remote Endpoint sub-TLV is correct, and is the same AS that is 1109 specified in the Remote Endpoint sub-TLV. 1111 One then has some level of assurance that the tunneled traffic is 1112 going to the same destination AS that it would have gone to had the 1113 Tunnel Encapsulation attribute not been present. However, this may 1114 not suit all use cases, and in any event is not very strong 1115 protection against hijacking. 1117 For these reasons, BGP Origin Validation should not be relied upon 1118 exclusively, and the filtering procedures of Section 8 should always 1119 be in place. 1121 Increased protection can be obtained by using BGP Path Validation 1122 [BGPSEC] to ensure that the route carrying the Tunnel Encapsulation 1123 attribute, and the routes to the Remote Endpoint of each specified 1124 tunnel, have not been altered illegitimately. 1126 If BGP Origin Validation is used as specified above, and the tunnel 1127 specified in a particular TLV of a Tunnel Encapsulation attribute is 1128 therefore regarded as "suspicious", that tunnel should not be used. 1129 Other tunnels specified in (other TLVs of) the Tunnel Encapsulation 1130 attribute may still be used. 1132 12. Acknowledgments 1134 The authors wish to think Ron Bonica, John Drake, Satoru Matushima, 1135 Dhananjaya Rao, John Scudder, and Ravi Singh for their review, 1136 comments, and/or helpful discussions. 1138 13. Contributor Addresses 1140 Below is a list of other contributing authors in alphabetical order: 1142 Randy Bush 1143 Internet Initiative Japan 1144 5147 Crystal Springs 1145 Bainbridge Island, Washington 98110 1146 United States 1148 Email: randy@psg.com 1150 Robert Raszuk 1151 Mirantis Inc. 1152 615 National Ave. #100 1153 Mountain View, California 94043 1154 United States 1156 Email: robert@raszuk.net 1158 14. References 1160 14.1. Normative References 1162 [ERRORS] Chen, E., Scudder, J., Mohapatra, P., and K. Patel, 1163 "Revised Error Handling for BGP UPDATE Messages", 1164 internet-draft draft-ietf-idr-error-handling-19, April 1165 2015. 1167 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1168 Requirement Levels", BCP 14, RFC 2119, March 1997. 1170 [RFC5512] Mohapatra, P. and E. Rosen, "The BGP Encapsulation 1171 Subsequent Address Family Identifier (SAFI) and the BGP 1172 Tunnel Encapsulation Attribute", RFC 5512, April 2009. 1174 14.2. Informative References 1176 [BGPSEC] Lepinski, M. and S. Turner, "An Overview of BGPsec", 1177 internet-draft draft-ietf-sidr-bgpsec-overview, January 1178 2015. 1180 [GTP-U] 3GPP, "GPRS Tunneling Protocol User Plane, TS 29.281", 1181 2014. 1183 [NVGRE] Garg, P. and Y. Wang, "NVGRE: Network Virtualization using 1184 Generic Routing Encapsulation", internet-draft draft- 1185 sridharan-virtualization-nvgre, April 2015. 1187 [RFC2474] Nichols, K., Blake, S., Baker, F., and D. Black, 1188 "Definition of the Differentiated Services Field (DS 1189 Field) in the IPv4 and IPv6 Headers", RFC 2474, December 1190 1998. 1192 [RFC2784] Farinacci, D., Li, T., Hanks, S., Meyer, D., and P. 1193 Traina, "Generic Routing Encapsulation (GRE)", RFC 2784, 1194 March 2000. 1196 [RFC2890] Dommety, G., "Key and Sequence Number Extensions to GRE", 1197 RFC 2890, September 2000. 1199 [RFC4023] Worster, T., Rekhter, Y., and E. Rosen, "Encapsulating 1200 MPLS in IP or Generic Routing Encapsulation (GRE)", RFC 1201 4023, March 2005. 1203 [RFC4364] Rosen, E. and Y. Rekhter, "BGP/MPLS IP Virtual Private 1204 Networks (VPNs)", RFC 4364, February 2006. 1206 [RFC6811] Mohapatra, P., Scudder, J., Ward, D., Bush, R., and R. 1207 Austein, "BGP Prefix Origin Validation", RFC 6811, January 1208 2013. 1210 [RFC7348] Mahalingam, M., Dutt, D., Duda, K., Agarwal, P., Kreeger, 1211 L., Sridhar, T., Bursell, M., and C. Wright, "Virtual 1212 eXtensible Local Area Network (VXLAN): A Framework for 1213 Overlaying Virtualized Layer 2 Networks over Layer 3 1214 Networks", RFC 7348, August 2014. 1216 [RFC7510] Xu, X., Sheth, N., Yong, L., Callon, R., and D. Black, 1217 "Encapsulating MPLS in UDP", RFC 7510, April 2015. 1219 [vEPC] Matsushima, S. and R. Wakikawa, "Stateless User-Plane 1220 Architecture for Virtualized EPC", internet-draft draft- 1221 matsushima-stateless-uplane-vepc-04, March 2015. 1223 [VXLAN-GPE] 1224 Quinn, P., Manur, R., Kreeger, L., Lewis, D., Maino, F., 1225 Smith, M., Agarwal, P., Xu, X., Elzur, U., Garg, P., 1226 Melman, D., and R. Manur, "Generic Protocol Extension for 1227 VXLAN", internet-draft draft-ietf-nvo3-vxlan-gpe, May 1228 2015. 1230 Authors' Addresses 1232 Eric C. Rosen (editor) 1233 Juniper Networks, Inc. 1234 10 Technology Park Drive 1235 Westford, Massachusetts 01886 1236 United States 1238 Email: erosen@juniper.net 1240 Keyur Patel 1241 Cisco Systems 1242 170 W. Tasman Drive 1243 San Jose, CA 95134 1244 United States 1246 Email: keyupate@cisco.com 1248 Gunter Van de Velde 1249 Alcatel-Lucent 1250 Copernicuslaan 50 1251 Antwerpen 2018 1252 Belgium 1254 Email: gunter.van_de_velde@alcatel-lucent.com