idnits 2.17.1 draft-rosen-idr-tunnel-encaps-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == The document seems to lack the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords -- however, there's a paragraph with a matching beginning. Boilerplate error? (The document does seem to have the reference to RFC 2119 which the ID-Checklist requires). (Using the creation date from RFC5512, updated by this document, for RFC5378 checks: 2008-01-24) -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (July 6, 2015) is 3209 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) ** Obsolete normative reference: RFC 5512 (Obsoleted by RFC 9012) Summary: 1 error (**), 0 flaws (~~), 2 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 IDR Working Group E. Rosen, Ed. 3 Internet-Draft Juniper Networks, Inc. 4 Updates: 5512 (if approved) K. Patel 5 Intended status: Standards Track Cisco Systems 6 Expires: January 7, 2016 G. Van de Velde 7 Alcatel-Lucent 8 July 6, 2015 10 Using the BGP Tunnel Encapsulation Attribute without the BGP 11 Encapsulation SAFI 12 draft-rosen-idr-tunnel-encaps-01 14 Abstract 16 RFC 5512 defines a BGP Path Attribute known as the "Tunnel 17 Encapsulation Attribute". This attribute allows one to specify a set 18 of tunnels. For each such tunnel, the attribute can provide 19 additional information used to create a tunnel and the corresponding 20 encapsulation header, and can also provide information that aids in 21 choosing whether a particular packet is to be sent through a 22 particular tunnel. RFC 5512 states that the attribute is only 23 carried in BGP UPDATEs that have the "Encapsulation Subsequent 24 Address Family (Encapsulation SAFI)". This document updates RFC 5512 25 by removing that restriction, and by specifying semantics for the 26 attribute when it is carried in UPDATEs of certain other SAFIs. This 27 document also extends the attribute by enabling it to carry 28 additional information needed to create the encapsulation headers 29 additional tunnel types not mentioned in RFC 5512. Finally, this 30 document also extends the attribute by allowing it to specify a 31 remote tunnel endpoint address for each tunnel. 33 Status of This Memo 35 This Internet-Draft is submitted in full conformance with the 36 provisions of BCP 78 and BCP 79. 38 Internet-Drafts are working documents of the Internet Engineering 39 Task Force (IETF). Note that other groups may also distribute 40 working documents as Internet-Drafts. The list of current Internet- 41 Drafts is at http://datatracker.ietf.org/drafts/current/. 43 Internet-Drafts are draft documents valid for a maximum of six months 44 and may be updated, replaced, or obsoleted by other documents at any 45 time. It is inappropriate to use Internet-Drafts as reference 46 material or to cite them other than as "work in progress." 48 This Internet-Draft will expire on January 7, 2016. 50 Copyright Notice 52 Copyright (c) 2015 IETF Trust and the persons identified as the 53 document authors. All rights reserved. 55 This document is subject to BCP 78 and the IETF Trust's Legal 56 Provisions Relating to IETF Documents 57 (http://trustee.ietf.org/license-info) in effect on the date of 58 publication of this document. Please review these documents 59 carefully, as they describe your rights and restrictions with respect 60 to this document. Code Components extracted from this document must 61 include Simplified BSD License text as described in Section 4.e of 62 the Trust Legal Provisions and are provided without warranty as 63 described in the Simplified BSD License. 65 Table of Contents 67 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 68 2. Tunnel Encapsulation Attribute Sub-TLVs . . . . . . . . . . . 5 69 2.1. The Remote Endpoint Sub-TLV . . . . . . . . . . . . . . . 5 70 2.2. Encapsulation Sub-TLVs for Particular Tunnel Types . . . 7 71 2.2.1. VXLAN . . . . . . . . . . . . . . . . . . . . . . . . 8 72 2.2.2. VXLAN-GPE . . . . . . . . . . . . . . . . . . . . . . 9 73 2.2.3. NVGRE . . . . . . . . . . . . . . . . . . . . . . . . 10 74 2.2.4. GTP . . . . . . . . . . . . . . . . . . . . . . . . . 11 75 2.2.5. MPLS-in-GRE . . . . . . . . . . . . . . . . . . . . . 11 76 2.3. Outer Encapsulation Sub-TLVs . . . . . . . . . . . . . . 12 77 2.3.1. IPv4 DS Field . . . . . . . . . . . . . . . . . . . . 13 78 2.3.2. UDP Destination Port . . . . . . . . . . . . . . . . 13 79 2.4. Embedded Label Handling Sub-TLV . . . . . . . . . . . . . 13 80 3. Semantics and Usage of the Tunnel Encapsulation 81 attribute . . . . . . . . . . . . . . . . . . . . . . . . . . 14 82 4. Routing Considerations . . . . . . . . . . . . . . . . . . . 17 83 4.1. No Impact on BGP Decision Process . . . . . . . . . . . . 17 84 4.2. Looping, Infinite Stacking, Etc. . . . . . . . . . . . . 18 85 5. Recursive Next Hop Resolution . . . . . . . . . . . . . . . . 18 86 6. Tunnel Encapsulation Extended Community . . . . . . . . . . . 19 87 7. Use of Virtual Network Identifiers and Embedded Labels 88 when Imposing a Tunnel Encapsulation . . . . . . . . . . . . 19 89 7.1. Unlabeled Address Families . . . . . . . . . . . . . . . 20 90 7.2. Labeled Address Families . . . . . . . . . . . . . . . . 20 91 7.2.1. When a Valid VNID has been Signaled . . . . . . . . . 20 92 7.2.2. When a Valid VNID has not been Signaled . . . . . . . 21 93 7.2.3. Applicability Restrictions . . . . . . . . . . . . . 21 94 8. Scoping . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 95 9. Error Handling . . . . . . . . . . . . . . . . . . . . . . . 22 96 10. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 24 97 11. Security Considerations . . . . . . . . . . . . . . . . . . . 24 98 12. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 25 99 13. Contributor Addresses . . . . . . . . . . . . . . . . . . . . 25 100 14. References . . . . . . . . . . . . . . . . . . . . . . . . . 26 101 14.1. Normative References . . . . . . . . . . . . . . . . . . 26 102 14.2. Informative References . . . . . . . . . . . . . . . . . 26 103 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 27 105 1. Introduction 107 [RFC5512] defines a BGP Path Attribute known as the Tunnel 108 Encapsulation attribute. This attribute consists of one or more 109 TLVs. Each TLV identifies a particular type of tunnel. Each TLV 110 also contains one or more sub-TLVs. Some of the sub-TLVs, e.g., the 111 "Encapsulation sub-TLV", contain information that may be used to form 112 the encapsulation header for the specified tunnel type. Other sub- 113 TLVs, e.g., the "color sub-TLV" and the "protocol sub-TLV", contain 114 information that aids in determining whether particular packets 115 should be sent through the tunnel that the TLV identifies. 117 [RFC5512] only allows the Tunnel Encapsulation attribute to be 118 attached to BGP UPDATE messages that have the "Encapsulation SAFI" 119 (i.e., UPDATE messages with AFI/SAFI 1/7 or 2/7). In an UPDATE of 120 the Encapsulation SAFI, the NLRI is an address of the BGP speaker 121 originating the UPDATE. Consider the following scenario: 123 o BGP speaker R1 has received and installed UPDATE U; 125 o UPDATE U's SAFI is the Encapsulation SAFI; 127 o UPDATE U has the address R2 as its NLRI; 129 o UPDATE U has a Tunnel Encapsulation attribute. 131 o R1 has a packet, P, to transmit to destination D; 133 o R1's best path to D is a BGP route that has R2 as its next hop; 135 In this scenario, when R1 transmits packet P, it should transmit it 136 to R2 through one of the tunnels specified in U's Tunnel 137 Encapsulation attribute. The IP address of the remote endpoint of 138 each such tunnel is R2. Packet P is known as the tunnel's "payload". 140 While the ability to specify tunnel information in a BGP UPDATE is 141 useful, the procedures of [RFC5512] have certain limitations: 143 o The requirement to use the "Encapsulation SAFI" presents an 144 unfortunate operational cost, as each BGP session that may need to 145 carry tunnel encapsulation information needs to be reconfigured to 146 support the Encapsulation SAFI. 148 o There is no way to use the Tunnel Encapsulation attribute to 149 specify the remote endpoint address of a given tunnel; [RFC5512] 150 assumes that the remote endpoint of each tunnel is specified as 151 the NLRI of an UPDATE of the Encapsulation-SAFI. 153 o If the respective best paths to two different address prefixes 154 have the same next hop, [RFC5512] does not provide a 155 straightforward method to associate each prefix with a different 156 tunnel. 158 In this document we address these deficiencies by: 160 o Defining a new "Remote Endpoint Address sub-TLV" that can be 161 included in any of the TLVs contained in the Tunnel Encapsulation 162 attribute. This sub-TLV can be used to specify the remote 163 endpoint address of a particular tunnel. 165 o Allowing the Tunnel Encapsulation attribute to be carried by BGP 166 UPDATEs of additional AFI/SAFIs. Appropriate semantics are 167 provided for this way of using the attribute. 169 One of the sub-TLVs defined in [RFC5512] is the "Encapsulation sub- 170 TLV". For a given tunnel, the encapsulation sub-TLV specifies some 171 of the information needed to construct the encapsulation header used 172 when sending packets through that tunnel. This document defines 173 encapsulation sub-TLVs for a number of tunnel types not discussed in 174 [RFC5512]: VXLAN, VXLAN-GRE, NVGRE, GTP, and MPLS-in-GRE. MPLS-in- 175 UDP [RFC7510] is also supported, but an Encapsulation sub-TLV for it 176 is not needed. 178 Some of the encapsulations mentioned in the previous paragraph need 179 to be further encapsulated inside UDP and/or IP. [RFC5512] provides 180 no way to specify that certain information is to appear in these 181 outer IP and/or UDP encapsulations. This document provides a 182 framework for including such information in the TLVs of the Tunnel 183 Encapsulation attribute. 185 When the Tunnel Encapsulation attribute is attached to a BGP UPDATE 186 whose AFI/SAFI identifies one of the labeled address families, it is 187 not always obvious whether the label embedded in the NLRI is to 188 appear somewhere in the tunnel encapsulation header (and if so, 189 where), or whether it is to appear in the payload, or whether it can 190 be omitted altogether. This is especially true if the tunnel 191 encapsulation header itself contains a "virtual network identifier". 192 This document provides a mechanism that allows one to signal (by 193 using sub-TLVs of the Tunnel Encapsulation attribute) how one wants 194 to use the embedded label when the tunnel encapsulation has its own 195 virtual network identifier field. 197 [RFC5512] defines a Tunnel Encapsulation Extended Community, that can 198 be used instead of the Tunnel Encapsulation attribute under certain 199 circumstances. This document addresses the issue of how to handle a 200 BGP UPDATE that carries both a Tunnel Encapsulation attribute and one 201 or more Tunnel Encapsulation Extended Communities. 203 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 204 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 205 "OPTIONAL", when and only when appearing in all capital letters, are 206 to be interpreted as described in [RFC2119]. 208 2. Tunnel Encapsulation Attribute Sub-TLVs 210 [RFC5512] specifies three sub-TLVs for the Tunnel Encapsulation 211 attribute: the Encapsulation sub-TLV, the Color sub-TLV, and the 212 Protocol Type sub-TLV. In this section we specify a number of 213 additional sub-TLVs. We also specify Encapsulation sub-TLVs for a 214 number of tunnel types that are not mentioned in [RFC5512]. 216 2.1. The Remote Endpoint Sub-TLV 218 The Remote Endpoint sub-TLV is a sub-TLV whose value field contains 219 three sub-fields: 221 1. a four-octet Autonomous System (AS) number sub-field 223 2. a two-octet Address Family sub-field 225 3. an address sub-field, whose length depends upon the Address 226 Family. 228 0 1 2 3 229 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 230 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 231 | Autonomous System Number | 232 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 233 | Address Family | Address ~ 234 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + 235 ~ ~ 236 | | 237 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 239 Figure 1: Remote Endpoint Sub-TLV Value Field 241 The Address Family subfield contains a value from IANA's "Address 242 Family Numbers" registry. In this document, we assume that the 243 Address Family is either IPv4 or IPv6; use of other address families 244 is outside the scope of this document. 246 If the Address Family subfield contains the value for IPv4, the 247 address subfield must contain an IPv4 address (a /32 IPv4 prefix). 248 In this case, the length field of Remote Endpoint sub-TLV must 249 contain the value 10 (0xa). IPv4 broadcast addresses are not valid 250 values of this field. 252 If the Address Family subfield contains the value for IPv6, the 253 address sub-field must contain an IPv6 address (a /128 IPv6 prefix). 254 In this case, the length field of Remote Endpoint sub-TLV must 255 contain the value 22 (0x16). IPv6 link local addresses are not valid 256 values of the IP address field. 258 In a given BGP UPDATE, the address family (IPv4 or IPv6) of a Remote 259 Endpoint sub-TLV is independent of the address family of the UPDATE 260 itself. For example, an UPDATE whose NLRI is an IPv4 address may 261 have a Tunnel Encapsulation attribute containing Remote Endpoint sub- 262 TLVs that contain IPv6 addresses. Also, different tunnels 263 represented in the Tunnel Encapsulation attribute may have Remote 264 Endpoints of different address families. 266 A two-octet AS number can be carried in the AS number field by 267 setting the two high order octets to zero, and carrying the number in 268 the two low order octets of the field. 270 The AS number in the sub-TLV MUST be the number of the AS to which 271 the IP address in the sub-TLV belongs. 273 There is one special case: the Remote Endpoint sub-TLV MAY have a 274 value field whose Address Family subfield contains 0. This means 275 that the tunnel's remote endpoint is the UPDATE's BGP next hop. If 276 the Address Family subfield contains 0, the Address subfield is 277 omitted, and the Autonomus System number field is set to 0. 279 If any of the following conditions hold, the Remote Endpoint sub-TLV 280 is considered to be "malformed": 282 o The sub-TLV contains the value for IPv4 in its Address Family 283 subfield, but the length of the sub-TLV's value field is other 284 than 10 (0xa). 286 o The sub-TLV contains the value for IPv6 in its Address Family 287 subfield, but the length of the sub-TLV's value field is other 288 than 22 (0x16). 290 o The sub-TLV contains the value zero in its Address Family field, 291 but the length of the sub-TLV's value field is other than 6, or 292 the Autonomous System subfield is not set to zero. 294 o The IP address in the sub-TLV's address subfield is not a valid IP 295 address (e.g., it's an IPv4 broadcast address). 297 o It can be determined that the IP address in the sub-TLV's address 298 subfield does not belong to the non-zero AS whose number is in the 299 its Autonomous System subfield. (See section Section 11 for 300 discussion of one way to determine this.) 302 If the Remote Endpoint sub-TLV is malformed, the TLV containing it is 303 also considered to be malformed, and the entire TLV MUST be ignored. 304 However, the Tunnel Encapsulation attribute SHOULD NOT be considered 305 to be malformed in this case; other TLVs in the attribute SHOULD be 306 processed (if they can be parsed correctly). 308 When redistributing a route that is carrying a Tunnel Encapsulation 309 attribute containing a TLV that itself contains a malformed Remote 310 Endpoint sub-TLV, the TLV SHOULD be removed from the attribute before 311 redistribution. 313 See Section 9 for further discussion of how to handle errors that are 314 encountered when parsing the Tunnel Encapsulation attribute. 316 If the Remote Endpoint sub-TLV contains an IPv4 or IPv6 address that 317 is valid but not reachable, the sub-TLV is NOT considered to be 318 malformed, and the containing TLV SHOULD NOT be removed from the 319 attribute before redistribution. However, the tunnel identified by 320 the TLV containing that sub-TLV cannot be used until such time as the 321 address becomes reachable. See Section 3. 323 2.2. Encapsulation Sub-TLVs for Particular Tunnel Types 325 Tunnel Encapsulation sub-TLVs for the following tunnel types are 326 defined in [RFC5512]: L2TPv3, and GRE. 328 This section defines Tunnel Encapsulation sub-TLVs for the following 329 tunnel types: VXLAN ([RFC7348]), VXLAN-GPE ([VXLAN-GPE]), NVGRE 330 ([NVGRE]), GTP [GTP-U], and MPLS-in-GRE ([RFC2784], [RFC2890], 331 [RFC4023]). 333 Rules for forming the encapsulation based on the information in a 334 given TLV are given in Section 7. 336 2.2.1. VXLAN 338 This document defines an encapsulation sub-TLV for VXLAN tunnels. 339 When the tunnel type is VXLAN, the following is the structure of the 340 value field in the encapsulation sub-TLV: 342 0 1 2 3 343 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 344 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 345 |V|M|R|R|R|R|R|R| VN-ID (3 Octets) | 346 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 347 | MAC Address (4 Octets) | 348 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 349 | MAC Address (2 Octets) | Reserved | 350 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 352 Figure 2: VXLAN Encapsulation Sub-TLV 354 V: This bit is set to 1 to indicate that a valid VN-ID is present 355 in the encapsulation sub-TLV. 357 M: This bit is set to 1 to indicate that a valid MAC Address is 358 present in the encapsulation sub-TLV. 360 R: The remaining bits in the 8-bit flags field are reserved for 361 further use. They SHOULD always be set to 0. 363 VN-ID: If the V bit is set, the VN-id field contains a 3 octet VN- 364 ID value. If the V bit is not set, the VN-id field SHOULD be set 365 to zero. 367 MAC Address: If the M bit is set, this field contains a 6 octet 368 Ethernet MAC address. If the M bit is not set, this field SHOULD 369 be set to all zeroes. 371 When forming the VXLAN encapsulation header: 373 o The values of the V, M, and R bits are NOT copied into the flags 374 field of the VXLAN header. The flags field of the VXLAN header is 375 set as per [RFC7348]. 377 o If the M bit is set, the MAC Address is copied into the Inner 378 Destination MAC Address field of the Inner Ethernet Header (see 379 section 5 of [RFC7348]. If the M bit is not set, the Inner 380 Destination MAC address field is set to a configured value. If 381 the M bit is not set, and there is no configured value, the VXLAN 382 tunnel cannot be used. 384 o See Section 7 to see how the VNI field of the VXLAN encapsulation 385 header is set. 387 Note that what we are calling a "VXLAN tunnel" is actually an 388 "ethernet-in-VXLAN" tunnel. Although, strictly speaking, VXLAN 389 tunnels only carry ethernet frames, a IP packet or an MPLS packet can 390 be carried through a "VXLAN tunnel" by forming an IP-in-ethernet-in- 391 VXLAN or MPLS-in-ethernet-in-VXLAN tunnel. 393 2.2.2. VXLAN-GPE 395 This document defines an encapsulation sub-TLV for VXLAN tunnels. 396 When the tunnel type is VXLAN-GPE, the following is the structure of 397 the value field in the encapsulation sub-TLV: 399 0 1 2 3 400 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 401 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 402 |Ver|V|R|R|R|R|R| Reserved | 403 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 404 | VN-ID | Reserved | 405 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 407 Figure 3: VXLAN GPE Encapsulation Sub-TLV 409 V: This bit is set to 1 to indicate that a valid VN-ID is present 410 in the encapsulation sub-TLV. 412 R: The bits designated "R" above are reserved for future use. 413 They SHOULD always be set to zero. 415 Version (Ver): Indicates VXLAN GPE protocol version. If the 416 indicated version is not supported, the TLV that contains this 417 Encapsulation sub-TLV MUST be treated as specifying an unsupported 418 tunnel type. The value of this field will be copied into the 419 corresponding field of the VXLAN encapsulation header. 421 VN-ID: If the V bit is set, this field contains a 3 octet VN-ID 422 value. If the V bit is not set, this field SHOULD be set to zero. 424 When forming the VXLAN-GPE encapsulation header: 426 o The values of the V and R bits are NOT copied into the flags field 427 of the VXLAN-GPE header. However, the values of the Ver bits are 428 copied into the VXLAN-GPE header. Other bits in the flags field 429 of the VXLAN-GPE header are set as per [VXLAN-GPE]. 431 o See Section 7 to see how the VNI field of the VXLAN-GPE 432 encapsulation header is set. 434 2.2.3. NVGRE 436 This document defines an encapsulation sub-TLV for NVGRE tunnels. 437 When the tunnel type is NVGRE, the following is the structure of the 438 value field in the encapsulation sub-TLV: 440 0 1 2 3 441 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 442 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 443 |V|M|R|R|R|R|R|R| VN-ID (3 Octets) | 444 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 445 | MAC Address (4 Octets) | 446 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 447 | MAC Address (2 Octets) | Reserved | 448 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 450 Figure 4: NVGRE Encapsulation Sub-TLV 452 V: This bit is set to 1 to indicate that a valid VN-ID is present 453 in the encapsulation sub-TLV. 455 M: This bit is set to 1 to indicate that a valid MAC Address is 456 present in the encapsulation sub-TLV. 458 R: The remaining bits in the 8-bit flags field are reserved for 459 further use. They SHOULD always be set to 0. 461 VN-ID: If the V bit is set, the VN-id field contains a 3 octet VN- 462 ID value. If the V bit is not set, the VN-id field SHOULD be set 463 to zero. 465 MAC Address: If the M bit is set, this field contains a 6 octet 466 Ethernet MAC address. If the M bit is not set, this field SHOULD 467 be set to all zeroes. 469 When forming the NVGRE encapsulation header: 471 o The values of the V, M, and R bits are NOT copied into the flags 472 field of the NVGRE header. The flags field of the VXLAN header is 473 set as per [NVGRE]. 475 o If the M bit is set, the MAC Address is copied into the Inner 476 Destination MAC Address field of the Inner Ethernet Header (see 477 section 5 of [NVGRE]. If the M bit is not set, the Inner 478 Destination MAC address field is set to a configured value. If 479 the M bit is not set, and there is no configured value, the NVGRE 480 tunnel cannot be used. 482 o See Section 7 to see how the VNI field of the VXLAN encapsulation 483 header is set. 485 2.2.4. GTP 487 When the tunnel type is GTP [GTP-U], the Encapsulation sub-TLV 488 contains information needed to send data packets through a GTP 489 tunnel, and also contains information needed by the tunnel's remote 490 endpoint to create a "reverse" tunnel back to the transmitter. This 491 allows a bidirectional control connection to be created. The format 492 of the Encapsulation Sub-TLV is: 494 0 1 2 3 495 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 496 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 497 | Remote TEID (4 Octets) | 498 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 499 | Local TEID (4 Octets) | 500 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 501 | Local Endpoint Address (4/16 Octets (IPv4/IPv6)) | 502 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 504 Figure 5: GTP Encapsulation Sub-TLV 506 Remote TEID: Contains the 32-bit Tunnel Endpoint Identifier of the 507 GTP tunnel through which data packets are to be sent. When data 508 packets are sent through the tunnel, the Remote TEID is carried in 509 the GTP encapsulation header. The GTP header is itself 510 encapsulation within an IP header, whose IP destination address 511 field is set to the value of the Remote Endpoint sub-TLV. 513 Local TEID: Contains a 32-bit Tunnel Endpoint Identifier of a GTP 514 tunnel assigned by EPC ([vEPC]). 516 Local Endpoint Address: Contains an IPv4 or IPv6 anycast address. 517 This is used, along with the Local TEID, to set up a tunnel in the 518 reverse direction. See [vEPC] for details. 520 2.2.5. MPLS-in-GRE 522 When the tunnel type is MPLS-in-GRE, the following is the structure 523 of the value field in an optional encapsulation sub-TLV: 525 0 1 2 3 526 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 527 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 528 | GRE-Key (4 Octets) | 529 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 531 Figure 6: MPLS-in-GRE Encapsulation Sub-TLV 533 GRE-Key: 4-octet field [RFC2890] that is generated by the 534 advertising router. The actual method by which the key is 535 obtained is beyond the scope of this document. The key is 536 inserted into the GRE encapsulation header of the payload packets 537 sent by ingress routers to the advertising router. It is intended 538 to be used for identifying extra context information about the 539 received payload. Note that the key is optional. Unless a key 540 value is being advertised, the MPLS-in-GRE encapsulation sub-TLV 541 MUST NOT be present. 543 Note that the GRE tunnel type defined in [RFC5512] can be used 544 instead of the MPLS-in-GRE tunnel type when it is necessary to 545 encapsulate MPLS in GRE. Including a TLV of the MPLS-in-GRE tunnel 546 type is equivalent to including a TLV of the GRE tunnel type that 547 also includes a Protocol Type sub-TLV ([RFC5512]) specifying MPLS as 548 the protocol to be encapsulated. That is, if a TLV specifies MPLS- 549 in-GRE or if it includes a Protocol Type sub-TLV specifying MPLS, the 550 GRE tunnel advertised in that TLV MUST NOT be used for carrying IP 551 packets. 553 2.3. Outer Encapsulation Sub-TLVs 555 The Encapsulation sub-TLV for a particular tunnel type allows one to 556 specify the values that are to be placed in certain fields of the 557 encapsulation header for that tunnel type. However, some tunnel 558 types require an outer IP encapsulation, and some also require an 559 outer UDP encapsulation. The Encapsulation sub-TLV for a given 560 tunnel type does not usually provide a way to specify values for 561 fields of the outer IP and/or UDP encapsulations. If it is necessary 562 to specify values for fields of the outer encapsulation, additional 563 sub-TLVs must be used. This document defines two such sub-TLVs. 565 If an outer encapsulation sub-TLV occurs in a TLV for a tunnel type 566 that does not use the corresponding outer encapsulation, the sub-TLV 567 as if it were an unknown type of sub-TLV. 569 2.3.1. IPv4 DS Field 571 Most of the tunnel types that can be specified in the Tunnel 572 Encapsulation attribute require an outer IP encapsulation. The IPv4 573 DS Field sub-TLV can be carried in the TLV of any such tunnel type. 574 It specifies the setting of one-octet Differentiated Services field 575 in the outer IP encapsulation (see [RFC2474]). The value field is 576 always a single octet. 578 2.3.2. UDP Destination Port 580 Some of the tunnel types that can be specified in the Tunnel 581 Encapsulation attribute require an outer UDP encapsulation. 582 Generally there is a standard UDP Destination Port value for a 583 particular tunnel type. However, sometimes it is useful to be able 584 to use a non-standard UDP destination port. If a particular tunnel 585 type requires an outer UDP encapsulation, and it is desired to use a 586 UDP destination port other than the standard one, the port to be used 587 can be specified by including a UDP Destination Port sub-TLV. The 588 value field of this sub-TLV is always a two-octet field, containing 589 the port value. 591 2.4. Embedded Label Handling Sub-TLV 593 Certain BGP address families (corresponding to particular AFI/SAFI 594 pairs, e.g., 1/4, 2/4, 1/128, 2/128) have MPLS labels embedded in 595 their NLRIs. We will use the term "embedded label" to refer to the 596 MPLS label that is embedded in an NLRI, and the term "labeled address 597 family" to refer to any AFI/SAFI that has embedded labels. 599 Some of the tunnel types (e.g., VXLAN, VXLAN-GPE, and NVGRE) that can 600 be specified in the Tunnel Encapsulation attribute have an 601 encapsulation header containing "Virtual Network" identifier of some 602 sort. The Encapsulation sub-TLVs for these tunnel types may 603 optionally specify a value for the virtual network identifier. 605 Suppose a Tunnel Encapsulation attribute is attached to an UPDATE of 606 an embedded address family, and it is decided to use a particular 607 tunnel (specified in one of the attribute's TLVs) for transmitting a 608 packet that is being forwarded according to that UPDATE. When 609 forming the encapsulation header for that packet, different 610 deployment scenarios require different handling of the embedded label 611 and/or the virtual network identifier. The Embedded Label Handling 612 sub-TLV can be used to control the placement of the embedded label 613 and/or the virtual network identifier in the encapsulation. 615 The Embedded Label Handling sub-TLV may be included in any TLV of the 616 Tunnel Encapsulation attribute. If the Tunnel Encapsulation 617 attribute is attached to an UPDATE of a non-labeled address family, 618 the sub-TLV is treated as a no-op. If the sub-TLV is contained in a 619 TLV whose tunnel type does not have a virtual network identifier in 620 its encapsulation header, the sub-TLV is treated as a no-op. 622 The sub-TLV's Length field always contains the value 1, and its value 623 field consists of a single octet. The following values are defined: 625 1: The payload will be an MPLS packet with the embedded label at the 626 top of its label stack. 628 2: The embedded label is not carried in the payload, but is carried 629 either in the virtual network identifier field of the 630 encapsulation header, or else is ignored entirely. 632 Please see Section 7 for the details of how this sub-TLV is used when 633 it is carried by an UPDATE of a labeled address family. 635 If the Embedded Label sub-TLV is carried by an UPDATE of a non- 636 labeled address family, it is treated as a no-op. However, it SHOULD 637 NOT be stripped from the TLV before the UPDATE is forwarded. 639 3. Semantics and Usage of the Tunnel Encapsulation attribute 641 [RFC5512] specifies the use of the Tunnel Encapsulation attribute in 642 BGP UPDATE messages of AFI/SAFI 1/7 and 2/7. That document restricts 643 the use of this attribute to UPDATE messsages of those SAFIs. This 644 document removes that restriction. 646 The BGP Tunnel Encapsulation attribute MAY be carried in any BGP 647 UPDATE message whose AFI/SAFI is 1/1 (IPv4 Unicast), 2/1 (IPv6 648 Unicast), 1/4 (IPv4 Labeled Unicast), 2/4 (IPv6 Labeled Unicast), 649 1/128 (VPN-IPv4 Labeled Unicast), 2/128 (VPN-IPv6 Labeled Unicast), 650 or 25/70 (EVPN). Use of the Tunnel Encapsulation attribute in BGP 651 UPDATE messages of other AFI/SAFIs is outside the scope of this 652 document. 654 The decision to attach a Tunnel Encapsulation attribute to a given 655 BGP UPDATE is determined by policy. The set of TLVs and sub-TLVs 656 contained in the attribute is also determined by policy. 658 When the Tunnel Encapsulation attribute is carried in an UPDATE of 659 one of the AFI/SAFIs specifies in the previous paragraph, each TLV 660 MUST have a Remote Endpoint sub-TLV. If a TLV that does not have a 661 Remote Endpoint sub-TLV, that TLV should be treated as if it had a 662 malformed Remote Endpoint sub-TLV (see Section 2.1). 664 Suppose that: 666 o a given packet P must be forwarded by router R; 668 o the path along which P is to be forwarded is determined by BGP 669 UPDATE U; 671 o UPDATE U has a Tunnel Encapsulation attribute, containing at least 672 one TLV that identifies a "feasible tunnel" for packet P. A 673 tunnel is considered feasible if it has the following two 674 properties: 676 * The tunnel type is supported (i.e., router R knows how to set 677 up tunnels of that type, how to create the encapsulation header 678 for tunnels of that type, etc.) 680 * The tunnel is of a type that can be used to carry packet P 681 (e.g., an MPLS-in-UDP tunnel would not be a feasible tunnel for 682 carrying an IP packet, UNLESS the IP packet can first be 683 converted to an MPLS packet). 685 * The tunnel is specified in a TLV whose Remote Endpoint sub-TLV 686 identifies an IP address that is reachable. 688 Then router R SHOULD send packet P through one of the feasible 689 tunnels identified in the Tunnel Encapsulation attribute of UPDATE U. 691 If the Tunnel Encapsulation attribute contains several TLVs (i.e., if 692 it specifies several tunnels), router R may choose any one of those 693 tunnels, based upon local policy. If any of tunnels' TLVs contain 694 the Color sub-TLV and/or the Protocol Type sub-TLV defined in 695 [RFC5512], the choice of tunnel may be influenced by these sub-TLVs. 697 If a particular tunnel is not feasible at some moment because its 698 Remote Endpoint cannot be reached at that moment, the tunnel may 699 become feasible at a later time. When this happens, router R SHOULD 700 reconsider its choice of tunnel to use, and MAY choose to now use the 701 tunnel. 703 A TLV specifying a non-feasible tunnel is not considered to be 704 malformed or erroneous in any way, and the TLV SHOULD NOT be stripped 705 from the Tunnel Encapsulation attribute before redistribution. 707 In addition to the sub-TLVs already defined, additional sub-TLVs may 708 be defined that affect the choice of tunnel to be used, or that 709 affect the contents of the tunnel encapsulation header. The 710 documents that define any such additional sub-TLVs must specify the 711 effect that including the sub-TLV is to have. 713 If it is determined to send a packet through the tunnel specified in 714 a particular TLV of a particular Tunnel Encapsulation attribute, and 715 if that TLV contains a Remote Endpoint sub-TLV, then the tunnel's 716 remote endpoint address is the IP address contained in the sub-TLV. 717 If the TLV does not contain a Remote Endpoint sub-TLV, or if it 718 contains a Remote Endpoint sub-TLV whose value field is all zeroes, 719 then the tunnel's remote endpoint is the IP address specified as the 720 Next Hop of the BGP Update containing the Tunnel Encapsulation 721 attribute. 723 The procedure for sending a packet through a particular tunnel type 724 to a particular remote endpoint depends upon the tunnel type, and is 725 outside the scope of this document. The contents of the tunnel 726 encapsulation header MAY be influenced by the Encapsulation sub-TLV. 728 Note that some tunnel types may require the execution of an explicit 729 tunnel setup protocol before they can be used for carrying data. 730 Other tunnel types may not require any tunnel setup protocol. 731 Whenever a new Tunnel Type TLV is defined, the specification of that 732 TLV must describe (or reference) the procedures for creating the 733 encapsulation header used to forward packets through that tunnel 734 type. 736 If a Tunnel Encapsulation attribute specifies several tunnels, the 737 way in which a router chooses which one to use is a matter of policy, 738 subject to the following constraint: if a router can determine that a 739 given tunnel is not functional, it MUST NOT use that tunnel. In 740 particular, if the tunnel is identified in a TLV that has a Remote 741 Endpoint sub-TLV, and if the IP address specified in the sub-TLV is 742 not reachable from router R, then the tunnel SHOULD be considered 743 non-functional. Other means of determining whether a given tunnel is 744 functional MAY be used; specification of such means is outside the 745 scope of this specification. Of course, if a non-functional tunnel 746 later becomes functional, router R SHOULD reevaluate its choice of 747 tunnels. 749 If router R determines that it cannot use any of the tunnels 750 specified in the Tunnel Encapsulation attribute, it MAY either drop 751 packet P, or it MAY transmit packet P as it would had the Tunnel 752 Encapsulation attribute not been present. This is a matter of local 753 policy. By default, the packet SHOULD be transmitted as if the 754 Tunnel Encapsulation attribute had not been present. 756 A Tunnel Encapsulation attribute may contain several TLVs that all 757 specify the same tunnel type. Each TLV should be considered as 758 specifying a different tunnel. Two tunnels of the same type may have 759 different Remote Endpoint sub-TLVs, different Encapsulation sub-TLVs, 760 etc. Choosing between two such tunnels is a matter of local policy. 762 Once router R has decided to send packet P through a particular 763 tunnel, it encapsulates packet P appropriately and then forwards it 764 according to the route that leads to the tunnel's remote endpoint. 765 This route may itself be a BGP route with a Tunnel Encapsulation 766 attribute. If so, the encapsulated packet is treated as the payload 767 and is encapsulated according to the Tunnel Encapsulation attribute 768 of that route. That is, tunnels may be "stacked". 770 4. Routing Considerations 772 4.1. No Impact on BGP Decision Process 774 The presence of the Tunnel Encapsulation attribute does not affect 775 the BGP bestpath selection algorithm. 777 Under certain circumstances, this may need to counter-intuitive 778 consequences. For example, suppose: 780 o router R1 receives a BGP UPDATE message from router R2, such that 782 * the NLRI of that UPDATE is prefix X, 784 * the UPDATE contains a Tunnel Encapsulation attribute specifying 785 two tunnels, T1 and T2, 787 * R1 cannot use tunnel T1 or tunnel T2, either because the tunnel 788 remote endpoint is not reachable or because R1 does not support 789 that kind of tunnel 791 o router R1 receives a BGP UPDATE message from router R3, such that 793 * the NLRI of that UPDATE is prefix X, 795 * the UPDATE contains a Tunnel Encapsulation attribute specifying 796 two tunnels, T3 and T4, 798 * R1 can use at least one of the two tunnels 800 Since the Tunnel Encapsulation attribute does not affect bestpath 801 selection, R1 may well install the route from R2 rather than the 802 route from R3, even though R2's route contains no usable tunnels. 804 This possibility must be kept in mind whenever a Remote Endpoint sub- 805 TLV carried by a given UPDATE specifies an IP address that is 806 different than the next hop of that UPDATE. 808 4.2. Looping, Infinite Stacking, Etc. 810 Consider a packet destined for address X. Suppose a BGP UPDATE for 811 address prefix X carries a Tunnel Encapsulation attribute that 812 specifies a remote tunnel endpoint of Y. And suppose that a BGP 813 UPDATE for address prefix Y carries a Tunnel Encapsulation attribute 814 that specifies a Remote Endpoint of X. It is easy to see that this 815 will cause an infinite number of encapsulation headers to be put on 816 the given packet. 818 This could happen as a result of misconfiguration, either accidental 819 or intentional. It could also happen if the Tunnel Encapsulation 820 attribute were altered by a malicious agent. Implementations should 821 be aware of this. 823 Improper setting (or malicious altering) of the Tunnel Encapsulation 824 attribute could also cause data packets to loop. Suppose a BGP 825 UPDATE for address prefix X carries a Tunnel Encapsulation attribute 826 that specifies a remote tunnel endpoint of Y. Suppose router R 827 receives and processes the update. When router R receives a packet 828 destined for X, it will apply the encapsulation and send the 829 encapsulated packet to Y. Y will decapsulate the packet and forward 830 it further. If Y is further away from X than is router R, it is 831 possible that the path from Y to X will traverse R. This would cause 832 a long-lasting routing loop. 834 These possibilities must also be kept in mind whenever the Remote 835 Endpoint for a given prefix differs from the BGP next hop for that 836 prefix. 838 5. Recursive Next Hop Resolution 840 Suppose that: 842 o a given packet P must be forwarded by router R1; 844 o the path along which P is to be forwarded is determined by BGP 845 UPDATE U1; 847 o UPDATE U1 does not have a Tunnel Encapsulation attribute; 849 o the next hop of UPDATE U1 is router R2; 851 o the best path to router R2 is a BGP route that was advertised in 852 UPDATE U2; 854 o UPDATE U2 has a Tunnel Encapsulation attribute. 856 Then packet P SHOULD be sent through one of the tunnels identified in 857 the Tunnel Encapsulation attribute of UPDATE U2. See Section 3 for 858 further details. 860 Note that if UPDATE U1 and UPDATE U2 both have Tunnel Encapsulation 861 attributes, packet P will be carried through a pair of nested 862 tunnels. P will first be encapsulated based on the Tunnel 863 Encapsulation attribute of U1. This encapsulated packet then becomes 864 the payload, and is encapsulated based on the Tunnel Encapsulation 865 attribute of U2. This is another way of "stacking" tunnels (see also 866 Section 3. 868 6. Tunnel Encapsulation Extended Community 870 [RFC5512] defines an Encapsulation Extended Community. This Extended 871 Community may be attached to a route any AFI/SAFI to which the Tunnel 872 Encapsulation attribute may be attached. Each such Extended 873 Community identifies a particular tunnel type. If the Encapsulation 874 Extended Community identifies a particular tunnel type, its semantics 875 are exactly equivalent to the semantics of a Tunnel Encapsulation 876 attribute TLV that: 878 o identifies the same tunnel type, and 880 o has a Remote Endpoint sub-TLV whose IP address field contains the 881 address of the BGP next hop of the route to which it is attached, 882 and 884 o has no other sub-TLVs. 886 7. Use of Virtual Network Identifiers and Embedded Labels when Imposing 887 a Tunnel Encapsulation 889 Three of the tunnel types that can be specified in a Tunnel 890 Encapsulation TLV have virtual network identifier fields in their 891 encapsulation headers. In the VXLAN and VXLAN-GPE encapsulations, 892 this field is called the VNI field; in the NVGRE encapsulation, this 893 field is called the VSID field. 895 When one of these tunnel encapsulations is imposed on a packet, the 896 setting of the virtual network identifier field in the encapsulation 897 header depends upon the contents of the Encapsulation sub-TLV (if one 898 is present). When the Tunnel Encapsulation attribute is being 899 carried on a BGP UPDATE of a labeled address family, the setting of 900 the virtual network identifier field also depends upon the contents 901 of the Embedded Label Handling sub-TLV (if present). 903 This section specifies the procedures for choosing the value to set 904 in the virtual network identifier field of the encapsulation header. 905 These procedures apply only when the tunnel type is VXLAN, VXLAN-GPE, 906 or NVGRE. 908 7.1. Unlabeled Address Families 910 This sub-section applies when: 912 o the Tunnel Encapsulation attribute is carried on a BGP UPDATE of 913 an unlabeled address family, and 915 o at least one of the attribute's TLVs identifies a tunnel type that 916 uses a virtual network identifier, and 918 o it has been determined to send a packet through one of those 919 tunnels. 921 If the TLV identifying the tunnel contains an Encapsulation sub-TLV 922 whose V bit is set, the virtual network identifier field of the 923 encapsulation header is set to the value of the virtual network 924 identifier field of the Encapsulation sub-TLV. 926 Otherwise, the virtual network identifier field of the encapsulation 927 header is set to a configured value; if there is no configured value, 928 the tunnel cannot be used. 930 7.2. Labeled Address Families 932 This sub-section applies when: 934 o the Tunnel Encapsulation attribute is carried on a BGP UPDATE of a 935 labeled address family, and 937 o at least one of the attribute's TLVs identifies a tunnel type that 938 uses a virtual network identifier, and 940 o it has been determined to send a packet through one of those 941 tunnels. 943 7.2.1. When a Valid VNID has been Signaled 945 If the TLV identifying the tunnel contains an Encapsulation sub-TLV 946 whose V bit is set, the virtual network identifier field of the 947 encapsulation header is set as follows: 949 o If the TLV does not contain an Embedded Label Handling sub-TLV, or 950 if it contains an Embedded Label Handling sub-TLV whose value is 951 1, then the virtual network identifier field of the encapsulation 952 header is set to the value of the virtual network identifier field 953 of the Encapsulation sub-TLV. 955 The embedded label (from the NLRI of the route that is carrying 956 the Tunnel Encapsulation attribute) appears at the top of the MPLS 957 label stack in the encapsulation payload. 959 o If the TLV contains an Embedded Label Handling sub-TLV whose value 960 is 2, the embedded label is ignored entirely, and the virtual 961 network identifier field of the encapsulation header is set to the 962 value of the virtual network identifier field of the Encapsulation 963 sub-TLV. 965 7.2.2. When a Valid VNID has not been Signaled 967 If the TLV identifying the tunnel does not contain an Encapsulation 968 sub-TLV whose V bit is set, the virtual network identifier field of 969 the encapsulation header is set as follows: 971 o If the TLV does not contain an Embedded Label Handling sub-TLV, or 972 if it contains an Embedded Label Handling sub-TLV whose value is 973 1, then the virtual network identifier field of the encapsulation 974 header is set to a configured value. 976 If there is no configured value, the tunnel cannot be used. 978 The embedded label (from the NLRI of the route that is carrying 979 the Tunnel Encapsulation attribute) appears at the top of the MPLS 980 label stack in the encapsulation payload. 982 o If the TLV contains an Embedded Label Handling sub-TLV whose value 983 is 2, the embedded label is copied into the virtual network 984 identifier field of the encapsulation header. 986 The embedded label does not appear in the MPLS label stack of the 987 payload. 989 7.2.3. Applicability Restrictions 991 In a given UPDATE of a labeled address family, the label embedded in 992 the NLRI is generally a label that is meaningful only to the router 993 whose address appears as the next hop. Certain of the procedures of 994 Section 7.2.1 or Section 7.2.2 cause the embedded label to be carried 995 by a data packet to the router whose address appears in the Remote 996 Endpoint sub-TLV. If the Remote Endpoint sub-TLV does not identify 997 the same router that is the next hop, sending the packet through the 998 tunnel may cause the label to be misinterpreted at the tunnel's 999 remote endpoint. This may cause misdelivery of the packet. 1001 Therefore the embedded label MUST NOT be carried by a data packet 1002 traveling through a tunnel unless it is known that the label will be 1003 properly interpreted at the tunnel's remote endpoint. How this is 1004 known is outside the scope of this document. 1006 Note that if the Tunnel Encapsulation attribute is attached to a VPN- 1007 IP route [RFC4364], and if Inter-AS "option b" (see section 10 of 1008 [RFC4364] is being used, and if the Remote Endpoint sub-TLV contains 1009 an IP address that is not in same AS as the router receiving the 1010 route, it is very likely that the embedded label has been changed. 1011 Therefore use of the Tunnel Encapsulation attribute in an "Inter-AS 1012 option b" scenario is not supported. 1014 8. Scoping 1016 The Tunnel Encapsulation attribute is defined as a transitive 1017 attribute, so that it may be passed along by BGP speakers that do not 1018 recognize it. However, it is intended that the Tunnel Encapsulation 1019 attribute be used only within a well-defined scope, e.g., within a 1020 set of Autonomous Systems that belong to a single administrative 1021 entity. If the attribute is distributed beyond its intended scope, 1022 packets may be sent through tunnels in a manner that is not intended. 1024 To prevent the Tunnel Encapsulation attribute from being distributed 1025 beyond its intended scope, any BGP speaker that understands the 1026 attribute MUST be able to filter the attribute from incoming BGP 1027 UPDATE messages. When the attribute is filtered from an incoming 1028 UPDATE, the attribute is neither processed nor redistributed. This 1029 filtering SHOULD be possible on a per-BGP-session basis. For each 1030 session, filtering of the attribute on incoming UPDATEs MUST be 1031 enabled by default. 1033 In addition, any BGP speaker that understands the attribute MUST be 1034 able to filter the attribute from outgoing BGP UPDATE messages. This 1035 filtering SHOULD be possible on a per-BGP-session basis. For each 1036 session, filtering of the attribute on outgoing UPDATEs MUST be 1037 enabled by default. 1039 9. Error Handling 1041 The Tunnel Encapsulation attribute is a sequence of TLVs, each of 1042 which is a sequence of sub-TLVs. The final octet of a TLV is 1043 determined by its length field. Similarly, the final octet of a sub- 1044 TLV is determined by its length field. The final octet of a TLV must 1045 also be the final octet of its final sub-TLV. If this is not the 1046 case, the TLV MUST be considered malformed. A TLV that is found to 1047 be malformed for this reason MUST NOT be processed, and MUST be 1048 stripped from the Tunnel Encapsulation attribute before 1049 redistribution. Subsequent TLVs in the Tunnel Encapsulation 1050 attribute may still be valid, in which case they MUST be processed 1051 and redistributed normally. 1053 If a Tunnel Encapsulation attribute does not have any valid TLVs, or 1054 it does not have the transitive bit set, the "Attribute Discard" 1055 procedure of [ERRORS] is applied. 1057 If a Tunnel Encapsulation attribute can be parsed correctly, but 1058 contains a TLV that is not recognized (i.e., the tunnel type is not 1059 recognized) by a particular BGP speaker, the attribute is NOT 1060 considered to be malformed. The unrecognized TLV MUST be ignored, 1061 and the BGP speaker MUST interpret the attribute as if the 1062 unrecognized TLV had not been present. If the route carrying the 1063 Tunnel Encapsulation attribute is redistributed with the attribute, 1064 the unrecognized TLV SHOULD remain in the attribute. 1066 If a TLV of a Tunnel Encapsulation attribute contains a sub-TLV that 1067 is not recognized by a particular BGP speaker, the BGP speaker SHOULD 1068 process that TLV as if the unrecognized sub-TLV had not been present. 1069 If the route carrying the Tunnel Encapsulation attribute is 1070 redistributed with the attribute, the unrecognized TLV SHOULD remain 1071 in the attribute. 1073 In general, if a TLV contains a sub-TLV that is malformed (e.g., 1074 contains a length field whose value is not legal for that sub-TLV), 1075 the sub-TLV should be treated as if it were an unrecognized sub-TLV. 1076 This document specifies one exception to this rule -- if a TLV 1077 contains a malformed Remote Endpoint sub-TLV (as defined in 1078 Section 2.1, the entire TLV MUST be ignored, and SHOULD be removed 1079 from the Tunnel Encapsulation attribute before the route carrying 1080 that attribute is redistributed. 1082 A TLV that does not contain the Remote Endpoint sub-TLV MUST be 1083 treated as if it contained a malformed Remote Endpoint sub-TLV. 1085 A TLV identifying a particular tunnel type may contain a sub-TLV that 1086 is meaningless for that tunnel type. For example, perhaps the TLV 1087 contains a "UDP Destination Port" sub-TLV, but the identified tunnel 1088 type does not use UDP encapsulation at all. Sub-TLVs of this sort 1089 SHOULD be treated as no-ops. That is, they SHOULD NOT affect the 1090 creation of the encapsulation header. However, the sub-TLV MUST NOT 1091 be considered to be malformed, and MUST NOT be removed from the TLV 1092 before the route carrying the Tunnel Encapsulation attribute is 1093 redistributed. 1095 There is no significance to the order in which the TLVs occur within 1096 the Tunnel Encapsulation attribute. Multiple TLVs may occur for a 1097 given tunnel type; each such TLV is regarded as describing a 1098 different tunnel. 1100 10. IANA Considerations 1102 IANA is requested to assign a codepoint from the "BGP Tunnel 1103 Encapsulation Attribute Sub-TLVs" registry for "Remote Endpoint", 1104 with this document being the reference. 1106 IANA is requested to assign a codepoint from the "BGP Tunnel 1107 Encapsulation Attribute Sub-TLVs" registry for "IPv4 DS Field", with 1108 this document being the reference. 1110 IANA is requested to assign a codepoint from the "BGP Tunnel 1111 Encapsulation Attribute Sub-TLVs" registry for "UDP Destination 1112 Port", with this document being the reference. 1114 IANA is requested to assign a codepoint from the "BGP Tunnel 1115 Encapsulation Attribute Sub-TLVs" registry for "Embedded Label 1116 Handling", with this document being the reference. 1118 IANA is requested to add this document as a reference for tunnel 1119 types 8-13 in the "BGP Tunnel Encapsulation Tunnel Types" registry. 1121 11. Security Considerations 1123 The Tunnel Encapsulation attribute can cause traffic to be diverted 1124 from its normal path, especially when the Remote Endpoint sub-TLV is 1125 used. This can have serious consequences if the attribute is added 1126 or modified illegitimately, as it enables traffic to be "hijacked". 1128 The Remote Endpoint sub-TLV contains both an IP address and an AS 1129 number. BGP Origin Validation [RFC6811] can be used to obtain 1130 assurance that the given IP address belongs to the given AS. While 1131 this provides some protection against misconfiguration, it does not 1132 prevent a malicious agent from inserting a sub-TLV that will appear 1133 valid. 1135 Before sending a packet through the tunnel identified in a particular 1136 TLV of a Tunnel Encapsulation attribute, it may be advisable to use 1137 BGP Origin Validation to obtain the following additional assurances: 1139 o the origin AS of the route carrying the Tunnel Encapsulation 1140 attribute is correct; 1142 o the origin AS of the route to the IP address specified in the 1143 Remote Endpoint sub-TLV is correct, and is the same AS that is 1144 specified in the Remote Endpoint sub-TLV. 1146 One then has some level of assurance that the tunneled traffic is 1147 going to the same destination AS that it would have gone to had the 1148 Tunnel Encapsulation attribute not been present. However, this may 1149 not suit all use cases, and in any event is not very strong 1150 protection against hijacking. 1152 For these reasons, BGP Origin Validation should not be relied upon 1153 exclusively, and the filtering procedures of Section 8 should always 1154 be in place. 1156 Increased protection can be obtained by using BGP Path Validation 1157 [BGPSEC] to ensure that the route carrying the Tunnel Encapsulation 1158 attribute, and the routes to the Remote Endpoint of each specified 1159 tunnel, have not been altered illegitimately. 1161 If BGP Origin Validation is used as specified above, and the tunnel 1162 specified in a particular TLV of a Tunnel Encapsulation attribute is 1163 therefore regarded as "suspicious", that tunnel should not be used. 1164 Other tunnels specified in (other TLVs of) the Tunnel Encapsulation 1165 attribute may still be used. 1167 12. Acknowledgments 1169 The authors wish to think Ron Bonica, John Drake, Satoru Matushima, 1170 Dhananjaya Rao, John Scudder, and Ravi Singh for their review, 1171 comments, and/or helpful discussions. 1173 13. Contributor Addresses 1175 Below is a list of other contributing authors in alphabetical order: 1177 Randy Bush 1178 Internet Initiative Japan 1179 5147 Crystal Springs 1180 Bainbridge Island, Washington 98110 1181 United States 1183 Email: randy@psg.com 1185 Robert Raszuk 1186 Mirantis Inc. 1187 615 National Ave. #100 1188 Mountain View, California 94043 1189 United States 1191 Email: robert@raszuk.net 1193 14. References 1195 14.1. Normative References 1197 [ERRORS] Chen, E., Scudder, J., Mohapatra, P., and K. Patel, 1198 "Revised Error Handling for BGP UPDATE Messages", 1199 internet-draft draft-ietf-idr-error-handling-19, April 1200 2015. 1202 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1203 Requirement Levels", BCP 14, RFC 2119, March 1997. 1205 [RFC5512] Mohapatra, P. and E. Rosen, "The BGP Encapsulation 1206 Subsequent Address Family Identifier (SAFI) and the BGP 1207 Tunnel Encapsulation Attribute", RFC 5512, April 2009. 1209 14.2. Informative References 1211 [BGPSEC] Lepinski, M. and S. Turner, "An Overview of BGPsec", 1212 internet-draft draft-ietf-sidr-bgpsec-overview, January 1213 2015. 1215 [GTP-U] 3GPP, "GPRS Tunneling Protocol User Plane, TS 29.281", 1216 2014. 1218 [NVGRE] Garg, P. and Y. Wang, "NVGRE: Network Virtualization using 1219 Generic Routing Encapsulation", internet-draft draft- 1220 sridharan-virtualization-nvgre, April 2015. 1222 [RFC2474] Nichols, K., Blake, S., Baker, F., and D. Black, 1223 "Definition of the Differentiated Services Field (DS 1224 Field) in the IPv4 and IPv6 Headers", RFC 2474, December 1225 1998. 1227 [RFC2784] Farinacci, D., Li, T., Hanks, S., Meyer, D., and P. 1228 Traina, "Generic Routing Encapsulation (GRE)", RFC 2784, 1229 March 2000. 1231 [RFC2890] Dommety, G., "Key and Sequence Number Extensions to GRE", 1232 RFC 2890, September 2000. 1234 [RFC4023] Worster, T., Rekhter, Y., and E. Rosen, "Encapsulating 1235 MPLS in IP or Generic Routing Encapsulation (GRE)", RFC 1236 4023, March 2005. 1238 [RFC4364] Rosen, E. and Y. Rekhter, "BGP/MPLS IP Virtual Private 1239 Networks (VPNs)", RFC 4364, February 2006. 1241 [RFC6811] Mohapatra, P., Scudder, J., Ward, D., Bush, R., and R. 1242 Austein, "BGP Prefix Origin Validation", RFC 6811, January 1243 2013. 1245 [RFC7348] Mahalingam, M., Dutt, D., Duda, K., Agarwal, P., Kreeger, 1246 L., Sridhar, T., Bursell, M., and C. Wright, "Virtual 1247 eXtensible Local Area Network (VXLAN): A Framework for 1248 Overlaying Virtualized Layer 2 Networks over Layer 3 1249 Networks", RFC 7348, August 2014. 1251 [RFC7510] Xu, X., Sheth, N., Yong, L., Callon, R., and D. Black, 1252 "Encapsulating MPLS in UDP", RFC 7510, April 2015. 1254 [vEPC] Matsushima, S. and R. Wakikawa, "Stateless User-Plane 1255 Architecture for Virtualized EPC", internet-draft draft- 1256 matsushima-stateless-uplane-vepc-04, March 2015. 1258 [VXLAN-GPE] 1259 Quinn, P., Manur, R., Kreeger, L., Lewis, D., Maino, F., 1260 Smith, M., Agarwal, P., Xu, X., Elzur, U., Garg, P., 1261 Melman, D., and R. Manur, "Generic Protocol Extension for 1262 VXLAN", internet-draft draft-ietf-nvo3-vxlan-gpe, May 1263 2015. 1265 Authors' Addresses 1266 Eric C. Rosen (editor) 1267 Juniper Networks, Inc. 1268 10 Technology Park Drive 1269 Westford, Massachusetts 01886 1270 United States 1272 Email: erosen@juniper.net 1274 Keyur Patel 1275 Cisco Systems 1276 170 W. Tasman Drive 1277 San Jose, CA 95134 1278 United States 1280 Email: keyupate@cisco.com 1282 Gunter Van de Velde 1283 Alcatel-Lucent 1284 Copernicuslaan 50 1285 Antwerpen 2018 1286 Belgium 1288 Email: gunter.van_de_velde@alcatel-lucent.com