idnits 2.17.1 draft-ietf-idr-tunnel-encaps-06.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == The document seems to lack the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords -- however, there's a paragraph with a matching beginning. Boilerplate error? (The document does seem to have the reference to RFC 2119 which the ID-Checklist requires). -- The document date (June 14, 2017) is 2505 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) ** Obsolete normative reference: RFC 5512 (Obsoleted by RFC 9012) == Outdated reference: A later version (-15) exists of draft-ietf-bess-evpn-inter-subnet-forwarding-03 == Outdated reference: A later version (-27) exists of draft-ietf-idr-bgp-prefix-sid-05 -- Obsolete informational reference (is this intentional?): RFC 5566 (Obsoleted by RFC 9012) Summary: 1 error (**), 0 flaws (~~), 4 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 IDR Working Group E. Rosen, Ed. 3 Internet-Draft Juniper Networks, Inc. 4 Obsoletes: 5512 (if approved) K. Patel 5 Intended status: Standards Track Arrcus 6 Expires: December 16, 2017 G. Van de Velde 7 Nokia 8 June 14, 2017 10 The BGP Tunnel Encapsulation Attribute 11 draft-ietf-idr-tunnel-encaps-06 13 Abstract 15 RFC 5512 defines a BGP Path Attribute known as the "Tunnel 16 Encapsulation Attribute". This attribute allows one to specify a set 17 of tunnels. For each such tunnel, the attribute can provide the 18 information needed to create the tunnel and the corresponding 19 encapsulation header. The attribute can also provide information 20 that aids in choosing whether a particular packet is to be sent 21 through a particular tunnel. RFC 5512 states that the attribute is 22 only carried in BGP UPDATEs that have the "Encapsulation Subsequent 23 Address Family (Encapsulation SAFI)". This document deprecates the 24 Encapsulation SAFI (which has never been used in production), and 25 specifies semantics for the attribute when it is carried in UPDATEs 26 of certain other SAFIs. This document adds support for additional 27 tunnel types, and allows a remote tunnel endpoint address to be 28 specified for each tunnel. This document also provides support for 29 specifying fields of any inner or outer encapsulations that may be 30 used by a particular tunnel. 32 This document obsoletes RFC 5512. 34 Status of This Memo 36 This Internet-Draft is submitted in full conformance with the 37 provisions of BCP 78 and BCP 79. 39 Internet-Drafts are working documents of the Internet Engineering 40 Task Force (IETF). Note that other groups may also distribute 41 working documents as Internet-Drafts. The list of current Internet- 42 Drafts is at http://datatracker.ietf.org/drafts/current/. 44 Internet-Drafts are draft documents valid for a maximum of six months 45 and may be updated, replaced, or obsoleted by other documents at any 46 time. It is inappropriate to use Internet-Drafts as reference 47 material or to cite them other than as "work in progress." 48 This Internet-Draft will expire on December 16, 2017. 50 Copyright Notice 52 Copyright (c) 2017 IETF Trust and the persons identified as the 53 document authors. All rights reserved. 55 This document is subject to BCP 78 and the IETF Trust's Legal 56 Provisions Relating to IETF Documents 57 (http://trustee.ietf.org/license-info) in effect on the date of 58 publication of this document. Please review these documents 59 carefully, as they describe your rights and restrictions with respect 60 to this document. Code Components extracted from this document must 61 include Simplified BSD License text as described in Section 4.e of 62 the Trust Legal Provisions and are provided without warranty as 63 described in the Simplified BSD License. 65 Table of Contents 67 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 68 1.1. Brief Summary of RFC 5512 . . . . . . . . . . . . . . . . 4 69 1.2. Deficiencies in RFC 5512 . . . . . . . . . . . . . . . . 4 70 1.3. Brief Summary of Changes from RFC 5512 . . . . . . . . . 5 71 1.4. Impact on RFC 5566 . . . . . . . . . . . . . . . . . . . 6 72 2. The Tunnel Encapsulation Attribute . . . . . . . . . . . . . 6 73 3. Tunnel Encapsulation Attribute Sub-TLVs . . . . . . . . . . . 8 74 3.1. The Remote Endpoint Sub-TLV . . . . . . . . . . . . . . . 8 75 3.2. Encapsulation Sub-TLVs for Particular Tunnel Types . . . 10 76 3.2.1. VXLAN . . . . . . . . . . . . . . . . . . . . . . . . 10 77 3.2.2. VXLAN-GPE . . . . . . . . . . . . . . . . . . . . . . 12 78 3.2.3. NVGRE . . . . . . . . . . . . . . . . . . . . . . . . 13 79 3.2.4. L2TPv3 . . . . . . . . . . . . . . . . . . . . . . . 14 80 3.2.5. GTP . . . . . . . . . . . . . . . . . . . . . . . . . 15 81 3.2.6. GRE . . . . . . . . . . . . . . . . . . . . . . . . . 15 82 3.2.7. MPLS-in-GRE . . . . . . . . . . . . . . . . . . . . . 16 83 3.3. Outer Encapsulation Sub-TLVs . . . . . . . . . . . . . . 17 84 3.3.1. IPv4 DS Field . . . . . . . . . . . . . . . . . . . . 17 85 3.3.2. UDP Destination Port . . . . . . . . . . . . . . . . 17 86 3.4. Sub-TLVs for Aiding Tunnel Selection . . . . . . . . . . 17 87 3.4.1. Protocol Type Sub-TLV . . . . . . . . . . . . . . . . 17 88 3.4.2. Color Sub-TLV . . . . . . . . . . . . . . . . . . . . 18 89 3.5. Embedded Label Handling Sub-TLV . . . . . . . . . . . . . 18 90 3.6. MPLS Label Stack Sub-TLV . . . . . . . . . . . . . . . . 19 91 3.7. Prefix-SID Sub-TLV . . . . . . . . . . . . . . . . . . . 20 92 4. Extended Communities Related to the Tunnel Encapsulation 93 Attribute . . . . . . . . . . . . . . . . . . . . . . . . . . 21 94 4.1. Encapsulation Extended Community . . . . . . . . . . . . 22 95 4.2. Router's MAC Extended Community . . . . . . . . . . . . . 23 96 4.3. Color Extended Community . . . . . . . . . . . . . . . . 23 97 5. Semantics and Usage of the Tunnel Encapsulation 98 attribute . . . . . . . . . . . . . . . . . . . . . . . . . . 24 99 6. Routing Considerations . . . . . . . . . . . . . . . . . . . 27 100 6.1. No Impact on BGP Decision Process . . . . . . . . . . . . 27 101 6.2. Looping, Infinite Stacking, Etc. . . . . . . . . . . . . 28 102 7. Recursive Next Hop Resolution . . . . . . . . . . . . . . . . 28 103 8. Use of Virtual Network Identifiers and Embedded Labels 104 when Imposing a Tunnel Encapsulation . . . . . . . . . . . . 29 105 8.1. Tunnel Types without a Virtual Network Identifier 106 Field . . . . . . . . . . . . . . . . . . . . . . . . . . 29 107 8.2. Tunnel Types with a Virtual Network Identifier Field . . 30 108 8.2.1. Unlabeled Address Families . . . . . . . . . . . . . 30 109 8.2.2. Labeled Address Families . . . . . . . . . . . . . . 31 110 8.2.2.1. When a Valid VNI has been Signaled . . . . . . . 31 111 8.2.2.2. When a Valid VNI has not been Signaled . . . . . 31 112 9. Applicability Restrictions . . . . . . . . . . . . . . . . . 32 113 10. Scoping . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 114 11. Error Handling . . . . . . . . . . . . . . . . . . . . . . . 33 115 12. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 34 116 12.1. Subsequent Address Family Identifiers . . . . . . . . . 34 117 12.2. BGP Path Attributes . . . . . . . . . . . . . . . . . . 34 118 12.3. Extended Communities . . . . . . . . . . . . . . . . . . 34 119 12.4. BGP Tunnel Encapsulation Attribute Sub-TLVs . . . . . . 35 120 12.5. Tunnel Types . . . . . . . . . . . . . . . . . . . . . . 36 121 13. Security Considerations . . . . . . . . . . . . . . . . . . . 36 122 14. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 37 123 15. Contributor Addresses . . . . . . . . . . . . . . . . . . . . 37 124 16. References . . . . . . . . . . . . . . . . . . . . . . . . . 38 125 16.1. Normative References . . . . . . . . . . . . . . . . . . 38 126 16.2. Informative References . . . . . . . . . . . . . . . . . 38 127 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 41 129 1. Introduction 131 This document obsoletes RFC 5512. The deficiencies of RFC 5512, and 132 a summary of the changes made, are discussed in Sections 1.1-1.3. 133 The material from RFC 5512 that is retained has been incorporated 134 into this document. 136 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 137 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 138 "OPTIONAL", when and only when appearing in all capital letters, are 139 to be interpreted as described in [RFC2119]. 141 1.1. Brief Summary of RFC 5512 143 [RFC5512] defines a BGP Path Attribute known as the Tunnel 144 Encapsulation attribute. This attribute consists of one or more 145 TLVs. Each TLV identifies a particular type of tunnel. Each TLV 146 also contains one or more sub-TLVs. Some of the sub-TLVs, e.g., the 147 "Encapsulation sub-TLV", contain information that may be used to form 148 the encapsulation header for the specified tunnel type. Other sub- 149 TLVs, e.g., the "color sub-TLV" and the "protocol sub-TLV", contain 150 information that aids in determining whether particular packets 151 should be sent through the tunnel that the TLV identifies. 153 [RFC5512] only allows the Tunnel Encapsulation attribute to be 154 attached to BGP UPDATE messages of the Encapsulation Address Family. 155 These UPDATE messages have an AFI (Address Family Identifier) of 1 or 156 2, and a SAFI of 7. In an UPDATE of the Encapsulation SAFI, the NLRI 157 (Network Layer Reachability Information) is an address of the BGP 158 speaker originating the UPDATE. Consider the following scenario: 160 o BGP speaker R1 has received and installed UPDATE U; 162 o UPDATE U's SAFI is the Encapsulation SAFI; 164 o UPDATE U has the address R2 as its NLRI; 166 o UPDATE U has a Tunnel Encapsulation attribute. 168 o R1 has a packet, P, to transmit to destination D; 170 o R1's best path to D is a BGP route that has R2 as its next hop; 172 In this scenario, when R1 transmits packet P, it should transmit it 173 to R2 through one of the tunnels specified in U's Tunnel 174 Encapsulation attribute. The IP address of the remote endpoint of 175 each such tunnel is R2. Packet P is known as the tunnel's "payload". 177 1.2. Deficiencies in RFC 5512 179 While the ability to specify tunnel information in a BGP UPDATE is 180 useful, the procedures of [RFC5512] have certain limitations: 182 o The requirement to use the "Encapsulation SAFI" presents an 183 unfortunate operational cost, as each BGP session that may need to 184 carry tunnel encapsulation information needs to be reconfigured to 185 support the Encapsulation SAFI. The Encapsulation SAFI has never 186 been used, and this requirement has served only to discourage the 187 use of the Tunnel Encapsulation attribute. 189 o There is no way to use the Tunnel Encapsulation attribute to 190 specify the remote endpoint address of a given tunnel; [RFC5512] 191 assumes that the remote endpoint of each tunnel is specified as 192 the NLRI of an UPDATE of the Encapsulation-SAFI. 194 o If the respective best paths to two different address prefixes 195 have the same next hop, [RFC5512] does not provide a 196 straightforward method to associate each prefix with a different 197 tunnel. 199 o If a particular tunnel type requires an outer IP or UDP 200 encapsulation, there is no way to signal the values of any of the 201 fields of the outer encapsulation. 203 o In [RFC5512]'s specification of the sub-TLVs, each sub-TLV has 204 one-octet length field. In some cases, a two-octet length field 205 may be needed. 207 1.3. Brief Summary of Changes from RFC 5512 209 In this document we address these deficiencies by: 211 o Deprecating the Encapsulation SAFI. 213 o Defining a new "Remote Endpoint Address sub-TLV" that can be 214 included in any of the TLVs contained in the Tunnel Encapsulation 215 attribute. This sub-TLV can be used to specify the remote 216 endpoint address of a particular tunnel. 218 o Allowing the Tunnel Encapsulation attribute to be carried by BGP 219 UPDATEs of additional AFI/SAFIs. Appropriate semantics are 220 provided for this way of using the attribute. 222 o Defining a number of new sub-TLVs that provide additional 223 information that is useful when forming the encapsulation header 224 used to send a packet through a particular tunnel. 226 o Defining the sub-TLV type field so that a sub-TLV whose type is in 227 the range from 1 to 127 inclusive has a one-octet length field, 228 but a sub-TLV whose type is in the range from 128 to 254 inclusive 229 has a two-octet length field. 231 One of the sub-TLVs defined in [RFC5512] is the "Encapsulation sub- 232 TLV". For a given tunnel, the encapsulation sub-TLV specifies some 233 of the information needed to construct the encapsulation header used 234 when sending packets through that tunnel. This document defines 235 encapsulation sub-TLVs for a number of tunnel types not discussed in 236 [RFC5512]: VXLAN (Virtual Extensible Local Area Network, [RFC7348]), 237 VXLAN-GPE (Generic Protocol Extension for VXLAN, [VXLAN-GPE]), NVGRE 238 (Network Virtualization Using Generic Routing Encapsulation 239 [RFC7637]), GTP, and MPLS-in-GRE (MPLS in Generic Routing 240 Encapsulation [RFC2784], [RFC2890], [RFC4023]). MPLS-in-UDP 241 [RFC7510] is also supported, but an Encapsulation sub-TLV for it is 242 not needed. 244 Some of the encapsulations mentioned in the previous paragraph need 245 to be further encapsulated inside UDP and/or IP. [RFC5512] provides 246 no way to specify that certain information is to appear in these 247 outer IP and/or UDP encapsulations. This document provides a 248 framework for including such information in the TLVs of the Tunnel 249 Encapsulation attribute. 251 When the Tunnel Encapsulation attribute is attached to a BGP UPDATE 252 whose AFI/SAFI identifies one of the labeled address families, it is 253 not always obvious whether the label embedded in the NLRI is to 254 appear somewhere in the tunnel encapsulation header (and if so, 255 where), or whether it is to appear in the payload, or whether it can 256 be omitted altogether. This is especially true if the tunnel 257 encapsulation header itself contains a "virtual network identifier". 258 This document provides a mechanism that allows one to signal (by 259 using sub-TLVs of the Tunnel Encapsulation attribute) how one wants 260 to use the embedded label when the tunnel encapsulation has its own 261 virtual network identifier field. 263 [RFC5512] defines a Tunnel Encapsulation Extended Community, that can 264 be used instead of the Tunnel Encapsulation attribute under certain 265 circumstances. This document addresses the issue of how to handle a 266 BGP UPDATE that carries both a Tunnel Encapsulation attribute and one 267 or more Tunnel Encapsulation Extended Communities. 269 1.4. Impact on RFC 5566 271 [RFC5566] uses the mechanisms defined in [RFC5512]. While this 272 document obsoletes [RFC5512], it does not address the issue of how to 273 use the mechanisms of [RFC5566] without also using the Encapsulation 274 SAFI. Those issues are considered to be outside the scope of this 275 document. 277 2. The Tunnel Encapsulation Attribute 279 The Tunnel Encapsulation attribute is an optional transitive BGP Path 280 attribute. IANA has assigned the value 23 as the type code of the 281 attribute. The attribute is composed of a set of Type-Length-Value 282 (TLV) encodings. Each TLV contains information corresponding to a 283 particular tunnel type. A TLV is structured as shown in Figure 1: 285 0 1 2 3 286 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 287 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 288 | Tunnel Type (2 Octets) | Length (2 Octets) | 289 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 290 | | 291 | Value | 292 | | 293 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 295 Figure 1: Tunnel Encapsulation TLV Value Field 297 o Tunnel Type (2 octets): identifies a type of tunnel. The field 298 contains values from the IANA Registry "BGP Tunnel Encapsulation 299 Attribute Tunnel Types". 301 Note that for tunnel types whose names are of the form "X-in-Y", 302 e.g., "MPLS-in-GRE", only packets of the specified payload type 303 "X" are to be carried through the tunnel of type "Y". This is the 304 equivalent of specifying a tunnel type "Y" and including in its 305 TLV a Protocol Type sub-TLV (see Section 3.4.1 specifying protocol 306 "X". 308 o Length (2 octets): the total number of octets of the value field. 310 o Value (variable): comprised of multiple sub-TLVs. 312 Each sub-TLV consists of three fields: a 1-octet type, 1-octet 313 length, and zero or more octets of value. A sub-TLV is structured as 314 shown in Figure 2: 316 +-----------------------------------+ 317 | Sub-TLV Type (1 Octet) | 318 +-----------------------------------+ 319 | Sub-TLV Length (1 or 2 Octets)| 320 +-----------------------------------+ 321 | Sub-TLV Value (Variable) | 322 | | 323 +-----------------------------------+ 325 Figure 2: Tunnel Encapsulation Sub-TLV Format 327 o Sub-TLV Type (1 octet): each sub-TLV type defines a certain 328 property about the tunnel TLV that contains this sub-TLV. 330 o Sub-TLV Length (1 or 2 octets): the total number of octets of the 331 sub-TLV value field. The Sub-TLV Length field contains 1 octet if 332 the Sub-TLV Type field contains a value in the range from 1-127. 334 The Sub-TLV Length field contains two octets if the Sub-TLV Type 335 field contains a value in the range from 128-254. 337 o Sub-TLV Value (variable): encodings of the value field depend on 338 the sub-TLV type as enumerated above. The following sub-sections 339 define the encoding in detail. 341 3. Tunnel Encapsulation Attribute Sub-TLVs 343 In this section, we specify a number of sub-TLVs. These sub-TLVs can 344 be included in a TLV of the Tunnel Encapsulation attribute. 346 3.1. The Remote Endpoint Sub-TLV 348 The Remote Endpoint sub-TLV is a sub-TLV whose value field contains 349 three sub-fields: 351 1. a four-octet Autonomous System (AS) number sub-field 353 2. a two-octet Address Family sub-field 355 3. an address sub-field, whose length depends upon the Address 356 Family. 358 0 1 2 3 359 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 360 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 361 | Autonomous System Number | 362 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 363 | Address Family | Address ~ 364 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + 365 ~ ~ 366 | | 367 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 369 Figure 3: Remote Endpoint Sub-TLV Value Field 371 The Address Family subfield contains a value from IANA's "Address 372 Family Numbers" registry. In this document, we assume that the 373 Address Family is either IPv4 or IPv6; use of other address families 374 is outside the scope of this document. 376 If the Address Family subfield contains the value for IPv4, the 377 address subfield must contain an IPv4 address (a /32 IPv4 prefix). 378 In this case, the length field of Remote Endpoint sub-TLV must 379 contain the value 10 (0xa). IPv4 broadcast addresses are not valid 380 values of this field. 382 If the Address Family subfield contains the value for IPv6, the 383 address sub-field must contain an IPv6 address (a /128 IPv6 prefix). 384 In this case, the length field of Remote Endpoint sub-TLV must 385 contain the value 22 (0x16). IPv6 link local addresses are not valid 386 values of the IP address field. 388 In a given BGP UPDATE, the address family (IPv4 or IPv6) of a Remote 389 Endpoint sub-TLV is independent of the address family of the UPDATE 390 itself. For example, an UPDATE whose NLRI is an IPv4 address may 391 have a Tunnel Encapsulation attribute containing Remote Endpoint sub- 392 TLVs that contain IPv6 addresses. Also, different tunnels 393 represented in the Tunnel Encapsulation attribute may have Remote 394 Endpoints of different address families. 396 A two-octet AS number can be carried in the AS number field by 397 setting the two high order octets to zero, and carrying the number in 398 the two low order octets of the field. 400 The AS number in the sub-TLV MUST be the number of the AS to which 401 the IP address in the sub-TLV belongs. 403 There is one special case: the Remote Endpoint sub-TLV MAY have a 404 value field whose Address Family subfield contains 0. This means 405 that the tunnel's remote endpoint is the UPDATE's BGP next hop. If 406 the Address Family subfield contains 0, the Address subfield is 407 omitted, and the Autonomous System number field is set to 0. 409 If any of the following conditions hold, the Remote Endpoint sub-TLV 410 is considered to be "malformed": 412 o The sub-TLV contains the value for IPv4 in its Address Family 413 subfield, but the length of the sub-TLV's value field is other 414 than 10 (0xa). 416 o The sub-TLV contains the value for IPv6 in its Address Family 417 subfield, but the length of the sub-TLV's value field is other 418 than 22 (0x16). 420 o The sub-TLV contains the value zero in its Address Family field, 421 but the length of the sub-TLV's value field is other than 6, or 422 the Autonomous System subfield is not set to zero. 424 o The IP address in the sub-TLV's address subfield is not a valid IP 425 address (e.g., it's an IPv4 broadcast address). 427 o It can be determined that the IP address in the sub-TLV's address 428 subfield does not belong to the non-zero AS whose number is in the 429 its Autonomous System subfield. (See section Section 13 for 430 discussion of one way to determine this.) 432 If the Remote Endpoint sub-TLV is malformed, the TLV containing it is 433 also considered to be malformed, and the entire TLV MUST be ignored. 434 However, the Tunnel Encapsulation attribute SHOULD NOT be considered 435 to be malformed in this case; other TLVs in the attribute SHOULD be 436 processed (if they can be parsed correctly). 438 When redistributing a route that is carrying a Tunnel Encapsulation 439 attribute containing a TLV that itself contains a malformed Remote 440 Endpoint sub-TLV, the TLV SHOULD be removed from the attribute before 441 redistribution. 443 See Section 11 for further discussion of how to handle errors that 444 are encountered when parsing the Tunnel Encapsulation attribute. 446 If the Remote Endpoint sub-TLV contains an IPv4 or IPv6 address that 447 is valid but not reachable, the sub-TLV is NOT considered to be 448 malformed, and the containing TLV SHOULD NOT be removed from the 449 attribute before redistribution. However, the tunnel identified by 450 the TLV containing that sub-TLV cannot be used until such time as the 451 address becomes reachable. See Section 5. 453 3.2. Encapsulation Sub-TLVs for Particular Tunnel Types 455 This section defines Tunnel Encapsulation sub-TLVs for the following 456 tunnel types: VXLAN ([RFC7348]), VXLAN-GPE ([VXLAN-GPE]), NVGRE 457 ([RFC7637]), GTP ([GTP-U]), MPLS-in-GRE ([RFC2784], [RFC2890], 458 [RFC4023]), L2TPv3 ([RFC3931]), and GRE ([RFC2784], [RFC2890], 459 [RFC4023]). 461 Rules for forming the encapsulation based on the information in a 462 given TLV are given in Section 8. For some tunnel types, the rules 463 are obvious and not mentioned in this document. There are also 464 tunnel types for which it is not necessary to define an Encapsulation 465 sub-TLV. 467 3.2.1. VXLAN 469 This document defines an encapsulation sub-TLV for VXLAN tunnels. 470 When the tunnel type is VXLAN, the following is the structure of the 471 value field in the encapsulation sub-TLV: 473 0 1 2 3 474 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 475 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 476 |V|M|R|R|R|R|R|R| VN-ID (3 Octets) | 477 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 478 | MAC Address (4 Octets) | 479 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 480 | MAC Address (2 Octets) | Reserved | 481 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 483 Figure 4: VXLAN Encapsulation Sub-TLV 485 V: This bit is set to 1 to indicate that a "valid" VN-ID (Virtual 486 Network Identifier) is present in the encapsulation sub-TLV. 487 Please see Section 8. 489 M: This bit is set to 1 to indicate that a valid MAC Address is 490 present in the encapsulation sub-TLV. 492 R: The remaining bits in the 8-bit flags field are reserved for 493 further use. They SHOULD always be set to 0. 495 VN-ID: If the V bit is set, the VN-id field contains a 3 octet VN- 496 ID value. If the V bit is not set, the VN-id field SHOULD be set 497 to zero. 499 MAC Address: If the M bit is set, this field contains a 6 octet 500 Ethernet MAC address. If the M bit is not set, this field SHOULD 501 be set to all zeroes. 503 Note that, strictly speaking, VXLAN tunnels only carry ethernet 504 frames. To send an IP packet or an MPLS packet through a VXLAN 505 tunnel, it is necessary to form an IP-in-ethernet-in-VXLAN or an 506 MPLS-in-ethernet-in-VXLAN tunnel. 508 When forming the VXLAN encapsulation header: 510 o The values of the V, M, and R bits are NOT copied into the flags 511 field of the VXLAN header. The flags field of the VXLAN header is 512 set as per [RFC7348]. 514 o If the M bit is set, the MAC Address is copied into the Inner 515 Destination MAC Address field of the Inner Ethernet Header (see 516 section 5 of [RFC7348]. 518 If the M bit is not set, and the payload being sent through the 519 VXLAN tunnel is an ethernet frame, the Destination MAC Address 520 field of the Inner Ethernet Header is just the Destination MAC 521 Address field of the payload's ethernet header. 523 If the M bit is not set, and the payload being sent through the 524 VXLAN tunnel is an IP or MPLS packet, the Inner Destination MAC 525 address field is set to a configured value; if there is no 526 configured value, the VXLAN tunnel cannot be used. 528 o See Section 8 to see how the VNI field of the VXLAN encapsulation 529 header is set. 531 3.2.2. VXLAN-GPE 533 This document defines an encapsulation sub-TLV for VXLAN tunnels. 534 When the tunnel type is VXLAN-GPE, the following is the structure of 535 the value field in the encapsulation sub-TLV: 537 0 1 2 3 538 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 539 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 540 |Ver|V|R|R|R|R|R| Reserved | 541 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 542 | VN-ID | Reserved | 543 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 545 Figure 5: VXLAN GPE Encapsulation Sub-TLV 547 V: This bit is set to 1 to indicate that a "valid" VN-ID is 548 present in the encapsulation sub-TLV. Please see Section 8. 550 R: The bits designated "R" above are reserved for future use. 551 They SHOULD always be set to zero. 553 Version (Ver): Indicates VXLAN GPE protocol version. If the 554 indicated version is not supported, the TLV that contains this 555 Encapsulation sub-TLV MUST be treated as specifying an unsupported 556 tunnel type. The value of this field will be copied into the 557 corresponding field of the VXLAN encapsulation header. 559 VN-ID: If the V bit is set, this field contains a 3 octet VN-ID 560 value. If the V bit is not set, this field SHOULD be set to zero. 562 When forming the VXLAN-GPE encapsulation header: 564 o The values of the V and R bits are NOT copied into the flags field 565 of the VXLAN-GPE header. However, the values of the Ver bits are 566 copied into the VXLAN-GPE header. Other bits in the flags field 567 of the VXLAN-GPE header are set as per [VXLAN-GPE]. 569 o See Section 8 to see how the VNI field of the VXLAN-GPE 570 encapsulation header is set. 572 3.2.3. NVGRE 574 This document defines an encapsulation sub-TLV for NVGRE tunnels. 575 When the tunnel type is NVGRE, the following is the structure of the 576 value field in the encapsulation sub-TLV: 578 0 1 2 3 579 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 580 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 581 |V|M|R|R|R|R|R|R| VN-ID (3 Octets) | 582 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 583 | MAC Address (4 Octets) | 584 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 585 | MAC Address (2 Octets) | Reserved | 586 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 588 Figure 6: NVGRE Encapsulation Sub-TLV 590 V: This bit is set to 1 to indicate that a "valid" VN-ID is 591 present in the encapsulation sub-TLV. Please see Section 8. 593 M: This bit is set to 1 to indicate that a valid MAC Address is 594 present in the encapsulation sub-TLV. 596 R: The remaining bits in the 8-bit flags field are reserved for 597 further use. They SHOULD always be set to 0. 599 VN-ID: If the V bit is set, the VN-id field contains a 3 octet VN- 600 ID value. If the V bit is not set, the VN-id field SHOULD be set 601 to zero. 603 MAC Address: If the M bit is set, this field contains a 6 octet 604 Ethernet MAC address. If the M bit is not set, this field SHOULD 605 be set to all zeroes. 607 When forming the NVGRE encapsulation header: 609 o The values of the V, M, and R bits are NOT copied into the flags 610 field of the NVGRE header. The flags field of the VXLAN header is 611 set as per [RFC7637]. 613 o If the M bit is set, the MAC Address is copied into the Inner 614 Destination MAC Address field of the Inner Ethernet Header (see 615 section 3.2 of [RFC7637]. 617 If the M bit is not set, and the payload being sent through the 618 NVGRE tunnel is an ethernet frame, the Destination MAC Address 619 field of the Inner Ethernet Header is just the Destination MAC 620 Address field of the payload's ethernet header. 622 If the M bit is not set, and the payload being sent through the 623 NVGRE tunnel is an IP or MPLS packet, the Inner Destination MAC 624 address field is set to a configured value; if there is no 625 configured value, the NVGRE tunnel cannot be used. 627 o See Section 8 to see how the VSID (Virtual Subnet Identifier) 628 field of the NVGRE encapsulation header is set. 630 3.2.4. L2TPv3 632 When the tunnel type of the TLV is L2TPv3 over IP, the following is 633 the structure of the value field of the encapsulation sub-TLV: 635 0 1 2 3 636 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 637 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 638 | Session ID (4 octets) | 639 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 640 | | 641 | Cookie (Variable) | 642 | | 643 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 645 Figure 7: L2TPv3 Encapsulation Sub-TLV 647 Session ID: a non-zero 4-octet value locally assigned by the 648 advertising router that serves as a lookup key in the incoming 649 packet's context. 651 Cookie: an optional, variable length (encoded in octets -- 0 to 8 652 octets) value used by L2TPv3 to check the association of a 653 received data message with the session identified by the Session 654 ID. Generation and usage of the cookie value is as specified in 655 [RFC3931]. 657 The length of the cookie is not encoded explicitly, but can be 658 calculated as (sub-TLV length - 4). 660 3.2.5. GTP 662 When the tunnel type is GTP [GTP-U], the Encapsulation sub-TLV 663 contains information needed to send data packets through a GTP 664 tunnel, and also contains information needed by the tunnel's remote 665 endpoint to create a "reverse" tunnel back to the transmitter. This 666 allows a bidirectional control connection to be created. The format 667 of the Encapsulation Sub-TLV is: 669 0 1 2 3 670 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 671 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 672 | Remote TEID (4 Octets) | 673 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 674 | Local TEID (4 Octets) | 675 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 676 | Local Endpoint Address (4/16 Octets (IPv4/IPv6)) | 677 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 679 Figure 8: GTP Encapsulation Sub-TLV 681 Remote TEID: Contains the 32-bit Tunnel Endpoint Identifier of the 682 GTP tunnel through which data packets are to be sent. When data 683 packets are sent through the tunnel, the Remote TEID is carried in 684 the GTP encapsulation header. The GTP header is itself 685 encapsulation within an IP header, whose IP destination address 686 field is set to the value of the Remote Endpoint sub-TLV. 688 Local TEID: Contains a 32-bit Tunnel Endpoint Identifier of a GTP 689 tunnel assigned by EPC ([vEPC]). 691 Local Endpoint Address: Contains an IPv4 or IPv6 anycast address. 692 This is used, along with the Local TEID, to set up a tunnel in the 693 reverse direction. See [vEPC] for details. 695 3.2.6. GRE 697 When the tunnel type of the TLV is GRE, the following is the 698 structure of the value field of the encapsulation sub-TLV: 700 0 1 2 3 701 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 702 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 703 | GRE Key (4 octets) | 704 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 706 Figure 9: GRE Encapsulation Sub-TLV 708 GRE Key: 4-octet field [RFC2890] that is generated by the 709 advertising router. The actual method by which the key is 710 obtained is beyond the scope of this document. The key is 711 inserted into the GRE encapsulation header of the payload packets 712 sent by ingress routers to the advertising router. It is intended 713 to be used for identifying extra context information about the 714 received payload. 716 Note that the key is optional. Unless a key value is being 717 advertised, the GRE encapsulation sub-TLV MUST NOT be present. 719 3.2.7. MPLS-in-GRE 721 When the tunnel type is MPLS-in-GRE, the following is the structure 722 of the value field in an optional encapsulation sub-TLV: 724 0 1 2 3 725 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 726 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 727 | GRE-Key (4 Octets) | 728 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 730 Figure 10: MPLS-in-GRE Encapsulation Sub-TLV 732 GRE-Key: 4-octet field [RFC2890] that is generated by the 733 advertising router. The actual method by which the key is 734 obtained is beyond the scope of this document. The key is 735 inserted into the GRE encapsulation header of the payload packets 736 sent by ingress routers to the advertising router. It is intended 737 to be used for identifying extra context information about the 738 received payload. Note that the key is optional. Unless a key 739 value is being advertised, the MPLS-in-GRE encapsulation sub-TLV 740 MUST NOT be present. 742 Note that the GRE tunnel type defined in Section 3.2.6 can be used 743 instead of the MPLS-in-GRE tunnel type when it is necessary to 744 encapsulate MPLS in GRE. Including a TLV of the MPLS-in-GRE tunnel 745 type is equivalent to including a TLV of the GRE tunnel type that 746 also includes a Protocol Type sub-TLV ([RFC5512]) specifying MPLS as 747 the protocol to be encapsulated. That is, if a TLV specifies MPLS- 748 in-GRE or if it includes a Protocol Type sub-TLV specifying MPLS, the 749 GRE tunnel advertised in that TLV MUST NOT be used for carrying IP 750 packets. 752 While it is not really necessary to have both the GRE and MPLS-in-GRE 753 tunnel types, both are included for reasons of backwards 754 compatibility. 756 3.3. Outer Encapsulation Sub-TLVs 758 The Encapsulation sub-TLV for a particular tunnel type allows one to 759 specify the values that are to be placed in certain fields of the 760 encapsulation header for that tunnel type. However, some tunnel 761 types require an outer IP encapsulation, and some also require an 762 outer UDP encapsulation. The Encapsulation sub-TLV for a given 763 tunnel type does not usually provide a way to specify values for 764 fields of the outer IP and/or UDP encapsulations. If it is necessary 765 to specify values for fields of the outer encapsulation, additional 766 sub-TLVs must be used. This document defines two such sub-TLVs. 768 If an outer encapsulation sub-TLV occurs in a TLV for a tunnel type 769 that does not use the corresponding outer encapsulation, the sub-TLV 770 is treated as if it were an unknown type of sub-TLV. 772 3.3.1. IPv4 DS Field 774 Most of the tunnel types that can be specified in the Tunnel 775 Encapsulation attribute require an outer IP encapsulation. The IPv4 776 Differentiated Services (DS) Field sub-TLV can be carried in the TLV 777 of any such tunnel type. It specifies the setting of the one-octet 778 Differentiated Services field in the outer IP encapsulation (see 779 [RFC2474]). The value field is always a single octet. 781 3.3.2. UDP Destination Port 783 Some of the tunnel types that can be specified in the Tunnel 784 Encapsulation attribute require an outer UDP encapsulation. 785 Generally there is a standard UDP Destination Port value for a 786 particular tunnel type. However, sometimes it is useful to be able 787 to use a non-standard UDP destination port. If a particular tunnel 788 type requires an outer UDP encapsulation, and it is desired to use a 789 UDP destination port other than the standard one, the port to be used 790 can be specified by including a UDP Destination Port sub-TLV. The 791 value field of this sub-TLV is always a two-octet field, containing 792 the port value. 794 3.4. Sub-TLVs for Aiding Tunnel Selection 796 3.4.1. Protocol Type Sub-TLV 798 The protocol type sub-TLV MAY be included in a given TLV to indicate 799 the type of the payload packets that may be encapsulated with the 800 tunnel parameters that are being signaled in the TLV. The value 801 field of the sub-TLV contains a 2-octet value from IANA's ethertype 802 registry [Ethertypes]. 804 For example, if we want to use three L2TPv3 sessions, one carrying 805 IPv4 packets, one carrying IPv6 packets, and one carrying MPLS 806 packets, the egress router will include three TLVs of L2TPv3 807 encapsulation type, each specifying a different Session ID and a 808 different payload type. The protocol type sub-TLV for these will be 809 IPv4 (protocol type = 0x0800), IPv6 (protocol type = 0x86dd), and 810 MPLS (protocol type = 0x8847), respectively. This informs the 811 ingress routers of the appropriate encapsulation information to use 812 with each of the given protocol types. Insertion of the specified 813 Session ID at the ingress routers allows the egress to process the 814 incoming packets correctly, according to their protocol type. 816 3.4.2. Color Sub-TLV 818 The color sub-TLV MAY be encoded as a way to "color" the 819 corresponding tunnel TLV. The value field of the sub-TLV consists of 820 a Color Extended Community, as defined in Section 4.3. For the use 821 of this sub-TLV and Extended Community, please see Section 7. 823 3.5. Embedded Label Handling Sub-TLV 825 Certain BGP address families (corresponding to particular AFI/SAFI 826 pairs, e.g., 1/4, 2/4, 1/128, 2/128) have MPLS labels embedded in 827 their NLRIs. We will use the term "embedded label" to refer to the 828 MPLS label that is embedded in an NLRI, and the term "labeled address 829 family" to refer to any AFI/SAFI that has embedded labels. 831 Some of the tunnel types (e.g., VXLAN, VXLAN-GPE, and NVGRE) that can 832 be specified in the Tunnel Encapsulation attribute have an 833 encapsulation header containing "Virtual Network" identifier of some 834 sort. The Encapsulation sub-TLVs for these tunnel types may 835 optionally specify a value for the virtual network identifier. 837 Suppose a Tunnel Encapsulation attribute is attached to an UPDATE of 838 an embedded address family, and it is decided to use a particular 839 tunnel (specified in one of the attribute's TLVs) for transmitting a 840 packet that is being forwarded according to that UPDATE. When 841 forming the encapsulation header for that packet, different 842 deployment scenarios require different handling of the embedded label 843 and/or the virtual network identifier. The Embedded Label Handling 844 sub-TLV can be used to control the placement of the embedded label 845 and/or the virtual network identifier in the encapsulation. 847 The Embedded Label Handling sub-TLV may be included in any TLV of the 848 Tunnel Encapsulation attribute. If the Tunnel Encapsulation 849 attribute is attached to an UPDATE of a non-labeled address family, 850 the sub-TLV is treated as a no-op. If the sub-TLV is contained in a 851 TLV whose tunnel type does not have a virtual network identifier in 852 its encapsulation header, the sub-TLV is treated as a no-op. In 853 those cases where the sub-TLV is treated as a no-op, it SHOULD NOT be 854 stripped from the TLV before the UPDATE is forwarded. 856 The sub-TLV's Length field always contains the value 1, and its value 857 field consists of a single octet. The following values are defined: 859 1: The payload will be an MPLS packet with the embedded label at the 860 top of its label stack. 862 2: The embedded label is not carried in the payload, but is carried 863 either in the virtual network identifier field of the 864 encapsulation header, or else is ignored entirely. 866 Please see Section 8 for the details of how this sub-TLV is used when 867 it is carried by an UPDATE of a labeled address family. 869 3.6. MPLS Label Stack Sub-TLV 871 This sub-TLV allows an MPLS label stack ([RFC3032]) to be associated 872 with a particular tunnel. 874 The value field of this sub-TLV is a sequence of MPLS label stack 875 entries. The first entry in the sequence is the "topmost" label, the 876 final entry in the sequence is the "bottommost" label. When this 877 label stack is pushed onto a packet, this ordering MUST be preserved. 879 Each label stack entry has the following format: 881 0 1 2 3 882 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 883 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 884 | Label | TC |S| TTL | 885 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 887 Figure 11: MPLS Label Stack Sub-TLV 889 If a packet is to be sent through the tunnel identified in a 890 particular TLV, and if that TLV contains an MPLS Label Stack sub-TLV, 891 then the label stack appearing in the sub-TLV MUST be pushed onto the 892 packet. This label stack MUST be pushed onto the packet before any 893 other labels are pushed onto the packet. 895 In particular, if the Tunnel Encapsulation attribute is attached to a 896 BGP UPDATE of a labeled address family, the contents of the MPLS 897 Label Stack sub-TLV MUST be pushed onto the packet before the label 898 embedded in the NLRI is pushed onto the packet. 900 If the MPLS label stack sub-TLV is included in a TLV identifying a 901 tunnel type that uses virtual network identifiers (see Section 8), 902 the contents of the MPLS label stack sub-TLV MUST be pushed onto the 903 packet before the procdures of Section 8 are applied. 905 The number of label stack entries in the sub-TLV MUST be determined 906 from the sub-TLV length field. Thus it is not necessary to set the S 907 bit in any of the label stack entries of the sub-TLV, and the setting 908 of the S bit is ignored when parsing the sub-TLV. When the label 909 stack entries are pushed onto a packet that already has a label 910 stack, the S bits of all the entries MUST be cleared. When the label 911 stack entries are pushed onto a packet that does not already have a 912 label stack, the S bit of the bottommost label stack entry MUST be 913 set, and the S bit of all the other label stack entries MUST be 914 cleared.. 916 By default, the TC (Traffic Class) field ([RFC3032], [RFC5462]) of 917 each label stack entry is set to 0. This may of course be changed by 918 policy at the originator of the sub-TLV. When pushing the label 919 stack onto a packet, the TC of the label stack entries is preserved 920 by default. However, local policy at the router that is pushing on 921 the stack MAY cause modification of the TC values. 923 By default, the TTL (Time to Live) field of each label stack entry is 924 set to 255. This may be changed by policy at the originator of the 925 sub-TLV. When pushing the label stack onto a packet, the TTL of the 926 label stack entries is preserved by default. However, local policy 927 at the router that is pushing on the stack MAY cause modification of 928 the TTL values. If any label stack entry in the sub-TLV has a TTL 929 value of zero, the router that is pushing the stack on a packet MUST 930 change the value to a non-zero value. 932 Note that this sub-TLV can be appear within a TLV identifying any 933 type of tunnel, not just within a TLV identifying an MPLS tunnel. 934 However, if this sub-TLV appears within a TLV identifying an MPLS 935 tunnel (or an MPLS-in-X tunnel), this sub-TLV plays the same role 936 that would be played by an MPLS Encapsulation sub-TLV. Therefore, an 937 MPLS Encapsulation sub-TLV is not defined. 939 3.7. Prefix-SID Sub-TLV 941 [Prefix-SID-Attribute] defines a BGP Path attribute known as the 942 "Prefix-SID Attribute". This attribute is defined to contain a 943 sequence of one or more TLVs, where each TLV is either a "Label- 944 Index" TLV, an "IPv6 SID (Segment Identifier)" TLV, or an "Originator 945 SRGB (Source Routing Global Block)" TLV. 947 In this document, we define a Prefix-SID sub-TLV. The value field of 948 the Prefix-SID sub-TLV can be set to any valid value of the value 949 field of a BGP Prefix-SID attribute, as defined in 950 [Prefix-SID-Attribute]. 952 The Prefix-SID sub-TLV can occur in a TLV identifying any type of 953 tunnel. If an Originator SRGB is specified in the sub-TLV, that SRGB 954 MUST be interpreted to be the SRGB used by the tunnel's Remote 955 Endpoint. The Label-Index, if present, is the Segment Routing SID 956 that the tunnel's Remote Endpoint uses to represent the prefix 957 appearing in the NLRI field of the BGP UPDATE to which the Tunnel 958 Encapsulation attribute is attached. 960 If a Label-Index is present in the prefix-SID sub-TLV, then when a 961 packet is sent through the tunnel identified by the TLV, the 962 corresponding MPLS label MUST be pushed on the packet's label stack. 963 The corresponding MPLS label is computed from the Label-Index value 964 and the SRGB of the route's originator. 966 If the Originator SRGB is not present,it is assumed that the 967 originator's SRGB is known by other means. Such "other means" are 968 outside the scope of this document. 970 The corresponding MPLS label is pushed on after the processing of the 971 MPLS Label Stack sub-TLV, if present, as specified in Section 3.6. 972 It is pushed on before any other labels (e.g., a label embedded in 973 UPDATE's NLRI, or a label determined by the procedures of Section 8 974 are pushed on the stack. 976 The Prefix-SID sub-TLV has slightly different semantics than the 977 Prefix-SID attribute. When the Prefix-SID attribute is attached to a 978 given route, the BGP speaker that originally attached the attribute 979 is expected to be in the same Segment Routing domain as the BGP 980 speakers who receive the route with the attached attribute. The 981 Label-Index tells the receiving BGP speakers that the prefix-SID is 982 for the advertised prefix in that Segment Routing domain. When the 983 Prefix-SID sub-TLV is used, the BGP speaker at the head end of the 984 tunnel need even not be in the same Segment Routing Domain as the 985 tunnel's Remote Endpoint, and there is no implication that the 986 prefix-SID for the advertised prefix is the same in the Segment 987 Routing domains of the BGP speaker that originated the sub-TLV and 988 the BGP speaker that received it. 990 4. Extended Communities Related to the Tunnel Encapsulation Attribute 991 4.1. Encapsulation Extended Community 993 The Encapsulation Extended Community is a Transitive Opaque Extended 994 Community. This Extended Community may be attached to a route of any 995 AFI/SAFI to which the Tunnel Encapsulation attribute may be attached. 996 Each such Extended Community identifies a particular tunnel type. If 997 the Encapsulation Extended Community identifies a particular tunnel 998 type, its semantics are exactly equivalent to the semantics of a 999 Tunnel Encapsulation attribute Tunnel TLV for which the following 1000 three conditions all hold: 1002 1. it identifies the same tunnel type, 1004 2. it has a Remote Endpoint sub-TLV for which one of the following 1005 two conditions holds: 1007 a. its "Address Family" subfield contains zero, or 1009 b. its "Address" subfield contains the same IP address that 1010 appears in the next hop field of the route to which the 1011 Tunnel Encapsulation attribute is attached 1013 3. it has no other sub-TLVs. 1015 We will refer to such a Tunnel TLV as a "barebones" Tunnel TLV. 1017 The Encapsulation Extended Community was first defined in [RFC5512]. 1018 While it provides only a small subset of the functionality of the 1019 Tunnel Encapsulation attribute, it is used in a number of deployed 1020 applications, and is still needed for backwards compatibility. To 1021 ensure backwards compatibility, this specification establishes the 1022 following rules: 1024 1. If the Tunnel Encapsulation attribute of a given route contains a 1025 barebones Tunnel TLV identifying a particular tunnel type, an 1026 Encapsulation Extended Community identifying the same tunnel type 1027 SHOULD be attached to the route. 1029 2. If the Encapsulation Extended Community identifying a particular 1030 tunnel type is attached to a given route, the corresponding 1031 barebones Tunnel TLV MAY be omitted from the Tunnel Encapsulation 1032 attribute. 1034 3. Suppose a particular route has both (a) an Encapsulation Extended 1035 Community specifying a particular tunnel type, and (b) a Tunnel 1036 Encapsulation attribute with a barebones Tunnel TLV specifying 1037 that same tunnel type. Both (a) and (b) MUST be interpreted as 1038 denoting the same tunnel. 1040 In short, in situations where one could use either the Encapsulation 1041 Extended Community or a barebones Tunnel TLV, one may use either or 1042 both. However, to ensure backwards compatibility with applications 1043 that do not support the Tunnel Encapsulation attribute, it is 1044 preferable to use the Encapsulation Extended Community. If the 1045 Extended Community (identifying a particular tunnel type) is present, 1046 the corresponding Tunnel TLV is optional. 1048 Note that for tunnel types of the form "X-in-Y", e.g., MPLS-in-GRE, 1049 the Encapsulation Extended Community implies that only packets of the 1050 specified payload type "X" are to be carried through the tunnel of 1051 type "Y". 1053 In the remainder of this specification, when we speak of a route as 1054 containing a Tunnel Encapsulation attribute with a TLV identifying a 1055 particular tunnel type, we are implicitly including the case where 1056 the route contains a Tunnel Encapsulation Extended Community 1057 identifying that tunnel type. 1059 4.2. Router's MAC Extended Community 1061 [EVPN-Inter-Subnet] defines a Router's MAC Extended Community. This 1062 Extended Community provides information that may conflict with 1063 information in one or more of the Encapsulation Sub-TLVs of a Tunnel 1064 Encapsulation attribute. In case of such a conflict, the information 1065 in the Encapsulation Sub-TLV takes precedence. 1067 4.3. Color Extended Community 1069 The Color Extended Community is a Transitive Opaque Extended 1070 Community with the following encoding: 1072 0 1 2 3 1073 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1074 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1075 | 0x03 | 0x0b | Reserved | 1076 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1077 | Color Value | 1078 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1080 Figure 12: Color Extended Community 1082 For the use of this Extended Community please see Section 7. 1084 5. Semantics and Usage of the Tunnel Encapsulation attribute 1086 [RFC5512] specifies the use of the Tunnel Encapsulation attribute in 1087 BGP UPDATE messages of AFI/SAFI 1/7 and 2/7. That document restricts 1088 the use of this attribute to UPDATE messsages of those SAFIs. This 1089 document removes that restriction. 1091 The BGP Tunnel Encapsulation attribute MAY be carried in any BGP 1092 UPDATE message whose AFI/SAFI is 1/1 (IPv4 Unicast), 2/1 (IPv6 1093 Unicast), 1/4 (IPv4 Labeled Unicast), 2/4 (IPv6 Labeled Unicast), 1094 1/128 (VPN-IPv4 Labeled Unicast), 2/128 (VPN-IPv6 Labeled Unicast), 1095 or 25/70 (Ethernet VPN, usually known as EVPN)). Use of the Tunnel 1096 Encapsulation attribute in BGP UPDATE messages of other AFI/SAFIs is 1097 outside the scope of this document. 1099 It has been suggested that it may sometimes be useful to attach a 1100 Tunnel Encapsulation attribute to a BGP UPDATE message that is also 1101 carrying a PMSI (Provider Multicast Service Interface) Tunnel 1102 attribute [RFC6514]. If the PMSI Tunnel attribute specifies an IP 1103 tunnel, the Tunnel Encapsulation attribute could be used to provide 1104 additional information about the IP tunnel. The usage of the Tunnel 1105 Encapsulation attribute in combination with the PMSI Tunnel attribute 1106 is outside the scope of this document. 1108 The decision to attach a Tunnel Encapsulation attribute to a given 1109 BGP UPDATE is determined by policy. The set of TLVs and sub-TLVs 1110 contained in the attribute is also determined by policy. 1112 When the Tunnel Encapsulation attribute is carried in an UPDATE of 1113 one of the AFI/SAFIs specified in the previous paragraph, each TLV 1114 MUST have a Remote Endpoint sub-TLV. If a TLV that does not have a 1115 Remote Endpoint sub-TLV, that TLV should be treated as if it had a 1116 malformed Remote Endpoint sub-TLV (see Section 3.1). 1118 Suppose that: 1120 o a given packet P must be forwarded by router R; 1122 o the path along which P is to be forwarded is determined by BGP 1123 UPDATE U; 1125 o UPDATE U has a Tunnel Encapsulation attribute, containing at least 1126 one TLV that identifies a "feasible tunnel" for packet P. A 1127 tunnel is considered feasible if it has the following two 1128 properties: 1130 * The tunnel type is supported (i.e., router R knows how to set 1131 up tunnels of that type, how to create the encapsulation header 1132 for tunnels of that type, etc.) 1134 * The tunnel is of a type that can be used to carry packet P 1135 (e.g., an MPLS-in-UDP tunnel would not be a feasible tunnel for 1136 carrying an IP packet, UNLESS the IP packet can first be 1137 converted to an MPLS packet). 1139 * The tunnel is specified in a TLV whose Remote Endpoint sub-TLV 1140 identifies an IP address that is reachable. 1142 Then router R SHOULD send packet P through one of the feasible 1143 tunnels identified in the Tunnel Encapsulation attribute of UPDATE U. 1145 If the Tunnel Encapsulation attribute contains several TLVs (i.e., if 1146 it specifies several tunnels), router R may choose any one of those 1147 tunnels, based upon local policy. If any of tunnels' TLVs contain 1148 the Color sub-TLV(Section 3.4.2) and/or the Protocol Type sub-TLV 1149 (Section 3.4.1, the choice of tunnel may be influenced by these sub- 1150 TLVs. 1152 Note that if none of the TLVs specifies the MPLS tunnel type, a Label 1153 Switched Path SHOULD NOT be used unless none of the TLVs specifies a 1154 feasible tunnel. 1156 If a particular tunnel is not feasible at some moment because its 1157 Remote Endpoint cannot be reached at that moment, the tunnel may 1158 become feasible at a later time. When this happens, router R SHOULD 1159 reconsider its choice of tunnel to use, and MAY choose to now use the 1160 tunnel. 1162 A TLV specifying a non-feasible tunnel is not considered to be 1163 malformed or erroneous in any way, and the TLV SHOULD NOT be stripped 1164 from the Tunnel Encapsulation attribute before redistribution. 1166 In addition to the sub-TLVs already defined, additional sub-TLVs may 1167 be defined that affect the choice of tunnel to be used, or that 1168 affect the contents of the tunnel encapsulation header. The 1169 documents that define any such additional sub-TLVs must specify the 1170 effect that including the sub-TLV is to have. 1172 If it is determined to send a packet through the tunnel specified in 1173 a particular TLV of a particular Tunnel Encapsulation attribute, then 1174 the tunnel's remote endpoint address is the IP address contained in 1175 the sub-TLV. If the TLV contains a Remote Endpoint sub-TLV whose 1176 value field is all zeroes, then the tunnel's remote endpoint is the 1177 IP address specified as the Next Hop of the BGP Update containing the 1178 Tunnel Encapsulation attribute. 1180 The procedure for sending a packet through a particular tunnel type 1181 to a particular remote endpoint depends upon the tunnel type, and is 1182 outside the scope of this document. The contents of the tunnel 1183 encapsulation header MAY be influenced by the Encapsulation sub-TLV. 1185 Note that some tunnel types may require the execution of an explicit 1186 tunnel setup protocol before they can be used for carrying data. 1187 Other tunnel types may not require any tunnel setup protocol. 1188 Whenever a new Tunnel Type TLV is defined, the specification of that 1189 TLV must describe (or reference) the procedures for creating the 1190 encapsulation header used to forward packets through that tunnel 1191 type. 1193 If a Tunnel Encapsulation attribute specifies several tunnels, the 1194 way in which a router chooses which one to use is a matter of policy, 1195 subject to the following constraint: if a router can determine that a 1196 given tunnel is not functional, it MUST NOT use that tunnel. In 1197 particular, if the tunnel is identified in a TLV that has a Remote 1198 Endpoint sub-TLV, and if the IP address specified in the sub-TLV is 1199 not reachable from router R, then the tunnel SHOULD be considered 1200 non-functional. Other means of determining whether a given tunnel is 1201 functional MAY be used; specification of such means is outside the 1202 scope of this specification. Of course, if a non-functional tunnel 1203 later becomes functional, router R SHOULD reevaluate its choice of 1204 tunnels. 1206 If router R determines that it cannot use any of the tunnels 1207 specified in the Tunnel Encapsulation attribute, it MAY either drop 1208 packet P, or it MAY transmit packet P as it would had the Tunnel 1209 Encapsulation attribute not been present. This is a matter of local 1210 policy. By default, the packet SHOULD be transmitted as if the 1211 Tunnel Encapsulation attribute had not been present. 1213 A Tunnel Encapsulation attribute may contain several TLVs that all 1214 specify the same tunnel type. Each TLV should be considered as 1215 specifying a different tunnel. Two tunnels of the same type may have 1216 different Remote Endpoint sub-TLVs, different Encapsulation sub-TLVs, 1217 etc. Choosing between two such tunnels is a matter of local policy. 1219 Once router R has decided to send packet P through a particular 1220 tunnel, it encapsulates packet P appropriately and then forwards it 1221 according to the route that leads to the tunnel's remote endpoint. 1222 This route may itself be a BGP route with a Tunnel Encapsulation 1223 attribute. If so, the encapsulated packet is treated as the payload 1224 and is encapsulated according to the Tunnel Encapsulation attribute 1225 of that route. That is, tunnels may be "stacked". 1227 Notwithstanding anything said in this document, a BGP speaker MAY 1228 have local policy that influences the choice of tunnel, and the way 1229 the encapsulation is formed. A BGP speaker MAY also have a local 1230 policy that tells it to ignore the Tunnel Encapsulation attribute 1231 entirely or in part. Of course, interoperability issues must be 1232 considered when such policies are put into place. 1234 6. Routing Considerations 1236 6.1. No Impact on BGP Decision Process 1238 The presence of the Tunnel Encapsulation attribute does not affect 1239 the BGP bestpath selection algorithm. 1241 Under certain circumstances, this may lead to counter-intuitive 1242 consequences. For example, suppose: 1244 o router R1 receives a BGP UPDATE message from router R2, such that 1246 * the NLRI of that UPDATE is prefix X, 1248 * the UPDATE contains a Tunnel Encapsulation attribute specifying 1249 two tunnels, T1 and T2, 1251 * R1 cannot use tunnel T1 or tunnel T2, either because the tunnel 1252 remote endpoint is not reachable or because R1 does not support 1253 that kind of tunnel 1255 o router R1 receives a BGP UPDATE message from router R3, such that 1257 * the NLRI of that UPDATE is prefix X, 1259 * the UPDATE contains a Tunnel Encapsulation attribute specifying 1260 two tunnels, T3 and T4, 1262 * R1 can use at least one of the two tunnels 1264 Since the Tunnel Encapsulation attribute does not affect bestpath 1265 selection, R1 may well install the route from R2 rather than the 1266 route from R3, even though R2's route contains no usable tunnels. 1268 This possibility must be kept in mind whenever a Remote Endpoint sub- 1269 TLV carried by a given UPDATE specifies an IP address that is 1270 different than the next hop of that UPDATE. 1272 6.2. Looping, Infinite Stacking, Etc. 1274 Consider a packet destined for address X. Suppose a BGP UPDATE for 1275 address prefix X carries a Tunnel Encapsulation attribute that 1276 specifies a remote tunnel endpoint of Y. And suppose that a BGP 1277 UPDATE for address prefix Y carries a Tunnel Encapsulation attribute 1278 that specifies a Remote Endpoint of X. It is easy to see that this 1279 will cause an infinite number of encapsulation headers to be put on 1280 the given packet. 1282 This could happen as a result of misconfiguration, either accidental 1283 or intentional. It could also happen if the Tunnel Encapsulation 1284 attribute were altered by a malicious agent. Implementations should 1285 be aware of this. 1287 Improper setting (or malicious altering) of the Tunnel Encapsulation 1288 attribute could also cause data packets to loop. Suppose a BGP 1289 UPDATE for address prefix X carries a Tunnel Encapsulation attribute 1290 that specifies a remote tunnel endpoint of Y. Suppose router R 1291 receives and processes the update. When router R receives a packet 1292 destined for X, it will apply the encapsulation and send the 1293 encapsulated packet to Y. Y will decapsulate the packet and forward 1294 it further. If Y is further away from X than is router R, it is 1295 possible that the path from Y to X will traverse R. This would cause 1296 a long-lasting routing loop. 1298 These possibilities must also be kept in mind whenever the Remote 1299 Endpoint for a given prefix differs from the BGP next hop for that 1300 prefix. 1302 7. Recursive Next Hop Resolution 1304 Suppose that: 1306 o a given packet P must be forwarded by router R1; 1308 o the path along which P is to be forwarded is determined by BGP 1309 UPDATE U1; 1311 o UPDATE U1 does not have a Tunnel Encapsulation attribute; 1313 o the next hop of UPDATE U1 is router R2; 1315 o the best path to router R2 is a BGP route that was advertised in 1316 UPDATE U2; 1318 o UPDATE U2 has a Tunnel Encapsulation attribute. 1320 Then packet P SHOULD be sent through one of the tunnels identified in 1321 the Tunnel Encapsulation attribute of UPDATE U2. See Section 5 for 1322 further details. 1324 However, suppose that one of the TLVs in U2's Tunnel Encapsulation 1325 attribute contains the Color Sub-TLV. In that case, packet P SHOULD 1326 NOT be sent through the tunnel identified in that TLV, unless U1 is 1327 carrying the Color Extended Community that is identified in U2's 1328 Color Sub-TLV. 1330 Note that if UPDATE U1 and UPDATE U2 both have Tunnel Encapsulation 1331 attributes, packet P will be carried through a pair of nested 1332 tunnels. P will first be encapsulated based on the Tunnel 1333 Encapsulation attribute of U1. This encapsulated packet then becomes 1334 the payload, and is encapsulated based on the Tunnel Encapsulation 1335 attribute of U2. This is another way of "stacking" tunnels (see also 1336 Section 5. 1338 The procedures in this section presuppose that U1's next hop resolves 1339 to a BGP route, and that U2's next hop resolves (perhaps after 1340 further recursion) to a non-BGP route. 1342 8. Use of Virtual Network Identifiers and Embedded Labels when Imposing 1343 a Tunnel Encapsulation 1345 If the TLV specifying a tunnel contains an MPLS Label Stack sub-TLV, 1346 then when sending a packet through that tunnel, the procedures of 1347 Section 3.6 are applied before the procedures of this section. 1349 If the TLV specifying a tunnel contains a Prefix-SID sub-TLV, the 1350 procedures of Section 3.7 are applied before the procedures of this 1351 section. If the TLV also contains an MPLS Label Stack sub-TLV, the 1352 procedures of Section 3.6 are applied before the procedures of 1353 Section 3.7. 1355 8.1. Tunnel Types without a Virtual Network Identifier Field 1357 If a Tunnel Encapsulation attribute is attached to an UPDATE of a 1358 labeled address family, there will be one or more labels specified in 1359 the UPDATE's NLRI. When a packet is sent through a tunnel specified 1360 in one of the attribute's TLVs, and that tunnel type does not contain 1361 a virtual network identifier field, the label or labels from the NLRI 1362 are pushed on the packet's label stack. The resulting MPLS packet is 1363 then further encapsulated, as specified by the TLV. 1365 8.2. Tunnel Types with a Virtual Network Identifier Field 1367 Three of the tunnel types that can be specified in a Tunnel 1368 Encapsulation TLV have virtual network identifier fields in their 1369 encapsulation headers. In the VXLAN and VXLAN-GPE encapsulations, 1370 this field is called the VNI (Virtual Network Identifier) field; in 1371 the NVGRE encapsulation, this field is called the VSID (Virtual 1372 Subnet Identifier) field. 1374 When one of these tunnel encapsulations is imposed on a packet, the 1375 setting of the virtual network identifier field in the encapsulation 1376 header depends upon the contents of the Encapsulation sub-TLV (if one 1377 is present). When the Tunnel Encapsulation attribute is being 1378 carried on a BGP UPDATE of a labeled address family, the setting of 1379 the virtual network identifier field also depends upon the contents 1380 of the Embedded Label Handling sub-TLV (if present). 1382 This section specifies the procedures for choosing the value to set 1383 in the virtual network identifier field of the encapsulation header. 1384 These procedures apply only when the tunnel type is VXLAN, VXLAN-GPE, 1385 or NVGRE. 1387 8.2.1. Unlabeled Address Families 1389 This sub-section applies when: 1391 o the Tunnel Encapsulation attribute is carried on a BGP UPDATE of 1392 an unlabeled address family, and 1394 o at least one of the attribute's TLVs identifies a tunnel type that 1395 uses a virtual network identifier, and 1397 o it has been determined to send a packet through one of those 1398 tunnels. 1400 If the TLV identifying the tunnel contains an Encapsulation sub-TLV 1401 whose V bit is set, the virtual network identifier field of the 1402 encapsulation header is set to the value of the virtual network 1403 identifier field of the Encapsulation sub-TLV. 1405 Otherwise, the virtual network identifier field of the encapsulation 1406 header is set to a configured value; if there is no configured value, 1407 the tunnel cannot be used. 1409 8.2.2. Labeled Address Families 1411 This sub-section applies when: 1413 o the Tunnel Encapsulation attribute is carried on a BGP UPDATE of a 1414 labeled address family, and 1416 o at least one of the attribute's TLVs identifies a tunnel type that 1417 uses a virtual network identifier, and 1419 o it has been determined to send a packet through one of those 1420 tunnels. 1422 8.2.2.1. When a Valid VNI has been Signaled 1424 If the TLV identifying the tunnel contains an Encapsulation sub-TLV 1425 whose V bit is set, the virtual network identifier field of the 1426 encapsulation header is set as follows: 1428 o If the TLV contains an Embedded Label Handling sub-TLV whose value 1429 is 1, then the virtual network identifier field of the 1430 encapsulation header is set to the value of the virtual network 1431 identifier field of the Encapsulation sub-TLV. 1433 The embedded label (from the NLRI of the route that is carrying 1434 the Tunnel Encapsulation attribute) appears at the top of the MPLS 1435 label stack in the encapsulation payload. 1437 o If the TLV does not contain an Embedded Label Handling sub-TLV, or 1438 if contains an Embedded Label Handling sub-TLV whose value is 2, 1439 the embedded label is ignored entirely, and the virtual network 1440 identifier field of the encapsulation header is set to the value 1441 of the virtual network identifier field of the Encapsulation sub- 1442 TLV. 1444 8.2.2.2. When a Valid VNI has not been Signaled 1446 If the TLV identifying the tunnel does not contain an Encapsulation 1447 sub-TLV whose V bit is set, the virtual network identifier field of 1448 the encapsulation header is set as follows: 1450 o If the TLV contains an Embedded Label Handling sub-TLV whose value 1451 is 1, then the virtual network identifier field of the 1452 encapsulation header is set to a configured value. 1454 If there is no configured value, the tunnel cannot be used. 1456 The embedded label (from the NLRI of the route that is carrying 1457 the Tunnel Encapsulation attribute) appears at the top of the MPLS 1458 label stack in the encapsulation payload. 1460 o If the TLV does not contain an Embedded Label Handling sub-TLV, or 1461 if it contains an Embedded Label Handling sub-TLV whose value is 1462 2, the embedded label is copied into the virtual network 1463 identifier field of the encapsulation header. 1465 In this case, the payload may or may not contain an MPLS label 1466 stack, depending upon other factors. If the payload does contain 1467 an MPLS lable stack, the embedded label does not appear in that 1468 stack. 1470 9. Applicability Restrictions 1472 In a given UPDATE of a labeled address family, the label embedded in 1473 the NLRI is generally a label that is meaningful only to the router 1474 whose address appears as the next hop. Certain of the procedures of 1475 Section 8.2.2.1 or Section 8.2.2.2 cause the embedded label to be 1476 carried by a data packet to the router whose address appears in the 1477 Remote Endpoint sub-TLV. If the Remote Endpoint sub-TLV does not 1478 identify the same router that is the next hop, sending the packet 1479 through the tunnel may cause the label to be misinterpreted at the 1480 tunnel's remote endpoint. This may cause misdelivery of the packet. 1482 Therefore the embedded label MUST NOT be carried by a data packet 1483 traveling through a tunnel unless it is known that the label will be 1484 properly interpreted at the tunnel's remote endpoint. How this is 1485 known is outside the scope of this document. 1487 Note that if the Tunnel Encapsulation attribute is attached to a VPN- 1488 IP route [RFC4364], and if Inter-AS "option b" (see section 10 of 1489 [RFC4364] is being used, and if the Remote Endpoint sub-TLV contains 1490 an IP address that is not in same AS as the router receiving the 1491 route, it is very likely that the embedded label has been changed. 1492 Therefore use of the Tunnel Encapsulation attribute in an "Inter-AS 1493 option b" scenario is not supported. 1495 10. Scoping 1497 The Tunnel Encapsulation attribute is defined as a transitive 1498 attribute, so that it may be passed along by BGP speakers that do not 1499 recognize it. However, it is intended that the Tunnel Encapsulation 1500 attribute be used only within a well-defined scope, e.g., within a 1501 set of Autonomous Systems that belong to a single administrative 1502 entity. If the attribute is distributed beyond its intended scope, 1503 packets may be sent through tunnels in a manner that is not intended. 1505 To prevent the Tunnel Encapsulation attribute from being distributed 1506 beyond its intended scope, any BGP speaker that understands the 1507 attribute MUST be able to filter the attribute from incoming BGP 1508 UPDATE messages. When the attribute is filtered from an incoming 1509 UPDATE, the attribute is neither processed nor redistributed. This 1510 filtering SHOULD be possible on a per-BGP-session basis. For each 1511 session, filtering of the attribute on incoming UPDATEs MUST be 1512 enabled by default. 1514 In addition, any BGP speaker that understands the attribute MUST be 1515 able to filter the attribute from outgoing BGP UPDATE messages. This 1516 filtering SHOULD be possible on a per-BGP-session basis. For each 1517 session, filtering of the attribute on outgoing UPDATEs MUST be 1518 enabled by default. 1520 11. Error Handling 1522 The Tunnel Encapsulation attribute is a sequence of TLVs, each of 1523 which is a sequence of sub-TLVs. The final octet of a TLV is 1524 determined by its length field. Similarly, the final octet of a sub- 1525 TLV is determined by its length field. The final octet of a TLV MUST 1526 also be the final octet of its final sub-TLV. If this is not the 1527 case, the TLV MUST be considered to be malformed. A TLV that is 1528 found to be malformed for this reason MUST NOT be processed, and MUST 1529 be stripped from the Tunnel Encapsulation attribute before the 1530 attribute is propagated. Subsequent TLVs in the Tunnel Encapsulation 1531 attribute may still be valid, in which case they MUST be processed 1532 and redistributed normally. 1534 If a Tunnel Encapsulation attribute does not have any valid TLVs, or 1535 it does not have the transitive bit set, the "Attribute Discard" 1536 procedure of [RFC7606] is applied. 1538 If a Tunnel Encapsulation attribute can be parsed correctly, but 1539 contains a TLV whose tunnel type is not recognized by a particular 1540 BGP speaker, that BGP speaker MUST NOT consider the attribute to be 1541 malformed. Rather, the TLV with the unrecognized tunnel type MUST be 1542 ignored, and the BGP speaker MUST interpret the attribute as if that 1543 TLV had not been present. If the route carrying the Tunnel 1544 Encapsulation attribute is propagated with the attribute, the 1545 unrecognized TLV SHOULD remain in the attribute. 1547 If a TLV of a Tunnel Encapsulation attribute contains a sub-TLV that 1548 is not recognized by a particular BGP speaker, the BGP speaker SHOULD 1549 process that TLV as if the unrecognized sub-TLV had not been present. 1550 If the route carrying the Tunnel Encapsulation attribute is 1551 propagated with the attribute, the unrecognized TLV SHOULD remain in 1552 the attribute. 1554 In general, if a TLV contains a sub-TLV that is malformed (e.g., 1555 contains a length field whose value is not legal for that sub-TLV), 1556 the sub-TLV should be treated as if it were an unrecognized sub-TLV. 1557 This document specifies one exception to this rule -- if a TLV 1558 contains a malformed Remote Endpoint sub-TLV (as defined in 1559 Section 3.1, the entire TLV MUST be ignored, and SHOULD be removed 1560 from the Tunnel Encapsulation attribute before the route carrying 1561 that attribute is redistributed. 1563 A TLV that does not contain the Remote Endpoint sub-TLV MUST be 1564 treated as if it contained a malformed Remote Endpoint sub-TLV. 1566 A TLV identifying a particular tunnel type may contain a sub-TLV that 1567 is meaningless for that tunnel type. For example, perhaps the TLV 1568 contains a "UDP Destination Port" sub-TLV, but the identified tunnel 1569 type does not use UDP encapsulation at all. Sub-TLVs of this sort 1570 SHOULD be treated as no-ops. That is, they SHOULD NOT affect the 1571 creation of the encapsulation header. However, the sub-TLV MUST NOT 1572 be considered to be malformed, and MUST NOT be removed from the TLV 1573 before the route carrying the Tunnel Encapsulation attribute is 1574 redistributed. (This allows for the possibility that such sub-TLVs 1575 may be given a meaning, in the context of the specified tunnel type, 1576 in the future.) 1578 There is no significance to the order in which the TLVs occur within 1579 the Tunnel Encapsulation attribute. Multiple TLVs may occur for a 1580 given tunnel type; each such TLV is regarded as describing a 1581 different tunnel. 1583 12. IANA Considerations 1585 12.1. Subsequent Address Family Identifiers 1587 IANA is requested to modify the "Subsequent Address Family 1588 Identifiers" registry to indicate that the Encapsulation SAFI is 1589 deprecated. This document should be the reference. 1591 12.2. BGP Path Attributes 1593 IANA has assigned value 23 from the "BGP Path Attributes" Registry, 1594 to "Tunnel Encapsulation Attribute". IANA is requested to add this 1595 document as a reference. 1597 12.3. Extended Communities 1599 IANA has assigned values from the "Transitive Opaque Extended 1600 Community" type Registry to the "Color Extended Community" (sub-type 1601 0x0b), and to the "Encapsulation Extended Community"(0x030c). IANA 1602 is requested to add this document as a reference for both 1603 assignments. 1605 12.4. BGP Tunnel Encapsulation Attribute Sub-TLVs 1607 IANA is requested to add the following note to the "BGP Tunnel 1608 Encapsulation Attribute Sub-TLVs" registry: 1610 If the Sub-TLV Type is in the range from 1 to 127 inclusive, the 1611 Sub-TLV Length field contains one octet. If the Sub-TLV Type is 1612 in the range from 128-254 inclusive, the Sub-TLV Length field 1613 contains two octets. 1615 IANA is requested to change the registration policy of the "BGP 1616 Tunnel Encapsulation Attribute Sub-TLVs" registry to the following: 1618 o The values 0 and 255 are reserved. 1620 o The values in the range 1-63 and 128-191 are to be allocated using 1621 the "Standards Action" registration procedure. 1623 o The values in the range 64-125 and 192-252 are to be allocated 1624 using the "First Come, First Served" registration procedure. 1626 o The values in the range 126-127 and 253-254 are reserved for 1627 experimental use; IANA shall not allocate values from this range. 1629 IANA is requested to assign a codepoint, from the range 1-63 of the 1630 "BGP Tunnel Encapsulation Attribute Sub-TLVs" registry, for "Remote 1631 Endpoint", with this document being the reference. 1633 IANA is requested to assign a codepoint, from the range 1-63 of the 1634 "BGP Tunnel Encapsulation Attribute Sub-TLVs" registry, for "IPv4 DS 1635 Field", with this document being the reference. 1637 IANA is requested to assign a codepoint from the "BGP Tunnel 1638 Encapsulation Attribute Sub-TLVs" registry for "UDP Destination 1639 Port", with this document being the reference. 1641 IANA is requested to assign a codepoint, from the range 1-63 of the 1642 "BGP Tunnel Encapsulation Attribute Sub-TLVs" registry, for "Embedded 1643 Label Handling", with this document being the reference. 1645 IANA is requested to assign a codepoint, from the range 1-63 of the 1646 "BGP Tunnel Encapsulation Attribute Sub-TLVs" registry, for "MPLS 1647 Label Stack", with this document being the reference. 1649 IANA is requested to assign a codepoint, from the range 1-63 of the 1650 "BGP Tunnel Encapsulation Attribute Sub-TLVs" registry, for "Prefix 1651 SID", with this document being the reference. 1653 IANA has assigned codepoints from the "BGP Tunnel Encapsulation 1654 Attribute Sub-TLVs" registry for "Encapsulation", "Protocol Type", 1655 and "Color". IANA is requested to add this document as a reference. 1657 12.5. Tunnel Types 1659 IANA is requested to add this document as a reference for tunnel 1660 types 8 (VXLAN), 9 (NVGRE), 11 (MPLS-in-GRE), and 12 (VXLAN-GPE) in 1661 the "BGP Tunnel Encapsulation Tunnel Types" registry. 1663 IANA is requested to assign a codepoint from the "BGP Tunnel 1664 Encapsulation Tunnel Types" registry for "GTP". 1666 IANA is requested to add this document as a reference for tunnel 1667 types 1 (L2TPv3), 2 (GRE), and 7 (IP in IP) in the "BGP Tunnel 1668 Encapsulation Tunnel Types" registry. 1670 13. Security Considerations 1672 The Tunnel Encapsulation attribute can cause traffic to be diverted 1673 from its normal path, especially when the Remote Endpoint sub-TLV is 1674 used. This can have serious consequences if the attribute is added 1675 or modified illegitimately, as it enables traffic to be "hijacked". 1677 The Remote Endpoint sub-TLV contains both an IP address and an AS 1678 number. BGP Origin Validation [RFC6811] can be used to obtain 1679 assurance that the given IP address belongs to the given AS. While 1680 this provides some protection against misconfiguration, it does not 1681 prevent a malicious agent from inserting a sub-TLV that will appear 1682 valid. 1684 Before sending a packet through the tunnel identified in a particular 1685 TLV of a Tunnel Encapsulation attribute, it may be advisable to use 1686 BGP Origin Validation to obtain the following additional assurances: 1688 o the origin AS of the route carrying the Tunnel Encapsulation 1689 attribute is correct; 1691 o the origin AS of the route to the IP address specified in the 1692 Remote Endpoint sub-TLV is correct, and is the same AS that is 1693 specified in the Remote Endpoint sub-TLV. 1695 One then has some level of assurance that the tunneled traffic is 1696 going to the same destination AS that it would have gone to had the 1697 Tunnel Encapsulation attribute not been present. However, this may 1698 not suit all use cases, and in any event is not very strong 1699 protection against hijacking. 1701 For these reasons, BGP Origin Validation should not be relied upon 1702 exclusively, and the filtering procedures of Section 10 should always 1703 be in place. 1705 Increased protection can be obtained by using BGP Path Validation 1706 [BGPSEC] to ensure that the route carrying the Tunnel Encapsulation 1707 attribute, and the routes to the Remote Endpoint of each specified 1708 tunnel, have not been altered illegitimately. 1710 If BGP Origin Validation is used as specified above, and the tunnel 1711 specified in a particular TLV of a Tunnel Encapsulation attribute is 1712 therefore regarded as "suspicious", that tunnel should not be used. 1713 Other tunnels specified in (other TLVs of) the Tunnel Encapsulation 1714 attribute may still be used. 1716 14. Acknowledgments 1718 This document contains text from RFC5512, co-authored by Pradosh 1719 Mohapatra. The authors of the current document wish to thank Pradosh 1720 for his contribution. RFC5512 itself built upon prior work by Gargi 1721 Nalawade, Ruchi Kapoor, Dan Tappan, David Ward, Scott Wainner, Simon 1722 Barber, and Chris Metz, whom we also thank for their contributions. 1724 The authors wish to thank Lou Berger, Ron Bonica, John Drake, Satoru 1725 Matsushima, Dhananjaya Rao, John Scudder, Ravi Singh, Thomas Morin, 1726 Xiaohu Xu, and Zhaohui Zhang for their review, comments, and/or 1727 helpful discussions. 1729 15. Contributor Addresses 1731 Below is a list of other contributing authors in alphabetical order: 1733 Randy Bush 1734 Internet Initiative Japan 1735 5147 Crystal Springs 1736 Bainbridge Island, Washington 98110 1737 United States 1739 Email: randy@psg.com 1741 Robert Raszuk 1742 Bloomberg LP 1743 731 Lexington Ave 1744 New York City, NY 10022 1745 United States 1747 Email: robert@raszuk.net 1749 16. References 1751 16.1. Normative References 1753 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1754 Requirement Levels", BCP 14, RFC 2119, 1755 DOI 10.17487/RFC2119, March 1997, 1756 . 1758 [RFC5512] Mohapatra, P. and E. Rosen, "The BGP Encapsulation 1759 Subsequent Address Family Identifier (SAFI) and the BGP 1760 Tunnel Encapsulation Attribute", RFC 5512, 1761 DOI 10.17487/RFC5512, April 2009, 1762 . 1764 [RFC7606] Chen, E., Ed., Scudder, J., Ed., Mohapatra, P., and K. 1765 Patel, "Revised Error Handling for BGP UPDATE Messages", 1766 RFC 7606, DOI 10.17487/RFC7606, August 2015, 1767 . 1769 16.2. Informative References 1771 [BGPSEC] Lepinski, M. and S. Turner, "An Overview of BGPsec", 1772 internet-draft draft-ietf-sidr-bgpsec-overview-08, June 1773 2016. 1775 [Ethertypes] 1776 "IANA Ethertype Registry", 1777 . 1780 [EVPN-Inter-Subnet] 1781 Sajassi, A., Salem, S., Thoria, S., Drake, J., Rabadan, 1782 J., and L. Yong, "Integrated Routing and Bridging in 1783 EVPN", internet-draft draft-ietf-bess-evpn-inter-subnet- 1784 forwarding-03, February 2017. 1786 [GTP-U] 3GPP, "GPRS Tunneling Protocol User Plane, TS 29.281", 1787 2014. 1789 [Prefix-SID-Attribute] 1790 Previdi, S., Filsfils, C., Lindem, A., Patel, K., 1791 Sreekantiah, A., Ray, S., and H. Gredler, "Segment Routing 1792 Prefix SID extensions for BGP", internet-draft draft-ietf- 1793 idr-bgp-prefix-sid-05, April 2016. 1795 [RFC2474] Nichols, K., Blake, S., Baker, F., and D. Black, 1796 "Definition of the Differentiated Services Field (DS 1797 Field) in the IPv4 and IPv6 Headers", RFC 2474, 1798 DOI 10.17487/RFC2474, December 1998, 1799 . 1801 [RFC2784] Farinacci, D., Li, T., Hanks, S., Meyer, D., and P. 1802 Traina, "Generic Routing Encapsulation (GRE)", RFC 2784, 1803 DOI 10.17487/RFC2784, March 2000, 1804 . 1806 [RFC2890] Dommety, G., "Key and Sequence Number Extensions to GRE", 1807 RFC 2890, DOI 10.17487/RFC2890, September 2000, 1808 . 1810 [RFC3032] Rosen, E., Tappan, D., Fedorkow, G., Rekhter, Y., 1811 Farinacci, D., Li, T., and A. Conta, "MPLS Label Stack 1812 Encoding", RFC 3032, DOI 10.17487/RFC3032, January 2001, 1813 . 1815 [RFC3931] Lau, J., Ed., Townsley, M., Ed., and I. Goyret, Ed., 1816 "Layer Two Tunneling Protocol - Version 3 (L2TPv3)", 1817 RFC 3931, DOI 10.17487/RFC3931, March 2005, 1818 . 1820 [RFC4023] Worster, T., Rekhter, Y., and E. Rosen, Ed., 1821 "Encapsulating MPLS in IP or Generic Routing Encapsulation 1822 (GRE)", RFC 4023, DOI 10.17487/RFC4023, March 2005, 1823 . 1825 [RFC4364] Rosen, E. and Y. Rekhter, "BGP/MPLS IP Virtual Private 1826 Networks (VPNs)", RFC 4364, DOI 10.17487/RFC4364, February 1827 2006, . 1829 [RFC5462] Andersson, L. and R. Asati, "Multiprotocol Label Switching 1830 (MPLS) Label Stack Entry: "EXP" Field Renamed to "Traffic 1831 Class" Field", RFC 5462, DOI 10.17487/RFC5462, February 1832 2009, . 1834 [RFC5566] Berger, L., White, R., and E. Rosen, "BGP IPsec Tunnel 1835 Encapsulation Attribute", RFC 5566, DOI 10.17487/RFC5566, 1836 June 2009, . 1838 [RFC6514] Aggarwal, R., Rosen, E., Morin, T., and Y. Rekhter, "BGP 1839 Encodings and Procedures for Multicast in MPLS/BGP IP 1840 VPNs", RFC 6514, DOI 10.17487/RFC6514, February 2012, 1841 . 1843 [RFC6811] Mohapatra, P., Scudder, J., Ward, D., Bush, R., and R. 1844 Austein, "BGP Prefix Origin Validation", RFC 6811, 1845 DOI 10.17487/RFC6811, January 2013, 1846 . 1848 [RFC7348] Mahalingam, M., Dutt, D., Duda, K., Agarwal, P., Kreeger, 1849 L., Sridhar, T., Bursell, M., and C. Wright, "Virtual 1850 eXtensible Local Area Network (VXLAN): A Framework for 1851 Overlaying Virtualized Layer 2 Networks over Layer 3 1852 Networks", RFC 7348, DOI 10.17487/RFC7348, August 2014, 1853 . 1855 [RFC7510] Xu, X., Sheth, N., Yong, L., Callon, R., and D. Black, 1856 "Encapsulating MPLS in UDP", RFC 7510, 1857 DOI 10.17487/RFC7510, April 2015, 1858 . 1860 [RFC7637] Garg, P., Ed. and Y. Wang, Ed., "NVGRE: Network 1861 Virtualization Using Generic Routing Encapsulation", 1862 RFC 7637, DOI 10.17487/RFC7637, September 2015, 1863 . 1865 [vEPC] Matsushima, S. and R. Wakikawa, "Stateless User-Plane 1866 Architecture for Virtualized EPC", internet-draft draft- 1867 matsushima-stateless-uplane-vepc-06, March 2016. 1869 [VXLAN-GPE] 1870 Kreeger, L. and U. Elzur, "Generic Protocol Extension for 1871 VXLAN", internet-draft draft-ietf-nvo3-vxlan-gpe, October 1872 2016. 1874 Authors' Addresses 1876 Eric C. Rosen (editor) 1877 Juniper Networks, Inc. 1878 10 Technology Park Drive 1879 Westford, Massachusetts 01886 1880 United States 1882 Email: erosen@juniper.net 1884 Keyur Patel 1885 Arrcus 1887 Email: keyur@arrcus.com 1889 Gunter Van de Velde 1890 Nokia 1891 Copernicuslaan 50 1892 Antwerpen 2018 1893 Belgium 1895 Email: gunter.van_de_velde@nokia.com