idnits 2.17.1 draft-ietf-idr-tunnel-encaps-13.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- == There are 1 instance of lines with non-RFC6890-compliant IPv4 addresses in the document. If these are example addresses, they should be changed. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (July 22, 2019) is 1740 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Unused Reference: 'RFC4760' is defined on line 1840, but no explicit reference was found in the text == Outdated reference: A later version (-13) exists of draft-ietf-nvo3-vxlan-gpe-07 ** Downref: Normative reference to an Informational draft: draft-ietf-nvo3-vxlan-gpe (ref. 'I-D.ietf-nvo3-vxlan-gpe') ** Obsolete normative reference: RFC 5512 (Obsoleted by RFC 9012) ** Obsolete normative reference: RFC 5566 (Obsoleted by RFC 9012) ** Downref: Normative reference to an Informational RFC: RFC 7348 ** Downref: Normative reference to an Informational RFC: RFC 7637 == Outdated reference: A later version (-15) exists of draft-ietf-bess-evpn-inter-subnet-forwarding-08 Summary: 5 errors (**), 0 flaws (~~), 5 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 IDR Working Group K. Patel 3 Internet-Draft Arrcus, Inc 4 Obsoletes: 5512 (if approved) G. Van de Velde 5 Intended status: Standards Track Nokia 6 Expires: January 23, 2020 S. Sangli 7 Juniper Networks, Inc 8 E. Rosen 9 July 22, 2019 11 The BGP Tunnel Encapsulation Attribute 12 draft-ietf-idr-tunnel-encaps-13.txt 14 Abstract 16 RFC 5512 defines a BGP Path Attribute known as the "Tunnel 17 Encapsulation Attribute". This attribute allows one to specify a set 18 of tunnels. For each such tunnel, the attribute can provide the 19 information needed to create the tunnel and the corresponding 20 encapsulation header. The attribute can also provide information 21 that aids in choosing whether a particular packet is to be sent 22 through a particular tunnel. RFC 5512 states that the attribute is 23 only carried in BGP UPDATEs that have the "Encapsulation Subsequent 24 Address Family (Encapsulation SAFI)". This document deprecates the 25 Encapsulation SAFI (which has never been used in production), and 26 specifies semantics for the attribute when it is carried in UPDATEs 27 of certain other SAFIs. This document adds support for additional 28 tunnel types, and allows a remote tunnel endpoint address to be 29 specified for each tunnel. This document also provides support for 30 specifying fields of any inner or outer encapsulations that may be 31 used by a particular tunnel. 33 This document obsoletes RFC 5512. 35 Status of This Memo 37 This Internet-Draft is submitted in full conformance with the 38 provisions of BCP 78 and BCP 79. 40 Internet-Drafts are working documents of the Internet Engineering 41 Task Force (IETF). Note that other groups may also distribute 42 working documents as Internet-Drafts. The list of current Internet- 43 Drafts is at https://datatracker.ietf.org/drafts/current/. 45 Internet-Drafts are draft documents valid for a maximum of six months 46 and may be updated, replaced, or obsoleted by other documents at any 47 time. It is inappropriate to use Internet-Drafts as reference 48 material or to cite them other than as "work in progress." 49 This Internet-Draft will expire on January 23, 2020. 51 Copyright Notice 53 Copyright (c) 2019 IETF Trust and the persons identified as the 54 document authors. All rights reserved. 56 This document is subject to BCP 78 and the IETF Trust's Legal 57 Provisions Relating to IETF Documents 58 (https://trustee.ietf.org/license-info) in effect on the date of 59 publication of this document. Please review these documents 60 carefully, as they describe your rights and restrictions with respect 61 to this document. Code Components extracted from this document must 62 include Simplified BSD License text as described in Section 4.e of 63 the Trust Legal Provisions and are provided without warranty as 64 described in the Simplified BSD License. 66 Table of Contents 68 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 69 1.1. Brief Summary of RFC 5512 . . . . . . . . . . . . . . . . 4 70 1.2. Deficiencies in RFC 5512 . . . . . . . . . . . . . . . . 4 71 1.3. Brief Summary of Changes from RFC 5512 . . . . . . . . . 5 72 1.4. Impact on RFC 5566 . . . . . . . . . . . . . . . . . . . 6 73 2. The Tunnel Encapsulation Attribute . . . . . . . . . . . . . 6 74 3. Tunnel Encapsulation Attribute Sub-TLVs . . . . . . . . . . . 8 75 3.1. The Tunnel Endpoint Sub-TLV . . . . . . . . . . . . . . . 8 76 3.2. Encapsulation Sub-TLVs for Particular Tunnel Types . . . 10 77 3.2.1. VXLAN . . . . . . . . . . . . . . . . . . . . . . . . 10 78 3.2.2. VXLAN-GPE . . . . . . . . . . . . . . . . . . . . . . 12 79 3.2.3. NVGRE . . . . . . . . . . . . . . . . . . . . . . . . 13 80 3.2.4. L2TPv3 . . . . . . . . . . . . . . . . . . . . . . . 14 81 3.2.5. GRE . . . . . . . . . . . . . . . . . . . . . . . . . 15 82 3.2.6. MPLS-in-GRE . . . . . . . . . . . . . . . . . . . . . 15 83 3.2.7. IP-in-IP . . . . . . . . . . . . . . . . . . . . . . 16 84 3.3. Outer Encapsulation Sub-TLVs . . . . . . . . . . . . . . 16 85 3.3.1. IPv4 DS Field . . . . . . . . . . . . . . . . . . . . 16 86 3.3.2. UDP Destination Port . . . . . . . . . . . . . . . . 17 87 3.4. Sub-TLVs for Aiding Tunnel Selection . . . . . . . . . . 17 88 3.4.1. Protocol Type Sub-TLV . . . . . . . . . . . . . . . . 17 89 3.4.2. Color Sub-TLV . . . . . . . . . . . . . . . . . . . . 17 90 3.5. Embedded Label Handling Sub-TLV . . . . . . . . . . . . . 18 91 3.6. MPLS Label Stack Sub-TLV . . . . . . . . . . . . . . . . 19 92 3.7. Prefix-SID Sub-TLV . . . . . . . . . . . . . . . . . . . 20 93 4. Extended Communities Related to the Tunnel Encapsulation 94 Attribute . . . . . . . . . . . . . . . . . . . . . . . . . . 21 95 4.1. Encapsulation Extended Community . . . . . . . . . . . . 21 96 4.2. Router's MAC Extended Community . . . . . . . . . . . . . 23 97 4.3. Color Extended Community . . . . . . . . . . . . . . . . 23 98 5. Semantics and Usage of the Tunnel Encapsulation attribute . . 23 99 6. Routing Considerations . . . . . . . . . . . . . . . . . . . 27 100 6.1. Impact on BGP Decision Process . . . . . . . . . . . . . 27 101 6.2. Looping, Infinite Stacking, Etc. . . . . . . . . . . . . 27 102 7. Recursive Next Hop Resolution . . . . . . . . . . . . . . . . 28 103 8. Use of Virtual Network Identifiers and Embedded Labels when 104 Imposing a Tunnel Encapsulation . . . . . . . . . . . . . . . 28 105 8.1. Tunnel Types without a Virtual Network Identifier Field . 29 106 8.2. Tunnel Types with a Virtual Network Identifier Field . . 29 107 8.2.1. Unlabeled Address Families . . . . . . . . . . . . . 30 108 8.2.2. Labeled Address Families . . . . . . . . . . . . . . 30 109 9. Applicability Restrictions . . . . . . . . . . . . . . . . . 31 110 10. Scoping . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 111 11. Error Handling . . . . . . . . . . . . . . . . . . . . . . . 32 112 12. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 34 113 12.1. Subsequent Address Family Identifiers . . . . . . . . . 34 114 12.2. BGP Path Attributes . . . . . . . . . . . . . . . . . . 34 115 12.3. Extended Communities . . . . . . . . . . . . . . . . . . 35 116 12.4. BGP Tunnel Encapsulation Attribute Sub-TLVs . . . . . . 35 117 12.5. Tunnel Types . . . . . . . . . . . . . . . . . . . . . . 36 118 12.6. Flags Field of Vxlan Encapsulation sub-TLV . . . . . . . 36 119 12.7. Flags Field of Vxlan-GPE Encapsulation sub-TLV . . . . . 36 120 12.8. Flags Field of NVGRE Encapsulation sub-TLV . . . . . . . 36 121 12.9. Embedded Label Handling sub-TLV . . . . . . . . . . . . 36 122 13. Security Considerations . . . . . . . . . . . . . . . . . . . 37 123 14. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 38 124 15. Contributor Addresses . . . . . . . . . . . . . . . . . . . . 38 125 16. References . . . . . . . . . . . . . . . . . . . . . . . . . 38 126 16.1. Normative References . . . . . . . . . . . . . . . . . . 38 127 16.2. Informative References . . . . . . . . . . . . . . . . . 40 128 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 41 130 1. Introduction 132 This document obsoletes RFC 5512. The deficiencies of RFC 5512, and 133 a summary of the changes made, are discussed in Sections 1.1-1.3. 134 The material from RFC 5512 that is retained has been incorporated 135 into this document. 137 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 138 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 139 "OPTIONAL" in this document are to be interpreted as described in BCP 140 14 [RFC2119] [RFC8174] when, and only when, they appear in all 141 capitals, as shown here. 143 1.1. Brief Summary of RFC 5512 145 [RFC5512] defines a BGP Path Attribute known as the Tunnel 146 Encapsulation attribute. This attribute consists of one or more 147 TLVs. Each TLV identifies a particular type of tunnel. Each TLV 148 also contains one or more sub-TLVs. Some of the sub-TLVs, e.g., the 149 "Encapsulation sub-TLV", contain information that may be used to form 150 the encapsulation header for the specified tunnel type. Other sub- 151 TLVs, e.g., the "color sub-TLV" and the "protocol sub-TLV", contain 152 information that aids in determining whether particular packets 153 should be sent through the tunnel that the TLV identifies. 155 [RFC5512] only allows the Tunnel Encapsulation attribute to be 156 attached to BGP UPDATE messages of the Encapsulation Address Family. 157 These UPDATE messages have an AFI (Address Family Identifier) of 1 or 158 2, and a SAFI of 7. In an UPDATE of the Encapsulation SAFI, the NLRI 159 (Network Layer Reachability Information) is an address of the BGP 160 speaker originating the UPDATE. Consider the following scenario: 162 o BGP speaker R1 has received and installed UPDATE U; 164 o UPDATE U's SAFI is the Encapsulation SAFI; 166 o UPDATE U has the address R2 as its NLRI; 168 o UPDATE U has a Tunnel Encapsulation attribute. 170 o R1 has a packet, P, to transmit to destination D; 172 o R1's best path to D is a BGP route that has R2 as its next hop; 174 In this scenario, when R1 transmits packet P, it should transmit it 175 to R2 through one of the tunnels specified in U's Tunnel 176 Encapsulation attribute. The IP address of the tunnel egress 177 endpoint of each such tunnel is R2. Packet P is known as the 178 tunnel's "payload". 180 1.2. Deficiencies in RFC 5512 182 While the ability to specify tunnel information in a BGP UPDATE is 183 useful, the procedures of [RFC5512] have certain limitations: 185 o The requirement to use the "Encapsulation SAFI" presents an 186 unfortunate operational cost, as each BGP session that may need to 187 carry tunnel encapsulation information needs to be reconfigured to 188 support the Encapsulation SAFI. The Encapsulation SAFI has never 189 been used, and this requirement has served only to discourage the 190 use of the Tunnel Encapsulation attribute. 192 o There is no way to use the Tunnel Encapsulation attribute to 193 specify the tunnel egress endpoint address of a given tunnel; 194 [RFC5512] assumes that the tunnel egress endpoint of each tunnel 195 is specified as the NLRI of an UPDATE of the Encapsulation-SAFI. 197 o If the respective best paths to two different address prefixes 198 have the same next hop, [RFC5512] does not provide a 199 straightforward method to associate each prefix with a different 200 tunnel. 202 o If a particular tunnel type requires an outer IP or UDP 203 encapsulation, there is no way to signal the values of any of the 204 fields of the outer encapsulation. 206 o In [RFC5512]'s specification of the sub-TLVs, each sub-TLV has 207 one-octet length field. In some cases, a two-octet length field 208 may be needed. 210 1.3. Brief Summary of Changes from RFC 5512 212 In this document we address these deficiencies by: 214 o Deprecating the Encapsulation SAFI. 216 o Defining a new "Tunnel Endpoint sub-TLV" that can be included in 217 any of the TLVs contained in the Tunnel Encapsulation attribute. 218 This sub-TLV can be used to specify the remote endpoint address of 219 a particular tunnel. 221 o Allowing the Tunnel Encapsulation attribute to be carried by BGP 222 UPDATEs of additional AFI/SAFIs. Appropriate semantics are 223 provided for this way of using the attribute. 225 o Defining a number of new sub-TLVs that provide additional 226 information that is useful when forming the encapsulation header 227 used to send a packet through a particular tunnel. 229 o Defining the sub-TLV type field so that a sub-TLV whose type is in 230 the range from 0 to 127 inclusive has a one-octet length field, 231 but a sub-TLV whose type is in the range from 128 to 255 inclusive 232 has a two-octet length field. 234 One of the sub-TLVs defined in [RFC5512] is the "Encapsulation sub- 235 TLV". For a given tunnel, the encapsulation sub-TLV specifies some 236 of the information needed to construct the encapsulation header used 237 when sending packets through that tunnel. This document defines 238 encapsulation sub-TLVs for a number of tunnel types not discussed in 239 [RFC5512]: VXLAN (Virtual Extensible Local Area Network, [RFC7348]), 240 VXLAN-GPE (Generic Protocol Extension for VXLAN, 241 [I-D.ietf-nvo3-vxlan-gpe]), NVGRE (Network Virtualization Using 242 Generic Routing Encapsulation [RFC7637]), and MPLS-in-GRE (MPLS in 243 Generic Routing Encapsulation [RFC2784], [RFC2890], [RFC4023]). 244 MPLS-in-UDP [RFC7510] is also supported, but an Encapsulation sub-TLV 245 for it is not needed. 247 Some of the encapsulations mentioned in the previous paragraph need 248 to be further encapsulated inside UDP and/or IP. [RFC5512] provides 249 no way to specify that certain information is to appear in these 250 outer IP and/or UDP encapsulations. This document provides a 251 framework for including such information in the TLVs of the Tunnel 252 Encapsulation attribute. 254 When the Tunnel Encapsulation attribute is attached to a BGP UPDATE 255 whose AFI/SAFI identifies one of the labeled address families, it is 256 not always obvious whether the label embedded in the NLRI is to 257 appear somewhere in the tunnel encapsulation header (and if so, 258 where), or whether it is to appear in the payload, or whether it can 259 be omitted altogether. This is especially true if the tunnel 260 encapsulation header itself contains a "virtual network identifier". 261 This document provides a mechanism that allows one to signal (by 262 using sub-TLVs of the Tunnel Encapsulation attribute) how one wants 263 to use the embedded label when the tunnel encapsulation has its own 264 virtual network identifier field. 266 [RFC5512] defines a Tunnel Encapsulation Extended Community, that can 267 be used instead of the Tunnel Encapsulation attribute under certain 268 circumstances. This document addresses the issue of how to handle a 269 BGP UPDATE that carries both a Tunnel Encapsulation attribute and one 270 or more Tunnel Encapsulation Extended Communities. 272 1.4. Impact on RFC 5566 274 [RFC5566] uses the mechanisms defined in [RFC5512]. While this 275 document obsoletes [RFC5512], it does not address the issue of how to 276 use the mechanisms of [RFC5566] without also using the Encapsulation 277 SAFI. Those issues are considered to be outside the scope of this 278 document. 280 2. The Tunnel Encapsulation Attribute 282 The Tunnel Encapsulation attribute is an optional transitive BGP Path 283 attribute. IANA has assigned the value 23 as the type code of the 284 attribute. The attribute is composed of a set of Type-Length-Value 285 (TLV) encodings. Each TLV contains information corresponding to a 286 particular tunnel type. A TLV is structured as shown in Figure 1: 288 0 1 2 3 289 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 290 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 291 | Tunnel Type (2 Octets) | Length (2 Octets) | 292 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 293 | | 294 | Value | 295 | | 296 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 298 Figure 1: Tunnel Encapsulation TLV Value Field 300 o Tunnel Type (2 octets): identifies a type of tunnel. The field 301 contains values from the IANA Registry "BGP Tunnel Encapsulation 302 Attribute Tunnel Types". 304 Note that for tunnel types whose names are of the form "X-in-Y", 305 e.g., "MPLS-in-GRE", only packets of the specified payload type 306 "X" are to be carried through the tunnel of type "Y". This is the 307 equivalent of specifying a tunnel type "Y" and including in its 308 TLV a Protocol Type sub-TLV (see Section 3.4.1) specifying 309 protocol "X". If the tunnel type is "X-in-Y", it is unnecessary, 310 though harmless, to include a Protocol Type sub-TLV specifying 311 "X". 313 o Length (2 octets): the total number of octets of the value field. 315 o Value (variable): comprised of multiple sub-TLVs. 317 Each sub-TLV consists of three fields: a 1-octet type, a 1-octet or 318 2-octet length field (depending on the type), and zero or more octets 319 of value. A sub-TLV is structured as shown in Figure 2: 321 +--------------------------------+ 322 | Sub-TLV Type (1 Octet) | 323 +--------------------------------+ 324 | Sub-TLV Length (1 or 2 Octets) | 325 +--------------------------------+ 326 | Sub-TLV Value (Variable) | 327 +--------------------------------+ 329 Table 1: Tunnel Encapsulation Sub-TLV Format 331 o Sub-TLV Type (1 octet): each sub-TLV type defines a certain 332 property about the tunnel TLV that contains this sub-TLV. 334 o Sub-TLV Length (1 or 2 octets): the total number of octets of the 335 sub-TLV value field. The Sub-TLV Length field contains 1 octet if 336 the Sub-TLV Type field contains a value in the range from 0-127. 337 The Sub-TLV Length field contains two octets if the Sub-TLV Type 338 field contains a value in the range from 128-255. 340 o Sub-TLV Value (variable): encodings of the value field depend on 341 the sub-TLV type as enumerated above. The following sub-sections 342 define the encoding in detail. 344 3. Tunnel Encapsulation Attribute Sub-TLVs 346 In this section, we specify a number of sub-TLVs. These sub-TLVs can 347 be included in a TLV of the Tunnel Encapsulation attribute. 349 3.1. The Tunnel Endpoint Sub-TLV 351 The Tunnel Endpoint sub-TLV specifies the address of the endpoint of 352 the tunnel, that is, the address of the router that will decapsulate 353 the payload. It is a sub-TLV whose value field contains three sub- 354 fields: 356 1. a four-octet Autonomous System (AS) number sub-field 358 2. a two-octet Address Family sub-field 360 3. an address sub-field, whose length depends upon the Address 361 Family. 363 0 1 2 3 364 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 365 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 366 | Autonomous System Number | 367 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 368 | Address Family | Address ~ 369 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + 370 ~ ~ 371 | | 372 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 374 Figure 2: Tunnel Endpoint Sub-TLV Value Field 376 The Address Family subfield contains a value from IANA's "Address 377 Family Numbers" registry. In this document, we assume that the 378 Address Family is either IPv4 or IPv6; use of other address families 379 is outside the scope of this document. 381 If the Address Family subfield contains the value for IPv4, the 382 address subfield must contain an IPv4 address (a /32 IPv4 prefix). 384 In this case, the length field of Tunnel Endpoint sub-TLV must 385 contain the value 10 (0xa). 387 If the Address Family subfield contains the value for IPv6, the 388 address sub-field must contain an IPv6 address (a /128 IPv6 prefix). 389 In this case, the length field of Tunnel Endpoint sub-TLV must 390 contain the value 22 (0x16). IPv6 link local addresses are not valid 391 values of the IP address field. 393 In a given BGP UPDATE, the address family (IPv4 or IPv6) of a Tunnel 394 Endpoint sub-TLV is independent of the address family of the UPDATE 395 itself. For example, an UPDATE whose NLRI is an IPv4 address may 396 have a Tunnel Encapsulation attribute containing Tunnel Endpoint sub- 397 TLVs that contain IPv6 addresses. Also, different tunnels 398 represented in the Tunnel Encapsulation attribute may have Tunnel 399 Endpoints of different address families. 401 A two-octet AS number can be carried in the AS number field by 402 setting the two high order octets to zero, and carrying the number in 403 the two low order octets of the field. 405 The AS number in the sub-TLV MUST be the number of the AS to which 406 the IP address in the sub-TLV belongs. 408 There is one special case: the Tunnel Endpoint sub-TLV MAY have a 409 value field whose Address Family subfield contains 0. This means 410 that the tunnel's egress endpoint is the UPDATE's BGP next hop. If 411 the Address Family subfield contains 0, the Address subfield is 412 omitted, and the Autonomous System number field is set to 0. 414 If any of the following conditions hold, the Tunnel Endpoint sub-TLV 415 is considered to be "malformed": 417 o The sub-TLV contains the value for IPv4 in its Address Family 418 subfield, but the length of the sub-TLV's value field is other 419 than 10 (0xa). 421 o The sub-TLV contains the value for IPv6 in its Address Family 422 subfield, but the length of the sub-TLV's value field is other 423 than 22 (0x16). 425 o The sub-TLV contains the value zero in its Address Family field, 426 but the length of the sub-TLV's value field is other than 6, or 427 the Autonomous System subfield is not set to zero. 429 o The IP address in the sub-TLV's address subfield is not a valid IP 430 address (e.g., it's an IPv4 broadcast address). 432 o It can be determined that the IP address in the sub-TLV's address 433 subfield does not belong to the non-zero AS whose number is in the 434 its Autonomous System subfield. (See section Section 13 for 435 discussion of one way to determine this.) 437 If the Tunnel Endpoint sub-TLV is malformed, the TLV containing it is 438 also considered to be malformed, and the entire TLV MUST be ignored. 439 However, the Tunnel Encapsulation attribute MUST NOT be considered to 440 be malformed in this case; other TLVs in the attribute MUST be 441 processed (if they can be parsed correctly). 443 When redistributing a route that is carrying a Tunnel Encapsulation 444 attribute containing a TLV that itself contains a malformed Tunnel 445 Endpoint sub-TLV, the TLV MUST be removed from the attribute before 446 redistribution. 448 See Section 11 for further discussion of how to handle errors that 449 are encountered when parsing the Tunnel Encapsulation attribute. 451 If the Tunnel Endpoint sub-TLV contains an IPv4 or IPv6 address that 452 is valid but not reachable, the sub-TLV is NOT considered to be 453 malformed. 455 3.2. Encapsulation Sub-TLVs for Particular Tunnel Types 457 This section defines Tunnel Encapsulation sub-TLVs for the following 458 tunnel types: VXLAN ([RFC7348]), VXLAN-GPE 459 ([I-D.ietf-nvo3-vxlan-gpe]), NVGRE ([RFC7637]), MPLS-in-GRE 460 ([RFC2784], [RFC2890], [RFC4023]), L2TPv3 ([RFC3931]), and GRE 461 ([RFC2784], [RFC2890], [RFC4023]). 463 Rules for forming the encapsulation based on the information in a 464 given TLV are given in Sections 5 and 8. 466 There are also tunnel types for which it is not necessary to define 467 an Encapsulation sub-TLV, because there are no fields in the 468 encapsulation header whose values need to be signaled from the tunnel 469 egress endpoint. 471 3.2.1. VXLAN 473 This document defines an encapsulation sub-TLV for VXLAN tunnels. 474 When the tunnel type is VXLAN, the following is the structure of the 475 value field in the encapsulation sub-TLV: 477 0 1 2 3 478 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 479 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 480 |V|M|R|R|R|R|R|R| VN-ID (3 Octets) | 481 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 482 | MAC Address (4 Octets) | 483 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 484 | MAC Address (2 Octets) | Reserved | 485 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 487 Figure 3: VXLAN Encapsulation Sub-TLV 489 V: This bit is set to 1 to indicate that a "valid" VN-ID (Virtual 490 Network Identifier) is present in the encapsulation sub-TLV. 491 Please see Section 8. 493 M: This bit is set to 1 to indicate that a valid MAC Address is 494 present in the encapsulation sub-TLV. 496 R: The remaining bits in the 8-bit flags field are reserved for 497 further use. They MUST always be set to 0 by the originator of 498 the sub-TLV. Intermediate routers MUST propagate them without 499 modification. Any receiving routers MUST ignore these bits upon a 500 receipt of the sub-TLV. 502 VN-ID: If the V bit is set, the VN-id field contains a 3 octet VN- 503 ID value. If the V bit is not set, the VN-id field MUST be set to 504 zero. 506 MAC Address: If the M bit is set, this field contains a 6 octet 507 Ethernet MAC address. If the M bit is not set, this field MUST be 508 set to all zeroes. 510 When forming the VXLAN encapsulation header: 512 o The values of the V, M, and R bits are NOT copied into the flags 513 field of the VXLAN header. The flags field of the VXLAN header is 514 set as per [RFC7348]. 516 o If the M bit is set, the MAC Address is copied into the Inner 517 Destination MAC Address field of the Inner Ethernet Header (see 518 section 5 of [RFC7348]). 520 If the M bit is not set, and the payload being sent through the 521 VXLAN tunnel is an ethernet frame, the Destination MAC Address 522 field of the Inner Ethernet Header is just the Destination MAC 523 Address field of the payload's ethernet header. 525 If the M bit is not set, and the payload being sent through the 526 VXLAN tunnel is an IP or MPLS packet, the Inner Destination MAC 527 address field is set to a configured value; if there is no 528 configured value, the VXLAN tunnel cannot be used. 530 o See Section 8 to see how the VNI field of the VXLAN encapsulation 531 header is set. 533 Note that in order to send an IP packet or an MPLS packet through a 534 VXLAN tunnel, the packet must first be encapsulated in an ethernet 535 header, which becomes the "inner ethernet header" described in 536 [RFC7348]. The VXLAN Encapsulation sub-TLV may contain information 537 (e.g.,the MAC address) that is used to form this ethernet header. 539 3.2.2. VXLAN-GPE 541 This document defines an encapsulation sub-TLV for VXLAN tunnels. 542 When the tunnel type is VXLAN-GPE, the following is the structure of 543 the value field in the encapsulation sub-TLV: 545 0 1 2 3 546 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 547 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 548 |Ver|V|R|R|R|R|R| Reserved | 549 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 550 | VN-ID | Reserved | 551 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 553 Figure 4: VXLAN GPE Encapsulation Sub-TLV 555 V: This bit is set to 1 to indicate that a "valid" VN-ID is 556 present in the encapsulation sub-TLV. Please see Section 8. 558 R: The bits designated "R" above are reserved for future use. 559 They MUST always be set to 0 by the originator of the sub-TLV. 560 Intermediate routers MUST propagate them without modification. 561 Any receiving routers MUST ignore these bits upon a receipt of the 562 sub-TLV. 564 Version (Ver): Indicates VXLAN GPE protocol version. (See the 565 "Version Bits" section of [I-D.ietf-nvo3-vxlan-gpe].) If the 566 indicated version is not supported, the TLV that contains this 567 Encapsulation sub-TLV MUST be treated as specifying an unsupported 568 tunnel type. The value of this field will be copied into the 569 corresponding field of the VXLAN encapsulation header. 571 VN-ID: If the V bit is set, this field contains a 3 octet VN-ID 572 value. If the V bit is not set, this field MUST be set to zero. 574 When forming the VXLAN-GPE encapsulation header: 576 o The values of the V and R bits are NOT copied into the flags field 577 of the VXLAN-GPE header. However, the values of the Ver bits are 578 copied into the VXLAN-GPE header. Other bits in the flags field 579 of the VXLAN-GPE header are set as per [I-D.ietf-nvo3-vxlan-gpe]. 581 o See Section 8 to see how the VNI field of the VXLAN-GPE 582 encapsulation header is set. 584 3.2.3. NVGRE 586 This document defines an encapsulation sub-TLV for NVGRE tunnels. 587 When the tunnel type is NVGRE, the following is the structure of the 588 value field in the encapsulation sub-TLV: 590 0 1 2 3 591 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 592 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 593 |V|M|R|R|R|R|R|R| VN-ID (3 Octets) | 594 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 595 | MAC Address (4 Octets) | 596 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 597 | MAC Address (2 Octets) | Reserved | 598 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 600 Figure 5: NVGRE Encapsulation Sub-TLV 602 V: This bit is set to 1 to indicate that a "valid" VN-ID is 603 present in the encapsulation sub-TLV. Please see Section 8. 605 M: This bit is set to 1 to indicate that a valid MAC Address is 606 present in the encapsulation sub-TLV. 608 R: The remaining bits in the 8-bit flags field are reserved for 609 further use. They MUST always be set to 0 by the originator of 610 the sub-TLV. Intermediate routers MUST propagate them without 611 modification. Any receiving routers MUST ignore these bits upon a 612 receipt of the sub-TLV. 614 VN-ID: If the V bit is set, the VN-id field contains a 3 octet VN- 615 ID value. If the V bit is not set, the VN-id field MUST be set to 616 zero. 618 MAC Address: If the M bit is set, this field contains a 6 octet 619 Ethernet MAC address. If the M bit is not set, this field MUST be 620 set to all zeroes. 622 When forming the NVGRE encapsulation header: 624 o The values of the V, M, and R bits are NOT copied into the flags 625 field of the NVGRE header. The flags field of the VXLAN header is 626 set as per [RFC7637]. 628 o If the M bit is set, the MAC Address is copied into the Inner 629 Destination MAC Address field of the Inner Ethernet Header (see 630 section 3.2 of [RFC7637]). 632 If the M bit is not set, and the payload being sent through the 633 NVGRE tunnel is an ethernet frame, the Destination MAC Address 634 field of the Inner Ethernet Header is just the Destination MAC 635 Address field of the payload's ethernet header. 637 If the M bit is not set, and the payload being sent through the 638 NVGRE tunnel is an IP or MPLS packet, the Inner Destination MAC 639 address field is set to a configured value; if there is no 640 configured value, the NVGRE tunnel cannot be used. 642 o See Section 8 to see how the VSID (Virtual Subnet Identifier) 643 field of the NVGRE encapsulation header is set. 645 3.2.4. L2TPv3 647 When the tunnel type of the TLV is L2TPv3 over IP, the following is 648 the structure of the value field of the encapsulation sub-TLV: 650 0 1 2 3 651 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 652 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 653 | Session ID (4 octets) | 654 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 655 | | 656 | Cookie (Variable) | 657 | | 658 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 660 Figure 6: L2TPv3 Encapsulation Sub-TLV 662 Session ID: a non-zero 4-octet value locally assigned by the 663 advertising router that serves as a lookup key in the incoming 664 packet's context. 666 Cookie: an optional, variable length (encoded in octets -- 0 to 8 667 octets) value used by L2TPv3 to check the association of a 668 received data message with the session identified by the Session 669 ID. Generation and usage of the cookie value is as specified in 670 [RFC3931]. 672 The length of the cookie is not encoded explicitly, but can be 673 calculated as (sub-TLV length - 4). 675 3.2.5. GRE 677 When the tunnel type of the TLV is GRE, the following is the 678 structure of the value field of the encapsulation sub-TLV: 680 0 1 2 3 681 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 682 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 683 | GRE Key (4 octets) | 684 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 686 Figure 7: GRE Encapsulation Sub-TLV 688 GRE Key: 4-octet field [RFC2890] that is generated by the 689 advertising router. The actual method by which the key is 690 obtained is beyond the scope of this document. The key is 691 inserted into the GRE encapsulation header of the payload packets 692 sent by ingress routers to the advertising router. It is intended 693 to be used for identifying extra context information about the 694 received payload. 696 Note that the key is optional. Unless a key value is being 697 advertised, the GRE encapsulation sub-TLV MUST NOT be present. 699 3.2.6. MPLS-in-GRE 701 When the tunnel type is MPLS-in-GRE, the following is the structure 702 of the value field in an optional encapsulation sub-TLV: 704 0 1 2 3 705 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 706 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 707 | GRE-Key (4 Octets) | 708 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 710 Figure 8: MPLS-in-GRE Encapsulation Sub-TLV 712 GRE-Key: 4-octet field [RFC2890] that is generated by the 713 advertising router. The actual method by which the key is 714 obtained is beyond the scope of this document. The key is 715 inserted into the GRE encapsulation header of the payload packets 716 sent by ingress routers to the advertising router. It is intended 717 to be used for identifying extra context information about the 718 received payload. Note that the key is optional. Unless a key 719 value is being advertised, the MPLS-in-GRE encapsulation sub-TLV 720 MUST NOT be present. 722 Note that the GRE tunnel type defined in Section 3.2.5 can be used 723 instead of the MPLS-in-GRE tunnel type when it is necessary to 724 encapsulate MPLS in GRE. Including a TLV of the MPLS-in-GRE tunnel 725 type is equivalent to including a TLV of the GRE tunnel type that 726 also includes a Protocol Type sub-TLV (Section 3.4.1) specifying MPLS 727 as the protocol to be encapsulated. That is, if a TLV specifies 728 MPLS-in-GRE or if it includes a Protocol Type sub-TLV specifying 729 MPLS, the GRE tunnel advertised in that TLV MUST NOT be used for 730 carrying IP packets. 732 While it is not really necessary to have both the GRE and MPLS-in-GRE 733 tunnel types, both are included for reasons of backwards 734 compatibility. 736 3.2.7. IP-in-IP 738 When the tunnel type of the TLV is IP-in-IP, it does not have Virtual 739 Network Identifier. See for Section 8.1 Embedded Label handling on 740 IP-in-IP tunnels. 742 3.3. Outer Encapsulation Sub-TLVs 744 The Encapsulation sub-TLV for a particular tunnel type allows one to 745 specify the values that are to be placed in certain fields of the 746 encapsulation header for that tunnel type. However, some tunnel 747 types require an outer IP encapsulation, and some also require an 748 outer UDP encapsulation. The Encapsulation sub-TLV for a given 749 tunnel type does not usually provide a way to specify values for 750 fields of the outer IP and/or UDP encapsulations. If it is necessary 751 to specify values for fields of the outer encapsulation, additional 752 sub-TLVs must be used. This document defines two such sub-TLVs. 754 If an outer encapsulation sub-TLV occurs in a TLV for a tunnel type 755 that does not use the corresponding outer encapsulation, the sub-TLV 756 is treated as if it were an unknown type of sub-TLV. 758 3.3.1. IPv4 DS Field 760 Most of the tunnel types that can be specified in the Tunnel 761 Encapsulation attribute require an outer IP encapsulation. The IPv4 762 Differentiated Services (DS) Field sub-TLV can be carried in the TLV 763 of any such tunnel type. It specifies the setting of the one-octet 764 Differentiated Services field in the outer IP encapsulation (see 765 [RFC2474]). The value field is always a single octet. 767 3.3.2. UDP Destination Port 769 Some of the tunnel types that can be specified in the Tunnel 770 Encapsulation attribute require an outer UDP encapsulation. 771 Generally there is a standard UDP Destination Port value for a 772 particular tunnel type. However, sometimes it is useful to be able 773 to use a non-standard UDP destination port. If a particular tunnel 774 type requires an outer UDP encapsulation, and it is desired to use a 775 UDP destination port other than the standard one, the port to be used 776 can be specified by including a UDP Destination Port sub-TLV. The 777 value field of this sub-TLV is always a two-octet field, containing 778 the port value. 780 3.4. Sub-TLVs for Aiding Tunnel Selection 782 3.4.1. Protocol Type Sub-TLV 784 The protocol type sub-TLV MAY be included in a given TLV to indicate 785 the type of the payload packets that may be encapsulated with the 786 tunnel parameters that are being signaled in the TLV. The value 787 field of the sub-TLV contains a 2-octet value from IANA's ethertype 788 registry [Ethertypes]. 790 For example, if we want to use three L2TPv3 sessions, one carrying 791 IPv4 packets, one carrying IPv6 packets, and one carrying MPLS 792 packets, the egress router will include three TLVs of L2TPv3 793 encapsulation type, each specifying a different Session ID and a 794 different payload type. The protocol type sub-TLV for these will be 795 IPv4 (protocol type = 0x0800), IPv6 (protocol type = 0x86dd), and 796 MPLS (protocol type = 0x8847), respectively. This informs the 797 ingress routers of the appropriate encapsulation information to use 798 with each of the given protocol types. Insertion of the specified 799 Session ID at the ingress routers allows the egress to process the 800 incoming packets correctly, according to their protocol type. 802 3.4.2. Color Sub-TLV 804 The color sub-TLV MAY be encoded as a way to "color" the 805 corresponding tunnel TLV. The value field of the sub-TLV is eight 806 octets long, and consists of a Color Extended Community, as defined 807 in Section 4.3. For the use of this sub-TLV and Extended Community, 808 please see Section 7. 810 Note that the high-order octet of this sub-TLV's value field MUST be 811 set to 3, and the next octet MUST be set to 0x0b. (Otherwise the 812 value field is not identical to a Color Extended Community.) 814 If a Color sub-TLV is not of the proper length, or the first two 815 octets of its value field are not 0x030b, the sub-TLV should be 816 treated as if it were an unrecognized sub-TLV (see Section 11). 818 3.5. Embedded Label Handling Sub-TLV 820 Certain BGP address families (corresponding to particular AFI/SAFI 821 pairs, e.g., 1/4, 2/4, 1/128, 2/128) have MPLS labels embedded in 822 their NLRIs. We will use the term "embedded label" to refer to the 823 MPLS label that is embedded in an NLRI, and the term "labeled address 824 family" to refer to any AFI/SAFI that has embedded labels. 826 Some of the tunnel types (e.g., VXLAN, VXLAN-GPE, and NVGRE) that can 827 be specified in the Tunnel Encapsulation attribute have an 828 encapsulation header containing "Virtual Network" identifier of some 829 sort. The Encapsulation sub-TLVs for these tunnel types may 830 optionally specify a value for the virtual network identifier. 832 Suppose a Tunnel Encapsulation attribute is attached to an UPDATE of 833 an embedded address family, and it is decided to use a particular 834 tunnel (specified in one of the attribute's TLVs) for transmitting a 835 packet that is being forwarded according to that UPDATE. When 836 forming the encapsulation header for that packet, different 837 deployment scenarios require different handling of the embedded label 838 and/or the virtual network identifier. The Embedded Label Handling 839 sub-TLV can be used to control the placement of the embedded label 840 and/or the virtual network identifier in the encapsulation. 842 The Embedded Label Handling sub-TLV may be included in any TLV of the 843 Tunnel Encapsulation attribute. If the Tunnel Encapsulation 844 attribute is attached to an UPDATE of a non-labeled address family, 845 the sub-TLV is treated as a no-op. If the sub-TLV is contained in a 846 TLV whose tunnel type does not have a virtual network identifier in 847 its encapsulation header, the sub-TLV is treated as a no-op. In 848 those cases where the sub-TLV is treated as a no-op, it SHOULD NOT be 849 stripped from the TLV before the UPDATE is forwarded. 851 The sub-TLV's Length field always contains the value 1, and its value 852 field consists of a single octet. The following values are defined: 854 1: The payload will be an MPLS packet with the embedded label at the 856 top of its label stack. 858 2: The embedded label is not carried in the payload, but is carried 859 either in the virtual network identifier field of the 860 encapsulation header, or else is ignored entirely. 862 Please see Section 8 for the details of how this sub-TLV is used when 863 it is carried by an UPDATE of a labeled address family. 865 3.6. MPLS Label Stack Sub-TLV 867 This sub-TLV allows an MPLS label stack ([RFC3032]) to be associated 868 with a particular tunnel. 870 The value field of this sub-TLV is a sequence of MPLS label stack 871 entries. The first entry in the sequence is the "topmost" label, the 872 final entry in the sequence is the "bottommost" label. When this 873 label stack is pushed onto a packet, this ordering MUST be preserved. 875 Each label stack entry has the following format: 877 0 1 2 3 878 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 879 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 880 | Label | TC |S| TTL | 881 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 883 Figure 9: MPLS Label Stack Sub-TLV 885 If a packet is to be sent through the tunnel identified in a 886 particular TLV, and if that TLV contains an MPLS Label Stack sub-TLV, 887 then the label stack appearing in the sub-TLV MUST be pushed onto the 888 packet. This label stack MUST be pushed onto the packet before any 889 other labels are pushed onto the packet. 891 In particular, if the Tunnel Encapsulation attribute is attached to a 892 BGP UPDATE of a labeled address family, the contents of the MPLS 893 Label Stack sub-TLV MUST be pushed onto the packet before the label 894 embedded in the NLRI is pushed onto the packet. 896 If the MPLS label stack sub-TLV is included in a TLV identifying a 897 tunnel type that uses virtual network identifiers (see Section 8), 898 the contents of the MPLS label stack sub-TLV MUST be pushed onto the 899 packet before the procedures of Section 8 are applied. 901 The number of label stack entries in the sub-TLV MUST be determined 902 from the sub-TLV length field. Thus it is not necessary to set the S 903 bit in any of the label stack entries of the sub-TLV, and the setting 904 of the S bit is ignored when parsing the sub-TLV. When the label 905 stack entries are pushed onto a packet that already has a label 906 stack, the S bits of all the entries MUST be cleared. When the label 907 stack entries are pushed onto a packet that does not already have a 908 label stack, the S bit of the bottommost label stack entry MUST be 909 set, and the S bit of all the other label stack entries MUST be 910 cleared. 912 By default, the TC (Traffic Class) field ([RFC3032], [RFC5462]) of 913 each label stack entry is set to 0. This may of course be changed by 914 policy at the originator of the sub-TLV. When pushing the label 915 stack onto a packet, the TC of the label stack entries is preserved 916 by default. However, local policy at the router that is pushing on 917 the stack MAY cause modification of the TC values. 919 By default, the TTL (Time to Live) field of each label stack entry is 920 set to 255. This may be changed by policy at the originator of the 921 sub-TLV. When pushing the label stack onto a packet, the TTL of the 922 label stack entries is preserved by default. However, local policy 923 at the router that is pushing on the stack MAY cause modification of 924 the TTL values. If any label stack entry in the sub-TLV has a TTL 925 value of zero, the router that is pushing the stack on a packet MUST 926 change the value to a non-zero value. 928 Note that this sub-TLV can appear within a TLV identifying any type 929 of tunnel, not just within a TLV identifying an MPLS tunnel. 930 However, if this sub-TLV appears within a TLV identifying an MPLS 931 tunnel (or an MPLS-in-X tunnel), this sub-TLV plays the same role 932 that would be played by an MPLS Encapsulation sub-TLV. Therefore, an 933 MPLS Encapsulation sub-TLV is not defined. 935 3.7. Prefix-SID Sub-TLV 937 [I-D.ietf-idr-bgp-prefix-sid] defines a BGP Path attribute known as 938 the "Prefix-SID Attribute". This attribute is defined to contain a 939 sequence of one or more TLVs, where each TLV is either a "Label- 940 Index" TLV, an "IPv6 SID (Segment Identifier)" TLV, or an "Originator 941 SRGB (Source Routing Global Block)" TLV. 943 In this document, we define a Prefix-SID sub-TLV. The value field of 944 the Prefix-SID sub-TLV can be set to any valid value of the value 945 field of a BGP Prefix-SID attribute, as defined in 946 [I-D.ietf-idr-bgp-prefix-sid]. 948 The Prefix-SID sub-TLV can occur in a TLV identifying any type of 949 tunnel. If an Originator SRGB is specified in the sub-TLV, that SRGB 950 MUST be interpreted to be the SRGB used by the tunnel's egress 951 endpoint. The Label-Index, if present, is the Segment Routing SID 952 that the tunnel's egress endpoint uses to represent the prefix 953 appearing in the NLRI field of the BGP UPDATE to which the Tunnel 954 Encapsulation attribute is attached. 956 If a Label-Index is present in the prefix-SID sub-TLV, then when a 957 packet is sent through the tunnel identified by the TLV, the 958 corresponding MPLS label MUST be pushed on the packet's label stack. 959 The corresponding MPLS label is computed from the Label-Index value 960 and the SRGB of the route's originator. 962 If the Originator SRGB is not present, it is assumed that the 963 originator's SRGB is known by other means. Such "other means" are 964 outside the scope of this document. 966 The corresponding MPLS label is pushed on after the processing of the 967 MPLS Label Stack sub-TLV, if present, as specified in Section 3.6. 968 It is pushed on before any other labels (e.g., a label embedded in 969 UPDATE's NLRI, or a label determined by the procedures of Section 8 970 are pushed on the stack. 972 The Prefix-SID sub-TLV has slightly different semantics than the 973 Prefix-SID attribute. When the Prefix-SID attribute is attached to a 974 given route, the BGP speaker that originally attached the attribute 975 is expected to be in the same Segment Routing domain as the BGP 976 speakers who receive the route with the attached attribute. The 977 Label-Index tells the receiving BGP speakers that the prefix-SID is 978 for the advertised prefix in that Segment Routing domain. When the 979 Prefix-SID sub-TLV is used, the BGP speaker at the head end of the 980 tunnel need even not be in the same Segment Routing Domain as the 981 tunnel's egress endpoint, and there is no implication that the 982 prefix-SID for the advertised prefix is the same in the Segment 983 Routing domains of the BGP speaker that originated the sub-TLV and 984 the BGP speaker that received it. 986 4. Extended Communities Related to the Tunnel Encapsulation Attribute 988 4.1. Encapsulation Extended Community 990 The Encapsulation Extended Community is a Transitive Opaque Extended 991 Community. This Extended Community may be attached to a route of any 992 AFI/SAFI to which the Tunnel Encapsulation attribute may be attached. 993 Each such Extended Community identifies a particular tunnel type. If 994 the Encapsulation Extended Community identifies a particular tunnel 995 type, its semantics are exactly equivalent to the semantics of a 996 Tunnel Encapsulation attribute Tunnel TLV for which the following 997 three conditions all hold: 999 1. it identifies the same tunnel type, 1000 2. it has a Tunnel Endpoint sub-TLV for which one of the following 1001 two conditions holds: 1003 A. its "Address Family" subfield contains zero, or 1005 B. its "Address" subfield contains the same IP address that 1006 appears in the next hop field of the route to which the 1007 Tunnel Encapsulation attribute is attached 1009 3. it has no other sub-TLVs. 1011 We will refer to such a Tunnel TLV as a "barebones" Tunnel TLV. 1013 The Encapsulation Extended Community was first defined in [RFC5512]. 1014 While it provides only a small subset of the functionality of the 1015 Tunnel Encapsulation attribute, it is used in a number of deployed 1016 applications, and is still needed for backwards compatibility. To 1017 ensure backwards compatibility, this specification establishes the 1018 following rules: 1020 1. If the Tunnel Encapsulation attribute of a given route contains a 1021 barebones Tunnel TLV identifying a particular tunnel type, an 1022 Encapsulation Extended Community identifying the same tunnel type 1023 SHOULD be attached to the route. 1025 2. If the Encapsulation Extended Community identifying a particular 1026 tunnel type is attached to a given route, the corresponding 1027 barebones Tunnel TLV MAY be omitted from the Tunnel Encapsulation 1028 attribute. 1030 3. Suppose a particular route has both (a) an Encapsulation Extended 1031 Community specifying a particular tunnel type, and (b) a Tunnel 1032 Encapsulation attribute with a barebones Tunnel TLV specifying 1033 that same tunnel type. Both (a) and (b) MUST be interpreted as 1034 denoting the same tunnel. 1036 In short, in situations where one could use either the Encapsulation 1037 Extended Community or a barebones Tunnel TLV, one may use either or 1038 both. However, to ensure backwards compatibility with applications 1039 that do not support the Tunnel Encapsulation attribute, it is 1040 preferable to use the Encapsulation Extended Community. If the 1041 Extended Community (identifying a particular tunnel type) is present, 1042 the corresponding Tunnel TLV is optional. 1044 Note that for tunnel types of the form "X-in-Y", e.g., MPLS-in-GRE, 1045 the Encapsulation Extended Community implies that only packets of the 1046 specified payload type "X" are to be carried through the tunnel of 1047 type "Y". 1049 In the remainder of this specification, when we speak of a route as 1050 containing a Tunnel Encapsulation attribute with a TLV identifying a 1051 particular tunnel type, we are implicitly including the case where 1052 the route contains a Tunnel Encapsulation Extended Community 1053 identifying that tunnel type. 1055 4.2. Router's MAC Extended Community 1057 [I-D.ietf-bess-evpn-inter-subnet-forwarding] defines a Router's MAC 1058 Extended Community. This Extended Community provides information 1059 that may conflict with information in one or more of the 1060 Encapsulation Sub-TLVs of a Tunnel Encapsulation attribute. In case 1061 of such a conflict, the information in the Encapsulation Sub-TLV 1062 takes precedence. 1064 4.3. Color Extended Community 1066 The Color Extended Community is a Transitive Opaque Extended 1067 Community with the following encoding: 1069 0 1 2 3 1070 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1071 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1072 | 0x03 | 0x0b | Reserved | 1073 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1074 | Color Value | 1075 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1077 Figure 10: Color Extended Community 1079 For the use of this Extended Community please see Section 7. 1081 5. Semantics and Usage of the Tunnel Encapsulation attribute 1083 [RFC5512] specifies the use of the Tunnel Encapsulation attribute in 1084 BGP UPDATE messages of AFI/SAFI 1/7 and 2/7. That document restricts 1085 the use of this attribute to UPDATE messsages of those SAFIs. This 1086 document removes that restriction. 1088 The BGP Tunnel Encapsulation attribute MAY be carried in any BGP 1089 UPDATE message whose AFI/SAFI is 1/1 (IPv4 Unicast), 2/1 (IPv6 1090 Unicast), 1/4 (IPv4 Labeled Unicast), 2/4 (IPv6 Labeled Unicast), 1091 1/128 (VPN-IPv4 Labeled Unicast), 2/128 (VPN-IPv6 Labeled Unicast), 1092 or 25/70 (Ethernet VPN, usually known as EVPN)). Use of the Tunnel 1093 Encapsulation attribute in BGP UPDATE messages of other AFI/SAFIs is 1094 outside the scope of this document. 1096 It has been suggested that it may sometimes be useful to attach a 1097 Tunnel Encapsulation attribute to a BGP UPDATE message that is also 1098 carrying a PMSI (Provider Multicast Service Interface) Tunnel 1099 attribute [RFC6514]. If the PMSI Tunnel attribute specifies an IP 1100 tunnel, the Tunnel Encapsulation attribute could be used to provide 1101 additional information about the IP tunnel. The usage of the Tunnel 1102 Encapsulation attribute in combination with the PMSI Tunnel attribute 1103 is outside the scope of this document. 1105 The decision to attach a Tunnel Encapsulation attribute to a given 1106 BGP UPDATE is determined by policy. The set of TLVs and sub-TLVs 1107 contained in the attribute is also determined by policy. 1109 When the Tunnel Encapsulation attribute is carried in an UPDATE of 1110 one of the AFI/SAFIs specified in the previous paragraph, each TLV 1111 MUST have a Tunnel Endpoint sub-TLV. If a TLV that does not have a 1112 Tunnel Endpoint sub-TLV, that TLV should be treated as if it had a 1113 malformed Tunnel Endpoint sub-TLV (see Section 3.1). 1115 Suppose that: 1117 o a given packet P must be forwarded by router R; 1119 o the path along which P is to be forwarded is determined by BGP 1120 UPDATE U; 1122 o UPDATE U has a Tunnel Encapsulation attribute, containing at least 1123 one TLV that identifies a "feasible tunnel" for packet P. A 1124 tunnel is considered feasible if it has the following three 1125 properties: 1127 * The tunnel type is supported (i.e., router R knows how to set 1128 up tunnels of that type, how to create the encapsulation header 1129 for tunnels of that type, etc.) 1131 * The tunnel is of a type that can be used to carry packet P 1132 (e.g., an MPLS-in-UDP tunnel would not be a feasible tunnel for 1133 carrying an IP packet, UNLESS the IP packet can first be 1134 converted to an MPLS packet). 1136 * The tunnel is specified in a TLV whose Tunnel Endpoint sub-TLV 1137 identifies an IP address that is reachable. 1139 Then router R MUST send packet P through one of the feasible tunnels 1140 identified in the Tunnel Encapsulation attribute of UPDATE U. 1142 If the Tunnel Encapsulation attribute contains several TLVs (i.e., if 1143 it specifies several tunnels), router R may choose any one of those 1144 tunnels, based upon local policy. If any tunnel TLV contains one or 1145 more Color sub-TLVs (Section 3.4.2) and/or the Protocol Type sub-TLV 1146 (Section 3.4.1), the choice of tunnel may be influenced by these sub- 1147 TLVs. 1149 If a particular tunnel is not feasible at some moment because its 1150 Tunnel Endpoint cannot be reached at that moment, the tunnel may 1151 become feasible at a later time (when its endpoint becomes 1152 reachable). Router R should take note of this. If router R is 1153 already using a different tunnel, it MAY switch to the tunnel that 1154 just became feasible, or it MAY decide to continue using the tunnel 1155 that it is already using. How this decision is made is outside the 1156 scope of this document. 1158 In addition to the sub-TLVs already defined, additional sub-TLVs may 1159 be defined that affect the choice of tunnel to be used, or that 1160 affect the contents of the tunnel encapsulation header. The 1161 documents that define any such additional sub-TLVs must specify the 1162 effect that including the sub-TLV is to have. 1164 Once it is determined to send a packet through the tunnel specified 1165 in a particular TLV of a particular Tunnel Encapsulation attribute, 1166 then the tunnel's egress endpoint address is the IP address contained 1167 in the sub-TLV. If the TLV contains a Tunnel Endpoint sub-TLV whose 1168 value field is all zeroes, then the tunnel's egress endpoint is the 1169 IP address specified as the Next Hop of the BGP Update containing the 1170 Tunnel Encapsulation attribute. The address of the tunnel egress 1171 endpoint generally appears in a "destination address" field of the 1172 encapsulation. 1174 The full set of procedures for sending a packet through a particular 1175 tunnel type to a particular tunnel egress endpoint depends upon the 1176 tunnel type, and is outside the scope of this document. Note that 1177 some tunnel types may require the execution of an explicit tunnel 1178 setup protocol before they can be used for carrying data. Other 1179 tunnel types may not require any tunnel setup protocol. 1181 Sending a packet through a tunnel always requires that the packet be 1182 encapsulated, with an encapsulation header that is appropriate for 1183 the tunnel type. The contents of the tunnel encapsulation header MAY 1184 be influenced by the Encapsulation sub-TLV. If there is no 1185 Encapsulation sub-TLV present, the router transmitting the packet 1186 through the tunnel must have a priori knowledge (e.g., by 1187 provisioning) of how to fill in the various fields in the 1188 encapsulation header. 1190 Whenever a new Tunnel Type TLV is defined, the specification of that 1191 TLV should describe (or reference) the procedures for creating the 1192 encapsulation header used to forward packets through that tunnel 1193 type. If a tunnel type codepoint is assigned in the IANA "BGP Tunnel 1194 Encapsulation Tunnel Types" registry, but there is no corresponding 1195 specification that defines an Encapsulation sub-TLV for that tunnel 1196 type, the transmitting endpoint of such a tunnel is presumed to know 1197 a priori how to form the encapsulation header for that tunnel type. 1199 If a Tunnel Encapsulation attribute specifies several tunnels, the 1200 way in which a router chooses which one to use is a matter of policy, 1201 subject to the following constraint: if a router can determine that a 1202 given tunnel is not functional, it MUST NOT use that tunnel. In 1203 particular, if the tunnel is identified in a TLV that has a Tunnel 1204 Endpoint sub-TLV, and if the IP address specified in the sub-TLV is 1205 not reachable from router R, then the tunnel MUST be considered non- 1206 functional. Other means of determining whether a given tunnel is 1207 functional MAY be used; specification of such means is outside the 1208 scope of this specification. Of course, if a non-functional tunnel 1209 later becomes functional, router R SHOULD reevaluate its choice of 1210 tunnels. 1212 If router R determines that it cannot use any of the tunnels 1213 specified in the Tunnel Encapsulation attribute, it MAY either drop 1214 packet P, or it MAY transmit packet P as it would had the Tunnel 1215 Encapsulation attribute not been present. This is a matter of local 1216 policy. By default, the packet SHOULD be transmitted as if the 1217 Tunnel Encapsulation attribute had not been present. 1219 A Tunnel Encapsulation attribute may contain several TLVs that all 1220 specify the same tunnel type. Each TLV should be considered as 1221 specifying a different tunnel. Two tunnels of the same type may have 1222 different Tunnel Endpoint sub-TLVs, different Encapsulation sub-TLVs, 1223 etc. Choosing between two such tunnels is a matter of local policy. 1225 Once router R has decided to send packet P through a particular 1226 tunnel, it encapsulates packet P appropriately and then forwards it 1227 according to the route that leads to the tunnel's egress endpoint. 1228 This route may itself be a BGP route with a Tunnel Encapsulation 1229 attribute. If so, the encapsulated packet is treated as the payload 1230 and is encapsulated according to the Tunnel Encapsulation attribute 1231 of that route. That is, tunnels may be "stacked". 1233 Notwithstanding anything said in this document, a BGP speaker MAY 1234 have local policy that influences the choice of tunnel, and the way 1235 the encapsulation is formed. A BGP speaker MAY also have a local 1236 policy that tells it to ignore the Tunnel Encapsulation attribute 1237 entirely or in part. Of course, interoperability issues must be 1238 considered when such policies are put into place. 1240 6. Routing Considerations 1242 6.1. Impact on BGP Decision Process 1244 The presence of the Tunnel Encapsulation attribute affects the BGP 1245 bestpath selection algorithm. For all the tunnels described in the 1246 Tunnel Encapsulation attribute for a path, if no Tunnel Endpoint 1247 address is feasible, then that path MUST NOT be considered resolvable 1248 for the purposes of Route Resolvability Condition [RFC4271] section 1249 9.1.2.1. 1251 6.2. Looping, Infinite Stacking, Etc. 1253 Consider a packet destined for address X. Suppose a BGP UPDATE for 1254 address prefix X carries a Tunnel Encapsulation attribute that 1255 specifies a tunnel egress endpoint of Y. And suppose that a BGP 1256 UPDATE for address prefix Y carries a Tunnel Encapsulation attribute 1257 that specifies a Tunnel Endpoint of X. It is easy to see that this 1258 will cause an infinite number of encapsulation headers to be put on 1259 the given packet. 1261 This could happen as a result of misconfiguration, either accidental 1262 or intentional. It could also happen if the Tunnel Encapsulation 1263 attribute were altered by a malicious agent. Implementations should 1264 be aware of this. This document does not specify a maximum number of 1265 recursions; that is an implementation-specific matter. 1267 Improper setting (or malicious altering) of the Tunnel Encapsulation 1268 attribute could also cause data packets to loop. Suppose a BGP 1269 UPDATE for address prefix X carries a Tunnel Encapsulation attribute 1270 that specifies a tunnel egress endpoint of Y. Suppose router R 1271 receives and processes the update. When router R receives a packet 1272 destined for X, it will apply the encapsulation and send the 1273 encapsulated packet to Y. Y will decapsulate the packet and forward 1274 it further. If Y is further away from X than is router R, it is 1275 possible that the path from Y to X will traverse R. This would cause 1276 a long-lasting routing loop. The control plane itself cannot detect 1277 this situation, though a TTL field in the payload packets would 1278 presumably prevent any given packet from looping infinitely. 1280 These possibilities must also be kept in mind whenever the Tunnel 1281 Endpoint for a given prefix differs from the BGP next hop for that 1282 prefix. 1284 7. Recursive Next Hop Resolution 1286 Suppose that: 1288 o a given packet P must be forwarded by router R1; 1290 o the path along which P is to be forwarded is determined by BGP 1291 UPDATE U1; 1293 o UPDATE U1 does not have a Tunnel Encapsulation attribute; 1295 o the next hop of UPDATE U1 is router R2; 1297 o the best path to router R2 is a BGP route that was advertised in 1298 UPDATE U2; 1300 o UPDATE U2 has a Tunnel Encapsulation attribute. 1302 Then packet P MUST be sent through one of the tunnels identified in 1303 the Tunnel Encapsulation attribute of UPDATE U2. See Section 5 for 1304 further details. 1306 However, suppose that one of the TLVs in U2's Tunnel Encapsulation 1307 attribute contains the Color Sub-TLV. In that case, packet P MUST 1308 NOT be sent through the tunnel identified in that TLV, unless U1 is 1309 carrying the Color Extended Community that is identified in U2's 1310 Color Sub-TLV. 1312 Note that if UPDATE U1 and UPDATE U2 both have Tunnel Encapsulation 1313 attributes, packet P will be carried through a pair of nested 1314 tunnels. P will first be encapsulated based on the Tunnel 1315 Encapsulation attribute of U1. This encapsulated packet then becomes 1316 the payload, and is encapsulated based on the Tunnel Encapsulation 1317 attribute of U2. This is another way of "stacking" tunnels (see also 1318 Section 5). 1320 The procedures in this section presuppose that U1's next hop resolves 1321 to a BGP route, and that U2's next hop resolves (perhaps after 1322 further recursion) to a non-BGP route. 1324 8. Use of Virtual Network Identifiers and Embedded Labels when Imposing 1325 a Tunnel Encapsulation 1327 If the TLV specifying a tunnel contains an MPLS Label Stack sub-TLV, 1328 then when sending a packet through that tunnel, the procedures of 1329 Section 3.6 are applied before the procedures of this section. 1331 If the TLV specifying a tunnel contains a Prefix-SID sub-TLV, the 1332 procedures of Section 3.7 are applied before the procedures of this 1333 section. If the TLV also contains an MPLS Label Stack sub-TLV, the 1334 procedures of Section 3.6 are applied before the procedures of 1335 Section 3.7. 1337 8.1. Tunnel Types without a Virtual Network Identifier Field 1339 If a Tunnel Encapsulation attribute is attached to an UPDATE of a 1340 labeled address family, there will be one or more labels specified in 1341 the UPDATE's NLRI. 1343 o If the TLV contains an Embedded Label Handling sub-TLV whose value 1344 is 1, the label or labels from the NLRI are pushed on the packet's 1345 label stack. 1347 o If the TLV does not contain an Embedded Label Handling sub-TLV, or 1348 if it contains an Embedded Label Handling sub-TLV whose value is 1349 2, the embedded label is ignored completely. The tunnel is 1350 assumed to have terminated at the corresponding VRF. 1352 The resulting MPLS packet is then further encapsulated, as specified 1353 by the TLV. 1355 8.2. Tunnel Types with a Virtual Network Identifier Field 1357 Three of the tunnel types that can be specified in a Tunnel 1358 Encapsulation TLV have virtual network identifier fields in their 1359 encapsulation headers. In the VXLAN and VXLAN-GPE encapsulations, 1360 this field is called the VNI (Virtual Network Identifier) field; in 1361 the NVGRE encapsulation, this field is called the VSID (Virtual 1362 Subnet Identifier) field. 1364 When one of these tunnel encapsulations is imposed on a packet, the 1365 setting of the virtual network identifier field in the encapsulation 1366 header depends upon the contents of the Encapsulation sub-TLV (if one 1367 is present). When the Tunnel Encapsulation attribute is being 1368 carried on a BGP UPDATE of a labeled address family, the setting of 1369 the virtual network identifier field also depends upon the contents 1370 of the Embedded Label Handling sub-TLV (if present). 1372 This section specifies the procedures for choosing the value to set 1373 in the virtual network identifier field of the encapsulation header. 1374 These procedures apply only when the tunnel type is VXLAN, VXLAN-GPE, 1375 or NVGRE. 1377 8.2.1. Unlabeled Address Families 1379 This sub-section applies when: 1381 o the Tunnel Encapsulation attribute is carried on a BGP UPDATE of 1382 an unlabeled address family, and 1384 o at least one of the attribute's TLVs identifies a tunnel type that 1385 uses a virtual network identifier, and 1387 o it has been determined to send a packet through one of those 1388 tunnels. 1390 If the TLV identifying the tunnel contains an Encapsulation sub-TLV 1391 whose V bit is set, the virtual network identifier field of the 1392 encapsulation header is set to the value of the virtual network 1393 identifier field of the Encapsulation sub-TLV. 1395 Otherwise, the virtual network identifier field of the encapsulation 1396 header is set to a configured value; if there is no configured value, 1397 the tunnel cannot be used. 1399 8.2.2. Labeled Address Families 1401 This sub-section applies when: 1403 o the Tunnel Encapsulation attribute is carried on a BGP UPDATE of a 1404 labeled address family, and 1406 o at least one of the attribute's TLVs identifies a tunnel type that 1407 uses a virtual network identifier, and 1409 o it has been determined to send a packet through one of those 1410 tunnels. 1412 8.2.2.1. When a Valid VNI has been Signaled 1414 If the TLV identifying the tunnel contains an Encapsulation sub-TLV 1415 whose V bit is set, the virtual network identifier field of the 1416 encapsulation header is set as follows: 1418 o If the TLV contains an Embedded Label Handling sub-TLV whose value 1419 is 1, then the virtual network identifier field of the 1420 encapsulation header is set to the value of the virtual network 1421 identifier field of the Encapsulation sub-TLV. 1423 The embedded label (from the NLRI of the route that is carrying 1424 the Tunnel Encapsulation attribute) appears at the top of the MPLS 1425 label stack in the encapsulation payload. 1427 o If the TLV does not contain an Embedded Label Handling sub-TLV, or 1428 if contains an Embedded Label Handling sub-TLV whose value is 2, 1429 the embedded label is ignored entirely, and the virtual network 1430 identifier field of the encapsulation header is set to the value 1431 of the virtual network identifier field of the Encapsulation sub- 1432 TLV. 1434 8.2.2.2. When a Valid VNI has not been Signaled 1436 If the TLV identifying the tunnel does not contain an Encapsulation 1437 sub-TLV whose V bit is set, the virtual network identifier field of 1438 the encapsulation header is set as follows: 1440 o If the TLV contains an Embedded Label Handling sub-TLV whose value 1441 is 1, then the virtual network identifier field of the 1442 encapsulation header is set to a configured value. 1444 If there is no configured value, the tunnel cannot be used. 1446 The embedded label (from the NLRI of the route that is carrying 1447 the Tunnel Encapsulation attribute) appears at the top of the MPLS 1448 label stack in the encapsulation payload. 1450 o If the TLV does not contain an Embedded Label Handling sub-TLV, or 1451 if it contains an Embedded Label Handling sub-TLV whose value is 1452 2, the embedded label is copied into the virtual network 1453 identifier field of the encapsulation header. 1455 In this case, the payload may or may not contain an MPLS label 1456 stack, depending upon other factors. If the payload does contain 1457 an MPLS label stack, the embedded label does not appear in that 1458 stack. 1460 9. Applicability Restrictions 1462 In a given UPDATE of a labeled address family, the label embedded in 1463 the NLRI is generally a label that is meaningful only to the router 1464 whose address appears as the next hop. Certain of the procedures of 1465 Section 8.2.2.1 or Section 8.2.2.2 cause the embedded label to be 1466 carried by a data packet to the router whose address appears in the 1467 Tunnel Endpoint sub-TLV. If the Tunnel Endpoint sub-TLV does not 1468 identify the same router that is the next hop, sending the packet 1469 through the tunnel may cause the label to be misinterpreted at the 1470 tunnel's egress endpoint. This may cause misdelivery of the packet. 1472 Therefore the embedded label MUST NOT be carried by a data packet 1473 traveling through a tunnel unless it is known that the label will be 1474 properly interpreted at the tunnel's egress endpoint. How this is 1475 known is outside the scope of this document. 1477 Note that if the Tunnel Encapsulation attribute is attached to a VPN- 1478 IP route [RFC4364], and if Inter-AS "option b" (see section 10 of 1479 [RFC4364]) is being used, and if the Tunnel Endpoint sub-TLV contains 1480 an IP address that is not in same AS as the router receiving the 1481 route, it is very likely that the embedded label has been changed. 1482 Therefore use of the Tunnel Encapsulation attribute in an "Inter-AS 1483 option b" scenario is not supported. 1485 10. Scoping 1487 The Tunnel Encapsulation attribute is defined as a transitive 1488 attribute, so that it may be passed along by BGP speakers that do not 1489 recognize it. However, it is intended that the Tunnel Encapsulation 1490 attribute be used only within a well-defined scope, e.g., within a 1491 set of Autonomous Systems that belong to a single administrative 1492 entity. If the attribute is distributed beyond its intended scope, 1493 packets may be sent through tunnels in a manner that is not intended. 1495 To prevent the Tunnel Encapsulation attribute from being distributed 1496 beyond its intended scope, any BGP speaker that understands the 1497 attribute MUST be able to filter the attribute from incoming BGP 1498 UPDATE messages. When the attribute is filtered from an incoming 1499 UPDATE, the attribute is neither processed nor redistributed. This 1500 filtering SHOULD be possible on a per-BGP-session basis. For each 1501 session, filtering of the attribute on incoming UPDATEs MUST be 1502 enabled by default. 1504 In addition, any BGP speaker that understands the attribute MUST be 1505 able to filter the attribute from outgoing BGP UPDATE messages. This 1506 filtering SHOULD be possible on a per-BGP-session basis. For each 1507 session, filtering of the attribute on outgoing UPDATEs MUST be 1508 enabled by default. 1510 11. Error Handling 1512 The Tunnel Encapsulation attribute is a sequence of TLVs, each of 1513 which is a sequence of sub-TLVs. The final octet of a TLV is 1514 determined by its length field. Similarly, the final octet of a sub- 1515 TLV is determined by its length field. The final octet of a TLV MUST 1516 also be the final octet of its final sub-TLV. If this is not the 1517 case, the TLV MUST be considered to be malformed. A TLV that is 1518 found to be malformed for this reason MUST NOT be processed, and MUST 1519 be stripped from the Tunnel Encapsulation attribute before the 1520 attribute is propagated. Subsequent TLVs in the Tunnel Encapsulation 1521 attribute may still be valid, in which case they MUST be processed 1522 and redistributed normally. 1524 If a Tunnel Encapsulation attribute does not have any valid TLVs, or 1525 it does not have the transitive bit set, the "Attribute Discard" 1526 procedure of [RFC7606] is applied. 1528 If a Tunnel Encapsulation attribute can be parsed correctly, but 1529 contains a TLV whose tunnel type is not recognized by a particular 1530 BGP speaker, that BGP speaker MUST NOT consider the attribute to be 1531 malformed. Rather, the TLV with the unrecognized tunnel type MUST be 1532 ignored, and the BGP speaker MUST interpret the attribute as if that 1533 TLV had not been present. If the route carrying the Tunnel 1534 Encapsulation attribute is propagated with the attribute, the 1535 unrecognized TLV MUST remain in the attribute. 1537 If a TLV of a Tunnel Encapsulation attribute contains a sub-TLV that 1538 is not recognized by a particular BGP speaker, the BGP speaker MUST 1539 process that TLV as if the unrecognized sub-TLV had not been present. 1540 If the route carrying the Tunnel Encapsulation attribute is 1541 propagated with the attribute, the unrecognized TLV MUST remain in 1542 the attribute. 1544 If the type code of a sub-TLV appears as "reserved" in the IANA "BGP 1545 Tunnel Encapsulation Attribute Sub-TLVs" registry, the sub-TLV MUST 1546 be treated as an unrecognized sub-TLV. 1548 In general, if a TLV contains a sub-TLV that is malformed (e.g., 1549 contains a length field whose value is not legal for that sub-TLV), 1550 the sub-TLV should be treated as if it were an unrecognized sub-TLV. 1551 This document specifies one exception to this rule -- within a tunnel 1552 encapsulation attribute that is carried by a BGP UPDATE whose AFI/ 1553 SAFI is one of those explicitly listed in the second paragraph of 1554 Section 5, if a TLV contains a malformed Tunnel Endpoint sub-TLV (as 1555 defined in Section 3.1), the entire TLV MUST be ignored, and MUST be 1556 removed from the Tunnel Encapsulation attribute before the route 1557 carrying that attribute is redistributed. 1559 Within a tunnel encapsulation attribute that is carried by a BGP 1560 UPDATE whose AFI/SAFI is one of those explicitly listed in the second 1561 paragraph of Section 5, a TLV that does not contain exactly one 1562 Tunnel Endpoint sub-TLV MUST be treated as if it contained a 1563 malformed Tunnel Endpoint sub-TLV. 1565 A TLV identifying a particular tunnel type may contain a sub-TLV that 1566 is meaningless for that tunnel type. For example, perhaps the TLV 1567 contains a "UDP Destination Port" sub-TLV, but the identified tunnel 1568 type does not use UDP encapsulation at all. Sub-TLVs of this sort 1569 MUST be treated as a no-op. That is, they MUST NOT affect the 1570 creation of the encapsulation header. However, the sub-TLV MUST NOT 1571 be considered to be malformed, and MUST NOT be removed from the TLV 1572 before the route carrying the Tunnel Encapsulation attribute is 1573 redistributed. (This allows for the possibility that such sub-TLVs 1574 may be given a meaning, in the context of the specified tunnel type, 1575 in the future.) 1577 There is no significance to the order in which the TLVs occur within 1578 the Tunnel Encapsulation attribute. Multiple TLVs may occur for a 1579 given tunnel type; each such TLV is regarded as describing a 1580 different tunnel. 1582 The following sub-TLVs defined in this document MUST NOT occur more 1583 than once in a given Tunnel TLV: Tunnel Endpoint (discussed above), 1584 Encapsulation, IPv4 DS, UDP Destination Port, Embedded Label 1585 Handling, MPLS Label Stack, Prefix-SID. If a Tunnel TLV has more 1586 than one of any of these sub-TLVs, all but the first occurrence of 1587 each such sub-TLV type MUST be treated as a no-op. However, the 1588 Tunnel TLV containing them MUST NOT be considered to be malformed, 1589 and all the sub-TLVs MUST be propagated if the route carrying the 1590 Tunnel Encapsulation attribute is propagated. 1592 The following sub-TLVs defined in this document may appear zero or 1593 more times in a given Tunnel TLV: Protocol Type, Color. Each 1594 occurrence of such sub-TLVs is meaningful. For example, the Color 1595 sub-TLV may appear multiple times to assign multiple colors to a 1596 tunnel. 1598 12. IANA Considerations 1600 12.1. Subsequent Address Family Identifiers 1602 IANA is requested to modify the "Subsequent Address Family 1603 Identifiers" registry to indicate that the Encapsulation SAFI is 1604 deprecated. This document should be the reference. 1606 12.2. BGP Path Attributes 1608 IANA has previously assigned value 23 from the "BGP Path Attributes" 1609 Registry to "Tunnel Encapsulation Attribute". IANA is requested to 1610 add this document as a reference. 1612 12.3. Extended Communities 1614 IANA has previously assigned values from the "Transitive Opaque 1615 Extended Community" type Registry to the "Color Extended Community" 1616 (sub-type 0x0b), and to the "Encapsulation Extended 1617 Community"(0x030c). IANA is requested to add this document as a 1618 reference for both assignments. 1620 12.4. BGP Tunnel Encapsulation Attribute Sub-TLVs 1622 IANA is requested to add the following note to the "BGP Tunnel 1623 Encapsulation Attribute Sub-TLVs" registry: 1625 If the Sub-TLV Type is in the range from 0 to 127 inclusive, the 1626 Sub-TLV Length field contains one octet. If the Sub-TLV Type is 1627 in the range from 128-255 inclusive, the Sub-TLV Length field 1628 contains two octets. 1630 IANA is requested to change the registration policy of the "BGP 1631 Tunnel Encapsulation Attribute Sub-TLVs" registry to the following: 1633 o The values 0 and 255 are reserved. 1635 o The values in the range 1-63 and 128-191 are to be allocated using 1636 the "Standards Action" registration procedure. 1638 o The values in the range 64-125 and 192-252 are to be allocated 1639 using the "First Come, First Served" registration procedure. 1641 o The values in the range 126-127 and 253-254 are reserved for 1642 experimental use; IANA shall not allocate values from this range. 1644 IANA has assigned the following codepoints in the "BGP Tunnel 1645 Encapsulation Attribute Sub-TLVs registry: 1647 6: Remote Endpoint 1649 IANA is requested to change the name of "Remote Endpoint" to 1650 "Tunnel Egress Endpoint". 1652 7: IPv4 DS Field 1654 8: UDP Destination Port 1656 9: Embedded Label Handling 1658 10: MPLS Label Stack 1659 11: Prefix SID 1661 IANA has previously assigned codepoints from the "BGP Tunnel 1662 Encapsulation Attribute Sub-TLVs" registry for "Encapsulation", 1663 "Protocol Type", and "Color". IANA is requested to add this document 1664 as a reference. 1666 12.5. Tunnel Types 1668 IANA is requested to add this document as a reference for tunnel 1669 types 8 (VXLAN), 9 (NVGRE), 11 (MPLS-in-GRE), and 12 (VXLAN-GPE) in 1670 the "BGP Tunnel Encapsulation Tunnel Types" registry. 1672 IANA is requested to add this document as a reference for tunnel 1673 types 1 (L2TPv3), 2 (GRE), and 7 (IP in IP) in the "BGP Tunnel 1674 Encapsulation Tunnel Types" registry. 1676 12.6. Flags Field of Vxlan Encapsulation sub-TLV 1678 IANA is requested to add this document as a reference for creating 1679 the flags field of the Vxlan Encapsulation sub-TLV registry. 1681 IANA is requested to add this document as a reference for flag bits V 1682 and M in the "Flags field of Vxlan Encapsulation sub-TLV" registry. 1684 12.7. Flags Field of Vxlan-GPE Encapsulation sub-TLV 1686 IANA is requested to add this document as a reference for creating 1687 the flags field of the Vxlan-GPE Encapsulation sub-TLV registry. 1689 IANA is requested to add this document as a reference for flag bit V 1690 in the "Flags field of Vxlan-GPE Encapsulation sub-TLV" registry. 1692 12.8. Flags Field of NVGRE Encapsulation sub-TLV 1694 IANA is requested to add this document as a reference for creating 1695 the flags field of the NVGRE Encapsulation sub-TLV registry. 1697 IANA is requested to add this document as a reference for flag bits V 1698 and M in the "Flags field of NVGRE Encapsulation sub-TLV" registry. 1700 12.9. Embedded Label Handling sub-TLV 1702 IANA is requested to add this document as a reference for creating 1703 the sub-TLV's value field of the Embedded Label Handling sub-TLV 1704 registry. 1706 IANA is requested to add this document as a reference for value of 1 1707 (Payload of MPLS with embedded label) and 2 (no embedded label in 1708 payload) in the "sub-TLV's value field of the Embedded Label Handling 1709 sub-TLV" registry. 1711 13. Security Considerations 1713 The Tunnel Encapsulation attribute can cause traffic to be diverted 1714 from its normal path, especially when the Tunnel Endpoint sub-TLV is 1715 used. This can have serious consequences if the attribute is added 1716 or modified illegitimately, as it enables traffic to be "hijacked". 1718 The Tunnel Endpoint sub-TLV contains both an IP address and an AS 1719 number. BGP Origin Validation [RFC6811] can be used to obtain 1720 assurance that the given IP address belongs to the given AS. While 1721 this provides some protection against misconfiguration, it does not 1722 prevent a malicious agent from inserting a sub-TLV that will appear 1723 valid. 1725 Before sending a packet through the tunnel identified in a particular 1726 TLV of a Tunnel Encapsulation attribute, it may be advisable to use 1727 BGP Origin Validation to obtain the following additional assurances: 1729 o the origin AS of the route carrying the Tunnel Encapsulation 1730 attribute is correct; 1732 o the origin AS of the route to the IP address specified in the 1733 Tunnel Endpoint sub-TLV is correct, and is the same AS that is 1734 specified in the Tunnel Endpoint sub-TLV. 1736 One then has some level of assurance that the tunneled traffic is 1737 going to the same destination AS that it would have gone to had the 1738 Tunnel Encapsulation attribute not been present. However, this may 1739 not suit all use cases, and in any event is not very strong 1740 protection against hijacking. 1742 For these reasons, BGP Origin Validation should not be relied upon 1743 exclusively, and the filtering procedures of Section 10 should always 1744 be in place. 1746 Increased protection can be obtained by using BGPSEC [RFC8205] to 1747 ensure that the route carrying the Tunnel Encapsulation attribute, 1748 and the routes to the Tunnel Endpoint of each specified tunnel, have 1749 not been altered illegitimately. 1751 If BGP Origin Validation is used as specified above, and the tunnel 1752 specified in a particular TLV of a Tunnel Encapsulation attribute is 1753 therefore regarded as "suspicious", that tunnel should not be used. 1755 Other tunnels specified in (other TLVs of) the Tunnel Encapsulation 1756 attribute may still be used. 1758 14. Acknowledgments 1760 This document contains text from RFC5512, co-authored by Pradosh 1761 Mohapatra. The authors of the current document wish to thank Pradosh 1762 for his contribution. RFC5512 itself built upon prior work by Gargi 1763 Nalawade, Ruchi Kapoor, Dan Tappan, David Ward, Scott Wainner, Simon 1764 Barber, and Chris Metz, whom we also thank for their contributions. 1766 The authors wish to thank Lou Berger, Ron Bonica, Martin Djernaes, 1767 John Drake, Satoru Matsushima, Dhananjaya Rao, John Scudder, Ravi 1768 Singh, Thomas Morin, Xiaohu Xu, and Zhaohui Zhang for their review, 1769 comments, and/or helpful discussions. 1771 15. Contributor Addresses 1773 Below is a list of other contributing authors in alphabetical order: 1775 Randy Bush 1776 Internet Initiative Japan 1777 5147 Crystal Springs 1778 Bainbridge Island, Washington 98110 1779 United States 1781 Email: randy@psg.com 1783 Robert Raszuk 1784 Bloomberg LP 1785 731 Lexington Ave 1786 New York City, NY 10022 1787 United States 1789 Email: robert@raszuk.net 1791 16. References 1793 16.1. Normative References 1795 [I-D.ietf-idr-bgp-prefix-sid] 1796 Previdi, S., Filsfils, C., Lindem, A., Sreekantiah, A., 1797 and H. Gredler, "Segment Routing Prefix SID extensions for 1798 BGP", draft-ietf-idr-bgp-prefix-sid-27 (work in progress), 1799 June 2018. 1801 [I-D.ietf-nvo3-vxlan-gpe] 1802 Maino, F., Kreeger, L., and U. Elzur, "Generic Protocol 1803 Extension for VXLAN", draft-ietf-nvo3-vxlan-gpe-07 (work 1804 in progress), April 2019. 1806 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1807 Requirement Levels", BCP 14, RFC 2119, 1808 DOI 10.17487/RFC2119, March 1997, 1809 . 1811 [RFC2784] Farinacci, D., Li, T., Hanks, S., Meyer, D., and P. 1812 Traina, "Generic Routing Encapsulation (GRE)", RFC 2784, 1813 DOI 10.17487/RFC2784, March 2000, 1814 . 1816 [RFC2890] Dommety, G., "Key and Sequence Number Extensions to GRE", 1817 RFC 2890, DOI 10.17487/RFC2890, September 2000, 1818 . 1820 [RFC3032] Rosen, E., Tappan, D., Fedorkow, G., Rekhter, Y., 1821 Farinacci, D., Li, T., and A. Conta, "MPLS Label Stack 1822 Encoding", RFC 3032, DOI 10.17487/RFC3032, January 2001, 1823 . 1825 [RFC3931] Lau, J., Ed., Townsley, M., Ed., and I. Goyret, Ed., 1826 "Layer Two Tunneling Protocol - Version 3 (L2TPv3)", 1827 RFC 3931, DOI 10.17487/RFC3931, March 2005, 1828 . 1830 [RFC4023] Worster, T., Rekhter, Y., and E. Rosen, Ed., 1831 "Encapsulating MPLS in IP or Generic Routing Encapsulation 1832 (GRE)", RFC 4023, DOI 10.17487/RFC4023, March 2005, 1833 . 1835 [RFC4271] Rekhter, Y., Ed., Li, T., Ed., and S. Hares, Ed., "A 1836 Border Gateway Protocol 4 (BGP-4)", RFC 4271, 1837 DOI 10.17487/RFC4271, January 2006, 1838 . 1840 [RFC4760] Bates, T., Chandra, R., Katz, D., and Y. Rekhter, 1841 "Multiprotocol Extensions for BGP-4", RFC 4760, 1842 DOI 10.17487/RFC4760, January 2007, 1843 . 1845 [RFC5512] Mohapatra, P. and E. Rosen, "The BGP Encapsulation 1846 Subsequent Address Family Identifier (SAFI) and the BGP 1847 Tunnel Encapsulation Attribute", RFC 5512, 1848 DOI 10.17487/RFC5512, April 2009, 1849 . 1851 [RFC5566] Berger, L., White, R., and E. Rosen, "BGP IPsec Tunnel 1852 Encapsulation Attribute", RFC 5566, DOI 10.17487/RFC5566, 1853 June 2009, . 1855 [RFC7348] Mahalingam, M., Dutt, D., Duda, K., Agarwal, P., Kreeger, 1856 L., Sridhar, T., Bursell, M., and C. Wright, "Virtual 1857 eXtensible Local Area Network (VXLAN): A Framework for 1858 Overlaying Virtualized Layer 2 Networks over Layer 3 1859 Networks", RFC 7348, DOI 10.17487/RFC7348, August 2014, 1860 . 1862 [RFC7510] Xu, X., Sheth, N., Yong, L., Callon, R., and D. Black, 1863 "Encapsulating MPLS in UDP", RFC 7510, 1864 DOI 10.17487/RFC7510, April 2015, 1865 . 1867 [RFC7606] Chen, E., Ed., Scudder, J., Ed., Mohapatra, P., and K. 1868 Patel, "Revised Error Handling for BGP UPDATE Messages", 1869 RFC 7606, DOI 10.17487/RFC7606, August 2015, 1870 . 1872 [RFC7637] Garg, P., Ed. and Y. Wang, Ed., "NVGRE: Network 1873 Virtualization Using Generic Routing Encapsulation", 1874 RFC 7637, DOI 10.17487/RFC7637, September 2015, 1875 . 1877 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 1878 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 1879 May 2017, . 1881 16.2. Informative References 1883 [Ethertypes] 1884 "IANA Ethertype Registry", 1885 . 1888 [I-D.ietf-bess-evpn-inter-subnet-forwarding] 1889 Sajassi, A., Salam, S., Thoria, S., Drake, J., and J. 1890 Rabadan, "Integrated Routing and Bridging in EVPN", draft- 1891 ietf-bess-evpn-inter-subnet-forwarding-08 (work in 1892 progress), March 2019. 1894 [RFC2474] Nichols, K., Blake, S., Baker, F., and D. Black, 1895 "Definition of the Differentiated Services Field (DS 1896 Field) in the IPv4 and IPv6 Headers", RFC 2474, 1897 DOI 10.17487/RFC2474, December 1998, 1898 . 1900 [RFC4364] Rosen, E. and Y. Rekhter, "BGP/MPLS IP Virtual Private 1901 Networks (VPNs)", RFC 4364, DOI 10.17487/RFC4364, February 1902 2006, . 1904 [RFC5462] Andersson, L. and R. Asati, "Multiprotocol Label Switching 1905 (MPLS) Label Stack Entry: "EXP" Field Renamed to "Traffic 1906 Class" Field", RFC 5462, DOI 10.17487/RFC5462, February 1907 2009, . 1909 [RFC6514] Aggarwal, R., Rosen, E., Morin, T., and Y. Rekhter, "BGP 1910 Encodings and Procedures for Multicast in MPLS/BGP IP 1911 VPNs", RFC 6514, DOI 10.17487/RFC6514, February 2012, 1912 . 1914 [RFC6811] Mohapatra, P., Scudder, J., Ward, D., Bush, R., and R. 1915 Austein, "BGP Prefix Origin Validation", RFC 6811, 1916 DOI 10.17487/RFC6811, January 2013, 1917 . 1919 [RFC8205] Lepinski, M., Ed. and K. Sriram, Ed., "BGPsec Protocol 1920 Specification", RFC 8205, DOI 10.17487/RFC8205, September 1921 2017, . 1923 Authors' Addresses 1925 Keyur Patel 1926 Arrcus, Inc 1927 2077 Gateway Pl 1928 San Jose, CA 95110 1929 United States 1931 Email: keyur@arrcus.com 1933 Gunter Van de Velde 1934 Nokia 1935 Copernicuslaan 50 1936 Antwerpen 2018 1937 Belgium 1939 Email: gunter.van_de_velde@nokia.com 1940 Srihari R. Sangli 1941 Juniper Networks, Inc 1942 10 Technology Park Drive 1943 Westford, Massachusetts 01886 1944 United States 1946 Email: ssangli@juniper.net 1948 Eric C. Rosen