idnits 2.17.1 draft-rosen-mpls-rfc3107bis-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (May 31, 2016) is 2886 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) ** Obsolete normative reference: RFC 3107 (Obsoleted by RFC 8277) ** Obsolete normative reference: RFC 5549 (Obsoleted by RFC 8950) == Outdated reference: A later version (-06) exists of draft-ietf-idr-enhanced-gr-05 == Outdated reference: A later version (-22) exists of draft-ietf-idr-tunnel-encaps-01 Summary: 2 errors (**), 0 flaws (~~), 3 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Internet Engineering Task Force E. Rosen, Ed. 3 Internet-Draft Juniper Networks, Inc. 4 Obsoletes: 3107 (if approved) May 31, 2016 5 Intended status: Standards Track 6 Expires: December 2, 2016 8 Using BGP to Bind MPLS Labels to Address Prefixes 9 draft-rosen-mpls-rfc3107bis-01 11 Abstract 13 This document specifies a set of procedures for using BGP to 14 advertise that a specified router has bound a specified MPLS label 15 (or a specified sequence of MPLS labels, organized as a contiguous 16 part of a label stack) to a specified address prefix. This can be 17 done by sending a BGP UPDATE message whose Network Layer Reachability 18 Information field contains both the prefix and the MPLS label(s), and 19 whose Next Hop field identifies the node at which said prefix is 20 bound to said label(s). This document obsoletes RFC 3107. 22 Status of This Memo 24 This Internet-Draft is submitted in full conformance with the 25 provisions of BCP 78 and BCP 79. 27 Internet-Drafts are working documents of the Internet Engineering 28 Task Force (IETF). Note that other groups may also distribute 29 working documents as Internet-Drafts. The list of current Internet- 30 Drafts is at http://datatracker.ietf.org/drafts/current/. 32 Internet-Drafts are draft documents valid for a maximum of six months 33 and may be updated, replaced, or obsoleted by other documents at any 34 time. It is inappropriate to use Internet-Drafts as reference 35 material or to cite them other than as "work in progress." 37 This Internet-Draft will expire on December 2, 2016. 39 Copyright Notice 41 Copyright (c) 2016 IETF Trust and the persons identified as the 42 document authors. All rights reserved. 44 This document is subject to BCP 78 and the IETF Trust's Legal 45 Provisions Relating to IETF Documents 46 (http://trustee.ietf.org/license-info) in effect on the date of 47 publication of this document. Please review these documents 48 carefully, as they describe your rights and restrictions with respect 49 to this document. Code Components extracted from this document must 50 include Simplified BSD License text as described in Section 4.e of 51 the Trust Legal Provisions and are provided without warranty as 52 described in the Simplified BSD License. 54 Table of Contents 56 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 57 2. Using BGP to Bind an Address Prefix to One or More MPLS 58 Labels . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 59 2.1. Multiple Labels Capability . . . . . . . . . . . . . . . 5 60 2.2. NLRI Encoding when the Multiple Labels Capability is 61 Not Used . . . . . . . . . . . . . . . . . . . . . . . . 8 62 2.3. NLRI Encoding when the Multiple Labels Capability is Used 9 63 2.4. How to Explicitly Withdraw the Binding of a Label to a 64 Prefix . . . . . . . . . . . . . . . . . . . . . . . . . 11 65 2.5. Changing the Label that is Bound to a Prefix . . . . . . 12 66 3. Installing and/or Propagating SAFI-4 or SAFI-128 Routes . . . 13 67 3.1. Comparability of Routes . . . . . . . . . . . . . . . . . 13 68 3.2. Modification of Label(s) Field When Propagating . . . . . 14 69 3.2.1. When the Next Hop Field is Unchanged . . . . . . . . 14 70 3.2.2. When the Next Hop Field is Changed . . . . . . . . . 14 71 4. Data Plane . . . . . . . . . . . . . . . . . . . . . . . . . 15 72 5. Relationship Between SAFI-4 and SAFI-1 Routes . . . . . . . . 17 73 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 18 74 7. Security Considerations . . . . . . . . . . . . . . . . . . . 18 75 8. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 19 76 9. References . . . . . . . . . . . . . . . . . . . . . . . . . 19 77 9.1. Normative References . . . . . . . . . . . . . . . . . . 19 78 9.2. Informative References . . . . . . . . . . . . . . . . . 21 79 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 21 81 1. Introduction 83 [RFC3107] specifies encodings and procedures for using BGP to 84 indicate that a particular router has bound either a single MPLS 85 label or a sequence of MPLS labels (organized as a contiguous part of 86 an MPLS label stack) ([RFC3031], [RFC3032]) to a particular address 87 prefix. This is done by sending a BGP UPDATE message whose Network 88 Layer Reachability Information field contains both the prefix and the 89 MPLS label(s), and whose Next Hop field identifies the node at which 90 said prefix is bound to said label(s). Each such UPDATE also 91 advertises a path to the specified prefix, via the specified next 92 hop. 94 Although there are many implementations and deployments of [RFC3107], 95 there are a number of issues with [RFC3107] that have impeded 96 interoperability in the past, and may potentially impede 97 interoperability in the future: 99 o Although [RFC3107] specifies an encoding that allows a sequence of 100 MPLS labels (rather than just a single label) to be bound to a 101 prefix, it does not specify the semantics of binding a sequence of 102 labels to a prefix. 104 o Many implementations of [RFC3107] assume that only one label will 105 be bound to a prefix, and cannot properly process a BGP UPDATE 106 message that binds a sequence of labels to a prefix. Thus an 107 implementation attempting to provide this feature is likely to 108 experience problems interoperating with other implementations. 110 o [RFC3107]'s procedures for withdrawing the binding of a label or 111 sequence of labels to a prefix are not specified clearly and 112 correctly. 114 o [RFC3107] specifies an optional feature, known as "Advertising 115 Multiple Routes to a Destination", that, to the best of the 116 author's knowledge, has never been implemented as specified. The 117 functionality that this feature was intended to provide can and 118 has been implemented in a different way using the procedures of 119 [ADD-PATHS], which were not available at the time that [RFC3107] 120 was written. In [RFC3107], this feature was controlled by a BGP 121 Capability Code that has never been implemented, and is now 122 essentially obsolete. 124 o It is possible for a BGP speaker to receive two BGP UPDATEs that 125 advertise paths to the same address prefix, where one UPDATE binds 126 a label (or sequence of labels) to the prefix and the other does 127 not. [RFC3107] is silent on the issue of how the presence of two 128 such UPDATEs impacts the BGP decision process, and does not say 129 explicitly whether one or the other or both of these UPDATEs 130 should be propagated. This has led different implementations to 131 handle this situation in different ways. 133 o Much of [RFC3107] applies to the VPN-IPv4 ([RFC4364]) and VPN-IPv6 134 ([RFC4659]) address families, but those address families are not 135 mentioned in [RFC3107]. 137 This document replaces and obsoletes [RFC3107]. It defines a new BGP 138 Capability to be used when binding a sequence of labels to a prefix; 139 by using this Capability, the interoperability problems alluded to 140 above can be avoided. This document also removes the unimplemented 141 "Advertising Multiple Routes to a Destination" feature, while 142 specifying how to use [ADD-PATHS] to provide the same functionality. 143 This document also addresses the issue of the how UPDATEs that bind 144 labels to a given prefix interact with UPDATEs that advertise paths 145 to that prefix but do not bind labels to it. However, for backwards 146 compatibility, it declares most of these interactions to be matters 147 of local policy. 149 The places where this specification differs from [RFC3107] are 150 indicated in the text. It is believed that implementations that 151 conform to the current document will interoperate correctly with 152 existing deployed implementations of [RFC3107]. 154 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 155 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 156 "OPTIONAL" in this document are to be interpreted as described in 157 [RFC2119]. 159 2. Using BGP to Bind an Address Prefix to One or More MPLS Labels 161 BGP may be used to advertise that a particular node, call it N, has 162 bound a particular MPLS label, or a particular sequence of MPLS 163 labels (organized as a contiguous part of an MPLS label stack), to a 164 particular address prefix. This is done by sending a Multiprotocol 165 BGP UPDATE message, i.e., an UPDATE message with an MP_REACH_NLRI 166 attribute ([RFC4760]. The "Network Address of Next Hop" field of 167 that attribute contains an IP address of node N. The label(s) and 168 the prefix are encoded in the NLRI field of the MP_REACH_NLRI. The 169 encoding of the NLRI field is specified in Sections 2.2 and 2.3. 171 If the prefix is an IPv4 address prefix or a VPN-IPv4 ([RFC4364]) 172 address prefix, the Address Family Identifier (AFI) of the 173 MP_REACH_NLRI attribute is set to 1. If the prefix is an IPv6 174 address prefix or a VPN-IPv6 prefix ([RFC4659], the AFI is set to 2. 176 If the prefix is an IPv4 address prefix or an IPv6 address prefix, 177 the Subsequent Address Family Identifier (SAFI) field is set to 4. 178 If the prefix is a VPN-IPv4 address prefix or a VPN-IPv6 address 179 prefix, the SAFI is set to 128. 181 The use of SAFI 4 or SAFI 128 when the AFI is other than 1 or 2 is 182 outside the scope of this document. 184 This document does not specify the format of the "Network Address of 185 Next Hop" field of the MP_REACH_NLRI attribute. The format of the 186 next hop field depends upon a number of factors, and is discussed in 187 a number of other RFCs: see [RFC4364], [RFC4659], [RFC4798], and 188 [RFC5549]. 190 There are a variety of applications that make use of alternative 191 methods of using BGP to advertise MPLS label bindings. See, e.g., 193 [RFC7432], [RFC6514], or [TUNNEL-ENCAPS]. The method described in 194 the current document is not claimed to be the only way of using BGP 195 to advertise MPLS label bindings.. Discussion of which method to use 196 for which application is outside the scope of the current document. 198 In the remainder of this document, we will use the term "SAFI-x 199 UPDATE" to refer to a BGP UPDATE message containing an MP_REACH_NLRI 200 attribute or an MP_UNREACH_NLRI attribute ([RFC4760] whose SAFI field 201 contains the value x. 203 This document defines a BGP Optional Capabilities parameter 204 ([RFC5492]) known as the "Multiple Labels Capability". 206 o Unless this Capability is sent on a given BGP session by both of 207 that session's BGP speakers, a SAFI-4 or SAFI-128 UPDATE message 208 sent on that session from either speaker MUST bind a prefix to 209 only a single label, and MUST use the encoding of Section 2.2. 211 o If this Capability is sent by both BGP speakers on a given 212 session, an UPDATE message on that session, from either speaker, 213 MUST use the encoding of Section 2.3, and MAY bind a prefix to a 214 sequence of more than one label. 216 The encoding of the Multiple Labels Capability is specified in 217 Section 2.1. 219 Procedures for explicitly withdrawing a label binding are given in 220 Section 2.4. Procedures for changing the label(s) bound to a given 221 prefix by a given node are given in Section 2.5. 223 Procedures for propagating SAFI-4 and SAFI-128 UPDATEs are discussed 224 in Section 3. 226 When a BGP speaker installs and propagates a SAFI-4 or SAFI-128 227 update, and if it changes the value of the Network Address of Next 228 Hop field, it must program its data plane appropriately. This is 229 discussed in Section 4. 231 2.1. Multiple Labels Capability 233 [RFC5492] defines the "Capabilities Optional Parameter". A BGP 234 speaker can include a Capabilities Optional Parameter in a BGP OPEN 235 message. The Capabilities Optional Parameter is a triple including a 236 one-octet Capability Code, a one-octet Capability length, and a 237 variable-length Capability Value. 239 This document defines a Capability Code, known as the "Multiple 240 Labels Capability" code. IANA will assign a codepoint for this 241 Capability Code. (This Capability Code is new to this document and 242 does not appear in [RFC3107].) 244 If a BGP speaker has not sent the Multiple Labels Capability in its 245 BGP Open message on a particular BGP session, or if it has not 246 received the Multiple Labels Capability in the BGP Open message from 247 its peer on that BGP session, that BGP speaker MUST NOT send on that 248 session any UPDATE message that binds more than one MPLS label to any 249 given prefix. Further, when advertising the binding of a single 250 label to a prefix, the BGP speaker MUST use the encoding specified in 251 Section 2.2. 253 The value field of the Multiple Labels Capability (shown in Figure 1) 254 consists of one or more triples, where each triple consists of four 255 octets. The first two octets of a triple specify an AFI value, the 256 third octet specifies a SAFI value, and the fourth specifies a Count. 257 If one of the triples is , the Count is the maximum 258 number of labels that the BGP speaker can process in a received 259 UPDATE of the specified AFI/SAFI. 261 If the Capability contains more than one triple with a given AFI/ 262 SAFI, all but the first MUST be ignored. 264 Any triple of the form MUST be ignored. 266 If the Capability contains the triple , then 267 no limit has been placed on the number of labels that can be 268 advertised in UPDATEs of the corresponding AFI/SAFI. 270 A Multiple Labels Capability whose length is not a multiple of four 271 MUST be considered to be malformed. 273 [RFC4724] ("Graceful Restart Mechanism for BGP") describes a 274 procedure that allows routes learned over a given BGP session to be 275 maintained when the session fails and then restarts. This procedure 276 requires the entire RIB to be transmitted when the session restarts. 277 If the Multiple Labels Capability for a given AFI/SAFI had been 278 exchanged on the failed session, but is not exchanged on the 279 restarted session, then any prefixes advertised in that AFI/SAFI with 280 multiple labels MUST be explicitly withdrawn. Similarly, if the 281 maximum label count, specified in the Capability for a given AFI/SAFI 282 is reduced, an prefixes advertised with more labels than are valid 283 for the current session MUST be explicitly withdrawn. 285 [Enhanced-GR] ("Accelerated Routing Convergence for BGP Graceful 286 Restart") describes another procedure that allows the routes learned 287 over a given BGP session to be maintained when the session fails and 288 then restarts. These procedures MUST NOT be applied if either of the 289 following conditions hold: 291 o The Multiple Labels Capability for a given AFI/SAFI had been 292 exchanged prior to the restart, but has not been exchanged on the 293 restarted session. 295 o The Multiple Labels Capability for a given AFI/SAFI had been 296 exchanged with a given Count prior to the restart, but has been 297 exchanged with a smaller count on the restarted session. 299 If either of these conditions holds, the complete set of routes for 300 of the given AFI/SAFI MUST be exchanged. 302 0 1 2 3 303 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 304 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 305 | AFI | SAFI | Count ~ 306 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 307 ~ AFI | SAFI | Count | 308 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 310 Figure 1: Value Field of Multiple Labels Capability 312 If a BGP OPEN message contains multiple copies of the Multiple Labels 313 Capability, only the first copy is significant; subsequent copies 314 MUST BE ignored. 316 If a BGP speaker has sent the Multiple Labels Capability in its BGP 317 OPEN message for a particular BGP session, and has also received the 318 Multiple Labels Capability in its peer's BGP OPEN message for that 319 session, and if both Capabilities specify AFI/SAFI x/y, then: 321 when using an UPDATE of AFI x and SAFI y to advertise the binding 322 of a label or sequence of labels to a given prefix, the BGP 323 speaker MUST use the encoding of Section 2.3. This encoding MUST 324 be used even if only one label is being bound to a given prefix. 326 If both BGP speakers of a given BGP session have sent the Multiple 327 Labels Capability, but AFI/SAFI x/y has not been specified in both 328 Capabilities, then UPDATES of AFI/SAFI x/y on that session MUST use 329 the encoding of Section 2.2, and such UPDATEs can only bind one label 330 to a prefix. 332 A BGP speaker SHOULD NOT send an UPDATE that binds more labels to a 333 given prefix than its peer is capable of receiving, as specified in 334 the Multiple Labels Capability sent by that peer. If a BGP speaker 335 receives an UPDATE that binds more labels to a given prefix than the 336 number of prefixes the BGP speaker is prepared to receive (as 337 announced in its Multiple Labels Capability), the BGP speaker MUST 338 apply the "treat-as-withdraw" strategy of [RFC7606] to that UPDATE. 340 Notwithstanding the number of labels that a BGP speaker has claimed 341 to be able to receive, its peer MUST NOT attempt to send more labels 342 than can be properly encoded in the NLRI field of the MP_REACH_NLRI 343 attribute. Please note that there is only a limited amount of space 344 in the NLRI field for labels: 346 o per [RFC4760] the size of this field is limited to 255 bits (not 347 255 octets), including the number of bits in the prefix; 349 o in a SAFI-128 update, the prefix is at least 64 bits long, and may 350 be as long as 192 bits (e.g., in a VPN-IPv6 host route). 352 2.2. NLRI Encoding when the Multiple Labels Capability is Not Used 354 If the Multiple Labels Capability has not been both sent and received 355 on a given BGP session, then in a BGP UPDATE on that session whose 356 MP_REACH_NLRI attribute contains one of the AFI/SAFI combinations 357 specified in Section 2, the NLRI field is encoded as shown in 358 Figure 2: 360 0 1 2 3 361 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 362 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 363 | Length | Label |Rsrv |S| 364 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 365 | Prefix ~ 366 ~ | 367 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 369 Figure 2: NLRI With One Label 371 - Length: 373 The Length field consists of a single octet. It specifies the 374 length in bits of the remainder of the NLRI field. 376 Note that the length will always be the sum of 20 (number of bits 377 in label field) plus 3 (number of bits in Rsrv field) plus 1 378 (number of bits in S field) plus the length in bits of the prefix. 380 In an MP_REACH_NLRI attribute whose AFI/SAFI is 1/4, the prefix 381 length will be 32 bits or less. In an MP_REACH_NLRI attribute 382 whose AFI/SAFI is 2/4, the prefix length will be 128 bits or less. 383 In an MP_REACH_NLRI attribute whose SAFI is 128, the prefix will 384 be 96 bits or less if the AFI is 1, and will be 192 bits or less 385 if the AFI is 2. 387 As specified in [RFC4760], actual length of the NLRI field will be 388 the number of bits specified in the length field, rounded up to 389 the nearest integral number of octets. 391 - Label: 393 The Label field is a 20-bit field, containing an MPLS label value 394 (see [RFC3032]). 396 - Rsrv: 398 This 3-bit field SHOULD be set to zero on transmission, and MUST 399 be ignored on reception. 401 - S: 403 This 1-bit field MUST be set to one on transmission, and MUST be 404 ignored on reception. 406 Note that the UPDATE message not only advertises the binding between 407 the prefix and the label, it also advertises a path to the prefix via 408 the node identified in the "Network Address of Next Hop" field of the 409 MP_REACH_NLRI attribute. 411 [RFC3107] requires that if only a single label is bound to a prefix, 412 the S bit must be set. If the S bit is not set, [RFC3107] specifies 413 that additional labels will appear in the NLRI. However, some 414 implementations assume that the NLRI will contain only a single 415 label, and do not check the setting of the S bit. The procedures 416 specified in the current document will interwork with such 417 implementations. As long as the Multiple Labels Capability is not 418 sent and received by both BGP speakers on a given BGP session, this 419 document REQUIRES that only one label be specified in the NLRI, that 420 the S bit be set on transmission, and that it be ignored on 421 reception. 423 If the procedures of [ADD-PATHS] are being used, a four-octet "path 424 identifier" (as defined in Section 3 of [ADD-PATHS]) is part of the 425 NLRI, and precedes the Length field. 427 2.3. NLRI Encoding when the Multiple Labels Capability is Used 429 If the Multiple Labels Capability has been both sent and received on 430 a given BGP session, then in a BGP UPDATE on that session whose 431 MP_REACH_NLRI attribute contains one of the AFI/SAFI combinations 432 specified in Section 2, the NLRI field is encoded as shown in 433 Figure 3: 435 0 1 2 3 436 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 437 +-+-+-+-+-+-+-+-+ 438 | Length | 439 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 440 | Label |Rsrv |S~ 441 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 442 ~ Label |Rsrv |S| 443 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 444 | Prefix ~ 445 ~ | 446 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 448 Figure 3: NLRI With Multiple Labels 450 - Length: 452 The Length field consists of a single octet. It specifies the 453 length in bits of the remainder of the NLRI field. 455 Note that for each label, the length is increased by 24 bits (20 456 bits in label field plus 3 bits in Rsrv field plus 1 S bit). 458 In an MP_REACH_NLRI attribute whose AFI/SAFI is 1/4, the prefix 459 length will be 32 bits or less. In an MP_REACH_NLRI attribute 460 whose AFI/SAFI is 2/4, the prefix length will be 128 bits or less. 461 In an MP_REACH_NLRI attribute whose SAFI is 128, the prefix will 462 be 96 bits or less if the AFI is 1, and will be 192 bits or less 463 if the AFI is 2. 465 As specified in [RFC4760], actual length of the NLRI field will be 466 the number of bits specified in the length field, rounded up to 467 the nearest integral number of octets. 469 - Label: 471 The Label field is a 20-bit field, containing an MPLS label value 472 ([RFC3032]). 474 - Rsrv: 476 This 3-bit field SHOULD be set to zero on transmission, and MUST 477 be ignored on reception. 479 - S: 481 In all labels except the last (i.e., in all labels except the one 482 immediately preceding the prefix), the S bit MUST be 0. In the 483 last label, the S bit MUST be 1. 485 Note that failure to set the S bit in the last label will make it 486 impossible to parse the NLRI correctly. See Section 3 paragraph j 487 of [RFC7606] for a discussion of error handling when the NLRI 488 cannot be parsed. 490 Note that the UPDATE message not only advertises the binding between 491 the prefix and the labels, it also advertises a path to the prefix 492 via the node identified in the "next hop" field of the MP_REACH_NLRI 493 attribute. 495 If the procedures of [ADD-PATHS] are being used, a four-octet "path 496 identifier" (as defined in Section 3 of [ADD-PATHS]) is part of the 497 NLRI, and precedes the Length field. 499 2.4. How to Explicitly Withdraw the Binding of a Label to a Prefix 501 Suppose a BGP speaker has announced, on a given BGP session, the 502 binding of a given label or sequence of labels to a given prefix. 503 Suppose it now wishes to withdraw that binding. To do so, it may 504 send a BGP UPDATE message with an MP_UNREACH_NLRI attribute. The 505 NLRI field of this attribute is encoded as follows: 507 0 1 2 3 508 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 509 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 510 | Length | Compatibility | 511 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 512 | Prefix ~ 513 ~ | 514 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 516 Figure 4: NLRI For Withdrawal 518 Upon transmission, the Compatibility field SHOULD be set to 0x800000. 519 Upon reception, the value of the Compatibility field MUST be ignored. 521 This encoding is used for explicitly withdrawing the binding, on a 522 given BGP session, between the specified prefix and whatever label or 523 sequence of labels had previously been bound by the procedures of 524 this document to that prefix on the given session. This encoding is 525 used whether or not the Multiple Labels Capability has been sent or 526 received on the session. Note that label/prefix bindings that were 527 not advertised on the given session cannot be withdrawn by this 528 method. 530 When using an MP_UNREACH_NLRI attribute to withdraw a route whose 531 NLRI was previously specified in an MP_REACH_NLRI attribute, the 532 lengths and values of the respective prefixes must match, and the 533 respective AFI/SAFIs must match. If the procedures of [ADD-PATHS] 534 are being used, the respective values of the "path identifier" fields 535 must match as well. Note that the prefix length is not the same as 536 the NLRI length; to determine the prefix length of a prefix in an 537 MP_UNREACH_NLRI, the length of the Compatibility field must be 538 subtracted from the length of the NLRI. 540 An explicit withdrawal in a SAFI-x UPDATE on a given BGP session not 541 only withdraws the binding between the prefix and the label(s), it 542 also withdraws the path to that prefix that was previously advertised 543 in a SAFI-x UPDATE on that session. 545 [RFC3107] allowed one to specify a particular label value in the 546 Compatibility field. However, the functionality that required the 547 presence of a particular label value (or sequence of label values) 548 was never implemented, and that functionality is not present in the 549 current document. Hence the value of this field is of no 550 significance; there is never any reason for this field to contain a 551 label value or a sequence of label values. 553 [RFC3107] also allowed one to withdraw a binding without specifying 554 the label explicitly, by setting the Compatibility field to 0x800000. 555 However, some implementations set it to 0x000000. In order to ensure 556 backwards compatibility, this document RECOMMENDS that the 557 Compatibility field be set to 0x800000, but REQUIRES that it be 558 ignored upon reception. 560 2.5. Changing the Label that is Bound to a Prefix 562 Suppose a BGP speaker, S1, has received on a given BGP session, a 563 SAFI-4 or SAFI-128 UPDATE, U1, that specifies label (or sequence of 564 labels) L1, prefix P, and next hop N1. As specified above, this 565 indicates that label (or sequence of labels) L1 is bound to prefix P 566 at node N1. Suppose that S1 now receives, on the same session, an 567 UPDATE, U2, of the same AFI/SAFI, that specifies label (or sequence 568 of labels) L2, prefix P, and the same next hop, N1. 570 o If [ADD-PATHS] is not being used, UPDATE U2 MUST be interpreted as 571 meaning that L2 is now bound to P at N1, and that L1 is no longer 572 bound to P at N1. That is, the UPDATE U1 is implicitly withdrawn, 573 and is replaced by UPDATE U2. 575 o Suppose that [ADD-PATHS] is being used, that UPDATE U1 has Path 576 Identifier I1, and that UPDATE U2 has Path Identifier I2. 578 * If I1 is the same as I2, UPDATE U2 MUST be interpreted as 579 meaning that L2 is now bound to P at N1, and that L1 is no 580 longer bound to P at N1. UPDATE U1 is implicitly withdrawn. 582 * If I1 is not the same as I2, U2 MUST be interpreted as meaning 583 that L2 is now bound to P at N1, but MUST NOT be interpreted as 584 meaning that L1 is no longer bound to P at N1. Under certain 585 conditions (specification of which is outside the scope of this 586 document), S1 may choose to load balance traffic between the 587 path represented by U1 and the path represented by U2. To send 588 traffic on the path represented by U1, S1 uses the label(s) 589 advertised in U1; to send traffic on the path represented by 590 U2, S1 uses the label(s) advertised in U2. (Although these two 591 paths have the same next hop, one must suppose that they 592 diverge further downstream.) 594 Suppose a BGP speaker, S1, has received, on a given BGP session, a 595 SAFI-4 or SAFI-128 UPDATE that specifies label L1, prefix P, and next 596 hop N1. Suppose that S1 now receives, on a different BGP session, an 597 UPDATE, of the same AFI/SAFI, that specifies label L2, prefix P, and 598 the same next hop, N1. BGP speaker S1 SHOULD treat this as 599 indicating that N1 has at least two paths to P, and S MAY use this 600 fact to do load-balancing of any traffic that it has to send to P. 602 Note that this section discusses only the case where two UPDATEs have 603 the same next hop. Procedures for the case where two UPDATEs have 604 different next hops are adequately described in [RFC4271]. 606 3. Installing and/or Propagating SAFI-4 or SAFI-128 Routes 608 3.1. Comparability of Routes 610 Suppose a BGP speaker has received two SAFI-4 UPDATEs specifying the 611 same Prefix, and that either: 613 o the two UPDATEs are received on different BGP sessions, or 615 o the two UPDATES are received on the same session, add-paths is 616 used on that session, and the NLRIs of the two updates have 617 different path identifiers. 619 These two routes MUST be considered to be comparable, even if they 620 specify different labels. Thus the BGP bestpath selection procedures 621 (Section 9.1 of [RFC4271]) are applied to select one of them as the 622 better path. If the procedures of [ADD-PATHS] are not being used on 623 a particular BGP session, only the best path is propagated on that 624 session. If the procedures of [ADD-PATHS] are being used on a 625 particular BGP session, then both paths may be propagated on that 626 session, though with different path identifiers.. 628 The same applies to SAFI-128 routes. 630 3.2. Modification of Label(s) Field When Propagating 632 3.2.1. When the Next Hop Field is Unchanged 634 When a SAFI-4 or SAFI-128 route is propagated, if the "Network 635 Address of Next Hop" field is left unchanged, the label field(s) MUST 636 also be left unchanged. 638 Note that a given route MUST NOT be propagated to a given peer if the 639 route's NLRI has multiple labels, but the peer can only handle a 640 single label in the NLRI. Similarly, a given route SHOULD NOT be 641 propagated to a given peer if the route's NLRI has more labels than 642 the peer has announced (through its "Multiple Labels" Capability) 643 that it can handle. In either case, if a previous route with the 644 same AFI, SAFI, and prefix (but with fewer labels) has already been 645 propagated to the peer, that route MUST be withdrawn from that peer, 646 using the procedure of Figure 4. 648 3.2.2. When the Next Hop Field is Changed 650 If the "Network Address of Next Hop" field is changed before a SAFI-4 651 or SAFI-128 route is propagated, the label field(s) of the propagated 652 route MUST contain the label(s) that that is (are) bound to the 653 prefix at the new next hop. 655 Suppose BGP speaker S1 has received an UPDATE that binds a particular 656 sequence of one or more labels to a particular prefix. If S1 chooses 657 to propagate this route after changing its next hop, S1 may change 658 the label in any of the following ways, depending upon local policy: 660 o A single label may be replaced by a single label, of the same or 661 different value. 663 o A sequence of multiple labels may be replaced by a single label. 665 o A single label may be replaced by a sequence of multiple labels. 667 o A sequence of multiple labels may be replaced by a sequence of 668 multiple labels; the number of labels may be left the same, or may 669 be changed. 671 Of course, when deciding whether to propagate, to a given BGP peer, 672 an UPDATE binding a sequence of more than one label, a BGP speaker 673 must attend to the information provided by the Multiple Labels 674 Capability (see Section 2.1). A BGP speaker MUST NOT send multiple 675 labels to a peer with which it has not exchanged the "Multiple 676 Labels" Capability, and SHOULD NOT send more labels to a given peer 677 than the peer has announced (via the "Multiple Labels" Capability) 678 than it can handle. 680 It is possible that a BGP speaker's local policy will tell it to 681 encode N labels in a given route's NLRI before propagating the route, 682 but that one of the BGP speaker's peers cannot handle N labels in the 683 NLRI. In this case, the BGP speaker has two choices: 685 o It can propagate the route to the given peer with fewer than N 686 labels. However, whether this makes sense, and if so, how to 687 choose the labels, is also a matter of local policy 689 o It can decide not to propagate the route to the given peer. In 690 that case, if a previous route with the same AFI, SAFI, and prefix 691 (but with fewer labels) has already been propagated to that peer, 692 that route MUST be withdrawn from that peer, using the procedure 693 of Figure 4. 695 4. Data Plane 697 In the following, we will use the phrase "node S tunnels packet P to 698 node N", where packet P is an MPLS packet. By this phrase, we mean 699 that node S encapsulates packet P and causes packet P to be delivered 700 to node N, in such a way that P's label stack before encapsulation 701 will be seen unchanged by N, but will not be seen by the nodes (if 702 any) between S and N. 704 If the tunnel is a Label Switched Path (LSP), encapsulating the 705 packet may be as simple as pushing on another MPLS label. If node N 706 is a layer 2 adjacency of node S, S a layer 2 encapsulation may be 707 all that is needed. Other sorts of tunnels (e.g., IP tunnels, GRE 708 tunnels, UDP tunnels) may also be used, depending upon the particular 709 deployment scenario. 711 Suppose BGP speaker S1 receives a SAFI-4 or SAFI-128 BGP UPDATE with 712 an MP_REACH_NLRI specifying label L1, prefix P, and next hop N1, and 713 suppose S1 installs this route as its (or one of its) bestpath(s) 714 towards P. And suppose S1 propagates this route after changing the 715 next hop to itself and changing the label to L2. Suppose further 716 that S1 receives an MPLS data packet, and in the process of 717 forwarding that MPLS data packet, S1 sees label L2 rise to the top of 718 the packet's label stack. Then to forward the packet further, S1 719 must replace L2 with L1 as the top entry in the packet's label stack, 720 and S1 must then tunnel the packet to N1. 722 Suppose that the route received by S1 specified not a single label, 723 but a sequence of k labels , where L11 is the 724 first label appearing in the NLRI, and L1k is the last. And suppose 725 again that S1 propagates this route after changing the next hop to 726 itself and changing the label field to the single label L2. Suppose 727 further that S1 receives an MPLS data packet, and in the process of 728 forwarding that MPLS data packet, S1 sees label L2 rise to the top of 729 the packet's label stack. In this case, instead of simply replacing 730 L2 with L1, S1 removes L2 from the top of the label stack, and then 731 pushes labels L1k through L11 onto the label stack, such that L11 is 732 now at the top of the label stack. Then S1 must tunnel the packet to 733 N1. (Note that L1k will not be at the bottom of the packet's label 734 stack, and hence will not have the "bottom of stack" bit set, unless 735 L2 had previously been at the bottom of the packet's label stack.) 737 The above paragraph assumes that when S1 propagates a SAFI-4 or 738 SAFI-128 route after setting the next hop to itself, it replaces the 739 label or labels specified in the NLRI of that route with a single 740 label. However, it is also possible, as determined by local policy, 741 for a BGP speaker to specify multiple labels when it propagates a 742 SAFI-4 or SAFI-128 route after setting the next hop to itself. 744 Suppose, for example, that S1 supports context labels ([RFC5331]). 745 Let L21 be a context label supported by S1, and let L22 be a label 746 that is in the label space identified (at S1) by L21. Suppose S1 747 receives a SAFI-4 or SAFI-128 UPDATE whose prefix is P, whose label 748 field is , and whose next hop is N1. Before 749 propagating the UPDATE, S1 may set the next hop to itself (by 750 replacing N1 with S1), and may replace the label stack 751 with the pair of labels . 753 In this case, if S1 receives an MPLS data packet whose top label is 754 L21 and whose second label is L22, S1 will remove both L21 and L22 755 from the label stack, and replace them with . Note 756 that the fact that L21 is a context label is known only to S1; other 757 BGP speakers do not know how S1 will interpret L21 (or L22). 759 The ability to replace one or more labels by one or more labels can 760 provide great flexibility, but must be done carefully. Let's suppose 761 again that S1 receives an UPDATE that specifies prefix P, label stack 762 , and next hop N1. And suppose that S1 propagates 763 this UPDATE to BGP speaker S2 after setting next hop self and after 764 replacing the label field with . Finally, suppose 765 that S1 programs its data plane so that when it processes a received 766 MPLS packet whose top label is L21, it replaces L21 with 767 , and then tunnels the packet to N1. 769 In this case, BGP speaker S2 will have received a route with prefix 770 P, label field , and next hop S1. If S2 decides to 771 forward an IP packet according to this route, it will push 772 onto the packet's label stack, and tunnel the packet 773 to S1. S1 will replace L21 with , and will tunnel 774 the packet to N1. N1 will receive the packet with the following 775 label stack: . While this may be useful 776 in certain scenarios, it may provide unintended results in other 777 scenarios. 779 Procedures for choosing, setting up, maintaining, or determining the 780 liveness of a particular tunnel or type of tunnel, are outside the 781 scope of this document. 783 It is a matter of local policy whether SAFI-4 routes can be used as 784 the basis for forwarding IP packets, or whether SAFI-4 routes can 785 only be used for forwarding MPLS packets. If BGP speaker S1 is 786 forwarding IP packets according to SAFI-4 routes, then consider an IP 787 packet with destination address D, such that P is the "longest prefix 788 match" for D from among the routes that are being used to forward IP 789 packets. And suppose the packet is being forwarded according to a 790 SAFI-4 route whose prefix is P, whose next hop is N1, and whose 791 sequence of labels is L1. To forward the packet according to this 792 route, S1 must create a label stack for the packet, push on the 793 sequence of labels L1, and then tunnel the packet to N1. 795 5. Relationship Between SAFI-4 and SAFI-1 Routes 797 It is possible that a BGP speaker will receive both a SAFI-1 route 798 for prefix P and a SAFI-4 route for prefix P. The significance of 799 this is a matter of local policy. 801 For example, some implementations may regard SAFI-1 routes and SAFI-4 802 routes as completely independent, and may treat them in a "ships in 803 the night" fashion. In this case, bestpath selection for the two 804 SAFIs is independent, and there will be a best SAFI-1 route to P as 805 well as a best SAFI-4 route to P. Which packets get forwarded 806 according to the routes of which SAFI is then a matter of local 807 policy. 809 Other implementations may treat the SAFI-1 and SAFI-4 routes for a 810 given prefix as comparable, such that the best route to prefix P is 811 either a SAFI-1 route or a SAFI-4 route, but not both. In 812 implementations of this sort, if load balancing is done among a set 813 of equal cost routes, some of the equal cost routes may be SAFI-1 814 routes and some may be SAFI-4 routes. Whether this is allowed is 815 again a matter of local policy. 817 Some implementations may allow a single BGP session to carry UPDATES 818 of both SAFI-1 and SAFI-4; other implementations may disallow this. 820 A BGP speaker may receive a SAFI-4 route over a given BGP session, 821 but may have other BGP sessions for which SAFI-4 is not enabled. In 822 this case, the BGP speaker MAY convert the SAFI-4 route to a SAFI-1 823 route and then propagate the result over the session on which SAFI-4 824 is not enabled. Whether this is done is a matter of local policy. 826 6. IANA Considerations 828 IANA is requested to assign a codepoint for "Multiple Labels 829 Capability" in the "BGP Capability Codes" registry, with this 830 document as the reference. 832 IANA is requested to modify the "BGP Capability Codes" registry to 833 mark Capability Code 4 ("Multiple routes to a destination") as 834 deprecated, with this document as the reference. 836 IANA is requested to change the reference for SAFI 4 in the 837 "Subsequent Address Family Identifiers (SAFI) Parameters" to this 838 document. IANA is also requested to add this document as a reference 839 for SAFI 128 in that same registry. 841 7. Security Considerations 843 The security considerations of the BGP specification ([RFC4271]) 844 apply. 846 If a BGP implementation, not conformant with the current document, 847 encodes multiple labels in the NLRI but has not sent and received the 848 "Multiple Labels" Capability, a BGP implementation that does conform 849 with the current document will likely reset the BGP session. 851 This document specifies that certain data packets be "tunneled" from 852 one BGP speaker to another. This requires that the packets be 853 encapsulated while in flight. This document does not specify the 854 encapsulation to be used. However, if a particular encapsulation is 855 used, the security considerations of that encapsulation are 856 applicable. 858 If a particular tunnel encapsulation does not provide integrity and 859 authentication, it is possible that a data packet's label stack can 860 be modified, through error or malfeasance, while the packet is in 861 flight. This can result in misdelivery of the packet. 863 There are various techniques one can use to constrain the 864 distribution of BGP UPDATE messages. If a BGP UPDATE advertises the 865 binding of a particular (set of) label(s) to a particular address 866 prefix, such techniques can be used to control the set of BGP 867 speakers that are intended to learn of that binding. However, if BGP 868 sessions do not provide privacy, other routers may learn of that 869 binding. 871 When a BGP speaker processes a received MPLS data packet whose top 872 label it advertised, there is no guarantee that the label in question 873 was put on the packet by a router that was intended to know about the 874 label binding. If a BGP speaker is using the procedures of this 875 document, it may be useful for that speaker to distinguish its 876 "internal" interfaces from its "external" interfaces, and to avoid 877 advertising the same labels to BGP speakers reached on internal 878 interfaces as to BGP speakers reached on external interfaces. Then a 879 data packet can be discarded if its top label was not advertised over 880 the type of interface from which the packet was received. This 881 reduces the likelihood of forwarding packets whose labels have been 882 "spoofed" by untrusted sources. 884 8. Acknowledgements 886 This draft obsoletes RFC 3107. We wish to think Yakov Rekhter, co- 887 author of RFC 3107, for his work on that document. We also wish to 888 thank Ravi Chandra, Enke Chen, Srihari R. Sangli, Eric Gray, and 889 Liam Casey for their review of and comments on that document. 891 We thank Alexander Okonnikov and David Lamparter for pointing out a 892 number of the errors in RFC 3107. 894 We wish to thank Lili Wang and Kaliraj Vairavakkalai for their help 895 and advice during the preparation of this document. 897 We also thank Bruno Decraene, Jie Dong, Jeff Haas, Jakob Heitz, Keyur 898 Patel, and Kevin Wang for their review of and comments on this 899 document. 901 9. References 903 9.1. Normative References 905 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 906 Requirement Levels", BCP 14, RFC 2119, 907 DOI 10.17487/RFC2119, March 1997, 908 . 910 [RFC3031] Rosen, E., Viswanathan, A., and R. Callon, "Multiprotocol 911 Label Switching Architecture", RFC 3031, 912 DOI 10.17487/RFC3031, January 2001, 913 . 915 [RFC3032] Rosen, E., Tappan, D., Fedorkow, G., Rekhter, Y., 916 Farinacci, D., Li, T., and A. Conta, "MPLS Label Stack 917 Encoding", RFC 3032, DOI 10.17487/RFC3032, January 2001, 918 . 920 [RFC3107] Rekhter, Y. and E. Rosen, "Carrying Label Information in 921 BGP-4", RFC 3107, DOI 10.17487/RFC3107, May 2001, 922 . 924 [RFC4271] Rekhter, Y., Ed., Li, T., Ed., and S. Hares, Ed., "A 925 Border Gateway Protocol 4 (BGP-4)", RFC 4271, 926 DOI 10.17487/RFC4271, January 2006, 927 . 929 [RFC4364] Rosen, E. and Y. Rekhter, "BGP/MPLS IP Virtual Private 930 Networks (VPNs)", RFC 4364, DOI 10.17487/RFC4364, February 931 2006, . 933 [RFC4659] De Clercq, J., Ooms, D., Carugi, M., and F. Le Faucheur, 934 "BGP-MPLS IP Virtual Private Network (VPN) Extension for 935 IPv6 VPN", RFC 4659, DOI 10.17487/RFC4659, September 2006, 936 . 938 [RFC4760] Bates, T., Chandra, R., Katz, D., and Y. Rekhter, 939 "Multiprotocol Extensions for BGP-4", RFC 4760, 940 DOI 10.17487/RFC4760, January 2007, 941 . 943 [RFC4798] De Clercq, J., Ooms, D., Prevost, S., and F. Le Faucheur, 944 "Connecting IPv6 Islands over IPv4 MPLS Using IPv6 945 Provider Edge Routers (6PE)", RFC 4798, 946 DOI 10.17487/RFC4798, February 2007, 947 . 949 [RFC5492] Scudder, J. and R. Chandra, "Capabilities Advertisement 950 with BGP-4", RFC 5492, DOI 10.17487/RFC5492, February 951 2009, . 953 [RFC5549] Le Faucheur, F. and E. Rosen, "Advertising IPv4 Network 954 Layer Reachability Information with an IPv6 Next Hop", 955 RFC 5549, DOI 10.17487/RFC5549, May 2009, 956 . 958 [RFC7606] Chen, E., Ed., Scudder, J., Ed., Mohapatra, P., and K. 959 Patel, "Revised Error Handling for BGP UPDATE Messages", 960 RFC 7606, DOI 10.17487/RFC7606, August 2015, 961 . 963 9.2. Informative References 965 [ADD-PATHS] 966 Walton, D., Retana, A., Chen, E., and J. Scudder, 967 "Advertisement of Multiple Paths in BGP", internet-draft 968 draft-ietf-idr-add-paths-15, May 2016. 970 [Enhanced-GR] 971 Patel, K., Chen, E., Fernando, R., and J. Scudder, 972 "Accelerated Routing Convergence for BGP Graceful 973 Restart", internet-draft draft-ietf-idr-enhanced-gr-05, 974 December 2014. 976 [RFC4724] Sangli, S., Chen, E., Fernando, R., Scudder, J., and Y. 977 Rekhter, "Graceful Restart Mechanism for BGP", RFC 4724, 978 DOI 10.17487/RFC4724, January 2007, 979 . 981 [RFC5331] Aggarwal, R., Rekhter, Y., and E. Rosen, "MPLS Upstream 982 Label Assignment and Context-Specific Label Space", 983 RFC 5331, DOI 10.17487/RFC5331, August 2008, 984 . 986 [RFC6514] Aggarwal, R., Rosen, E., Morin, T., and Y. Rekhter, "BGP 987 Encodings and Procedures for Multicast in MPLS/BGP IP 988 VPNs", RFC 6514, DOI 10.17487/RFC6514, February 2012, 989 . 991 [RFC7432] Sajassi, A., Ed., Aggarwal, R., Bitar, N., Isaac, A., 992 Uttaro, J., Drake, J., and W. Henderickx, "BGP MPLS-Based 993 Ethernet VPN", RFC 7432, DOI 10.17487/RFC7432, February 994 2015, . 996 [TUNNEL-ENCAPS] 997 Rosen, E., Patel, K., and G. Vandevelde, "The BGP Tunnel 998 Encapulation Attribute", internet-draft draft-ietf-idr- 999 tunnel-encaps-01, December 2015. 1001 Author's Address 1002 Eric C. Rosen (editor) 1003 Juniper Networks, Inc. 1004 10 Technology Park Drive 1005 Westford, Massachusetts 01886 1006 United States 1008 Email: erosen@juniper.net