idnits 2.17.1 draft-chan-idr-bgp-lu2-04.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- == The page length should not exceed 58 lines per page, but there was 6 longer pages, the longest (page 7) being 61 lines Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** There are 146 instances of too long lines in the document, the longest one being 14 characters in excess of 72. == There are 1 instance of lines with non-RFC6890-compliant IPv4 addresses in the document. If these are example addresses, they should be changed. == There are 7 instances of lines with private range IPv4 addresses in the document. If these are generic example addresses, they should be changed to use any of the ranges defined in RFC 6890 (or successor): 192.0.2.x, 198.51.100.x or 203.0.113.x. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == The "Author's Address" (or "Authors' Addresses") section title is misspelled. == The document doesn't use any RFC 2119 keywords, yet seems to have RFC 2119 boilerplate text. -- The document date (Aug 23, 2021) is 977 days in the past. Is this intentional? Checking references for intended status: Experimental ---------------------------------------------------------------------------- == Missing Reference: 'RFC5492' is mentioned on line 253, but not defined == Missing Reference: 'RFC4760' is mentioned on line 283, but not defined == Missing Reference: '1000 2000' is mentioned on line 337, but not defined == Missing Reference: '1001 3000' is mentioned on line 339, but not defined == Missing Reference: '1002 4000' is mentioned on line 341, but not defined == Missing Reference: 'FLEXALGO' is mentioned on line 397, but not defined == Missing Reference: '100101 100001' is mentioned on line 407, but not defined == Missing Reference: '400101 400001' is mentioned on line 409, but not defined == Unused Reference: 'RFC3107' is defined on line 464, but no explicit reference was found in the text == Unused Reference: 'RFC4360' is defined on line 469, but no explicit reference was found in the text == Unused Reference: 'RFC5512' is defined on line 473, but no explicit reference was found in the text == Unused Reference: 'RFC5575' is defined on line 479, but no explicit reference was found in the text == Unused Reference: 'RFC7311' is defined on line 485, but no explicit reference was found in the text == Unused Reference: 'RFC8277' is defined on line 496, but no explicit reference was found in the text -- Obsolete informational reference (is this intentional?): RFC 3107 (Obsoleted by RFC 8277) -- Obsolete informational reference (is this intentional?): RFC 5512 (Obsoleted by RFC 9012) -- Obsolete informational reference (is this intentional?): RFC 5575 (Obsoleted by RFC 8955) Summary: 1 error (**), 0 flaws (~~), 20 warnings (==), 4 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 IDR Working Group Louis Chan 3 INTERNET-DRAFT 4 Intended status: Experimental Juniper Networks 5 Expires: Feb 23, 2022 Aug 23, 2021 7 Color Operation with BGP Label Unicast 8 draft-chan-idr-bgp-lu2-04.txt 10 Abstract 12 This document specifies how to carry colored path advertisement via an enhancement 13 to the existing protocol BGP Label Unicast. It would allow backward compatibility 14 with RFC8277. 16 The targeted solution is to use stack of labels advertised via BGP Label Unicast 17 2.0 for end to end traffic steering across multiple IGP domains. The operation is 18 similar to Segment Routing. 20 This proposed protocol will convey the necessary reachability information to the 21 ingress PE node to construct an end to end path. 23 Another two problems addressed here are the interworking with Flex-Algo, and the 24 MPLS label space limit problem. 26 Please note that there is a major change of protocol format starting from version 27 01 draft. Except the optional BGP capability code, these rest of BGP attributes 28 used in this draft are defined in previous RFC or in use today in other scenario. 30 Status of this Memo 32 This Internet-Draft is submitted in full conformance with the provisions of BCP 78 33 and BCP 79. 35 Internet-Drafts are working documents of the Internet Engineering Task Force 36 (IETF). Note that other groups may also distribute working documents as Internet- 37 Drafts. The list of current Internet-Drafts is at 38 http://datatracker.ietf.org/drafts/current/. 40 Internet-Drafts are draft documents valid for a maximum of six months and may be 41 updated, replaced, or obsoleted by other documents at any time. It is 42 inappropriate to use Internet-Drafts as reference material or to cite them other 43 than as "work in progress." 45 This Internet-Draft will expire on Feb 23, 2022. 47 Copyright Notice 49 Copyright (c) 2017 IETF Trust and the persons identified as the document authors. 50 All rights reserved. 52 This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating 53 to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of 54 publication of this document. Please review these documents carefully, as they 55 describe your rights and restrictions with respect to this document. Code 56 Components extracted from this document must include Simplified BSD License text as 57 described in Section 4.e of the Trust Legal Provisions and are provided without 58 warranty as described in the Simplified BSD License. 60 Table of Contents 62 1. Introduction...................................................2 63 2. Conventions used in this document..............................4 64 3. Carrying Label Mapping Information with Color and Label Stack..4 65 3.1. Use of Add-path to advertise multiple color paths.........4 66 3.2. Color extended community for BGP Labeled Unicast..........5 67 3.3. Color extended community for service prefixes.............6 68 3.4. Color Slicing Capability..................................6 69 4. Uniqueness of path entries.....................................7 70 5. AIGP consideration.............................................8 71 6. Explicit Withdraw of a .............8 72 7. Error Handling Procedure.......................................8 73 8. Controller Compatibility.......................................8 74 9. Interworking with Flex Algo....................................9 75 10. Label stacking to increase label space........................9 76 11. Tunneling SRv6 packet via MPLS................................9 77 12. Security Considerations......................................10 78 13. IANA Considerations..........................................10 79 14. References...................................................10 80 14.1. Normative References....................................10 81 14.2. Informative References..................................10 82 15. Acknowledgments..............................................11 84 1. Introduction 86 The proposed protocol is aimed to solve interdomain traffic steering, with 87 different transport services in mind. One application is low latency service across 88 multiple IGP domains, which could scale up to 100k or more routers network. 90 BGP is a flexible protocol. With additional of color attribute to BGP Label 91 Unicast, a path with specific color would be given a meaning in application - a low 92 latency path, a fully protected path, or a path for diversity. 94 The stack of labels would mean an end to end path across domains through each ABR 95 or ASBR. Each ABR or ASBR will take one label from the stack, and hence pick the 96 forwarding path to next ABR, ASBR, or the final destination. 98 And the label in the stack may be derived from any of the below 99 - Prefix SID 100 - Binding SID for RSVP LSP 101 - Binding SID for SR-TE LSP 102 - Local assigned label 104 The enhancement to the original RFC8277 is to add color extended community, with 105 multiple advertisement allowed. The result is similar to multi-topology BGP-LU with 106 different colors. 108 With Add-path [RFC7911] feature, non color RIB and colored RIB could be advertised 109 to the BGP neighbors without new additional attributes. Add-path capability is 110 required advertise multiple paths with same prefix but different colors. 112 A new [BGP-CAP] should be required to enforce such slicing operation during 113 negotiation. 115 On the other hand, to enable the service prefixes to be mapped accordingly, the 116 L3VPN, L2VPN, EVPN and IP prefix with BGP signaling, the color extended community 117 is also added there. In the PE node, the service prefixes with color will be 118 matched to a transport tunnel with the same color. 120 The following is an example. Between PE1 and PE2, there is a VPN service running 121 with label 16, which is associated with color 100. 123 PE1----ABR1-----ABR2-----PE2 125 PE1 will send the following labels with a color 100 path plus VPN label 127 [2001 13001 801 16], where 129 2001 - SR label to reach ABR1 131 13001 - a Binding-SID label for ABR1-ABR2 tunnel. Underlying tunnel type is RSVP-TE 133 801 - a Binding-SID label for ABR2-PE2 tunnel. Underlying tunnel type is SR-TE 135 16 - a VPN label, which is signaled via other means 137 [2001 13001 801] - denotes the label stack for this color 100 path to reach PE2 139 The document here is going to describe how PE1 gains enough information to build 140 this label stack across routing domains. 142 If PE1 wants to reach PE2 with another colored path, say color 200, the label stack 143 could be different. 145 At the same time, this architecture is also controller friendly, since all the 146 notation is Segment Routing compatible, like use of Binding-SID. 148 The above architecture could be used in conjunction with Flex-Algo [FLEXAGLO] where 149 one color could represent a Flex Algorithm. e.g. color 128 equals to Algo 128 150 When using with Flex Algo in huge network, there could be label space limit. The 151 MPLS label 20 bits long and the maximum label space is around 1 million. In order 152 to represent more IPv4 or IPv6 nodes, label stacking method is recommended. One IP 153 loopback address could be represent by one or more labels. In this case, (20 bits x 154 n) of label address space is possible. 156 2. Conventions used in this document 158 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", 159 "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be 160 interpreted as described in RFC 2119 [RFC2119]. 162 In this document, these words will appear with that interpretation only when in ALL 163 CAPS. Lower case uses of these words are not to be interpreted as carrying 164 significance described in RFC 2119. 166 3. Carrying Label Mapping Information with Color and Label Stack 168 3.1. Use of Add-path to advertise multiple color paths 170 The use of Path Identifier is to allow multiple advertisement of the same prefix 171 but with different colors or null color. 173 The extended NLRI format would be like this 175 +--------------------------------+ 176 | Path Identifier (4 octets) | 177 +--------------------------------+ 178 | Length (1 octet) | 179 +--------------------------------+ 180 | Label (3 octets) ~ 181 +--------------------------------+ 182 ~ Label (3 octets) | 183 +--------------------------------+ 184 | Prefix (variable) | 185 +--------------------------------+ 187 3.2. Color extended community for BGP Labeled Unicast 189 The addition of Color Extended Community is an opaque extended community from 190 RFC4360 and RFC5512. The draft allows multiple color values advertisement. 192 0 1 2 3 193 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 194 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 195 | 0x03 | 0x0b |C|O| Reserved |X|X|X| 196 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 197 | Color Value ~ 198 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 199 ~ 0x03 | 0x0b |C|O| Reserved |X|X|X| 200 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 201 | Color Value | 202 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 204 Figure 1: Color value advertisement format 206 Both in BGP update and MP_UNREACH_NLRI message, multiple color extended communities 207 could be included. It means that multiple colors, indicating different kind of 208 services, could share the same label stack. With the use of Path-ID, the multiple 209 colors are considered as one bundled update. Any subsequent update is based on 210 Path-ID. 212 If color extended community is not present in a BGP update message, it would be 213 treated as normal BGP-LU without any color. 215 3 bits of XXX is reserved here for the draft. 217 The meaning for XXX is interpreted as sub-slice of color, with 0 to 7 in decimal, 218 or 000b and 111b in binary. These sub-slice could be used in either of the 219 following case. 221 a) Primary path and fallback paths in order of preference 222 0 - primary path 223 1 - first and most preferred backup path 224 .... 225 7 - least preferred backup path 227 b) ECMP paths up to 8, since all paths should be active in forwarding plane. 229 Color value 0 is reserved for future interoperability purpose. 231 Color value 1 - 31 are not recommended to use, and this range is reserved for 232 future use. 234 3.3. Color extended community for service prefixes 236 The same format of color extended community is advertised with service prefixes, 237 which could be VPN prefixes or IP prefixes. The order of the color extended 238 community could be interpreted as 240 - Order of primary and fallback colors 241 - Or, ECMP of equal split between color paths 243 The above would be interpreted by the receiving PE upon its local configuration. 245 It is optional to enable sub-slice notation. 247 But if sub-slice bits are used, it will be used to map directly to each of the sub- 248 slice path. If sub-slice path is not available for mapping, it should just fallback 249 to resolving by color. 251 3.4. Color Slicing Capability 253 The Color Slicing Capability is a BGP capability [RFC5492], with Capability Code xx 254 (TBD). 256 The color slicing capability is an optional but preferred to have capability. It 257 could be configurable parameters at both side of BGP session but with assumption of 258 BGP add-path support [RFC7911]. If the specific BGP capability is not negotiated, 259 it is assumed version 0 without sub-slice notation. In this case, multiple paths 260 with color attribute are advertised through BGP add-path. 262 The Capability Length field of this capability is variable. The Capability Value 263 field consists of one or more of the following tuples: 265 +------------------------------------------------+ 266 | Address Family Identifier (2 octets) | 267 +------------------------------------------------+ 268 | Subsequent Address Family Identifier (1 octet) | 269 +------------------------------------------------+ 270 | version (1 octet) | 271 +------------------------------------------------+ 272 | Reserved (3 octet) | 273 +------------------------------------------------+ 275 The meaning and use of the fields are as follows: 277 Address Family Identifier (AFI): 279 This field is the same as the one used in [RFC4760]. 281 Subsequent Address Family Identifier (SAFI): 283 This field is the same as the one used in [RFC4760]. 285 Version: 287 This field is for capability negotiation. 289 0 1 2 3 4 5 6 7 290 +-+-+-+-+-+-+-+-+ 291 |v v v v| |s| 292 +-+-+-+-+-+-+-+-+ 294 Each of 4 bits of v represents a flag of version from 0 to 4, where LSB denotes 295 support of version 1, and MSB denotes version 4. Version 0 is the default mode of 296 operation, which is described in this document. To determine the common capability 297 between the two BGP PEER, logical AND function to use determine the highest 298 denominator of protocol version. 300 For example, if BGP receive 0b0110 from its peer and perform AND function with its 301 own capability 0b0010, the result is 0b0010. Version 2 is selected. 303 The other examples are 304 - 0b0110 AND 0b0110, version 3 is selected 305 - 0b0100 AND 0b0010, version 0 is selected 307 Version 1 (0b0001) is reserved. 309 S-flag is the indication of use of sub-slice. Set to 1 if sub-slice notation is 310 enforced. If either side is set to 0 for S-flag, sub-slice is not in use. 312 Reserved: 314 This field is reserved for future use. 316 4. Uniqueness of path entries 318 a) Use of color can be considered to slice into multiple BGP Label Unicast RIB. 319 Therefore, it should be treated as unique entries for the . 322 e.g. , [labels] 324 <123, 100, 10.1.1.1/32>, [1000 2000] 326 <124, 200, 10.1.1.1/32>, [1000 2000] 327 <222, {300,400}, 10.1.1.1/32>, [1000 2000] 329 <223, null, 10.1.1.1/32>, [1000 2000] 331 All these 4 NLRI are considered different but valid entries for different color 332 instances. 334 b) With sub-slice notation 335 , [labels] 337 <901, 100-0, 10.1.1.1/32>, [1000 2000] 339 <902, 100-1, 10.1.1.1/32>, [1001 3000] 341 <903, 100-7, 10.1.1.1/32>, [1002 4000] 343 These 3 NLRI are distinct, and the second and third NLRI could be used for 344 backup or ECMP purpose. 346 5. AIGP consideration 348 AIGP (RFC7311) would be also used in here to embed certain metric across. 350 6. Explicit Withdraw of a 352 According to RFC8277, MP_UNREACH_NLRI can be used to remove binding of a . 355 If a path-id is associated with a prefix with multiple colors, the withdrawal would 356 be applied to all associated colors. 358 To withdraw color(s) partially from the same path-id advertisement, BGP update 359 should be used instead. 361 7. Error Handling Procedure 363 If BGP receiver could not handle the NLRI, it should silently discard with error 364 logging. 366 8. Controller Compatibility 368 The proposed architecture is compatible with controller for end to end 369 provisioning. Persistent label, like Binding-SID is recommended to be used. Hence, 370 controller could learn these labels from the network, and program specific end to 371 end path. 373 In this case, BGP-LU2 will provide a second best path to an ingress PE node, while 374 a controller, with more external information, could provide a best path from 375 overall perspective. 377 Controller could also be deployed based on domain by domain perspective. e.g. 378 Optimizing latency of a RSVP LSP, or maintain the bandwidth and loading between SR- 379 TE LSPs. 381 9. Interworking with Flex Algo 383 Flex Algo is a way of network slicing, but it is only an IGP protocol. In order to 384 scale across different domains, BGP is recommended as the method to distribute the 385 information across. 387 With color notation in this proposal, one router can distribute to another domain 388 via BGP. 390 There are two ways of mapping Flex-Algo to color attribute in BGP-LU2 392 a) Color 128 equals Flex Algo 128 393 b) Or, Color 400 is mapped to Flex Algo 128 395 10. Label stacking to increase label space 397 Due to the use of Flex-Algo [FLEXALGO], the MPLS label space might run into limit. 398 Each node will need extra labels for each Algo. 400 The idea is to use multiple labels to represent a single node. In this case, the 401 label space becomes (2^20)^n, depending on n stacking level. 403 For IPv6 address, there would be enough label space even if running with SR-MPLS. 405 For example, for node 1.1.1.1, 2 consecutive labels are used to represent the node. 407 Algo 0: [100101 100001] 409 Algo 128: [400101 400001] 411 How the forwarding plane treats the stacked labels is out of the discussion here. 413 11. Tunneling SRv6 packet via MPLS 415 PE1-----ABR1-----ABR2-----PE2 417 In a SRv6 network, PE1 and PE2 is using SRv6 for VPN service. Between ABR1 and 418 ABR2, it is capable of MPLS only. The use of BGP-LU2 would be a method to provide 419 locator route mapping to MPLS tunnel between ABRs. 421 At ABR1, the mapping options could be 422 a) Use of color attribute associated with the VPN advertisement and map to the 423 desired tunnel. 424 b) Up to the locator route. For example, use first 48 bits of SRv6 header 425 FC00:0000:nnnn::/48 ; where nnnn is the locator portion 427 c) Making use of sub-slice information as defined in [SRV6-SUBSLICE] 429 +------------+----------------+--------------------------+ 430 | Locator | Sub-slice ID | Remainder for behavior | 431 +------------+----------------+--------------------------+ 432 |<- Endpoint Behavior ->| 434 Sub-slice ID could be used for mapping to different color path in MPLS. For 435 example, 437 FC00:0000:nnnn:ssss::/64 ; where ssss is a sub-slice ID 439 ABR2 advertises a/64 prefix route inclusive of sub-slice ID via BGP-LU2 into 440 ABR1. Hence, traffic will be redirected to a MPLS tunnel from ABR1. 442 d) With the format described in [SRV6-SUBSLICE], a mapping could be made between 443 sub-slice ID and mentioned in section 3.2. 445 12. Security Considerations 447 TBD 449 13. IANA Considerations 451 TBD. It will require a new BGP capability code to enable such color operation. 453 New SAFI might be required as well. 455 14. References 457 14.1. Normative References 459 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", 460 BCP 14, RFC 2119, March 1997. 462 14.2. Informative References 464 [RFC3107] Rekhter, Y. and E. Rosen, "Carrying Label Information in BGP-4", RFC 465 3107, DOI 10.17487/RFC3107, May 2001, 467 . 469 [RFC4360] Sangli, S., Tappan, D., and Y. Rekhter, "BGP Extended 470 Communities Attribute", RFC 4360, February 2006 471 . 473 [RFC5512] Mohapatra, P. and E. Rosen, "The BGP Encapsulation Subsequent Address 474 Family Identifier (SAFI) and the BGP Tunnel Encapsulation Attribute", RFC 475 5512, April 2009. 477 . 479 [RFC5575] Marques, P., Sheth, N., Raszuk, R., Greene, B., Mauch, J., and D. 480 McPherson, "Dissemination of Flow Specification Rules", RFC 5575, DOI 481 10.17487/RFC5575, August 2009, 483 . 485 [RFC7311] Mohapatra, P., Fernando, R., Rosen, E., and J. Uttaro, 486 "The Accumulated IGP Metric Attribute for BGP", RFC 7311, 487 DOI 10.17487/RFC7311, August 2014, 489 . 491 [RFC7911] Walton, D., Retana, A., Chen, E., and J. Scudder, "Advertisement of 492 Multiple Paths in BGP", RFC 7911, DOI 10.17487/RFC7911, July 2016, 494 . 496 [RFC8277] Rosen, E., "Using BGP to Bind MPLS Labels to Address Prefixes", RFC 8277, 497 DOI 10.17487/RFC8277, October 2017, 499 . 501 [BGP-CAP] Chandra, R. and J. Scudder, "Capabilities Advertisement 503 with BGP-4", RFC 2842, May 2000. 505 [FLEXAGLO] S. Hegde, P. Psenak and etc, IGP Flexible Algorithm 507 https://datatracker.ietf.org/doc/draft-ietf-lsr-flex-algo 509 [SRV6-SUBSLICE] Louis Chan, Sub-slicing for SRv6 511 https://datatracker.ietf.org/doc/draft-chan-srv6-sub-slice/ 513 15. Acknowledgments 515 The following people have contributed to this document: 517 Jeff Haas, Juniper Networks 519 Shraddha Hedge, Juniper Networks 520 Santosh Kolenchery, Juniper Networks 522 Shihari Sangli, Juniper Networks 524 Krzysztof Szarkowicz, Juniper Networks 526 Yimin Shen, Juniper Networks 528 Author Address 530 Louis Chan (editor) 531 Juniper Networks 532 2604, Cityplaza One, 1111 King's Road 533 Taikoo Shing 534 Hong Kong 536 Phone: +85225876659 537 Email: louisc@juniper.net