idnits 2.17.1 draft-dskc-bess-bgp-car-04.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack a Security Considerations section. ** There are 9 instances of too long lines in the document, the longest one being 22 characters in excess of 72. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == Line 1715 has weird spacing: '... policy v :...' == Using lowercase 'not' together with uppercase 'MUST', 'SHALL', 'SHOULD', or 'RECOMMENDED' is not an accepted usage according to RFC 2119. Please use uppercase 'NOT' together with RFC 2119 keywords (if that is what you mean). Found 'MUST not' in this paragraph: - T bit unset to indicate TLV is non-transitive. An unrecognized non-transitive TLV MUST not be propagated by a speaker that changes next hop -- The document date (April 27, 2022) is 727 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Unused Reference: 'I-D.ietf-idr-bgp-ipv6-rt-constrain' is defined on line 1490, but no explicit reference was found in the text == Unused Reference: 'RFC4360' is defined on line 1523, but no explicit reference was found in the text == Unused Reference: 'RFC4684' is defined on line 1527, but no explicit reference was found in the text == Unused Reference: 'RFC5512' is defined on line 1539, but no explicit reference was found in the text == Unused Reference: 'RFC5701' is defined on line 1545, but no explicit reference was found in the text == Unused Reference: 'I-D.ietf-mpls-seamless-mpls' is defined on line 1585, but no explicit reference was found in the text == Unused Reference: 'RFC3906' is defined on line 1590, but no explicit reference was found in the text == Unused Reference: 'RFC4271' is defined on line 1595, but no explicit reference was found in the text == Unused Reference: 'RFC4272' is defined on line 1600, but no explicit reference was found in the text == Unused Reference: 'RFC6952' is defined on line 1608, but no explicit reference was found in the text == Unused Reference: 'RFC7911' is defined on line 1614, but no explicit reference was found in the text == Outdated reference: A later version (-26) exists of draft-ietf-lsr-flex-algo-19 ** Obsolete normative reference: RFC 5512 (Obsoleted by RFC 9012) Summary: 3 errors (**), 0 flaws (~~), 15 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 BESS WorkGroup D. Rao 3 Internet-Draft S. Agrawal 4 Intended status: Standards Track C. Filsfils 5 Expires: October 29, 2022 Cisco Systems 6 D. Steinberg 7 Lapishills Consulting Limited 8 L. Jalil 9 Verizon 10 Y. Su 11 Alibaba, Inc 12 B. Decraene 13 Orange 14 J. Guichard 15 Futurewei 16 K. Talaulikar 17 K. Patel 18 Arrcus, Inc 19 H. Wang 20 Huawei Technologies 21 April 27, 2022 23 BGP Color-Aware Routing (CAR) 24 draft-dskc-bess-bgp-car-04 26 Abstract 28 This document describes a BGP based routing solution to establish 29 end-to-end intent-aware paths across a multi-domain service provider 30 transport network. This solution is called BGP Color-Aware Routing 31 (BGP CAR). 33 Status of This Memo 35 This Internet-Draft is submitted in full conformance with the 36 provisions of BCP 78 and BCP 79. 38 Internet-Drafts are working documents of the Internet Engineering 39 Task Force (IETF). Note that other groups may also distribute 40 working documents as Internet-Drafts. The list of current Internet- 41 Drafts is at https://datatracker.ietf.org/drafts/current/. 43 Internet-Drafts are draft documents valid for a maximum of six months 44 and may be updated, replaced, or obsoleted by other documents at any 45 time. It is inappropriate to use Internet-Drafts as reference 46 material or to cite them other than as "work in progress." 48 This Internet-Draft will expire on October 29, 2022. 50 Copyright Notice 52 Copyright (c) 2022 IETF Trust and the persons identified as the 53 document authors. All rights reserved. 55 This document is subject to BCP 78 and the IETF Trust's Legal 56 Provisions Relating to IETF Documents 57 (https://trustee.ietf.org/license-info) in effect on the date of 58 publication of this document. Please review these documents 59 carefully, as they describe your rights and restrictions with respect 60 to this document. Code Components extracted from this document must 61 include Simplified BSD License text as described in Section 4.e of 62 the Trust Legal Provisions and are provided without warranty as 63 described in the Simplified BSD License. 65 Table of Contents 67 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 68 1.1. Terminology . . . . . . . . . . . . . . . . . . . . . . . 3 69 1.2. Illustration . . . . . . . . . . . . . . . . . . . . . . 5 70 1.3. Requirements Language . . . . . . . . . . . . . . . . . . 7 71 2. BGP CAR SAFI . . . . . . . . . . . . . . . . . . . . . . . . 7 72 2.1. Data Model . . . . . . . . . . . . . . . . . . . . . . . 7 73 2.2. Extensible encoding . . . . . . . . . . . . . . . . . . . 7 74 2.3. BGP CAR Route Origination . . . . . . . . . . . . . . . . 8 75 2.4. BGP CAR Route Validation . . . . . . . . . . . . . . . . 8 76 2.5. BGP CAR Route Resolution . . . . . . . . . . . . . . . . 8 77 2.6. AIGP Metric Computation . . . . . . . . . . . . . . . . . 9 78 2.7. Path Availability . . . . . . . . . . . . . . . . . . . . 9 79 2.8. BGP CAR signaling through different color domains . . . . 10 80 2.9. Format and Encoding . . . . . . . . . . . . . . . . . . . 11 81 2.9.1. BGP CAR SAFI NLRI Format . . . . . . . . . . . . . . 11 82 2.9.2. Color-Aware Routes NLRI Type . . . . . . . . . . . . 12 83 2.9.3. Local-Color-Mapping (LCM) Extended Community . . . . 16 84 2.10. Error Handling . . . . . . . . . . . . . . . . . . . . . 17 85 3. Service route Automated Steering on Color-Aware path . . . . 18 86 4. Intents . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 87 5. (E, C) Subscription and Filtering . . . . . . . . . . . . . . 19 88 5.1. Illustration . . . . . . . . . . . . . . . . . . . . . . 19 89 5.2. Definition . . . . . . . . . . . . . . . . . . . . . . . 20 90 6. Scaling . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 91 6.1. Ultra-Scale Reference Topology . . . . . . . . . . . . . 21 92 6.2. Deployment model . . . . . . . . . . . . . . . . . . . . 22 93 6.2.1. Flat . . . . . . . . . . . . . . . . . . . . . . . . 22 94 6.2.2. Hierarchical Design with next-hop-self at ingress 95 domain BR . . . . . . . . . . . . . . . . . . . . . . 23 96 6.2.3. Hierarchical Design with Next Hop Unchanged at 97 ingress domain BR . . . . . . . . . . . . . . . . . . 25 99 6.3. Scale Analysis . . . . . . . . . . . . . . . . . . . . . 26 100 6.4. Scaling Benefits of the (E, C) BGP Subscription and 101 Filtering . . . . . . . . . . . . . . . . . . . . . . . . 28 102 6.5. Anycast SID . . . . . . . . . . . . . . . . . . . . . . . 28 103 6.5.1. Anycast SID for transit inter-domain nodes . . . . . 28 104 6.5.2. Anycast SID for transport color endpoints (e.g., PEs) 29 105 7. Routing Convergence . . . . . . . . . . . . . . . . . . . . . 29 106 8. VPN CAR . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 107 9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 31 108 9.1. BGP CAR NLRI Types Registry . . . . . . . . . . . . . . . 31 109 9.2. BGP CAR NLRI TLV Registry . . . . . . . . . . . . . . . . 31 110 9.3. Guidance for Designated Experts . . . . . . . . . . . . . 32 111 9.4. BGP Extended Community Registry . . . . . . . . . . . . . 32 112 10. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 32 113 11. References . . . . . . . . . . . . . . . . . . . . . . . . . 32 114 11.1. Normative References . . . . . . . . . . . . . . . . . . 32 115 11.2. Informative References . . . . . . . . . . . . . . . . . 34 116 Appendix A. Illustrations of Service Steering . . . . . . . . . 35 117 A.1. E2E BGP transport CAR intent realized using IGP FA . . . 35 118 A.2. E2E BGP transport CAR intent realized using SR Policy . . 37 119 A.3. BGP transport CAR intent realized in a section of the 120 network . . . . . . . . . . . . . . . . . . . . . . . . . 39 121 A.4. Transit network domains that do not support CAR . . . . . 41 122 Appendix B. Color Mapping Illustrations . . . . . . . . . . . . 42 123 B.1. Single color domain containing network domains with N:N 124 color distribution . . . . . . . . . . . . . . . . . . . 42 125 B.2. Single color domain containing network domains with N:M 126 color distribution . . . . . . . . . . . . . . . . . . . 43 127 B.3. Multiple color domains . . . . . . . . . . . . . . . . . 43 128 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 44 130 1. Introduction 132 This document specifies a new BGP SAFI called BGP Color-Aware Routing 133 (BGP CAR). BGP CAR fulfills the transport and VPN problem statement 134 and requirements described in [dskc-bess-bgp-car-problem-statement]. 136 1.1. Terminology 138 +---------------+---------------------------------------------------+ 139 | Intent | Any combination of the following behaviors: a/ | 140 | | Topology path selection (e.g. minimize metric, | 141 | | avoid resource), b/ NFV service insertion (e.g. | 142 | | service chain steering), c/ per-hop behavior | 143 | | (e.g. 5G slice). | 144 | | | 145 | Color | A 32-bit numerical value associated with an | 146 | | intent: e.g. low-cost vs low-delay vs avoiding | 147 | | some resources. | 148 | | | 149 | Colored | An egress PE E2 colors its BGP VPN route V/v to | 150 | Service Route | indicate the intent that it requests for the | 151 | | traffic bound to V/v. The color is encoded as a | 152 | | BGP Color Extended community | 153 | | [I-D.ietf-idr-tunnel-encaps]. | 154 | | | 155 | Color-Aware | A routed path to E2 which satisfies the intent | 156 | Path to (E2, | associated with color C. Several technologies | 157 | C) | may provide a Color-Aware Path to (E2, C): SR | 158 | | Policy [I-D.ietf-spring-segment-routing-policy], | 159 | | IGP Flex-Algo [I-D.ietf-lsr-flex-algo], BGP CAR | 160 | | [specified in this document]. | 161 | | | 162 | Color-Aware | A distributed or signaled route that builds a | 163 | Route (E2, C) | color-aware path to E2 for color C. | 164 | | | 165 | Service Route | E1 automatically steers a C-colored service route | 166 | Automated | V/v from E2 onto an (E2, C) path. If several such | 167 | Steering on | paths exist, a preference scheme is used to | 168 | Color-aware | select the best path: E.g. IGP Flex-Algo first | 169 | path | then BGP CAR then SR Policy. | 170 | | | 171 | Color Domain | A set of nodes which share the same Color-to- | 172 | | Intent mapping. This set can be organized in one | 173 | | or several IGP instances or BGP domains. | 174 | | | 175 | Resolution of | An inter-domain BGP CAR route (E, C) from N is | 176 | a BGP CAR | resolved on an intra-domain color-aware path (N, | 177 | route (E, C) | C) where N is the next-hop of the BGP CAR route. | 178 | | | 179 | Resolution vs | In this document and consistently with the | 180 | Steering | terminology of the SR Policy document | 181 | | [I-D.ietf-spring-segment-routing-policy], | 182 | | steering is used to describe the mapping of a | 183 | | service route onto a BGP CAR path while the term | 184 | | resolution is preserved for the mapping of an | 185 | | inter-domain BGP CAR route on an intra-domain | 186 | | color-aware path. | 187 | | | 188 | | Service Steering: Service route -> BGP CAR path | 189 | | (or other Color-Aware Routed Paths: e.g., SR | 190 | | Policy) | 191 | | | 192 | | Intra-Domain Resolution: BGP CAR route -> intra- | 193 | | domain color aware path (e.g. SR Policy, IGP | 194 | | Flex-Algo, BGP CAR) | 195 +---------------+---------------------------------------------------+ 197 1.2. Illustration 199 Here is a brief illustration of the salient properties of the BGP CAR 200 solution. 202 +-------------+ +-------------+ +-------------+ 203 | | | | | | V/v with C1 204 |----+ |------| |------| +----|/ 205 | E1 | | | | | | E2 |\ 206 |----+ | | | | +----| W/w with C2 207 | |------| |------| | 208 | Domain 1 | | Domain 2 | | Domain 3 | 209 +-------------+ +-------------+ +-------------+ 211 Figure 1 213 All the nodes are part of an interdomain network under a single 214 authority and with a consistent color-to-intent mapping: 216 o C1 is mapped to "low-delay" 218 * Flex-Algo FA1 is mapped to "low delay" and hence to C1 220 o C2 is mapped to "low-delay and avoid resource R" 222 * Flex-Algo FA2 is mapped to "low delay and avoid resource R" and 223 hence C2 225 E1 receives two service routes from E2: 227 o V/v with BGP Extended-Color community C1 229 o W/w with BGP Extended-Color community C2 231 E1 has the following color-aware paths: 233 o (E2, C1) provided by BGP CAR with the following per-domain 234 support: 236 * Domain1: over IGP FA1 238 * Domain2: over SR Policy bound to color C1 240 * Domain3: over IGP FA1 242 o (E2, C2) provided by SR Policy 244 E1 automatically steers the received service routes as follows: 246 o V/v via (E2, C1) provided by BGP CAR 248 o W/w via (E2, C2) provided by SR Policy 250 Illustrated Properties: 252 o Leverage of the BGP Color Extended-Community 254 * The service routes are colored with widely-used BGP Extended- 255 Color Community 257 o (E, C) Automated Steering 259 * V/v and W/w are automatically steered on the appropriate color- 260 aware path 262 o Seamless co-existence of BGP CAR and SR Policy 264 * V/v is steered on BGP CAR color-aware path 266 * W/w is steered on SR Policy color-aware path 268 o Seamless interworking of BGP CAR and SR Policy 270 * V/v is steered on a BGP CAR color-aware path that is itself 271 resolved within domain 2 onto an SR Policy bound to the color 272 of V/v 274 Other properties: 276 o MPLS dataplane: with 300k PE's and 5 colors, the BGP CAR solution 277 ensures that no single node needs to support a dataplane scaling 278 in the order of Remote PE * C. This would otherwise blow the MPLS 279 dataplane. 281 o Control-Plane: a node should not install a (E, C) path if it does 282 not need it 284 o Incongruent Color-Intent mapping: the solution supports the 285 signaling of a BGP CAR route across different color domains 287 The keys to this simplicity are: 289 o the leverage of the BGP Color Extended-Community to color service 290 routes 292 o the definition of the automated steering: a C-colored service 293 route V/v from E2 is steered onto a color-aware path (E2, C) 295 o the definition of the data model of a BGP CAR path: (E, C) 297 * consistent with SR Policy data model 299 o the definition of the recursive resolution of a BGP CAR route: a 300 BGP CAR (E2, C) via N is resolved onto the color-aware path (N, C) 301 which may itself be provided by BGP CAR or via another color-aware 302 routing solution: SR Policy, IGP Flex-Algo. 304 1.3. Requirements Language 306 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 307 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 308 "OPTIONAL" in this document are to be interpreted as described in BCP 309 14 [RFC2119] [RFC8174] when, and only when, they appear in all 310 capitals, as shown here. 312 2. BGP CAR SAFI 314 2.1. Data Model 316 The BGP CAR data model is: 318 o NLRI Key: IP Prefix, Color 320 o NLRI non-key encapsulation data: MPLS label stack, Label index, 321 SRv6 SID list etc. 323 o BGP Next Hop 325 o AIGP Metric: accumulates color/intent specific metric across 326 domains 328 o Local-Color-Mapping Extended-Community (LCM-EC): Optional 32-bit 329 Color value used when a CAR route propagates between different 330 color domains 332 2.2. Extensible encoding 334 Extensible encoding is ensured by: 336 o NLRI Route-Type field: provides extensibility to add new NLRI 337 formats for new route-types 339 o Key length: field enables handling of unsupported route-types 340 opaquely, enabling transitivity via RRs 342 o TLV-based encoding of non-key NLRI: enables support for multiple 343 encapsulations with efficient update packing 345 o AIGP Attribute provides extensibility via TLVs, enabling 346 definition of additional metric semantics for a color as needed 347 for an intent 349 2.3. BGP CAR Route Origination 351 A BGP CAR route may be originated locally (e.g., loopback) or through 352 redistribution of an (E, C) color-aware path provided by another 353 routing solution: SR Policy, IGP Flex-Algo or BGP-LU [RFC8277]. 355 2.4. BGP CAR Route Validation 357 A BGP CAR path (E, C) from N with encapsulation T is valid if color- 358 aware path (N, C) exists and T is dataplane available. 360 A local policy may customize the validation process: 362 o the color constraint in the first check may be relaxed: instead N 363 is reachable in the default routing table 365 o the dataplane availability constraint of T may be relaxed 367 o addition of a performance-measurement verification to ensure that 368 the intent associated with C is met (e.g. delay < bound) 370 2.5. BGP CAR Route Resolution 372 A BGP color-aware route (E2, C1) from N is resolved over a color- 373 aware route (N, C1). The color-aware route (N, C1) may be provided 374 recursively by BGP CAR or by other routing solutions: SR Policy, IGP 375 Flex-Algo, BGP-LU. 377 When multiple resolutions are possible, the default preference should 378 be: IGP Flex-Algo, SR Policy, BGP CAR, BGP LU. 380 Through local policy, a BGP color-aware route (E2, C1) from N may be 381 resolved over a color-aware route (N, C2): i.e. the local policy maps 382 the resolution of C1 over C2. For example, in a domain where 383 resource R is known to not be present, the inter-domain intent 384 C1="low delay and avoid R" may be resolved over an intra-domain path 385 of intent C2="low delay". 387 The color-aware route (N, C1) may have a different dataplane 388 encapsulation than the one of (E2, C1): e.g. a BGP CAR route (E2, C1) 389 with SR-MPLS encapsulation may be transported over an intermediate 390 SRv6 domain. 392 2.6. AIGP Metric Computation 394 The Accumulated IGP (AIGP) Attribute is updated as the BGP CAR route 395 propagates across the network. 397 The value set (or appropriately incremented) in the AIGP TLV 398 corresponds to the metric associated with the underlying intent of 399 the color. For example, when the color is associated with a low- 400 latency path, the metric value is set based on the delay metric. 402 Information regarding the metric type used by the underlying intra- 403 domain mechanism can also be set. 405 If BGP CAR routes traverse across a discontinuity in the transport 406 path for a given intent, add a penalty in accumulated IGP metric. 407 The discontinuity is also indicated to upstream nodes via a bit in 408 the AIGP TLV. 410 AIGP metric computation is recursive. 412 To avoid continuous IGP metric churn causing end to end BGP CAR 413 churn, an implementation should provide thresholds to trigger AIGP 414 update. 416 Additional AIGP extensions may be defined to signal state for 417 specific use-cases: MSD along the BGP CAR advertisement, Minimum MTU 418 along the BGP CAR advertisement. 420 2.7. Path Availability 422 The (E, C) route inherently provides availability of redundant paths 423 at every hop. For instance, BGP CAR routes originated by two egress 424 ABRs in a domain are advertised as multiple paths to ingress ABRs in 425 the domain, where they become equal-cost or primary-backup paths. A 426 failure of an egress ABR is detected and handled by ingress ABRs 427 locally within the domain for faster convergence, without any 428 necessity to propagate the event to upstream nodes for traffic 429 restoration. 431 BGP ADD-PATH should be enabled for BGP CAR to signal multiple next 432 hops through a transport RR. 434 2.8. BGP CAR signaling through different color domains 436 [Color Domain 1 A]-----[B Color Domain 2 E2] 437 [C1=low-delay ] [C2=low-delay ] 439 Let us assume a BGP CAR route (E2, C2) is signaled from B to A; two 440 border routers of respectively domain 2 and domain 1. Let us assume 441 that these two domains do not share the same color-to-intent mapping. 442 Low-delay in domain 2 is color C2 while C1 in domain 1 (C1 <> C2). 444 The BGP CAR solution seamlessly supports this (rare) scenario while 445 maintaining the separation and independence of the administrative 446 authority in different color domains. 448 The solution works as follows: 450 o Within domain 2, the BGP CAR route is (E2, C2) via E2 452 o B signals to A the BGP CAR route as (E2, C2) via B with Local- 453 Color-Mapping-Extended-Community (LCM-EC) of color C2 455 o A is aware (classic peering agreement) of the intent-to-color 456 mapping within domain 2 ("low-delay" in domain 2 is C2) 458 o A maps C2 in LCM-EC to C1 and signals within domain 1 the received 459 BGP CAR route as (E2, C2) via A with LCM-EC(C1) 461 o The nodes within the receiving domain 1 use the local color 462 encoded in the LCM-EC for next-hop resolution and BGP CAR route 463 installation 465 Salient properties: 467 o The NLRI never changes 469 o E is globally unique, which makes E-C in that order unique 471 o In the vast majority of the case, the color of the NLRI is used 472 for resolution and steering 474 o In the rare case of color incongruence, the local color encoded in 475 LCM-EC takes precedence 477 Further illustrations are provided in Appendix B. 479 2.9. Format and Encoding 481 BGP CAR leverages the BGP multi-protocol extensions [RFC4760] and 482 uses the MP_REACH_NLRI and MP_UNREACH_NLRI attributes for route 483 updates by using the SAFI value TBD1 along with AFI 1 for IPv4 484 prefixes and AFI 2 for IPv6 prefixes. 486 BGP speakers MUST use BGP Capabilities Advertisement to ensure 487 support for processing of BGP CAR updates. This is done as specified 488 in [RFC4760], by using capability code 1 (multi-protocol BGP), with 489 AFI 1 and 2 (as required) and SAFI TBD1. 491 The sub-sections below specify the generic encoding of the BGP CAR 492 NLRI followed by the encoding for specific NLRI types introduced in 493 this document. 495 2.9.1. BGP CAR SAFI NLRI Format 497 The generic format for the BGP CAR SAFI NLRI is shown below: 499 0 1 2 3 500 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 501 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 502 | NLRI Length | Key Length | NLRI Type | // 503 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ // 504 | Type-specific Key Fields // 505 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 506 | Type-specific Non-Key Fields (if applicable) // 507 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 509 where: 511 o NLRI Length: 1 octet field that indicates the length in octets of 512 the NLRI excluding the NLRI Length field itself. 514 o Key Length: 1 octet field that indicates the length in octets of 515 the NLRI type-specific key fields. Key length MUST be at least 2 516 less than the NLRI length. 518 o NLRI Type: 1 octet field that indicates the type of the BGP CAR 519 NLRI. 521 o Type-Specific Key Fields: Depend on the NLRI type and of length 522 indicated by the Key Length. 524 o Type-Specific Non-Key Fields: optional and variable depending on 525 the NLRI type. The NLRI encoding allows for encoding of specific 526 non-key information associated with the route (i.e. the key) as 527 part of the NLRI for efficient packing of BGP updates. 529 The indication of the key length enables BGP Speakers to determine 530 the key portion of the NLRI and use it along with the NLRI Type field 531 in an opaque manner for handling of unknown or unsupported NLRI 532 types. This can help Route Reflectors (RR) to propagate NLRI types 533 introduced in the future in a transparent manner. 535 The NLRI encoding allows for encoding of specific non-key information 536 associated with the route (i.e. the key) as part of the NLRI for 537 efficient packing of BGP updates. 539 The non-key portion of the NLRI MUST be omitted while carrying it 540 within the MP_UNREACH_NLRI when withdrawing the route advertisement. 542 2.9.2. Color-Aware Routes NLRI Type 544 The Color-Aware Routes NLRI Type is used for advertisement of color- 545 aware routes and has the following format: 547 0 1 2 3 548 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 549 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 550 | NLRI Length | Key Length | NLRI Type |Prefix Length | 551 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 552 | IP Prefix (variable) // 553 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 554 | Color (4 octets) | 555 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 557 Followed by optional TLVs encoded as below: 559 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 560 | Type | Length | Value (variable) // 561 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 563 where: 565 o NLRI Length: variable 567 o Key Length: variable. It indicates the total length comprised of 568 the Prefix Length field, IP Prefix field, and the Color field, as 569 described below. For IPv4 (AFI=1), the minimum length is 5 and 570 maximum length is 9. For IPv6 (AFI=2), the minimum length is 5 571 and maximum length is 21. 573 o NLRI Type: 1 574 o Type-Specific Key Fields: as below 576 * Prefix Length: 1 octet field that carries the length of prefix 577 in bits. Length MUST be less than or equal to 32 for IPv4 578 (AFI=1) and less than or equal to 128 for IPv6 (AFI=2). 580 * IP Prefix: IPv4 or IPv6 prefix (based on the AFI). A variable 581 size field that contains the most significant octets of the 582 prefix, i.e., 0 octet for prefix length 0, 1 octet for prefix 583 length 1 to 8, 2 octets for prefix length 9 to 16, 3 octets for 584 prefix length 17 up to 24, 4 octets for prefix length 25 up to 585 32, and so on. The size of the field MUST be less than or 586 equal to 4 for IPv4 (AFI=1) and less than or equal to 16 for 587 IPv6 (AFI=2). 589 * Color: 4 octets that contains color value associated with the 590 prefix. 592 o Type-Specific Non-Key Fields: specified in the form of optional 593 TLVs as below: 595 * Type: 1 octet that contains the type code and flags. It is 596 encoded as shown below: 598 0 1 2 3 4 5 6 7 599 +-+-+-+-+-+-+-+-+ 600 |R|T| Type code | 601 +-+-+-+-+-+-+-+-+ 602 where: 604 + R: Bit is reserved and MUST be set to 0 and ignored on 605 receive. 607 + T: Transitive bit, applicable to speakers that change the 608 BGP CAR next hop 610 - T bit set to indicate TLV is transitive. An unrecognized 611 transitive TLV MUST be propagated by a speaker that 612 changes the next hop 614 - T bit unset to indicate TLV is non-transitive. An 615 unrecognized non-transitive TLV MUST not be propagated by 616 a speaker that changes next hop 618 A speaker thats does not change next hop should ignore the 619 T-bit and propagate all received TLVs. 621 + Type code: Remaining 6 bits contains the type of the TLV. 623 * Length: 1 octet field that contains the length of the value 624 portion of the non-key TLV in terms of octets 626 * Value: variable length field as indicated by the length field 627 and to be interpreted as per the type field. 629 The prefix is routable across the administrative domain where BGP 630 transport CAR is deployed. It is possible that the same prefix is 631 originated by multiple BGP CAR speakers in the case of anycast 632 addressing or multi-homing. 634 The Color is introduced to enable multiple route advertisements for 635 the same prefix. The color is associated with an intent (e.g. low- 636 latency) in originator color-domain. 638 The following sub-sections specify the non-key TLVs associated with 639 the Color-Aware Routes NLRI type. 641 2.9.2.1. Label TLV 643 The Label TLV is used for advertisement of color-aware routes along 644 with their MPLS labels and has the following format: 646 0 1 2 3 647 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 648 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 649 | Type | Length | 650 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 652 Followed by one (or more) Labels encoded as below: 654 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 655 | Label |Rsrv |S| 656 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 658 where: 660 o Type : Type code is 1. T bit MUST be unset 662 o Length: variable, MUST be a multiple of 3 664 o Label Information: multiples of 3 octet fields to convey the MPLS 665 label(s) associated with the advertised color-aware route. It is 666 used for encoding a single label or a stack of labels as per 667 procedures specified in [RFC8277]. 669 When a BGP transport CAR speaker is propagating the route further 670 after setting itself as the nexthop, it allocates a local label for 671 the specific prefix and color combination which it updates in this 672 TLV. It also MUST program a label cross-connect that would result in 673 the label swap operation for the incoming label that it advertises 674 with the label received from its best-path router(s). 676 2.9.2.2. Label Index TLV 678 The Label Index TLV is used for advertisement of Segment Routing MPLS 679 (SR-MPLS) Segment Identifier (SID) [RFC8402] information associated 680 with the labeled color-aware routes and has the following format: 682 0 1 2 3 683 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 684 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 685 | Type | Length | Reserved | Flags ~ 686 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 687 ~ | Label Index ~ 688 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 689 ~ | 690 +-+-+-+-+-+-+-+-+ 692 where: 694 o Type : Type code is 2. T bit MUST be set 696 o Length: 7 698 o Reserved: 1 octet field that MUST be set to 0 and ignored on 699 receipt. 701 o Flags: 2 octet field that maps to the Flags field of the Label- 702 Index TLV of the BGP Prefix SID Attribute [RFC8669]. 704 o Label Index: 4 octet field that maps to the Label Index field of 705 the Label-Index TLV of the BGP Prefix SID Attribute [RFC8669]. 707 This TLV provides the equivalent functionality as Label-Index TLV of 708 [RFC8669] for Transport CAR in SR-MPLS deployments. The BGP Prefix 709 SID Attribute SHOULD be omitted from the labeled color-aware routes 710 when the attribute is being used to only convey the Label Index TLV 711 for better BGP packing efficiency. 713 When a BGP Transport CAR speaker is propagating the route further 714 after setting itself as the nexthop, it allocates a local label for 715 the specific prefix and color combination. When the received update 716 has the Label Index TLV, it SHOULD use that hint to allocate the 717 local label from the SR Global Block (SRGB) using procedures as 718 specified in [RFC8669]. 720 2.9.2.3. SRv6 SID TLV 722 BGP Transport CAR can be also used to setup end-to-end color-aware 723 connectivity using Segment Routing over IPv6 (SRv6) [RFC8402]. 724 [I-D.ietf-spring-srv6-network-programming] specifies the SRv6 725 Endpoint behaviors (e.g. End PSP) which MAY be leveraged for BGP CAR 726 with SRv6.The SRv6 SID TLV is used for advertisement of color-aware 727 routes along with their SRv6 SIDs and has the following format: 729 0 1 2 3 730 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 731 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 732 | Type | Length | SRv6 SID Info (variable) // 733 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 735 where: 737 o Type : Type code is 3. T bit MUST be unset 739 o Length: variable, MUST be either less than or equal to 16, or be a 740 multiple of 16 742 o SRv6 SID Information: field of size as indicated by the length 743 that either carries the SRv6 SID(s) for the advertised color-aware 744 route as one of the following: 746 * A single 128-bit SRv6 SID or a stack of 128-bit SRv6 SIDs 748 * A transposed portion (refer [I-D.ietf-bess-srv6-services]) of 749 the SRv6 SID that MUST be of size in multiples of one octet and 750 less than 16. 752 The BGP color-aware route update for SRv6 MUST include the BGP 753 Prefix-SID attribute along with the TLV carrying the SRv6 SID 754 information as specified in [I-D.ietf-bess-srv6-services] when using 755 the transposition scheme of encoding for packing efficiency of BGP 756 updates. 758 2.9.3. Local-Color-Mapping (LCM) Extended Community 760 This document defines a new BGP Extended Community called "LCM". The 761 LCM is a Transitive Opaque Extended Community with the following 762 encoding: 764 0 1 2 3 765 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 766 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 767 | Type=0x3 | Sub-Type=TBD2 | Reserved | 768 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 769 | Color | 770 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 772 where: 774 o Type: 0x3 776 o Sub-Type: TBD2. 778 o Reserved: 2 octet of reserved field that MUST be set to zero on 779 transmission and ignored on reception. 781 o Color: 4-octet field that carries the 32-bit color value. 783 When a CAR route crosses the originator color domain's boundary, LCM 784 EC is added. LCM EC conveys the local color mapping for the intent 785 (e.g. low latency) into transit or remote color domains. 787 The LCM EC MAY be used for filtering of BGP CAR routes and/or for 788 applying routing policies for the intent, when present. 790 2.10. Error Handling 792 The fault management actions as described in [RFC7606] are applicable 793 for handling of BGP update messages for BGP-CAR. 795 When the error determined allows for the router to skip the malformed 796 NLRI(s) and continue processing of the rest of the update message, 797 then it MUST handle such malformed NLRIs as 'Treat-as-withdraw'. In 798 other cases, where the error in the NLRI encoding results in the 799 inability to process the BGP update message, then the router SHOULD 800 handle such malformed NLRIs as 'AFI/SAFI disable' when other AFI/SAFI 801 besides BGP-CAR are being advertised over the same session. 802 Alternately, the router MUST perform 'session reset' when the session 803 is only being used for BGP-CAR. 805 Following errors result in 'AFI/SAFI disable' or 'session reset': 807 o Minimum NLRI length check error. 809 o NLRI length conflict with key length. 811 o Key length encoding errors (such as minimum, maximum and conflict 812 with prefix length). 814 There can be cases where the NLRI length value is in conflict with 815 the enclosed non-key TLVs, which themselves carry length values. 816 Either the length of a TLV would cause the NLRI length to be exceeded 817 when parsing the TLV, or fewer than 2 bytes remain when beginning to 818 parse the TLV. 820 In either of these cases, an error condition exists and the "treat- 821 as-withdraw" approach MUST be used (unless some other, more severe 822 error is encountered dictating a stronger approach), and the NLRI 823 Length MUST be relied upon to enable the beginning of the next NLRI 824 field to be located. The above recommendations follow the principle 825 defined in section 4 of [RFC7606]. 827 Type-Specific Non-Key TLV handling 829 o If multiple instances of same type are encountered, all but the 830 first instance MUST be ignored. 832 o Type specific length constraints should be verified. The TLV is 833 discarded if there is an error. 835 o A TLV is not considered malformed because of failing any semantic 836 validation of its Value field. 838 o Speaker modifying the BGP next-hop MUST recognize at least one of 839 the forwarding information TLV (such as label and SRv6 SID). If 840 it is not able to, such NLRI is considered invalid and not 841 eligible for best path selection. 843 3. Service route Automated Steering on Color-Aware path 845 E1 automatically steers a C-colored service route V/v from E2 onto an 846 (E2, C) color-aware path. If several such paths exist, a preference 847 scheme is used to select the best path: E.g. IGP Flex-Algo first 848 then BGP CAR then SR Policy. 850 This is consistent with the automated service route steering on SR 851 Policy (a routing solution providing color-aware path) defined in 852 [I-D.ietf-spring-segment-routing-policy]. All the steering 853 variations defined in [I-D.ietf-spring-segment-routing-policy] are 854 applicable to BGP CAR color-aware path: on-demand steering, per- 855 destination, per-flow, CO-only. For brevity, in this revision, we 856 refer the reader to the [I-D.ietf-spring-segment-routing-policy] 857 text. 859 Salient property: Seamless integration of BGP CAR and SR Policy. 861 Appendix A provides illustrations of service route automated 862 steering. 864 4. Intents 866 The widely deployed color-aware path SR Policy solution demonstrates 867 that the following intents can easily be associated with a color: 869 1. Minimization of a cost metric vs a latency metric 871 * Minimization of different metric types, static and dynamic 873 2. Exclusion/Inclusion of SRLG and/or Link Affinity and/or minimum 874 MTU/number of hops 876 3. Bandwidth management 878 4. In the inter-domain context, exclusion/inclusion of entire 879 domains, and border routers 881 5. Inclusion of one or several virtual network function chains 883 * Located in a regional domain and/or core domain, in a DC 885 6. Localization of the virtual network function chains 887 * Some functions may be desired in the regional DC or vice versa 889 7. Per-Destination and Per-Flow steering 891 It is straightforward to note that the BGP CAR color-aware 892 alternative supports intents 1, 2, 4 and 7. 894 Future revisions of this document will analyze the BGP CAR supports 895 for 3, 5 and 6. 897 5. (E, C) Subscription and Filtering 899 This section defines an (E, C) BGP subscription model that allows to 900 filter the (E, C) routes learned by a BGP CAR node. 902 5.1. Illustration 903 E1-----------------A-------------------B-------------------E2 904 <--- (E2, C1) ---- 905 -- F (E2, C1) --> --- F (E2, C1) --> 906 | | 907 <-- (E2, C1) ---- <--- (E2, C1) ---- 909 o BGP CAR route (E2, C1) advertised by E2 is not unconditionally 910 distributed beyond a certain point (e.g., B) 912 o E1 subscribes to (E2, C1) by advertising a filter route F (E2, C1) 913 to its upstream peer A 915 o If A has (E2, C1) in its BGP RIB, it will advertise (E2, C1) to E1 917 o If A does not have (E2, C1), it will advertise F (E2, C1) to its 918 peer B 920 o B will advertise (E2, C1) to A, which will distribute it to E1 922 E1 may trigger a subscription for BGP CAR route (E2, C1) as a result 923 of receiving a C1-colored service route V/v from E2, for on-demand 924 steering via (E2, C1). 926 5.2. Definition 928 future version of this document 930 6. Scaling 932 This section analyses the key scale requirement of [ref:dskc-bess- 933 bgp-car-problem-statement], specifically: 935 o No intermediate node dataplane should need to scale to (Colors * 936 PEs) 938 o No node should learn and install a BGP CAR route to (E,C) if it 939 does not install a Colored service route to E 941 Figure 2 provides an ultra-scale reference topology. Section 6.2 942 presents three design models to deploy BGP CAR in the reference 943 topology. Section 6.3 analyses the scaling properties of each model. 944 Section 6.4 illustrates the scaling benefits of the (E, C) BGP 945 subscription and filtering. 947 6.1. Ultra-Scale Reference Topology 949 RD:V/v via E2 950 +-----+ +-----+ vpn label:30030 +-----+ 951 ....... |S-RR1| <........... |S-RR2| <...............|S-RR3| <...... 952 : +-----+ +-----+ Color C1 +-----+ : 953 : : 954 : : 955 : : 956 +:------------+--------------+--------------+--------------+--------:-+ 957 |: | | | | : | 958 |: | | | | : | 959 |: +---+ +---+ +---+ +---+ : | 960 |: |121| |231| |341| |451| : | 961 |: +---+ +---+ +---+ +---+ : | 962 |---+ | | | | +---| 963 | E1| | | | | | E2| 964 |---+ | | | | +---| 965 | +---+ +---+ +---+ +---+ | 966 | |122| |232| |342| |452| | 967 | +---+ +---+ +---+ +---+ | 968 | Access | Metro | Core | Metro | Access | 969 | domain 1 | domain 2 | domain 3 | domain 4 | domain 5 | 970 +-------------+--------------+--------------+--------------+----------+ 971 iPE iBRM iBRC eBRC eBRM ePE 973 Figure 2: Ultra-Scale Reference Topology 975 The following applies to the reference topology above: 977 o Independent ISIS/OSPF SR instance in each domain. 979 o Each domain has Flex Algo 128. Prefix SID for a node is SRGB 980 168000 plus node number. 982 o A BGP CAR route (E2, C1) is advertised by egress BRM node 451.The 983 route is sourced locally from redistribution from IGP-FA 128. 985 o Not shown for simplicity, node 452 will also advertise (E2, C1). 987 o When a transport RR is used within the domain or across domains, 988 ADD-PATH is enabled to advertise paths from both egress BRs to 989 it's clients. 991 o Egress PE E2 advertises a VPN route RD:V/v with BGP Color extended 992 community C1 that propagates via service RRs to ingress PE E1. 994 o E1 steers V/v prefix via color-aware path (E2,C1) and VPN label 995 30030 997 6.2. Deployment model 999 6.2.1. Flat 1001 RD:V/v via E2 1002 +-----+ +-----+ vpn label:30030 +-----+ 1003 ....... |S-RR1| <........... |S-RR2| <...............|S-RR3| <...... 1004 : +-----+ +-----+ Color C1 +-----+ : 1005 : : 1006 : : 1007 : : 1008 +:------------+--------------+--------------+--------------+--------:-+ 1009 |: | | | | : | 1010 |: | (E2,C1) | (E2,C1) | (E2,C1) | : | 1011 |: +---+ via 231 +---+ via 341 +---+ via 451 +---+ : | 1012 |:(E2,C1) |121|<---------|231|<---------|341|<---------|451| : | 1013 |: via 121 /+---+ L=168002 +---+ L=168002 +---+ L=168002 +---+ : | 1014 |---+ / | | | | +---| 1015 | E1| <--/ | | | | | E2| 1016 |---+ L=168002| | | | +---| 1017 | +---+ +---+ +---+ +---+ | 1018 | |122| |232| |342| |452| | 1019 | +---+ +---+ +---+ +---+ | 1020 | Access | Metro | Core | Metro | Access | 1021 | domain 1 | domain 2 | domain 3 | domain 4 | domain 5 | 1022 +-------------+--------------+--------------+--------------+----------+ 1023 iPE iBRM iBRC eBRC eBRM ePE 1025 168121 168231 168341 168451 1026 168002 168002 168002 168002 168002 1027 30030 30030 30030 30030 30030 30030 1029 Figure 3 1031 1. Node 451 advertises BGP CAR route (E2, C1) to 341, from which it 1032 goes to 231 then to 121 and finally to E1 1034 2. Each BGP hop allocates local label and programs swap entry in 1035 forwarding for (E2, C1) 1037 3. E1 receives BGP CAR route (E2, C1) via 121 with label 168002 1039 1. Let's assume E1 selects that path 1041 4. E1 resolves BGP CAR route (E2, C1) via 121 on color-aware path 1042 (121, C1) 1044 1. Color-aware path (121, C1) is FA128 path to 121 (label 1045 168121) 1047 5. E1's imposition color-aware label-stack for V/v is thus 1049 1. 30030 <=> V/v 1051 2. 168002 <=> (E2, C1) 1053 3. 168121 <=> (121, C1) 1055 6. Each BGP hop performs swap operation on 168002 bound to color- 1056 aware path (E2,C1) 1058 6.2.2. Hierarchical Design with next-hop-self at ingress domain BR 1060 (E2,C1) 1061 +-----+ via 451 +-----+ 1062 |T-RR1| <-------------- |T-RR2| 1063 / +-----+ L=168002 +-----+\ 1064 / \ 1065 +-------------+---/----------+--------------+-----------\--+----------+ 1066 | | / | | \ | | 1067 | (E2,C1) | / (451,C1) | (451,C1) | \| | 1068 | via 121 +---+ via 231 +---+ via 341 +---+ +---+ | 1069 | L=168002 |121| <======= |231| <========|341| <======= |451| | 1070 | / +---+ L=168451 +---+ L=168451 +---+ +---+ | 1071 |---+ / | | | | +---| 1072 | E1|<--/ | | | | | E2| 1073 |---+ | | | | +---| 1074 | +---+ +---+ +---+ +---+ | 1075 | |122| |232| |342| |452| | 1076 | +---+ +---+ +---+ +---+ | 1077 | Access | Metro | Core | Metro | Access | 1078 | domain 1 | domain 2 | domain 3 | domain 4 | domain 5 | 1079 +-------------+--------------+--------------+--------------+----------+ 1080 iPE iBRM iBRC eBRC eBRM ePE 1082 168231 168341 1083 168121 168451 168451 168451 1084 168002 168002 168002 168002 168002 1085 30030 30030 30030 30030 30030 30030 1087 Figure 4: Heirarchical BGP transport CAR, NHS at iBR 1089 1. Node 451 advertises BGP CAR route (451, C1) to 341, from which 1090 it goes to 231 and finally to 121 1092 2. Each BGP hop allocates local label and programs swap entry in 1093 forwarding for (451, C1) 1095 3. 121 resolves received BGP CAR route (451, C1) via 231 (label 1096 168451) on color-aware path (231, C1) 1098 1. Color-aware path (231, C1) is FA128 path to 231 (label 1099 168231) 1101 4. 451 advertises BGP CAR route (E2, C1) via 451 to Transport RR 1102 T-RR2, which reflects it to T-RR1, which reflects it to 121 1104 5. 121 receives BGP CAR route (E2, C1) via 451 with label 168002 1106 1. Let's assume 121 selects that path 1108 6. 121 resolves BGP CAR route (E2, C1) via 451 on color-aware path 1109 (451, C1) 1111 1. Color-aware path (451, C1) is BGP CAR path to 451 (label 1112 168451) 1114 7. 121 imposition of color-aware label stack for (E2, C1) is thus 1116 1. 168002 <=> (E2, C1) 1118 2. 168451 <=> (451, C1) 1120 3. 168231 <=> (231, C1) 1122 8. 121 advertises (E2, C1) to E1 with next hop self (121) and label 1123 168002 1125 9. E1 constructs same imposition color-aware label-stack for V/v 1126 via (E2, C1) as in the flat model: 1128 1. 30030 <=> V/v 1130 2. 168002 <=> (E2, C1) 1132 3. 168121 <=> (121, C1) 1134 10. 121 performs swap operation on 168002 with hierarchical color- 1135 aware label stack for (E2, C1) via 451 from step 7 1137 11. Nodes 231 and 341 perform swap operation on 168451 bound to 1138 color-aware path (451, C1) 1140 12. 451 performs swap operation on 168002 bound to color-aware path 1141 (E2, C1) 1143 Note: E1 does not need the BGP CAR (451, C1) route 1145 6.2.3. Hierarchical Design with Next Hop Unchanged at ingress domain BR 1147 (E2,C1) 1148 +-----+ via 451 +-----+ 1149 |T-RR1| <-------------- |T-RR2| 1150 / +-----+ L=168002 +-----+\ 1151 / \ 1152 +-------------+---/----------+--------------+-----------\--+----------+ 1153 | | / | | \ | | 1154 | (E2,C1) | / (451,C1) | (451,C1) | \| | 1155 | via 451 +---+ via 231 +---+ via 341 +---+ +---+ | 1156 | L=168002/|121| <======= |231| <========|341| <======= |451| | 1157 | / +---+ L=168451 +---+ L=168451 +---+ +---+ | 1158 |---+ <--/ //| | | | +---| 1159 | E1| // | | | | | E2| 1160 |---+ <===// | | | | +---| 1161 | (451,C1) +---+ +---+ +---+ +---+ | 1162 | via 121 |122| |232| |342| |452| | 1163 | L=168451 +---+ +---+ +---+ +---+ | 1164 | | | | | | 1165 | Access | Metro | Core | Metro | Access | 1166 | domain 1 | domain 2 | domain 3 | domain 4 | domain 5 | 1167 +-------------+--------------+--------------+--------------+----------+ 1168 iPE iBRM iBRC eBRC eBRM ePE 1170 168121 168231 168341 1171 168451 168451 168451 168451 1172 168002 168002 168002 168002 168002 1173 30030 30030 30030 30030 30030 30030 1175 Figure 5: Heirarchical BGP transport CAR, NHU at iBR 1177 1. Nodes 341, 231 and 121 receive and resolve BGP CAR route (451, 1178 C1) the same as in the previous model 1180 2. Node 121 allocates local label and programs swap entry in 1181 forwarding for (451, C1) 1183 3. 451 advertises BGP CAR route (E2, C1) to Transport RR T-RR2, 1184 which reflects it to T-RR1, which reflects it to 121 1186 4. Node 121 advertises (E2, C1) to E1 with next hop as 451 i.e. 1187 next-hop unchanged 1189 5. 121 also advertises (451, C1) to E1 with next hop self (121) and 1190 label 168451 1192 6. E1 resolves BGP CAR route (451, C1) via 121 on color-aware path 1193 (121, C1) 1195 1. Color-aware path (121, C1) is FA128 path to 121 (label 1196 168121) 1198 7. E1 receives BGP CAR route (E2, C1) via 451 with label 168002 1200 1. Let's assume E1 selects that path 1202 8. E1 resolves BGP CAR route (E2, C1) via 451 on color-aware path 1203 (451, C1) 1205 1. Color-aware path (451, C1) is BGP CAR path to 451 (label 1206 168451) 1208 9. E1's imposition color-aware label-stack for V/v is thus 1210 1. 30030 <=> V/v 1212 2. 168002 <=> (E2, C1) 1214 3. 168451 <=> (451, C1) 1216 4. 168121 <=> (121, C1) 1218 10. Nodes 121, 231 and 341 perform swap operation on 168451 bound to 1219 (451, C1) 1221 11. 451 performs swap operation on 168002 bound to color-aware path 1222 (E2, C1) 1224 6.3. Scale Analysis 1226 The following two tables summarize the control-plane and dataplane 1227 scale of these three models: 1229 | E1 | 121 | 231 1230 -----+---------------------+---------------------+-------------------- 1231 FLAT | (E2,C) via (121,C) | (E2,C) via (231,C) | (E2,C) via (341,C) 1232 -----+---------------------+---------------------+-------------------- 1233 H.NHS| (E2,C) via (121,C) | (E2,C) via (451,C) | 1234 | | (451,C) via (231,C) | (451,C) via (341,C) 1235 -----+---------------------+---------------------+-------------------- 1236 H.NHU| (E2,C) via (451,C) | | 1237 | (451,C) via (121,C) | (451,C) via (231,C) | (451,C) via (341,C) 1238 -----+---------------------+---------------------+-------------------- 1240 | E1 | 121 | 231 1241 -----+---------------------+---------------------+-------------------- 1242 FLAT | V -> 30030 | 168002 -> 168002 | 168002 -> 168002 1243 | 168002 | 168231 | 168341 1244 | 168121 | | 1245 -----+---------------------+---------------------+-------------------- 1246 H.NHS| V -> 30030 | 168002 -> 168002 | 168451 -> 168451 1247 | 168002 | 168451 | 168341 1248 | 168121 | 168231 | 1249 -----+---------------------+---------------------+-------------------- 1250 H.NHU| V -> 30030 | 168451 -> 168451 | 168451 -> 168451 1251 | 168002 | 168231 | 168341 1252 | 168451 | | 1253 | 168121 | | 1254 -----+---------------------+---------------------+-------------------- 1256 o The flat model is the simplest design, with a single BGP transport 1257 level. It results in the minimum label/SID stack at each BGP hop. 1258 However, it significantly increases the scale impact on the core 1259 BRs (e.g. 341), whose FIB capacity and even MPLS label space may 1260 be exceeded. 1262 * 341's dataplane scales with (E2,C) where there may be 300k E's 1263 and 5 C's hence 1.5M entries > 1M MPLS dataplane 1265 o The hierarchical models avoid the need for core BRs to learn 1266 routes and install label forwarding entries for (E, C) routes. 1268 * Whether NH self or unchanged at 121, 341's dataplane scales 1269 with (451,C) where there may be thousands of 451's and 5 C's 1270 hence well under the 1M MPLS dataplane 1272 o The next-hop-self option at ingress BRM (e.g. 121) hides the 1273 hierarchical design from the ingress PE, keeping its outgoing 1274 label programming as simple as the flat model. However, the 1275 ingress BRM requires an additional BGP transport level recursion, 1276 which coupled with load-balancing adds dataplane complexity. It 1277 needs to support a swap and push operation. It also needs to 1278 install label forwarding entries for the egress PEs that are of 1279 interest to its local ingress PEs. 1281 o With the next-hop-unchanged option at ingress BRM (e.g. 121), only 1282 an ingress PE needs to learn and install output label entries for 1283 egress (E, C) routes. The ingress BRM only installs label 1284 forwarding entries for the egress ABR (e.g. 451). However, the 1285 ingress PE needs an additional BGP transport level recursion and 1286 pushes a BGP VPN label and two BGP transport labels. It may also 1287 need to handle load-balancing for the egress ABRs. This is the 1288 most complex dataplane option for the ingress PE. 1290 6.4. Scaling Benefits of the (E, C) BGP Subscription and Filtering 1292 The (E, C) subscription scheme from Section 5 provides the following 1293 scaling benefits for the models in Section 6.2 1295 o An ingress PE (E1) only learns (E, C) routes that it needs to 1296 install into data plane for service route automated steering 1298 o An ingress BRM (121) only learns (E, C) routes that it needs to 1299 install into data plane (for Next-Hop-Self), or that it needs to 1300 distribute towards it's ingress PEs (inline RR with Next-Hop- 1301 Unchanged) 1303 o An ingress BRM or a transport RR only needs to distribute the 1304 necessary subset of (E, C) routes to each client (subscriber); 1305 this minimizes their processing load for generating updates 1307 o As a result, withdrawal of (E, C) routes when a remote node fails 1308 (E2), may also be faster, aiding better convergence 1310 6.5. Anycast SID 1312 This section describes how Anycast SID complements and improves the 1313 scaling designs above. 1315 6.5.1. Anycast SID for transit inter-domain nodes 1317 o Redundant BRs (e.g. two egress BRMs, 451 and 452) advertise BGP 1318 CAR routes for a local PE (e.g., E2) with the same SID (based on 1319 label-index). Such egress BRMs may be assigned a common Anycast 1320 SID, so that the BGP next-hops for these routes will also resolve 1321 via a color-aware path to the Anycast SID. 1323 o The use of Anycast SID naturally provides fast local convergence 1324 upon failure of an egress BRM node. In addition, it decreases the 1325 recursive resolution and load-balancing complexity at an ingress 1326 BRM or PE in the hierarchical designs above. 1328 6.5.2. Anycast SID for transport color endpoints (e.g., PEs) 1330 The common Anycast SID technique may also be used for a redundant 1331 pair of PEs that share an identical set of service (VPN) attachments. 1333 o For example, assume a node E2' paired with E2 above. Both PEs 1334 should be configured with the same static label/SID for the 1335 services (e.g., per-VRF VPN label/SID), and will advertise 1336 associated service routes with the Anycast IP as BGP next-hop. 1338 o This design provides a convergence and recursive resolution 1339 benefit on an ingress PE or ABR similar to the egress ABR case 1340 above. 1342 7. Routing Convergence 1344 This section will analyze routing convergence. 1346 8. VPN CAR 1348 This section illustrates the extension of BGP CAR to address the VPN 1349 CAR requirement stated in Section 3.2 of [dskc-bess-bgp-car-problem- 1350 statement]. 1352 CE1 -------------- PE1 -------------------- PE2 -------------- CE2 - V 1354 o BGP CAR is enabled between CE1-PE1 and PE2-CE2 1356 o BGP VPN CAR is enabled between PE1 and PE2 1358 o Provider publishes intent 'low-delay' is mapped to color CP on its 1359 inbound peering links 1361 o Within its infrastructure, Provider maps intent 'low-delay' to 1362 color CPT 1364 o On CE1 and CE2, intent 'low-delay' is mapped to CC 1366 (V, CC) is a Color-Aware route originated by CE2 1367 1. CE2 sends to PE2 : [(V, CC), Label L1] via CE2 with LCM (CP) 1368 2. PE2 installs in VRF A: [(V, CC), L1] via CE2 which resolves on (CE2, CP) 1369 / connected OIF 1370 2.a. PE2 allocates VPN Label L2 and programs swap entry for (V, CC) 1371 3. PE2 sends to PE1 : [(RD, V, CC), L2] via PE2 with regular Color Extended 1372 Community (CPT) 1373 4. PE1 installs in VRF A: [(V, CC), L2] via (PE2, CPT) steered on (PE2, CPT) 1374 4.a. PE1 allocates Label L3 and programs swap entry for (V, CC) 1375 5. PE1 sends to CE1 : [(V, CC), L3] via PE1 without any LCM 1376 6. CE1 installs : [(V, CC), L3] via PE1 which resolves on (PE1, CC) 1377 / connected OIF 1378 6.a. Label L3 is installed as the imposition label for (V, CC) 1380 VPN CAR distribution for (RD, V, CC) requires a new SAFI that follows 1381 same VPN semantics as defined in [RFC4364], the difference being that 1382 the advertised routes carry CAR NLRI defined in Section 2.9.2 of this 1383 document. 1385 VPN CAR NLRI with RD has the format shown below 1387 0 1 2 3 1388 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1389 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1390 | NLRI Length | Key Length | NLRI Type |Prefix Length | 1391 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1392 | Route Distinguisher | 1393 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1394 | Route Distinguisher | 1395 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1396 | IP Prefix (variable) // 1397 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1398 | Color (4 octets) | 1399 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1401 Followed by optional TLVs encoded as below: 1403 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1404 | Type | Length | Value (variable) // 1405 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1407 where: 1409 Route Distinguisher: 8 octet field encoded according to [RFC4364] 1411 9. IANA Considerations 1413 IANA is requested to assign SAFI value 83 (BGP CAR) and SAFI value 1414 84 (BGP VPN CAR) from the "SAFI Values" sub-registry under the 1415 "Subsequent Address Family Identifiers (SAFI) Parameters" registry 1416 with this document as a reference. 1418 9.1. BGP CAR NLRI Types Registry 1420 IANA is requested to create a "BGP CAR NLRI Types" sub-registry under 1421 the "Border Gateway Protocol (BGP) Parameters" registry with this 1422 document as a reference. The registry is for assignment of the one 1423 octet sized code-points for BGP CAR NLRI types and populated with the 1424 values shown below: 1426 Type NLRI Type Reference 1427 ----------------------------------------------------------------- 1428 0 Reserved (not to be used) [This document] 1429 1 Color-Aware Routes NLRI [This document] 1430 2-255 Unassigned 1432 Allocations within the registry are to be made under the 1433 "Specification Required" policy as specified in [RFC8126]). 1435 9.2. BGP CAR NLRI TLV Registry 1437 IANA is requested to create a "BGP CAR NLRI TLV Types" sub-registry 1438 under the "Border Gateway Protocol (BGP) Parameters" registry with 1439 this document as a reference. The registry is for assignment of the 1440 one octet sized code-points for BGP-CAR NLRI non-key TLV types and 1441 populated with the values shown below: 1443 Type NLRI Type Reference 1444 ----------------------------------------------------------------- 1445 0 Reserved (not to be used) [This document] 1446 1 Label TLV [This document] 1447 2 Label Index TLV [This document] 1448 3 SRv6 SID TLV [This document] 1449 4-255 Unassigned 1451 Allocations within the registry are to be made under the 1452 "Specification Required" policy as specified in [RFC8126]). 1454 9.3. Guidance for Designated Experts 1456 In all cases of review by the Designated Expert (DE) described here, 1457 the DE is expected to ascertain the existence of suitable 1458 documentation (a specification) as described in [RFC8126]. The DE is 1459 also expected to check the clarity of purpose and use of the 1460 requested code points. Additionally, the DE must verify that any 1461 request for one of these code points has been made available for 1462 review and comment within the IETF: the DE will post the request to 1463 the IDR Working Group mailing list (or a successor mailing list 1464 designated by the IESG). If the request comes from within the IETF, 1465 it should be documented in an Internet-Draft. Lastly, the DE must 1466 ensure that any other request for a code point does not conflict with 1467 work that is active or already published within the IETF. 1469 9.4. BGP Extended Community Registry 1471 IANA is requested to allocate the sub-type TBD2 for "Local Color 1472 Mapping (LCM)" under the "BGP Transitive Opaque Extended Community" 1473 registry under the "BGP Extended Community" parameter registry. 1475 10. Acknowledgements 1477 The authors would like to acknowledge the review and inputs from many 1478 people.TBD 1480 11. References 1482 11.1. Normative References 1484 [I-D.ietf-bess-srv6-services] 1485 Dawra, G., Filsfils, C., Talaulikar, K., Raszuk, R., 1486 Decraene, B., Zhuang, S., and J. Rabadan, "SRv6 BGP based 1487 Overlay Services", draft-ietf-bess-srv6-services-15 (work 1488 in progress), March 2022. 1490 [I-D.ietf-idr-bgp-ipv6-rt-constrain] 1491 Patel, K., Raszuk, R., Djernaes, M., Dong, J., and M. 1492 Chen, "IPv6 Extensions for Route Target Distribution", 1493 draft-ietf-idr-bgp-ipv6-rt-constrain-12 (work in 1494 progress), April 2018. 1496 [I-D.ietf-idr-tunnel-encaps] 1497 Patel, K., Velde, G. V. D., Sangli, S. R., and J. Scudder, 1498 "The BGP Tunnel Encapsulation Attribute", draft-ietf-idr- 1499 tunnel-encaps-22 (work in progress), January 2021. 1501 [I-D.ietf-lsr-flex-algo] 1502 Psenak, P., Hegde, S., Filsfils, C., Talaulikar, K., and 1503 A. Gulko, "IGP Flexible Algorithm", draft-ietf-lsr-flex- 1504 algo-19 (work in progress), April 2022. 1506 [I-D.ietf-spring-segment-routing-policy] 1507 Filsfils, C., Talaulikar, K., Voyer, D., Bogdanov, A., and 1508 P. Mattes, "Segment Routing Policy Architecture", draft- 1509 ietf-spring-segment-routing-policy-22 (work in progress), 1510 March 2022. 1512 [I-D.ietf-spring-srv6-network-programming] 1513 Filsfils, C., Garvia, P. C., Leddy, J., Voyer, D., 1514 Matsushima, S., and Z. Li, "Segment Routing over IPv6 1515 (SRv6) Network Programming", draft-ietf-spring-srv6- 1516 network-programming-28 (work in progress), December 2020. 1518 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1519 Requirement Levels", BCP 14, RFC 2119, 1520 DOI 10.17487/RFC2119, March 1997, 1521 . 1523 [RFC4360] Sangli, S., Tappan, D., and Y. Rekhter, "BGP Extended 1524 Communities Attribute", RFC 4360, DOI 10.17487/RFC4360, 1525 February 2006, . 1527 [RFC4684] Marques, P., Bonica, R., Fang, L., Martini, L., Raszuk, 1528 R., Patel, K., and J. Guichard, "Constrained Route 1529 Distribution for Border Gateway Protocol/MultiProtocol 1530 Label Switching (BGP/MPLS) Internet Protocol (IP) Virtual 1531 Private Networks (VPNs)", RFC 4684, DOI 10.17487/RFC4684, 1532 November 2006, . 1534 [RFC4760] Bates, T., Chandra, R., Katz, D., and Y. Rekhter, 1535 "Multiprotocol Extensions for BGP-4", RFC 4760, 1536 DOI 10.17487/RFC4760, January 2007, 1537 . 1539 [RFC5512] Mohapatra, P. and E. Rosen, "The BGP Encapsulation 1540 Subsequent Address Family Identifier (SAFI) and the BGP 1541 Tunnel Encapsulation Attribute", RFC 5512, 1542 DOI 10.17487/RFC5512, April 2009, 1543 . 1545 [RFC5701] Rekhter, Y., "IPv6 Address Specific BGP Extended Community 1546 Attribute", RFC 5701, DOI 10.17487/RFC5701, November 2009, 1547 . 1549 [RFC7311] Mohapatra, P., Fernando, R., Rosen, E., and J. Uttaro, 1550 "The Accumulated IGP Metric Attribute for BGP", RFC 7311, 1551 DOI 10.17487/RFC7311, August 2014, 1552 . 1554 [RFC7606] Chen, E., Ed., Scudder, J., Ed., Mohapatra, P., and K. 1555 Patel, "Revised Error Handling for BGP UPDATE Messages", 1556 RFC 7606, DOI 10.17487/RFC7606, August 2015, 1557 . 1559 [RFC8126] Cotton, M., Leiba, B., and T. Narten, "Guidelines for 1560 Writing an IANA Considerations Section in RFCs", BCP 26, 1561 RFC 8126, DOI 10.17487/RFC8126, June 2017, 1562 . 1564 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 1565 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 1566 May 2017, . 1568 [RFC8277] Rosen, E., "Using BGP to Bind MPLS Labels to Address 1569 Prefixes", RFC 8277, DOI 10.17487/RFC8277, October 2017, 1570 . 1572 [RFC8402] Filsfils, C., Ed., Previdi, S., Ed., Ginsberg, L., 1573 Decraene, B., Litkowski, S., and R. Shakir, "Segment 1574 Routing Architecture", RFC 8402, DOI 10.17487/RFC8402, 1575 July 2018, . 1577 [RFC8669] Previdi, S., Filsfils, C., Lindem, A., Ed., Sreekantiah, 1578 A., and H. Gredler, "Segment Routing Prefix Segment 1579 Identifier Extensions for BGP", RFC 8669, 1580 DOI 10.17487/RFC8669, December 2019, 1581 . 1583 11.2. Informative References 1585 [I-D.ietf-mpls-seamless-mpls] 1586 Leymann, N., Decraene, B., Filsfils, C., Konstantynowicz, 1587 M., and D. Steinberg, "Seamless MPLS Architecture", draft- 1588 ietf-mpls-seamless-mpls-07 (work in progress), June 2014. 1590 [RFC3906] Shen, N. and H. Smit, "Calculating Interior Gateway 1591 Protocol (IGP) Routes Over Traffic Engineering Tunnels", 1592 RFC 3906, DOI 10.17487/RFC3906, October 2004, 1593 . 1595 [RFC4271] Rekhter, Y., Ed., Li, T., Ed., and S. Hares, Ed., "A 1596 Border Gateway Protocol 4 (BGP-4)", RFC 4271, 1597 DOI 10.17487/RFC4271, January 2006, 1598 . 1600 [RFC4272] Murphy, S., "BGP Security Vulnerabilities Analysis", 1601 RFC 4272, DOI 10.17487/RFC4272, January 2006, 1602 . 1604 [RFC4364] Rosen, E. and Y. Rekhter, "BGP/MPLS IP Virtual Private 1605 Networks (VPNs)", RFC 4364, DOI 10.17487/RFC4364, February 1606 2006, . 1608 [RFC6952] Jethanandani, M., Patel, K., and L. Zheng, "Analysis of 1609 BGP, LDP, PCEP, and MSDP Issues According to the Keying 1610 and Authentication for Routing Protocols (KARP) Design 1611 Guide", RFC 6952, DOI 10.17487/RFC6952, May 2013, 1612 . 1614 [RFC7911] Walton, D., Retana, A., Chen, E., and J. Scudder, 1615 "Advertisement of Multiple Paths in BGP", RFC 7911, 1616 DOI 10.17487/RFC7911, July 2016, 1617 . 1619 Appendix A. Illustrations of Service Steering 1621 The following sub-sections illustrate example scenarios of Colored 1622 Service Route Steering over E2E BGP CAR resolving over different 1623 intra-domain mechanisms 1625 The examples use MPLS/SR for the transport data plane. Scenarios 1626 specific to other encapsulations will be added in subsequent 1627 versions. 1629 A.1. E2E BGP transport CAR intent realized using IGP FA 1630 RD:V/v via E2 1631 +-----+ vpn label: 30030 +-----+ 1632 ...... |S-RR1| <..................................|S-RR2| <....... 1633 : +-----+ Color C1 +-----+ : 1634 : : 1635 : : 1636 : : 1637 +-:-----------------------+----------------------+------------------:--+ 1638 | : | | : | 1639 | : | | : | 1640 | : (E2,C1) via 121 | (E2,C1) via 231 | (E2,C1)via E2 : | 1641 | : L=168002,AIGP=110 +---+ L=168002,AIGP=10 +---+ L=0x3,LI=8002 : | 1642 | : |-------------------|121|<-----------------|231|<-------------| : | 1643 | : V LI=8002 +---+ LI=8002 +---+ | : | 1644 |----+ | | +-----| 1645 | E1 | | | | E2 | 1646 |----+(E2,C1) via 122 | (E2,C1) via 232 | (E2,C1)via E2+-----| 1647 | ^ L=168002,AIGP=210 +---+ L=168002,AIGP=20 +---+ L=0x3 | | 1648 | |---------------- |122|<-----------------|232|<-------------| | 1649 | LI=8002 +---+ LI=8002 +---+ LI=8002 | 1650 | | | | 1651 | ISIS SR | ISIS SR | ISIS SR | 1652 | FA 128 | FA 128 | FA 128 | 1653 +-------------------------+----------------------+---------------------+ 1654 iPE iABR eABR ePE 1656 +------+ +------+ 1657 |168121| |168231| 1658 +------+ +------+ 1659 +------+ +------+ +------+ 1660 |168002| |168002| |168002| 1661 +------+ +------+ +------+ 1662 +------+ +------+ +------+ 1663 |30030 | |30030 | |30030 | 1664 +------+ +------+ +------+ 1666 Figure 6: BGP FA Aware transport CAR path 1668 Use case: Provide end to end intent for service flows. 1670 o With reference to the topology above: 1672 * IGP FA 128 is running in each domain. 1674 * Egress PE E2 advertises a VPN route RD:V/v colored with (color 1675 extended community) C1 to steer traffic to BGP transport CAR 1676 (E2, C1). VPN route propagates via service RRs to ingress PE 1677 E1. 1679 * BGP CAR route (E2, C1) with next-hop, label-index and label as 1680 shown above are advertised through border routers in each 1681 domain. When a RR is used in the domain, ADD-PATH is enabled 1682 to advertise multiple available paths. 1684 * Local policy on each hop maps intent C1 to resolve CAR route 1685 next-hop over IGP FA 128 of the domain. AIGP attribute 1686 influences BGP CAR route best path decision as per [RFC7311]. 1687 BGP CAR label swap entry is installed that goes over FA 128 LSP 1688 to next-hop providing intent in each IGP domain. Update AIGP 1689 metric to reflect FA 128 metric to next-hop. 1691 * Ingress PE E1 learns CAR route (E2, C1). It steers colored VPN 1692 route RD:V/v into (E2, C1) 1694 o Important: 1696 * IGP FA 128 top label provides intent in each domain. 1698 * BGP CAR label (e.g. 168002) carries end to end intent. Thus 1699 stitches intent over intra domain FA 128. 1701 A.2. E2E BGP transport CAR intent realized using SR Policy 1702 RD:1/8 via E2 1703 +-----+ vpn label: 30030 +-----+ 1704 ...... |S-RR1| <..................................|S-RR2| <...... 1705 : +-----+ Color C1 +-----+ : 1706 : : 1707 : : 1708 : : 1709 +-:-----------------------+----------------------+------------------:-+ 1710 | : | | : | 1711 | : | | : | 1712 | : <-(E2,C1) via 121 | <-(E2,C1) via 231 | <-(E2,C1)via E2 : | 1713 | : +---+ +---+ : | 1714 | : ------------------>|121|----------------->|231|--------------| : | 1715 | : | SR policy(C,121) +---+ SR policy(C1,231)+---+ SR policy v : | 1716 |----+ | | (C1,E2) +---| 1717 | E1 | | | |E2 | 1718 |----+ <-(E2,C1) via 122 | (E2,C1) via 232 | <-(E2,C1)via E2+---| 1719 | | +---+ +---+ ^ | 1720 | ------------------>|122|----------------->|232|---------------| | 1721 | SR policy(C,122) +---+ SR policy(C1,232)+---+ SR policy(C1,E2) | 1722 | | | | 1723 | | | | 1724 | ISIS SR | ISIS SR | ISIS SR | 1725 +-------------------------+----------------------+--------------------+ 1726 iPE iABR eABR ePE 1728 Figure 7: BGP SR policy Aware transport CAR path 1730 Use case: Provide end to end intent for service flows 1732 o With reference to the topology above: 1734 * SR Policy provide intra domain intent. 1736 * Egress PE E2 advertises a VPN route RD:V/v colored with (color 1737 extended community) C1 to steer traffic to BGP transport CAR 1738 (E2, C1). VPN route propagates via service RRs to ingress PE 1739 E1. 1741 * BGP CAR route (E2, C1) with next-hop, label-index and label as 1742 shown above are advertised through border routers in each 1743 domain. When a RR is used in the domain, ADD-PATH is enabled 1744 to advertise multiple available paths. 1746 * Local policy on each hop maps intent C1 to resolve CAR route 1747 next-hop over an SR policy(C1, next-hop). BGP CAR label swap 1748 entry is installed that goes over SR policy segment list. 1750 * Ingress PE E1 learns CAR route (E2, C1). It steers colored VPN 1751 route RD:V/v into (E2, C1). 1753 o Important: 1755 * SR policy provides intent in each domain. 1757 * BGP CAR label (e.g. 168002) carries end to end intent. Thus 1758 stitches intent over intra domain SR policies. 1760 A.3. BGP transport CAR intent realized in a section of the network 1761 RD:1/8 via E2 1762 +-----+ vpn label: 30030 +-----+ 1763 ...... |S-RR1| <..................................|S-RR2| <....... 1764 : +-----+ Color C1 +-----+ : 1765 : : 1766 : : 1767 : : 1768 +-:-----------------------+----------------------+------------------:--+ 1769 | : | | : | 1770 | : | | : | 1771 | : (E2,C1) via 121 | (E2,C1) via 231 | (E2,C1) via E2 : | 1772 | : L=168002,AIGP=1110+---+L=168002,AIGP=1010+---+ L=0x3 : | 1773 | : |-------------------|121|<-----------------|231|<-------------| : | 1774 | : V LI=8002 +---+ LI=8002 +---+ | : | 1775 |----+ | | +-----| 1776 | E1 | | | | E2 | 1777 |----+(E2,C1) via 122 | (E2,C1) via 232 | (E2,C1) via E2+-----| 1778 | ^ L=168002,AIGP=1210+---+L=168002,AIGP=1020+---+ L=0x3 | | 1779 | |---------------- |122|<-----------------|232|<-------------| | 1780 | LI=8002 +---+ LI=8002 +---+ | 1781 | | | | 1782 | ISIS SR | ISIS SR | ISIS SR | 1783 | FA 0 | FA 128 | FA 0 | 1784 | Access | Core | Access 1785 +-------------------------+----------------------+---------------------+ 1786 iPE iABR eABR ePE 1788 +------+ +------+ 1789 |160121| |168231| 1790 +------+ +------+ 1791 +------+ +------+ +------+ 1792 |168002| |168002| |160002| 1793 +------+ +------+ +------+ 1794 +------+ +------+ +------+ 1795 |30030 | |30030 | |30030 | 1796 +------+ +------+ +------+ 1798 Figure 8: BGP Hybrid FA Aware transport CAR path 1800 Use case: Provide intent for service flows only in Core domain. 1802 o With reference to the topology above: 1804 * IGP FA 128 is only enabled in Core (e.g. WAN network). Access 1805 only has base algo 0. 1807 * Egress PE E2 advertises a VPN route RD:V/v colored with (color 1808 extended community) C1 to steer traffic to BGP transport CAR 1809 (E2, C1). VPN route propagates via service RRs to ingress PE 1810 E1. 1812 * BGP CAR route (E2, C1) with next-hop, label-index and label as 1813 shown above are advertised through border routers in each 1814 domain. When a RR is used in the domain, ADD-PATH is enabled 1815 to advertise multiple available paths. 1817 * Local policy on 231 and 232 maps intent C1 to resolve CAR route 1818 next-hop over IGP base algo 0 in right access domain. BGP CAR 1819 label swap entry is installed that goes over algo 0 LSP to 1820 next-hop. Update AIGP metric to reflect algo 0 metric to next- 1821 hop with an additional penalty. 1823 * Local policy on 121 and 122 maps intent C1 to resolve CAR route 1824 next-hop learnt from Core domain over IGP FA 128. BGP CAR 1825 label swap entry is installed that goes over FA 128 LSP to 1826 next-hop providing intent in Core IGP domain. 1828 * Ingress PE E1 learns CAR route (E2, C1). It maps intent C1 to 1829 resolve CAR route next-hop over IGP base algo 0. It steers 1830 colored VPN route RD:V/v into (E2, C1) 1832 o Important: 1834 * IGP FA 128 top label provides intent in Core domain. 1836 * BGP CAR label (e.g. 168002) carries intent from PEs which is 1837 realized in core domain 1839 A.4. Transit network domains that do not support CAR 1841 o In a brownfield deployment, color-aware paths between two PEs may 1842 need to go through a transit domain that does not support CAR. 1843 Example include an MPLS LDP network with IGP best-effort; or a 1844 BGP-LU based multi-domain network. MPLS LDP network with best 1845 effort IGP can adopt above scheme. Below is the example for BGP 1846 LU. 1848 o Reference topology: 1850 E1 --- BR1 --- BR2 ......... BR3 ---- BR4 --- E2 1851 Ci <----LU----> Ci 1853 * Network between BR2 and BR3 comprises of multiple BGP-LU hops 1854 (over IGP-LDP domains). 1856 * E1, BR1, BR4 and E2 are enabled for BGP CAR, with Ci colors 1857 * BR1 and BR2 are directly connected; BR3 and BR4 are directly 1858 connected 1860 o BR1 and BR4 form an over-the-top peering (via RRs as needed) to 1861 exchange BGP CAR routes 1863 o BR1 and BR4 also form direct BGP-LU sessions to BR2 and BR3 1864 respectively, to establish labeled paths between each other 1865 through the BGP-LU network 1867 o BR1 recursively resolves the BGP CAR next-hop for CAR routes 1868 learnt from BR4 via the BGP-LU path to BR4 1870 o BR1 signals the transport discontinuity to E1 via the AIGP TLV, so 1871 that E1 can prefer other paths if available 1873 o BR4 does the same in the reverse direction 1875 o Thus, the color-awareness of the routes and hence the paths in the 1876 data plane are maintained between E1 and E2, even if the intent is 1877 not available within the BGP-LU island 1879 o A similar design can be used for going over network islands of 1880 other types 1882 Appendix B. Color Mapping Illustrations 1884 There are a variety of deployment scenarios that arise w.r.t 1885 different color mappings in an inter-domain environment. This 1886 section attempts to enumerate them and provide clarity into the usage 1887 of the color related protocol constructs. 1889 B.1. Single color domain containing network domains with N:N color 1890 distribution 1892 o All network domains (ingress, egress and all transit domains) are 1893 enabled for the same N colors. 1895 * A color may of course be realized by different technologies in 1896 different domains as described above. 1898 o The N intents are both signaled end-to-end via BGP CAR routes; as 1899 well as realized in the data plane. 1901 o Appendix A.1 is an example of this case. 1903 B.2. Single color domain containing network domains with N:M color 1904 distribution 1906 o Certain network domains may not be enabled for some of the colors, 1907 but may still be required to provide transit. 1909 o When a (E, C) route traverses a domain where color C is not 1910 available, the operator may decide to use a different intent of 1911 color c that is available in that domain to resolve the next-hop 1912 and establish a path through the domain. 1914 * The next-hop resolution may occur via paths of any intra-domain 1915 protocol or even via paths provided by BGP CAR. 1917 * The next-hop resolution color c may be defined as a local 1918 policy at ingress or transit nodes of the domain. 1920 * It may also be automatically signaled from egress border nodes 1921 by attaching a color extended community with value c to the BGP 1922 CAR routes. 1924 o Hence, routes of N colors may be resolved via a smaller set of M 1925 colored paths in a transit domain, while preserving the original 1926 color-awareness end-to-end. 1928 o Any ingress PE that installs a service (VPN) route with a color C, 1929 must have C enabled locally to install IP routes to (E, C) and 1930 resolve the service route next-hop. 1932 o A degenerate variation of this scenario is where a transit domain 1933 does not support any color. Appendix A.3 describes an example of 1934 this case. 1936 B.3. Multiple color domains 1938 When the routes are distributed between domains with different color- 1939 to-intent mapping schemes, both N:N and N:M cases are possible, 1940 although an N:M mapping is more likely to occur. 1942 Reference topology: 1944 D1 ----- D2 ----- D3 1945 C1 C2 C3 1947 o C1 in D1 maps to C2 in D2 and to C3 in D3 1949 o BGP CAR is enabled in all three domains 1950 The reference topology above is used to elaborate on the design 1951 described in Section 2.8 1953 When the route originates in color domain D1 and gets advertised to a 1954 different color domain D2, following procedures apply: 1956 o The original intent in the BGP CAR route is preserved; i.e. route 1957 is (E, C1) 1959 o A BR of D1 attaches LCM-EC with value C1 when advertising to a BR 1960 in D2 1962 o A BR in D2 receiving (E, C1) maps C1 in received LCM-EC to local 1963 color, say C2 1965 o Within D2, this LCM-EC value of C2 is used instead of the Color in 1966 CAR route NLRI (E, C1). This applies to all procedures described 1967 in the earlier section for a single color domain, such as next-hop 1968 resolution and installation of route and forwarding entries. 1970 o A colored service route V/v originated in domain D1 with next-hop 1971 E and color C1 will also have its color extended-community value 1972 re-mapped to C2, typically at a service RR 1974 o On an ingress PE in D2, V/v will resolve via C2 1976 o When a BR in D2 advertises the route to a BR in D3, the same 1977 process repeats. 1979 Authors' Addresses 1981 Dhananjaya Rao 1982 Cisco Systems 1983 USA 1985 Email: dhrao@cisco.com 1987 Swadesh Agrawal 1988 Cisco Systems 1989 USA 1991 Email: swaagraw@cisco.com 1992 Clarence Filsfils 1993 Cisco Systems 1994 Belgium 1996 Email: cfilsfil@cisco.com 1998 Dirk Steinberg 1999 Lapishills Consulting Limited 2000 Germany 2002 Email: dirk@lapishills.com 2004 Luay Jalil 2005 Verizon 2006 USA 2008 Email: luay.jalil@verizon.com 2010 Yuanchao Su 2011 Alibaba, Inc 2013 Email: yitai.syc@alibaba-inc.com 2015 Bruno Decraene 2016 Orange 2017 France 2019 Email: bruno.decraene@orange.com 2021 Jim Guichard 2022 Futurewei 2023 USA 2025 Email: james.n.guichard@futurewei.com 2027 Ketan Talaulikar 2028 Arrcus, Inc 2029 India 2031 Email: ketant.ietf@gmail.com 2032 Keyur Patel 2033 Arrcus, Inc 2034 USA 2036 Email: keyur@arrcus.com 2038 Haibo Wang 2039 Huawei Technologies 2040 China 2042 Email: rainsword.wang@huawei.com