idnits 2.17.1 draft-dskc-bess-bgp-car-02.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack a Security Considerations section. ** There are 9 instances of too long lines in the document, the longest one being 22 characters in excess of 72. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == Line 1643 has weird spacing: '... policy v :...' -- The document date (May 11, 2021) is 1074 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Unused Reference: 'I-D.ietf-idr-bgp-ipv6-rt-constrain' is defined on line 1418, but no explicit reference was found in the text == Unused Reference: 'RFC4360' is defined on line 1451, but no explicit reference was found in the text == Unused Reference: 'RFC4684' is defined on line 1455, but no explicit reference was found in the text == Unused Reference: 'RFC5512' is defined on line 1467, but no explicit reference was found in the text == Unused Reference: 'RFC5701' is defined on line 1473, but no explicit reference was found in the text == Unused Reference: 'I-D.ietf-mpls-seamless-mpls' is defined on line 1513, but no explicit reference was found in the text == Unused Reference: 'RFC3906' is defined on line 1518, but no explicit reference was found in the text == Unused Reference: 'RFC4271' is defined on line 1523, but no explicit reference was found in the text == Unused Reference: 'RFC4272' is defined on line 1528, but no explicit reference was found in the text == Unused Reference: 'RFC6952' is defined on line 1536, but no explicit reference was found in the text == Unused Reference: 'RFC7911' is defined on line 1542, but no explicit reference was found in the text == Outdated reference: A later version (-15) exists of draft-ietf-bess-srv6-services-07 == Outdated reference: A later version (-26) exists of draft-ietf-lsr-flex-algo-15 == Outdated reference: A later version (-22) exists of draft-ietf-spring-segment-routing-policy-11 ** Obsolete normative reference: RFC 5512 (Obsoleted by RFC 9012) Summary: 3 errors (**), 0 flaws (~~), 16 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 BESS WorkGroup D. Rao 3 Internet-Draft S. Agrawal 4 Intended status: Standards Track C. Filsfils 5 Expires: November 12, 2021 K. Talaulikar 6 Cisco Systems 7 D. Steinberg 8 Lapishills Consulting Limited 9 L. Jalil 10 Verizon 11 Y. Su 12 Alibaba, Inc 13 J. Guichard 14 Futurewei 15 K. Patel 16 Arrcus, Inc 17 H. Wang 18 Huawei Technologies 19 May 11, 2021 21 BGP Color-Aware Routing (CAR) 22 draft-dskc-bess-bgp-car-02 24 Abstract 26 This document describes a BGP based routing solution to establish 27 end-to-end intent-aware paths across a multi-domain service provider 28 transport network. This solution is called BGP Color-Aware Routing 29 (BGP CAR). 31 Status of This Memo 33 This Internet-Draft is submitted in full conformance with the 34 provisions of BCP 78 and BCP 79. 36 Internet-Drafts are working documents of the Internet Engineering 37 Task Force (IETF). Note that other groups may also distribute 38 working documents as Internet-Drafts. The list of current Internet- 39 Drafts is at https://datatracker.ietf.org/drafts/current/. 41 Internet-Drafts are draft documents valid for a maximum of six months 42 and may be updated, replaced, or obsoleted by other documents at any 43 time. It is inappropriate to use Internet-Drafts as reference 44 material or to cite them other than as "work in progress." 46 This Internet-Draft will expire on November 12, 2021. 48 Copyright Notice 50 Copyright (c) 2021 IETF Trust and the persons identified as the 51 document authors. All rights reserved. 53 This document is subject to BCP 78 and the IETF Trust's Legal 54 Provisions Relating to IETF Documents 55 (https://trustee.ietf.org/license-info) in effect on the date of 56 publication of this document. Please review these documents 57 carefully, as they describe your rights and restrictions with respect 58 to this document. Code Components extracted from this document must 59 include Simplified BSD License text as described in Section 4.e of 60 the Trust Legal Provisions and are provided without warranty as 61 described in the Simplified BSD License. 63 Table of Contents 65 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 66 1.1. Terminology . . . . . . . . . . . . . . . . . . . . . . . 3 67 1.2. Illustration . . . . . . . . . . . . . . . . . . . . . . 5 68 1.3. Requirements Language . . . . . . . . . . . . . . . . . . 7 69 2. BGP CAR SAFI . . . . . . . . . . . . . . . . . . . . . . . . 7 70 2.1. Data Model . . . . . . . . . . . . . . . . . . . . . . . 7 71 2.2. Extensible encoding . . . . . . . . . . . . . . . . . . . 7 72 2.3. BGP CAR Route Origination . . . . . . . . . . . . . . . . 8 73 2.4. BGP CAR Route Validation . . . . . . . . . . . . . . . . 8 74 2.5. BGP CAR Route Resolution . . . . . . . . . . . . . . . . 8 75 2.6. AIGP Metric Computation . . . . . . . . . . . . . . . . . 9 76 2.7. Path Availability . . . . . . . . . . . . . . . . . . . . 9 77 2.8. BGP CAR signaling through different color domains . . . . 10 78 2.9. Format and Encoding . . . . . . . . . . . . . . . . . . . 11 79 2.9.1. BGP CAR SAFI NLRI Format . . . . . . . . . . . . . . 11 80 2.9.2. CAR NLRI Type . . . . . . . . . . . . . . . . . . . . 12 81 2.9.3. Local-Color-Mapping (LCM) Extended Community . . . . 16 82 2.10. Fault Handling . . . . . . . . . . . . . . . . . . . . . 17 83 3. Service route Automated Steering on Color-Aware path . . . . 17 84 4. Intents . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 85 5. (E, C) Subscription and Filtering . . . . . . . . . . . . . . 18 86 5.1. Illustration . . . . . . . . . . . . . . . . . . . . . . 18 87 5.2. Definition . . . . . . . . . . . . . . . . . . . . . . . 19 88 6. Scaling . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 89 6.1. Ultra-Scale Reference Topology . . . . . . . . . . . . . 19 90 6.2. Deployment model . . . . . . . . . . . . . . . . . . . . 21 91 6.2.1. Flat . . . . . . . . . . . . . . . . . . . . . . . . 21 92 6.2.2. Hierarchical Design with next-hop-self at ingress 93 domain BR . . . . . . . . . . . . . . . . . . . . . . 22 94 6.2.3. Hierarchical Design with Next Hop Unchanged at 95 ingress domain BR . . . . . . . . . . . . . . . . . . 24 97 6.3. Scale Analysis . . . . . . . . . . . . . . . . . . . . . 25 98 6.4. Scaling Benefits of the (E, C) BGP Subscription and 99 Filtering . . . . . . . . . . . . . . . . . . . . . . . . 27 100 6.5. Anycast SID . . . . . . . . . . . . . . . . . . . . . . . 27 101 6.5.1. Anycast SID for transit inter-domain nodes . . . . . 27 102 6.5.2. Anycast SID for transport color endpoints (e.g., PEs) 28 103 7. Routing Convergence . . . . . . . . . . . . . . . . . . . . . 28 104 8. VPN CAR . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 105 9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 30 106 9.1. BGP CAR NLRI Types Registry . . . . . . . . . . . . . . . 30 107 9.2. BGP CAR NLRI TLV Registry . . . . . . . . . . . . . . . . 30 108 9.3. Guidance for Designated Experts . . . . . . . . . . . . . 31 109 9.4. BGP Extended Community Registry . . . . . . . . . . . . . 31 110 10. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 31 111 11. References . . . . . . . . . . . . . . . . . . . . . . . . . 31 112 11.1. Normative References . . . . . . . . . . . . . . . . . . 31 113 11.2. Informative References . . . . . . . . . . . . . . . . . 33 114 Appendix A. Illustrations of Service Steering . . . . . . . . . 34 115 A.1. E2E BGP transport CAR intent realized using IGP FA . . . 34 116 A.2. E2E BGP transport CAR intent realized using SR Policy . . 36 117 A.3. BGP transport CAR intent realized in a section of the 118 network . . . . . . . . . . . . . . . . . . . . . . . . . 38 119 A.4. Transit network domains that do not support CAR . . . . . 40 120 Appendix B. Color Mapping Illustrations . . . . . . . . . . . . 41 121 B.1. Single color domain containing network domains with N:N 122 color distribution . . . . . . . . . . . . . . . . . . . 41 123 B.2. Single color domain containing network domains with N:M 124 color distribution . . . . . . . . . . . . . . . . . . . 42 125 B.3. Multiple color domains . . . . . . . . . . . . . . . . . 42 126 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 43 128 1. Introduction 130 This document specifies a new BGP SAFI called BGP Color-Aware Routing 131 (BGP CAR). BGP CAR fulfills the transport and VPN problem statement 132 and requirements described in [dskc-bess-bgp-car-problem-statement]. 134 1.1. Terminology 136 +---------------+---------------------------------------------------+ 137 | Intent | Any combination of the following behaviors: a/ | 138 | | Topology path selection (e.g. minimize metric, | 139 | | avoid resource), b/ NFV service insertion (e.g. | 140 | | service chain steering), c/ per-hop behavior | 141 | | (e.g. 5G slice). | 142 | | | 143 | Color | A 32-bit numerical value associated with an | 144 | | intent: e.g. low-cost vs low-delay vs avoiding | 145 | | some resources. | 146 | | | 147 | Colored | An egress PE E2 colors its BGP VPN route V/v to | 148 | Service Route | indicate the intent that it requests for the | 149 | | traffic bound to V/v. The color is encoded as a | 150 | | BGP Color Extended community | 151 | | [I-D.ietf-idr-tunnel-encaps]. | 152 | | | 153 | Color-Aware | A routed path to E2 which satisfies the intent | 154 | Path to (E2, | associated with color C. Several technologies | 155 | C) | may provide a Color-Aware Path to (E2, C): SR | 156 | | Policy [I-D.ietf-spring-segment-routing-policy], | 157 | | IGP Flex-Algo [I-D.ietf-lsr-flex-algo], BGP CAR | 158 | | [specified in this document]. | 159 | | | 160 | Color-Aware | A distributed or signaled route that builds a | 161 | Route (E2, C) | color-aware path to E2 for color C. | 162 | | | 163 | Service Route | E1 automatically steers a C-colored service route | 164 | Automated | V/v from E2 onto an (E2, C) path. If several such | 165 | Steering on | paths exist, a preference scheme is used to | 166 | Color-aware | select the best path: E.g. IGP Flex-Algo first | 167 | path | then BGP CAR then SR Policy. | 168 | | | 169 | Color Domain | A set of nodes which share the same Color-to- | 170 | | Intent mapping. This set can be organized in one | 171 | | or several IGP instances or BGP domains. | 172 | | | 173 | Resolution of | An inter-domain BGP CAR route (E, C) from N is | 174 | a BGP CAR | resolved on an intra-domain color-aware path (N, | 175 | route (E, C) | C) where N is the next-hop of the BGP CAR route. | 176 | | | 177 | Resolution vs | In this document and consistently with the | 178 | Steering | terminology of the SR Policy document | 179 | | [I-D.ietf-spring-segment-routing-policy], | 180 | | steering is used to describe the mapping of a | 181 | | service route onto a BGP CAR path while the term | 182 | | resolution is preserved for the mapping of an | 183 | | inter-domain BGP CAR route on an intra-domain | 184 | | color-aware path. | 185 | | | 186 | | Service Steering: Service route -> BGP CAR path | 187 | | (or other Color-Aware Routed Paths: e.g., SR | 188 | | Policy) | 189 | | | 190 | | Intra-Domain Resolution: BGP CAR route -> intra- | 191 | | domain color aware path (e.g. SR Policy, IGP | 192 | | Flex-Algo, BGP CAR) | 193 +---------------+---------------------------------------------------+ 195 1.2. Illustration 197 Here is a brief illustration of the salient properties of the BGP CAR 198 solution. 200 +-------------+ +-------------+ +-------------+ 201 | | | | | | V/v with C1 202 |----+ |------| |------| +----|/ 203 | E1 | | | | | | E2 |\ 204 |----+ | | | | +----| W/w with C2 205 | |------| |------| | 206 | Domain 1 | | Domain 2 | | Domain 3 | 207 +-------------+ +-------------+ +-------------+ 209 Figure 1 211 All the nodes are part of an interdomain network under a single 212 authority and with a consistent color-to-intent mapping: 214 o C1 is mapped to "low-delay" 216 * Flex-Algo FA1 is mapped to "low delay" and hence to C1 218 o C2 is mapped to "low-delay and avoid resource R" 220 * Flex-Algo FA2 is mapped to "low delay and avoid resource R" and 221 hence C2 223 E1 receives two service routes from E2: 225 o V/v with BGP Extended-Color community C1 227 o W/w with BGP Extended-Color community C2 229 E1 has the following color-aware paths: 231 o (E2, C1) provided by BGP CAR with the following per-domain 232 support: 234 * Domain1: over IGP FA1 236 * Domain2: over SR Policy bound to color C1 238 * Domain3: over IGP FA1 240 o (E2, C2) provided by SR Policy 242 E1 automatically steers the received service routes as follows: 244 o V/v via (E2, C1) provided by BGP CAR 246 o W/w via (E2, C2) provided by SR Policy 248 Illustrated Properties: 250 o Leverage of the BGP Color Extended-Community 252 * The service routes are colored with widely-used BGP Extended- 253 Color Community 255 o (E, C) Automated Steering 257 * V/v and W/w are automatically steered on the appropriate color- 258 aware path 260 o Seamless co-existence of BGP CAR and SR Policy 262 * V/v is steered on BGP CAR color-aware path 264 * W/w is steered on SR Policy color-aware path 266 o Seamless interworking of BGP CAR and SR Policy 268 * V/v is steered on a BGP CAR color-aware path that is itself 269 resolved within domain 2 onto an SR Policy bound to the color 270 of V/v 272 Other properties: 274 o MPLS dataplane: with 300k PE's and 5 colors, the BGP CAR solution 275 ensures that no single node needs to support a dataplane scaling 276 in the order of Remote PE * C. This would otherwise blow the MPLS 277 dataplane. 279 o Control-Plane: a node should not install a (E, C) path if it does 280 not need it 282 o Incongruent Color-Intent mapping: the solution supports the 283 signaling of a BGP CAR route across different color domains 285 The keys to this simplicity are: 287 o the leverage of the BGP Color Extended-Community to color service 288 routes 290 o the definition of the automated steering: a C-colored service 291 route V/v from E2 is steered onto a color-aware path (E2, C) 293 o the definition of the data model of a BGP CAR path: (E, C) 295 * consistent with SR Policy data model 297 o the definition of the recursive resolution of a BGP CAR route: a 298 BGP CAR (E2, C) via N is resolved onto the color-aware path (N, C) 299 which may itself be provided by BGP CAR or via another color-aware 300 routing solution: SR Policy, IGP Flex-Algo. 302 1.3. Requirements Language 304 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 305 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 306 "OPTIONAL" in this document are to be interpreted as described in BCP 307 14 [RFC2119] [RFC8174] when, and only when, they appear in all 308 capitals, as shown here. 310 2. BGP CAR SAFI 312 2.1. Data Model 314 The BGP CAR data model is: 316 o NLRI Key: IP Prefix, Color 318 o NLRI non-key encapsulation data: MPLS label stack, Label index, 319 SRv6 SID list etc. 321 o BGP Next Hop 323 o AIGP Metric: accumulates color/intent specific metric across 324 domains 326 o Local-Color-Mapping Extended-Community (LCM-EC): Optional 32-bit 327 Color value used when a CAR route propagates between different 328 color domains 330 2.2. Extensible encoding 332 Extensible encoding is ensured by: 334 o NLRI Route-Type field: provides extensibility to add new NLRI 335 formats for new route-types 337 o Key length: field enables handling of unsupported route-types 338 opaquely, enabling transitivity via RRs 340 o TLV-based encoding of non-key NLRI: enables support for multiple 341 encapsulations with efficient update packing 343 o AIGP Attribute provides extensibility via TLVs, enabling 344 definition of additional metric semantics for a color as needed 345 for an intent 347 2.3. BGP CAR Route Origination 349 A BGP CAR route may be originated locally (e.g., loopback) or through 350 redistribution of an (E, C) color-aware path provided by another 351 routing solution: SR Policy, IGP Flex-Algo or BGP-LU [RFC8277]. 353 2.4. BGP CAR Route Validation 355 A BGP CAR path (E, C) from N with encapsulation T is valid if color- 356 aware path (N, C) exists and T is dataplane available. 358 A local policy may customize the validation process: 360 o the color constraint in the first check may be relaxed: instead N 361 is reachable in the default routing table 363 o the dataplane availability constraint of T may be relaxed 365 o addition of a performance-measurement verification to ensure that 366 the intent associated with C is met (e.g. delay < bound) 368 2.5. BGP CAR Route Resolution 370 A BGP color-aware route (E2, C1) from N is resolved over a color- 371 aware route (N, C1). The color-aware route (N, C1) may be provided 372 recursively by BGP CAR or by other routing solutions: SR Policy, IGP 373 Flex-Algo, BGP-LU. 375 When multiple resolutions are possible, the default preference should 376 be: IGP Flex-Algo, SR Policy, BGP CAR, BGP LU. 378 Through local policy, a BGP color-aware route (E2, C1) from N may be 379 resolved over a color-aware route (N, C2): i.e. the local policy maps 380 the resolution of C1 over C2. For example, in a domain where 381 resource R is known to not be present, the inter-domain intent 382 C1="low delay and avoid R" may be resolved over an intra-domain path 383 of intent C2="low delay". 385 The color-aware route (N, C1) may have a different dataplane 386 encapsulation than the one of (E2, C1): e.g. a BGP CAR route (E2, C1) 387 with SR-MPLS encapsulation may be transported over an intermediate 388 SRv6 domain. 390 2.6. AIGP Metric Computation 392 The Accumulated IGP (AIGP) Attribute is updated as the BGP CAR route 393 propagates across the network. 395 The value set (or appropriately incremented) in the AIGP TLV 396 corresponds to the metric associated with the underlying intent of 397 the color. For example, when the color is associated with a low- 398 latency path, the metric value is set based on the delay metric. 400 Information regarding the metric type used by the underlying intra- 401 domain mechanism can also be set. 403 If BGP CAR routes traverse across a discontinuity in the transport 404 path for a given intent, add a penalty in accumulated IGP metric. 405 The discontinuity is also indicated to upstream nodes via a bit in 406 the AIGP TLV. 408 AIGP metric computation is recursive. 410 To avoid continuous IGP metric churn causing end to end BGP CAR 411 churn, an implementation should provide thresholds to trigger AIGP 412 update. 414 Additional AIGP extensions may be defined to signal state for 415 specific use-cases: MSD along the BGP CAR advertisement, Minimum MTU 416 along the BGP CAR advertisement. 418 2.7. Path Availability 420 The (E, C) route inherently provides availability of redundant paths 421 at every hop. For instance, BGP CAR routes originated by two egress 422 ABRs in a domain are advertised as multiple paths to ingress ABRs in 423 the domain, where they become equal-cost or primary-backup paths. A 424 failure of an egress ABR is detected and handled by ingress ABRs 425 locally within the domain for faster convergence, without any 426 necessity to propagate the event to upstream nodes for traffic 427 restoration. 429 BGP ADD-PATH should be enabled for BGP CAR to signal multiple next 430 hops through a transport RR. 432 2.8. BGP CAR signaling through different color domains 434 [Color Domain 1 A]-----[B Color Domain 2 E2] 435 [C1=low-delay ] [C2=low-delay ] 437 Let us assume a BGP CAR route (E2, C2) is signaled from B to A; two 438 border routers of respectively domain 2 and domain 1. Let us assume 439 that these two domains do not share the same color-to-intent mapping. 440 Low-delay in domain 2 is color C2 while C1 in domain 1 (C1 <> C2). 442 The BGP CAR solution seamlessly supports this (rare) scenario while 443 maintaining the separation and independence of the administrative 444 authority in different color domains. 446 The solution works as follows: 448 o Within domain 2, the BGP CAR route is (E2, C2) via E2 450 o B signals to A the BGP CAR route as (E2, C2) via B with Local- 451 Color-Mapping-Extended-Community (LCM-EC) of color C2 453 o A is aware (classic peering agreement) of the intent-to-color 454 mapping within domain 2 ("low-delay" in domain 2 is C2) 456 o A maps C2 in LCM-EC to C1 and signals within domain 1 the received 457 BGP CAR route as (E2, C2) via A with LCM-EC(C1) 459 o The nodes within the receiving domain 1 use the local color 460 encoded in the LCM-EC for next-hop resolution and BGP CAR route 461 installation 463 Salient properties: 465 o The NLRI never changes 467 o E is globally unique, which makes E-C in that order unique 469 o In the vast majority of the case, the color of the NLRI is used 470 for resolution and steering 472 o In the rare case of color incongruence, the local color encoded in 473 LCM-EC takes precedence 475 Further illustrations are provided in Appendix B. 477 2.9. Format and Encoding 479 BGP CAR leverages the BGP multi-protocol extensions [RFC4760] and 480 uses the MP_REACH_NLRI and MP_UNREACH_NLRI attributes for route 481 updates by using the SAFI value TBD1 along with AFI 1 for IPv4 482 prefixes and AFI 2 for IPv6 prefixes. 484 BGP speakers MUST use BGP Capabilities Advertisement to ensure 485 support for processing of BGP CAR updates. This is done as specified 486 in [RFC4760], by using capability code 1 (multi-protocol BGP), with 487 AFI 1 and 2 (as required) and SAFI TBD1. 489 The sub-sections below specify the generic encoding of the BGP CAR 490 NLRI followed by the encoding for specific NLRI types introduced in 491 this document. 493 2.9.1. BGP CAR SAFI NLRI Format 495 The generic format for the BGP CAR SAFI NLRI is shown below: 497 0 1 2 3 498 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 499 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 500 | NLRI Length | Key Length | NLRI Type | // 501 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ // 502 | Type-specific Key Fields // 503 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 504 | Type-specific Non-Key Fields (if applicable) // 505 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 507 where: 509 o NLRI Length: 1 octet field that indicates the length in octets of 510 the NLRI excluding the NLRI Length field itself. 512 o Key Length: 1 octet field that indicates the length in octets of 513 the NLRI type-specific key fields. Key length MUST be at least 2 514 less than the NLRI length. 516 o NLRI Type: 1 octet field that indicates the type of the BGP CAR 517 NLRI. 519 o Type-Specific Key Fields: Depend on the NLRI type and of length 520 indicated by the Key Length. 522 o Type-Specific Non-Key Fields: optional and variable depending on 523 the NLRI type. The NLRI encoding allows for encoding of specific 524 non-key information associated with the route (i.e. the key) as 525 part of the NLRI for efficient packing of BGP updates. 527 The indication of the key length enables BGP Speakers to determine 528 the key portion of the NLRI and use it along with the NLRI Type field 529 in an opaque manner for handling of unknown or unsupported NLRI 530 types. This can help Route Reflectors (RR) to propagate NLRI types 531 introduced in the future in a transparent manner. 533 The NLRI encoding allows for encoding of specific non-key information 534 associated with the route (i.e. the key) as part of the NLRI for 535 efficient packing of BGP updates. 537 The non-key portion of the NLRI MUST be omitted while carrying it 538 within the MP_UNREACH_NLRI when withdrawing the route advertisement. 540 2.9.2. CAR NLRI Type 542 The Color-Aware Routes NLRI Type is used for advertisement of color- 543 aware routes and has the following format: 545 0 1 2 3 546 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 547 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 548 | NLRI Length | Key Length | NLRI Type |Prefix Length | 549 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 550 | IP Prefix (variable) // 551 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 552 | Color (4 octets) | 553 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 555 Followed by optional TLVs encoded as below: 557 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 558 | Type | Length | Value (variable) // 559 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 561 where: 563 o NLRI Length: variable 565 o Key Length: variable 567 o NLRI Type: 1 569 o Type-Specific Key Fields: as below 570 * Prefix Length: 1 octet field that carries the length of prefix 571 in bits. Length MUST be less than or equal to 32 for IPv4 572 (AFI=1) and less than or equal to 128 for IPv6 (AFI=2). 574 * IP Prefix: IPv4 or IPv6 prefix (based on the AFI). A variable 575 size field that contains the most significant octets of the 576 prefix, i.e., 1 octet for prefix length 1 to 8, 2 octets for 577 prefix length 9 to 16, 3 octets for prefix length 17 up to 24, 578 4 octets for prefix length 25 up to 32, and so on. The size of 579 the field MUST be less than or equal to 4 for IPv4 (AFI=1) and 580 less than or equal to 16 for IPv6 (AFI=2). 582 * Color: 4 octets that contains color value associated with the 583 prefix. 585 o Type-Specific Non-Key Fields: specified in the form of optional 586 TLVs as below: 588 * Type: 1 octet field that contains the type of the non-key TLV 590 * Length: 1 octet field that contains the length of the value 591 portion of the non-key TLV in terms of octets 593 * Value: variable length field as indicated by the length field 594 and to be interpreted as per the type field. 596 The prefix is routable across the administrative domain where BGP 597 transport CAR is deployed. It is possible that the same prefix is 598 originated by multiple BGP CAR speakers in the case of anycast 599 addressing or multi-homing. 601 The Color is introduced to enable multiple route advertisements for 602 the same prefix. The color is associated with an intent (e.g. low- 603 latency) in originator color-domain. 605 The following sub-sections specify the non-key TLVs associated with 606 the Color-Aware Routes NLRI type. 608 2.9.2.1. Label TLV 610 The Label TLV is used for advertisement of color-aware routes along 611 with their MPLS labels and has the following format: 613 0 1 2 3 614 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 615 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 616 | Type | Length | 617 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 619 Followed by one (or more) Labels encoded as below: 621 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 622 | Label |Rsrv |S| 623 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 625 where: 627 o Type : 1 629 o Length: variable, MUST be a multiple of 3 631 o Label Information: multiples of 3 octet fields to convey the MPLS 632 label(s) associated with the advertised color-aware route. It is 633 used for encoding a single label or a stack of labels as per 634 procedures specified in [RFC8277]. 636 When a BGP transport CAR speaker is propagating the route further 637 after setting itself as the nexthop, it allocates a local label for 638 the specific prefix and color combination which it updates in this 639 TLV. It also MUST program a label cross-connect that would result in 640 the label swap operation for the incoming label that it advertises 641 with the label received from its best-path router(s). 643 2.9.2.2. Label Index TLV 645 The Label Index TLV is used for advertisement of Segment Routing MPLS 646 (SR-MPLS) Segment Identifier (SID) [RFC8402] information associated 647 with the labeled color-aware routes and has the following format: 649 0 1 2 3 650 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 651 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 652 | Type | Length | Reserved | Flags ~ 653 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 654 ~ | Label Index ~ 655 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 656 ~ | 657 +-+-+-+-+-+-+-+-+ 659 where: 661 o Type : 2 663 o Length: 7 665 o Reserved: 1 octet field that MUST be set to 0 and ignored on 666 receipt. 668 o Flags: 2 octet field that maps to the Flags field of the Label- 669 Index TLV of the BGP Prefix SID Attribute [RFC8277]. 671 o Label Index: 4 octet field that maps to the Label Index field of 672 the Label-Index TLV of the BGP Prefix SID Attribute [RFC8277]. 674 This TLV provides the equivalent functionality as Label-Index TLV of 675 [RFC8669] for Transport CAR in SR-MPLS deployments. The BGP Prefix 676 SID Attribute SHOULD be omitted from the labeled color-aware routes 677 when the attribute is being used to only convey the Label Index TLV 678 for better BGP packing efficiency. 680 When a BGP Transport CAR speaker is propagating the route further 681 after setting itself as the nexthop, it allocates a local label for 682 the specific prefix and color combination. When the received update 683 has the Label Index TLV, it SHOULD use that hint to allocate the 684 local label from the SR Global Block (SRGB) using procedures as 685 specified in [RFC8669]. 687 2.9.2.3. SRv6 SID TLV 689 BGP Transport CAR can be also used to setup end-to-end color-aware 690 connectivity using Segment Routing over IPv6 (SRv6) [RFC8402]. 691 [I-D.ietf-spring-srv6-network-programming] specifies the SRv6 692 Endpoint behaviors (e.g. End PSP) which MAY be leveraged for BGP CAR 693 with SRv6.The SRv6 SID TLV is used for advertisement of color-aware 694 routes along with their SRv6 SIDs and has the following format: 696 0 1 2 3 697 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 698 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 699 | Type | Length | SRv6 SID Info (variable) // 700 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 702 where: 704 o Type : 3 706 o Length: variable, MUST be either less than or equal to 16, or be a 707 multiple of 16 709 o SRv6 SID Information: field of size as indicated by the length 710 that either carries the SRv6 SID(s) for the advertised color-aware 711 route as one of the following: 713 * A single 128-bit SRv6 SID or a stack of 128-bit SRv6 SIDs 715 * A transposed portion (refer [I-D.ietf-bess-srv6-services]) of 716 the SRv6 SID that MUST be of size in multiples of one octet and 717 less than 16. 719 The BGP color-aware route update for SRv6 MUST include the BGP 720 Prefix-SID attribute along with the TLV carrying the SRv6 SID 721 information as specified in [I-D.ietf-bess-srv6-services] when using 722 the transposition scheme of encoding for packing efficiency of BGP 723 updates. 725 2.9.3. Local-Color-Mapping (LCM) Extended Community 727 This document defines a new BGP Extended Community called "LCM". The 728 LCM is a Transitive Opaque Extended Community with the following 729 encoding: 731 0 1 2 3 732 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 733 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 734 | Type=0x3 | Sub-Type=TBD2 | Reserved | 735 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 736 | Color | 737 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 739 where: 741 o Type: 0x3 743 o Sub-Type: TBD2. 745 o Reserved: 2 octet of reserved field that MUST be set to zero on 746 transmission and ignored on reception. 748 o Color: 4-octet field that carries the 32-bit color value. 750 When a CAR route crosses the originator color domain's boundary, LCM 751 EC is added. LCM EC conveys the local color mapping for the intent 752 (e.g. low latency) into transit or remote color domains. 754 The LCM EC MAY be used for filtering of BGP CAR routes and/or for 755 applying routing policies for the intent, when present. 757 2.10. Fault Handling 759 This the fault management actions as described in [RFC7606] are 760 applicable for handling of BGP update messages for BGP-CAR. 762 When the error determined allows for the router to skip the malformed 763 NLRI(s) and continue processing of the rest of the update message, 764 then it MUST handle such malformed NLRIs as 'Treat-as-withdraw'. In 765 other cases, where the error in the NLRI encoding results in the 766 inability to process the BGP update message (e.g. length related 767 encoding errors), then the router SHOULD handle such malformed NLRIs 768 as 'AFI/SAFI disable' when other AFI/SAFI besides BGP-CAR are being 769 advertised over the same session. Alternately, the router MUST 770 perform 'session reset' when the session is only being used for BGP- 771 CAR. 773 3. Service route Automated Steering on Color-Aware path 775 E1 automatically steers a C-colored service route V/v from E2 onto an 776 (E2, C) color-aware path. If several such paths exist, a preference 777 scheme is used to select the best path: E.g. IGP Flex-Algo first 778 then BGP CAR then SR Policy. 780 This is consistent with the automated service route steering on SR 781 Policy (a routing solution providing color-aware path) defined in 782 [I-D.ietf-spring-segment-routing-policy]. All the steering 783 variations defined in [I-D.ietf-spring-segment-routing-policy] are 784 applicable to BGP CAR color-aware path: on-demand steering, per- 785 destination, per-flow, CO-only. For brevity, in this revision, we 786 refer the reader to the [I-D.ietf-spring-segment-routing-policy] 787 text. 789 Salient property: Seamless integration of BGP CAR and SR Policy. 791 Appendix A provides illustrations of service route automated 792 steering. 794 4. Intents 796 The widely deployed color-aware path SR Policy solution demonstrates 797 that the following intents can easily be associated with a color: 799 1. Minimization of a cost metric vs a latency metric 801 * Minimization of different metric types, static and dynamic 803 2. Exclusion/Inclusion of SRLG and/or Link Affinity and/or minimum 804 MTU/number of hops 806 3. Bandwidth management 808 4. In the inter-domain context, exclusion/inclusion of entire 809 domains, and border routers 811 5. Inclusion of one or several virtual network function chains 813 * Located in a regional domain and/or core domain, in a DC 815 6. Localization of the virtual network function chains 817 * Some functions may be desired in the regional DC or vice versa 819 7. Per-Destination and Per-Flow steering 821 It is straightforward to note that the BGP CAR color-aware 822 alternative supports intents 1, 2, 4 and 7. 824 Future revisions of this document will analyze the BGP CAR supports 825 for 3, 5 and 6. 827 5. (E, C) Subscription and Filtering 829 This section defines an (E, C) BGP subscription model that allows to 830 filter the (E, C) routes learned by a BGP CAR node. 832 5.1. Illustration 834 E1-----------------A-------------------B-------------------E2 835 <--- (E2, C1) ---- 836 -- F (E2, C1) --> --- F (E2, C1) --> 837 | | 838 <-- (E2, C1) ---- <--- (E2, C1) ---- 840 o BGP CAR route (E2, C1) advertised by E2 is not unconditionally 841 distributed beyond a certain point (e.g., B) 843 o E1 subscribes to (E2, C1) by advertising a filter route F (E2, C1) 844 to its upstream peer A 846 o If A has (E2, C1) in its BGP RIB, it will advertise (E2, C1) to E1 848 o If A does not have (E2, C1), it will advertise F (E2, C1) to its 849 peer B 851 o B will advertise (E2, C1) to A, which will distribute it to E1 852 E1 may trigger a subscription for BGP CAR route (E2, C1) as a result 853 of receiving a C1-colored service route V/v from E2, for on-demand 854 steering via (E2, C1). 856 5.2. Definition 858 future version of this document 860 6. Scaling 862 This section analyses the key scale requirement of [ref:dskc-bess- 863 bgp-car-problem-statement], specifically: 865 o No intermediate node dataplane should need to scale to (Colors * 866 PEs) 868 o No node should learn and install a BGP CAR route to (E,C) if it 869 does not install a Colored service route to E 871 Figure 2 provides an ultra-scale reference topology. Section 6.2 872 presents three design models to deploy BGP CAR in the reference 873 topology. Section 6.3 analyses the scaling properties of each model. 874 Section 6.4 illustrates the scaling benefits of the (E, C) BGP 875 subscription and filtering. 877 6.1. Ultra-Scale Reference Topology 878 RD:V/v via E2 879 +-----+ +-----+ vpn label:30030 +-----+ 880 ....... |S-RR1| <........... |S-RR2| <...............|S-RR3| <...... 881 : +-----+ +-----+ Color C1 +-----+ : 882 : : 883 : : 884 : : 885 +:------------+--------------+--------------+--------------+--------:-+ 886 |: | | | | : | 887 |: | | | | : | 888 |: +---+ +---+ +---+ +---+ : | 889 |: |121| |231| |341| |451| : | 890 |: +---+ +---+ +---+ +---+ : | 891 |---+ | | | | +---| 892 | E1| | | | | | E2| 893 |---+ | | | | +---| 894 | +---+ +---+ +---+ +---+ | 895 | |122| |232| |342| |452| | 896 | +---+ +---+ +---+ +---+ | 897 | Access | Metro | Core | Metro | Access | 898 | domain 1 | domain 2 | domain 3 | domain 4 | domain 5 | 899 +-------------+--------------+--------------+--------------+----------+ 900 iPE iBRM iBRC eBRC eBRM ePE 902 Figure 2: Ultra-Scale Reference Topology 904 The following applies to the reference topology above: 906 o Independent ISIS/OSPF SR instance in each domain. 908 o Each domain has Flex Algo 128. Prefix SID for a node is SRGB 909 168000 plus node number. 911 o A BGP CAR route (E2, C1) is advertised by egress BRM node 451.The 912 route is sourced locally from redistribution from IGP-FA 128. 914 o Not shown for simplicity, node 452 will also advertise (E2, C1). 916 o When a transport RR is used within the domain or across domains, 917 ADD-PATH is enabled to advertise paths from both egress BRs to 918 it's clients. 920 o Egress PE E2 advertises a VPN route RD:V/v with BGP Color extended 921 community C1 that propagates via service RRs to ingress PE E1. 923 o E1 steers V/v prefix via color-aware path (E2,C1) and VPN label 924 30030 926 6.2. Deployment model 928 6.2.1. Flat 930 RD:V/v via E2 931 +-----+ +-----+ vpn label:30030 +-----+ 932 ....... |S-RR1| <........... |S-RR2| <...............|S-RR3| <...... 933 : +-----+ +-----+ Color C1 +-----+ : 934 : : 935 : : 936 : : 937 +:------------+--------------+--------------+--------------+--------:-+ 938 |: | | | | : | 939 |: | (E2,C1) | (E2,C1) | (E2,C1) | : | 940 |: +---+ via 231 +---+ via 341 +---+ via 451 +---+ : | 941 |:(E2,C1) |121|<---------|231|<---------|341|<---------|451| : | 942 |: via 121 /+---+ L=168002 +---+ L=168002 +---+ L=168002 +---+ : | 943 |---+ / | | | | +---| 944 | E1| <--/ | | | | | E2| 945 |---+ L=168002| | | | +---| 946 | +---+ +---+ +---+ +---+ | 947 | |122| |232| |342| |452| | 948 | +---+ +---+ +---+ +---+ | 949 | Access | Metro | Core | Metro | Access | 950 | domain 1 | domain 2 | domain 3 | domain 4 | domain 5 | 951 +-------------+--------------+--------------+--------------+----------+ 952 iPE iBRM iBRC eBRC eBRM ePE 954 168121 168231 168341 168451 955 168002 168002 168002 168002 168002 956 30030 30030 30030 30030 30030 30030 958 Figure 3 960 1. Node 451 advertises BGP CAR route (E2, C1) to 341, from which it 961 goes to 231 then to 121 and finally to E1 963 2. Each BGP hop allocates local label and programs swap entry in 964 forwarding for (E2, C1) 966 3. E1 receives BGP CAR route (E2, C1) via 121 with label 168002 968 1. Let's assume E1 selects that path 970 4. E1 resolves BGP CAR route (E2, C1) via 121 on color-aware path 971 (121, C1) 972 1. Color-aware path (121, C1) is FA128 path to 121 (label 973 168121) 975 5. E1's imposition color-aware label-stack for V/v is thus 977 1. 30030 <=> V/v 979 2. 168002 <=> (E2, C1) 981 3. 168121 <=> (121, C1) 983 6. Each BGP hop performs swap operation on 168002 bound to color- 984 aware path (E2,C1) 986 6.2.2. Hierarchical Design with next-hop-self at ingress domain BR 988 (E2,C1) 989 +-----+ via 451 +-----+ 990 |T-RR1| <-------------- |T-RR2| 991 / +-----+ L=168002 +-----+\ 992 / \ 993 +-------------+---/----------+--------------+-----------\--+----------+ 994 | | / | | \ | | 995 | (E2,C1) | / (451,C1) | (451,C1) | \| | 996 | via 121 +---+ via 231 +---+ via 341 +---+ +---+ | 997 | L=168002 |121| <======= |231| <========|341| <======= |451| | 998 | / +---+ L=168451 +---+ L=168451 +---+ +---+ | 999 |---+ / | | | | +---| 1000 | E1|<--/ | | | | | E2| 1001 |---+ | | | | +---| 1002 | +---+ +---+ +---+ +---+ | 1003 | |122| |232| |342| |452| | 1004 | +---+ +---+ +---+ +---+ | 1005 | Access | Metro | Core | Metro | Access | 1006 | domain 1 | domain 2 | domain 3 | domain 4 | domain 5 | 1007 +-------------+--------------+--------------+--------------+----------+ 1008 iPE iBRM iBRC eBRC eBRM ePE 1010 168231 168341 1011 168121 168451 168451 168451 1012 168002 168002 168002 168002 168002 1013 30030 30030 30030 30030 30030 30030 1015 Figure 4: Heirarchical BGP transport CAR, NHS at iBR 1017 1. Node 451 advertises BGP CAR route (451, C1) to 341, from which 1018 it goes to 231 and finally to 121 1020 2. Each BGP hop allocates local label and programs swap entry in 1021 forwarding for (451, C1) 1023 3. 121 resolves received BGP CAR route (451, C1) via 231 (label 1024 168451) on color-aware path (231, C1) 1026 1. Color-aware path (231, C1) is FA128 path to 231 (label 1027 168231) 1029 4. 451 advertises BGP CAR route (E2, C1) via 451 to Transport RR 1030 T-RR2, which reflects it to T-RR1, which reflects it to 121 1032 5. 121 receives BGP CAR route (E2, C1) via 451 with label 168002 1034 1. Let's assume 121 selects that path 1036 6. 121 resolves BGP CAR route (E2, C1) via 451 on color-aware path 1037 (451, C1) 1039 1. Color-aware path (451, C1) is BGP CAR path to 451 (label 1040 168451) 1042 7. 121 imposition of color-aware label stack for (E2, C1) is thus 1044 1. 168002 <=> (E2, C1) 1046 2. 168451 <=> (451, C1) 1048 3. 168231 <=> (231, C1) 1050 8. 121 advertises (E2, C1) to E1 with next hop self (121) and label 1051 168002 1053 9. E1 constructs same imposition color-aware label-stack for V/v 1054 via (E2, C1) as in the flat model: 1056 1. 30030 <=> V/v 1058 2. 168002 <=> (E2, C1) 1060 3. 168121 <=> (121, C1) 1062 10. 121 performs swap operation on 168002 with hierarchical color- 1063 aware label stack for (E2, C1) via 451 from step 7 1065 11. Nodes 231 and 341 perform swap operation on 168451 bound to 1066 color-aware path (451, C1) 1068 12. 451 performs swap operation on 168002 bound to color-aware path 1069 (E2, C1) 1071 Note: E1 does not need the BGP CAR (451, C1) route 1073 6.2.3. Hierarchical Design with Next Hop Unchanged at ingress domain BR 1075 (E2,C1) 1076 +-----+ via 451 +-----+ 1077 |T-RR1| <-------------- |T-RR2| 1078 / +-----+ L=168002 +-----+\ 1079 / \ 1080 +-------------+---/----------+--------------+-----------\--+----------+ 1081 | | / | | \ | | 1082 | (E2,C1) | / (451,C1) | (451,C1) | \| | 1083 | via 451 +---+ via 231 +---+ via 341 +---+ +---+ | 1084 | L=168002/|121| <======= |231| <========|341| <======= |451| | 1085 | / +---+ L=168451 +---+ L=168451 +---+ +---+ | 1086 |---+ <--/ //| | | | +---| 1087 | E1| // | | | | | E2| 1088 |---+ <===// | | | | +---| 1089 | (451,C1) +---+ +---+ +---+ +---+ | 1090 | via 121 |122| |232| |342| |452| | 1091 | L=168451 +---+ +---+ +---+ +---+ | 1092 | | | | | | 1093 | Access | Metro | Core | Metro | Access | 1094 | domain 1 | domain 2 | domain 3 | domain 4 | domain 5 | 1095 +-------------+--------------+--------------+--------------+----------+ 1096 iPE iBRM iBRC eBRC eBRM ePE 1098 168121 168231 168341 1099 168451 168451 168451 168451 1100 168002 168002 168002 168002 168002 1101 30030 30030 30030 30030 30030 30030 1103 Figure 5: Heirarchical BGP transport CAR, NHU at iBR 1105 1. Nodes 341, 231 and 121 receive and resolve BGP CAR route (451, 1106 C1) the same as in the previous model 1108 2. Node 121 allocates local label and programs swap entry in 1109 forwarding for (451, C1) 1111 3. 451 advertises BGP CAR route (E2, C1) to Transport RR T-RR2, 1112 which reflects it to T-RR1, which reflects it to 121 1114 4. Node 121 advertises (E2, C1) to E1 with next hop as 451 i.e. 1115 next-hop unchanged 1117 5. 121 also advertises (451, C1) to E1 with next hop self (121) and 1118 label 168451 1120 6. E1 resolves BGP CAR route (451, C1) via 121 on color-aware path 1121 (121, C1) 1123 1. Color-aware path (121, C1) is FA128 path to 121 (label 1124 168121) 1126 7. E1 receives BGP CAR route (E2, C1) via 451 with label 168002 1128 1. Let's assume E1 selects that path 1130 8. E1 resolves BGP CAR route (E2, C1) via 451 on color-aware path 1131 (451, C1) 1133 1. Color-aware path (451, C1) is BGP CAR path to 451 (label 1134 168451) 1136 9. E1's imposition color-aware label-stack for V/v is thus 1138 1. 30030 <=> V/v 1140 2. 168002 <=> (E2, C1) 1142 3. 168451 <=> (451, C1) 1144 4. 168121 <=> (121, C1) 1146 10. Nodes 121, 231 and 341 perform swap operation on 168451 bound to 1147 (451, C1) 1149 11. 451 performs swap operation on 168002 bound to color-aware path 1150 (E2, C1) 1152 6.3. Scale Analysis 1154 The following two tables summarize the control-plane and dataplane 1155 scale of these three models: 1157 | E1 | 121 | 231 1158 -----+---------------------+---------------------+-------------------- 1159 FLAT | (E2,C) via (121,C) | (E2,C) via (231,C) | (E2,C) via (341,C) 1160 -----+---------------------+---------------------+-------------------- 1161 H.NHS| (E2,C) via (121,C) | (E2,C) via (451,C) | 1162 | | (451,C) via (231,C) | (451,C) via (341,C) 1163 -----+---------------------+---------------------+-------------------- 1164 H.NHU| (E2,C) via (451,C) | | 1165 | (451,C) via (121,C) | (451,C) via (231,C) | (451,C) via (341,C) 1166 -----+---------------------+---------------------+-------------------- 1168 | E1 | 121 | 231 1169 -----+---------------------+---------------------+-------------------- 1170 FLAT | V -> 30030 | 168002 -> 168002 | 168002 -> 168002 1171 | 168002 | 168231 | 168341 1172 | 168121 | | 1173 -----+---------------------+---------------------+-------------------- 1174 H.NHS| V -> 30030 | 168002 -> 168002 | 168451 -> 168451 1175 | 168002 | 168451 | 168341 1176 | 168121 | 168231 | 1177 -----+---------------------+---------------------+-------------------- 1178 H.NHU| V -> 30030 | 168451 -> 168451 | 168451 -> 168451 1179 | 168002 | 168231 | 168341 1180 | 168451 | | 1181 | 168121 | | 1182 -----+---------------------+---------------------+-------------------- 1184 o The flat model is the simplest design, with a single BGP transport 1185 level. It results in the minimum label/SID stack at each BGP hop. 1186 However, it significantly increases the scale impact on the core 1187 BRs (e.g. 341), whose FIB capacity and even MPLS label space may 1188 be exceeded. 1190 * 341's dataplane scales with (E2,C) where there may be 300k E's 1191 and 5 C's hence 1.5M entries > 1M MPLS dataplane 1193 o The hierarchical models avoid the need for core BRs to learn 1194 routes and install label forwarding entries for (E, C) routes. 1196 * Whether NH self or unchanged at 121, 341's dataplane scales 1197 with (451,C) where there may be thousands of 451's and 5 C's 1198 hence well under the 1M MPLS dataplane 1200 o The next-hop-self option at ingress BRM (e.g. 121) hides the 1201 hierarchical design from the ingress PE, keeping its outgoing 1202 label programming as simple as the flat model. However, the 1203 ingress BRM requires an additional BGP transport level recursion, 1204 which coupled with load-balancing adds dataplane complexity. It 1205 needs to support a swap and push operation. It also needs to 1206 install label forwarding entries for the egress PEs that are of 1207 interest to its local ingress PEs. 1209 o With the next-hop-unchanged option at ingress BRM (e.g. 121), only 1210 an ingress PE needs to learn and install output label entries for 1211 egress (E, C) routes. The ingress BRM only installs label 1212 forwarding entries for the egress ABR (e.g. 451). However, the 1213 ingress PE needs an additional BGP transport level recursion and 1214 pushes a BGP VPN label and two BGP transport labels. It may also 1215 need to handle load-balancing for the egress ABRs. This is the 1216 most complex dataplane option for the ingress PE. 1218 6.4. Scaling Benefits of the (E, C) BGP Subscription and Filtering 1220 The (E, C) subscription scheme from Section 5 provides the following 1221 scaling benefits for the models in Section 6.2 1223 o An ingress PE (E1) only learns (E, C) routes that it needs to 1224 install into data plane for service route automated steering 1226 o An ingress BRM (121) only learns (E, C) routes that it needs to 1227 install into data plane (for Next-Hop-Self), or that it needs to 1228 distribute towards it's ingress PEs (inline RR with Next-Hop- 1229 Unchanged) 1231 o An ingress BRM or a transport RR only needs to distribute the 1232 necessary subset of (E, C) routes to each client (subscriber); 1233 this minimizes their processing load for generating updates 1235 o As a result, withdrawal of (E, C) routes when a remote node fails 1236 (E2), may also be faster, aiding better convergence 1238 6.5. Anycast SID 1240 This section describes how Anycast SID complements and improves the 1241 scaling designs above. 1243 6.5.1. Anycast SID for transit inter-domain nodes 1245 o Redundant BRs (e.g. two egress BRMs, 451 and 452) advertise BGP 1246 CAR routes for a local PE (e.g., E2) with the same SID (based on 1247 label-index). Such egress BRMs may be assigned a common Anycast 1248 SID, so that the BGP next-hops for these routes will also resolve 1249 via a color-aware path to the Anycast SID. 1251 o The use of Anycast SID naturally provides fast local convergence 1252 upon failure of an egress BRM node. In addition, it decreases the 1253 recursive resolution and load-balancing complexity at an ingress 1254 BRM or PE in the hierarchical designs above. 1256 6.5.2. Anycast SID for transport color endpoints (e.g., PEs) 1258 The common Anycast SID technique may also be used for a redundant 1259 pair of PEs that share an identical set of service (VPN) attachments. 1261 o For example, assume a node E2' paired with E2 above. Both PEs 1262 should be configured with the same static label/SID for the 1263 services (e.g., per-VRF VPN label/SID), and will advertise 1264 associated service routes with the Anycast IP as BGP next-hop. 1266 o This design provides a convergence and recursive resolution 1267 benefit on an ingress PE or ABR similar to the egress ABR case 1268 above. 1270 7. Routing Convergence 1272 This section will analyze routing convergence. 1274 8. VPN CAR 1276 This section illustrates the extension of BGP CAR to address the VPN 1277 CAR requirement stated in Section 3.2 of [dskc-bess-bgp-car-problem- 1278 statement]. 1280 CE1 -------------- PE1 -------------------- PE2 -------------- CE2 - V 1282 o BGP CAR is enabled between CE1-PE1 and PE2-CE2 1284 o BGP VPN CAR is enabled between PE1 and PE2 1286 o Provider publishes intent 'low-delay' is mapped to color CP on its 1287 inbound peering links 1289 o Within its infrastructure, Provider maps intent 'low-delay' to 1290 color CPT 1292 o On CE1 and CE2, intent 'low-delay' is mapped to CC 1294 (V, CC) is a Color-Aware route originated by CE2 1295 1. CE2 sends to PE2 : [(V, CC), Label L1] via CE2 with LCM (CP) 1296 2. PE2 installs in VRF A: [(V, CC), L1] via CE2 which resolves on (CE2, CP) 1297 / connected OIF 1298 2.a. PE2 allocates VPN Label L2 and programs swap entry for (V, CC) 1299 3. PE2 sends to PE1 : [(RD, V, CC), L2] via PE2 with regular Color Extended 1300 Community (CPT) 1301 4. PE1 installs in VRF A: [(V, CC), L2] via (PE2, CPT) steered on (PE2, CPT) 1302 4.a. PE1 allocates Label L3 and programs swap entry for (V, CC) 1303 5. PE1 sends to CE1 : [(V, CC), L3] via PE1 without any LCM 1304 6. CE1 installs : [(V, CC), L3] via PE1 which resolves on (PE1, CC) 1305 / connected OIF 1306 6.a. Label L3 is installed as the imposition label for (V, CC) 1308 VPN CAR distribution for (RD, V, CC) requires a new SAFI that follows 1309 same VPN semantics as defined in [RFC4364], the difference being that 1310 the advertised routes carry CAR NLRI defined in Section 2.9.2 of this 1311 document. 1313 VPN CAR NLRI with RD has the format shown below 1315 0 1 2 3 1316 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1317 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1318 | NLRI Length | Key Length | NLRI Type |Prefix Length | 1319 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1320 | Route Distinguisher | 1321 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1322 | Route Distinguisher | 1323 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1324 | IP Prefix (variable) // 1325 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1326 | Color (4 octets) | 1327 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1329 Followed by optional TLVs encoded as below: 1331 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1332 | Type | Length | Value (variable) // 1333 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1335 where: 1337 Route Distinguisher: 8 octet field encoded according to [RFC4364] 1339 9. IANA Considerations 1341 IANA is requested to assign SAFI value TBD1 (BGP CAR) and SAFI value 1342 TBD2 (BGP VPN CAR) from the "SAFI Values" sub-registry under the 1343 "Subsequent Address Family Identifiers (SAFI) Parameters" registry 1344 with this document as a reference. 1346 9.1. BGP CAR NLRI Types Registry 1348 IANA is requested to create a "BGP CAR NLRI Types" sub-registry under 1349 the "Border Gateway Protocol (BGP) Parameters" registry with this 1350 document as a reference. The registry is for assignment of the one 1351 octet sized code-points for BGP CAR NLRI types and populated with the 1352 values shown below: 1354 Type NLRI Type Reference 1355 ----------------------------------------------------------------- 1356 0 Reserved (not to be used) [This document] 1357 1 Color-Aware Routes NLRI [This document] 1358 2-255 Unassigned 1360 Allocations within the registry are to be made under the 1361 "Specification Required" policy as specified in [RFC8126]). 1363 9.2. BGP CAR NLRI TLV Registry 1365 IANA is requested to create a "BGP CAR NLRI TLV Types" sub-registry 1366 under the "Border Gateway Protocol (BGP) Parameters" registry with 1367 this document as a reference. The registry is for assignment of the 1368 one octet sized code-points for BGP-CAR NLRI non-key TLV types and 1369 populated with the values shown below: 1371 Type NLRI Type Reference 1372 ----------------------------------------------------------------- 1373 0 Reserved (not to be used) [This document] 1374 1 Label TLV [This document] 1375 2 Label Index TLV [This document] 1376 3 SRv6 SID TLV [This document] 1377 4-255 Unassigned 1379 Allocations within the registry are to be made under the 1380 "Specification Required" policy as specified in [RFC8126]). 1382 9.3. Guidance for Designated Experts 1384 In all cases of review by the Designated Expert (DE) described here, 1385 the DE is expected to ascertain the existence of suitable 1386 documentation (a specification) as described in [RFC8126]. The DE is 1387 also expected to check the clarity of purpose and use of the 1388 requested code points. Additionally, the DE must verify that any 1389 request for one of these code points has been made available for 1390 review and comment within the IETF: the DE will post the request to 1391 the IDR Working Group mailing list (or a successor mailing list 1392 designated by the IESG). If the request comes from within the IETF, 1393 it should be documented in an Internet-Draft. Lastly, the DE must 1394 ensure that any other request for a code point does not conflict with 1395 work that is active or already published within the IETF. 1397 9.4. BGP Extended Community Registry 1399 IANA is requested to allocate the sub-type TBD2 for "Local Color 1400 Mapping (LCM)" under the "BGP Transitive Opaque Extended Community" 1401 registry under the "BGP Extended Community" parameter registry. 1403 10. Acknowledgements 1405 The authors would like to acknowledge the review and inputs from many 1406 people.TBD 1408 11. References 1410 11.1. Normative References 1412 [I-D.ietf-bess-srv6-services] 1413 Dawra, G., Filsfils, C., Talaulikar, K., Raszuk, R., 1414 Decraene, B., Zhuang, S., and J. Rabadan, "SRv6 BGP based 1415 Overlay Services", draft-ietf-bess-srv6-services-07 (work 1416 in progress), April 2021. 1418 [I-D.ietf-idr-bgp-ipv6-rt-constrain] 1419 Patel, K., Raszuk, R., Djernaes, M., Dong, J., and M. 1420 Chen, "IPv6 Extensions for Route Target Distribution", 1421 draft-ietf-idr-bgp-ipv6-rt-constrain-12 (work in 1422 progress), April 2018. 1424 [I-D.ietf-idr-tunnel-encaps] 1425 Patel, K., Velde, G. V. D., Sangli, S. R., and J. Scudder, 1426 "The BGP Tunnel Encapsulation Attribute", draft-ietf-idr- 1427 tunnel-encaps-22 (work in progress), January 2021. 1429 [I-D.ietf-lsr-flex-algo] 1430 Psenak, P., Hegde, S., Filsfils, C., Talaulikar, K., and 1431 A. Gulko, "IGP Flexible Algorithm", draft-ietf-lsr-flex- 1432 algo-15 (work in progress), April 2021. 1434 [I-D.ietf-spring-segment-routing-policy] 1435 Filsfils, C., Talaulikar, K., Voyer, D., Bogdanov, A., and 1436 P. Mattes, "Segment Routing Policy Architecture", draft- 1437 ietf-spring-segment-routing-policy-11 (work in progress), 1438 April 2021. 1440 [I-D.ietf-spring-srv6-network-programming] 1441 Filsfils, C., Garvia, P. C., Leddy, J., Voyer, D., 1442 Matsushima, S., and Z. Li, "Segment Routing over IPv6 1443 (SRv6) Network Programming", draft-ietf-spring-srv6- 1444 network-programming-28 (work in progress), December 2020. 1446 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1447 Requirement Levels", BCP 14, RFC 2119, 1448 DOI 10.17487/RFC2119, March 1997, 1449 . 1451 [RFC4360] Sangli, S., Tappan, D., and Y. Rekhter, "BGP Extended 1452 Communities Attribute", RFC 4360, DOI 10.17487/RFC4360, 1453 February 2006, . 1455 [RFC4684] Marques, P., Bonica, R., Fang, L., Martini, L., Raszuk, 1456 R., Patel, K., and J. Guichard, "Constrained Route 1457 Distribution for Border Gateway Protocol/MultiProtocol 1458 Label Switching (BGP/MPLS) Internet Protocol (IP) Virtual 1459 Private Networks (VPNs)", RFC 4684, DOI 10.17487/RFC4684, 1460 November 2006, . 1462 [RFC4760] Bates, T., Chandra, R., Katz, D., and Y. Rekhter, 1463 "Multiprotocol Extensions for BGP-4", RFC 4760, 1464 DOI 10.17487/RFC4760, January 2007, 1465 . 1467 [RFC5512] Mohapatra, P. and E. Rosen, "The BGP Encapsulation 1468 Subsequent Address Family Identifier (SAFI) and the BGP 1469 Tunnel Encapsulation Attribute", RFC 5512, 1470 DOI 10.17487/RFC5512, April 2009, 1471 . 1473 [RFC5701] Rekhter, Y., "IPv6 Address Specific BGP Extended Community 1474 Attribute", RFC 5701, DOI 10.17487/RFC5701, November 2009, 1475 . 1477 [RFC7311] Mohapatra, P., Fernando, R., Rosen, E., and J. Uttaro, 1478 "The Accumulated IGP Metric Attribute for BGP", RFC 7311, 1479 DOI 10.17487/RFC7311, August 2014, 1480 . 1482 [RFC7606] Chen, E., Ed., Scudder, J., Ed., Mohapatra, P., and K. 1483 Patel, "Revised Error Handling for BGP UPDATE Messages", 1484 RFC 7606, DOI 10.17487/RFC7606, August 2015, 1485 . 1487 [RFC8126] Cotton, M., Leiba, B., and T. Narten, "Guidelines for 1488 Writing an IANA Considerations Section in RFCs", BCP 26, 1489 RFC 8126, DOI 10.17487/RFC8126, June 2017, 1490 . 1492 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 1493 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 1494 May 2017, . 1496 [RFC8277] Rosen, E., "Using BGP to Bind MPLS Labels to Address 1497 Prefixes", RFC 8277, DOI 10.17487/RFC8277, October 2017, 1498 . 1500 [RFC8402] Filsfils, C., Ed., Previdi, S., Ed., Ginsberg, L., 1501 Decraene, B., Litkowski, S., and R. Shakir, "Segment 1502 Routing Architecture", RFC 8402, DOI 10.17487/RFC8402, 1503 July 2018, . 1505 [RFC8669] Previdi, S., Filsfils, C., Lindem, A., Ed., Sreekantiah, 1506 A., and H. Gredler, "Segment Routing Prefix Segment 1507 Identifier Extensions for BGP", RFC 8669, 1508 DOI 10.17487/RFC8669, December 2019, 1509 . 1511 11.2. Informative References 1513 [I-D.ietf-mpls-seamless-mpls] 1514 Leymann, N., Decraene, B., Filsfils, C., Konstantynowicz, 1515 M., and D. Steinberg, "Seamless MPLS Architecture", draft- 1516 ietf-mpls-seamless-mpls-07 (work in progress), June 2014. 1518 [RFC3906] Shen, N. and H. Smit, "Calculating Interior Gateway 1519 Protocol (IGP) Routes Over Traffic Engineering Tunnels", 1520 RFC 3906, DOI 10.17487/RFC3906, October 2004, 1521 . 1523 [RFC4271] Rekhter, Y., Ed., Li, T., Ed., and S. Hares, Ed., "A 1524 Border Gateway Protocol 4 (BGP-4)", RFC 4271, 1525 DOI 10.17487/RFC4271, January 2006, 1526 . 1528 [RFC4272] Murphy, S., "BGP Security Vulnerabilities Analysis", 1529 RFC 4272, DOI 10.17487/RFC4272, January 2006, 1530 . 1532 [RFC4364] Rosen, E. and Y. Rekhter, "BGP/MPLS IP Virtual Private 1533 Networks (VPNs)", RFC 4364, DOI 10.17487/RFC4364, February 1534 2006, . 1536 [RFC6952] Jethanandani, M., Patel, K., and L. Zheng, "Analysis of 1537 BGP, LDP, PCEP, and MSDP Issues According to the Keying 1538 and Authentication for Routing Protocols (KARP) Design 1539 Guide", RFC 6952, DOI 10.17487/RFC6952, May 2013, 1540 . 1542 [RFC7911] Walton, D., Retana, A., Chen, E., and J. Scudder, 1543 "Advertisement of Multiple Paths in BGP", RFC 7911, 1544 DOI 10.17487/RFC7911, July 2016, 1545 . 1547 Appendix A. Illustrations of Service Steering 1549 The following sub-sections illustrate example scenarios of Colored 1550 Service Route Steering over E2E BGP CAR resolving over different 1551 intra-domain mechanisms 1553 The examples use MPLS/SR for the transport data plane. Scenarios 1554 specific to other encapsulations will be added in subsequent 1555 versions. 1557 A.1. E2E BGP transport CAR intent realized using IGP FA 1558 RD:V/v via E2 1559 +-----+ vpn label: 30030 +-----+ 1560 ...... |S-RR1| <..................................|S-RR2| <....... 1561 : +-----+ Color C1 +-----+ : 1562 : : 1563 : : 1564 : : 1565 +-:-----------------------+----------------------+------------------:--+ 1566 | : | | : | 1567 | : | | : | 1568 | : (E2,C1) via 121 | (E2,C1) via 231 | (E2,C1)via E2 : | 1569 | : L=168002,AIGP=110 +---+ L=168002,AIGP=10 +---+ L=0x3,LI=8002 : | 1570 | : |-------------------|121|<-----------------|231|<-------------| : | 1571 | : V LI=8002 +---+ LI=8002 +---+ | : | 1572 |----+ | | +-----| 1573 | E1 | | | | E2 | 1574 |----+(E2,C1) via 122 | (E2,C1) via 232 | (E2,C1)via E2+-----| 1575 | ^ L=168002,AIGP=210 +---+ L=168002,AIGP=20 +---+ L=0x3 | | 1576 | |---------------- |122|<-----------------|232|<-------------| | 1577 | LI=8002 +---+ LI=8002 +---+ LI=8002 | 1578 | | | | 1579 | ISIS SR | ISIS SR | ISIS SR | 1580 | FA 128 | FA 128 | FA 128 | 1581 +-------------------------+----------------------+---------------------+ 1582 iPE iABR eABR ePE 1584 +------+ +------+ 1585 |168121| |168231| 1586 +------+ +------+ 1587 +------+ +------+ +------+ 1588 |168002| |168002| |168002| 1589 +------+ +------+ +------+ 1590 +------+ +------+ +------+ 1591 |30030 | |30030 | |30030 | 1592 +------+ +------+ +------+ 1594 Figure 6: BGP FA Aware transport CAR path 1596 Use case: Provide end to end intent for service flows. 1598 o With reference to the topology above: 1600 * IGP FA 128 is running in each domain. 1602 * Egress PE E2 advertises a VPN route RD:V/v colored with (color 1603 extended community) C1 to steer traffic to BGP transport CAR 1604 (E2, C1). VPN route propagates via service RRs to ingress PE 1605 E1. 1607 * BGP CAR route (E2, C1) with next-hop, label-index and label as 1608 shown above are advertised through border routers in each 1609 domain. When a RR is used in the domain, ADD-PATH is enabled 1610 to advertise multiple available paths. 1612 * Local policy on each hop maps intent C1 to resolve CAR route 1613 next-hop over IGP FA 128 of the domain. AIGP attribute 1614 influences BGP CAR route best path decision as per [RFC7311]. 1615 BGP CAR label swap entry is installed that goes over FA 128 LSP 1616 to next-hop providing intent in each IGP domain. Update AIGP 1617 metric to reflect FA 128 metric to next-hop. 1619 * Ingress PE E1 learns CAR route (E2, C1). It steers colored VPN 1620 route RD:V/v into (E2, C1) 1622 o Important: 1624 * IGP FA 128 top label provides intent in each domain. 1626 * BGP CAR label (e.g. 168002) carries end to end intent. Thus 1627 stitches intent over intra domain FA 128. 1629 A.2. E2E BGP transport CAR intent realized using SR Policy 1630 RD:1/8 via E2 1631 +-----+ vpn label: 30030 +-----+ 1632 ...... |S-RR1| <..................................|S-RR2| <...... 1633 : +-----+ Color C1 +-----+ : 1634 : : 1635 : : 1636 : : 1637 +-:-----------------------+----------------------+------------------:-+ 1638 | : | | : | 1639 | : | | : | 1640 | : <-(E2,C1) via 121 | <-(E2,C1) via 231 | <-(E2,C1)via E2 : | 1641 | : +---+ +---+ : | 1642 | : ------------------>|121|----------------->|231|--------------| : | 1643 | : | SR policy(C,121) +---+ SR policy(C1,231)+---+ SR policy v : | 1644 |----+ | | (C1,E2) +---| 1645 | E1 | | | |E2 | 1646 |----+ <-(E2,C1) via 122 | (E2,C1) via 232 | <-(E2,C1)via E2+---| 1647 | | +---+ +---+ ^ | 1648 | ------------------>|122|----------------->|232|---------------| | 1649 | SR policy(C,122) +---+ SR policy(C1,232)+---+ SR policy(C1,E2) | 1650 | | | | 1651 | | | | 1652 | ISIS SR | ISIS SR | ISIS SR | 1653 +-------------------------+----------------------+--------------------+ 1654 iPE iABR eABR ePE 1656 Figure 7: BGP SR policy Aware transport CAR path 1658 Use case: Provide end to end intent for service flows 1660 o With reference to the topology above: 1662 * SR Policy provide intra domain intent. 1664 * Egress PE E2 advertises a VPN route RD:V/v colored with (color 1665 extended community) C1 to steer traffic to BGP transport CAR 1666 (E2, C1). VPN route propagates via service RRs to ingress PE 1667 E1. 1669 * BGP CAR route (E2, C1) with next-hop, label-index and label as 1670 shown above are advertised through border routers in each 1671 domain. When a RR is used in the domain, ADD-PATH is enabled 1672 to advertise multiple available paths. 1674 * Local policy on each hop maps intent C1 to resolve CAR route 1675 next-hop over an SR policy(C1, next-hop). BGP CAR label swap 1676 entry is installed that goes over SR policy segment list. 1678 * Ingress PE E1 learns CAR route (E2, C1). It steers colored VPN 1679 route RD:V/v into (E2, C1). 1681 o Important: 1683 * SR policy provides intent in each domain. 1685 * BGP CAR label (e.g. 168002) carries end to end intent. Thus 1686 stitches intent over intra domain SR policies. 1688 A.3. BGP transport CAR intent realized in a section of the network 1689 RD:1/8 via E2 1690 +-----+ vpn label: 30030 +-----+ 1691 ...... |S-RR1| <..................................|S-RR2| <....... 1692 : +-----+ Color C1 +-----+ : 1693 : : 1694 : : 1695 : : 1696 +-:-----------------------+----------------------+------------------:--+ 1697 | : | | : | 1698 | : | | : | 1699 | : (E2,C1) via 121 | (E2,C1) via 231 | (E2,C1) via E2 : | 1700 | : L=168002,AIGP=1110+---+L=168002,AIGP=1010+---+ L=0x3 : | 1701 | : |-------------------|121|<-----------------|231|<-------------| : | 1702 | : V LI=8002 +---+ LI=8002 +---+ | : | 1703 |----+ | | +-----| 1704 | E1 | | | | E2 | 1705 |----+(E2,C1) via 122 | (E2,C1) via 232 | (E2,C1) via E2+-----| 1706 | ^ L=168002,AIGP=1210+---+L=168002,AIGP=1020+---+ L=0x3 | | 1707 | |---------------- |122|<-----------------|232|<-------------| | 1708 | LI=8002 +---+ LI=8002 +---+ | 1709 | | | | 1710 | ISIS SR | ISIS SR | ISIS SR | 1711 | FA 0 | FA 128 | FA 0 | 1712 | Access | Core | Access 1713 +-------------------------+----------------------+---------------------+ 1714 iPE iABR eABR ePE 1716 +------+ +------+ 1717 |160121| |168231| 1718 +------+ +------+ 1719 +------+ +------+ +------+ 1720 |168002| |168002| |160002| 1721 +------+ +------+ +------+ 1722 +------+ +------+ +------+ 1723 |30030 | |30030 | |30030 | 1724 +------+ +------+ +------+ 1726 Figure 8: BGP Hybrid FA Aware transport CAR path 1728 Use case: Provide intent for service flows only in Core domain. 1730 o With reference to the topology above: 1732 * IGP FA 128 is only enabled in Core (e.g. WAN network). Access 1733 only has base algo 0. 1735 * Egress PE E2 advertises a VPN route RD:V/v colored with (color 1736 extended community) C1 to steer traffic to BGP transport CAR 1737 (E2, C1). VPN route propagates via service RRs to ingress PE 1738 E1. 1740 * BGP CAR route (E2, C1) with next-hop, label-index and label as 1741 shown above are advertised through border routers in each 1742 domain. When a RR is used in the domain, ADD-PATH is enabled 1743 to advertise multiple available paths. 1745 * Local policy on 231 and 232 maps intent C1 to resolve CAR route 1746 next-hop over IGP base algo 0 in right access domain. BGP CAR 1747 label swap entry is installed that goes over algo 0 LSP to 1748 next-hop. Update AIGP metric to reflect algo 0 metric to next- 1749 hop with an additional penalty. 1751 * Local policy on 121 and 122 maps intent C1 to resolve CAR route 1752 next-hop learnt from Core domain over IGP FA 128. BGP CAR 1753 label swap entry is installed that goes over FA 128 LSP to 1754 next-hop providing intent in Core IGP domain. 1756 * Ingress PE E1 learns CAR route (E2, C1). It maps intent C1 to 1757 resolve CAR route next-hop over IGP base algo 0. It steers 1758 colored VPN route RD:V/v into (E2, C1) 1760 o Important: 1762 * IGP FA 128 top label provides intent in Core domain. 1764 * BGP CAR label (e.g. 168002) carries intent from PEs which is 1765 realized in core domain 1767 A.4. Transit network domains that do not support CAR 1769 o In a brownfield deployment, color-aware paths between two PEs may 1770 need to go through a transit domain that does not support CAR. 1771 Example include an MPLS LDP network with IGP best-effort; or a 1772 BGP-LU based multi-domain network. MPLS LDP network with best 1773 effort IGP can adopt above scheme. Below is the example for BGP 1774 LU. 1776 o Reference topology: 1778 E1 --- BR1 --- BR2 ......... BR3 ---- BR4 --- E2 1779 Ci <----LU----> Ci 1781 * Network between BR2 and BR3 comprises of multiple BGP-LU hops 1782 (over IGP-LDP domains). 1784 * E1, BR1, BR4 and E2 are enabled for BGP CAR, with Ci colors 1785 * BR1 and BR2 are directly connected; BR3 and BR4 are directly 1786 connected 1788 o BR1 and BR4 form an over-the-top peering (via RRs as needed) to 1789 exchange BGP CAR routes 1791 o BR1 and BR4 also form direct BGP-LU sessions to BR2 and BR3 1792 respectively, to establish labeled paths between each other 1793 through the BGP-LU network 1795 o BR1 recursively resolves the BGP CAR next-hop for CAR routes 1796 learnt from BR4 via the BGP-LU path to BR4 1798 o BR1 signals the transport discontinuity to E1 via the AIGP TLV, so 1799 that E1 can prefer other paths if available 1801 o BR4 does the same in the reverse direction 1803 o Thus, the color-awareness of the routes and hence the paths in the 1804 data plane are maintained between E1 and E2, even if the intent is 1805 not available within the BGP-LU island 1807 o A similar design can be used for going over network islands of 1808 other types 1810 Appendix B. Color Mapping Illustrations 1812 There are a variety of deployment scenarios that arise w.r.t 1813 different color mappings in an inter-domain environment. This 1814 section attempts to enumerate them and provide clarity into the usage 1815 of the color related protocol constructs. 1817 B.1. Single color domain containing network domains with N:N color 1818 distribution 1820 o All network domains (ingress, egress and all transit domains) are 1821 enabled for the same N colors. 1823 * A color may of course be realized by different technologies in 1824 different domains as described above. 1826 o The N intents are both signaled end-to-end via BGP CAR routes; as 1827 well as realized in the data plane. 1829 o Appendix A.1 is an example of this case. 1831 B.2. Single color domain containing network domains with N:M color 1832 distribution 1834 o Certain network domains may not be enabled for some of the colors, 1835 but may still be required to provide transit. 1837 o When a (E, C) route traverses a domain where color C is not 1838 available, the operator may decide to use a different intent of 1839 color c that is available in that domain to resolve the next-hop 1840 and establish a path through the domain. 1842 * The next-hop resolution may occur via paths of any intra-domain 1843 protocol or even via paths provided by BGP CAR. 1845 * The next-hop resolution color c may be defined as a local 1846 policy at ingress or transit nodes of the domain. 1848 * It may also be automatically signaled from egress border nodes 1849 by attaching a color extended community with value c to the BGP 1850 CAR routes. 1852 o Hence, routes of N colors may be resolved via a smaller set of M 1853 colored paths in a transit domain, while preserving the original 1854 color-awareness end-to-end. 1856 o Any ingress PE that installs a service (VPN) route with a color C, 1857 must have C enabled locally to install IP routes to (E, C) and 1858 resolve the service route next-hop. 1860 o A degenerate variation of this scenario is where a transit domain 1861 does not support any color. Appendix A.3 describes an example of 1862 this case. 1864 B.3. Multiple color domains 1866 When the routes are distributed between domains with different color- 1867 to-intent mapping schemes, both N:N and N:M cases are possible, 1868 although an N:M mapping is more likely to occur. 1870 Reference topology: 1872 D1 ----- D2 ----- D3 1873 C1 C2 C3 1875 o C1 in D1 maps to C2 in D2 and to C3 in D3 1877 o BGP CAR is enabled in all three domains 1878 The reference topology above is used to elaborate on the design 1879 described in Section 2.8 1881 When the route originates in color domain D1 and gets advertised to a 1882 different color domain D2, following procedures apply: 1884 o The original intent in the BGP CAR route is preserved; i.e. route 1885 is (E, C1) 1887 o A BR of D1 attaches LCM-EC with value C1 when advertising to a BR 1888 in D2 1890 o A BR in D2 receiving (E, C1) maps C1 in received LCM-EC to local 1891 color, say C2 1893 o Within D2, this LCM-EC value of C2 is used instead of the Color in 1894 CAR route NLRI (E, C1). This applies to all procedures described 1895 in the earlier section for a single color domain, such as next-hop 1896 resolution and installation of route and forwarding entries. 1898 o A colored service route V/v originated in domain D1 with next-hop 1899 E and color C1 will also have its color extended-community value 1900 re-mapped to C2, typically at a service RR 1902 o On an ingress PE in D2, V/v will resolve via C2 1904 o When a BR in D2 advertises the route to a BR in D3, the same 1905 process repeats. 1907 Authors' Addresses 1909 Dhananjaya Rao 1910 Cisco Systems 1911 USA 1913 Email: dhrao@cisco.com 1915 Swadesh Agrawal 1916 Cisco Systems 1917 USA 1919 Email: swaagraw@cisco.com 1920 Clarence Filsfils 1921 Cisco Systems 1922 Belgium 1924 Email: cfilsfil@cisco.com 1926 Ketan Talaulikar 1927 Cisco Systems 1928 India 1930 Email: ketant@cisco.com 1932 Dirk Steinberg 1933 Lapishills Consulting Limited 1934 Germany 1936 Email: dirk@lapishills.com 1938 Luay Jalil 1939 Verizon 1940 USA 1942 Email: luay.jalil@verizon.com 1944 Yuanchao Su 1945 Alibaba, Inc 1947 Email: yitai.syc@alibaba-inc.com 1949 Jim Guichard 1950 Futurewei 1951 USA 1953 Email: james.n.guichard@futurewei.com 1955 Keyur Patel 1956 Arrcus, Inc 1957 USA 1959 Email: keyur@arrcus.com 1960 Haibo Wang 1961 Huawei Technologies 1962 China 1964 Email: rainsword.wang@huawei.com