idnits 2.17.1 draft-kaliraj-idr-bgp-classful-transport-planes-11.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- == There are 29 instances of lines with non-RFC6890-compliant IPv4 addresses in the document. If these are example addresses, they should be changed. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (26 July 2021) is 1004 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'TBD' is mentioned on line 804, but not defined == Missing Reference: 'RR26' is mentioned on line 866, but not defined == Missing Reference: 'RR27' is mentioned on line 866, but not defined == Missing Reference: 'RR16' is mentioned on line 866, but not defined == Missing Reference: 'CE41' is mentioned on line 871, but not defined == Missing Reference: 'PE25' is mentioned on line 871, but not defined == Missing Reference: 'P28' is mentioned on line 871, but not defined == Missing Reference: 'P29' is mentioned on line 871, but not defined == Missing Reference: 'P15' is mentioned on line 871, but not defined == Missing Reference: 'CE31' is mentioned on line 871, but not defined -- Possible downref: Non-RFC (?) normative reference: ref. 'MPLS-NAMESPACES' -- Possible downref: Non-RFC (?) normative reference: ref. 'PCEP-RSVP-COLOR' -- Possible downref: Non-RFC (?) normative reference: ref. 'RTC-Ext' -- Possible downref: Non-RFC (?) normative reference: ref. 'Seamless-SR' == Outdated reference: A later version (-26) exists of draft-ietf-idr-segment-routing-te-policy-08 -- Possible downref: Non-RFC (?) normative reference: ref. 'SRV6-INTER-DOMAIN' -- Possible downref: Non-RFC (?) normative reference: ref. 'SRV6-MPLS-AGRWL' -- Possible downref: Non-RFC (?) normative reference: ref. 'SRV6-SERVICES' Summary: 0 errors (**), 0 flaws (~~), 13 warnings (==), 8 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group K. Vairavakkalai, Ed. 3 Internet-Draft N. Venkataraman 4 Intended status: Standards Track B. Rajagopalan 5 Expires: 27 January 2022 Juniper Networks, Inc. 6 G. Mishra 7 Verizon Communications Inc. 8 M. Khaddam 9 Cox Communications Inc. 10 X. Xu 11 Capitalonline. 12 R. Szarecki 13 Google. 14 26 July 2021 16 BGP Classful Transport Planes 17 draft-kaliraj-idr-bgp-classful-transport-planes-11 19 Abstract 21 This document specifies a mechanism, referred to as "service 22 mapping", to express association of overlay routes with underlay 23 routes satisfying a certain SLA, using BGP. The document describes a 24 framework for classifying underlay routes into transport classes, and 25 mapping service routes to specific transport class. 27 The "Transport class" construct maps to a desired SLA, and can be 28 used to realize the "Topology Slice" in 5G Network slicing 29 architecture. 31 This document specifies BGP protocol procedures that enable 32 dissemination of such service mapping information that may span 33 multiple co-operating administrative domains. These domains may be 34 administetered by the same provider or closely co-ordinating provider 35 networks. 37 It makes it possible to advertise multiple tunnels to the same 38 destination address, thus avoiding need of multiple loopbacks on the 39 egress node. 41 A new BGP transport layer address family (SAFI 76) is defined for 42 this purpose that uses RFC-4364 technology and follows RFC-8277 NLRI 43 encoding. This new address family is called "BGP Classful 44 Transport", aka BGP CT. 46 It carries transport prefixes across tunnel domain boundaries (e.g. 47 in Inter-AS Option-C networks), parallel to BGP LU (SAFI 4) . It 48 disseminates "Transport class" information for the transport prefixes 49 across the participating domains, which is not possible with BGP LU. 50 This makes the end-to-end network a "Transport Class" aware tunneled 51 network. 53 Requirements Language 55 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 56 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 57 document are to be interpreted as described in RFC 2119 [RFC2119]. 59 Status of This Memo 61 This Internet-Draft is submitted in full conformance with the 62 provisions of BCP 78 and BCP 79. 64 Internet-Drafts are working documents of the Internet Engineering 65 Task Force (IETF). Note that other groups may also distribute 66 working documents as Internet-Drafts. The list of current Internet- 67 Drafts is at https://datatracker.ietf.org/drafts/current/. 69 Internet-Drafts are draft documents valid for a maximum of six months 70 and may be updated, replaced, or obsoleted by other documents at any 71 time. It is inappropriate to use Internet-Drafts as reference 72 material or to cite them other than as "work in progress." 74 This Internet-Draft will expire on 27 January 2022. 76 Copyright Notice 78 Copyright (c) 2021 IETF Trust and the persons identified as the 79 document authors. All rights reserved. 81 This document is subject to BCP 78 and the IETF Trust's Legal 82 Provisions Relating to IETF Documents (https://trustee.ietf.org/ 83 license-info) in effect on the date of publication of this document. 84 Please review these documents carefully, as they describe your rights 85 and restrictions with respect to this document. Code Components 86 extracted from this document must include Simplified BSD License text 87 as described in Section 4.e of the Trust Legal Provisions and are 88 provided without warranty as described in the Simplified BSD License. 90 Table of Contents 92 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 93 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 5 94 3. Transport Class . . . . . . . . . . . . . . . . . . . . . . . 6 95 4. "Transport Class" Route Target Extended Community . . . . . . 7 96 5. Transport RIB . . . . . . . . . . . . . . . . . . . . . . . . 9 97 6. Transport Routing Instance . . . . . . . . . . . . . . . . . 9 98 7. Nexthop Resolution Scheme . . . . . . . . . . . . . . . . . . 9 99 8. BGP Classful Transport Family NLRI . . . . . . . . . . . . . 10 100 9. Comparison with other families using RFC-8277 encoding . . . 11 101 10. Protocol Procedures . . . . . . . . . . . . . . . . . . . . . 12 102 11. Scaling considerations . . . . . . . . . . . . . . . . . . . 15 103 11.1. Avoiding unintended spread of CT routes across 104 domains. . . . . . . . . . . . . . . . . . . . . . . . . 15 105 11.2. Constrained distribution of PNHs to SNs (On Demand 106 Nexthop) . . . . . . . . . . . . . . . . . . . . . . . . 16 107 11.3. Limiting scope of visibility of PE loopback as PNHs . . 17 108 12. OAM considerations . . . . . . . . . . . . . . . . . . . . . 17 109 13. Applicability to Network Slicing . . . . . . . . . . . . . . 18 110 14. SRv6 support . . . . . . . . . . . . . . . . . . . . . . . . 19 111 15. Illustration of procedures with example topology . . . . . . 19 112 15.1. Topology . . . . . . . . . . . . . . . . . . . . . . . . 19 113 15.2. Service Layer route exchange . . . . . . . . . . . . . . 21 114 15.3. Transport Layer route propagation . . . . . . . . . . . 21 115 15.4. Data plane view . . . . . . . . . . . . . . . . . . . . 23 116 15.4.1. Steady state . . . . . . . . . . . . . . . . . . . . 23 117 15.4.2. Absorbing failure of primary path . . . . . . . . . 24 118 16. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 25 119 16.1. New BGP SAFI . . . . . . . . . . . . . . . . . . . . . . 25 120 16.2. New Format for BGP Extended Community . . . . . . . . . 25 121 16.2.1. Existing registries to be modified . . . . . . . . . 25 122 16.2.2. New registries to be created . . . . . . . . . . . . 26 123 16.3. MPLS OAM code points . . . . . . . . . . . . . . . . . . 27 124 17. Security Considerations . . . . . . . . . . . . . . . . . . . 27 125 18. Contributors . . . . . . . . . . . . . . . . . . . . . . . . 27 126 19. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 28 127 20. Normative References . . . . . . . . . . . . . . . . . . . . 28 128 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 30 130 1. Introduction 132 To facilitate service mapping, the tunnels in a network can be 133 grouped by the purpose they serve into a "Transport Class". The 134 tunnels could be created using any signaling protocol, such as LDP, 135 RSVP, BGP LU or SPRING. The tunnels could also use native IP or 136 IPv6, as long as they can carry MPLS payload. Tunnels may exist 137 between different pair of end points. Multiple tunnels may exist 138 between the same pair of end points. 140 Thus, a Transport Class consists of tunnels created by various 141 protocols that satisfy the properties of the class. For example, a 142 "Gold" transport class may consist of tunnels that traverse the 143 shortest path with fast re-route protection, a "Silver" transport 144 class may hold tunnels that traverse shortest paths without 145 protection, a "To NbrAS Foo" transport class may hold tunnels that 146 exit to neighboring AS Foo, and so on. 148 The extensions specified in this document can be used to create a BGP 149 transport tunnel that potentially spans domains, while preserving its 150 Transport Class. Examples of domain are Autonomous System (AS), or 151 IGP area. Within each domain, there is a second level underlay 152 tunnel used by BGP to cross the domain. The second level underlay 153 tunnels could be hetrogeneous: Each domain may use a different type 154 of tunnel (e.g. MPLS, IP, GRE), or use a differnet signaling 155 protocol. A domain boundary is demarcated by a rewrite of BGP 156 nexthop to 'self' while re-advertising tunnel routes in BGP. 157 Examples of domain boundary are inter-AS links and inter-region ABRs. 158 The path uses MPLS label-switching when crossing domain boundary and 159 uses the native intra-AS tunnel of the desired transport class when 160 traversing within a domain. 162 Overlay routes carry sufficient indication of the Transport Classes 163 they should be encapsulated over, in form of BGP community called the 164 "Mapping community". Based on the mapping community, "route 165 resolution" procedure on the ingress node selects from the 166 corresponding Transport Class an appropriate tunnel whose destination 167 matches (LPM) the nexthop of the overlay route. If the overlay route 168 is carried in BGP, the protocol nexthop (or, PNH) is generally 169 carried as an attribute of the route. 171 The PNH of the overlay route is also referred to as "service 172 endpoint" (SEP). The service endpoint may exist in the same domain 173 as the service ingress node or lie in a different domain, adjacent or 174 non-adjacent. In the former case, reachability to the SEP is 175 provided by an intra-domain tunneling protocol, and in the latter 176 case, reachability to the SEP is via BGP transport families. 178 In this architecture, the intra-domain transport protocols (e.g. 179 RSVP, SRTE) are also "Transport Class aware", and they publish 180 ingress routes in Transport RIB associated with the Transport Class, 181 at the tunnel ingress node. These routes are then redistributed into 182 BGP CT to be advertised to adjacent domains. It is outside the scope 183 of this document how exactly the transport protocols are made 184 transport class aware, though configuration on the tunnel ingress 185 node is a simple mechanism to achieve it. 187 This document describes mechanisms to: 189 Model a "Transport Class" as "Transport RIB" on a router, 190 consisting of tunnel ingress routes of a certain class. 192 Enable service routes to resolve over an intended Transport Class 193 by virtue of carrying the appropriate "Mapping community". Which 194 results in using the corresponding Transport RIB for finding 195 nexthop reachability. 197 Advertise tunnel ingress routes in a Transport RIB via BGP without 198 any path hiding, using BGP VPN technology and Add-path. Such that 199 overlay routes in the receiving domains can also resolve over 200 tunnels of associated Transport Class. 202 Provide a way for co-operating domains to reconcile any 203 differences in extended community namespaces, and interoperate 204 between different transport signaling protocols in each domain. 206 In this document we focus mainly on MPLS as the intra-domain 207 transport tunnel forwarding, but the mechanisms described here would 208 work in similar manner for non-MPLS (e.g. IP, GRE, UDP) transport 209 tunnel forwarding technologies too. 211 This document assumes MPLS forwarding when crossing domain 212 boundaries, as that is the defacto standard in deployed networks 213 today. But mechanisms specified in this document can also support 214 different forwarding technologies (e.g. SRv6). 215 Section [SRV6-INTER-DOMAIN]in this document describes adaptation of 216 BGP CT over SRv6 data plane. 218 The document Seamless Segment Routing [Seamless-SR] describes various 219 use cases and applications of procedures described in this document. 221 2. Terminology 223 LSP: Label Switched Path. 225 TE : Traffic Engineering. 227 SN : Service Node. 229 BN : Border Node. 231 TN : Transport Node, P-router. 233 BGP-VPN : VPNs built using RFC4364 mechanisms. 235 RT : Route-Target extended community. 237 RD : Route-Distinguisher. 239 PNH : Protocol-Nexthop address carried in a BGP Update message. 241 SEP : Service End point, the PNH of a Service route. 243 LPM : Longest Prefix Match. 245 Service Family : BGP address family used for advertising routes for 246 "data traffic", as opposed to tunnels. 248 Transport Family : BGP address family used for advertising tunnels, 249 which are in turn used by service routes for resolution. 251 Transport Tunnel : A tunnel over which a service may place traffic. 252 These tunnels can be GRE, UDP, LDP, RSVP, or SR-TE. 254 Tunnel Domain : A domain of the network containing SN and BN, under a 255 single administrative control that has a tunnel between SN and BN. 256 An end-to-end tunnel spanning several adjacent tunnel domains can be 257 created by "stitching" them together using labels. 259 Transport Class : A group of transport tunnels offering the same type 260 of service. 262 Transport Class RT : A Route-Target extended community used to 263 identify a specific Transport Class. 265 Transport RIB : At the SN and BN, a Transport Class has an associted 266 Transport RIB that holds its tunnel routes. 268 Transport Plane : An end to end plane comprising of transport tunnels 269 belonging to same transport class. Tunnels of same transport class 270 are stitched together by BGP route readvertisements with nexthop- 271 self, to span across domain boundaries using Label-Swap forwarding 272 mechanism similar to Inter-AS option-b. 274 Mapping Community : BGP Community/Extended-community on a service 275 route, that maps it to resolve over a Transport Class. 277 3. Transport Class 279 A Transport Class is defined as a set of transport tunnels that share 280 certain characteristics useful for underlay selection. 282 On the wire, a transport class is represented as the Transport Class 283 RT, which is a new Route-Target extended community. 285 A Transport Class is configured at SN and BN, along with attributes 286 like RD and Route-Target. Creation of a Transport Class instantiates 287 the associated Transport RIB and a Transport routing instance to 288 contain them all. 290 The operator may configure a SN/BN to classify a tunnel into an 291 appropriate Transport Class, which causes the tunnel's ingress routes 292 to be installed in the corresponding Transport RIB. At a BN, these 293 tunnel routes may then be advertised into BGP CT. 295 Alternatively, a router receiving the transport routes in BGP with 296 appropriate signaling information can associate those ingress routes 297 to the appropriate Transport Class. E.g. for Classful Transport 298 family (SAFI 76) routes, the Transport Class RT indicates the 299 Transport Class. For BGP LU family(SAFI 4) routes, import processing 300 based on Communities or inter-AS source-peer may be used to place the 301 route in the desired Transport Class. 303 When the ingress route is received via SRTE [SRTE], which encodes the 304 Transport Class as an integer 'Color' in the NLRI as 305 "Color:Endpoint", the 'Color' is mapped to a Transport Class during 306 import processing. SRTE ingress route for 'Endpoint' is installed in 307 that transport class. The SRTE route when advertised out to BGP 308 speakers will then be advertised in Classful Transport family with 309 Transport Class RT and a new label. The MPLS swap route thus 310 installed for the new label will pop the label and deliver 311 decapsulated traffic into the path determined by SRTE route. 313 RFC8664 [RFC8664] extends PCEP to carry SRTE Color. This color 314 association thus learnt is also mapped to a Transport Class thus 315 associating the PCEP signaled SRTE LSP with the desired Transport 316 Class. 318 Similarly, PCEP-RSVP-COLOR [PCEP-RSVP-COLOR] extends PCEP to carry 319 RSVP Color. This color association thus learnt is also mapped to a 320 Transport Class thus associating the PCEP signaled RSVP LSP with the 321 desired Transport Class. 323 4. "Transport Class" Route Target Extended Community 325 This document defines a new type of Route Target, called "Transport 326 Class" Route Target Extended Community. 328 "Transport Class" Route Target extended community is a transitive 329 extended community EXT-COMM [RFC4360] of extended-type, with a new 330 Format (Type high = 0xa) and SubType as 0x2 (Route Target). 332 This new Route Target Format has the following encoding: 334 0 1 2 3 335 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 336 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 337 | Type= 0xa | SubType= 0x02 | Reserved | 338 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 339 | Transport Class ID | 340 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 342 "Transport Class" Route Target Extended Community 344 Type: 1 octet 346 Type field contains value 0xa. 348 SubType: 1 octet 350 Subtype field contain 0x2. This indicates 'Route Target'. 352 Transport Class ID: 4 octets 354 The least significant 32-bits of the value field contain the 355 "Transport Class" identifier, which is a 32-bit integer. 357 The remaining 2 octets after SubType field are Reserved, they MUST 358 be set to zero by originator, and ignored, left unaltered by 359 receiver. 361 The "Transport class" Route Target Extended community follows the 362 mechanisms for VPN route import, export as specified in BGP-VPN 363 [RFC4364], and follows the Route Target Contrain mechanisms as 364 specified in VPN-RTC [RFC4684] 366 A BGP speaker that implements RT Constraint VPN-RTC [RFC4684] MUST 367 apply the RT Constraint procedures to the "Transport class" Route 368 Target Extended community as-well. 370 The Transport Class Route Target Extended community is carried on 371 Classful Transport family routes, and allows associating them with 372 appropriate Transport RIBs at receiving BGP speakers. 374 Use of the Transport Class Route Target Extended community with a new 375 Type code avoids conflicts with any VPN Route Target assignments 376 already in use for service families. 378 5. Transport RIB 380 A Transport RIB is a routing-only RIB that is not installed in 381 forwarding path. However, the routes in this RIB are used to resolve 382 reachability of overlay routes' PNH. Transport RIB is created when 383 the Transport Class it represents is configured. 385 Overlay routes that want to use a specific Transport Class confine 386 the scope of nexthop resolution to the set of routes contained in the 387 corresponding Transport RIB. This Transport RIB is the "Routing 388 Table" referred in Section 9.1.2.1 RFC4271 (https://www.rfc- 389 editor.org/rfc/rfc4271#section-9.1.2.1) 391 Routes in a Transport RIB are exported out in 'Classful Transport' 392 address family. 394 6. Transport Routing Instance 396 A BGP VPN routing instance that is a container for the Transport RIB. 397 It imports, and exports routes in this RIB with Transport Class RT. 398 Tunnel destination addresses in this routing instance's context come 399 from the "provider namespace". This is different from user VRFs for 400 e.g., which contain prefixes in "customer namespace" 402 The Transport Routing instance uses the RD and RT configured for the 403 Transport Class. 405 7. Nexthop Resolution Scheme 407 An implementation may provide an option for the service route to 408 resolve over less preferred Transport Classes, should the resolution 409 over preferred, or "primary" Transport Class fail. 411 To accomplish this, the set of service routes may be associated with 412 a user-configured "resolution scheme", which consists of the primary 413 Transport Class, and optionally, an ordered list of fallback 414 Transport Classes. 416 A community called as "Mapping Community" is configured for a 417 "resolution scheme". A Mapping community maps to exactly one 418 resolution scheme. A resolution scheme comprises of one primary 419 transport class and optionally one or more fallback transport 420 classes. 422 A BGP route is associated with a resolution scheme during import 423 processing. The first community on the route that matches a mapping 424 community of a locally configured resolution scheme is considered the 425 effective mapping community for the route. The resolution scheme 426 thus found is used when resolving the route's PNH. If a route 427 contains more than one mapping community, it indicates that the route 428 considers these multiple mapping communities as equivalent. So the 429 first community that maps to a resolution scheme is chosen. 431 A transport route received in BGP Classful Transport family SHOULD 432 use a resolution scheme that contains the primary Transport Class 433 without any fallback to best effort tunnels. The primary Transport 434 Class is identified by the Transport Class RT carried on the route. 435 Thus Transport Class RT serves as the Mapping Community for Classful 436 Transport routes. 438 A service route received in a BGP service family MAY map to a 439 resolution scheme that contains the primary Transport Class 440 identified by the mapping community on the route, and a fallback to 441 best effort tunnels transport class. The primary Transport Class is 442 identified by the Mapping community carried on the route. For e.g. 443 the Extended Color community may serve as the Mapping Community for 444 service routes. Color:0: MAY map to a resolution scheme that has 445 primary transport class , and a fallback to best-effort transport 446 class. 448 8. BGP Classful Transport Family NLRI 450 The Classful Transport (CT) family will use the existing AFI of IPv4 451 or IPv6, and a new SAFI 76 "Classful Transport" that will apply to 452 both IPv4 and IPv6 AFIs. These AFI, SAFI pair of values MUST be 453 negotiated in Multiprotocol Extensions capability described in 454 [RFC4760] to be able to send and receive BGP CT routes. 456 The "Classful Transport" SAFI NLRI itself is encoded as specified in 457 https://tools.ietf.org/html/rfc8277#section-2 [RFC8277]. 459 When AFI is IPv4 the "Prefix" portion of Classful Transport family 460 NLRI consists of an 8-byte RD followed by an IPv4 prefix. When AFI 461 is IPv6 the "Prefix" consists of an 8-byte RD followed by an IPv6 462 prefix. 464 Attributes on a Classful Transport route include the Transport Class 465 Route-Target extended community, which is used to leak the route into 466 the right Transport RIBs on SNs and BNs in the network. 468 SAFI 76 routes can be sent with either IPv4 or IPv6 nexthop. The 469 type of nexthop is inferred from the length of nexthop. 471 When the length of Next Hop Address field is 24 (or 48) the nexthop 472 address is of type VPN-IPv6 with 8-octet RD set to zero (potentially 473 followed by the link-local VPN-IPv6 address of the next hop with an 474 8-octet RD set to zero). 476 When the length of Next Hop Address field is 12 the nexthop address 477 is of type VPN-IPv4 with 8-octet RD set to zero. 479 9. Comparison with other families using RFC-8277 encoding 481 SAFI 128 (Inet-VPN) is a RF8277 encoded family that carries service 482 prefixes in the NLRI, where the prefixes come from the customer 483 namespaces, and are contexualized into separate user virtual service 484 RIBs called VRFs, using RFC4364 procedures. 486 SAFI 4 (BGP LU) is a RFC8277 encoded family that carries transport 487 prefixes in the NLRI, where the prefixes come from the provider 488 namespace. 490 SAFI 76 (Classful Transport) is a RFC8277 encoded family that carries 491 transport prefixes in the NLRI, where the prefixes come from the 492 provider namespace, but are contexualized into separate Transport 493 RIBs, using RFC4364 procedures. 495 It is worth noting that SAFI 128 has been used to carry transport 496 prefixes in "L3VPN Inter-AS Carrier's carrier" scenario, where BGP 497 LU/LDP prefixes in CsC VRF are advertised in SAFI 128 towards the 498 remote-end baby carrier. 500 In this document a new AFI/SAFI is used instead of reusing SAFI 128 501 to carry these transport routes, because it is operationally 502 advantageous to segregate transport and service prefixes into 503 separate address families, RIBs. E.g. It allows to safely enable 504 "per-prefix" label allocation scheme for Classful Transport prefixes 505 without affecting SAFI 128 service prefixes which may have huge 506 scale. "per prefix" label allocation scheme keeps the routing churn 507 local during topology changes. 509 A new family also facilitates having a different readvertisement path 510 of the transport family routes in a network than the service route 511 readvertisement path. viz. Service routes (Inet-VPN) are exchanged 512 over an EBGP multihop session between Autonomous systems with nexthop 513 unchanged; whereas Classful Transport routes are readvertised over 514 EBGP single hop sessions with "nexthop-self" rewrite over inter-AS 515 links. 517 The Classful Transport family is similar in vein to BGP LU, in that 518 it carries transport prefixes. The only difference is, it also 519 carries in Route Target an indication of which Transport Class the 520 transport prefix belongs to, and uses RD to disambiguate multiple 521 instances of the same transport prefix in a BGP Update. 523 10. Protocol Procedures 525 This section summarizes the procedures followed by various nodes 526 speaking Classful Transport family 528 Preparing the network for deploying Classful Transport planes 530 Operator decides on the Transport Classes that exist in the 531 network, and allocates a Route-Target to identify each Transport 532 Class. 534 Operator configures Transport Classes on the SNs and BNs in the 535 network with unique Route-Distinguishers and Route-Targets. 537 Implementations may provide automatic generation and assignment of 538 RD, RT values for a transport routing instance; they MAY also 539 provide a way to manually override the automatic mechanism, in 540 order to deal with any conflicts that may arise with existing RD, 541 RT values in the different network domains participating in a 542 deployment. 544 Origination of Classful Transport route: 546 At the ingress node of the tunnel's home domain, the tunneling 547 protocols install routes in the Transport RIB associated with the 548 Transport Class the tunnel belongs to. 550 The ingress node then advertises this tunnel destination into BGP 551 as a Classful Transport family route with NLRI RD:TunnelEndpoint, 552 attaching a 'Transport Class' Route Target that identifies the 553 Transport Class. This BGP CT route is advertised to EBGP peers 554 and IBGP peers which are RR-clients. This route MUST NOT be 555 advertised to the IBGP peers who are not RR-clients. 557 Alternatively, the egress node of the tunnel i.e. the tunnel 558 endpoint can originate the same BGP Classful Transport route, with 559 NLRI RD:TunnelEndpoint and PNH TunnelEndpoint, which will resolve 560 over the tunnel route at the ingress node. When the tunnel is up, 561 the Classful Transport BGP route will become usable and get re- 562 advertised. 564 Unique RD SHOULD be used by the originator of a Classful Transport 565 route to disambiguate the multiple BGP advertisements for a 566 transport end point. 568 Ingress node receiving Classful Transport route 570 On receiving a BGP Classful Transport route with a PNH that is not 571 directly connected, e.g. an IBGP-route, a mapping community on the 572 route (the Transport Class RT) indicates which Transport Class 573 this route maps to. The routes in the associated Transport RIB 574 are used to resolve the received PNH. If there does not exist a 575 route in the Transport RIB matching the PNH, the Classful 576 Transport route is considered unusable, and MUST NOT be re- 577 advertised further. 579 Border node readvertising Classful Transport route with nexthop self: 581 The BN allocates an MPLS label to advertise upstream in Classful 582 Transport NLRI. The BN also installs an MPLS swap-route for that 583 label that swaps the incoming label with a label received from the 584 downstream BGP speaker, or pops the incoming label. And then 585 pushes received traffic to the transport tunnel or direct 586 interface that the Classful Transport route's PNH resolved over. 588 The label SHOULD be allocated with "per-prefix" label allocation 589 semantics. RD is stripped from the BGP CT NLRI prefix when a BGP 590 CT route is leaked to a Transport RIB. The IP prefix in the 591 transport RIB context (IP-prefix, Transport-Class) is used as the 592 key to do per-prefix label allocation. This helps in avoiding BGP 593 CT route churn through out the CT network when a failure happens 594 in a domain. The failure is not propagated further than the BN 595 closest to the failure. 597 The value of advertised MPLS label is locally significant, and is 598 dynamic by default. The BN may provide option to allocate a value 599 from a statically carved out range. This can be achieved using 600 locally configured export policy, or via mechanisms described in 601 BGP Prefix-SID [RFC8669]. 603 Border node receiving Classful Transport route on EBGP : 605 If the route is received with PNH that is known to be directly 606 connected, e.g. EBGP single-hop peering address, the directly 607 connected interface is checked for MPLS forwarding capability. No 608 other nexthop resolution process is performed, as the inter-AS 609 link can be used for any Transport Class. 611 If the inter-AS links should honor Transport Class, then the BN 612 SHOULD follow procedures of an Ingress node described above, and 613 perform nexthop resolution process. The interface routes SHOULD 614 be installed in the Transport RIB belonging to the associated 615 Transport Class. 617 Avoiding path-hiding through Route Reflectors 619 When multiple BNs exist that advertise a RDn:PEn prefix to RRs, 620 the RRs may hide all but one of the BNs, unless ADDPATH [RFC7911] 621 is used for the Classful Transport family. This is similar to 622 L3VPN option-B scenarios. Hence ADDPATH SHOULD be used for 623 Classful Transport family, to avoid path-hiding through RRs. 625 Avoiding loop between Route Reflectors in forwarding path 627 Pair of redundant ABRs acting as RR with nexthop-self may chose 628 each other as best path instead of the upstream ASBR, causing a 629 traffic forwarding loop. 631 Implementations SHOULD provide a way to alter the tie-breaking 632 rule specified in BGP RR [RFC4456] to tie-break on CLUSTER_LIST 633 step before ROUTER-ID step, when performing path selection for BGP 634 CT routes. RFC4456 considers pure RR which is not in forwarding 635 path. When RR is in forwarding path and reflects routes with 636 nexthop-self, which is the case for ABR BNs in a BGP transport 637 network, this rule may cause loops. This document suggests the 638 following modification to the BGP Decision Process Tie Breaking 639 rules (Sect. 9.1.2.2, [RFC4271]) when doing path selection for BGP 640 CT family routes: 642 The following rule SHOULD be inserted between Steps e) and f): a 643 BGP Speaker SHOULD prefer a route with the shorter CLUSTER_LIST 644 length. The CLUSTER_LIST length is zero if a route does not carry 645 the CLUSTER_LIST attribute. 647 Some deployment considerations can also help in avoiding this 648 problem: 650 - IGP metric should be assigned such that "ABR to redundant ABR" 651 cost is inferior than "ABR to upstream ASBR" cost. 653 - Tunnels belonging to special Transport classes SHOULD NOT be 654 provisioned between ABR to ABRs. This will ensure that the 655 route received from an ABR with nexthop-self will not be usable 656 at a redundant ABR. 658 This avoids possibility of such loops altogether, irrespective of 659 whether the path selection modification mentioned above is 660 implemented. 662 Ingress node receiving service route with mapping community 664 Service routes received with mapping community resolve using 665 Transport RIBs determined by the resolution scheme. If the 666 resolution process does not find an usable Classful Transport 667 route or tunnel route in any of the Transport RIBs, the service 668 route MUST be considered unusable for forwarding purpose. 670 Coordinating between domains using different community namespaces. 672 Cooperating option-C domains may sometimes not agree on RT, RD, 673 Mapping-community or Transport Route Target values because of 674 differences in community namespaces; e.g. during network mergers 675 or renumbering for expansion. Such deployments may deploy 676 mechanisms to map and rewrite the Route-target values on domain 677 boundaries, using per ASBR import policies. This is no different 678 than any other BGP VPN family. Mechanisms employed in inter-AS 679 VPN deployments may be used with the Classful Transport family 680 also. 682 The resolution schemes SHOULD allow association with multiple 683 mapping communities. This helps with renumbering, network 684 mergers, or transitions. 686 Though RD can also be rewritten on domain boundaries, deploying 687 unique RDs is strongly RECOMMENDED, because it helps in trouble 688 shooting by uniquely identifying originator of a route, and avoids 689 path-hiding. 691 This document defines a new format of Route-Target extended- 692 community to carry Transport Class, this avoids collision with 693 regular Route Target namespace used by service routes. 695 11. Scaling considerations 697 11.1. Avoiding unintended spread of CT routes across domains. 699 RFC8212 [RFC8212] suggests BGP speakers require explicit 700 configuration of both BGP Import and Export Policies for any EBGP 701 sessions, in order to receive or send routes on EBGP sessions. 703 It is recommended to follow this for BGP CT routes. It will prohibit 704 unintended advertisement of transport routes through out the BGP CT 705 transport domain which may span multiple AS. This will conserve 706 usage of MPLS label and nexthop resources in the network. An ASBR of 707 a domain can be provisioned to allow routes with only the Transport 708 targets that are required by SNs in the domain. 710 11.2. Constrained distribution of PNHs to SNs (On Demand Nexthop) 712 This section describes how the number of Protocol Nexthops 713 advertised to a SN or BN can be constrained using BGP Classsful 714 Transport and VPN RTC [RFC4684] 716 An egress SN MAY advertise BGP CT route for RD:eSN with two Route 717 Targets: transport-target:0: and a RT carrying :. 718 Where TC is the Transport Class identifier, and eSN is the IP- 719 address used by SN as BGP nexthop in it's service route 720 advertisements. 722 transport-target:0: is the new type of route target (Transport 723 Class RT) defined in this document. It is carried in BGP extended 724 community attribute (BGP attribute code 16). 726 The RT carrying : MAY be an IP-address specific regular 727 RT (BGP attribute code 16), IPv6-address specific RT (BGP 728 attribute code 25), or a Wide-communities based RT (BGP attribute 729 code 34) as described in RTC-Ext [RTC-Ext] 731 An ingress SN MAY import BGP CT routes with Route Target carrying: 732 :. The ingress SN MAY learn the eSN values either by 733 configuration, or it MAY discover them from the BGP nexthop field 734 in the BGP VPN service routes received from eSN. A BGP ingress SN 735 receiving a BGP service route with nexthop of eSN SHOULD generate 736 a RTC/Extended-RTC route for Route Target prefix :/[80|176] in order to learn BGP CT transport routes to 738 reach eSN. This allows constrained distribution of the transport 739 routes to the PNHs actually required by iSN. 741 When path of route propogation of BGP CT routes is same as the RTC 742 routes, a BN would learn the RTC routes advertised by ingress SNs 743 and propagate further. This will allow constraining distribution 744 of BGP CT routes for a PNH to only the necessary BNs in the 745 network, closer to the egress SN. 747 This mechanism provides "On Demand Nexthop" of BGP CT routes, 748 which help with scaling of MPLS forwarding state at SN and BN. 750 But the amount of state carried in RTC family may become 751 proportional to number of PNHs in the network. To strike a 752 balance, the RTC route advertisements for :/[80|176] MAY be confined to the BNs in home region of 754 ingress-SN, or the BNs of a super core. 756 Such a BN in the core of the network SHOULD import BGP CT routes 757 with Transport Class Route Target: 0:, and generate a RTC 758 route for :0:/96, while not propagating the more 759 specific RTC requests for specific PNHs. This will let the BN 760 learn transport routes to all eSN nodes. But confine their 761 propagation to ingress-SNs. 763 11.3. Limiting scope of visibility of PE loopback as PNHs 765 It may be even more desirable to limit the number of PNHs that are 766 globaly visible in the network. This is possible using mechanism 767 described in MPLS Namespaces [MPLS-NAMESPACES] 769 Such that advertisement of PE loopback addresses as next-hop in BGP 770 service routes is confined to the region they belong to. An anycast 771 IP-address called "Context Protocol Nexthop Address" abstracts the 772 PEs in a region from other regions in the network, swapping the PE 773 scoped service label with a CPNH scoped private namespace label. 775 This provides much greater advantage in terms of scaling and 776 convergence. Changes to implement this feature are required only on 777 the region's BNs and RR. 779 12. OAM considerations 781 Standard MPLS OAM procedures specified in [RFC8029] also apply to BGP 782 Classful Transport. 784 The 'Target FEC Stack' sub-TLV for IPv4 Classful Transport has a Sub- 785 Type of [TBD], and a length of 13. The Value field consists of the 786 RD advertised with the Classful Transport prefix, the IPv4 prefix 787 (with trailing 0 bits to make 32 bits in all), and a prefix length, 788 encoded as follows: 790 0 1 2 3 791 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 792 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 793 | Route Distinguisher | 794 | (8 octets) | 795 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 796 | IPv4 prefix | 797 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 798 | Prefix Length | Must Be Zero | 799 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 801 Figure 1: Classful Transport IPv4 FEC 803 The 'Target FEC Stack' sub-TLV for IPv6 Classful Transport has a Sub- 804 Type of [TBD], and a length of 25. The Value field consists of the 805 RD advertised with the Classful Transport prefix, the IPv6 prefix 806 (with trailing 0 bits to make 128 bits in all), and a prefix length, 807 encoded as follows: 809 0 1 2 3 810 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 811 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 812 | Route Distinguisher | 813 | (8 octets) | 814 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 815 | IPv6 prefix | 816 | | 817 | | 818 | | 819 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 820 | Prefix Length | Must Be Zero | 821 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 823 Figure 2: Classful Transport IPv6 FEC 825 13. Applicability to Network Slicing 827 In Network Slicing, the Transport Slice Controller (TSC) sets up the 828 Topology (e.g. RSVP, SR-TE tunnels with desired characteristics) and 829 resources (e.g. polices/shapers) in a transport network to create a 830 Transport slice. The Transport class construct described in this 831 document represents the "Topology Slice" portion of this equation. 833 The TSC can use the Transport Class Identifier (Color value) to 834 provision a transport tunnel in a specific Topology Slice. 836 Further, Network slice controller can use the Mapping community on 837 the service route to map traffic to the desired Transport slice. 839 14. SRv6 support 841 This section describes how BGP CT may be used to set up inter domain 842 tunnels of a certain Transport Class, when using Segment Routing over 843 IPv6 (SRv6) data plane on the inter AS links or as intra-AS tunneling 844 mechanism. 846 RFC8986, [SRV6-INTER-DOMAIN] specify the SRv6 Endpoint behaviors (End 847 USD, End.BM, End.B6.Encaps and End.Replace, End.ReplaceB6, 848 respectively). These are leveraged for BGP CT with SRv6 data plane. 850 The BGP Classful Transport route update for SRv6 MUST include the BGP 851 Prefix-SID attribute along with SRv6 SID information as specified in 852 [SRV6-SERVICES]. It may also include SRv6 SID structure for 853 Transposition as specified in [SRV6-SERVICES]. It should be noted 854 that prefixes carried in BGP CT family are transport layer end- 855 points, e.g. PE loopback addresses. Thus the SRv6 SID carried in a 856 BGP CT route is also a transport layer identifier. 858 This document extends the usage of "SRv6 label route tunnel" TLV to 859 AFI=1/2 SAFI 76. "SRv6 label route tunnel" is the TLV of the BGP 860 Prefix-SID Attribute as specified in [SRV6-MPLS-AGRWL]. 862 15. Illustration of procedures with example topology 864 15.1. Topology 866 [RR26] [RR27] [RR16] 867 | | | 868 | | | 869 |+-[ABR23]--+|+--[ASBR21]---[ASBR13]-+|+--[PE11]--+ 870 || ||| ` / ||| | 871 [CE41]--[PE25]--[P28] [P29] `/ [P15] [CE31] 872 | | | /` | | | 873 | | | / ` | | | 874 | | | / ` | | | 875 +--[ABR24]--+ +--[ASBR22]---[ASBR14]-+ +--[PE12]--+ 877 | | | | 878 + + + + 879 CE | region-1 | region-2 | |CE 880 AS4 ...AS2... AS1 AS3 882 41.41.41.41 ------------ Traffic Direction ----------> 31.31.31.31 883 This example shows a provider network that comprises of two 884 Autonomous systems, AS1, AS2. They are serving customers AS3, AS4 885 respectively. Traffic direction being described is CE41 to CE31. 886 CE31 may request a specific SLA, e.g. Gold for this traffic, when 887 traversing these provider networks. 889 AS2 is further divided into two regions. So there are three tunnel 890 domains in provider space. AS1 uses ISIS Flex-Algo intra-domain 891 tunnels, whereas AS2 uses RSVP intra-domain tunnels. 893 The network has two Transport classes: Gold with transport class id 894 100, Bronze with transport class id 200. These transport classes are 895 provisioned at the PEs and the Border nodes (ABRs, ASBRs) in the 896 network. 898 Following tunnels exist for Gold transport class. 900 PE25_to_ABR23_gold - RSVP tunnel 902 PE25_to_ABR24_gold - RSVP tunnel 904 ABR23_to_ASBR22_gold - RSVP tunnel 906 ASBR13_to_PE11_gold - ISIS FlexAlgo tunnel 908 ASBR14_to_PE11_gold - ISIS FlexAlgo tunnel 910 Following tunnels exist for Bronze transport class. 912 PE25_to_ABR23_bronze - RSVP tunnel 914 ABR23_to_ASBR21_bronze - RSVP tunnel 916 ABR23_to_ASBR22_bronze - RSVP tunnel 918 ABR24_to_ASBR21_bronze - RSVP tunnel 920 ASBR13_to_PE12_bronze - ISIS FlexAlgo tunnel 922 ASBR14_to_PE11_bronze - ISIS FlexAlgo tunnel 924 These tunnels are either provisioned or auto-discovered to belong to 925 transport class 100 or 200. 927 15.2. Service Layer route exchange 929 Service nodes PE11, PE12 negotiate service families (SAFI 1, 128) on 930 the BGP session with RR16. Service helpers RR16, RR26 have multihop 931 EBGP session to exchange service routes between the two AS. 932 Similarly PE25 negotiates service families with RR26. 934 Forwarding happens using service routes at service nodes PE25, PE11, 935 PE12 only. Routes received from CEs are not present in any other 936 nodes' FIB in the network. 938 CE31 advertises a route for example prefix 31.31.31.31 with nexthop 939 self to PE11, PE12. CE31 can attach a mapping community Color:0:100 940 on this route, to indicate its request for Gold SLA. Or, PE11 can 941 attach the same using locally configured policies. Let us assume 942 CE31 is getting VPN service from PE25. 944 The 31.31.31.31 route is readvertised in SAFI 128 by PE11 with 945 nexthop self (1.1.1.1) and label V-L1, to RR16 with the mapping 946 community Color:0:100 attached. This SAFI 128 route reaches PE25 via 947 RR16, RR26 with the nexthop unchanged, as PE11 and label V-L1. Now 948 PE25 can resolve the PNH 1.1.1.1 using transport routes received in 949 BGP CT or BGP LU. 951 The IP FIB at PE25 will have a route for 31.31.31.31 with a nexthop 952 thus found, that points to a Gold tunnel in ingress domain. 954 15.3. Transport Layer route propagation 956 ASBR13 negotiates BGP CT family with transport ASBRs ASBR21, ASBR22. 957 They negotiate BGP CT family with RR27 in region 2. ABR23, ABR24 958 negotiate BGP CT family with RR27 in region 2 and RR26 in region 1. 959 PE25 receives BGP CT routes from RR26. BGP LU family is also 960 negotiated on these sessions alongside BGP CT family. BGP LU carries 961 "best effort" transport class routes, BGP CT carries gold, bronze 962 transport class routes. 964 ASBR13 is provisioned with transport class 100, RD value 1.1.1.3:10 965 and a transport route target 0:100. And a Transport class 200 with 966 RD value 1.1.1.3:20, and transport route target 0:200. 968 Similarly, these transoprt classes are also configured on ASBRs, ABRs 969 and PEs, with same transport route target, but unique RDs. 971 Ingress route for ASBR13_to_PE11_gold is advertised by ASBR13 in BGP 972 CT family to ASBRs ASBR21, ASBR22. This route is sent with a NLRI 973 containing RD prefix 1.1.1.3:10:1.1.1.1, Label B-L1 and a route 974 target extended community transport-target:0:100. MPLS swap route is 975 installed at ASBR13 for B-L1 with a nexthop pointing to 976 ASBR13_to_PE11_gold tunnel. 978 Ingress route for ASBR13_to_PE11_bronze is advertised by ASBR13 in 979 BGP CT family to ASBRs ASBR21, ASBR22. This route is sent with a 980 NLRI containing RD prefix 1.1.1.3:20:1.1.1.1, Label B-L2 and a route 981 target extended community transport-target:0:200. MPLS swap route is 982 installed at ASBR13 for label B-L2 with a nexthop pointing to 983 ASBR13_to_PE11_bronze tunnel 985 ASBR21 receives BGP CT route 1.1.1.3:10:1.1.1.1 over the single hop 986 EBGP sesion, and readvertises with nexthop self (loopback adderss 987 2.2.2.1) to RR27, advertising a new label B-L3. MPLS swap route is 988 installed for label B-L3 at ASBR21 to swap to received label B-L1 and 989 forwards to ASBR13. RR27 readvertises this BGP CT route to ABR23, 990 ABR24. 992 ASBR22 receives BGP CT route 1.1.1.3:10:1.1.1.1 over the single hop 993 EBGP sesion, and readvertises with nexthop self (loopback adderss 994 2.2.2.2) to RR27, advertising a new label B-L4. MPLS swap route is 995 installed for label B-L4 at ASBR22 to swap to received label B-L2 and 996 forwards to ASBR13. RR27 readvertises this BGP CT route to ABR23, 997 ABR24. 999 Addpath is enabled for BGP CT family on the sessions between RR27 and 1000 ASBRs, ABRs. Such that routes for 1.1.1.3:10:1.1.1.1 with the 1001 nexthops ASBR21 and ASBR22 are reflected to ABR23, ABR24 without any 1002 path hiding. Thus giving ABR23 visibiity of both available nexthops 1003 for Gold SLA. 1005 ABR23 receives the route with nexthop 2.2.2.1, label B-L3 from RR27. 1006 The route target "transport-target:0:100" on this route acts as 1007 mapping community, and instructs ABR23 to strictly resolve the 1008 nexthop using transport class 100 routes only. ABR23 is unable to 1009 find a route for 2.2.2.1 with transport class 100. Thus it considers 1010 this route unusable and does not propagate it further. This prunes 1011 ASBR21 from Gold SLA tunneled path. 1013 ABR23 also receives the route with nexthop 2.2.2.2, label B-L4 from 1014 RR27. The route target "transport-target:0:100" on this route acts 1015 as mapping community, and instructs ABR23 to strictly resolve the 1016 nexthop using transport class 100 routes only. ABR23 successfully 1017 resolves the nexthop to point to ABR23_to_ASBR22_gold tunnel. ABR23 1018 readvertises this route with nexthop self (loopback address 2.2.2.3) 1019 and a new label B-L5 to RR26. Swap route for B-L5 is installed by 1020 ABR23 to swap to label B-L4, and forward into ABR23_to_ASBR22_gold 1021 tunnel. 1023 RR26 reflects the route from ABR23 to PE25. PE25 receives the BGP CT 1024 route for prefix 1.1.1.3:10:1.1.1.1 with label B-L5, nexthop 2.2.2.3 1025 and transport-target:0:100 from RR26. And it similarly resolves the 1026 nexthop 2.2.2.3 over transport class 100, pushing labels associated 1027 with PE25_to_ABR23_gold tunnel. 1029 In this manner, the Gold transport LSP "ASBR13_to_PE11_gold" in 1030 egress-domain is extended by BGP CT until the ingress-node PE25 in 1031 ingress domain, to create an end-to-end Gold SLA path. MPLS swap 1032 routes are installed at ASBR13, ASBR22 and ABR23, when propagating 1033 the PE11 BGP CT Gold transport class route 1.1.1.3:10:1.1.1.1 with 1034 nexthop self towards PE25. 1036 The BGP CT LSP thus formed, originates in PE25, and terminates in 1037 ASBR13, traversing over the Gold underlay LSPs in each domain. 1038 ASBR13 uses UHP to stitch the BGP CT LSP into the 1039 "ASBR13_to_PE11_gold" LSP to traverse the last domain, thus 1040 satisfying Gold SLA end-to-end. 1042 When PE25 receives service route with nexthop 1.1.1.1 and mapping 1043 community Color:0:100, it resolves over this BGP CT route 1044 1.1.1.3:10:1.1.1.1. Thus pushing label B-L5, and pushing as top 1045 label the labels associated with PE25_to_ABR23_gold tunnel. 1047 15.4. Data plane view 1049 15.4.1. Steady state 1051 This section describes how the data plane looks like in steady state. 1053 CE41 transmits an IP packet with destination as 31.31.31.31. On 1054 receiving this packet PE25 performs a lookup in the IP FIB associated 1055 with the CE41 interface. This lookup yeids the service route that 1056 pushes the VPN service label V-L1, BGP CT label B-L5, and labels for 1057 PE25_to_ABR23_gold tunnel. Thus PE25 encapsulates the IP packet in 1058 MPLS packet with label V-L1(innermost), B-L5, and top label as 1059 PE25_to_ABR23_gold tunnel. This MPLS packet is thus transmitted to 1060 ABR23 using Gold SLA. 1062 ABR23 decapsulates the packet received on PE25_to_ABR23_gold tunnel 1063 as required, and finds the MPLS packet with label B-L5. It performs 1064 lookup for label B-L5 in the global MPLS FIB. This yields the route 1065 that swaps label B-L5 with label B-L4, and pushes top label provided 1066 by ABR23_to_ASBR22_gold tunnel. Thus ABR23 transmits the MPLS packet 1067 with label B-L4 to ASBR22, on a tunnel that satisfies Gold SLA. 1069 ASBR22 similarly performs a lookup for label B-L4 in global MPLS FIB, 1070 finds the route that swaps label B-L4 with label B-L2, and forwards 1071 to ASBR13 over the directly connected MPLS enabled interface. This 1072 interface is a common resource not dedicated to any specific 1073 transport class, in this example. 1075 ASBR13 receives the MPLS packet with label B-L2, and performs a 1076 lookup in MPLS FIB, finds the route that pops label B-L2, and pushes 1077 labels associated with ASBR13_to_PE11_gold tunnel. This transmits 1078 the MPLS packet with VPN label V-L1 to PE11, using a tunnel that 1079 preserves Gold SLA in AS 1. 1081 PE11 receives the MPLS packet with V-L1, and performs VPN forwarding. 1082 Thus transmitting the original IP payload from CE41 to CE31. The 1083 payload has traversed path satisfying Gold SLA end-to-end. 1085 15.4.2. Absorbing failure of primary path 1087 This section describes how the data plane reacts when gold path 1088 experiences a failure. 1090 Let us assume tunnel ABR23_to_ASBR22_gold goes down, such that now 1091 end-to-end Gold path does not exist in the network. This makes the 1092 BGP CT route for RD prefix 1.1.1.1:10:1.1.1.1 unusable at ABR23. 1093 This makes ABR23 send a BGP withdrawal for 1.1.1.1:10:1.1.1.1 to 1094 RR26, which then withdraws the prefix from PE25. 1096 Withdrawal for 1.1.1.1:10:1.1.1.1 allows PE25 to react to the loss of 1097 gold path to 1.1.1.1. Let us assume PE25 is provisioned to use best- 1098 effort transport class as the backup path. This withdrawal of BGP CT 1099 route allows PE25 to adjust the nexthop of the VPN Service-route to 1100 push the labels provided by the BGP LU route. That repairs the 1101 traffic to go via best effort path. PE25 can also be provisioned to 1102 use Bronze transport class as the backup path. The repair will 1103 happen in similar manner in that case as-well. 1105 Traffic repair to absorb the failure happens at ingress node PE25, in 1106 a service prefix scale independent manner. This is called PIC 1107 (Prefix scale Independent Convergence). The repair time will be 1108 proportional to time taken for withdrawing the BGP CT route. 1110 16. IANA Considerations 1112 This document makes following requests of IANA. 1114 16.1. New BGP SAFI 1116 New BGP SAFI code for "Classful Transport". Value 76. 1118 This will be used to create new AFI,SAFI pairs for IPv4, IPv6 1119 Classful Transport families. viz: 1121 * "Inet, Classful Transport". AFI/SAFI = "1/76" for carrying IPv4 1122 Classful Transport prefixes. 1124 * "Inet6, Classful Transport". AFI/SAFI = "2/76" for carrying IPv6 1125 Classful Transport prefixes. 1127 16.2. New Format for BGP Extended Community 1129 Please assign a new Format (Type high = 0xa) of extended community 1130 EXT-COMM [RFC4360] called "Transport Class" from the following 1131 registries: 1133 the "BGP Transitive Extended Community Types" registry, and 1135 the "BGP Non-Transitive Extended Community Types" registry. 1137 Please assign the same low-order six bits for both allocations. 1139 This document uses this new Format with subtype 0x2 (route target), 1140 as a transitive extended community. 1142 The Route Target thus formed is called "Transport Class" route target 1143 extended community. 1145 Taking reference of RFC7153 [RFC7153] , following requests are made: 1147 16.2.1. Existing registries to be modified 1149 16.2.1.1. Registries for the "Type" Field 1151 16.2.1.1.1. Transitive Types 1153 This registry contains values of the high-order octet (the "Type" 1154 field) of a Transitive Extended Community. 1156 Registry Name: BGP Transitive Extended Community Types 1158 TYPE VALUE NAME 1159 + 0x0a Transitive Transport Class Extended 1160 + Community (Sub-Types are defined in the 1161 + "Transitive Transport Class Extended 1162 + Community Sub-Types" registry) 1164 16.2.1.1.2. Non-Transitive Types 1166 This registry contains values of the high-order octet (the "Type" 1167 field) of a Non-transitive Extended Community. 1169 Registry Name: BGP Non-Transitive Extended Community Types 1171 TYPE VALUE NAME 1173 + 0x4a Non-Transitive Transport Class Extended 1174 + Community (Sub-Types are defined in the 1175 + "Non-Transitive Transport Class Extended 1176 + Community Sub-Types" registry) 1178 16.2.2. New registries to be created 1180 16.2.2.1. Transitive "Transport Class" Extended Community Sub-Types 1181 Registry 1183 This registry contains values of the second octet (the "Sub-Type" 1184 field) of an extended community when the value of the first octet 1185 (the "Type" field) is 0x07. 1187 Registry Name: Transitive Transport Class Extended 1188 Community Sub-Types 1190 RANGE REGISTRATION PROCEDURE 1192 0x00-0xBF First Come First Served 1193 0xC0-0xFF IETF Review 1195 SUB-TYPE VALUE NAME 1197 0x02 Route Target 1199 16.2.2.2. Non-Transitive "Transport Class" Extended Community Sub-Types 1200 Registry 1202 This registry contains values of the second octet (the "Sub-Type" 1203 field) of an extended community when the value of the first octet 1204 (the "Type" field) is 0x47. 1206 Registry Name: Non-Transitive Transport Class Extended 1207 Community Sub-Types 1209 RANGE REGISTRATION PROCEDURE 1211 0x00-0xBF First Come First Served 1212 0xC0-0xFF IETF Review 1214 SUB-TYPE VALUE NAME 1216 0x02 Route Target 1218 16.3. MPLS OAM code points 1220 The following two code points are sought for Target FEC Stack sub- 1221 TLVs: 1223 * IPv4 BGP Classful Transport 1225 * IPv6 BGP Classful Transport 1227 17. Security Considerations 1229 Mechanisms described in this document carry Transport routes in a new 1230 BGP address family. That minimizes possibility of these routes 1231 leaking outside the expected domain or mixing with service routes. 1233 When redistributing between SAFI 4 and SAFI 76 Classful Transport 1234 routes, there is a possibility of SAFI 4 routes mixing with SAFI 1 1235 service routes. To avoid such scenarios, it is RECOMMENDED that 1236 implementations support keeping SAFI 4 routes in a separate transport 1237 RIB, distinct from service RIB that contain SAFI 1 service routes. 1239 18. Contributors 1241 Rajesh M 1242 Juniper Networks, Inc. 1243 Electra, Exora Business Park~Marathahalli - Sarjapur Outer Ring Road, 1244 Bangalore 560103 1245 KA 1246 India 1247 Email: mrajesh@juniper.net 1249 19. Acknowledgements 1251 The authors thank Jeff Haas, John Scudder, Navaneetha Krishnan, Ravi 1252 M R, Chandrasekar Ramachandran, Shradha Hegde, Richard Roberts, 1253 Krzysztof Szarkowicz, John E Drake, Srihari Sangli, Vijay Kestur, 1254 Santosh Kolenchery, Robert Raszuk, Ahmed Darwish for the valuable 1255 discussions and review comments. 1257 The decision to not reuse SAFI 128 and create a new address-family to 1258 carry these transport-routes was based on suggestion made by Richard 1259 Roberts and Krzysztof Szarkowicz. 1261 20. Normative References 1263 [MPLS-NAMESPACES] 1264 Vairavakkalai, Ed., "BGP signalled MPLS-namespaces", 11 1265 June 2021, . 1268 [PCEP-RSVP-COLOR] 1269 Rajagopalan, Ed., "Path Computation Element Protocol(PCEP) 1270 Extension for RSVP Color", 15 January 2021, 1271 . 1274 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1275 Requirement Levels", BCP 14, RFC 2119, 1276 DOI 10.17487/RFC2119, March 1997, 1277 . 1279 [RFC4271] Rekhter, Y., Ed., Li, T., Ed., and S. Hares, Ed., "A 1280 Border Gateway Protocol 4 (BGP-4)", RFC 4271, 1281 DOI 10.17487/RFC4271, January 2006, 1282 . 1284 [RFC4360] Sangli, S., Tappan, D., and Y. Rekhter, "BGP Extended 1285 Communities Attribute", RFC 4360, DOI 10.17487/RFC4360, 1286 February 2006, . 1288 [RFC4364] Rosen, E. and Y. Rekhter, "BGP/MPLS IP Virtual Private 1289 Networks (VPNs)", RFC 4364, DOI 10.17487/RFC4364, February 1290 2006, . 1292 [RFC4456] Bates, T., Chen, E., and R. Chandra, "BGP Route 1293 Reflection: An Alternative to Full Mesh Internal BGP 1294 (IBGP)", RFC 4456, DOI 10.17487/RFC4456, April 2006, 1295 . 1297 [RFC4684] Marques, P., Bonica, R., Fang, L., Martini, L., Raszuk, 1298 R., Patel, K., and J. Guichard, "Constrained Route 1299 Distribution for Border Gateway Protocol/MultiProtocol 1300 Label Switching (BGP/MPLS) Internet Protocol (IP) Virtual 1301 Private Networks (VPNs)", RFC 4684, DOI 10.17487/RFC4684, 1302 November 2006, . 1304 [RFC4760] Bates, T., Chandra, R., Katz, D., and Y. Rekhter, 1305 "Multiprotocol Extensions for BGP-4", RFC 4760, 1306 DOI 10.17487/RFC4760, January 2007, 1307 . 1309 [RFC7153] Rosen, E. and Y. Rekhter, "IANA Registries for BGP 1310 Extended Communities", RFC 7153, DOI 10.17487/RFC7153, 1311 March 2014, . 1313 [RFC7911] Walton, D., Retana, A., Chen, E., and J. Scudder, 1314 "Advertisement of Multiple Paths in BGP", RFC 7911, 1315 DOI 10.17487/RFC7911, July 2016, 1316 . 1318 [RFC8029] Kompella, K., Swallow, G., Pignataro, C., Ed., Kumar, N., 1319 Aldrin, S., and M. Chen, "Detecting Multiprotocol Label 1320 Switched (MPLS) Data-Plane Failures", RFC 8029, 1321 DOI 10.17487/RFC8029, March 2017, 1322 . 1324 [RFC8212] Mauch, J., Snijders, J., and G. Hankins, "Default External 1325 BGP (EBGP) Route Propagation Behavior without Policies", 1326 RFC 8212, DOI 10.17487/RFC8212, July 2017, 1327 . 1329 [RFC8277] Rosen, E., "Using BGP to Bind MPLS Labels to Address 1330 Prefixes", RFC 8277, DOI 10.17487/RFC8277, October 2017, 1331 . 1333 [RFC8664] Sivabalan, S., Filsfils, C., Tantsura, J., Henderickx, W., 1334 and J. Hardwick, "Path Computation Element Communication 1335 Protocol (PCEP) Extensions for Segment Routing", RFC 8664, 1336 DOI 10.17487/RFC8664, December 2019, 1337 . 1339 [RFC8669] Previdi, S., Filsfils, C., Lindem, A., Ed., Sreekantiah, 1340 A., and H. Gredler, "Segment Routing Prefix Segment 1341 Identifier Extensions for BGP", RFC 8669, 1342 DOI 10.17487/RFC8669, December 2019, 1343 . 1345 [RTC-Ext] Zhang, Z., Ed., "Route Target Constrain Extension", 12 1346 July 2020, . 1349 [Seamless-SR] 1350 Hegde, Ed., "Seamless Segment Routing", 17 November 2020, 1351 . 1354 [SRTE] Previdi, S., Ed., "Advertising Segment Routing Policies in 1355 BGP", 18 November 2019, . 1358 [SRV6-INTER-DOMAIN] 1359 K A, Ed., "SRv6 inter-domain mapping SIDs", 10 January 1360 2021, . 1363 [SRV6-MPLS-AGRWL] 1364 Agrawal, Ed., "SRv6 and MPLS interworking", 22 February 1365 2021, . 1368 [SRV6-SERVICES] 1369 Dawra, Ed., "SRv6 BGP based Overlay Services", 11 April 1370 2021, . 1373 Authors' Addresses 1375 Kaliraj Vairavakkalai (editor) 1376 Juniper Networks, Inc. 1377 1133 Innovation Way, 1378 Sunnyvale, CA 94089 1379 United States of America 1381 Email: kaliraj@juniper.net 1383 Natrajan Venkataraman 1384 Juniper Networks, Inc. 1385 1133 Innovation Way, 1386 Sunnyvale, CA 94089 1387 United States of America 1389 Email: natv@juniper.net 1390 Balaji Rajagopalan 1391 Juniper Networks, Inc. 1392 Electra, Exora Business Park~Marathahalli - Sarjapur Outer Ring Road, 1393 Bangalore 560103 1394 KA 1395 India 1397 Email: balajir@juniper.net 1399 Gyan Mishra 1400 Verizon Communications Inc. 1401 13101 Columbia Pike 1402 Silver Spring, MD 20904 1403 United States of America 1405 Email: gyan.s.mishra@verizon.com 1407 Mazen Khaddam 1408 Cox Communications Inc. 1409 Atlanta, GA 1410 United States of America 1412 Email: mazen.khaddam@cox.com 1414 Xiaohu Xu 1415 Capitalonline. 1416 Beijing 1417 China 1419 Email: xiaohu.xu@capitalonline.net 1421 Rafal Jan Szarecki 1422 Google. 1423 1160 N Mathilda Ave, Bldg 5, 1424 Sunnyvale,, CA 94089 1425 United States of America 1427 Email: szarecki@google.com