idnits 2.17.1 draft-ietf-idr-rpd-14.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- == There are 8 instances of lines with non-RFC6890-compliant IPv4 addresses in the document. If these are example addresses, they should be changed. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == Using lowercase 'not' together with uppercase 'MUST', 'SHALL', 'SHOULD', or 'RECOMMENDED' is not an accepted usage according to RFC 2119. Please use uppercase 'NOT' together with RFC 2119 keywords (if that is what you mean). Found 'MUST not' in this paragraph: GeMask: 1 octet for route prefix length match range's lower bound, MUST not be less than Mask or be 0. -- The document date (3 August 2021) is 968 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Unused Reference: 'I-D.ietf-idr-registered-wide-bgp-communities' is defined on line 867, but no explicit reference was found in the text == Outdated reference: A later version (-11) exists of draft-ietf-idr-wide-bgp-communities-05 == Outdated reference: A later version (-06) exists of draft-ietf-idr-long-lived-gr-00 Summary: 0 errors (**), 0 flaws (~~), 6 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group Z. Li 3 Internet-Draft Huawei 4 Intended status: Standards Track L. Ou 5 Expires: 4 February 2022 Y. Luo 6 China Telcom Co., Ltd. 7 S. Lu 8 Tencent 9 G. Mishra 10 Verizon Inc. 11 H. Chen 12 Futurewei 13 S. Zhuang 14 H. Wang 15 Huawei 16 3 August 2021 18 BGP Extensions for Routing Policy Distribution (RPD) 19 draft-ietf-idr-rpd-14 21 Abstract 23 It is hard to adjust traffic and optimize traffic paths in a 24 traditional IP network from time to time through manual 25 configurations. It is desirable to have a mechanism for setting up 26 routing policies, which adjusts traffic and optimizes traffic paths 27 automatically. This document describes BGP Extensions for Routing 28 Policy Distribution (BGP RPD) to support this. 30 Requirements Language 32 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 33 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 34 document are to be interpreted as described in [RFC2119] [RFC8174] 35 when, and only when, they appear in all capitals, as shown here. 37 Status of This Memo 39 This Internet-Draft is submitted in full conformance with the 40 provisions of BCP 78 and BCP 79. 42 Internet-Drafts are working documents of the Internet Engineering 43 Task Force (IETF). Note that other groups may also distribute 44 working documents as Internet-Drafts. The list of current Internet- 45 Drafts is at https://datatracker.ietf.org/drafts/current/. 47 Internet-Drafts are draft documents valid for a maximum of six months 48 and may be updated, replaced, or obsoleted by other documents at any 49 time. It is inappropriate to use Internet-Drafts as reference 50 material or to cite them other than as "work in progress." 52 This Internet-Draft will expire on 4 February 2022. 54 Copyright Notice 56 Copyright (c) 2021 IETF Trust and the persons identified as the 57 document authors. All rights reserved. 59 This document is subject to BCP 78 and the IETF Trust's Legal 60 Provisions Relating to IETF Documents (https://trustee.ietf.org/ 61 license-info) in effect on the date of publication of this document. 62 Please review these documents carefully, as they describe your rights 63 and restrictions with respect to this document. Code Components 64 extracted from this document must include Simplified BSD License text 65 as described in Section 4.e of the Trust Legal Provisions and are 66 provided without warranty as described in the Simplified BSD License. 68 Table of Contents 70 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 71 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 3 72 3. Problem Statement . . . . . . . . . . . . . . . . . . . . . . 4 73 3.1. Inbound Traffic Control . . . . . . . . . . . . . . . . . 4 74 3.2. Outbound Traffic Control . . . . . . . . . . . . . . . . 5 75 4. Protocol Extensions . . . . . . . . . . . . . . . . . . . . . 6 76 4.1. Using a New AFI and SAFI . . . . . . . . . . . . . . . . 6 77 4.2. BGP Wide Community and Atoms . . . . . . . . . . . . . . 8 78 4.2.1. RouteAttr Sub-TLV . . . . . . . . . . . . . . . . . . 8 79 4.2.2. Sub-TLVs of the Parameters TLV . . . . . . . . . . . 12 80 4.3. Capability Negotiation . . . . . . . . . . . . . . . . . 14 81 5. Operations . . . . . . . . . . . . . . . . . . . . . . . . . 15 82 5.1. Application Scenario . . . . . . . . . . . . . . . . . . 15 83 5.2. About Failure . . . . . . . . . . . . . . . . . . . . . . 16 84 6. Contributors . . . . . . . . . . . . . . . . . . . . . . . . 17 85 7. Security Considerations . . . . . . . . . . . . . . . . . . . 17 86 8. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 17 87 9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 17 88 9.1. Existing Assignments . . . . . . . . . . . . . . . . . . 17 89 9.2. RouteAttr Atom Type . . . . . . . . . . . . . . . . . . . 18 90 9.3. Route Attributes Sub-sub-TLV Registry . . . . . . . . . . 18 91 9.4. Attribute Change Sub-TLV Registry . . . . . . . . . . . . 18 92 10. References . . . . . . . . . . . . . . . . . . . . . . . . . 19 93 10.1. Normative References . . . . . . . . . . . . . . . . . . 19 94 10.2. Informative References . . . . . . . . . . . . . . . . . 20 96 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 20 98 1. Introduction 100 It is difficult to optimize traffic paths in a traditional IP network 101 because of the following: 103 * Complex. Traffic can only be adjusted device by device. The 104 configurations on all the routers that the traffic traverses need 105 to be changed or added. There are already lots of policies 106 configured on the routers in an operational network. There are 107 different types of policies, which include security, management 108 and control policies. These policies are relatively stable. 109 However, the policies for adjusting traffic are dynamic. Whenever 110 the traffic through a route is not expected, the policies to 111 adjust the traffic for that route are configured on the related 112 routers. It is complex to dynamically add or change the policies 113 to the existing policies on the special routers to adjust the 114 traffic. Some people would like to separate the stable route 115 policies from the dynamic ones even though they have configuration 116 automation systems (including YANG models). 118 * Difficult maintenance. The routing policies used to adjust 119 network traffic are dynamic, posing difficulties to subsequent 120 maintenance. High maintenance skills are required. 122 * Slow. Adding or changing some route policies on some routers 123 through a configuration automation system for adjusting some 124 traffic to avoid congestions may be slow. 126 It is desirable to have an automatic mechanism for setting up routing 127 policies, which can simplify routing policy configuration and be 128 fast. This document describes extensions to BGP for Routing Policy 129 Distribution to resolve these issues. 131 2. Terminology 133 The following terminology is used in this document. 135 * ACL: Access Control List 137 * BGP: Border Gateway Protocol [RFC4271] 139 * FS: Flow Specification 141 * NLRI: Network Layer Reachability Information [RFC4271] 143 * PBR: Policy-Based Routing 144 * RPD: Routing Policy Distribution 146 * VPN: Virtual Private Network 148 3. Problem Statement 150 Providers have the requirement to adjust their business traffic 151 routing policies from time to time because of the following: 153 * Business development or network failure introduces link congestion 154 and overload. 156 * Business changes or network additions produce unused resources 157 such as idle links. 159 * Network transmission quality is decreased as the result of delay, 160 loss and they need to adjust traffic to other paths. 162 * To control OPEX and CPEX, they may prefer the transit provider 163 with lower price. 165 3.1. Inbound Traffic Control 167 In Figure 1, for the reasons above, the provider P of AS100 may wish 168 the inbound traffic from AS200 to enter AS100 through link L3 instead 169 of the others. Since P doesn't have any administrative control over 170 AS200, there is no way for P to directly modify the route selection 171 criteria inside AS200. 173 Traffic from PE1 to Prefix1 174 -----------------------------------> 176 +-----------------+ +-------------------------+ 177 | +---------+ | L1 | +----+ +----------+| 178 | |Speaker1 | +------------+ |IGW1| |policy || 179 | +---------+ |** L2**| +----+ |controller|| 180 | | ** ** | +----------+| 181 | +---+ | **** | | 182 | |PE1| | **** | | 183 | +---+ | ** ** | | 184 | +---------+ |** L3**| +----+ | 185 | |Speaker2 | +------------+ |IGW2| AS100 | 186 | +---------+ | L4 | +----+ | 187 | | | | 188 | AS200 | | | 189 | | | ... | 190 | | | | 191 | +---------+ | | +----+ +-------+ | 192 | |Speakern | | | |IGWn| |Prefix1| | 193 | +---------+ | | +----+ +-------+ | 194 +-----------------+ +-------------------------+ 196 Prefix1 advertised from AS100 to AS200 197 <---------------------------------------- 199 Figure 1: Inbound Traffic Control case 201 3.2. Outbound Traffic Control 203 In Figure 2, the provider P of AS100 prefers link L3 for the traffic 204 to the destination Prefix2 among multiple exits and links to AS200. 205 This preference can be dynamic and might change frequently because of 206 the reasons above. So, provider P expects an efficient and 207 convenient solution. 209 Traffic from PE2 to Prefix2 210 -----------------------------------> 211 +-------------------------+ +-----------------+ 212 |+----------+ +----+ |L1 | +---------+ | 213 ||policy | |IGW1| +------------+ |Speaker1 | | 214 ||controller| +----+ |** **| +---------+ | 215 |+----------+ |L2** ** | +-------+| 216 | | **** | |Prefix2|| 217 | | **** | +-------+| 218 | |L3** ** | | 219 | AS100 +----+ |** **| +---------+ | 220 | |IGW2| +------------+ |Speaker2 | | 221 | +----+ |L4 | +---------+ | 222 | | | | 223 |+---+ | | AS200 | 224 ||PE2| ... | | | 225 |+---+ | | | 226 | +----+ | | +---------+ | 227 | |IGWn| | | |Speakern | | 228 | +----+ | | +---------+ | 229 +-------------------------+ +-----------------+ 231 Prefix2 advertised from AS200 to AS100 232 <---------------------------------------- 234 Figure 2: Outbound Traffic Control case 236 4. Protocol Extensions 238 This document specifies a solution using a new AFI and SAFI with the 239 BGP Wide Community for encoding a routing policy. 241 4.1. Using a New AFI and SAFI 243 A new AFI and SAFI are defined: the Routing Policy AFI whose 244 codepoint 16398 has been assigned by IANA, and SAFI whose codepoint 245 75 has been assigned by IANA. 247 The AFI and SAFI pair uses a new NLRI, which is defined as follows: 249 0 1 2 3 250 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 251 +-+-+-+-+-+-+-+-+ 252 | NLRI Length | 253 +-+-+-+-+-+-+-+-+ 254 | Policy Type | 255 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 256 | Distinguisher (4 octets) | 257 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 258 | Peer IP (4/16 octets) ~ 259 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 261 Where: 263 NLRI Length: 1 octet represents the length of NLRI. If the Length 264 is anything other than 9 or 21, the NLRI is corrupt and the 265 enclosing UPDATE message MUST be ignored. 267 Policy Type: 1 octet indicates the type of a policy. 1 is for 268 Export policy. 2 is for Import policy. If the Policy Type is any 269 other value, the NLRI is corrupt and the enclosing UPDATE message 270 MUST be ignored. 272 Distinguisher: 4 octet unsigned integer that uniquely identifies the 273 content/policy. It is used to sort/order the polices from the 274 lower to higher distinguisher. They are applied in order. The 275 policy with a lower/smaller distinguisher is applied before the 276 policies with higher/larger distinguishers. 278 Peer IP: 4/16 octet value indicates IPv4/IPv6 peers. Its default 279 value is 0, which indicates that when receiving a BGP UPDATE 280 message with the NLRI, a BGP speaker will apply the policy in the 281 message to all its IPv4/IPv6 peers. 283 Under RPD AFI/SAFI, the RPD routes are stored and ordered according 284 to their keys. Under IPv4/IPv6 Unicast AFI/SAFI, there are IPv4/IPv6 285 unicast routes learned and various static policies configured. In 286 addition, there are dynamic RPD policies from the RPD AFI/SAFI when 287 RPD is enabled. 289 Before advertising an IPv4/IPv6 Unicast AFI/SAFI route, the 290 configured policies are applied to it first, and then the RPD Export 291 policies are applied. 293 The NLRI containing the Routing Policy is carried in MP_REACH_NLRI 294 and MP_UNREACH_NLRI path attributes in a BGP UPDATE message, which 295 MUST also contain the BGP mandatory attributes and MAY contain some 296 BGP optional attributes. 298 When receiving a BGP UPDATE message with routing policy, a BGP 299 speaker processes it as follows: 301 * If the peer IP in the NLRI is 0, then apply the routing policy to 302 all the remote peers of this BGP speaker. 304 * If the peer IP in the NLRI is non-zero, then the IP address 305 indicates a remote peer of this BGP speaker and the routing policy 306 will be applied to it. 308 The content of the Routing Policy is encoded in a BGP Wide Community. 310 4.2. BGP Wide Community and Atoms 312 The BGP wide community is defined in 313 [I-D.ietf-idr-wide-bgp-communities]. It can be used to facilitate 314 the delivery of new network services and be extended easily for 315 distributing different kinds of routing policies. 317 A wide community Atom is a TLV (or sub-TLV), which may be included in 318 a BGP wide community container (or BGP wide community for short) 319 containing some BGP Wide Community TLVs. Three BGP Wide Community 320 TLVs are defined in [I-D.ietf-idr-wide-bgp-communities], which are 321 BGP Wide Community Target(s) TLV, Exclude Target(s) TLV, and 322 Parameter(s) TLV. The value of each of these TLVs comprises a series 323 of Atoms, each of which is a TLV (or sub-TLV). A new wide community 324 Atom is defined for BGP Wide Community Target(s) TLV and a few new 325 Atoms are defined for BGP Wide Community Parameter(s) TLV. For your 326 reference, the format of the TLV is illustrated below: 328 0 1 2 3 329 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 330 +-+-+-+-+-+-+-+-++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 331 | Type | Length | 332 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 333 | Value (variable) ~ 334 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 336 Format of Wide Community Atom TLV 338 4.2.1. RouteAttr Sub-TLV 340 A RouteAttr Atom sub-TLV (or RouteAttr sub-TLV for short) is defined 341 and may be included in a Target TLV. It has the following format. 343 0 1 2 3 344 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 345 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 346 | Type (TBD1) | Length (variable) | 347 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 348 | sub-sub-TLVs ~ 349 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 351 Format of RouteAttr Atom sub-TLV 353 The Type for RouteAttr is TBD1. In RouteAttr sub-TLV, four sub-sub- 354 TLVs are defined: IPv4 Prefix, IPv6 Prefix, AS-Path, and Community 355 sub-sub-TLV. 357 An IP prefix sub-sub-TLV gives matching criteria on IPv4 prefixes. 358 Its format is illustrated below: 360 0 1 2 3 361 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 362 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 363 | Type 1 | Length (N x 8) |M-Type | Flags | 364 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 365 | IPv4 Address | 366 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 367 | Mask | GeMask | LeMask |M-Type | Flags | 368 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 369 ~ . . . 370 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 371 | IPv4 Address | 372 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 373 | Mask | GeMask | LeMask | 374 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 376 Format of IPv4 Prefix sub-sub-TLV 378 Type: 1 for IPv4 Prefix. 380 Length: N x 8, where N is the number of tuples . If Length is not a multiple of 8, 382 the Atom is corrupt and the enclosing UPDATE message MUST be 383 ignored. 385 M-Type: 4-bit field specifying match type. The following four 386 values are defined. IPaddress is the IP address in the sub-sub- 387 TLV while IProute is the IP route being matched. 389 M-Type = 0: Exact match with the Mask length IP address prefix. 390 GeMask and LeMask MUST be sent as zero and ignored on receipt. 392 M-Type = 1: Matches if the Mask number of prefix bits exactly 393 match between IPaddress and IProute and the actual prefix 394 length of IProute is greater than or equal to GeMask. LeMask 395 MUST be sent as zero and ignored on receipt. 397 M-Type = 2: Matches if the Mask number of prefix bits exactly 398 match between IPaddress and IProute and the actual prefix 399 length of IProute is less than or equal to LeMask. GeMask MUST 400 be sent as zero and ignored on receipt. 402 M-Type = 3: Matches if the Mask number of prefix bits exactly 403 match between IPaddress and IProute and the actual prefix 404 length of IProute is less than or equal to LeMask and greater 405 than or equal to GeMask. 407 Flags: 4 bits. No flags are currently defined. They MUST be sent 408 as zero and ignored on receipt. 410 IPv4 Address: 4 octets for an IPv4 address. 412 Mask: 1 octet for the IP address prefix length that needs to exactly 413 match between the IP address in the sub-sub-TLV and the route. 415 GeMask: 1 octet for route prefix length match range's lower bound, 416 MUST not be less than Mask or be 0. 418 LeMask: 1 octet for route prefix length match range's upper bound, 419 MUST be greater than Mask or be 0. 421 For example, tuple represents an exact IP prefix match for 423 1.1.0.0/22. 425 represents match IP prefix 16.1.0.0/24 greater-equal 24 427 (i.e., route matches if route's first Mask=24 bits match 16.1.0 and 428 24 =< route's prefix length =< 32). 430 represents match IP prefix 17.1.0.0/24 less-equal 26 432 (i.e., route matches if route's first Mask=24 bits match 17.1.0 and 433 24 =< route's prefix length <= 26). 435 represents match IP prefix 18.1.0.0/24 greater-equal 24 437 and less-equal 30 (i.e., route matches if route's first Mask=24 bits 438 match 18.1.0 and 24 =< route's prefix length <= 30). 440 Similarly, an IPv6 Prefix sub-sub-TLV represents match criteria on 441 IPv6 prefixes. Its format is illustrated below: 443 0 1 2 3 444 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 445 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 446 | Type 4 | Length (N x 20) |M-Type | Flags | 447 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 448 | IPv6 Address (16 octets) ~ 449 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 450 | Mask | GeMask | LeMask |M-Type | Flags | 451 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 452 ~ . . . 453 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 454 | IPv6 Address (16 octets ~ 455 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 456 | Mask | GeMask | LeMask | 457 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 459 Format of IPv6 Prefix sub-sub-TLV 461 An AS-Path sub-sub-TLV represents a match criteria in a regular 462 expression string. Its format is illustrated below: 464 0 1 2 3 465 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 466 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 467 | Type 2 | Length (Variable) | 468 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 469 | AS-Path Regex String | 470 : : 471 | ~ 472 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 474 Format of AS Path sub-sub-TLV 476 Type: 2 for AS-Path. 478 Length: Variable, maximum is 1024. 480 AS-Path Regex String: AS-Path regular expression string. 482 A community sub-sub-TLV represents a list of communities to be 483 matched all. Its format is illustrated below: 485 0 1 2 3 486 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 487 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 488 | Type 3 | Length (N x 4 + 1) | Flags | 489 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 490 | Community 1 Value | 491 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 492 ~ . . . ~ 493 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 494 | Community N Value | 495 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 497 Format of Community sub-sub-TLV 499 Type: 3 for Community. 501 Length: N x 4 + 1, where N is the number of communities. If Length 502 is not a multiple of 4 plus 1, the Atom is corrupt and the 503 enclosing UPDATE MUST be ignored. 505 Flags: 1 octet. No flags are currently defined. These bits MUST be 506 sent as zero and ignored on receipt. 508 4.2.2. Sub-TLVs of the Parameters TLV 510 This document introduces 2 community values: 512 MATCH AND SET ATTR: If the IPv4/IPv6 unicast routes to a remote peer 513 match the specific conditions defined in the routing policy 514 extracted from the RPD route, then the attributes of the IPv4/IPv6 515 unicast routes will be modified when sending to the remote peer 516 per the actions defined in the RPD route. 518 MATCH AND NOT ADVERTISE: If the IPv4/IPv6 unicast routes to a remote 519 peer match the specific conditions defined in the routing policy 520 extracted from the RPD route, then the IPv4/IPv6 unicast routes 521 will not be advertised to the remote peer. 523 For the Parameter(s) TLV, two action sub-TLVs are defined: MED change 524 sub-TLV and AS-Path change sub-TLV. When the community in the 525 container is MATCH AND SET ATTR, the Parameter(s) TLV can include 526 these sub-TLVs. When the community is MATCH AND NOT ADVERTISE, the 527 Parameter(s) TLV's value is empty. 529 A MED change sub-TLV indicates an action to change the MED. Its 530 format is illustrated below: 532 0 1 2 3 533 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 534 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 535 | Type 1 | Length (5) | OP | 536 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 537 | Value | 538 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 540 Format of MED Change sub-TLV 542 Type: 1 for MED Change. 544 Length: 5. If Length is any other value, the sub-TLV is corrupt and 545 the enclosing UPDATE MUST be ignored. 547 OP: 1 octet. Three are defined: 549 OP = 0: assign the Value to the existing MED. 551 OP = 1: add the Value to the existing MED. If the sum is greater 552 than the maximum value for MED, assign the maximum value to 553 MED. 555 OP = 2: subtract the Value from the existing MED. If the 556 existing MED minus the Value is less than 0, assign 0 to MED. 558 If OP is any other value, the sub-TLV is ignored. 560 Value: 4 octets. 562 An AS-Path change sub-TLV indicates an action to change the AS-Path. 563 Its format is illustrated below: 565 0 1 2 3 566 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 567 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 568 | Type 2 | Length (n x 5) | 569 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 570 | AS1 | 571 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 572 | Count1 | 573 +-+-+-+-+-+-+-+-+ 574 ~ . . . 575 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 576 | ASn | 577 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 578 | Countn | 579 +-+-+-+-+-+-+-+-+ 581 Format of AS-Path Change sub-TLV 583 Type: 2 for AS-Path Change. 585 Length: n x 5. If Length is not a multiple of 5, the sub-TLV is 586 corrupt and the enclosing UPDATE MUST be ignored. 588 ASi: 4 octet. An AS number. 590 Counti: 1 octet. ASi repeats Counti times. 592 The sequence of AS numbers are added to the existing AS Path. 594 4.3. Capability Negotiation 596 It is necessary to negotiate the capability to support BGP Extensions 597 for Routing Policy Distribution (RPD). The BGP RPD Capability is a 598 new BGP capability [RFC5492]. The Capability Code for this 599 capability is 72 assigned by the IANA. The Capability Length field 600 of this capability is variable. The Capability Value field consists 601 of one or more of the following tuples: 603 +--------------------------------------------------+ 604 | Address Family Identifier (2 octets) | 605 +--------------------------------------------------+ 606 | Subsequent Address Family Identifier (1 octet) | 607 +--------------------------------------------------+ 608 | Send/Receive (1 octet) | 609 +--------------------------------------------------+ 611 BGP RPD Capability 613 The meaning and use of the fields are as follows: 615 Address Family Identifier (AFI): This field is the same as the one 616 used in [RFC4760]. 618 Subsequent Address Family Identifier (SAFI): This field is the same 619 as the one used in [RFC4760]. 621 Send/Receive: This field indicates whether the sender is (a) willing 622 to receive Routing Policies from its peer (value 1), (b) would like 623 to send Routing Policies to its peer (value 2), or (c) both (value 3) 624 for the . If Send/Receive is any other value, that tuple 625 is ignored but any other tuples present are still used. 627 5. Operations 629 This section presents a typical application scenario and some details 630 about handling a related failure. 632 5.1. Application Scenario 634 Figure 3 illustrates a typical scenario, where RPD is used by a 635 controller with a Route Reflector (RR) to adjust traffic dynamically. 637 +--------------+ 638 | Controller | 639 +-------+------+ 640 \ 641 \ RPD 642 .--\._.+--+ ___...__ 643 __( \ '.---... ( ) 644 / RR o -------- A o) ---------- (o X AS2 ) 645 (o E |\ ) _____//(___ ___) 646 ( | \_______ B o) ____/ / ''' 647 (o F \ ) ____/ 648 ( \_____ C o) ______/ ___...__ 649 ' AS1 _) \_____ ( ) 650 '---._.-. ) \_______ (o Y AS3 ) 651 '---' (___ ___) 652 ''' 654 Figure 3: Controller with RR Adjusts Traffic 656 The controller connects the RR through a BGP session. There is a BGP 657 session between the RR and each of routers A, B and C in AS1, which 658 is shown in the figure. Other sessions in AS1 are not shown in the 659 figure. 661 There is router X in AS2. There is a BGP session between X and each 662 of routers A, B and C in AS1. 664 There is router Y in AS3. There is a BGP session between Y and 665 router C in AS1. 667 The controller sends a RPD route to the RR. After receiving the RPD 668 route from the controller, the RR reflects the RPD route to routers 669 A, B and C. After receiving the RPD route from the RR, routers A, B 670 and C extract the routing policy from the RPD route. If the peer IP 671 in the NLRI of the RPD route is 0, then apply the routing policy to 672 all the remote peers of routers A, B and C. If the peer IP in the 673 NLRI of the RPD route is non-zero, then the IP address indicates a 674 remote peer of routers A, B and C and such routing policy is applied 675 to the specific remote peer. The IPv4/IPv6 unicast routes towards 676 router X in AS2 and router Y in AS3 will be adjusted based on the 677 routing policy sent by the controller via a RPD route. 679 The controller uses the RT extend community to notify a router 680 whether to receive a RPD policy. For example, if there is not any 681 adjustment on router B, the controller sends RPD routes with the RTs 682 for A and C. B will not receive the routes. 684 The process of adjusting traffic in a network is a close loop. The 685 loop starts from the controller with some traffic expectations on a 686 set of routes. The controller obtains the information about traffic 687 flows for the related routes. It analyzes the traffic and checks 688 whether the current traffic flows meet the expectations. If the 689 expectations are not met, the controller adjusts the traffic. And 690 then the loop goes to the starter of the loop (The controller obtains 691 the information about traffic ...). 693 5.2. About Failure 695 This section describes some details about handling a failure related 696 to a RPD route being applied. 698 A RPD route is not a configuration. When it is sent to a router from 699 a controller, no ack is needed from the router. The existing BGP 700 mechanisms are re-used for delivering a RPD route. After the route 701 is delivered to a router, it will be successful. This is guaranteed 702 by the BGP protocols. 704 If there is a failure for the router to install the route locally, 705 this failure is a bug of the router. The bug needs to be fixed. 707 For the errors mentioned in [RFC7606], they are handled according to 708 [RFC7606]. These errors are bugs, which need to be resolved. 710 When the controller fails while a RPD route is being applied such as 711 on the way to the router, some existing mechanisms such BGP Graceful 712 Restart (GR) [RFC4724] and BGP Long-lived Graceful Restart (LLGR) can 713 be used to let the router keep the routes from the controller for 714 some time. 716 With support of "Long-lived Graceful Restart Capability" 717 [I-D.ietf-idr-long-lived-gr], the routes can be retained for a longer 718 time after the controller fails. 720 After the controller recovers from its failure, the router will have 721 all the routes (including the RPD route being applied) from the 722 controller. 724 In the worst case, the controller fails and the RPD routes for 725 adjusting the traffic are withdrawn. The traffic adjusted/redirected 726 may take its old path. This should be acceptable. 728 6. Contributors 730 The following people have substantially contributed to the definition 731 of the BGP-FS RPD and to the editing of this document: 733 Peng Zhou 734 Huawei 735 Email: Jewpon.zhou@huawei.com 737 7. Security Considerations 739 Protocol extensions defined in this document do not affect BGP 740 security other than as discussed in the Security Considerations 741 section of [RFC8955]. 743 8. Acknowledgements 745 The authors would like to thank Acee Lindem, Jeff Haas, Jie Dong, 746 Lucy Yong, Qiandeng Liang, Zhenqiang Li, Robert Raszuk, Donald 747 Eastlake, Ketan Talaulikar, and Jakob Heitz for their comments to 748 this work. 750 9. IANA Considerations 752 9.1. Existing Assignments 754 IANA has assigned an AFI of value 16398 from the registry "Address 755 Family Numbers" for Routing Policy. 757 IANA has assigned a SAFI of value 75 from the registry "Subsequent 758 Address Family Identifiers (SAFI) Parameters" for Routing Policy. 760 IANA has assigned a Code Point of value 72 from the registry 761 "Capability Codes" for Routing Policy Distribution. 763 9.2. RouteAttr Atom Type 765 IANA is requested to assign a code-point from the registry "BGP 766 Community Container Atom Types" as follows: 768 +---------------------+------------------------------+-------------+ 769 | Atom Code Point | Description | Reference | 770 +---------------------+------------------------------+-------------+ 771 | TBD1 (48 suggested) | RouteAttr Atom |This document| 772 +---------------------+------------------------------+-------------+ 774 9.3. Route Attributes Sub-sub-TLV Registry 776 IANA is requested to create a registry called "Route Attributes Sub- 777 sub-TLV" under RouteAttr Atom Sub-TLV. The allocation policy of this 778 registry is "First Come First Served (FCFS)". 780 The initial code points are as follows: 782 +-------------+-----------------------------------+-------------+ 783 | Code Point | Description | Reference | 784 +-------------+-----------------------------------+-------------+ 785 | 0 | Reserved | | 786 +-------------+-----------------------------------+-------------+ 787 | 1 | IPv4 Prefix Sub-sub-TLV |This document| 788 +-------------+-----------------------------------+-------------+ 789 | 2 | AS-Path Sub-sub-TLV |This document| 790 +-------------+-----------------------------------+-------------+ 791 | 3 | Community Sub-sub-TLV |This document| 792 +-------------+-----------------------------------+-------------+ 793 | 4 | IPv6 Prefix Sub-sub-TLV |This document| 794 +-------------+-----------------------------------+-------------+ 795 | 5 - 255 | Available | | 796 +-------------+-----------------------------------+-------------+ 798 9.4. Attribute Change Sub-TLV Registry 800 IANA is requested to create a registry called "Attribute Change Sub- 801 TLV" under Parameter(s) TLV. The allocation policy of this registry 802 is "First Come First Served (FCFS)". 804 Initial code points are as follows: 806 +-------------+-----------------------------------+-------------+ 807 | Code Point | Description | Reference | 808 +-------------+-----------------------------------+-------------+ 809 | 0 | Reserved | | 810 +-------------+-----------------------------------+-------------+ 811 | 1 | MED Change Sub-TLV |This document| 812 +-------------+-----------------------------------+-------------+ 813 | 2 | AS-Path Change Sub-TLV |This document| 814 +-------------+-----------------------------------+-------------+ 815 | 3 - 255 | Available | | 816 +-------------+-----------------------------------+-------------+ 818 10. References 820 10.1. Normative References 822 [I-D.ietf-idr-wide-bgp-communities] 823 Raszuk, R., Haas, J., Lange, A., Decraene, B., Amante, S., 824 and P. Jakma, "BGP Community Container Attribute", Work in 825 Progress, Internet-Draft, draft-ietf-idr-wide-bgp- 826 communities-05, 2 July 2018, 827 . 830 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 831 Requirement Levels", BCP 14, RFC 2119, 832 DOI 10.17487/RFC2119, March 1997, 833 . 835 [RFC4271] Rekhter, Y., Ed., Li, T., Ed., and S. Hares, Ed., "A 836 Border Gateway Protocol 4 (BGP-4)", RFC 4271, 837 DOI 10.17487/RFC4271, January 2006, 838 . 840 [RFC4760] Bates, T., Chandra, R., Katz, D., and Y. Rekhter, 841 "Multiprotocol Extensions for BGP-4", RFC 4760, 842 DOI 10.17487/RFC4760, January 2007, 843 . 845 [RFC5492] Scudder, J. and R. Chandra, "Capabilities Advertisement 846 with BGP-4", RFC 5492, DOI 10.17487/RFC5492, February 847 2009, . 849 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 850 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 851 May 2017, . 853 [RFC8955] Loibl, C., Hares, S., Raszuk, R., McPherson, D., and M. 854 Bacher, "Dissemination of Flow Specification Rules", 855 RFC 8955, DOI 10.17487/RFC8955, December 2020, 856 . 858 10.2. Informative References 860 [I-D.ietf-idr-long-lived-gr] 861 Uttaro, J., Chen, E., Decraene, B., and J. G. Scudder, 862 "Support for Long-lived BGP Graceful Restart", Work in 863 Progress, Internet-Draft, draft-ietf-idr-long-lived-gr-00, 864 5 September 2019, . 867 [I-D.ietf-idr-registered-wide-bgp-communities] 868 Raszuk, R. and J. Haas, "Registered Wide BGP Community 869 Values", Work in Progress, Internet-Draft, draft-ietf-idr- 870 registered-wide-bgp-communities-02, 31 May 2016, 871 . 874 [RFC4724] Sangli, S., Chen, E., Fernando, R., Scudder, J., and Y. 875 Rekhter, "Graceful Restart Mechanism for BGP", RFC 4724, 876 DOI 10.17487/RFC4724, January 2007, 877 . 879 [RFC7606] Chen, E., Ed., Scudder, J., Ed., Mohapatra, P., and K. 880 Patel, "Revised Error Handling for BGP UPDATE Messages", 881 RFC 7606, DOI 10.17487/RFC7606, August 2015, 882 . 884 Authors' Addresses 886 Zhenbin Li 887 Huawei 888 Huawei Bld., No.156 Beiqing Rd. 889 Beijing 890 100095 891 China 893 Email: lizhenbin@huawei.com 894 Liang Ou 895 China Telcom Co., Ltd. 896 109 West Zhongshan Ave,Tianhe District 897 Guangzhou 898 510630 899 China 901 Email: ouliang@chinatelecom.cn 903 Yujia Luo 904 China Telcom Co., Ltd. 905 109 West Zhongshan Ave,Tianhe District 906 Guangzhou 907 510630 908 China 910 Email: luoyuj@sdu.edu.cn 912 Sujian Lu 913 Tencent 914 Tengyun Building,Tower A ,No. 397 Tianlin Road 915 Shanghai 916 Xuhui District, 200233 917 China 919 Email: jasonlu@tencent.com 921 Gyan S. Mishra 922 Verizon Inc. 923 13101 Columbia Pike 924 Silver Spring, MD 20904 925 United States of America 927 Phone: 301 502-1347 928 Email: gyan.s.mishra@verizon.com 930 Huaimo Chen 931 Futurewei 932 Boston, MA, 933 United States of America 935 Email: Huaimo.chen@futurewei.com 936 Shunwan Zhuang 937 Huawei 938 Huawei Bld., No.156 Beiqing Rd. 939 Beijing 940 100095 941 China 943 Email: zhuangshunwan@huawei.com 945 Haibo Wang 946 Huawei 947 Huawei Bld., No.156 Beiqing Rd. 948 Beijing 949 100095 950 China 952 Email: rainsword.wang@huawei.com