idnits 2.17.1 draft-ietf-idr-rpd-12.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- == There are 12 instances of lines with non-RFC6890-compliant IPv4 addresses in the document. If these are example addresses, they should be changed. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (28 July 2021) is 1003 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Unused Reference: 'RFC1997' is defined on line 803, but no explicit reference was found in the text == Unused Reference: 'RFC8126' is defined on line 831, but no explicit reference was found in the text == Unused Reference: 'I-D.ietf-idr-registered-wide-bgp-communities' is defined on line 849, but no explicit reference was found in the text == Outdated reference: A later version (-11) exists of draft-ietf-idr-wide-bgp-communities-05 ** Obsolete normative reference: RFC 5575 (Obsoleted by RFC 8955) == Outdated reference: A later version (-06) exists of draft-ietf-idr-long-lived-gr-00 Summary: 1 error (**), 0 flaws (~~), 7 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group Z. Li 3 Internet-Draft Huawei 4 Intended status: Standards Track L. Ou 5 Expires: 29 January 2022 Y. Luo 6 China Telcom Co., Ltd. 7 S. Lu 8 Tencent 9 G. Mishra 10 Verizon Inc. 11 H. Chen 12 Futurewei 13 S. Zhuang 14 H. Wang 15 Huawei 16 28 July 2021 18 BGP Extensions for Routing Policy Distribution (RPD) 19 draft-ietf-idr-rpd-12 21 Abstract 23 It is hard to adjust traffic and optimize traffic paths in a 24 traditional IP network from time to time through manual 25 configurations. It is desirable to have a mechanism for setting up 26 routing policies, which adjusts traffic and optimizes traffic paths 27 automatically. This document describes BGP Extensions for Routing 28 Policy Distribution (BGP RPD) to support this. 30 Requirements Language 32 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 33 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 34 document are to be interpreted as described in [RFC2119] [RFC8174] 35 when, and only when, they appear in all capitals, as shown here. 37 Status of This Memo 39 This Internet-Draft is submitted in full conformance with the 40 provisions of BCP 78 and BCP 79. 42 Internet-Drafts are working documents of the Internet Engineering 43 Task Force (IETF). Note that other groups may also distribute 44 working documents as Internet-Drafts. The list of current Internet- 45 Drafts is at https://datatracker.ietf.org/drafts/current/. 47 Internet-Drafts are draft documents valid for a maximum of six months 48 and may be updated, replaced, or obsoleted by other documents at any 49 time. It is inappropriate to use Internet-Drafts as reference 50 material or to cite them other than as "work in progress." 52 This Internet-Draft will expire on 29 January 2022. 54 Copyright Notice 56 Copyright (c) 2021 IETF Trust and the persons identified as the 57 document authors. All rights reserved. 59 This document is subject to BCP 78 and the IETF Trust's Legal 60 Provisions Relating to IETF Documents (https://trustee.ietf.org/ 61 license-info) in effect on the date of publication of this document. 62 Please review these documents carefully, as they describe your rights 63 and restrictions with respect to this document. Code Components 64 extracted from this document must include Simplified BSD License text 65 as described in Section 4.e of the Trust Legal Provisions and are 66 provided without warranty as described in the Simplified BSD License. 68 Table of Contents 70 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 71 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 3 72 3. Problem Statement . . . . . . . . . . . . . . . . . . . . . . 4 73 3.1. Inbound Traffic Control . . . . . . . . . . . . . . . . . 4 74 3.2. Outbound Traffic Control . . . . . . . . . . . . . . . . 5 75 4. Protocol Extensions . . . . . . . . . . . . . . . . . . . . . 6 76 4.1. Using a New AFI and SAFI . . . . . . . . . . . . . . . . 6 77 4.2. BGP Wide Community and Atoms . . . . . . . . . . . . . . 8 78 4.2.1. RouteAttr Sub-TLV . . . . . . . . . . . . . . . . . . 8 79 4.2.2. Sub-TLVs of the Parameters TLV . . . . . . . . . . . 12 80 4.3. Capability Negotiation . . . . . . . . . . . . . . . . . 14 81 5. Operations . . . . . . . . . . . . . . . . . . . . . . . . . 15 82 5.1. Application Scenario . . . . . . . . . . . . . . . . . . 15 83 5.2. About Failure . . . . . . . . . . . . . . . . . . . . . . 16 84 6. Contributors . . . . . . . . . . . . . . . . . . . . . . . . 17 85 7. Security Considerations . . . . . . . . . . . . . . . . . . . 17 86 8. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 17 87 9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 17 88 9.1. Existing Assignments . . . . . . . . . . . . . . . . . . 17 89 9.2. RouteAttr Atom Type . . . . . . . . . . . . . . . . . . . 18 90 9.3. Route Attributes Sub-sub-TLV Registry . . . . . . . . . . 18 91 9.4. Attribute Change Sub-TLV Registry . . . . . . . . . . . . 18 92 10. References . . . . . . . . . . . . . . . . . . . . . . . . . 19 93 10.1. Normative References . . . . . . . . . . . . . . . . . . 19 94 10.2. Informative References . . . . . . . . . . . . . . . . . 20 96 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 20 98 1. Introduction 100 It is difficult to optimize traffic paths in a traditional IP network 101 because of the following: 103 * Complex and error prone configuration. Traffic can only be 104 adjusted device by device. The configurations on all the routers 105 that the traffic traverses need to be changed or added. There are 106 already lots of policies configured on the routers in an 107 operational network. There are different types of policies, which 108 include security, management and control policies. These policies 109 are relatively stable. However, the policies for adjusting 110 traffic are dynamic. Whenever the traffic through a route is not 111 expected, the policies to adjust the traffic for that route are 112 configured on the related routers. It is complex and error prone 113 to dynamically add or change the policies to the existing policies 114 on the special routers to adjust the traffic. 116 * Difficult maintenance. The routing policies used to control 117 network routes are dynamic, posing difficulties to subsequent 118 maintenance. High maintenance skills are required. 120 It is desirable to have an automatic mechanism for setting up routing 121 policies, which can simplify routing policy configuration. This 122 document describes extensions to BGP for Routing Policy Distribution 123 to resolve these issues. 125 2. Terminology 127 The following terminology is used in this document. 129 * ACL: Access Control List 131 * BGP: Border Gateway Protocol [RFC4271] 133 * FS: Flow Specification 135 * NLRI: Network Layer Reachability Information [RFC4271] 137 * PBR: Policy-Based Routing 139 * RPD: Routing Policy Distribution 141 * VPN: Virtual Private Network 143 3. Problem Statement 145 Providers have the requirement to adjust their business traffic 146 routing policies from time to time because of the following: 148 * Business development or network failure introduces link congestion 149 and overload. 151 * Business changes or network additions produce unused resources 152 such as idle links. 154 * Network transmission quality is decreased as the result of delay, 155 loss and they need to adjust traffic to other paths. 157 * To control OPEX and CPEX, they may prefer the transit provider 158 with lower price. 160 3.1. Inbound Traffic Control 162 In Figure 1, for the reasons above, the provider P of AS100 may wish 163 the inbound traffic from AS200 to enter AS100 through link L3 instead 164 of the others. Since P doesn't have any administrative control over 165 AS200, there is no way for P to directly modify the route selection 166 criteria inside AS200. 168 Traffic from PE1 to Prefix1 169 -----------------------------------> 171 +-----------------+ +-------------------------+ 172 | +---------+ | L1 | +----+ +----------+| 173 | |Speaker1 | +------------+ |IGW1| |policy || 174 | +---------+ |** L2**| +----+ |controller|| 175 | | ** ** | +----------+| 176 | +---+ | **** | | 177 | |PE1| | **** | | 178 | +---+ | ** ** | | 179 | +---------+ |** L3**| +----+ | 180 | |Speaker2 | +------------+ |IGW2| AS100 | 181 | +---------+ | L4 | +----+ | 182 | | | | 183 | AS200 | | | 184 | | | ... | 185 | | | | 186 | +---------+ | | +----+ +-------+ | 187 | |Speakern | | | |IGWn| |Prefix1| | 188 | +---------+ | | +----+ +-------+ | 189 +-----------------+ +-------------------------+ 191 Prefix1 advertised from AS100 to AS200 192 <---------------------------------------- 194 Figure 1: Inbound Traffic Control case 196 3.2. Outbound Traffic Control 198 In Figure 2, the provider P of AS100 prefers link L3 for the traffic 199 to the destination Prefix2 among multiple exits and links to AS200. 200 This preference can be dynamic and might change frequently because of 201 the reasons above. So, provider P expects an efficient and 202 convenient solution. 204 Traffic from PE2 to Prefix2 205 -----------------------------------> 206 +-------------------------+ +-----------------+ 207 |+----------+ +----+ |L1 | +---------+ | 208 ||policy | |IGW1| +------------+ |Speaker1 | | 209 ||controller| +----+ |** **| +---------+ | 210 |+----------+ |L2** ** | +-------+| 211 | | **** | |Prefix2|| 212 | | **** | +-------+| 213 | |L3** ** | | 214 | AS100 +----+ |** **| +---------+ | 215 | |IGW2| +------------+ |Speaker2 | | 216 | +----+ |L4 | +---------+ | 217 | | | | 218 |+---+ | | AS200 | 219 ||PE2| ... | | | 220 |+---+ | | | 221 | +----+ | | +---------+ | 222 | |IGWn| | | |Speakern | | 223 | +----+ | | +---------+ | 224 +-------------------------+ +-----------------+ 226 Prefix2 advertised from AS200 to AS100 227 <---------------------------------------- 229 Figure 2: Outbound Traffic Control case 231 4. Protocol Extensions 233 This document specifies a solution using a new AFI and SAFI with the 234 BGP Wide Community for encoding a routing policy. 236 4.1. Using a New AFI and SAFI 238 A new AFI and SAFI are defined: the Routing Policy AFI whose 239 codepoint 16398 has been assigned by IANA, and SAFI whose codepoint 240 75 has been assigned by IANA. 242 The AFI and SAFI pair uses a new NLRI, which is defined as follows: 244 0 1 2 3 245 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 246 +-+-+-+-+-+-+-+-+ 247 | NLRI Length | 248 +-+-+-+-+-+-+-+-+ 249 | Policy Type | 250 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 251 | Distinguisher (4 octets) | 252 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 253 | Peer IP (4/16 octets) ~ 254 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 256 Where: 258 NLRI Length: 1 octet represents the length of NLRI. If the Length 259 is anything other than 9 or 21, the NLRI is corrupt and the 260 enclosing UPDATE message MUST be ignored. 262 Policy Type: 1 octet indicates the type of a policy. 1 is for 263 Export policy. 2 is for Import policy. If the Policy Type is any 264 other value, the NLRI is corrupt and the enclosing UPDATE message 265 MUST be ignored. 267 Distinguisher: 4 octet unsigned integer that uniquely identifies the 268 content/policy. It is used to sort/order the polices from the 269 lower to higher distinguisher. They are applied in order. The 270 policy with a lower/smaller distinguisher is applied before the 271 policies with higher/larger distinguishers. 273 Peer IP: 4/16 octet value indicates IPv4/IPv6 peers. Its default 274 value is 0, which indicates that when receiving a BGP UPDATE 275 message with the NLRI, a BGP speaker will apply the policy in the 276 message to all its IPv4/IPv6 peers. 278 Under RPD AFI/SAFI, the RPD routes are stored and ordered according 279 to their keys. Under IPv4/IPv6 Unicast AFI/SAFI, there are IPv4/IPv6 280 unicast routes learned and various static policies configured. In 281 addition, there are dynamic RPD policies from the RPD AFI/SAFI when 282 RPD is enabled. 284 Before advertising an IPv4/IPv6 Unicast AFI/SAFI route, the 285 configured policies are applied to it first, and then the RPD Export 286 policies are applied. 288 The NLRI containing the Routing Policy is carried in MP_REACH_NLRI 289 and MP_UNREACH_NLRI path attributes in a BGP UPDATE message, which 290 MUST also contain the BGP mandatory attributes and MAY contain some 291 BGP optional attributes. 293 When receiving a BGP UPDATE message with routing policy, a BGP 294 speaker processes it as follows: 296 * If the peer IP in the NLRI is 0, then apply the routing policy to 297 all the remote peers of this BGP speaker. 299 * If the peer IP in the NLRI is non-zero, then the IP address 300 indicates a remote peer of this BGP speaker and the routing policy 301 will be applied to it. 303 The content of the Routing Policy is encoded in a BGP Wide Community. 305 4.2. BGP Wide Community and Atoms 307 The BGP wide community is defined in 308 [I-D.ietf-idr-wide-bgp-communities]. It can be used to facilitate 309 the delivery of new network services and be extended easily for 310 distributing different kinds of routing policies. 312 A wide community Atom is a TLV (or sub-TLV), which may be included in 313 a BGP wide community container (or BGP wide community for short) 314 containing some BGP Wide Community TLVs. Three BGP Wide Community 315 TLVs are defined in [I-D.ietf-idr-wide-bgp-communities], which are 316 BGP Wide Community Target(s) TLV, Exclude Target(s) TLV, and 317 Parameter(s) TLV. The value of each of these TLVs comprises a series 318 of Atoms, each of which is a TLV (or sub-TLV). A new wide community 319 Atom is defined for BGP Wide Community Target(s) TLV and a few new 320 Atoms are defined for BGP Wide Community Parameter(s) TLV. For your 321 reference, the format of the TLV is illustrated below: 323 0 1 2 3 324 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 325 +-+-+-+-+-+-+-+-++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 326 | Type | Length | 327 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 328 | Value (variable) ~ 329 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 331 Format of Wide Community Atom TLV 333 4.2.1. RouteAttr Sub-TLV 335 A RouteAttr Atom sub-TLV (or RouteAttr sub-TLV for short) is defined 336 and may be included in a Target TLV. It has the following format. 338 0 1 2 3 339 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 340 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 341 | Type (TBD1) | Length (variable) | 342 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 343 | sub-sub-TLVs ~ 344 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 346 Format of RouteAttr Atom sub-TLV 348 The Type for RouteAttr is TBD1. In RouteAttr sub-TLV, four sub-sub- 349 TLVs are defined: IPv4 Prefix, IPv6 Prefix, AS-Path, and Community 350 sub-sub-TLV. 352 An IP prefix sub-sub-TLV gives matching criteria on IPv4 prefixes. 353 Its format is illustrated below: 355 0 1 2 3 356 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 357 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 358 | Type 1 | Length (N x 8) |M-Type | Flags | 359 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 360 | IPv4 Address | 361 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 362 | Mask | GeMask | LeMask |M-Type | Flags | 363 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 364 ~ . . . 365 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 366 | IPv4 Address | 367 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 368 | Mask | GeMask | LeMask | 369 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 371 Format of IPv4 Prefix sub-sub-TLV 373 Type: 1 for IPv4 Prefix. 375 Length: N x 8, where N is the number of tuples . If Length is not a multiple of 8, 377 the Atom is corrupt and the enclosing UPDATE message MUST be 378 ignored. 380 M-Type: 4-bit field specifying match type. The following four 381 values are defined: 383 M-Type = 0: Exact match with the Mask length IP address prefix. 384 GeMask and LeMask MUST be sent as zero and ignored on receipt. 386 M-Type = 1: Match prefix greater and equal to the given masks. 388 M-Type = 2: Match prefix less and equal to the given masks. 390 M-Type = 3: Match prefix within the range of the given masks. 392 Flags: 4 bits. No flags are currently defined. 394 IPv4 Address: 4 octets for an IPv4 address. 396 Mask: 1 octet for the IP address prefix length. 398 GeMask: 1 octet for match range's lower bound, must not be less than 399 Mask or be 0. 401 LeMask: 1 octet for match range's upper bound, must be greater than 402 Mask or be 0. 404 For example, tuple represents an exact IP prefix match for 406 1.1.0.0/22. 408 represents match IP prefix 1.1.0.0/24 greater-equal 24 410 (i.e., 1.1.0.0/24 or 1.1.0.0/25 or ... or 1.1.0.0/32). 412 represents match IP prefix 17.1.0.0/24 less-equal 26 414 (i.e., 17.1.0.0/24 or 17.1.0.0/25 or 17.1.0.0/26). 416 represents match IP prefix 18.1.0.0/24 greater-equal 24 418 and less-equal 32 (i.e., 18.1.0.0/24 or 18.1.0.0/25 or ... or 419 18.1.0.0/32). 421 Similarly, an IPv6 Prefix sub-sub-TLV represents match criteria on 422 IPv6 prefixes. Its format is illustrated below: 424 0 1 2 3 425 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 426 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 427 | Type 4 | Length (N x 20) |M-Type | Flags | 428 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 429 | IPv6 Address (16 octets) ~ 430 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 431 | Mask | GeMask | LeMask |M-Type | Flags | 432 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 433 ~ . . . 434 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 435 | IPv6 Address (16 octets ~ 436 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 437 | Mask | GeMask | LeMask | 438 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 440 Format of IPv6 Prefix sub-sub-TLV 442 An AS-Path sub-sub-TLV represents a match criteria in a regular 443 expression string. Its format is illustrated below: 445 0 1 2 3 446 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 447 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 448 | Type 2 | Length (Variable) | 449 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 450 | AS-Path Regex String | 451 : : 452 | ~ 453 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 455 Format of AS Path sub-sub-TLV 457 Type: 2 for AS-Path. 459 Length: Variable, maximum is 1024. 461 AS-Path Regex String: AS-Path regular expression string. 463 A community sub-sub-TLV represents a list of communities to be 464 matched all. Its format is illustrated below: 466 0 1 2 3 467 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 468 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 469 | Type 3 | Length (N x 4 + 1) | Flags | 470 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 471 | Community 1 Value | 472 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 473 ~ . . . ~ 474 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 475 | Community N Value | 476 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 478 Format of Community sub-sub-TLV 480 Type: 3 for Community. 482 Length: N x 4 + 1, where N is the number of communities. If Length 483 is not a multiple of 4 plus 1, the Atom is corrupt and the 484 enclosing UPDATE MUST be ignored. 486 Flags: 1 octet. No flags are currently defined. These bits MUST be 487 sent as zero and ignored on receipt. 489 4.2.2. Sub-TLVs of the Parameters TLV 491 This document introduces 2 community values: 493 MATCH AND SET ATTR: If the IPv4/IPv6 unicast routes to a remote peer 494 match the specific conditions defined in the routing policy 495 extracted from the RPD route, then the attributes of the IPv4/IPv6 496 unicast routes will be modified when sending to the remote peer 497 per the actions defined in the RPD route. 499 MATCH AND NOT ADVERTISE: If the IPv4/IPv6 unicast routes to a remote 500 peer match the specific conditions defined in the routing policy 501 extracted from the RPD route, then the IPv4/IPv6 unicast routes 502 will not be advertised to the remote peer. 504 For the Parameter(s) TLV, two action sub-TLVs are defined: MED change 505 sub-TLV and AS-Path change sub-TLV. When the community in the 506 container is MATCH AND SET ATTR, the Parameter(s) TLV can include 507 these sub-TLVs. When the community is MATCH AND NOT ADVERTISE, the 508 Parameter(s) TLV's value is empty. 510 A MED change sub-TLV indicates an action to change the MED. Its 511 format is illustrated below: 513 0 1 2 3 514 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 515 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 516 | Type 1 | Length (5) | OP | 517 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 518 | Value | 519 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 521 Format of MED Change sub-TLV 523 Type: 1 for MED Change. 525 Length: 5. If Length is any other value, the sub-TLV is corrupt and 526 the enclosing UPDATE MUST be ignored. 528 OP: 1 octet. Three are defined: 530 OP = 0: assign the Value to the existing MED. 532 OP = 1: add the Value to the existing MED. If the sum is greater 533 than the maximum value for MED, assign the maximum value to 534 MED. 536 OP = 2: subtract the Value from the existing MED. If the 537 existing MED minus the Value is less than 0, assign 0 to MED. 539 If OP is any other value, the sub-TLV is ignored. 541 Value: 4 octets. 543 An AS-Path change sub-TLV indicates an action to change the AS-Path. 544 Its format is illustrated below: 546 0 1 2 3 547 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 548 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 549 | Type 2 | Length (n x 5) | 550 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 551 | AS1 | 552 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 553 | Count1 | 554 +-+-+-+-+-+-+-+-+ 555 ~ . . . 556 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 557 | ASn | 558 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 559 | Countn | 560 +-+-+-+-+-+-+-+-+ 562 Format of AS-Path Change sub-TLV 564 Type: 2 for AS-Path Change. 566 Length: n x 5. If Length is not a multiple of 5, the sub-TLV is 567 corrupt and the enclosing UPDATE MUST be ignored. 569 ASi: 4 octet. An AS number. 571 Counti: 1 octet. ASi repeats Counti times. 573 The sequence of AS numbers are added to the existing AS Path. 575 4.3. Capability Negotiation 577 It is necessary to negotiate the capability to support BGP Extensions 578 for Routing Policy Distribution (RPD). The BGP RPD Capability is a 579 new BGP capability [RFC5492]. The Capability Code for this 580 capability is 72 assigned by the IANA. The Capability Length field 581 of this capability is variable. The Capability Value field consists 582 of one or more of the following tuples: 584 +--------------------------------------------------+ 585 | Address Family Identifier (2 octets) | 586 +--------------------------------------------------+ 587 | Subsequent Address Family Identifier (1 octet) | 588 +--------------------------------------------------+ 589 | Send/Receive (1 octet) | 590 +--------------------------------------------------+ 592 BGP RPD Capability 594 The meaning and use of the fields are as follows: 596 Address Family Identifier (AFI): This field is the same as the one 597 used in [RFC4760]. 599 Subsequent Address Family Identifier (SAFI): This field is the same 600 as the one used in [RFC4760]. 602 Send/Receive: This field indicates whether the sender is (a) willing 603 to receive Routing Policies from its peer (value 1), (b) would like 604 to send Routing Policies to its peer (value 2), or (c) both (value 3) 605 for the . If Send/Receive is any other value, that tuple 606 is ignored but any other tuples present are still used. 608 5. Operations 610 This section presents a typical application scenario and some details 611 about handling a related failure. 613 5.1. Application Scenario 615 Figure 3 illustrates a typical scenario, where RPD is used by a 616 controller with a Route Reflector (RR) to adjust traffic dynamically. 618 +--------------+ 619 | Controller | 620 +-------+------+ 621 \ 622 \ RPD 623 .--\._.+--+ ___...__ 624 __( \ '.---... ( ) 625 / RR o -------- A o) ---------- (o X AS2 ) 626 (o E |\ ) _____//(___ ___) 627 ( | \_______ B o) ____/ / ''' 628 (o F \ ) ____/ 629 ( \_____ C o) ______/ ___...__ 630 ' AS1 _) \_____ ( ) 631 '---._.-. ) \_______ (o Y AS3 ) 632 '---' (___ ___) 633 ''' 635 Figure 3: Controller with RR Adjusts Traffic 637 The controller connects the RR through a BGP session. There is a BGP 638 session between the RR and each of routers A, B and C in AS1, which 639 is shown in the figure. Other sessions in AS1 are not shown in the 640 figure. 642 There is router X in AS2. There is a BGP session between X and each 643 of routers A, B and C in AS1. 645 There is router Y in AS3. There is a BGP session between Y and 646 router C in AS1. 648 The controller sends a RPD route to the RR. After receiving the RPD 649 route from the controller, the RR reflects the RPD route to routers 650 A, B and C. After receiving the RPD route from the RR, routers A, B 651 and C extract the routing policy from the RPD route. If the peer IP 652 in the NLRI of the RPD route is 0, then apply the routing policy to 653 all the remote peers of routers A, B and C. If the peer IP in the 654 NLRI of the RPD route is non-zero, then the IP address indicates a 655 remote peer of routers A, B and C and such routing policy is applied 656 to the specific remote peer. The IPv4/IPv6 unicast routes towards 657 router X in AS2 and router Y in AS3 will be adjusted based on the 658 routing policy sent by the controller via a RPD route. 660 The controller uses the RT extend community to notify a router 661 whether to receive a RPD policy. For example, if there is not any 662 adjustment on router B, the controller sends RPD routes with the RTs 663 for A and C. B will not receive the routes. 665 The process of adjusting traffic in a network is a close loop. The 666 loop starts from the controller with some traffic expectations on a 667 set of routes. The controller obtains the information about traffic 668 flows for the related routes. It analyzes the traffic and checks 669 whether the current traffic flows meet the expectations. If the 670 expectations are not met, the controller adjusts the traffic. And 671 then the loop goes to the starter of the loop (The controller obtains 672 the information about traffic ...). 674 5.2. About Failure 676 A RPD route is not a configuration. When it is sent to a router, no 677 ack is needed from the router. The existing BGP mechanisms are re- 678 used for delivering a RPD route. After the route is delivered to a 679 router, it will be successful. This is guaranteed by the BGP 680 protocols. 682 If there is a failure for the router to install the route locally, 683 this failure is a bug of the router. The bug needs to be fixed. 685 For the errors mentioned in [RFC7606], they are handled according to 686 [RFC7606]. These errors are bugs, which need to be resolved. 688 Regarding to the failure of the controller, some existing mechanisms 689 such BGP GR [RFC4724] and BGP Long-lived Graceful Restart (LLGR) can 690 be used to let the router keep the routes from the controller for 691 some time. 693 With support of "Long-lived Graceful Restart Capability" 694 [I-D.ietf-idr-long-lived-gr], the routes can be retained for a longer 695 time after the controller fails. 697 In the worst case, the controller fails and the RPD routes for 698 adjusting the traffic are withdrawn. The traffic adjusted/redirected 699 may take its old path. This should be acceptable. 701 6. Contributors 703 The following people have substantially contributed to the definition 704 of the BGP-FS RPD and to the editing of this document: 706 Peng Zhou 707 Huawei 708 Email: Jewpon.zhou@huawei.com 710 7. Security Considerations 712 Protocol extensions defined in this document do not affect BGP 713 security other than as discussed in the Security Considerations 714 section of [RFC5575]. 716 8. Acknowledgements 718 The authors would like to thank Acee Lindem, Jeff Haas, Jie Dong, 719 Lucy Yong, Qiandeng Liang, Zhenqiang Li, Robert Raszuk, Donald 720 Eastlake, Ketan Talaulikar, and Jakob Heitz for their comments to 721 this work. 723 9. IANA Considerations 725 9.1. Existing Assignments 727 IANA has assigned an AFI of value 16398 from the registry "Address 728 Family Numbers" for Routing Policy. 730 IANA has assigned a SAFI of value 75 from the registry "Subsequent 731 Address Family Identifiers (SAFI) Parameters" for Routing Policy. 733 IANA has assigned a Code Point of value 72 from the registry 734 "Capability Codes" for Routing Policy Distribution. 736 9.2. RouteAttr Atom Type 738 IANA is requested to assign a code-point from the registry "BGP 739 Community Container Atom Types" as follows: 741 +---------------------+------------------------------+-------------+ 742 | Atom Code Point | Description | Reference | 743 +---------------------+------------------------------+-------------+ 744 | TBD1 (48 suggested) | RouteAttr Atom |This document| 745 +---------------------+------------------------------+-------------+ 747 9.3. Route Attributes Sub-sub-TLV Registry 749 IANA is requested to create a registry called "Route Attributes Sub- 750 sub-TLV" under RouteAttr Atom Sub-TLV. The allocation policy of this 751 registry is "First Come First Served (FCFS)". 753 The initial code points are as follows: 755 +-------------+-----------------------------------+-------------+ 756 | Code Point | Description | Reference | 757 +-------------+-----------------------------------+-------------+ 758 | 0 | Reserved | | 759 +-------------+-----------------------------------+-------------+ 760 | 1 | IPv4 Prefix Sub-sub-TLV |This document| 761 +-------------+-----------------------------------+-------------+ 762 | 2 | AS-Path Sub-sub-TLV |This document| 763 +-------------+-----------------------------------+-------------+ 764 | 3 | Community Sub-sub-TLV |This document| 765 +-------------+-----------------------------------+-------------+ 766 | 4 | IPv6 Prefix Sub-sub-TLV |This document| 767 +-------------+-----------------------------------+-------------+ 768 | 5 - 255 | Available | | 769 +-------------+-----------------------------------+-------------+ 771 9.4. Attribute Change Sub-TLV Registry 773 IANA is requested to create a registry called "Attribute Change Sub- 774 TLV" under Parameter(s) TLV. The allocation policy of this registry 775 is "First Come First Served (FCFS)". 777 Initial code points are as follows: 779 +-------------+-----------------------------------+-------------+ 780 | Code Point | Description | Reference | 781 +-------------+-----------------------------------+-------------+ 782 | 0 | Reserved | | 783 +-------------+-----------------------------------+-------------+ 784 | 1 | MED Change Sub-TLV |This document| 785 +-------------+-----------------------------------+-------------+ 786 | 2 | AS-Path Change Sub-TLV |This document| 787 +-------------+-----------------------------------+-------------+ 788 | 3 - 255 | Available | | 789 +-------------+-----------------------------------+-------------+ 791 10. References 793 10.1. Normative References 795 [I-D.ietf-idr-wide-bgp-communities] 796 Raszuk, R., Haas, J., Lange, A., Decraene, B., Amante, S., 797 and P. Jakma, "BGP Community Container Attribute", Work in 798 Progress, Internet-Draft, draft-ietf-idr-wide-bgp- 799 communities-05, 2 July 2018, 800 . 803 [RFC1997] Chandra, R., Traina, P., and T. Li, "BGP Communities 804 Attribute", RFC 1997, DOI 10.17487/RFC1997, August 1996, 805 . 807 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 808 Requirement Levels", BCP 14, RFC 2119, 809 DOI 10.17487/RFC2119, March 1997, 810 . 812 [RFC4271] Rekhter, Y., Ed., Li, T., Ed., and S. Hares, Ed., "A 813 Border Gateway Protocol 4 (BGP-4)", RFC 4271, 814 DOI 10.17487/RFC4271, January 2006, 815 . 817 [RFC4760] Bates, T., Chandra, R., Katz, D., and Y. Rekhter, 818 "Multiprotocol Extensions for BGP-4", RFC 4760, 819 DOI 10.17487/RFC4760, January 2007, 820 . 822 [RFC5492] Scudder, J. and R. Chandra, "Capabilities Advertisement 823 with BGP-4", RFC 5492, DOI 10.17487/RFC5492, February 824 2009, . 826 [RFC5575] Marques, P., Sheth, N., Raszuk, R., Greene, B., Mauch, J., 827 and D. McPherson, "Dissemination of Flow Specification 828 Rules", RFC 5575, DOI 10.17487/RFC5575, August 2009, 829 . 831 [RFC8126] Cotton, M., Leiba, B., and T. Narten, "Guidelines for 832 Writing an IANA Considerations Section in RFCs", BCP 26, 833 RFC 8126, DOI 10.17487/RFC8126, June 2017, 834 . 836 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 837 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 838 May 2017, . 840 10.2. Informative References 842 [I-D.ietf-idr-long-lived-gr] 843 Uttaro, J., Chen, E., Decraene, B., and J. G. Scudder, 844 "Support for Long-lived BGP Graceful Restart", Work in 845 Progress, Internet-Draft, draft-ietf-idr-long-lived-gr-00, 846 5 September 2019, . 849 [I-D.ietf-idr-registered-wide-bgp-communities] 850 Raszuk, R. and J. Haas, "Registered Wide BGP Community 851 Values", Work in Progress, Internet-Draft, draft-ietf-idr- 852 registered-wide-bgp-communities-02, 31 May 2016, 853 . 856 [RFC4724] Sangli, S., Chen, E., Fernando, R., Scudder, J., and Y. 857 Rekhter, "Graceful Restart Mechanism for BGP", RFC 4724, 858 DOI 10.17487/RFC4724, January 2007, 859 . 861 [RFC7606] Chen, E., Ed., Scudder, J., Ed., Mohapatra, P., and K. 862 Patel, "Revised Error Handling for BGP UPDATE Messages", 863 RFC 7606, DOI 10.17487/RFC7606, August 2015, 864 . 866 Authors' Addresses 867 Zhenbin Li 868 Huawei 869 Huawei Bld., No.156 Beiqing Rd. 870 Beijing 871 100095 872 China 874 Email: lizhenbin@huawei.com 876 Liang Ou 877 China Telcom Co., Ltd. 878 109 West Zhongshan Ave,Tianhe District 879 Guangzhou 880 510630 881 China 883 Email: ouliang@chinatelecom.cn 885 Yujia Luo 886 China Telcom Co., Ltd. 887 109 West Zhongshan Ave,Tianhe District 888 Guangzhou 889 510630 890 China 892 Email: luoyuj@sdu.edu.cn 894 Sujian Lu 895 Tencent 896 Tengyun Building,Tower A ,No. 397 Tianlin Road 897 Shanghai 898 Xuhui District, 200233 899 China 901 Email: jasonlu@tencent.com 903 Gyan S. Mishra 904 Verizon Inc. 905 13101 Columbia Pike 906 Silver Spring, MD 20904 907 United States of America 909 Phone: 301 502-1347 910 Email: gyan.s.mishra@verizon.com 911 Huaimo Chen 912 Futurewei 913 Boston, MA, 914 United States of America 916 Email: Huaimo.chen@futurewei.com 918 Shunwan Zhuang 919 Huawei 920 Huawei Bld., No.156 Beiqing Rd. 921 Beijing 922 100095 923 China 925 Email: zhuangshunwan@huawei.com 927 Haibo Wang 928 Huawei 929 Huawei Bld., No.156 Beiqing Rd. 930 Beijing 931 100095 932 China 934 Email: rainsword.wang@huawei.com