idnits 2.17.1 draft-ietf-idr-rpd-11.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- == There are 8 instances of lines with non-RFC6890-compliant IPv4 addresses in the document. If these are example addresses, they should be changed. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (May 22, 2021) is 1071 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Unused Reference: 'RFC1997' is defined on line 818, but no explicit reference was found in the text == Unused Reference: 'RFC8126' is defined on line 846, but no explicit reference was found in the text == Unused Reference: 'I-D.ietf-idr-registered-wide-bgp-communities' is defined on line 862, but no explicit reference was found in the text == Outdated reference: A later version (-11) exists of draft-ietf-idr-wide-bgp-communities-05 ** Obsolete normative reference: RFC 5575 (Obsoleted by RFC 8955) == Outdated reference: A later version (-06) exists of draft-ietf-idr-long-lived-gr-00 Summary: 1 error (**), 0 flaws (~~), 7 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group Z. Li 3 Internet-Draft Huawei 4 Intended status: Standards Track L. Ou 5 Expires: November 23, 2021 Y. Luo 6 China Telcom Co., Ltd. 7 S. Lu 8 Tencent 9 G. Mishra 10 Verizon Inc. 11 H. Chen 12 Futurewei 13 S. Zhuang 14 H. Wang 15 Huawei 16 May 22, 2021 18 BGP Extensions for Routing Policy Distribution (RPD) 19 draft-ietf-idr-rpd-11 21 Abstract 23 It is hard to adjust traffic and optimize traffic paths in a 24 traditional IP network from time to time through manual 25 configurations. It is desirable to have a mechanism for setting up 26 routing policies, which adjusts traffic and optimizes traffic paths 27 automatically. This document describes BGP Extensions for Routing 28 Policy Distribution (BGP RPD) to support this. 30 Requirements Language 32 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 33 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 34 document are to be interpreted as described in [RFC2119] [RFC8174] 35 when, and only when, they appear in all capitals, as shown here. 37 Status of This Memo 39 This Internet-Draft is submitted in full conformance with the 40 provisions of BCP 78 and BCP 79. 42 Internet-Drafts are working documents of the Internet Engineering 43 Task Force (IETF). Note that other groups may also distribute 44 working documents as Internet-Drafts. The list of current Internet- 45 Drafts is at https://datatracker.ietf.org/drafts/current/. 47 Internet-Drafts are draft documents valid for a maximum of six months 48 and may be updated, replaced, or obsoleted by other documents at any 49 time. It is inappropriate to use Internet-Drafts as reference 50 material or to cite them other than as "work in progress." 52 This Internet-Draft will expire on November 23, 2021. 54 Copyright Notice 56 Copyright (c) 2021 IETF Trust and the persons identified as the 57 document authors. All rights reserved. 59 This document is subject to BCP 78 and the IETF Trust's Legal 60 Provisions Relating to IETF Documents 61 (https://trustee.ietf.org/license-info) in effect on the date of 62 publication of this document. Please review these documents 63 carefully, as they describe your rights and restrictions with respect 64 to this document. Code Components extracted from this document must 65 include Simplified BSD License text as described in Section 4.e of 66 the Trust Legal Provisions and are provided without warranty as 67 described in the Simplified BSD License. 69 Table of Contents 71 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 72 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 3 73 3. Problem Statement . . . . . . . . . . . . . . . . . . . . . . 4 74 3.1. Inbound Traffic Control . . . . . . . . . . . . . . . . . 4 75 3.2. Outbound Traffic Control . . . . . . . . . . . . . . . . 5 76 4. Protocol Extensions . . . . . . . . . . . . . . . . . . . . . 6 77 4.1. Using a New AFI and SAFI . . . . . . . . . . . . . . . . 6 78 4.2. BGP Wide Community and Atoms . . . . . . . . . . . . . . 8 79 4.2.1. RouteAttr TLV/sub-TLV . . . . . . . . . . . . . . . . 8 80 4.2.2. Sub-TLVs of the Parameters TLV . . . . . . . . . . . 12 81 4.3. Capability Negotiation . . . . . . . . . . . . . . . . . 13 82 5. Operations . . . . . . . . . . . . . . . . . . . . . . . . . 14 83 5.1. Application Scenario . . . . . . . . . . . . . . . . . . 14 84 5.2. About Failure . . . . . . . . . . . . . . . . . . . . . . 16 85 6. Contributors . . . . . . . . . . . . . . . . . . . . . . . . 16 86 7. Security Considerations . . . . . . . . . . . . . . . . . . . 17 87 8. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 17 88 9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 17 89 9.1. Existing Assignments . . . . . . . . . . . . . . . . . . 17 90 9.2. Routing Policy Type Registry . . . . . . . . . . . . . . 17 91 9.3. RouteAttr Atom Type . . . . . . . . . . . . . . . . . . . 18 92 9.4. Route Attributes Sub-TLV Registry . . . . . . . . . . . . 18 93 9.5. Attribute Change Sub-TLV Registry . . . . . . . . . . . . 18 94 10. References . . . . . . . . . . . . . . . . . . . . . . . . . 19 95 10.1. Normative References . . . . . . . . . . . . . . . . . . 19 96 10.2. Informative References . . . . . . . . . . . . . . . . . 20 98 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 20 100 1. Introduction 102 It is difficult to optimize traffic paths in a traditional IP network 103 because of the following: 105 o Complex and error prone configuration. Traffic can only be 106 adjusted device by device. The configurations on all the routers 107 that the traffic traverses need to be changed or added. There are 108 already lots of policies configured on the routers in an 109 operational network. There are different types of policies, which 110 include security, management and control policies. These policies 111 are relatively stable. However, the policies for adjusting 112 traffic are dynamic. Whenever the traffic through a route is not 113 expected, the policies to adjust the traffic for that route are 114 configured on the related routers. It is complex and error prone 115 to dynamically add or change the policies to the existing policies 116 on the special routers to adjust the traffic. 118 o Difficult maintenance. The routing policies used to control 119 network routes are dynamic, posing difficulties to subsequent 120 maintenance. High maintenance skills are required. 122 It is desirable to have an automatic mechanism for setting up routing 123 policies, which can simplify routing policy configuration. This 124 document describes extensions to BGP for Routing Policy Distribution 125 to resolve these issues. 127 2. Terminology 129 The following terminology is used in this document. 131 o ACL: Access Control List 133 o BGP: Border Gateway Protocol [RFC4271] 135 o FS: Flow Specification 137 o NLRI: Network Layer Reachability Information [RFC4271] 139 o PBR: Policy-Based Routing 141 o RPD: Routing Policy Distribution 143 o VPN: Virtual Private Network 145 3. Problem Statement 147 Providers have the requirement to adjust their business traffic 148 routing policies from time to time because of the following: 150 o Business development or network failure introduces link congestion 151 and overload. 153 o Business changes or network additions produce unused resources 154 such as idle links. 156 o Network transmission quality is decreased as the result of delay, 157 loss and they need to adjust traffic to other paths. 159 o To control OPEX and CPEX, they may prefer the transit provider 160 with lower price. 162 3.1. Inbound Traffic Control 164 In Figure 1, for the reasons above, the provider P of AS100 may wish 165 the inbound traffic from AS200 to enter AS100 through link L3 instead 166 of the others. Since P doesn't have any administrative control over 167 AS200, there is no way for P to directly modify the route selection 168 criteria inside AS200. 170 Traffic from PE1 to Prefix1 171 -----------------------------------> 173 +-----------------+ +-------------------------+ 174 | +---------+ | L1 | +----+ +----------+| 175 | |Speaker1 | +------------+ |IGW1| |policy || 176 | +---------+ |** L2**| +----+ |controller|| 177 | | ** ** | +----------+| 178 | +---+ | **** | | 179 | |PE1| | **** | | 180 | +---+ | ** ** | | 181 | +---------+ |** L3**| +----+ | 182 | |Speaker2 | +------------+ |IGW2| AS100 | 183 | +---------+ | L4 | +----+ | 184 | | | | 185 | AS200 | | | 186 | | | ... | 187 | | | | 188 | +---------+ | | +----+ +-------+ | 189 | |Speakern | | | |IGWn| |Prefix1| | 190 | +---------+ | | +----+ +-------+ | 191 +-----------------+ +-------------------------+ 193 Prefix1 advertised from AS100 to AS200 194 <---------------------------------------- 196 Figure 1: Inbound Traffic Control case 198 3.2. Outbound Traffic Control 200 In Figure 2, the provider P of AS100 prefers link L3 for the traffic 201 to the destination Prefix2 among multiple exits and links to AS200. 202 This preference can be dynamic and might change frequently because of 203 the reasons above. So, provider P expects an efficient and 204 convenient solution. 206 Traffic from PE2 to Prefix2 207 -----------------------------------> 208 +-------------------------+ +-----------------+ 209 |+----------+ +----+ |L1 | +---------+ | 210 ||policy | |IGW1| +------------+ |Speaker1 | | 211 ||controller| +----+ |** **| +---------+ | 212 |+----------+ |L2** ** | +-------+| 213 | | **** | |Prefix2|| 214 | | **** | +-------+| 215 | |L3** ** | | 216 | AS100 +----+ |** **| +---------+ | 217 | |IGW2| +------------+ |Speaker2 | | 218 | +----+ |L4 | +---------+ | 219 | | | | 220 |+---+ | | AS200 | 221 ||PE2| ... | | | 222 |+---+ | | | 223 | +----+ | | +---------+ | 224 | |IGWn| | | |Speakern | | 225 | +----+ | | +---------+ | 226 +-------------------------+ +-----------------+ 228 Prefix2 advertised from AS200 to AS100 229 <---------------------------------------- 231 Figure 2: Outbound Traffic Control case 233 4. Protocol Extensions 235 This document specifies a solution using a new AFI and SAFI with the 236 BGP Wide Community for encoding a routing policy. 238 4.1. Using a New AFI and SAFI 240 A new AFI and SAFI are defined: the Routing Policy AFI whose 241 codepoint 16398 has been assigned by IANA, and SAFI whose codepoint 242 75 has been assigned by IANA. 244 The AFI and SAFI pair uses a new NLRI, which is defined as follows: 246 0 1 2 3 247 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 248 +-+-+-+-+-+-+-+-+ 249 | NLRI Length | 250 +-+-+-+-+-+-+-+-+ 251 | Policy Type | 252 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 253 | Distinguisher (4 octets) | 254 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 255 | Peer IP (4/16 octets) ~ 256 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 258 Where: 260 NLRI Length: 1 octet represents the length of NLRI. If the Length 261 is anything other than 9 or 21, the NLRI is corrupt and the 262 enclosing UPDATE message MUST be ignored. 264 Policy Type: 1 octet indicates the type of a policy. 1 is for 265 Export policy. 2 is for Import policy. If the Policy Type is any 266 other value, the NLRI is corrupt and the enclosing UPDATE message 267 MUST be ignored. 269 Distinguisher: 4 octet value uniquely identifies the content/ 270 policy. It is used to sort/order the polices from the lower to 271 higher distinguisher. They are applied in order. The policy with 272 a lower/smaller distinguisher is applied before the policies with 273 higher/larger distinguishers. 275 Peer IP: 4/16 octet value indicates IPv4/IPv6 peers. Its default 276 value is 0, which indicates that when receiving a BGP UPDATE 277 message with the NLRI, a BGP speaker will apply the policy in the 278 message to all its IPv4/IPv6 peers. 280 Under RPD AFI/SAFI, the RPD routes are stored and ordered according 281 to their keys. Under IPv4/IPv6 Unicast AFI/SAFI, there are IPv4/IPv6 282 unicast routes learned and various static policies configured. In 283 addition, there are dynamic RPD policies from the RPD AFI/SAFI when 284 RPD is enabled. 286 Before advertising an IPv4/IPv6 Unicast AFI/SAFI route, the 287 configured policies are applied to it first, and then the RPD Export 288 policies are applied. 290 The NLRI containing the Routing Policy is carried in MP_Reach_NLRI 291 and MP_UNREACH_NLRI path attributes in a BGP UPDATE message, which 292 MUST also contain the BGP mandatory attributes and MAY contain some 293 BGP optional attributes. 295 When receiving a BGP UPDATE message with routing policy, a BGP 296 speaker processes it as follows: 298 o If the peer IP in the NLRI is 0, then apply the routing policy to 299 all the remote peers of this BGP speaker. 301 o If the peer IP in the NLRI is non-zero, then the IP address 302 indicates a remote peer of this BGP speaker and the routing policy 303 will be applied to it. 305 The content of the Routing Policy is encoded in a BGP Wide Community. 307 4.2. BGP Wide Community and Atoms 309 The BGP wide community is defined in 310 [I-D.ietf-idr-wide-bgp-communities]. It can be used to facilitate 311 the delivery of new network services and be extended easily for 312 distributing different kinds of routing policies. 314 A wide community Atom is a TLV (or sub-TLV), which may be included in 315 a BGP wide community container (or BGP wide community for short) 316 containing some BGP Wide Community TLVs. Three BGP Wide Community 317 TLVs are defined in [I-D.ietf-idr-wide-bgp-communities], which are 318 BGP Wide Community Target(s) TLV, Exclude Target(s) TLV, and 319 Parameter(s) TLV. The value of each of these TLVs comprises a series 320 of Atoms, each of which is a TLV (or sub-TLV). A new wide community 321 Atom is defined for BGP Wide Community Target(s) TLV and a few new 322 Atoms are defined for BGP Wide Community Parameter(s) TLV. For your 323 reference, the format of the TLV is illustrated below: 325 0 1 2 3 326 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 327 +-+-+-+-+-+-+-+-++-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 328 | Type | Length | 329 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 330 | Value (variable) ~ 331 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 333 Format of Wide Community Atom TLV 335 4.2.1. RouteAttr TLV/sub-TLV 337 A RouteAttr Atom TLV (or RouteAttr TLV/sub-TLV for short) is defined 338 and may be included in a Target TLV. It has the following format. 340 0 1 2 3 341 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 342 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 343 | Type (TBD1) | Length (variable) | 344 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 345 | sub-TLVs ~ 346 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 348 Format of RouteAttr Atom TLV 350 The Type for RouteAttr is TBD1. In RouteAttr TLV, four sub-TLVs are 351 defined: IPv4 Prefix, IPv6 Prefix, AS-Path, and Community sub-TLV. 353 An IP prefix sub-TLV gives matching criteria on IPv4 prefixes. Its 354 format is illustrated below: 356 0 1 2 3 357 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 358 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 359 | Type 1 | Length (N x 8) |M-Type | Flags | 360 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 361 | IPv4 Address | 362 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 363 | Mask | GeMask | LeMask |M-Type | Flags | 364 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 365 ~ . . . 366 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 367 | IPv4 Address | 368 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 369 | Mask | GeMask | LeMask | 370 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 372 Format of IPv4 Prefix sub-TLV 374 Type: 1 for IPv4 Prefix. 376 Length: N x 8, where N is the number of tuples . If Length is not a multiple of 8, 378 the Atom is corrupt and the enclosing UPDATE message MUST be 379 ignored. 381 M-Type: 4 bits for match types, four of which are defined: 383 M-Type = 0: Exact match. 385 M-Type = 1: Match prefix greater and equal to the given masks. 387 M-Type = 2: Match prefix less and equal to the given masks. 389 M-Type = 3: Match prefix within the range of the given masks. 391 Flags: 4 bits. No flags are currently defined. 393 IPv4 Address: 4 octets for an IPv4 address. 395 Mask: 1 octet for the mask length. 397 GeMask: 1 octet for match range's lower bound, must not be less than 398 Mask or be 0. 400 LeMask: 1 octet for match range's upper bound, must be greater than 401 Mask or be 0. 403 For example, tuple represents an exact IP prefix match for 405 1.1.0.0/22. 407 represents match IP prefix 1.1.0.0/24 greater-equal 24. 410 represents match IP prefix 17.1.0.0/24 less-equal 26. 413 represents match IP prefix 18.1.0.0/24 greater-equal 24 415 and less-equal 32. 417 Similarly, an IPv6 Prefix sub-TLV represents match criteria on IPv6 418 prefixes. Its format is illustrated below: 420 0 1 2 3 421 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 422 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 423 | Type 4 | Length (N x 20) |M-Type | Flags | 424 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 425 | IPv6 Address (16 octets) ~ 426 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 427 | Mask | GeMask | LeMask |M-Type | Flags | 428 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 429 ~ . . . 430 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 431 | IPv6 Address (16 octets ~ 432 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 433 | Mask | GeMask | LeMask | 434 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 436 Format of IPv6 Prefix sub-TLV 438 An AS-Path sub-TLV represents a match criteria in a regular 439 expression string. Its format is illustrated below: 441 0 1 2 3 442 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 443 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 444 | Type 2 | Length (Variable) | 445 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 446 | AS-Path Regex String | 447 : : 448 | ~ 449 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 451 Format of AS Path sub-TLV 453 Type: 2 for AS-Path. 455 Length: Variable, maximum is 1024. 457 AS-Path Regex String: AS-Path regular expression string. 459 A community sub-TLV represents a list of communities to be matched 460 all. Its format is illustrated below: 462 0 1 2 3 463 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 464 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 465 | Type 3 | Length (N x 4 + 1) | Flags | 466 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 467 | Community 1 Value | 468 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 469 ~ . . . ~ 470 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 471 | Community N Value | 472 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 474 Format of Community sub-TLV 476 Type: 3 for Community. 478 Length: N x 4 + 1, where N is the number of communities. If Length 479 is not a multiple of 4 plus 1, the Atom is corrupt and the 480 enclosing UPDATE MUST be ignored. 482 Flags: 1 octet. No flags are currently defined. These bits MUST be 483 sent as zero and ignored on receipt. 485 4.2.2. Sub-TLVs of the Parameters TLV 487 This document introduces 2 community values: 489 MATCH AND SET ATTR: If the IPv4/IPv6 unicast routes to a remote peer 490 match the specific conditions defined in the routing policy 491 extracted from the RPD route, then the attributes of the IPv4/IPv6 492 unicast routes will be modified when sending to the remote peer 493 per the actions defined in the RPD route. 495 MATCH AND NOT ADVERTISE: If the IPv4/IPv6 unicast routes to a remote 496 peer match the specific conditions defined in the routing policy 497 extracted from the RPD route, then the IPv4/IPv6 unicast routes 498 will not be advertised to the remote peer. 500 For the Parameter(s) TLV, two action sub-TLVs are defined: MED change 501 sub-TLV and AS-Path change sub-TLV. When the community in the 502 container is MATCH AND SET ATTR, the Parameter(s) TLV can include 503 these sub-TLVs. When the community is MATCH AND NOT ADVERTISE, the 504 Parameter(s) TLV's value is empty. 506 A MED change sub-TLV indicates an action to change the MED. Its 507 format is illustrated below: 509 0 1 2 3 510 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 511 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 512 | Type 1 | Length (5) | OP | 513 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 514 | Value | 515 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 517 Format of MED Change sub-TLV 519 Type: 1 for MED Change. 521 Length: 5. If Length is any other value, the sub-TLV is corrupt and 522 the enclosing UPDATE MUST be ignored. 524 OP: 1 octet. Three are defined: 526 OP = 0: assign the Value to the existing MED. 528 OP = 1: add the Value to the existing MED. If the sum is greater 529 than the maximum value for MED, assign the maximum value to 530 MED. 532 OP = 2: subtract the Value from the existing MED. If the 533 existing MED minus the Value is less than 0, assign 0 to MED. 535 If OP is any other value, the sub-TLV is ignored. 537 Value: 4 octets. 539 An AS-Path change sub-TLV indicates an action to change the AS-Path. 540 Its format is illustrated below: 542 0 1 2 3 543 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 544 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 545 | Type 2 | Length (n x 5) | 546 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 547 | AS1 | 548 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 549 | Count1 | 550 +-+-+-+-+-+-+-+-+ 551 ~ . . . 552 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 553 | ASn | 554 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 555 | Countn | 556 +-+-+-+-+-+-+-+-+ 558 Format of AS-Path Change sub-TLV 560 Type: 2 for AS-Path Change. 562 Length: n x 5. If Length is not a multiple of 5, the sub-TLV is 563 corrupt and the enclosing UPDATE MUST be ignored. 565 ASi: 4 octet. An AS number. 567 Counti: 1 octet. ASi repeats Counti times. 569 The sequence of AS numbers are added to the existing AS Path. 571 4.3. Capability Negotiation 573 It is necessary to negotiate the capability to support BGP Extensions 574 for Routing Policy Distribution (RPD). The BGP RPD Capability is a 575 new BGP capability [RFC5492]. The Capability Code for this 576 capability is 72 assigned by the IANA. The Capability Length field 577 of this capability is variable. The Capability Value field consists 578 of one or more of the following tuples: 580 +--------------------------------------------------+ 581 | Address Family Identifier (2 octets) | 582 +--------------------------------------------------+ 583 | Subsequent Address Family Identifier (1 octet) | 584 +--------------------------------------------------+ 585 | Send/Receive (1 octet) | 586 +--------------------------------------------------+ 588 BGP RPD Capability 590 The meaning and use of the fields are as follows: 592 Address Family Identifier (AFI): This field is the same as the one 593 used in [RFC4760]. 595 Subsequent Address Family Identifier (SAFI): This field is the same 596 as the one used in [RFC4760]. 598 Send/Receive: This field indicates whether the sender is (a) willing 599 to receive Routing Policies from its peer (value 1), (b) would like 600 to send Routing Policies to its peer (value 2), or (c) both (value 3) 601 for the . If Send/Receive is any other value, that tuple 602 is ignored but any other tuples present are still used. 604 5. Operations 606 This section presents a typical application scenario and some details 607 about handling a related failure. 609 5.1. Application Scenario 611 Figure 3 illustrates a typical scenario, where RPD is used by a 612 controller with a Route Reflector (RR) to adjust traffic dynamically. 614 +--------------+ 615 | Controller | 616 +-------+------+ 617 \ 618 \ RPD 619 .--\._.+--+ ___...__ 620 __( \ '.---... ( ) 621 / RR o -------- A o) ---------- (o X AS2 ) 622 (o E |\ ) _____//(___ ___) 623 ( | \_______ B o) ____/ / ''' 624 (o F \ ) ____/ 625 ( \_____ C o) ______/ ___...__ 626 ' AS1 _) \_____ ( ) 627 '---._.-. ) \_______ (o Y AS3 ) 628 '---' (___ ___) 629 ''' 631 Figure 3: Controller with RR Adjusts Traffic 633 The controller connects the RR through a BGP session. There is a BGP 634 session between the RR and each of routers A, B and C in AS1, which 635 is shown in the figure. Other sessions in AS1 are not shown in the 636 figure. 638 There is router X in AS2. There is a BGP session between X and each 639 of routers A, B and C in AS1. 641 There is router Y in AS3. There is a BGP session between Y and 642 router C in AS1. 644 The controller sends a RPD route to the RR. After receiving the RPD 645 route from the controller, the RR reflects the RPD route to routers 646 A, B and C. After receiving the RPD route from the RR, routers A, B 647 and C extract the routing policy from the RPD route. If the peer IP 648 in the NLRI of the RPD route is 0, then apply the routing policy to 649 all the remote peers of routers A, B and C. If the peer IP in the 650 NLRI of the RPD route is non-zero, then the IP address indicates a 651 remote peer of routers A, B and C and such routing policy is applied 652 to the specific remote peer. The IPv4/IPv6 unicast routes towards 653 router X in AS2 and router Y in AS3 will be adjusted based on the 654 routing policy sent by the controller via a RPD route. 656 The controller uses the RT extend community to notify a router 657 whether to receive a RPD policy. For example, if there is not any 658 adjustment on router B, the controller sends RPD routes with the RTs 659 for A and C. B will not receive the routes. 661 The process of adjusting traffic in a network is a close loop. The 662 loop starts from the controller with some traffic expectations on a 663 set of routes. The controller obtains the information about traffic 664 flows for the related routes. It analyzes the traffic and checks 665 whether the current traffic flows meet the expectations. If the 666 expectations are not met, the controller adjusts the traffic. And 667 then the loop goes to the starter of the loop (The controller obtains 668 the information about traffic ...). 670 5.2. About Failure 672 A RPD route is not a configuration. When it is sent to a router, no 673 ack is needed from the router. The existing BGP mechanisms are re- 674 used for delivering a RPD route. After the route is delivered to a 675 router, it will be successful. This is guaranteed by the BGP 676 protocols. 678 If there is a failure for the router to install the route locally, 679 this failure is a bug of the router. The bug needs to be fixed. 681 For the errors mentioned in [RFC7606], they are handled according to 682 [RFC7606]. These errors are bugs, which need to be resolved. 684 Regarding to the failure of the controller, some existing mechanisms 685 such BGP GR [RFC4724] and BGP Long-lived Graceful Restart (LLGR) can 686 be used to let the router keep the routes from the controller for 687 some time. 689 With support of "Long-lived Graceful Restart Capability" 690 [I-D.ietf-idr-long-lived-gr], the routes can be retained for a longer 691 time after the controller fails. 693 In the worst case, the controller fails and the RPD routes for 694 adjusting the traffic are withdrawn. The traffic adjusted/redirected 695 may take its old path. This should be acceptable. 697 6. Contributors 699 The following people have substantially contributed to the definition 700 of the BGP-FS RPD and to the editing of this document: 702 Peng Zhou 703 Huawei 704 Email: Jewpon.zhou@huawei.com 706 7. Security Considerations 708 Protocol extensions defined in this document do not affect BGP 709 security other than as discussed in the Security Considerations 710 section of [RFC5575]. 712 8. Acknowledgements 714 The authors would like to thank Acee Lindem, Jeff Haas, Jie Dong, 715 Lucy Yong, Qiandeng Liang, Zhenqiang Li, Robert Raszuk, Donald 716 Eastlake, Ketan Talaulikar, and Jakob Heitz for their comments to 717 this work. 719 9. IANA Considerations 721 9.1. Existing Assignments 723 IANA has assigned a new AFI of value 16398 from the registry "Address 724 Family Numbers" for Routing Policy. 726 IANA has assigned a new SAFI of value 75 from the registry 727 "Subsequent Address Family Identifiers (SAFI) Parameters" for Routing 728 Policy. 730 IANA has assigned a new Code Point of value 72 from the registry 731 "Capability Codes" for Routing Policy Distribution. 733 9.2. Routing Policy Type Registry 735 IANA is requested to create a new registry called "Routing Policy 736 Type". The allocation policy of this registry is "First Come First 737 Served (FCFS)". 739 The initial code points are as follows: 741 +-------------+-----------------------------------+-------------+ 742 | Code Point | Description | Reference | 743 +-------------+-----------------------------------+-------------+ 744 | 0 | Reserved | | 745 +-------------+-----------------------------------+-------------+ 746 | 1 | Export Policy |This document| 747 +-------------+-----------------------------------+-------------+ 748 | 2 | Import Policy |This document| 749 +-------------+-----------------------------------+-------------+ 750 | 3 - 255 | Available | | 751 +-------------+-----------------------------------+-------------+ 753 9.3. RouteAttr Atom Type 755 IANA is requested to assign a code-point from the registry "BGP 756 Community Container Atom Types" as follows: 758 +---------------------+------------------------------+-------------+ 759 | TLV Code Point | Description | Reference | 760 +---------------------+------------------------------+-------------+ 761 | TBD1 (48 suggested) | RouteAttr Atom |This document| 762 +---------------------+------------------------------+-------------+ 764 9.4. Route Attributes Sub-TLV Registry 766 IANA is requested to create a new registry called "Route Attributes 767 Sub-TLV" under RouteAttr Atom TLV. The allocation policy of this 768 registry is "First Come First Served (FCFS)". 770 The initial code points are as follows: 772 +-------------+-----------------------------------+-------------+ 773 | Code Point | Description | Reference | 774 +-------------+-----------------------------------+-------------+ 775 | 0 | Reserved | | 776 +-------------+-----------------------------------+-------------+ 777 | 1 | IPv4 Prefix Sub-TLV |This document| 778 +-------------+-----------------------------------+-------------+ 779 | 2 | AS-Path Sub-TLV |This document| 780 +-------------+-----------------------------------+-------------+ 781 | 3 | Community Sub-TLV |This document| 782 +-------------+-----------------------------------+-------------+ 783 | 4 | IPv6 Prefix Sub-TLV |This document| 784 +-------------+-----------------------------------+-------------+ 785 | 5 - 255 | Available | | 786 +-------------+-----------------------------------+-------------+ 788 9.5. Attribute Change Sub-TLV Registry 790 IANA is requested to create a new registry called "Attribute Change 791 Sub-TLV" under Parameter(s) TLV. The allocation policy of this 792 registry is "First Come First Served (FCFS)". 794 Initial code points are as follows: 796 +-------------+-----------------------------------+-------------+ 797 | Code Point | Description | Reference | 798 +-------------+-----------------------------------+-------------+ 799 | 0 | Reserved | | 800 +-------------+-----------------------------------+-------------+ 801 | 1 | MED Change Sub-TLV |This document| 802 +-------------+-----------------------------------+-------------+ 803 | 2 | AS-Path Change Sub-TLV |This document| 804 +-------------+-----------------------------------+-------------+ 805 | 3 - 255 | Available | | 806 +-------------+-----------------------------------+-------------+ 808 10. References 810 10.1. Normative References 812 [I-D.ietf-idr-wide-bgp-communities] 813 Raszuk, R., Haas, J., Lange, A., Decraene, B., Amante, S., 814 and P. Jakma, "BGP Community Container Attribute", draft- 815 ietf-idr-wide-bgp-communities-05 (work in progress), July 816 2018. 818 [RFC1997] Chandra, R., Traina, P., and T. Li, "BGP Communities 819 Attribute", RFC 1997, DOI 10.17487/RFC1997, August 1996, 820 . 822 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 823 Requirement Levels", BCP 14, RFC 2119, 824 DOI 10.17487/RFC2119, March 1997, 825 . 827 [RFC4271] Rekhter, Y., Ed., Li, T., Ed., and S. Hares, Ed., "A 828 Border Gateway Protocol 4 (BGP-4)", RFC 4271, 829 DOI 10.17487/RFC4271, January 2006, 830 . 832 [RFC4760] Bates, T., Chandra, R., Katz, D., and Y. Rekhter, 833 "Multiprotocol Extensions for BGP-4", RFC 4760, 834 DOI 10.17487/RFC4760, January 2007, 835 . 837 [RFC5492] Scudder, J. and R. Chandra, "Capabilities Advertisement 838 with BGP-4", RFC 5492, DOI 10.17487/RFC5492, February 839 2009, . 841 [RFC5575] Marques, P., Sheth, N., Raszuk, R., Greene, B., Mauch, J., 842 and D. McPherson, "Dissemination of Flow Specification 843 Rules", RFC 5575, DOI 10.17487/RFC5575, August 2009, 844 . 846 [RFC8126] Cotton, M., Leiba, B., and T. Narten, "Guidelines for 847 Writing an IANA Considerations Section in RFCs", BCP 26, 848 RFC 8126, DOI 10.17487/RFC8126, June 2017, 849 . 851 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 852 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 853 May 2017, . 855 10.2. Informative References 857 [I-D.ietf-idr-long-lived-gr] 858 Uttaro, J., Chen, E., Decraene, B., and J. G. Scudder, 859 "Support for Long-lived BGP Graceful Restart", draft-ietf- 860 idr-long-lived-gr-00 (work in progress), September 2019. 862 [I-D.ietf-idr-registered-wide-bgp-communities] 863 Raszuk, R. and J. Haas, "Registered Wide BGP Community 864 Values", draft-ietf-idr-registered-wide-bgp-communities-02 865 (work in progress), May 2016. 867 [RFC4724] Sangli, S., Chen, E., Fernando, R., Scudder, J., and Y. 868 Rekhter, "Graceful Restart Mechanism for BGP", RFC 4724, 869 DOI 10.17487/RFC4724, January 2007, 870 . 872 [RFC7606] Chen, E., Ed., Scudder, J., Ed., Mohapatra, P., and K. 873 Patel, "Revised Error Handling for BGP UPDATE Messages", 874 RFC 7606, DOI 10.17487/RFC7606, August 2015, 875 . 877 Authors' Addresses 879 Zhenbin Li 880 Huawei 881 Huawei Bld., No.156 Beiqing Rd. 882 Beijing 100095 883 China 885 Email: lizhenbin@huawei.com 886 Liang Ou 887 China Telcom Co., Ltd. 888 109 West Zhongshan Ave,Tianhe District 889 Guangzhou 510630 890 China 892 Email: ouliang@chinatelecom.cn 894 Yujia Luo 895 China Telcom Co., Ltd. 896 109 West Zhongshan Ave,Tianhe District 897 Guangzhou 510630 898 China 900 Email: luoyuj@sdu.edu.cn 902 Sujian Lu 903 Tencent 904 Tengyun Building,Tower A ,No. 397 Tianlin Road 905 Shanghai, Xuhui District 200233 906 China 908 Email: jasonlu@tencent.com 910 Gyan S. Mishra 911 Verizon Inc. 912 13101 Columbia Pike 913 Silver Spring MD 20904 914 USA 916 Phone: 301 502-1347 917 Email: gyan.s.mishra@verizon.com 919 Huaimo Chen 920 Futurewei 921 Boston, MA 922 USA 924 Email: Huaimo.chen@futurewei.com 925 Shunwan Zhuang 926 Huawei 927 Huawei Bld., No.156 Beiqing Rd. 928 Beijing 100095 929 China 931 Email: zhuangshunwan@huawei.com 933 Haibo Wang 934 Huawei 935 Huawei Bld., No.156 Beiqing Rd. 936 Beijing 100095 937 China 939 Email: rainsword.wang@huawei.com