idnits 2.17.1 draft-ietf-idr-performance-routing-03.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (December 21, 2020) is 1219 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'RFC7311' is mentioned on line 98, but not defined == Unused Reference: 'I-D.ietf-idr-bgp-optimal-route-reflection' is defined on line 397, but no explicit reference was found in the text == Unused Reference: 'I-D.ietf-spring-segment-routing-policy' is defined on line 403, but no explicit reference was found in the text == Unused Reference: 'RFC3630' is defined on line 413, but no explicit reference was found in the text == Unused Reference: 'RFC5305' is defined on line 418, but no explicit reference was found in the text ** Obsolete normative reference: RFC 3107 (Obsoleted by RFC 8277) == Outdated reference: A later version (-28) exists of draft-ietf-idr-bgp-optimal-route-reflection-21 == Outdated reference: A later version (-22) exists of draft-ietf-spring-segment-routing-policy-09 -- Obsolete informational reference (is this intentional?): RFC 2679 (Obsoleted by RFC 7679) -- Obsolete informational reference (is this intentional?): RFC 7810 (Obsoleted by RFC 8570) Summary: 1 error (**), 0 flaws (~~), 8 warnings (==), 3 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group X. Xu 3 Internet-Draft Alibaba, Inc 4 Intended status: Standards Track S. Hegde 5 Expires: June 24, 2021 Juniper 6 K. Talaulikar 7 Cisco 8 M. Boucadair 9 C. Jacquenet 10 France Telecom 11 December 21, 2020 13 Performance-based BGP Routing Mechanism 14 draft-ietf-idr-performance-routing-03 16 Abstract 18 The current BGP specification doesn't use network performance metrics 19 (e.g., network latency) in the route selection decision process. 20 This document describes a performance-based BGP routing mechanism in 21 which network latency metric is taken as one of the route selection 22 criteria. This routing mechanism is useful for those server 23 providers with global reach to deliver low-latency network 24 connectivity services to their customers. 26 Requirements Language 28 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 29 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 30 document are to be interpreted as described in RFC 2119 [RFC2119]. 32 Status of This Memo 34 This Internet-Draft is submitted in full conformance with the 35 provisions of BCP 78 and BCP 79. 37 Internet-Drafts are working documents of the Internet Engineering 38 Task Force (IETF). Note that other groups may also distribute 39 working documents as Internet-Drafts. The list of current Internet- 40 Drafts is at https://datatracker.ietf.org/drafts/current/. 42 Internet-Drafts are draft documents valid for a maximum of six months 43 and may be updated, replaced, or obsoleted by other documents at any 44 time. It is inappropriate to use Internet-Drafts as reference 45 material or to cite them other than as "work in progress." 47 This Internet-Draft will expire on June 24, 2021. 49 Copyright Notice 51 Copyright (c) 2020 IETF Trust and the persons identified as the 52 document authors. All rights reserved. 54 This document is subject to BCP 78 and the IETF Trust's Legal 55 Provisions Relating to IETF Documents 56 (https://trustee.ietf.org/license-info) in effect on the date of 57 publication of this document. Please review these documents 58 carefully, as they describe your rights and restrictions with respect 59 to this document. Code Components extracted from this document must 60 include Simplified BSD License text as described in Section 4.e of 61 the Trust Legal Provisions and are provided without warranty as 62 described in the Simplified BSD License. 64 Table of Contents 66 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 67 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 3 68 3. Performance Route Advertisement . . . . . . . . . . . . . . . 4 69 4. Capability Advertisement . . . . . . . . . . . . . . . . . . 5 70 5. Performance Route Selection . . . . . . . . . . . . . . . . . 5 71 6. Deployment Considerations . . . . . . . . . . . . . . . . . . 6 72 7. Contributors . . . . . . . . . . . . . . . . . . . . . . . . 7 73 8. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 8 74 9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 8 75 10. Security Considerations . . . . . . . . . . . . . . . . . . . 8 76 11. References . . . . . . . . . . . . . . . . . . . . . . . . . 8 77 11.1. Normative References . . . . . . . . . . . . . . . . . . 8 78 11.2. Informative References . . . . . . . . . . . . . . . . . 9 79 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 10 81 1. Introduction 83 Network latency is widely recognized as one of major obstacles in 84 migrating business applications to the cloud since cloud-based 85 applications usually have very clearly defined and stringent network 86 latency requirements. Service providers with global reach aim at 87 delivering low-latency network connectivity services to their cloud 88 service customers as a competitive advantage. Sometimes, the network 89 connectivity may travel across more than one Autonomous System (AS) 90 under their administration. However, the BGP [RFC4271] which is used 91 for path selection across ASes doesn't use network latency in the 92 route selection process. As such, the best route selected based upon 93 the existing BGP route selection criteria may not be the best from 94 the customer experience perspective. 96 This document describes a performance-based BGP routing paradigm in 97 which network latency metric is disseminated via a new TLV of the 98 AIGP attribute [RFC7311] and that metric is used as an input to the 99 route selection process. This mechanism is useful for those server 100 providers with global reach, which usually own more than one AS, to 101 deliver low-latency network connectivity services to their customers. 103 Furthermore, in order to be backward compatible with existing BGP 104 implementations and have no impact on the stability of the overall 105 routing system, it's expected that the performance routing paradigm 106 could coexist with the vanilla routing paradigm. As such, service 107 providers could thus provide low-latency routing services while still 108 offering the vanilla routing services depending on customers' 109 requirements. 111 For the sake of simplicity, this document considers only one network 112 performance metric that's the network latency metric. The support of 113 multiple network performance metrics is out of scope of this 114 document. In addition, this document focuses exclusively on BGP 115 matters and therefore all those BGP-irrelevant matters such as the 116 mechanisms for measuring network latency are outside the scope of 117 this document. 119 A variant of this performance-based BGP routing is implemented (see 120 http://www.ist-mescal.org/roadmap/qbgp-demo.avi). 122 2. Terminology 124 This memo makes use of the terms defined in [RFC4271]. 126 Network latency indicates the amount of time it takes for a packet to 127 traverse a given network path [RFC2679]. Provided a packet was 128 forwarded along a path which contains multiple links and routers, the 129 network latency would be the sum of the transmission latency of each 130 link (i.e., link latency), plus the sum of the internal delay 131 occurred within each router (i.e., router latency) which includes 132 queuing latency and processing latency. The sum of the link latency 133 is also known as the cumulative link latency. In today's service 134 provider networks which usually span across a wide geographical area, 135 the cumulative link latency becomes the major part of the network 136 latency since the total of the internal latency happened within each 137 high-capacity router seems trivial compared to the cumulative link 138 latency. In other words, the cumulative link latency could 139 approximately represent the network latency in the above networks. 141 Furthermore, since the link latency is more stable than the router 142 latency, such approximate network latency represented by the 143 cumulative link latency is more stable. Therefore, if there was a 144 way to calculate the cumulative link latency of a given network path, 145 it is strongly recommended to use such cumulative link latency to 146 approximately represent the network latency. Otherwise, the network 147 latency would have to be measured frequently by some means (e.g., 148 PING or other measurement tools). 150 3. Performance Route Advertisement 152 Performance (i.e., low latency) routes SHOULD be exchanged between 153 BGP peers by means of a specific Subsequent Address Family Identifier 154 (SAFI) of TBD (see IANA Section) and also be carried as labeled 155 routes as per [RFC3107]. In other word, performance routes can then 156 be looked as specific labeled routes which are associated with 157 network latency metric. 159 A BGP speaker SHOULD NOT advertise performance routes to a particular 160 BGP peer unless that peer indicates, through BGP capability 161 advertisement (see Section 4), that it can process update messages 162 with that specific SAFI field. 164 Network latency metric is attached to the performance routes via a 165 new TLV of the AIGP attribute, referred to as NETWORK_LATENCY TLV. 166 The value of this TLV indicates the network latency in microseconds 167 from the BGP speaker depicted by the NEXT_HOP path attribute to the 168 address depicted by the NLRI prefix. The type code of this TLV is 169 TBD (see IANA Section), and the value field is 4 octets in length. 170 In some abnormal cases, if the cumulative link latency exceeds the 171 maximum value of 0xFFFFFFFF, the value field SHOULD be set to 172 0xFFFFFFFF. Note that the NETWORK_LATENCY TLV MUST NOT co-exisit 173 with the AIGP TLV within the same AIGP attribute. 175 A BGP speaker SHOULD be configurable to enable or disable the 176 origination of performance routes. If enabled, a local latency value 177 for a given to-be-originated performance route MUST be configured to 178 the BGP speaker so that it can be filled to the NETWORK_LATENCY TLV 179 of that performance route. 181 A BGP speaker that is enabled to process NETWORK_LATENCY, but it was 182 not provisioned with the local latency value SHOULD remove the 183 NETWORK_LATENCY attribute when it advertises the corresponding route 184 downstream. 186 When distributing a performance route learnt from a BGP peer, if this 187 BGP speaker has set itself as the NEXT_HOP of such route, the value 188 of the NETWORK_LATENCY TLV SHOULD be increased by adding the network 189 latency from itself to the previous NEXT_HOP of such route. 190 Otherwise, the NETWORK_LATENCY TLV of such route MUST NOT be 191 modified. 193 As for how to obtain the network latency to a given BGP NEXT_HOP is 194 outside the scope of this document. However, note that the path 195 latency to the NEXT HOP SHOULD approximately represent the network 196 latency of the exact forwarding path towards the NEXT_HOP. For 197 example, if a BGP speaker uses a Traffic Engineering (TE) Label 198 Switching Path (LSP) from itself to the NEXT_HOP, rather than the 199 shortest path calculated by Interior Gateway Protocol (IGP), the 200 latency to the NEXT HOP SHOULD reflect the network latency of that TE 201 LSP path, rather than the IGP shortest path. In the case where the 202 latency to the NEXT HOP could not be obtained due to some reason(s), 203 that latency SHOULD be set to 0xFFFFFFFF by default. 205 To keep performance routes stable enough, a BGP speaker SHOULD use a 206 configurable threshold for network latency fluctuation to avoid 207 sending any update which would otherwise be triggered by a minor 208 network latency fluctuation below that threshold. 210 4. Capability Advertisement 212 A BGP speaker that uses multiprotocol extensions to advertise 213 performance routes SHOULD use the Capabilities Optional Parameter, as 214 defined in [RFC5492], to inform its peers about this capability. 216 The MP_EXT Capability Code, as defined in [RFC4760], is used to 217 advertise the (AFI, SAFI) pairs available on a particular connection. 219 A BGP speaker that implements the Performance Routing Capability MUST 220 support the BGP Labeled Route Capability, as defined in [RFC3107]. A 221 BGP speaker that advertises the Performance Routing Capability to a 222 peer using BGP Capabilities advertisement [RFC5492] does not have to 223 advertise the BGP Labeled Route Capability to that peer. 225 5. Performance Route Selection 227 Performance route selection only requires the following modification 228 to the tie-breaking procedures of the BGP route selection decision 229 (phase 2) described in [RFC4271]: network latency metric comparison 230 SHOULD be executed just ahead of the AS-Path Length comparison step. 231 Prior to executing the network latency metric comparison, the value 232 of the NETWORK_LATENCY TLV SHOULD be increased by adding the network 233 latency from the BGP speaker to the NEXT_HOP of that route. 235 The Loc-RIB of the performance routing paradigm is independent from 236 that of the vanilla routing paradigm. Accordingly, the routing table 237 of the performance routing paradigm is independent from that of the 238 vanilla routing paradigm. Whether the performance routing paradigm 239 or the vanilla routing paradigm would be applied to a given packet is 240 a local policy issue which is outside the scope of this document. 242 For example, by leveraging the Cos-Based Forwarding (CBF) capability 243 which allows routers to have distinct routing and forwarding tables 244 for each type of traffic, the selected performance routes could be 245 installed in the routing and forwarding tables corresponding to high- 246 priority traffic. 248 6. Deployment Considerations 250 This section is not normative. 252 Enabling the performance-based BGP routing at large (i.e., among 253 domains that do not belong to the same administrative entity) may be 254 conditioned by other administrative settlement considerations that 255 are out of scope of this document. Nevertheless, this document does 256 not require nor exclude activating the proposed route selection 257 scheme between domains that are managed by distinct administrative 258 entities. 260 The main deployment case targeted by this specification is where 261 involved domains are managed by the same administrative entity. 262 Concretely, this performance-based BGP routing mechanism can 263 advantageously be enabled in a multi-domain environment, where all 264 the involved domains are operated by the same administrative entity 265 so that the processing of the low latency routes can be consistent 266 throughout the domains. Besides security considerations that may 267 arise (and which are further discussed in Section 9), there is indeed 268 a need to consistently enforce a low-latency-based BGP routing policy 269 within a set of domains that belong to the same administrative 270 entity. This is motivated by the processing of traffic which is of 271 very different nature and which may have different QoS requirements. 272 Moreover, the combined use of BGP-inferred low latency information 273 with traffic engineering tools that would lead to the computation and 274 the establishment of traffic-engineered LSP paths between "low 275 latency"-enabled BGP peers based upon the manipulation of the 276 Unidirectional Link delay sub-TLV [RFC7810] [RFC7471] would 277 contribute to guarantee the overall consistency of the low latency 278 information within each domain. Furthmore, a BGP color extended 279 community could be attached to the performance routes so as to 280 associates a low-latency Segment Routing (SR) LSP towards the BGP 281 NEXT_HOP with these low-latency BGP routes, in this way, those 282 traffic matching the low-latency BGP routes would be forwarded to the 283 BGP NEXT_HOP via the low-latency SR LSP towards that BGP NEXT_HOP. 285 In network environments where router reflectors are deployed but 286 next-hop-self is disabled on them, route reflectors usually reflect 287 those received routes which are optimal (i.e., lowest latency) from 288 their perspectives but may not be optimal from the receivers' 289 perspectives. Some existing solutions as described in [RFC7911], [I- 290 D.ietf-idr-bgp-optimal-route-reflection] and [RFC6774] can be used to 291 address this issue. 293 From a network provider perspective, the ability to manipulate low 294 latency routes may lead to different, presumably service-specific 295 designs. In particular, there is a need to assess the impact of 296 using such capability on the overall performance of the BGP peers 297 from a route computation and selection procedure as a function of the 298 tie-breaking operation. A typical use case would consist in 299 selecting low latency routes for traffic that for example pertains to 300 the VoIP, or whose nature demands the selection of the lowest latency 301 route in the Adj-RIB-Out database of the corresponding BGP peers. 302 Typically, live broadcasting services or some e-health services could 303 certainly take advantage of such capability. It is out of scope of 304 this document to exhaustively elaborate on such service-specific 305 designs that are obviously deployment-specific. 307 7. Contributors 309 Ning So 310 Reliance 311 Email: Ning.So@ril.com 313 Yimin Shen 314 Juniper 315 Email: yshen@juniper.net 317 Uma Chunduri 318 Huawei 319 Email: uma.chunduri@huawei.com 321 Hui Ni 322 Huawei 323 Email: nihui@huawei.com 325 Yongbing Fan 326 China Telecom 327 Email: fanyb@gsta.com 329 Luis M. Contreras 330 Telefonica I+D 331 Email: luismiguel.contrerasmurillo@telefonica.com 333 8. Acknowledgements 335 Thanks to Joel Halpern, Alvaro Retana, Jim Uttaro, Robert Raszuk, 336 Eric Rosen, Bruno Decraene, Qing Zeng, Jie Dong, Mach Chen, Saikat 337 Ray, Wes George, Jeff Haas, John Scudder, Stephane Litkowski and 338 Sriganesh Kini for their valuable comments on this document. Special 339 thanks should be given to Jim Uttaro and Eric Rosen for their 340 proposal of using a new TLV of the AIGP attribute to convey the 341 network latency metric. 343 9. IANA Considerations 345 A new BGP Capability Code for the Performance Routing Capability, a 346 new SAFI specific for performance routing and a new type code for 347 NETWORK_LATENCY TLV of the AIGP attribute are required to be 348 allocated by IANA. 350 10. Security Considerations 352 In addition to the considerations discussed in [RFC4271], the 353 following items should be considered as well: 355 a. Tweaking the value of the NETWORK_LATENCY by an illegitimate 356 party may influence the route selection results. Therefore, the 357 Performance Routing Capability negotiation between BGP peers 358 which belong to different administration domains MUST be disabled 359 by default. Furthermore, a BGP speaker MUST discard all 360 performance routes received from the BGP peer for which the 361 Performance Routing Capability negotiation has been disabled. 363 b. Frequent updates of the NETWORK_LATENCY TLV may have a severe 364 impact on the stability of the routing system. Such practice 365 SHOULD be avoided by setting a reasonable threshold for network 366 latency fluctuation. 368 11. References 370 11.1. Normative References 372 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 373 Requirement Levels", BCP 14, RFC 2119, 374 DOI 10.17487/RFC2119, March 1997, 375 . 377 [RFC3107] Rekhter, Y. and E. Rosen, "Carrying Label Information in 378 BGP-4", RFC 3107, DOI 10.17487/RFC3107, May 2001, 379 . 381 [RFC4271] Rekhter, Y., Ed., Li, T., Ed., and S. Hares, Ed., "A 382 Border Gateway Protocol 4 (BGP-4)", RFC 4271, 383 DOI 10.17487/RFC4271, January 2006, 384 . 386 [RFC4760] Bates, T., Chandra, R., Katz, D., and Y. Rekhter, 387 "Multiprotocol Extensions for BGP-4", RFC 4760, 388 DOI 10.17487/RFC4760, January 2007, 389 . 391 [RFC5492] Scudder, J. and R. Chandra, "Capabilities Advertisement 392 with BGP-4", RFC 5492, DOI 10.17487/RFC5492, February 393 2009, . 395 11.2. Informative References 397 [I-D.ietf-idr-bgp-optimal-route-reflection] 398 Raszuk, R., Cassar, C., Aman, E., Decraene, B., and K. 399 Wang, "BGP Optimal Route Reflection (BGP-ORR)", draft- 400 ietf-idr-bgp-optimal-route-reflection-21 (work in 401 progress), June 2020. 403 [I-D.ietf-spring-segment-routing-policy] 404 Filsfils, C., Talaulikar, K., Voyer, D., Bogdanov, A., and 405 P. Mattes, "Segment Routing Policy Architecture", draft- 406 ietf-spring-segment-routing-policy-09 (work in progress), 407 November 2020. 409 [RFC2679] Almes, G., Kalidindi, S., and M. Zekauskas, "A One-way 410 Delay Metric for IPPM", RFC 2679, DOI 10.17487/RFC2679, 411 September 1999, . 413 [RFC3630] Katz, D., Kompella, K., and D. Yeung, "Traffic Engineering 414 (TE) Extensions to OSPF Version 2", RFC 3630, 415 DOI 10.17487/RFC3630, September 2003, 416 . 418 [RFC5305] Li, T. and H. Smit, "IS-IS Extensions for Traffic 419 Engineering", RFC 5305, DOI 10.17487/RFC5305, October 420 2008, . 422 [RFC6774] Raszuk, R., Ed., Fernando, R., Patel, K., McPherson, D., 423 and K. Kumaki, "Distribution of Diverse BGP Paths", 424 RFC 6774, DOI 10.17487/RFC6774, November 2012, 425 . 427 [RFC7471] Giacalone, S., Ward, D., Drake, J., Atlas, A., and S. 428 Previdi, "OSPF Traffic Engineering (TE) Metric 429 Extensions", RFC 7471, DOI 10.17487/RFC7471, March 2015, 430 . 432 [RFC7810] Previdi, S., Ed., Giacalone, S., Ward, D., Drake, J., and 433 Q. Wu, "IS-IS Traffic Engineering (TE) Metric Extensions", 434 RFC 7810, DOI 10.17487/RFC7810, May 2016, 435 . 437 [RFC7911] Walton, D., Retana, A., Chen, E., and J. Scudder, 438 "Advertisement of Multiple Paths in BGP", RFC 7911, 439 DOI 10.17487/RFC7911, July 2016, 440 . 442 Authors' Addresses 444 Xiaohu Xu 445 Alibaba, Inc 447 Email: 13910161692@qq.com 449 Shraddha Hegde 450 Juniper 452 Email: shraddha@juniper.net 454 Ketan Talaulikar 455 Cisco 457 Email: ketant@cisco.com 459 Mohamed Boucadair 460 France Telecom 462 Email: mohamed.boucadair@orange.com 464 Christian Jacquenet 465 France Telecom 467 Email: christian.jacquenet@orange.com