idnits 2.17.1 draft-ietf-idr-performance-routing-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (February 26, 2015) is 3347 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Unused Reference: 'RFC3630' is defined on line 385, but no explicit reference was found in the text == Unused Reference: 'RFC5305' is defined on line 393, but no explicit reference was found in the text == Outdated reference: A later version (-15) exists of draft-ietf-idr-add-paths-10 == Outdated reference: A later version (-28) exists of draft-ietf-idr-bgp-optimal-route-reflection-08 == Outdated reference: A later version (-11) exists of draft-ietf-isis-te-metric-extensions-04 -- Obsolete informational reference (is this intentional?): RFC 2679 (Obsoleted by RFC 7679) -- Obsolete informational reference (is this intentional?): RFC 3107 (Obsoleted by RFC 8277) Summary: 0 errors (**), 0 flaws (~~), 6 warnings (==), 3 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group X. Xu 3 Internet-Draft Huawei 4 Intended status: Standards Track M. Boucadair 5 Expires: August 30, 2015 C. Jacquenet 6 France Telecom 7 N. So 8 Vinci Systems 9 Y. Shen 10 Juniper 11 U. Chunduri 12 Ericsson 13 H. Ni 14 Huawei 15 Y. Fan 16 China Telecom 17 L. Contreras 18 Telefonica I+D 19 February 26, 2015 21 Performance-based BGP Routing Mechanism 22 draft-ietf-idr-performance-routing-01 24 Abstract 26 The current BGP specification doesn't use network performance metrics 27 (e.g., network latency) in the route selection decision process. 28 This document describes a performance-based BGP routing mechanism in 29 which network latency metric is taken as one of the route selection 30 criteria. This routing mechanism is useful for those server 31 providers with global reach to deliver low-latency network 32 connectivity services to their customers. 34 Status of This Memo 36 This Internet-Draft is submitted in full conformance with the 37 provisions of BCP 78 and BCP 79. 39 Internet-Drafts are working documents of the Internet Engineering 40 Task Force (IETF). Note that other groups may also distribute 41 working documents as Internet-Drafts. The list of current Internet- 42 Drafts is at http://datatracker.ietf.org/drafts/current/. 44 Internet-Drafts are draft documents valid for a maximum of six months 45 and may be updated, replaced, or obsoleted by other documents at any 46 time. It is inappropriate to use Internet-Drafts as reference 47 material or to cite them other than as "work in progress." 48 This Internet-Draft will expire on August 30, 2015. 50 Copyright Notice 52 Copyright (c) 2015 IETF Trust and the persons identified as the 53 document authors. All rights reserved. 55 This document is subject to BCP 78 and the IETF Trust's Legal 56 Provisions Relating to IETF Documents 57 (http://trustee.ietf.org/license-info) in effect on the date of 58 publication of this document. Please review these documents 59 carefully, as they describe your rights and restrictions with respect 60 to this document. Code Components extracted from this document must 61 include Simplified BSD License text as described in Section 4.e of 62 the Trust Legal Provisions and are provided without warranty as 63 described in the Simplified BSD License. 65 Table of Contents 67 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 68 1.1. Requirements Language . . . . . . . . . . . . . . . . . . 3 69 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 3 70 3. Performance Route Advertisement . . . . . . . . . . . . . . . 4 71 4. Capability Advertisement . . . . . . . . . . . . . . . . . . 5 72 5. Performance Route Selection . . . . . . . . . . . . . . . . . 5 73 6. Deployment Considerations . . . . . . . . . . . . . . . . . . 6 74 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 7 75 8. Security Considerations . . . . . . . . . . . . . . . . . . . 7 76 9. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 7 77 10. References . . . . . . . . . . . . . . . . . . . . . . . . . 8 78 10.1. Normative References . . . . . . . . . . . . . . . . . . 8 79 10.2. Informative References . . . . . . . . . . . . . . . . . 8 80 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 9 82 1. Introduction 84 Network latency is widely recognized as one of major obstacles in 85 migrating business applications to the cloud since cloud-based 86 applications usually have very clearly defined and stringent network 87 latency requirements. Service providers with global reach aim at 88 delivering low-latency network connectivity services to their cloud 89 service customers as a competitive advantage. Sometimes, the network 90 connectivity may travel across more than one Autonomous System (AS) 91 under their administration. However, the BGP [RFC4271] which is used 92 for path selection across ASes doesn't use network latency in the 93 route selection process. As such, the best route selected based upon 94 the existing BGP route selection criteria may not be the best from 95 the customer experience perspective. 97 This document describes a performance-based BGP routing paradigm in 98 which network latency metric is disseminated via a new TLV of the 99 AIGP attribute [RFC7311] and that metric is used as an input to the 100 route selection process. This mechanism is useful for those server 101 providers with global reach, which usually own more than one AS, to 102 deliver low-latency network connectivity services to their customers. 104 Furthermore, in order to be backward compatible with existing BGP 105 implementations and have no impact on the stability of the overall 106 routing system, it's expected that the performance routing paradigm 107 could coexist with the vanilla routing paradigm. As such, service 108 providers could thus provide low-latency routing services while still 109 offering the vanilla routing services depending on customers' 110 requirements. 112 For the sake of simplicity, this document considers only one network 113 performance metric that's the network latency metric. The support of 114 multiple network performance metrics is out of scope of this 115 document. In addition, this document focuses exclusively on BGP 116 matters and therefore all those BGP-irrelevant matters such as the 117 mechanisms for measuring network latency are outside the scope of 118 this document. 120 A variant of this performance-based BGP routing is implemented (see 121 http://www.ist-mescal.org/roadmap/qbgp-demo.avi). 123 1.1. Requirements Language 125 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 126 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 127 document are to be interpreted as described in RFC 2119 [RFC2119]. 129 2. Terminology 131 This memo makes use of the terms defined in [RFC4271]. 133 Network latency indicates the amount of time it takes for a packet to 134 traverse a given network path [RFC2679]. Provided a packet was 135 forwarded along a path which contains multiple links and routers, the 136 network latency would be the sum of the transmission latency of each 137 link (i.e., link latency), plus the sum of the internal delay 138 occurred within each router (i.e., router latency) which includes 139 queuing latency and processing latency. The sum of the link latency 140 is also known as the cumulative link latency. In today's service 141 provider networks which usually span across a wide geographical area, 142 the cumulative link latency becomes the major part of the network 143 latency since the total of the internal latency happened within each 144 high-capacity router seems trivial compared to the cumulative link 145 latency. In other words, the cumulative link latency could 146 approximately represent the network latency in the above networks. 148 Furthermore, since the link latency is more stable than the router 149 latency, such approximate network latency represented by the 150 cumulative link latency is more stable. Therefore, if there was a 151 way to calculate the cumulative link latency of a given network path, 152 it is strongly recommended to use such cumulative link latency to 153 approximately represent the network latency. Otherwise, the network 154 latency would have to be measured frequently by some means (e.g., 155 PING or other measurement tools). 157 3. Performance Route Advertisement 159 Performance (i.e., low latency) routes SHOULD be exchanged between 160 BGP peers by means of a specific Subsequent Address Family Identifier 161 (SAFI) of TBD (see IANA Section) and also be carried as labeled 162 routes as per [RFC3107]. In other word, performance routes can then 163 be looked as specific labeled routes which are associated with 164 network latency metric. 166 A BGP speaker SHOULD NOT advertise performance routes to a particular 167 BGP peer unless that peer indicates, through BGP capability 168 advertisement (see Section 4), that it can process update messages 169 with that specific SAFI field. 171 Network latency metric is attached to the performance routes via a 172 new TLV of the AIGP attribute, referred to as NETWORK_LATENCY TLV. 173 The value of this TLV indicates the network latency in microseconds 174 from the BGP speaker depicted by the NEXT_HOP path attribute to the 175 address depicted by the NLRI prefix. The type code of this TLV is 176 TBD (see IANA Section), and the value field is 4 octets in length. 177 In some abnormal cases, if the cumulative link latency exceeds the 178 maximum value of 0xFFFFFFFF, the value field SHOULD be set to 179 0xFFFFFFFF. Note that the NETWORK_LATENCY TLV MUST NOT co-exisit 180 with the AIGP TLV within the same AIGP attribute. 182 A BGP speaker SHOULD be configurable to enable or disable the 183 origination of performance routes. If enabled, a local latency value 184 for a given to-be-originated performance route MUST be configured to 185 the BGP speaker so that it can be filled to the NETWORK_LATENCY TLV 186 of that performance route. 188 A BGP speaker that is enabled to process NETWORK_LATENCY, but it was 189 not provisioned with the local latency value SHOULD remove the 190 NETWORK_LATENCY attribute when it advertises the corresponding route 191 downstream. 193 When distributing a performance route learnt from a BGP peer, if this 194 BGP speaker has set itself as the NEXT_HOP of such route, the value 195 of the NETWORK_LATENCY TLV SHOULD be increased by adding the network 196 latency from itself to the previous NEXT_HOP of such route. 197 Otherwise, the NETWORK_LATENCY TLV of such route MUST NOT be 198 modified. 200 As for how to obtain the network latency to a given BGP NEXT_HOP is 201 outside the scope of this document. However, note that the path 202 latency to the NEXT HOP SHOULD approximately represent the network 203 latency of the exact forwarding path towards the NEXT_HOP. For 204 example, if a BGP speaker uses a Traffic Engineering (TE) Label 205 Switching Path (LSP) from itself to the NEXT_HOP, rather than the 206 shortest path calculated by Interior Gateway Protocol (IGP), the 207 latency to the NEXT HOP SHOULD reflect the network latency of that TE 208 LSP path, rather than the IGP shortest path. In the case where the 209 latency to the NEXT HOP could not be obtained due to some reason(s), 210 that latency SHOULD be set to 0xFFFFFFFF by default. 212 To keep performance routes stable enough, a BGP speaker SHOULD use a 213 configurable threshold for network latency fluctuation to avoid 214 sending any update which would otherwise be triggered by a minor 215 network latency fluctuation below that threshold. 217 4. Capability Advertisement 219 A BGP speaker that uses multiprotocol extensions to advertise 220 performance routes SHOULD use the Capabilities Optional Parameter, as 221 defined in [RFC5492], to inform its peers about this capability. 223 The MP_EXT Capability Code, as defined in [RFC4760], is used to 224 advertise the (AFI, SAFI) pairs available on a particular connection. 226 A BGP speaker that implements the Performance Routing Capability MUST 227 support the BGP Labeled Route Capability, as defined in [RFC3107]. A 228 BGP speaker that advertises the Performance Routing Capability to a 229 peer using BGP Capabilities advertisement [RFC5492] does not have to 230 advertise the BGP Labeled Route Capability to that peer. 232 5. Performance Route Selection 234 Performance route selection only requires the following modification 235 to the tie-breaking procedures of the BGP route selection decision 236 (phase 2) described in [RFC4271]: network latency metric comparison 237 SHOULD be executed just ahead of the AS-Path Length comparison step. 238 Prior to executing the network latency metric comparison, the value 239 of the NETWORK_LATENCY TLV SHOULD be increased by adding the network 240 latency from the BGP speaker to the NEXT_HOP of that route. 242 The Loc-RIB of the performance routing paradigm is independent from 243 that of the vanilla routing paradigm. Accordingly, the routing table 244 of the performance routing paradigm is independent from that of the 245 vanilla routing paradigm. Whether the performance routing paradigm 246 or the vanilla routing paradigm would be applied to a given packet is 247 a local policy issue which is outside the scope of this document. 249 6. Deployment Considerations 251 This section is not normative. 253 Enabling the performance-based BGP routing at large (i.e., among 254 domains that do not belong to the same administrative entity) may be 255 conditioned by other administrative settlement considerations that 256 are out of scope of this document. Nevertheless, this document does 257 not require nor exclude activating the proposed route selection 258 scheme between domains that are managed by distinct administrative 259 entities. 261 The main deployment case targeted by this specification is where 262 involved domains are managed by the same administrative entity. 263 Concretely, this performance-based BGP routing mechanism can 264 advantageously be enabled in a multi-domain environment, where all 265 the involved domains are operated by the same administrative entity 266 so that the processing of the low latency routes can be consistent 267 throughout the domains. Besides security considerations that may 268 arise (and which are further discussed in Section 9), there is indeed 269 a need to consistently enforce a low-latency-based BGP routing policy 270 within a set of domains that belong to the same administrative 271 entity. This is motivated by the processing of traffic which is of 272 very different nature and which may have different QoS requirements. 273 Moreover, the combined use of BGP-inferred low latency information 274 with traffic engineering tools that would lead to the computation and 275 the establishment of traffic-engineered LSP paths between "low 276 latency"-enabled BGP peers based upon the manipulation of the 277 Unidirectional Link delay sub-TLV 278 [I-D.ietf-isis-te-metric-extensions] 279 [I-D.ietf-ospf-te-metric-extensions] would contribute to guarantee 280 the overall consistency of the low latency information within each 281 domain. 283 In network environments where router reflectors are deployed but 284 next-hop-self is disabled on them, route reflectors usually reflect 285 those received routes which are optimal (i.e., lowest latency) from 286 their perspectives but may not be optimal from the receivers' 287 perspectives. Some existing solutions as described in 288 [I-D.ietf-idr-add-paths], [I-D.ietf-idr-bgp-optimal-route-reflection] 289 and [RFC6774] can be used to address this issue. 291 From a network provider perspective, the ability to manipulate low 292 latency routes may lead to different, presumably service-specific 293 designs. In particular, there is a need to assess the impact of 294 using such capability on the overall performance of the BGP peers 295 from a route computation and selection procedure as a function of the 296 tie-breaking operation. A typical use case would consist in 297 selecting low latency routes for traffic that for example pertains to 298 the VoIP, or whose nature demands the selection of the lowest latency 299 route in the Adj-RIB-Out database of the corresponding BGP peers. 300 Typically, live broadcasting services or some e-health services could 301 certainly take advantage of such capability. It is out of scope of 302 this document to exhaustively elaborate on such service-specific 303 designs that are obviously deployment-specific. 305 7. IANA Considerations 307 A new BGP Capability Code for the Performance Routing Capability, a 308 new SAFI specific for performance routing and a new type code for 309 NETWORK_LATENCY TLV of the AIGP attribute are required to be 310 allocated by IANA. 312 8. Security Considerations 314 In addition to the considerations discussed in [RFC4271], the 315 following items should be considered as well: 317 a. Tweaking the value of the NETWORK_LATENCY by an illegitimate 318 party may influence the route selection results. Therefore, the 319 Performance Routing Capability negotiation between BGP peers 320 which belong to different administration domains MUST be disabled 321 by default. Furthermore, a BGP speaker MUST discard all 322 performance routes received from the BGP peer for which the 323 Performance Routing Capability negotiation has been disabled. 325 b. Frequent updates of the NETWORK_LATENCY TLV may have a severe 326 impact on the stability of the routing system. Such practice 327 SHOULD be avoided by setting a reasonable threshold for network 328 latency fluctuation. 330 9. Acknowledgements 332 Thanks to Joel Halpern, Alvaro Retana, Jim Uttaro, Robert Raszuk, 333 Eric Rosen, Bruno Decraene, Qing Zeng, Jie Dong, Mach Chen, Saikat 334 Ray, Wes George, Jeff Haas, John Scudder, Stephane Litkowski and 335 Sriganesh Kini for their valuable comments on this document. Special 336 thanks should be given to Jim Uttaro and Eric Rosen for their 337 proposal of using a new TLV of the AIGP attribute to convey the 338 network latency metric. 340 10. References 342 10.1. Normative References 344 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 345 Requirement Levels", BCP 14, RFC 2119, March 1997. 347 [RFC4271] Rekhter, Y., Li, T., and S. Hares, "A Border Gateway 348 Protocol 4 (BGP-4)", RFC 4271, January 2006. 350 [RFC7311] Mohapatra, P., Fernando, R., Rosen, E., and J. Uttaro, 351 "The Accumulated IGP Metric Attribute for BGP", RFC 7311, 352 August 2014. 354 10.2. Informative References 356 [I-D.ietf-idr-add-paths] 357 Walton, D., Retana, A., Chen, E., and J. Scudder, 358 "Advertisement of Multiple Paths in BGP", draft-ietf-idr- 359 add-paths-10 (work in progress), October 2014. 361 [I-D.ietf-idr-bgp-optimal-route-reflection] 362 Raszuk, R., Cassar, C., Aman, E., Decraene, B., and S. 363 Litkowski, "BGP Optimal Route Reflection (BGP-ORR)", 364 draft-ietf-idr-bgp-optimal-route-reflection-08 (work in 365 progress), October 2014. 367 [I-D.ietf-isis-te-metric-extensions] 368 Previdi, S., Giacalone, S., Ward, D., Drake, J., Atlas, 369 A., Filsfils, C., and W. Wu, "IS-IS Traffic Engineering 370 (TE) Metric Extensions", draft-ietf-isis-te-metric- 371 extensions-04 (work in progress), October 2014. 373 [I-D.ietf-ospf-te-metric-extensions] 374 Giacalone, S., Ward, D., Drake, J., Atlas, A., and S. 375 Previdi, "OSPF Traffic Engineering (TE) Metric 376 Extensions", draft-ietf-ospf-te-metric-extensions-11 (work 377 in progress), January 2015. 379 [RFC2679] Almes, G., Kalidindi, S., and M. Zekauskas, "A One-way 380 Delay Metric for IPPM", RFC 2679, September 1999. 382 [RFC3107] Rekhter, Y. and E. Rosen, "Carrying Label Information in 383 BGP-4", RFC 3107, May 2001. 385 [RFC3630] Katz, D., Kompella, K., and D. Yeung, "Traffic Engineering 386 (TE) Extensions to OSPF Version 2", RFC 3630, September 387 2003. 389 [RFC4760] Bates, T., Chandra, R., Katz, D., and Y. Rekhter, 390 "Multiprotocol Extensions for BGP-4", RFC 4760, January 391 2007. 393 [RFC5305] Li, T. and H. Smit, "IS-IS Extensions for Traffic 394 Engineering", RFC 5305, October 2008. 396 [RFC5492] Scudder, J. and R. Chandra, "Capabilities Advertisement 397 with BGP-4", RFC 5492, February 2009. 399 [RFC6774] Raszuk, R., Fernando, R., Patel, K., McPherson, D., and K. 400 Kumaki, "Distribution of Diverse BGP Paths", RFC 6774, 401 November 2012. 403 Authors' Addresses 405 Xiaohu Xu 406 Huawei 408 Email: xuxiaohu@huawei.com 410 Mohamed Boucadair 411 France Telecom 413 Email: mohamed.boucadair@orange.com 415 Christian Jacquenet 416 France Telecom 418 Email: christian.jacquenet@orange.com 420 Ning So 421 Vinci Systems 423 Email: ning.so@vinci-systems.com 425 Yimin Shen 426 Juniper 428 Email: yshen@juniper.net 429 Uma Chunduri 430 Ericsson 432 Email: uma.chunduri@ericsson.com 434 Hui Ni 435 Huawei 437 Email: nihui@huawei.com 439 Yongbing Fan 440 China Telecom 442 Email: fanyb@gsta.com 444 Luis M. Contreras 445 Telefonica I+D 446 Ronda de la Comunicacion, s/n 447 Sur-3 building, 3rd floor 448 Madrid, 28050 449 Spain 451 Email: luismiguel.contrerasmurillo@telefonica.com 452 URI: http://people.tid.es/LuisM.Contreras/