idnits 2.17.1 draft-ietf-idr-performance-routing-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (January 7, 2015) is 3395 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Outdated reference: A later version (-15) exists of draft-ietf-idr-add-paths-10 == Outdated reference: A later version (-11) exists of draft-ietf-isis-te-metric-extensions-04 == Outdated reference: A later version (-11) exists of draft-ietf-ospf-te-metric-extensions-10 -- Obsolete informational reference (is this intentional?): RFC 2679 (Obsoleted by RFC 7679) -- Obsolete informational reference (is this intentional?): RFC 3107 (Obsoleted by RFC 8277) Summary: 0 errors (**), 0 flaws (~~), 4 warnings (==), 3 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group X. Xu 3 Internet-Draft Huawei 4 Intended status: Standards Track M. Boucadair 5 Expires: July 11, 2015 C. Jacquenet 6 France Telecom 7 N. So 8 Vinci Systems 9 Y. Shen 10 Juniper 11 U. Chunduri 12 Ericsson 13 H. Ni 14 Huawei 15 Y. Fan 16 China Telecom 17 January 7, 2015 19 Performance-based BGP Routing Mechanism 20 draft-ietf-idr-performance-routing-00 22 Abstract 24 The current BGP specification doesn't use network performance metrics 25 (e.g., network latency) in the route selection decision process. 26 This document describes a performance-based BGP routing mechanism in 27 which network latency metric is taken as one of the route selection 28 criteria. This routing mechanism is useful for those server 29 providers with global reach to deliver low-latency network 30 connectivity services to their customers. 32 Status of This Memo 34 This Internet-Draft is submitted in full conformance with the 35 provisions of BCP 78 and BCP 79. 37 Internet-Drafts are working documents of the Internet Engineering 38 Task Force (IETF). Note that other groups may also distribute 39 working documents as Internet-Drafts. The list of current Internet- 40 Drafts is at http://datatracker.ietf.org/drafts/current/. 42 Internet-Drafts are draft documents valid for a maximum of six months 43 and may be updated, replaced, or obsoleted by other documents at any 44 time. It is inappropriate to use Internet-Drafts as reference 45 material or to cite them other than as "work in progress." 47 This Internet-Draft will expire on July 11, 2015. 49 Copyright Notice 51 Copyright (c) 2015 IETF Trust and the persons identified as the 52 document authors. All rights reserved. 54 This document is subject to BCP 78 and the IETF Trust's Legal 55 Provisions Relating to IETF Documents 56 (http://trustee.ietf.org/license-info) in effect on the date of 57 publication of this document. Please review these documents 58 carefully, as they describe your rights and restrictions with respect 59 to this document. Code Components extracted from this document must 60 include Simplified BSD License text as described in Section 4.e of 61 the Trust Legal Provisions and are provided without warranty as 62 described in the Simplified BSD License. 64 Table of Contents 66 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 67 1.1. Requirements Language . . . . . . . . . . . . . . . . . . 3 68 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 3 69 3. Performance Route Advertisement . . . . . . . . . . . . . . . 4 70 4. Capability Advertisement . . . . . . . . . . . . . . . . . . 5 71 5. Performance Route Selection . . . . . . . . . . . . . . . . . 5 72 6. Deployment Considerations . . . . . . . . . . . . . . . . . . 6 73 7. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 6 74 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 6 75 9. Security Considerations . . . . . . . . . . . . . . . . . . . 7 76 10. References . . . . . . . . . . . . . . . . . . . . . . . . . 7 77 10.1. Normative References . . . . . . . . . . . . . . . . . . 7 78 10.2. Informative References . . . . . . . . . . . . . . . . . 7 79 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 8 81 1. Introduction 83 Network latency is widely recognized as one of major obstacles in 84 migrating business applications to the cloud since cloud-based 85 applications usually have very clearly defined and stringent network 86 latency requirements. Service providers with global reach aim at 87 delivering low-latency network connectivity services to their cloud 88 service customers as a competitive advantage. Sometimes, the network 89 connectivity may travel across more than one Autonomous System (AS) 90 under their administration. However, the BGP [RFC4271] which is used 91 for path selection across ASes doesn't use network latency in the 92 route selection process. As such, the best route selected based upon 93 the existing BGP route selection criteria may not be the best from 94 the customer experience perspective. 96 This document describes a performance-based BGP routing paradigm in 97 which network latency metric is disseminated via a new TLV of the 98 AIGP attribute [RFC7311] and that metric is used as an input to the 99 route selection process. This mechanism is useful for those server 100 providers with global reach, which usually own more than one AS, to 101 deliver low-latency network connectivity services to their customers. 103 Furthermore, in order to be backward compatible with existing BGP 104 implementations and have no impact on the stability of the overall 105 routing system, it's expected that the performance routing paradigm 106 could coexist with the vanilla routing paradigm. As such, service 107 providers could thus provide low-latency routing services while still 108 offering the vanilla routing services depending on customers' 109 requirements. 111 For the sake of simplicity, this document considers only one network 112 performance metric that's the network latency metric. The support of 113 multiple network performance metrics is out of scope of this 114 document. In addition, this document focuses exclusively on BGP 115 matters and therefore all those BGP-irrelevant matters such as the 116 mechanisms for measuring network latency are outside the scope of 117 this document. 119 A variant of this performance-based BGP routing is implemented (see 120 http://www.ist-mescal.org/roadmap/qbgp-demo.avi). 122 1.1. Requirements Language 124 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 125 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 126 document are to be interpreted as described in RFC 2119 [RFC2119]. 128 2. Terminology 130 This memo makes use of the terms defined in [RFC4271]. 132 Network latency indicates the amount of time it takes for a packet to 133 traverse a given network path [RFC2679]. Provided a packet was 134 forwarded along a path which contains multiple links and routers, the 135 network latency would be the sum of the transmission latency of each 136 link (i.e., link latency), plus the sum of the internal delay 137 occurred within each router (i.e., router latency) which includes 138 queuing latency and processing latency. The sum of the link latency 139 is also known as the cumulative link latency. In today's service 140 provider networks which usually span across a wide geographical area, 141 the cumulative link latency becomes the major part of the network 142 latency since the total of the internal latency happened within each 143 high-capacity router seems trivial compared to the cumulative link 144 latency. In other words, the cumulative link latency could 145 approximately represent the network latency in the above networks. 147 Furthermore, since the link latency is more stable than the router 148 latency, such approximate network latency represented by the 149 cumulative link latency is more stable. Therefore, if there was a 150 way to calculate the cumulative link latency of a given network path, 151 it is strongly recommended to use such cumulative link latency to 152 approximately represent the network latency. Otherwise, the network 153 latency would have to be measured frequently by some means (e.g., 154 PING or other measurement tools). 156 3. Performance Route Advertisement 158 Performance (i.e., low latency) routes SHOULD be exchanged between 159 BGP peers by means of a specific Subsequent Address Family Identifier 160 (SAFI) of TBD (see IANA Section) and also be carried as labeled 161 routes as per [RFC3107]. In other word, performance routes can then 162 be looked as specific labeled routes which are associated with 163 network latency metric. 165 A BGP speaker SHOULD NOT advertise performance routes to a particular 166 BGP peer unless that peer indicates, through BGP capability 167 advertisement (see Section 4), that it can process update messages 168 with that specific SAFI field. 170 Network latency metric is attached to the performance routes via a 171 new TLV of the AIGP attribute, referred to as NETWORK_LATENCY TLV. 172 The value of this TLV indicates the network latency in microseconds 173 from the BGP speaker depicted by the NEXT_HOP path attribute to the 174 address depicted by the NLRI prefix. The type code of this TLV is 175 TBD (see IANA Section), and the value field is 4 octets in length. 176 In some abnormal cases, if the cumulative link latency exceeds the 177 maximum value of 0xFFFFFFFF, the value field SHOULD be set to 178 0xFFFFFFFF. Note that the NETWORK_LATENCY TLV MUST NOT co-exisit 179 with the AIGP TLV within the same AIGP attribute. 181 A BGP speaker SHOULD be configurable to enable or disable the 182 origination of performance routes. If enabled, a local latency value 183 for a given to-be-originated performance route MUST be configured to 184 the BGP speaker so that it can be filled to the NETWORK_LATENCY TLV 185 of that performance route. 187 When distributing a performance route learnt from a BGP peer, if this 188 BGP speaker has set itself as the NEXT_HOP of such route, the value 189 of the NETWORK_LATENCY TLV SHOULD be increased by adding the network 190 latency from itself to the previous NEXT_HOP of such route. 192 Otherwise, the NETWORK_LATENCY TLV of such route MUST NOT be 193 modified. 195 As for how to obtain the network latency to a given BGP NEXT_HOP is 196 outside the scope of this document. However, note that the path 197 latency to the NEXT HOP SHOULD approximately represent the network 198 latency of the exact forwarding path towards the NEXT_HOP. For 199 example, if a BGP speaker uses a Traffic Engineering (TE) Label 200 Switching Path (LSP) from itself to the NEXT_HOP, rather than the 201 shortest path calculated by Interior Gateway Protocol (IGP), the 202 latency to the NEXT HOP SHOULD reflect the network latency of that TE 203 LSP path, rather than the IGP shortest path. In the case where the 204 latency to the NEXT HOP could not be obtained due to some reason(s), 205 that latency SHOULD be set to 0xFFFFFFFF by default. 207 To keep performance routes stable enough, a BGP speaker SHOULD use a 208 configurable threshold for network latency fluctuation to avoid 209 sending any update which would otherwise be triggered by a minor 210 network latency fluctuation below that threshold. 212 4. Capability Advertisement 214 A BGP speaker that uses multiprotocol extensions to advertise 215 performance routes SHOULD use the Capabilities Optional Parameter, as 216 defined in [RFC5492], to inform its peers about this capability. 218 The MP_EXT Capability Code, as defined in [RFC4760], is used to 219 advertise the (AFI, SAFI) pairs available on a particular connection. 221 A BGP speaker that implements the Performance Routing Capability MUST 222 support the BGP Labeled Route Capability, as defined in [RFC3107]. A 223 BGP speaker that advertises the Performance Routing Capability to a 224 peer using BGP Capabilities advertisement [RFC5492] does not have to 225 advertise the BGP Labeled Route Capability to that peer. 227 5. Performance Route Selection 229 Performance route selection only requires the following modification 230 to the tie-breaking procedures of the BGP route selection decision 231 (phase 2) described in [RFC4271]: network latency metric comparison 232 SHOULD be executed just ahead of the AS-Path Length comparison step. 234 Prior to executing the network latency metric comparison, the value 235 of the NETWORK_LATENCY TLV SHOULD be increased by adding the network 236 latency from the BGP speaker to the NEXT_HOP of that route. In the 237 case where a router reflector is deployed without next-hop-self 238 enabled when reflecting received routes from one IBGP peer to other 239 IBGP peer, it is RECOMMENDED to enable such route reflector to 240 reflect all received performance routes by using some mechanisms such 241 as [I-D.ietf-idr-add-paths], rather than reflecting only the 242 performance route which is the best from its own perspective. 243 Otherwise, it may result in a non-optimal choice by its clients and/ 244 or its IBGP peers. 246 The Loc-RIB of performance routing paradigm is independent from that 247 of vanilla routing paradigm. Accordingly, the routing table of 248 performance routing paradigm is independent from that of the vanilla 249 routing paradigm. Whether performance routing paradigm or vanilla 250 routing paradigm would be used for a given packet is a local policy 251 issue which is outside the scope of this document. 253 6. Deployment Considerations 255 It is strongly RECOMMENDED to deploy this performance-based BGP 256 routing mechanism across multiple ASes which belong to a single 257 administrative domain. Within each AS, it is RECOMMENDED to deliver 258 a packet from a BGP speaker to the BGP NEXT_HOP via tunnels, 259 typically TE LSP tunnels. Furthermore, if a TE LSP is used between 260 iBGP peers, it is RECOMMENDED to use the latency metric carried in 261 Unidirectional Link Delay Sub-TLV 262 [I-D.ietf-isis-te-metric-extensions] 263 [I-D.ietf-ospf-te-metric-extensions] if possible, rather than the TE 264 metric [RFC3630][RFC5305] to calculate the cumulative link latency 265 associated with the TE LSP and use that cumulative link latency to 266 approximately represent the network latency. Thus, there is no need 267 for frequent measurement of network latency between IBGP peers. 269 7. Acknowledgements 271 Thanks to Joel Halpern, Alvaro Retana, Jim Uttaro, Robert Raszuk, 272 Eric Rosen, Qing Zeng, Jie Dong, Mach Chen, Saikat Ray, Wes George, 273 Jeff Haas, John Scudder, Stephane Litkowski and Sriganesh Kini for 274 their valuable comments on the initial idea of this document. 275 Special thanks should be given to Jim Uttaro and Eric Rosen for their 276 proposal of using a new TLV of the AIGP attribute to convey the 277 network latency metric. 279 8. IANA Considerations 281 A new BGP Capability Code for the Performance Routing Capability, a 282 new SAFI specific for performance routing and a new type code for 283 NETWORK_LATENCY TLV of the AIGP attribute are required to be 284 allocated by IANA. 286 9. Security Considerations 288 In addition to the considerations discussed in [RFC4271], the 289 following items should be considered as well: 291 a. Tweaking the value of the NETWORK_LATENCY by an illegitimate 292 party may influence the route selection results. Therefore, the 293 Performance Routing Capability negotiation between BGP peers 294 which belong to different administration domains MUST be disabled 295 by default. Furthermore, a BGP speaker MUST discard all 296 performance routes received from the BGP peer for which the 297 Performance Routing Capability negotiation has been disabled. 299 b. Frequent updates of the NETWORK_LATENCY TLV may have a severe 300 impact on the stability of the routing system. Such practice 301 SHOULD be avoided by setting a reasonable threshold for network 302 latency fluctuation. 304 10. References 306 10.1. Normative References 308 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 309 Requirement Levels", BCP 14, RFC 2119, March 1997. 311 [RFC4271] Rekhter, Y., Li, T., and S. Hares, "A Border Gateway 312 Protocol 4 (BGP-4)", RFC 4271, January 2006. 314 [RFC7311] Mohapatra, P., Fernando, R., Rosen, E., and J. Uttaro, 315 "The Accumulated IGP Metric Attribute for BGP", RFC 7311, 316 August 2014. 318 10.2. Informative References 320 [I-D.ietf-idr-add-paths] 321 Walton, D., Retana, A., Chen, E., and J. Scudder, 322 "Advertisement of Multiple Paths in BGP", draft-ietf-idr- 323 add-paths-10 (work in progress), October 2014. 325 [I-D.ietf-isis-te-metric-extensions] 326 Previdi, S., Giacalone, S., Ward, D., Drake, J., Atlas, 327 A., Filsfils, C., and W. Wu, "IS-IS Traffic Engineering 328 (TE) Metric Extensions", draft-ietf-isis-te-metric- 329 extensions-04 (work in progress), October 2014. 331 [I-D.ietf-ospf-te-metric-extensions] 332 Giacalone, S., Ward, D., Drake, J., Atlas, A., and S. 333 Previdi, "OSPF Traffic Engineering (TE) Metric 334 Extensions", draft-ietf-ospf-te-metric-extensions-10 (work 335 in progress), January 2015. 337 [RFC2679] Almes, G., Kalidindi, S., and M. Zekauskas, "A One-way 338 Delay Metric for IPPM", RFC 2679, September 1999. 340 [RFC3107] Rekhter, Y. and E. Rosen, "Carrying Label Information in 341 BGP-4", RFC 3107, May 2001. 343 [RFC3630] Katz, D., Kompella, K., and D. Yeung, "Traffic Engineering 344 (TE) Extensions to OSPF Version 2", RFC 3630, September 345 2003. 347 [RFC4760] Bates, T., Chandra, R., Katz, D., and Y. Rekhter, 348 "Multiprotocol Extensions for BGP-4", RFC 4760, January 349 2007. 351 [RFC5305] Li, T. and H. Smit, "IS-IS Extensions for Traffic 352 Engineering", RFC 5305, October 2008. 354 [RFC5492] Scudder, J. and R. Chandra, "Capabilities Advertisement 355 with BGP-4", RFC 5492, February 2009. 357 Authors' Addresses 359 Xiaohu Xu 360 Huawei 362 Email: xuxiaohu@huawei.com 364 Mohamed Boucadair 365 France Telecom 367 Email: mohamed.boucadair@orange.com 369 Christian Jacquenet 370 France Telecom 372 Email: christian.jacquenet@orange.com 373 Ning So 374 Vinci Systems 376 Email: ning.so@vinci-systems.com 378 Yimin Shen 379 Juniper 381 Email: yshen@juniper.net 383 Uma Chunduri 384 Ericsson 386 Email: uma.chunduri@ericsson.com 388 Hui Ni 389 Huawei 391 Email: nihui@huawei.com 393 Yongbing Fan 394 China Telecom 396 Email: fanyb@gsta.com