idnits 2.17.1 draft-xu-idr-performance-routing-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (September 11, 2014) is 3507 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Unused Reference: 'I-D.ietf-ospf-te-metric-extensions' is defined on line 327, but no explicit reference was found in the text == Outdated reference: A later version (-15) exists of draft-ietf-idr-add-paths-09 == Outdated reference: A later version (-11) exists of draft-ietf-isis-te-metric-extensions-03 == Outdated reference: A later version (-11) exists of draft-ietf-ospf-te-metric-extensions-05 -- Obsolete informational reference (is this intentional?): RFC 2679 (Obsoleted by RFC 7679) -- Obsolete informational reference (is this intentional?): RFC 3107 (Obsoleted by RFC 8277) Summary: 0 errors (**), 0 flaws (~~), 5 warnings (==), 3 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group X. Xu 3 Internet-Draft Huawei 4 Intended status: Standards Track M. Boucadair 5 Expires: March 15, 2015 C. Jacquenet 6 France Telecom 7 N. So 8 Vinci Systems 9 Y. Shen 10 Juniper 11 U. Chunduri 12 Ericsson 13 H. Ni 14 Huawei 15 Y. Fan 16 China Telecom 17 September 11, 2014 19 Performance-based BGP Routing Mechanism 20 draft-xu-idr-performance-routing-01 22 Abstract 24 The current BGP specification doesn't use network performance metrics 25 (e.g., network latency) in the route selection decision process. 26 This document describes a performance-based BGP routing mechanism in 27 which network latency metric is taken as one of the route selection 28 criteria. This routing mechanism is useful for those server 29 providers with global reach to deliver low-latency network 30 connectivity services to their customers. 32 Status of This Memo 34 This Internet-Draft is submitted in full conformance with the 35 provisions of BCP 78 and BCP 79. 37 Internet-Drafts are working documents of the Internet Engineering 38 Task Force (IETF). Note that other groups may also distribute 39 working documents as Internet-Drafts. The list of current Internet- 40 Drafts is at http://datatracker.ietf.org/drafts/current/. 42 Internet-Drafts are draft documents valid for a maximum of six months 43 and may be updated, replaced, or obsoleted by other documents at any 44 time. It is inappropriate to use Internet-Drafts as reference 45 material or to cite them other than as "work in progress." 47 This Internet-Draft will expire on March 15, 2015. 49 Copyright Notice 51 Copyright (c) 2014 IETF Trust and the persons identified as the 52 document authors. All rights reserved. 54 This document is subject to BCP 78 and the IETF Trust's Legal 55 Provisions Relating to IETF Documents 56 (http://trustee.ietf.org/license-info) in effect on the date of 57 publication of this document. Please review these documents 58 carefully, as they describe your rights and restrictions with respect 59 to this document. Code Components extracted from this document must 60 include Simplified BSD License text as described in Section 4.e of 61 the Trust Legal Provisions and are provided without warranty as 62 described in the Simplified BSD License. 64 Table of Contents 66 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 67 1.1. Requirements Language . . . . . . . . . . . . . . . . . . 3 68 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 3 69 3. Performance Route Advertisement . . . . . . . . . . . . . . . 4 70 4. Capability Advertisement . . . . . . . . . . . . . . . . . . 5 71 5. Performance Route Selection . . . . . . . . . . . . . . . . . 5 72 6. Deployment Considerations . . . . . . . . . . . . . . . . . . 6 73 7. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 6 74 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 6 75 9. Security Considerations . . . . . . . . . . . . . . . . . . . 6 76 10. References . . . . . . . . . . . . . . . . . . . . . . . . . 7 77 10.1. Normative References . . . . . . . . . . . . . . . . . . 7 78 10.2. Informative References . . . . . . . . . . . . . . . . . 7 79 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 8 81 1. Introduction 83 Network latency is widely recognized as one of major obstacles in 84 migrating business applications to the cloud since cloud-based 85 applications usually have very clearly defined and stringent network 86 latency requirements. Service providers with global reach aim at 87 delivering low-latency network connectivity services to their cloud 88 service customers as a competitive advantage. Sometimes, the network 89 connectivity may travel across more than one Autonomous System (AS) 90 under their administration. However, the BGP [RFC4271] which is used 91 for path selection across ASes doesn't use network latency in the 92 route selection process. As such, the best route selected based upon 93 the existing BGP route selection criteria may not be the best from 94 the customer experience perspective. 96 This document describes a performance-based BGP routing paradigm in 97 which network latency metric is disseminated via a new TLV of the 98 AIGP attribute [RFC7311] and that metric is used as an input to the 99 route selection process. This mechanism is useful for those server 100 providers with global reach, which usually own more than one AS, to 101 deliver low-latency network connectivity services to their customers. 103 Furthermore, in order to be backward compatible with existing BGP 104 implementations and have no impact on the stability of the overall 105 routing system, it's expected that the performance routing paradigm 106 could coexist with the vanilla routing paradigm. As such, service 107 providers could thus provide low-latency routing services while still 108 offering the vanilla routing services depending on customers' 109 requirements. 111 For the sake of simplicity, this document considers only one network 112 performance metric that's the network latency metric. The support of 113 multiple network performance metrics is out of scope of this 114 document. In addition, this document focuses exclusively on BGP 115 matters and therefore all those BGP-irrelevant matters such as the 116 mechanisms for measuring network latency are outside the scope of 117 this document. 119 A variant of this performance-based BGP routing is implemented (see 120 http://www.ist-mescal.org/roadmap/qbgp-demo.avi). 122 1.1. Requirements Language 124 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 125 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 126 document are to be interpreted as described in RFC 2119 [RFC2119]. 128 2. Terminology 130 This memo makes use of the terms defined in [RFC4271]. 132 Network latency indicates the amount of time it takes for a packet to 133 traverse a given network path [RFC2679]. Provided a packet was 134 forwarded along a path which contains multiple links and routers, the 135 network latency would be the sum of the transmission latency of each 136 link (i.e., link latency), plus the sum of the internal delay 137 occurred within each router (i.e., router latency) which includes 138 queuing latency and processing latency. The sum of the link latency 139 is also known as the cumulative link latency. In today's service 140 provider networks which usually span across a wide geographical area, 141 the cumulative link latency becomes the major part of the network 142 latency since the total of the internal latency happened within each 143 high-capacity router seems trivial compared to the cumulative link 144 latency. In other words, the cumulative link latency could 145 approximately represent the network latency in the above networks. 147 Furthermore, since the link latency is more stable than the router 148 latency, such approximate network latency represented by the 149 cumulative link latency is more stable. Therefore, if there was a 150 way to calculate the cumulative link latency of a given network path, 151 it is strongly recommended to use such cumulative link latency to 152 approximately represent the network latency. Otherwise, the network 153 latency would have to be measured frequently by some means (e.g., 154 PING or other measurement tools). 156 3. Performance Route Advertisement 158 Performance (i.e., low latency) routes SHOULD be exchanged between 159 BGP peers by means of a specific Subsequent Address Family Identifier 160 (SAFI) of TBD (see IANA Section) and also be carried as labeled 161 routes as per [RFC3107]. In other word, performance routes can then 162 be looked as specific labeled routes which are associated with 163 network latency metric. 165 A BGP speaker SHOULD NOT advertise performance routes to a particular 166 BGP peer unless that peer indicates, through BGP capability 167 advertisement (see Section 4), that it can process update messages 168 with that specific SAFI field. 170 Network latency metric is attached to the performance routes via a 171 new TLV of the AIGP attribute, referred to as NETWORK_LATENCY TLV. 172 The value of this TLV indicates the network latency in microseconds 173 from the BGP speaker depicted by the NEXT_HOP path attribute to the 174 address depicted by the NLRI prefix. The type code of this TLV is 175 TBD (see IANA Section), and the value field is 4 octets in length. 176 In some abnormal cases, if the cumulative link latency exceeds the 177 maximum value of 0xFFFFFFFF, the value field SHOULD be set to 178 0xFFFFFFFF. 180 A BGP speaker SHOULD be configurable to enable or disable the 181 origination of performance routes. If enabled, a local latency value 182 for a given to-be-originated performance route MUST be configured to 183 the BGP speaker so that it can be filled to the NETWORK_LATENCY TLV 184 of that performance route. 186 When distributing a performance route learnt from a BGP peer, if this 187 BGP speaker has set itself as the NEXT_HOP of such route, the value 188 of the NETWORK_LATENCY TLV SHOULD be increased by adding the network 189 latency from itself to the previous NEXT_HOP of such route. 190 Otherwise, the NETWORK_LATENCY TLV of such route MUST NOT be 191 modified. 193 As for how to obtain the network latency to a given BGP NEXT_HOP is 194 outside the scope of this document. However, note that the path 195 latency to the NEXT HOP SHOULD approximately represent the network 196 latency of the exact forwarding path towards the NEXT_HOP. For 197 example, if a BGP speaker uses a Traffic Engineering (TE) Label 198 Switching Path (LSP) from itself to the NEXT_HOP, rather than the 199 shortest path calculated by Interior Gateway Protocol (IGP), the 200 latency to the NEXT HOP SHOULD reflect the network latency of that TE 201 LSP path, rather than the IGP shortest path. 203 To keep performance routes stable enough, a BGP speaker SHOULD use a 204 configurable threshold for network latency fluctuation to avoid 205 sending any update which would otherwise be triggered by a minor 206 network latency fluctuation below that threshold. 208 4. Capability Advertisement 210 A BGP speaker that uses multiprotocol extensions to advertise 211 performance routes SHOULD use the Capabilities Optional Parameter, as 212 defined in [RFC5492], to inform its peers about this capability. 214 The MP_EXT Capability Code, as defined in [RFC4760], is used to 215 advertise the (AFI, SAFI) pairs available on a particular connection. 217 A BGP speaker that implements the Performance Routing Capability MUST 218 support the BGP Labeled Route Capability, as defined in [RFC3107]. A 219 BGP speaker that advertises the Performance Routing Capability to a 220 peer using BGP Capabilities advertisement [RFC5492] does not have to 221 advertise the BGP Labeled Route Capability to that peer. 223 5. Performance Route Selection 225 Performance route selection only requires the following modification 226 to the tie-breaking procedures of the BGP route selection decision 227 (phase 2) described in [RFC4271]: network latency metric comparison 228 SHOULD be executed just ahead of the AS-Path Length comparison step. 230 Prior to executing the network latency metric comparison, the value 231 of the NETWORK_LATENCY TLV SHOULD be increased by adding the network 232 latency from the BGP speaker to the NEXT_HOP of that route. In the 233 case where a router reflector is deployed without next-hop-self 234 enabled when reflecting received routes from one IBGP peer to other 235 IBGP peer, it is RECOMMENDED to enable such route reflector to 236 reflect all received performance routes by using some mechanisms such 237 as [I-D.ietf-idr-add-paths], rather than reflecting only the 238 performance route which is the best from its own perspective. 239 Otherwise, it may result in a non-optimal choice by its clients and/ 240 or its IBGP peers. 242 The Loc-RIB of performance routing paradigm is independent from that 243 of vanilla routing paradigm. Accordingly, the routing table of 244 performance routing paradigm is independent from that of the vanilla 245 routing paradigm. Whether performance routing paradigm or vanilla 246 routing paradigm would be used for a given packet is a local policy 247 issue which is outside the scope of this document. 249 6. Deployment Considerations 251 It is strongly RECOMMENDED to deploy this performance-based BGP 252 routing mechanism across multiple ASes which belong to a single 253 administrative domain. Within each AS, it is RECOMMENTED to deliver 254 a packet from a BGP speaker to the BGP NEXT_HOP via tunnels, 255 typically TE LSP tunnels. Furthermore, if a TE LSP is used between 256 iBGP peers, it is RECOMMENDED to use the latency metric carried in 257 Unidirectional Link Delay Sub-TLV 258 [I-D.ietf-isis-te-metric-extensions] 259 [I-D.ietf-isis-te-metric-extensions] if possible, rather than the TE 260 metric [RFC3630][RFC5305] to calculate the cumulative link latency 261 associated with the TE LSP and use that cumulative link latency to 262 approximately represent the network latency. Thus, there is no need 263 for frequent measurement of network latency between IBGP peers. 265 7. Acknowledgements 267 Thanks to Joel Halpern, Alvaro Retana, Jim Uttaro, Robert Raszuk, 268 Eric Rosen, Qing Zeng, Jie Dong, Mach Chen, Saikat Ray, Wes George, 269 Jeff Haas, John Scudder and Sriganesh Kini for their valuable 270 comments on the initial idea of this document. Special thanks should 271 be given to Jim Uttaro and Eric Rosen for their proposal of using a 272 new TLV of the AIGP attribute to convey the network latency metric. 274 8. IANA Considerations 276 A new BGP Capability Code for the Performance Routing Capability, a 277 new SAFI specific for performance routing and a new type code for 278 NETWORK_LATENCY TLV of the AIGP attribute are required to be 279 allocated by IANA. 281 9. Security Considerations 283 In addition to the considerations discussed in [RFC4271], the 284 following items should be considered as well: 286 a. Tweaking the value of the NETWORK_LATENCY by an illegitimate 287 party may influence the route selection results. Therefore, it 288 MUST disable Performance Routing Capability negotiation between 289 BGP peers which belong to different administration domains. 291 Furthermore, a BGP speaker MUST discard all performance routes 292 received from the BGP peer for which the Performance Routing 293 Capability negotiation has been disabled. 295 b. Frequent updates of the NETWORK_LATENCY TLV may have a severe 296 impact on the stability of the routing system. Such practice 297 SHOULD be avoided by setting a reasonable threshold for network 298 latency fluctuation. 300 10. References 302 10.1. Normative References 304 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 305 Requirement Levels", BCP 14, RFC 2119, March 1997. 307 [RFC4271] Rekhter, Y., Li, T., and S. Hares, "A Border Gateway 308 Protocol 4 (BGP-4)", RFC 4271, January 2006. 310 [RFC7311] Mohapatra, P., Fernando, R., Rosen, E., and J. Uttaro, 311 "The Accumulated IGP Metric Attribute for BGP", RFC 7311, 312 August 2014. 314 10.2. Informative References 316 [I-D.ietf-idr-add-paths] 317 Walton, D., Retana, A., Chen, E., and J. Scudder, 318 "Advertisement of Multiple Paths in BGP", draft-ietf-idr- 319 add-paths-09 (work in progress), October 2013. 321 [I-D.ietf-isis-te-metric-extensions] 322 Previdi, S., Giacalone, S., Ward, D., Drake, J., Atlas, 323 A., Filsfils, C., and W. Wu, "IS-IS Traffic Engineering 324 (TE) Metric Extensions", draft-ietf-isis-te-metric- 325 extensions-03 (work in progress), April 2014. 327 [I-D.ietf-ospf-te-metric-extensions] 328 Giacalone, S., Ward, D., Drake, J., Atlas, A., and S. 329 Previdi, "OSPF Traffic Engineering (TE) Metric 330 Extensions", draft-ietf-ospf-te-metric-extensions-05 (work 331 in progress), December 2013. 333 [RFC2679] Almes, G., Kalidindi, S., and M. Zekauskas, "A One-way 334 Delay Metric for IPPM", RFC 2679, September 1999. 336 [RFC3107] Rekhter, Y. and E. Rosen, "Carrying Label Information in 337 BGP-4", RFC 3107, May 2001. 339 [RFC3630] Katz, D., Kompella, K., and D. Yeung, "Traffic Engineering 340 (TE) Extensions to OSPF Version 2", RFC 3630, September 341 2003. 343 [RFC4760] Bates, T., Chandra, R., Katz, D., and Y. Rekhter, 344 "Multiprotocol Extensions for BGP-4", RFC 4760, January 345 2007. 347 [RFC5305] Li, T. and H. Smit, "IS-IS Extensions for Traffic 348 Engineering", RFC 5305, October 2008. 350 [RFC5492] Scudder, J. and R. Chandra, "Capabilities Advertisement 351 with BGP-4", RFC 5492, February 2009. 353 Authors' Addresses 355 Xiaohu Xu 356 Huawei 358 Email: xuxiaohu@huawei.com 360 Mohamed Boucadair 361 France Telecom 363 Email: mohamed.boucadair@orange.com 365 Christian Jacquenet 366 France Telecom 368 Email: christian.jacquenet@orange.com 370 Ning So 371 Vinci Systems 373 Email: ning.so@vinci-systems.com 375 Yimin Shen 376 Juniper 378 Email: yshen@juniper.net 379 Uma Chunduri 380 Ericsson 382 Email: uma.chunduri@ericsson.com 384 Hui Ni 385 Huawei 387 Email: nihui@huawei.com 389 Yongbing Fan 390 China Telecom 392 Email: fanyb@gsta.com