idnits 2.17.1 draft-wang-teas-pce-native-ip-03.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- -- The document has an IETF Trust Provisions (28 Dec 2009) Section 6.c(ii) Publication Limitation clause. If this document is intended for submission to the IESG for publication, this constitutes an error. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** There is 1 instance of too long lines in the document, the longest one being 1 character in excess of 72. ** The abstract seems to contain references ([I-D.draft-ietf-teas-pcecc-use-cases], [I-D.draft-ietf-teas-pce-control-function]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == Line 141 has weird spacing: '...traffic betwe...' == Line 196 has weird spacing: '... that illus...' == Line 210 has weird spacing: '...routers respe...' == The document doesn't use any RFC 2119 keywords, yet seems to have RFC 2119 boilerplate text. -- The document date (March 13, 2017) is 2601 days in the past. Is this intentional? -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '' and '' lines. Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'I-D.draft-ietf-teas-pcecc-use-cases' is mentioned on line 73, but not defined == Missing Reference: 'RFC2119' is mentioned on line 131, but not defined == Unused Reference: 'RFC4655' is defined on line 414, but no explicit reference was found in the text ** Downref: Normative reference to an Informational RFC: RFC 4655 -- No information found for draft-ietf-teas-pce-control-function - is the name correct? Summary: 3 errors (**), 0 flaws (~~), 8 warnings (==), 4 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 TEAS Working Group A.Wang 2 Internet Draft China Telecom 3 Quintin Zhao 4 Boris Khasanov 5 Huawei Technologies 6 Penghui Mi 7 Tencent Company 8 Raghavendra Mallya 9 Juniper Networks 10 Shaofu Peng 11 ZTE Corporation 13 Intended status: Standard Track March 13, 2017 14 Expires: September 12, 2017 16 PCE in Native IP Network 17 draft-wang-teas-pce-native-ip-03.txt 19 Status of this Memo 21 This Internet-Draft is submitted in full conformance with the 22 provisions of BCP 78 and BCP 79. 24 This Internet-Draft is submitted in full conformance with the 25 provisions of BCP 78 and BCP 79. This document may not be modified, 26 and derivative works of it may not be created, and it may not be 27 published except as an Internet-Draft. 29 This Internet-Draft is submitted in full conformance with the 30 provisions of BCP 78 and BCP 79. This document may not be modified, 31 and derivative works of it may not be created, except to publish it 32 as an RFC and to translate it into languages other than English. 34 it for publication as an RFC or to translate it into languages other 35 than English. 37 Internet-Drafts are working documents of the Internet Engineering 38 Task Force (IETF), its areas, and its working groups. Note that 39 other groups may also distribute working documents as Internet- 40 Drafts. 42 Internet-Drafts are draft documents valid for a maximum of six 43 months and may be updated, replaced, or obsoleted by other documents 44 at any time. It is inappropriate to use Internet-Drafts as 45 reference material or to cite them other than as "work in progress." 47 The list of current Internet-Drafts can be accessed at 48 http://www.ietf.org/ietf/1id-abstracts.txt 49 The list of Internet-Draft Shadow Directories can be accessed at 50 http://www.ietf.org/shadow.html 52 This Internet-Draft will expire on September 13, 2017. 54 Copyright Notice 56 Copyright (c) 2017 IETF Trust and the persons identified as the 57 document authors. All rights reserved. 59 This document is subject to BCP 78 and the IETF Trust's Legal 60 Provisions Relating to IETF Documents 61 (http://trustee.ietf.org/license-info) in effect on the date of 62 publication of this document. Please review these documents 63 carefully, as they describe your rights and restrictions with 64 respect to this document. 66 Abstract 68 This document defines the scenario and solution for traffic 69 engineering within Native IP network, using Dual/Multi-BGP session 70 strategy and PCE-based central control architecture. The proposed 71 central mode control solution conforms to the concept that defined 72 in draft [I-D.draft-ietf-teas-pce-control-function]. And together 73 with draft [I-D.draft-ietf-teas-pcecc-use-cases], the solution 74 portfolio for traffic engineering in MPLS and Native IP network is 75 almost completed. 77 Table of Contents 79 1. Introduction ................................................ 3 80 2. Conventions used in this document............................ 3 81 3. Dual-BGP solution for simple topology........................ 3 82 4. Dual-BGP in large Scale Topology............................. 5 83 5. Multi-BGP for Extended Traffic Differentiation .............. 6 84 6. PCE based solution for Multi-BGP strategy deployment......... 6 85 7. PCEP extension for key parameters delivery................... 8 86 8. Deployment Consideration..................................... 8 87 9. Security Considerations..................................... 10 88 10. IANA Considerations........................................ 10 89 11. Conclusions ............................................... 10 90 12. References ................................................ 10 91 12.1. Normative References.................................. 10 92 12.2. Informative References................................ 10 93 13. Acknowledgments ........................................... 11 95 1. Introduction 97 Currently, PCE based traffic assurance requires the underlying 98 network devices support MPLS and the network must deploy multiple 99 LSPs to assure the end-to-end traffic performance. LDP/RSVP-TE or 100 Segment Routing should be enabled within the network to establish 101 various MPLS paths. Such solution will certainly work but they does 102 not cover the needs in legacy Native IP network, which demands less 103 signaling protocol and less complex traffic steering policy. 105 Within Native IP network, the solution for traffic engineering is 106 generally hop-by-hop differentiate treatment. To achieve the end2end 107 QoS performance assurance, one can only deploy some dedicated links 108 statically, but such solution is not feasible in the service provider 109 network, because the complexity of underlying network and the 110 variation of application traffic from time to time. 112 In summary, the requirements for traffic engineering in Native IP 113 network are the following: 114 1) No complex MPLS signaling procedure. 115 2) End to End traffic assurance, determined QoS behavior. 116 3) Flexible deployment and automation control. 118 This document defines the solution for traffic engineering within 119 Native IP network, using Dual/Multi-BGP session strategy and PCE- 120 based central control architecture, to meet the above requirements in 121 dynamical and central control mode. Future PCEP protocol extensions 122 to transfer the key parameters between PCE and the underlying network 123 devices(PCC) are provided in draft [draft-wang-pcep-extension-native- 124 IP] 126 2. Conventions used in this document 128 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 129 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 130 document are to be interpreted as described in RFC 2119 [RFC2119]. 132 3. Dual-BGP solution for simple topology. 134 This section introduces the dual-BGP solution for simple topology 135 that illustrated in Fig.1, which is comprised by SW1, SW2, R1, R2. 136 There are multiple physical links between R1 and R2. Let's assume 137 traffic between IP11 and IP21 is normal traffic, traffic between IP12 138 and IP22 is priority traffic that should be treated differently. 140 Only Native IGP/BGP protocol is deployed between R1 and R2. The 141 traffic between each address pair may change timely and the 142 corresponding source/destination addresses of the traffic may also 143 change dynamically. 145 The key idea of the Dual-BGP solution for this simple topology is the 146 following: 147 1) Build two BGP sessions between R1 and R2, via the different 148 loopback address lo0, lo1 on these routers. 149 2) Send different prefixes via the two BGP sessions. (For example, 150 IP11/IP21 via the BGP pair 1 and IP12/IP22 via the BGP pair 2). 151 3) Set the explicit peer route on R1 and R2 respectively for BGP next 152 hop of lo0, lo1 to different physical link address between R1 and 153 R2. 155 So, the traffic between the IP11 and IP21, and the traffic between 156 IP12 and IP22 will go through different physical links between R1 and 157 R2, each type of traffic occupy the different dedicated physical 158 links. 160 If there is more traffic between IP12 and IP22 that needs to be 161 assured , one can add more physical links on R1 and R2 to reach the 162 loopback address lo1(also the next hop for BGP Peer pair2). In this 163 cases the prefixes that advertised by two BGP peer need not be 164 changed. 166 If, for example, there is traffic from another address pair that 167 needs to be assured (for example IP13/IP23), but the total volume of 168 assured traffic does not exceed the capacity of the previous 169 appointed physical links, then one need only to advertise the newly 170 added source/destination prefixes via the BGP peer pair2, then the 171 traffic between IP13/IP23 will go through the assigned dedicated 172 physical links as the traffic between IP12/IP22. 174 Such decouple philosophy gives the network operator more flexible 175 control ability on the network traffic, get the determined QoS 176 assurance effect to meet the application's requirement. No complex 177 MPLS signal procedures is introduced, the router need only support 178 native IP protocol. 180 | BGP Peer Pair2 | 181 +------------------+ 182 |lo1 lo1 | 183 | | 184 | BGP Peer Pair1 | 185 +------------------+ 186 IP12 |lo0 lo0 | IP22 187 IP11 | | IP21 188 SW1-------R1-----------------R2-------SW2 189 Links Group 191 Fig.1 Design Philosophy for Dual-BGP Solution 193 4. Dual-BGP in large Scale Topology 195 When the assured traffic spans across one large scale network, as 196 that illustrated in Fig.2, the dual BGP sessions cannot be 197 established hop by hop especially for the iBGP within one AS. For 198 such scenario, we should consider to use the Route Reflector (RR) to 199 achieve the similar Dual-BGP effect, that is to say, select one 200 router which performs the role of RR (for example R3 in Fig.2 - Dual- 201 BGP Solution using Route Reflector for large scale network), every 202 other edge router will establish two BGP peer sessions with the RR, 203 using their different loopback addresses respectively(the inner 204 router will establish one BGP session with RR). The other two steps 205 for traffic differentiation are same as one described in the Dual-BGP 206 simple topology usage case. 208 For the example shown in Fig.2, if we select the R1-R2-R4-R7 as the 209 dedicated path, then we should set the explicit peer routes on these 210 routers respectively, pointing to the BGP next hop (loopback 211 addresses of R1 and R7, which are used to send the prefix of the 212 assured traffic) to the actual address of the physical link 214 +------------R3--------------+ 215 | | 216 SW1-------R1-------R5---------R6-------R7--------SW2 217 | | | | 218 +-------R2---------R4--------+ 220 Fig.2 Dual-BGP solution for large scale network 222 5. Multi-BGP for Extended Traffic Differentiation 224 In general situation, several additional traffic differentiation 225 criteria exist, including: 226 o Traffic that requires low latency links and is not sensitive to 227 packet loss 228 o Traffic that requires low packet loss but can endure higher latency 229 o Traffic that requires lowest jitter path 230 o Traffic that requires high bandwidth links 232 These different traffic requirements can be summarized in the 233 following table: 235 +----------+-------------+---------------+-----------------+ 236 | Flow No. | Latency | Packet Loss | Jitter | 237 +----------+-------------+---------------+-----------------+ 238 | 1 | Low | Normal | Don't care | 239 +----------+-------------+---------------+-----------------+ 240 | 2 | Normal | Low | Dont't care | 241 +----------+-------------+---------------+-----------------+ 242 | 3 | Normal | Normal | Low | 243 +----------+-------------+---------------+-----------------+ 244 Table 1. Traffic Requirement Criteria 246 For Flow No.1, we can select the shortest distance path to carry the 247 traffic; for Flow No.2, we can select the idle links to form its end 248 to end path; for Flow No.3, we can let all the traffic pass one 249 single path, no ECMP distribution on the parallel links is required. 251 It is difficult and almost impossible to provide an end-to-end (E2E) 252 path with latency, latency variation, packet loss, and bandwidth 253 utilization constraints to meet the above requirements in large scale 254 IP-based network via the traditional distributed routing protocol, 255 but these requirements can be solved using the PCE-based architecture 256 since the PCE has the overall network view, can collect real network 257 topology and network performance information about the underlying 258 network, select the appropriate path to meet the various network 259 performance requirements of different traffic type. 261 6. PCE based solution for Multi-BGP strategy deployment. 263 With the advent of SDN concepts towards pure IP networks, it is 264 possible to deploy the PCE related technology into the underlying 265 native IP network, to accomplish the central and dynamic control of 266 network traffic according to the application's various requirements. 268 The procedure to implement the dynamic deployment of Multi-BGP 269 strategy is the following: 270 1) PCE gets topology and link utilization information from the 271 underlying network, calculate the appropriate link path upon 272 application's requirements. 273 2) PCE sends the key parameters to edge/RR routers(R1, R7 and R3 in 274 Fig.3) to build multi-BGP peer relations and advertise different 275 prefixes via them. 276 3) PCE sends the route information to the routers (R1,R2,R4,R7 in 277 Fig.3) on forwarding path via PCEP, to build the path to the BGP 278 next-hop of the advertised prefixes. 279 4) If the assured traffic prefixes were changed but the total volume 280 of assured traffic does not exceed the physical capacity of the 281 previous end-to-end path, then PCE needs only change the related 282 information on edge routers (R1,R7 in Fig.3). 283 5) If volume of the assured traffic exceeds the capacity of previous 284 calculated path, PCE must recalculate the appropriate path to 285 accommodate the exceeding traffic via some new end-to-end physical 286 link. After that PCE needs to update on-path routers to build such 287 path hop by hop. 289 +----+ 290 ***********+PCE +************* 291 * +--*-+ * 292 * / * \ * 293 * * * 294 PCEP* *BGP-LS/SNMP *PCEP 295 * * * 296 * * \ * / 297 \ * / * \ */ 298 \*/-----------R3--------------* 299 | | 300 | | 301 SW1-------R1-------R5---------R6-------R7--------SW2 302 | | | | 303 | | | | 304 +-------R2---------R4--------+ 306 Fig.3 PCE based solution for Multi-BGP deployment 308 7. PCEP extension for key parameters delivery. 310 We need to extend the PCEP protocol to transfer the following key 311 parameters: 312 1) BGP peer address and advertised prefixes. 313 2) Explicit route information to BGP next hop of advertised prefixes. 315 Once the router receives such information, it should establish the 316 BGP session with the peer appointed in the PCEP message, advertise 317 the prefixes that contained in the corresponding PCEP message, and 318 build the end to end dedicated path hop by hop. Details of 319 communications between PCEP and BGP subsystems in router's control 320 plane are out of scope of this draft and will be described in 321 separate draft.[draft-wang-pce-extension for native IP] 323 The reason why we selected PCEP as the southbound protocol instead of 324 OpenFlow, is that PCEP is very suitable for the changes in control 325 plane of the network devices, there OpenFlow dramatically changes the 326 forwarding plane. We also think that the level of centralization that 327 requires by OpenFlow is hardly achievable in many today's SP networks 328 so hybrid BGP+PCEP approach looks much more interesting. 330 8. Deployment Consideration 332 This solution requires the parallel work of 2 subsystems in router's 333 control plane: PCE (PCEP) and BGP as well as coordination between 334 them, so it might require additional planning work before deployment. 336 8.1 Scalability 338 In current solution, PCE need only to influence the edge routers for 339 the prefixes differentiation via the multi-BGP deployment. The route 340 information for these prefixes within the on-path routers were 341 distributed via the traditional BGP protocol. Unlike the solution 342 from BGP Flowspec, the on-path router need only keep the specific 343 policy routes to the BGP next-hop of the differentiate prefixes, not 344 the specific routes to the prefixes themselves. This can lessen the 345 burden from the table size of policy based routes for the on-path 346 routers, and has more scalability when comparing with the solution 347 from BGP flowspec or Openflow. 349 8.2 High Availability 351 Current solution is based on the traditional distributed IP protocol, 352 then if the central control PCE failed, the forwarding plane will not 353 be impacted, as the BGP session between all devices will not flap, 354 and the forwarding table will remain the same. If one node on the 355 optimal path is failed, the assurance traffic will fall over to the 356 best-effort forwarding path. One can even design several assurance 357 paths to load balance/hot standby the assurance traffic to meet the 358 path failure situation, as done in MPLS FRR. 359 From PCE/SDN-controller HA side we will rely on existing HA solutions 360 of SDN controllers such as clustering. 362 8.3 Incremental deployment 364 Not every router within the network support will support the PCEP 365 extension that defined in [draft-wang-pce-extension-native-IP] 366 simultaneously. For such situations, router on the edge of sub domain 367 can be upgraded first, and then the traffic can be assured between 368 different sub domains. Within each sub domain, the traffic will be 369 forwarded along the best-effort path. Service provider can 370 selectively upgrade the routers on each sub-domain in sequence. 372 8.4 Deployment within Pure IGP network 374 For some small underlying networks where the routers support only the 375 pure IGP protocol, we can use EVPN/VxLAN technology and similar 376 procedures that described within this draft to differentiate the 377 forwarding paths for different applications: 379 1) PCE instructs the IGP edge router (ABR) build different BGP 380 sessions. 382 2) PCE instructs the IGP edge router (ABR) redistribute external 383 prefixes via different BGP sessions under the EVPN address family, 384 and then different external prefixes will be associated with 385 different VTEP addresses. 387 3) PCE calculates the optimal path and instruct the on-path routers 388 to build the explicit peer routes to the different VTEP addresses 389 (also the different loopback addresses on ABR). 391 The traffic will then be forwarded via the VxLAN encapsulation, the 392 route path of them will be determined by the outer tunnel address, 393 which is calculated and programmed by PCE. 395 The detail of deployment scenario and the corresponding PCEP 396 extension will be exploited further later. 398 9. Security Considerations 400 TBD 402 10. IANA Considerations 404 TBD 406 11. Conclusions 408 TBD 410 12. References 412 12.1. Normative References 414 [RFC4655] Farrel, A., Vasseur, J.-P., and J. Ash, "A Path 416 Computation Element (PCE)-Based Architecture", RFC 418 4655, August 2006,. 420 [RFC5440]Vasseur, JP., Ed., and JL. Le Roux, Ed., "Path 422 Computation Element (PCE) Communication Protocol 424 (PCEP)", RFC 5440, March 2009, 426 . 428 12.2. Informative References 430 [I-D.draft-ietf-teas-pce-control-function] 432 A.Farrel, Q.Zhao et al. "An Architecture for use of PCE and PCEP in 433 a Network with Central Control" 435 https://datatracker.ietf.org/doc/draft-ietf-teas-pce-central- 436 control/ September, 2016 438 [I-D. draft-ietf-teas-pcecc-use-cases] 440 Quintin Zhao, Robin Li, Boris Khasanov et al. "The Use Cases for 441 Using PCE as the Central Controller(PCECC) of LSPs 443 https://tools.ietf.org/html/draft-ietf-teas-pcecc-use-cases-00 445 March,2017 447 [draft-wang-pcep-extension for native IP] 449 Aijun Wang, Boris Khasanov et al. "PCEP Extension for Native IP 450 Network" https://datatracker.ietf.org/doc/draft-wang-pce-extension- 451 native-ip/ 453 13. Acknowledgments 455 The authors would like to thank George Swallow, Xia Chen, Jeff 456 Tantsura, Daniele Ceccarelli and Dhruv Dhody for their valuable 457 comments and suggestions. 459 The authors would also like to thank Lou Berger, Adrian Farrel, King 460 Daniel for their suggestions to put forward this draft. 462 Authors' Addresses 464 Aijun Wang 465 China Telecom 466 Beiqijia Town, Changping District 467 Beijing,China 469 Email: wangaj.bri@chinatelecom.cn 470 Quintin Zhao 471 Huawei Technologies 472 125 Nagog Technology Park 473 Acton, MA 01719 474 US 476 EMail: quintin.zhao@huawei.com 478 Boris Khasanov 479 Huawei Technologies 480 Moskovskiy Prospekt 97A 481 St.Petersburg 196084 482 Russia 484 EMail: khasanov.boris@huawei.com 486 Penghui Mi 487 Tencent 488 Tencent Building, Kejizhongyi Avenue, 489 Hi-techPark, Nanshan District,Shenzhen 518057, P.R.China 491 Email kevinmi@tencent.com 493 Raghavendra Mallya 494 Juniper Networks 495 1133 Innovation Way 496 Sunnyvale, California 94089 USA 498 Email: rmallya@juniper.net 500 Shaofu Peng 501 ZTE Corporation 502 No.68 Zijinghua Road,Yuhuatai District 503 Nanjing 210012 504 China 506 Email: peng.shaofu@zte.com.cn