idnits 2.17.1 draft-ietf-teas-pce-native-ip-15.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (December 9, 2020) is 1227 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- ** Obsolete normative reference: RFC 7752 (Obsoleted by RFC 9552) == Outdated reference: A later version (-30) exists of draft-ietf-pce-pcep-extension-native-ip-09 Summary: 1 error (**), 0 flaws (~~), 2 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 TEAS Working Group A. Wang 3 Internet-Draft China Telecom 4 Intended status: Informational B. Khasanov 5 Expires: June 12, 2021 Yandex LLC 6 Q. Zhao 7 Etheric Networks 8 H. Chen 9 Futurewei 10 December 9, 2020 12 Path Computation Element (PCE) based Traffic Engineering (TE) in Native 13 IP Networks 14 draft-ietf-teas-pce-native-ip-15 16 Abstract 18 This document defines an architecture for providing traffic 19 engineering in a native IP network using multiple BGP sessions and a 20 Path Computation Element (PCE)-based central control mechanism. It 21 defines the Central Control Dynamic Routing (CCDR) procedures and 22 identifies needed extensions for the Path Computation Element 23 Communication Protocol (PCEP). 25 Status of This Memo 27 This Internet-Draft is submitted in full conformance with the 28 provisions of BCP 78 and BCP 79. 30 Internet-Drafts are working documents of the Internet Engineering 31 Task Force (IETF). Note that other groups may also distribute 32 working documents as Internet-Drafts. The list of current Internet- 33 Drafts is at https://datatracker.ietf.org/drafts/current/. 35 Internet-Drafts are draft documents valid for a maximum of six months 36 and may be updated, replaced, or obsoleted by other documents at any 37 time. It is inappropriate to use Internet-Drafts as reference 38 material or to cite them other than as "work in progress." 40 This Internet-Draft will expire on June 12, 2021. 42 Copyright Notice 44 Copyright (c) 2020 IETF Trust and the persons identified as the 45 document authors. All rights reserved. 47 This document is subject to BCP 78 and the IETF Trust's Legal 48 Provisions Relating to IETF Documents 49 (https://trustee.ietf.org/license-info) in effect on the date of 50 publication of this document. Please review these documents 51 carefully, as they describe your rights and restrictions with respect 52 to this document. Code Components extracted from this document must 53 include Simplified BSD License text as described in Section 4.e of 54 the Trust Legal Provisions and are provided without warranty as 55 described in the Simplified BSD License. 57 Table of Contents 59 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 60 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 3 61 3. CCDR Architecture in Simple Topology . . . . . . . . . . . . 4 62 4. CCDR Architecture in Large Scale Topology . . . . . . . . . . 5 63 5. CCDR Multiple BGP Sessions Strategy . . . . . . . . . . . . . 6 64 6. PCEP Extension for Critical Parameters Delivery . . . . . . . 8 65 7. Deployment Consideration . . . . . . . . . . . . . . . . . . 9 66 7.1. Scalability . . . . . . . . . . . . . . . . . . . . . . . 9 67 7.2. High Availability . . . . . . . . . . . . . . . . . . . . 9 68 7.3. Incremental deployment . . . . . . . . . . . . . . . . . 10 69 7.4. Loop Avoidance . . . . . . . . . . . . . . . . . . . . . 10 70 8. Security Considerations . . . . . . . . . . . . . . . . . . . 10 71 9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 11 72 10. Acknowledgement . . . . . . . . . . . . . . . . . . . . . . . 11 73 11. References . . . . . . . . . . . . . . . . . . . . . . . . . 11 74 11.1. Normative References . . . . . . . . . . . . . . . . . . 11 75 11.2. Informative References . . . . . . . . . . . . . . . . . 12 76 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 12 78 1. Introduction 80 [RFC8283], based on an extension of the PCE (Path Computation 81 Element) architecture described in [RFC4655] , introduced a broader 82 use applicability for a PCE as a central controller. PCEP (PCE 83 Protocol) continues to be used as the protocol between PCE and PCC 84 (Path Computation Client). Building on that work, this document 85 describes a solution using a PCE for centralized control in a native 86 IP network to provide End-to-End (E2E) performance assurance and QoS 87 for traffic. The solution combines the use of distributed routing 88 protocols and a centralized controller, referred to as Centralized 89 Control Dynamic Routing (CCDR). 91 [RFC8735] describes the scenarios and simulation results for traffic 92 engineering in a native IP network based on use of a CCDR 93 architecture. Per [RFC8735], the architecture for traffic 94 engineering in a native IP network should meet the following 95 criteria: 97 o Same solution for native IPv4 and IPv6 traffic. 99 o Support for intra-domain and inter-domain scenarios. 101 o Achieve End to End traffic assurance, with determined QoS 102 behavior, for traffic requiring a service assurance (prioritized 103 traffic). 105 o No changes in a router's forwarding behavior. 107 o Based on centralized control through a distributed network control 108 plane. 110 o Support different network requirements such as high traffic volume 111 and prefix scaling. 113 o Ability to adjust the optimal path dynamically upon the changes of 114 network status. No need for physical links resources reservations 115 to be done in advance. 117 Building on the above documents, this document defines an 118 architecture meeting these requirements by using a multiple BGP 119 session strategy and a PCE as the centralized controller. The 120 architecture depends on the central control (PCE) element to compute 121 the optimal path, and utilizes the dynamic routing behavior of IGP/ 122 BGP protocols for forwarding the traffic. 124 The related PCEP extensions are provided in draft 125 [I-D.ietf-pce-pcep-extension-native-ip]. 127 2. Terminology 129 This document uses the following terms defined in [RFC5440]: 131 o PCE: Path Computation Element 133 o PCEP: PCE Protocol 135 o PCC: Path Computation Client 137 Other terms are defined in this document: 139 o CCDR: Central Control Dynamic Routing 141 o E2E: End to End 143 o ECMP: Equal-Cost Multipath 144 o RR: Route Reflector 146 o SDN: Software Defined Network 148 3. CCDR Architecture in Simple Topology 150 Figure 1 illustrates the CCDR architecture for traffic engineering in 151 simple topology. The topology is comprises four devices which are 152 SW1, SW2, R1, R2. There are multiple physical links between R1 and 153 R2. Traffic between prefix PF11(on SW1) and prefix PF21(on SW2) is 154 normal traffic, traffic between prefix PF12(on SW1) and prefix 155 PF22(on SW2) is priority traffic that should be treated accordingly. 157 +-----+ 158 +----------+ PCE +--------+ 159 | +-----+ | 160 | | 161 | BGP Session 1(lo11/lo21)| 162 +-------------------------+ 163 | | 164 | BGP Session 2(lo12/lo22)| 165 +-------------------------+ 166 PF12 | | PF22 167 PF11 | | PF21 168 +---+ +-----+-----+ +-----+-----+ +---+ 169 |SW1+---------+(lo11/lo12)+-------------+(lo21/lo22)+--------------+SW2| 170 +---+ | R1 +-------------+ R2 | +---+ 171 +-----------+ +-----------+ 173 Figure 1: CCDR architecture in simple topology 175 In the Intra-AS scenario, IGP and BGP combined with a PCE are 176 deployed between R1 and R2. In the inter-AS scenario, only the 177 native BGP protocol is deployed. The traffic between each address 178 pair may change in real time and the corresponding source/destination 179 addresses of the traffic may also change dynamically. 181 The key ideas of the CCDR architecture for this simple topology are 182 the following: 184 o Build two BGP sessions between R1 and R2, via the different 185 loopback addresses on these routers. 187 o Using the PCE, set the explicit peer route on R1 and R2 for BGP 188 next hop to different physical link addresses between R1 and R2. 189 The explicit peer route can be set in the format of a static 190 route, which is different from the route learned from the IGP 191 protocol. 193 o Send different prefixes via the established BGP sessions. For 194 example, PF11/PF21 via the BGP session 1 and PF12/PF22 via the BGP 195 session 2. 197 After the above actions, the bi-directional traffic between the PF11 198 and PF21, and the bi-directional traffic between PF12 and PF22 will 199 go through different physical links between R1 and R2. 201 If there is more traffic between PF12 and PF22 that needs assured 202 transport, one can add more physical links between R1 and R2 to reach 203 the next hop for BGP session 2. In this case, the prefixes that are 204 advertised by the BGP peers need not be changed. 206 If, for example, there is bi-directional priority traffic from 207 another address pair (for example prefix PF13/PF23), and the total 208 volume of priority traffic does not exceed the capacity of the 209 previously provisioned physical links, one need only advertise the 210 newly added source/destination prefixes via the BGP session 2. The 211 bi-directional traffic between PF13/PF23 will go through the same 212 assigned dedicated physical links as the traffic between PF12/PF22. 214 Such a decoupling philosophy of the IGP/BGP traffic link and the 215 physical link achieves a flexible control capability for the network 216 traffic, achieving the needed QoS assurance to meet the application's 217 requirement. The router needs only support native IP and multiple 218 BGP sessions setup via different loopback addresses. 220 4. CCDR Architecture in Large Scale Topology 222 When the priority traffic spans a large scale network, such as that 223 illustrated in Figure 2, the multiple BGP sessions cannot be 224 established hop by hop, for example, the iBGP within one AS. 226 For such a scenario, we propose using a Route Reflector (RR) 227 [RFC4456] to achieve a similar effect. Every edge router will 228 establish two BGP sessions with the RR via different loopback 229 addresses respectively. The other steps for traffic differentiation 230 are the same as that described in the CCDR architecture for the 231 simple topology. 233 As shown in Figure 2, if we select R3 as the RR, every edge router(R1 234 and R7 in this example) will build two BGP session with the RR. If 235 the PCE selects the dedicated path as R1-R2-R4-R7, then the operator 236 should set the explicit peer routes via PCEP protocol on these 237 routers respectively, pointing to the BGP next hop (loopback 238 addresses of R1 and R7, which are used to send the prefix of the 239 priority traffic) to the selected forwarding address. 241 +-----+ 242 +----------------+ PCE +------------------+ 243 | +--+--+ | 244 | | | 245 | | | 246 | ++-+ | 247 +------------------+R3+-------------------+ 248 PF12 | +--+ | PF22 249 PF11 | | PF21 250 +---+ ++-+ +--+ +--+ +-++ +---+ 251 |SW1+-------+R1+----------+R5+----------+R6+---------+R7+--------+SW2| 252 +---+ ++-+ +--+ +--+ +-++ +---+ 253 | | 254 | | 255 | +--+ +--+ | 256 +------------+R2+----------+R4+-----------+ 257 +--+ +--+ 258 Figure 2: CCDR architecture in large scale network 260 5. CCDR Multiple BGP Sessions Strategy 262 Generally, different applications may require different QoS criteria, 263 which may include: 265 o Traffic that requires low latency and is not sensitive to packet 266 loss. 268 o Traffic that requires low packet loss and can endure higher 269 latency. 271 o Traffic that requires low jitter. 273 These different traffic requirements can be summarized in the 274 following table: 276 +----------------+-------------+---------------+-----------------+ 277 | Prefix Set No. | Latency | Packet Loss | Jitter | 278 +----------------+-------------+---------------+-----------------+ 279 | 1 | Low | Normal | Don't care | 280 +----------------+-------------+---------------+-----------------+ 281 | 2 | Normal | Low | Don't care | 282 +----------------+-------------+---------------+-----------------+ 283 | 3 | Normal | Normal | Low | 284 +----------------+-------------+---------------+-----------------+ 285 Table 1. Traffic Requirement Criteria 287 For Prefix Set No.1, we can select the shortest distance path to 288 carry the traffic; for Prefix Set No.2, we can select the path that 289 has end to end under loaded links; for Prefix Set No.3, we can let 290 traffic pass over a determined single path, as no Equal Cost 291 Multipath (ECMP) distribution on the parallel links is desired. 293 It is almost impossible to provide an End-to-End (E2E) path 294 efficiently with latency, jitter, and packet loss constraints to meet 295 the above requirements in a large scale IP-based network only using a 296 distributed routing protocol, but these requirements can be met with 297 the assistance of PCE, as that described in [RFC4655] and [RFC8283]. 298 The PCE will have the overall network view, ability to collect the 299 real-time network topology, and the network performance information 300 about the underlying network. The PCE can select the appropriate 301 path to meet the various network performance requirements for 302 different traffic. 304 The architecture to implement the CCDR Multiple BGP sessions strategy 305 is as the follows: 307 The PCE will be responsible for the optimal path computation for the 308 different priority classes of traffic: 310 o PCE collects topology information via BGP-LS [RFC7752] and link 311 utilization information via the existing Network Monitoring System 312 (NMS) from the underlying network. 314 o PCE calculates the appropriate path based upon the application's 315 requirements, and sends the key parameters to edge/RR routers(R1, 316 R7 and R3 in Figure 3) to establish multiple BGP sessions. The 317 loopback addresses used for the BGP sessions should be planned in 318 advance and distributed in the domain. 320 o PCE sends the route information to the routers (R1,R2,R4,R7 in 321 Figure 3) on the forwarding path via PCEP 322 [I-D.ietf-pce-pcep-extension-native-ip], to build the path to the 323 BGP next-hop of the advertised prefixes. 325 o PCE sends the prefixes information to the PCC for advertising 326 different prefixes via the specified BGP session. 328 o If the priority traffic prefixes were changed but the total volume 329 of priority traffic does not exceed the physical capacity of the 330 previous E2E path, the PCE needs only change the prefixed 331 advertised via the edge routers (R1,R7 in Figure 3). 333 o If the volume of priority traffic exceeds the capacity of the 334 previous calculated path, the PCE can recalculate and add the 335 appropriate paths to accommodate the exceeding traffic. After 336 that, the PCE needs to update the on-path routers to build the 337 forwarding path hop by hop. 339 +------------+ 340 | Application| 341 +------+-----+ 342 | 343 +--------+---------+ 344 +----------+SDN Controller/PCE+-----------+ 345 | +--------^---------+ | 346 | | | 347 | | | 348 PCEP | BGP-LS|PCEP | PCEP 349 | | | 350 | +v-+ | 351 +------------------+R3+-------------------+ 352 PF12 | +--+ | PF22 353 PF11 | | PF21 354 +---+ +v-+ +--+ +--+ +-v+ +---+ 355 |SW1+-------+R1+----------+R5+----------+R6+---------+R7+--------+SW2| 356 +---+ ++-+ +--+ +--+ +-++ +---+ 357 | | 358 | | 359 | +--+ +--+ | 360 +------------+R2+----------+R4+-----------+ 361 +--+ +--+ 363 Figure 3: CCDR architecture for Multi-BGP sessions deployment 365 6. PCEP Extension for Critical Parameters Delivery 367 The PCEP protocol needs to be extended to transfer the following 368 critical parameters: 370 o Peer information that is used to build the BGP session 372 o Explicit route information for BGP next hop of advertised prefixes 374 o Advertised prefixes and their associated BGP session. 376 Once the router receives such information, it should establish the 377 BGP session with the peer appointed in the PCEP message, build the 378 end-to-end dedicated path hop-by-hop, and advertise the prefixes that 379 are contained in the corresponding PCEP message. 381 The dedicated path is preferred by making sure that the explicit 382 route created by PCE has the higher priority (lower route preference) 383 than the route information created by other dynamic protocols. 385 All above dynamically created states (BGP sessions, Explicit route 386 and Prefix advertised prefix) will be cleared on the expiration of 387 the state timeout interval which is based on the existing Stateful 388 PCE [RFC8231] and PCECC [RFC8283] mechanism. 390 Regarding the BGP session, it is not different from that configured 391 manually or via NETCONF/YANG. Different BGP sessions are used mainly 392 for the clarification of the network prefixes, which can be 393 differentiated via the different BGP nexthop. Based on this 394 strategy, if we manipulate the path to the BGP nexthop, then the path 395 to the prefixes that were advertised with the BGP sessions will be 396 changed accordingly. Details of communications between PCEP and BGP 397 subsystems in the router's control plane are out of scope of this 398 draft and will be described in a separate document 399 [I-D.ietf-pce-pcep-extension-native-ip] . 401 7. Deployment Consideration 403 7.1. Scalability 405 In the CCDR architecture, only the edge routers that connect with the 406 PCE are responsible for the prefixes advertisement via the multiple 407 BGP sessions deployment. The route information for these prefixes 408 within the on-path routers is distributed via the BGP protocol. 410 For multiple domain deployment, the PCE, or the pool of PCEs 411 responsible for these domains, needs only to control the edge router 412 to build the multiple EBGP sessions; all other procedures are the 413 same as within one domain. 415 Unlike the solution from BGP Flowspec[I-D.ietf-idr-rfc5575bis], the 416 on-path router needs only to keep the specific policy routes for the 417 BGP next-hop of the differentiate prefixes, not the specific routes 418 to the prefixes themselves. This lessens the burden of the table 419 size of policy based routes for the on-path routers; and has more 420 expandability compared with BGP flowspec or Openflow solutions. For 421 example, if we want to differentiate 1000 prefixes from the normal 422 traffic, CCDR needs only one explicit peer route in every on-path 423 router, whereas the BGP flowspec or Openflow solutions need 1000 424 policy routes on them. 426 7.2. High Availability 428 The CCDR architecture is based on the use of the native IP protocol. 429 If the PCE fails, the forwarding plane will not be impacted, as the 430 BGP sessions between all the devices will not flap and the forwarding 431 table remains unchanged. 433 If one node on the optimal path fails, the priority traffic will fall 434 over to the best-effort forwarding path. One can even design several 435 paths to load balance/hot-standby the priority traffic to meet a path 436 failure situation. 438 For ensuring high availability of a PCE/SDN-controllers architecture, 439 an operator should rely on existing high availability solutions for 440 SDN controllers, such as clustering technology and deployment. 442 7.3. Incremental deployment 444 Not every router within the network needs to support the PCEP 445 extension defined in [I-D.ietf-pce-pcep-extension-native-ip] 446 simultaneously. 448 For such situations, routers on the edge of a domain can be upgraded 449 first, and then the traffic can be prioritized between different 450 domains. Within each domain, the traffic will be forwarded along the 451 best-effort path. A service provider can selectively upgrade the 452 routers on each domain in sequence. 454 7.4. Loop Avoidance 456 A PCE needs to assure calculation of the E2E path based on the status 457 of network and the service requirements in real-time. 459 The PCE needs to consider the explicit route deployment order (for 460 example, from tail router to head router) to eliminate any possible 461 transient traffic loop. 463 8. Security Considerations 465 The setup of BGP sessions, prefix advertisement, and explicit peer 466 route establishment are all controlled by the PCE. See [RFC7454] for 467 BGP security consideration. To prevent a bogus PCE sending harmful 468 messages to the network nodes, the network devices should 469 authenticate the validity of the PCE and ensure a secure 470 communication channel between them. Mechanisms described in 471 [RFC8253] should be used. 473 The CCDR architecture does not require changes to the forwarding 474 behavior of the underlay devices. There will no additional security 475 impacts on these devices. 477 9. IANA Considerations 479 This document does not require any IANA actions. 481 10. Acknowledgement 483 The author would like to thank Deborah Brungard, Adrian Farrel, 484 Vishnu Beeram, Lou Berger, Dhruv Dhody, Raghavendra Mallya , Mike 485 Koldychev, Haomian Zheng, Penghui Mi, Shaofu Peng, Donald Eastlake 486 and Jessica Chen for their supports and comments on this draft. 488 11. References 490 11.1. Normative References 492 [RFC4456] Bates, T., Chen, E., and R. Chandra, "BGP Route 493 Reflection: An Alternative to Full Mesh Internal BGP 494 (IBGP)", RFC 4456, DOI 10.17487/RFC4456, April 2006, 495 . 497 [RFC4655] Farrel, A., Vasseur, J., and J. Ash, "A Path Computation 498 Element (PCE)-Based Architecture", RFC 4655, 499 DOI 10.17487/RFC4655, August 2006, 500 . 502 [RFC5440] Vasseur, JP., Ed. and JL. Le Roux, Ed., "Path Computation 503 Element (PCE) Communication Protocol (PCEP)", RFC 5440, 504 DOI 10.17487/RFC5440, March 2009, 505 . 507 [RFC7454] Durand, J., Pepelnjak, I., and G. Doering, "BGP Operations 508 and Security", BCP 194, RFC 7454, DOI 10.17487/RFC7454, 509 February 2015, . 511 [RFC7752] Gredler, H., Ed., Medved, J., Previdi, S., Farrel, A., and 512 S. Ray, "North-Bound Distribution of Link-State and 513 Traffic Engineering (TE) Information Using BGP", RFC 7752, 514 DOI 10.17487/RFC7752, March 2016, 515 . 517 [RFC8231] Crabbe, E., Minei, I., Medved, J., and R. Varga, "Path 518 Computation Element Communication Protocol (PCEP) 519 Extensions for Stateful PCE", RFC 8231, 520 DOI 10.17487/RFC8231, September 2017, 521 . 523 [RFC8253] Lopez, D., Gonzalez de Dios, O., Wu, Q., and D. Dhody, 524 "PCEPS: Usage of TLS to Provide a Secure Transport for the 525 Path Computation Element Communication Protocol (PCEP)", 526 RFC 8253, DOI 10.17487/RFC8253, October 2017, 527 . 529 [RFC8283] Farrel, A., Ed., Zhao, Q., Ed., Li, Z., and C. Zhou, "An 530 Architecture for Use of PCE and the PCE Communication 531 Protocol (PCEP) in a Network with Central Control", 532 RFC 8283, DOI 10.17487/RFC8283, December 2017, 533 . 535 [RFC8735] Wang, A., Huang, X., Kou, C., Li, Z., and P. Mi, 536 "Scenarios and Simulation Results of PCE in a Native IP 537 Network", RFC 8735, DOI 10.17487/RFC8735, February 2020, 538 . 540 11.2. Informative References 542 [I-D.ietf-idr-rfc5575bis] 543 Loibl, C., Hares, S., Raszuk, R., McPherson, D., and M. 544 Bacher, "Dissemination of Flow Specification Rules", 545 draft-ietf-idr-rfc5575bis-27 (work in progress), October 546 2020. 548 [I-D.ietf-pce-pcep-extension-native-ip] 549 Wang, A., Khasanov, B., Fang, S., Tan, R., and C. Zhu, 550 "PCEP Extension for Native IP Network", draft-ietf-pce- 551 pcep-extension-native-ip-09 (work in progress), October 552 2020. 554 Authors' Addresses 556 Aijun Wang 557 China Telecom 558 Beiqijia Town, Changping District 559 Beijing 102209 560 China 562 Email: wangaj3@chinatelecom.cn 563 Boris Khasanov 564 Yandex LLC 565 Ulitsa Lva Tolstogo 16 566 Moscow 567 Russia 569 Email: bhassanov@yahoo.com 571 Quintin Zhao 572 Etheric Networks 573 1009 S CLAREMONT ST 574 SAN MATEO, CA 94402 575 USA 577 Email: qzhao@ethericnetworks.com 579 Huaimo Chen 580 Futurewei 581 Boston, MA 582 USA 584 Email: huaimo.chen@futurewei.com