idnits 2.17.1 draft-ietf-teas-pcecc-use-cases-02.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- -- The document has examples using IPv4 documentation addresses according to RFC6890, but does not use any IPv6 documentation addresses. Maybe there should be IPv6 examples, too? Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (October 18, 2018) is 2016 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- -- Looks like a reference, but probably isn't: '1' on line 1324 == Outdated reference: A later version (-16) exists of draft-ietf-pce-segment-routing-14 == Outdated reference: A later version (-15) exists of draft-ietf-pce-stateful-hpce-05 == Outdated reference: A later version (-13) exists of draft-ietf-pce-pcep-flowspec-02 == Outdated reference: A later version (-09) exists of draft-zhao-pce-pcep-extension-pce-controller-sr-03 == Outdated reference: A later version (-16) exists of draft-li-pce-controlled-id-space-00 == Outdated reference: A later version (-04) exists of draft-dugeon-pce-stateful-interdomain-01 == Outdated reference: A later version (-07) exists of draft-filsfils-spring-srv6-network-programming-05 == Outdated reference: A later version (-04) exists of draft-negi-pce-segment-routing-ipv6-02 == Outdated reference: A later version (-26) exists of draft-ietf-6man-segment-routing-header-14 == Outdated reference: A later version (-17) exists of draft-ietf-teas-pce-native-ip-01 == Outdated reference: A later version (-12) exists of draft-ietf-teas-native-ip-scenarios-01 Summary: 0 errors (**), 0 flaws (~~), 12 warnings (==), 3 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 TEAS Working Group Q. Zhao 3 Internet-Draft Z. Li 4 Intended status: Informational B. Khasanov 5 Expires: April 21, 2019 D. Dhody 6 Huawei Technologies 7 K. Ke 8 Tencent Holdings Ltd. 9 L. Fang 10 Expedia, Inc. 11 C. Zhou 12 Cisco Systems 13 B. Zhang 14 Telus Communications 15 A. Rachitskiy 16 Mobile TeleSystems JLLC 17 A. Gulida 18 LLC "Lifetech" 19 October 18, 2018 21 The Use Cases for Path Computation Element (PCE) as a Central Controller 22 (PCECC). 23 draft-ietf-teas-pcecc-use-cases-02 25 Abstract 27 The Path Computation Element (PCE) is a core component of Software- 28 Defined Networking (SDN) systems. It can compute optimal paths for 29 traffic across a network and can also update the paths to reflect 30 changes in the network or traffic demands. PCE was developed to 31 derive paths for MPLS Label Switched Paths (LSPs), which are supplied 32 to the head end of the LSP using the Path Computation Element 33 Communication Protocol (PCEP). 35 SDN has a broader applicability than signaled MPLS traffic-engineered 36 (TE) networks, and the PCE may be used to determine paths in a range 37 of use cases including static LSPs, segment routing, Service Function 38 Chaining (SFC), and most forms of a routed or switched network. It 39 is, therefore, reasonable to consider PCEP as a control protocol for 40 use in these environments to allow the PCE to be fully enabled as a 41 central controller. 43 This document describes general considerations for PCECC deployment 44 and examines its applicability and benefits, as well as its 45 challenges and limitations, through a number of use cases. PCEP 46 extensions required for stateful PCE usage are covered in separate 47 documents. 49 Requirements Language 51 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 52 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 53 "OPTIONAL" in this document are to be interpreted as described in BCP 54 14 [RFC2119] [RFC8174] when, and only when, they appear in all 55 capitals, as shown here. 57 Status of This Memo 59 This Internet-Draft is submitted in full conformance with the 60 provisions of BCP 78 and BCP 79. 62 Internet-Drafts are working documents of the Internet Engineering 63 Task Force (IETF). Note that other groups may also distribute 64 working documents as Internet-Drafts. The list of current Internet- 65 Drafts is at https://datatracker.ietf.org/drafts/current/. 67 Internet-Drafts are draft documents valid for a maximum of six months 68 and may be updated, replaced, or obsoleted by other documents at any 69 time. It is inappropriate to use Internet-Drafts as reference 70 material or to cite them other than as "work in progress." 72 This Internet-Draft will expire on April 21, 2019. 74 Copyright Notice 76 Copyright (c) 2018 IETF Trust and the persons identified as the 77 document authors. All rights reserved. 79 This document is subject to BCP 78 and the IETF Trust's Legal 80 Provisions Relating to IETF Documents 81 (https://trustee.ietf.org/license-info) in effect on the date of 82 publication of this document. Please review these documents 83 carefully, as they describe your rights and restrictions with respect 84 to this document. Code Components extracted from this document must 85 include Simplified BSD License text as described in Section 4.e of 86 the Trust Legal Provisions and are provided without warranty as 87 described in the Simplified BSD License. 89 Table of Contents 91 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 92 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 3 93 3. Application Scenarios . . . . . . . . . . . . . . . . . . . . 4 94 3.1. Use Cases of PCECC for Label Management . . . . . . . . . 4 95 3.2. Using PCECC for SR . . . . . . . . . . . . . . . . . . . 6 96 3.2.1. PCECC SID Allocation . . . . . . . . . . . . . . . . 7 97 3.2.2. Use Cases of PCECC for SR Best Effort (BE) Path . . . 8 98 3.2.3. Use Cases of PCECC for SR Traffic Engineering (TE) 99 Path . . . . . . . . . . . . . . . . . . . . . . . . 8 100 3.3. Use Cases of PCECC for TE LSP . . . . . . . . . . . . . . 9 101 3.3.1. PCECC Load Balancing (LB) Use Case . . . . . . . . . 11 102 3.3.2. PCECC and Inter-AS TE . . . . . . . . . . . . . . . . 13 103 3.4. Use Cases of PCECC for Multicast LSPs . . . . . . . . . . 16 104 3.4.1. Using PCECC for P2MP/MP2MP LSPs' Setup . . . . . . . 16 105 3.4.2. Use Cases of PCECC for the Resiliency of P2MP/MP2MP 106 LSPs . . . . . . . . . . . . . . . . . . . . . . . . 17 107 3.5. Use Cases of PCECC for LSP in the Network Migration . . . 19 108 3.6. Use Cases of PCECC for L3VPN and PWE3 . . . . . . . . . . 21 109 3.7. Using PCECC for Traffic Classification Information . . . 22 110 3.8. Use Cases of PCECC for SRv6 . . . . . . . . . . . . . . . 22 111 3.9. Use Cases of PCECC for SFC . . . . . . . . . . . . . . . 23 112 3.10. Use Cases of PCECC for Native IP . . . . . . . . . . . . 23 113 3.11. Use Cases of PCECC for Local Protection (RSVP-TE) . . . . 24 114 4. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 24 115 5. Security Considerations . . . . . . . . . . . . . . . . . . . 24 116 6. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 24 117 7. References . . . . . . . . . . . . . . . . . . . . . . . . . 24 118 7.1. Normative References . . . . . . . . . . . . . . . . . . 24 119 7.2. Informative References . . . . . . . . . . . . . . . . . 25 120 Appendix A. Using reliable P2MP TE based multicast delivery for 121 distributed computations (MapReduce-Hadoop) . . . . 29 122 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 31 124 1. Introduction 126 An Architecture for Use of PCE and PCEP [RFC5440] in a Network with 127 Central Control [RFC8283] describes SDN architecture where the Path 128 Computation Element (PCE) determines paths for variety of different 129 usecases, with PCEP as a general southbound communication protocol 130 with all the nodes along the path.. 132 [I-D.zhao-pce-pcep-extension-for-pce-controller] introduces the 133 procedures and extensions for PCEP to support the PCECC architecture 134 [RFC8283]. 136 This draft describes the various usecases for the PCECC architecture. 138 2. Terminology 140 The following terminology is used in this document. 142 IGP: Interior Gateway Protocol. Either of the two routing 143 protocols, Open Shortest Path First (OSPF) or Intermediate System 144 to Intermediate System (IS-IS). 146 PCC: Path Computation Client: any client application requesting a 147 path computation to be performed by a Path Computation Element. 149 PCE: Path Computation Element. An entity (component, application, 150 or network node) that is capable of computing a network path or 151 route based on a network graph and applying computational 152 constraints. 154 PCECC: PCE as a central controller. Extension of PCE to support SDN 155 functions as per [RFC8283]. 157 TE: Traffic Engineering. 159 3. Application Scenarios 161 In the following sections, several use cases are described, 162 showcasing scenarios that benefit from the deployment of PCECC. 164 3.1. Use Cases of PCECC for Label Management 166 As per [RFC8283], in some cases, the PCE-based controller can take 167 responsibility for managing some part of the MPLS label space for 168 each of the routers that it controls, and it may taker wider 169 responsibility for partitioning the label space for each router and 170 allocating different parts for different uses, communicating the 171 ranges to the router using PCEP. 173 [I-D.zhao-pce-pcep-extension-for-pce-controller] describe a mode 174 where LSPs are provisioned as explicit label instructions at each hop 175 on the end-to-end path. Each router along the path must be told what 176 label forwarding instructions to program and what resources to 177 reserve. The controller uses PCEP to communicate with each router 178 along the path of the end-to-end LSP. For this to work, the PCE- 179 based controller will take responsibility for managing some part of 180 the MPLS label space for each of the routers that it controls. An 181 extension to PCEP could be done to allow a PCC to inform the PCE of 182 such a label space to control. 184 [I-D.ietf-pce-segment-routing] specifies extensions to PCEP that 185 allow a stateful PCE to compute, update or initiate SR-TE paths. 186 [I-D.zhao-pce-pcep-extension-pce-controller-sr] describes the 187 mechanism for PCECC to allocate and provision the node/prefix/ 188 adjacency label (SID) via PCEP. To make such allocation PCE needs to 189 be aware of the label space from Segment Routing Global Block (SRGB) 190 or Segment Routing Local Block (SRLB) [RFC8402] of the node that it 191 controls. A mechanism for a PCC to inform the PCE of such a label 192 space to control is needed within PCEP. The full SRGB/SRLB of a node 193 could be learned via existing IGP or BGP-LS mechanism too. 195 [I-D.li-pce-controlled-id-space] defines a PCEP extension to support 196 advertisement of the MPLS label space to the PCE to control. 198 There have been various proposals for Global Labels, the PCECC 199 architecture could be used as means to learn the label space of 200 nodes, and could also be used to determine and provision the global 201 label range. 203 +------------------------------+ +------------------------------+ 204 | PCE DOMAIN 1 | | PCE DOMAIN 2 | 205 | +--------+ | | +--------+ | 206 | | | | | | | | 207 | | PCECC1 | ---------PCEP---------- | PCECC2 | | 208 | | | | | | | | 209 | | | | | | | | 210 | +--------+ | | +--------+ | 211 | ^ ^ | | ^ ^ | 212 | / \ PCEP | | PCEP / \ | 213 | V V | | V V | 214 | +--------+ +--------+ | | +--------+ +--------+ | 215 | |NODE 11 | | NODE 1n| | | |NODE 21 | | NODE 2n| | 216 | | | ...... | | | | | | ...... | | | 217 | | PCECC | | PCECC | | | | PCECC | |PCECC | | 218 | |Enabled | | Enabled| | |Enabled | |Enabled | | 219 | +--------+ +--------+ | | +--------+ +--------+ | 220 | | | | 221 +------------------------------+ +------------------------------+ 223 Figure 1: PCECC for Label Management 225 o PCC would advertise the PCECC capability to the PCE (central 226 controller-PCECC) 227 [I-D.zhao-pce-pcep-extension-for-pce-controller]. 229 o The PCECC could also learn the label range set aside by the PCC 230 ([I-D.li-pce-controlled-id-space]). 232 o Optionally, the PCECC could determine the shared MPLS global label 233 range for the network. 235 o In the case that the shared global label range need to be 236 negotiated across multiple domains, the central controllers of 237 these domains would also need to negotiate a common global 238 label range across domains. 240 o The PCECC would need to set the shared global label range to 241 all PCC nodes in the network. 243 3.2. Using PCECC for SR 245 Segment Routing (SR) leverages the source routing paradigm. Using 246 SR, a source node steers a packet through a path without relying on 247 hop-by-hop signaling protocols such as LDP or RSVP-TE. Each path is 248 specified as an ordered list of instructions called "segments". Each 249 segment is an instruction to route the packet to a specific place in 250 the network, or to perform a specific service on the packet. A 251 database of segments can be distributed through the network using a 252 routing protocol (such as IS-IS or OSPF) or by any other means. PCEP 253 (and PCECC) could be one such means. 255 [I-D.ietf-pce-segment-routing] specify the SR specific PCEP 256 extensions. PCECC may further use PCEP protocol for SR SID (Segment 257 Identifier) distribution to the SR nodes (PCC) with some benefits. 258 If the PCECC allocates and maintains the SID in the network for the 259 nodes and adjacencies; and further distributes them to the SR nodes 260 directly via the PCEP session has some advantage over the 261 configurations on each SR node and flooding via IGP, especially in a 262 SDN environment. 264 When the PCECC is used for the distribution of the node segment ID 265 and adjacency segment ID, the node segment ID is allocated from the 266 SRGB of the node. For the allocation of adjacency segment ID, the 267 allocation is from the SRLB of the node as described in 268 [I-D.zhao-pce-pcep-extension-pce-controller-sr]. 270 [RFC8355] identifies various protection and resiliency usecases for 271 SR. Path protection lets the ingress node be in charge of the 272 failure recovery (used for SR-TE). Also protection can be performed 273 by the node adjacent to the failed component, commonly referred to as 274 local protection techniques or fast-reroute (FRR) techniques. In 275 case of PCECC, the protection paths can be pre-computed and setup by 276 the PCE. 278 The following example illustrate the use case where the node SID and 279 adjacency SID are allocated by the PCECC. 281 192.0.2.1/32 282 +----------+ 283 | R1(1001) | 284 +----------+ 285 | 286 +----------+ 287 | R2(1002) | 192.0.2.2/32 288 +----------+ 289 * | * * 290 * | * * 291 *link1| * * 292 192.0.2.4/32 * | *link2 * 192.0.2.5/32 293 +-----------+ 9001| * +-----------+ 294 | R4(1004) | | * | R5(1005) | 295 +-----------+ | * +-----------+ 296 * | *9003 * + 297 * | * * + 298 * | * * + 299 +-----------+ +-----------+ 300 192.0.2.3/32 | R3(1003) | |R6(1006) |192.0.2.6/32 301 +-----------+ +-----------+ 302 | 303 +-----------+ 304 | R8(1008) | 192.0.2.8/32 305 +-----------+ 307 3.2.1. PCECC SID Allocation 309 Each node (PCC) is allocated a node-SID by the PCECC. The PCECC 310 needs to update the label map of each node to all the nodes in the 311 domain. On receiving the label map, each node (PCC) uses the local 312 routing information to determine the next-hop and download the label 313 forwarding instructions accordingly. The forwarding behavior and the 314 end result is same as IGP based Node-SID in SR. Thus, from anywhere 315 in the domain, it enforces the ECMP-aware shortest-path forwarding of 316 the packet towards the related node. 318 For each adjacency in the network, PCECC can allocate an Adj-SID. 319 The PCECC sends PCInitiate message to update the label map of each 320 Adj to the corresponding nodes in the domain. Each node (PCC) 321 download the label forwarding instructions accordingly. The 322 forwarding behavior and the end result is similar to IGP based "Adj- 323 SID" in SR. 325 The various mechanism are described in 326 [I-D.zhao-pce-pcep-extension-pce-controller-sr]. 328 3.2.2. Use Cases of PCECC for SR Best Effort (BE) Path 330 In this mode of the solution, the PCECC just need to allocate the 331 node segment ID and adjacency ID (without calculating the explicit 332 path for the SR path). The ingress of the forwarding path just need 333 to encapsulate the destination node segment ID on top of the packet. 334 All the intermediate nodes will forward the packet based on the 335 destination node SID. It is similar to the LDP LSP. 337 R1 may send a packet to R8 simply by pushing an SR header with 338 segment list {1008} (Node SID for R8). The path would be the based 339 on the routing/nexthop calculation on the routers. 341 3.2.3. Use Cases of PCECC for SR Traffic Engineering (TE) Path 343 SR-TE paths may not follow an IGP SPT. Such paths may be chosen by a 344 PCECC and provisioned on the ingress node of the SR-TE path. The SR 345 header consists of a list of SIDs (or MPLS labels). The header has 346 all necessary information so that, the packets can be guided from the 347 ingress node to the egress node of the path; hence, there is no need 348 for any signaling protocol. For the case where strict traffic 349 engineering path is needed, all the adjacency SID are stacked, 350 otherwise a combination of node-SID or adj-SID can be used for the 351 SR-TE paths. 353 Note that the bandwidth reservations is only guaranteed through the 354 enforce of the bandwidth admission control. As for the RSVP-TE LSP 355 case, the control plane signaling also does the link bandwidth 356 reservation in each hop of the path. 358 The SR traffic engineering path examples are explained as bellow: 360 Note that the node SID for each node is allocated from the SRGB and 361 adjacency SID for each link are allocated from the SRLB for each 362 node. 364 Example 1: 366 R1 may send a packet P1 to R8 simply by pushing an SR header with 367 segment list {1008}. Based on the best path, it could be: 368 R1-R2-R3-R8. 370 Example 2: 372 R1 may send a packet P2 to R8 by pushing an SR header with segment 373 list {1002, 9001, 1008}. The path should be: R1-R2-(1)link-R3-R8. 375 Example 3: 377 R1 may send a packet P3 to R8 via R4 by pushing an SR header with 378 segment list {1004, 1008}. The path could be : R1-R2-R4-R3-R8 380 The local protection examples for SR TE path are explained as below: 382 Example 4: local link protection: 384 o R1 may send a packet P4 to R8 by pushing an SR header with segment 385 list {1002, 9001, 1008}. The path should be: R1-R2-(1)link-R3-R8. 387 o When node R2 receives the packet from R1 which has the header of 388 R2-(1)link-R3-R8, and also find out there is a link failure of 389 link1, then the R2 can enforce the traffic over the bypass to send 390 out the packet with header of R3-R8 through link2. 392 Example 5: local node protection: 394 o R1 may send a packet P5 to R8 by pushing an SR header with segment 395 list {1004, 1008}. The path should be : R1-R2-R4-R3-R8. 397 o When node R2 receives the packet from R1 which has the header of 398 {1004, 1008}, and also find out there is a node failure for node4, 399 then it can enforce the traffic over the bypass and send out the 400 packet with header of {1005, 1008} to node5 instead of node4. 402 3.3. Use Cases of PCECC for TE LSP 404 In the previous sections, we have discussed the cases where the SR 405 path is setup through the PCECC. Although those cases give the 406 simplicity and scalability, but there are existing functionalities 407 for the traffic engineering path such as the bandwidth guarantee, 408 monitoring where SR based solution are complex. Also there are cases 409 where the depth of the label stack is an issue for existing 410 deployment and certain vendors. 412 So to address these issues, PCECC architecture also support the TE 413 LSP functionalities. To achieve this, the existing PCEP can be used 414 to communicate between the PCECC and nodes along the path. This is 415 similar to static LSPs, where LSPs can be provisioned as explicit 416 label instructions at each hop on the end-to-end path. Each router 417 along the path must be told what label- forwarding instructions to 418 program and what resources to reserve. The PCE-based controller 419 keeps a view of the network and determines the paths of the end-to- 420 end LSPs, and the controller uses PCEP to communicate with each 421 router along the path of the end-to-end LSP. 423 192.0.2.1/32 424 +----------+ 425 | R1 | 426 +----------+ 427 | | 428 |link1 | 429 | |link2 430 +----------+ 431 | R2 | 192.0.2.2/32 432 +----------+ 433 link3 * | * * link4 434 * | * * 435 *link5| * * 436 192.0.2.4/32 * | *link6 * 192.0.2.5/32 437 +-----------+ | * +-----------+ 438 | R4 | | * | R5 | 439 +-----------+ | * +-----------+ 440 * | * * + 441 link10 * | * *link7 + 442 * | * * + 443 +-----------+ +-----------+ 444 192.0.2.3/32 | R3 | |R6 |192.0.2.6/32 445 +-----------+ +-----------+ 446 | | 447 |link8 | 448 | |link9 449 +-----------+ 450 | R8 | 192.0.2.8/32 451 +-----------+ 453 Figure 2: PCECC TE LSP Setup Example 455 o Based on path computation request / delegation or PCE initiation, 456 the PCECC receives the PCECC request with constraints and 457 optimization criteria. 459 o PCECC would calculate the optimal path according to given 460 constrains (i.e.bandwidth). 462 o PCECC would provision each node along the path and assign incoming 463 and outgoing labels from R1 to R8 with the path: {R1, link1, 464 1001}, {1001, R2, link3, 2003], {2003, R4, link10, 4010}, {4010, 465 R3, link8, 3008}, {3008, R8}. 467 o For the end to end protection, PCECC program each node along the 468 path from R1 to R8 with the secondary path: {R1, link2, 1002}, 469 {1002, R2, link4, 2004], {2004, R5, link7, 5007}, {5007, R3, 470 link9, 3009}, {3009, R8}. 472 o It is also possible to have a bypass path for the local protection 473 setup by the PCECC. For example, the primary path as above, then 474 to protect the node R4 locally, PCECC can program the bypass path 475 like this: {R2, link5, 2005], {2005, R3}. By doing this, the node 476 R4 is locally protected at R2. 478 3.3.1. PCECC Load Balancing (LB) Use Case 480 Very often many service providers use TE tunnels for solving issues 481 with non-deterministic paths in their networks. One example of such 482 applications is usage of TEs in the mobile backhaul (MBH). Let's 483 consider the following typical topology. 485 TE1 --------------> 486 +---------+ +--------+ +--------+ +--------+ +------+ +---+ 487 | Access |----| Access |----| AGG 1 |----| AGG N-1|----|Core 1|--|SR1| 488 | SubNode1| | Node 1 | +--------+ +--------+ +------+ +---+ 489 +---------+ +--------+ | | | ^ | 490 | Access | Access | AGG Ring 1 | | | 491 | SubRing 1 | Ring 1 | | | | | 492 +---------+ +--------+ +--------+ | | | 493 | Access | | Access | | AGG 2 | | | | 494 | SubNode2| | Node 2 | +--------+ | | | 495 +---------+ +--------+ | | | | | 496 | | | | | | | 497 | | | +----TE2----|-+ | 498 +---------+ +--------+ +--------+ +--------+ +------+ +---+ 499 | Access | | Access |----| AGG 3 |----| AGG N |----|Core N|--|SRn| 500 | SubNodeN|----| Node N | +--------+ +--------+ +------+ +---+ 501 +---------+ +--------+ 503 This MBH architecture uses L2 access rings and subrings. L3 starts 504 at aggregation. For the sake of simplicity here we have only one 505 access subring, access ring and aggregation ring (AGG1...AGGN), 506 connected by Nx10GE interfaces. Aggregation domain runs its own IGP. 507 There are two Egress routers (AGG N-1,AGG N) that are connected to 508 the Core domain via L2 interfaces. Core also have connections to 509 service routers, RSVP TEs are used for MPLS transport inside the 510 ring. There could be at least 2 tunnels (one way) from each AGG 511 router to egress AGG routers. There are also many L2 access rings 512 connected to AGG routers. 514 Service deployment made by means of either L2VPNs (VPLS) or L3VPNs. 515 Those services use MPLS TE as transport towards egress AGG routers. 517 TE tunnels could be also used as transport towards service routers in 518 case of seamless MPLS based architecture in the future. 520 There is a need to solve the following tasks: 522 o Perform automatic LB amongst TE tunnels according to current 523 traffic load. 525 o TE bandwidth (BW) management: Provide guaranteed BW for specific 526 service: HSI, IPTV, etc., provide time-based BW reservation (BoD) 527 for other services. 529 o Simplify development of TE tunnels by automation without any 530 manual intervention. 532 o Provide flexibility for Service Router placement (anywhere in the 533 network by creation of transport LSPs to them). 535 Since other tasks are considered in other PCECC use cases above, 536 hereafter we will focus only on load balancing (LB) task. LB task 537 could be solved by means of PCECC in the following way: 539 o After application or network service or operator can ask SDN 540 controller (PCECC) for LSP based LB between AGG X and AGG N/AGG 541 N-1 (egress AGG routers which have connections to core) via North 542 Bound Interface (NBI). Each of these would have associated 543 constrains (i.e. LSP type: traditional CR-LSP or SR-TE LSP, 544 bandwidth, inclusion or exclusion specific links or nodes, number 545 of paths, shortest path or minimum cost tree, need for disjoint 546 LSP paths etc.). 548 o PCECC could calculate multiple (Say N) LSPs according to given 549 constrains, calculation is based on results of Objective Function 550 (OF) [RFC5541], constraints, endpoints, same or different 551 bandwidth (BW) , different links (in case of disjoint paths) and 552 other constrains. 554 o Depending on given LSP Path setup type (PST), PCECC would use 555 download instructions to the PCC. At this stage it is assumed the 556 PCECC is aware of the label space it controls and in case of SR 557 the SID allocation and distribution is already done. 559 o PCECC would send PCInitiate PCEP message [RFC8281] towards ingress 560 AGG X router(PCC) for each of N LSPs and receives PCRpt PCEP 561 message [RFC8231] back from PCCs. If the PST is PCECC-SR, the 562 PCECC would include the SID stack as per 563 [I-D.ietf-pce-segment-routing]. If the PST is PCECC (basic), then 564 the PCECC would assigns labels along the calculated path; and set 565 up the path by sending central controller instructions in PCEP 566 message to each node along the path of the LSP as per 567 [I-D.zhao-pce-pcep-extension-for-pce-controller] and then send 568 PCUpd message to the ingress AGG X router with information about 569 new LSP and AGG X(PCC) would respond with PCRpt with LSP status. 571 o AGG X as ingress router now have N LSPs towards AGG N and AGG N-1 572 which are available for installing to router's forwarding and LB 573 of traffic between them. Traffic distribution between those LSPs 574 depends on particular realization of hash-function on that router. 576 o Since PCECC knows as LSDB as TEDB (TE state) he can manage and 577 prevent possible oversubscriptions and limit number of available 578 LB states. Via PCECC mechanism the control can take quick actions 579 into the network by directly provisioning the central control 580 instructions. 582 3.3.2. PCECC and Inter-AS TE 584 There are various signaling options for establishing Inter-AS TE LSP: 585 contiguous TE LSP [RFC5151], stitched TE LSP [RFC5150], nested TE LSP 586 [RFC4206]. 588 Requirements for PCE-based Inter-AS setup [RFC5376] describe the 589 approach and PCEP functionality that are needed for establishing 590 Inter-AS TE LSPs. 592 [RFC5376] also gives Inter- and Intra-AS PCE Reference Model that is 593 provided below in shorten form for the sake of simplicity. 595 Inter-AS Inter-AS 596 PCC <-->PCE1<--------->PCE2 597 :: :: :: 598 :: :: :: 599 R1----ASBR1====ASBR3---R3---ASBR5 600 | AS1 | | PCC | 601 | | | AS2 | 602 R2----ASBR2====ASBR4---R4---ASBR6 603 :: :: 604 :: :: 605 Intra-AS Intra-AS 606 PCE3 PCE4 608 Shorten form of Inter- and Intra-AS PCE Reference Model [RFC5376] 610 The PCECC belonging to different domain can co-operate to setup 611 inter-AS TE LSP. The stateful H-PCE [I-D.ietf-pce-stateful-hpce] 612 mechanism could be used to first establish a per-domain PCECC LSP. 614 These could be stitched together to form inter-AS TE LSP as described 615 in [I-D.dugeon-pce-stateful-interdomain]. 617 Hereatfter we will focus on a simplified Inter-AS case when both AS1 618 and AS2 belong to the same service provider administration. In that 619 case Inter and Intra-AS PCEs could be combined in one single PCE if 620 such combined PCE performance is enough for handling all Path 621 Computation Requests. Even more in that particular case we 622 potentially could use single PCE for both ASes if the scalability and 623 performance are enough, we require interfaces (PCEP and BGP-LS) to 624 both domains. PCECC redundancy mechanisms are described in 625 [RFC8283]. Thus routers in AS1 and AS2 (PCCs) can send Path 626 Computation messages towards same PCECC. 628 +----BGP-LS------+ +------BGP-LS-----+ 629 | | | | 630 +-PCEP-|----++-+-------PCECC-----PCEP--++-+-|-------+ 631 +-:------|----::-:-+ +--::-:-|-------:---+ 632 | : | :: : | | :: : | : | 633 | : RR1 :: : | | :: : RR2 : | 634 | v v: : | LSP1 | :: v v | 635 | R1---------ASBR1=======================ASBR3--------R3 | 636 | | v : | | :v | | 637 | +----------ASBR2=======================ASBR4---------+ | 638 | | Region 1 : | | : Region 1 | | 639 |----------------:-| |--:-------------|--| 640 | | v | LSP2 | v | | 641 | +----------ASBR5=======================ASBR6---------+ | 642 | Region 2 | | Region 2 | 643 +------------------+ <--------------> +-------------------+ 644 MPLS Domain 1 Inter-AS MPLS Domain 2 645 <=======AS1=======> <========AS2=======> 647 Particular case of Inter-AS PCE 649 In a case of PCECC Inter-AS TE scenario where service provider 650 controls both domains (AS1 and AS2), each of them have own IGP and 651 MPLS transport. There is a need is to setup Inter-AS LSPs for 652 transporting different services on top of them (Voice, L3VPN etc.). 653 Inter-AS links with different capacity exist in several regions. The 654 task is not only to provision those Inter-AS LSPs with given 655 constrains but also calculate the path and pre-setup the backup 656 Inter-AS LSPs that will be used if primary LSP fails. 658 For the figure above it would be that LSP1 from R1 to R3 may go via 659 ASBR1 and ASBR3, and it is the primary Inter-AS LSP. R1-R3 LSP2 that 660 go via ASBR5 and ASBR6 is the backup one. In addition there could be 661 bypass LSP setup to protect against ASBR or inter-AS link failure. 663 After the addition of PCECC functionality to PCE (SDN controller), 664 PCECC based Inter-AS TE model SHOULD follow as PCECC usecase for TE 665 LSP as requirements of [RFC5376] with the following details: 667 o Since PCECC needs to know the topology of both domains AS1 and 668 AS2, PCECC could use BGP-LS peering with routers (or RRs) in both 669 domains. 671 o PCECC needs to PCEP connectivity towards all routers in both 672 domains (see also section 4 in [RFC5376]) in a similar manner as a 673 SDN controller. 675 o After operator's application or service orchestrator will create 676 request for tunnel creation of specific service, PCECC SHOULD 677 receive that request via NBI (NBI type is implementation 678 dependent, MAY be NETCONF/Yang, REST etc.). Then PCECC would 679 calculate the optimal path based on Objective Function (OF) and 680 given constrains (i.e. path setup type, bandwidth etc.), including 681 those from [RFC5376]: priority, AS sequence, preferred ASBR, 682 disjoint paths, protection. On this step we would have two paths: 683 R1-ASBR1-ASBR3-R3, R1-ASBR5-ASBR6-R3 685 o Depending on given LSP PST (PCECC or PCECC-SR), PCECC would use 686 download instructions to the PCC. At this stage it is assumed the 687 PCECC is aware of the label space it controls and in case of SR 688 the SID allocation and distribution is already done. 690 o PCECC would send PCInitiate PCEP message [RFC8281] towards ingress 691 router R1 (PCC) in AS1 and receives PCRpt PCEP message [RFC8231] 692 back from PCC. If the PST is PCECC-SR, the PCECC would include 693 the SID stack as per [I-D.ietf-pce-segment-routing]. It may also 694 include binding SID based on AS boundary. The backup SID stack 695 could also be installed at ingress but more importantly each node 696 along the SR path could also do local protection just based on the 697 top segement. If the PST is PCECC (basic), then the PCECC would 698 assigns labels along the calculated paths (R1-ASBR1-ASBR3-R3, 699 R1-ASBR5-ASBR6-R3); and set up the path by sending central 700 controller instructions in PCEP message to each node along the 701 path of the LSPs as per 702 [I-D.zhao-pce-pcep-extension-for-pce-controller] and then send 703 PCUpd message to the ingress R1 router with information about new 704 LSPs and R1 would respond with PCRpt with LSP(s) status. 706 o AGG X as ingress router now have N LSPs towards AGG N and AGG N-1 707 which are available for installing to router's forwarding and LB 708 of traffic between them. Traffic distribution between those LSPs 709 depends on particular realization of hash-function on that router. 711 o After that step R1 now have primary and backup TEs (LSP1 and LSP2) 712 towards R3. It is up to router implementation how to make 713 switchover to backup LSP2 if LSP1 fails. 715 3.4. Use Cases of PCECC for Multicast LSPs 717 The current multicast LSPs are setup either using the RSVP-TE P2MP or 718 mLDP protocols. The setup of these LSPs may require manual 719 configurations and complex signaling when the protection is 720 considered. By using the PCECC solution, the multicast LSP can be 721 computed and setup through centralized controller which has the full 722 picture of the topology and bandwidth usage for each link. It not 723 only reduces the complex configurations comparing the distributed 724 RSVP-TE P2MP or mLDP signaling, but also it can compute the disjoint 725 primary path and secondary P2MP path efficiently. 727 3.4.1. Using PCECC for P2MP/MP2MP LSPs' Setup 729 It is assumed the PCECC is aware of the label space it controls for 730 all nodes and make allocations accordingly. 732 +----------+ 733 | R1 | Root node of the multicast LSP 734 +----------+ 735 |6000 736 +----------+ 737 Transit Node | R2 | 738 branch +----------+ 739 * | * * 740 9001* | * *9002 741 * | * * 742 +-----------+ | * +-----------+ 743 | R4 | | * | R5 | Transit Nodes 744 +-----------+ | * +-----------+ 745 * | * * + 746 9003* | * * +9004 747 * | * * + 748 +-----------+ +-----------+ 749 | R3 | | R6 | Leaf Node 750 +-----------+ +-----------+ 751 9005| 752 +-----------+ 753 | R8 | Leaf Node 754 +-----------+ 756 The P2MP examples are explained here, where R1 is root and R8 and R6 757 are the leaves. 759 o Based on the P2MP path computation request / delegation or PCE 760 initiation, the PCECC receives the PCECC request with constraints 761 and optimization criteria. 763 o PCECC would calculate the optimal P2MP path according to given 764 constrains (i.e.bandwidth). 766 o PCECC would provision each node along the path and assign incoming 767 and outgoing labels from R1 to {R6, R8} with the path: {R1, 6000}, 768 {6000, R2, {9001,9002}}, {9001, R4, 9003}, {9002, R5, 9004} {9003, 769 R3, 9005}, {9004, R6}, {9005, R8}. The main difference is in the 770 branch node instruction at R2 where two copies of packet are sent 771 towards R4 and R5 with 9001 and 9002 labels respectively. 773 The packet forwarding involves - 775 Step1: R1 may send a packet P1 to R2 simply by pushing an label of 776 6000 to the packet. 778 Step2: After R2 receives the packet with label 6000, it will 779 forwarding to R4 by swapping label to 9001 and by swapping label 780 of 9002 towards R5. 782 Step3: After R4 receives the packet with label 9001, it will 783 forwarding to R3 by swapping to 9003. After R5 receives the 784 packet with label 9002, it will forwarding to R6 by swapping to 785 9004. 787 Step4: After R3 receives the packet with label 9003, it will 788 forwarding to R8 by swapping to 9005 and when R5 receives the 789 packet with label 9004, it will swap to 9004 and send to R6. 791 Step5: Packet received at R8 and 9005 is popped; packet receives 792 at R6 and 9004 is popped. 794 3.4.2. Use Cases of PCECC for the Resiliency of P2MP/MP2MP LSPs 796 3.4.2.1. PCECC for the End-to-End Protection of the P2MP/MP2MP LSPs 798 In this section we describe the end-to-end managed path protection 799 service as well as the local protection with the operation management 800 in the PCECC network for the P2MP/MP2MP LSP. 802 An end-to-end protection principle can be applied for computing 803 backup P2MP or MP2MP LSPs. During computation of the primary 804 multicast trees, PCECC server may also take the computation of a 805 secondary tree into consideration. A PCE may compute the primary and 806 backup P2MP (or MP2MP) LSP together or sequentially. 808 +----+ +----+ 809 Root node of LSP | R1 |--| R11| 810 +----+ +----+ 811 / + 812 10/ +20 813 / + 814 +----------+ +-----------+ 815 Transit Node | R2 | | R3 | 816 +----------+ +-----------+ 817 | \ + + 818 | \ + + 819 10| 10\ +20 20+ 820 | \ + + 821 | \ + 822 | + \ + 823 +-----------+ +-----------+ Leaf Nodes 824 | R4 | | R5 | (Downstream LSR) 825 +-----------+ +-----------+ 827 In the example above, when the PCECC setup the primary multicast tree 828 from the root node R1 to the leaves, which is R1->R2->{R4, R5}, at 829 same time, it can setup the backup tree, which is R1->R11->R3->{R4, 830 R5}. Both the these two primary forwarding tree and secondary 831 forwarding tree will be downloaded to each routers along the primary 832 path and the secondary path. The traffic will be forwarded through 833 the R1->R2->{R4, R5} path normally, and when there is a node in the 834 primary tree fails (say R2), then the root node R1 will switch the 835 flow to the backup tree, which is R1->R11->R3->{R4, R5}. By using 836 the PCECC, the path computation and forwarding path downloading can 837 all be done without the complex signaling used in the P2MP RSVP-TE or 838 mLDP. 840 3.4.2.2. PCECC for the Local Protection of the P2MP/MP2MP LSPs 842 In this section we describe the local protection service in the PCECC 843 network for the P2MP/MP2MP LSP. 845 While the PCECC sets up the primary multicast tree, it can also build 846 the back LSP among PLR, the protected node, and MPs (the downstream 847 nodes of the protected node). In the cases where the amount of 848 downstream nodes are huge, this mechanism can avoid unnecessary 849 packet duplication on PLR and protect the network from traffic 850 congestion risk. 852 +------------+ 853 | R1 | Root Node 854 +------------+ 855 . 856 . 857 . 858 +------------+ Point of Local Repair/ 859 | R10 | Switchover Point 860 +------------+ (Upstream LSR) 861 / + 862 10/ +20 863 / + 864 +----------+ +-----------+ 865 Protected Node | R20 | | R30 | 866 +----------+ +-----------+ 867 | \ + + 868 | \ + + 869 10| 10\ +20 20+ 870 | \ + + 871 | \ + 872 | + \ + 873 +-----------+ +-----------+ Merge Point 874 | R40 | | R50 | (Downstream LSR) 875 +-----------+ +-----------+ 876 . . 877 . . 879 In the example above, when the PCECC setup the primary multicast path 880 around the PLR node R10 to protect node R20, which is R10->R20->{R40, 881 R50}, at same time, it can setup the backup path R10->R30->{R40, 882 R50}. Both the these two primary forwarding path and secondary 883 bypass forwarding path will be downloaded to each routers along the 884 primary path and the secondary bypass path. The traffic will be 885 forwarded through the R10->R20->{R40, R50} path normally, and when 886 there is a node failure for node R20, then the PLR node R10 will 887 switch the flow to the backup path, which is R10->R30->{R40, R50}. 888 By using the PCECC, the path computation and forwarding path 889 downloading can all be done without the complex signaling used in the 890 P2MP RSVP-TE or mLDP. 892 3.5. Use Cases of PCECC for LSP in the Network Migration 894 One of the main advantages for PCECC solution is that it has backward 895 compatibility naturally since the PCE server itself can function as a 896 proxy node of MPLS network for all the new nodes which may no longer 897 support the signaling protocols. 899 As it is illustrated in the following example, the current network 900 could migrate to a total PCECC controlled network gradually by 901 replacing the legacy nodes. During the migration, the legacy nodes 902 still need to signal using the existing MPLS protocol such as LDP and 903 RSVP-TE, and the new nodes setup their portion of the forwarding path 904 through PCECC directly. With the PCECC function as the proxy of 905 these new nodes, MPLS signaling can populate through network as 906 normal. 908 Example described in this section is based on network configurations 909 illustrated using the following figure: 911 +------------------------------------------------------------------+ 912 | PCE DOMAIN | 913 | +-----------------------------------------------------+ | 914 | | PCECC | | 915 | +-----------------------------------------------------+ | 916 | ^ ^ ^ ^ | 917 | | PCEP | | PCEP | | 918 | V V V V | 919 | +--------+ +--------+ +--------+ +--------+ +--------+ | 920 | | NODE 1 | | NODE 2 | | NODE 3 | | NODE 4 | | NODE 5 | | 921 | | |...| |...| |...| |...| | | 922 | | Legacy |if1| Legacy |if2|Legacy |if3| PCECC |if4| PCECC | | 923 | | Node | | Node | |Enabled | |Enabled | | Enabled| | 924 | +--------+ +--------+ +--------+ +--------+ +--------+ | 925 | | 926 +------------------------------------------------------------------+ 928 Example: PCECC Initiated LSP Setup In the Network Migration 930 In this example, there are five nodes for the TE LSP from head end 931 (Node1) to the tail end (Node5). Where the Node4 and Node5 are 932 centrally controlled and other nodes are legacy nodes. 934 o Node1 sends a path request message for the setup of LSP 935 destinating to Node5. 937 o PCECC sends to node1 a reply message for LSP setup with the path: 938 (Node1, if1),(Node2, if2), (Node3, if3), (Node4, if4), Node5. 940 o Node1, Node2, Node3 will setup the LSP to Node5 using the local 941 labels as usual. Node 3 with help of PCECC could proxy the 942 signaling. 944 o Then the PCECC will program the out-segment of Node3, the in- 945 segment/ out-segment of Node4, and the in-segment for Node5. 947 3.6. Use Cases of PCECC for L3VPN and PWE3 949 As described in [RFC8283], various network services may be offered 950 over a network. These include protection services (including Virtual 951 Private Network (VPN) services (such as Layer 3 VPNs [RFC4364] or 952 Ethernet VPNs [RFC7432]); or Pseudowires [RFC3985]. Delivering 953 services over a network in an optimal way requires coordination in 954 the way that network resources are allocated to support the services. 955 A PCE-based central controller can consider the whole network and all 956 components of a service at once when planning how to deliver the 957 service. It can then use PCEP to manage the network resources and to 958 install the necessary associations between those resources. 960 In the case of L3VPN, VPN labels can be assigned and distributed 961 through the PCECC PCEP among the PE router instead of using the BGP 962 protocols. 964 Example described in this section is based on network configurations 965 illustrated using the following figure: 967 +-------------------------------------------+ 968 | PCE DOMAIN | 969 | +-----------------------------------+ | 970 | | PCECC | | 971 | +-----------------------------------+ | 972 | ^ ^ ^ | 973 |PWE3/L3VPN | PCEP PCEP|LSP PWE3/L3VPN|PCEP | 974 | V V V | 975 +--------+ | +--------+ +--------+ +--------+ | +--------+ 976 | CE | | | PE1 | | NODE x | | PE2 | | | CE | 977 | |...... | |...| |...| |.....| | 978 | Legacy | |if1 | PCECC |if2|PCCEC |if3| PCECC |if4 | Legacy | 979 | Node | | | Enabled| |Enabled | |Enabled | | | Node | 980 +--------+ | +--------+ +--------+ +--------+ | +--------+ 981 | | 982 +-------------------------------------------+ 984 Example: Using PCECC for L3VPN and PWE3 986 In the case PWE3, instead of using the LDP signaling protocols, the 987 label and port pairs assigned to each pseudowire can be assigned 988 through PCECC among the PE routers and the corresponding forwarding 989 entries will be distributed into each PE routers through the extended 990 PCEP protocols and PCECC mechanism. 992 3.7. Using PCECC for Traffic Classification Information 994 As described in [RFC8283], traffic classification is an important 995 part of traffic engineering. It is the process of looking at a 996 packet to determine how it should be treated as it is forwarded 997 through the network. It applies in many scenarios including MPLS 998 traffic engineering (where it determines what traffic is forwarded 999 onto which LSPs); segment routing (where it is used to select which 1000 set of forwarding instructions to add to a packet); and SFC (where it 1001 indicates along which service function path a packet should be 1002 forwarded). In conjunction with traffic engineering, traffic 1003 classification is an important enabler for load balancing. Traffic 1004 classification is closely linked to the computational elements of 1005 planning for the network functions just listed because it determines 1006 how traffic load is balanced and distributed through the network. 1007 Therefore, selecting what traffic classification should be performed 1008 by a router is an important part of the work done by a PCECC. 1010 Instructions can be passed from the controller to the routers using 1011 PCEP. These instructions tell the routers how to map traffic to 1012 paths or connections. Refer [I-D.ietf-pce-pcep-flowspec]. 1014 Along with traffic classification, there are few more question - 1016 o how to use it 1018 o Whether it is a virtual link 1020 o Whether to advertise it in the IGP 1022 o What bits of this information to signal to the tail end 1024 3.8. Use Cases of PCECC for SRv6 1026 As per [RFC8402], with Segment Routing (SR), a node steers a packet 1027 through an ordered list of instructions, called segments. Segment 1028 Routing can be applied to the IPv6 architecture with the Segment 1029 Routing Header (SRH) [I-D.ietf-6man-segment-routing-header]. A 1030 segment is encoded as an IPv6 address. An ordered list of segments 1031 is encoded as an ordered list of IPv6 addresses in the routing 1032 header. The active segment is indicated by the Destination Address 1033 of the packet. Upon completion of a segment, a pointer in the new 1034 routing header is incremented and indicates the next segment. 1036 As per [I-D.ietf-6man-segment-routing-header], an SRv6 Segment is a 1037 128-bit value. "SRv6 SID" or simply "SID" are often used as a 1038 shorter reference for "SRv6 Segment". Further details are in An 1039 illustration is provided in 1041 [I-D.filsfils-spring-srv6-network-programming] where SRv6 SID is 1042 represented as LOC:FUNCT. 1044 [I-D.negi-pce-segment-routing-ipv6] extends 1045 [I-D.ietf-pce-segment-routing] to support SR for IPv6 data plane. 1046 Further a PCECC could be extended to support SRv6 SID allocation and 1047 distribution. 1049 [Editor's Note - more details to be added] 1051 3.9. Use Cases of PCECC for SFC 1053 Service Function Chaining (SFC) is described in [RFC7665]. It is the 1054 process of directing traffic in a network such that it passes through 1055 specific hardware devices or virtual machines (known as service 1056 function nodes) that can perform particular desired functions on the 1057 traffic. The set of functions to be performed and the order in which 1058 they are to be performed is known as a service function chain. The 1059 chain is enhanced with the locations at which the service functions 1060 are to be performed to derive a Service Function Path (SFP). Each 1061 packet is marked as belonging to a specific SFP, and that marking 1062 lets each successive service function node know which functions to 1063 perform and to which service function node to send the packet next. 1064 To operate an SFC network, the service function nodes must be 1065 configured to understand the packet markings, and the edge nodes must 1066 be told how to mark packets entering the network. Additionally, it 1067 may be necessary to establish tunnels between service function nodes 1068 to carry the traffic. Planning an SFC network requires load 1069 balancing between service function nodes and traffic engineering 1070 across the network that connects them. As per [RFC8283], these are 1071 operations that can be performed by a PCE-based controller, and that 1072 controller can use PCEP to program the network and install the 1073 service function chains and any required tunnels. 1075 PCECC can play the role for setting the traffic classification rules 1076 at the classifier as well as downloading the forwarding instructions 1077 to the SFFs so that they could process the NSH and forward 1078 accordingly. 1080 [Editor's Note - more details to be added] 1082 3.10. Use Cases of PCECC for Native IP 1084 [I-D.ietf-teas-native-ip-scenarios] describes the scenarios, and 1085 suggestions for the "Centrally Control Dynamic Routing (CCDR)" 1086 architecture, which integrates the merit of traditional distributed 1087 protocols (IGP/BGP), and the power of centrally control technologies 1088 (PCE/SDN) to provide one feasible traffic engineering solution in 1089 various complex scenarios for the service provider. 1090 [I-D.ietf-teas-pce-native-ip] defines the framework for CCDR traffic 1091 engineering within Native IP network, using Dual/Multi-BGP session 1092 strategy and CCDR architecture. PCEP protocol can be used to 1093 transfer the key parameters between PCE and the underlying network 1094 devices (PCC) using PCECC technique. The central control 1095 instructions from PCECC to identify which prefix should be advertised 1096 on which BGP session. 1098 3.11. Use Cases of PCECC for Local Protection (RSVP-TE) 1100 [I-D.cbrt-pce-stateful-local-protection] describes the need for the 1101 PCE to maintain and associate the local protection paths for the 1102 RSVP-TE LSP. Local protection requires the setup of a bypass at the 1103 PLR. This bypass can be PCC-initiated and delegated, or PCE- 1104 initiated. In either case, the PLR MUST maintain a PCEP session to 1105 the PCE. The Bypass LSPs need to mapped to the primary LSP. This 1106 could be done locally at the PLR based on a local policy but there is 1107 a need for a PCE to do the mapping as well to exert greater control. 1109 This mapping can be done via PCECC procedures where the PCE could 1110 instruct the PLR to the mapping and identify the primary LSP for 1111 which bypass should be used. 1113 4. IANA Considerations 1115 This document does not require any action from IANA. 1117 5. Security Considerations 1119 TBD. 1121 6. Acknowledgments 1123 We would like to thank Adrain Farrel, Aijun Wang, Robert Tao, 1124 Changjiang Yan, Tieying Huang, Sergio Belotti, Dieter Beller, Andrey 1125 Elperin and Evgeniy Brodskiy for their useful comments and 1126 suggestions. 1128 7. References 1130 7.1. Normative References 1132 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1133 Requirement Levels", BCP 14, RFC 2119, 1134 DOI 10.17487/RFC2119, March 1997, 1135 . 1137 [RFC5440] Vasseur, JP., Ed. and JL. Le Roux, Ed., "Path Computation 1138 Element (PCE) Communication Protocol (PCEP)", RFC 5440, 1139 DOI 10.17487/RFC5440, March 2009, 1140 . 1142 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 1143 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 1144 May 2017, . 1146 [RFC8283] Farrel, A., Ed., Zhao, Q., Ed., Li, Z., and C. Zhou, "An 1147 Architecture for Use of PCE and the PCE Communication 1148 Protocol (PCEP) in a Network with Central Control", 1149 RFC 8283, DOI 10.17487/RFC8283, December 2017, 1150 . 1152 7.2. Informative References 1154 [RFC3985] Bryant, S., Ed. and P. Pate, Ed., "Pseudo Wire Emulation 1155 Edge-to-Edge (PWE3) Architecture", RFC 3985, 1156 DOI 10.17487/RFC3985, March 2005, 1157 . 1159 [RFC4206] Kompella, K. and Y. Rekhter, "Label Switched Paths (LSP) 1160 Hierarchy with Generalized Multi-Protocol Label Switching 1161 (GMPLS) Traffic Engineering (TE)", RFC 4206, 1162 DOI 10.17487/RFC4206, October 2005, 1163 . 1165 [RFC4364] Rosen, E. and Y. Rekhter, "BGP/MPLS IP Virtual Private 1166 Networks (VPNs)", RFC 4364, DOI 10.17487/RFC4364, February 1167 2006, . 1169 [RFC5150] Ayyangar, A., Kompella, K., Vasseur, JP., and A. Farrel, 1170 "Label Switched Path Stitching with Generalized 1171 Multiprotocol Label Switching Traffic Engineering (GMPLS 1172 TE)", RFC 5150, DOI 10.17487/RFC5150, February 2008, 1173 . 1175 [RFC5151] Farrel, A., Ed., Ayyangar, A., and JP. Vasseur, "Inter- 1176 Domain MPLS and GMPLS Traffic Engineering -- Resource 1177 Reservation Protocol-Traffic Engineering (RSVP-TE) 1178 Extensions", RFC 5151, DOI 10.17487/RFC5151, February 1179 2008, . 1181 [RFC5541] Le Roux, JL., Vasseur, JP., and Y. Lee, "Encoding of 1182 Objective Functions in the Path Computation Element 1183 Communication Protocol (PCEP)", RFC 5541, 1184 DOI 10.17487/RFC5541, June 2009, 1185 . 1187 [RFC5376] Bitar, N., Zhang, R., and K. Kumaki, "Inter-AS 1188 Requirements for the Path Computation Element 1189 Communication Protocol (PCECP)", RFC 5376, 1190 DOI 10.17487/RFC5376, November 2008, 1191 . 1193 [RFC7432] Sajassi, A., Ed., Aggarwal, R., Bitar, N., Isaac, A., 1194 Uttaro, J., Drake, J., and W. Henderickx, "BGP MPLS-Based 1195 Ethernet VPN", RFC 7432, DOI 10.17487/RFC7432, February 1196 2015, . 1198 [RFC7665] Halpern, J., Ed. and C. Pignataro, Ed., "Service Function 1199 Chaining (SFC) Architecture", RFC 7665, 1200 DOI 10.17487/RFC7665, October 2015, 1201 . 1203 [RFC8231] Crabbe, E., Minei, I., Medved, J., and R. Varga, "Path 1204 Computation Element Communication Protocol (PCEP) 1205 Extensions for Stateful PCE", RFC 8231, 1206 DOI 10.17487/RFC8231, September 2017, 1207 . 1209 [RFC8281] Crabbe, E., Minei, I., Sivabalan, S., and R. Varga, "Path 1210 Computation Element Communication Protocol (PCEP) 1211 Extensions for PCE-Initiated LSP Setup in a Stateful PCE 1212 Model", RFC 8281, DOI 10.17487/RFC8281, December 2017, 1213 . 1215 [RFC8355] Filsfils, C., Ed., Previdi, S., Ed., Decraene, B., and R. 1216 Shakir, "Resiliency Use Cases in Source Packet Routing in 1217 Networking (SPRING) Networks", RFC 8355, 1218 DOI 10.17487/RFC8355, March 2018, 1219 . 1221 [RFC8402] Filsfils, C., Ed., Previdi, S., Ed., Ginsberg, L., 1222 Decraene, B., Litkowski, S., and R. Shakir, "Segment 1223 Routing Architecture", RFC 8402, DOI 10.17487/RFC8402, 1224 July 2018, . 1226 [I-D.ietf-pce-segment-routing] 1227 Sivabalan, S., Filsfils, C., Tantsura, J., Henderickx, W., 1228 and J. Hardwick, "PCEP Extensions for Segment Routing", 1229 draft-ietf-pce-segment-routing-14 (work in progress), 1230 October 2018. 1232 [I-D.ietf-pce-stateful-hpce] 1233 Dhody, D., Lee, Y., Ceccarelli, D., Shin, J., King, D., 1234 and O. Dios, "Hierarchical Stateful Path Computation 1235 Element (PCE).", draft-ietf-pce-stateful-hpce-05 (work in 1236 progress), June 2018. 1238 [I-D.ietf-pce-pcep-flowspec] 1239 Dhody, D., Farrel, A., and Z. Li, "PCEP Extension for Flow 1240 Specification", draft-ietf-pce-pcep-flowspec-02 (work in 1241 progress), October 2018. 1243 [I-D.zhao-pce-pcep-extension-for-pce-controller] 1244 Zhao, Q., Li, Z., Dhody, D., Karunanithi, S., Farrel, A., 1245 and C. Zhou, "PCEP Procedures and Protocol Extensions for 1246 Using PCE as a Central Controller (PCECC) of LSPs", draft- 1247 zhao-pce-pcep-extension-for-pce-controller-08 (work in 1248 progress), June 2018. 1250 [I-D.zhao-pce-pcep-extension-pce-controller-sr] 1251 Zhao, Q., Li, Z., Dhody, D., Karunanithi, S., Farrel, A., 1252 and C. Zhou, "PCEP Procedures and Protocol Extensions for 1253 Using PCE as a Central Controller (PCECC) of SR-LSPs", 1254 draft-zhao-pce-pcep-extension-pce-controller-sr-03 (work 1255 in progress), June 2018. 1257 [I-D.li-pce-controlled-id-space] 1258 Li, C., Chen, M., Dong, J., Li, Z., and D. Dhody, "PCE 1259 Controlled ID Space", draft-li-pce-controlled-id-space-00 1260 (work in progress), June 2018. 1262 [I-D.dugeon-pce-stateful-interdomain] 1263 Dugeon, O., Meuric, J., Lee, Y., Dhody, D., and D. 1264 Ceccarelli, "PCEP Extension for Stateful Inter-Domain 1265 Tunnels", draft-dugeon-pce-stateful-interdomain-01 (work 1266 in progress), July 2018. 1268 [I-D.cbrt-pce-stateful-local-protection] 1269 Barth, C. and R. Torvi, "PCEP Extensions for RSVP-TE 1270 Local-Protection with PCE-Stateful", draft-cbrt-pce- 1271 stateful-local-protection-01 (work in progress), June 1272 2018. 1274 [I-D.filsfils-spring-srv6-network-programming] 1275 Filsfils, C., Camarillo, P., Leddy, J., 1276 daniel.voyer@bell.ca, d., Matsushima, S., and Z. Li, "SRv6 1277 Network Programming", draft-filsfils-spring-srv6-network- 1278 programming-05 (work in progress), July 2018. 1280 [I-D.negi-pce-segment-routing-ipv6] 1281 Negi, M., Kaladharan, P., Dhody, D., and S. Sivabalan, 1282 "PCEP Extensions for Segment Routing leveraging the IPv6 1283 data plane", draft-negi-pce-segment-routing-ipv6-02 (work 1284 in progress), June 2018. 1286 [I-D.ietf-6man-segment-routing-header] 1287 Filsfils, C., Previdi, S., Leddy, J., Matsushima, S., and 1288 d. daniel.voyer@bell.ca, "IPv6 Segment Routing Header 1289 (SRH)", draft-ietf-6man-segment-routing-header-14 (work in 1290 progress), June 2018. 1292 [I-D.ietf-teas-pce-native-ip] 1293 Wang, A., Zhao, Q., Khasanov, B., Chen, H., Mi, P., 1294 Mallya, R., and S. Peng, "PCE in Native IP Network", 1295 draft-ietf-teas-pce-native-ip-01 (work in progress), June 1296 2018. 1298 [I-D.ietf-teas-native-ip-scenarios] 1299 Wang, A., Huang, X., Qou, C., Li, Z., Huang, L., and P. 1300 Mi, "CCDR Scenario, Simulation and Suggestion", draft- 1301 ietf-teas-native-ip-scenarios-01 (work in progress), June 1302 2018. 1304 [MAP-REDUCE] 1305 Lee, K., Choi, T., Ganguly, A., Wolinsky, D., Boykin, P., 1306 and R. Figueiredo, "Parallel Processing Framework on a P2P 1307 System Using Map and Reduce Primitives", , may 2011, 1308 . 1310 [MPLS-DC] Afanasiev, D. and D. Ginsburg, "MPLS in DC and inter-DC 1311 networks: the unified forwarding mechanism for network 1312 programmability at scale", , march 2014, 1313 . 1316 7.3. URIs 1318 [1] https://hadoop.apache.org/ 1320 Appendix A. Using reliable P2MP TE based multicast delivery for 1321 distributed computations (MapReduce-Hadoop) 1323 MapReduce model of distributed computations in computing clusters is 1324 widely deployed. In Hadoop [1] 1.0 architecture MapReduce operations 1325 on big data performs by means of Master-Slave architecture in the 1326 Hadoop Distributed File System (HDFS), where NameNode has the 1327 knowledge about resources of the cluster and where actual data 1328 (chunks) for particular task are located (which DataNode). Each 1329 chunk of data (64MB or more) should have 3 saved copies in different 1330 DataNodes based on their proximity. 1332 Proximity level currently has semi-manual allocation and based on 1333 Rack IDs (Assumption is that closer data are better because of access 1334 speed/smaller latency). 1336 JobTracker node is responsible for computation tasks, scheduling 1337 across DataNodes and also have Rack-awareness. Currently transport 1338 protocols between NameNode/JobTracker and DataNodes are based on IP 1339 unicast. It has simplicity as pros but has numerous drawbacks 1340 related with its flat approach. 1342 It is clear that we should go beyond of one DC for Hadoop cluster 1343 creation and move towards distributed clusters. In that case we need 1344 to handle performance and latency issues. Latency depends on speed 1345 of light in fiber links and also latency introduced by intermediate 1346 devices in between. The last one is closely correlated with network 1347 device architecture and performance. Current performance of NPU 1348 based routers should be enough for creating distribute Hadoop 1349 clusters with predicted latency. Performance of SW based routers 1350 (mainly as VNF) together with additional HW features such as DPDK are 1351 promising but require additional research and testing. 1353 Main question is how can we create simple but effective architecture 1354 for distributed Hadoop cluster? 1356 There is research [MAP-REDUCE] which show how usage of multicast tree 1357 could improve speed of resource or cluster members discovery inside 1358 the cluster as well as increase redundancy in communications between 1359 cluster nodes. 1361 Is traditional IP based multicast enough for that? We doubt it 1362 because it requires additional control plane (IGMP, PIM) and a lot of 1363 signaling, that is not suitable for high performance computations, 1364 that are very sensitive to latency. 1366 P2MP TE tunnels looks much more suitable as potential solution for 1367 creation of multicast based communications between Master and Slave 1368 nodes inside cluster. Obviously these P2MP tunnels should be 1369 dynamically created and turned down (no manual intervention). Here, 1370 the PCECC comes to play with main objective to create optimal 1371 topology of each particular request for MapReduce computation and 1372 also create P2MP tunnels with needed parameters such as bandwidth and 1373 delay. 1375 This solution would require to use MPLS label based forwarding inside 1376 the cluster. Usage of label based forwarding inside DC was proposed 1377 by Yandex [MPLS-DC]. Technically it is already possible because MPLS 1378 on switches is already supported by some vendors, MPLS also exists on 1379 Linux and OVS. 1381 The following framework can make this task: 1383 +--------+ 1384 | APP | 1385 +--------+ 1386 | NBI (REST API,...) 1387 | 1388 PCEP +----------+ REST API 1389 +---------+ +---| PCECC |----------+ 1390 | Client |---|---| | | 1391 +---------+ | +----------+ | 1392 | | | | | | 1393 +-----|---+ |PCEP| | 1394 +--------+ | | | | | 1395 | | | | | | 1396 | REST API | | | | | 1397 | | | | | | 1398 +-------------+ | | | | +----------+ 1399 | Job Tracker | | | | | | NameNode | 1400 | | | | | | | | 1401 +-------------+ | | | | +----------+ 1402 +------------------+ | +-----------+ 1403 | | | | 1404 |---+-----P2MP TE--+-----|-----------| | 1405 +----------+ +----------+ +----------+ 1406 | DataNode1| | DataNode2| | DataNodeN| 1407 |TaskTraker| |TaskTraker| .... |TaskTraker| 1408 +----------+ +----------+ +----------+ 1410 Communication between Master nodes (JobTracker and NameNode) and 1411 PCECC via REST API MAY be either done directly or via cluster manager 1412 such as Mesos. 1414 Phase 1: Distributed cluster resources discovery During this phase 1415 Master Nodes SHOULD identify and find available Slave nodes according 1416 to computing request from application (APP). NameNode SHOULD query 1417 PCECC about available DataNodes, NameNode MAY provide additional 1418 constrains to PCECC such as topological proximity, redundancy level. 1420 PCECC SHOULD analyze the topology of distributed cluster and perform 1421 constrain based path calculation from client towards most suitable 1422 NameNodes. PCECC SHOULD reply to NameNode the list of most suitable 1423 DataNodes and their resource capabilities. Topology discovery 1424 mechanism for PCECC will be added later to that framework. 1426 Phase 2: PCECC SHOULD create P2MP LSP from client towards those 1427 DataNodes by means of PCEP messages following previously calculated 1428 path. 1430 Phase 3. NameNode SHOULD send this information to client, PCECC 1431 informs client about optimal P2MP path towards DataNodes via PCEP 1432 message. 1434 Phase 4. Client sends data blocks to those DataNodes for writing via 1435 created P2MP tunnel. 1437 When this task will be finished, P2MP tunnel could be turned down. 1439 Authors' Addresses 1441 Quintin Zhao 1442 Huawei Technologies 1443 125 Nagog Technology Park 1444 Acton, MA 01719 1445 US 1447 Email: quintin.zhao@huawei.com 1449 Zhenbin (Robin) Li 1450 Huawei Technologies 1451 Huawei Bld., No.156 Beiqing Rd. 1452 Beijing 100095 1453 China 1455 Email: lizhenbin@huawei.com 1456 Boris Khasanov 1457 Huawei Technologies 1458 Moskovskiy Prospekt 97A 1459 St.Petersburg 196084 1460 Russia 1462 Email: khasanov.boris@huawei.com 1464 Dhruv Dhody 1465 Huawei Technologies 1466 Divyashree Techno Park, Whitefield 1467 Bangalore, Karnataka 560066 1468 India 1470 Email: dhruv.ietf@gmail.com 1472 King Ke 1473 Tencent Holdings Ltd. 1474 Shenzhen 1475 China 1477 Email: kinghe@tencent.com 1479 Luyuan Fang 1480 Expedia, Inc. 1481 USA 1483 Email: luyuanf@gmail.com 1485 Chao Zhou 1486 Cisco Systems 1488 Email: chao.zhou@cisco.com 1490 Boris Zhang 1491 Telus Communications 1493 Email: Boris.zhang@telus.com 1494 Artem Rachitskiy 1495 Mobile TeleSystems JLLC 1496 Nezavisimosti ave., 95 1497 Minsk 220043 1498 Belarus 1500 Email: arachitskiy@mts.by 1502 Anton Gulida 1503 LLC "Lifetech" 1504 Krasnoarmeyskaya str., 24 1505 Minsk 220030 1506 Belarus 1508 Email: anton.gulida@life.com.by