idnits 2.17.1 draft-ietf-teas-pcecc-use-cases-10.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (14 June 2022) is 681 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Outdated reference: A later version (-08) exists of draft-ietf-pce-pcep-extension-pce-controller-sr-04 == Outdated reference: A later version (-16) exists of draft-li-pce-controlled-id-space-11 == Outdated reference: A later version (-04) exists of draft-ietf-pce-stateful-interdomain-03 == Outdated reference: A later version (-25) exists of draft-ietf-pce-segment-routing-ipv6-13 == Outdated reference: A later version (-13) exists of draft-ietf-bier-te-arch-10 == Outdated reference: A later version (-13) exists of draft-chen-pce-bier-09 == Outdated reference: A later version (-09) exists of draft-ietf-spring-sr-service-programming-06 == Outdated reference: A later version (-15) exists of draft-ietf-spring-nsh-sr-11 == Outdated reference: A later version (-26) exists of draft-ietf-idr-segment-routing-te-policy-17 == Outdated reference: A later version (-15) exists of draft-ietf-pce-segment-routing-policy-cp-07 Summary: 0 errors (**), 0 flaws (~~), 11 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 TEAS Working Group Z. Li 3 Internet-Draft D. Dhody 4 Intended status: Informational Huawei Technologies 5 Expires: 16 December 2022 Q. Zhao 6 Etheric Networks 7 K. He 8 Tencent Holdings Ltd. 9 B. Khasanov 10 Yandex LLC 11 14 June 2022 13 The Use Cases for Path Computation Element (PCE) as a Central Controller 14 (PCECC). 15 draft-ietf-teas-pcecc-use-cases-10 17 Abstract 19 The Path Computation Element (PCE) is a core component of a Software- 20 Defined Networking (SDN) system. It can compute optimal paths for 21 traffic across a network and can also update the paths to reflect 22 changes in the network or traffic demands. PCE was developed to 23 derive paths for MPLS Label Switched Paths (LSPs), which are supplied 24 to the head end of the LSP using the Path Computation Element 25 Communication Protocol (PCEP). 27 SDN has a broader applicability than signaled MPLS traffic-engineered 28 (TE) networks, and the PCE may be used to determine paths in a range 29 of use cases including static LSPs, segment routing (SR), Service 30 Function Chaining (SFC), and most forms of a routed or switched 31 network. It is, therefore, reasonable to consider PCEP as a control 32 protocol for use in these environments to allow the PCE to be fully 33 enabled as a central controller. 35 A PCE as a Central Controller (PCECC) can simplify the processing of 36 a distributed control plane by blending it with elements of SDN and 37 without necessarily completely replacing it. This document describes 38 general considerations for PCECC deployment and examines its 39 applicability and benefits, as well as its challenges and 40 limitations, through a number of use cases. PCEP extensions required 41 for stateful PCE usage are covered in separate documents. 43 Requirements Language 44 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 45 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 46 "OPTIONAL" in this document are to be interpreted as described in BCP 47 14 [RFC2119] [RFC8174] when, and only when, they appear in all 48 capitals, as shown here. 50 Status of This Memo 52 This Internet-Draft is submitted in full conformance with the 53 provisions of BCP 78 and BCP 79. 55 Internet-Drafts are working documents of the Internet Engineering 56 Task Force (IETF). Note that other groups may also distribute 57 working documents as Internet-Drafts. The list of current Internet- 58 Drafts is at https://datatracker.ietf.org/drafts/current/. 60 Internet-Drafts are draft documents valid for a maximum of six months 61 and may be updated, replaced, or obsoleted by other documents at any 62 time. It is inappropriate to use Internet-Drafts as reference 63 material or to cite them other than as "work in progress." 65 This Internet-Draft will expire on 16 December 2022. 67 Copyright Notice 69 Copyright (c) 2022 IETF Trust and the persons identified as the 70 document authors. All rights reserved. 72 This document is subject to BCP 78 and the IETF Trust's Legal 73 Provisions Relating to IETF Documents (https://trustee.ietf.org/ 74 license-info) in effect on the date of publication of this document. 75 Please review these documents carefully, as they describe your rights 76 and restrictions with respect to this document. Code Components 77 extracted from this document must include Revised BSD License text as 78 described in Section 4.e of the Trust Legal Provisions and are 79 provided without warranty as described in the Revised BSD License. 81 Table of Contents 83 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 84 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 4 85 3. Application Scenarios . . . . . . . . . . . . . . . . . . . . 4 86 3.1. Use Cases of PCECC for Label Management . . . . . . . . . 5 87 3.2. Using PCECC for SR . . . . . . . . . . . . . . . . . . . 6 88 3.2.1. PCECC SID Allocation . . . . . . . . . . . . . . . . 8 89 3.2.2. Use Cases of PCECC for SR Best Effort (BE) Path . . . 9 90 3.2.3. Use Cases of PCECC for SR Traffic Engineering (TE) 91 Path . . . . . . . . . . . . . . . . . . . . . . . . 9 93 3.2.4. SR Policy . . . . . . . . . . . . . . . . . . . . . . 10 94 3.3. Use Cases of PCECC for TE LSP . . . . . . . . . . . . . . 11 95 3.3.1. PCECC Load Balancing (LB) Use Case . . . . . . . . . 13 96 3.3.2. PCECC and Inter-AS TE . . . . . . . . . . . . . . . . 15 97 3.4. Use Cases of PCECC for Multicast LSPs . . . . . . . . . . 18 98 3.4.1. Using PCECC for P2MP/MP2MP LSPs' Setup . . . . . . . 18 99 3.4.2. Use Cases of PCECC for the Resiliency of P2MP/MP2MP 100 LSPs . . . . . . . . . . . . . . . . . . . . . . . . 20 101 3.5. Using PCECC for Traffic Classification Information . . . 22 102 3.6. Use Cases of PCECC for SRv6 . . . . . . . . . . . . . . . 23 103 3.7. Use Cases of PCECC for SFC . . . . . . . . . . . . . . . 24 104 3.8. Use Cases of PCECC for Native IP . . . . . . . . . . . . 25 105 3.9. Use Cases of PCECC for BIER . . . . . . . . . . . . . . . 26 106 4. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 26 107 5. Security Considerations . . . . . . . . . . . . . . . . . . . 26 108 6. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 27 109 7. References . . . . . . . . . . . . . . . . . . . . . . . . . 27 110 7.1. Normative References . . . . . . . . . . . . . . . . . . 27 111 7.2. Informative References . . . . . . . . . . . . . . . . . 28 112 Appendix A. Other Use Cases of PCECC . . . . . . . . . . . . . . 33 113 A.1. Use Cases of PCECC for LSP in the Network Migration . . . 33 114 A.2. Use Cases of PCECC for L3VPN and PWE3 . . . . . . . . . . 34 115 A.3. Use Cases of PCECC for Local Protection (RSVP-TE) . . . . 35 116 A.4. Using reliable P2MP TE based multicast delivery for 117 distributed computations (MapReduce-Hadoop) . . . . . . . 36 118 Appendix B. Contributor Addresses . . . . . . . . . . . . . . . 38 119 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 39 121 1. Introduction 123 The Path Computation Element (PCE) [RFC4655] was developed to offload 124 the path computation function from routers in an MPLS traffic- 125 engineered (TE) network. It can compute optimal paths for traffic 126 across a network and can also update the paths to reflect changes in 127 the network or traffic demands. Since then, the role and function of 128 the PCE have grown to cover a number of other uses (such as GMPLS 129 [RFC7025]) and to allow delegated control [RFC8231] and PCE-initiated 130 use of network resources [RFC8281]. 132 According to [RFC7399], Software-Defined Networking (SDN) refers to a 133 separation between the control elements and the forwarding components 134 so that software running in a centralized system, called a 135 controller, can act to program the devices in the network to behave 136 in specific ways. A required element in an SDN architecture is a 137 component that plans how the network resources will be used and how 138 the devices will be programmed. It is possible to view this 139 component as performing specific computations to place traffic flows 140 within the network given knowledge of the availability of network 141 resources, how other forwarding devices are programmed, and the way 142 that other flows are routed. This is the function and purpose of a 143 PCE, and the way that a PCE integrates into a wider network control 144 system (including an SDN system) is presented in [RFC7491]. 146 [RFC8283] introduces the architecture for the PCE as a central 147 controller as an extension to the architecture described in [RFC4655] 148 and assumes the continued use of PCEP as the protocol used between 149 the PCE and PCC. [RFC8283] further examines the motivations and 150 applicability for PCEP as a Southbound Interface (SBI) and introduces 151 the implications for the protocol. 153 [RFC9050] introduces the procedures and extensions for PCEP to 154 support the PCECC architecture [RFC8283]. 156 This draft describes the various other usecases for the PCECC 157 architecture. 159 2. Terminology 161 The following terminology is used in this document. 163 IGP: Interior Gateway Protocol. Either of the two routing 164 protocols, Open Shortest Path First (OSPF) or Intermediate System to 165 Intermediate System (IS-IS). 167 PCC: Path Computation Client: any client application requesting a 168 path computation to be performed by a Path Computation Element. 170 PCE: Path Computation Element. An entity (component, application, 171 or network node) that is capable of computing a network path or route 172 based on a network graph and applying computational constraints. 174 PCECC: PCE as a central controller. Extension of PCE to support SDN 175 functions as per [RFC8283]. 177 TE: Traffic Engineering. 179 3. Application Scenarios 181 In the following sections, several use cases are described, 182 showcasing scenarios that benefit from the deployment of PCECC. 184 3.1. Use Cases of PCECC for Label Management 186 As per [RFC8283], in some cases, the PCE-based controller can take 187 responsibility for managing some part of the MPLS label space for 188 each of the routers that it controls, and it may taker wider 189 responsibility for partitioning the label space for each router and 190 allocating different parts for different uses, communicating the 191 ranges to the router using PCEP. 193 [RFC9050] describe a mode where LSPs are provisioned as explicit 194 label instructions at each hop on the end-to-end path. Each router 195 along the path must be told what label forwarding instructions to 196 program and what resources to reserve. The controller uses PCEP to 197 communicate with each router along the path of the end-to-end LSP. 198 For this to work, the PCE- based controller will take responsibility 199 for managing some part of the MPLS label space for each of the 200 routers that it controls. An extension to PCEP could be done to 201 allow a PCC to inform the PCE of such a label space to control. (See 202 [I-D.li-pce-controlled-id-space] for a possible PCEP extension to 203 support advertisement of the MPLS label space to the PCE to control.) 205 [RFC8664] specifies extensions to PCEP that allow a stateful PCE to 206 compute, update or initiate SR-TE paths. 207 [I-D.ietf-pce-pcep-extension-pce-controller-sr] describes the 208 mechanism for PCECC to allocate and provision the node/prefix/ 209 adjacency label (SID) via PCEP. To make such allocation PCE needs to 210 be aware of the label space from Segment Routing Global Block (SRGB) 211 or Segment Routing Local Block (SRLB) [RFC8402] of the node that it 212 controls. A mechanism for a PCC to inform the PCE of such a label 213 space to control is needed within PCEP. The full SRGB/SRLB of a node 214 could be learned via existing IGP or BGP-LS mechanism too. 216 Further, there have been various proposals for Global Labels in MPLS, 217 the PCECC architecture could be used as means to learn the label 218 space of nodes, and could also be used to determine and provision the 219 global label range. 221 +------------------------------+ +------------------------------+ 222 | PCE DOMAIN 1 | | PCE DOMAIN 2 | 223 | +--------+ | | +--------+ | 224 | | | | | | | | 225 | | PCECC1 | ---------PCEP---------- | PCECC2 | | 226 | | | | | | | | 227 | | | | | | | | 228 | +--------+ | | +--------+ | 229 | ^ ^ | | ^ ^ | 230 | / \ PCEP | | PCEP / \ | 231 | V V | | V V | 232 | +--------+ +--------+ | | +--------+ +--------+ | 233 | |NODE 11 | | NODE 1n| | | |NODE 21 | | NODE 2n| | 234 | | | ...... | | | | | | ...... | | | 235 | | PCECC | | PCECC | | | | PCECC | |PCECC | | 236 | |Enabled | | Enabled| | |Enabled | |Enabled | | 237 | +--------+ +--------+ | | +--------+ +--------+ | 238 | | | | 239 +------------------------------+ +------------------------------+ 241 Figure 1: PCECC for Label Management 243 * PCC would advertise the PCECC capability to the PCE (central 244 controller-PCECC) [RFC9050]. 246 * The PCECC could also learn the label range set aside by the PCC 247 ([I-D.li-pce-controlled-id-space]). 249 * Optionally, the PCECC could determine the shared MPLS global label 250 range for the network. 252 - In the case that the shared global label range need to be 253 negotiated across multiple domains, the central controllers of 254 these domains would also need to negotiate a common global 255 label range across domains. 257 - The PCECC would need to set the shared global label range to 258 all PCC nodes in the network. 260 3.2. Using PCECC for SR 262 Segment Routing (SR) leverages the source routing paradigm. Using 263 SR, a source node steers a packet through a path without relying on 264 hop-by-hop signaling protocols such as LDP or RSVP-TE. Each path is 265 specified as an ordered list of instructions called "segments". Each 266 segment is an instruction to route the packet to a specific place in 267 the network, or to perform a specific service on the packet. A 268 database of segments can be distributed through the network using a 269 routing protocol (such as IS-IS or OSPF) or by any other means. PCEP 270 (and PCECC) could be one such means. 272 [RFC8664] specify the SR specific PCEP extensions. PCECC may further 273 use PCEP protocol for SR SID (Segment Identifier) distribution to the 274 SR nodes (PCC) with some benefits. If the PCECC allocates and 275 maintains the SID in the network for the nodes and adjacencies; and 276 further distributes them to the SR nodes directly via the PCEP 277 session has some advantage over the configurations on each SR node 278 and flooding via IGP, especially in a SDN environment. 280 When the PCECC is used for the distribution of the node SID and 281 adjacency SID, the node SID is allocated from the SRGB of the node. 282 For the allocation of adjacency SID, the allocation is from the SRLB 283 of the node as described in 284 [I-D.ietf-pce-pcep-extension-pce-controller-sr]. 286 [RFC8355] identifies various protection and resiliency usecases for 287 SR. Path protection lets the ingress node be in charge of the 288 failure recovery (used for SR-TE). Also protection can be performed 289 by the node adjacent to the failed component, commonly referred to as 290 local protection techniques or fast-reroute (FRR) techniques. In 291 case of PCECC, the protection paths can be pre-computed and setup by 292 the PCE. 294 The following example illustrate the use case where the node SID and 295 adjacency SID are allocated by the PCECC. 297 192.0.2.1/32 298 +----------+ 299 | R1(1001) | 300 +----------+ 301 | 302 +----------+ 303 | R2(1002) | 192.0.2.2/32 304 +----------+ 305 * | * * 306 * | * * 307 *link1| * * 308 192.0.2.4/32 * | *link2 * 192.0.2.5/32 309 +-----------+ 9001| * +-----------+ 310 | R4(1004) | | * | R5(1005) | 311 +-----------+ | * +-----------+ 312 * | *9003 * + 313 * | * * + 314 * | * * + 315 +-----------+ +-----------+ 316 192.0.2.3/32 | R3(1003) | |R6(1006) |192.0.2.6/32 317 +-----------+ +-----------+ 318 | 319 +-----------+ 320 | R8(1008) | 192.0.2.8/32 321 +-----------+ 323 3.2.1. PCECC SID Allocation 325 Each node (PCC) is allocated a node-SID by the PCECC. The PCECC 326 needs to update the label map of each node to all the nodes in the 327 domain. On receiving the label map, each node (PCC) uses the local 328 routing information to determine the next-hop and download the label 329 forwarding instructions accordingly. The forwarding behavior and the 330 end result is same as IGP based Node-SID in SR. Thus, from anywhere 331 in the domain, it enforces the ECMP-aware shortest-path forwarding of 332 the packet towards the related node. 334 For each adjacency in the network, PCECC can allocate an Adj-SID. 335 The PCECC sends PCInitiate message to update the label map of each 336 Adj to the corresponding nodes in the domain. Each node (PCC) 337 download the label forwarding instructions accordingly. The 338 forwarding behavior and the end result is similar to IGP based "Adj- 339 SID" in SR. 341 These mechanism are described in 342 [I-D.ietf-pce-pcep-extension-pce-controller-sr]. 344 3.2.2. Use Cases of PCECC for SR Best Effort (BE) Path 346 In this mode of the solution, the PCECC just need to allocate the 347 node SID (without calculating the explicit path for the SR path). 348 The ingress of the forwarding path just need to encapsulate the 349 destination node SID on top of the packet. All the intermediate 350 nodes will forward the packet based on the destination node SID. It 351 is similar to the LDP LSP. 353 R1 may send a packet to R8 simply by pushing an SR header with 354 segment list {1008} (Node SID for R8). The path would be the based 355 on the routing/nexthop calculation on the routers. 357 3.2.3. Use Cases of PCECC for SR Traffic Engineering (TE) Path 359 SR-TE paths may not follow an IGP SPT. Such paths may be chosen by a 360 PCECC and provisioned on the ingress node of the SR-TE path. The SR 361 header consists of a list of SIDs (or MPLS labels). The header has 362 all necessary information so that, the packets can be guided from the 363 ingress node to the egress node of the path; hence, there is no need 364 for any signaling protocol. For the case where strict traffic 365 engineering path is needed, all the adjacency SID are stacked, 366 otherwise a combination of node-SID or adj-SID can be used for the 367 SR-TE paths. 369 Note that the bandwidth reservations is only guaranteed at controller 370 and through the enforce of the bandwidth admission control. As for 371 the RSVP-TE LSP case, the control plane signaling also does the link 372 bandwidth reservation in each hop of the path. 374 The SR traffic engineering path examples are explained as bellow: 376 Note that the node SID for each node is allocated from the SRGB and 377 adjacency SID for each link are allocated from the SRLB for each 378 node. 380 Example 1: 382 R1 may send a packet P1 to R8 simply by pushing an SR header with 383 segment list {1008}. Based on the best path, it could be: 384 R1-R2-R3-R8. 386 Example 2: 388 R1 may send a packet P2 to R8 by pushing an SR header with segment 389 list {1002, 9001, 1008}. The path should be: R1-R2-link1-R3-R8. 391 Example 3: 393 R1 may send a packet P3 to R8 via R4 by pushing an SR header with 394 segment list {1004, 1008}. The path could be : R1-R2-R4-R3-R8 396 The local protection examples for SR TE path are explained below: 398 Example 4: local link protection: 400 * R1 may send a packet P4 to R8 by pushing an SR header with segment 401 list {1002, 9001, 1008}. The path should be: R1-R2-link1-R3-R8. 403 * When node R2 receives the packet from R1 which has the header of 404 link1-R3-R8, and also find out there is a link failure of link1, 405 then the R2 can enforce the traffic over the bypass to send out 406 the packet with header of R3-R8 through link2. 408 Example 5: local node protection: 410 * R1 may send a packet P5 to R8 by pushing an SR header with segment 411 list {1004, 1008}. The path could be : R1-R2-R4-R3-R8. 413 * When node R2 receives the packet from R1 which has the header of 414 {1004, 1008}, and also finds out there is a node failure for 415 node4, then it can enforce the traffic over the bypass and send 416 out the packet with header of {1005, 1008} to node5 instead of 417 node4. 419 3.2.4. SR Policy 421 [RFC8402] defines Segment Routing architecture, which uses a SR 422 Policy to steer packets from a node through a ordered list of 423 segments. The SR Policy could be configured on the headed or 424 instantiated by an SR controller. The SR architecture does not 425 restrict how the controller programs the network. The options are 426 Network Configuration Protocol (NETCONF), PCEP, and BGP. SR Policy 427 can be based on either SR-MPLS or SRv6 dataplane. 429 A SR Policy architecture is described in 430 [I-D.ietf-spring-segment-routing-policy]. An SR Policy is a 431 framework that enables the instantiation of an ordered list of 432 segments on a node for implementing a source routing policy for the 433 steering of traffic for a specific purpose (e.g. for a specific SLA) 434 from that node. 436 A SR Policy is identified through the tuple . In the context of a specific headend, one may identify an 438 SR policy by the tuple. 440 The headend is the node where the policy is instantiated/implemented. 441 The endpoint indicates the destination of the policy. The color is a 442 32-bit numerical value that associates the SR Policy with an intent 443 or objective. 445 A SR Policy should have one or more Candidate Paths. A candidate 446 path is the unit for signaling of an SR Policy to a headend via 447 protocol extensions like [I-D.ietf-pce-segment-routing-policy-cp] or 448 BGP SR Policy [I-D.ietf-idr-segment-routing-te-policy]. Each 449 candidate path must have one or mode Segment-Lists. A Segment- List 450 represents a specific source-routed path to send traffic from the 451 headend to the endpoint of the corresponding SR Policy. 453 A candidate path is either dynamic, explicit, or composite. For 454 PCECC use case a candidate path should be either dynamic (i.e. when 455 PCE provides its according to specific optimization objective) or 456 composite (a composite candidate path construct enables the 457 combination of SR Policies, each with explicit candidate paths and/or 458 dynamic candidate paths with potentially different optimization 459 objectives and constraints). 461 [I-D.ietf-pce-segment-routing-policy-cp] defines a new ASSOCIATION 462 type that binds previously separated LSPs in the PCEP (Candidate 463 Paths) into common SR Policy hierarchy. This is applicable in the 464 PCECC scenario as well. 466 Further one could also use the PCECC mechanism directly to create an 467 SR policy container at the PCC by defining a new CCI for it. The 468 advantage of that approach would be to allow SR Policy to be created 469 without signaling candidate paths. 471 3.3. Use Cases of PCECC for TE LSP 473 In the Section 3.2 the case of SR path via PCECC is discussed. 474 Although those cases give the simplicity and scalability, but there 475 are existing functionalities for the traffic engineering path such as 476 the bandwidth guarantee, monitoring where SR based solution are 477 complex. Also there are cases where the depth of the label stack is 478 an issue for existing deployment and certain vendors. 480 So to address these issues, PCECC architecture also support the TE 481 LSP functionalities. To achieve this, the existing PCEP can be used 482 to communicate between the PCECC and nodes along the path. This is 483 similar to static LSPs, where LSPs can be provisioned as explicit 484 label instructions at each hop on the end-to-end path. Each router 485 along the path must be told what label-forwarding instructions to 486 program and what resources to reserve. The PCE-based controller 487 keeps a view of the network and determines the paths of the end-to- 488 end LSPs, and the controller uses PCEP to communicate with each 489 router along the path of the end-to-end LSP. 491 192.0.2.1/32 492 +----------+ 493 | R1 | 494 +----------+ 495 | | 496 |link1 | 497 | |link2 498 +----------+ 499 | R2 | 192.0.2.2/32 500 +----------+ 501 link3 * | * * link4 502 * | * * 503 *link5| * * 504 192.0.2.4/32 * | *link6 * 192.0.2.5/32 505 +-----------+ | * +-----------+ 506 | R4 | | * | R5 | 507 +-----------+ | * +-----------+ 508 * | * * + 509 link10 * | * *link7 + 510 * | * * + 511 +-----------+ +-----------+ 512 192.0.2.3/32 | R3 | |R6 |192.0.2.6/32 513 +-----------+ +-----------+ 514 | | 515 |link8 | 516 | |link9 517 +-----------+ 518 | R8 | 192.0.2.8/32 519 +-----------+ 521 Figure 2: PCECC TE LSP Setup Example 523 * Based on path computation request / delegation or PCE initiation, 524 the PCECC receives the PCECC request with constraints and 525 optimization criteria. 527 * PCECC would calculate the optimal path according to given 528 constrains (e.g. bandwidth). 530 * PCECC would provision each node along the path and assign incoming 531 and outgoing labels from R1 to R8 with the path: {R1, link1, 532 1001}, {1001, R2, link3, 2003], {2003, R4, link10, 4010}, {4010, 533 R3, link8, 3008}, {3008, R8}. 535 * For the end to end protection, PCECC program each node along the 536 path from R1 to R8 with the secondary path: {R1, link2, 1002}, 537 {1002, R2, link4, 2004], {2004, R5, link7, 5007}, {5007, R3, 538 link9, 3009}, {3009, R8}. 540 * It is also possible to have a bypass path for the local protection 541 setup by the PCECC. For example, the primary path as above, then 542 to protect the node R4 locally, PCECC can program the bypass path 543 like this: {R2, link5, 2005}, {2005, R3}. By doing this, the node 544 R4 is locally protected at R2. 546 3.3.1. PCECC Load Balancing (LB) Use Case 548 Very often many service providers use TE tunnels for solving issues 549 with non-deterministic paths in their networks. One example of such 550 applications is usage of TEs in the mobile backhaul (MBH). Consider 551 the following topology - 553 TE1 --------------> 554 +---------+ +--------+ +--------+ +--------+ +------+ +---+ 555 | Access |----| Access |----| AGG 1 |----| AGG N-1|----|Core 1|--|SR1| 556 | SubNode1| | Node 1 | +--------+ +--------+ +------+ +---+ 557 +---------+ +--------+ | | | ^ | 558 | Access | Access | AGG Ring 1 | | | 559 | SubRing 1 | Ring 1 | | | | | 560 +---------+ +--------+ +--------+ | | | 561 | Access | | Access | | AGG 2 | | | | 562 | SubNode2| | Node 2 | +--------+ | | | 563 +---------+ +--------+ | | | | | 564 | | | | | | | 565 | | | +----TE2----|-+ | 566 +---------+ +--------+ +--------+ +--------+ +------+ +---+ 567 | Access | | Access |----| AGG 3 |----| AGG N |----|Core N|--|SRn| 568 | SubNodeN|----| Node N | +--------+ +--------+ +------+ +---+ 569 +---------+ +--------+ 571 This MBH architecture uses L2 access rings and sub-rings. L3 starts 572 at the aggregation layer. For the sake of simplicity, the figure 573 shows only one access sub-ring, access ring and aggregation ring 574 (AGG1...AGGN), connected by Nx10GE interfaces. Aggregation domain 575 runs its own IGP. There are two Egress routers (AGG N-1,AGG N) that 576 are connected to the Core domain via L2 interfaces. Core also have 577 connections to service routers, RSVP-TEs are used for MPLS transport 578 inside the ring. There could be at least 2 tunnels (one way) from 579 each AGG router to egress AGG routers. There are also many L2 access 580 rings connected to AGG routers. 582 Service deployment made by means of either L2VPNs (VPLS) or L3VPNs. 583 Those services use MPLS TE as transport towards egress AGG routers. 584 TE tunnels could be also used as transport towards service routers in 585 case of seamless MPLS based architecture in the future. 587 There is a need to solve the following tasks: 589 * Perform automatic load-balance amongst TE tunnels according to 590 current traffic load. 592 * TE bandwidth (BW) management: Provide guaranteed BW for specific 593 service: HSI, IPTV, etc., provide time-based BW reservation (BoD) 594 for other services. 596 * Simplify development of TE tunnels by automation without any 597 manual intervention. 599 * Provide flexibility for Service Router placement (anywhere in the 600 network by creation of transport LSPs to them). 602 Since other tasks are already considered by other PCECC use cases, in 603 this section, the focus is on load balancing (LB) task. LB task 604 could be solved by means of PCECC in the following way: 606 * After application or network service or operator can ask SDN 607 controller (PCECC) for LSP based load balancing between AGG X and 608 AGG N/AGG N-1 (egress AGG routers which have connections to core). 609 Each of these would have associated constrains (i.e. bandwidth, 610 inclusion or exclusion specific links or nodes, number of paths, 611 objective function (OF), need for disjoint LSP paths etc.). 613 * PCECC could calculate multiple (say N) LSPs according to given 614 constrains, calculation is based on results of Objective Function 615 (OF) [RFC5541], constraints, endpoints, same or different 616 bandwidth (BW) , different links (in case of disjoint paths) and 617 other constrains. 619 * Depending on given LSP Path setup type (PST), PCECC would use 620 download instructions to the PCC. At this stage it is assumed the 621 PCECC is aware of the label space it controls and in case of SR 622 the SID allocation and distribution is already done. 624 * PCECC would send PCInitiate message [RFC8281] towards ingress AGG 625 X router(PCC) for each of N LSPs and receives PCRpt PCEP message 626 [RFC8231] back from PCCs. If the PST is PCECC-SR, the PCECC would 627 include the SID stack as per [RFC8664]. If the PST is PCECC 628 (basic), then the PCECC would assigns labels along the calculated 629 path; and set up the path by sending central controller 630 instructions in PCEP message to each node along the path of the 631 LSP as per [RFC9050] and then send PCUpd message to the ingress 632 AGG X router with information about new LSP and AGG X(PCC) would 633 respond with PCRpt with LSP status. 635 * AGG X as ingress router now have N LSPs towards AGG N and AGG N-1 636 which are available for installing to router's forwarding and LB 637 of traffic between them. Traffic distribution between those LSPs 638 depends on particular realization of hash-function on that router. 640 * Since PCECC is aware of TEDB (TE state) and LSP-DB, it can manage 641 and prevent possible over-subscriptions and limit number of 642 available LB states. Via PCECC mechanism the control can take 643 quick actions into the network by directly provisioning the 644 central control instructions. 646 3.3.2. PCECC and Inter-AS TE 648 There are various signaling options for establishing Inter-AS TE LSP: 649 contiguous TE LSP [RFC5151], stitched TE LSP [RFC5150], nested TE LSP 650 [RFC4206]. 652 Requirements for PCE-based Inter-AS setup [RFC5376] describe the 653 approach and PCEP functionality that are needed for establishing 654 Inter-AS TE LSPs. 656 [RFC5376] also gives Inter- and Intra-AS PCE Reference Model that is 657 provided below in shorten form for the sake of simplicity. 659 Inter-AS Inter-AS 660 PCC <-->PCE1<--------->PCE2 661 :: :: :: 662 :: :: :: 663 R1----ASBR1====ASBR3---R3---ASBR5 664 | AS1 | | PCC | 665 | | | AS2 | 666 R2----ASBR2====ASBR4---R4---ASBR6 667 :: :: 668 :: :: 669 Intra-AS Intra-AS 670 PCE3 PCE4 672 Figure 3: Shorten form of Inter- and Intra-AS PCE Reference Model 673 [RFC5376] 675 The PCECC belonging to different domain can co-operate to setup 676 inter-AS TE LSP. The stateful H-PCE [RFC8751] mechanism could also 677 be used to first establish a per-domain PCECC LSP. These could be 678 stitched together to form inter-AS TE LSP as described in 679 [I-D.ietf-pce-stateful-interdomain]. 681 For the sake of simplicity, here after the focus is on a simplified 682 Inter-AS case when both AS1 and AS2 belong to the same service 683 provider administration. In that case Inter and Intra-AS PCEs could 684 be combined in one single PCE if such combined PCE performance is 685 enough for handling all path computation request and setup. There is 686 a potential to use a single PCE for both ASes if the scalability and 687 performance are enough. The PCE would require interfaces (PCEP and 688 BGP-LS) to both domains. PCECC redundancy mechanisms are described 689 in [RFC8283]. Thus routers in AS1 and AS2 (PCCs) can send PCEP 690 messages towards same PCECC. 692 +----BGP-LS------+ +------BGP-LS-----+ 693 | | | | 694 +-PCEP-|----++-+-------PCECC-----PCEP--++-+-|-------+ 695 +-:------|----::-:-+ +--::-:-|-------:---+ 696 | : | :: : | | :: : | : | 697 | : RR1 :: : | | :: : RR2 : | 698 | v v: : | LSP1 | :: v v | 699 | R1---------ASBR1=======================ASBR3--------R3 | 700 | | v : | | :v | | 701 | +----------ASBR2=======================ASBR4---------+ | 702 | | Region 1 : | | : Region 1 | | 703 |----------------:-| |--:-------------|--| 704 | | v | LSP2 | v | | 705 | +----------ASBR5=======================ASBR6---------+ | 706 | Region 2 | | Region 2 | 707 +------------------+ <--------------> +-------------------+ 708 MPLS Domain 1 Inter-AS MPLS Domain 2 709 <=======AS1=======> <========AS2=======> 711 Figure 4: Particular case of Inter-AS PCE 713 In a case of PCECC Inter-AS TE scenario where service provider 714 controls both domains (AS1 and AS2), each of them have own IGP and 715 MPLS transport. There is a need is to setup Inter-AS LSPs for 716 transporting different services on top of them (Voice, L3VPN etc.). 717 Inter-AS links with different capacity exist in several regions. The 718 task is not only to provision those Inter-AS LSPs with given 719 constrains but also calculate the path and pre-setup the backup 720 Inter-AS LSPs that will be used if primary LSP fails. 722 As per the Figure 4, LSP1 from R1 to R3 goes via ASBR1 and ASBR3, and 723 it is the primary Inter-AS LSP. R1-R3 LSP2 that go via ASBR5 and 724 ASBR6 is the backup one. In addition there could also be a bypass 725 LSP setup to protect against ASBR or inter-AS link failure. 727 After the addition of PCECC functionality to PCE (SDN controller), 728 PCECC based Inter-AS TE model SHOULD follow as PCECC usecase for TE 729 LSP as requirements of [RFC5376] with the following details: 731 * Since PCECC needs to know the topology of both domains AS1 and 732 AS2, PCECC could use BGP-LS peering with routers (or RRs) in both 733 domains. 735 * PCECC needs to PCEP connectivity towards all routers in both 736 domains (see also section 4 in [RFC5376]) in a similar manner as a 737 SDN controller. 739 * After operator's application or service orchestrator will create 740 request for tunnel creation of specific service, PCECC should 741 receive that request via NBI (NBI type is implementation 742 dependent, could be NETCONF/Yang, REST etc.). Then PCECC would 743 calculate the optimal path based on Objective Function (OF) and 744 given constraints (i.e. path setup type, bandwidth etc.), 745 including those from [RFC5376]: priority, AS sequence, preferred 746 ASBR, disjoint paths, protection. On this step we would have two 747 paths: R1-ASBR1-ASBR3-R3, R1-ASBR5-ASBR6-R3 749 * Depending on given LSP PST (PCECC or PCECC-SR), PCECC would use 750 central control download instructions to the PCC. At this stage 751 it is assumed the PCECC is aware of the label space it controls 752 and in case of SR the SID allocation and distribution is already 753 done. 755 * PCECC would send PCInitiate message [RFC8281] towards ingress 756 router R1 (PCC) in AS1 and receives PCRpt PCEP message [RFC8231] 757 back from PCC. If the PST is PCECC-SR, the PCECC would include 758 the SID stack as per [RFC8664]. It may also include binding SID 759 based on AS boundary. The backup SID stack could also be 760 installed at ingress but more importantly each node along the SR 761 path could also do local protection just based on the top segment. 762 If the PST is PCECC (basic), then the PCECC would assigns labels 763 along the calculated paths (R1-ASBR1-ASBR3-R3, R1-ASBR5-ASBR6-R3); 764 and set up the path by sending central controller instructions in 765 PCEP message to each node along the path of the LSPs as per 766 [RFC9050] and then send PCUpd message to the ingress R1 router 767 with information about new LSPs and R1 would respond with PCRpt 768 with LSP(s) status. 770 * After that step R1 now have primary and backup TEs (LSP1 and LSP2) 771 towards R3. It is up to router implementation how to make 772 switchover to backup LSP2 if LSP1 fails. 774 3.4. Use Cases of PCECC for Multicast LSPs 776 The multicast LSPs can be setup via the RSVP-TE P2MP or mLDP 777 protocols. The setup of these LSPs may require manual configurations 778 and complex signaling when the protection is considered. By using 779 the PCECC solution, the multicast LSP can be computed and setup 780 through centralized controller which has the full picture of the 781 topology and bandwidth usage for each link. It not only reduces the 782 complex configurations comparing the distributed RSVP-TE P2MP or mLDP 783 signaling, but also it can compute the disjoint primary path and 784 secondary P2MP path efficiently. 786 3.4.1. Using PCECC for P2MP/MP2MP LSPs' Setup 788 It is assumed the PCECC is aware of the label space it controls for 789 all nodes and make allocations accordingly. 791 +----------+ 792 | R1 | Root node of the multicast LSP 793 +----------+ 794 |6000 795 +----------+ 796 Transit Node | R2 | 797 branch +----------+ 798 * | * * 799 9001* | * *9002 800 * | * * 801 +-----------+ | * +-----------+ 802 | R4 | | * | R5 | Transit Nodes 803 +-----------+ | * +-----------+ 804 * | * * + 805 9003* | * * +9004 806 * | * * + 807 +-----------+ +-----------+ 808 | R3 | | R6 | Leaf Node 809 +-----------+ +-----------+ 810 9005| 811 +-----------+ 812 | R8 | Leaf Node 813 +-----------+ 815 The P2MP examples are explained here, where R1 is root and R8 and R6 816 are the leaves. 818 * Based on the P2MP path computation request / delegation or PCE 819 initiation, the PCECC receives the PCECC request with constraints 820 and optimization criteria. 822 * PCECC would calculate the optimal P2MP path according to given 823 constrains (i.e.bandwidth). 825 * PCECC would provision each node along the path and assign incoming 826 and outgoing labels from R1 to {R6, R8} with the path: {R1, 6000}, 827 {6000, R2, {9001,9002}}, {9001, R4, 9003}, {9002, R5, 9004} {9003, 828 R3, 9005}, {9004, R6}, {9005, R8}. The main difference is in the 829 branch node instruction at R2 where two copies of packet are sent 830 towards R4 and R5 with 9001 and 9002 labels respectively. 832 The packet forwarding involves - 834 Step 1: R1 may send a packet P1 to R2 simply by pushing an label 835 of 6000 to the packet. 837 Step 2: After R2 receives the packet with label 6000, it will 838 forwarding to R4 by swapping label to 9001 and by swapping label 839 of 9002 towards R5. 841 Step 3: After R4 receives the packet with label 9001, it will 842 forwarding to R3 by swapping to 9003. After R5 receives the 843 packet with label 9002, it will forwarding to R6 by swapping to 844 9004. 846 Step 4: After R3 receives the packet with label 9003, it will 847 forwarding to R8 by swapping to 9005 and when R5 receives the 848 packet with label 9004, it will swap to 9004 and send to R6. 850 Step 5: Packet received at R8 and 9005 is popped; packet receives 851 at R6 and 9004 is popped. 853 3.4.2. Use Cases of PCECC for the Resiliency of P2MP/MP2MP LSPs 855 3.4.2.1. PCECC for the End-to-End Protection of the P2MP/MP2MP LSPs 857 In this section we describe the end-to-end managed path protection 858 service as well as the local protection with the operation management 859 in the PCECC network for the P2MP/MP2MP LSP. 861 An end-to-end protection principle can be applied for computing 862 backup P2MP or MP2MP LSPs. During computation of the primary 863 multicast trees, PCECC server may also take the computation of a 864 secondary tree into consideration. A PCE may compute the primary and 865 backup P2MP (or MP2MP) LSP together or sequentially. 867 +----+ +----+ 868 Root node of LSP | R1 |--| R11| 869 +----+ +----+ 870 / + 871 10/ +20 872 / + 873 +----------+ +-----------+ 874 Transit Node | R2 | | R3 | 875 +----------+ +-----------+ 876 | \ + + 877 | \ + + 878 10| 10\ +20 20+ 879 | \ + + 880 | \ + 881 | + \ + 882 +-----------+ +-----------+ Leaf Nodes 883 | R4 | | R5 | (Downstream LSR) 884 +-----------+ +-----------+ 886 In the example above, when the PCECC setup the primary multicast tree 887 from the root node R1 to the leaves, which is R1->R2->{R4, R5}, at 888 same time, it can setup the backup tree, which is R1->R11->R3->{R4, 889 R5}. Both the these two primary forwarding tree and secondary 890 forwarding tree will be downloaded to each routers along the primary 891 path and the secondary path. The traffic will be forwarded through 892 the R1->R2->{R4, R5} path normally, and when there is a node in the 893 primary tree fails (say R2), then the root node R1 will switch the 894 flow to the backup tree, which is R1->R11->R3->{R4, R5}. By using 895 the PCECC, the path computation and forwarding path downloading can 896 all be done without the complex signaling used in the P2MP RSVP-TE or 897 mLDP. 899 3.4.2.2. PCECC for the Local Protection of the P2MP/MP2MP LSPs 901 In this section we describe the local protection service in the PCECC 902 network for the P2MP/MP2MP LSP. 904 While the PCECC sets up the primary multicast tree, it can also build 905 the back LSP among PLR, the protected node, and MPs (the downstream 906 nodes of the protected node). In the cases where the amount of 907 downstream nodes are huge, this mechanism can avoid unnecessary 908 packet duplication on PLR and protect the network from traffic 909 congestion risk. 911 +------------+ 912 | R1 | Root Node 913 +------------+ 914 . 915 . 916 . 917 +------------+ Point of Local Repair/ 918 | R10 | Switchover Point 919 +------------+ (Upstream LSR) 920 / + 921 10/ +20 922 / + 923 +----------+ +-----------+ 924 Protected Node | R20 | | R30 | 925 +----------+ +-----------+ 926 | \ + + 927 | \ + + 928 10| 10\ +20 20+ 929 | \ + + 930 | \ + 931 | + \ + 932 +-----------+ +-----------+ Merge Point 933 | R40 | | R50 | (Downstream LSR) 934 +-----------+ +-----------+ 935 . . 936 . . 938 In the example above, when the PCECC setup the primary multicast path 939 around the PLR node R10 to protect node R20, which is R10->R20->{R40, 940 R50}, at same time, it can setup the backup path R10->R30->{R40, 941 R50}. Both the these two primary forwarding path and secondary 942 bypass forwarding path will be downloaded to each routers along the 943 primary path and the secondary bypass path. The traffic will be 944 forwarded through the R10->R20->{R40, R50} path normally, and when 945 there is a node failure for node R20, then the PLR node R10 will 946 switch the flow to the backup path, which is R10->R30->{R40, R50}. 947 By using the PCECC, the path computation and forwarding path 948 downloading can all be done without the complex signaling used in the 949 P2MP RSVP-TE or mLDP. 951 3.5. Using PCECC for Traffic Classification Information 953 As described in [RFC8283], traffic classification is an important 954 part of traffic engineering. It is the process of looking at a 955 packet to determine how it should be treated as it is forwarded 956 through the network. It applies in many scenarios including MPLS 957 traffic engineering (where it determines what traffic is forwarded 958 onto which LSPs); segment routing (where it is used to select which 959 set of forwarding instructions to add to a packet); and SFC (where it 960 indicates along which service function path a packet should be 961 forwarded). In conjunction with traffic engineering, traffic 962 classification is an important enabler for load balancing. Traffic 963 classification is closely linked to the computational elements of 964 planning for the network functions just listed because it determines 965 how traffic load is balanced and distributed through the network. 966 Therefore, selecting what traffic classification should be performed 967 by a router is an important part of the work done by a PCECC. 969 Instructions can be passed from the controller to the routers using 970 PCEP. These instructions tell the routers how to map traffic to 971 paths or connections. Refer [RFC9168]. 973 Along with traffic classification, there are few more question that 974 needs to be considered once the path is setup - 976 * how to use it 978 * Whether it is a virtual link 980 * Whether to advertise it in the IGP as a virtual link 982 * What bits of this information to signal to the tail end 984 These are out of scope of this document. 986 3.6. Use Cases of PCECC for SRv6 988 As per [RFC8402], with Segment Routing (SR), a node steers a packet 989 through an ordered list of instructions, called segments. Segment 990 Routing can be applied to the IPv6 architecture with the Segment 991 Routing Header (SRH) [RFC8754]. A segment is encoded as an IPv6 992 address. An ordered list of segments is encoded as an ordered list 993 of IPv6 addresses in the routing header. The active segment is 994 indicated by the Destination Address of the packet. Upon completion 995 of a segment, a pointer in the new routing header is incremented and 996 indicates the next segment. 998 As per [RFC8754], an SRv6 Segment is a 128-bit value. "SRv6 SID" or 999 simply "SID" are often used as a shorter reference for "SRv6 1000 Segment". Further details are in An illustration is provided in 1001 [RFC8986] where SRv6 SID is represented as LOC:FUNCT. 1003 [I-D.ietf-pce-segment-routing-ipv6] extends [RFC8664] to support SR 1004 for IPv6 data plane. Further a PCECC could be extended to support 1005 SRv6 SID allocation and distribution. 1007 2001:db8::1 1008 +----------+ 1009 | R1 | 1010 +----------+ 1011 | 1012 +----------+ 1013 | R2 | 2001:db8::2 1014 +----------+ 1015 * | * * 1016 * | * * 1017 *link1| * * 1018 2001:db8::4 * | *link2 * 2001:db8::5 1019 +-----------+ | * +-----------+ 1020 | R4 | | * | R5 | 1021 +-----------+ | * +-----------+ 1022 * | * * + 1023 * | * * + 1024 * | * * + 1025 +-----------+ +-----------+ 1026 2001:db8::3 | R3 | |R6 |2001:db8::6 1027 +-----------+ +-----------+ 1028 | 1029 +-----------+ 1030 | R8 | 2001:db8::8 1031 +-----------+ 1033 In this case, PCECC could assign the SRv6 SID (in form of a IPv6 1034 address) to be used for node and adjacency. Later SRv6 path in form 1035 of list of SRv6 SID could be used at the ingress. Some examples - 1037 * SRv6 SID-List={2001:db8::8} - The best path towards R8 1039 * SRv6 SID-List={2001:db8::5, 2001:db8::8} - The path towards R8 via 1040 R5 1042 3.7. Use Cases of PCECC for SFC 1044 Service Function Chaining (SFC) is described in [RFC7665]. It is the 1045 process of directing traffic in a network such that it passes through 1046 specific hardware devices or virtual machines (known as service 1047 function nodes) that can perform particular desired functions on the 1048 traffic. The set of functions to be performed and the order in which 1049 they are to be performed is known as a service function chain. The 1050 chain is enhanced with the locations at which the service functions 1051 are to be performed to derive a Service Function Path (SFP). Each 1052 packet is marked as belonging to a specific SFP, and that marking 1053 lets each successive service function node know which functions to 1054 perform and to which service function node to send the packet next. 1056 To operate an SFC network, the service function nodes must be 1057 configured to understand the packet markings, and the edge nodes must 1058 be told how to mark packets entering the network. Additionally, it 1059 may be necessary to establish tunnels between service function nodes 1060 to carry the traffic. Planning an SFC network requires load 1061 balancing between service function nodes and traffic engineering 1062 across the network that connects them. As per [RFC8283], these are 1063 operations that can be performed by a PCE-based controller, and that 1064 controller can use PCEP to program the network and install the 1065 service function chains and any required tunnels. 1067 A possible mechanism would be to add support for SFC based central 1068 control instruction that would be able to instruct the following to 1069 the each SFF along the SFP - 1071 * Service Path Identifier (SPI): Uniquely identifies a SFP. 1073 * Service Index (SI): Provides location within the SFP. 1075 * SFC Proxy handling 1077 PCECC can play the role for setting the traffic classification rules 1078 at the classifier to impose the NSH as well as downloading the 1079 forwarding instructions to each SFFs along the way so that they could 1080 process the NSH and forward accordingly. Instructions to the service 1081 classifier handle the context header, meta data etc. 1083 It is also possible to support SFC with SR in conjunction with or 1084 without NSH such as [I-D.ietf-spring-nsh-sr] and 1085 [I-D.ietf-spring-sr-service-programming]. PCECC technique can also 1086 be used for service function related segments and SR service 1087 policies. 1089 3.8. Use Cases of PCECC for Native IP 1091 [RFC8735] describes the scenarios, and suggestions for the "Centrally 1092 Control Dynamic Routing (CCDR)" architecture, which integrates the 1093 merit of traditional distributed protocols (IGP/BGP), and the power 1094 of centrally control technologies (PCE/SDN) to provide one feasible 1095 traffic engineering solution in various complex scenarios for the 1096 service provider. [RFC8821] defines the framework for CCDR traffic 1097 engineering within Native IP network, using Dual/Multi-BGP session 1098 strategy and CCDR architecture. PCEP protocol can be used to 1099 transfer the key parameters between PCE and the underlying network 1100 devices (PCC) using PCECC technique. The central control 1101 instructions from PCECC to identify which prefix should be advertised 1102 on which BGP session. 1104 3.9. Use Cases of PCECC for BIER 1106 Bit Index Explicit Replication (BIER) [RFC8279] defines an 1107 architecture where all intended multicast receivers are encoded as a 1108 bitmask in the multicast packet header within different 1109 encapsulations. A router that receives such a packet will forward 1110 the packet based on the bit position in the packet header towards the 1111 receiver(s) following a precomputed tree for each of the bits in the 1112 packet. Each receiver is represented by a unique bit in the bitmask. 1114 BIER-TE [I-D.ietf-bier-te-arch] shares architecture and packet 1115 formats with BIER. BIER-TE forwards and replicates packets based on 1116 a BitString in the packet header, but every BitPosition of the 1117 BitString of a BIER-TE packet indicates one or more adjacencies. 1118 BIER-TE Path can be derived from a PCE and used at the ingress as 1119 described in [I-D.chen-pce-bier]. 1121 Further, PCECC mechanism could be used for the allocation of bits for 1122 the BIER router for BIER as well as for the adjacencies for BIER-TE. 1123 PCECC based controller can use PCEP to instruct the BIER capable 1124 routers the meaning of the bits as well as other fields needed for 1125 BIER encapsulation. The PCECC could be used to program the BIER 1126 router with various parameters used in the BIER encapsulation such as 1127 BIER subdomain-ID, BFR-ID, BIER Encapsulation etc etc for both node 1128 and adjacency. 1130 4. IANA Considerations 1132 This document does not require any action from IANA. 1134 5. Security Considerations 1136 [RFC8283] describes how the security considerations for a PCE-based 1137 controller are little different from those for any other PCE system. 1138 That is, the operation relies heavily on the use and security of 1139 PCEP, so due consideration should be given to the security features 1140 discussed in [RFC5440] and the additional mechanisms described in 1141 [RFC8253]. It further lists the vulnerability of a central 1142 controller architecture, such as a central point of failure, denial 1143 of service, and a focus for interception and modification of messages 1144 sent to individual Network Elements (NEs). 1146 As per [RFC9050], the use of Transport Layer Security (TLS) in PCEP 1147 is recommended, as it provides support for peer authentication, 1148 message encryption, and integrity. It further provides mechanisms 1149 for associating peer identities with different levels of access and/ 1150 or authoritativeness via an attribute in X.509 certificates or a 1151 local policy with a specific accept-list of X.509 certificates. This 1152 can be used to check the authority for the PCECC operations. 1154 It is expected that each new document that is produced for a specific 1155 use case will also include considerations of the security impacts of 1156 the use of a PCE-based central controller on the network type and 1157 services being managed. 1159 6. Acknowledgments 1161 We would like to thank Adrian Farrel, Aijun Wang, Robert Tao, 1162 Changjiang Yan, Tieying Huang, Sergio Belotti, Dieter Beller, Andrey 1163 Elperin and Evgeniy Brodskiy for their useful comments and 1164 suggestions. 1166 7. References 1168 7.1. Normative References 1170 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1171 Requirement Levels", BCP 14, RFC 2119, 1172 DOI 10.17487/RFC2119, March 1997, 1173 . 1175 [RFC5440] Vasseur, JP., Ed. and JL. Le Roux, Ed., "Path Computation 1176 Element (PCE) Communication Protocol (PCEP)", RFC 5440, 1177 DOI 10.17487/RFC5440, March 2009, 1178 . 1180 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 1181 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 1182 May 2017, . 1184 [RFC8283] Farrel, A., Ed., Zhao, Q., Ed., Li, Z., and C. Zhou, "An 1185 Architecture for Use of PCE and the PCE Communication 1186 Protocol (PCEP) in a Network with Central Control", 1187 RFC 8283, DOI 10.17487/RFC8283, December 2017, 1188 . 1190 [RFC8253] Lopez, D., Gonzalez de Dios, O., Wu, Q., and D. Dhody, 1191 "PCEPS: Usage of TLS to Provide a Secure Transport for the 1192 Path Computation Element Communication Protocol (PCEP)", 1193 RFC 8253, DOI 10.17487/RFC8253, October 2017, 1194 . 1196 7.2. Informative References 1198 [RFC3985] Bryant, S., Ed. and P. Pate, Ed., "Pseudo Wire Emulation 1199 Edge-to-Edge (PWE3) Architecture", RFC 3985, 1200 DOI 10.17487/RFC3985, March 2005, 1201 . 1203 [RFC4206] Kompella, K. and Y. Rekhter, "Label Switched Paths (LSP) 1204 Hierarchy with Generalized Multi-Protocol Label Switching 1205 (GMPLS) Traffic Engineering (TE)", RFC 4206, 1206 DOI 10.17487/RFC4206, October 2005, 1207 . 1209 [RFC4364] Rosen, E. and Y. Rekhter, "BGP/MPLS IP Virtual Private 1210 Networks (VPNs)", RFC 4364, DOI 10.17487/RFC4364, February 1211 2006, . 1213 [RFC4655] Farrel, A., Vasseur, J.-P., and J. Ash, "A Path 1214 Computation Element (PCE)-Based Architecture", RFC 4655, 1215 DOI 10.17487/RFC4655, August 2006, 1216 . 1218 [RFC5150] Ayyangar, A., Kompella, K., Vasseur, JP., and A. Farrel, 1219 "Label Switched Path Stitching with Generalized 1220 Multiprotocol Label Switching Traffic Engineering (GMPLS 1221 TE)", RFC 5150, DOI 10.17487/RFC5150, February 2008, 1222 . 1224 [RFC5151] Farrel, A., Ed., Ayyangar, A., and JP. Vasseur, "Inter- 1225 Domain MPLS and GMPLS Traffic Engineering -- Resource 1226 Reservation Protocol-Traffic Engineering (RSVP-TE) 1227 Extensions", RFC 5151, DOI 10.17487/RFC5151, February 1228 2008, . 1230 [RFC5541] Le Roux, JL., Vasseur, JP., and Y. Lee, "Encoding of 1231 Objective Functions in the Path Computation Element 1232 Communication Protocol (PCEP)", RFC 5541, 1233 DOI 10.17487/RFC5541, June 2009, 1234 . 1236 [RFC5376] Bitar, N., Zhang, R., and K. Kumaki, "Inter-AS 1237 Requirements for the Path Computation Element 1238 Communication Protocol (PCECP)", RFC 5376, 1239 DOI 10.17487/RFC5376, November 2008, 1240 . 1242 [RFC7025] Otani, T., Ogaki, K., Caviglia, D., Zhang, F., and C. 1243 Margaria, "Requirements for GMPLS Applications of PCE", 1244 RFC 7025, DOI 10.17487/RFC7025, September 2013, 1245 . 1247 [RFC7399] Farrel, A. and D. King, "Unanswered Questions in the Path 1248 Computation Element Architecture", RFC 7399, 1249 DOI 10.17487/RFC7399, October 2014, 1250 . 1252 [RFC7432] Sajassi, A., Ed., Aggarwal, R., Bitar, N., Isaac, A., 1253 Uttaro, J., Drake, J., and W. Henderickx, "BGP MPLS-Based 1254 Ethernet VPN", RFC 7432, DOI 10.17487/RFC7432, February 1255 2015, . 1257 [RFC7491] King, D. and A. Farrel, "A PCE-Based Architecture for 1258 Application-Based Network Operations", RFC 7491, 1259 DOI 10.17487/RFC7491, March 2015, 1260 . 1262 [RFC7665] Halpern, J., Ed. and C. Pignataro, Ed., "Service Function 1263 Chaining (SFC) Architecture", RFC 7665, 1264 DOI 10.17487/RFC7665, October 2015, 1265 . 1267 [RFC8231] Crabbe, E., Minei, I., Medved, J., and R. Varga, "Path 1268 Computation Element Communication Protocol (PCEP) 1269 Extensions for Stateful PCE", RFC 8231, 1270 DOI 10.17487/RFC8231, September 2017, 1271 . 1273 [RFC8279] Wijnands, IJ., Ed., Rosen, E., Ed., Dolganow, A., 1274 Przygienda, T., and S. Aldrin, "Multicast Using Bit Index 1275 Explicit Replication (BIER)", RFC 8279, 1276 DOI 10.17487/RFC8279, November 2017, 1277 . 1279 [RFC8281] Crabbe, E., Minei, I., Sivabalan, S., and R. Varga, "Path 1280 Computation Element Communication Protocol (PCEP) 1281 Extensions for PCE-Initiated LSP Setup in a Stateful PCE 1282 Model", RFC 8281, DOI 10.17487/RFC8281, December 2017, 1283 . 1285 [RFC8355] Filsfils, C., Ed., Previdi, S., Ed., Decraene, B., and R. 1286 Shakir, "Resiliency Use Cases in Source Packet Routing in 1287 Networking (SPRING) Networks", RFC 8355, 1288 DOI 10.17487/RFC8355, March 2018, 1289 . 1291 [RFC8402] Filsfils, C., Ed., Previdi, S., Ed., Ginsberg, L., 1292 Decraene, B., Litkowski, S., and R. Shakir, "Segment 1293 Routing Architecture", RFC 8402, DOI 10.17487/RFC8402, 1294 July 2018, . 1296 [RFC8664] Sivabalan, S., Filsfils, C., Tantsura, J., Henderickx, W., 1297 and J. Hardwick, "Path Computation Element Communication 1298 Protocol (PCEP) Extensions for Segment Routing", RFC 8664, 1299 DOI 10.17487/RFC8664, December 2019, 1300 . 1302 [RFC8735] Wang, A., Huang, X., Kou, C., Li, Z., and P. Mi, 1303 "Scenarios and Simulation Results of PCE in a Native IP 1304 Network", RFC 8735, DOI 10.17487/RFC8735, February 2020, 1305 . 1307 [RFC8751] Dhody, D., Lee, Y., Ceccarelli, D., Shin, J., and D. King, 1308 "Hierarchical Stateful Path Computation Element (PCE)", 1309 RFC 8751, DOI 10.17487/RFC8751, March 2020, 1310 . 1312 [RFC8754] Filsfils, C., Ed., Dukes, D., Ed., Previdi, S., Leddy, J., 1313 Matsushima, S., and D. Voyer, "IPv6 Segment Routing Header 1314 (SRH)", RFC 8754, DOI 10.17487/RFC8754, March 2020, 1315 . 1317 [RFC8821] Wang, A., Khasanov, B., Zhao, Q., and H. Chen, "PCE-Based 1318 Traffic Engineering (TE) in Native IP Networks", RFC 8821, 1319 DOI 10.17487/RFC8821, April 2021, 1320 . 1322 [RFC8986] Filsfils, C., Ed., Camarillo, P., Ed., Leddy, J., Voyer, 1323 D., Matsushima, S., and Z. Li, "Segment Routing over IPv6 1324 (SRv6) Network Programming", RFC 8986, 1325 DOI 10.17487/RFC8986, February 2021, 1326 . 1328 [RFC9050] Li, Z., Peng, S., Negi, M., Zhao, Q., and C. Zhou, "Path 1329 Computation Element Communication Protocol (PCEP) 1330 Procedures and Extensions for Using the PCE as a Central 1331 Controller (PCECC) of LSPs", RFC 9050, 1332 DOI 10.17487/RFC9050, July 2021, 1333 . 1335 [RFC9168] Dhody, D., Farrel, A., and Z. Li, "Path Computation 1336 Element Communication Protocol (PCEP) Extension for Flow 1337 Specification", RFC 9168, DOI 10.17487/RFC9168, January 1338 2022, . 1340 [I-D.ietf-pce-pcep-extension-pce-controller-sr] 1341 Li, Z., Peng, S., Negi, M. S., Zhao, Q., and C. Zhou, 1342 "Path Computation Element Communication Protocol (PCEP) 1343 Procedures and Extensions for Using PCE as a Central 1344 Controller (PCECC) for Segment Routing (SR) MPLS Segment 1345 Identifier (SID) Allocation and Distribution.", Work in 1346 Progress, Internet-Draft, draft-ietf-pce-pcep-extension- 1347 pce-controller-sr-04, 6 March 2022, 1348 . 1351 [I-D.li-pce-controlled-id-space] 1352 Li, C., Shi, H., Wang, A., Cheng, W., and C. Zhou, "PCE 1353 Controlled ID Space", Work in Progress, Internet-Draft, 1354 draft-li-pce-controlled-id-space-11, 20 March 2022, 1355 . 1358 [I-D.ietf-pce-stateful-interdomain] 1359 Dugeon, O., Meuric, J., Lee, Y., and D. Ceccarelli, "PCEP 1360 Extension for Stateful Inter-Domain Tunnels", Work in 1361 Progress, Internet-Draft, draft-ietf-pce-stateful- 1362 interdomain-03, 4 March 2022, 1363 . 1366 [I-D.cbrt-pce-stateful-local-protection] 1367 Barth, C. and R. Torvi, "PCEP Extensions for RSVP-TE 1368 Local-Protection with PCE-Stateful", Work in Progress, 1369 Internet-Draft, draft-cbrt-pce-stateful-local-protection- 1370 01, 29 June 2018, . 1373 [I-D.ietf-pce-segment-routing-ipv6] 1374 Li, C., Negi, M., Sivabalan, S., Koldychev, M., 1375 Kaladharan, P., and Y. Zhu, "PCEP Extensions for Segment 1376 Routing leveraging the IPv6 data plane", Work in Progress, 1377 Internet-Draft, draft-ietf-pce-segment-routing-ipv6-13, 1 1378 April 2022, . 1381 [I-D.ietf-bier-te-arch] 1382 Eckert, T., Cauchie, G., and M. Menth, "Tree Engineering 1383 for Bit Index Explicit Replication (BIER-TE)", Work in 1384 Progress, Internet-Draft, draft-ietf-bier-te-arch-10, 9 1385 July 2021, . 1388 [I-D.chen-pce-bier] 1389 Chen, R., Zhang, Z., Chen, H., Dhanaraj, S., Qin, F., and 1390 A. Wang, "PCEP Extensions for BIER-TE", Work in Progress, 1391 Internet-Draft, draft-chen-pce-bier-09, 12 July 2021, 1392 . 1395 [I-D.ietf-spring-sr-service-programming] 1396 Clad, F., Xu, X., Filsfils, C., Bernier, D., Li, C., 1397 Decraene, B., Ma, S., Yadlapalli, C., Henderickx, W., and 1398 S. Salsano, "Service Programming with Segment Routing", 1399 Work in Progress, Internet-Draft, draft-ietf-spring-sr- 1400 service-programming-06, 9 June 2022, 1401 . 1404 [I-D.ietf-spring-nsh-sr] 1405 Guichard, J. N. and J. Tantsura, "Integration of Network 1406 Service Header (NSH) and Segment Routing for Service 1407 Function Chaining (SFC)", Work in Progress, Internet- 1408 Draft, draft-ietf-spring-nsh-sr-11, 20 April 2022, 1409 . 1412 [I-D.ietf-idr-segment-routing-te-policy] 1413 Previdi, S., Filsfils, C., Talaulikar, K., Mattes, P., 1414 Jain, D., and S. Lin, "Advertising Segment Routing 1415 Policies in BGP", Work in Progress, Internet-Draft, draft- 1416 ietf-idr-segment-routing-te-policy-17, 14 April 2022, 1417 . 1420 [I-D.ietf-spring-segment-routing-policy] 1421 Filsfils, C., Talaulikar, K., Voyer, D., Bogdanov, A., and 1422 P. Mattes, "Segment Routing Policy Architecture", Work in 1423 Progress, Internet-Draft, draft-ietf-spring-segment- 1424 routing-policy-22, 22 March 2022, 1425 . 1428 [I-D.ietf-pce-segment-routing-policy-cp] 1429 Koldychev, M., Sivabalan, S., Barth, C., Peng, S., and H. 1430 Bidgoli, "PCEP extension to support Segment Routing Policy 1431 Candidate Paths", Work in Progress, Internet-Draft, draft- 1432 ietf-pce-segment-routing-policy-cp-07, 21 April 2022, 1433 . 1436 [MAP-REDUCE] 1437 Lee, K., Choi, T., Ganguly, A., Wolinsky, D., Boykin, P., 1438 and R. Figueiredo, "Parallel Processing Framework on a P2P 1439 System Using Map and Reduce Primitives", , May 2011, 1440 . 1442 [MPLS-DC] Afanasiev, D. and D. Ginsburg, "MPLS in DC and inter-DC 1443 networks: the unified forwarding mechanism for network 1444 programmability at scale", , March 2014, 1445 . 1448 Appendix A. Other Use Cases of PCECC 1450 This section list some more advanced use cases of PCECC that were 1451 discussed and could be worked on in future. 1453 A.1. Use Cases of PCECC for LSP in the Network Migration 1455 One of the main advantages for PCECC solution is that it has backward 1456 compatibility naturally since the PCE server itself can function as a 1457 proxy node of MPLS network for all the new nodes which may no longer 1458 support the signaling protocols. 1460 As it is illustrated in the following example, the current network 1461 could migrate to a total PCECC controlled network gradually by 1462 replacing the legacy nodes. During the migration, the legacy nodes 1463 still need to signal using the existing MPLS protocol such as LDP and 1464 RSVP-TE, and the new nodes setup their portion of the forwarding path 1465 through PCECC directly. With the PCECC function as the proxy of 1466 these new nodes, MPLS signaling can populate through network as 1467 normal. 1469 Example described in this section is based on network configurations 1470 illustrated using the following figure: 1472 +------------------------------------------------------------------+ 1473 | PCE DOMAIN | 1474 | +-----------------------------------------------------+ | 1475 | | PCECC | | 1476 | +-----------------------------------------------------+ | 1477 | ^ ^ ^ ^ | 1478 | | PCEP | | PCEP | | 1479 | V V V V | 1480 | +--------+ +--------+ +--------+ +--------+ +--------+ | 1481 | | NODE 1 | | NODE 2 | | NODE 3 | | NODE 4 | | NODE 5 | | 1482 | | |...| |...| |...| |...| | | 1483 | | Legacy |if1| Legacy |if2|Legacy |if3| PCECC |if4| PCECC | | 1484 | | Node | | Node | |Enabled | |Enabled | | Enabled| | 1485 | +--------+ +--------+ +--------+ +--------+ +--------+ | 1486 | | 1487 +------------------------------------------------------------------+ 1489 Example: PCECC Initiated LSP Setup In the Network Migration 1491 In this example, there are five nodes for the TE LSP from head end 1492 (Node1) to the tail end (Node5). Where the Node4 and Node5 are 1493 centrally controlled and other nodes are legacy nodes. 1495 * Node1 sends a path request message for the setup of LSP 1496 destinating to Node5. 1498 * PCECC sends to node1 a reply message for LSP setup with the path: 1499 (Node1, if1),(Node2, if2), (Node3, if3), (Node4, if4), Node5. 1501 * Node1, Node2, Node3 will setup the LSP to Node5 using the local 1502 labels as usual. Node 3 with help of PCECC could proxy the 1503 signaling. 1505 * Then the PCECC will program the out-segment of Node3, the in- 1506 segment/ out-segment of Node4, and the in-segment for Node5. 1508 A.2. Use Cases of PCECC for L3VPN and PWE3 1510 As described in [RFC8283], various network services may be offered 1511 over a network. These include protection services (including Virtual 1512 Private Network (VPN) services (such as Layer 3 VPNs [RFC4364] or 1513 Ethernet VPNs [RFC7432]); or Pseudowires [RFC3985]. Delivering 1514 services over a network in an optimal way requires coordination in 1515 the way that network resources are allocated to support the services. 1516 A PCE-based central controller can consider the whole network and all 1517 components of a service at once when planning how to deliver the 1518 service. It can then use PCEP to manage the network resources and to 1519 install the necessary associations between those resources. 1521 In the case of L3VPN, VPN labels can be assigned and distributed 1522 through the PCECC PCEP among the PE router instead of using the BGP 1523 protocols. 1525 Example described in this section is based on network configurations 1526 illustrated using the following figure: 1528 +-------------------------------------------+ 1529 | PCE DOMAIN | 1530 | +-----------------------------------+ | 1531 | | PCECC | | 1532 | +-----------------------------------+ | 1533 | ^ ^ ^ | 1534 |PWE3/L3VPN | PCEP PCEP|LSP PWE3/L3VPN|PCEP | 1535 | V V V | 1536 +--------+ | +--------+ +--------+ +--------+ | +--------+ 1537 | CE | | | PE1 | | NODE x | | PE2 | | | CE | 1538 | |...... | |...| |...| |.....| | 1539 | Legacy | |if1 | PCECC |if2|PCCEC |if3| PCECC |if4 | Legacy | 1540 | Node | | | Enabled| |Enabled | |Enabled | | | Node | 1541 +--------+ | +--------+ +--------+ +--------+ | +--------+ 1542 | | 1543 +-------------------------------------------+ 1545 Example: Using PCECC for L3VPN and PWE3 1547 In the case PWE3, instead of using the LDP signaling protocols, the 1548 label and port pairs assigned to each pseudowire can be assigned 1549 through PCECC among the PE routers and the corresponding forwarding 1550 entries will be distributed into each PE routers through the extended 1551 PCEP protocols and PCECC mechanism. 1553 A.3. Use Cases of PCECC for Local Protection (RSVP-TE) 1555 [I-D.cbrt-pce-stateful-local-protection] describes the need for the 1556 PCE to maintain and associate the local protection paths for the 1557 RSVP-TE LSP. Local protection requires the setup of a bypass at the 1558 PLR. This bypass can be PCC-initiated and delegated, or PCE- 1559 initiated. In either case, the PLR MUST maintain a PCEP session to 1560 the PCE. The Bypass LSPs need to mapped to the primary LSP. This 1561 could be done locally at the PLR based on a local policy but there is 1562 a need for a PCE to do the mapping as well to exert greater control. 1564 This mapping can be done via PCECC procedures where the PCE could 1565 instruct the PLR to the mapping and identify the primary LSP for 1566 which bypass should be used. 1568 A.4. Using reliable P2MP TE based multicast delivery for distributed 1569 computations (MapReduce-Hadoop) 1571 MapReduce model of distributed computations in computing clusters is 1572 widely deployed. In Hadoop (https://hadoop.apache.org/) 1.0 1573 architecture MapReduce operations on big data in the Hadoop 1574 Distributed File System (HDFS), where NameNode has the knowledge 1575 about resources of the cluster and where actual data (chunks) for 1576 particular task are located (which DataNode). Each chunk of data 1577 (64MB or more) should have 3 saved copies in different DataNodes 1578 based on their proximity. 1580 Proximity level currently has semi-manual allocation and based on 1581 Rack IDs (Assumption is that closer data are better because of access 1582 speed/smaller latency). 1584 JobTracker node is responsible for computation tasks, scheduling 1585 across DataNodes and also have Rack-awareness. Currently transport 1586 protocols between NameNode/JobTracker and DataNodes are based on IP 1587 unicast. It has simplicity as pros but has numerous drawbacks 1588 related with its flat approach. 1590 It is clear that we should go beyond of one DC for Hadoop cluster 1591 creation and move towards distributed clusters. In that case we need 1592 to handle performance and latency issues. Latency depends on speed 1593 of light in fiber links and also latency introduced by intermediate 1594 devices in between. The last one is closely correlated with network 1595 device architecture and performance. Current performance of NPU 1596 based routers should be enough for creating distribute Hadoop 1597 clusters with predicted latency. Performance of SW based routers 1598 (mainly as VNF) together with additional HW features such as DPDK are 1599 promising but require additional research and testing. 1601 Main question is how can we create simple but effective architecture 1602 for distributed Hadoop cluster? 1604 There is research [MAP-REDUCE] which show how usage of multicast tree 1605 could improve speed of resource or cluster members discovery inside 1606 the cluster as well as increase redundancy in communications between 1607 cluster nodes. 1609 Is traditional IP based multicast enough for that? We doubt it 1610 because it requires additional control plane (IGMP, PIM) and a lot of 1611 signaling, that is not suitable for high performance computations, 1612 that are very sensitive to latency. 1614 P2MP TE tunnels looks much more suitable as potential solution for 1615 creation of multicast based communications between NameNode as root 1616 and DataNodes as leaves inside the cluster. Obviously these P2MP 1617 tunnels should be dynamically created and turned down (no manual 1618 intervention). Here, the PCECC comes to play with main objective to 1619 create optimal topology of each particular request for MapReduce 1620 computation and also create P2MP tunnels with needed parameters such 1621 as bandwidth and delay. 1623 This solution would require to use MPLS label based forwarding inside 1624 the cluster. Usage of label based forwarding inside DC was proposed 1625 by Yandex [MPLS-DC]. Technically it is already possible because MPLS 1626 on switches is already supported by some vendors, MPLS also exists on 1627 Linux and OVS. 1629 The following framework can make this task: 1631 +--------+ 1632 | APP | 1633 +--------+ 1634 | NBI (REST API,...) 1635 | 1636 PCEP +----------+ REST API 1637 +---------+ +---| PCECC |----------+ 1638 | Client |---|---| | | 1639 +---------+ | +----------+ | 1640 | | | | | | 1641 +-----|---+ |PCEP| | 1642 +--------+ | | | | | 1643 | | | | | | 1644 | REST API | | | | | 1645 | | | | | | 1646 +-------------+ | | | | +----------+ 1647 | Job Tracker | | | | | | NameNode | 1648 | | | | | | | | 1649 +-------------+ | | | | +----------+ 1650 +------------------+ | +-----------+ 1651 | | | | 1652 |---+-----P2MP TE--+-----|-----------| | 1653 +----------+ +----------+ +----------+ 1654 | DataNode1| | DataNode2| | DataNodeN| 1655 |TaskTraker| |TaskTraker| .... |TaskTraker| 1656 +----------+ +----------+ +----------+ 1658 Communication between JobTracker, NameNode and PCECC can be done via 1659 REST API directly or via cluster manager such as Mesos. 1661 Phase 1: Distributed cluster resources discovery During this phase 1662 JobTracker and NameNode SHOULD identify and find available DataNodes 1663 according to computing request from application (APP). NameNode 1664 SHOULD query PCECC about available DataNodes, NameNode MAY provide 1665 additional constrains to PCECC such as topological proximity, 1666 redundancy level. 1668 PCECC SHOULD analyze the topology of distributed cluster and perform 1669 constrain based path calculation from client towards most suitable 1670 NameNodes. PCECC SHOULD reply to NameNode the list of most suitable 1671 DataNodes and their resource capabilities. Topology discovery 1672 mechanism for PCECC will be added later to that framework. 1674 Phase 2: PCECC SHOULD create P2MP LSP from client towards those 1675 DataNodes by means of PCEP messages following previously calculated 1676 path. 1678 Phase 3. NameNode SHOULD send this information to client, PCECC 1679 informs client about optimal P2MP path towards DataNodes via PCEP 1680 message. 1682 Phase 4. Client sends data blocks to those DataNodes for writing via 1683 created P2MP tunnel. 1685 When this task will be finished, P2MP tunnel could be turned down. 1687 Appendix B. Contributor Addresses 1689 Following authors contributed text for this document and should be 1690 considered as co-authors: 1692 Luyuan Fang 1693 United States of America 1695 Email: luyuanf@gmail.com 1697 Chao Zhou 1698 HPE 1700 Email: chaozhou_us@yahoo.com 1702 Boris Zhang 1703 Amazon 1705 Email: zhangyud@amazon.com 1707 Artsiom Rachytski 1708 Belarus 1710 Email: arachyts@gmail.com 1712 Anton Gulida 1713 EPAM Systems, Inc. 1714 Belarus 1716 Email: Anton_Hulida@epam.com 1718 Authors' Addresses 1720 Zhenbin (Robin) Li 1721 Huawei Technologies 1722 Huawei Bld., No.156 Beiqing Rd. 1723 Beijing 1724 100095 1725 China 1726 Email: lizhenbin@huawei.com 1728 Dhruv Dhody 1729 Huawei Technologies 1730 Divyashree Techno Park, Whitefield 1731 Bangalore 1732 Karnataka 560066 1733 India 1734 Email: dhruv.ietf@gmail.com 1735 Quintin Zhao 1736 Etheric Networks 1737 1009 S CLAREMONT ST 1738 SAN MATEO, CA 94402 1739 United States of America 1740 Email: qzhao@ethericnetworks.com 1742 King He 1743 Tencent Holdings Ltd. 1744 Shenzhen 1745 China 1746 Email: kinghe@tencent.com 1748 Boris Khasanov 1749 Yandex LLC 1750 Ulitsa Lva Tolstogo 16 1751 Moscow 1752 Email: bhassanov@yahoo.com