idnits 2.17.1 draft-ietf-pce-stateful-pce-app-08.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (October 31, 2016) is 2735 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Unused Reference: 'RFC4657' is defined on line 1005, but no explicit reference was found in the text Summary: 0 errors (**), 0 flaws (~~), 2 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 PCE Working Group X. Zhang, Ed. 3 Internet-Draft Huawei Technologies 4 Intended status: Informational I. Minei, Ed. 5 Expires: May 4, 2017 Google, Inc. 6 October 31, 2016 8 Applicability of a Stateful Path Computation Element (PCE) 9 draft-ietf-pce-stateful-pce-app-08 11 Abstract 13 A stateful Path Computation Element (PCE) maintains information about 14 Label Switched Path (LSP) characteristics and resource usage within a 15 network in order to provide traffic engineering calculations for its 16 associated Path Computation Clients (PCCs). This document describes 17 general considerations for a stateful PCE deployment and examines its 18 applicability and benefits, as well as its challenges and limitations 19 through a number of use cases. PCE Communication Protocol (PCEP) 20 extensions required for stateful PCE usage are covered in separate 21 documents. 23 Status of This Memo 25 This Internet-Draft is submitted in full conformance with the 26 provisions of BCP 78 and BCP 79. 28 Internet-Drafts are working documents of the Internet Engineering 29 Task Force (IETF). Note that other groups may also distribute 30 working documents as Internet-Drafts. The list of current Internet- 31 Drafts is at http://datatracker.ietf.org/drafts/current/. 33 Internet-Drafts are draft documents valid for a maximum of six months 34 and may be updated, replaced, or obsoleted by other documents at any 35 time. It is inappropriate to use Internet-Drafts as reference 36 material or to cite them other than as "work in progress." 38 This Internet-Draft will expire on May 4, 2017. 40 Copyright Notice 42 Copyright (c) 2016 IETF Trust and the persons identified as the 43 document authors. All rights reserved. 45 This document is subject to BCP 78 and the IETF Trust's Legal 46 Provisions Relating to IETF Documents 47 (http://trustee.ietf.org/license-info) in effect on the date of 48 publication of this document. Please review these documents 49 carefully, as they describe your rights and restrictions with respect 50 to this document. Code Components extracted from this document must 51 include Simplified BSD License text as described in Section 4.e of 52 the Trust Legal Provisions and are provided without warranty as 53 described in the Simplified BSD License. 55 Table of Contents 57 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 58 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 3 59 3. Application Scenarios . . . . . . . . . . . . . . . . . . . . 4 60 3.1. Optimization of LSP Placement . . . . . . . . . . . . . . 4 61 3.1.1. Throughput Maximization and Bin Packing . . . . . . . 5 62 3.1.2. Deadlock . . . . . . . . . . . . . . . . . . . . . . 7 63 3.1.3. Minimum Perturbation . . . . . . . . . . . . . . . . 8 64 3.1.4. Predictability . . . . . . . . . . . . . . . . . . . 9 65 3.2. Auto-bandwidth Adjustment . . . . . . . . . . . . . . . . 11 66 3.3. Bandwidth Scheduling . . . . . . . . . . . . . . . . . . 11 67 3.4. Recovery . . . . . . . . . . . . . . . . . . . . . . . . 12 68 3.4.1. Protection . . . . . . . . . . . . . . . . . . . . . 12 69 3.4.2. Restoration . . . . . . . . . . . . . . . . . . . . . 13 70 3.4.3. SRLG Diversity . . . . . . . . . . . . . . . . . . . 14 71 3.5. Maintenance of Virtual Network Topology (VNT) . . . . . . 15 72 3.6. LSP Re-optimization . . . . . . . . . . . . . . . . . . . 15 73 3.7. Resource Defragmentation . . . . . . . . . . . . . . . . 16 74 3.8. Point-to-Multi-Point Applications . . . . . . . . . . . . 17 75 3.9. Impairment-Aware Routing and Wavelength Assignment (IA- 76 RWA) . . . . . . . . . . . . . . . . . . . . . . . . . . 17 77 4. Deployment Considerations . . . . . . . . . . . . . . . . . . 18 78 4.1. Multi-PCE Deployments . . . . . . . . . . . . . . . . . . 18 79 4.2. LSP State Synchronization . . . . . . . . . . . . . . . . 19 80 4.3. PCE Survivability . . . . . . . . . . . . . . . . . . . . 19 81 5. Security Considerations . . . . . . . . . . . . . . . . . . . 19 82 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 20 83 7. Contributing Authors . . . . . . . . . . . . . . . . . . . . 20 84 8. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 21 85 9. References . . . . . . . . . . . . . . . . . . . . . . . . . 21 86 9.1. Normative References . . . . . . . . . . . . . . . . . . 21 87 9.2. Informative References . . . . . . . . . . . . . . . . . 22 88 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 23 90 1. Introduction 92 [RFC4655] defines the architecture for a Path Computation Element 93 (PCE)-based model for the computation of Multiprotocol Label 94 Switching (MPLS) and Generalized MPLS (GMPLS) Traffic Engineering 95 Label Switched Paths (TE LSPs). To perform such a constrained 96 computation, a PCE stores the network topology (i.e., TE links and 97 nodes) and resource information (i.e., TE attributes) in its TE 98 Database (TED). [RFC5440] describes the Path Computation Element 99 Protocol (PCEP) for interaction between a Path Computation Client 100 (PCC) and a PCE, or between two PCEs, enabling computation of TE 101 LSPs. 103 As per [RFC4655], a PCE can be either stateful or stateless. A 104 stateful PCE maintains two sets of information for use in path 105 computation. The first is the Traffic Engineering Database (TED) 106 which includes the topology and resource state in the network. This 107 information can be obtained by a stateful PCE using the same 108 mechanisms as a stateless PCE (see [RFC4655]). The second is the LSP 109 State Database (LSP-DB), in which a PCE stores attributes of all 110 active LSPs in the network, such as their paths through the network, 111 bandwidth/resource usage, switching types and LSP constraints. This 112 state information allows the PCE to compute constrained paths while 113 considering individual LSPs and their inter-dependency. However, 114 this requires reliable state synchronization mechanisms between the 115 PCE and the network, between the PCE and the PCCs, and between 116 cooperating PCEs, with potentially significant control plane overhead 117 and maintenance of a large amount of state data, as explained in 118 [RFC4655]. 120 This document describes how a stateful PCE can be used to solve 121 various problems for MPLS-TE and GMPLS networks, and the benefits it 122 brings to such deployments. Note that alternative solutions relying 123 on stateless PCEs may also be possible for some of these use cases, 124 and will be mentioned for completeness where appropriate. 126 2. Terminology 128 This document uses the following terms defined in [RFC5440]: PCC, 129 PCE, PCEP peer. 131 This document defines the following terms: 133 Stateful PCE: a PCE that has access to not only the network state, 134 but also to the set of active paths and their reserved resources 135 for its computations. A stateful PCE might also retain 136 information regarding LSPs under construction in order to reduce 137 churn and resource contention. The additional state allows the 138 PCE to compute constrained paths while considering individual LSPs 139 and their interactions. Note that this requires reliable state 140 synchronization mechanisms between the PCE and the network, PCE 141 and PCC, and between cooperating PCEs. 143 Passive Stateful PCE: a PCE that uses LSP state information learned 144 from PCCs to optimize path computations. It does not actively 145 update LSP state. A PCC maintains synchronization with the PCE. 147 Active Stateful PCE:: a PCE that may issue recommendations to the 148 network. For example, an Active Stateful PCE may utilize the 149 Delegation mechanism to update LSP parameters in those PCCs that 150 delegated control over their LSPs to the PCE. 152 Delegation: an operation to grant a PCE temporary rights to modify a 153 subset of LSP parameters on one or more PCC's LSPs. LSPs are 154 delegated from a PCC to a PCE, and are referred to as delegated 155 LSPs. The PCC that owns the PCE state for the LSP has the right 156 to delegate it. An LSP is owned by a single PCC at any given 157 point in time. For intra-domain LSPs, this PCC should be the LSP 158 head end. 160 LSP State Database: information about all LSPs and their attributes. 162 PCE Initiation: a PCE, assuming LSP delegation granted by default, 163 can issue recommendations to the network. 165 Minimum Cut Set: the minimum set of links for a specific source 166 destination pair which, when removed from the network, results in 167 a specific source being completely isolated from specific 168 destination. The summed capacity of these links is equivalent to 169 the maximum capacity from the source to the destination by the 170 max-flow min-cut theorem. 172 3. Application Scenarios 174 In the following sections, several use cases are described, 175 showcasing scenarios that benefit from the deployment of a stateful 176 PCE. 178 3.1. Optimization of LSP Placement 180 The following use cases demonstrate a need for visibility into global 181 LSP states in PCE path computations, and for a PCE control of 182 sequence and timing in altering LSP path characteristics within and 183 across PCEP sessions. Reference topologies for the use cases 184 described later in this section are shown in Figures 1 and 2. 186 Some of the use cases below are focused on MPLS-TE deployments, but 187 may also apply to GMPLS. Unless otherwise cited, use cases assume 188 that all LSPs listed exist at the same LSP priority. 190 The main benefit in the cases below comes from moving away from an 191 asynchronous PCC-driven mode of operation to a model that allows for 192 central control over LSP computations and maintenance, and focuses 193 specifically on the active stateful PCE model of operation. 195 +-----+ 196 | A | 197 +-----+ 198 \ 199 +-----+ +-----+ 200 | C |----------------------| E | 201 +-----+ +-----+ 202 / \ +-----+ / 203 +-----+ +-----| D |-----+ 204 | B | +-----+ 205 +-----+ 207 Figure 1: Reference topology 1 209 +-----+ +-----+ +-----+ 210 | A | | B | | C | 211 +--+--+ +--+--+ +--+--+ 212 | | | 213 | | | 214 +--+--+ +--+--+ +--+--+ 215 | E +--------+ F +--------+ G | 216 +-----+ +-----+ +-----+ 218 Figure 2: Reference topology 2 220 3.1.1. Throughput Maximization and Bin Packing 222 Because LSP attribute changes in [RFC5440] are driven by Path 223 Computation Request (PCReq) messages under control of a PCC's local 224 timers, the sequence of resource reservation arrivals occurring in 225 the network will be randomized. This, coupled with a lack of global 226 LSP state visibility on the part of a stateless PCE may result in 227 suboptimal throughput in a given network topology, as will be shown 228 in the example below. 230 Reference topology 2 in Figure 2 and Tables 1 and 2 show an example 231 in which throughput is at 50% of optimal as a result of lack of 232 visibility and synchronized control across PCC's. In this scenario, 233 the decision must be made as to whether to route any portion of the 234 E-G demand, as any demand routed for this source and destination will 235 decrease system throughput. 237 +------+--------+----------+ 238 | Link | Metric | Capacity | 239 +------+--------+----------+ 240 | A-E | 1 | 10 | 241 | B-F | 1 | 10 | 242 | C-G | 1 | 10 | 243 | E-F | 1 | 10 | 244 | F-G | 1 | 10 | 245 +------+--------+----------+ 247 Table 1: Link parameters for Throughput use case 249 +------+-----+-----+-----+--------+----------+-------+ 250 | Time | LSP | Src | Dst | Demand | Routable | Path | 251 +------+-----+-----+-----+--------+----------+-------+ 252 | 1 | 1 | E | G | 10 | Yes | E-F-G | 253 | 2 | 2 | A | B | 10 | No | --- | 254 | 3 | 1 | F | C | 10 | No | --- | 255 +------+-----+-----+-----+--------+----------+-------+ 257 Table 2: Throughput use case demand time series 259 In many cases throughput maximization becomes a bin packing problem. 260 While bin packing itself is an NP-hard problem, a number of common 261 heuristics which run in polynomial time can provide significant 262 improvements in throughput over random reservation event 263 distribution, especially when traversing links which are members of 264 the minimum cut set for a large subset of source destination pairs. 266 Tables 3 and 4 show a simple use case using Reference Topology 1 in 267 Figure 1, where LSP state visibility and control of reservation order 268 across PCCs would result in significant improvement in total 269 throughput. 271 +------+--------+----------+ 272 | Link | Metric | Capacity | 273 +------+--------+----------+ 274 | A-C | 1 | 10 | 275 | B-C | 1 | 10 | 276 | C-E | 10 | 5 | 277 | C-D | 1 | 10 | 278 | D-E | 1 | 10 | 279 +------+--------+----------+ 281 Table 3: Link parameters for Bin Packing use case 283 +------+-----+-----+-----+--------+----------+---------+ 284 | Time | LSP | Src | Dst | Demand | Routable | Path | 285 +------+-----+-----+-----+--------+----------+---------+ 286 | 1 | 1 | A | E | 5 | Yes | A-C-D-E | 287 | 2 | 2 | B | E | 10 | No | --- | 288 +------+-----+-----+-----+--------+----------+---------+ 290 Table 4: Bin Packing use case demand time series 292 3.1.2. Deadlock 294 This section discusses a use case of cross-LSP impact under degraded 295 operation. Most existing RSVP-TE implementations will not tear down 296 established LSPs in the event of the failure of the bandwidth 297 increase procedure detailed in [RFC3209]. This behavior is directly 298 implied to be correct in [RFC3209] and is often desirable from an 299 operator's perspective, because either a) the destination prefixes 300 are not reachable via any means other than MPLS or b) this would 301 result in significant packet loss as demand is shifted to other LSPs 302 in the overlay mesh. 304 In addition, there are currently few implementations offering dynamic 305 ingress admission control (policing of the traffic volume mapped onto 306 an LSP) at the label edge router (LER). Having ingress admission 307 control on a per LSP basis is not necessarily desirable from an 308 operational perspective, as a) one must over-provision tunnels 309 significantly in order to avoid deleterious effects resulting from 310 stacked transport and flow control systems (for example for tunnels 311 that are dynamically resized based on current traffic) and b) there 312 is currently no efficient commonly available northbound interface for 313 dynamic configuration of per LSP ingress admission control. 315 Lack of ingress admission control coupled with the behavior in 316 [RFC3209] may result in LSPs operating out of profile for significant 317 periods of time. It is reasonable to expect that these out-of- 318 profile LSPs will be operating in a degraded state and experience 319 traffic loss, but because they end up sharing common network 320 interfaces with other LSPs operating within their bandwidth 321 reservations, thus impacting the operation of the in-profile LSPs, 322 even when there is unused network capacity elsewhere in the network. 323 Furthermore, this behavior will cause information loss in the TED 324 with regards to the actual available bandwidth on the links used by 325 the out-of-profile LSPs, as the reservations on the links no longer 326 reflect the capacity used. 328 Reference Topology 1 in Figure 1 and Tables 5 and 6 show a use case 329 that demonstrates this behavior. Two LSPs, LSP 1 and LSP 2 are 330 signaled with demand 2 and routed along paths A-C-D-E and B-C-D-E 331 respectively. At a later time, the demand of LSP 1 increases to 20. 332 Under such a demand, the LSP cannot be resignaled. However, the 333 existing LSP will not be torn down. In the absence of ingress 334 policing, traffic on LSP 1 will cause degradation for traffic of LSP 335 2 (due to oversubscription on the links C-D and D-E), as well as 336 information loss in the TED with regard to the actual network state. 338 The problem could be easily ameliorated by global visibility of LSP 339 state coupled with PCC-external demand measurements and placement of 340 two LSPs on disjoint links. Note that while the demand of 20 for LSP 341 1 could never be satisfied in the given topology, what could be 342 achieved would be isolation from the ill-effects of the 343 (unsatisfiable) increased demand. 345 +------+--------+----------+ 346 | Link | Metric | Capacity | 347 +------+--------+----------+ 348 | A-C | 1 | 10 | 349 | B-C | 1 | 10 | 350 | C-E | 10 | 5 | 351 | C-D | 1 | 10 | 352 | D-E | 1 | 10 | 353 +------+--------+----------+ 355 Table 5: Link parameters for the 'Degraded operation' example 357 +------+-----+-----+-----+--------+----------+---------+ 358 | Time | LSP | Src | Dst | Demand | Routable | Path | 359 +------+-----+-----+-----+--------+----------+---------+ 360 | 1 | 1 | A | E | 2 | Yes | A-C-D-E | 361 | 2 | 2 | B | E | 2 | Yes | B-C-D-E | 362 | 3 | 1 | A | E | 20 | No | --- | 363 +------+-----+-----+-----+--------+----------+---------+ 365 Table 6: Degraded operation demand time series 367 3.1.3. Minimum Perturbation 369 As a result of both the lack of visibility into global LSP state and 370 the lack of control over event ordering across PCE sessions, 371 unnecessary perturbations may be introduced into the network by a 372 stateless PCE. Tables 7 and 8 show an example of an unnecessary 373 network perturbation using Reference Topology 1 in Figure 1. In this 374 case an unimportant (high LSP priority value) LSP (LSP1) is first set 375 up along the shortest path. At time 2, which is assumed to be 376 relatively close to time 1, a second more important (lower LSP- 377 priority value) LSP (LSP2) is established, preempting LSP1, 378 potentially causing traffic loss. LSP1 is then reestablished on the 379 longer A-C-E path. 381 +------+--------+----------+ 382 | Link | Metric | Capacity | 383 +------+--------+----------+ 384 | A-C | 1 | 10 | 385 | B-C | 1 | 10 | 386 | C-E | 10 | 10 | 387 | C-D | 1 | 10 | 388 | D-E | 1 | 10 | 389 +------+--------+----------+ 391 Table 7: Link parameters for the 'Minimum-Perturbation' example 393 +------+-----+-----+-----+--------+----------+----------+---------+ 394 | Time | LSP | Src | Dst | Demand | LSP Prio | Routable | Path | 395 +------+-----+-----+-----+--------+----------+----------+---------+ 396 | 1 | 1 | A | E | 7 | 7 | Yes | A-C-D-E | 397 | 2 | 2 | B | E | 7 | 0 | Yes | B-C-D-E | 398 | 3 | 1 | A | E | 7 | 7 | Yes | A-C-E | 399 +------+-----+-----+-----+--------+----------+----------+---------+ 401 Table 8: Minimum-Perturbation LSP and demand time series 403 A stateful PCE can help in this scenario by computing both routes at 404 the same time. The advantages of using a stateful PCE over 405 exploiting a stateless PCE via Global Concurrent Optimization(GCO) 406 are three folds. First is the ability to accommodate concurrent path 407 computation from different PCCs. Second is the reduction of control 408 plane overhead since the stateful PCE has the route information of 409 the affected LSPs. Thirdly, the stateful PCE can use the LSP-DB to 410 further optimize the placement of LSPs. This will ensure placement 411 of the more important LSP along the shortest path, avoiding the setup 412 and subsequent preemption of the lower priority LSP. Similarly, when 413 a new higher priority LSP which requires preemption of existing lower 414 priority LSP(s), a stateful PCE can determine the minimum number of 415 lower priority LSP(s) to reroute using the make-before-break (MBB) 416 mechanism without disrupting any service and then set up the higher 417 priority LSP. 419 3.1.4. Predictability 421 Randomization of reservation events caused by lack of control over 422 event ordering across PCE sessions results in poor predictability in 423 LSP routing. An offline system applying a consistent optimization 424 method will produce predictable results to within either the boundary 425 of forecast error (when reservations are over-provisioned by 426 reasonable margins) or to the variability of the signal and the 427 forecast error (when applying some hysteresis in order to minimize 428 churn). Predictable results are valuable for being able to simulate 429 the network and reliably test it under various scenarios, especially 430 under various failure modes and planned maintenances when predictable 431 path characteristics are desired under contention for network 432 resources. 434 Reference Topology 1 and Tables 9, 10 and 11 show the impact of event 435 ordering and predictability of LSP routing. 437 +------+--------+----------+ 438 | Link | Metric | Capacity | 439 +------+--------+----------+ 440 | A-C | 1 | 10 | 441 | B-C | 1 | 10 | 442 | C-E | 1 | 10 | 443 | C-D | 1 | 10 | 444 | D-E | 1 | 10 | 445 +------+--------+----------+ 447 Table 9: Link parameters for the 'Predictability' example 449 +------+-----+-----+-----+--------+----------+---------+ 450 | Time | LSP | Src | Dst | Demand | Routable | Path | 451 +------+-----+-----+-----+--------+----------+---------+ 452 | 1 | 1 | A | E | 7 | Yes | A-C-E | 453 | 2 | 2 | B | E | 7 | Yes | B-C-D-E | 454 +------+-----+-----+-----+--------+----------+---------+ 456 Table 10: Predictability LSP and demand time series 1 458 +------+-----+-----+-----+--------+----------+---------+ 459 | Time | LSP | Src | Dst | Demand | Routable | Path | 460 +------+-----+-----+-----+--------+----------+---------+ 461 | 1 | 2 | B | E | 7 | Yes | B-C-E | 462 | 2 | 1 | A | E | 7 | Yes | A-C-D-E | 463 +------+-----+-----+-----+--------+----------+---------+ 465 Table 11: Predictability LSP and demand time series 2 467 As can be shown in the example, both LSPs are routed in both cases, 468 but along very different paths. This would be a challenge if 469 reliable simulation of the network is attempted. An active stateful 470 PCE can solve this through control over LSP ordering. Based on 471 triggers such as a failure or an optimization trigger, the PCE can 472 order the computations and path setup in a deterministic way. 474 3.2. Auto-bandwidth Adjustment 476 The bandwidth requirement of LSPs often change over time, requiring 477 resizing the LSP. In most implementations available today, the head- 478 end node performs this function by monitoring the actual bandwidth 479 usage, triggering a recomputation and resignaling when a threshold is 480 reached. This operation is referred as auto-bandwidth adjustment. 481 The head-end node either recomputes the path locally, or it requests 482 a recomputation from a PCE by sending a PCReq message. In the latter 483 case, the PCE computes a new path and provides the new route 484 suggestion. Upon receiving the reply from the PCE, the PCC re- 485 signals the LSP in Shared-Explicit (SE) mode along the newly computed 486 path. With a stateless PCE, the head-end node needs to provide the 487 current used bandwidth and the route information via path computation 488 request messages. Note that in this scenario, the head-end node is 489 the one that drives the LSP resizing based on local information, and 490 that the difference between using a stateless and a passive stateful 491 PCE is in the level of optimization of the LSP placement as discussed 492 in the previous section. 494 A more interesting smart bandwidth adjustment case is one where the 495 LSP resizing decision is done by an external entity, with access to 496 additional information such as historical trending data, application- 497 specific information about expected demands or policy information, as 498 well as knowledge of the actual desired flow volumes. In this case 499 an active stateful PCE provides an advantage in both the computation 500 with knowledge of all LSPs in the domain and in the ability to 501 trigger bandwidth modification of the LSP. 503 3.3. Bandwidth Scheduling 505 Bandwidth scheduling allows network operators to reserve resources in 506 advance according to the agreements with their customers, and allow 507 them to transmit data with specified starting time and duration, for 508 example for a scheduled bulk data replication between data centers. 510 Traditionally, this can be supported by network management system 511 (NMS) operation through path pre-establishment and activation on the 512 agreed starting time. However, this does not provide efficient 513 network usage since the established paths exclude the possibility of 514 being used by other services even when they are not used for 515 undertaking any service. It can also be accomplished through GMPLS 516 protocol extensions by carrying the related request information 517 (e.g., starting time and duration) across the network. Nevertheless, 518 this method inevitably increases the complexity of signaling and 519 routing process. 521 A passive stateful PCE can support this application with better 522 efficiency since it can alleviate the burden of processing on network 523 elements. This requires the PCE to maintain the scheduled LSPs and 524 their associated resource usage, as well as the ability of head-ends 525 to trigger signaling for LSP setup/deletion at the correct time. 526 This approach requires coarse time synchronization between PCEs and 527 PCCs. With PCE initiation capability, a PCE can trigger the setup 528 and deletion of scheduled requests in a centralized manner, without 529 modification of existing head-end behaviors, by notifying the PCCs to 530 set up or tear down the paths. 532 3.4. Recovery 534 The recovery use cases discussed in the following sections show how 535 leveraging a stateful PCE can simplify the computation of recovery 536 path(s). In particular, two characteristics of a stateful PCE are 537 used: 1) using information stored in the LSP-DB for determining 538 shared protection resources and 2) performing computations with 539 knowledge of all LSPs in a domain. 541 3.4.1. Protection 543 If a PCC can specify in a request whether the computation is for a 544 working path or for protection, and a PCC can report the resource as 545 a working or protection path, then the following text applies. A PCC 546 can send multiple requests to the PCE, asking for two LSPs and use 547 them as working and backup paths separately. Either way, the 548 resources bound to backup paths can be shared by different LSPs to 549 improve the overall network efficiency, such as m:n protection or 550 pre-configured shared mesh recovery techniques as specified in 551 [RFC4427]. If resource sharing is supported for LSP protection, the 552 information relating to existing LSPs is required to avoid allocation 553 of shared protection resources to two LSPs that might fail together 554 and cause protection contention issues. A stateless PCE can 555 accommodate this use case by having the PCC pass this information as 556 a constraint in the path computation request. A passive stateful PCE 557 can more easily accommodate this need using the information stored in 558 its LSP-DB. Furthermore, an active stateful PCE can help with (re)- 559 optimizization of protection resource sharing as well as LSP 560 maintenance operation with fewer impact on protection resources. 562 +----+ 563 |PCE | 564 +----+ 566 +------+ +------+ +------+ 567 | A +----------+ B +----------+ C | 568 +--+---+ +---+--+ +---+--+ 569 | | | 570 | +---------+ | 571 | | | 572 | +--+---+ +------+ | 573 +-----+ E +----------+ D +-----+ 574 +------+ +------+ 576 Figure 3: Reference topology 3 578 For example, in the network depicted in Figure 3, suppose there 579 exists LSP1 with working path LSP1_working following A->E and with 580 backup path LSP1_backup following A->B->E. A request arrives asking 581 for a working and backup path pair to be computed for LSP2 from B to 582 E. If the PCE decides LSP2_working follows B->A->E, then the backup 583 path LSP2_backup should not share the same protection resource with 584 LSP1 since LSP2 shares part of its resource (specifically A->E) with 585 LSP1 (i.e., these two LSPs are in the same shared risk group). There 586 is no such constraint if B->C->D->E is chosen for LSP2_working. 588 If a stateless PCE is used, the head node B needs to be aware of the 589 existence of LSPs which share the route of LSP2_working and of the 590 details of their protection resources. B must pass this information 591 to the PCE as a constraint so as to request a path with diversity. 592 Alternatively, a stateless PCE may able to compute Shared Risk Link 593 Group (SRLG)-diversified paths if TED is extended so that it includes 594 the SRLG information that are protected by a given backup resource, 595 but at the expense of a high complexity in routing. On the other 596 hand, a stateful PCE can get the LSPs information by itself given 597 that the LSP identifier(s) and can achieve the goal of finding SRLG- 598 diversified protection paths for both LSPs. This is made possible by 599 comparing the LSP resource usage exploiting the LSP-DB accessible by 600 the stateful PCE. 602 3.4.2. Restoration 604 In case of a link failure, such as a fiber cut, multiple LSPs may 605 fail at the same time. Thus, the source nodes of the affected LSPs 606 will be informed of the failure by the nodes detecting the failure. 607 These source nodes will send requests to a PCE for rerouting. In 608 order to reuse the resource taken by an existing LSP, the source node 609 can send a PCReq message including the Exclude Route Object (XRO) 610 with Fail (F) bit set, together with the record route object (RRO) 611 containing the current route information, as specified in [RFC5521]. 613 If a stateless PCE is used, it might respond to the rerouting 614 requests separately if they arrive at different times. Thus, it 615 might result in sub-optimal resource usage. Even worse, it might 616 unnecessarily block some of the rerouting requests due to 617 insufficient resources for later-arrived rerouting messages. If a 618 passive stateful PCE is used to fulfill this task, the procedure can 619 be simplified. The PCCs reporting the failures can include LSP 620 identifiers instead of detailed information and the PCE can find 621 relevant LSP information by inspecting the LSP-DB. Moreover, the PCE 622 can re-compute the affected LSPs concurrently while reusing part of 623 the existing LSPs resources when it is informed of the failed link 624 identifier provided by the first request. This is made possible 625 since the passive stateful PCE can check what other LSPs are affected 626 by the failed link and their route information by inspecting its LSP- 627 DB. As a result, a better performance can be achieved, such as 628 better resource usage or minimal probability of blocking upcoming new 629 rerouting requests sent as a result of the link failure. 631 If the target is to avoid resource contention within the time-window 632 of high number of LSP rerouting requests, a stateful PCE can retain 633 the under-construction LSP resource usage information for a given 634 time and exclude it from being used for forthcoming LSPs request. In 635 this way, it can ensure that the resource will not be double-booked 636 and thus the issue of resource contention and computation crank-backs 637 can be alleviated. 639 3.4.3. SRLG Diversity 641 An alternative way to achieve efficient resilience is to maintain 642 SRLG disjointness between LSPs, irrespective of whether these LSPs 643 share the source and destination nodes or not. This can be achieved 644 at provisioning time, if the routes of all the LSPs are requested 645 together, using a synchronized computation of the different LSPs with 646 SRLG disjointness constraint. If the LSPs need to be provisioned at 647 different times, the PCC can specify, as constraints to the path 648 computation a set of SRLGs using the Exclude Route Object [RFC5521]. 649 However, for the latter to be effective, it is needed that the entity 650 that requests the route to the PCE maintains updated SRLG information 651 of all the LSPs to which it must maintain the disjointness. A 652 stateless PCE can compute an SRLG-disjoint path by inspecting the TED 653 and precluding the links with the same SRLG values specified in the 654 PCReq message sent by a PCC. 656 A passive stateful PCE maintains the updated SRLG information of the 657 established LSPs in a centralized manner. Therefore, the PCC can 658 specify as constraints to the path computation the SRLG disjointness 659 of a set of already established LSPs by only providing the LSP 660 identifiers. Similarly, a passive stateful PCE can also accommodate 661 disjointness using other constraints, such as link, node or path 662 segment etc. 664 3.5. Maintenance of Virtual Network Topology (VNT) 666 In Multi-Layer Networks (MLN), a Virtual Network Topology (VNT) 667 [RFC5212] consists of a set of one or more TE LSPs in the lower layer 668 which provides TE links to the upper layer. In [RFC5623], the PCE- 669 based architecture is proposed to support path computation in MLN 670 networks in order to achieve inter-layer TE. 672 The establishment/teardown of a TE link in VNT needs to take into 673 consideration the state of existing LSPs and/or new LSP request(s) in 674 the higher layer. Hence, when a stateless PCE cannot find the route 675 for a request based on the upper layer topology information, it does 676 not have enough information to decide whether to set up or remove a 677 TE link or not, which then can result in non-optimal usage of 678 resource. On the other hand, a passive stateful PCE can make a 679 better decision of when and how to modify the VNT either to 680 accommodate new LSP requests or to re-optimize resource usage across 681 layers irrespective of the PCE models as described in [RFC5623]. 682 Furthermore, given the active capability, the stateful PCE can issue 683 VNT modification suggestions in order to accommodate path setup 684 requests or re-optimize resource usage across layers. 686 3.6. LSP Re-optimization 688 In order to make efficient usage of network resources, it is 689 sometimes desirable to re-optimize one or more LSPs dynamically. In 690 the case of a stateless PCE, in order to optimize network resource 691 usage dynamically through online planning, a PCC must send a request 692 to the PCE together with detailed path/bandwidth information of the 693 LSPs that need to be concurrently optimized. This means the PCC must 694 be able to determine when and which LSPs should be optimized. In the 695 case of a passive stateful PCE, given the LSP state information in 696 the LSP database, the process of dynamic optimization of network 697 resources can be simplified without requiring the PCC to supply 698 detailed LSP state information. Moreover, an active stateful PCE can 699 even make the process automated by triggering the request since a 700 stateful PCE can maintain information for all LSPs that are in the 701 process of being set up and it may have the ability to control timing 702 and sequence of LSP setup/deletion, the optimization procedures can 703 be performed more intelligently and effectively. A stateful PCE can 704 also determine which LSP should be re-optimized based on network 705 events. For example, when a LSP is torn down, its resources are 706 freed. This can trigger the stateful PCE to automatically determine 707 which LSP should be reoptimized so that the recently freed resources 708 may be allocated to it. 710 A special case of LSP re-optimization is GCO [RFC5557]. Global 711 control of LSP operation sequence in [RFC5557] is predicated on the 712 use of what is effectively a stateful (or semi-stateful) NMS. The 713 NMS can be either not local to the network nodes, in which case 714 another northbound interface is required for LSP attribute changes, 715 or local/collocated, in which case there are significant issues with 716 efficiency in resource usage. A stateful PCE adds a few features 717 that: 719 o Roll the NMS visibility into the PCE and remove the requirement 720 for an additional northbound interface 722 o Allow the PCE to determine when re-optimization is needed, with 723 which level (GCO or a more incremental optimization) 725 o Allow the PCE to determine which LSPs should be re-optimized 727 o Allow a PCE to control the sequence of events across multiple 728 PCCs, allowing for bulk (and truly global) optimization, LSP 729 shuffling etc. 731 3.7. Resource Defragmentation 733 If LSPs are dynamically allocated and released over time, the 734 resource becomes fragmented. In networks with link bundle, the 735 overall available resource on a (bundle) link might be sufficient for 736 a new LSP request, but if the available resource is not continuous, 737 the request is rejected. In order to perform the defragmentation 738 procedure, stateful PCEs can be used, since global visibility of LSPs 739 in the network is required to accurately assess resources on the 740 LSPs, and perform de-fragmentation while ensuring a minimal 741 disruption of the network. This use case cannot be accommodated by a 742 stateless PCE since it does not possess the detailed information of 743 existing LSPs in the network. 745 Another case of particular interest is the optical spectrum 746 defragmentation in flexible grid networks. In Flexible grid networks 747 [RFC7698], LSPs with different optical spectrum sizes (such as 748 12.5GHz, 25GHz etc.) can co-exist so as to accommodate the services 749 with different bandwidth requests. Therefore, even if the overall 750 spectrum size can meet the service request, it may not be usable if 751 the available spectrum resource is not contiguous, but rather 752 fragmented into smaller pieces. Thus, with the help of existing LSP 753 state information, a stateful PCE can make the resource grouped 754 together to be usable. Moreover, a stateful PCE can proactively 755 choose routes for upcoming path requests to reduce the chance of 756 spectrum fragmentation. 758 3.8. Point-to-Multi-Point Applications 760 PCE has been identified as an appropriate technology for the 761 determination of the paths of point-to-multipoint (P2MP) TE LSPs 762 [RFC5671]. The application scenarios and use-cases described in 763 Section 3.1, Section 3.4 and Section 3.6 are also applicable to P2MP 764 TE LSPs. 766 In addition to these, the stateful nature of a PCE simplifies the 767 information conveyed in PCEP messages since it is possible to refer 768 to the LSPs via an identifier. For P2MP, this is an added advantage, 769 where the size of the PCEP message is much larger. In case of 770 stateless PCEs, modification of a P2MP tree requires encoding of all 771 leaves along with the paths in PCReq message. But using a stateful 772 PCE with P2MP capability, the PCEP message can be used to convey only 773 the modifications (the other information can be retrieved from the 774 identifier via the LSP-DB). 776 3.9. Impairment-Aware Routing and Wavelength Assignment (IA-RWA) 778 In Wavelength Switched Optical Networks (WSONs) [RFC6163], a 779 wavelength-switched LSP traverses one or more fiber links. The bit 780 rates of the client signals carried by the wavelength LSPs may be the 781 same or different. Hence, a fiber link may transmit a number of 782 wavelength LSPs with equal or mixed bit rate signals. For example, a 783 fiber link may multiplex the wavelengths with only 10Gb/s signals, 784 mixed 10Gb/s and 40Gb/s signals, or mixed 40Gb/s and 100Gb/s signals. 786 IA-RWA in WSONs refers to the process (i.e., lightpath computation) 787 that takes into account the optical layer/transmission imperfections 788 by considering as additional (i.e., physical layer) constraints. To 789 be more specific, linear and non-linear effects associated with the 790 optical network elements should be incorporated into the route and 791 wavelength assignment procedure. For example, the physical 792 imperfection can result in the interference of two adjacent 793 lightpaths. Thus, a guard band should be reserved between them to 794 alleviate these effects. The width of the guard band between two 795 adjacent wavelengths depends on their characteristics, such as 796 modulation formats and bit rates. Two adjacent wavelengths with 797 different characteristics (e.g., different bit rates) may need a 798 wider guard band and with same characteristics may need a narrower 799 guard band. For example, 50GHz spacing may be acceptable for two 800 adjacent wavelengths with 40G signals. But for two adjacent 801 wavelengths with different bit rates (e.g., 10G and 40G), a larger 802 spacing such as 300GHz spacing may be needed. Hence, the 803 characteristics (states) of the existing wavelength LSPs should be 804 considered for a new RWA request in WSON. 806 In summary, when stateful PCEs are used to perform the IA-RWA 807 procedure, they need to know the characteristics of the existing 808 wavelength LSPs. The impairment information relating to existing and 809 to-be-established LSPs can be obtained by nodes in WSON networks via 810 external configuration or other means such as monitoring or 811 estimation based on a vendor-specific impair model. However, WSON 812 related routing protocols, i.e., [RFC7688] and [RFC7580], only 813 advertise limited information (i.e., availability) of the existing 814 wavelengths, without defining the supported client bit rates. It 815 will incur substantial amount of control plane overhead if routing 816 protocols are extended to support dissemination of the new 817 information relevant for the IA-RWA process. In this scenario, 818 stateful PCE(s) would be a more appropriate mechanism to solve this 819 problem. Stateful PCE(s) can exploit impairment information of LSPs 820 stored in LSP-DB to provide accurate RWA calculation. 822 4. Deployment Considerations 824 This section discusses general issues with stateful PCE deployments, 825 and identifies areas where additional protocol extensions and 826 precedures are needed to address them. Definitions of protocol 827 mechanisms are beyond the scope of this document. 829 4.1. Multi-PCE Deployments 831 Stateless and stateful PCEs can co-exist in the same network and be 832 in charge of path computation of different types. To solve the 833 problem of distinguishing between the two types of PCEs, either 834 discovery or configuration may be used. 836 Multiple stateful PCEs can co-exist in the same network. These PCEs 837 may provide redundancy for load sharing, resilience, or partitioning 838 of computation features. Regardless of the reason for multiple PCEs, 839 an LSP is only delegated to one of the PCEs at any given point in 840 time. However, an LSP can be re-delegated between PCEs, for example 841 when a PCE fails. [RFC7399] discusses various approaches for 842 synchronizing state among the PCEs when multiple PCEs are used for 843 load sharing or backup and compute LSPs for the same network. 845 4.2. LSP State Synchronization 847 The LSP-DB is populated using information received from the PCC. 848 Because the accuracy of the computations depends on the accuracy of 849 the databases used, it is worth noting that the PCE view lags behind 850 the true state of the network, because the updates must reach the PCE 851 from the network. Thus, the use of stateful PCE reduces but cannot 852 eliminate the possibility of crankbacks, nor can it guarantee optimal 853 computations all the time. [RFC7399] discusses these limitations and 854 potential ways to alleviate them. 856 In case of multiple PCEs with different capabilities, co-existing in 857 the same network, such as a passive stateful PCE and an active 858 stateful PCE, it is useful to refer to a LSP, be it delegated or not, 859 by a unique identifier instead of providing detailed information 860 (e.g., route, bandwidth etc.) associated with it, when these PCEs 861 cooperate on path computation, such as for load sharing. 863 4.3. PCE Survivability 865 For a stateful PCE, an important issue is to get the LSP state 866 information resynchronized after a restart. LSP state 867 synchronization procedures can be applied equally to a network node 868 or another PCE, allowing multiple ways of re-acquiring the LSP 869 database on a restart. Because synchronization may also be skipped, 870 if a PCE implementation has the means to retrieve its database in a 871 different way (for example from a backup copy stored locally), the 872 state can be restored without further overhead in the network. A 873 hybrid approach where the bulk of the state is recovered locally, and 874 a small amount of state is reacquired from the network, is also 875 possible. Note that locally recovering the state would still require 876 some degree of resynchronization to ensure that the recovered state 877 is indeed up-to-date. Depending on the resynchronization mechanism 878 used, there may be an additional load on the PCE, and there may be a 879 delay in reaching the synchronized state, which may negatively affect 880 survivability. Different resynchronization methods are suited for 881 different deployments and objectives. 883 5. Security Considerations 885 This document describes general considerations for a stateful PCE 886 deployment and examines its applicability and benefits, as well as 887 its challenges and limitations through a number of use cases. No new 888 protocol extensions to PCEP are defined in this document. 890 The PCEP extensions in support of the stateful PCE and the delegation 891 of path control ability can result in more information and control 892 being available for a hypothetical adversary and a number of 893 additional attack surfaces which must be protected. This includes 894 but not limited to the authentication and encryption of PCEP 895 sessions, snooping of the state of the LSPs active in the network 896 etc. Therefore, documents where the PCEP protocol extensions are 897 defined need to consider the issues and risks associated with a 898 stateful PCE. 900 6. IANA Considerations 902 This document does not require any IANA action. 904 7. Contributing Authors 906 The following people all contributed significantly to this document 907 and are listed below in alphabetical order: 909 Ramon Casellas 910 CTTC - Centre Tecnologic de Telecomunicacions de Catalunya 911 Av. Carl Friedrich Gauss n7 912 Castelldefels, Barcelona 08860 913 Spain 914 Email: ramon.casellas@cttc.es 916 Edward Crabbe 917 Email: edward.crabbe@gmail.com 919 Dhruv Dhody 920 Huawei Technology 921 Leela Palace 922 Bangalore, Karnataka 560008 923 INDIA 924 EMail: dhruv.dhody@huawei.com 926 Oscar Gonzalez de Dios 927 Telefonica Investigacion y Desarrollo 928 Emilio Vargas 6 929 Madrid, 28045 930 Spain 931 Phone: +34 913374013 932 Email: ogondio@tid.es 934 Young Lee 935 Huawei 936 1700 Alma Drive, Suite 100 937 Plano, TX 75075 938 US 939 Phone: +1 972 509 5599 x2240 940 Fax: +1 469 229 5397 941 EMail: leeyoung@huawei.com 943 Jan Medved 944 Cisco Systems, Inc. 945 170 West Tasman Dr. 946 San Jose, CA 95134 947 US 948 Email: jmedved@cisco.com 950 Robert Varga 951 Pantheon Technologies LLC 952 Mlynske Nivy 56 953 Bratislava 821 05 954 Slovakia 955 Email: robert.varga@pantheon.sk 957 Fatai Zhang 958 Huawei Technologies 959 F3-5-B R&D Center, Huawei Base 960 Bantian, Longgang District 961 Shenzhen 518129 P.R.China 962 Phone: +86-755-28972912 963 Email: zhangfatai@huawei.com 965 Xiaobing Zi 966 Email: unknown 968 8. Acknowledgements 970 We would like to thank Cyril Margaria, Adrian Farrel, JP Vasseur and 971 Ravi Torvi for the useful comments and discussions. 973 9. References 975 9.1. Normative References 977 [RFC4655] Farrel, A., Vasseur, J., and J. Ash, "A Path Computation 978 Element (PCE)-Based Architecture", RFC 4655, 979 DOI 10.17487/RFC4655, August 2006, 980 . 982 [RFC5440] Vasseur, JP., Ed. and JL. Le Roux, Ed., "Path Computation 983 Element (PCE) Communication Protocol (PCEP)", RFC 5440, 984 DOI 10.17487/RFC5440, March 2009, 985 . 987 [RFC7399] Farrel, A. and D. King, "Unanswered Questions in the Path 988 Computation Element Architecture", RFC 7399, 989 DOI 10.17487/RFC7399, October 2014, 990 . 992 9.2. Informative References 994 [RFC3209] Awduche, D., Berger, L., Gan, D., Li, T., Srinivasan, V., 995 and G. Swallow, "RSVP-TE: Extensions to RSVP for LSP 996 Tunnels", RFC 3209, DOI 10.17487/RFC3209, December 2001, 997 . 999 [RFC4427] Mannie, E., Ed. and D. Papadimitriou, Ed., "Recovery 1000 (Protection and Restoration) Terminology for Generalized 1001 Multi-Protocol Label Switching (GMPLS)", RFC 4427, 1002 DOI 10.17487/RFC4427, March 2006, 1003 . 1005 [RFC4657] Ash, J., Ed. and J. Le Roux, Ed., "Path Computation 1006 Element (PCE) Communication Protocol Generic 1007 Requirements", RFC 4657, DOI 10.17487/RFC4657, September 1008 2006, . 1010 [RFC5212] Shiomoto, K., Papadimitriou, D., Le Roux, JL., Vigoureux, 1011 M., and D. Brungard, "Requirements for GMPLS-Based Multi- 1012 Region and Multi-Layer Networks (MRN/MLN)", RFC 5212, 1013 DOI 10.17487/RFC5212, July 2008, 1014 . 1016 [RFC5521] Oki, E., Takeda, T., and A. Farrel, "Extensions to the 1017 Path Computation Element Communication Protocol (PCEP) for 1018 Route Exclusions", RFC 5521, DOI 10.17487/RFC5521, April 1019 2009, . 1021 [RFC5557] Lee, Y., Le Roux, JL., King, D., and E. Oki, "Path 1022 Computation Element Communication Protocol (PCEP) 1023 Requirements and Protocol Extensions in Support of Global 1024 Concurrent Optimization", RFC 5557, DOI 10.17487/RFC5557, 1025 July 2009, . 1027 [RFC5623] Oki, E., Takeda, T., Le Roux, JL., and A. Farrel, 1028 "Framework for PCE-Based Inter-Layer MPLS and GMPLS 1029 Traffic Engineering", RFC 5623, DOI 10.17487/RFC5623, 1030 September 2009, . 1032 [RFC5671] Yasukawa, S. and A. Farrel, Ed., "Applicability of the 1033 Path Computation Element (PCE) to Point-to-Multipoint 1034 (P2MP) MPLS and GMPLS Traffic Engineering (TE)", RFC 5671, 1035 DOI 10.17487/RFC5671, October 2009, 1036 . 1038 [RFC6163] Lee, Y., Ed., Bernstein, G., Ed., and W. Imajuku, 1039 "Framework for GMPLS and Path Computation Element (PCE) 1040 Control of Wavelength Switched Optical Networks (WSONs)", 1041 RFC 6163, DOI 10.17487/RFC6163, April 2011, 1042 . 1044 [RFC7580] Zhang, F., Lee, Y., Han, J., Bernstein, G., and Y. Xu, 1045 "OSPF-TE Extensions for General Network Element 1046 Constraints", RFC 7580, DOI 10.17487/RFC7580, June 2015, 1047 . 1049 [RFC7688] Lee, Y., Ed. and G. Bernstein, Ed., "GMPLS OSPF 1050 Enhancement for Signal and Network Element Compatibility 1051 for Wavelength Switched Optical Networks", RFC 7688, 1052 DOI 10.17487/RFC7688, November 2015, 1053 . 1055 [RFC7698] Gonzalez de Dios, O., Ed., Casellas, R., Ed., Zhang, F., 1056 Fu, X., Ceccarelli, D., and I. Hussain, "Framework and 1057 Requirements for GMPLS-Based Control of Flexi-Grid Dense 1058 Wavelength Division Multiplexing (DWDM) Networks", 1059 RFC 7698, DOI 10.17487/RFC7698, November 2015, 1060 . 1062 Authors' Addresses 1064 Xian Zhang (editor) 1065 Huawei Technologies 1066 F3-5-B R&D Center, Huawei Industrial Base, Bantian, Longgang District 1067 Shenzhen, Guangdong 518129 1068 P.R.China 1070 Email: zhang.xian@huawei.com 1072 Ina Minei (editor) 1073 Google, Inc. 1074 1600 Amphitheatre Parkway 1075 Mountain View, CA 94043 1076 US 1078 Email: inaminei@google.com