idnits 2.17.1 draft-ietf-pals-endpoint-fast-protection-04.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (November 13, 2016) is 2714 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) ** Obsolete normative reference: RFC 4447 (Obsoleted by RFC 8077) Summary: 1 error (**), 0 flaws (~~), 1 warning (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Internet Engineering Task Force Yimin Shen 3 Internet-Draft Juniper Networks 4 Intended status: Standards Track Rahul Aggarwal 5 Expires: May 17, 2017 Arktan, Inc 6 Wim Henderickx 7 Alcatel-Lucent 8 Yuanlong Jiang 9 Huawei Technologies 10 November 13, 2016 12 PW Endpoint Fast Failure Protection 13 draft-ietf-pals-endpoint-fast-protection-04 15 Abstract 17 This document specifies a fast mechanism for protecting pseudowires 18 against egress endpoint failures, including egress attachment circuit 19 failure, egress PE failure, multi-segment PW terminating PE failure, 20 and multi-segment PW switching PE failure. Operating on the basis of 21 multi-homed CE, redundant PWs, upstream label assignment and context 22 specific label switching, the mechanism enables local repair to be 23 performed by the router upstream adjacent to a failure. The router 24 can restore a PW in the order of tens of milliseconds, by rerouting 25 traffic around the failure to a protector through a pre-established 26 bypass tunnel. Therefore, the mechanism can be used to reduce 27 traffic loss before global repair reacts to the failure and the 28 network converges on the topology changes due to the failure. 30 Status of This Memo 32 This Internet-Draft is submitted in full conformance with the 33 provisions of BCP 78 and BCP 79. 35 Internet-Drafts are working documents of the Internet Engineering 36 Task Force (IETF). Note that other groups may also distribute 37 working documents as Internet-Drafts. The list of current Internet- 38 Drafts is at http://datatracker.ietf.org/drafts/current/. 40 Internet-Drafts are draft documents valid for a maximum of six months 41 and may be updated, replaced, or obsoleted by other documents at any 42 time. It is inappropriate to use Internet-Drafts as reference 43 material or to cite them other than as "work in progress." 45 This Internet-Draft will expire on May 17, 2017. 47 Copyright Notice 49 Copyright (c) 2016 IETF Trust and the persons identified as the 50 document authors. All rights reserved. 52 This document is subject to BCP 78 and the IETF Trust's Legal 53 Provisions Relating to IETF Documents 54 (http://trustee.ietf.org/license-info) in effect on the date of 55 publication of this document. Please review these documents 56 carefully, as they describe your rights and restrictions with respect 57 to this document. Code Components extracted from this document must 58 include Simplified BSD License text as described in Section 4.e of 59 the Trust Legal Provisions and are provided without warranty as 60 described in the Simplified BSD License. 62 Table of Contents 64 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 65 2. Specification of Requirements . . . . . . . . . . . . . . . . 4 66 3. Reference Models for Egress Endpoint Failures . . . . . . . . 4 67 3.1. Single-Segment PW . . . . . . . . . . . . . . . . . . . . 4 68 3.2. Multi-Segment PW . . . . . . . . . . . . . . . . . . . . 8 69 4. Theory of Operation . . . . . . . . . . . . . . . . . . . . . 9 70 4.1. Applicability . . . . . . . . . . . . . . . . . . . . . . 9 71 4.2. Local Repair and Protector . . . . . . . . . . . . . . . 10 72 4.3. Context Identifier . . . . . . . . . . . . . . . . . . . 13 73 4.3.1. Semantics . . . . . . . . . . . . . . . . . . . . . . 13 74 4.3.2. FEC . . . . . . . . . . . . . . . . . . . . . . . . . 14 75 4.3.3. IGP Advertisement and Path Computation . . . . . . . 15 76 4.4. Protection Models . . . . . . . . . . . . . . . . . . . . 16 77 4.4.1. Co-located Protector . . . . . . . . . . . . . . . . 16 78 4.4.2. Centralized Protector . . . . . . . . . . . . . . . . 17 79 4.5. Transport Tunnel . . . . . . . . . . . . . . . . . . . . 19 80 4.6. Bypass Tunnel . . . . . . . . . . . . . . . . . . . . . . 20 81 4.7. Examples of Forwarding State . . . . . . . . . . . . . . 21 82 4.7.1. Co-located Protector Model . . . . . . . . . . . . . 21 83 4.7.2. Centralized Protector Model . . . . . . . . . . . . . 24 84 5. Revertive Behavior . . . . . . . . . . . . . . . . . . . . . 27 85 6. LDP Extensions . . . . . . . . . . . . . . . . . . . . . . . 28 86 6.1. Egress Protection Capability TLV . . . . . . . . . . . . 29 87 6.2. PW Label Distribution from Primary PE to Protector . . . 30 88 6.3. PW Label Distribution from Backup PE to Protector . . . . 31 89 6.4. Protection FEC Element TLV . . . . . . . . . . . . . . . 31 90 6.4.1. Encoding Format for PWid . . . . . . . . . . . . . . 32 91 6.4.2. Encoding Format for Generalized PWid . . . . . . . . 33 92 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 34 93 8. Security Considerations . . . . . . . . . . . . . . . . . . . 34 94 9. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 35 95 10. References . . . . . . . . . . . . . . . . . . . . . . . . . 35 96 10.1. Normative References . . . . . . . . . . . . . . . . . . 35 97 10.2. Informative References . . . . . . . . . . . . . . . . . 36 98 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 37 100 1. Introduction 102 Per [RFC3985, RFC4447, RFC5659], a pseudowire (PW) or PW segment can 103 be thought of as a connection between a pair of forwarders hosted by 104 two PEs, carrying an emulated layer-2 service over a packet switched 105 network (PSN). In the single-segment PW (SS-PW) case, a forwarder 106 binds a PW to an attachment circuit (AC). In the multi-segment PW 107 (MS-PW) case, a forwarder on a terminating PE (T-PE) binds a PW 108 segment to an AC, while a forwarder on a switching PE (S-PE) binds 109 one PW segment to another PW segment. In each direction between the 110 PEs, PW packets are transported by a PSN tunnel, which is also called 111 a transport tunnel. 113 In order to protect the PW service against network failures, it is 114 necessary to protect every link and node along the entire data path. 115 For the traffic in a given direction, this include ingress AC, 116 ingress (T-)PE, intermediate routers of transport tunnel, S-PEs, 117 egress (T-)PE, and egress AC. To minimize service disruption upon a 118 failure, it is also desirable that each of these components is 119 protected by a fast protection mechanism based on local repair. Such 120 mechanism generally involves a bypass path that is pre-computed and 121 pre-installed in the data plane on the router upstream adjacent to an 122 anticipated failure. This router is referred to as a "point of local 123 repair" (PLR). The bypass path has the property that it can guide 124 traffic around the failure, while remaining unaffected by the 125 topology changes resulting from the failure. When the failure 126 occurs, the PLR can invoke the bypass path to achieve fast 127 restoration for the service. 129 Today, fast protection against ingress AC failure and ingress (T-)PE 130 failure can be achieved by using a multi-homed CE and redundant ACs, 131 such as multi-chassis link aggregation group (MC-LAG). Fast 132 protection against the failure of an intermediate router of transport 133 tunnel can be achieved through RSVP fast-reroute [RFC4090] or IP/LDP 134 fast-reroute [RFC5714, RFC5286]. However, there is no equivalent 135 mechanism that can be used against an egress AC failure, an egress 136 (T-)PE failure, or an S-PE failure. For these failures, service 137 restoration has to rely on global repair or control plane repair. 138 Global repair normally involves the ingress CE or the ingress (T-)PE 139 switching traffic to an alternative path, based on remote failure 140 detection via PW status notification, end-to-end OAM, etc. Control 141 plane repair relies on control protocols to converge on the topology 142 changes due to a failure. Compared to local repair, these mechanisms 143 are relatively slow in reacting to a failure and restoring traffic. 145 This document is intended to serve the above need. It specifies a 146 fast protection mechanism based on local repair to protect PWs 147 against the following endpoint failures. 149 a. Egress AC failure. 151 b. Egress PE failure: Link or node failure of an egress PE of an SS- 152 PW, or a T-PE of an MS-PW. 154 c. Switching PE (S-PE) failure: Link or node failure of an S-PE of 155 an MS-PW. 157 The mechanism is applicable to LDP signaled PWs. It is relevant to 158 networks with redundant PWs and multi-homed CEs. It is designed on 159 the basis of MPLS upstream label assignment and context-specific 160 label switching [RFC5331]. Fast protection refers to its ability to 161 restore traffic in the order of tens of milliseconds. Compared with 162 global repair and control plane repair, this mechanism can provide 163 faster service restoration. However, it is intended to complement 164 those mechanisms, rather than replacing them. 166 2. Specification of Requirements 168 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 169 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 170 document are to be interpreted as described in RFC2119. 172 3. Reference Models for Egress Endpoint Failures 174 This document refers to the following topologies to describe egress 175 endpoint failures and protection procedures. 177 3.1. Single-Segment PW 178 |<-------------- PW1 --------------->| 180 - PE1 -------------- P1 ---------------- PE2 - 181 / \ 182 / \ 183 CE1 CE2 184 \ / 185 \ / 186 - PE3 -------------- P2 ---------------- PE4 - 188 |<-------------- PW2 --------------->| 190 Figure 1 192 In Figure 1, the IP/MPLS network consists of PE and P routers. It 193 provides a PW service between CE1 and CE2. Each CE is multi-homed 194 via two ACs to two PEs. This forms two divergent paths between the 195 CEs. The first path uses PW1 between PE1 and PE2, and the second 196 path uses PW2 between PE3 and PE4. The transport tunnels of the PWs 197 and other links between the routers are not shown in this figure for 198 clarity. 200 In general, a CE may operate the ACs in two modes when sending 201 traffic to the remote CE, i.e. active-standby mode and active-active 202 mode. 204 o In the active-standby mode, the CE chooses one AC as active AC and 205 the corresponding path as active path, and uses the other AC as 206 standby AC and the corresponding path as standby path. The CE 207 only sends traffic on the active AC as long as the active path is 208 operational. The CE will only send traffic on the standby AC 209 after it detects a failure of the active path. Note that the CE 210 may receive traffic on the active or standby AC, depending on 211 whether the remote CE chooses the same active path for the traffic 212 of the reverse direction. In this document, even if both CEs 213 choose the same active path, each CE should still anticipate 214 receiving traffic on a standby AC, because the traffic may be 215 redirected to the standby path by the fast protection mechanism. 217 o In the active-active mode, the CE treats both ACs and their 218 corresponding paths as active, and sends traffic on both ACs in a 219 load balance fashion. In the reverse direction, the CE may 220 receive traffic on both ACs. 222 The above modes assume the traffic to be data traffic which is not 223 bound to specific AC. This does not include control protocol traffic 224 between the CEs, when the CE-CE control protocol sessions or 225 adjacencies established on the two ACs are considered as distinct, 226 rather than having a primary and backup relationship. In general, a 227 dual-homed CE should not make any explicit or implicit assumptions 228 regarding specific AC from which it receives packets from the remote 229 CE. 231 For either mode, when considering the traffic flowing in a given 232 direction over an active path, this document views the ACs, PEs and 233 PWs to serve primary or backup roles. In particular, the ACs, PEs 234 and PW along this active path are primary, while those along the 235 other path are backup. Note that in the active-active mode, the 236 backup path is an active path by itself, carrying its own share of 237 traffic while protecting the other active path. 239 For Figure 1, the following roles are assumed for the traffic going 240 from CE1 to CE2 via PW1. 242 Primary ingress AC: CE1-PE1 244 Primary ingress PE: PE1 246 Primary PW: PW1 248 Primary egress PE: PE2 250 Primary egress AC: PE2-CE2 252 Backup ingress AC: CE1-PE3 254 Backup ingress PE: PE3 256 Backup PW: PW2 258 Backup egress PE: PE4 260 Backup egress AC: PE4-CE2 262 Based on this schema, this document describes egress endpoint 263 failures and the fast protection mechanism on the per-active-path and 264 per-direction basis. In this case, an egress AC failure refers to 265 the failure of the AC PE2-CE2, and an egress node failure refers to 266 the failure of PE2. The ultimate goal is that when a failure occurs, 267 the traffic should be locally repaired, so that it can eventually 268 reach CE2 via the backup egress PE (PE4) and the backup egress AC 269 (PE4-CE2). 271 Subsequent to the local repair, either the current active path should 272 heal after control plane converges on the new topology, or the 273 ingress CE should switch traffic from the primary path to the backup 274 path, depending on the failure scenario. In the latter case, the 275 ingress CE may perform the path switchover triggered by end-to-end 276 OAM (in-band or out-band), PW status notification, CE-PE control 277 protocols (e.g. LACP), etc. In the active-standby mode, this will 278 promote the standby path to new active path. In the active-active 279 mode, it will make the other active path carry all the traffic 280 between the two CEs. In any case, this phase of restoration falls 281 into the control plane repair and global repair category, and hence 282 is out of the scope of this document. The purpose of the fast 283 protection mechanism in this document is to reduce traffic loss 284 before this phase of restoration takes place. 286 Note that in Figure 1, if the traffic in the reverse direction (i.e. 287 from CE2 to CE1) traverses the AC CE2-PE2 and PE2 as active path, the 288 failure of PE2 and the failure of the AC PE2-CE2 will be considered 289 as ingress failures of the traffic. If CE2 can detect the failures, 290 it may protect the traffic by switching it to the backup path via the 291 AC CE2-PE4 and PE4. However, this is categorized as ingress endpoint 292 failure protection, and hence is not handled by the mechanism 293 described in this document. 295 Figure 2 shows another possible scenario, where CE1 is single-homed 296 to PE1, while CE2 remains multi-homed to PE2 and PE4. From the 297 perspective of egress endpoint protection for the traffic going from 298 CE1 to CE2 over PW1, this scenario is the same as the scenario shown 299 in Figure 1. 301 |<-------------- PW1 --------------->| 303 ------------- P1 ---------------- PE2 - 304 / \ 305 / \ 306 CE1 -- PE1 CE2 307 \ / 308 \ / 309 ------------- P2 ---------------- PE4 - 311 |<-------------- PW2 --------------->| 313 Figure 2 315 For clarity, primary egress AC, primary egress PE, backup egress AC, 316 and backup egress PE may simply be referred to as primary AC, primary 317 PE, backup AC, and backup PE, respectively, when the context of a 318 discussion is egress endpoint. 320 3.2. Multi-Segment PW 322 |<--------------- PW1 --------------->| 323 |<----- SEG1 ----->|<----- SEG2 ----->| 325 - TPE1 -------------- SPE1 --------------- TPE2 - 326 / \ 327 / \ 328 CE1 CE2 329 \ / 330 \ / 331 - TPE3 -------------- SPE2 --------------- TPE4 - 333 |<----- SEG3 ----->|<----- SEG4 ----->| 334 |<--------------- PW2 --------------->| 336 Figure 3 338 Figure 3 shows a topology that is similar to Figure 1 but in an MS-PW 339 environment. PW1 and PW2 are both MS-PWs. PW1 is established 340 between TPE1 and TPE2, and switched between segments SEG1 and SEG2 at 341 SPE1. PW2 is established between TPE3 and TPE4, and switched between 342 segments SEG3 and SEG4 at SPE2. CE1 is multi-homed to TPE1 and TPE3. 343 CE2 is multi-homed to TPE2 and TPE4. The transport tunnels of the PW 344 segments are not shown in this figure for clarity. 346 In this document, the following primary and backup roles are assigned 347 for the traffic going from CE1 to CE2: 349 Primary ingress AC: CE1-TPE1 351 Primary ingress T-PE: TPE1 353 Primary PW: PW1 355 Primary S-PE: SPE1 357 Primary egress T-PE: TPE2 359 Primary egress AC: TPE2-CE2 361 Backup ingress AC: CE1-TPE3 363 Backup ingress T-PE: TPE3 365 Backup PW: PW2 367 Backup S-PE: SPE2 368 Backup egress T-PE: TPE4 370 Backup egress AC: TPE4-CE2 372 In this case, an egress AC failure refers to the failure of the AC 373 TPE2-CE2. An egress node failure refers to the failure of TPE2. An 374 S-PE failure refers to the failure of SPE1. 376 For consistency with the SS-PW scenario, primary T-PEs and a primary 377 S-PEs may simply be referred to as primary PEs in this document, 378 where specifics are not required. Similarly, backup T-PEs and backup 379 S-PEs may be referred to as backup PEs. 381 4. Theory of Operation 383 The fast protection mechanism in this document provides three types 384 of protection for PWs, corresponding to the three types of failures 385 described in Section 1. 387 a. Egress AC protection 389 b. Egress (T-)PE node protection 391 c. S-PE node protection 393 4.1. Applicability 395 The mechanism is applicable to LDP signaled PWs in an environment 396 where an egress CE is multi-homed to a primary PE and a backup PE and 397 there exists a backup PW, as described in Section 3. The procedure 398 for S-PE node protection is applicable when there exists a backup 399 S-PE on the backup PW. 401 The mechanism assumes IP/MPLS transport tunnels. In a network where 402 transport tunnels may provide ECMP to primary PEs, care should be 403 taken to prevent misordered packet delivery during local repair. 404 Imagine a scenario where the transport tunnel of a PW traverses a 405 router with an ECMP set to a primary PE, and the ECMP set includes a 406 direct link to the primary PE. Normally the router will attempt to 407 forward PW packets in a load balance fashion over the ECMP set. When 408 the link fails, if the router reroutes only the portion of traffic 409 originally traversing the link while letting the rest of traffic 410 remain on the other ECMP branches, it will create a situation where 411 the egress CE receives traffic from both the primary PE and the 412 backup PE. This is considered as undesirable if the PW or some flows 413 within the PW are sensitive to packet misordering. Therefore, it is 414 RECOMMENDED that the router SHOULD treat the link failure as a node 415 failure of the primary PE, and reroute the entire traffic of the ECMP 416 set. The goal is to ensure that a PW or flow SHOULD traverse a 417 single path in steady state and be rerouted over a single path during 418 local repair. 420 It is also RECOMMENDED that the mechanism SHOULD be used in 421 conjunction with global repair and control plane repair, in such a 422 manner that the mechanism temporarily repairs a failed path by using 423 a bypass tunnel, and global repair and control plane repair 424 eventually move traffic to a fully functional alternative path. 426 4.2. Local Repair and Protector 428 The fast protection ability of the mechanism comes from local repair 429 performed by routers upstream adjacent to failures. Each of these 430 routers is referred to as a "point of local repair" (PLR). A PLR 431 MUST be able to detect a failure by using a rapid mechanism, such as 432 physical layer failure detection, Bidirectional Failure Detection 433 (BFD) [RFC5880], etc. In anticipation of the failure, the PLR MUST 434 also pre-establish a bypass tunnel to a "protector", and pre-install 435 a bypass route in the data plane. The bypass tunnel MUST have the 436 property that it will not be affected by the topology changes due to 437 the failure. Specifically, it MUST NOT traverse the primary PE or 438 the penultimate link of the protected transport tunnel, or share any 439 SRLG (shared risk link groups) with the penultimate link. Upon 440 detecting the failure, the PLR invokes the bypass route in the data 441 plane, and reroutes PW traffic to the protector through the bypass 442 tunnel. The protector in turn sends the traffic to the target CE. 443 This procedure is referred to as local repair. 445 Different routers may serve as PLR and protector in different 446 scenarios. 448 o In egress AC protection, the PLR is the primary PE, and the 449 protector is the backup PE (Figure 4). 451 |<-------------- PW1 --------------->| 453 - PE1 -------------- P1 ---------------- PE2 - 454 / PLR \ 455 / | \ 456 CE1 bypass| CE2 457 \ | / 458 \ | / 459 - PE3 -------------- P2 ---------------- PE4 - 460 protector 462 |<-------------- PW2 --------------->| 464 Figure 4 466 o In egress PE node protection, the PLR is the penultimate hop 467 router of the transport tunnel of the primary PW, and the 468 protector is the backup PE (Figure 5). 470 |<-------------- PW1 --------------->| 472 - PE1 -------------- P1 ------- P3 ----- PE2 - 473 / PLR \ \ 474 / \ \ 475 CE1 bypass\ CE2 476 \ \ / 477 \ \ / 478 - PE3 -------------- P2 ---------------- PE4 - 479 protector 481 |<-------------- PW2 --------------->| 483 Figure 5 485 o In S-PE node protection, the PLR is the penultimate hop router of 486 the transport tunnel of the primary PW segment, and the protector 487 is the backup S-PE (Figure 6). 489 |<--------------- PW1 --------------->| 490 |<----- SEG1 ----->|<----- SEG2 ----->| 492 - TPE1 ----- P1 ----- SPE1 -------------- TPE2 - 493 / PLR \ \ 494 / \ \ 495 CE1 bypass\ CE2 496 \ \ / 497 \ \ / 498 - TPE3 --------------- SPE2 -------------- TPE4 - 499 protector 501 |<----- SEG3 ----->|<----- SEG4 ----->| 502 |<--------------- PW2 --------------->| 504 Figure 6 506 In egress AC protection, a PLR realizes its role based on 507 configuration of a "context identifier" introduced in this document 508 (Section 4.3). The PLR establishes a bypass tunnel to the protector 509 in the same fashion as a normal PSN tunnel. 511 In egress PE and S-PE node protection, a PLR is a transit router on 512 the transport tunnel, and it normally does not have knowledge of the 513 PW(s) carried by the transport tunnel. In this document, the PLR 514 simply computes and establishes a node protection bypass tunnel in 515 the same fashion as the normal IP/MPLS node protection, except that 516 with the notion of context identifier, the bypass tunnel will be 517 established from the PLR to the protector (Section 4.6). Conversely, 518 when the router is no longer a PLR for egress PE or S-PE node 519 protection due to a change in network topology or the transport 520 tunnel's path, the router should revert to the role of regular 521 transit router, including PLR for normal IP/MPLS link or node 522 protection. 524 In local repair, a PLR simply switches all the traffic received on 525 the transport tunnel to the bypass tunnel. This requires that the 526 protector given by the bypass tunnel MUST be intended for all the PWs 527 carried by the transport tunnel. This is achieved by the ingress PE 528 using a context identifier to associate a PW with the specific pair 529 of {primary PE, protector} and map the PW to a transport tunnel 530 destined for the same {primary PE, protector}. The ingress PE MAY map 531 multiple PWs to the transport tunnel, if they share the {primary PE, 532 protector} in common. 534 In local repair, the PLR keeps PW label intact in packets. This 535 obviates the need for the PLR to maintain bypass routes on a per-PW 536 basis, and allows bypass tunnel sharing between PWs. On the other 537 hand, this imposes a requirement on the protector that it MUST be 538 able to forward the packets based on a PW label that is assigned by 539 the primary PE, and ensure that the traffic MUST eventually reach the 540 target CE. From the protector's perspective, this PW label is an 541 upstream assigned label [RFC5331]. To achieve this, the protector 542 MUST learn the PW label from the primary PE prior to the failure, and 543 install proper forwarding state for the PW label in a dedicated label 544 space associated with the primary PE. During local repair, the 545 protector MUST perform PW label lookup in this label space. 547 The previous examples have shown the scenarios where the protectors 548 are backup (T/S-)PEs. It is also possible that a protector is a 549 dedicated router to serve such role, separate from the backup (T/ 550 S-)PE. During local repair, the PLR still reroutes traffic to the 551 protector through a bypass tunnel. The protector then forwards the 552 traffic to the backup (T/S-)PE, which further forwards the traffic to 553 the target CE via a backup AC or a backup PW segment. More detail 554 will be described in Section 4.4. 556 4.3. Context Identifier 558 A protector may protect multiple primary PEs. The protector MUST 559 maintain a separate label space for each primary PE. Likewise, the 560 PWs terminated on a primary PE may be protected by multiple 561 protectors, each for a subset of the PWs. In any case, a given PW 562 MUST be associated with one and only one pair of {primary PE, 563 protector}. 565 This document introduces the notion of "context identifier" to 566 facilitate protection establishment. A context identifier is an 567 IPv4/v6 address assigned to each ordered pair of {primary PE, 568 protector}. The address MUST be globally unique, or unique in the 569 address space of the network where the primary PE and the protector 570 reside. 572 4.3.1. Semantics 574 The semantics of a context identifier is twofold. 576 o A context identifier identifies a primary PE and an associated 577 protector. It represents the primary PE as PW destination on a 578 per protector basis. A given primary PE may be protected by 579 multiple protectors, each for a subset of the PWs terminated on 580 the primary PE. A distinct context identifier MUST be assigned to 581 the primary PE and each protector. 583 The ingress PE of a PW learns the context identifier of the PW's 584 {primary PE, protector} from the primary PE via Interface_ID TLV 586 [RFC3471, RFC3472] in the LDP Label Mapping message of the PW. 587 The ingress PE then sets up or resolves a transport tunnel with 588 the context identifier, rather than a private IP address of the 589 primary PE, as destination. This destination not only makes the 590 transport tunnel reach the primary PE, but also conveys the 591 identity of the protector to the PLR, which MUST use the context 592 identifier as destination for the bypass tunnel to the protector. 593 The ingress PE MUST map only the PWs terminated by the exact 594 primary PE and protected by the exact protector to the transport 595 tunnel. 597 o A context identifier indicates the primary PE's label space on the 598 protector. The protector may protect PWs for multiple primary 599 PEs. For each primary PE, it MUST maintain a separate label space 600 to store the PW labels assigned by that primary PE. It associates 601 a PW label with a label space via the context identifier of the 602 {primary PE, protector}, as below. 604 In addition to the normal LDP PW signaling, the primary PE MUST 605 have a targeted LDP session with the protector, and advertise PW 606 labels to the protector via LDP Label Mapping messages 607 (Section 6). The primary PE MUST attach the context identifier to 608 each message. Upon receiving the message, the protector MUST 609 install the advertised PW label in the label space identified by 610 the context identifier. 612 When a PLR sets up or resolves a bypass tunnel to the protector, 613 it MUST use the context identifier rather than a private IP 614 address of the protector as destination. The protector MUST use 615 the bypass tunnel, either the MPLS tunnel label or IP tunnel 616 destination address, as the pointer to the corresponding label 617 space. The protector MUST forward PW packets received on the 618 bypass tunnel based on label lookup in that label space. 620 4.3.2. FEC 622 In an MPLS network, a context identifier represents a FEC (Forwarding 623 Equivalence Class) for transport tunnels and bypass tunnels destined 624 for it. For examples, it may be encoded in an LDP Prefix FEC 625 Element, or in the "tunnel end point address" of an RSVP Session 626 object. The FEC is associated with unique forwarding state on PLRs 627 and protector, which cannot be shared with other FECs. Some MPLS 628 protocols (e.g. LDP) support FEC aggregation [RFC3031]. In this 629 case, FEC aggregation MUST NOT be applied to a context identifier's 630 FEC, and every router MUST assign a unique label to the FEC. 632 4.3.3. IGP Advertisement and Path Computation 634 Using a context identifier as destination for both transport tunnel 635 and bypass tunnel requires coordination between the primary PE and 636 the protector in IGP advertisement of the context identifier in 637 routing domain and TE domain. The context identifier should be 638 advertised in such a way that all the routers on the tunnels MUST be 639 able to independently reach the following common view of paths. 641 o The transport tunnel MUST have the primary PE as path endpoint. 643 o The bypass tunnel MUST have the protector as path endpoint. In 644 egress PE and S-PE node protection, the path MUST avoid the 645 primary PE. 647 There are generally two categories of approaches to achieve the 648 above. 650 o The first category does not require an ingress PE or a PLR to have 651 knowledge of the PW egress endpoint protection schema. It does 652 not require any IGP extension for context identifier 653 advertisement. A context identifier is advertised by the primary 654 PE and the protector as an address reachable via both routers. 655 The ingress PE and the PLR can compute paths by using a normal 656 method, such as Dijkstra, CSPF (constrained shortest path first), 657 LFA [RFC5286] and MRT [RFC7812]. One example is to advertise a 658 context identifier as a virtual proxy node connected to the 659 primary PE and the protector, with the link between the proxy node 660 and the primary PE having a more preferable IGP and TE metric than 661 the link between the proxy node and the protector. The transport 662 tunnel will follow the shortest path or a TE path to the primary 663 PE, and be terminated by the primary PE. The PLR will no longer 664 view itself as a penultimate hop of the transport tunnel, but 665 rather two hops away from the proxy node, via the primary PE. 666 Hence, a node protection bypass tunnel will be available via the 667 protector to the proxy node, but actually be terminated by the 668 protector. 670 o The second category requires a PLR to have knowledge of the PW 671 egress endpoint protection schema. The primary PE advertises the 672 context identifier as a regular IP address, while the protector 673 advertises it by using an explicit "context identifier" object, 674 which MUST be understood by the PLR. The "context identifier" 675 object requires an IGP extension. In both the routing domain and 676 the TE domain, the context identifier is only reachable via the 677 primary PE. This ensures that the transport tunnel is terminated 678 by the primary PE. The PLR views itself as the penultimate hop of 679 the transport tunnel, and based on the IGP "context identifier" 680 object, it establishes or resolves a bypass tunnel to the 681 advertiser (i.e. the protector), while avoiding the primary PE. 683 The mechanism in this document intends to be flexible on the approach 684 used by a network, as long as it satisfies the above requirements for 685 transport tunnel path and bypass tunnel path. In theory, the network 686 can use one approach for context ID X and another approach for 687 context ID Y. For a given context ID, all relevant routers, 688 including primary PE, protector, and PLR, must support and agree on 689 the chosen approach. The coordination between the routers can be 690 achieved by configuration. 692 4.4. Protection Models 694 There are two protection models based on the location of a protector. 695 A network MAY use either model or both. 697 4.4.1. Co-located Protector 699 In this model, the protector is a backup PE that is directly 700 connected to the target CE via a backup AC, or it is a backup S-PE on 701 a backup PW. That is, the protector is co-located with the backup 702 (S-)PE. Examples of this model have been shown in Figure 4, Figure 5 703 and Figure 6 in Section 4.2. 705 In egress AC protection and egress PE node protection, when a 706 protector receives traffic from the PLR, it forwards the traffic to 707 the CE via the backup AC. This is shown in Figure 7, where PE2 is 708 the PLR for egress AC failure, P3 is the PLR for PE2 failure, and PE4 709 (backup PE) is the protector. 711 |<-------------- PW1 --------------->| 713 - PE1 -------------- P1 ------- P3 ----- PE2 ---- 714 / PLR \ PLR \ 715 / \ | \ 716 CE1 bypass\ |bypass CE2 717 \ \ | / 718 \ \ | / 719 - PE3 -------------- P2 ---------------- PE4 ---- 720 protector 722 |<-------------- PW2 --------------->| 724 Figure 7 726 In S-PE node protection, when a protector receives traffic from the 727 PLR, it forwards the traffic over the next segment of the backup PW. 729 The T-PE of the backup PW in turn forwards the traffic to the CE via 730 a backup AC. This is shown in Figure 8, where P1 is the PLR for SPE1 731 failure, and SPE2 (backup S-PE) is the protector for SPE1. SPE2 732 receives traffic from P1, swaps SEG1's label to SEG4's label, and 733 forwards the traffic over a transport tunnel to TPE4. 735 |<--------------- PW1 --------------->| 736 |<----- SEG1 ----->|<----- SEG2 ----->| 738 - TPE1 ----- P1 ----- SPE1 -------------- TPE2 - 739 / PLR \ \ 740 / \ \ 741 CE1 bypass\ CE2 742 \ \ / 743 \ \ / 744 - TPE3 --------------- SPE2 -------------- TPE4 - 745 protector 747 |<----- SEG3 ----->|<----- SEG4 ----->| 748 |<--------------- PW2 --------------->| 750 Figure 8 752 In the co-located protector model, the number of context identifiers 753 needed by a network is the number of distinct {primary PE, backup PE} 754 pairs. From the perspective of scalability, the model is suitable 755 for networks where the number of primary PEs and the average number 756 of backup PEs per primary PE are both relatively low. 758 4.4.2. Centralized Protector 760 In this model, the protector is a dedicated P router or PE router 761 that serves the role. In egress AC protection and egress PE node 762 protection, the protector may or may not be a backup PE directly 763 connected to the target CE. In S-PE node protection, the protector 764 may or may not be a backup S-PE on the backup PW. 766 In egress AC protection and egress PE node protection, if the 767 protector is not directly connected to the CE, it forwards the 768 traffic to a backup PE, which in turn forwards the traffic to the CE 769 via a backup AC. This is shown in Figure 9, where the protector 770 receives traffic from P3 (PLR for egress PE failure) or PE2 (PLR for 771 egress AC failure), swaps PW1's label to PW2's label, and forwards 772 the traffic via a transport tunnel to PE4 (backup PE). The protector 773 may be protecting other PWs and other primary PEs as well, which is 774 not shown in this figure for clarity. 776 |<------------- PW1 --------------->| 778 - PE1 ------------- P1 ------- P3 ----- PE2 -- 779 / PLR \ PLR \ 780 / \ / \ 781 / bypass\ /bypass \ 782 / \ / \ 783 CE1 protector CE2 784 \ \ / 785 \ transport\ / 786 \ tunnel \ / 787 \ \ / 788 - PE3 ------------- P2 -----------------PE4 -- 790 |<------------- PW2 --------------->| 792 Figure 9 794 In S-PE node protection, if the protector is not a backup S-PE, it 795 forwards the traffic to the backup S-PE, which in turn forwards the 796 traffic over the next segment of the backup PW. Finally, the T-PE of 797 the backup PW forwards the traffic to the CE via the backup AC. This 798 is shown in Figure 10, where the protector receives traffic from P1 799 (PLR), swaps SEG1's label to SEG3's label, and forwards the traffic 800 via a transport tunnel to SPE2 (backup S-PE). SPE2 in turn performs 801 MS-PW switching from SEG3's label to SEG4's label, and forwards the 802 traffic over a transport tunnel to TPE4 (backup T-PE). The protector 803 may be protecting other PW segments and other primary S-PEs as well, 804 which is not shown in this figure for clarity. 806 |<--------------- PW1 --------------->| 807 |<----- SEG1 ----->|<----- SEG2 ----->| 809 - TPE1 ----- P1 ----- SPE1 -------------- TPE2 - 810 / PLR \ \ 811 / \ \ 812 / bypass\ \ 813 / \ \ 814 CE1 protector CE2 815 \ \ / 816 \ transport\ / 817 \ tunnel \ / 818 \ \ / 819 - TPE3 --------------- SPE2 -------------- TPE4 - 821 |<----- SEG3 ----->|<----- SEG4 ----->| 822 |<--------------- PW2 --------------->| 824 Figure 10 826 The centralized protector model allows multiple primary PEs to share 827 one protector. Each primary PE may need only one protector. 828 Therefore, the number of context identifiers needed by a network may 829 be bound to the number of primary PEs. 831 4.5. Transport Tunnel 833 A PW is associated with a pair of {primary PE, protector}, which is 834 represented by a unique context identifier. The ingress PE of the PW 835 sets up or resolves a transport tunnel by using the context 836 identifier rather than a private IP address of the primary PE as 837 destination. This not only ensures that the PW is transported to the 838 primary PE, but also facilitates bypass tunnel establishment at PLR, 839 because the context identifier contains the identity of the protector 840 as well. This is also the case for a multi-segment PW, where the 841 ingress PE and egress PE are T/S-PEs. 843 An ingress PE learns the association between a PW and a context 844 identifier from the primary PE, which MUST advertise the context 845 identifier as a "third party next hop" via the IPv4/v6 Interface_ID 846 TLV [RFC3471, RFC3472] in the LDP Label Mapping message of the PW. 848 In an ECMP scenario, a transport tunnel may have multiple penultimate 849 hop routers. Each of them SHOULD act as a PLR independently. Also 850 in an ECMP scenario, a penultimate hop router of a transport tunnel 851 may have an ECMP set to the primary PE, and forward PW traffic in a 852 load balance fashion. At least one ECMP branch must be a direct link 853 to the primary PE, qualifying the router as penultimate hop. The 854 other ECMP branches may be direct links or indirect paths to the 855 primary PE. In egress PE node protection and S-PE node protection, 856 when a node failure is detected on any ECMP branch, the penultimate 857 hop router SHOULD act as a PLR to reroute all the traffic of the ECMP 858 set to the protector. 860 4.6. Bypass Tunnel 862 A PLR may protect multiple PWs associated with one or multiple pairs 863 of {primary PE, protector}. The PLR MUST establish a bypass tunnel to 864 each protector for each context identifier associated with that 865 protector. The destination of the bypass tunnel MUST be the context 866 identifier (Section 4.3.1). Since the PLR is a transit router of the 867 transport tunnel, it SHOULD derive the context identifier from the 868 destination of the transport tunnel. 870 For examples, in Figure 7 and Figure 9, a bypass tunnel is 871 established from PE2 (PLR for egress AC failure) to the protector, 872 and another bypass tunnel is established from P3 (PLR for egress node 873 failure) to the protector. In Figure 8 and Figure 10, a bypass 874 tunnel is established from P1 (PLR for S-PE failure) to the 875 protector. 877 In local repair, a PLR reroutes traffic to the protector through a 878 bypass tunnel, with PW label intact in the packets. This normally 879 involves pushing a label to the label stack, if the bypass tunnel is 880 an MPLS tunnel, or pushing an IP header to the packets, if the bypass 881 tunnel is an IP tunnel. Upon receipt of the packets, the protector 882 forwards them based on the PW label. Specifically, the protector 883 uses the bypass tunnel as a context to determine the primary PE's 884 label space. If the bypass tunnel is an MPLS tunnel, the protector 885 should have assigned a non-reserved label to the bypass tunnel, and 886 hence this label can serve as the context. This label is also called 887 a "context label", as it is actually bound to the context identifier. 888 If the bypass tunnel is an IP tunnel, the context identifier should 889 be the destination address of IP header. 891 To be useful for local repair, a bypass tunnel MUST have the property 892 that it is not affected by any topology changes caused by the 893 failure. It MUST NOT traverse the primary PE or the penultimate link 894 of the transport tunnel, or share any SRLG with the penultimate link. 895 It should remain effective during local repair, until the traffic is 896 moved to an alternative path, i.e. either the same PW over a fully 897 functional transport tunnel, or another fully functional PW. 899 A bypass tunnel SHOULD NOT need to be further protected against a 900 transit link failure, transit node failure, or egress node failure. 902 4.7. Examples of Forwarding State 904 This section provides some detailed examples of forwarding state on 905 PLR, protector, and other relevant routers. 907 A protector learns PW labels from all the primary PEs that it 908 protects (Section 6.2), and maintains the PW labels in separate label 909 spaces on a per primary PE basis. In the control plane, each label 910 space is identified by the context identifier of the corresponding 911 {primary PE, protector}. In the forwarding plane, it is indicated by 912 the bypass tunnel(s) destined for the context identifier. 914 4.7.1. Co-located Protector Model 916 In Figure 11, PE4 is a co-located protector that protects PW1 against 917 egress AC failure and egress node failure. It maintains a label 918 space for PE2, which is identified by the context identifier of {PE2, 919 PE4}. It learns PW1's label from PE2, and installs an forwarding 920 entry for the label in that label space. The nexthop of the 921 forwarding entry indicates a label pop with outgoing interface 922 pointing to the backup AC PE4-CE2. 924 |<-------------- PW1 --------------->| 926 - PE1 -------------- P1 ------- P3 ----- PE2 ------ 927 / PLR \ PLR \ 928 / \ | \ 929 / \ | \ 930 CE1 bypass P4 P5 bypass CE2 931 \ \ | / 932 \ \ | / 933 \ \ | / 934 - PE3 -------------- P2 ---------------- PE4 ------ 935 protector 937 |<-------------- PW2 --------------->| 939 PW1's label assigned by PE2: 100 940 PW2's label assigned by PE4: 200 941 On P3: 942 Incoming label of transport tunnel to PE2: 1000 943 Outgoing label of transport tunnel to PE2: implicit null 944 Outgoing label of bypass tunnel to PE4: 2000 945 On PE2: 946 Outgoing label of bypass tunnel to PE4: 3000 947 On PE4: 948 Context label (incoming label of bypass tunnels): 999 950 Forwarding state on P3: 951 label 1000 -- primary nexthop: pop, to PE2 952 backup nexthop: swap 2000, to P4 954 Forwarding state on PE2: 955 label 100 -- primary nexthop: pop, to CE2 956 backup nexthop: push 3000, to P5 958 Forwarding state on PE4: 959 label 200 -- nexthop: pop, to CE2 960 label 999 -- nexthop: label table of PE2's label space 962 Label table of PE2's label space on PE4: 963 label 100 -- nexthop: pop, to CE2 965 Figure 11 967 In Figure 12, SPE2 is a co-located protector that protects PW1 968 against S-PE failure. It maintains a label space for SPE1, which is 969 identified by the context identifier of {SPE1, SPE2}. It learns 970 SEG1's label from SPE1, and installs a forwarding entry in the label 971 space. The nexthop of the forwarding entry indicates a label swap to 972 SEG4's label and a label push with the label of a transport tunnel to 973 TPE4. 975 |<--------------- PW1 --------------->| 976 |<----- SEG1 ----->|<----- SEG2 ----->| 978 - TPE1 ----- P1 ----- SPE1 --- P3 ------- TPE2 - 979 / PLR \ \ 980 / \ \ 981 CE1 bypass P2 CE2 982 \ \ / 983 \ \ / 984 - TPE3 --------------- SPE2 --- P4 ------- TPE4 - 985 protector 987 |<----- SEG3 ----->|<----- SEG4 ----->| 988 |<--------------- PW2 --------------->| 990 SEG1's label assigned by SPE1: 100 991 SEG2's label assigned by TPE2: 200 992 SEG3's label assigned by SPE2: 300 993 SEG4's label assigned by TPE4: 400 994 On P1: 995 Incoming label of transport tunnel to SPE1: 1000 996 Outgoing label of transport tunnel to SPE1: implicit null 997 Outgoing label of bypass tunnel to SPE2: 2000 998 On SPE1: 999 Outgoing label of transport tunnel to TPE2: 3000 1000 On SPE2: 1001 Outgoing label of transport tunnel to TPE4: 4000 1002 Context label (incoming label of bypass tunnel): 999 1004 Forwarding state on P1: 1005 label 1000 -- primary nexthop: pop, to SPE1 1006 backup nexthop: swap 2000, to P2 1008 Forwarding state on SPE1: 1009 label 100 -- nexthop: swap 200, push 3000, to P3 1011 Forwarding state on SPE2: 1012 label 300 -- nexthop: swap 400, push 4000, to P4 1013 label 999 -- nexthop: label table of SPE1's label space 1015 Label table of SPE1's label space on SPE2: 1016 label 100 -- nexthop: swap 400, push 4000, to P4 1018 Figure 12 1020 4.7.2. Centralized Protector Model 1022 In the centralized protector model, for each primary PW of which the 1023 protector is not a backup (S-)PE, the protector MUST also learn the 1024 label of the backup PW from the backup (S-)PE (Section 6.3). This is 1025 the backup (S-)PE that the protector will forward traffic to. The 1026 protector MUST install a forwarding entry with a label swap from the 1027 primary PW's label to the backup PW's label and a label push with the 1028 label of a transport tunnel to the backup (S-)PE. 1030 In Figure 13, the protector is a centralized protector that protects 1031 PW1 against egress AC failure and egress node failure. It maintains 1032 a label space for PE2, which is identified by the context identifier 1033 of {PE2, protector}. It learns PW1's label from PE2, and PW2's label 1034 from PE4. It installs a forwarding entry for PW1's label in the 1035 label space. The nexthop of the forwarding entry indicates a label 1036 swap to PW2's label and a label push with the label of a transport 1037 tunnel to PE4. 1039 |<-------------- PW1 --------------->| 1041 - PE1 ------------- P1 ------- P3 ------ PE2 ---- 1042 / PLR \ PLR \ 1043 / \ / \ 1044 / bypass P5 P6 bypass \ 1045 / \ / \ 1046 / \/ \ 1047 CE1 protector CE2 1048 \ \ / 1049 \ transport \ / 1050 \ tunnel P7 / 1051 \ \ / 1052 \ \ / 1053 - PE3 ------------- P2 ----------------- PE4 ---- 1055 |<-------------- PW2 --------------->| 1057 PW1's label assigned by PE2: 100 1058 PW2's label assigned by PE4: 200 1059 On P3: 1060 Incoming label of transport tunnel to PE2: 1000 1061 Outgoing label of transport tunnel to PE2: implicit null 1062 Outgoing label of bypass tunnel to protector: 2000 1063 On PE2: 1064 Outgoing label of bypass tunnel to protector: 3000 1065 On protector: 1066 Context label (incoming label of bypass tunnels): 999 1067 Outgoing label of transport tunnel to PE4: 4000 1069 Forwarding state on P3: 1070 label 1000 -- primary nexthop: pop, to PE2 1071 backup nexthop: swap 2000, to P5 1073 Forwarding state on PE2: 1074 label 100 -- primary nexthop: pop, to CE2 1075 backup nexthop: push 3000, to P6 1077 Forwarding state on PE4: 1078 label 200 -- nexthop: pop, to CE2 1080 Forwarding state on protector: 1081 label 999 -- nexthop: label table of PE2's label space 1083 Label table of PE2's label space on protector: 1084 label 100 -- nexthop: swap 200, push 4000, to P7 1086 Figure 13 1088 In Figure 14, the protector is a centralized protector that protects 1089 the PW segment SEG1 of PW1 against the node failure of SPE1. It 1090 maintains a label space for SPE1, which is identified by the context 1091 identifier of {SPE1, protector}. It learns SEG1's label from SPE1, 1092 and learns SEG3's label from SPE2. It installs a forwarding entry 1093 for SEG1's label in the label space. The nexthop of the forwarding 1094 entry indicates a label swap to SEG3's label and a label push with 1095 the label of a transport tunnel to TPE4. 1097 |<--------------- PW1 --------------->| 1098 |<----- SEG1 ----->|<----- SEG2 ----->| 1100 - TPE1 ----- P1 ----- SPE1 --- P2 -------- TPE2 - 1101 / PLR \ \ 1102 / \ \ 1103 / bypass P4 \ 1104 / \ \ 1105 / \ \ 1106 CE1 protector CE2 1107 \ \ / 1108 \ \ / 1109 \ transport P5 / 1110 \ tunnel \ / 1111 \ \ / 1112 - TPE3 -------------- SPE2 --- P3 -------- TPE4 - 1114 |<----- SEG3 ----->|<----- SEG4 ----->| 1115 |<--------------- PW2 --------------->| 1117 SEG1's label assigned by SPE1: 100 1118 SEG2's label assigned by TPE2: 200 1119 SEG3's label assigned by SPE2: 300 1120 SEG4's label assigned by TPE4: 400 1121 On P1: 1122 Incoming label of transport tunnel to SPE1: 1000 1123 Outgoing label of transport tunnel to SPE1: implicit null 1124 Outgoing label of bypass tunnel to protector: 2000 1125 On SPE1: 1126 Outgoing label of transport tunnel to TPE2: 3000 1127 On SPE2: 1128 Outgoing label of transport tunnel to TPE4: 4000 1129 On protector: 1130 Context label (incoming label of bypass tunnel): 999 1131 Outgoing label of transport tunnel to SPE2: 5000 1133 Forwarding state on P1: 1134 label 1000 -- primary nexthop: pop, to SPE1 1135 backup nexthop: swap 2000, to P4 1137 Forwarding state on SPE1: 1138 label 100 -- nexthop: swap 200, push 3000, to P2 1140 Forwarding state on SPE2: 1141 label 300 -- nexthop: swap 400, push 4000, to P3 1143 Forwarding state on protector: 1144 label 999 -- nexthop: label table of SPE1's label space 1146 Label table of SPE1's label space on protector: 1147 label 100 -- nexthop: swap 300, push 5000, to P5 1149 Figure 14 1151 5. Revertive Behavior 1153 Subsequent to local repair, there are three strategies for a network 1154 to restore traffic to a fully functional alternative path. 1156 o Global revertive mode 1158 If the ingress CE is multi-homed (Figure 1), it MAY switch the 1159 traffic to the backup AC which is bound to the backup PW. 1160 Alternatively, if the ingress PE hosts a backup PW (Figure 2), the 1161 ingress PE MAY switch the traffic to the backup PW. These 1162 procedures are referred to as global repair. Possible triggers of 1163 global repair include PW status notification, VCCV, BFD, end-to- 1164 end OAM between CEs, etc. 1166 o Control plane revertive mode 1168 In egress PE node protection and S-PE node protection, it is 1169 possible that the failure is limited to the link between the PLR 1170 and the primary PE, whereas the primary PE is still operational. 1171 In this case, the PLR or an upstream router on the transport 1172 tunnel MAY reroute the tunnel around the link via an alternative 1173 path to the primary PE. Thus, the transport tunnel can heal and 1174 continue to carry the PW to the primary PE. This procedure is 1175 driven by control plane convergence on the new topology, and is 1176 referred to as control plane repair. 1178 o Local revertive mode 1180 The PLR MAY move traffic back to the primary PW, after the failure 1181 is resolved. In egress AC protection, upon detecting that the 1182 primary AC is restored, the PLR MAY start forwarding traffic over 1183 the AC again. Likewise, in egress PE node protection and S-PE 1184 node protection, upon detecting that the primary PE is restored, 1185 the PLR MAY re-establish the transport tunnel to the primary PE, 1186 and move the traffic from the bypass tunnel back to the transport 1187 tunnel. These procedures are referred to as local reversion. 1189 It is RECOMMENDED that the fast protection mechanism SHOULD be used 1190 in conjunction with the global revertive mode. Particularly in the 1191 case of egress PE and S-PE node failures, if the ingress PE or the 1192 protector loses communication with the (S-)PE for an extensive period 1193 of time, LDP session may go down. Consequently, the ingress PE may 1194 bring down the primary PW completely, or the protector may remove the 1195 forwarding entry of the primary PW label. In either case, the 1196 service will be disrupted. In other words, although the mechanism 1197 can temporarily repair traffic, control plane state may eventually 1198 expire if the failure persists. Therefore, the global revertive mode 1199 SHOULD take place in a timely manner to move traffic to a fully 1200 functional alternative path. 1202 The control plane revertive mode may automatically happen as part of 1203 the convergence of control plane protocols. However, it is only 1204 applicable to the specific link failure scenario described above. 1206 The local revertive mode is optional. In the circumstances where the 1207 failure is caused by resource flapping, local reversion MAY be 1208 dampened to limit potential disruption. Local revertive mode MAY be 1209 disabled completely by configuration. 1211 6. LDP Extensions 1213 As described in previous sections, a targeted LDP session MUST be 1214 established between each pair of primary PE and protector. The 1215 primary PE sends Label Mapping message over this session to advertise 1216 primary PW labels to the protector. In the centralized protector 1217 model, a targeted LDP session MUST also be established between a 1218 backup (S-)PE and the protector. The backup PE sends Label Mapping 1219 message over this session to advertise backup PW labels to the 1220 protector. 1222 To facilitate the procedures, this document defines a new "Protection 1223 FEC Element" TLV. The Label Mapping messages of both the LDP 1224 sessions above MUST carry this TLV to identify a primary PW. 1225 Specifically, in the centralized protector model, the Protection FEC 1226 Element TLV advertised by a backup (S-)PE MUST match the one 1227 advertised by the primary PE, so that the protector can associate the 1228 primary PW's label with the backup PW's label, and perform a label 1229 swap. The backup (S-)PE builds such a Protection FEC Element TLV 1230 based on local configuration. 1232 This document also defines a new "Egress Protection Capability" TLV 1233 as a new type of Capability Parameter TLV [RFC5561], to allow a 1234 protector to announce its capability of processing the above 1235 Protection FEC Element TLV and performing context specific label 1236 switching for PW labels. 1238 The procedures in this section are only applicable, if the protector 1239 advertises the Egress Protection Capability TLV, the primary PE 1240 supports the advertisement of the Protection FEC Element TLV, and in 1241 the centralized protector model, the backup PE also supports the 1242 advertisement of the Protection FEC Element TLV. 1244 6.1. Egress Protection Capability TLV 1246 A protector MUST advertise the Egress Protection Capability TLV in 1247 its Initialization message and Capability message, over the LDP 1248 session with a primary PE. In the centralized protector model, the 1249 protector MUST also advertise the TLV over the LDP session with a 1250 backup PE. The TLV carries one or multiple context identifiers. To 1251 the primary PE, the TLV MUST carry the context identifier of the 1252 {primary PE, protector}. In the centralized protector model, the TLV 1253 MUST carry to the backup PE multiple context identifiers, one for 1254 each {primary PE, protector} where the backup PE serves as a backup 1255 for the primary PE. This TLV MUST NOT be advertised by the primary 1256 PE or the backup PE to the protector. 1258 The processing of the Egress Protection Capability TLV by a receiving 1259 router MUST follow the procedures defined in [RFC5561]. In 1260 particular, the router MUST advertise PW information to the protector 1261 by using the Protection FEC Element TLV, only after it has received 1262 the Egress Protection Capability TLV from the protector. It MUST 1263 validate each context identifier included in the TLV, and advertise 1264 the information of only the PWs that are associated with the context 1265 identifier. It MUST withdraw previously advertised Protection FEC 1266 TLVs, when the protector has withdrawn a previously advertised 1267 context identifier or the entire Egress Protection Capability TLV via 1268 Capability message. 1270 The encoding of the Egress Protection Capability TLV is defined as 1271 below. It conforms to the format of Capability Parameter TLV 1272 specified in [RFC5561]. 1274 0 1 2 3 1275 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1276 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1277 |U|F| Egress Protection (TBD) | Length | 1278 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1279 |S| Reserved | | 1280 +-+-+-+-+-+-+-+-+ | 1281 | | 1282 ~ Capability Data = context identifier(s) ~ 1283 | | 1284 | +-+-+-+-+-+-+-+-+ 1285 | | 1286 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1288 Figure 15 1290 The U-bit MUST be set to 1 so that a receiver MUST silently ignore 1291 this TLV if unknown to it, and continue processing the rest of the 1292 message. 1294 The F-bit MUST be set to 0 since this TLV is sent only in 1295 Initialization and Capability messages, which are not forwarded. 1297 The TLV Code Point is TBD. It needs to be assigned by IANA. 1299 The S-bit indicates whether the sender is advertising (S=1) or 1300 withdrawing (S=0) the capability. 1302 The "Capability Data" is encoded with the context identifier of the 1303 {primary PE, protector}. 1305 6.2. PW Label Distribution from Primary PE to Protector 1307 A primary PE MUST advertise a primary PW's label to a protector by 1308 sending a Label Mapping message. The message includes a Protection 1309 FEC Element TLV (see Section 6.4 for encoding), and an Upstream- 1310 Assigned Label TLV [RFC6389] encoded with the PW's label. The 1311 combination of the Protection FEC Element TLV and the PW label 1312 represents the primary PE's forwarding state for the PW. The Label 1313 Mapping message MUST also carry an IPv4/v6 Interface_ID TLV [RFC6389, 1314 RFC3471] encoded with the context identifier of the {primary PE, 1315 protector}. 1317 The protector that receives this Label Mapping message MUST install a 1318 forwarding entry for the PW label in the label space identified by 1319 the context identifier. The nexthop of the forwarding entry MUST 1320 ensure packets to be sent towards the target CE via a backup AC or a 1321 backup (S-)PE, depending on the protection scenario. The protector 1322 MUST silently discard a Label Mapping message if the included context 1323 identifier is unknown to it. 1325 6.3. PW Label Distribution from Backup PE to Protector 1327 In the centralized protector model, a backup PE MUST advertise a 1328 backup PW's label to the protector by sending a Label Mapping 1329 message. The message includes a Protection FEC Element TLV and a 1330 Generic Label TLV encoded with the backup PW's label. This 1331 Protection FEC Element MUST be identical to the Protection FEC 1332 Element TLV that the primary PE advertises to the protector 1333 (Section 6.2). This is achieved through configuration on the backup 1334 PE. The context identifier MUST NOT be encoded in Interface_ID TLV 1335 in this message. 1337 The protector that receives this Label Mapping message MUST associate 1338 the backup PW with the primary PW, based on the common Protection FEC 1339 Element TLV. It MUST distinguish between the Label Mapping message 1340 from the primary PE and the Label Mapping message from the backup PE 1341 based on the respective presence and absence of context identifier in 1342 Interface_ID TLV. It MUST install a forwarding entry for the primary 1343 PW's label in the label space identified by the context identifier. 1344 The nexthop of the forwarding entry MUST indicate a label swap to the 1345 backup PW's label, followed by a label push or IP header push for a 1346 transport tunnel to the backup PE. 1348 6.4. Protection FEC Element TLV 1350 The Protection FEC Element TLV has type 0x83. Its format is defined 1351 as below: 1353 0 1 2 3 1354 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1355 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1356 | Type(0x83) | Reserved | Encoding Type | Length | 1357 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1358 | | 1359 | | 1360 ~ PW Information ~ 1361 | | 1362 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1363 | | 1364 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1366 Figure 16 1368 - Encoding Type 1369 Type of format that PW Information field is encoded. 1371 - Length 1373 Length of PW Information field in octets. 1375 - PW Information 1377 Field of variable length that specifies a PW 1379 For Encoding Type, 1 is defined for the PWid FEC Element format, and 1380 2 is defined for the Generalized PWid FEC Element format [RFC4447]. 1382 6.4.1. Encoding Format for PWid 1384 0 1 2 3 1385 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1386 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1387 | Type(0x83) | Reserved | Enc Type(1) | Length(20) | 1388 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1389 | Ingress PE Address | 1390 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1391 | Egress PE Address | 1392 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1393 | Group ID | 1394 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1395 | PW ID | 1396 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1397 |C| PW Type | Reserved | 1398 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1400 Figure 17 1402 - Ingress PE Address 1404 IP address of the ingress PE of PW. 1406 - Egress PE Address 1408 IP address of the egress PE of PW. 1410 - Group ID 1412 An arbitrary 32-bit value that represents a group of PWs and that 1413 is used to create groups in the PW space. 1415 - PW ID 1416 A non-zero 32-bit connection ID that, together with the PW Type 1417 field, identifies a particular PW. 1419 - Control word bit (C) 1421 A bit that flags the presence of a control word on this PW. If C 1422 = 1, control word is present; If C = 0, control word is not 1423 present. 1425 - PW Type 1427 A 15-bit quantity that represents the type of PW. 1429 6.4.2. Encoding Format for Generalized PWid 1431 0 1 2 3 1432 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1433 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1434 | Type(0x83) | Reserved | Enc Type(2) | Length | 1435 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1436 | Ingress PE Address | 1437 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1438 | Egress PE Address | 1439 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1440 |C| PW Type | Reserved | 1441 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1442 | AGI Type | Length | Value | 1443 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1444 ~ AGI Value (contd.) ~ 1445 | | 1446 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1447 | AII Type | Length | Value | 1448 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1449 ~ SAII Value (contd.) ~ 1450 | | 1451 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1452 | AII Type | Length | Value | 1453 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1454 ~ TAII Value (contd.) ~ 1455 | | 1456 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1458 Figure 18 1460 - Ingress PE Address 1462 IP address of the ingress PE of PW. 1464 - Egress PE Address 1466 IP address of the egress PE of PW. 1468 - Control word bit (C) 1470 A bit that flags the presence of a control word on this PW. If C 1471 = 1, control word is present; If C = 0, control word is not 1472 present. 1474 - PW Type 1476 A 15-bit quantity that represents the type of PW. 1478 - AGI Type, Length, Value, AGI Value 1480 Attachment Group Identifier of PW. 1482 - SAII Type, Length, Value, SAII Value 1484 Source Attachment Individual Identifier of PW. 1486 - TAII Type, Length, Value, TAII Value 1488 Target Attachment Individual Identifier of PW. 1490 7. IANA Considerations 1492 This document defines a new "Egress Protection Capability" TLV in 1493 Section 6. The document requests IANA to assign an LDP TLV code 1494 point for the TLV. 1496 This document uses the LDP Protection FEC Type Name Space value 1497 0x083. The LDP Protection FEC Type Name Space for this type value 1498 references this document, and it is requested to update this 1499 reference to the RFC number for this document. 1501 Value Hex Name Label Advertisement Discipline 1502 ------------------------------------------------------------------- 1503 131 0x83 Protection FEC Element DU 1505 8. Security Considerations 1507 In this document, PW traffic can be temporarily rerouted to a 1508 protector. In the centralized protector scenario, the traffic can be 1509 further rerouted to a backup PE. In the control plane, there is a 1510 targeted LDP session between a primary PE and a protector. In the 1511 centralized protector scenario, there is also a targeted LDP session 1512 between a backup PE and a protector. In all scenarios, the role of 1513 protector is entirely managed by network operator, and backup PEs can 1514 be used anyway to host PWs and LDP sessions. Hence, the rerouted 1515 traffic and the LDP sessions introduced in this document should not 1516 be viewed as a new security threat. 1518 In general, [RFC5920] describes the security framework for MPLS 1519 networks. [RFC3209] describes the security considerations for RSVP 1520 LSPs. [RFC5036] describes the security considerations for the base 1521 LDP specification. [RFC5561] describes the security considerations 1522 which apply when using the LDP capability mechanism. All these 1523 security framework and considerations apply to this document as well. 1525 9. Acknowledgements 1527 This document leverages work done by Hannes Gredler, Yakov Rekhter, 1528 Minto Jeyananth, Kevin Wang and several on MPLS edge protection. 1529 Thanks to Nischal Sheth and Bhupesh Kothari for their contribution. 1530 Thanks to John E Drake, Andrew G Malis, Alexander Vainshtein, Stewart 1531 Bryant, and Mach Chen for valuable comments that helped shape this 1532 document and improve its clarity. 1534 10. References 1536 10.1. Normative References 1538 [RFC4447] Martini, L., Ed., Rosen, E., El-Aawar, N., Smith, T., and 1539 G. Heron, "Pseudowire Setup and Maintenance Using the 1540 Label Distribution Protocol (LDP)", RFC 4447, 1541 DOI 10.17487/RFC4447, April 2006, 1542 . 1544 [RFC5331] Aggarwal, R., Rekhter, Y., and E. Rosen, "MPLS Upstream 1545 Label Assignment and Context-Specific Label Space", 1546 RFC 5331, DOI 10.17487/RFC5331, August 2008, 1547 . 1549 [RFC5561] Thomas, B., Raza, K., Aggarwal, S., Aggarwal, R., and JL. 1550 Le Roux, "LDP Capabilities", RFC 5561, 1551 DOI 10.17487/RFC5561, July 2009, 1552 . 1554 [RFC3471] Berger, L., Ed., "Generalized Multi-Protocol Label 1555 Switching (GMPLS) Signaling Functional Description", 1556 RFC 3471, DOI 10.17487/RFC3471, January 2003, 1557 . 1559 [RFC3472] Ashwood-Smith, P., Ed. and L. Berger, Ed., "Generalized 1560 Multi-Protocol Label Switching (GMPLS) Signaling 1561 Constraint-based Routed Label Distribution Protocol (CR- 1562 LDP) Extensions", RFC 3472, DOI 10.17487/RFC3472, January 1563 2003, . 1565 [RFC6389] Aggarwal, R. and JL. Le Roux, "MPLS Upstream Label 1566 Assignment for LDP", RFC 6389, DOI 10.17487/RFC6389, 1567 November 2011, . 1569 [RFC4090] Pan, P., Ed., Swallow, G., Ed., and A. Atlas, Ed., "Fast 1570 Reroute Extensions to RSVP-TE for LSP Tunnels", RFC 4090, 1571 DOI 10.17487/RFC4090, May 2005, 1572 . 1574 [RFC5286] Atlas, A., Ed. and A. Zinin, Ed., "Basic Specification for 1575 IP Fast Reroute: Loop-Free Alternates", RFC 5286, 1576 DOI 10.17487/RFC5286, September 2008, 1577 . 1579 [RFC7812] Atlas, A., Bowers, C., and G. Enyedi, "An Architecture for 1580 IP/LDP Fast Reroute Using Maximally Redundant Trees (MRT- 1581 FRR)", RFC 7812, DOI 10.17487/RFC7812, June 2016, 1582 . 1584 [RFC3031] Rosen, E., Viswanathan, A., and R. Callon, "Multiprotocol 1585 Label Switching Architecture", RFC 3031, 1586 DOI 10.17487/RFC3031, January 2001, 1587 . 1589 10.2. Informative References 1591 [RFC5920] Fang, L., Ed., "Security Framework for MPLS and GMPLS 1592 Networks", RFC 5920, DOI 10.17487/RFC5920, July 2010, 1593 . 1595 [RFC3985] Bryant, S., Ed. and P. Pate, Ed., "Pseudo Wire Emulation 1596 Edge-to-Edge (PWE3) Architecture", RFC 3985, 1597 DOI 10.17487/RFC3985, March 2005, 1598 . 1600 [RFC3209] Awduche, D., Berger, L., Gan, D., Li, T., Srinivasan, V., 1601 and G. Swallow, "RSVP-TE: Extensions to RSVP for LSP 1602 Tunnels", RFC 3209, DOI 10.17487/RFC3209, December 2001, 1603 . 1605 [RFC5036] Andersson, L., Ed., Minei, I., Ed., and B. Thomas, Ed., 1606 "LDP Specification", RFC 5036, DOI 10.17487/RFC5036, 1607 October 2007, . 1609 [RFC5659] Bocci, M. and S. Bryant, "An Architecture for Multi- 1610 Segment Pseudowire Emulation Edge-to-Edge", RFC 5659, 1611 DOI 10.17487/RFC5659, October 2009, 1612 . 1614 [RFC5714] Shand, M. and S. Bryant, "IP Fast Reroute Framework", 1615 RFC 5714, DOI 10.17487/RFC5714, January 2010, 1616 . 1618 [RFC5880] Katz, D. and D. Ward, "Bidirectional Forwarding Detection 1619 (BFD)", RFC 5880, DOI 10.17487/RFC5880, June 2010, 1620 . 1622 Authors' Addresses 1624 Yimin Shen 1625 Juniper Networks 1626 10 Technology Park Drive 1627 Westford, MA 01886 1628 USA 1630 Phone: +1 9785890722 1631 Email: yshen@juniper.net 1633 Rahul Aggarwal 1634 Arktan, Inc 1636 Email: raggarwa_1@yahoo.com 1638 Wim Henderickx 1639 Alcatel-Lucent 1640 Copernicuslaan 50 1641 2018 Antwerp 1642 Belgium 1644 Email: wim.henderickx@alcatel-lucent.be 1646 Yuanlong Jiang 1647 Huawei Technologies 1649 Email: jiangyuanlong@huawei.com