idnits 2.17.1 draft-ietf-mpls-tp-1ton-protection-02.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** There is 1 instance of too long lines in the document, the longest one being 1 character in excess of 72. ** The abstract seems to contain references ([LinProt], [SurvivFwk]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == Line 601 has weird spacing: '...rts and selec...' == The document seems to contain a disclaimer for pre-RFC5378 work, but was first submitted on or after 10 November 2008. The disclaimer is usually necessary only for documents that revise or obsolete older RFCs, and that take significant amounts of text from those RFCs. If you can contact all authors of the source material and they are willing to grant the BCP78 rights to the IETF Trust, you can and should remove the disclaimer. Otherwise, the disclaimer is needed and you can ignore this comment. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (August 6, 2013) is 3916 days in the past. Is this intentional? -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '' and '' lines. Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'W1' is mentioned on line 752, but not defined == Missing Reference: 'W2' is mentioned on line 752, but not defined == Missing Reference: 'W3' is mentioned on line 752, but not defined == Missing Reference: 'W4' is mentioned on line 520, but not defined -- Looks like a reference, but probably isn't: '5' on line 1540 -- Looks like a reference, but probably isn't: '1' on line 1532 -- Looks like a reference, but probably isn't: '2' on line 1534 -- Looks like a reference, but probably isn't: '6' on line 1543 -- Looks like a reference, but probably isn't: '3' on line 1536 -- Looks like a reference, but probably isn't: '7' on line 1546 -- Looks like a reference, but probably isn't: '4' on line 1538 -- Looks like a reference, but probably isn't: '8' on line 1549 -- Looks like a reference, but probably isn't: '9' on line 1551 -- Looks like a reference, but probably isn't: '10' on line 1553 -- Looks like a reference, but probably isn't: '16' on line 1567 -- Looks like a reference, but probably isn't: '11' on line 1555 -- Looks like a reference, but probably isn't: '12' on line 1557 -- Looks like a reference, but probably isn't: '14' on line 1561 -- Looks like a reference, but probably isn't: '15' on line 1564 -- Looks like a reference, but probably isn't: '13' on line 1559 -- Looks like a reference, but probably isn't: '17' on line 1571 -- Looks like a reference, but probably isn't: '18' on line 1574 == Outdated reference: A later version (-09) exists of draft-ietf-mpls-tp-security-framework-07 Summary: 2 errors (**), 0 flaws (~~), 8 warnings (==), 20 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group E. Osborne 3 Internet-Draft Cisco 4 Intended status: Standards Track F. Zhang 5 Expires: February 7, 2014 ZTE 6 Y. Weingarten 7 August 6, 2013 9 MPLS-TP 1toN Protection 10 draft-ietf-mpls-tp-1ton-protection-02.txt 12 Abstract 14 There is a requirement for Multiprotocol Label Switching Transport 15 Profile(MPLS-TP) to support 1:n linear protection for transport 16 paths. This requirement is further elaborated in RFC6372 17 [SurvivFwk]. The basic protocol for linear protection, specified in 18 RFC6378 [LinProt], is limited to 1+1 and 1:1 protection. This 19 document extends that protocol to address the additional 20 functionality necessary to support scenarios where a single 21 protection path is preconfigured to provide protection of multiple 22 transport paths between two joint endpoints. 24 This document is a product of a joint Internet Engineering Task Force 25 (IETF) / International Telecommunications Union Telecommunications 26 Standardization Sector (ITU-T) effort to include an MPLS Transport 27 Profile within the IETF MPLS and PWE3 architectures to support the 28 capabilities and functionalities of a packet transport network as 29 defined by the ITU-T. 31 Status of this Memo 33 This Internet-Draft is submitted in full conformance with the 34 provisions of BCP 78 and BCP 79. 36 Internet-Drafts are working documents of the Internet Engineering 37 Task Force (IETF). Note that other groups may also distribute 38 working documents as Internet-Drafts. The list of current Internet- 39 Drafts is at http://datatracker.ietf.org/drafts/current/. 41 Internet-Drafts are draft documents valid for a maximum of six months 42 and may be updated, replaced, or obsoleted by other documents at any 43 time. It is inappropriate to use Internet-Drafts as reference 44 material or to cite them other than as "work in progress." 46 This Internet-Draft will expire on February 7, 2014. 48 Copyright Notice 49 Copyright (c) 2013 IETF Trust and the persons identified as the 50 document authors. All rights reserved. 52 This document is subject to BCP 78 and the IETF Trust's Legal 53 Provisions Relating to IETF Documents 54 (http://trustee.ietf.org/license-info) in effect on the date of 55 publication of this document. Please review these documents 56 carefully, as they describe your rights and restrictions with respect 57 to this document. Code Components extracted from this document must 58 include Simplified BSD License text as described in Section 4.e of 59 the Trust Legal Provisions and are provided without warranty as 60 described in the Simplified BSD License. 62 This document may contain material from IETF Documents or IETF 63 Contributions published or made publicly available before November 64 10, 2008. The person(s) controlling the copyright in some of this 65 material may not have granted the IETF Trust the right to allow 66 modifications of such material outside the IETF Standards Process. 67 Without obtaining an adequate license from the person(s) controlling 68 the copyright in such materials, this document may not be modified 69 outside the IETF Standards Process, and derivative works of it may 70 not be created outside the IETF Standards Process, except to format 71 it for publication as an RFC or to translate it into languages other 72 than English. 74 Table of Contents 76 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 77 1.1. 1:n Protection architecture . . . . . . . . . . . . . . . 4 78 1.2. Locking operation . . . . . . . . . . . . . . . . . . . . 6 79 1.3. Non-Locking . . . . . . . . . . . . . . . . . . . . . . . 7 80 1.4. Path priority . . . . . . . . . . . . . . . . . . . . . . 7 81 1.5. Preemption . . . . . . . . . . . . . . . . . . . . . . . . 8 82 1.6. Contributing authors . . . . . . . . . . . . . . . . . . . 8 83 2. Conventions used in this document . . . . . . . . . . . . . . 8 84 2.1. Acronyms . . . . . . . . . . . . . . . . . . . . . . . . . 9 85 2.2. Definitions and Terminology . . . . . . . . . . . . . . . 9 86 3. Use cases and scenarios . . . . . . . . . . . . . . . . . . . 9 87 3.1. Non-locking use case: Per-node label space . . . . . . . . 9 88 3.2. Locking use-case: . . . . . . . . . . . . . . . . . . . . 10 89 3.3. PSC Scenarios . . . . . . . . . . . . . . . . . . . . . . 12 90 3.3.1. Unidirectional failure cases . . . . . . . . . . . . . 13 91 3.3.2. Bidirectional fault scenarios . . . . . . . . . . . . 15 92 3.3.3. Preemption scenarios . . . . . . . . . . . . . . . . . 17 93 4. Changes to PSC . . . . . . . . . . . . . . . . . . . . . . . . 22 94 4.1. PSC . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 95 4.2. Changes to PSC Payload . . . . . . . . . . . . . . . . . . 23 96 4.2.1. Locking (L) flag . . . . . . . . . . . . . . . . . . . 24 97 4.2.2. Fault path (FPath) field . . . . . . . . . . . . . . . 24 98 4.2.3. Data path (Path) field . . . . . . . . . . . . . . . . 24 99 4.3. Changes to PSC Operation . . . . . . . . . . . . . . . . . 25 100 4.3.1. Basic operation . . . . . . . . . . . . . . . . . . . 25 101 4.3.2. Two-phased operation . . . . . . . . . . . . . . . . . 25 102 4.3.3. Acknowledge message . . . . . . . . . . . . . . . . . 26 103 4.3.4. Wait for Acknowledge (WFA) timer . . . . . . . . . . . 27 104 4.3.5. Additional PSC State . . . . . . . . . . . . . . . . . 27 105 5. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 31 106 6. Security Considerations . . . . . . . . . . . . . . . . . . . 31 107 7. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 31 108 8. References . . . . . . . . . . . . . . . . . . . . . . . . . . 31 109 8.1. Normative References . . . . . . . . . . . . . . . . . . . 31 110 8.2. Informative References . . . . . . . . . . . . . . . . . . 32 111 Appendix A. PSC state machine tables . . . . . . . . . . . . . . 32 112 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 36 114 1. Introduction 116 The MPLS Transport Profile (MPLS-TP) Requirements document [TPReq] 117 includes requirements for the necessary survivability tools required 118 for MPLS based transport networks. Network survivability is the 119 ability of a network to recover traffic delivery following failure, 120 or degradation of network resources. Requirement 67 lists various 121 types of 1:n protection architectures that are required for MPLS-TP. 122 The MPLS-TP Survivability Framework [SurvivFwk] is a framework for 123 survivability in MPLS-TP networks, and describes recovery elements, 124 types, methods, and topological considerations, focusing on 125 mechanisms for recovering MPLS-TP Label Switched Paths (LSPs). 127 Linear protection in mesh networks - networks with arbitrary 128 interconnectivity between nodes - is described in Section 4.7 of 129 [SurvivFwk]. Linear protection provides rapid and simple protection 130 switching. In a mesh network, linear protection provides a very 131 suitable protection mechanism because it can operate between any pair 132 of points within the network. It can protect against a defect in an 133 intermediate node, a span, a transport path segment, or an end-to-end 134 transport path. 136 [LinProt] defines a Protection State Coordination (PSC) protocol that 137 supports the different 1+1 and 1:1 architectures described in 138 [SurvivFwk]. The PSC protocol is a single-phased protocol that 139 allows the two endpoints of the protection domain to coordinate the 140 protection switching operation when a switching condition is detected 141 on the transport paths of the protection domain. 143 This document extends the PSC protocol to support a protection domain 144 that includes multiple working transport paths, between common end 145 points, protected by a single protection transport path. The 146 protection transport path is pre-allocated with resources to 147 transport the traffic normally carried by any one of the working 148 transport paths. This is the architecture described in [SurvivFwk] 149 as 1:n protection, and is the generalization of the 1:1 protection 150 architecture already supported by PSC. 152 1.1. 1:n Protection architecture 154 Linear protection switching is a fully allocated survivability 155 mechanism in the sense that the route and bandwidth of the protection 156 path is reserved for a set of working paths. For 1:n protection the 157 protection path is allocated to protect any one of n working paths 158 between the two endpoints of the protection domain. 160 +-----+ +-----+ 161 | |=============================| | 162 |LER-A| Working Path #1 |LER-Z| 163 | | | | 164 | |=============================| | 165 | | Working Path #2 | | 166 | | | | 167 | |=============================| | 168 | | Working Path #3 | | 169 | | | | 170 | | ooo | | 171 | | | | 172 | |=============================| | 173 | | Working Path #N | | 174 | | | | 175 | | Protection Path | | 176 | |*****************************| | 177 | | | | 178 +-----+ +-----+ 180 |--------Protection Domain--------| 182 Figure 1: 1:n Protection domain 184 Figure 1 shows a protection domain with N working transport paths and 185 a single protection path. In 1:n protection, the protection path may 186 transport the traffic of only a single working path at any particular 187 time. The identity of the working path that is being protected must 188 be communicated between the two endpoints. 190 Unless otherwise specified, all examples will be based on the network 191 topology in Figure 1, with the working paths referenced as Wi (for 192 1<=i<=N) and the protection path referenced as P. The end-points of 193 the protection domain will be referred to as LER-A and LER-Z. 195 The different working paths may be disjoint at the intermediary 196 points on the path between LER-A and LER-Z and may also have 197 different resource requirements. In addition, each of the working 198 paths may be assigned a priority that could be used to decide which 199 working path would be protected in cases of conflict (see more on 200 this topic in Section 1.5). It is usually advised to arrange these 201 protection groups in a way that would minimize any potential conflict 202 situation. 204 1:n protection in MPLS supports two modes of operation - locking and 205 non-locking. The locking mode mirrors the behavior that is used by 206 many transport protection mechanisms, and is necessary in some cases 207 but may incur increased latency (and thus packet loss), as a result 208 of prolonged switching time, in comparison to the non-locking case. 209 Non-locking 1:n can be used in many MPLS networks and affords a lower 210 rate of packet loss as compared to locking mode, but must be used 211 with care - since incorrect use of non-locking can lead to 212 misconnectivity. 214 1.2. Locking operation 216 The high-level functionality of the locking operation mode of 1:n 217 protection would follow the following basic steps: 219 o LER-A detects a unidirectional failure of W1 and stops sending 220 traffic on W1. 222 o LER-A transmits a PSC SF message to LER-Z indicating that W1 has 223 failed and its traffic should be redirected to P. No traffic is 224 sent on P at this point. 226 o LER-Z receives the PSC message from LER-A and begins transmitting 227 W1 traffic in P, and sends a PSC message to LER-A indicating that 228 W1 is now being protected by P. LER-A receives the normal data 229 traffic intended for W1 from P, LER-Z receives the W1 data traffic 230 from P and also bridges W1 data traffic into P. 232 o LER-A receives the PSC message from LER-Z and begins transporting 233 W1 traffic in P -- that is, LER-A bridges W1 into P. 235 It should be clear from this description that no traffic is sent over 236 P until LER-Z processes the PSC message from LER-A, and that traffic 237 is only sent unidirectionally (Z->A) until LER-A processes the 238 "reply" PSC message from LER-Z. As the message processing time is 239 expected to be dwarfed by the propagation delay between LER-A to 240 LER-Z, it can be said that there is complete traffic loss between the 241 endpoints for the duration of the one-way propagation delay from 242 LER-A to LER-Z, and full bidirectional traffic flow is not fully 243 restored until after 1xRTT of the protection path. 245 This operation mode is referred to as "locking" because the sequence 246 of processing the PSC messages includes periods where the protection 247 path is locked from carrying protected traffic, while the two end- 248 points verify that both are ready to process the W1 traffic that is 249 received on P. More detailed information on this mode of operation 250 will be supplied later in the document when considering different 251 scenarios. 253 1.3. Non-Locking 255 In non-locking protection operation mode, LER-A switches data traffic 256 onto P immediately upon failure detection. This minimizes traffic 257 loss, but at the cost of temporary asymmetry of packet flow. At a 258 high level, it works like this: 260 o LER-A detects the failure of W1 and stops sending traffic on W1. 262 o LER-A immediately begins to transport W1's data traffic over the 263 protection path P. 265 o Simultaneously LER-A transmits a PSC message to LER-Z indicating 266 that W1 has failed and is currently being protected in P. 268 o LER-Z receives the PSC message from LER-A, switches all W1 data 269 traffic to P, and transmits a PSC message to LER-A indicating that 270 W1 is now protected in P. 272 o LER-A receives the PSC message from LER-Z and needs to take no 273 action, as the protection switch had already been completed. 275 In the non-locking case, the packet loss between the endpoints is 276 minimized. Packet loss may occur in the A->Z direction only for the 277 duration of the failure detection time , which is assumed, for this 278 document, to be negligible. Packet loss in the Z->A direction is 279 almost entirely the result of the one-way propagation delay of the 280 PSC message from LER-A to LER-Z. Assuming the transport path from 281 A->Z has the same delay as that from Z->A, it can be said that the 282 packet loss in the non-locking case is roughly half that of the 283 locking case. 285 1.4. Path priority 287 As the 1:n architecture requires the ability for one working path to 288 preempt the traffic of another in the event of multiple failures (see 289 Section 1.5), there must be an indication of priority between the 290 different working paths so that an implementation can decide whether 291 a new failure should be allowed to preempt a protection switch 292 already in place. The priority for a given Working path is 293 determined by the value used to represent that path in the FPath 294 field of the PSC packet. When comparing two Working paths to 295 determine priority, the numerically lower FPath value is the winner. 296 That is, Wi>Wj if i W2 etc. There is a single protection path, P. These 522 examples use the notation "B = x" to indicate the protect LSP whose 523 contents are bridged into the protect LSP. For example, if W3 has 524 failed and is currently protected, B = 3. If no protection is in 525 place, B = n/a. All examples end with the REQ(FPath, Path) and B 526 values for each node in each example. 528 The non-locking cases assume that both LER-A and LER-Z have 529 preestablished per-node label spaces, as per the use case above. 531 All cases assume that the time required to perform on-box operations 532 such as bridging or selecting is instantaneous. The one-way delay 533 between nodes is abbreviated OWD, and the round trip time is RTT 534 (i.e. RTT = 2 x OWD). 536 3.3.1. Unidirectional failure cases 538 The examples in this section provide the message flow between LER-A 539 and LER-Z for the scenario where a unidirectional fault is detected 540 by LER-A on working path W1. The message flow is described as a 541 sequence along a timeline. 543 3.3.1.1. Non-locking 545 Considering the scenario of a protection domain operating in non- 546 locking mode the following is the event timeline: 548 +--------------------------------------------------------------------+ 549 |Time| Event Description |LER-A PSC|LER-Z PSC| 550 | | | Bridge | Bridge | 551 +----+-------------------------------------------+---------+---------+ 552 | t0 | Traffic is being transported on W1, P is | NR(0,0) | NR(0,0) | 553 | | not carrying any traffic. Both LER-A and | B = n/a | B = n/a | 554 | | LER-Z transmitting PSC NR(0,0) message. | | | 555 +----+-------------------------------------------+---------+---------+ 556 | t1 | LER-A detects SF on W1, bridges W1 into P | SF(1,1) | NR(0,0) | 557 | | and sends SF(1,1). LER-A enters into WFA | B = 1 | B = n/a | 558 | | (Waiting for Acknowledgement) state. LER-A| | | 559 | | still selects the traffic from W1. This is| | | 560 | | admittedly of not much use when LER-A sees| | | 561 | | SF, may be useful when LER-A encounters a | | | 562 | | partial failure such as SD. | | | 563 +----+-------------------------------------------+---------+---------+ 564 | t2 | LER-Z receives SF(1,1). LER-Z enters | SF(1,1) | NR(0,1) | 565 | | PF:W:R state. LER-Z switches W1 onto P | B = 1 | B = 1 | 566 | | and sends SF(1,1). At this point traffic | | | 567 | | for W1 is protected in both directions | | | 568 +----+-------------------------------------------+---------+---------+ 569 | t3 | LER-A receives SF(1,1), which it takes as | SF(1,1) | NR(0,1) | 570 | | an ACK from LER-Z. LER-A transits from | B = 1 | B = 1 | 571 | | WFA to PF:W:L state. Switch is complete. | | | 572 +----+-------------------------------------------+---------+---------+ 574 Figure 2: Unidirectional non-locking 576 Note: Between t1 and t2, LER-A transports the data traffic on P while 577 LER-Z continues transporting it on W1, and there is temporary path 578 asymmetry. After t2, the data traffic is in P in both directions. 580 In this case, LER-A loses traffic for the OWD time, as it does not 581 receive any traffic from LER-Z on P until LER-Z bridges W1 into P. 582 LER-Z does not lose any traffic due to the immediate bridging on 583 LER-A. 585 3.3.1.2. Locking 587 When examining the similar scenario for a protection domain that is 588 using the Locking mode of operation, we have the following time 589 sequence: 591 +--------------------------------------------------------------------+ 592 |Time| Event Description |LER-A PSC|LER-Z PSC| 593 | | | Bridge | Bridge | 594 +----+-------------------------------------------+---------+---------+ 595 | t0 | Traffic is being transported on W1, P is | NR(0,0) | NR(0,0) | 596 | | not carrying any traffic. Both LER-A and | B = n/a | B = n/a | 597 | | LER-Z transmitting PSC NR(0,0) message. | | | 598 +----+-------------------------------------------+---------+---------+ 599 | t1 | LER-A detects SF on W1, LER-A enters into | SF(1,0) | NR(0,0) | 600 | | WFA state and sends SF(1,1). LER-A still| B = n/a | B = n/a | 601 | | transports and selects the traffic from | | | 602 | | W1. This allows traffic to get through if | | | 603 | | the failure is truly unidirectional. | | | 604 +----+-------------------------------------------+---------+---------+ 605 | t2 | LER-Z receives SF(1,0). LER-Z enters | SF(1,0) | NR(0,1) | 606 | | PF:W:R state. LER-Z bridges W1 into P and| B = 1 | B = 1 | 607 | | sends NR(0,1) but continues to select | | | 608 | | traffic from W1 | | | 609 +----+-------------------------------------------+---------+---------+ 610 | t3 | LER-A receives NR(0,1), which it takes as | SF(1,1) | NR(0,1) | 611 | | an ACK from LER-Z. LER-A completely | B = 1 | B = 1 | 612 | | switches W1 traffic onto P. LER-A transits| | | 613 | | from WFA to PF:W:L state. Switch complete| | | 614 +----+-------------------------------------------+---------+---------+ 615 | t4 | LER-Z receives SF(1,1). LER-Z selects W1 | SF(1,1) | NR(0,1) | 616 | | traffic from P and sends NR(0,1) | B = 1 | B = 1 | 617 +----+-------------------------------------------+---------+---------+ 619 Figure 3: Unidirectional locking 621 Note: At t1, LER-A stops sending traffic to LER-Z. At t3, it 622 resumes. Since the majority of the time delay at both t1 and t2 is 623 the one-way transmission delay between LER-A and LER-Z, there is a 624 total of 1xRTT traffic loss at both endpoints. 626 3.3.2. Bidirectional fault scenarios 628 The examples above focused on unidirectional failures in order to 629 illustrate the basic principles of 1:n protection. However, most 630 failures in carrier networks are bidirectional in nature. 631 Bidirectionality includes not only the failure of both the tx and rx 632 physical path (e.g. a fiber cut) but also a unidirectional failure 633 made bidirectional by mechanisms outside of PSC such as CC-V or LDI. 635 Both ends of a protection domain may not see the bidirectional 636 failure at the same instant. In the case of a true bidirectional 637 fiber cut, the cut may be physically closer to one end of the domain 638 than the other, and thus the end which is farther away takes longer 639 to notice the failure. This is referred to as "asymmetric 640 notification delay" in this document. Similarly, a unidirectional 641 failure seen by one endpoint which triggers an LDI notification to 642 the far endpoint will not be recognized by this far end until after 643 ir has been noticed it at the near endpoint. 645 There are a number of scenarios that constitute bidirectional 646 failure, and the variety of triggers and notification delays mean 647 that it is impossible to document them all here. The scenario used 648 in this case is of a true bidirectional failure, on working path W1, 649 with asymmetric notification delay, as described above. Both the 650 case of Non-locking and Locking operation modes are presented. 652 It is perhaps important to understand that a node, when reacting to a 653 failure, simply reacts either to its local LSP status (e.g. SF on 654 the underlying fiber) or the status of the remote node (e.g. the 655 remote node sending SF(x,y)). A node neither knows nor cares whether 656 the failure is bidirectional; it simply reacts to inputs to its local 657 state machine. It can easily be observed that there are no special 658 states needed for unidirectional vs. bidirectional error handling. 660 3.3.2.1. Non-Locking 662 First we present the scenario when operating in non-locking mode: 664 +--------------------------------------------------------------------+ 665 |Time| Event Description |LER-A PSC|LER-Z PSC| 666 | | | Bridge | Bridge | 667 +----+-------------------------------------------+---------+---------+ 668 | t0 | Traffic is being transported on W1, P is | NR(0,0) | NR(0,0) | 669 | | not carrying any traffic. Both LER-A and | B = n/a | B = n/a | 670 | | LER-Z transmitting PSC NR(0,0) message. | | | 671 +----+-------------------------------------------+---------+---------+ 672 | t1 | LER-A detects SF on W1, bridges W1 into P | SF(1,1) | NR(0,0) | 673 | | and sends SF(1,1). LER-A enters into WFA | B = 1 | B = n/a | 674 | | state and continues to select the traffic | | | 675 | | from W1. | | | 676 +----+-------------------------------------------+---------+---------+ 677 | t2 | LER-Z detects the SF on W1. LER-Z enters | SF(1,1) | SF(1,1) | 678 | | WFA state and bridges W1 into P and | B = 1 | B = 1 | 679 | | transmitting SF(1,1). At this point | | | 680 | | traffic for W1 is protected in both | | | 681 | | directions, however the endpoints are | | | 682 | | still not coordinated | | | 683 +----+-------------------------------------------+---------+---------+ 684 | t3 | LER-Z receives the SF(1,1) from LER-A and | SF(1,1) | SF(1,1) | 685 | | considers it an Ack and transits from WFA | B = 1 | B = 1 | 686 | | to PF:W:L state | | | 687 +----+-------------------------------------------+---------+---------+ 688 | t4 | LER-A receives SF(1,1), which it takes as | SF(1,1) | SF(1,1) | 689 | | an Ack from LER-Z and transits from WFA | B = 1 | B = 1 | 690 | | to PF:W:L state. Switch is complete. | | | 691 +----+-------------------------------------------+---------+---------+ 693 Figure 4: Bidirectional non-locking 695 It is perhaps instructive to note that the only differences between 696 the unidirectional non-locking and bidirectional non-locking 697 scenarios are the trigger at t2 which causes Z to send SF(1,1) and 698 the state Z finally enters (PF:W:L rather than PF:W:R). All other 699 actions before and after this point are identical between the two 700 cases. 702 3.3.2.2. Locking 704 We now follow the scenario for the locking mode of operation: 706 +--------------------------------------------------------------------+ 707 |Time| Event Description |LER-A PSC|LER-Z PSC| 708 | | | Bridge | Bridge | 709 +----+-------------------------------------------+---------+---------+ 710 | t0 | Traffic is being transported on W1, P is | NR(0,0) | NR(0,0) | 711 | | not carrying any traffic. Both LER-A and | B = n/a | B = n/a | 712 | | LER-Z transmitting PSC NR(0,0) message. | | | 713 +----+-------------------------------------------+---------+---------+ 714 | t1 | LER-A detects SF on W1 and sends SF(1,0). | SF(1,0) | NR(0,0) | 715 | | LER-A enters into WFA continues to bridge | B = n/a | B = n/a | 716 | | and select the traffic from W1. This | | | 717 | | allows traffic to get through if the | | | 718 | | failure is really unidirectional. | | | 719 +----+-------------------------------------------+---------+---------+ 720 | t2 | LER-Z detects the SF on W1. LER-Z enters | SF(1,0) | SF(1,0) | 721 | | WFA state and continues to bridge and | B = n/a | B = n/a | 722 | | select traffic from W1 while transmitting | | | 723 | | SF(1,0). | | | 724 +----+-------------------------------------------+---------+---------+ 725 | t3 | LER-Z receives the SF(1,0) from LER-A and | SF(1,0) | SF(1,1) | 726 | | bridges traffic from W1 to P remaining in | B = n/a | B = 1 | 727 | | WFA state now transmitting a SF(1,1) | | | 728 +----+-------------------------------------------+---------+---------+ 729 | t4 | LER-A receives the SF(1,0) from LER-Z and | SF(1,1) | SF(1,1) | 730 | | bridges traffic from W1 to P remaining in | B = 1 | B = 1 | 731 | | WFA state now transmitting a SF(1,1) | | | 732 +----+-------------------------------------------+---------+---------+ 733 | t5 | LER-A receives the SF(1,1) from LER-Z and | SF(1,1) | SF(1,1) | 734 | | considers it an Ack and transits from WFA | B = 1 | B = 1 | 735 | | to PF:W:L state | | | 736 +----+-------------------------------------------+---------+---------+ 737 | t6 | LER-Z receives SF(1,1), which it takes as | SF(1,1) | SF(1,1) | 738 | | an Ack from LER-A and transits from WFA | B = 1 | B = 1 | 739 | | to PF:W:L state. Switch is complete. | | | 740 +----+-------------------------------------------+---------+---------+ 742 Figure 5: Bidirectional locking 744 As with non-locking, the major difference between the unidirectional 745 and bidirectional scenarios of this failure are the alarm which 746 causes LER-Z to take action and the final state LER-Z enters as a 747 result. 749 3.3.3. Preemption scenarios 751 In addition to a bidirectional failure, it is also necessary to 752 consider preemption. When protecting n entities e.g [W1, W2, W3] it 753 is possible for multiple working LSPs to simultaneously fail. 754 Consider the case where LSP W1 fails and starts to use the protection 755 LSP. After this failure, LSP W2 fails before W1 has been restored. 756 If W2 is of a lower relative priority than W1, there is no 757 preemption. However, if W2 has a higher priority than W1, when W2 758 fails it preempts W1 from the protection LSP. Preemption is not an 759 issue in 1:1 or 1+1, as with only a single working LSP there's 760 nothing to preempt. 762 There are multiple scenarios of preemption depending on where the 763 failures were detected. In addition to the combinations of failure 764 directionality and preemption, it is also necessary to consider how 765 these combinations behave in both the locking and non-locking modes 766 of operation. 768 First consider, the two flavors of preemption due to multiple 769 unidirectional failures. 771 The difference between Locking and Non-Locking modes is that a node 772 can continue to send traffic on the P-LSP during the preemption 773 process, when operating in Non-Locking mode. The P-LSP contents may 774 momentarily disagree (A may send W1 on P, Z may send W2 on P) but in 775 the non-locking case there is no risk of misconnectivity as explained 776 in the previous discussion. For this reason, the identity of the 777 path that the endpoints are selecting incoming traffic from are 778 irrelevant. In a sense there is no selector; each node is able to 779 properly process arbitrary data on the P-LSP. 781 However, WFA state is still necessary in order to ensure that the 782 endpoints converge on the identity of the working path whose traffic 783 is being transported on the P-LSP. Failure to converge is a problem 784 that should be flagged to the operator. 786 The scenarios start after the two endpoints have converged on 787 protecting a unidirectional SF condition that was detected on W2, 788 when a new SF condition is detected on W1 (with higher priority): 790 3.3.3.1. Unidirectional non-locking 792 First, consider the event sequence for unidirectional faults in a 793 domain in non-locking mode: 795 +--------------------------------------------------------------------+ 796 |Time| Event Description |LER-A PSC|LER-Z PSC| 797 | | | Bridge | Bridge | 798 +----+-------------------------------------------+---------+---------+ 799 | t0 | Traffic from W2 is being transported on P | SF(2,2) | NR(0,2) | 800 | | and both endpoints are coordinated | B = 2 | B = 2 | 801 +----+-------------------------------------------+---------+---------+ 802 | t1 | LER-A detects SF on W1 and sends SF(1,1). | SF(1,1) | NR(0,2) | 803 | | LER-A enters into WFA, blocks the W2 | B = 1 | B = 2 | 804 | | traffic and begins transporting W1 traffic| | | 805 | | on P. (Since W1 has higher priority) | | | 806 +----+-------------------------------------------+---------+---------+ 807 | t2 | LER-Z receives the SF(1,1) from LER-A and | SF(1,1) | NR(0,1) | 808 | | bridges traffic from W1 to P remaining in | B = 1 | B = 1 | 809 | | PF:W:R now transmitting a NR(0,1) | | | 810 +----+-------------------------------------------+---------+---------+ 811 | t3 | LER-A receives the NR(0,1) from LER-Z and | SF(1,1) | NR(0,1) | 812 | | considers it an Ack and transits from WFA | B = 1 | B = 1 | 813 | | to PF:W:L state. Coordination complete | | | 814 +----+-------------------------------------------+---------+---------+ 816 Figure 6: Preemption unidirectional non-locking 818 As mentioned, in steady state LER-A is sending SF(2,2) and LER-Z is 819 sending NR(0,2). If LER-A detects an SF on W1, W1 must preempt W2 in 820 its use of the protection LSP. What the network subsequently does 821 with W2 is outside the scope of PSC, but likely recovery actions may 822 include rerouting W2, alerting W2's clients as to the unprotected 823 failure status of W2, and so forth. 825 3.3.3.2. Unidirectional locking 827 In locking operation mode, when A detects an SF on W1, it needs to 828 alert the far-end, LER-Z, that the W2 traffic must be preempted. 829 LER-A does this by indicating an SF on the higher priority LSP and by 830 emptying the protection LSP. The following table presents the 831 sequence for this scenario (we include the indication of the working 832 path that is expected by each endpoint to be on the protection path, 833 shown as "S = n") 835 +--------------------------------------------------------------------+ 836 |Time| Event Description |LER-A PSC|LER-Z PSC| 837 | | | Bridge | Bridge | 838 | | |Selector | Selector| 839 +----+-------------------------------------------+---------+---------+ 840 | t0 | Traffic from W2 is being transported on P | SF(2,2) | NR(0,2) | 841 | | and both endpoints are coordinated | B = 2 | B = 2 | 842 | | | S = 2 | S = 2 | 843 +----+-------------------------------------------+---------+---------+ 844 | t1 | LER-A detects SF on W1 and sends SF(1,0). | SF(1,0) | NR(0,2) | 845 | | LER-A enters into WFA blocks all traffic | B = n/a | B = 2 | 846 | | on the protection path | S = n/a | S = 2 | 847 +----+-------------------------------------------+---------+---------+ 848 | t2 | LER-Z receives the SF(1,0) from LER-A and | SF(1,0) | NR(0,1) | 849 | | bridges traffic from W1 to P (higher | B = n/a | B = 1 | 850 | | priority), and begins transmitting NR(0,1)| S = n/a | S = 2 | 851 | | At this point W1 traffic is flowing Z->A | | | 852 | | but not A->Z | | | 853 +----+-------------------------------------------+---------+---------+ 854 | t3 | LER-A receives NR(0,1) from LER-Z and | SF(1,1) | NR(0,1) | 855 | | considers it an Ack and transits from WFA | B = 1 | B = 1 | 856 | | to PF:W:L state and transmits SF(1,1) | S = 1 | S = 2 | 857 +----+-------------------------------------------+---------+---------+ 858 | t4 | LER-Z receives SF(1,1), and begins | SF(1,1) | NR(0,1) | 859 | | selecting the protected traffic as W1 data| B = 1 | B = 1 | 860 | | Switch is complete. | S = 1 | S = 1 | 861 +----+-------------------------------------------+---------+---------+ 863 Figure 7: Preemption unidirectional locking 865 Traffic loss is asymmetric. Loss A->Z starts at t1 and ends at t4, 866 roughly 1.5xRTT. Loss Z->A starts at t1 and ends at t3, roughly 867 0.5xRTT. 869 3.3.3.3. Bidirectional non-locking 871 Looking, similarly, at the implications of preemption on the basic 872 scenarios of bidirectional faults in multiple working paths. Both of 873 the operating modes, i.e. non-locking and locking, are presented. 874 The scenarios begin at the point where W2 traffic is being 875 transported on the protection path in a coordinated fashion, when a 876 SF is detected by both endpoints of the 1:n protection domain. W1 877 traffic has a higher priority than that of W2 traffic and, therefore, 878 will preempt the current protected traffic. 880 The following presents the scenario in non-locking operation: 882 +--------------------------------------------------------------------+ 883 |Time| Event Description |LER-A PSC|LER-Z PSC| 884 | | | Bridge | Bridge | 885 +----+-------------------------------------------+---------+---------+ 886 | t0 | Traffic from W2 is being transported on P | SF(2,2) | NR(0,2) | 887 | | and both endpoints are coordinated | B = 2 | B = 2 | 888 +----+-------------------------------------------+---------+---------+ 889 | t1 | LER-A detects SF on W1, bridges W1 into P | SF(1,1) | NR(0,2) | 890 | | and sends SF(1,1). LER-A enters into WFA | B = 1 | B = 2 | 891 | | state and continues to select the | | | 892 | | protected traffic from P that is for W2. | | | 893 +----+-------------------------------------------+---------+---------+ 894 | t2 | LER-Z detects the SF on W1. LER-Z enters | SF(1,1) | SF(1,1) | 895 | | WFA state and bridges W1 into P and | B = 1 | B = 1 | 896 | | transmitting SF(1,1). At this point | | | 897 | | traffic for W1 is protected in both | | | 898 | | directions, however the endpoints are | | | 899 | | still not coordinated | | | 900 +----+-------------------------------------------+---------+---------+ 901 | t3 | LER-Z receives the SF(1,1) from LER-A and | SF(1,1) | SF(1,1) | 902 | | considers it an Ack and transits from WFA | B = 1 | B = 1 | 903 | | to PF:W:L state | | | 904 +----+-------------------------------------------+---------+---------+ 905 | t4 | LER-A receives SF(1,1), which it takes as | SF(1,1) | SF(1,1) | 906 | | an Ack from LER-Z and transits from WFA | B = 1 | B = 1 | 907 | | to PF:W:L state. Switch is complete. | | | 908 +----+-------------------------------------------+---------+---------+ 910 Figure 8: Preemption bidirectional non-locking 912 3.3.3.4. Bidirectional locking 914 When considering the locking mode of operation, we must consider that 915 the protection path, P, must be cleared of all traffic during the 916 transition of traffic caused by preemption. The bidirectional case 917 will be similar to the scenario for a unidirectional fault with the 918 major difference being the final state of the two endpoints. The 919 following would be the sequence of events: 921 +--------------------------------------------------------------------+ 922 |Time| Event Description |LER-A PSC|LER-Z PSC| 923 | | | Bridge | Bridge | 924 | | |Selector | Selector| 925 +----+-------------------------------------------+---------+---------+ 926 | t0 | Traffic from W2 is being transported on P | SF(2,2) | NR(0,2) | 927 | | and both endpoints are coordinated | B = 2 | B = 2 | 928 | | | S = 2 | S = 2 | 929 +----+-------------------------------------------+---------+---------+ 930 | t1 | LER-A detects SF on W1 and sends SF(1,0). | SF(1,0) | NR(0,2) | 931 | | LER-A enters into WFA blocks all traffic | B = n/a | B = 2 | 932 | | on the protection path | S = n/a | S = 2 | 933 +----+-------------------------------------------+---------+---------+ 934 | t2 | LER-Z detects the SF on W1. LER-Z enters | SF(1,0) | SF(1,0) | 935 | | WFA state and blocks all traffic on the | B = n/a | B = n/a | 936 | | protection path while transmitting SF(1,0)| S = n/a | S = n/a | 937 +----+-------------------------------------------+---------+---------+ 938 | t3 | LER-Z receives the SF(1,0) from LER-A and | SF(1,0) | SF(1,1) | 939 | | bridges traffic from W1 to P (higher | B = n/a | B = 1 | 940 | | priority) At this point W1 traffic is | S = n/a | S = n/a | 941 | | flowing Z->A but not A->Z | | | 942 +----+-------------------------------------------+---------+---------+ 943 | t4 | LER-A receives NR(0,1) from LER-Z and | SF(1,1) | SF(1,1) | 944 | | considers it an Ack and transits from WFA | B = 1 | B = 1 | 945 | | to PF:W:L state | S = 1 | S = n/a | 946 +----+-------------------------------------------+---------+---------+ 947 | t5 | LER-Z receives SF(1,1), and begins | SF(1,1) | SF(1,1) | 948 | | selecting the protected traffic as W1 data| B = 1 | B = 1 | 949 | | Switch is complete. | S = 1 | S = 1 | 950 +----+-------------------------------------------+---------+---------+ 952 Figure 9: Preemption bidirectional locking 954 4. Changes to PSC 956 The Protection State Coordination protocol (PSC) is defined in 957 [LinProt]. This includes both the format of the G-ACh based message 958 as well as a description of the operations and the state transition 959 logic of the protocol. The extension to cover 1:n protection 960 includes changes to both aspects of PSC. 962 The changes to the message structure, include both the addition of 963 new information and extension of the semantics of some of the 964 existing fields of the message. These changes will be described in 965 Section 4.2. 967 The changes relative to the behavior of the base PSC protocol will be 968 described in Section 4.3. 970 4.1. PSC 972 Base PSC (as defined in [LinProt] is a single-phased protocol, i.e. 973 the endpoints perform protection switching without waiting for 974 acknowledgement from the far end LER. The protocol messages are 975 transmitted using the G-ACh and the format is described in Figure 10. 977 0 1 2 3 978 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 979 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 980 |0 0 0 1|Version| Reserved | PSC-CT = 0x0024 | 981 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 982 |Ver|Request|PT |R| Reserved1 | FPath | Path | 983 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 984 | TLV Length | Reserved2 | 985 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 986 ~ Optional TLVs ~ 987 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 989 Figure 10: Format of basic PSC packet with a G-ACh header 991 In regards to the G-ACh Header no changes are suggested in the 992 extensions for 1:n protection, i.e., the channel type field will 993 continue to use the PSC-CT value defined in [LinProt]. The PSC 994 payload fields affected by this document are the Ver field, Reserved1 995 field, and the Fpath and Path fields. 997 4.2. Changes to PSC Payload 999 In order to support 1:n protection there is a need to make one small 1000 change to the format of the PSC payload (see Figure 11). In 1001 particular, we have added a new flag (L), taken from the Reserved1 1002 space, that is used to indicate whether the protection domain is 1003 opearting in locking or non-locking mode. In addition, the semantics 1004 of the FPath and Path field are adjusted to indicate an index of the 1005 multiple working paths. The details of these changes are supplied in 1006 the following subsections. 1008 Due to the significance of these changes, the value of the Ver field 1009 (in the PSC payload) for 1:n protection domain MUST be set to 2. 1011 0 1 2 3 1012 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1013 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1014 |Ver|Request|PT |R|L| Reserved1 | FPath | Path | 1015 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1016 | TLV Length | Reserved2 | 1017 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1018 ~ Optional TLVs ~ 1019 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1021 Figure 11: Format of 1:n PSC message payload 1023 4.2.1. Locking (L) flag 1025 The Locking flag is used to indicate that the end-point is configured 1026 for Locking mode (see Section 1.2). 1028 If the value is 1 then the protection-domain is operating in locking 1029 mode 1031 The Locking flag must be the same on both ends; if the two endpoints 1032 of a protection domain have different L-flag settings, this MUST 1033 raise an error to the network operator. 1035 4.2.2. Fault path (FPath) field 1037 The Fpath field indicates which path is identified to be in a fault 1038 condition or affected by an administrative command. The following 1039 are the possible values: 1041 o 0: indicates that the anomaly condition is on the protection path 1043 o 1-128: indicates that the anomaly condition is on a working path 1044 whose index is indicated. 1046 o 129-255: for future extensions or experimental use. 1048 4.2.3. Data path (Path) field 1050 The Path field indicates which data is being transmitted on the 1051 protection path. Under normal conditions, the protection path does 1052 not need to carry any user data traffic, but may carry extra traffic. 1053 If there is a failure/degrade condition on one of the working paths, 1054 then that working path's data traffic will be transmitted over the 1055 protection path. The following are the possible values: 1057 o 0: indicates that the protection path is not transporting user 1058 data traffic. 1060 o 1-128: indicates that the protection path is transmitting user 1061 traffic replacing the use of the working path indexed. 1063 o 129-255: for future extensions or experimental use. 1065 4.3. Changes to PSC Operation 1067 In all of the following subsections, assume a protection domain 1068 between LER-A and LER-Z, using working paths 1-N and the protection 1069 path as shown in figure 1. 1071 A basic premise of this protection architecture is that both 1072 endpoints of the protection domain MUST be configured to associate 1073 the indices of the working paths with the proper LSP identifiers. If 1074 this condition is not met then the protection scheme will cause 1075 inconsistencies in traffic transmission. 1077 4.3.1. Basic operation 1079 Protection of the N working paths is based on the operational 1080 principles outlined in [LinProt] and will employ the same basic 1081 Protection State Coordination Protocol (PSC) outlined in that 1082 document. However, as can be expected, due to certain basic 1083 differences in the architecture of the protection domain, a small set 1084 of differences in operation are necessary. The following sub- 1085 sections will highlight these differences and explain their effects 1086 on the PSC state machine. 1088 4.3.2. Two-phased operation 1090 PSC, as presented in [LinProt] is a single-phased protocol. This 1091 means that when an endpoint receives a trigger to perform a 1092 protection switch, the LER switches traffic and then notifies the far 1093 end of the switch, without waiting for acknowledgement. When 1094 addressing the situation in a 1:n protection domain, the endpoint 1095 that receives the trigger must first verify that the protection path 1096 is available to transmit the protected traffic. This may involve 1097 interrupting the traffic that is currently being transmitted on the 1098 protection path by both endpoints. 1100 In general, after the LER has detected a trigger for protection 1101 switching, e.g. a FS operator command, or a SF indication for one of 1102 the working paths, the LER SHALL transmit the appropriate PSC message 1103 as described in [LinProt] with the following changes: 1105 o If the protection domain is currently in either Protecting 1106 administrative or Protecting failure state, then the endpoint 1107 SHALL verify that the new trigger has a higher priority than the 1108 currently protected traffic. If the new trigger has a lower 1109 priority then it MUST be ignored. 1111 o The PSC message SHALL set the FPath value to the index of the 1112 working path that generated the trigger. The Path value SHOULD be 1113 set to 0, unless the protection path was previously transporting 1114 traffic from another working path (as indicated by the value of 1115 the Path field.) 1117 o If the protection path is currently transporting protected traffic 1118 and the protection domain is operating in locking mode, then the 1119 endpoint SHALL block all traffic of the protected working path. 1121 o The endpoint SHALL transit to WFA state (see below). 1123 o Upon reception of the switching PSC message, the far end LER SHALL 1124 verify that the received request is of higher priority than the 1125 known current traffic on the protection path, and if so SHALL 1126 interrupt the current traffic on the protection path, perform the 1127 switch to the requested protected traffic, and send a PSC message 1128 with the Path field set to the index of the current protected 1129 working path. 1131 o Upon reception of the PSC message, the initiating LER SHALL verify 1132 that the Path field is set to the index of the working path of the 1133 highest priority. If the Path field matches the highest priority 1134 path the LER SHALL perform the protection switch and transmit the 1135 appropriate PSC message, with the FPath field indicating the index 1136 of the working path that triggered the protection switch and the 1137 Path field set to the index of the working path whose traffic is 1138 being transported on the protection path. 1140 4.3.3. Acknowledge message 1142 As stated above, before performing a protection switch the endpoint 1143 that detected a switching trigger MUST wait for an Acknowledge 1144 message prior to performing the switch. There are two types of 1145 message that will be considered as an Acknowledge message: 1147 1. A reply message with the Request field reflecting the state of 1148 the far end, and the Path field set to the index of the working 1149 path that triggered the switching condition. For example, if 1150 there is a Forced Switch command detected by LER-Z on working 1151 path W4, then LER-Z will have sent an FS(4,0) message to LER-A. 1152 Then when LER-Z receives a message such as NR(0,4)Ack this should 1153 be considered acknowledgement of the switching and that the 1154 protection path is available to switch the traffic from working 1155 path W4. 1157 2. A remote message with the same Request field and FPath field as 1158 that transmitted by the LER in the WFA state. For example, if 1159 there is a bidirectional Signal fault detected by LER-A on 1160 working path W4, then LER-A will enter WFA state and transmit a 1161 SF(4,0) message. When it receives the SF(4,0) message from 1162 LER-Z, that has also detected the SF condition, it should be 1163 considered an acknowledgement of the switching and that the 1164 protection path is available to switch the traffic from working 1165 path W2. 1167 4.3.4. Wait for Acknowledge (WFA) timer 1169 The protection system MUST include a timer called the Wait for 1170 Acknowledge (WFA) timer that SHALL be started when the LER enters WFA 1171 state and reset when the Acknowledge message is received. The length 1172 of the WFA timer SHOULD be configured to allow protection switching 1173 within the normal time constraints. The WFA timer will expire only 1174 if no Acknowledge message was received by the LER in WFA state. The 1175 WFA Expires local input should have a priority just below that of the 1176 WTRExpires signal. 1178 4.3.5. Additional PSC State 1180 As described above and demonstrated in the scenarios in Section 3.3, 1181 there is a need, in some scenarios, for the endpoint that is 1182 reporting a trigger for protection-switching to delay the actual 1183 switch-over until an acknowledge is received from the far end LER. 1184 In order to facilitate this wait period it is necessary to define a 1185 new PSC State - Wait for Acknowledge (WFA) state. WFA is used in 1186 both the Locking and Non-Locking cases. It is more essential to the 1187 Locking mode of operation, as agreement is the mechanism to establish 1188 and release the lock on the protection LSP. However, it is necessary 1189 for the Non-Locking mode as a persistent disagreement on the contents 1190 of the protection LSP indicates an error in the network devices and 1191 WFA is the method used to detect this error. 1193 In the locking mode, WFA comes into play when a failed LSP preempts 1194 another LSP. This is highlighted in the scenarios presented in 1195 Figure 7 & Figure 9. 1197 When a working path is preempted, the protection domain must 1198 transition the contents of the protecting path from the preempted 1199 working path to the preempting working path. In the locking case, 1200 the protecting path must temporarily be blocked (that is, nothing is 1201 being protected) in order to ensure that there is no misconnectivity. 1202 In the case where W1 preempts W2, the contents of the protection path 1203 transitions from transporting the W2 to not carrying any traffic 1204 before beginning to transport W1 traffic. 1206 The following sub-section will describe the actions to be taken when 1207 an LER is in the WFA state. 1209 4.3.5.1. Wait for Acknowledge (WFA) State 1211 An LER will enter the Wait for Acknowledge state before transitioning 1212 into a protection state, i.e. either Protecting administrative or 1213 Protecting failure state. The LER SHALL remain in this state until 1214 either receiving an Acknowledge message, or until a WFA timer 1215 expires. Normally, the Acknowledge message will be a remote PSC 1216 input. The following describe how the LER, in WFA state, should 1217 react to a new local input: 1219 o A local Clear SHALL cause the LER to go into Normal state if the 1220 LER is in WFA state due to either a FS or MS trigger and transmit 1221 an NR(0,0) PSC message. If the LER is in WFA state due to a SF 1222 trigger then the local Clear SHALL be ignored. 1224 o A local LO SHALL cause the LER to go into Unavailable state and 1225 begin to transmit LO(x, 0) [where x indicates the index of the 1226 working path that triggered the WFA state]. 1228 o A local FS SHALL cause the LER to remain in WFA state and transmit 1229 the FS(x, 0) message [where x indicates the index of the protected 1230 working path]. If the LER is in WFA state due to a FS from a 1231 different working path, then the working path with the higher 1232 priority SHALL be the protected working path. If the LER is in 1233 WFA state due to any other switching trigger, then the working 1234 path that is identified in this FS will be the protected working 1235 path. 1237 o A local SF SHALL cause the LER to remain in WFA state. If the LER 1238 is in WFA state due to an existing FS trigger, then ignore the 1239 local SF and continue to transmit the FS(x, 0) PSC message. If 1240 the LER is in WFA state due to an existing SF trigger then 1241 transmit the SF(x, 0) PSC message [where x indicates the index of 1242 protected working path, i.e. the highest priority working path 1243 indicating an SF condition]. If the LER is in WFA state due to 1244 any other trigger, then begin transmitting a SF(x, 0) PSC message 1245 [where x indicates the index of the working path that is 1246 generating the SF condition]. 1248 o A local ClearSF indication where the working path is the same as 1249 the path that triggered the LER into WFA state SHALL cause the LER 1250 to go into WTR state (note: 1:N protection is always revertive) 1251 and to transmit the WTR(0, 0) message. If the ClearSF indicates a 1252 different index from the protected working path or indicates the 1253 protection path then the indication SHALL be ignored. 1255 o A local MS operator command SHALL cause the LER to remain in WFA 1256 state. If the LER is in WFA state due an existing MS trigger, 1257 then the node continues to transmit MS(x, 0) messages [where x 1258 indicates the index of the protected working path, i.e. the 1259 highest priority working path indicating the MS condition]. If 1260 the LER is in WFA state due to any other trigger, ignore the MS 1261 command and continue transmitting the current message. 1263 o If the WFA timer expires, i.e. the LER did not receive the 1264 Acknowledge message from the far end in a timely manner, then the 1265 LER SHALL go to Unavailable state, i.e. it assumes that there is a 1266 problem on the protection path (where all PSC traffic is 1267 transmitted) and send an error notification to the management 1268 system. The LER SHALL continue transmitting the current PSC 1269 message with Path field set to 0. 1271 o All other local indications SHALL be ignored. 1273 The following details the reactions of the LER in WFA state to remote 1274 messages: 1276 o Any remote message with the Acknowledge flag set to 1 and the Path 1277 field set to the index of the protected working path SHALL cause 1278 the LER to change state. If the trigger was either FS or MS 1279 command, the LER enters Protecting administrative state. The LER 1280 transmits the appropriate message according to the trigger (i.e. 1281 FS(x,x) for FS command and MS(x,x) for the MS command). If the 1282 trigger was a SF condition, then the LER enters the Protecting 1283 failure state and begins to transmit the appropriate SF(x, x) 1284 message. A remote message with the Acknowledge flag set to 1 but 1285 where the Path field does not match, according to the description 1286 above, SHALL be ignored. 1288 o A remote LO message SHALL cause the LER to go into Unavailable 1289 state and transmit the appropriate message for the trigger that 1290 caused the WFA state. 1292 o A remote FS message indicating the same working path as the local 1293 FS command that triggered the WFA state SHALL be considered an 1294 Acknowledge message, even if the Acknowledge flag is not set. The 1295 LER SHALL perform the protection switch, and begin transmitting 1296 the FS(x, x) message [where x indicates the index of the protected 1297 working path]. If the remote FS message indicates a different 1298 index than the one indicated in the local FS and if the remote FS 1299 message indicates a lower priority working path than the working 1300 path in the local FS trigger then the LER SHALL ignore the remote 1301 FS message and remain in WFA state. If the remote FS message 1302 indicates an index of higher priority or the LER is in WFA state 1303 as a result of a SF or MS trigger, then the LER SHALL perform the 1304 protection switch for the protected working path indicated by the 1305 remote FS message, and SHALL go to Protecting administrative state 1306 and transmit the appropriate message for the local trigger with 1307 the Path field set to the index of the remote message and the 1308 Acknowledge flag set to 1. 1310 o A remote SF message indicating an error on the protection path 1311 SHALL cause the LER to go into Unavailable state and transmit the 1312 appropriate message for the trigger that caused to WFA state. 1314 o A remote SF message indicating an error on the same working path 1315 as the local SF condition that triggered the WFA state SHALL be 1316 considered an Acknowledge message (even if the Acknowledge flag is 1317 not set). The LER SHALL perform the protection switch, go to 1318 Protecting failure state and transmit the SF(x, x) message [where 1319 x is the index of the protected working path]. If the remote SF 1320 message indicates a different index than the one indicated in the 1321 local SF, then if the local command indicates a higher priority 1322 working path the LER SHALL ignore the remote SF message and remain 1323 in WFA state. If the remote SF message indicates an index of 1324 higher priority or the LER is in WFA state as a result of a MS 1325 trigger, then the LER SHALL perform the protection switch for the 1326 protected working path indicated by the remote SF message, and 1327 SHALL go to Protecting failure state and transmit the appropriate 1328 message for the local trigger with the Path field set to the index 1329 of the remote message and the Acknowledge flag set to 1. If the 1330 LER is in WFA state due to a local FS command, then it SHALL 1331 ignore the remote message and remain in WFA state. 1333 o A remote MS message indicating an error on the same working path 1334 as the local MS that triggered the WFA state SHALL be considered 1335 an Acknowledge message (even if the Acknowledge flag is not set). 1336 The LER SHALL perform the protection switch, go to Protecting 1337 administrative state and transmit the MS(x, x) message [where x is 1338 the index of the protected working path]. If the remote MS 1339 message indicates a different index than the one indicated in the 1340 local MS, then if the local command indicates a higher priority 1341 working path or the LER is in WFA due to either a FS or SF 1342 trigger, the LER SHALL ignore the remote MS message and remain in 1343 WFA state. If the remote MS message indicates an index of higher 1344 priority, then the LER SHALL perform the protection switch for the 1345 protected working path indicated by the remote MS message, and 1346 SHALL go to Protecting administrative state and transmit an NR(0, 1347 y) with the Path field set to the index of the remote message and 1348 the Acknowledge flag set to 1. 1350 o All other remote messages SHOULD be ignored. 1352 5. IANA Considerations 1354 This document does not include any required IANA considerations 1356 6. Security Considerations 1358 The generic security considerations for the data-plane of MPLS-TP are 1359 described in the security framework document [SecureFwk] together 1360 with the required mechanisms needed to address them. The security 1361 considerations for the generic associated control channel are 1362 described in [RFC5586]. The security considerations for protection 1363 and recovery aspects of MPLS-TP are addressed in [SurvivFwk]. 1365 The extensions to the protocol described in this document are 1366 extensions to the protocol defined in [LinProt] and does not 1367 introduce any new security risks. 1369 7. Acknowledgements 1371 The authors would like to thank everyone involved in the definition 1372 and specification of protection mechanisms for MPLS Transport Profile 1373 (MPLS-TP). 1375 8. References 1377 8.1. Normative References 1379 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1380 Requirement Levels", BCP 14, RFC 2119, March 1997. 1382 [TPReq] Niven-Jenkins, B., Brungard, D., Betts, M., Sprecher, N., 1383 and S. Ueno, "Requirements of an MPLS Transport Profile", 1384 RFC 5654, September 2009. 1386 [LinProt] Bryant, S., Sprecher, N., Osborne, E., Fulignoli, A., and 1387 Y. Weingarten, "Multi-protocol Label Switching Transport 1388 Profile Linear Protection", RFC 6378, Apr 2011. 1390 8.2. Informative References 1392 [RFC5586] Vigoureux,, M., Bocci, M., Swallow, G., Aggarwal, R., and 1393 D. Ward, "MPLS Generic Associated Channel", RFC 5586, 1394 May 2009. 1396 [RFC4427] Mannie, E. and D. Papadimitriou, "Recovery Terminology for 1397 Generalized Multi-Protocol Label Switching", RFC 4427, 1398 Mar 2006. 1400 [RFC3031] Rosen, Eric., Viswanathan, A., and Ross. Callon, 1401 "Multiprotocol Label Switching Architecture", RFC 3031, 1402 Mar 2006. 1404 [SurvivFwk] 1405 Sprecher, N., Farrel, A., and H. Shah, "Multi-protocol 1406 Label Switching Transport Profile Survivability 1407 Framework", RFC 6372, Feb 2009. 1409 [SecureFwk] 1410 Fang, L., Niven-Jenkins, B., Mansfield, S., Zhang, R., 1411 Bitar, N., Daikoku, M., and L. Wang, "MPLS-TP Security 1412 Framework", 1413 ID draft-ietf-mpls-tp-security-framework-07.txt, Jan 2013. 1415 Appendix A. PSC state machine tables 1417 Note/Disclaimer: This state machine is not currently in sync with the 1418 text of the document and will be updated in a future revision. 1420 The full PSC state machine is described in [LinProt], both in textual 1421 and tabular form. This appendix highlights the changes to the basic 1422 PSC state machine. In the event of a mismatch between these tables 1423 and the text either in [LinProt] or in this document, the text is 1424 authoritative. Note that this appendix is intended to be a 1425 functional description, not an implementation specification. 1427 The tables here use the same format and state descriptions used in 1428 the Linear Protection document with the addition of the WFA state, 1429 WFA Expires, and the changes in the behavior that is noted. 1431 Each state corresponds to the transmission of a particular set of 1432 Request, FPath and Path bits. The table below lists the message that 1433 is generally sent in each particular state. If the message to be 1434 sent in a particular state deviates from the table below, it is noted 1435 in the footnotes to the state-machine table. 1437 State REQ(FP,P) 1438 ------- --------- 1439 N NR(0,0) 1440 UA:LO:L LO(0,0) 1441 UA:P:L SF(0,0) 1442 UA:LO:R NR(0,0) 1443 UA:P:R NR(0,0) 1444 PF:W:L SF(1,1) 1445 PF:W:R NR(0,1) 1446 PA:F:L FS(1,1) 1447 PA:M:L MS(1,1) 1448 PA:F:R NR(0,1) 1449 PA:M:R NR(0,1) 1450 WTR WTR(0,1) 1451 DNR DNR(0,1) 1453 The top row in each table is the list of possible inputs. The local 1454 inputs are: 1456 NR No Request 1457 OC Operator Clear 1458 LO Lockout of protection 1459 SF-P Signal Fail on protection path 1460 SF-W Signal Fail on working path 1461 FS Forced Switch 1462 SFc Clear Signal Fail 1463 MS Manual Switch 1464 WTRExp WTR Expired 1466 and the remote inputs are: 1468 LO remote LO message 1469 SF-P remote SF message indicating protection path 1470 SF-W remote SF message indicating working path 1471 FS remote FS message 1472 MS remote MS message 1473 WTR remote WTR message 1474 DNR remote DNR message 1475 NR remote NR message 1477 Section 4.3.3 refers to some states as 'remote' and some as 'local'. 1478 By definition, all states listed in the table of local sources are 1479 local states, and all states listed in the table of remote sources 1480 are remote states. For example, section 4.3.3.1 says "A local 1481 Lockout of protection input SHALL cause the LER to go into local 1482 Unavailable State". As the trigger for this state change is a local 1483 one, 'local Unavailable State' is by definition displayed in the 1484 table of local sources. Similarly, "A remote Lockout of protection 1485 message SHALL cause the LER to go into remote Unavailable state" 1486 means that the state represented in the Unavailable rows in the table 1487 of remote sources is by definition a remote Unavailable state. 1489 Each cell in the table below contains either a state, a footnote, or 1490 the letter 'i'. 'i' stands for Ignore, and is an indication to 1491 continue with the current behavior. See section 4.3.3. The 1492 footnotes are listed below the table. 1494 Part 1: Local input state machine 1496 | OC | LO | SF-P | FS | SF-W | SFc | MS | WTRExp 1497 --------+-----+-------+------+------+------+------+------+------- 1498 N | i |UA:LO:L|UA:P:L|PA:F:L|PF:W:L| i |PA:M:L| i 1499 UA:LO:L | N | i | i | i | i | i | i | i 1500 UA:P:L | i |UA:LO:L| i | i | i | [5] | i | i 1501 UA:LO:R | i |UA:LO:L| [1] | i | [2] | [6] | i | i 1502 UA:P:R | i |UA:LO:L|UA:P:L| i | [3] | [6] | i | i 1503 PF:W:L | i |UA:LO:L|UA:P:L|PA:F:L| i | [7] | i | i 1504 PF:W:R | i |UA:LO:L|UA:P:L|PA:F:L|PF:W:L| i | i | i 1505 PA:F:L | N |UA:LO:L|UA:P:L| i | i | i | i | i 1506 PA:M:L | N |UA:LO:L|UA:P:L|PA:F:L|PF:W:L| i | i | i 1507 PA:F:R | i |UA:LO:L|UA:P:L|PA:F:L| [4] | [8] | i | i 1508 PA:M:R | i |UA:LO:L|UA:P:L|PA:F:L|PF:W:L| i |PA:M:L| i 1509 WTR | i |UA:LO:L|UA:P:L|PA:F:L|PF:W:L| i |PA:M:L| [9] 1510 DNR | i |UA:LO:L|UA:P:L|PA:F:L|PF:W:L| i |PA:M:L| i 1512 Part 2: Remote messages state machine 1514 | LO | SF-P | FS | SF-W | MS | WTR | DNR | NR 1515 --------+-------+------+------+------+------+------+------+------ 1516 N |UA:LO:R|UA:P:R|PA:F:R|PF:W:R|PA:M:R| i | i | i 1517 UA:LO:L | i | i | i | i | i | i | i | i 1518 UA:P:L | [10] | i | i | i | i | i | i | i 1519 UA:LO:R | i | i | i | i | i | i | i | [16] 1520 UA:P:R |UA:LO:R| i | i | i | i | i | i | [16] 1521 PF:W:L | [11] | [12] |PA:F:R| i | i | i | i | i 1522 PF:W:R |UA:LO:R|UA:P:R|PA:F:R| i | i | [14] | [15] | N 1523 PA:F:L |UA:LO:R|UA:P:R| i | i | i | i | i | i 1524 PA:M:L |UA:LO:R|UA:P:R|PA:F:R| [13] | i | i | i | i 1525 PA:F:R |UA:LO:R|UA:P:R| i | i | i | i | i | [17] 1526 PA:M:R |UA:LO:R|UA:P:R|PA:F:R| [13] | i | i | i | N 1527 WTR |UA:LO:R|UA:P:R|PA:F:R|PF:W:R|PA:M:R| i | i | [18] 1528 DNR |UA:LO:R|UA:P:R|PA:F:R|PF:W:R|PA:M:R| i | i | i 1530 The following are the footnotes for the table: 1532 [1] Remain in the current state (UA:LO:R) and transmit SF(0,0) 1534 [2] Remain in the current state (UA:LO:R) and transmit SF(1,0) 1536 [3] Remain in the current state (UA:P:R) and transmit SF(1,0) 1538 [4] Remain in the current state (PA:F:R) and transmit SF(1,1) 1540 [5] If the SF being cleared is SF-P, Transition to N. If it's SF-W, 1541 ignore the clear. 1543 [6] Remain in current state (UA:x:R), if the SFc corresponds to a 1544 previous SF then begin transmitting NR(0,0). 1546 [7] If domain configured for revertive behavior transition to WTR, 1547 else transition to DNR 1549 [8] Remain in PA:F:R and transmit NR(0,1) 1551 [9] Remain in WTR, send NR(0,1) 1553 [10] Transition to UA:LO:R continue sending SF(0,0) 1555 [11] Transition to UA:LO:R and send SF(1,0) 1557 [12] Transition to UA and send SF(1,0) 1559 [13] Transition to PF:W:R and send NR(0,1) 1561 [14] Transition to WTR state and continue to send the current 1562 message. 1564 [15] Transition to DNR state and continue to send the current 1565 message. 1567 [16] If the local input is SF-P then transition to UA:P:L. If the 1568 local input is SF-W then transition to PF:W:L. Else - transition to N 1569 state and continue to send the current message. 1571 [17] If the local input is SF-W then transition to PF:W:L. Else - 1572 transition to N state and continue to send the current message. 1574 [18] If the receiving LER's WTR timer is running, maintain current 1575 state and message. If the WTR timer is stopped, transition to N. 1577 Authors' Addresses 1579 Eric Osborne 1580 Cisco 1581 United States 1583 Email: eosborne@cisco.com 1585 Fei Zhang 1586 ZTE 1587 China 1589 Email: zhang.fei3@zte.com.cn 1591 Yaacov Weingarten 1592 34 Hagefen St 1593 Karnei Shomron, 4485500 1594 Israel 1596 Email: wyaacov@gmail.com