idnits 2.17.1 draft-csaszar-ipfrr-fn-03.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (June 6, 2012) is 4342 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) ** Obsolete normative reference: RFC 4970 (Obsoleted by RFC 7770) ** Obsolete normative reference: RFC 4971 (Obsoleted by RFC 7981) == Outdated reference: A later version (-10) exists of draft-ietf-rtgwg-mrt-frr-architecture-01 == Outdated reference: A later version (-04) exists of draft-enyedi-rtgwg-mrt-frr-algorithm-01 Summary: 2 errors (**), 0 flaws (~~), 3 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 Network Working Group A. Csaszar (Ed.) 2 Internet Draft G. Enyedi 3 Intended status: Standards Track J. Tantsura 4 Expires: December 6, 2012 S. Kini 5 Ericsson 7 J. Sucec 8 S. Das 9 Telcordia 11 June 6, 2012 13 IP Fast Re-Route with Fast Notification 14 draft-csaszar-ipfrr-fn-03.txt 16 Status of this Memo 18 This Internet-Draft is submitted in full conformance with the 19 provisions of BCP 78 and BCP 79. 21 Internet-Drafts are working documents of the Internet Engineering 22 Task Force (IETF), its areas, and its working groups. Note that 23 other groups may also distribute working documents as Internet- 24 Drafts. 26 Internet-Drafts are draft documents valid for a maximum of six months 27 and may be updated, replaced, or obsoleted by other documents at any 28 time. It is inappropriate to use Internet-Drafts as reference 29 material or to cite them other than as "work in progress." 31 The list of current Internet-Drafts can be accessed at 32 http://www.ietf.org/ietf/1id-abstracts.txt 34 The list of Internet-Draft Shadow Directories can be accessed at 35 http://www.ietf.org/shadow.html 37 This Internet-Draft will expire on November 6, 2012. 39 Copyright Notice 41 Copyright (c) 2012 IETF Trust and the persons identified as the 42 document authors. All rights reserved. 44 This document is subject to BCP 78 and the IETF Trust's Legal 45 Provisions Relating to IETF Documents 46 (http://trustee.ietf.org/license-info) in effect on the date of 47 publication of this document. Please review these documents 48 carefully, as they describe your rights and restrictions with respect 49 to this document. 51 Abstract 53 This document describes the benefits and main applications of sending 54 explicit fast notification (FN) packets to routers in an area. FN 55 packets are generated and processed in the dataplane, and a single FN 56 service can substitute existing OAM methods for remote failure 57 detection, such as a full mesh of multi-hop BFD session. The FN 58 service, therefore, decreases network overhead considerable. The main 59 application is fast reroute in pure IP and in IP/LDP-MPLS networks 60 called IPFRR-FN. The detour paths used when IPFRR-FN is active are in 61 most cases identical to those used after Interior Gateway Protocol 62 (IGP) convergence. The proposed mechanism can address all single 63 link, node, and SRLG failures in an area; moreover it is an efficient 64 solution to protect against BGP ASBR failures as well as VPN PE 65 router failures. IPFRR-FN can be a supplemental tool to provide FRR 66 when LFA cannot repair a failure case, while it can be a replacement 67 of existing ASBR/PE protection mechanisms by overcoming their 68 scalability and complexity issues. 70 Table of Contents 72 1. Introduction...................................................3 73 2. Overview of current IPFRR Proposals based on Local Repair......6 74 3. Requirements of an Explicit Failure Signaling Mechanism........7 75 4. Conceptual Operation of IPFRR relying on Fast Notification.....8 76 4.1. Preparation Phase.........................................8 77 4.2. Failure Reaction Phase....................................9 78 4.2.1. Activating Failure Specific Backups.................10 79 4.2.2. SRLG Handling.......................................11 80 4.3. Example and Timing.......................................11 81 4.4. Scoping FN Messages with TTL.............................12 82 5. Operation Details.............................................13 83 5.1. Transport of Fast Notification Messages..................13 84 5.2. Message Handling and Encoding............................14 85 5.2.1. Failure Identification Message for OSPF.............15 86 5.2.2. Failure Identification Message for ISIS.............16 87 5.3. Protecting External Prefixes.............................17 88 5.3.1. Failure on the Intra-Area Path Leading to the ASBR..17 89 5.3.2. Protecting ASBR Failures: BGP-FRR...................18 90 5.3.2.1. Primary and Backup ASBR in the Same Area.......18 91 5.3.2.2. Primary and Backup ASBR in Different Areas.....19 92 5.4. Application to LDP.......................................22 93 5.5. Application to VPN PE Protection.........................23 94 5.6. Bypassing Legacy Nodes...................................23 95 5.7. Capability Advertisement.................................24 96 5.8. Constraining the Dissemination Scope of Fast Notification 97 Packets.......................................................25 98 5.8.1. Pre-Configured FN TTL Setting.......................25 99 5.8.2. Advanced FN Scoping.................................25 100 6. Protection against Replay Attacks.............................26 101 6.1. Calculating LSDB Digest..................................27 102 7. Security Considerations.......................................28 103 8. IANA Considerations...........................................28 104 9. References....................................................28 105 9.1. Normative References.....................................28 106 9.2. Informative References...................................29 107 10. Acknowledgments..............................................31 108 Appendix A. Memory Needs of a Naive Implementation...............32 109 A.1. An Example Implementation................................32 110 A.2. Estimation of Memory Requirements........................33 111 A.3. Estimation of Failover Time..............................34 112 Appendix B. Impact Scope of Fast Notification....................35 114 1. Introduction 116 Convergence of link-state IGPs, such as OSPF or IS-IS, after a link 117 or node failure is known to be relatively slow. While this may be 118 sufficient for many applications, some network SLAs and applications 119 require faster reaction to network failures. 121 IGP convergence time is composed mainly of: 123 1. Failure detection at nodes adjacent to the failure 125 2. Advertisement of the topology change 127 3. Calculation of new routes 129 4. Installing new routes to linecards 131 Traditional Hello-based failure detection methods of link-state IGPs 132 are relatively slow, hence a new, optimized, Hello protocol has been 133 standardized [BFD] which can reduce failure detection times to the 134 range of 10ms even if no lower layer notices the failure quickly 135 (like loss of signal, etc.). 137 Even with fast failure detection, reaction times of IGPs may take 138 several seconds, and even with a tuned configuration it may take at 139 least a couple of hundreds of milliseconds. 141 To decrease fail-over time even further, IPFRR techniques [RFC5714], 142 can be introduced. IPFRR solutions compliant with [RFC5714] are 143 targeting fail-over time reduction of steps 2-4 with the following 144 design principles: 146 IGP IPFRR 148 2. Advertisement of the ==> No explicit advertisement, 149 topology change only local repair 151 3. Calculation of new routes ==> Pre-computation of new 152 routes 154 4. Installing new routes ==> Pre-installation of backup 155 to linecards routes 157 Pre-computing means that the way of bypassing a failed resource is 158 computed before any failure occurs. In order to limit complexity, 159 IPFRR techniques typically prepare for single link, single node and 160 single Shared Risk Link Group (SRLG) failures, which failure types 161 are undoubtedly the most common ones. The pre-calculated backup 162 routes are also downloaded to linecards in preparation for the 163 failure, in this way sparing the lengthy communication between 164 control plane and data plane when a failure happens. 166 The principle of local rerouting requires forwarding a packet along a 167 detour even if only the immediate neighbors of the failed resource 168 know the failure. IPFRR methods observing the local rerouting 169 principle do not explicitly propagate the failure information. 170 Unfortunately, packets on detours must be handled in a different way 171 than normal packets as otherwise they might get returned to the 172 failed resource. Rephrased, a node not having *any* sort of 173 information about the failure may loop the packet back to the node 174 from where it was rerouted - simply because its default 175 routing/forwarding configuration dictates that. As an example, see 176 the following figure. Assuming a link failure between A and Dst, A 177 needs to drop packets heading to Dst. If node A forwarded packets to 178 Src, and if the latter had absolutely no knowledge of the failure, a 179 loop would be formed between Src and A. 181 +---+ +---+ 182 | B |-------------| C | 183 +---+ +---+ 184 / \ 185 / \ 186 / \ 187 +---+ +---+ failure +---+ 188 |Src|------------| A |-----X------|Dst| 189 +---+ +---+ +---+ 190 =========>==============>=========> 191 Primary path 193 Figure 1 Forwarding inconsistency in case of local repair: The path 194 of Src to Dst leads through A 196 The basic problem that previous IPFRR solutions struggle to solve is, 197 therefore, to provide consistent routing hop-by-hop without explicit 198 signaling of the failure. 200 To provide protection for all single failure cases in arbitrary 201 topologies, the information about the failure must be given in *some* 202 way to other nodes. That is, IPFRR solutions targeting full failure 203 coverage need to signal the fact and to some extent the identity of 204 the failure within the data packet as no explicit signaling is 205 allowed. Such solutions have turned out to be considerably complex 206 and hard or impossible to implement practically. The Loop Free 207 Alternates (LFA) solution [RFC5286] does not give the failure 208 information in any way to other routers, and so it cannot repair all 209 failure cases such as the one in Figure 1. 211 As discussed in Section 2. solutions that address full failure 212 coverage and rely on local repair, i.e. carrying some failure 213 information within the data packets, present an overly complex and 214 therefore often inpractical alternative to LFA. This draft, 215 therefore, suggests that relaxing the local re-routing principle with 216 carefully engineered explicit failure signaling is an effective 217 approach. 219 The idea of using explicit failure notification for IPFRR has been 220 proposed before for Remote LFA Paths [RLFAP]. RLFAP sends explicit 221 notifications and can limit the radius in which the notification is 222 propagated to enhance scalability. Design, implementation and 223 enhancements for the remote LFAP concept are reported in [Hok2007], 224 [Hok2008] and [Cev2010]. 226 This draft attempts to work out in more detail what kind of failure 227 dissemination mechanism is required to facilitate remote repair 228 efficiently. Requirements for explicit signaling are given in 229 Section 3. This draft does not limit the failure advertisement radius 230 as opposed to RLFAP. As a result, the detour paths remain stable in 231 most cases, since they are identical to those that the IGP will 232 calculation after IGP convergence. Hence, micro-loop will not occur 233 after IGP convergence. 235 A key contribution of this memo is to recognize that a Fast 236 Notification service is not only an enabler for a new IPFRR approach 237 but it is also a replacement for various OAM remote connectivity 238 verification procedures such as multi-hop BFD. These previous methods 239 posed considerable overhead to the network: (i) management of many 240 OAM sessions; (ii) careful configuration of connectivity verification 241 packet interval so that no false alarm is given for network internal 242 failures which are handled by other mechanisms; and (iii) packet 243 processing overhead, since connectivity verification packets have to 244 be transmitted continuously through the network in a mesh, even in 245 fault-free conditions. 247 2. Overview of current IPFRR Proposals based on Local Repair 249 The only practically feasible solution, Loop Free 250 Alternates [RFC5286], offers the simplest resolution of the hop-by- 251 hop routing consistency problem: a node performing fail-over may only 252 use a next-hop as backup if it is guaranteed that it does not send 253 the packets back. These neighbors are called Loop-Free 254 Alternates (LFA). LFAs, however, do not always exist, as shown in 255 Figure 1 above, i.e., node A has no LFAs with respect to Dst. while 256 it is true that tweaking the network configuration may boost LFA 257 failure case coverage considerably [Ret2011], LFAs cannot protect all 258 failure cases in arbitrary network topologies. 260 The exact way of adding extra information to data packets and its 261 usage for forwarding is the most important property that 262 differentiates most existing IPFRR proposals. 264 Packets can be marked "implicitly", when they are not altered in any 265 way, but some extra information owned by the router helps deciding 266 the correct way of forwarding. Such extra information can be for 267 instance the direction of the packet, e.g., the incoming interface, 268 e.g. as in [FIFR]. Such solutions require what is called interface- 269 based or interface-specific forwarding. 271 Interface-based forwarding significantly changes the well-established 272 nature of IP's destination-based forwarding principle, where the IP 273 destination address alone describes the next hop. One embodiment 274 would need to download different FIBs for each physical or virtual IP 275 interface - not a very compelling idea. Another embodiment would 276 alter the next-hop selection process by adding the incoming interface 277 id also to the lookup fields, which would impact forwarding 278 performance considerably. 280 Other solutions mark data packets explicitly. Some proposals suggest 281 using free bits in the IP header [MRC], which unfortunately do not 282 exist in the IPv4 header. Other proposals resort to encapsulating re- 283 routed packets with an additional IP header as in e.g. [NotVia], 284 [Eny2009] or [MRT-ARCH]. Encapsulation raises the problem of 285 fragmentation and reassembly, which could be a performance 286 bottleneck, if many packets are sent at MTU size. Another significant 287 problem is the additional management complexity of the encapsulation 288 addresses, which have their own semantics and require cumbersome 289 routing calculations, see e.g. [MRT-ALG]. Encapsulation in the IP 290 header translates to label stacking in LDP-MPLS. The above mentioned 291 mechanisms either encode the active topology ID in a label on the 292 stack or encode the failure point in a label, and also require an 293 increasing mesh of targeted LDP sessions to acquire a valid label at 294 the detour endpoint, which is another level of complexity. 296 3. Requirements of an Explicit Failure Signaling Mechanism 298 All local repair mechanisms touched above try to avoid explicit 299 notification of the failure via signaling, and instead try to hack 300 some failure-related information into data packets. This is mainly 301 due to relatively low signaling performance of legacy hardware. 302 Failure notification, therefore, should fulfill the following 303 properties to be practically feasible: 305 1. The signaling mechanism should be reliable. The mechanism needs to 306 propagate the failure information to all interested nodes even in 307 a network where a single link or a node is down. 309 2. The mechanism should be fast in the sense that getting the 310 notification packet to remote nodes through possible multiple hops 311 should not require (considerably) more processing at each hop than 312 plain fast path packet forwarding. 314 3. The mechanism should involve simple and efficient processing to be 315 feasible for implementation in the dataplane. This goal manifests 316 itself in three ways: 318 a. Origination of notification should be very easy, e.g. creating 319 a simple IP packet, the payload of which can be filled easily. 321 b. When receiving the packet, it should be easy to recognize by 322 dataplane linecards so that processing can commence after 323 forwarding. 325 c. No complex operations should be required in order to extract 326 the information from the packet needed to activate the correct 327 backup routes. 329 4. The mechanism should be trustable; that is, it should provide 330 means to verify the authenticity of the notifications without 331 significant increase of the processing burden in the dataplane. 333 5. Duplication of notification packets should be either strictly 334 bounded or handled without significant dataplane processing 335 burden. 337 These requirements present a trade-off. A proper balance needs to be 338 found that offers good enough authentication and reliability while 339 keeping processing complexity sufficiently low to be feasible for 340 data plane implementation. One such solution is proposed in [fn- 341 transport], which is the assumed notification protocol in the 342 following. 344 4. Conceptual Operation of IPFRR relying on Fast Notification 346 This section outlines the operation of an IPFRR mechanism relying on 347 Fast Notification. 349 4.1. Preparation Phase 351 As any other IPFRR solution, IPFRR-FN also requires quick failure 352 detection mechanisms in place, such as lower layer upcalls or BFD. 353 The FN service needs to be activated and configured so that FN 354 disseminates the information identifying the failure to the area once 355 triggered by a local failure detection method. 357 Based on the detailed topology database obtained by a link state IGP, 358 the node should pre-calculate alternative paths considering 359 *relevant* link or node failures in the area. Failure specific 360 alternative path computation should typically be executed at lower 361 priority than other routing processing. Note that the calculation can 362 be done "offline", while the network is intact and the CP has few 363 things to do. 365 Also note the word *relevant* above: a node does not needed to 366 compute all the shortest paths with respect to each possible failure; 367 only those link failures need to be taken into consideration, which 368 are in the shortest path tree starting from the node. 370 To provide protection for Autonomous System Border Router (ASBR) 371 failures, the node will need information not only from the IGP but 372 also from BGP. This is described in detail in Section 5.3. 374 After calculating the failure specific alternative next-hops, only 375 those which represent a change to the primary next-hop, should be 376 pre-installed to the linecards together with the identifier of the 377 failure, which triggers the switch-over. In order to preserve 378 scalability, external prefixes are handled through FIB indirection 379 available in most routers already. Due to indirection, backup routes 380 need to be installed only for egress routers. (The resource needs of 381 an example implementation are briefly discussed in Appendix A.) 383 4.2. Failure Reaction Phase 385 The main steps to be taken after a failure are the following: 387 1. Quick dataplane failure detection 389 2. Send information about failure using FN service right from 390 dataplane. 392 3. Forward the received notification as defined by the actually used 393 FN protocol such as the one in [fn-transport] 395 4. After learning about a local or remote failure, extract failure 396 identifier and activate failure specific backups, if needed, 397 directly within dataplane 399 5. Start forwarding data traffic using the updated FIB 401 After a node detects the loss of connectivity to another node, it 402 should make a decision whether the failure can be handled locally. If 403 local repair is not possible or not configured, for example because 404 LFA is not configured or there are destinations for which no LFA 405 exists, a failure should trigger the FN service to disseminate the 406 failure description. For instance, if BFD detects a dataplane failure 407 it not only should invoke routines to notify the control plane but it 408 should first trigger FN before notifying the CP. 410 After receiving the trigger, without any DP-CP communication 411 involved, FN constructs a packet and adds the description of the 412 failure (described in Section 5.1. ) to the payload. The notification 413 describes that 414 o a node X has lost connectivity 416 o to a node Z 418 o via a link L. 420 The proposed encoding of the IPFRR-FN packet is described in 421 Section 5.1. 423 The packet is then disseminated by the FN service in the routing 424 area. Note the synergy of the relation between BFD and IGP Hellos and 425 between FN and IGP link state advertisements. BFD makes a dataplane 426 optimized implementation of the routing protocol's Hello mechanism, 427 while Fast Notification makes a dataplane optimized implementation of 428 the link state advertisement flooding mechanism of IGPs. 430 In each hop, the recipient node needs to perform a "punt and 431 forward". That is, the FN packet not only needs to be forwarded to 432 the FN neighbors as the specific FN mechanism dictates, but a replica 433 needs to be detached and, after forwarding, started to be processed 434 by the dataplane card. 436 4.2.1. Activating Failure Specific Backups 438 After the forwarding element extracted the contents of the 439 notification packet, it knows that a node X has lost connectivity to 440 a node Z via a link L. The recipient now needs to decide whether the 441 failure was a link or a node failure. Two approaches can be thought 442 of. Both options are based on the property that notifications advance 443 in the network as fast as possible. 445 In the first option, the router does not immediately make the 446 decision, but instead starts a timer set to fire after a couple of 447 milliseconds. If, the failure was a node failure, the node will 448 receive further notifications saying that another node Y has lost 449 connectivity to node Z through another link M. That is, if node Z is 450 common in multiple notifications, the recipient can conclude that it 451 is a node failure and already knows which node it is (Z). If link L 452 is common, then the recipient can decide for link failure (L). If 453 further inconclusive notifications arrive, then it means multiple 454 failures which case is not in scope for IPFRR, and is left for 455 regular IGP convergence. 457 After concluding about the exact failure, the data plane element 458 needs to check in its pre-installed IPFRR database whether this 459 particular failure results in any route changes. If yes, the linecard 460 replaces the next-hops impacted by that failure with their failure 461 specific backups which were pre-installed in the preparation phase. 463 In the second option, the first received notification is handled 464 immediately as a link failure, hence the router may start replacing 465 its next-hops. In many cases this is a good decision, as it has been 466 shown before that most network failures are link failures. If, 467 however, another notification arrives a couple of milliseconds later 468 that points to a node failure, the router then needs to start 469 replacing its next-hops again. This may cause a route flap but due to 470 the quick dissemination mechanism the routing inconsistency is very 471 short lived and likely takes only a couple of milliseconds. 473 4.2.2. SRLG Handling 475 The above conceptual solution is easily extensible to support pre- 476 configured SRLGs. Namely, if the failed link is part of an SRLG, then 477 the disseminated link ID should identify the SRLG itself. As a 478 result, possible notifications describing other link failures of the 479 same SRLG will identify the same resource. 481 If the control plane knows about SRLGs, it can prepare for failures 482 of these, e.g. by calculating a path that avoids all links in that 483 SRLG. SRLG identifier may have been pre-configured or have been 484 obtained by automated mechanisms such as [RFC4203]. 486 4.3. Example and Timing 488 The main message of this section is that big delay links do not 489 represent a problem for IPFRR-FN. The FN message of course propagates 490 on long-haul links slower but the same delay is incurred by normal 491 data packets as well. Packet loss only takes place as long as a node 492 forwards traffic to an incorrect or inconsistent next-hop. This may 493 happen in two cases: 495 First, as long as the failure is not detected, the node adjacent to 496 the failure only has the failed next-hop installed. 498 Secondly, when a node (A) selects a new next-hop (B) after detecting 499 the failure locally or by receiving an FN, the question is if the 500 routing in the new next-hop (B) is consistent by the time the first 501 data packets get from A to B. The following timeline depicts the 502 situation: 504 Legend: X : period with packet loss 505 FN forwarding delay: |--| 507 |--|--------| 508 A:----1XX2XXXXXXXX3-------------------------------------------------- 509 |----| |----| 510 B:------------4--5-----6XX7------------------------------------------ 511 |--|--------| 513 (a) Link delay is |----| FIB update delay is |--------| 515 |--|--------| 516 A:----1XX2XXXXXXXX3-------------------------------------------------- 517 |---------------| 518 |---------------| 519 B:-----------------------4--5-----6XX7------------------------------- 520 |--|--------| 522 (b) Link delay is |---------------| FIB update delay is |--------| 524 Figure 2 Timing of FN and data packet forwarding 526 As can be seen above, the outage time is only influenced by the FN 527 forwarding delay and the FIB update time. The link delay is not a 528 factor. Node A forwards the first re-routed packets from time 529 instance 3 to node B. These reach node B at time instance 6. Node B 530 is doing incorrect/inconsistent forwarding when it tries to forward 531 those packets back to A which have already been put onto a detour by 532 A. This is the interval between time instances 6 and 7. 534 4.4. Scoping FN Messages with TTL 536 In a large routing area it is often the case that a failure (i.e. a 537 topology change) causes next-hop changes only in routers relatively 538 close to the failure. Analysis of certain random topologies and two 539 example ISP topologies revealed that a single link failure event 540 generated routing table changes only in routers not more than 2 hops 541 away from the failure site for the particular topologies under study 542 [Hok2008]. Based on this analysis, it is anticipated that in practice 543 the TTL for failure notification messages can be set to a relatively 544 small radius, perhaps as small as 2 or 3 hops. 546 A chief benefit of TTL scoping is that it reduces the overhead on 547 routers that have no use for the information (i.e. which do not need 548 to re-route). Another benefit (that is particularly important for 549 links with scarce capacity) is that it helps to constrain the control 550 overhead incurred on network links. Determining a suitable TTL value 551 for each locally originated event and controlling failure 552 notification dissemination, in general, is discussed further in 553 Section 5.8. 555 5. Operation Details 557 5.1. Transport of Fast Notification Messages 559 This draft recommends that out of the several FN delivery options 560 defined in [fn-transport], the flooding transport option is 561 preferred, which ensures that any event can reach each node from any 562 source with any failure present in the network area as long as 563 theoretically possible. Flooding also ensures that FN messages reach 564 each node on the shortest (delay) path, and as a side effect failure 565 notifications always reach *each* node *before* re-routed data 566 packets could reach that node. This means that looping is minimized. 568 [fn-transport] describes that the dataplane flooding procedure 569 requires routers to perform duplicate checking before forwarding the 570 notifications to other interfaces to avoid duplicating notifications. 571 [fn-transport] describes that duplicate check can be performed by a 572 simple storage queue, where previously received notification packets 573 or their signatures are stored. 575 IPFRR-FN enables another duplicate check process that is based on the 576 internal state machine. Routers, after receiving a notification but 577 before forwarding it to other peers, check the authenticity of the 578 message, if authentication is used. Now the router may check what is 579 the stored event and what is the event described by the received 580 notification. 582 Two variables and a bit describe what is the known failure state: 584 o Suspected failed node ID (denoted by N) 586 o Suspected link/SRLG ID (denoted by S) 588 o Bit indicating the type of the failure, i.e. link/SRLG failure or 589 node failure (denoted by T) 591 Recall that the incoming notification describes that a node X has 592 lost connectivity to a node Z via a link L. Now, the state machine 593 can be described with the following pseudo-code: 595 //current state: 596 // N: ID of suspected failed node 597 // S: ID of suspected failed link/SRLG 598 // T: bit indicating the type of the failure 599 // T=0 indicates link/SRLG 600 // T=1 indicates node 601 // 602 Proc notification_received(Node Originator_X, Node Y, SRLG L) { 603 if (N == NULL) { 604 // this is a new event, store it and forward it 605 N=Y; 606 S=L; 607 T=0; //which is the default anyway 608 Forward_notification; 609 } 610 else if (S == L AND T == 0) { 611 // this is the same link or SRLG as before, need not do 612 // anything 613 Discard_notification; 614 } 615 else if (N == Y) { 616 // This is a node failure 617 if (T == 0) { 618 // Just now turned out that it is a node failure 619 T=1; 620 Forward_notification; 621 } 622 else { 623 // Known before that it is a node failure, 624 // no need to forward it 625 Discard_notification; 626 } 627 } 628 else { 629 // multiple failures 630 } 631 } 632 Figure 3 Pseudo-code of state machine for FN forwarding 634 5.2. Message Handling and Encoding 636 A failure identifier is needed that unambiguously describes the 637 failed resource consistently among the nodes in the area. The 638 schemantics of the identifiers are defined by the IGP used to pre- 639 calculate and pre-install the backup forwarding entries, e.g. OSPF or 640 ISIS. 642 This draft defines a Failure Identification message class. Members of 643 this class represent a routing protocol specific Failure 644 Identification message to be carried with the Fast Notification 645 transport protocol. Each message within the Failure Identification 646 message class shall contain the following fields, the lengths of 647 which are routing protocol specific. The exact values shall be 648 aligned with the WG of the routing protocol: 650 o Originator Router ID: the identifier of the router advertising the 651 failure; 653 o Neighbour Router ID: the identifier of the neighbour node to which 654 the originator lost connectivity. 656 o Link ID: the identifier of the link, through which connectivity 657 was lost to the neighbour. The routing protocol should assign the 658 same Link ID for bidirectional, broadcast or multi access links 659 from each access point, consistently. 661 o Sequence Number: [fn-transport] expects the applications of the FN 662 service that require replay attack protection to create and verify 663 a sequence number in FN messages. It is described in Section 6. 665 Routers forwarding the FN packets should ensure that Failure 666 Identification messages are not lost, e.g. due to congestion. FN 667 packets can be put a high precedence traffic class (e.g. Network 668 Control class). If the network environment is known to be lossy, the 669 FN sender should repeat the same notification a couple of times, like 670 a salvo fire. 672 After the forwarding element processed the FN packet and extracted 673 the Failure Identification message, it should decide what backups 674 need to be activated if at all - as described in Section 4.2.1. 676 5.2.1. Failure Identification Message for OSPF 678 0 1 2 3 679 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 680 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 681 | FN Length | FN App Type | AuType|unused | 682 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 683 | Originator Router ID | 684 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 685 | Neighbour Router ID | 686 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 687 | Link ID | 688 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 689 | Sequence Number | 690 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 691 | Sequence Number (cont'd) | 692 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 694 FN Header fields: 696 FN Length 697 The length of the Failure Identification message for OSPF is 16 698 bytes. 700 FN App Type 701 The exact values are to be assigned by IANA for the Failure 702 Identification message class. For example, FN App Type values 703 between 0x0008 and 0x000F could represent Failure 704 Identification messages, from which 0x0008 could mean OSPF, 705 0x0009 could be ISIS. 707 AuType 708 IPFRR-FN relies on the authentication options offered the FN 709 transport service. Cryptographic authentication is recommended. 711 Originator Router ID 712 If the routing protocol is OSPF, then the value can take the OSPF 713 Router ID of the advertising router. 715 Neighbour Router ID 716 The OSPF Router ID of the neighbour router to which connectivity 717 was lost. 719 Link ID 720 If the link is a LAN, the Link ID takes the LSAID of its 721 representing Network LSA. 722 If the link is a point-to-point link, the Link ID can take the 723 minimum or the maximum of the two interface IDs. The requirement 724 is that it is performed consistently. 726 Sequence Number 727 This field stores a digest of the LSDB of the routing protocol, as 728 described in Section 6. 5.8.1. 730 5.2.2. Failure Identification Message for ISIS 732 TBA. 734 5.3. Protecting External Prefixes 736 5.3.1. Failure on the Intra-Area Path Leading to the ASBR 738 Installing failure specific backup next-hops for each external prefix 739 would be a scalability problem as the number of these prefixes may be 740 one or two orders of magnitude higher than intra-area destinations. 741 To avoid this, it is suggested to make use of indirection already 742 offered by router vendors. 744 Indirection means that when a packet needs to be forwarded to an 745 external destination, the IP address lookup in the FIB will not 746 return a direct result but a pointer to another FIB entry, i.e. to 747 the FIB entry of the ASBR. In LDP/MPLS this means that all prefixes 748 reachable through the same ASBR constitute the same FEC. 750 As an example, consider that in an area ASBR1 is the primary BGP 751 route for prefixes P1, P2, P3 and P4 and ASBR2 is the primary route 752 for prefixes P5, P6 and P7. A FIB arrangement for this scenario could 753 be the one shown on the following figure. Prefixes using the same 754 ASBR could be resolved to the same pointer that references to the 755 next-hop leading to the ASBR. Prefixes resolved to the same pointer 756 are said to be part of the same "prefix group" or FEC. 758 FIB lookup | FIB lookup 759 | 760 ASBR2 ========> NH2 | ASBR2 ========> NH2 <----+ 761 ASBR1 ========> NH1 | ASBR1 ========> NH1 <-+ | 762 | | | 763 P1 ========> NH1 | P1 ========> Ptr1 -+ | 764 P2 ========> NH1 | P2 ========> Ptr1 -+ | 765 P3 ========> NH1 | P3 ========> Ptr1 -+ | 766 P4 ========> NH1 | P4 ========> Ptr1 -+ | 767 | | 768 P5 ========> NH2 | P5 ========> Ptr2 ----+ 769 P6 ========> NH2 | P6 ========> Ptr2 ----+ 770 P7 ========> NH2 | P7 ========> Ptr2 ----+ 771 | 773 Figure 4 FIB without (left) and with (right) indirection 775 If the next-hop to an ASBR changes, it is enough to update in the FIB 776 the next-hop of the ASBR route. In the above example, this means that 777 if the next-hop of ASBR1 changes, it is enough to update the route 778 entry for ASBR1 and due to indirection through pointer Ptr1 this 779 updates several prefixes at the same time. 781 5.3.2. Protecting ASBR Failures: BGP-FRR 783 IPFRR-FN can make use of alternative BGP routes advertised in an AS 784 by new extensions of BGP such as [BGPAddPaths], [DiverseBGP] or 785 [BGPBestExt]. Using these extensions, for each destination prefix, a 786 node may learn a "backup" ASBR besides the primary ASBR learnt by 787 normal BGP operation. 789 5.3.2.1. Primary and Backup ASBR in the Same Area 791 If the failed ASBR is inside the area, all nodes within that area get 792 notified by FN. Grouping prefixes into FECs, however, needs to be 793 done carefully. Prefixes now constitute a common group (i.e. are 794 resolved to the same pointer) if *both* their primary AND their 795 backup ASBRs are the same. This is due to the fact that even if two 796 prefixes use the ASBR by default, they may use different ASBRs when 797 their common default ASBR fails. 799 Considering the previous example, let us assume that the backup ASBR 800 of prefixes P1 and P2 is ASBR3 but that the backup ASBR of P3 and P4 801 is an ASBR2. Let us further assume that P5 also has ASBR3 as its 802 backup ASBR but P6 and P7 have an ASBR 4 as their backup ASBR. The 803 resulting FIB structure is shown in the following figure: 805 FIB lookup 806 ASBR4 ========> NH4 807 ASBR2 ========> NH2 808 ASBR3 ========> NH3 809 ASBR1 ========> NH1 811 P1 ========> Ptr1 -+-> NH1 812 P2 ========> Ptr1 -+ 814 P3 ========> Ptr2 -+-> NH1 815 P4 ========> Ptr2 -+ 817 P5 ========> Ptr3 ---> NH2 819 P6 ========> Ptr4 -+-> NH2 820 P7 ========> Ptr4 -+ 822 Figure 5 Indirect FIB for ASBR protection 824 If, for example, ASBR1 goes down, this affects prefixes P1 through 825 P4. In order to set the correct backup routes, the container 826 referenced by Ptr1 needs to be updated to NH2 (next-hop of ASBR2) but 827 the location referenced by Ptr2 needs to be updated to NH3 (next-hop 828 of ASBR3). This means that P1 and P2 may constitute the same FEC but 829 P3 and P4 needs to be another FEC so that there backups can be set 830 independently. 832 Note that the routes towards ASBR2 or ASBR3 may have changed, too. 833 For example, if after the failure ASBR3 would use a new next-hop NH5, 834 then the container referenced by Ptr2 should be updated to NH5. A 835 resulting detour FIB is shown in the following figure. 837 FIB lookup 838 ASBR4 ========> NH4 839 ASBR2 ========> NH2 840 ASBR3 ========> NH5 841 ASBR1 ========> X 843 P1 ========> Ptr1 -+-> NH2 844 P2 ========> Ptr1 -+ 846 P3 ========> Ptr2 -+-> NH5 847 P4 ========> Ptr2 -+ 849 P5 ========> Ptr3 ---> NH2 851 P6 ========> Ptr4 -+-> NH2 852 P7 ========> Ptr4 -+ 854 Figure 6 Indirect "detour" FIB in case of ASBR1 failure 856 During pre-calculation, the control plane pre-downloaded the failure 857 identifier of ASBR1 and assigned NH5 as the failure specific backup 858 for routes for ASBR3 and pointer Ptr2 and assigned NH2 as the failure 859 specific backup for the route referenced by Ptr1. 861 5.3.2.2. Primary and Backup ASBR in Different Areas 863 By default, the scope of FN messages is limited to a single routing 864 area. 866 The IPFRR-FN application of FN, may, however, need to redistribute 867 some specific notifications across areas in a limited manner. 869 If an ASBR1 in Area1 goes down and some prefixes need to use ASBR2 in 870 another Area2, then, besides Area1, routers in Area2 need to know 871 about this failure. Since communication between non-backbone areas is 872 done through the backbone areas, it may also need the information. 874 Naturally, if ASBR2 resides in the backbone area, then the FN of 875 ASBR1 failure needs to be leaked only to the backbone area. 877 Leaking is facilitated by area border routers (ABR). During failure 878 preparation phase, the routing engine of an ABR can determine that 879 for an intra-area ASBR the backup ASBR is in a different area to 880 which it is the ABR. Therefore, the routing engine installs such 881 intra-area ASBRs in an "FN redistribution list" at the dataplane 882 cards. 884 The ABR, after receiving FN messages, may conclude in its state 885 machine that a node failure happened. If this node failure is in the 886 redistribution list, the ABR will generate an FN with the following 887 data: 889 0 1 2 3 890 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 891 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 892 | 16 | 0x008 | AuType|unused | 893 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 894 | ABR Router ID | 895 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 896 | ASBR Router ID | 897 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 898 | 0x0 | 899 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 900 | Sequence Number | 901 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 902 | Sequence Number (cont'd) | 903 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 905 This message is then distributed to the neighbour area specified in 906 the redistribution list as a regular FN message. A Link ID of 0x0 907 specifically signals in the neighbour area that this failure is a 908 known node failure of the node specified by the "Neighbour Router ID" 909 field (which was set to the failed ASBR's ID). 911 ABRs in a non-backbone area need to prepare to redistribute ASBR 912 failure notifications from within their area to the backbone area. 914 ABRs in the backbone area need to prepare to redistribute an ASBR 915 failure notification from the backbone area to that area where a 916 backup ASBR resides. 918 Consider the previous example, but now let us assume that the current 919 area is Area0, ASBR2 and ASBR3 reside in Area1 (reachable through 920 ABR1) but ASBR 4 resides in Area2 (reachable through ABR2). The 921 resulting FIBs are shown in the following figures: in case of ASBR2 922 failure, only Ptr4 needs an update. 924 FIB lookup 925 ABR1 ========> NH6 926 ABR2 ========> NH7 928 (ASBR4 ========> NH7) //may or may not be in the FIB 929 (ASBR2 ========> NH6) //may or may not be in the FIB 930 (ASBR3 ========> NH6) //may or may not be in the FIB 931 (ASBR1 ========> NH1) //may or may not be in the FIB 933 P1 ========> Ptr1 -+-> NH1 934 P2 ========> Ptr1 -+ 936 P3 ========> Ptr2 -+-> NH1 937 P4 ========> Ptr2 -+ 939 P5 ========> Ptr3 ---> NH6 941 P6 ========> Ptr4 -+-> NH6 942 P7 ========> Ptr4 -+ 944 Figure 7 Indirect FIB for inter-area ASBR protection 946 FIB lookup 947 ABR1 ========> NH6 948 ABR2 ========> NH7 950 (ASBR4 =======> NH7) //may or may not be in the FIB 951 (ASBR2 =======> X ) //may or may not be in the FIB 952 (ASBR3 ========> NH6) //may or may not be in the FIB 953 (ASBR1 ========> NH1) //may or may not be in the FIB 955 P1 ========> Ptr1 -+-> NH1 956 P2 ========> Ptr1 -+ 958 P3 ========> Ptr2 -+-> NH1 959 P4 ========> Ptr2 -+ 961 P5 ========> Ptr3 ---> NH6 963 P6 ========> Ptr4 -+-> NH7 964 P7 ========> Ptr4 -+ 966 Figure 8 Indirect "detour" FIB for inter-area ASBR protection, ASBR2 967 failure 969 5.4. Application to LDP 971 It is possible for LDP traffic to follow paths other than those 972 indicated by the IGP. To do so, it is necessary for LDP to have the 973 appropriate labels available for the alternate so that the 974 appropriate out-segments can be installed in the forwarding plane 975 before the failure occurs. 977 This means that a Label Switching Router (LSR) running LDP must 978 distribute its labels for the Forwarding Equivalence Classes (FECs) 979 it can provide to all its neighbours, regardless of whether or not 980 they are upstream. Additionally, LDP must be acting in liberal label 981 retention mode so that the labels that correspond to neighbours that 982 aren't currently the primary neighbour are stored. Similarly, LDP 983 should be in downstream unsolicited mode, so that the labels for the 984 FEC are distributed other than along the SPT. 986 The above criteria are identical to those defined in [RFC5286]. 988 In IP, a received FN message may result in rewriting the next-hop in 989 the FIB. If LDP is applied, the label FIB also needs to be updated in 990 accordance with the new next-hop; in the LFIB, however, not only the 991 outgoing interface needs to be replaced but also the label that is 992 valid to this non-default next-hop. The latter is available due to 993 liberal label retention and unsolicited downstream mode. 995 5.5. Application to VPN PE Protection 997 Protecting against (egress) PE router failures in VPN scenarios is 998 conceptually similar to protecting against ASBR failures for Internet 999 traffic. The difference is that in case of ASBR protection core 1000 routers are normally aware of external prefixes using iBGP, while in 1001 VPN cases P routers can only route inside the domain. In case of 1002 VPNs, tunnels running between ingress PE and egress PE decrease the 1003 burden for P routers. The task here is to redirect traffic to a 1004 backup egress PE. 1006 Egress PE protection effectively calls out for an explicit failure 1007 notification, yet existing proposals try to avoid it. 1009 [I-D.bashandy-bgp-edge-node-frr] proposes that the P routers adjacent 1010 to the primary PE maintain the necessary routing state and perform 1011 the tunnel decaps/re-encaps to the backup PE, thereby proposing 1012 considerable complexity for P routers. 1014 [I-D.ietf-pwe3-redundancy] describes a mechanism for pseudowire 1015 redundancy, where PE routers need to run multi-hop BFD sessions to 1016 detect the loss of a primary egress PE. This leads to a potentially 1017 full mesh of multihop BFD session, which is a tremendous complexity. 1018 In addition, in some cases the egress PE of the secondary PW might 1019 need to explicitly set the PW state from standby to active. 1021 FN provides the needed mechanism to actively inform all nodes 1022 including PE routers that a failure happened, and also identifies 1023 that a node failure happened. Furthermore, since both the ingress PE 1024 and the secondary egress PE are informed, all information is 1025 available for a proper switch-over. This is without a full mesh of 1026 BFD sessions running all the time between PE routers. 1028 5.6. Bypassing Legacy Nodes 1030 Legacy nodes, while cannot originate fast notifications and cannot 1031 process them either, can be assumed to be able to forward the 1032 notifications. As [fn-transport] discusses, FN forwarding is based on 1033 multicast. It is safe to assume that legacy routers' multicast 1034 configuration can be set up statically so as to be able to propagate 1035 fast notifications as needed. 1037 When calculating failure specific alternative routes, IPFRR-FN 1038 capable nodes must consider legacy nodes as being fixed directed 1039 links since legacy nodes do not change packet forwarding in the case 1040 of failure. There are situations when an FN-IPFRR capable node can, 1041 exceptionally, bypass a non-IPFRR-FN capable node in order to handle 1042 a remote failure. 1044 As an example consider the topology depicted in Figure 9, where the 1045 link between C and D fails. C cannot locally repair the failure. 1047 +---+ +---+ +---+ +---+ 1048 | E |---| F |---| G |---| H | 1049 +---+ +---+ +---+ +---+ 1050 | / | 1051 | / | 1052 | / | 1053 +---+ +---+ +---+ +---+ 1054 | A |---| B |---| C |-X-| D | 1055 +---+ +---+ +---+ +---+ 1056 >========>==============> 1057 Traffic from A to D 1059 Figure 9 Example for bypassing legacy nodes 1061 First, let us assume that each node is IPFRR-FN capable. C would 1062 advertise the failure information using FN. Each node learns that the 1063 link between C and D fails, as a result of which C changes its 1064 forwarding table to send any traffic destined to D via B. B also 1065 makes a change, replacing its default next-hop (C) with G. Note that 1066 other nodes do not need to modify their forwarding at all. 1068 Now, let us assume that B is a legacy router not supporting IPFRR-FN 1069 but it is statically configured to multicast fast notifications as 1070 needed. As such, A will receive the notification. A's pre- 1071 calculations have been done knowing that B is unable to correct the 1072 failure. Node A, therefore, has pre-calculated E as the failure 1073 specific next-hop. Traffic entering at A and heading to D can thus be 1074 repaired. 1076 5.7. Capability Advertisement 1078 The solution requires nodes to know which other nodes in the area are 1079 capable of IPFRR-FN. The most straightforward way to achieve this is 1080 to rely on the Router Capability TLVs available both in 1081 OSPF [RFC4970] and in IS-IS [RFC4971]. 1083 5.8. Constraining the Dissemination Scope of Fast Notification Packets 1085 As discussed earlier in Section 4.4. it is desirable to constrain the 1086 dissemination scope of failure notification messages. This section 1087 presents three candidate methods for controlling the scope of failure 1088 notification: (1) Pre-configure the TTL for FN messages in routers 1089 based on best current practices and related studies of available ISP 1090 and enterprise network topologies; (2) dynamically calculate the 1091 minimum TTL value needed to ensure 100% remote LFAP coverage; and (3) 1092 dynamically calculate the set of neighbours for which FN message 1093 should given the identity of the link that has failed. 1095 These candidate dissemination options are mechanisms with different 1096 levels of optimality and complexity. The intent here is to present 1097 some options that will generate further discussion on the tradeoffs 1098 between different FN message scoping methods. 1100 5.8.1. Pre-Configured FN TTL Setting 1102 As discussed, earlier in Section 4.4. studies of various network 1103 topologies suggest that a fixed TTL setting of 2 hops may be 1104 sufficient to ensure failure notification message for typical OSPF 1105 area topologies. Therefore, a potentially simple solution for 1106 constraining FN message dissemination is for network managers to 1107 configure their routers with fixed TTL setting (e.g., TTL=2 hops) for 1108 FN messages. This TTL setting can be adjusted by network managers to 1109 consider implementation-specific details of the topology such as 1110 configuring a larger TTL setting for topologies containing, say, 1111 large ring sub-graph structures. 1113 In terms of performance trades, pre-configuring the FN TTL, since it 1114 is fixed at configuration time, incurs no computational overhead for 1115 the router. On the other hand, it represents a configurable router 1116 parameter that network administrators must manage. Furthermore, the 1117 fixed, pre-configured FN TTL approach is sub-optimal in terms of 1118 constraining the FN dissemination as most single link events will not 1119 require FN messages send to up to TTL hops away from the failure 1120 site. 1122 5.8.2. Advanced FN Scoping 1124 While the static pre-configured setting of the FN TTL will likely 1125 work in practice for a wide range of OSPF area topologies, it has at 1126 two least weaknesses: (1) There may be certain topologies for which 1127 the TTL setting happens to be insufficient to provide the needed 1128 failure coverage; and (2) as discussed above, it tends to result in 1129 FN being disseminated to a larger radius than needed to facilitate 1130 re-routing. 1132 The solution to these drawbacks is for routers to dynamically compute 1133 the FN TTL radius needed for each of the local links it monitors. 1134 Doing so addresses the two weakness of a pre-configured TTL setting 1135 by computing a custom TTL setting for each of its local links that 1136 matches exactly the FN message radius for the given topology. The 1137 drawback, of course, is the additional computations. However, given 1138 a quasi-static network topology, it is possible this dynamic FN TTL 1139 computation is performed infrequently and, therefore, on average 1140 incurs relatively small computation overhead. 1142 While a pre-configured TTL eliminates computation overhead at the 1143 expense of FN dissemination overhead and dynamic updates of the TTL 1144 settings achieve better dissemination efficiency by incurring some 1145 computational complexity, directed FN message forwarding attempts to 1146 minimize the FN dissemination scope by leveraging additional 1147 computation power. Here, rather than computing a FN TTL setting for 1148 each local link, a network employing directed forwarding has each 1149 router instance R compute the sets of one-hop neighbours to which a 1150 FN message must be forwarded for every possible failure event in the 1151 routing area. This has the beneficial effect of constraining the FN 1152 scope to the direction where there are nodes that require the FN 1153 update as opposed to disseminating to the entire TTL hop radius about 1154 a failure site. The trade off here, of course, is the additional 1155 computation complexity incurred and the maintenance of forwarding 1156 state for each possible failure case. Reference [Cev2010] gives an 1157 algorithm for finding, for each failure event, the direct neighbours 1158 to which the notification should be forwarded. 1160 6. Protection against Replay Attacks 1162 To defend against replay attacks, recipients should be able to ignore 1163 a re-sent recording of a previously sent FN packet. This suggests 1164 that some sort of sequence number should be included in the FN 1165 packet, the verification of which should not need control plane 1166 involvement. Since the solution should be simple to implement in the 1167 dataplane, maintaining and verifying per-source sequence numbers is 1168 not the best option. 1170 We propose, therefore, that messages should be stamped with the 1171 digest of the actual routing configuration, i.e., a digest of the 1172 link state database of the link state routing protocol. The digest 1173 has to be picked carefully, so that if two LSDBs describe the same 1174 connectivity information, their digest should be identical as well, 1175 and different LSDBs should result in different digest values with 1176 high probability. 1178 The conceptual way of handling these digests could be the following: 1180 o When the LSDB changes, the IGP re-calculates the digest and 1181 downloads the new value to the dataplane element(s), in a secure 1182 way. 1184 o When a FN packet is originated, the digest is put into the FN 1185 message into the Sequence Number field. 1187 o Network nodes distribute (forward) the FN packet. 1189 o When processing, the dataplane element first performs an 1190 authentication check of the FN packet, as described in [fn- 1191 transport]. 1193 o Finally, before processing the failure notification, the dataplane 1194 element should check whether its own known LSDB digest is 1195 identical with the one in the message. 1197 If due to a failure event a node disseminates a failure notification 1198 with FN, an attacker might capture the whole packet and re-send it 1199 later. If it resends the packet after the IGP re-converged on the new 1200 topology, the active LSDB digest is different, so the packet can be 1201 ignored. If the packet is replayed to a recipient who still has the 1202 same LSDB digest, then it means that the original failure 1203 notification was already processed but the IGP has not yet finished 1204 converging; the IPFRR detour is already active, the replica has no 1205 impact. 1207 6.1. Calculating LSDB Digest 1209 We propose to create an LSDB digest that is conceptually similar 1210 to [ISISDigest]. The operation is proposed to be the following: 1212 o Create a hash from each LSA(OSPF)/LSP(ISIS) one by one 1214 o XOR these hashes together 1216 o When an LSA/LSP is removed, the new LSDB digest is received by 1217 computing the hash of the removed LSA, and then XOR to the 1218 existing digest 1220 o When an LSA/LSP is added, the new LSDB digest is received by 1221 computing the hash of the new LSA, and then XOR to the existing 1222 digest 1224 7. Security Considerations 1226 The IPFRR application of Fast Notification does not raise further 1227 known security consideration in addition to those already present in 1228 Fast Notification itself. If an attacker could send false Failure 1229 Identification Messages or could hinder the transmission of legal 1230 messages, then the network would produce an undesired routing 1231 behaviour. These issues should be solved, however, in [fn-transport]. 1233 IPFRR-FN relies on the authentication mechanism provided by the Fast 1234 Notification transport protocol [fn-transport]. The specification of 1235 the FN transport protocol requires applications to protect against 1236 replay attacks with application specific sequence numbers. This 1237 draft, therefore, describes its own proposed sequence number in 1238 Section 5.8.1. 1240 8. IANA Considerations 1242 The Failure Identification message types need to be allocated a value 1243 in the FN App Type field. 1245 IPFRR-FN capability needs to be allocated within Router Capability 1246 TLVs both for OSPF [RFC4970] and in IS-IS [RFC4971]. 1248 9. References 1250 9.1. Normative References 1252 [RFC5286] 1253 A. Atlas, A. Zinin, "Basic specification for IP Fast- 1254 Reroute: Loop-Free Alternates", Internet Engineering Task 1255 Force: RFC 5286, 2008. 1257 [fn-transport] 1258 W. Lu, S. Kini, A. Csaszar, G. Enyedi, J. Tantsura, A. 1259 Tian, "Transport of Fast Notifications Messages", draft-lu- 1260 fn-transport, 2011 1262 [RFC4970] 1263 A. Lindem et al., Extensions to OSPF for Advertising 1264 Optional Router Capabilities, RFC 4970, 2007 1266 [RFC4971] 1267 JP. Vasseur et al., Intermediate System to Intermediate 1268 System (IS-IS) Extensions for Advertising Router 1269 Information, RFC 4971, 2007 1271 [RFC4203] 1272 K. Kompella, Y. Rekhter, " OSPF Extensions in Support of 1273 Generalized Multi-Protocol Label Switching (GMPLS)", 1274 RFC4203, 2005 1276 9.2. Informative References 1278 [BFD] 1279 D. Katz, D. Ward, "Bidirectional forwarding detection", 1280 RFC 5880, IETF, 2010 1282 [RFC5714] 1283 M. Shand, S. Bryant, "IP Fast Reroute Framework", RFC 5714, 1284 IETF, 2010. 1286 [Cev2010] 1287 S. Sevher, T. Chen, I. Hokelek, J. Kang, V. Kaul, Y.J. Lin, 1288 M. Pang, R. Rodoper, S. Samtani, C. Shah, J. Bowcock, G. B. 1289 Rucker, J. L. Simbol and A. Staikos, "An Integrated Soft 1290 Handoff Approach in IP Fast Reroute in Wireless Mobile 1291 Networks", In Proceedings IEEE COMSNETS, 2011. 1293 [Eny2009] 1294 Gabor Enyedi, Peter Szilagyi, Gabor Retvari, Andras 1295 Csaszar, "IP Fast ReRoute: Lightweight Not-Via without 1296 Additional Addresses", IEEE INFOCOM-MiniConference, Rio de 1297 Janeiro, Brazil, 2009. 1299 [FIFR] 1300 J. Wand, S. Nelakuditi, "IP fast reroute with failure 1301 inferencing", In Proceedings of ACM SIGCOMM Workshop on 1302 Internet Network Management - The Five-Nines Workshop, 1303 2007. 1305 [Hok2007] 1306 I. Hokelek, M. A. Fecko, P. Gurung, S. Samtani, J. Sucec, 1307 A. Staikos, J. Bowcock and Z. Zhang, "Seamless Soft Handoff 1308 in Wireless Battlefield Networks Using Local and Remote 1309 LFAPs", In Proceedings IEEE MILCOM, 2007. 1311 [Hok2008] 1312 I. Hokelek, S. Cevher, M. A. Fecko, P. Gurung, S. Samtani, 1313 Z. Zhang, A. Staikos and J. Bowcock, "Testbed 1314 Implementation of Loop-Free Soft Handoff in Wireless 1315 Battlefield Networks", In Proceedings of the 26th Army 1316 Science Conference, December 1-4, 2008. 1318 [MRC] 1319 T. Cicic, A. F. Hansen, A. Kvalbein, M. Hartmann, R. 1320 Martin, M. Menth, S. Gjessing, O. Lysne, "Relaxed multiple 1321 routing configurations IP fast reroute for single and 1322 correlated failures", IEEE Transactions on Network and 1323 Service Management, available online: 1324 http://www3.informatik.uni- 1325 wuerzburg.de/staff/menth/Publications/papers/Menth08-Sub- 1326 4.pdf, September 2010. 1328 [NotVia] 1329 S. Bryant, M. Shand, S. Previdi, "IP fast reroute using 1330 Not-via addresses", Internet Draft, draft-ietf-rtgwg-ipfrr- 1331 notvia-addresses, 2010. 1333 [RLFAP] 1334 I. Hokelek, M. Fecko, P. Gurung, S. Samtani, S. Cevher, J. 1335 Sucec, "Loop-Free IP Fast Reroute Using Local and Remote 1336 LFAPs", Internet Draft, draft-hokelek-rlfap-01 (expired), 1337 2008. 1339 [Ret2011] 1340 G. Retvari, J. Tapolcai, G. Enyedi, A. Csaszar, "IP Fast 1341 ReRoute: Loop Free Alternates Revisited", to appear at IEEE 1342 INFOCOM 2011 1344 [ISISDigest] 1345 J. Chiabaut and D. Fedyk. IS-IS Multicast Synchronization 1346 Digest. Available online: 1347 http://www.ieee802.org/1/files/public/docs2008/aq-fedyk- 1348 ISIS-digest-1108-v1.pdf, Nov 2008. 1350 [BGPAddPaths] 1351 D. Walton, A. Retana, E. Chen, J. Scudder, "Advertisement 1352 of Multiple Paths in BGP", draft-ietf-idr-add-paths, Work 1353 in progress 1355 [DiverseBGP] 1356 R. Raszuk, et. Al, "Distribution of diverse BGP paths", 1357 draft-ietf-grow-diverse-bgp-path-dist, Work in progress 1359 [BGPBestExt] 1360 P. Marques, R. Fernando, E. Chen, P. Mohapatra, H. Gredler, 1361 "Advertisement of the best external route in BGP", draft- 1362 ietf-idr-best-external, Work in progress 1364 [BRITE] 1365 Oliver Heckmann et al., "How to use topology generators to 1366 create realistic topologies", Technical Report, Dec 2002. 1368 [MRT-ARCH] 1369 A. Atlas et al., "An Architecture for IP/LDP Fast-Reroute 1370 Using Maximally Redundant Trees", Internet Draft, draft- 1371 ietf-rtgwg-mrt-frr-architecture-01, 2012 1373 [MRT-ALG] 1374 A. Atlas, G. Enyedi, A. Csaszar, "Algorithms for computing 1375 Maximally Redundant Trees for IP/LDP Fast-Reroute", 1376 Internet Draft, draft-enyedi-rtgwg-mrt-frr-algorithm-01, 1377 2012 1379 [I-D.ietf-pwe3-redundancy] 1380 P. Muley, M. Aissaoui, M. Bocci, "Pseudowire Redundancy", 1381 draft-ietf-pwe3-redundancy (Work in progress!), May 2012 1383 [I-D.bashandy-bgp-edge-node-frr] 1384 A. Bashandy, B. Pithawala, K. Patel, "Scalable BGP FRR 1385 Protection against Edge Node Failure", draft-bashandy-bgp- 1386 edge-node-frr (Work in progress!), March 2012 1388 10. Acknowledgments 1390 The authors would like to thank Albert Tian, Wenhu Lu, Acee Lindem 1391 and Ibrahim Hokelek for the continuous discussions and comments on 1392 the topic, as well as Joel Halpern for his comments and review. 1394 Appendix A. Memory Needs of a Naive Implementation 1396 Practical background might suggest that storing and maintaining 1397 backup next-hops for many potential remote failures could overwhelm 1398 the resources of router linecards. This section attempts to provide a 1399 calculation describing the approximate memory needs in reasonable 1400 sized networks with a possible implementation. 1402 A.1. An Example Implementation 1404 Let us suppose that for exterior destinations the forwarding engine 1405 is using recursive lookup or indirection in order to improve updating 1406 time such as described in Section 5.3. We are also supposing that the 1407 concept of "prefix groups" is applied, i.e. there is an internal 1408 entity for the prefixes using exactly the same primary and backup 1409 ASBRs, and the next hop entry for a prefix among them is pointing to 1410 the next hop towards this entity. See e.g. Figure 7. 1412 In the sequel, the term of "area" refers to an extended area, made up 1413 by the OSPF or IS-IS area containing the router, with the prefix 1414 groups added to the area as virtual nodes. Naturally, a prefix group 1415 is connected to the egress routers (ABRs) through which it can be 1416 reached. We just need to react to the failure ID of an ASBR for all 1417 the prefix groups connected to that ASBR; technically, we must 1418 suppose that one of the virtual links of all the affected prefix 1419 groups go down. 1421 Here we show a simple naive implementation which can easily be beaten 1422 in real routers. This implementation uses an array for all the nodes 1423 (including real routers and virtual nodes representing prefix groups) 1424 in the area (node array in the sequel), made up by two pointers and a 1425 length filed (an integer) per record. One of the pointers points to 1426 another array (called alternative array). That second array is 1427 basically an enumeration containing the IDs of those failures 1428 influencing a shortest path towards that node and an alternative 1429 neighbor, which can be used, when such a failure occurs. When a 1430 failure is detected, (either locally, or by FN), we can easily find 1431 the proper record in all the lists. Moreover, since these arrays can 1432 be sorted based on the failure ID, we can even use binary search to 1433 find the needed record. The length of this array is stored in the 1434 record of the node array pointing to the alternative list. 1436 Now, we only need to know, which records in the FIB should be 1437 updated. Therefore there is a second pointer in the node array 1438 pointing to that record. 1440 +-------+-------+-------+-- --+-------+ 1441 | r1 | r2 | r3 | ... | rk | 1442 +-------+-------+-------+-- --+-------+ 1443 | | | | 1444 | | | | 1445 \|/ \|/ \|/ \|/ 1446 * * * * 1447 +-------+-------+-------+-- --+-------+ 1448 | fail1 | fail2 | fail3 | | failk | 1449 | alt.1 | alt.2 | alt.3 | ... | alt.k | 1450 +-------+-------+-------+-- --+-------+ 1451 | fail4 | | fail5 | 1452 | alt.4 | | alt.5 | 1453 +-------+ +-------+ 1454 | fail6 | 1455 | alt.6 | 1456 +-------+ 1458 Figure 10The way of storing alternatives 1460 A.2. Estimation of Memory Requirements. 1462 Now, suppose that there are V nodes in the extended area, the network 1463 diameter is D, a neighbor descriptor takes X bytes, a failure ID 1464 takes Y bytes and a pointer takes Z bytes. We suppose that lookup for 1465 external prefixes are using indirection, so we only need to deal with 1466 destinations inside the extended area. In this way, if there is no 1467 ECMP, this data structure takes 1469 (2*Z+Y)*(V-1) + 2*(X+Y)*D*(V-1) 1471 bytes altogether. The first part is the memory consumption of the 1472 node array. The memory needed by alternative arrays: any path can 1473 contain at most D nodes and D links, each record needs X+Y bytes; 1474 there are records for all the other nodes in the area (V-1 nodes). 1475 Observe that this is a very rough overestimation, since most of the 1476 possible failures influencing the path will not change the next hop. 1478 For computing memory consumption, suppose that neighbor descriptors, 1479 failure IDs and pointers take 4 bytes, there are 10000 nodes in the 1480 extended area (so both real routers and virtual nodes representing 1481 prefix groups are included) and the network diameter is 20 hops. In 1482 this case, we get that the node array needs about 120KB, the 1483 alternative array needs about 3.2MB, so altogether 3.4MB if there is 1484 no ECMP. Observe that the number of external prefixes is not 1485 important. 1487 If however, there are paths with equal costs, the size of the 1488 alternative array increases. Suppose that there are 10 equal paths 1489 between ANY two nodes in the network. This would cause that the 1490 alternative list gets 10 times bigger, and now it needs a bit less 1491 than 32MB. Observe that the node array still needs only about 160KB, 1492 so 32MB is a good overestimation, which is likely acceptable for 1493 modern linecards with gigs of DRAM. Moreover, we need to stress here 1494 again that this is an extremely rough overestimation, so in reality 1495 much less memory will be enough. Furthermore, usually only protecting 1496 outer prefixes is needed, so we only need to protect the paths 1497 towards the prefix groups, which further decreases both the size of 1498 node array and the number of alternative lists. 1500 A.3. Estimation of Failover Time 1502 After a failover was detected either locally or by using FN, the 1503 nodes need to change the entries in their FIB. Here we do a rough 1504 estimation to show that the previous implementation can do it in at 1505 most a few milliseconds. 1507 We are supposing that we have the data structure described in the 1508 previous section. When a failure happens we need to decide for each 1509 node in the node table whether the shortest path towards that 1510 destination was influenced by the failure. We can sort the elements 1511 in the alternative list, so now we can use binary search, which needs 1512 ceil(log(2D)) memory access (log here has base 2) for worst case. We 1513 need one more access to get the node list entry and another to 1514 rewrite the FIB. 1516 We suppose DDR3 SDRAM with 64 byte cache line, which means that up to 1517 8 entries of the alternative list can be fetched from the RAM at a 1518 time, so the previous formula is modified as we need ceil(log(D/4))+2 1519 transactions. In this way for D=20 and V=10.000 we need 1520 (3+2)*10.000=50.000 transactions. If we suppose 10 ECMP paths as 1521 previously, D=200 and we need (5+2)*10000=70.000 transactions. 1523 We can do a very conservative estimation by supposing a recent DDR3 1524 SDRAM module which can do 5MT/s with completely random access, so 1525 doing 50.000 or 70.000 transaction takes 10ms or 14ms. Keep in mind 1526 that we assumed that there is only one memory controller, we always 1527 got the result of the search with the last read, and all the 1528 alternative lists were full. Moreover, internal system latencies 1529 (e.g. multiple memory requests) were overestimated seriously, since a 1530 DDR3 SDRAM can reach even 6 times this speed with random access. 1532 Appendix B. Impact Scope of Fast Notification 1534 The memory and fail-over time calculations presented in Appendix A 1535 are based on worst-case estimation. They assume that basically in a 1536 network with diameter equal to 20 hops, each failure has a route 1537 changing consequence on all routers in the full diameter. 1539 This section provides experimental results on real-world topologies, 1540 showing that already 100% failure coverage can be achieved within a 1541 2-hop radius around the failure. 1543 We performed the coverage analysis of the fast reroute mechanism 1544 presented here on realistic topologies, which were generated by the 1545 BRITE topology generator in bottom-up mode [BRITE]. The coverage 1546 percentage is defined here as the percentage of the number of useable 1547 backup paths for protecting the primary paths which are failed 1548 because of link failures to the number of all failed primary paths. 1550 The realistic topologies include AT&T and DFN using pre-determined 1551 BRITE parameter values from [BRITE] and various random topologies 1552 with different number of nodes and varying network connectivity. For 1553 example, the number of nodes for AT&T and DFN are 154 and 30, 1554 respectively, while the number of nodes for other random topologies 1555 is varied from 20 to 100. The BRITE parameters which are used in our 1556 topology generation process are summarized in Figure 11 (see [BRITE] 1557 for the details of each parameter). In summary, m represents the 1558 average number of edges per node and is set to either 2 or 3. A 1559 uniform bandwidth distribution in the range 100-1024 Mbps is selected 1560 and the link cost is obtained deterministically from the link 1561 bandwidth (i.e., inversely proportional to the link bandwidth as used 1562 by many vendors). Since the values for p(add) and beta determine the 1563 number of edges in the generated topologies, their values are varied 1564 to obtain network topologies with varying connectivity (e.g., sparse 1565 and dense). 1567 |----------------------------|-----------------------| 1568 | | Bottom up | 1569 |----------------------------|-----------------------| 1570 | Grouping Model | Random pick | 1571 | Model | GLP | 1572 | Node Placement | Random | 1573 | Growth Type | Incremental | 1574 | Preferential Connectivity | On | 1575 | BW Distribution | Uniform | 1576 | Minimum BW | 100 | 1577 | Maximum BW | 1024 | 1578 | m | 2-3 | 1579 | Number of Nodes (N) | 20,30,50,100,154 | 1580 | p(add) | 0.01,0.05,0.10,0.42 | 1581 | beta | 0.01,0.05,0.15,0.62 | 1582 |----------------------------|-----------------------| 1584 Figure 11 BRITE topology generator parameters 1586 The coverage percentage of our fast reroute method is reported for 1587 different network topologies (e.g., different number of nodes and 1588 varying network connectivity) using neighborhood depths of 0, 1, and 1589 2. (i.e., X=0, 1, and 2). For a particular failure, backup routes 1590 protecting the failed primary paths are calculated only by those 1591 nodes which are within the selected radious of this failure. Note 1592 that these nodes are determined by the parameter X as follows: For 1593 X=0, two nodes which are directly connected to the failed link, for 1594 X=1, two nodes which are directly connected to the failed link and 1595 also neighboring nodes which are adjacent to one of the outgoing 1596 links of these two nodes, and so on. 1598 The coverage percentage for a certain topology is computed by the 1599 following formula: Coverage Percentage = N_backupsexist*100/N_fpp 1600 where N_backupsexist is the number of source-destination pairs whose 1601 primary paths are failed because of link failures and have backup 1602 paths for protecting these failed paths, and N_fpp is the number of 1603 source-destination pairs whose primary paths are failed because of 1604 link failures. The source-destination pairs, in which source and 1605 destination nodes do not have any physical connectivity after a 1606 failure, are excluded from N_fpp. Note that the coverage percentage 1607 includes a network-wide result which is calculated by averaging all 1608 coverage results obtained by individually failing all edges for a 1609 certain network topology. 1611 Figure 12 shows the coverage percentage results for random topologies 1612 with different number of nodes (N) and network connectivity, and 1613 Figure 13 shows these results for AT&T and DFN topologies. In these 1614 figures, E_mean represents the average number of edges per node for a 1615 certain topology. Note that the average number of edges per node is 1616 determined by the parameters m, p(add), and beta. We observed that 1617 E_mean increases when p(add) and beta values increase. For each 1618 topology, coverage analysis is repeated for 10 topologies generated 1619 randomly by using the same BRITE parameters. E_mean and coverage 1620 percentage are obtained by averaging the results of these ten 1621 experiments. 1623 |------------|-----|------|------|------|------| 1624 | Case |N |E_mean|X=0 |X=1 |X=2 | 1625 |------------|-----|------|------|------|------| 1626 |p(add)=0.01 |20 |3.64 |82.39 |98.85 |100.0 | 1627 |beta=0.01 |50 |3.86 |82.10 |98.69 |100.0 | 1628 | |100 |3.98 |83.21 |98.04 |100.0 | 1629 |------------|-----|------|------|------|------| 1630 |p(add)=0.05 |20 |3.70 |85.60 |99.14 |100.0 | 1631 |beta=0.05 |50 |4.01 |84.17 |99.09 |100.0 | 1632 | |100 |4.08 |83.35 |98.01 |100.0 | 1633 |------------|-----|------|------|------|------| 1634 |p(add)=0.1 |20 |5.52 |93.24 |100.0 |100.0 | 1635 |beta=0.15 |50 |6.21 |91.46 |99.87 |100.0 | 1636 | |100 |6.39 |91.17 |99.86 |100.0 | 1637 |------------|-----|------|------|------|------| 1639 Figure 12 Coverage percentage results for random topologies 1641 |------------|-----------|------|------|------|------| 1642 | Case |N |E_mean|X=0 |X=1 |X=2 | 1643 |------------|-----------|------|------|------|------| 1644 |p(add)=0.42 |154 (AT&T) |6.88 |91.04 |99.81 |100.0 | 1645 |beta=0.62 |30 (DFN) |8.32 |93.76 |100.0 |100.0 | 1646 |------------|-----------|------|------|------|------| 1648 Figure 13 Coverage percentage results for AT&T and DFN topologies 1650 There are two main observations from these results: 1652 1. As the neighborhood depth (X) increases the coverage percentage 1653 increases and the complete coverage is obtained using a low 1654 neighborhood depth value (i.e., X=2). This result is significant 1655 since failure notification message needs to be sent only to nodes 1656 which are two-hop away from the point of failure for the complete 1657 coverage. This result supports that our method provides fast 1658 convergence by introducing minimal signaling overhead within only the 1659 two-hop neighborhood. 1661 2. The topologies with higher connectivity (i.e., higher E_mean 1662 values) have better coverage compared to the topologies with lower 1663 connectivity (i.e., lower E_mean values). This is an intuitive result 1664 since the number of possible alternate hops in dense network 1665 topologies is higher than the number of possible alternate hops in 1666 sparse topologies. This phenomenon increases the likelihood of 1667 finding backup paths, and therefore the coverage percentage. 1669 Authors' Addresses 1671 Andras Csaszar 1672 Ericsson 1673 Irinyi J utca 4-10, Budapest, Hungary, 1117 1674 Email: Andras.Csaszar@ericsson.com 1676 Gabor Sandor Enyedi 1677 Ericsson 1678 Irinyi J utca 4-10, Budapest, Hungary, 1117 1679 Email: Gabor.Sandor.Enyedi@ericsson.com 1681 Jeff Tantsura 1682 Ericsson 1683 300 Holger Way, San Jose, CA 95134 1684 Email: jeff.tantsura@ericsson.com 1686 Sriganesh Kini 1687 Ericsson 1688 300 Holger Way, San Jose, CA 95134 1689 Email: sriganesh.kini@ericsson.com 1691 John Sucec 1692 Telcordia Technologies 1693 One Telcordia Drive, Piscataway, NJ 08854 1694 Email: sucecj@telcordia.com 1696 Subir Das 1697 Telcordia Technologies 1698 One Telcordia Drive, Piscataway, NJ 08854 1699 Email: sdas2@telcordia.com