idnits 2.17.1 draft-bonaventure-lisp-preserve-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** The document seems to lack a License Notice according IETF Trust Provisions of 28 Dec 2009, Section 6.b.ii or Provisions of 12 Sep 2009 Section 6.b -- however, there's a paragraph with a matching beginning. Boilerplate error? (You're using the IETF Trust Provisions' Section 6.b License Notice from 12 Feb 2009 rather than one of the newer Notices. See https://trustee.ietf.org/license-info/.) -- The document has an IETF Trust Provisions (28 Dec 2009) Section 6.c(ii) Publication Limitation clause. If this document is intended for submission to the IESG for publication, this constitutes an error. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (July 6, 2009) is 5402 days in the past. Is this intentional? Checking references for intended status: Experimental ---------------------------------------------------------------------------- == Unused Reference: 'I-D.ietf-bfd-multihop' is defined on line 665, but no explicit reference was found in the text == Unused Reference: 'RFC4984' is defined on line 694, but no explicit reference was found in the text == Outdated reference: A later version (-11) exists of draft-ietf-bfd-base-09 == Outdated reference: A later version (-09) exists of draft-ietf-bfd-multihop-07 == Outdated reference: A later version (-24) exists of draft-ietf-lisp-01 == Outdated reference: A later version (-13) exists of draft-ietf-rtgwg-ipfrr-framework-10 -- Obsolete informational reference (is this intentional?): RFC 2547 (Obsoleted by RFC 4364) Summary: 2 errors (**), 0 flaws (~~), 7 warnings (==), 3 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group O. Bonaventure 3 Internet-Draft P. Francois 4 Intended status: Experimental D. Saucez 5 Expires: January 7, 2010 UCLouvain 6 July 6, 2009 8 Preserving the reachability of LISP ETRs in case of failures 9 draft-bonaventure-lisp-preserve-00.txt 11 Status of this Memo 13 This Internet-Draft is submitted to IETF in full conformance with the 14 provisions of BCP 78 and BCP 79. This document may not be modified, 15 and derivative works of it may not be created, and it may not be 16 published except as an Internet-Draft. 18 Internet-Drafts are working documents of the Internet Engineering 19 Task Force (IETF), its areas, and its working groups. Note that 20 other groups may also distribute working documents as Internet- 21 Drafts. 23 Internet-Drafts are draft documents valid for a maximum of six months 24 and may be updated, replaced, or obsoleted by other documents at any 25 time. It is inappropriate to use Internet-Drafts as reference 26 material or to cite them other than as "work in progress." 28 The list of current Internet-Drafts can be accessed at 29 http://www.ietf.org/ietf/1id-abstracts.txt. 31 The list of Internet-Draft Shadow Directories can be accessed at 32 http://www.ietf.org/shadow.html. 34 This Internet-Draft will expire on January 7, 2010. 36 Copyright Notice 38 Copyright (c) 2009 IETF Trust and the persons identified as the 39 document authors. All rights reserved. 41 This document is subject to BCP 78 and the IETF Trust's Legal 42 Provisions Relating to IETF Documents in effect on the date of 43 publication of this document (http://trustee.ietf.org/license-info). 44 Please review these documents carefully, as they describe your rights 45 and restrictions with respect to this document. 47 Abstract 49 Maintaining reachability of an EID prefix despite the failures of 50 ETRs is a key concern in the LISP architecture. In this document, we 51 first analyse this problem in comparison with traditional routing 52 protocols. Then, we explain how Internet Service Providers could 53 offer a service that preserves the reachability of the LISP ETRs of 54 their customers in case of failures. 56 Table of Contents 58 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 59 2. Using anycast to preserve reachability of EID prefixes in 60 case of failure . . . . . . . . . . . . . . . . . . . . . . . 7 61 3. Rewriting to preserve the reachability of EID prefixes . . . . 9 62 3.1. Rewriting interface . . . . . . . . . . . . . . . . . . . 10 63 3.2. Link and ETR failures . . . . . . . . . . . . . . . . . . 11 64 3.3. PE failures . . . . . . . . . . . . . . . . . . . . . . . 12 65 4. Protocol issues . . . . . . . . . . . . . . . . . . . . . . . 13 66 4.1. Verifying the reachability of ETRs . . . . . . . . . . . . 13 67 4.2. Advertising the backup ETR . . . . . . . . . . . . . . . . 14 68 4.3. Destination RLOC rewriting . . . . . . . . . . . . . . . . 14 69 4.3.1. Which packets should be rewritten ? . . . . . . . . . 14 70 4.3.2. After a failure, for how long should packets be 71 rewritten ? . . . . . . . . . . . . . . . . . . . . . 15 72 5. Security Considerations . . . . . . . . . . . . . . . . . . . 16 73 6. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . 17 74 7. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 18 75 8. References . . . . . . . . . . . . . . . . . . . . . . . . . . 19 76 8.1. Normative References . . . . . . . . . . . . . . . . . . . 19 77 8.2. Informative References . . . . . . . . . . . . . . . . . . 19 78 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 21 80 1. Introduction 82 Measurements performed in ISP networks indicate that link and node 83 failures are frequent events [FAILURES][BGPFRR]. Fortunately, most 84 of these failures have a short duration. However, the more and more 85 stringent Service Level Agreements (SLAs) requested by users of IP 86 networks have forced researchers and router vendors to develop 87 various kinds of fast route techniques that allow a network to 88 quickly recover after a node or link failure [RFC4090] 89 [I-D.ietf-rtgwg-ipfrr-framework] [RECOVERY]. 91 The Locator/Identifier Separation Protocol (LISP) [I-D.ietf-lisp] is 92 being developed within the LISP working group of the IETF. LISP 93 relies on two principles. First, Endpoint Identifiers (EIDs) are 94 allocated to hosts while Routing Locators (RLOCs) are allocated to 95 LISP Ingress/Egress Tunnel Routers (xTRs). The EIDs are not directly 96 routable on the global Internet, only the RLOCs are routable. 97 Second, LISP relies on map and encaps. Hosts are located on sites 98 and are served by xTRs. When host A.1 in site A needs to send a 99 packet to host B.2 in site B, its packet is intercepted by the 100 Ingress Tunnel Router (ITR) that serves its site. This ITR will 101 query a mapping system to find the RLOC of the Egress Tunnel Router 102 (ETR) that serves EID B.2. Once the RLOC of the ETR serving B's site 103 is known, the ITR will encapsulate the packet using the encapsulation 104 defined in [I-D.ietf-lisp] so that it can reach B's ETR. B's ETR 105 will decapsulate the packet and forward it to host B. 107 Recovery in case of failures is also one of the problems being 108 discussed within the LISP working group. More precisely, the working 109 group is working on techniques to verify the reachability of the 110 destination ETRs for a given EID prefix. The current draft, 111 [I-D.ietf-lisp], uses several locator reachability bits in the header 112 of all data encapsulated packets to allow an ITR to indicate to a 113 remote ETR the xTRs on the ITR's site that are known to be reachable 114 and unreachable. For another discussion of the reachability problem, 115 see [I-D.meyer-loc-id-implications] 117 This reachability problem can be better understood by comparing it 118 with the operation of traditional routing protocols in the network 119 shown in Figure 1. In this picture, the stars indicate domain 120 boundaries. 122 +----+ +----+ +----+ 123 | R1 |------| R2 |------| R3 | 124 +----+ +----+ +----+ 125 | | | 126 *******|************************|******* 127 | | | 128 +----+ +----+ +----+ 129 | R6 |------| R5 |------| R4 | 130 +----+ +----+ +----+ 131 | | 132 *******|************************|******* 133 | | 134 +----+ +----+ 135 | e1 | | E2 | 136 +----+ +----+ 137 \ / 138 \ / 139 ================== 140 Prefix P 142 Figure 1: A simple network 144 Figure 1 shows a simple network with 8 routers and one LAN containing 145 a single prefix P. With traditional routing protocols, the prefix P 146 will be advertised by both E1 and E2 via BGP. If E1 and E2 are up, P 147 will be reachable via both routers. If E1 (resp. E2) fails, then all 148 the packets destined to P will be sent via E2 (resp. E1). In such a 149 network, the reachability of P is maintained despite the failures of 150 E1 or E2 because : 152 o routers E1 and E2 send messages about the reachability of P in the 153 entire network 155 o all routers of the network have an entry for prefix P inside their 156 Forwarding Information Base (FIB) 157 +----+ 158 |ITR1| 159 +----+ 160 | 161 ******************|***************** 162 +----+ +----+ +----+ 163 | R1 |------| R2 |------| R3 | 164 +----+ +----+ +----+ 165 | | | 166 | | | 167 +----+ +----+ +----+ 168 | R6 |------| R5 |------| R4 | 169 +----+ +----+ +----+ 170 | | 171 *****|************************|***** 172 +----+ +----+ 173 |ETR1| |ETR2| 174 +----+ +----+ 175 \ / 176 \ / 177 ================== 178 EID Prefix P 180 Figure 2: A simple network with LISP routers 182 Now, let us assume that E1 and E2 are LISP ETRs and that P is an EID 183 prefix. We also add an ITR connected to R2 as shown in Figure 2. 184 Since both the network of Figure 1 and of Figure 2 have the same 185 topology, they should be able to maintain reachability even in case 186 of failures. Unfortunately, there are several important differences 187 : 189 1. the routers are managed by three different autonomous entities 190 and different IGPs are used : one for R1-R6, another one for ETR1 191 and ETR2 and a third for the network that contains ITR1. Three 192 different routing protocols are used and only aggregated RLOCs 193 are advertised accross the boundaries represented by stars in the 194 figure. 196 2. The packets sent towards EID prefix P are encapsulated in packets 197 destined to ETR1 or ETR2. There is no entry for prefix P in the 198 FIB or routers R1-R6. ITR1 has one entry for P inside its LISP 199 mapping cache. Only ETR1 and ETR2 can reach directly EID prefix 200 P. 202 We assume that the middle network uses an IGP to advertise the 203 reachability of all the routers (R1-R6) and of the directly attached 204 customers (i.e. ITR1, ETR1 and ETR2). This is a very common design. 205 For the routers R1-R6, ETR1, ETR2 ad ITR1 are different RLOCs and 206 none of these routers is aware of the fact that LISP data 207 encapsulated packets sent to ETR1 can also be sent to ETR2. 209 The network of Figure 2 is sufficiently redundant to preserve the 210 reachability of EID prefix P in case of the failure of ETR1, ETR2, R6 211 or R4. Let us analyse how LISP would react to these four failures : 213 o Failure of ETR1. In this case, ETR2 can notice the failure by 214 either having an iBGP or BFD session with ETR1 or participating in 215 the same IGP. Once ETR2 has detected the failure of ETR1, it 216 changes its locator reachability bits so that ITR1 is also 217 informed and can redirect the packets destined to EID prefix P via 218 ETR2. The time required to inform ITR1 will depend on both the 219 local failure detection time and the current packet transmission 220 rate between ETR2 and ITR1. This only works, of course, if 221 traffic is bidirectionnal. 223 o Failure of R6. To detect such failures, since ETR1 does not 224 participate in the ISP's IGP, it needs to use a mechanism to 225 verify that its upstream router is alive. This can be achieved 226 for example by having a BGP session between ETR1 and R6 possibly 227 coupled with a fast failure detection mechanism such as BFD 228 [I-D.ietf-bfd-base]. Once ETR1 has detected the failure of R6, it 229 must inform ETR2. The method used to inform ETR2 is not specified 230 by LISP, but is important from a deployment viewpoint. For 231 example, ETR1 could withdrawing the default route learned from R6 232 from the site's IGP. ETR2 can then update the loc-reach bits of 233 the LISP encapsulated packets that it sends. ITR1 will stop 234 sending LISP data encapsulated packets to ETR1 as soon as it has 235 received the updated loc-reach bits. 237 In practice, the time required to detect and recover from such 238 failures can be longer than a round-trip-time. It would be desirable 239 in some environments to have a shorter recovery time. Unfortunately, 240 the classical techniques [RECOVERY] deployed in IP and MPLS networks 241 are not directly applicable to preserve the reachability of the EIDs 242 behind the unreachable ETR. 244 In this document, we first analyse several solutions based on anycast 245 that can be used by an ISP to preserve the reachability to LISP ETRs 246 in case and failures and discuss their advantages and drawbacks. 247 Then, we propose a rewriting technique that can be deployed by ISPs 248 to ensure that the EIDs of their customers remain reachable despite 249 that some of their LISP ETRs are unreachable. 251 2. Using anycast to preserve reachability of EID prefixes in case of 252 failure 254 A first possible approach to preserve the reachability of EID 255 prefixes in case of link or node failures in the service provider 256 network to which the ETR is attached is to use anycast routing. The 257 figure below shows a simplified network using the terminology used by 258 BGP/MPLS VPNs [RFC2547]. The ISP network contains three Provider (P) 259 routers, 3 Provider Edge (PE) routers and two LISP ETRs. The two 260 LISP ETRs are responsible for the same EID prefix P. 262 +----+ +----+ +----+ 263 | P1 |------| P3 |------| P2 | 264 +----+ +----+ +----+ 265 | | | 266 | | | 267 +----+ +----+ +----+ 268 | PE1|------| PE3|------| PE2| 269 +----+ +----+ +----+ 270 | | 271 ******|************************|****** 272 +----+ +----+ 273 |ETR1| |ETR2| 274 +----+ +----+ 275 \ / 276 \ / 277 ================== 278 EID Prefix P 280 Figure 3: A simple network with two ETRs 282 A first solution to ensure that ETR2 remains reachable when ETR1 283 becomes unreachable is to use an anycast address for the RLOC used by 284 both ETR1 and ETR2. For example, with IPv4 a single anycast /32 285 would be allocated to both ETR1 and ETR2. This solution clearly 286 ensures that all LISP data encapsulated packets will reach an ETR 287 attached to EID prefix P as long as either ETR remains reachable. 288 However, it has several important drawbacks : 290 o As ETR1 and ETR2 use the same anycast address, the site cannot 291 engineer the incoming traffic toward EID prefix p by tuning its 292 mapping replies. 294 o Anycast cannot be used if ETR1 and ETR2 are attached to two 295 different ISPs. Unfortunately, it can be expected that owners of 296 sites will often attach their ETRs to different ISP networks to 297 have technical and economical redundancy. Anycast could probably 298 be used if ETR1 and ETR2 were located in the same IGP area (often 299 equivalent to the same POP in large ISP networks). 301 To allow a site to continue to engineer its incoming traffic, an 302 alternative could be to use two anycast addresses as RLOCs for the 303 site's ETRs. PE1 (resp. PE2) would advertise in the ISP's IGP two 304 addresses for ETR1 (resp. ETR2) : ETR1's RLOC (resp. ETR2's RLOC) 305 with a low IGP distance and ETR2's RLOC (resp. ETR1's RLOC) with a 306 very high IGP distance. With those advertisements, ETR1 and ETR2 are 307 both used when they are up. If ETR1 becomes unreachable, the 308 provider's IGP will converge and all packets sent to its RLOC will be 309 automatically rerouted to ETR2 which also supports the same RLOC. 310 Unfortunately, this solution has the following drawbacks : 312 o It increases the size of the IGP, especially when ETR1 and ETR2 313 are not in the same POP/area. 315 o It cannot be used when ETR1 and ETR2 are attached to two different 316 ISPs. 318 For these reasons, anycast cannot be considered as a technique that 319 totally fulfills the role of preserving the reachability of 320 multihomed EID prefixes. 322 3. Rewriting to preserve the reachability of EID prefixes 324 To preserve the reachability of EID prefixes in case of failures of 325 either the link or the router that connects an ETR to its provider, 326 we need to ensure that the packets destined to the RLOC of an ETR 327 that became unreachable can be rerouted efficiently by routers in the 328 provider's network. We consider three reference environments where 329 our solution must be applicable : 331 o A network where the two ETRs are attached to the same POP of one 332 ISP 334 o A network where the two ETRs are attached to different POPs of the 335 same ISP 337 o A network where the two ETRs are attached to different ISPs 339 The more general case is the third one. In the remainder of this 340 section, we will mainly discuss the topology shown in Figure 4. 342 A solution to preserve the reachability of these ETRs in case of 343 link/router failures must be applicable to these three deployment 344 scenarios. We consider two different types of failures : 346 o Failure of the link between an ETR and its PE router, such as 347 PE1-E1 in Figure 4. From the viewpoint of the ISP network, the 348 failure of a link between a PE and an ETR is equivalent to the 349 failure of the ETR itself. 351 o Failure of the PE router to which an ETR is attached, such as PE1 352 in Figure 4. In this case, all the ETRs attached to the PE router 353 become unreachable. 355 Internet 356 / \ 357 ISP1 ISP2 358 / | | 359 +----+ +----+ +----+ 360 | P1 |------| P2 | | P3 | 361 +----+ +----+ +----+ 362 | | | 363 | | | 364 +----+ +----+ +----+ 365 | PE1|------| PE2| | PE3| 366 +----+ +----+ +----+ 367 | | 368 | | 369 +----+ +----+ 370 | E1 | | E2 | 371 +----+ +----+ 372 \ / 373 \ / 374 ================== 375 Prefix P 376 -- POP1 -- -- POP3 -- 378 Figure 4: A network with two LISP ETRs attached to different ISPs 380 3.1. Rewriting interface 382 Our technique to preserve the reachability of EID prefixes despite 383 link and node failures relies on a new type of virtual interface that 384 we call a rewriting interface. Besides real physical interfaces, 385 routers often have virtual interfaces such as tunnel interfaces. 386 When the nexthop of a packet is a tunnel interface, this packet is 387 encapsulated and the encapsulated packet is sent towards the tunnel 388 destination. 390 A rewriting virtual interface is configured with : 392 o a primary address 394 o a (set of ) alternate addresses 396 A rewriting interface can only be used by packets whose destination 397 address is equal to the primary address of the rewriting interface. 398 When such a packet is to be forwarded by the rewriting interface, its 399 destination address is replaced by one of alternate addresses known 400 for this interface. Of course, the IP and UDP checksums of the 401 rewritten packets are updated. When selecting an alternate address, 402 the router should prefer an alternate address that it knows (e.g. 403 based on its own routing table or thanks to other information) to be 404 reachable. The rewritten packet is then forwarded towards its new 405 destination. 407 Instead of using a rewriting interface, another solution could have 408 been to encapsulate the packet destined to the failed address towards 409 the alternate. However, using a second level of encapsulation would 410 like cause MTU problems. For this reason, we chose to rewrite part 411 of the LISP header. From an implementation viewpoint, rewriting part 412 of a LISP header is similar to the operation performed by a Network 413 Address Translator. Given the current interest in carrier-grade NAT, 414 it can be expected that efficient hardware-based NAT implementations 415 will appear. 417 The operation of the rewriting interface is discussed in more details 418 in section Section 4.3. 420 3.2. Link and ETR failures 422 In this section, we describe informally the principle of our 423 solution. The details are discussed later. To maintain reachability 424 of EID prefix when the link between one of its ETR and the associated 425 PE fails, we propose to install a rewriting interface on the upstream 426 PE. Consider for example Figure 4 and that E1 is the ETR whose 427 reachability needs to be preserved. This can be achieved as follows 428 : 430 o PE1 is configured with a rewriting interface having E1's RLOC as 431 primary address and E2's RLOC as alternate address. A static 432 route for this rewriting interface is configured on PE1, but this 433 route has a high administrative distance so that the route is not 434 installed in the FIB when E1 is up. 436 o When the link between PE1 and E1 fails, PE1's rewriting interface 437 is still up. Thus, PE1 continues to announce E1's RLOC as being 438 reachable in the IGP. This ensures that packets destined to E1 439 still reach PE1. However, the rewriting interface replaces the 440 physical interface as the nexthop for E1 in PE1's FIB. 442 o When a LISP data encapsulated packet destined to E1 arrives while 443 E1 is unreachable, PE1 forwards this packet over its rewriting 444 interface. This interface rewrites the destination RLOC of this 445 LISP data encapsulated packet with E2's RLOC as destination 446 address and the packet is forwarded to E2. 448 o When E1 becomes again reachable, the physical interface towards E1 449 replaces the rewriting interface as the nexthop for E1 in PE1's 450 FIB and the rewriting stops. Rewriting could also stop by 451 removing the rewriting interface e.g. after the expiration of a 452 timer. 454 It should be noted that this solution is purely local on the PE 455 router attached to the ETR responsible for the EID prefix whose 456 reachability must be preserved in case of failures. No additional 457 prefix needs to be advertised in the IGP. Thus, there are no 458 scalability issues with this solution. 460 3.3. PE failures 462 To maintain reachability of an EID prefix when the PE attached to one 463 ETR fails, we cannot use the solution described above as the PE is 464 not reachable anymore. To solve this problem, we introduce a 465 rewriting PE. A rewriting PE is a PE router that is configured with 466 a rewriting interface whose primary address is the address of an ETR 467 attached to another PE router. The rewriting PE will usually be 468 located in the same POP as the PE that must be protected. For 469 example, let us consider the failure of PE1 in Figure 4 and assume 470 that PE2 is the rewriting PE : 472 o PE2 is configured with one rewriting interface having : 474 * E1's RLOC as primary address 476 * E2's RLOC as alternate address 478 o E1's RLOC is advertised as an anycast address by both PE1 and PE2 479 that acts as a rewriting router. PE2's advertisement has a high 480 IGP distance such that PE1's advertisement is always preferred 481 inside the ISP network. Furthermore, the rewriting interface has 482 a high administrative distance and thus PE2 does not install a FIB 483 entry towards this rewriting interface. 485 o When PE1 becomes unreachable, the IGP converges and PE2 becomes 486 the only router that advertises E1's RLOC. It thus receives all 487 packets destined to E1's RLOC. These packets are rewritten by the 488 rewriting interface and forwarded to E2's RLOC. 490 o When PE1 comes back, it readvertises the reachability of E1's 491 RLOC. PE2 prefers PE1's advertisement and stops receiveing 492 packets destined to E1's RLOC. 494 4. Protocol issues 496 In this section, we discuss in more details the protocols and 497 mechanisms that are required to implement the solution described 498 informally in the previous section. We first discuss how a PE can 499 verify the reachability of ETRs. Then we discuss how a rewriting 500 router can learn the rewriting address that it should use when an ETR 501 becomes unreachable. Finally we explain how the RLOC of the 502 unreachable ETR needs to be rewritten and propose a small change to 503 the LISP header for this. 505 4.1. Verifying the reachability of ETRs 507 The first router that needs to detect the unreachability of a LISP 508 ETR is the PE router directly connected to it. Several mechanisms 509 can be used to detect this unreachability : physical layer 510 information (if available), BFD or a single hop eBGP session could be 511 established between the PE and the ETR. No prefix will be advertised 512 by the ETR on this eBGP session, but the PE may advertise a default 513 route or its full BGP (RLOC) routing table. 515 However, the rewriting PE router could also need to verify the 516 reachability of the ETR that owns the RLOC that it will rewrite if 517 the primary ETR becomes unreachable due to the failure of its 518 attached PE. This is especially important when the the rewriting PE 519 knows several alternate ETR routers. If it only knows a single 520 alternate ETR and the primary fails, the only solution is to rewrite 521 the packets towards the only alternate ETR. This alternate ETR can 522 be located in the same POP, in another POP or in another ISP. Thus, 523 the rewriting PE cannot always rely on its routing table to verify 524 the reachability of such a distant ETR. 526 To allow a PE to know which of the alternate addresses for a given 527 primary address are alive, we propose to use multihop eBGP sessions 528 to distribute the reachability information of each ETR. Reachability 529 information could be distributed as follows : 531 o Each LISP site, containing at least one EID prefix and several 532 ETRs is allocated a unique route target. 534 o Each ETR has a single-hop BGP session with its attached PE router. 535 On this eBGP session, the ETR advertises only its own RLOC with 536 the allocated route target. 538 o The PE routers and the routers with rewriting interfaces are part 539 of an iBGP mesh (e.g. based on route reflectors) where the routes 540 received by the ETRs are distributed with their route target. 542 o The route reflectors of different ASes that host LISP ETRs can 543 exchange the routes received from their ETRs by using multihop 544 eBGP sessions. 546 o A rewriting router only needs to receive reachability information 547 for alternate addresses that it supports. This can be achieved by 548 requesting in the iBGP mesh all the routes with a list of route 549 targets. 551 The next version of this document will analyse this problem in more 552 details 554 4.2. Advertising the backup ETR 556 In the previous section, we have assumed that the PE and the 557 rewriting router were configured with several information. Such a 558 manual configuration is possible, but in practice it would be useful 559 to allow some of these routers to automatically learn some of this 560 information. For example, it would be useful for a PE router to 561 learn automatically the backup RLOCs to be used in case of failure of 562 one of its directly attached ETRs. This can be achieved by either : 564 o developing a new protocol to advertise these backup RLOCs to be 565 rewritten 567 o using BGP and defining a new address family that allows BGP to 568 carry this kind of information 570 o extending the Map-Request/Map-Reply and allow the PE to query the 571 ETR for its alternate ETR 573 The next version of this document will analyse in more detailed the 574 advantages and drawbacks of each of these two approaches. 576 4.3. Destination RLOC rewriting 578 Our solution rewrites the destination RLOC of LISP packets once the 579 destination of this packet has been found unreachable. This 580 rewriting raises several questions as discussed in the following 581 sections. 583 4.3.1. Which packets should be rewritten ? 585 A LISP ETR will receive different types of packets and we need to 586 define which packets should be rewritten by the rewriting router. 587 LISP encapsulated data packets should be rewritten. However, we need 588 to ensure that when multiple failures occur LISP encapsulated data 589 packets do not loop between rewriting routers. This can be achieved 590 by reserving one bit in the LISP header, called the Deflection (D) 591 bit. When an ITR sends a data encapsulated packet, it sets the D bit 592 to false. When a rewriting router receives a LISP data encapsulated 593 with the D bit set to false, it can rewrite the destination address 594 of the packet. If the D bit is set to true, the packet must be 595 dropped. LISP control packets, i.e. Map-Request and Map-Reply 596 packets, do not need to be rewritten as they are targeted at the ETR 597 itself and not at hosts behind the ETR. Non-LISP packets destined to 598 the ETR do not need to be rewritten either. 600 Upon reception of packets with the D bit set, the ETR knows that the 601 packets have been deflected by upstream routers, likely due to an 602 upstream failure. This ETR will soon detect the failure by other 603 means (e.g. the primary ETR stops advertising its default route in 604 the site's IGP). 606 4.3.2. After a failure, for how long should packets be rewritten ? 608 In theory, the ITR which is sending packets to the ETR could have 609 learned the mapping up to TTL minutes ago if TTL is the mapping 610 lifetime. Thus, the rewriting entry should remain in the rewriting 611 router for a duration at least equal to the lifetime of the mapping 612 entries if we do not want to loose encapsulated packets. With a 613 default mapping lifetime of 24hours, this duration can be large. In 614 practice however, most of the failures have a short duration and the 615 ETR will become reachable again well before the expiration of the 616 lifetime of its mapping entries. 618 5. Security Considerations 620 To be written once the details of the protocols have been specified. 622 6. Conclusion 624 In this document, we have first compared the LISP reachability 625 problem with the traditional reachability problem with routing 626 protocols. We have then shown the drawbacks of using anycast to 627 preserve the reachability of LISP ETRs in case of failures. Then, we 628 have proposed to allow PE routers to rewrite the destination address 629 of LISP encapsulated packets to preserve the reachability of the EID 630 prefix in case of failure of one of the responsible ETRs. Further 631 work is required to define the protocols and mechanisms that are 632 necessary to allow ISPs to preserve the reachability of the ETRs of 633 their customers. 635 7. Acknowledgements 637 We would like to thank Dave Meyer for his comments on the first 638 version of this draft. This work was partially supported by a Cisco 639 URP grant. 641 8. References 643 8.1. Normative References 645 [RFC4090] Pan, P., Swallow, G., and A. Atlas, "Fast Reroute 646 Extensions to RSVP-TE for LSP Tunnels", RFC 4090, 647 May 2005. 649 8.2. Informative References 651 [BGPFRR] Bonaventure , O., Filsfils, C., and P. Francois, 652 "Achieving Sub-50 Milliseconds Recovery Upon BGP Peering 653 Link Failures", Conext 2005 . 655 [FAILURES] 656 Markopoulou, A., Iannacone, G., Chattacharyya, S., 657 Chuah, C., and C. Diot, "Characterization of Failures in 658 an IP Backbone", INFOCOM 2004. 660 [I-D.ietf-bfd-base] 661 Katz, D. and D. Ward, "Bidirectional Forwarding 662 Detection", draft-ietf-bfd-base-09 (work in progress), 663 February 2009. 665 [I-D.ietf-bfd-multihop] 666 Katz, D. and D. Ward, "BFD for Multihop Paths", 667 draft-ietf-bfd-multihop-07 (work in progress), 668 February 2009. 670 [I-D.ietf-lisp] 671 Farinacci, D., Fuller, V., Meyer, D., and D. Lewis, 672 "Locator/ID Separation Protocol (LISP)", 673 draft-ietf-lisp-01 (work in progress), May 2009. 675 [I-D.ietf-rtgwg-ipfrr-framework] 676 Shand, M. and S. Bryant, "IP Fast Reroute Framework", 677 draft-ietf-rtgwg-ipfrr-framework-10 (work in progress), 678 February 2009. 680 [I-D.meyer-loc-id-implications] 681 Meyer, D. and D. Lewis, "Architectural Implications of 682 Locator/ID Separation", draft-meyer-loc-id-implications-01 683 (work in progress), January 2009. 685 [RECOVERY] 686 Vasseur, J., Demeester, P., and M. Pickavet, "Network 687 Recovery: Protection and Restoration of Optical, SONET- 688 SDH, IP, and MPLS", Elsevier Science & Technology 689 Books 2004. 691 [RFC2547] Rosen, E. and Y. Rekhter, "BGP/MPLS VPNs", RFC 2547, 692 March 1999. 694 [RFC4984] Meyer, D., Zhang, L., and K. Fall, "Report from the IAB 695 Workshop on Routing and Addressing", RFC 4984, 696 September 2007. 698 Authors' Addresses 700 Olivier Bonaventure 701 UCLouvain 702 Universite catholique de Louvain, Place Sainte Barbe 2 703 Louvain-la-Neuve, 1348 704 Belgium 706 Email: olivier.bonaventure@uclouvain.be 707 URI: http://inl.info.ucl.ac.be 709 Pierre Francois 710 UCLouvain 711 Universite catholique de Louvain, Place Sainte Barbe 2 712 Louvain-la-Neuve, 1348 713 Belgium 715 Email: pierre.francois@uclouvain.be 716 URI: http://inl.info.ucl.ac.be 718 Damien Saucez 719 UCLouvain 720 Universite catholique de Louvain, Place Sainte Barbe 2 721 Louvain-la-Neuve, 1348 722 Belgium 724 Email: damien.saucez@uclouvain.be 725 URI: http://inl.info.ucl.ac.be