idnits 2.17.1 

draft-bonaventure-lisp-preserve-00.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

  ** The document seems to lack a License Notice according IETF Trust
     Provisions of 28 Dec 2009, Section 6.b.ii or Provisions of 12 Sep 2009
     Section 6.b -- however, there's a paragraph with a matching beginning.
     Boilerplate error?

     (You're using the IETF Trust Provisions' Section 6.b License Notice from
     12 Feb 2009 rather than one of the newer Notices.  See
     https://trustee.ietf.org/license-info/.)

  -- The document has an IETF Trust Provisions (28 Dec 2009) Section 6.c(ii)
     Publication Limitation clause.  If this document is intended for
     submission to the IESG for publication, this constitutes an error.


  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  ** The document seems to lack an IANA Considerations section.  (See Section
     2.2 of https://www.ietf.org/id-info/checklist for how to handle the case
     when there are no actions for IANA.)


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  -- The document date (July 6, 2009) is 5402 days in the past.  Is this
     intentional?


  Checking references for intended status: Experimental
  ----------------------------------------------------------------------------

  == Unused Reference: 'I-D.ietf-bfd-multihop' is defined on line 665, but no
     explicit reference was found in the text

  == Unused Reference: 'RFC4984' is defined on line 694, but no explicit
     reference was found in the text

  == Outdated reference: A later version (-11) exists of
     draft-ietf-bfd-base-09

  == Outdated reference: A later version (-09) exists of
     draft-ietf-bfd-multihop-07

  == Outdated reference: A later version (-24) exists of draft-ietf-lisp-01

  == Outdated reference: A later version (-13) exists of
     draft-ietf-rtgwg-ipfrr-framework-10

  -- Obsolete informational reference (is this intentional?): RFC 2547
     (Obsoleted by RFC 4364)


     Summary: 2 errors (**), 0 flaws (~~), 7 warnings (==), 3 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	Network Working Group                                     O. Bonaventure
3	Internet-Draft                                               P. Francois
4	Intended status: Experimental                                  D. Saucez
5	Expires: January 7, 2010                                       UCLouvain
6	                                                            July 6, 2009

8	      Preserving the reachability of LISP ETRs in case of failures
9	                 draft-bonaventure-lisp-preserve-00.txt

11	Status of this Memo

13	   This Internet-Draft is submitted to IETF in full conformance with the
14	   provisions of BCP 78 and BCP 79.  This document may not be modified,
15	   and derivative works of it may not be created, and it may not be
16	   published except as an Internet-Draft.

18	   Internet-Drafts are working documents of the Internet Engineering
19	   Task Force (IETF), its areas, and its working groups.  Note that
20	   other groups may also distribute working documents as Internet-
21	   Drafts.

23	   Internet-Drafts are draft documents valid for a maximum of six months
24	   and may be updated, replaced, or obsoleted by other documents at any
25	   time.  It is inappropriate to use Internet-Drafts as reference
26	   material or to cite them other than as "work in progress."

28	   The list of current Internet-Drafts can be accessed at
29	   http://www.ietf.org/ietf/1id-abstracts.txt.

31	   The list of Internet-Draft Shadow Directories can be accessed at
32	   http://www.ietf.org/shadow.html.

34	   This Internet-Draft will expire on January 7, 2010.

36	Copyright Notice

38	   Copyright (c) 2009 IETF Trust and the persons identified as the
39	   document authors.  All rights reserved.

41	   This document is subject to BCP 78 and the IETF Trust's Legal
42	   Provisions Relating to IETF Documents in effect on the date of
43	   publication of this document (http://trustee.ietf.org/license-info).
44	   Please review these documents carefully, as they describe your rights
45	   and restrictions with respect to this document.

47	Abstract

49	   Maintaining reachability of an EID prefix despite the failures of
50	   ETRs is a key concern in the LISP architecture.  In this document, we
51	   first analyse this problem in comparison with traditional routing
52	   protocols.  Then, we explain how Internet Service Providers could
53	   offer a service that preserves the reachability of the LISP ETRs of
54	   their customers in case of failures.

56	Table of Contents

58	   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  3
59	   2.  Using anycast to preserve reachability of EID prefixes in
60	       case of failure  . . . . . . . . . . . . . . . . . . . . . . .  7
61	   3.  Rewriting to preserve the reachability of EID prefixes . . . .  9
62	     3.1.  Rewriting interface  . . . . . . . . . . . . . . . . . . . 10
63	     3.2.  Link and ETR failures  . . . . . . . . . . . . . . . . . . 11
64	     3.3.  PE failures  . . . . . . . . . . . . . . . . . . . . . . . 12
65	   4.  Protocol issues  . . . . . . . . . . . . . . . . . . . . . . . 13
66	     4.1.  Verifying the reachability of ETRs . . . . . . . . . . . . 13
67	     4.2.  Advertising the backup ETR . . . . . . . . . . . . . . . . 14
68	     4.3.  Destination RLOC rewriting . . . . . . . . . . . . . . . . 14
69	       4.3.1.  Which packets should be rewritten ?  . . . . . . . . . 14
70	       4.3.2.  After a failure, for how long should packets be
71	               rewritten ?  . . . . . . . . . . . . . . . . . . . . . 15
72	   5.  Security Considerations  . . . . . . . . . . . . . . . . . . . 16
73	   6.  Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . 17
74	   7.  Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 18
75	   8.  References . . . . . . . . . . . . . . . . . . . . . . . . . . 19
76	     8.1.  Normative References . . . . . . . . . . . . . . . . . . . 19
77	     8.2.  Informative References . . . . . . . . . . . . . . . . . . 19
78	   Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 21

80	1.  Introduction

82	   Measurements performed in ISP networks indicate that link and node
83	   failures are frequent events [FAILURES][BGPFRR].  Fortunately, most
84	   of these failures have a short duration.  However, the more and more
85	   stringent Service Level Agreements (SLAs) requested by users of IP
86	   networks have forced researchers and router vendors to develop
87	   various kinds of fast route techniques that allow a network to
88	   quickly recover after a node or link failure [RFC4090]
89	   [I-D.ietf-rtgwg-ipfrr-framework] [RECOVERY].

91	   The Locator/Identifier Separation Protocol (LISP) [I-D.ietf-lisp] is
92	   being developed within the LISP working group of the IETF.  LISP
93	   relies on two principles.  First, Endpoint Identifiers (EIDs) are
94	   allocated to hosts while Routing Locators (RLOCs) are allocated to
95	   LISP Ingress/Egress Tunnel Routers (xTRs).  The EIDs are not directly
96	   routable on the global Internet, only the RLOCs are routable.
97	   Second, LISP relies on map and encaps.  Hosts are located on sites
98	   and are served by xTRs.  When host A.1 in site A needs to send a
99	   packet to host B.2 in site B, its packet is intercepted by the
100	   Ingress Tunnel Router (ITR) that serves its site.  This ITR will
101	   query a mapping system to find the RLOC of the Egress Tunnel Router
102	   (ETR) that serves EID B.2.  Once the RLOC of the ETR serving B's site
103	   is known, the ITR will encapsulate the packet using the encapsulation
104	   defined in [I-D.ietf-lisp] so that it can reach B's ETR.  B's ETR
105	   will decapsulate the packet and forward it to host B.

107	   Recovery in case of failures is also one of the problems being
108	   discussed within the LISP working group.  More precisely, the working
109	   group is working on techniques to verify the reachability of the
110	   destination ETRs for a given EID prefix.  The current draft,
111	   [I-D.ietf-lisp], uses several locator reachability bits in the header
112	   of all data encapsulated packets to allow an ITR to indicate to a
113	   remote ETR the xTRs on the ITR's site that are known to be reachable
114	   and unreachable.  For another discussion of the reachability problem,
115	   see [I-D.meyer-loc-id-implications]

117	   This reachability problem can be better understood by comparing it
118	   with the operation of traditional routing protocols in the network
119	   shown in Figure 1.  In this picture, the stars indicate domain
120	   boundaries.

122	                       +----+      +----+      +----+
123	                       | R1 |------| R2 |------| R3 |
124	                       +----+      +----+      +----+
125	                         |            |           |
126	                  *******|************************|*******
127	                         |            |           |
128	                       +----+      +----+      +----+
129	                       | R6 |------| R5 |------| R4 |
130	                       +----+      +----+      +----+
131	                         |                        |
132	                  *******|************************|*******
133	                         |                        |
134	                       +----+                  +----+
135	                       | e1 |                  | E2 |
136	                       +----+                  +----+
137	                             \                 /
138	                               \             /
139	                             ==================
140	                                    Prefix P

142	                        Figure 1: A simple network

144	   Figure 1 shows a simple network with 8 routers and one LAN containing
145	   a single prefix P. With traditional routing protocols, the prefix P
146	   will be advertised by both E1 and E2 via BGP.  If E1 and E2 are up, P
147	   will be reachable via both routers.  If E1 (resp. E2) fails, then all
148	   the packets destined to P will be sent via E2 (resp. E1).  In such a
149	   network, the reachability of P is maintained despite the failures of
150	   E1 or E2 because :

152	   o  routers E1 and E2 send messages about the reachability of P in the
153	      entire network

155	   o  all routers of the network have an entry for prefix P inside their
156	      Forwarding Information Base (FIB)
157	                                    +----+
158	                                    |ITR1|
159	                                    +----+
160	                                       |
161	                     ******************|*****************
162	                        +----+      +----+      +----+
163	                        | R1 |------| R2 |------| R3 |
164	                        +----+      +----+      +----+
165	                          |            |           |
166	                          |            |           |
167	                        +----+      +----+      +----+
168	                        | R6 |------| R5 |------| R4 |
169	                        +----+      +----+      +----+
170	                          |                        |
171	                     *****|************************|*****
172	                        +----+                  +----+
173	                        |ETR1|                  |ETR2|
174	                        +----+                  +----+
175	                              \                 /
176	                                \             /
177	                              ==================
178	                                   EID Prefix P

180	               Figure 2: A simple network with LISP routers

182	   Now, let us assume that E1 and E2 are LISP ETRs and that P is an EID
183	   prefix.  We also add an ITR connected to R2 as shown in Figure 2.
184	   Since both the network of Figure 1 and of Figure 2 have the same
185	   topology, they should be able to maintain reachability even in case
186	   of failures.  Unfortunately, there are several important differences
187	   :

189	   1.  the routers are managed by three different autonomous entities
190	       and different IGPs are used : one for R1-R6, another one for ETR1
191	       and ETR2 and a third for the network that contains ITR1.  Three
192	       different routing protocols are used and only aggregated RLOCs
193	       are advertised accross the boundaries represented by stars in the
194	       figure.

196	   2.  The packets sent towards EID prefix P are encapsulated in packets
197	       destined to ETR1 or ETR2.  There is no entry for prefix P in the
198	       FIB or routers R1-R6.  ITR1 has one entry for P inside its LISP
199	       mapping cache.  Only ETR1 and ETR2 can reach directly EID prefix
200	       P.

202	   We assume that the middle network uses an IGP to advertise the
203	   reachability of all the routers (R1-R6) and of the directly attached
204	   customers (i.e.  ITR1, ETR1 and ETR2).  This is a very common design.
205	   For the routers R1-R6, ETR1, ETR2 ad ITR1 are different RLOCs and
206	   none of these routers is aware of the fact that LISP data
207	   encapsulated packets sent to ETR1 can also be sent to ETR2.

209	   The network of Figure 2 is sufficiently redundant to preserve the
210	   reachability of EID prefix P in case of the failure of ETR1, ETR2, R6
211	   or R4.  Let us analyse how LISP would react to these four failures :

213	   o  Failure of ETR1.  In this case, ETR2 can notice the failure by
214	      either having an iBGP or BFD session with ETR1 or participating in
215	      the same IGP.  Once ETR2 has detected the failure of ETR1, it
216	      changes its locator reachability bits so that ITR1 is also
217	      informed and can redirect the packets destined to EID prefix P via
218	      ETR2.  The time required to inform ITR1 will depend on both the
219	      local failure detection time and the current packet transmission
220	      rate between ETR2 and ITR1.  This only works, of course, if
221	      traffic is bidirectionnal.

223	   o  Failure of R6.  To detect such failures, since ETR1 does not
224	      participate in the ISP's IGP, it needs to use a mechanism to
225	      verify that its upstream router is alive.  This can be achieved
226	      for example by having a BGP session between ETR1 and R6 possibly
227	      coupled with a fast failure detection mechanism such as BFD
228	      [I-D.ietf-bfd-base].  Once ETR1 has detected the failure of R6, it
229	      must inform ETR2.  The method used to inform ETR2 is not specified
230	      by LISP, but is important from a deployment viewpoint.  For
231	      example, ETR1 could withdrawing the default route learned from R6
232	      from the site's IGP.  ETR2 can then update the loc-reach bits of
233	      the LISP encapsulated packets that it sends.  ITR1 will stop
234	      sending LISP data encapsulated packets to ETR1 as soon as it has
235	      received the updated loc-reach bits.

237	   In practice, the time required to detect and recover from such
238	   failures can be longer than a round-trip-time.  It would be desirable
239	   in some environments to have a shorter recovery time.  Unfortunately,
240	   the classical techniques [RECOVERY] deployed in IP and MPLS networks
241	   are not directly applicable to preserve the reachability of the EIDs
242	   behind the unreachable ETR.

244	   In this document, we first analyse several solutions based on anycast
245	   that can be used by an ISP to preserve the reachability to LISP ETRs
246	   in case and failures and discuss their advantages and drawbacks.
247	   Then, we propose a rewriting technique that can be deployed by ISPs
248	   to ensure that the EIDs of their customers remain reachable despite
249	   that some of their LISP ETRs are unreachable.

251	2.  Using anycast to preserve reachability of EID prefixes in case of
252	    failure

254	   A first possible approach to preserve the reachability of EID
255	   prefixes in case of link or node failures in the service provider
256	   network to which the ETR is attached is to use anycast routing.  The
257	   figure below shows a simplified network using the terminology used by
258	   BGP/MPLS VPNs [RFC2547].  The ISP network contains three Provider (P)
259	   routers, 3 Provider Edge (PE) routers and two LISP ETRs.  The two
260	   LISP ETRs are responsible for the same EID prefix P.

262	                        +----+      +----+      +----+
263	                        | P1 |------| P3 |------| P2 |
264	                        +----+      +----+      +----+
265	                          |            |           |
266	                          |            |           |
267	                        +----+      +----+      +----+
268	                        | PE1|------| PE3|------| PE2|
269	                        +----+      +----+      +----+
270	                          |                        |
271	                    ******|************************|******
272	                        +----+                  +----+
273	                        |ETR1|                  |ETR2|
274	                        +----+                  +----+
275	                              \                 /
276	                                \             /
277	                              ==================
278	                                  EID Prefix P

280	                 Figure 3: A simple network with two ETRs

282	   A first solution to ensure that ETR2 remains reachable when ETR1
283	   becomes unreachable is to use an anycast address for the RLOC used by
284	   both ETR1 and ETR2.  For example, with IPv4 a single anycast /32
285	   would be allocated to both ETR1 and ETR2.  This solution clearly
286	   ensures that all LISP data encapsulated packets will reach an ETR
287	   attached to EID prefix P as long as either ETR remains reachable.
288	   However, it has several important drawbacks :

290	   o  As ETR1 and ETR2 use the same anycast address, the site cannot
291	      engineer the incoming traffic toward EID prefix p by tuning its
292	      mapping replies.

294	   o  Anycast cannot be used if ETR1 and ETR2 are attached to two
295	      different ISPs.  Unfortunately, it can be expected that owners of
296	      sites will often attach their ETRs to different ISP networks to
297	      have technical and economical redundancy.  Anycast could probably
298	      be used if ETR1 and ETR2 were located in the same IGP area (often
299	      equivalent to the same POP in large ISP networks).

301	   To allow a site to continue to engineer its incoming traffic, an
302	   alternative could be to use two anycast addresses as RLOCs for the
303	   site's ETRs.  PE1 (resp. PE2) would advertise in the ISP's IGP two
304	   addresses for ETR1 (resp. ETR2) : ETR1's RLOC (resp. ETR2's RLOC)
305	   with a low IGP distance and ETR2's RLOC (resp. ETR1's RLOC) with a
306	   very high IGP distance.  With those advertisements, ETR1 and ETR2 are
307	   both used when they are up.  If ETR1 becomes unreachable, the
308	   provider's IGP will converge and all packets sent to its RLOC will be
309	   automatically rerouted to ETR2 which also supports the same RLOC.
310	   Unfortunately, this solution has the following drawbacks :

312	   o  It increases the size of the IGP, especially when ETR1 and ETR2
313	      are not in the same POP/area.

315	   o  It cannot be used when ETR1 and ETR2 are attached to two different
316	      ISPs.

318	   For these reasons, anycast cannot be considered as a technique that
319	   totally fulfills the role of preserving the reachability of
320	   multihomed EID prefixes.

322	3.  Rewriting to preserve the reachability of EID prefixes

324	   To preserve the reachability of EID prefixes in case of failures of
325	   either the link or the router that connects an ETR to its provider,
326	   we need to ensure that the packets destined to the RLOC of an ETR
327	   that became unreachable can be rerouted efficiently by routers in the
328	   provider's network.  We consider three reference environments where
329	   our solution must be applicable :

331	   o  A network where the two ETRs are attached to the same POP of one
332	      ISP

334	   o  A network where the two ETRs are attached to different POPs of the
335	      same ISP

337	   o  A network where the two ETRs are attached to different ISPs

339	   The more general case is the third one.  In the remainder of this
340	   section, we will mainly discuss the topology shown in Figure 4.

342	   A solution to preserve the reachability of these ETRs in case of
343	   link/router failures must be applicable to these three deployment
344	   scenarios.  We consider two different types of failures :

346	   o  Failure of the link between an ETR and its PE router, such as
347	      PE1-E1 in Figure 4.  From the viewpoint of the ISP network, the
348	      failure of a link between a PE and an ETR is equivalent to the
349	      failure of the ETR itself.

351	   o  Failure of the PE router to which an ETR is attached, such as PE1
352	      in Figure 4.  In this case, all the ETRs attached to the PE router
353	      become unreachable.

355	                                        Internet
356	                                      /          \
357	                                  ISP1            ISP2
358	                              /         |           |
359	                         +----+      +----+      +----+
360	                         | P1 |------| P2 |      | P3 |
361	                         +----+      +----+      +----+
362	                           |            |           |
363	                           |            |           |
364	                         +----+      +----+      +----+
365	                         | PE1|------| PE2|      | PE3|
366	                         +----+      +----+      +----+
367	                           |                        |
368	                           |                        |
369	                         +----+                  +----+
370	                         | E1 |                  | E2 |
371	                         +----+                  +----+
372	                               \                 /
373	                                 \             /
374	                               ==================
375	                                      Prefix P
376	                           -- POP1 --          -- POP3 --

378	     Figure 4: A network with two LISP ETRs attached to different ISPs

380	3.1.  Rewriting interface

382	   Our technique to preserve the reachability of EID prefixes despite
383	   link and node failures relies on a new type of virtual interface that
384	   we call a rewriting interface.  Besides real physical interfaces,
385	   routers often have virtual interfaces such as tunnel interfaces.
386	   When the nexthop of a packet is a tunnel interface, this packet is
387	   encapsulated and the encapsulated packet is sent towards the tunnel
388	   destination.

390	   A rewriting virtual interface is configured with :

392	   o  a primary address

394	   o  a (set of ) alternate addresses

396	   A rewriting interface can only be used by packets whose destination
397	   address is equal to the primary address of the rewriting interface.
398	   When such a packet is to be forwarded by the rewriting interface, its
399	   destination address is replaced by one of alternate addresses known
400	   for this interface.  Of course, the IP and UDP checksums of the
401	   rewritten packets are updated.  When selecting an alternate address,
402	   the router should prefer an alternate address that it knows (e.g.
403	   based on its own routing table or thanks to other information) to be
404	   reachable.  The rewritten packet is then forwarded towards its new
405	   destination.

407	   Instead of using a rewriting interface, another solution could have
408	   been to encapsulate the packet destined to the failed address towards
409	   the alternate.  However, using a second level of encapsulation would
410	   like cause MTU problems.  For this reason, we chose to rewrite part
411	   of the LISP header.  From an implementation viewpoint, rewriting part
412	   of a LISP header is similar to the operation performed by a Network
413	   Address Translator.  Given the current interest in carrier-grade NAT,
414	   it can be expected that efficient hardware-based NAT implementations
415	   will appear.

417	   The operation of the rewriting interface is discussed in more details
418	   in section Section 4.3.

420	3.2.  Link and ETR failures

422	   In this section, we describe informally the principle of our
423	   solution.  The details are discussed later.  To maintain reachability
424	   of EID prefix when the link between one of its ETR and the associated
425	   PE fails, we propose to install a rewriting interface on the upstream
426	   PE.  Consider for example Figure 4 and that E1 is the ETR whose
427	   reachability needs to be preserved.  This can be achieved as follows
428	   :

430	   o  PE1 is configured with a rewriting interface having E1's RLOC as
431	      primary address and E2's RLOC as alternate address.  A static
432	      route for this rewriting interface is configured on PE1, but this
433	      route has a high administrative distance so that the route is not
434	      installed in the FIB when E1 is up.

436	   o  When the link between PE1 and E1 fails, PE1's rewriting interface
437	      is still up.  Thus, PE1 continues to announce E1's RLOC as being
438	      reachable in the IGP.  This ensures that packets destined to E1
439	      still reach PE1.  However, the rewriting interface replaces the
440	      physical interface as the nexthop for E1 in PE1's FIB.

442	   o  When a LISP data encapsulated packet destined to E1 arrives while
443	      E1 is unreachable, PE1 forwards this packet over its rewriting
444	      interface.  This interface rewrites the destination RLOC of this
445	      LISP data encapsulated packet with E2's RLOC as destination
446	      address and the packet is forwarded to E2.

448	   o  When E1 becomes again reachable, the physical interface towards E1
449	      replaces the rewriting interface as the nexthop for E1 in PE1's
450	      FIB and the rewriting stops.  Rewriting could also stop by
451	      removing the rewriting interface e.g. after the expiration of a
452	      timer.

454	   It should be noted that this solution is purely local on the PE
455	   router attached to the ETR responsible for the EID prefix whose
456	   reachability must be preserved in case of failures.  No additional
457	   prefix needs to be advertised in the IGP.  Thus, there are no
458	   scalability issues with this solution.

460	3.3.  PE failures

462	   To maintain reachability of an EID prefix when the PE attached to one
463	   ETR fails, we cannot use the solution described above as the PE is
464	   not reachable anymore.  To solve this problem, we introduce a
465	   rewriting PE.  A rewriting PE is a PE router that is configured with
466	   a rewriting interface whose primary address is the address of an ETR
467	   attached to another PE router.  The rewriting PE will usually be
468	   located in the same POP as the PE that must be protected.  For
469	   example, let us consider the failure of PE1 in Figure 4 and assume
470	   that PE2 is the rewriting PE :

472	   o  PE2 is configured with one rewriting interface having :

474	      *  E1's RLOC as primary address

476	      *  E2's RLOC as alternate address

478	   o  E1's RLOC is advertised as an anycast address by both PE1 and PE2
479	      that acts as a rewriting router.  PE2's advertisement has a high
480	      IGP distance such that PE1's advertisement is always preferred
481	      inside the ISP network.  Furthermore, the rewriting interface has
482	      a high administrative distance and thus PE2 does not install a FIB
483	      entry towards this rewriting interface.

485	   o  When PE1 becomes unreachable, the IGP converges and PE2 becomes
486	      the only router that advertises E1's RLOC.  It thus receives all
487	      packets destined to E1's RLOC.  These packets are rewritten by the
488	      rewriting interface and forwarded to E2's RLOC.

490	   o  When PE1 comes back, it readvertises the reachability of E1's
491	      RLOC.  PE2 prefers PE1's advertisement and stops receiveing
492	      packets destined to E1's RLOC.

494	4.  Protocol issues

496	   In this section, we discuss in more details the protocols and
497	   mechanisms that are required to implement the solution described
498	   informally in the previous section.  We first discuss how a PE can
499	   verify the reachability of ETRs.  Then we discuss how a rewriting
500	   router can learn the rewriting address that it should use when an ETR
501	   becomes unreachable.  Finally we explain how the RLOC of the
502	   unreachable ETR needs to be rewritten and propose a small change to
503	   the LISP header for this.

505	4.1.  Verifying the reachability of ETRs

507	   The first router that needs to detect the unreachability of a LISP
508	   ETR is the PE router directly connected to it.  Several mechanisms
509	   can be used to detect this unreachability : physical layer
510	   information (if available), BFD or a single hop eBGP session could be
511	   established between the PE and the ETR.  No prefix will be advertised
512	   by the ETR on this eBGP session, but the PE may advertise a default
513	   route or its full BGP (RLOC) routing table.

515	   However, the rewriting PE router could also need to verify the
516	   reachability of the ETR that owns the RLOC that it will rewrite if
517	   the primary ETR becomes unreachable due to the failure of its
518	   attached PE.  This is especially important when the the rewriting PE
519	   knows several alternate ETR routers.  If it only knows a single
520	   alternate ETR and the primary fails, the only solution is to rewrite
521	   the packets towards the only alternate ETR.  This alternate ETR can
522	   be located in the same POP, in another POP or in another ISP.  Thus,
523	   the rewriting PE cannot always rely on its routing table to verify
524	   the reachability of such a distant ETR.

526	   To allow a PE to know which of the alternate addresses for a given
527	   primary address are alive, we propose to use multihop eBGP sessions
528	   to distribute the reachability information of each ETR.  Reachability
529	   information could be distributed as follows :

531	   o  Each LISP site, containing at least one EID prefix and several
532	      ETRs is allocated a unique route target.

534	   o  Each ETR has a single-hop BGP session with its attached PE router.
535	      On this eBGP session, the ETR advertises only its own RLOC with
536	      the allocated route target.

538	   o  The PE routers and the routers with rewriting interfaces are part
539	      of an iBGP mesh (e.g. based on route reflectors) where the routes
540	      received by the ETRs are distributed with their route target.

542	   o  The route reflectors of different ASes that host LISP ETRs can
543	      exchange the routes received from their ETRs by using multihop
544	      eBGP sessions.

546	   o  A rewriting router only needs to receive reachability information
547	      for alternate addresses that it supports.  This can be achieved by
548	      requesting in the iBGP mesh all the routes with a list of route
549	      targets.

551	   The next version of this document will analyse this problem in more
552	   details

554	4.2.  Advertising the backup ETR

556	   In the previous section, we have assumed that the PE and the
557	   rewriting router were configured with several information.  Such a
558	   manual configuration is possible, but in practice it would be useful
559	   to allow some of these routers to automatically learn some of this
560	   information.  For example, it would be useful for a PE router to
561	   learn automatically the backup RLOCs to be used in case of failure of
562	   one of its directly attached ETRs.  This can be achieved by either :

564	   o  developing a new protocol to advertise these backup RLOCs to be
565	      rewritten

567	   o  using BGP and defining a new address family that allows BGP to
568	      carry this kind of information

570	   o  extending the Map-Request/Map-Reply and allow the PE to query the
571	      ETR for its alternate ETR

573	   The next version of this document will analyse in more detailed the
574	   advantages and drawbacks of each of these two approaches.

576	4.3.  Destination RLOC rewriting

578	   Our solution rewrites the destination RLOC of LISP packets once the
579	   destination of this packet has been found unreachable.  This
580	   rewriting raises several questions as discussed in the following
581	   sections.

583	4.3.1.  Which packets should be rewritten ?

585	   A LISP ETR will receive different types of packets and we need to
586	   define which packets should be rewritten by the rewriting router.
587	   LISP encapsulated data packets should be rewritten.  However, we need
588	   to ensure that when multiple failures occur LISP encapsulated data
589	   packets do not loop between rewriting routers.  This can be achieved
590	   by reserving one bit in the LISP header, called the Deflection (D)
591	   bit.  When an ITR sends a data encapsulated packet, it sets the D bit
592	   to false.  When a rewriting router receives a LISP data encapsulated
593	   with the D bit set to false, it can rewrite the destination address
594	   of the packet.  If the D bit is set to true, the packet must be
595	   dropped.  LISP control packets, i.e.  Map-Request and Map-Reply
596	   packets, do not need to be rewritten as they are targeted at the ETR
597	   itself and not at hosts behind the ETR.  Non-LISP packets destined to
598	   the ETR do not need to be rewritten either.

600	   Upon reception of packets with the D bit set, the ETR knows that the
601	   packets have been deflected by upstream routers, likely due to an
602	   upstream failure.  This ETR will soon detect the failure by other
603	   means (e.g. the primary ETR stops advertising its default route in
604	   the site's IGP).

606	4.3.2.  After a failure, for how long should packets be rewritten ?

608	   In theory, the ITR which is sending packets to the ETR could have
609	   learned the mapping up to TTL minutes ago if TTL is the mapping
610	   lifetime.  Thus, the rewriting entry should remain in the rewriting
611	   router for a duration at least equal to the lifetime of the mapping
612	   entries if we do not want to loose encapsulated packets.  With a
613	   default mapping lifetime of 24hours, this duration can be large.  In
614	   practice however, most of the failures have a short duration and the
615	   ETR will become reachable again well before the expiration of the
616	   lifetime of its mapping entries.

618	5.  Security Considerations

620	   To be written once the details of the protocols have been specified.

622	6.  Conclusion

624	   In this document, we have first compared the LISP reachability
625	   problem with the traditional reachability problem with routing
626	   protocols.  We have then shown the drawbacks of using anycast to
627	   preserve the reachability of LISP ETRs in case of failures.  Then, we
628	   have proposed to allow PE routers to rewrite the destination address
629	   of LISP encapsulated packets to preserve the reachability of the EID
630	   prefix in case of failure of one of the responsible ETRs.  Further
631	   work is required to define the protocols and mechanisms that are
632	   necessary to allow ISPs to preserve the reachability of the ETRs of
633	   their customers.

635	7.  Acknowledgements

637	   We would like to thank Dave Meyer for his comments on the first
638	   version of this draft.  This work was partially supported by a Cisco
639	   URP grant.

641	8.  References

643	8.1.  Normative References

645	   [RFC4090]  Pan, P., Swallow, G., and A. Atlas, "Fast Reroute
646	              Extensions to RSVP-TE for LSP Tunnels", RFC 4090,
647	              May 2005.

649	8.2.  Informative References

651	   [BGPFRR]   Bonaventure , O., Filsfils, C., and P. Francois,
652	              "Achieving Sub-50 Milliseconds Recovery Upon BGP Peering
653	              Link Failures",  Conext   2005 .

655	   [FAILURES]
656	              Markopoulou, A., Iannacone, G.,  Chattacharyya, S.,
657	              Chuah, C., and C. Diot, "Characterization of Failures in
658	              an IP Backbone", INFOCOM 2004.

660	   [I-D.ietf-bfd-base]
661	              Katz, D. and D. Ward, "Bidirectional Forwarding
662	              Detection", draft-ietf-bfd-base-09 (work in progress),
663	              February 2009.

665	   [I-D.ietf-bfd-multihop]
666	              Katz, D. and D. Ward, "BFD for Multihop Paths",
667	              draft-ietf-bfd-multihop-07 (work in progress),
668	              February 2009.

670	   [I-D.ietf-lisp]
671	              Farinacci, D., Fuller, V., Meyer, D., and D. Lewis,
672	              "Locator/ID Separation Protocol (LISP)",
673	              draft-ietf-lisp-01 (work in progress), May 2009.

675	   [I-D.ietf-rtgwg-ipfrr-framework]
676	              Shand, M. and S. Bryant, "IP Fast Reroute Framework",
677	              draft-ietf-rtgwg-ipfrr-framework-10 (work in progress),
678	              February 2009.

680	   [I-D.meyer-loc-id-implications]
681	              Meyer, D. and D. Lewis, "Architectural Implications of
682	              Locator/ID Separation", draft-meyer-loc-id-implications-01
683	              (work in progress), January 2009.

685	   [RECOVERY]
686	              Vasseur, J., Demeester, P., and M. Pickavet, "Network
687	              Recovery: Protection and Restoration of Optical, SONET-
688	              SDH, IP, and MPLS", Elsevier Science & Technology
689	              Books 2004.

691	   [RFC2547]  Rosen, E. and Y. Rekhter, "BGP/MPLS VPNs", RFC 2547,
692	              March 1999.

694	   [RFC4984]  Meyer, D., Zhang, L., and K. Fall, "Report from the IAB
695	              Workshop on Routing and Addressing", RFC 4984,
696	              September 2007.

698	Authors' Addresses

700	   Olivier Bonaventure
701	   UCLouvain
702	   Universite catholique de Louvain, Place Sainte Barbe 2
703	   Louvain-la-Neuve,   1348
704	   Belgium

706	   Email: olivier.bonaventure@uclouvain.be
707	   URI:   http://inl.info.ucl.ac.be

709	   Pierre Francois
710	   UCLouvain
711	   Universite catholique de Louvain, Place Sainte Barbe 2
712	   Louvain-la-Neuve,   1348
713	   Belgium

715	   Email: pierre.francois@uclouvain.be
716	   URI:   http://inl.info.ucl.ac.be

718	   Damien Saucez
719	   UCLouvain
720	   Universite catholique de Louvain, Place Sainte Barbe 2
721	   Louvain-la-Neuve,   1348
722	   Belgium

724	   Email: damien.saucez@uclouvain.be
725	   URI:   http://inl.info.ucl.ac.be