idnits 2.17.1 

draft-shand-remote-lfa-01.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

     No issues found here.

  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  -- The document date (June 1, 2012) is 4347 days in the past.  Is this
     intentional?


  Checking references for intended status: Proposed Standard
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

  == Unused Reference: 'ISOCORE2010' is defined on line 497, but no explicit
     reference was found in the text


     Summary: 0 errors (**), 0 flaws (~~), 2 warnings (==), 1 comment (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.
--------------------------------------------------------------------------------


2	Network Working Group                                          S. Bryant
3	Internet-Draft                                               C. Filsfils
4	Intended status: Standards Track                           Cisco Systems
5	Expires: December 3, 2012                                       M. Shand
6	                                                 Independent Contributor
7	                                                                   N. So
8	                                                            Verizon Inc.
9	                                                            June 1, 2012

11	                             Remote LFA FRR
12	                       draft-shand-remote-lfa-01

14	Abstract

16	   This draft describes an extension to the basic IP fast re-route
17	   mechanism described in RFC 5286 that provides additional backup
18	   connectivity when none can be provided by the basic mechanisms.

20	Requirements Language

22	   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
23	   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
24	   document are to be interpreted as described in RFC2119 [RFC2119].

26	Status of this Memo

28	   This Internet-Draft is submitted in full conformance with the
29	   provisions of BCP 78 and BCP 79.

31	   Internet-Drafts are working documents of the Internet Engineering
32	   Task Force (IETF).  Note that other groups may also distribute
33	   working documents as Internet-Drafts.  The list of current Internet-
34	   Drafts is at http://datatracker.ietf.org/drafts/current/.

36	   Internet-Drafts are draft documents valid for a maximum of six months
37	   and may be updated, replaced, or obsoleted by other documents at any
38	   time.  It is inappropriate to use Internet-Drafts as reference
39	   material or to cite them other than as "work in progress."

41	   This Internet-Draft will expire on December 3, 2012.

43	Copyright Notice

45	   Copyright (c) 2012 IETF Trust and the persons identified as the
46	   document authors.  All rights reserved.

48	   This document is subject to BCP 78 and the IETF Trust's Legal
49	   Provisions Relating to IETF Documents
50	   (http://trustee.ietf.org/license-info) in effect on the date of
51	   publication of this document.  Please review these documents
52	   carefully, as they describe your rights and restrictions with respect
53	   to this document.  Code Components extracted from this document must
54	   include Simplified BSD License text as described in Section 4.e of
55	   the Trust Legal Provisions and are provided without warranty as
56	   described in the Simplified BSD License.

58	1.  Terminology

60	   This draft uses the terms defined in [RFC5714].  This section defines
61	   additional terms used in this draft.

63	   Extended P-space

65	                  The union of the P-space of the neighbours of a
66	                  specific router with respect to the protected link.

68	   P-space        P-space is the set of routers reachable from a
69	                  specific router without any path (including equal cost
70	                  path splits) transiting the protected link.

72	                  For example, the P-space of S, is the set of routers
73	                  that S can reach without using the protected link S-E.

75	   PQ node        A node which is a member of both the extended P-space
76	                  and the Q-space.

78	   Q-space        Q-space is the set of routers from which a specific
79	                  router can be reached without any path (including
80	                  equal cost path splits) transiting the protected link.

82	   Repair tunnel  A tunnel established for the purpose of providing a
83	                  virtual neighbor which is a Loop Free Alternate.

85	   Remote LFA     The tail-end of a repair tunnel.  This tail-end is a
86	                  member of both the extended-P space the Q space.  It
87	                  is also termed a "PQ" node.

89	2.  Introduction

91	   RFC 5714 [RFC5714] describes a framework for IP Fast Re-route and
92	   provides a summary of various proposed IPFRR solutions.  A basic
93	   mechanism using loop-free alternates (LFAs) is described in [RFC5286]
94	   that provides good repair coverage in many
95	   topologies[I-D.filsfils-rtgwg-lfa-applicability], especially those
96	   that are highly meshed.  However, some topologies, notably ring based
97	   topologies are not well protected by LFAs alone.  This is illustrated
98	   in Figure 1 below.

100	             S---E
101	            /     \
102	           A       D
103	            \     /
104	             B---C

106	                     Figure 1: A simple ring topology

108	   If all link costs are equal, the link S-E cannot be fully protected
109	   by LFAs.  The destination C is an ECMP from S, and so can be
110	   protected when S-E fails, but D and E are not protectable using LFAs

112	   This draft describes extensions to the basic repair mechanism in
113	   which tunnels are used to provide additional logical links which can
114	   then be used as loop free alternates where none exist in the original
115	   topology.  For example if a tunnel is provided between S and C as
116	   shown in Figure 2 then C, now being a direct neighbor of S would
117	   become an LFA for D and E. The non-failure traffic distribution is
118	   not disrupted by the provision of such a tunnel since it is only used
119	   for repair traffic and MUST NOT be used for normal traffic.

121	             S---E
122	            / \   \
123	           A   \   D
124	            \   \ /
125	             B---C

127	                    Figure 2: The addition of a tunnel

129	   The use of this technique is not restricted to ring based topologies,
130	   but is a general mechanism which can be used to enhance the
131	   protection provided by LFAs.

133	3.  Repair Paths

135	   As with LFA FRR, when a router detects an adjacent link failure, it
136	   uses one or more repair paths in place of the failed link.  Repair
137	   paths are pre-computed in anticipation of later failures so they can
138	   be promptly activated when a failure is detected.

140	   A tunneled repair path tunnels traffic to some staging point in the
141	   network from which it is assumed that, in the absence of multiple
142	   failures, it will travel to its destination using normal forwarding
143	   without looping back.  This is equivalent to providing a virtual
144	   loop-free alternate to supplement the physical loop-free alternates.
145	   Hence the name "Remote LFA FRR".  When a link cannot be entirely
146	   protected with local LFA neighbors, the protecting router seeks the
147	   help of a remote LFA staging point.

149	3.1.  Tunnels as Repair Paths

151	   Consider an arbitrary protected link S-E.  In LFA FRR, if a path to
152	   the destination from a neighbor N of S does not cause a packet to
153	   loop back over the link S-E (i.e.  N is a loop-free alternate), then
154	   S can send the packet to N and the packet will be delivered to the
155	   destination using the pre-failure forwarding information.  If there
156	   is no such LFA neighbor, then S may be able to create a virtual LFA
157	   by using a tunnel to carry the packet to a point in the network which
158	   is not a direct neighbor of S from which the packet will be delivered
159	   to the destination without looping back to S. In this document such a
160	   tunnel is termed a repair tunnel.  The tail-end of this tunnel is
161	   called a "remote LFA" or a "PQ node".

163	   Note that the repair tunnel terminates at some intermediate router
164	   between S and E, and not E itself.  This is clearly the case, since
165	   if it were possible to construct a tunnel from S to E then a
166	   conventional LFA would have been sufficient to effect the repair.

168	3.2.  Tunnel Requirements

170	   There are a number of IP in IP tunnel mechanisms that may be used to
171	   fulfil the requirements of this design, such as IP-in-IP [RFC1853]
172	   and GRE[RFC1701] .

174	   In an MPLS enabled network using LDP[RFC5036], a simple label
175	   stack[RFC3032] may be used to provide the required repair tunnel.  In
176	   this case the outer label is S's neighbor's label for the repair
177	   tunnel end point, and the inner label is the repair tunnel end
178	   point's label for the packet destination.  In order for S to obtain
179	   the correct inner label it is necessary to establish a directed LDP
180	   session[RFC5036] to the tunnel end point.

182	   The selection of the specific tunnelling mechanism (and any necessary
183	   enhancements) used to provide a repair path is outside the scope of
184	   this document.  The authors simply note that deployment in an MPLS/
185	   LDP environment is extremely simple and straight-forward as an LDP
186	   LSP from S to the PQ node is readily available, and hence does not
187	   require any new protocol extension or design change.  This LSP is
188	   automatically established as a basic property of LDP behavior.  The
189	   performance of the encapsulation and decapsulation is also excellent
190	   as encapsulation is just a push of one label (like conventional MPLS
191	   TE FRR) and the decapsulation occurs naturally at the penultimate hop
192	   before the PQ node.

194	   When a failure is detected, it is necessary to immediately redirect
195	   traffic to the repair path.  Consequently, the repair tunnel used
196	   must be provisioned beforehand in anticipation of the failure.  Since
197	   the location of the repair tunnels is dynamically determined it is
198	   necessary to establish the repair tunnels without management action.
199	   Multiple repairs may share a tunnel end point.

201	4.  Construction of Repair Paths

203	4.1.  Identifying Required Tunneled Repair Paths

205	   Not all links will require protection using a tunneled repair path.
206	   If E can already be protected via an LFA, S-E does not need to be
207	   protected using a repair tunnel, since all destinations normally
208	   reachable through E must therefore also be protectable by an LFA.
209	   Such an LFA is frequently termed a "link LFA".  Tunneled repair paths
210	   are only required for links which do not have a link LFA.

212	4.2.  Determining Tunnel End Points

214	   The repair tunnel endpoint needs to be a node in the network
215	   reachable from S without traversing S-E.  In addition, the repair
216	   tunnel end point needs to be a node from which packets will normally
217	   flow towards their destination without being attracted back to the
218	   failed link S-E.

220	   Note that once released from the tunnel, the packet will be
221	   forwarded, as normal, on the shortest path from the release point to
222	   its destination.  This may result in the packet traversing the router
223	   E at the far end of the protected link S-E., but this is obviously
224	   not required.

226	   The properties that are required of repair tunnel end points are
227	   therefore:

229	   o  The repair tunneled point MUST be reachable from the tunnel source
230	      without traversing the failed link; and

232	   o  When released, tunneled packets MUST proceed towards their
233	      destination without being attracted back over the failed link.

235	   Provided both these requirements are met, packets forwarded over the
236	   repair tunnel will reach their destination and will not loop.

238	   In some topologies it will not be possible to find a repair tunnel
239	   endpoint that exhibits both the required properties.  For example if
240	   the ring topology illustrated in Figure 1 had a cost of 4 for the
241	   link B-C, while the remaining links were cost 1, then it would not be
242	   possible to establish a tunnel from S to C (without resorting to some
243	   form of source routing).

245	4.2.1.  Computing Repair Paths

247	   The set of routers which can be reached from S without traversing S-E
248	   is termed the P-space of S with respect to the link S-E.  The P-space
249	   can be obtained by computing a shortest path tree (SPT) rooted at S
250	   and excising the sub-tree reached via the link S-E (including those
251	   which are members of an ECMP).  In the case of Figure 1 the P-space
252	   comprises nodes A and B only.

254	   The set of routers from which the node E can be reached, by normal
255	   forwarding, without traversing the link S-E is termed the Q-space of
256	   E with respect to the link S-E.  The Q-space can be obtained by
257	   computing a reverse shortest path tree (rSPT) rooted at E, with the
258	   sub-tree which traverses the failed link excised (including those
259	   which are members of an ECMP).  The rSPT uses the cost towards the
260	   root rather than from it and yields the best paths towards the root
261	   from other nodes in the network.  In the case of Figure 1 the Q-space
262	   comprises nodes C and D only.

264	   The intersection of the E's Q-space with S's P-space defines the set
265	   of viable repair tunnel end-points, known as "PQ nodes".  As can be
266	   seen, for the case of Figure 1 there is no common node and hence no
267	   viable repair tunnel end-point.

269	   Note that the Q-space calculation could be conducted for each
270	   individual destination and a per-destination repair tunnel end point
271	   determined.  However this would, in the worst case, require an SPF
272	   computation per destination which is not considered to be scalable.
273	   We therefore use the Q-space of E as a proxy for the Q-space of each
274	   destination.  This approximation is obviously correct since the
275	   repair is only used for the set of destinations which were, prior to
276	   the failure, routed through node E. This is analogous to the use of
277	   link-LFAs rather than per-prefix LFAs.

279	4.2.2.  Extended P-space

281	   The description in Section 4.2.1 calculated router S's P-space rooted
282	   at S itself.  However, since router S will only use a repair path
283	   when it has detected the failure of the link S-E, the initial hop of
284	   the repair path need not be subject to S's normal forwarding decision
285	   process.  Thus we introduce the concept of extended P-space.  Router
286	   S's extended P-space is the union of the P-spaces of each of S's
287	   neighbours.  The use of extended P-space may allow router S to reach
288	   potential repair tunnel end points that were otherwise unreachable.

290	   Another way to describe extended P-space is that it is the union of (
291	   un-extended ) P-space and the set of destinations for which S has a
292	   per-prefix LFA protecting the link S-E. i.e. the repair tunnel end
293	   point can be reached either directly or using a per-prefix LFA.

295	   Since in the case of Figure 1 node A is a per-prefix LFA for the
296	   destination node C, the set of extended P-space nodes comprises nodes
297	   A, B and C. Since node C is also in E's Q-space, there is now a node
298	   common to both extended P-space and Q-space which can be used as a
299	   repair tunnel end-point to protect the link S-E.

301	4.2.3.  Selecting Repair Paths

303	   The mechanisms described above will identify all the possible repair
304	   tunnel end points that can be used to protect a particular link.  In
305	   a well-connected network there are likely to be multiple possible
306	   release points for each protected link.  All will deliver the packets
307	   correctly so, arguably, it does not matter which is chosen.  However,
308	   one repair tunnel end point may be preferred over the others on the
309	   basis of path cost or some other selection criteria.

311	   In general there are advantages in choosing the repair tunnel end
312	   point closest (shortest metric) to S. Choosing the closest maximises
313	   the opportunity for the traffic to be load balanced once it has been
314	   released from the tunnel.

316	   There is no technical requirement for the selection criteria to be
317	   consistent across all routers, but such consistency may be desirable
318	   from an operational point of view.

320	5.  Example Application of Remote LFAs

322	   An example of a commonly deployed topology which is not fully
323	   protected by LFAs alone is shown in Figure 3.  PE1 and PE2 are
324	   connected in the same site.  P1 and P2 may be geographically
325	   separated (inter-site).  In order to guarantee the lowest latency
326	   path from/to all other remote PEs, normally the shortest path follows
327	   the geographical distance of the site locations.  Therefore, to
328	   ensure this, a lower IGP metric (5) is assigned between PE1 and PE2.
329	   A high metric (1000) is set on the P-PE links to prevent the PEs
330	   being used for transit traffic.  The PEs are not individually dual-
331	   homed in order to reduce costs.

333	   This is a common topology in SP networks.

335	   When a failure occurs on the link between PE1 and P2, PE1 does not
336	   have an LFA for traffic reachable via P1.  Similarly, by symmetry, if
337	   the link between PE2 and P1 fails, PE2 does not have an LFA for
338	   traffic reachable via P2.

340	   Increasing the metric between PE1 and PE2 to allow the LFA would
341	   impact the normal traffic performance by potentially increasing the
342	   latency.
343	             |    100    |
344	            -P2---------P1-
345	              \         /
346	          1000 \       / 1000
347	               PE1---PE2
348	                   5

350	                       Figure 3: Example SP topology

352	   Clearly, full protection can be provided, using the techniques
353	   described in this draft, by PE1 choosing P2 as a PQ node, and PE2
354	   choosing P1 as a PQ node.

356	6.  Historical Note

358	   The basic concepts behind Remote LFA were invented in 2002 and were
359	   later included in draft-bryant-ipfrr-tunnels, submitted in 2004.

361	   draft-bryant-ipfrr-tunnels targetted a 100% protection coverage and
362	   hence included additional mechanims on top of the Remote LFA concept.
363	   The addition of these mechanisms made the proposal very complex and
364	   computationally intensive and it was therefore not pursued as a
365	   working group item.

367	   As explained in [I-D.filsfils-rtgwg-lfa-applicability], the purpose
368	   of the LFA FRR technology is not to provide coverage at any cost.  A
369	   solution for this already exists with MPLS TE FRR.  MPLS TE FRR is a
370	   mature technology which is able to provide protection in any topology
371	   thanks to the explicit routing capability of MPLS TE.

373	   The purpose of LFA FRR technology is to provide for a simple FRR
374	   solution when such a solution is possible.  The first step along this
375	   simplicity approach was "local" LFA [RFC5286].  We propose "Remote
376	   LFA" as a natural second step.  The following section motivates its
377	   benefits in terms of simplicity, incremental deployment and
378	   significant coverage increase.

380	7.  Benefits

382	   Remote LFAs preserve the benefits of RFC5286: simplicity, incremental
383	   deployment and good protection coverage.

385	7.1.  Simplicity

387	   The remote LFA algorithm is simple to compute.

389	   o  The extended P space does not require any new computation (it is
390	      known once per-prefix LFA computation is completed).

392	   o  The Q-space is a single reverse SPF rooted at the neighbor.

394	   o  The directed LDP session is automatically computed and
395	      established.

397	   In edge topologies (square, ring), the directed LDP session position
398	   and number is determinic and hence troubleshooting is simple.

400	   In core topologies, our simulation indicates that the 90th percentile
401	   number of LDP sessions per node to achieve the significant Remote LFA
402	   coverage observed in section 7.3 is <= 6.  This is insignificant
403	   compared to the number of LDP sessions commonly deployed per router
404	   which is frequently is in the several hundreds.

406	7.2.  Incremental Deployment

408	   The establishment of the directed LDP session to the PQ node does not
409	   require any new technology on the PQ node.  Indeed, routers commonly
410	   support the ability to accept a remote request to open a directed LDP
411	   session.  The new capability is restricted to the Remote-LFA
412	   computing node (the originator of the LDP session).

414	7.3.  Significant Coverage Extension

416	   The previous sections have already explained how Remote LFAs provide
417	   protection for frequently occuring edge topologies: square and rings.
418	   In the core, we extend the analysis framework in section 4.3 of
419	   [I-D.filsfils-rtgwg-lfa-applicability]and provide hereafter the
420	   Remote LFA coverage results for the 11 topologies:

422	               +----------+--------------+----------------+------------+
423	               | Topology | Per-link LFA | Per-prefix LFA | Remote LFA |
424	               +----------+--------------+----------------+------------+
425	               |    T1    |      45%     |       77%      |    78%     |
426	               |    T2    |      49%     |       99%      |   100%     |
427	               |    T3    |      88%     |       99%      |    99%     |
428	               |    T4    |      68%     |       84%      |    92%     |
429	               |    T5    |      75%     |       94%      |    99%     |
430	               |    T6    |      87%     |       99%      |   100%     |
431	               |    T7    |      16%     |       67%      |    96%     |
432	               |    T8    |      87%     |      100%      |   100%     |
433	               |    T9    |      67%     |       80%      |    98%     |
434	               |    T10   |      98%     |      100%      |   100%     |
435	               |    T11   |      59%     |       77%      |    95%     |
436	               |  Average |      67%     |       89%      |    96%     |
437	               |  Median  |      68%     |       94%      |    99%     |
438	               +----------+--------------+----------------+------------+

440	   Another study[ISOCORE2010]confirms the significant coverage increase
441	   provided by Remote LFAs.

443	8.  Complete Protection

445	   As shown in the previous table, Remote LFA provides for 96% average
446	   (99% median) protection in the 11 analyzed SP topologies.

448	   In an MPLS network, this is achieved without any scalability impact
449	   as the tunnels to the PQ nodes are always present as a property of an
450	   LDP-based deployment.

452	   In the very few cases where P and Q spaces have an empty
453	   intersection, one could select the closest node in the Q space (i.e.
454	   Qc) and signal an explicitely-routed RSVP TE LSP to Qc.  A directed
455	   LDP session is then established with Qc and the rest of the solution
456	   is identical.

458	   The drawbacks of this solution are:

460	   1.  only available for MPLS network;

462	   2.  the addition of LSPs in the SP infrastructure.

464	   This extension is described for exhaustivity.  In practice, the
465	   "Remote LFA" solution should be preferred for three reasons: its
466	   simplicity, its excellent coverage in the analyzed backbones and its
467	   complete coverage in the most frequent access/aggregation topologies
468	   (box or ring).

470	9.  IANA Considerations

472	   There are no IANA considerations that arise from this architectural
473	   description of IPFRR.

475	10.  Security Considerations

477	   The security considerations of RFC 5286 also apply.

479	   To prevent their use as an attack vector the repair tunnel endpoints
480	   SHOULD be assigned from a set of addresses that are not reachable
481	   from outside the routing domain.

483	11.  Acknowledgments

485	   The authors acknowledge the technical contributions made to this work
486	   by Stefano Previdi.

488	12.  Informative References

490	   [I-D.filsfils-rtgwg-lfa-applicability]
491	              Filsfils, C., Francois, P., Shand, M., Decraene, B.,
492	              Uttaro, J., Leymann, N., and M. Horneffer, "LFA
493	              applicability in SP networks",
494	              draft-filsfils-rtgwg-lfa-applicability-00 (work in
495	              progress), March 2010.

497	   [ISOCORE2010]
498	              So, N., Lin, T., and C. Chen, "LFA (Loop Free Alternates)
499	              Case Studies in Verizon's LDP Network", 2010.

501	   [RFC1701]  Hanks, S., Li, T., Farinacci, D., and P. Traina, "Generic
502	              Routing Encapsulation (GRE)", RFC 1701, October 1994.

504	   [RFC1853]  Simpson, W., "IP in IP Tunneling", RFC 1853, October 1995.

506	   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
507	              Requirement Levels", BCP 14, RFC 2119, March 1997.

509	   [RFC3032]  Rosen, E., Tappan, D., Fedorkow, G., Rekhter, Y.,
510	              Farinacci, D., Li, T., and A. Conta, "MPLS Label Stack
511	              Encoding", RFC 3032, January 2001.

513	   [RFC5036]  Andersson, L., Minei, I., and B. Thomas, "LDP
514	              Specification", RFC 5036, October 2007.

516	   [RFC5286]  Atlas, A. and A. Zinin, "Basic Specification for IP Fast
517	              Reroute: Loop-Free Alternates", RFC 5286, September 2008.

519	   [RFC5714]  Shand, M. and S. Bryant, "IP Fast Reroute Framework",
520	              RFC 5714, January 2010.

522	Authors' Addresses

524	   Stewart Bryant
525	   Cisco Systems
526	   250, Longwater, Green Park,
527	   Reading  RG2 6GB, UK
528	   UK

530	   Email: stbryant@cisco.com

532	   Clarence Filsfils
533	   Cisco Systems
534	   De Kleetlaan 6a
535	   1831 Diegem
536	   Belgium

538	   Email: cfilsfil@cisco.com

540	   Mike Shand
541	   Independent Contributor

543	   Email: imc.shand@gmail.com

545	   Ning So
546	   Verizon Inc.

548	   Email: ningso@yahoo.com