idnits 2.17.1 

draft-ietf-rtgwg-cl-requirement-02.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

     No issues found here.

  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  == Line 508 has weird spacing: '... packet  trans...'

  == Using lowercase 'not' together with uppercase 'MUST', 'SHALL', 'SHOULD',
     or 'RECOMMENDED' is not an accepted usage according to RFC 2119.  Please
     use uppercase 'NOT' together with RFC 2119 keywords (if that is what you
     mean).
     
     Found 'MUST not' in this paragraph:
     
     FR#5  Any automatic LSP routing and/or load balancing solutions
     MUST not oscillate such that performance observed by users changes such
     that an NPO is violated.  Since oscillation may cause reordering, there
     MUST be means to control the frequency of changing the component link
     over which a flow is placed.

  == Using lowercase 'not' together with uppercase 'MUST', 'SHALL', 'SHOULD',
     or 'RECOMMENDED' is not an accepted usage according to RFC 2119.  Please
     use uppercase 'NOT' together with RFC 2119 keywords (if that is what you
     mean).
     
     Found 'MUST not' in this paragraph:
     
     DR#8  When a worst case failure scenario occurs, the number of
     RSVP-TE LSPs to be resignaled will cause a period of unavailability as
     perceived by users.  The resignaling time of the solution MUST meet the
     NPO objective for the duration of unavailability.  The resignaling time
     of the solution MUST not increase significantly as compared with current
     methods.

  -- The document date (October 11, 2010) is 4947 days in the past.  Is this
     intentional?


  Checking references for intended status: Informational
  ----------------------------------------------------------------------------

  == Missing Reference: 'I-D.ietf-pwe3-fat-pw' is mentioned on line 721, but
     not defined

  == Missing Reference: 'IEEE-802.1AX' is mentioned on line 758, but not
     defined

  == Missing Reference: 'ITU-T.Y.1540' is mentioned on line 601, but not
     defined

  == Missing Reference: 'ITU-T.Y.1541' is mentioned on line 559, but not
     defined

  == Missing Reference: 'RFC1717' is mentioned on line 761, but not defined

  ** Obsolete undefined reference: RFC 1717 (Obsoleted by RFC 1990)

  == Missing Reference: 'RFC2475' is mentioned on line 705, but not defined

  == Missing Reference: 'RFC2615' is mentioned on line 763, but not defined

  == Missing Reference: 'RFC2991' is mentioned on line 720, but not defined

  == Missing Reference: 'RFC2992' is mentioned on line 720, but not defined

  == Missing Reference: 'RFC3260' is mentioned on line 706, but not defined

  == Missing Reference: 'RFC4201' is mentioned on line 785, but not defined

  == Missing Reference: 'RFC4301' is mentioned on line 625, but not defined

  == Missing Reference: 'RFC4385' is mentioned on line 718, but not defined

  == Missing Reference: 'RFC4928' is mentioned on line 719, but not defined

  == Unused Reference: 'RFC2702' is defined on line 446, but no explicit
     reference was found in the text

  == Unused Reference: 'RFC4665' is defined on line 471, but no explicit
     reference was found in the text

  == Unused Reference: 'RFC5254' is defined on line 488, but no explicit
     reference was found in the text

  == Outdated reference: A later version (-05) exists of
     draft-ietf-l2vpn-vpms-frmwk-requirements-03


     Summary: 1 error (**), 0 flaws (~~), 22 warnings (==), 1 comment (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	RTGWG                                                 C. Villamizar, Ed.
3	Internet-Draft                                      Infinera Corporation
4	Intended status: Informational                           D. McDysan, Ed.
5	Expires: April 14, 2011                                          S. Ning
6	                                                                A. Malis
7	                                                                 Verizon
8	                                                                 L. Yong
9	                                                              Huawei USA
10	                                                        October 11, 2010

12	              Requirements for MPLS Over a Composite Link
13	                   draft-ietf-rtgwg-cl-requirement-02

15	Abstract

17	   There is often a need to provide large aggregates of bandwidth that
18	   are best provided using parallel links between routers or MPLS LSR.
19	   In core networks there is often no alternative since the aggregate
20	   capacities of core networks today far exceed the capacity of a single
21	   physical link or single packet processing element.

23	   The presence of parallel links, with each link potentially comprised
24	   of multiple layers has resulted in additional requirements.  Certain
25	   services may benefit from being restricted to a subset of the
26	   component links or a specific component link, where component link
27	   characteristics, such as latency, differ.  Certain services require
28	   that an LSP be treated as atomic and avoid reordering.  Other
29	   services will continue to require only that reordering not occur
30	   within a microflow as is current practice.

32	   Current practice related to multipath is described briefly in an
33	   appendix.

35	Status of this Memo

37	   This Internet-Draft is submitted in full conformance with the
38	   provisions of BCP 78 and BCP 79.

40	   Internet-Drafts are working documents of the Internet Engineering
41	   Task Force (IETF).  Note that other groups may also distribute
42	   working documents as Internet-Drafts.  The list of current Internet-
43	   Drafts is at http://datatracker.ietf.org/drafts/current/.

45	   Internet-Drafts are draft documents valid for a maximum of six months
46	   and may be updated, replaced, or obsoleted by other documents at any
47	   time.  It is inappropriate to use Internet-Drafts as reference
48	   material or to cite them other than as "work in progress."
49	   This Internet-Draft will expire on April 14, 2011.

51	Copyright Notice

53	   Copyright (c) 2010 IETF Trust and the persons identified as the
54	   document authors.  All rights reserved.

56	   This document is subject to BCP 78 and the IETF Trust's Legal
57	   Provisions Relating to IETF Documents
58	   (http://trustee.ietf.org/license-info) in effect on the date of
59	   publication of this document.  Please review these documents
60	   carefully, as they describe your rights and restrictions with respect
61	   to this document.  Code Components extracted from this document must
62	   include Simplified BSD License text as described in Section 4.e of
63	   the Trust Legal Provisions and are provided without warranty as
64	   described in the Simplified BSD License.

66	Table of Contents

68	   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  4
69	     1.1.  Requirements Language  . . . . . . . . . . . . . . . . . .  4
70	   2.  Assumptions  . . . . . . . . . . . . . . . . . . . . . . . . .  4
71	   3.  Definitions  . . . . . . . . . . . . . . . . . . . . . . . . .  4
72	   4.  Network Operator Functional Requirements . . . . . . . . . . .  5
73	     4.1.  Availability, Stability and Transient Response . . . . . .  5
74	     4.2.  Component Links Provided by Lower Layer Networks . . . . .  6
75	     4.3.  Parallel Component Links with Different Characteristics  .  7
76	   5.  Derived Requirements . . . . . . . . . . . . . . . . . . . . .  9
77	   6.  Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 10
78	   7.  IANA Considerations  . . . . . . . . . . . . . . . . . . . . . 10
79	   8.  Security Considerations  . . . . . . . . . . . . . . . . . . . 10
80	   9.  References . . . . . . . . . . . . . . . . . . . . . . . . . . 11
81	     9.1.  Normative References . . . . . . . . . . . . . . . . . . . 11
82	     9.2.  Informative References . . . . . . . . . . . . . . . . . . 11
83	     9.3.  Appendix References  . . . . . . . . . . . . . . . . . . . 12
84	   Appendix A.  More Details on Existing Network Operator
85	                Practices and Protocol Usage  . . . . . . . . . . . . 13
86	   Appendix B.  Existing Multipath Standards and Techniques . . . . . 15
87	     B.1.  Common Multpath Load Spliting Techniques . . . . . . . . . 16
88	     B.2.  Simple and Adaptive Load Balancing Multipath . . . . . . . 17
89	     B.3.  Traffic Split over Parallel Links  . . . . . . . . . . . . 18
90	     B.4.  Traffic Split over Multiple Paths  . . . . . . . . . . . . 18
91	   Appendix C.  ITU-T G.800 Composite Link Definitions and
92	                Terminology . . . . . . . . . . . . . . . . . . . . . 18
93	   Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 19

95	1.  Introduction

97	   The purpose of this document is to describe why network operators
98	   require certain functions in order to solve certain business problems
99	   (Section 2).  The intent is to first describe why things need to be
100	   done in terms of functional requirements that are as independent as
101	   possible of protocol specifications (Section 4).  For certain
102	   functional requirements this document describes a set of derived
103	   protocol requirements (Section 5).  Three appendices provide
104	   supporting details as a summary of existing/prior operator approaches
105	   (Appendix A), a summary of implementation techniques and relevant
106	   protocol standards (Appendix B), and a summary of G.800 terminology
107	   used to define a composite link (Appendix C).

109	1.1.  Requirements Language

111	   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
112	   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
113	   document are to be interpreted as described in RFC 2119 [RFC2119].

115	2.  Assumptions

117	   The services supported include L3VPN RFC 4364 [RFC4364], RFC 4797
118	   [RFC4797]L2VPN RFC 4664 [RFC4664] (VPWS, VPLS (RFC 4761 [RFC4761],
119	   RFC 4762 [RFC4762]) and VPMS VPMS Framework
120	   [I-D.ietf-l2vpn-vpms-frmwk-requirements]), Internet traffic
121	   encapsulated by at least one MPLS label, and dynamically signaled
122	   MPLS or MPLS-TP LSPs and pseudowires.  The MPLS LSPs supporting these
123	   services may be pt-pt, pt-mpt, or mpt-mpt.

125	   The locations in a network where these requirements apply are a Label
126	   Edge Router (LER) or a Label Switch Router (LSR) as defined in RFC
127	   3031 [RFC3031].

129	   The IP DSCP cannot be used for flow identification since L3VPN
130	   requires Diffserv transparency (see RFC 4031 5.5.2 [RFC4031]), and in
131	   general network operators do not rely on the DSCP of Internet
132	   packets.

134	3.  Definitions

136	   ITU-T G.800 Based Composite and Component Link Definitions:
137	       Section 6.9.2 of ITU-T-G.800 [ITU-T.G.800] defines composite and
138	       component links as summarized in Appendix C.  The following
139	       definitions for composite and component links are derived from
140	       and intended to be consistent with the cited ITU-T G.800
141	       terminology.

143	       Composite Link:  A composite link is a logical link composed of a
144	           set of parallel point-to-point component links, where all
145	           links in the set share the same endpoints.  A composite link
146	           may itself be a component of another composite link, but only
147	           a strict hierarchy of links is allowed.

149	       Component Link:  A point-to-point physical or logical link that
150	           preserves ordering in the steady state.  A component link may
151	           have transient out of order events, but such events must not
152	           exceed the network's specific NPO.  Examples of a physical
153	           link are: Lambda, Ethernet PHY, and OTN.  Examples of a
154	           logical link are: MPLS LSP, Ethernet VLAN, and MPLS-TP LSP.

156	   Flow:  A sequence of packets that must be transferred in order.

158	   Flow identification:  The label stack and other information that
159	       uniquely identifies a flow.  Other information in flow
160	       identification may include an IP header, PW control word,
161	       Ethernet MAC address, etc.  Note that an LSP may contain one or
162	       more Flows or an LSP may be equivalent to a Flow.  Flow
163	       identification is used to locally select a component link, or a
164	       path through the network toward the destination.

166	   Network Performance Objective (NPO):  Numerical values for
167	       performance measures, principally availability, latency, and
168	       delay variation.  See Appendix A for more details.

170	4.  Network Operator Functional Requirements

172	   The Functional Requirements in this section are grouped in
173	   subsections starting with the highest priority.

175	4.1.  Availability, Stability and Transient Response

177	   Limiting the period of unavailability in response to failures or
178	   transient events is extremely important as well as maintaining
179	   stability.  The transient period between some service disrupting
180	   event and the convergence of the routing and/or signaling protocols
181	   MUST occur within a time frame specified by NPO values.  Appendix A
182	   provides references and a summary of service types requiring a range
183	   of restoration times.

185	   FR#1  The solution SHALL provide a means to summarize routing
186	         advertisements regarding the characteristics of a composite
187	         link such that the routing protocol converges within the
188	         timeframe needed to meet the network performance objective.

190	   FR#2  The solution SHALL ensure that all possible restoration
191	         operations happen within the timeframe needed to meet the NPO.
192	         The solution may need to specify a means for aggregating
193	         signaling to meet this requirement.

195	   FR#3  The solution SHALL provide a mechanism to select a path for a
196	         flow across a network that contains a number of paths comprised
197	         of pairs of nodes connected by composite links in such a way as
198	         to automatically distribute the load over the network nodes
199	         connected by composite links while meeting all of the other
200	         mandatory requirements stated above.  The solution SHOULD work
201	         in a manner similar to that of current networks without any
202	         composite link protocol enhancements when the characteristics
203	         of the individual component links are advertised.

205	   FR#4  If extensions to existing protocols are specified and/or new
206	         protocols are defined, then the solution SHOULD provide a means
207	         for a network operator to migrate an existing deployment in a
208	         minimally disruptive manner.

210	   FR#5  Any automatic LSP routing and/or load balancing solutions MUST
211	         not oscillate such that performance observed by users changes
212	         such that an NPO is violated.  Since oscillation may cause
213	         reordering, there MUST be means to control the frequency of
214	         changing the component link over which a flow is placed.

216	   FR#6  Management and diagnostic protocols MUST be able to operate
217	         over composite links.

219	4.2.  Component Links Provided by Lower Layer Networks

221	   Case 3 as defined in [ITU-T.G.800] involves a component link
222	   supporting an MPLS layer network over another lower layer network
223	   (e.g., circuit switched or another MPLS network (e.g., MPLS-TP)).
224	   The lower layer network may change the latency (and/or other
225	   performance parameters) seen by the MPLS layer network.  Network
226	   Operators have NPOs of which some components are based on performance
227	   parameters.  Currently, there is no protocol for the lower layer
228	   network to inform the higher layer network of a change in a
229	   performance parameter.  Communication of the latency performance
230	   parameter is a very important requirement.  Communication of other
231	   performance parameters (e.g., delay variation) is desirable.

233	   FR#7   In order to support network NPOs and provide acceptable user
234	          experience, the solution SHALL specify a protocol means to
235	          allow a lower layer server network to communicate latency to
236	          the higher layer client network.

238	   FR#8   The precision of latency reporting SHOULD be at least 10% of
239	          the one way latencies for latency of 1 ms or more.

241	   FR#9   The solution SHALL provide a means to limit the latency on a
242	          per LSP basis between nodes within a network to meet an NPO
243	          target when the path between these nodes contains one or more
244	          pairs of nodes connected via a composite link.

246	          The NPOs differ across the services, and some services have
247	          different NPOs for different QoS classes, for example, one QoS
248	          class may have a much larger latency bound than another.
249	          Overload can occur which would violate an NPO parameter (e.g.,
250	          loss) and some remedy to handle this case for a composite link
251	          is required.

253	   FR#10  If the total demand offered by traffic flows exceeds the
254	          capacity of the composite link, the solution SHOULD define a
255	          means to cause the LSPs for some traffic flows to move to some
256	          other point in the network that is not congested.  These
257	          "preempted LSPs" may not be restored if there is no
258	          uncongested path in the network.

260	4.3.  Parallel Component Links with Different Characteristics

262	   Corresponding to Case 1 of [ITU-T.G.800], as one means to provide
263	   high availability, network operators deploy a topology in the MPLS
264	   network using lower layer networks that have a certain degree of
265	   diversity at the lower layer(s).  Many techniques have been developed
266	   to balance the distribution of flows across component links that
267	   connect the same pair of nodes (See Appendix B.3).  When the path for
268	   a flow can be chosen from a set of candidate nodes connected via
269	   composite links, other techniques have been developed (See
270	   Appendix B.4).

272	   FR#11  The solution SHALL measure traffic on a labeled traffic flow
273	          and dynamically select the component link on which to place
274	          this flow in order to balance the load so that no component
275	          link in the composite link between a pair of nodes is
276	          overloaded.

278	   FR#12  When a traffic flow is moved from one component link to
279	          another in the same composite link between a set of nodes (or
280	          sites), it MUST be done so in a minimally disruptive manner.

282	          When a flow is moved from a current link to a target link with
283	          different latency, reordering can occur if the target link
284	          latency is less than that of the current or clumping can occur
285	          if target link latency is greater than that of the current.
286	          Therefore, some flows (e.g., timing distribution, PW circuit
287	          emulation) are quite sensitive to these effects, which may be
288	          specified in an NPO or are needed to meet a user experience
289	          objective (e.g. jitter buffer under/overrun).

291	   FR#13  The solution SHALL provide a means to identify flows whose
292	          rearrangement frequency needs to be bounded by a configured
293	          value.

295	   FR#14  The solution SHALL provide a means that communicates whether
296	          the flows within an LSP can be split across multiple component
297	          links.  The solution SHOULD provide a means to indicate the
298	          flow identification field(s) which can be used along the flow
299	          path which can be used to perform this function.

301	   FR#15  The solution SHALL provide a means to indicate that a traffic
302	          flow shall select a component link with the minimum latency
303	          value.

305	   FR#16  The solution SHALL provide a means to indicate that a traffic
306	          flow shall select a component link with a maximum acceptable
307	          latency value as specified by protocol.

309	   FR#17  The solution SHALL provide a means to indicate that a traffic
310	          flow shall select a component link with a maximum acceptable
311	          delay variation value as specified by protocol.

313	   FR#18  The solution SHALL provide a means local to a node that
314	          automatically distributes flows across the component links in
315	          the composite link such that NPOs are met.

317	   FR#19  The solution SHALL provide a means to distribute flows from a
318	          single LSP across multiple component links to handle at least
319	          the case where the traffic carried in an LSP exceeds that of
320	          any component link in the composite link.  As defined in
321	          section 3, a flow is a sequence of packets that must be
322	          transferred on one component link.

324	   FR#20  The solution SHOULD support the use case where a composite
325	          link itself is a component link for a higher order composite
326	          link.  For example, a composite link comprised of MPLS-TP bi-
327	          directional tunnels viewed as logical links could then be used
328	          as a component link in yet another composite link that
329	          connects MPLS routers.

331	5.  Derived Requirements

333	   This section takes the next step and derives high-level requirements
334	   on protocol specification from the functional requirements.

336	   DR#1  The solution SHOULD attempt to extend existing protocols
337	         wherever possible, developing a new protocol only if this adds
338	         a significant set of capabilities.

340	         The vast majority of network operators have provisioned L3VPN
341	         services over LDP.  Many have deployed L2VPN services over LDP
342	         as well.  TE extensions to IGP and RSVP-TE are viewed as being
343	         overly complex by some operators.

345	   DR#2  A solution SHOULD extend LDP capabilities to meet functional
346	         requirements (without using TE methods as decided in
347	         [RFC3468]).

349	   DR#3  Coexistence of LDP and RSVP-TE signaled LSPs MUST be supported
350	         on a composite link.  Other functional requirements should be
351	         supported as independently of signaling protocol as possible.

353	   DR#4  When the nodes connected via a composite link are in the same
354	         MPLS network topology, the solution MAY define extensions to
355	         the IGP.

357	   DR#5  When the nodes are connected via a composite link are in
358	         different MPLS network topologies, the solution SHALL NOT rely
359	         on extensions to the IGP.

361	   DR#6  The Solution SHALL support composite link IGP advertisement
362	         that results in convergence time better than that of
363	         advertising the individual component links.  The solution SHALL
364	         be designed so that it represents the range of capabilities of
365	         the individual component links such that functional
366	         requirements are met, and also minimizes the frequency of
367	         advertisement updates which may cause IGP convergence to occur.

369	         One solution approach is to summarize the characteristics of
370	         the component links in IGP advertisements; however, the intent
371	         of the above requirement is not to specify the form of a
372	         solution.  Examples of advertisement update triggering events
373	         to be considered include: LSP establishment/release, changes in
374	         component link characteristics (e.g., latency, up/down state),
375	         and/or bandwidth utilization.

377	   DR#7  When a worst case failure scenario occurs,the resulting number
378	         of links advertised in the IGP causes IGP convergence to occur,
379	         causing a period of unavailability as perceived by users.  The
380	         convergence time of the solution MUST meet the SLA objective
381	         for the duration of unavailability.

383	   DR#8  When a worst case failure scenario occurs, the number of
384	         RSVP-TE LSPs to be resignaled will cause a period of
385	         unavailability as perceived by users.  The resignaling time of
386	         the solution MUST meet the NPO objective for the duration of
387	         unavailability.  The resignaling time of the solution MUST not
388	         increase significantly as compared with current methods.

390	6.  Acknowledgements

392	   Frederic Jounay of France Telecom and Yuji Kamite of NTT
393	   Communications Corporation co-authored a version of this document.

395	   A rewrite of this document occurred after the IETF77 meeting.
396	   Dimitri Papadimitriou, Lou Berger, Tony Li, the WG chairs John Scuder
397	   and Alex Zinin, and others provided valuable guidance prior to and at
398	   the IETF77 RTGWG meeting.

400	   Tony Li and John Drake have made numerous valuable comments on the
401	   RTGWG mailing list that are reflected in versions following the
402	   IETF77 meeting.

404	7.  IANA Considerations

406	   This memo includes no request to IANA.

408	8.  Security Considerations

410	   This document specifies a set of requirements.  The requirements
411	   themselves do not pose a security threat.  If these requirements are
412	   met using MPLS signaling as commonly practiced today with
413	   authenticated but unencrypted OSPF-TE, ISIS-TE, and RSVP-TE or LDP,
414	   then the requirement to provide additional information in this
415	   communication presents additional information that could conceivably
416	   be gathered in a man-in-the-middle confidentiality breach.  Such an
417	   attack would require a capability to monitor this signaling either
418	   through a provider breach or access to provider physical transmission
419	   infrastructure.  A provider breach already poses a threat of numerous
420	   tpes of attacks which are of far more serious consequence.  Encrption
421	   of the signaling can prevent or render more difficult any
422	   confidentiality breach that otherwise might occur by means of access
423	   to provider physical transmission infrastructure.

425	9.  References

427	9.1.  Normative References

429	   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
430	              Requirement Levels", BCP 14, RFC 2119, March 1997.

432	9.2.  Informative References

434	   [I-D.ietf-l2vpn-vpms-frmwk-requirements]
435	              Kamite, Y., JOUNAY, F., Niven-Jenkins, B., Brungard, D.,
436	              and L. Jin, "Framework and Requirements for Virtual
437	              Private Multicast Service (VPMS)",
438	              draft-ietf-l2vpn-vpms-frmwk-requirements-03 (work in
439	              progress), July 2010.

441	   [ITU-T.G.800]
442	              ITU-T, "Unified functional architecture of transport
443	              networks", 2007, <http://www.itu.int/rec/T-REC-G/
444	              recommendation.asp?parent=T-REC-G.800>.

446	   [RFC2702]  Awduche, D., Malcolm, J., Agogbua, J., O'Dell, M., and J.
447	              McManus, "Requirements for Traffic Engineering Over MPLS",
448	              RFC 2702, September 1999.

450	   [RFC3031]  Rosen, E., Viswanathan, A., and R. Callon, "Multiprotocol
451	              Label Switching Architecture", RFC 3031, January 2001.

453	   [RFC3468]  Andersson, L. and G. Swallow, "The Multiprotocol Label
454	              Switching (MPLS) Working Group decision on MPLS signaling
455	              protocols", RFC 3468, February 2003.

457	   [RFC3809]  Nagarajan, A., "Generic Requirements for Provider
458	              Provisioned Virtual Private Networks (PPVPN)", RFC 3809,
459	              June 2004.

461	   [RFC4031]  Carugi, M. and D. McDysan, "Service Requirements for Layer
462	              3 Provider Provisioned Virtual Private Networks (PPVPNs)",
463	              RFC 4031, April 2005.

465	   [RFC4364]  Rosen, E. and Y. Rekhter, "BGP/MPLS IP Virtual Private
466	              Networks (VPNs)", RFC 4364, February 2006.

468	   [RFC4664]  Andersson, L. and E. Rosen, "Framework for Layer 2 Virtual
469	              Private Networks (L2VPNs)", RFC 4664, September 2006.

471	   [RFC4665]  Augustyn, W. and Y. Serbest, "Service Requirements for
472	              Layer 2 Provider-Provisioned Virtual Private Networks",
473	              RFC 4665, September 2006.

475	   [RFC4761]  Kompella, K. and Y. Rekhter, "Virtual Private LAN Service
476	              (VPLS) Using BGP for Auto-Discovery and Signaling",
477	              RFC 4761, January 2007.

479	   [RFC4762]  Lasserre, M. and V. Kompella, "Virtual Private LAN Service
480	              (VPLS) Using Label Distribution Protocol (LDP) Signaling",
481	              RFC 4762, January 2007.

483	   [RFC4797]  Rekhter, Y., Bonica, R., and E. Rosen, "Use of Provider
484	              Edge to Provider Edge (PE-PE) Generic Routing
485	              Encapsulation (GRE) or IP in BGP/MPLS IP Virtual Private
486	              Networks", RFC 4797, January 2007.

488	   [RFC5254]  Bitar, N., Bocci, M., and L. Martini, "Requirements for
489	              Multi-Segment Pseudowire Emulation Edge-to-Edge (PWE3)",
490	              RFC 5254, October 2008.

492	9.3.  Appendix References

494	   [I-D.ietf-pwe3-fat-pw]
495	              Bryant, S., Filsfils, C., Drafz, U., Kompella, V., Regan,
496	              J., and S. Amante, "Flow Aware Transport of Pseudowires
497	              over an MPLS PSN", draft-ietf-pwe3-fat-pw-03 (work in
498	              progress), January 2010.

500	   [IEEE-802.1AX]
501	              IEEE Standards Association, "IEEE Std 802.1AX-2008 IEEE
502	              Standard for Local and Metropolitan Area Networks - Link
503	              Aggregation", 2006, <http://standards.ieee.org/getieee802/
504	              download/802.1AX-2008.pdf>.

506	   [ITU-T.Y.1540]
507	              ITU-T, "Internet protocol data communication service - IP
508	              packet  transfer and availability performance parameters",
509	              2007, <http://www.itu.int/rec/T-REC-Y.1540/en>.

511	   [ITU-T.Y.1541]
512	              ITU-T, "Network performance objectives for IP-based
513	              services", 2006, <http://www.itu.int/rec/T-REC-Y.1541/en>.

515	   [RFC1717]  Sklower, K., Lloyd, B., McGregor, G., and D. Carr, "The
516	              PPP Multilink Protocol (MP)", RFC 1717, November 1994.

518	   [RFC2475]  Blake, S., Black, D., Carlson, M., Davies, E., Wang, Z.,
519	              and W. Weiss, "An Architecture for Differentiated
520	              Services", RFC 2475, December 1998.

522	   [RFC2615]  Malis, A. and W. Simpson, "PPP over SONET/SDH", RFC 2615,
523	              June 1999.

525	   [RFC2991]  Thaler, D. and C. Hopps, "Multipath Issues in Unicast and
526	              Multicast Next-Hop Selection", RFC 2991, November 2000.

528	   [RFC2992]  Hopps, C., "Analysis of an Equal-Cost Multi-Path
529	              Algorithm", RFC 2992, November 2000.

531	   [RFC3260]  Grossman, D., "New Terminology and Clarifications for
532	              Diffserv", RFC 3260, April 2002.

534	   [RFC4201]  Kompella, K., Rekhter, Y., and L. Berger, "Link Bundling
535	              in MPLS Traffic Engineering (TE)", RFC 4201, October 2005.

537	   [RFC4301]  Kent, S. and K. Seo, "Security Architecture for the
538	              Internet Protocol", RFC 4301, December 2005.

540	   [RFC4385]  Bryant, S., Swallow, G., Martini, L., and D. McPherson,
541	              "Pseudowire Emulation Edge-to-Edge (PWE3) Control Word for
542	              Use over an MPLS PSN", RFC 4385, February 2006.

544	   [RFC4928]  Swallow, G., Bryant, S., and L. Andersson, "Avoiding Equal
545	              Cost Multipath Treatment in MPLS Networks", BCP 128,
546	              RFC 4928, June 2007.

548	Appendix A.  More Details on Existing Network Operator Practices and
549	             Protocol Usage

551	   Often, network operators have a contractual Service Level Agreement
552	   (SLA) with customers for services that are comprised of numerical
553	   values for performance measures, principally availability, latency,
554	   delay variation.  Additionally, network operators may have Service
555	   Level Sepcification (SLS) that is for internal use by the operator.
556	   See [ITU-T.Y.1540], [ITU-T.Y.1541], RFC3809, Section 4.9 [RFC3809]
557	   for examples of the form of such SLA and SLS specifications.  In this
558	   document we use the term Network Performance Objective (NPO) as
559	   defined in section 5 of [ITU-T.Y.1541] since the SLA and SLS measures
560	   have network operator and service specific implications.  Note that
561	   the numerical NPO values of Y.1540 and Y.1541 span multiple networks
562	   and may be looser than network operator SLA or SLS objectives.
563	   Applications and acceptable user experience have an important
564	   relationship to these performance parameters.

566	   Consider latency as an example.  In some cases, minimizing latency
567	   relates directly to the best customer experience (e.g., in TCP closer
568	   is faster).  I other cases, user experience is relatively insensitive
569	   to latency, up to a specific limit at which point user perception of
570	   quality degrades significantly (e.g., interactive human voice and
571	   multimedia conferencing).  A number of NPOs have. a bound on point-
572	   point latency, and as long as this bound is met, the NPO is met --
573	   decreasing the latency is not necessary.  In some NPOs, if the
574	   specified latency is not met, the user considers the service as
575	   unavailable.  An unprotected LSP can be manually provisioned on a set
576	   of to meet this type of NPO, but this lowers availability since an
577	   alternate route that meets the latency NPO cannot be determined.

579	   Historically, when an IP/MPLS network was operated over a lower layer
580	   circuit switched network (e.g., SONET rings), a change in latency
581	   caused by the lower layer network (e.g., due to a maintenance action
582	   or failure) this was not known to the MPLS network.  This resulted in
583	   latency affecting end user experience, sometimes violating NPOs or
584	   resulting in user complaints.

586	   A response to this problem was to provision IP/MPLS networks over
587	   unprotected circuits and set the metric and/or TE-metric proportional
588	   to latency.  This resulted in traffic being directed over the least
589	   latency path, even if this was not needed to meet an NPO or meet user
590	   experience objectives.  This results in reduced flexibility and
591	   increased cost for network operators.  Using lower layer networks to
592	   provide restoration and grooming is expected to be more efficient,
593	   but the inability to communicate performance parameters, in
594	   particular latency, from the lower layer network to the higher layer
595	   network is an important problem to be solved before this can be done.

597	   Latency NPOs for pt-pt services are often tied closely to geographic
598	   locations, while latency for multipoint services may be based upon a
599	   worst case within a region.

601	   Section 7 of [ITU-T.Y.1540] defines availability for an IP service in
602	   terms of loss exceeding a threshold for a period on the order of 5
603	   minutes.  However, the timeframes for restoration (i.e., as
604	   implemented by pre-determined protection, convergence of routing
605	   protocols and/or signaling) for services range from on the order of
606	   100 ms or less (e.g., for VPWS to emulate classical SDH/SONET
607	   protection switching), to several minutes (e.g., to allow BGP to
608	   reconverge for L3VPN) and may differ among the set of customers
609	   within a single service.

611	   The presence of only three Traffic Class (TC) bits (previously known
612	   as EXP bits) in the MPLS shim header is limiting when a network
613	   operator needs to support QoS classes for multiple services (e.g.,
614	   L2VPN VPWS, VPLS, L3VPN and Internet), each of which has a set of QoS
615	   classes that need to be supported.  In some cases one bit is used to
616	   indicate conformance to some ingress traffic classification, leaving
617	   only two bits for indicating the service QoS classes.  The approach
618	   that has been taken is to aggregate these QoS classes into similar
619	   sets on LER-LSR and LSR-LSR links.

621	   Labeled LSPs have been and use of link layer encapsulation have been
622	   standardized in order to provide a means to meet these needs.

624	   The IP DSCP cannot be used for flow identification since RFC 4301
625	   Section 5.5 [RFC4301] requires Diffserv transparency, and in general
626	   network operators do not rely on the DSCP of Internet packets.

628	   A label is pushed onto Internet packets when they are carried along
629	   with L2/L3VPN packets on the same link or lower layer network
630	   provides a mean to distinguish between the QoS class for these
631	   packets.

633	   Operating an MPLS-TE network involves a different paradigm from
634	   operating an IGP metric-based LDP signaled MPLS network.  The mpt-pt
635	   LDP signaled MPLS LSPs occur automatically, and balancing across
636	   parallel links occurs if the IGP metrics are set "equally" (with
637	   equality a locally definable relation).

639	   Traffic is typically comprised of a few large (some very large) flows
640	   and many small flows.  In some cases, separate LSPs are established
641	   for very large flows.  This can occur even if the IP header
642	   information is inspected by a router, for example an IPsec tunnel
643	   that carries a large amount of traffic.  An important example of
644	   large flows is that of a L2/L3 VPN customer who has an access line
645	   bandwdith comparable to a client-client composite link bandwidth --
646	   there could be flows that are on the order of the access line
647	   bandwdith.

649	Appendix B.  Existing Multipath Standards and Techniques

651	   Today the requirement to handle large aggregations of traffic, much
652	   larger than a single component link, can be handled by a number of
653	   techniques which we will collectively call multipath.  Multipath
654	   applied to parallel links between the same set of nodes includes
655	   Ethernet Link Aggregation [IEEE-802.1AX], link bundling [RFC4201], or
656	   other aggregation techniques some of which may be vendor specific.
657	   Multipath applied to diverse paths rather than parallel links
658	   includes Equal Cost MultiPath (ECMP) as applied to OSPF, ISIS, or
659	   even BGP, and equal cost LSP, as described in Appendix B.4.  Various
660	   mutilpath techniques have strengths and weaknesses.

662	   The term composite link is more general than terms such as link
663	   aggregate which is generally considered to be specific to Ethernet
664	   and its use here is consistent with the broad definition in
665	   [ITU-T.G.800].  The term multipath excludes inverse multiplexing and
666	   refers to techniques which only solve the problem of large
667	   aggregations of traffic, without addressing the other requirements
668	   outlined in this document.

670	B.1.  Common Multpath Load Spliting Techniques

672	   Identical load balancing techniqes are used for multipath both over
673	   parallel links and over diverse paths.

675	   Large aggregates of IP traffic do not provide explicit signaling to
676	   indicate the expected traffic loads.  Large aggregates of MPLS
677	   traffic are carried in MPLS tunnels supported by MPLS LSP.  LSP which
678	   are signaled using RSVP-TE extensions do provide explicit signaling
679	   which includes the expected traffic load for the aggregate.  LSP
680	   which are signaled using LDP do not provide an expected traffic load.

682	   MPLS LSP may contain other MPLS LSP arranged hierarchically.  When an
683	   MPLS LSR serves as a midpoint LSR in an LSP carrying other LSP as
684	   payload, there is no signaling associated with these inner LSP.
685	   Therefore even when using RSVP-TE signaling there may be insufficient
686	   information provided by signaling to adequately distribute load
687	   across a composite link.

689	   Generally a set of label stack entries that is unique across the
690	   ordered set of label numbers can safely be assumed to contain a group
691	   of flows.  The reordering of traffic can therefore be considered to
692	   be acceptable unless reordering occurs within traffic containing a
693	   common unique set of label stack entries.  Existing load splitting
694	   techniques take advantage of this property in addition to looking
695	   beyond the bottom of the label stack and determining if the payload
696	   is IPv4 or IPv6 to load balance traffic accordingly.

698	   MPLS-TP OAM violates the assumption that it is safe to reorder
699	   traffic within an LSP.  If MPLS-TP OAM is to be accommodated, then
700	   existing multipth techniques must be modified.  Such modifications
701	   are outside the scope of this document.

703	   For example a large aggregate of IP traffic may be subdivided into a
704	   large number of groups of flows using a hash on the IP source and
705	   destination addresses.  This is as described in [RFC2475] and
706	   clarified in [RFC3260].  For MPLS traffic carrying IP, a similar hash
707	   can be performed on the set of labels in the label stack.  These
708	   techniques are both examples of means to subdivide traffic into
709	   groups of flows for the purpose of load balancing traffic across
710	   aggregated link capacity.  The means of identifying a flow should not
711	   be confused with the definition of a flow.

713	   Discussion of whether a hash based approach provides a sufficiently
714	   even load balance using any particular hashing algorithm or method of
715	   distributing traffic across a set of component links is outside of
716	   the scope of this document.

718	   The current load balancing techniques are referenced in [RFC4385] and
719	   [RFC4928].  The use of three hash based approaches are described in
720	   [RFC2991] and [RFC2992].  A mechanism to identify flows within PW is
721	   described in [I-D.ietf-pwe3-fat-pw].  The use of hash based
722	   approaches is mentioned as an example of an existing set of
723	   techniques to distribute traffic over a set of component links.
724	   Other techniques are not precluded.

726	B.2.  Simple and Adaptive Load Balancing Multipath

728	   Simple multipath generally relies on the mathematical probability
729	   that given a very large number of small microflows, these microflows
730	   will tend to be distributed evenly across a hash space.  A common
731	   simple multipath implementation assumes that all members (component
732	   links) are of equal capacity and perform a modulo operation across
733	   the hashed value.  An alternate simple multipath technique uses a
734	   table generally with a power of two size, and distributes the table
735	   entries proportionally among members according to the capacity of
736	   each member.

738	   Simple load balancing works well if there are a very large number of
739	   small microflows (i.e., microflow rate is much less than component
740	   link capacity).  However, the case where there are even a few large
741	   microflows is not handled well by simple load balancing.

743	   An adaptive multipath technique is one where the traffic bound to
744	   each member (component link) is measured and the load split is
745	   adjusted accordingly.  As long as the adjustment is done within a
746	   single network element, then no protocol extensions are required and
747	   there are no interoperability issues.

749	   Note that if the load balancing algorithm and/or its parameters is
750	   adjusted, then packets in some flows may be delivered out of
751	   sequence.

753	B.3.  Traffic Split over Parallel Links

755	   The load spliting techniques defined in Appendix B.1 and Appendix B.2
756	   are both used in splitting traffic over parallel links between the
757	   same pair of nodes.  The best known technique, though far from being
758	   the first, is Ethernet Link Aggregation [IEEE-802.1AX].  This same
759	   technique had been applied much earlier using OSPF or ISIS Equal Cost
760	   MultiPath (ECMP) over parallel links between the same nodes.
761	   Multilink PPP [RFC1717] uses a technique that provides inverse
762	   multiplexing, however a number of vendors had provided proprietary
763	   extensions to PPP over SONET/SDH [RFC2615] that predated Ethernet
764	   Link Aggregation but are no longer used.

766	   Link bundling [RFC4201] provides yet another means of handling
767	   parallel LSP.  RFC4201 explicitly allow a special value of all ones
768	   to indicate a split across all members of the bundle.

770	B.4.  Traffic Split over Multiple Paths

772	   OSPF or ISIS Equal Cost MultiPath (ECMP) is a well known form of
773	   traffic split over multiple paths that may traverse intermediate
774	   nodes.  ECMP is often incorrectly equated to only this case, and
775	   multipath over multiple diverse paths is often incorrectly equated to
776	   ECMP.

778	   Many implementations are able to create more than one LSP between a
779	   pair of nodes, where these LSP are routed diversely to better make
780	   use of available capacity.  The load on these LSP can be distributed
781	   proportionally to the reserved bandwidth of the LSP.  These multiple
782	   LSP may be advertised as a single PSC FA and any LSP making use of
783	   the FA may be split over these multiple LSP.

785	   Link bundling [RFC4201] component links may themselves be LSP.  When
786	   this technique is used, any LSP which specifies the link bundle may
787	   be split across the multiple paths of the LSP that comprise the
788	   bundle.

790	Appendix C.  ITU-T G.800 Composite Link Definitions and Terminology
791	   Composite Link:
792	       Section 6.9.2 of ITU-T-G.800 [ITU-T.G.800] defines composite link
793	       in terms of three cases, of which the following two are relevant
794	       (the one describing inverse (TDM) multiplexing does not apply).
795	       Note that these case definitions are taken verbatim from section
796	       6.9, "Layer Relationships".

798	       Case 1:  "Multiple parallel links between the same subnetworks
799	           can be bundled together into a single composite link.  Each
800	           component of the composite link is independent in the sense
801	           that each component link is supported by a separate server
802	           layer trail.  The composite link conveys communication
803	           information using different server layer trails thus the
804	           sequence of symbols crossing this link may not be preserved.
805	           This is illustrated in Figure 14."

807	       Case 3:  "A link can also be constructed by a concatenation of
808	           component links and configured channel forwarding
809	           relationships.  The forwarding relationships must have a 1:1
810	           correspondence to the link connections that will be provided
811	           by the client link.  In this case, it is not possible to
812	           fully infer the status of the link by observing the server
813	           layer trails visible at the ends of the link.  This is
814	           illustrated in Figure 16."

816	   Subnetwork:  A set of one or more nodes (i.e., LER or LSR) and links.
817	       As a special case it can represent a site comprised of multiple
818	       nodes.

820	   Forwarding Relationship:  Configured forwarding between ports on a
821	       subnetwork.  It may be connectionless (e.g., IP, not considered
822	       in this draft), or connection oriented (e.g., MPLS signaled or
823	       configured).

825	   Component Link:  A topolological relationship between subnetworks
826	       (i.e., a connection between nodes), which may be a wavelength,
827	       circuit, virtual circuit or an MPLS LSP.

829	Authors' Addresses

831	   Curtis Villamizar (editor)
832	   Infinera Corporation
833	   169 W. Java Drive
834	   Sunnyvale, CA  94089

836	   Email: cvillamizar@infinera.com
837	   Dave McDysan (editor)
838	   Verizon
839	   22001 Loudoun County PKWY
840	   Ashburn, VA  20147

842	   Email: dave.mcdysan@verizon.com

844	   So Ning
845	   Verizon
846	   2400 N. Glenville Ave.
847	   Richardson, TX  75082

849	   Phone: +1 972-729-7905
850	   Email: ning.so@verizonbusiness.com

852	   Andrew Malis
853	   Verizon
854	   117 West St.
855	   Waltham, MA  02451

857	   Phone: +1 781-466-2362
858	   Email: andrew.g.malis@verizon.com

860	   Lucy Yong
861	   Huawei USA
862	   1700 Alma Dr. Suite 500
863	   Plano, TX  75075

865	   Phone: +1 469-229-5387
866	   Email: lucyyong@huawei.com