idnits 2.17.1 

draft-ietf-rtgwg-cl-use-cases-05.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

     No issues found here.

  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  -- The document date (November 13, 2013) is 3810 days in the past.  Is this
     intentional?


  Checking references for intended status: Informational
  ----------------------------------------------------------------------------

  == Outdated reference: A later version (-04) exists of
     draft-ietf-mpls-multipath-use-00

  == Outdated reference: A later version (-04) exists of
     draft-ietf-rtgwg-cl-framework-03

  == Outdated reference: A later version (-16) exists of
     draft-ietf-rtgwg-cl-requirement-11

  -- Obsolete informational reference (is this intentional?): RFC 1717
     (Obsoleted by RFC 1990)


     Summary: 0 errors (**), 0 flaws (~~), 4 warnings (==), 2 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.
--------------------------------------------------------------------------------


2	RTGWG                                                            S. Ning
3	Internet-Draft                                       Tata Communications
4	Intended status: Informational                                  A. Malis
5	Expires: May 17, 2014                                         Consultant
6	                                                              D. McDysan
7	                                                                 Verizon
8	                                                                 L. Yong
9	                                                              Huawei USA
10	                                                           C. Villamizar
11	                                       Outer Cape Cod Network Consulting
12	                                                       November 13, 2013

14	         Advanced Multipath Use Cases and Design Considerations
15	                    draft-ietf-rtgwg-cl-use-cases-05

17	Abstract

19	   Advanced Multipath is a formalization of multipath techniques
20	   currently in use in IP and MPLS networks and a set of extensions to
21	   existing multipath techniques.

23	   This document provides a set of use cases and design considerations
24	   for Advanced Multipath.  Existing practices are described.  Use cases
25	   made possible through Advanced Multipath extensions are described.

27	Status of This Memo

29	   This Internet-Draft is submitted in full conformance with the
30	   provisions of BCP 78 and BCP 79.

32	   Internet-Drafts are working documents of the Internet Engineering
33	   Task Force (IETF).  Note that other groups may also distribute
34	   working documents as Internet-Drafts.  The list of current Internet-
35	   Drafts is at http://datatracker.ietf.org/drafts/current/.

37	   Internet-Drafts are draft documents valid for a maximum of six months
38	   and may be updated, replaced, or obsoleted by other documents at any
39	   time.  It is inappropriate to use Internet-Drafts as reference
40	   material or to cite them other than as "work in progress."

42	   This Internet-Draft will expire on May 17, 2014.

44	Copyright Notice

46	   Copyright (c) 2013 IETF Trust and the persons identified as the
47	   document authors.  All rights reserved.

49	   This document is subject to BCP 78 and the IETF Trust's Legal
50	   Provisions Relating to IETF Documents
51	   (http://trustee.ietf.org/license-info) in effect on the date of
52	   publication of this document.  Please review these documents
53	   carefully, as they describe your rights and restrictions with respect
54	   to this document.  Code Components extracted from this document must
55	   include Simplified BSD License text as described in Section 4.e of
56	   the Trust Legal Provisions and are provided without warranty as
57	   described in the Simplified BSD License.

59	Table of Contents

61	   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   2
62	   2.  Assumptions . . . . . . . . . . . . . . . . . . . . . . . . .   3
63	   3.  Terminology . . . . . . . . . . . . . . . . . . . . . . . . .   3
64	   4.  Multipath Foundation Use Cases  . . . . . . . . . . . . . . .   5
65	   5.  Advanced Multipath Use Cases  . . . . . . . . . . . . . . . .   7
66	     5.1.  Delay Sensitive Applications  . . . . . . . . . . . . . .   7
67	     5.2.  Large Volume of IP and LDP Traffic  . . . . . . . . . . .   8
68	     5.3.  Multipath and Packet Ordering . . . . . . . . . . . . . .   9
69	       5.3.1.  MPLS-TP in network edges only . . . . . . . . . . . .  10
70	       5.3.2.  Multipath at core LSP ingress/egress  . . . . . . . .  11
71	       5.3.3.  MPLS-TP as a MPLS client  . . . . . . . . . . . . . .  12
72	   6.  IANA Considerations . . . . . . . . . . . . . . . . . . . . .  13
73	   7.  Security Considerations . . . . . . . . . . . . . . . . . . .  13
74	   8.  Acknowledgments . . . . . . . . . . . . . . . . . . . . . . .  13
75	   9.  Informative References  . . . . . . . . . . . . . . . . . . .  13
76	   Appendix A.  Network Operator Practices and Protocol Usage  . . .  16
77	   Appendix B.  Existing Multipath Standards and Techniques  . . . .  18
78	     B.1.  Common Multpath Load Spliting Techniques  . . . . . . . .  19
79	     B.2.  Static and Dynamic Load Balancing Multipath . . . . . . .  20
80	     B.3.  Traffic Split over Parallel Links . . . . . . . . . . . .  20
81	     B.4.  Traffic Split over Multiple Paths . . . . . . . . . . . .  21
82	   Appendix C.  Characteristics of Transport in Core Networks  . . .  21
83	   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .  23

85	1.  Introduction

87	   Advanced Multipath requirements are specified in
88	   [I-D.ietf-rtgwg-cl-requirement].  An Advanced Multipath framework is
89	   defined in [I-D.ietf-rtgwg-cl-framework].

91	   Multipath techniques have been widely used in IP networks for over
92	   two decades.  The use of MPLS began more than a decade ago.
93	   Multipath has been widely used in IP/MPLS networks for over a decade
94	   with very little protocol support dedicated to effective use of
95	   multipath.

97	   The state of the art in multipath prior to Advanced Multipath is
98	   documented in Appendix B.

100	   Both Ethernet Link Aggregation [IEEE-802.1AX] and MPLS link bundling
101	   [RFC4201] have been widely used in today's MPLS networks.  Advanced
102	   Multipath differs in the following characteristics.

104	   1.  Advanced Multipath allows bundling of non-homogenous links
105	       together as a single logical link.

107	   2.  Advanced Multipath provides more information in the TE-LSDB and
108	       supports more explicit control over placement of LSP.

110	2.  Assumptions

112	   The supported services are, but not limited to, pseudowire (PW) based
113	   services ([RFC3985]), including Virtual Private Network (VPN)
114	   services, Internet traffic encapsulated by at least one MPLS label
115	   ([RFC3032]), and dynamically signaled MPLS ([RFC3209] or [RFC5036])
116	   or MPLS-TP Label Switched Paths (LSPs) ([RFC5921]).

118	   The MPLS LSPs supporting these services may be point-to-point, point-
119	   to-multipoint, or multipoint-to-multipoint.  The MPLS LSPs may be
120	   signaled using RSVP-TE [RFC3209] or LDP [RFC5036].  With RSVP-TE,
121	   extensions to Interior Gateway Protocols (IGPs) may be used,
122	   specifically to OSPF-TE [RFC3630] or ISIS-TE [RFC5305].

124	   The locations in a network where these requirements apply are a Label
125	   Edge Router (LER) or a Label Switch Router (LSR) as defined in
126	   [RFC3031].

128	   The IP DSCP field [RFC2474] [RFC2475] cannot be used for flow
129	   identification since L3VPN requires Diffserv transparency (see RFC
130	   4031 5.5.2 [RFC4031]), and in general network operators do not rely
131	   on the DSCP of Internet packets.

133	3.  Terminology

135	   Terminology defined in [I-D.ietf-rtgwg-cl-requirement] and
136	   [I-D.ietf-mpls-multipath-use] is used in this document.

138	   In addition, the following terms are used:

140	   classic multipath:
141	       Classic multipath refers to the most common current practice in
142	       implementation and deployment of multipath (see Appendix B).  The
143	       most common current practice when applied to MPLS traffic makes
144	       use of a hash on the MPLS label stack, and if IPv4 or IPv6 are
145	       indicated under the label stack, makes use of the IP source and
146	       destination addresses [RFC4385] [RFC4928].

148	   classic link bundling:
149	       Classic link bundling refers to the use of [RFC4201] where the
150	       "all ones" component is not used.  Where the "all ones" component
151	       is used, link bundling behaves as classic multipath does.
152	       Classic link bundling selects a single component link to carry
153	       all of the traffic for a given LSP.

155	   Among the important distinctions between classic multipath or classic
156	   link bundling and Advanced Multipath are:

158	   1.  Classic multipath has no provision to retain packet order within
159	       any specific LSP.  Classic link bundling retains packet order
160	       among any given LSP but as a result does a poor job of splitting
161	       load among components and therefore is rarely (if ever) deployed.
162	       Advanced Multipath allows per LSP control of load split
163	       characteristics.

165	   2.  Classic multipath and classic link bundling do not provide a
166	       means to put some LSP on component links with lower delay.
167	       Advanced Multipath does.

169	   3.  Classic multipath will provide a load balance for IP and LDP
170	       traffic.  Classic link bundling will not.  Neither classic
171	       multipath or classic link bundling will measure IP and LDP
172	       traffic and reduce the RSVP-TE advertised "Available Bandwidth"
173	       as a result of that measurement.  Advanced Multipath better
174	       supports RSVP-TE used with significant traffic levels of native
175	       IP and native LDP.

177	   4.  Classic link bundling cannot support an LSP that is greater in
178	       capacity than any single component link.  Classic multipath
179	       supports this capability but may reorder traffic on such an LSP.
180	       Advanced Multipath can retain order of an LSP that is carried
181	       within an LSP that is greater in capacity than any single
182	       component link if the contained LSP has such a requirement.

184	   None of these techniques, classic multipath, classic link bundling,
185	   or Advanced Multipath, will reorder traffic among IP microflows.
186	   None of these techniques will reorder traffic among PW, if a PWE3
187	   Control Word is used [RFC4385].

189	4.  Multipath Foundation Use Cases

191	   A simple multipath composed entirely of physical links is illustrated
192	   in Figure 1, where an multipath is configured between LSR1 and LSR2.
193	   This multipath has three component links.  Individual component links
194	   in a multipath may be supported by different transport technologies
195	   such as SONET, OTN, Ethernet, etc.  Even if the transport technology
196	   implementing the component links is identical, the characteristics
197	   (e.g., bandwidth, latency) of the component links may differ.

199	   The multipath in Figure 1 may carry LSP traffic flows and control
200	   plane packets.  Control plane packets may appear as IP packets or may
201	   be carried within a generic associated channel (G-Ach) [RFC5586].  A
202	   LSP may be established over the link by either RSVP-TE [RFC3209] or
203	   LDP [RFC5036] signaling protocols.  All component links in a
204	   multipath are summarized in the same forwarding adjacency LSP (FA-
205	   LSP) routing advertisement [RFC3945].  The multipath is summarized as
206	   one TE-Link advertised into the IGP by the multipath end points (the
207	   LER if the multipath is MPLS based).  This information is used in
208	   path computation when a full MPLS control plane is in use.

210	   If Advanced Multipath techniques are used, then the individual
211	   component links or groups of component links may optionally be
212	   advertised into the IGP as sub-TLV of the multipath FA advertisement
213	   to indicate capacity available with various characteristics, such as
214	   a delay range.

216	              Management Plane
217	          Configuration and Measurement <------------+
218	                     ^                               |
219	                     |                               |
220	             +-------+-+                           +-+-------+
221	             |       | |                           | |       |
222	        CP Packets   V |                           | V     CP Packets
223	             |  V    | |     Component Link 1      | |    ^  |
224	             |  |    |=|===========================|=|    |  |
225	             |  +----| |     Component Link 2      | |----+  |
226	             |       |=|===========================|=|       |
227	   Aggregated LSPs   | |                           | |       |
228	            ~|~~~~~~>| |     Component Link 3      | |~~~~>~~|~~
229	             |       |=|===========================|=|       |
230	             |       | |                           | |       |
231	             | LSR1    |                           |    LSR2 |
232	             +---------+                           +---------+
233	                     !                               !
234	                     !                               !
235	                     !<-------- Multipath ---------->!

237	      Figure 1: a multipath constructed with multiple physical links
238	                              between two LSR

240	   [I-D.ietf-rtgwg-cl-requirement] specifies that component links may
241	   themselves be multipath.  This is true for most implementations even
242	   prior to the Advanced Multipath work in
243	   [I-D.ietf-rtgwg-cl-requirement].  For example, a component of a pre-
244	   Advanced Multipath MPLS Link Bundle or ISIS or OSPF ECMP could be an
245	   Ethernet LAG.  In some implementations many other combinations or
246	   even arbitrary combinations could be supported.  Figure 2 shows three
247	   three forms of component links which may be deployed in a network.

249	   +-------+                 1. Physical Link             +-------+
250	   |     |-|----------------------------------------------|-|     |
251	   |     | |                                              | |     |
252	   |     | |     +------+                     +------+    | |     |
253	   |     | |     | MPLS |    2. Logical Link  | MPLS |    | |     |
254	   |     |.|.... |......|.....................|......|....|.|     |
255	   |     | |-----| LSR3 |---------------------| LSR4 |----| |     |
256	   |     | |     +------+                     +------+    | |     |
257	   |     | |                                              | |     |
258	   |     | |                                              | |     |
259	   |     | |     +------+                     +------+    | |     |
260	   |     | |     |GMPLS |    3. Logical Link  |GMPLS |    | |     |
261	   |     |.|. ...|......|.....................|......|....|.|     |
262	   |     | |-----| LSR5 |---------------------| LSR6 |----| |     |
263	   |       |     +------+                     +------+    |       |
264	   | LSR1  |                                              |  LSR2 |
265	   +-------+                                              +-------+
266	         |<---------------- Multipath --------------------->|

268	          Figure 2: Illustration of Various Component Link Types

270	   The three forms of component link shown in Figure 2 are:

272	   1.  The first component link is configured with direct physical media
273	       plus a link layer protocol.  This case also includes emulated
274	       physical links, for example using pseudowire emulation.

276	   2.  The second component link is a TE tunnel that traverses LSR3 and
277	       LSR4, where LSR3 and LSR4 are the nodes supporting MPLS, but
278	       supporting few or no GMPLS extensions.

280	   3.  The third component link is formed by lower layer network that
281	       has GMPLS enabled.  In this case, LSR5 and LSR6 are not the nodes
282	       controlled by the MPLS but provide the connectivity for the
283	       component link.

285	   A multipath forms one logical link between connected LSR (LSR1 and
286	   LSR2 in Figure 1 and Figure 2) and is used to carry aggregated
287	   traffic.  Multipath relies on its component links to carry the
288	   traffic but must distribute or load balance the traffic.  The
289	   endpoints of the multipath maps incoming traffic into the set of
290	   component links.

292	   For example, LSR1 in Figure 1 distributes the set of traffic flows
293	   including control plane packets among the set of component links.
294	   LSR2 in Figure 1 receives the packets from its component links and
295	   sends them to MPLS forwarding engine with no attempt to reorder
296	   packets arriving on different component links.  The traffic in the
297	   opposite direction, from LSR2 to LSR1, is distributed across the set
298	   of component links by the LSR2.

300	   These three forms of component link are a limited set of very simple
301	   examples.  Many other examples are possible.  A component link may
302	   itself be a multipath.  A segment of an LSP (single hop for that LSP)
303	   may be a multipath.

305	5.  Advanced Multipath Use Cases

307	   The following subsections provide some uses of the Advanced Multipath
308	   extensions.  These are not the only uses, simply a set of examples.

310	5.1.  Delay Sensitive Applications

312	   Most applications benefit from lower delay.  Some types of
313	   applications are far more sensitive than others.  For example, real
314	   time bidirectional applications such as voice communication or two
315	   way video conferencing are far more sensitive to delay than
316	   unidirectional streaming audio or video.  Non-interactive bulk
317	   transfer is almost insensitive to delay if a large enough TCP window
318	   is used.

320	   Some applications are sensitive to delay but users of those
321	   applications are unwilling to pay extra to insure lower delay.  For
322	   example, many SIP end users are willing to accept the delay offered
323	   to best effort services as long as call quality is good most of the
324	   time.

326	   Other applications are sensitive to delay and willing to pay extra to
327	   insure lower delay.  For example, financial trading applications are
328	   extremely sensitive to delay and with a lot at stake are willing to
329	   go to great lengths to reduce delay.

331	   Among the requirements of Advanced Multipath are requirements to
332	   support non-homogeneous links.  One solution in support of lower
333	   delay links is to advertise capacity available within configured
334	   ranges of delay within a given multipath and then support the ability
335	   to place an LSP only on component links that meeting that LSP's delay
336	   requirements.

338	   The Advanced Multipath requirements to accommodate delay sensitive
339	   applications are analogous to Diffserv requirements to accommodate
340	   applications requiring higher quality of service on the same
341	   infrastructure as applications with less demanding requirements.  The
342	   ability to share capacity with less demanding applications, with best
343	   effort applications generally being the least demanding, can greatly
344	   reduce the cost of delivering service to the more demanding
345	   applications.

347	5.2.  Large Volume of IP and LDP Traffic

349	   IP and LDP do not support traffic engineering.  Both make use of a
350	   shortest (lowest routing metric) path, with an option to use equal
351	   cost multipath (ECMP).  Note that though ECMP is prohibited in LDP
352	   specifications, it is widely implemented.  Where implemented for LDP,
353	   ECMP is generally disabled by default for standards compliance, but
354	   often enabled in LDP deployments.

356	   Without traffic engineering capability, there must be sufficient
357	   capacity to accommodate the IP and LDP traffic.  If not, persistent
358	   queuing delay and loss will occur.  Unlike RSVP-TE, a subset of
359	   traffic cannot be routed using constraint based routing to avoid a
360	   congested portion of an infrastructure.

362	   In existing networks which accommodate IP and/or LDP with RSVP-TE,
363	   either the IP and LDP can be carried over RSVP-TE, or where the
364	   traffic contribution of IP and LDP is small, IP and LDP can be
365	   carried native and the effect on RSVP-TE can be ignored.  Ignoring
366	   the traffic contribution of IP is valid on high capacity networks
367	   where a very low volume of native IP is used primarily for control
368	   and network management and customer IP is carried within RSVP-TE.

370	   Where it is desirable to carry native IP and/or LDP and IP and/or LDP
371	   traffic volumes are not negligible, RSVP-TE needs improvement.  An
372	   enhancement offered by Advanced Multipath is an ability to measure
373	   the IP and LDP, filter the measurements, and reduce the capacity
374	   available to RSVP-TE to avoid congestion.  The treatment given to the
375	   IP or LDP traffic is similar to the treatment when using the "auto-
376	   bandwidth" feature in some RSVP-TE implementations on that same
377	   traffic, and giving a higher priority (numerically lower setup
378	   priority and holding priority value) to the "auto-bandwidth" LSP.
379	   The difference is that the measurement is made at each hop and the
380	   reduction in advertised bandwidth is made more directly.

382	5.3.  Multipath and Packet Ordering

384	   A strong motivation for multipath is the need to provide LSP capacity
385	   in IP backbones that exceeds the capacity of single wavelengths
386	   provided by transport equipment and exceeds the practical capacity
387	   limits achievable through inverse multiplexing.  Appendix C describes
388	   characteristics and limitations of transport systems today.
389	   Section 3 defines the terms "classic multipath" and "classic link
390	   bundling" used in this section.

392	   For purpose of discussion, consider two very large cities, city A and
393	   city Z.  For example, in the US high traffic cities might be New York
394	   and Los Angeles and in Europe high traffic cities might be London and
395	   Amsterdam.  Two other high volume cities, city B and city Y may share
396	   common provider core network infrastructure.  Using the same
397	   examples, the city B and Y may Washington DC and San Francisco or
398	   Paris and Stockholm.  In the US, the common infrastructure may span
399	   Denver, Chicago, Detroit, and Cleveland.  Other major traffic
400	   contributors on either US coast include Boston, northern Virginia on
401	   the east coast, and Seattle, and San Diego on the west coast.  The
402	   capacity of IP/MPLS links within the shared infrastructure, for
403	   example city to city links in the Denver, Chicago, Detroit, and
404	   Cleveland path in the US example, have capacities for most of the
405	   2000s decade that greatly exceeded single circuits available in
406	   transport networks.

408	   For a case with four large traffic sources on either side of the
409	   shared infrastructure, up to sixteen core city to core city traffic
410	   flows in excess of transport circuit capacity may be accommodated on
411	   the shared infrastructure.

413	   Today the most common IP/MPLS core network design makes use of very
414	   large links which consist of many smaller component links, but use
415	   classic multipath techniques.  A component link typically corresponds
416	   to the largest circuit that the transport system is capable of
417	   providing (or the largest cost effective circuit).  IP source and
418	   destination address hashing is used to distribute flows across the
419	   set of component links as described in Appendix B.3.

421	   Classic multipath can handle large LSP up to the total capacity of
422	   the multipath (within limits, see Appendix B.2).  A disadvantage of
423	   classic multipath is the reordering among traffic within a given core
424	   city to core city LSP.  While there is no reordering within any
425	   microflow and therefore no customer visible issue, MPLS-TP cannot be
426	   used across an infrastructure where classic multipath is in use,
427	   except within pseudowires.

429	   Capacity issues force the use of classic multipath today.  Classic
430	   multipath excludes a direct use of MPLS-TP.  The desire for OAM,
431	   offered by MPLS-TP, is in conflict with the use of classic multipath.
432	   There are a number of alternatives that satisfy both requirements.
433	   Some alternatives are described below.

435	   MPLS-TP in network edges only

437	       A simple approach which requires no change to the core is to
438	       disallow MPLS-TP across the core unless carried within a
439	       pseudowire (PW).  MPLS-TP may be used within edge domains where
440	       classic multipath is not used.  PW may be signaled end to end
441	       using single segment PW (SS-PW), or stitched across domains using
442	       multisegment PW (MS-PW).  The PW and anything carried within the
443	       PW may use OAM as long as fat-PW [RFC6391] load splitting is not
444	       used by the PW.

446	   Advanced Multipath at core LSP ingress/egress

448	       The interior of the core network may use classic link bundling,
449	       with the limitation that no LSP can exceed the capacity of a
450	       single circuit.  Larger non-MPLS-TP LSP can be configured using
451	       multiple ingress to egress component MPLS-TP LSP.  This can be
452	       accomplished using existing IP source and destination address
453	       hashing configured at LSP ingress and egress.  Each component
454	       LSP, if constrained to be no larger than the capacity of a single
455	       circuit, can make use of MPLS-TP and offer OAM for all top level
456	       LSP across the core.

458	   MPLS-TP as a MPLS client

460	       A third approach involves making use of Entropy Labels [RFC6790]
461	       on all MPLS-TP LSP such that the entire MPLS-TP LSP is treated as
462	       a microflow by midpoint LSR, even if further encapsulated in very
463	       large server layer MPLS LSP.

465	   The above list of alternatives allow packet ordering within an LSP to
466	   be maintained in some circumstances and allow very large LSP
467	   capacities.  Each of these alternatives are discussed further in the
468	   following subsections.

470	5.3.1.  MPLS-TP in network edges only

472	   Classic MPLS link bundling is defined in [RFC4201] and has existed
473	   since early in the 2000s decade.  Classic MPLS link bundling place
474	   any given LSP entirely on a single component link.  Classic MPLS link
475	   bundling is not in widespread use as the means to accommodate large
476	   link capacities in core networks due to the simplicity and better
477	   multiplexing gain, and therefore lower network cost of classic
478	   multipath.

480	   If MPLS-TP OAM capability in the IP/MPLS network core LSP is not
481	   required, then there is no need to change existing network designs
482	   which use classic multipath and both label stack and IP source and
483	   destination address based hashing as a basis for load splitting.

485	   If MPLS-TP is needed for a subset of LSP, then those LSP can be
486	   carried within pseudowires.  The pseudowires adds a thin layer of
487	   encapsulation and therefore a small overhead.  If only a subset of
488	   LSP need MPLS-TP OAM, then some LSP must make use of the pseudowires
489	   and other LSP avoid them.  A straightforward way to accomplish this
490	   is with administrative attributes [RFC3209].

492	5.3.2.  Multipath at core LSP ingress/egress

494	   Multipath can be configured for large LSP that are made of smaller
495	   MPLS-TP component LSP.  Some implementations already support this
496	   capability, though until Advanced Multipath no IETF document required
497	   it.  This approach is capable of supporting MPLS-TP OAM over the
498	   entire set of component link LSP and therefore the entire set of top
499	   level LSP traversing the core.

501	   There are two primary disadvantage of this approach.  One is the
502	   number of top level LSP traversing the core can be dramatically
503	   increased.  The other disadvantage is the loss of multiplexing gain
504	   that results from use of classic link bundling within the interior of
505	   the core network.

507	   If component LSP use MPLS-TP, then no component LSP can exceed the
508	   capacity of a single circuit.  For a given multipath LSP there can
509	   either be a number of equal capacity component LSP or some number of
510	   full capacity component links plus one LSP carrying the excess.  For
511	   example, a 350 Gb/s multipath LSP over a 100 Gb/s infrastructure may
512	   use five 70 Gb/s component LSP or three 100 Gb/s LSP plus one 50 Gb/s
513	   LSP.  Classic MPLS link bundling is needed to support MPLS-TP and
514	   suffers from a bin packing problem even if LSP traffic is completely
515	   predictable, which it never is in practice.

517	   The common means of setting very large LSP link bandwidth parameters
518	   uses long term statistical measures.  For example, at one time many
519	   providers based their LSP bandwidth parameters on the 95th percentile
520	   of carried traffic as measured over the prior one week period.  It is
521	   common to add 10-30% to the 95th percentile value measured over the
522	   prior week and adjust bandwidth parameters of LSP weekly.  It is also
523	   possible to measure traffic flow at the LSR and adjust bandwidth
524	   parameters somewhat more dynamically.  This is less common in
525	   deployments and where deployed, makes use of filtering to track very
526	   long term trends in traffic levels.  In either case, short term
527	   variation of traffic levels relative to signaled LSP capacity are
528	   common.  Allowing a large over allocation of LSP bandwidth parameters
529	   (ie: adding 30% or more) avoids over utilization of any given LSP,
530	   but increases unused network capacity and increases network cost.
531	   Allowing a small over allocation of LSP bandwidth parameters (ie:
532	   10-20% or less) results in both underutilization and over utilization
533	   but statistically results in a total utilization within the core that
534	   is under capacity most or all of the time.

536	   The classic multipath solution accommodates the situation in which
537	   some very large LSP are under utilizing their signaled capacity and
538	   others are over utilizing their capacity with the need for far less
539	   unused network capacity to accommodate variation in actual traffic
540	   levels.  If the actual traffic levels of LSP can be described by a
541	   probability distribution, the variation of the sum of LSP is less
542	   than the variation of any given LSP for all but a constant traffic
543	   level (where the variation of the sum and the variation of the
544	   components are both zero).

546	   Splitting very large LSP at the ingress and carrying those large LSP
547	   within smaller MPLS-TP component LSP and then using classic link
548	   bundling to carry the MPLS-TP LSP is a viable approach.  However this
549	   approach loses the statistical gain discussed in the prior
550	   paragraphs.  Losing this statistical gain drives up network costs
551	   necessary to acheive the same very low probability of only mild
552	   congestion that is expected of provider networks.

554	   There are two situations which can motivate the use of this approach.
555	   This design is favored if the provider values MPLS-TP OAM across the
556	   core more than efficiency (or is unaware of the efficiency issue).
557	   This design can also make sense if transport equipment or very low
558	   cost core LSR are available which support only classic link bundling
559	   and regardless of loss of multiplexing gain, are more cost effective
560	   at carrying transit traffic than using equipment which supports IP
561	   source and destination address hashing.

563	5.3.3.  MPLS-TP as a MPLS client

565	   Accommodating MPLS-TP as a MPLS client requires the small change to
566	   forwarding behavior necessary to support [RFC6790] and is therefore
567	   most applicable to major network overbuilds or new deployments.  This
568	   approach is described in [I-D.ietf-mpls-multipath-use] and makes use
569	   of Entropy Labels [RFC6790] to prevent reordering of MPLS-TP LSP or
570	   any other LSP which requires that its traffic not be reordered for
571	   OAM or other reasons.

573	   The advantage of this approach is an ability to accommodate MPLS-TP
574	   as a client LSP but retain the high multiplexing gain and therefore
575	   efficiency and low network cost of a pure MPLS deployment.  The
576	   disadvantage is the need for a small change in forwarding to support
577	   [RFC6790].

579	6.  IANA Considerations

581	   This memo includes no request to IANA.

583	7.  Security Considerations

585	   This document is a use cases document.  Existing protocols are
586	   referenced such as MPLS.  Existing techniques such as MPLS link
587	   bundling and multipath techniques are referenced.  These protocols
588	   and techniques are documented elsewhere and contain security
589	   considerations which are unchanged by this document.

591	   This document also describes use cases for multipath and Advanced
592	   Multipath.  Advanced Multipath requirements are defined in
593	   [I-D.ietf-rtgwg-cl-requirement].  [I-D.ietf-rtgwg-cl-framework]
594	   defines a framework for Advanced Multipath.  Advanced Multipath bears
595	   many similarities to MPLS link bundling and multipath techniques used
596	   with MPLS.  Additional security considerations, if any, beyond those
597	   already identified for MPLS, MPLS link bundling and multipath
598	   techniques, will be documented in the framework document if specific
599	   to the overall framework of Advanced Multipath, or in protocol
600	   extensions if specific to a given protocol extension defined later to
601	   support Advanced Multipath.

603	8.  Acknowledgments

605	   In the interest of full disclosure of affiliation and in the interest
606	   of acknowledging sponsorship, past affiliations of authors are noted.
607	   Much of the work done by Ning So occurred while Ning was at Verizon.
608	   Much of the work done by Curtis Villamizar occurred while at
609	   Infinera.  Much of the work done by Andy Malis occurred while Andy
610	   was at Verizon.

612	9.  Informative References

614	   [I-D.ietf-mpls-multipath-use]
615	              Villamizar, C., "Use of Multipath with MPLS-TP and MPLS",
616	              draft-ietf-mpls-multipath-use-00 (work in progress),
617	              February 2013.

619	   [I-D.ietf-rtgwg-cl-framework]
620	              Ning, S., McDysan, D., Osborne, E., Yong, L., and C.
621	              Villamizar, "Composite Link Framework in Multi Protocol
622	              Label Switching (MPLS)", draft-ietf-rtgwg-cl-framework-03
623	              (work in progress), June 2013.

625	   [I-D.ietf-rtgwg-cl-requirement]
626	              Villamizar, C., McDysan, D., Ning, S., Malis, A., and L.
627	              Yong, "Requirements for Advanced Multipath in MPLS
628	              Networks", draft-ietf-rtgwg-cl-requirement-11 (work in
629	              progress), July 2013.

631	   [IEEE-802.1AX]
632	              IEEE Standards Association, "IEEE Std 802.1AX-2008 IEEE
633	              Standard for Local and Metropolitan Area Networks - Link
634	              Aggregation", 2006, <http://standards.ieee.org/getieee802/
635	              download/802.1AX-2008.pdf>.

637	   [ITU-T.G.694.2]
638	              ITU-T, "Spectral grids for WDM applications: CWDM
639	              wavelength grid ", 2003,
640	              <http://www.itu.int/rec/T-REC-G.694.2-200312-I>.

642	   [RFC1717]  Sklower, K., Lloyd, B., McGregor, G., and D. Carr, "The
643	              PPP Multilink Protocol (MP)", RFC 1717, November 1994.

645	   [RFC2474]  Nichols, K., Blake, S., Baker, F., and D. Black,
646	              "Definition of the Differentiated Services Field (DS
647	              Field) in the IPv4 and IPv6 Headers", RFC 2474, December
648	              1998.

650	   [RFC2475]  Blake, S., Black, D., Carlson, M., Davies, E., Wang, Z.,
651	              and W. Weiss, "An Architecture for Differentiated
652	              Services", RFC 2475, December 1998.

654	   [RFC2597]  Heinanen, J., Baker, F., Weiss, W., and J. Wroclawski,
655	              "Assured Forwarding PHB Group", RFC 2597, June 1999.

657	   [RFC2615]  Malis, A. and W. Simpson, "PPP over SONET/SDH", RFC 2615,
658	              June 1999.

660	   [RFC2991]  Thaler, D. and C. Hopps, "Multipath Issues in Unicast and
661	              Multicast Next-Hop Selection", RFC 2991, November 2000.

663	   [RFC2992]  Hopps, C., "Analysis of an Equal-Cost Multi-Path
664	              Algorithm", RFC 2992, November 2000.

666	   [RFC3031]  Rosen, E., Viswanathan, A., and R. Callon, "Multiprotocol
667	              Label Switching Architecture", RFC 3031, January 2001.

669	   [RFC3032]  Rosen, E., Tappan, D., Fedorkow, G., Rekhter, Y.,
670	              Farinacci, D., Li, T., and A. Conta, "MPLS Label Stack
671	              Encoding", RFC 3032, January 2001.

673	   [RFC3209]  Awduche, D., Berger, L., Gan, D., Li, T., Srinivasan, V.,
674	              and G. Swallow, "RSVP-TE: Extensions to RSVP for LSP
675	              Tunnels", RFC 3209, December 2001.

677	   [RFC3260]  Grossman, D., "New Terminology and Clarifications for
678	              Diffserv", RFC 3260, April 2002.

680	   [RFC3270]  Le Faucheur, F., Wu, L., Davie, B., Davari, S., Vaananen,
681	              P., Krishnan, R., Cheval, P., and J. Heinanen, "Multi-
682	              Protocol Label Switching (MPLS) Support of Differentiated
683	              Services", RFC 3270, May 2002.

685	   [RFC3630]  Katz, D., Kompella, K., and D. Yeung, "Traffic Engineering
686	              (TE) Extensions to OSPF Version 2", RFC 3630, September
687	              2003.

689	   [RFC3809]  Nagarajan, A., "Generic Requirements for Provider
690	              Provisioned Virtual Private Networks (PPVPN)", RFC 3809,
691	              June 2004.

693	   [RFC3945]  Mannie, E., "Generalized Multi-Protocol Label Switching
694	              (GMPLS) Architecture", RFC 3945, October 2004.

696	   [RFC3985]  Bryant, S. and P. Pate, "Pseudo Wire Emulation Edge-to-
697	              Edge (PWE3) Architecture", RFC 3985, March 2005.

699	   [RFC4031]  Carugi, M. and D. McDysan, "Service Requirements for Layer
700	              3 Provider Provisioned Virtual Private Networks (PPVPNs)",
701	              RFC 4031, April 2005.

703	   [RFC4124]  Le Faucheur, F., "Protocol Extensions for Support of
704	              Diffserv-aware MPLS Traffic Engineering", RFC 4124, June
705	              2005.

707	   [RFC4201]  Kompella, K., Rekhter, Y., and L. Berger, "Link Bundling
708	              in MPLS Traffic Engineering (TE)", RFC 4201, October 2005.

710	   [RFC4385]  Bryant, S., Swallow, G., Martini, L., and D. McPherson,
711	              "Pseudowire Emulation Edge-to-Edge (PWE3) Control Word for
712	              Use over an MPLS PSN", RFC 4385, February 2006.

714	   [RFC4928]  Swallow, G., Bryant, S., and L. Andersson, "Avoiding Equal
715	              Cost Multipath Treatment in MPLS Networks", BCP 128, RFC
716	              4928, June 2007.

718	   [RFC5036]  Andersson, L., Minei, I., and B. Thomas, "LDP
719	              Specification", RFC 5036, October 2007.

721	   [RFC5305]  Li, T. and H. Smit, "IS-IS Extensions for Traffic
722	              Engineering", RFC 5305, October 2008.

724	   [RFC5586]  Bocci, M., Vigoureux, M., and S. Bryant, "MPLS Generic
725	              Associated Channel", RFC 5586, June 2009.

727	   [RFC5921]  Bocci, M., Bryant, S., Frost, D., Levrau, L., and L.
728	              Berger, "A Framework for MPLS in Transport Networks", RFC
729	              5921, July 2010.

731	   [RFC6391]  Bryant, S., Filsfils, C., Drafz, U., Kompella, V., Regan,
732	              J., and S. Amante, "Flow-Aware Transport of Pseudowires
733	              over an MPLS Packet Switched Network", RFC 6391, November
734	              2011.

736	   [RFC6790]  Kompella, K., Drake, J., Amante, S., Henderickx, W., and
737	              L. Yong, "The Use of Entropy Labels in MPLS Forwarding",
738	              RFC 6790, November 2012.

740	Appendix A.  Network Operator Practices and Protocol Usage

742	   Often, network operators have a contractual Service Level Agreement
743	   (SLA) with customers for services that are comprised of numerical
744	   values for performance measures, principally availability, latency,
745	   delay variation.  Additionally, network operators may have
746	   performance objectives for internal use by the operator.  See
747	   RFC3809, Section 4.9 [RFC3809] for examples of the form of such SLA
748	   and performance objective specifications.  In this document we use
749	   the term Performance Objective as defined in
750	   [I-D.ietf-rtgwg-cl-requirement].  Applications and acceptable user
751	   experience have an important relationship to these performance
752	   parameters.

754	   Consider latency as an example.  In some cases, minimizing latency
755	   relates directly to the best customer experience (for example, in
756	   interactive applications closer is faster).  In other cases, user
757	   experience is relatively insensitive to latency, up to a specific
758	   limit at which point user perception of quality degrades
759	   significantly (e.g., interactive human voice and multimedia
760	   conferencing).  A number of Performance Objectives have a bound on
761	   point-to-point latency and as long as this bound is met the
762	   Performance Objective is met; decreasing the latency is not
763	   necessary.  In some Performance Objectives, if the specified latency
764	   is not met, the user considers the service as unavailable.  An
765	   unprotected LSP can be manually provisioned on a set of links to meet
766	   this type of Performance Objective, but this lowers availability
767	   since an alternate route that meets the latency Performance Objective
768	   cannot be determined.

770	   Historically, when an IP/MPLS network was operated over a lower layer
771	   circuit switched network (e.g., SONET rings), a change in latency
772	   caused by the lower layer network (e.g., due to a maintenance action
773	   or failure) was not known to the MPLS network.  This resulted in
774	   latency affecting end user experience, sometimes violating
775	   Performance Objectives or resulting in user complaints.

777	   A response to this problem was to provision IP/MPLS networks over
778	   unprotected circuits and set the metric and/or TE-metric proportional
779	   to latency.  This resulted in traffic being directed over the least
780	   latency path, even if this was not needed to meet an Performance
781	   Objective or meet user experience objectives.  This results in
782	   reduced flexibility and increased cost for network operators.  Some
783	   providers perfer to use lower layer networks to provide restoration
784	   and grooming, but the inability to communicate performance
785	   parameters, in particular latency, from the lower layer network to
786	   the higher layer network is an important problem to be solved before
787	   this can be done.

789	   Latency Performance Objectives for point-to-point services are often
790	   tied closely to geographic locations, while latency for multipoint
791	   services may be based upon a worst case within a region.

793	   The time frames for restoration (i.e., as implemented by
794	   predetermined protection, convergence of routing protocols and/or
795	   signaling) for services range from on the order of 100 ms or less
796	   (e.g., for VPWS to emulate classical SDH/SONET protection switching),
797	   to several minutes (e.g., to allow BGP to reconverge for L3VPN) and
798	   may differ among the set of customers within a single service.

800	   The presence of only three Traffic Class (TC) bits (previously known
801	   as EXP bits) in the MPLS shim header is limiting when a network
802	   operator needs to support QoS classes for multiple services (e.g.,
803	   L2VPN VPWS, VPLS, L3VPN and Internet), each of which has a set of QoS
804	   classes that need to be supported and where the operator prefers to
805	   use only E-LSP [RFC3270].  In some cases one bit is used to indicate
806	   conformance to some ingress traffic classification, leaving only two
807	   bits for indicating the service QoS classes.  One approach that has
808	   been taken is to aggregate these QoS classes into similar sets on
809	   LER-LSR and LSR-LSR links and continue to use only E-LSP.  Another
810	   approach is to use L-LSP as defined in [RFC3270] or use the Class-
811	   Type as defined in [RFC4124] to support up to eight mappings of TC
812	   into Per-Hop Behavior (PHB).

814	   The IP DSCP cannot be used for flow identification.  The use of IP
815	   DSCP for flow identification is incompatible with Assured Forwarding
816	   services [RFC2597] or any other service which may use more than one
817	   DSCP code point to carry traffic for a given microflow.  In general
818	   network operators do not rely on the DSCP of Internet packets in core
819	   networks but must preserve DSCP values for use closer to network
820	   edges.

822	   A label is pushed onto Internet packets when they are carried along
823	   with L2VPN or L3VPN packets on the same link or lower layer network
824	   provides a mean to distinguish between the QoS class for these
825	   packets.

827	   Operating an MPLS-TE network involves a different paradigm from
828	   operating an IGP metric-based LDP signaled MPLS network.  The
829	   multipoint-to-point LDP signaled MPLS LSPs occur automatically, and
830	   balancing across parallel links occurs if the IGP metrics are set
831	   "equally" (with equality a locally definable relation) and if ECMP is
832	   enabled for LDP, which network operators generally do in large
833	   networks.

835	   Traffic is typically comprised of large (some very large) flows and a
836	   much larger number of small flows.  In some cases, separate LSPs are
837	   established for very large flows.  Very large microflows can occur
838	   even if the IP header information is inspected by a LSR.  For example
839	   an IPsec tunnel that carries a large amount of traffic must be
840	   carried as a single large flow.  An important example of large flows
841	   is that of a L2VPN or L3VPN customer who has an access line bandwidth
842	   comparable to a client-client component link bandwidth -- there could
843	   be flows that are on the order of the access line bandwidth.

845	Appendix B.  Existing Multipath Standards and Techniques

847	   Today the requirement to handle large aggregations of traffic, much
848	   larger than a single component link, can be handled by a number of
849	   techniques which we will collectively call multipath.  Multipath
850	   applied to parallel links between the same set of nodes includes
851	   Ethernet Link Aggregation [IEEE-802.1AX], link bundling [RFC4201], or
852	   other aggregation techniques some of which may be vendor specific.
853	   Multipath applied to diverse paths rather than parallel links
854	   includes Equal Cost MultiPath (ECMP) as applied to OSPF, ISIS, LDP,
855	   or even BGP, and equal cost LSP, as described in Appendix B.4.
856	   Various multipath techniques have strengths and weaknesses.

858	   Existing multipath techniques solve the problem of large aggregations
859	   of traffic, without addressing the other requirements outlined in
860	   this document, particularly those described in Section 5.

862	B.1.  Common Multpath Load Spliting Techniques

864	   Identical load balancing techniques are used for multipath both over
865	   parallel links and over diverse paths.

867	   Large aggregates of IP traffic do not provide explicit signaling to
868	   indicate the expected traffic loads.  Large aggregates of MPLS
869	   traffic are carried in MPLS tunnels supported by MPLS LSP.  LSP which
870	   are signaled using RSVP-TE extensions do provide explicit signaling
871	   which includes the expected traffic load for the aggregate.  LSP
872	   which are signaled using LDP do not provide an expected traffic load.

874	   MPLS LSP may contain other MPLS LSP arranged hierarchically.  When an
875	   MPLS LSR serves as a midpoint LSR in an LSP carrying client LSP as
876	   payload, there is no signaling associated with these client LSP.
877	   Therefore even when using RSVP-TE signaling there may be insufficient
878	   information provided by signaling to adequately distribute load based
879	   solely on signaling.

881	   Generally a set of label stack entries that is unique across the
882	   ordered set of label numbers in the label stack can safely be assumed
883	   to contain a group of flows.  The reordering of traffic can therefore
884	   be considered to be acceptable unless reordering occurs within
885	   traffic containing a common unique set of label stack entries.
886	   Existing load splitting techniques take advantage of this property in
887	   addition to looking beyond the bottom of the label stack and
888	   determining if the payload is IPv4 or IPv6 to load balance traffic
889	   accordingly.

891	   MPLS-TP OAM violates the assumption that it is safe to reorder
892	   traffic within an LSP.  If MPLS-TP OAM is to be accommodated, then
893	   existing multipath techniques must be modified.  [RFC6790] and
894	   [I-D.ietf-mpls-multipath-use] provide a solution but require a small
895	   forwarding change.

897	   For example, a large aggregate of IP traffic may be subdivided into a
898	   large number of groups of flows using a hash on the IP source and
899	   destination addresses.  This is as described in [RFC2475] and
900	   clarified in [RFC3260].  For MPLS traffic carrying IP, a similar hash
901	   can be performed on the set of labels in the label stack.  These
902	   techniques are both examples of means to subdivide traffic into
903	   groups of flows for the purpose of load balancing traffic across
904	   aggregated link capacity.  The means of identifying a group of flows
905	   should not be confused with the definition of a flow.

907	   Discussion of whether a hash based approach provides a sufficiently
908	   even load balance using any particular hashing algorithm or method of
909	   distributing traffic across a set of component links is outside of
910	   the scope of this document.

912	   The current load balancing techniques are referenced in [RFC4385] and
913	   [RFC4928].  The use of three hash based approaches are described in
914	   [RFC2991] and [RFC2992].  A mechanism to identify flows within PW is
915	   described in [RFC6391].  The use of hash based approaches is
916	   mentioned as an example of an existing set of techniques to
917	   distribute traffic over a set of component links.  Other techniques
918	   are not precluded.

920	B.2.  Static and Dynamic Load Balancing Multipath

922	   Static multipath generally relies on the mathematical probability
923	   that given a very large number of small microflows, these microflows
924	   will tend to be distributed evenly across a hash space.  Early very
925	   static multipath implementations assumed that all component links are
926	   of equal capacity and perform a modulo operation across the hashed
927	   value.  An alternate static multipath technique uses a table
928	   generally with a power of two size, and distributes the table entries
929	   proportionally among component links according to the capacity of
930	   each component link.

932	   Static load balancing works well if there are a very large number of
933	   small microflows (i.e., microflow rate is much less than component
934	   link capacity).  However, the case where there are even a few large
935	   microflows is not handled well by static load balancing.

937	   A dynamic load balancing multipath technique is one where the traffic
938	   bound to each component link is measured and the load split is
939	   adjusted accordingly.  As long as the adjustment is done within a
940	   single network element, then no protocol extensions are required and
941	   there are no interoperability issues.

943	   Note that if the load balancing algorithm and/or its parameters is
944	   adjusted, then packets in some flows may be briefly delivered out of
945	   sequence, however in practice such adjustments can be made very
946	   infrequent.

948	B.3.  Traffic Split over Parallel Links

950	   The load splitting techniques defined in Appendix B.1 and
951	   Appendix B.2 are both used in splitting traffic over parallel links
952	   between the same pair of nodes.  The best known technique, though far
953	   from being the first, is Ethernet Link Aggregation [IEEE-802.1AX].
954	   This same technique had been applied much earlier using OSPF or ISIS
955	   Equal Cost MultiPath (ECMP) over parallel links between the same
956	   nodes.  Multilink PPP [RFC1717] uses a technique that provides
957	   inverse multiplexing, however a number of vendors had provided
958	   proprietary extensions to PPP over SONET/SDH [RFC2615] that predated
959	   Ethernet Link Aggregation but are no longer used.

961	   Link bundling [RFC4201] provides yet another means of handling
962	   parallel LSP.  RFC4201 explicitly allow a special value of all ones
963	   to indicate a split across all members of the bundle.  This "all
964	   ones" component link is signaled in the MPLS RESV to indicate that
965	   the link bundle is making use of classic multipath techniques.

967	B.4.  Traffic Split over Multiple Paths

969	   OSPF or ISIS Equal Cost MultiPath (ECMP) is a well known form of
970	   traffic split over multiple paths that may traverse intermediate
971	   nodes.  ECMP is often incorrectly equated to only this case, and
972	   multipath over multiple diverse paths is often incorrectly equated to
973	   ECMP.

975	   Many implementations are able to create more than one LSP between a
976	   pair of nodes, where these LSP are routed diversely to better make
977	   use of available capacity.  The load on these LSP can be distributed
978	   proportionally to the reserved bandwidth of the LSP.  These multiple
979	   LSP may be advertised as a single PSC FA and any LSP making use of
980	   the FA may be split over these multiple LSP.

982	   Link bundling [RFC4201] component links may themselves be LSP.  When
983	   this technique is used, any LSP which specifies the link bundle may
984	   be split across the multiple paths of the component LSP that comprise
985	   the bundle.

987	Appendix C.  Characteristics of Transport in Core Networks

989	   The characteristics of primary interest are the capacity of a single
990	   circuit and the use of wave division multiplexing (WDM) to provide a
991	   large number of parallel circuits.

993	   Wave division multiplexing (WDM) supports multiple independent
994	   channels (independent ignoring crosstalk noise) at slightly different
995	   wavelengths of light, multiplexed onto a single fiber.  Typical in
996	   the early 2000s was 40 wavelengths of 10 Gb/s capacity per
997	   wavelength.  These wavelengths are in the C-band range, which is
998	   about 1530-1565 nm, though some work has been done using the L-band
999	   1565-1625 nm.

1001	   The C-band has been carved up using a 100 GHz spacing from 191.7 THz
1002	   to 196.1 THz by [ITU-T.G.694.2].  This yields 44 channels.  If the
1003	   outermost channels are not used, due to poorer transmission
1004	   characteristics, then typically 40 are used.  For practical reasons,
1005	   a 50 GhZ or 25 GHz spacing is used by more recent equipment,
1006	   yielding. 80 or 160 channels in practice.

1008	   The early optical modulation techniques used within a single channel
1009	   yielded 2.5Gb/s and 10 Gb/s capacity per channel.  As modulation
1010	   techniques have improved 40 Gb/s and 100 Gb/s per channel have been
1011	   achieved.

1013	   The 40 channels of 10 Gb/s common in the mid 2000s yields a total of
1014	   400 Gb/s.  Tighter spacing and better modulations are yielding up to
1015	   8 Tb/s or more in more recent systems.

1017	   Over the optical modulation is an electrical encoding.  In the 1990s
1018	   this was typically Synchronous Optical Networking (SONET) or
1019	   Synchronous Digital Hierarchy (SDH), with a maximum defined circuit
1020	   capacity of 40 Gb/s (OC-768), though the 10 Gb/s OC-192 is more
1021	   common.  More recently the low level electrical encoding has been
1022	   Optical Transport Network (OTN) defined by ITU-T.  OTN currently
1023	   defines circuit capacities up to a nominal 100 Gb/s (ODU4).  Both
1024	   SONET/SDH and OTN make use of time division multiplexing (TDM) where
1025	   the a higher capacity circuit such as a 100 Gb/s ODU4 in OTN may be
1026	   subdivided into lower fixed capacity circuits such as ten 10 Gb/s
1027	   ODU2.

1029	   In the 1990s, all IP and later IP/MPLS networks either used a
1030	   fraction of maximum circuit capacity, or at most the full circuit
1031	   capacity toward the end of the decade, when full circuit capacity was
1032	   2.5 Gb/s or 10 Gb/s.  Beyond 2000, the TDM circuit multiplexing
1033	   capability of SONET/SDH or OTN was rarely used.

1035	   Early in the 2000s both transport equipment and core LSR offered 40
1036	   Gb/s SONET OC-768.  However 10 Gb/s transport equipment was
1037	   predominantly deployed throughout the decade, partially because LSR
1038	   10GbE ports were far more cost effective than either OC-192 or OC-768
1039	   and 10GbE became practical in the second half of the decade.

1041	   Entering the 2010 decade, LSR 40GbE and 100GbE are expected to become
1042	   widely available and cost effective.  Slightly preceding this
1043	   transport equipment making use of 40 Gb/s and 100 Gb/s modulations
1044	   are becoming available.  This transport equipment is capable or
1045	   carrying 40 Gb/s ODU3 and 100 Gb/s ODU4 circuits.

1047	   Early in the 2000s decade IP/MPLS core networks were making use of
1048	   single 10 Gb/s circuits.  Capacity grew quickly in the first half of
1049	   the decade but more IP/MPLS core networks had only a small number of
1050	   IP/MPLS links requiring 4-8 parallel 10 Gb/s circuits.  However, the
1051	   use of multipath was necessary, was deemed the simplest and most cost
1052	   effective alternative, and became thoroughly entrenched.  By the end
1053	   of the 2000s decade nearly all major IP/MPLS core service provider
1054	   networks and a few content provider networks had IP/MPLS links which
1055	   exceeded 100 Gb/s, long before 40GbE was available and 40 Gb/s
1056	   transport in widespread use.

1058	   It is less clear when IP/MPLS LSP exceeded 10 Gb/s, 40 Gb/s, and 100
1059	   Gb/s.  By 2010, many service providers have LSP in excess of 100 Gb/
1060	   s, but few are willing to disclose how many LSP have reached this
1061	   capacity.

1063	   By 2012 40GbE and 100GbE LSR products had become available, but were
1064	   mostly still being evaluated or in trial use by service providers and
1065	   contect providers.  The cost of components required to deliver 100GbE
1066	   products remained high making these products less cost effective.
1067	   This is expected to change within years.

1069	   The important point is that IP/MPLS core network links have long ago
1070	   exceeded 100 Gb/s and some may have already exceeded a Tb/s and a
1071	   small number of IP/MPLS LSP exceed 100 Gb/s.  By the time 100 Gb/s
1072	   circuits are widely deployed, many IP/MPLS core network links are
1073	   likely to exceed 1 Tb/s and many IP/MPLS LSP capacities are likely to
1074	   exceed 100 Gb/s.  The growth in service provider traffic has
1075	   consistently outpaced growth in DWDM channel capacities and the
1076	   growth in capacity of single interfaces and is expected to continue
1077	   to do so.  Therefore multipath techniques are likely here to stay.

1079	Authors' Addresses

1081	   So Ning
1082	   Tata Communications

1084	   Email: ning.so@tatacommunications.com

1086	   Andrew Malis
1087	   Consultant

1089	   Email: agmalis@gmail.com
1090	   Dave McDysan
1091	   Verizon
1092	   22001 Loudoun County PKWY
1093	   Ashburn, VA  20147
1094	   USA

1096	   Email: dave.mcdysan@verizon.com

1098	   Lucy Yong
1099	   Huawei USA
1100	   5340 Legacy Dr.
1101	   Plano, TX  75025
1102	   USA

1104	   Phone: +1 469-277-5837
1105	   Email: lucy.yong@huawei.com

1107	   Curtis Villamizar
1108	   Outer Cape Cod Network Consulting

1110	   Email: curtis@occnc.com