idnits 2.17.1 

draft-ietf-mpls-te-scaling-analysis-05.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

  ** The document seems to lack a License Notice according IETF Trust
     Provisions of 28 Dec 2009, Section 6.b.i or Provisions of 12 Sep 2009
     Section 6.b -- however, there's a paragraph with a matching beginning.
     Boilerplate error?

  -- It seems you're using the 'non-IETF stream' Licence Notice instead


  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

     No issues found here.

  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  -- The document seems to lack a disclaimer for pre-RFC5378 work, but may
     have content which was first submitted before 10 November 2008.  If you
     have contacted all the original authors and they are all willing to grant
     the BCP78 rights to the IETF Trust, then this is fine, and you can ignore
     this comment.  If not, you may need to add the pre-RFC5378 disclaimer. 
     (See the Legal Provisions document at
     https://trustee.ietf.org/license-info for more information.)

  -- Couldn't find a document date in the document -- date freshness check
     skipped.


  Checking references for intended status: Informational
  ----------------------------------------------------------------------------

  == Missing Reference: '4X' is mentioned on line 879, but not defined


     Summary: 1 error (**), 0 flaws (~~), 2 warnings (==), 3 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------

1	Network Working Group                                        S. Yasukawa
2	Internet-Draft                                                       NTT
3	Intended Status: Informational                                 A. Farrel
4	Created: December 14, 2008                            Old Dog Consulting
5	Expires: June 14, 2009                                       O. Komolafe
6	                                                           Cisco Systems

8	        An Analysis of Scaling Issues in MPLS-TE Core Networks

10	              draft-ietf-mpls-te-scaling-analysis-05.txt

12	Status of this Memo

14	   This Internet-Draft is submitted to IETF in full conformance with
15	   the provisions of BCP 78 and BCP 79.

17	   Internet-Drafts are working documents of the Internet Engineering
18	   Task Force (IETF), its areas, and its working groups.  Note that
19	   other groups may also distribute working documents as Internet-
20	   Drafts.

22	   Internet-Drafts are draft documents valid for a maximum of six months
23	   and may be updated, replaced, or obsoleted by other documents at any
24	   time.  It is inappropriate to use Internet-Drafts as reference
25	   material or to cite them other than as "work in progress."

27	   The list of current Internet-Drafts can be accessed at
28	   http://www.ietf.org/ietf/1id-abstracts.txt.

30	   The list of Internet-Draft Shadow Directories can be accessed at
31	   http://www.ietf.org/shadow.html.

33	Abstract

35	   Traffic engineered Multiprotocol Label Switching (MPLS-TE) is
36	   deployed in providers' core networks. As providers plan to grow these
37	   networks, they need to understand whether existing protocols and
38	   implementations can support the network sizes that they are planning.

40	   This document presents an analysis of some of the scaling concerns
41	   for the number of Label Switching Paths (LSPs) in MPLS-TE core
42	   networks, and examines the value of two techniques (LSP hierarchies,
43	   and multipoint-to-point LSPs) for improving scaling. The intention is
44	   to motivate the development of appropriate deployment techniques and
45	   protocol extensions to enable the application of MPLS-TE in large
46	   networks.

48	   This document only considers the question of achieving scalability
49	   for the support of point-to-point MPLS-TE LSPs. Point-to-multipoint
50	   MPLS-TE LSPs are for future study.

52	Contents
53	   1. Introduction ..................................................  3
54	   1.1 Overview .....................................................  3
55	   1.2 Glossary of Notation .........................................  4
56	   2. Issues of Concern for Scaling .................................  5
57	   2.1 LSP State ...................................................   5
58	   2.2 Processing Overhead ..........................................  5
59	   2.3 RSVP-TE Implications .........................................  6
60	   2.4 Management ...................................................  7
61	   3. Network Topologies ............................................  7
62	   3.1 The Snowflake Network Topology ...............................  8
63	   3.2 The Ladder Network Topology .................................. 10
64	   3.3 Commercial Drivers for Selected Configurations ............... 12
65	   3.4 Other Network Topologies ..................................... 13
66	   4. Required Network Sizes ........................................ 14
67	   4.1 Practical Numbers ............................................ 14
68	   5. Scaling in Flat Networks ...................................... 14
69	   5.1 Snowflake Networks ........................................... 15
70	   5.2 Ladder Networks .............................................. 16
71	   6. Scaling Snowflake Networks with Forwarding Adjacencies ........ 19
72	   6.1 Two Layer Hierarchy .......................................... 20
73	   6.1.1 Tuning the Network Topology to Suit the Two Layer Hierarchy  21
74	   6.2 Alternative Two Layer Hierarchy .............................. 21
75	   6.3 Three Layer Hierarchy ........................................ 22
76	   6.4 Issues with Hierarchical LSPs  ............................... 23
77	   7. Scaling Ladder Networks with Forwarding Adjacencies............ 24
78	   7.1 Two Layer Hierarchy .......................................... 24
79	   7.2 Three Layer Hierarchy ........................................ 25
80	   7.3 Issues with Hierarchical LSPs ................................ 26
81	   8. Scaling Improvements Through Multipoint-to-Point LSPs ......... 26
82	   8.1 Overview of MP2P LSPs ........................................ 27
83	   8.2 LSP State: A Better Measure of Scalability ................... 27
84	   8.3 Scaling Improvements for Snowflake Networks .................. 28
85	   8.3.1 Comparison with Other Scenarios ............................ 29
86	   8.4 Scaling Improvements for Ladder Networks ..................... 30
87	   8.4.1 Comparison with Other Scenarios ............................ 31
88	   8.4.2 LSP State Compared with LSP Numbers ........................ 33
89	   8.5 Issues with MP2P LSPs ........................................ 33
90	   9. Combined Models ............................................... 34
91	   10. An Alternate Solution ........................................ 35
92	   10.1 Pros and Cons of the Alternate Solution ..................... 36
93	   11. Management Considerations .................................... 37
94	   12. Security Considerations ...................................... 37
95	   13. Recommendations .............................................. 38
96	   14. IANA Considerations .......................................... 38
97	   15. Acknowledgements ............................................. 38
98	   16. Intellectual Property Consideration .......................... 39
99	   17. Normative References ......................................... 40
100	   18. Informative References ....................................... 40
101	   19. Authors' Addresses ........................................... 41

103	1. Introduction

105	   Network operators and service providers are examining scaling issues
106	   as they look to deploy ever-larger traffic engineered Multiprotocol
107	   Label Switching (MPLS-TE) networks. Concerns have been raised about
108	   the number of Label Switched Paths (LSPs) that need to be supported
109	   at the edge and at the core of the network. The impact on control
110	   plane and management plane resources threatens to outweigh the
111	   benefits and popularity of MPLS-TE, while the physical limitations of
112	   the routers may constrain the deployment options.

114	   Historically, it has been assumed that all MPLS-TE scaling issues
115	   can be addressed using hierarchical LSP [RFC4206]. However, analysis
116	   shows that the improvement gained by LSP hierarchies is not as
117	   significant in all topologies and at all points in the network as
118	   might have been presumed. Further, additional management issues are
119	   introduced to determine the end-points of the hierarchical LSPs and
120	   to operate them. Although this does not invalidate the benefits of
121	   LSP hierarchies, it does indicate that additional techniques may be
122	   desirable in order to fully scale MPLS-TE networks.

124	   This document examines the scaling properties of two generic MPLS-TE
125	   network topologies, and investigates the benefits of two scaling
126	   techniques.

128	1.1 Overview

130	   Physical topology scaling concerns are addressed by building networks
131	   that are not fully meshed. Network topologies tend to be meshed in
132	   the core, but tree-shaped at the edges giving rise to a snow-flake
133	   design. Alternatively, the core may be more of a ladder shape with
134	   tree-shaped edges.

136	   MPLS-TE, however, establishes a logical full mesh between all edge
137	   points in the network, and this is where the scaling problems arise
138	   since the structure of the network tends to focus a large number of
139	   LSPs within the core of the network.

141	   This document presents two generic network topologies: the snow-flake
142	   and the ladder, and attempts to parameterize the networks by making
143	   some generalities. It introduces terminology for the different
144	   scaling parameters and examines how many LSPs might be required to be
145	   carried within the core of a network.

147	   Two techniques (hierarchical LSPs, and multipoint-to-point LSPs) are
148	   introduced and an examination is made of the scaling benefits that
149	   they offer as well as of some of the concerns with using these
150	   techniques.

152	   Of necessity, this document makes many generalizations. Not least
153	   among these are a set of assumptions about the symmetry and
154	   connectivity of the physical network. It is hoped that these
155	   generalizations will not impinge on the usefulness of the overview of
156	   the scaling properties that this document attempts to give. Indeed,
157	   the symmetry of the example topologies tends to highlight the
158	   scaling issues of the different solution models, and this may be
159	   useful in exposing the worst case scenarios.

161	   Although protection mechanisms like Fast Re-route (FRR) [RFC4090] are
162	   briefly discussed, the main body of this document considers stable
163	   network cases. It should be noted that make-before-break
164	   re-optimisation after link failure may result in a significant number
165	   of 'duplicate' LSPs. This issue is not addressed in this document.

167	   It should also be understood that certain deployment models where
168	   separate traffic engineered LSPs are used to provide different
169	   services such as layer 3 VPNs [RFC4110] or pseudowires [RFC3985], or
170	   different classes of service [RFC3270], may result in 'duplicate' or
171	   'parallel' LSPs running between any pair of provider edge nodes
172	   (PEs). This scaling factor is also not considered in this document,
173	   but may be easily applied as a linear factor by the reader.

175	   The operation of security mechanisms in MPLS-TE networks [MPLS-SEC]
176	   may have an impact on the ability of the network to scale. For
177	   example, they may increase both the size and number of control plane
178	   messages. Additionally, they may increase the processing overhead as
179	   control plane messages are subject to processing algorithms (such as
180	   encryption), and security keys need to be managed. Deployers will
181	   need to consider the trade-offs between scaling objectives and
182	   security objectives in thier networks and should resist the
183	   temptation to respond to a degradation of scaling performance by
184	   turning off security techniques that have previously been deemed as
185	   necessary. Further analysis of the effects of security measures on
186	   scalability are not considered further in this document.

188	   This document is designed to help service providers discover whether
189	   existing protocols and implementations can support the network sizes
190	   that they are planning. To do this, it presents an analysis of some
191	   of the scaling concerns for MPLS-TE core networks, and examines the
192	   value of two techniques for improving scaling. This should motivate
193	   the development of appropriate deployment techniques and protocol
194	   extensions to enable the application of MPLS-TE in large networks.

196	   This document only considers the question of achieving scalability
197	   for the support of point-to-point MPLS-TE LSPs. Point-to-multipoint
198	   MPLS-TE LSPs are for future study.

200	1.2 Glossary of Notation

202	   This document applies consistent notation to define various
203	   parameters of the networks that are analyzed. These terms are defined
204	   as they are introduced though-out the document, but are grouped
205	   together here for quick reference. Refer to the full definitions in
206	   the text for detailed explanations.

208	   n      A network level. n = 1 is the core of the network.
209	          See section 3, for more details of the definition of a level.
210	   P(n)   A node at level n in the network.
211	   S(n)   The number of nodes at level n. That is, the number of P(n)
212	          nodes.
213	   L(n)   The number of LSPs seen by a P(n) node.
214	   X(n)   The number of LSP segment states held by a P(n) node.
215	   M(n)   The number of P(n+1) nodes subtended to a P(n) node.
216	   R      The number of rungs in a ladder network.
217	   E      The number of edge nodes (PEs) subtended below (directly or
218	          indirectly) a spar node in a ladder network.
219	   K      The cost-effectiveness of the network expressed in terms of
220	          the ratio of the number of PEs to the number of network nodes.

222	2. Issues of Concern for Scaling

224	   This section presents some of the issues associated with the support
225	   of LSPs at a Label Switching Router (LSR) or within the network.
226	   These issues may mean that there is a limit to the number LSPs that
227	   can be supported.

229	2.1 LSP State

231	   LSP state is the data (information) that must be stored at an LSR in
232	   order to maintain an LSP. Here we refer to the information that is
233	   necessary to maintain forwarding plane state and the additional
234	   information required when LSPs are established through control
235	   plane protocols. While the size of the LSP state is implementation-
236	   dependent, it is clear that any implementation will require some data
237	   in order to maintain LSP state.

239	   Thus LSP state becomes a scaling concern because, as the number of
240	   LSPs at an LSR increases, so the amount of memory required to
241	   maintain the LSPs increases in direct proportion. Since the memory
242	   capacity of an LSR is limited, there is a related limit placed on the
243	   number LSPs that can be supported.

245	   Note that techniques to reduce the memory requirements (such as data
246	   compression) may serve to increase the number of LSPs that can be
247	   supported, but this will only achieve a moderate multiplier and may
248	   significantly decrease the ability to process the state rapidly.

250	   In this document we define X(n) as "The number of LSP segment states
251	   held by a P(n) node." This definition observes that an LSR at the end
252	   of an LSP only has to maintain state in one direction (i.e., into the
253	   network), while a transit LSR must maintain state in both directions
254	   (i.e., toward both ends of the LSP). Furthermore, in multipoint-to-
255	   point (MP2P) LSPs (see Section 8) a transit LSR may need to maintain
256	   LSP state for one downstream segment (toward the destination) and
257	   multiple upstream segments (from multiple sources). That is, we
258	   define LSP segment state as the state necessary to maintain an LSP in
259	   one direction to one adjacent node.

261	2.2 Processing Overhead

263	   Depending largely on implementation issues, the number of LSPs
264	   passing through an LSR may impact the processing speed for each LSP.
265	   For example, control block search times can increase with the number
266	   of control blocks to be searched, and even excellent implementations
267	   cannot completely mitigate this fact. Thus, since CPU power is
268	   constrained in any LSR, there may be a practical limit to the number
269	   of LSPs that can be supported.

271	   Further processing overhead considerations depend on issues specific
272	   to the control plane protocols, and are discussed in the next
273	   section.

275	2.3 RSVP-TE Implications

277	   Like many connection-oriented signaling protocols, RSVP-TE requires
278	   that state is held within the network in order to maintain LSPs. The
279	   impact of this is described in Section 2.1. Note that RSVP-TE
280	   requires that separate information is maintained for upstream and
281	   downstream relationships, but does not require any specific
282	   implementation of that state.

284	   RSVP-TE is a soft-state protocol which means that protocol messages
285	   (refresh messages) must be regularly exchanged between signaling
286	   neighbors in order to maintain the state for each LSP that runs
287	   between the neighbors. A common period for the transmission (and
288	   receipt) of refresh messages is 30 seconds meaning that each LSR must
289	   send and receive one message in each direction (upstream and
290	   downstream) for every LSP it supports every 30 seconds. This has the
291	   potential to be a significant constraint on the scaling of the
292	   network, but various improvements [RFC2961] mean that this refresh
293	   processing can be significantly reduced allowing an implementation to
294	   be optimized to nearly remove all concerns about soft state scaling
295	   in a stable network.

297	   Observations of existing implementations indicate that there may be a
298	   threshold of around 50,000 LSPs above which an LSR struggles to
299	   achieve sufficient processing to maintain LSP state. Although refresh
300	   reduction [RFC2961] may substantially improve this situation,
301	   it has also been observed that under these circumstances the size of
302	   the Srefresh may become very large and the processing required may
303	   still cause significant disruption to an LSR.

305	   Another approach is to increase the refresh time. There is a
306	   correlation between the percentage increase in refresh time and the
307	   improvement in performance for the LSR. However, it should be noted
308	   that RSVP-TE's soft-state nature depends on regular refresh messages,
309	   thus a degree of functionality is lost by increasing the refresh
310	   time. This loss may be partially mitigated by the use of the RSVP-TE
311	   Hello message, and can also be reduced by the use of various GMPLS
312	   extensions [RFC3473] such as the use of [RFC2961] message
313	   acknowledgements on all messages.

315	   RSVP-TE also requires that signaling adjacencies are maintained
316	   through the use of Hello message exchanges. Although [RFC3209]
317	   suggests that Hello messages should be retransmitted every 5ms, in
318	   practice values of around 3 seconds are more common. Nevertheless,
319	   the support of Hello messages can represent a scaling limitation on
320	   an RSVP-TE implementation since one message must be sent and received
321	   to/from each signaling adjacency every time period. This can impose
322	   limits on the number of neighbors (physical or logical) that an LSR
323	   supports, but does not impact the number of LSPs that the LSR can
324	   handle.

326	2.4 Management

328	   Another practical concern for the scalability of large MPLS-TE
329	   networks is the ability to manage the network. This may be
330	   constrained by the available tools, the practicality of managing
331	   large numbers of LSPs, and the management protocols in use.

333	   Management tools are software implementations. Although such
334	   implementations should not constrain the control plane protocols, it
335	   is realistic to appreciate that network deployments will be limited
336	   by the scalability of the available tools. In practice, most existing
337	   tools have a limit to the number of LSPs that they can support. While
338	   a Network Management System (NMS) may be able to support a large
339	   number of LSPs, the number that can be supported by anElement
340	   Management System (EMS) (or the number supported by an NMS per-LSR)
341	   is more likely to be limited.

343	   Similarly, practical constraints may be imposed by the operation of
344	   management protocols. For example, an LSR may be swamped by
345	   management protocol requests to read information about the LSPs that
346	   it supports, and this might impact its ability to sustain those LSPs
347	   in the control plane. OAM, alarms, and notifications can further add
348	   to the burden placed on an LSR and limit the number of LSPs it can
349	   support.

351	   All of these consideration encourage a reduction in the number of
352	   LSPs supported within the network and at any particular LSR.

354	3. Network Topologies

356	   In order to provide some generic analysis of the potential scaling
357	   issues for MPLS-TE networks, this document explores two network
358	   topology models. These topologies are selected partly because of
359	   their symmetry which makes them more tractable to a formulaic
360	   approach, and partly because they represent generalizations of
361	   real deployment models. Section 3.3 provides a discussion of the
362	   commercial drivers for deployed topologies and gives more analysis
363	   of why it is reasonable to consider these two topologies.

365	   The first topology is the snowflake model. In this type of network,
366	   only the very core of the network is meshed. The edges of the network
367	   are formed as trees rooted in the core.

369	   The second network topology considered is the ladder model. In this
370	   type of network the core of the network is shaped and meshed in the
371	   form of a ladder and trees are attached rooted to the edge of the
372	   ladder.

374	   The sections that follow examine these topologies in detail in order
375	   to parameterize them.

377	3.1 The Snowflake Network Topology

379	   The snowflake topologies considered in this document are based on a
380	   hierarchy of connectivity within the core network. PE nodes have
381	   connectivity to P-nodes as shown in Figure 1. There is no direct
382	   connectivity between the PEs. Dual homing of PEs to multiple P
383	   nodes is not considered in this document although it may be a
384	   valuable addition to a network configuration.

386	         P
387	        /|\
388	       / | \
389	      /  |  \
390	     /   |   \
391	   PE    PE   PE

393	   Figure 1 : PE to P Node Connectivity

395	   The relationship between P-nodes is also structured in a hierarchical
396	   way. Thus, as shown in Figure 2, multiple P-nodes at one level are
397	   connected to a P-node at a higher level. We number the levels such
398	   that level 1 is the top level (top in our figure, and nearest to the
399	   core of the network) and level (n) is immediately above level (n+1),
400	   and we denote a P-node at level n as a P(n).

402	   As with PEs, there is no direct connectivity between P(n+1) nodes.
403	   Again, dual homing of P(n+1) nodes to multiple P(n) nodes is not
404	   considered in this document although it may be a valuable addition
405	   to a network configuration.

407	           P(n)
408	           /|\
409	          / | \
410	         /  |  \
411	        /   |   \
412	   P(n+1) P(n+1) P(n+1)

414	   Figure 2 : Relationship Between P-Nodes

416	   At the top level, P(1) nodes are connected in a full mesh. In
417	   reality, the level 1 part of the network may be slightly less well
418	   connected than this, but assuming a full mesh provides for
419	   generality. Thus, the snowflake topology comprises a clique with
420	   topologically equivalent trees subtended from each node in the
421	   clique.

423	   The key multipliers for scalability are the number of P(1) nodes, and
424	   the multiplier relationship between P(n) and P(n+1) at each level
425	   including down to PEs.

427	   We define the multiplier M(n) as the number of P(n+1) nodes at level
428	   (n+1) attached to any one P(n). Assume that M(n) is constant for all
429	   nodes at level n. Since nodes at the same level are not
430	   interconnected (except at the top level), and since each P(n+1) node
431	   is connected to precisely one P(n) node, M(n) is one less than the
432	   degree of the node at level n (that is, the P(n) node is attached to
433	   M(n) nodes at level (n+1) and 1 node at level (n-1)).

435	   We define S(n) as the number of nodes at level (n).

437	   Thus:  S(n) = S(1)*M(1)*M(2)*...*M(n-1)

439	   So the number of PEs can be expressed as:

441	      S(PE) = S(1)*M(1)*M(2)*...*M(n)

443	   where the network has (n) layers of P-nodes.

445	   Thus we may depict an example snowflake network as shown in Figure 3.
446	   In this case:

448	     S(1) = 3
449	     M(1) = 3
450	     S(2) = S(1)*M(1) = 9
451	     M(2) = 2
452	     S(PE) = S(1)*M(1)*M(2) = 18
453	     PE      PE  PE     PE  PE      PE
454	        \      \/         \/       /
455	     PE--P(2)  P(2)      P(2)  P(2)--PE
456	             \ |            | /
457	              \|            |/
458	    PE--P(2)---P(1)------P(1)---P(2)--PE
459	       /           \    /           \
460	     PE             \  /             PE
461	                     \/
462	                     P(1)
463	                     /|\
464	                    / | \
465	                   /  |  \
466	           PE--P(2)  P(2) P(2)--PE
467	               /      /\      \
468	             PE     PE  PE     PE

470	   Figure 3 : An Example Snowflake Network

472	3.2 The Ladder Network Topology

474	   The ladder networks considered in this network are based on an
475	   arrangement of routers in the core network that resembles a ladder.

477	   Ladder networks typically have long and thin cores that are arranged
478	   as conventional ladders. That is, they have one or more spars
479	   connected by rungs. Each node on a spar may have:

481	   - connection to one or more other spars
482	   - connection to a tree of other core nodes
483	   - connection to customer nodes.

485	   Figure 4 shows a simplified example of a ladder network. A core of
486	   twelve nodes make up two spars connected by six rungs.

488	                PE    PE           PE   PE
489	       PE PE PE | PE  | PE  PE  PE |  PE | PE
490	         \|    \|/    |/    |     \|    \|/
491	       PE-P-----P-----P-----P------P-----P--PE
492	          |     |     |     |      |     |\
493	          |     |     |     |      |     | PE
494	          |     |     |     |      |     |
495	       PE-P-----P-----P-----P------P-----P
496	         /|    /|\    |\    |\     |\     \
497	       PE PE PE | PE  | PE  | PE   | PE    PE
498	                PE    PE    PE     PE

500	      Figure 4 : A Simplified Ladder Network.

502	   In practice, not all nodes on a spar (call them Spar-Nodes) need to
503	   have subtended PEs. That is, they can exist simply to give
504	   connectivity along the spar to other spar nodes, or across a rung to
505	   another spar. Similarly, the connectivity between spars can be more
506	   complex with multiple connections from one spar-node to another spar.
507	   Lastly, the network may be complicated by the inclusion of more than
508	   two spars (or simplified by reduction to a single spar).

510	   These variables make the ladder network non-trivial to model. For the
511	   sake of simplicity we will make the following restrictions:

513	   - There are precisely two spars in the core network.

515	   - Every spar-node connects to precisely one spar-node on the other
516	     spar. That is, each spar-node is attached to precisely one rung.

518	   - Each spar-node connects to either one (end-spar) or two (core-spar)
519	     other spar-nodes on the same spar.

521	   - Every spar-node has the same number of PEs subtended. This does not
522	     mean that there are no P-nodes subtended to the spar-nodes, but
523	     does mean that the edge tree subtended to each spar-node is
524	     identical.

526	   From these restrictions we are able to quantify a ladder network as
527	   follows:

529	      R    - The number of rungs. That is, the number of spar-nodes on
530	             each spar.
531	      S(1) - The number of spar-nodes in the network. S(1)=2*R.
532	      E    - The number of subtended edge nodes (PEs) to each spar-node.

534	   The number of rungs may vary considerably. A number less than 3 is
535	   unlikely (since that would not be a significantly connected network),
536	   and a number greater than 100 seems improbable (because that would
537	   represent a very long, thin network).

539	   E can be treated as for the snowflake network. That is, we can
540	   consider a number of levels of attachment from P(1) nodes which are
541	   the spar-nodes, through P(i) down to P(n) which are the PEs.
542	   Practically, we need to only consider n=2 (PEs attached direct to the
543	   spar-nodes) and n=3 (one level of P-nodes between the PEs and the
544	   spar-nodes).

546	   Let M(i) be the ratio of P(i) nodes to P(i-1) nodes, i.e. the
547	   connectivity between levels of P-node as defined for the snowflake
548	   topology. Hence, the number of nodes at any level (n) is:

550	      S(n) = S(1)*M(1)*M(2)*...*M(n-1)

552	   So number of PEs subtended to a spar node is:

554	      E = M(1)*M(2)*...*M(n)

556	   And the number of PEs can be expressed as:

558	      S(PE) = S(1)*M(1)*M(2)*...*M(n)
559	            = S(1)*E

561	   Thus we may depict an example ladder network as shown in Figure 5.
562	   In this case:

564	     R = 5
565	     S(1) = 10
566	     M(1) = 2
567	     S(2) = S(1)*M(1) = 20
568	     M(2) = 2
569	     E = M(1)*M(2) = 4
570	     S(PE) = S(1)*E = 40

572	     PE PE  PE PE PE PE PE PE PE PE PE PE PE PE  PE PE
573	       \|     \|    \|    \|   |/    |/    |/     |/
574	        P(2)   P(2) P(2) P(2) P(2) P(2) P(2)     P(2)
575	            \      \  |   \    /    |  /        /
576	     PE      \      \ |    \  /     | /        /       PE
577	       \      \      \|     \/      |/        /       /
578	     PE-P(2)---P(1)---P(1)---P(1)---P(1)---P(1)---P(2)-PE
579	               |      |      |      |      |
580	               |      |      |      |      |
581	               |      |      |      |      |
582	     PE-P(2)---P(1)---P(1)---P(1)---P(1)---P(1)---P(2)-PE
583	       /      /     / |     /\      |\        \       \
584	     PE      /     /  |    /  \     | \        \       PE
585	            /     /   |   /    \    |  \        \
586	        P(2)   P(2) P(2) P(2) P(2) P(2) P(2)     P(2)
587	       /|     /|    /|    /|   |\    |\    |\     |\
588	     PE PE  PE PE PE PE PE PE PE PE PE PE PE PE  PE PE

590	   Figure 5 : An Example Ladder Network

592	3.3 Commercial Drivers for Selected Configurations

594	   It is reasonable to ask why these two particular network topologies
595	   have been chosen.

597	   The most important consideration is physical scalability. Each node
598	   (label switching router - LSR) is only able to support a limited
599	   number of physical interfaces. This necessarily reduces the ability
600	   to fully mesh a network and leads to the tree-like structure of the
601	   network toward the PEs.

603	   A realistic commercial consideration for an operator is the fact that
604	   the only revenue-generating nodes in the network are the PEs. Other
605	   nodes are needed only to support connectivity and scalability.
606	   Therefore, there is a desire to maximize S(PE) while minimizing the
607	   sum of S(n) for all values of (n). This could be achieved by
608	   minimizing the number of levels, and by maximizing the connectivity
609	   at each layer, M(n). Ultimately, however, this would produce a
610	   network of just interconnected PEs, which is clearly in conflict with
611	   the physical scaling situation.

613	   Therefore, the solution calls for a "few" levels with "relatively
614	   large" connectivity at each level. We might say that the
615	   cost-effectiveness of the network can be stated as:

617	   K = S(PE)/(S(1)+S(2) + ... + S(n)) where n is the level above the PEs

619	   We should observe, however, that this equation may be naive in that
620	   the cost of a network is not actually a function of the number of
621	   routers (since a router chasis is often free or low cost), but is
622	   really a function of the cost of the line cards which is, itself, a
623	   product of the capacity of the line cards. Thus, the relatively high
624	   connectivity decreases the cost-effectiveness, while a topology that
625	   tends to channel data through a network core tends to demand higher
626	   capacity (so more expensive) line cards.

628	   A further consideration is the availability of connectivity (usually
629	   fibers) between LSR sites. Although it is always possible to lay new
630	   fiber, this may not be cost-effective or timely. As much of a problem
631	   is likely to be the physical shape and topography of the country in
632	   which the network is laid. If the country is 'long and thin' then a
633	   ladder network is likely to be used.

635	   This document examines the implications for control plane and data
636	   plane scalability of this type of network when MPLS-TE LSPs are used
637	   to provide full connectivity between all PEs.

639	3.4 Other Network Topologies

641	   As explained in Section 1, this document is using two symmetrical
642	   and generalized network topologies for simplicity of modelling. In
643	   practice there are two other topological considerations.

645	   a. Multihoming
646	      It is relatively common for a node at level (n) to be attached to
647	      more than one node at level (n-1). This is particularly common at
648	      PEs that may be connected to more than one P(n).

650	   b. Meshing within a level
651	      A level in the network will often include links between P-nodes at
652	      the same level including the possibility of links between PEs.

654	      This may result in a network that looks like a series of
655	      concentric circles with spokes.

657	   Both of these features are likely to have some impact on the scaling
658	   of the networks. However, for the purposes of establishing the
659	   ground-rules for scaling, this document restricts itself to the
660	   consideration of the symmetrical networks described in Sections 2.1
661	   and 2.2. Discussion of other network formats is for future study.

663	4. Required Network Sizes

665	   An important question for this evaluation and analysis is the size of
666	   the network that operators require. How many PEs are required? What
667	   ratio of P to PE is acceptable. How many ports do devices have for
668	   physical connectivity? What type of MPLS-TE connectivity between PEs
669	   is required?

671	   Although presentation of figures for desired network sizes must be
672	   treated with caution because history shows that networks grow beyond
673	   all projections, it is useful to set some acceptable lower bounds.
674	   That is, we can state that we are interested in networks of at least
675	   a certain size.

677	   The most important features are:

679	   - The network should have at least 1000 PEs.
680	   - Each pair of PEs should be connected by at least one LSP in each
681	     direction.

683	4.1 Practical Numbers

685	   In practice, reasonable target numbers are as follows.

687	   S(PE) >= 1000
688	   Number of levels is 3. That is: 1, 2 and PE.
689	   M(2) <= 20
690	   M(1) <= 20
691	   S(1) <= 100

693	5. Scaling in Flat Networks

695	   Before proceeding to examine potential scaling improvements, we need
696	   to examine how well the flat networks described in the previous
697	   sections scale.

699	   Consider the requirement for a full mesh of LSPs linking all PEs.
700	   That is, each PE has an LSP to and from each other LSP. Thus, if
701	   there are S(PE) PEs in the network, there are S(PE)*(S(PE) - 1) LSPs.

703	   Define L(n) as the number of LSPs handled by a level (n) LSR.

705	   L(PE) = 2*(S(PE) - 1)

707	5.1 Snowflake Networks

709	   There are a total of S(PE) PEs in the network and, since each PE
710	   establishes an LSP with every other PE, it would be expected that
711	   there are S(PE) - 1 LSPs incoming to each PE and the same number of
712	   LSPs outgoing from the same PE, giving a total of  2(S(PE) - 1) on
713	   the incident link. Hence, in a snowflake topology (see Figure 3),
714	   since there are M(2) PEs attached to each P(2) node, it may tempting
715	   to think that L(2), the number of LSPs traversing each P(2) node, is
716	   simply 2*(S(PE) - 1)*M(2). However, it should be noted that of the
717	   S(PE) - 1 LSPs incoming to each PE, M(2) - 1 originated from nodes
718	   attached to the same P(2) node and so this value would count the LSPs
719	   between the between the M(2) PEs attached to each P(2) node twice;
720	   once when outgoing from the M(2) - 1 other nodes and once when
721	   incoming into a particular PE.

723	   There are a total of M(2)*(M(2) - 1) LSPs between these M(2) PEs and
724	   since this value is erroneously included twice in 2*(S(PE) - 1)*M(2),
725	   the correct value is:

727	   L(2) = 2*M(2)*(S(PE) - 1) - M(2)*(M(2) - 1)
728	        = M(2)*(2*S(PE) - M(2) - 1)

730	   An alternative way of looking at this, that proves extensible for the
731	   calculation of L(1), is to observe that each PE subtended to a P(2)
732	   node has an LSP in each direction to all S(PE) - M(2) PEs in the rest
733	   of the system, and there are M(2) such locally subtended PEs; thus
734	   2*M(2)*(S(PE) - M(2)). Additionally there are M(2)*(M(2) - 1) LSPs
735	   between the locally subtended PEs. So:

737	   L(2) = 2*M(2)*(S(PE) - M(2)) + M(2)*(M(2) - 1)
738	        = M(2)*(2*S(PE) - M(2) - 1)

740	   L(1) can be computed in the same way as this second evaluation of
741	   L(2). Each PE subtended below a P(1) node has an LSP in each
742	   direction to all PEs not below the P(1) node. There are M(1)*M(2) PEs
743	   below the P(1) node, so this accounts for
744	   2*M(1)*M(2)*(S(PE) - M(1)*M(2)) LSPs. To this, we need to add the
745	   number of LSPs that pass through the P(1) node and that run between
746	   the PEs subtended below the P(1). Consider each P(2): it has M(2) PEs
747	   that each has an LSP going to all of the PEs subtended to the other
748	   P(2) nodes subtended to the P(1). There are M(1) - 1 such other P(2)
749	   nodes, and so M(2)*(M(1) - 1) other such PEs. So the number of LSPs
750	   from the PEs below a P(2) node is M(2)*M(2)*(M(1) - 1). And there
751	   are M(1) P(2) nodes below the P(1), giving rise to a total of
752	   M(2)*M(2)*M(1)*(M(1) - 1) LSPs. Thus:

754	   L(1) = 2*M(1)*M(2)*(S(PE) - M(1)*M(2)) + M(2)*M(2)*M(1)*(M(1) - 1)
755	        = M(1)*M(2)*(2*S(PE) - M(2)*(M(1) + 1))

757	   So, for example, with S(1) = 5, M(1) = 10, and M(2) = 20, we see:

759	      S(PE) = 1000
760	      L(PE) = 1998
761	      L(2)  = 39580
762	      L(1)  = 356000

764	   Alternatively, with S(1) = 10, M(1) = 10, and M(2) = 20, we see:

766	      S(PE) = 2000
767	      L(PE) = 3998
768	      L(2)  = 79580
769	      L(1)  = 756000

771	   In both examples, the number of LSPs at the core (P(1)) nodes is
772	   probably unacceptably large even though there are only a relatively
773	   modest number of PEs. In fact, L(2) may even be too large in the
774	   second example.

776	5.2 Ladder Networks

778	   In ladder networks L(PE) remains the same at 2*(S(PE) - 1).

780	   L(2) can be computed using the same mechanism as for the snowflake
781	   topology because the subtended tree is the same format. Hence,

783	   L(2) = 2*M(2)*(S(PE) - 1) - M(2)*(M(2) - 1)

785	   But L(1) requires a different computation because each P(1) not only
786	   sees LSPs for the subtended PEs, but is also a transit node for some
787	   of the LSPs that cross the core (the core is not fully meshed).

789	   Each P(1) sees:

791	   o  all of the LSPs between locally attached PEs
792	   o  less the those LSPs between locally attached PEs that can be
793	      served exclusively by the attached P(2) nodes
794	   o  all LSPs between locally attached PEs and remote PEs
795	   o  LSPs in transit that pass through the P(1).

797	   The first three numbers are easily determined and match what we have
798	   seen from the snowflake network. They are:

800	   o  E*(E-1)
801	   o  M(1)*M(2)*(M(2)-1) = E*(M(2) - 1)
802	   o  2*E*E*(S(1) - 1)
803	   The number of LSPs in transit is more complicated to compute. It is
804	   simplified by not considering the ends of the ladders, but examining
805	   an arbitrary segment of the middle of the ladder such as shown in
806	   Figure 6. We look to compute and generalize the number of LSPs
807	   traversing each core link (labeled a and b in Figure 6) and so
808	   determine the number of transit LSPs seen by each P(1).

810	          :    :    :    :    :    :
811	          :    :    :    :    :    :
812	        P(2) P(2) P(2) P(2) P(2) P(2)
813	            \  |   \    /    |  /
814	             \ |    \  /     | /
815	              \|     \/      |/
816	         ......P(1)---P(1)---P(1)......
817	               |   a  |      |
818	               |      |b     |
819	               |      |      |
820	         ......P(1)---P(1)---P(1)......
821	              /|     /\      |\
822	             / |    /  \     | \
823	            /  |   /    \    |  \
824	        P(2) P(2) P(2) P(2) P(2) P(2)
825	          :    :    :    :    :    :
826	          :    :    :    :    :    :

828	   Figure 6 : An Arbitrary Section of a Ladder Network

830	   Of course, the number of LSPs carried on links a and b in the Figure
831	   depends on how LSPs are routed through the core network. But if we
832	   assume a symmetrical routing policy and an even distribution of LSPs
833	   across all shortest paths, the result is the same.

835	   Now we can see that each P(1) sees half of 2a+b LSPs (since each LSP
836	   would otherwise be counted twice as it passed through the P(1))
837	   except that some of the LSPs are locally terminated so are only
838	   included once in the sum 2a+b.

840	   So L(1) = a + b/2 -
841	             (locally terminated transit LSPs)/2 +
842	             (locally contained LSPs)

844	   Thus:

846	   L(1) = a + b/2 -
847	          2*E*E*(S(1) - 1)/2 +
848	          E*(E-1) - E*(M(2) - 1)
849	        = a + b/2 +
850	          E*E*(2 - S(1)) - E*M(2)

852	   So all we have to do is work out a and b.

854	   Recall that the ladder length R = S(1)/2, and define X = E*E.

856	   Consider the contribution made by all of the LSPs that make n hops on
857	   the ladder to the totals of each of a and b. If the ladder was
858	   unbounded then we could say that in the case of a, there are n*2X
859	   LSPs along the spar only, and n(n-1)*2X/n = 2X(n-1) LSPs use a rung
860	   and the spar. Thus, the LSPs that make n hops on the ladder
861	   contribute (4n-2)X LSPs to a. Note that the edge cases are special
862	   because LSPs that make only one hop on the ladder cannot transit a
863	   P(1) but only start or end there.

865	   So with a ladder of length R = S(1)/2 we could say:

867	         R
868	   a = SUM[(4i-2)*X] + 2RX
869	       i=2

871	     = 2*X*R*(R+1)

873	   And similarly, considering b in an unbounded ladder, the LSPs that
874	   only travel one hop on the LSP are a special case, contributing 2X
875	   LSPs, and each other LSP that traverses n hops on the ladder
876	   contributes 2n*2X/n = 4X LSPs. So:

878	            R+1
879	   b = 2X + SUM[4X]
880	            i=2

882	     = 2*X + 4*X*R

884	   In fact the ladders are bounded and so the number of LSPs is reduced
885	   because of the effect of the ends of the ladders. The links that see
886	   the most LSPs are in the middle of the ladder. Consider a ladder of
887	   length R; a node in the middle of the ladder is R/2 hops away from
888	   the end of the ladder. So we see that the formula for the
889	   contribution to the count of spar-only LSPs for a is only valid up to
890	   n=R/2, and for spar-and-rung LSPs for n=1+R/2. Above these limits the
891	   contribution made by spar-only LSPs decays as (n-R/2)*2X. However,
892	   for a first order approximation, we will use the values of a and b as
893	   computed above. This gives us an upper bound of the number of LSPs
894	   without using a more complex formula for the reduction made by the
895	   effect of the ends of the ladder.

897	   From this:

899	   L(1) = a + b/2 +
900	          E*E*(2 - S(1)) - E*M(2)
901	        = 2*X*R*(R+1) +
902	          X + 2*X*R +
903	          E*E*(2 - S(1)) - E*M(2)
904	        = E*E*S(1)*(1 + S(1)/2) +
905	          E*E + E*E*S(1) +
906	          2*E+E - E*E*S(1) - E*M(2)
907	        = E*E*S(1)*(1 + S(1)/2) + 3*E+E - E*M(2)
908	        = E*E*S(1)*S(1)/2 + E*E*S(1) + 3*E*E - E*M(2)

910	   So, for example, with S(1) = 6, M(1) = 10, and M(2) = 17, we see:
911	      E     = 170
912	      S(PE) = 1020
913	      L(PE) = 2038
914	      L(2)  = 34374
915	      L(1)  = 777410

917	   Alternatively, with S(1) = 10, M(1) = 10, and M(2) = 20, we see:
918	      E     = 200
919	      S(PE) = 2000
920	      L(PE) = 3998
921	      L(2)  = 79580
922	      L(1)  = 2516000

924	   In both examples, the number of LSPs at the core (P(1)) nodes is
925	   probably unacceptably large even though there are only a relatively
926	   modest number of PEs. In fact, L(2) may even be too large in the
927	   second example.

929	   Compare the L(1) values with the total number of LSPs in the sytem
930	   S(PE)*(S(PE) - 1) which is 1039380 and 3998000 respectively.

932	6. Scaling Snowflake Networks with Forwarding Adjacencies

934	   One of the purposes of LSP hierarchies [RFC4206] is to improve the
935	   scaling properties of MPLS-TE networks. LSP tunnels (sometimes known
936	   as Forwarding Adjacencies (FAs)) may be established to provide
937	   connectivity over the core of the network and multiple edge-to-edge
938	   LSPs may be tunneled down a single FA LSP.

940	   In our network we consider a mesh of FA LSPs between all core nodes
941	   at the same level. We consider two possibilities here. In the first
942	   all P(2) nodes are connected to all other P(2) nodes by LSP tunnels,
943	   and the PE-to-PE LSPs are tunneled across the core of the network. In
944	   the second, an extra layer of LSP hierarchy is introduced by
945	   connecting all P(1) nodes in an LSP mesh and tunneling the P(2)-to-
946	   P(2) tunnels through these.

948	6.1 Two Layer Hierarchy

950	   In this hierarchy model, the P(2) nodes are connected by a mesh of
951	   tunnels. This means that the P(1) nodes do not see the PE-to-PE LSPs.

953	   It remains the case that:
954	      L(PE) = 2*(S(PE) - 1)

956	   L(2) is slightly increased. It can be computed as the sum of all LSPs
957	   for all attached PEs including the LSPs between the attached PE (this
958	   figure is unchanged from Section 5.1: i.e. M(2)*(2*S(PE) - M(2) - 1))
959	   plus the number of FA LSPs providing a mesh to the other P(2) nodes.
960	   Since the number of P(2) nodes is S(2), each P(2) node sees
961	   2*(S(2) - 1) FA LSPs. Thus:
962	      L(2) = M(2)*(2*S(PE) - M(2) - 1) + 2*(S(2) - 1)

964	   L(1), however, is significantly reduced and can be computed as the
965	   sum of the number of FA LSPs to and from each attached P(2) to each
966	   other P(2) in the network, including (but counting only once) the FA
967	   LSPs between attached P(2) nodes. In fact, the problem is identical
968	   to the L(2) computation in Section 5.1. So:

970	   L(1) = M(1)*(2*S(2) - M(1) - 1)

972	   So, for example, with S(1) = 5, M(1) = 10, and M(2) = 20, we see:
973	      S(PE) = 1000
974	      S(2)  = 50
975	      L(PE) = 1998
976	      L(2)  = 39678
977	      L(1)  = 890

979	   Alternatively, with S(1) = 10, M(1) = 10, and M(2) = 20, we see:
980	      S(PE) = 2000
981	      S(2)  = 100
982	      L(PE) = 3998
983	      L(2)  = 79778
984	      L(1)  = 1890

986	   So, in both examples, potential problems at the core (P(1)) nodes
987	   caused by an excessive number of LSPs can be avoided, but any problem
988	   with L(2) is made slightly worse as can be seen from the table below.

990	   Example| Count | Unmodified    | 2-Layer
991	          |       | (Section 5.1) | Hierarchy
992	   -------+-------+---------------+----------
993	   A      | L(2)  |      39580    |   39678
994	          | L(1)  |     356000    |     890
995	   -------+-------+---------------+----------
996	   B      | L(2)  |      79580    |   79778
997	          | L(1)  |     756000    |    1890

999	6.1.1 Tuning the Network Topology to Suit the Two Layer Hierarchy

1001	   Clearly we can reduce L(2) by selecting appropriate values of S(1),
1002	   M(1), and M(2). We can do this without negative consequences, since
1003	   no change will affect L(PE), and since a large percentage increase in
1004	   L(1) is sustainable because L(1) is now so small.

1006	   Observe that:
1007	      L(2) = M(2)*(2*S(PE) - M(2) - 1) + 2*(S(2) - 1)

1009	   where S(PE) = S(1)*M(1)*M(2) and S(2) = S(1)*M(1). So L(2) scales
1010	   with M(2)^2 and we can have the most impact by reducing M(2) while
1011	   keeping S(PE) constant.

1013	   For example, with S(1) = 10, M(1) = 10, and M(2) = 10, we see:
1014	      S(PE) = 1000
1015	      S(2)  = 100
1016	      L(PE) = 1998
1017	      L(2)  = 20088
1018	      L(1)  = 1890

1020	   And similarly, with S(1) = 20, M(1) = 20, and M(2) = 5, we see:
1021	      S(PE) = 2000
1022	      S(2)  = 400
1023	      L(PE) = 3998
1024	      L(2)  = 20768
1025	      L(1)  = 15580

1027	   These considerable scaling benefits must be offset against the cost-
1028	   effectiveness of the network. Recall from Section 3.3 that

1030	      K = S(PE)/(S(1)+S(2) ... + S(n))

1032	   where n is the level above the PEs, so that for our network:

1034	      K = S(PE) / (S(1) + S(2))

1036	   Thus, in the first example the cost-effectiveness has been halved
1037	   from 1000/55 to 1000/110. And in the second example it has been
1038	   reduced to roughly one quarter, changing from 2000/110 to 2000/420.

1040	   So, although the tuning changes may be necessary to reach the desired
1041	   network size, they come at a considerable cost to the operator.

1043	6.2 Alternative Two Layer Hierarchy

1045	   An alternative to the two layer hierarchy presented in section 6.1 is
1046	   to provide a full mesh of FA LSPs between P(1) nodes. This technique
1047	   is only of benefit to any nodes in the core of the level 1 network.
1048	   It makes no difference to the PE and P(2) nodes since they continue
1049	   to see only the PE-to-PE LSPs. Furthermore, this approach increases
1050	   the burden at the P(1) nodes since they have to support all of the
1051	   PE-to-PE LSPs as in the flat model, plus the additional 2*(S(1) - 1)
1052	   P(1)-to-P(1) FA LSPs. Thus, this approach should only be considered
1053	   where there is a mesh of P-nodes within the ring of P(1) nodes, and
1054	   it is not considered further in this document.

1056	6.3 Three Layer Hierarchy

1058	   As demonstrated by section 6.2, introducing a mesh of FA LSPs at the
1059	   top level (P(1)) has no benefit, but if we introduce an additional
1060	   level in the network (P(3) between P(2) and PE) to make a four level
1061	   snowflake, we can introduce a new layer of FA LSPs so that we have a
1062	   full mesh of FA LSPs between all P(3) nodes to carry the PE-to-PE
1063	   LSPs, and a full mesh of FA LSPs between all P(2) nodes to carry the
1064	   P(3)-to-P(3) LSPs.

1066	   The number of PEs is S(PE) = S(1)*M(1)*M(2)*M(3), and the number of
1067	   PE-to-PE LSPs at a PE remains as L(PE) = 2*(S(PE) - 1).

1069	   The number of LSPs at a P(3) can be deduced from section 6.1. It is
1070	   the sum of all LSPs for all attached PEs including the LSPs between
1071	   the attached PE plus the number of FA LSPs providing a mesh to the
1072	   other P(3) nodes.

1074	   L(3) = M(3)*(2*S(PE) - M(3) - 1) + 2*(S(3) - 1)

1076	   The number of LSPs at P(2) can also be deduced from section 6.1 since
1077	   it is the sum of all LSPs for all attached P(3) nodes including the
1078	   LSPs between the attached PE plus the number of FA LSPs providing a
1079	   mesh to the other P(2) nodes.

1081	   L(2) = M(2)*(2*S(3) - M(2) - 1) + 2*(S(2) - 1)

1083	   Finally, L(1) can be copied straight from 6.1.

1085	   L(1) = M(1)*(2*S(2) - M(1) - 1)

1087	   For example, with S(1) = 5, M(1) = 5, M(2) = 5, and M(3) = 8 we see:
1088	      S(PE) = 1000
1089	      S(3)  = 125
1090	      S(2)  = 25
1091	      L(PE) = 1998
1092	      L(3)  = 16176
1093	      L(2)  = 1268
1094	      L(1)  = 220

1096	   Similarly, with S(1) = 5, M(1) = 5, M(2) = 8, and M(3) = 10 we see:
1097	      S(PE) = 2000
1098	      S(3)  = 200
1099	      S(2)  = 25
1100	      L(PE) = 3998
1101	      L(3)  = 40038
1102	      L(2)  = 3184
1103	      L(1)  = 220

1105	   Clearly, there are considerable scaling improvements with this three-
1106	   layer hierarchy, and all of the numbers (even L(3) in the second
1107	   example) are manageable.

1109	   Of course, the extra level in the network tends to reduce the cost-
1110	   effectiveness of the networks with values of K = 1000/155 and
1111	   K = 2000/230 (from 1000/55 and 2000/110) for the examples above.
1112	   That is, a reduction by a factor of 3 in the first case and 2 in the
1113	   second case. Such a change in cost-effectiveness has to be weighed
1114	   against the the desire to deploy such a large network. If LSP
1115	   hierarchies are the only scaling tool available, and networks this
1116	   size are required, the cost-effectiveness may need to be sacrificed.

1118	6.4 Issues with Hierarchical LSPs

1120	   A basic observation for hierarchical scaling techniques is that it is
1121	   hard to have any impact on the number of LSPs that must be supported
1122	   by the level of P(n) nodes adjacent to the PEs (for example, it is
1123	   hard to reduce L(3) in section 6.3). In fact, the only way we can
1124	   change the number of LSPs supported by these nodes is to change the
1125	   scaling ratio M(n) in the network, in other words to change the
1126	   number of PEs subtended to any P(n). But such a change has a direct
1127	   effect on the number of PEs in the network and so the
1128	   cost-effectiveness is impacted.

1130	   Another concern with the hierarchical approach is that it must be
1131	   configured and managed. This may not seem like a large burden, but it
1132	   must be recalled that the P(n) nodes are not at the edge of the
1133	   network - they are a set of nodes that must be identified so that the
1134	   FA LSPs can be configured and provisioned. Effectively, the operator
1135	   must plan and construct a layered network with a ring of P(n) nodes
1136	   giving access to the level (n) network. This design activity is open
1137	   to considerable risk as failing to close the ring (i.e. allowing a
1138	   node to be at both level (n+1) and at level (n)) may cause
1139	   operational confusion.

1141	   Protocol techniques (such as IGP auto-mesh [RFC4972]) have been
1142	   developed to reduce the configuration necessary to build this type of
1143	   multi-level network. In the case of auto-mesh, the routing protocol
1144	   is used to advertise the membership of a 'mesh group', and all
1145	   members of the mesh can discover each other and connect with LSP
1146	   tunnels. Thus the P(n) nodes giving access to level (n) can advertise
1147	   their existence to each other, and it is not necessary to configure
1148	   each with information about all of the others. Although this process
1149	   can help to reduce the configuration overhead, it does not eliminate
1150	   it as each member of the mesh group must still be planned and
1151	   configured for membership.

1153	   An important consideration for the use of hierarchical LSPs is how
1154	   they can be protected using MPLS Fast Re-Route (FRR) [RFC4090]. FRR
1155	   may provide link protection either by protecting the tunnels as they
1156	   traverse a broken link, or by treating each level (n) tunnel LSP as a
1157	   link in level (n+1), and providing protection for the level (n+1)
1158	   LSPs (although in this model fault detection and propagation time may
1159	   be an issue). Node protection may be performed in a similar way, but
1160	   protection of the first and last nodes of a hierarchical LSP is
1161	   particularly difficult. Additionally, the whole notion of scaling
1162	   with regard to FRR gives rise to separate concerns that are outside
1163	   the scope of this document as currently formulated.

1165	   Finally, observe that we have been explaining these techniques using
1166	   conveniently symmetrical networks. Consider how we would arrange the
1167	   hierarchical LSPs in a network where some PEs are connected closer to
1168	   the center of the network than others.

1170	7. Scaling Ladder Networks with Forwarding Adjacencies

1172	7.1 Two Layer Hierarchy

1174	   In Section 6.2, we observed that there is no value to placing FA LSPs
1175	   between the P(1) nodes of our example snowflake topologies. This is
1176	   because those LSPs would be just one hop long and would, in fact,
1177	   only serve to increase the burden at the P(1) nodes. However, in the
1178	   ladder model, there is value to this approach. The P(1) nodes are the
1179	   spar nodes of the ladder, and they are not all mutually adjacent.
1180	   That is, the P(1)-to-P(1) hierarchical LSPs can create a full mesh of
1181	   P(1) nodes where one does not exist in the physical topology.

1183	   The number of LSPs seen by a P(1) node is then:

1185	   o all of the tunnels terminating at the P(1) node
1186	   o any transit tunnels
1187	   o all of the LSPs due to subtended PEs.

1189	   This is a substantial reduction because all of the transit LSPs are
1190	   reduced to just one per remote P(1) that causes any transit LSP. So:

1192	   L(1) = 2*(S(1) - 1) +
1193	          O(S(1)*S(1)/2) +
1194	          2*E*E*(S(1) - 1) + E*(E-1) - E*(M(2) - 1)

1196	   where O(S(1)*S(1)/2) gives an upper bound order of magnitude. So:

1198	   L(1) = S(1)*S(1)/2 + 2*S(1) + 2*E*E*(S(1) - 1) - E*M(2) - 2

1200	   So, in our two examples:

1202	   With S(1) = 6, M(1) = 10, and M(2) = 17, we see:
1203	      E     = 170
1204	      S(PE) = 1020
1205	      L(PE) = 2038
1206	      L(2)  = 34374
1207	      L(1)  = 286138

1209	   Alternatively, with S(1) = 10, M(1) = 10, and M(2) = 20, we see:
1210	      E     = 200
1211	      S(PE) = 2000
1212	      L(PE) = 3998
1213	      L(2)  = 79580
1214	      L(1)  = 716060

1216	   Both of these show significant improvements over the previous L(1)
1217	   figures of 777410 and 2516000. But the numbers are still too large to
1218	   manage, and there is no improvement in the L(2) figures.

1220	7.2 Three Layer Hierarchy

1222	   We can also apply the three layer hierarchy to the ladder model. In
1223	   this case the number of LSPs between P(1) nodes is not reduced, but
1224	   tunnels are also set up between all P(2) nodes. Thus the number of
1225	   LSPs seen by a P(1) node is:

1227	   o all of the tunnels terminating at the P(1) node
1228	   o any transit tunnels between P(1) nodes
1229	   o all of the LSPs due to subtended P(2) nodes.

1231	   No PE-to-PE LSPs are seen at the P(1) nodes.

1233	   L(1) = 2*(S(1) - 1) +
1234	          O(S(1)*S(1)/2) +
1235	          2*(S(1) - 1)*M(1)*M(1) + M(1)*(M(1) - 1)

1237	   where O(S(1)*S(1)/2) gives an upper bound order of magnitude. So:

1239	   L(1) = S(1)*S(1)/2 + 2*S(1) + 2*M(1)*M(1)*S(1) - M(1)(M(1) + 1) - 2

1241	   Unfortunately, there is a small increase in the number of LSPs seen
1242	   by the P(2) nodes. Each P(2) now sees all of the PE-to-PE LSPs that
1243	   it saw before, and it is also an end point for a set of P(2)-to-P(2)
1244	   tunnels. Thus L(2) increases to:

1246	   L(2) = 2*M(2)*(S(PE) - 1) - M(2)*(M(2) - 1) + 2*(S(1)*M(1) - 1)

1248	   So, in our two examples:

1250	   With S(1) = 6, M(1) = 10, and M(2) = 17, we see:
1251	      E     = 170
1252	      S(PE) = 1020
1253	      L(PE) = 2038
1254	      L(2)  = 34492
1255	      L(1)  = 1118

1257	   Alternatively, with S(1) = 10, M(1) = 10, and M(2) = 20, we see:
1258	      E     = 200
1259	      S(PE) = 2000
1260	      L(PE) = 3998
1261	      L(2)  = 79778
1262	      L(1)  = 1958

1264	   This represents a very dramatic decrease in LSPs across the core.

1266	7.3 Issues with Hierarchical LSPs

1268	   The same issues exist for hierarchical LSPs as described in Section
1269	   6.4. Although dramatic improvements can be made to the scaling
1270	   numbers for the number of LSPs at core nodes, this can only be done
1271	   at the cost of configuring P(2) to P(2) tunnels. The mesh of P(1)
1272	   tunnels is not enough.

1274	   But the sheer number of P(2) to P(2) tunnels that must be configured
1275	   is a significant management burden that can only be eased by using a
1276	   technique like automesh [RFC4972].

1278	   It is significant, however, that the scaling problem at the P(2)
1279	   nodes cannot be improved by using tunnels, and the only solution to
1280	   ease this in the hierarchical approach would be to institute another
1281	   layer of hierarchy (that is, P(3) nodes) between the P(2) nodes and
1282	   the PEs. This is, of course, a significant expense.

1284	8. Scaling Improvements Through Multipoint-to-Point LSPs

1286	   An alternative (or complementary) scaling technique has been proposed
1287	   using multipoint-to-point (MP2P) LSPs. The fundamental improvement in
1288	   this case is achieved by reducing the number of LSPs toward the
1289	   destination as LSPs toward the same destination are merged.

1291	   This section presents an overview of MP2P LSPs and describes their
1292	   applicability and scaling benefits.

1294	8.1 Overview of MP2P LSPs

1296	   Note that the MP2P LSPs discussed here are for MPLS-TE and are not
1297	   the same concept familiar in the Label Distribution Protocol (LDP)
1298	   described in [RFC5036].

1300	   Traffic flows generally converge toward their destination and this
1301	   can be utilized by MPLS in constructing an MP2P LSP. With such an
1302	   LSP, the LFIB mappings at each LSR are many-to-one so that multiple
1303	   pairs {incoming interface, incoming label} are mapped to a single
1304	   pair {outgoing interface, outgoing label}. Obviously, if per-platform
1305	   labels are used, this mapping may be optimized within an
1306	   implementation.

1308	   It is important to note that with MP2P MPLS-TE LSPs the traffic flows
1309	   are merged. That is, some additional form of identifier is required
1310	   if demerging is required. For example, if the payload is IP traffic
1311	   belonging to the same client network, no additional demerging
1312	   information is required since the IP packet contains sufficient data.
1313	   On the other hand, if the data comes, for example, from a variety of
1314	   VPN client networks, then the flows will need to be labeled in their
1315	   own right as point-to-point (P2P) flows, so that traffic can be
1316	   disambiguated at the egress of the MP2P LSPs.

1318	   Techniques for establishing MP2P MPLS-TE LSPs and for assigning the
1319	   correct bandwidth downstream of LSP merge points are out of the scope
1320	   of this document.

1322	8.2 LSP State: A Better Measure of Scalability

1324	   Consider the network topology shown in figure 3. Suppose that we
1325	   establish MP2P LSP tunnels such that there is one tunnel terminating
1326	   at each PE, and that that tunnel has every other PE as an ingress.
1327	   Thus, a PE-to-PE MP2P LSP tunnel would have S(PE)-1 ingresses and one
1328	   egress, and there would be S(PE) such tunnels.

1330	   Note that there still remain 2*(S(PE) - 1) PE-to-PE P2P LSPs that are
1331	   carried through these tunnels.

1333	   Let's consider the number of LSPs handled at each node in the
1334	   network.

1336	   The PEs continue to handle the same number of PE-to-PE P2P LSPs, and
1337	   must also handle the MP2P LSPs. So:

1339	   L(PE) = 2*(S(PE) - 1) + S(PE)

1341	   But all P(n) nodes in the network only handle the MP2P LSP tunnels.
1342	   Nominally, this means that L(n) = S(PE) for all values of n. This
1343	   would appear to be a great success with the number of LSPs cut to
1344	   completely manageable levels.

1346	   But the number of LSPs is not the only issue (although it may have
1347	   some impact for some of the scaling concerns listed in section 4). We
1348	   are more interested in the amount of LSP state that is maintained by
1349	   an LSR. This reflects the amount of storage required at the LSR, the
1350	   amount of protocol processing, and the amount of information that
1351	   needs to be managed.

1353	   In fact, we were also interested in this measure of scalability in
1354	   the earlier sections of this document, but in those cases we could
1355	   see a direct correlation between the number of LSPs and the amount of
1356	   LSP state since transit LSPs had two pieces of state information (one
1357	   on the incoming and one on the outgoing interface), and ingress or
1358	   egress LSPs had just one piece of state.

1360	   We can quantify the amount of LSP state according to the number of
1361	   LSP segments managed by an LSR. So (as above), in the case of a P2P
1362	   LSP, an ingress or egress has one segment to maintain, while a
1363	   transit has two segments. Similarly, for an MP2P LSP, an LSR must
1364	   maintain one set of state information for each upstream segment
1365	   (which, we can assume, is in a one-to-one relationship with the
1366	   number of upstream neighbors), and exactly one downstream segment -
1367	   ingresses obviously have no upstream neighbors, and egresses have no
1368	   downstream segments.

1370	   So we can start again on our examination of the scaling properties of
1371	   MP2P LSPs using X(n) to represent the amount of LSP state held at
1372	   each P(n) node.

1374	8.3 Scaling Improvements for Snowflake Networks

1376	   At the PEs, there is only connectivity to one other network node, the
1377	   P(2) node. But note that if P2P LSPs need to be used to allow
1378	   disambiguation of data at the MP2P LSP egresses, then these P2P LSPs
1379	   are tunneled within the MP2P LSPs . So X(PE) is:

1381	   X(PE) = 2*(S(PE) - 1) if no disambiguation is required

1383	   and

1385	   X(PE) = 4*(S(PE) - 1) if disambiguation is required

1387	   Each P(2) node has M(2) downstream PEs. The P(2) sees a single MP2P
1388	   LSP targeted at each downstream PE with one downstream segment (to
1389	   that PE) and M(2) - 1 upstream segments from the other subtended PEs.
1390	   Additionally, each of these LSPs has an upstream segment from the one
1391	   upstream P(1). This gives a total of M(2)*(1 + M(2)) LSP segments.

1393	   There are also LSPs running from the subtended PEs to each other PE
1394	   in the network. There are S(PE) - M(2) such PEs, and the P(2) sees
1395	   one upstream segment for each of these from each subtended PE. It
1396	   also has one downstream segment for each of these LSPs. This gives
1397	   (M(2) + 1)*(S(PE) - M(2)) LSP segments.

1399	   Thus:

1401	   X(2) = M(2)*(1 + M(2)) + (M(2) + 1)*(S(PE) - M(2))
1402	        = S(PE)*(M(2) + 1)

1404	   Similarly, at each P(1) node there are M(1) downstream P(2) nodes and
1405	   so a total of M(1)*M(2) downstream PEs. Each P(1) is connected in a
1406	   full mesh with the other P(1) nodes so has (S(1) - 1) neighbors.

1408	   The P(1) sees a single MP2P LSP targeted at each downstream PE. This
1409	   has one downstream segment (to the P(2) to which the PE is connected)
1410	   and M(1) - 1 upstream segments from the other subtended P(2) nodes.
1411	   Additionally, each of these LSPs has an upstream segment from each of
1412	   the P(1) neighbors. This gives a total number of LSP segments of
1413	   M(1)*M(2)*(M(1) + S(1) - 1).

1415	   There are also LSPs running from each of the subtended PEs to each
1416	   other PE in the network. There are S(PE) - M(1)M(2) such PEs, and the
1417	   P(1) sees one upstream segment for each of these from each subtended
1418	   P(2) (since the aggregation form the subtended PEs has already
1419	   happened at the P(2) nodes). It also has one downstream segment to
1420	   appropriate next hop P(1) neighbor for each of these LSPs. This gives
1421	   (M(1) + 1)*(S(PE) - M(1)*M(2))  LSP segments.

1423	   Thus:

1425	   X(1) = M(1)*M(2)*(M(1) + S(1) - 1) +
1426	          (M(1) + 1)*(S(PE) - M(1)*M(2))
1427	        = M(1)*M(2)*(S(1) - 2) + S(PE)*(M(1) + 1)

1429	   So, for example, with S(1) = 10, M(1) = 10, and M(2) = 10, we see:
1430	      S(PE) = 1000
1431	      S(2)  = 100
1432	      X(PE) = 3996   (or 1998)
1433	      X(2)  = 11000
1434	      X(1)  = 11800

1436	   And similarly, with S(1) = 20, M(1) = 20, and M(2) = 5, we see:
1437	      S(PE) = 2000
1438	      S(2)  = 400
1439	      X(PE) = 5996   (or 2998)
1440	      X(2)  = 12000
1441	      X(1)  = 39800

1443	8.3.1 Comparison with Other Scenarios

1445	   For comparison with the examples in sections 5 and 6, we need to
1446	   convert those LSP-based figures to our new measure of LSP state.

1448	   Observe that each LSP in sections 5 and 6 generates two state units
1449	   at a transit LSR and one at an ingress or egress. So we can provide
1450	   conversions as follows:

1452	   Section 5 (flat snowflake network)
1453	     L(PE) = 2*(S(PE) - 1)
1454	     L(2)  = M(2)*(2*S(PE) - M(2) - 1)
1455	     L(1)  = M(1)*M(2)*(2*S(PE) - M(2)*(M(1) + 1))
1456	     X(PE) = 2*(S(PE) - 1)
1457	     X(2)  = 2*M(2)*(2*S(PE) - M(2) - 1)
1458	     X(1)  = 2*M(1)*M(2)*(2*S(PE) - M(2)*(M(1) + 1))

1460	     For the example with S(1) = 10, M(1) = 10, and M(2) = 10, this
1461	     gives a comparison table as follows:

1463	        Count | Unmodified  |  MP2P
1464	        ------+-------------+----------
1465	        X(PE) |     1998    |   3996
1466	        X(2)  |    39780    |  11000
1467	        X(1)  |   378000    |  11800

1469	     Clearly this technique is a significant improvement over the flat
1470	     network within the core of the network, although the PEs are more
1471	     heavily stressed if disambiguation is required.

1473	   Section 6.1 (Two-layer hierarchy snowflake network)
1474	     L(PE) = 2*(S(PE) - 1)
1475	     L(2)  = M(2)*(2*S(PE) - M(2) - 1) + 2*(S(2) - 1)
1476	     L(1)  = M(1)*(2*S(2) - M(1) - 1)
1477	     X(PE) = 2*(S(PE) - 1)
1478	     X(2)  = 2*M(2)*(2*S(PE) - M(2) - 1) + 2*(S(2) - 1)
1479	     X(1)  = 2*M(1)*(2*S(2) - M(1) - 1)

1481	     Note that in the computation of X(2) the hierarchical LSPs only add
1482	     one state at each P(2) node.

1484	     For the same example with S(1) = 10, M(1) = 10, and M(2) = 10, this
1485	     gives a comparison table as follows:

1487	        Count |   2-Layer   |  MP2P
1488	              |  Hierarchy  |
1489	        ------+-------------+----------
1490	        X(PE) |     1998    |   3996
1491	        X(2)  |    39978    |  11000
1492	        X(1)  |     3780    |  11800

1494	     And we can observe that the MP2P model is better at P(2), but the
1495	     hierarchical model is better at P(1).

1497	   In fact, this comparison can be generalized to observe that the MP2P
1498	   model produces best effects toward the edge of the network, while the
1499	   hierarchical model makes most impression at the core. However, the
1500	   requirement for disambiguation of P2P LSPs tunneled within the MP2P
1501	   LSPs does cause a double burden at the PEs.

1503	8.4 Scaling Improvements for Ladder Networks

1505	   MP2P LSPs applied just within the ladder will not make a significant
1506	   difference, but applying MP2P for all LSPs and at all nodes makes a
1507	   very big difference without requiring any further configuration.

1509	   LSP state at a spar node may be divided into those LSPs segments that
1510	   enter or leave the spar node due to subtended PEs (local LSP
1511	   segments), and those that enter or leave the spar node due to remote
1512	   PEs (remote segments).

1514	   The local segments may be counted as:

1516	   o  E LSPs targeting local PEs
1517	   o  (S(1)-1)*E*M(1) LSPs targeting remote PEs

1519	   The remote segments may be counted as:

1521	   o  (S(1)-1)*E outgoing LSPs targeting remote PEs
1522	   o  <= 3*S(1)*E incoming LSPs targeting any PE (there are precisely
1523	      P(1) nodes attached to any other P(1) node).

1525	   Hence, using X(1) as a measure of LSP state rather than a count of
1526	   LSPs, we get:

1528	   X(1) <= E + (S(1)-1)*E*M(1) + (S(1)-1)*E + 3*S(1)*E
1529	        <= (4 + M(1))*S(1)*E - M(1)*E

1531	   The number of LSPs at the P(2) nodes is also improved. We may also
1532	   count the LSP state in the same way so that there are:

1534	   o  M(2) LSPs targeting local PEs
1535	   o  M(2)*(S(1)*E) LSPs from local PEs to all other PEs
1536	   o  S(1)*E - M(2) LSPs to remote PEs.

1538	   So using X(2) as a measure of LSP state and not a count of LSPs, we
1539	   have:

1541	   X(2) = M(2) + M(2)*(S(1)*E) + S(1)*E - M(2)
1542	        = (M(2) + 1)*S(1)*E

1544	   Our examples from Section 5.2 give us the following numbers:

1546	   With S(1) = 6, M(1) = 10, and M(2) = 17, we see:
1547	      E     = 170
1548	      S(PE) = 1020
1549	      X(PE) = 2038
1550	      X(2)  = 18360
1551	      X(1) <= 12580

1553	   Alternatively, with S(1) = 10, M(1) = 10, and M(2) = 20, we see:
1554	      E     = 200
1555	      S(PE) = 2000
1556	      X(PE) = 3998
1557	      X(2)  = 42000
1558	      X(1) <= 26000

1560	8.4.1 Comparison with Other Scenarios

1562	   The use of MP2P compares very favorably with all scaling scenarios.
1563	   It is the only technique able to reduce the value of X(2), and it
1564	   does this by a factor of almost two. The impact on X(1) is better
1565	   than everything except the three level hierarchy.

1567	   The following table provides a quick cross-reference for the figures
1568	   for the example ladder networks. Note that the previous figures are
1569	   modified to provide counts of LSP state rather than LSP numbers.
1570	   Again, each LSP contributes one state at its end points, and two
1571	   states at transit nodes.

1573	   Thus, for the all cases we have

1575	     X(PE) = 2*(S(PE) - 1) or
1576	     X(PE) = 4*(S(PE) - 1) if disambiguation is required.

1578	   In the unmodified (flat) case, we have:

1580	     X(2) = 2*(M(2)*(2*S(PE) - M(2) - 1))
1581	     X(1) = 2*(M(1)*M(2)*(2*S(PE) - M(2)*(M(1) + 1)))

1583	   In the 2-level hierarchy, we have:

1585	     X(2) = 2*(2*M(2)*(S(PE) - 1) - M(2)*(M(2) - 1))
1586	     X(1) = S(1)*S(1) + 2*S(1) + 4*E*E*(S(1) - 1) - 2*E*M(2) - 2

1588	   In the 3-level hierarchy, we have:

1590	     X(2) = 2*(2*M(2)*(S(PE) - 1) - M(2)*(M(2) - 1)) + 2*(S(1)*M(1) - 1)
1591	     X(1) = S(1)*S(1) + 2*S(1) + 4*M(1)*M(1)*S(1) - 2*M(1)(M(1) + 1) - 2

1593	   Example A: S(1) = 6, M(1) = 10, and M(2) = 17
1594	   Example B: S(1) = 10, M(1) = 10, and M(2) = 20

1596	   Example| Count | Unmodified |  2-Level   |  3-Level    |  MP2P
1597	          |       |            | Hierarchy  | Hierarchy   |
1598	   -------+-------+------------+------------+-------------+-------
1599	   A      | X(2)  |     68748  |    68748   |    68866    |  18360
1600	          | X(1)  |   1554820  |   572266   |     2226    |  12580
1601	   -------+-------+------------+------------+-------------+-------
1602	   B      | X(2)  |    159160  |   159160   |   159358    |  42000
1603	          | X(1)  |   5032000  |  1433998   |     3898    |  26000

1605	8.4.2 LSP State Compared with LSP Numbers

1607	   Recall that in Section 8.3, the true benefit of MP2P was analyzed
1608	   with respect to the LSP segment state required, rather than the
1609	   actual number of LSPs. This proved to be a more accurate comparison
1610	   of the techniques because the MP2P LSPs require state on each branch
1611	   of the LSP so the saving is not linear with the reduced number of
1612	   LSPs.

1614	   A similar analysis could be performed here for the ladder network.
1615	   The net effect is that it increases the state by an order of two for
1616	   all transit LSPs in the P2P models, and by a multiplier equal to the
1617	   degree of a node in the MP2P model.

1619	   A rough estimate shows that, as with snowflake networks, MP2P
1620	   provides better scaling than the 1-level hierarchical model, and is
1621	   considerably better at the core. But MP2P compares less will with the
1622	   2-level hierarchy especially in the core.

1624	8.5 Issues with MP2P LSPs

1626	   The biggest challenges for MP2P LSPs are the provision of support in
1627	   the control and data planes. To some extent support must also be
1628	   provided in the management plane.

1630	   Control plane support is just a matter of defining the protocols and
1631	   procedures [MP2P-RSVP], although it must be clearly understood that
1632	   this will introduce some complexity to the control plane.

1634	   Hardware issues may be a little more tricky. For example, the
1635	   capacity of the upstream segments must never (allowing for
1636	   statistical over-subscription) exceed the capacity of the downstream
1637	   segment. Similarly, data planes must be equipped with sufficient
1638	   buffers to handle incoming packet collisions.

1640	   The management plane will be impacted in several ways. Firstly the
1641	   management applications will need to handle LSPs with multiple
1642	   senders. This means that, although the applications need to process
1643	   fewer LSPs, they will be more complicated and will, in fact, need to
1644	   process the same number of ingresses and egresses. Other issues like
1645	   diagnostics and OAM would also need to be enhanced to support MP2P,
1646	   but might be borrowed heavily from LDP networks.

1648	   Lastly, note that when the MP2P solution is used, the receiver (the
1649	   single egress PE of an MP2P tunnel) cannot use the incoming label as
1650	   an indicator of the source of the data. Contrast this with P2P LSPs.
1651	   Depending on deployment, this might not be an issue since the PE-PE
1652	   connectivity may in any case be a tunnel with inner labels to
1653	   discriminate the data flows.

1655	   In other deployments, it may be considered necessary to include
1656	   additional PE-PE P2P LSPs and tunnel these through the MP2P LSPs.
1657	   This would require the PEs to support twice as many LSPs. Since PEs
1658	   are not usually as fully specified as P-routers, this may cause some
1659	   concern although the use of penultimate hop popping on the MP2P LSPs
1660	   might help to reduce this issue.

1662	   In all cases, care must be taken not to confuse the reduction in the
1663	   number of LSPs with a reduction in the LSP state that is required. In
1664	   fact, the discussion in Section 8.3 is slightly optimistic since
1665	   LSP state toward the destination will probably need to include
1666	   sender information and so will increase depending on the number of
1667	   senders for the MP2P LSP. Section 8.4, on the other hand, counts LSP
1668	   state rather than LSPs. This issue is clearly dependent on the
1669	   protocol solution for MP2P RSVP-TE which is out of scope for this
1670	   document.

1672	   MPLS Fast Re-route (FRR) [RFC4090] is an attractive scheme for
1673	   providing rapid local protection from node or link failures. Such a
1674	   scheme has, however, not been designed for MP2P at the time of
1675	   writing, so it remains to be seen how practical it would be,
1676	   especially in the case of the failure of a merge node. Initial
1677	   examination of this case suggests that FRR would not be a problem
1678	   for MP2P given that each flow can be handled separately.

1680	   As a final note, observe that the MP2P scenario presented in this
1681	   document may be optimistic. MP2P LSP merging may be hard to achieve
1682	   between LSPs with significantly different traffic and QoS parameters.
1683	   Therefore, it may be necessary to increase the number of MP2P LSPs
1684	   arriving at an egress.

1686	9. Combined Models

1688	   There is nothing to prevent the combination of hierarchical and MP2P
1689	   solutions within a network.

1691	   Note that if MP2P LSPs are tunneled through P2P FA LSPs across the
1692	   core, none of the benefit of LSP merging is seen for the hops during
1693	   which the MP2P LSPs are tunneled.

1695	   On the other hand, it is possible to construct solutions where MP2P
1696	   FA LSPs are constructed within the network resulting in savings from
1697	   both modes of operation.

1699	10. An Alternate Solution

1701	   A simple solution to reducing the number of LSP tunnels handled by
1702	   any node in the network has been proposed. In this solution it is
1703	   observed that part of the problem is caused purely by the total
1704	   number of LSP in the network, and that this is a function of the
1705	   number of PEs since a full mesh of PE-PE LSPs is required. The
1706	   conclusion of this observation is to move the tunnel end-points
1707	   further into the network so that, instead of having a full mesh of
1708	   PE-PE tunnels, we have only a full mesh of P(n)-P(n) tunnels.

1710	   Obviously, there is no change in the physical network topology, so
1711	   the PEs remain subtended to the P(n) nodes, and the consequence is
1712	   that there is no TE on the links between PEs and P(n) nodes.

1714	   In this case we have already done the hard work for computing the
1715	   number of LSP in the previous sections. The power of the analysis in
1716	   the earlier sections is demonstrated by its applicability to this new
1717	   model - all we need to do is make minor changes to the formulae. This
1718	   is most simply done by removing a layer from the network. We
1719	   introduce the term "tunnel end-point" (TEP), and replace the P(n)
1720	   nodes with TEPs. Thus, the example of a flat snowflake network in
1721	   Figure 3 becomes as shown in Figure 7. Corresponding changes can be
1722	   made to all of the sample topologies.

1724	       PE    PE  PE     PE  PE     PE
1725	         \     \/         \/      /
1726	      PE--TEP  TEP        TEP  TEP--PE
1727	             \ |            | /
1728	              \|            |/
1729	     PE--TEP---P(1)------P(1)---TEP--PE
1730	        /          \    /          \
1731	      PE            \  /            PE
1732	                     \/
1733	                     P(1)
1734	                     /|\
1735	                    / | \
1736	                   /  |  \
1737	            PE--TEP  TEP  TEP--PE
1738	               /      /\     \
1739	             PE     PE  PE    PE

1741	   Figure 7 : An Example Snowflake Network With Tunnel End-Points
1742	   To perform the scaling calculations we need only replace the PE
1743	   counts in the formulae with TEP counts, and observe that there is
1744	   one fewer layer in the network. For example, in the flat snowflake
1745	   network shown in Figure 7 we can see that the number of LSPs seen
1746	   at a TEP is:

1748	   L(TEP) = 2*(S(TPE) - 1)

1750	   In our sample networks S(TPE) is typically of the order of 50 or 100
1751	   (the original values of S(2)), so L(TEP) is less than 200 which is
1752	   quite manageable.

1754	   Similarly, the number of LSPs handled by a P(1) node can be derived
1755	   from the original formula for the number of LSPs seen at a P(2) node
1756	   since all we have done is reduce n in P(n) from 2 to 1. So our new
1757	   formula is:

1759	   L(1) = M(1)*(2*S(TEP) - M(1) - 1)

1761	   With figures for M(1) = 10 and S(TEP) = 100, this gives us
1762	   L(1) = 1890. This is also very manageable.

1764	10.1 Pros and Cons of the Alternate Solution

1766	   On the face of it, this alternate solution seems very attractive.
1767	   Simply by contracting the edges of the tunnels into the network, we
1768	   have shown a dramatic reduction in the number of tunnels needed, and
1769	   there is no requirement to apply any additional scaling techniques.

1771	   But what of the PE-P(n) links? In the earlier sections of this
1772	   document we have assumed that there was some requirement for PE-PE
1773	   LSPs with TE properties that extended to the PE-P(n) links at both
1774	   ends of each LSP. That means that there was a requirement to provide
1775	   reservation-based QoS on those links, to be able to discriminate
1776	   traffic flows for priority-based treatment, and to be able to
1777	   distinguish applications and sources that send data based on the LSPs
1778	   that carry the data.

1780	   It might be argued that, since the PE-P(n) links do not offer any
1781	   routing options (each such link provides the only access to the
1782	   network for a PE), most of the benefits of tunnels are lost on these
1783	   peripheral links. However, TE is not just about routing. Just as
1784	   important are the abilities to make resource reservations, to
1785	   prioritize traffic, and to discriminate between traffic from
1786	   different applications, customers, or VPNs.

1788	   Furthermore, in multihoming scenarios where each PE is connected to
1789	   more than one P(n) or where a PE has multiple links to a single P(n),
1790	   there may be a desire to preselect the link to be used and to direct
1791	   the traffic to that link using a PE-PE LSP. Note that multihoming
1792	   has not been considered in this document.

1794	   Operationally, P(n)-P(n) LSPs offer the additional management
1795	   overhead that is seen for hierarchical LSPs described in Section 6.
1796	   That is, the LSPs have to be configured and established through
1797	   additional configuration or management operations that are not
1798	   carried out at the PEs. As described in Section 6, automesh [RFC4972]
1799	   could be used to ease this task. But it must be noted that, as
1800	   mentioned above, some of the key uses of tunnels require that traffic
1801	   is classified and placed in an appropriate tunnel according to its
1802	   traffic class, end-points, originating application, and customer
1803	   (such as client VPN). This information may not be readily available
1804	   for each packet at the P(n) nodes since it is PE-based information.
1805	   Of course, it is possible to conceive of techniques to make this
1806	   information available such as assigning a different label for each
1807	   class of traffic, but this gives rise to the original problems of
1808	   larger numbers of LSPs.

1810	   Our conclusion is, therefore, that this alternate technique may be
1811	   suitable for the general distribution of traffic based solely on the
1812	   destination, or on a combination of the destination and key fields
1813	   carried in the IP header. In this case it can provide a very
1814	   satisfactory answer to the scaling issues in an MPLS-TE network. But
1815	   if more sophisticated packet classification and discrimination is
1816	   required, this technique will make the desired function hard to
1817	   achieve, and the trade-off between scaling and feature-level will
1818	   swing too far towards solving the scaling issue at the expense of
1819	   delivery of function to the customer.

1821	11. Management Considerations

1823	   The management issues of the models presented in this document
1824	   have been discussed in line. No one solution is without its
1825	   management overhead.

1827	   Note, however, that scalability of management tools is one of the
1828	   motivators for this work and that network scaling solutions that
1829	   reduce the active management of LSPs at the cost of additional effort
1830	   to manage the more static elements of the network represent a
1831	   benefit. That is, it is worth the additional effort to set up MP2P or
1832	   FA LSPs if it means that the network can be scaled to a larger size
1833	   without being constrained by the management tools.

1835	   The MP2P technique may prove harder to debug through OAM methods than
1836	   the FA LSP approach.

1838	12. Security Considerations

1840	   The techniques described in this document use existing or
1841	   yet-to-be-defined signaling protocol extensions and are subject to
1842	   the security provided by those extensions. Note that we are talking
1843	   about tunneling techniques used within the network and both
1844	   approaches are vulnerable to the creation of bogus tunnels that
1845	   deliver data to an egress or consume network resources.

1847	   The fact that the MP2P technique may prove harder to debug through
1848	   OAM methods than the FA LSP approach is a security concern since it
1849	   is important to be able to detect misconnections.

1851	   General issues of the realtionship between scaling and security are
1852	   covered in Section 1.1, but the details are beyond the scope of this
1853	   document. Readers are refered to [MPLS-SEC] for details of MPLS
1854	   security techniques.

1856	13. Recommendations

1858	   The analysis in this document suggests that the ability to signal
1859	   MP2P MPLS-TE LSPs is a desirable addition to the operator's MPLS-TE
1860	   toolkit.

1862	   At this stage, no further recommendations are made, but it would be
1863	   valuable to consult more widely to discover:

1865	   - The concerns of other service providers with respect to network
1866	     scalability.

1868	   - More opinions on the realistic constraints to the network
1869	     parameters listed in Section 4.

1871	   - Desirable values for the cost-effectiveness of the network
1872	     (parameter K)

1874	   - The applicability, manageability and support for the two techniques
1875	     described

1877	   - The feasibility of combining the two techniques as discussed in
1878	     Section 9.

1880	   - The level of concern over the loss of functionality that would
1881	     occur if the alternate solution described in Section 10 was
1882	     adopted.

1884	14. IANA Considerations

1886	   This document makes no requests for IANA action.

1888	15. Acknowledgements

1890	   The authors are grateful to Jean-Louis Le Roux for discussions and
1891	   review input. Thanks to Ben Niven-Jenkins, JP Vasseur, Loa Andersson,
1892	   Anders Gavler, and Tom Polk for their comments. Thanks to Dave Allen
1893	   for useful discussion of the math.

1895	16. Intellectual Property Consideration

1897	   The IETF Trust takes no position regarding the validity or scope of
1898	   any Intellectual Property Rights or other rights that might be
1899	   claimed to pertain to the implementation or use of the technology
1900	   described in any IETF Document or the extent to which any license
1901	   under such rights might or might not be available; nor does it
1902	   represent that it has made any independent effort to identify any
1903	   such rights.

1905	   Copies of Intellectual Property disclosures made to the IETF
1906	   Secretariat and any assurances of licenses to be made available, or
1907	   the result of an attempt made to obtain a general license or
1908	   permission for the use of such proprietary rights by implementers or
1909	   users of this specification can be obtained from the IETF on-line IPR
1910	   repository at http://www.ietf.org/ipr

1912	   The IETF invites any interested party to bring to its attention any
1913	   copyrights, patents or patent applications, or other proprietary
1914	   rights that may cover technology that may be required to implement
1915	   any standard or specification contained in an IETF Document. Please
1916	   address the information to the IETF at ietf-ipr@ietf.org.

1918	   The definitive version of an IETF Document is that published by, or
1919	   under the auspices of, the IETF. Versions of IETF Documents that are
1920	   published by third parties, including those that are translated into
1921	   other languages, should not be considered to be definitive versions
1922	   of IETF Documents. The definitive version of these Legal Provisions
1923	   is that published by, or under the auspices of, the IETF. Versions of
1924	   these Legal Provisions that are published by third parties, including
1925	   those that are translated into other languages, should not be
1926	   considered to be definitive versions of these Legal Provisions.

1928	   For the avoidance of doubt, each Contributor to the IETF Standards
1929	   Process licenses each Contribution that he or she makes as part of
1930	   the IETF Standards Process to the IETF Trust pursuant to the
1931	   provisions of RFC 5378. No language to the contrary, or terms,
1932	   conditions or rights that differ from or are inconsistent with the
1933	   rights and licenses granted under RFC 5378, shall have any effect and
1934	   shall be null and void, whether published or posted by such
1935	   Contributor, or included with or in such Contribution.

1937	17. Normative References

1939	   [RFC4206]   Kompella, K. and Y. Rekhter, "Label Switched Paths (LSP)
1940	               Hierarchy with Generalized Multi-Protocol Label
1941	               Switching (GMPLS) Traffic Engineering (TE)", RFC 4206,
1942	               October 2005.

1944	18. Informative References

1946	   [RFC2961]   Berger, L., Gan, D., Swallow, G., Pan, P., Tommasi, F.,
1947	               and S. Molendini, "RSVP Refresh Overhead Reduction
1948	               Extensions", RFC 2961, April 2001.

1950	   [RFC3209]   Awduche, D., Berger, L., Gan, D., Li, T., Srinivasan, V.,
1951	               and G. Swallow, "RSVP-TE: Extensions to RSVP for LSP
1952	               Tunnels", RFC 3209, December 2001.

1954	   [RFC3270]   Le Faucheur, F., et al., "Multi-Protocol Label Switching
1955	               (MPLS) Support of Differentiated Services", RFC 3270, May
1956	               2002.

1958	   [RFC3473]   Berger, L., et al., Editor, "Generalized Multi-Protocol
1959	               Label Switching (GMPLS) Signaling Resource ReserVation
1960	               Protocol-Traffic Engineering (RSVP-TE) Extensions",
1961	               RFC 3473, January 2003.

1963	   [RFC3985]   Bryant, S., and Pate, P., "Pseudo Wire Emulation Edge-to-
1964	               Edge (PWE3) Architecture", RFC 3985, March 2005.

1966	   [RFC4090]   P. Pan, G. Swallow, A. Atlas (Editors), "Fast Reroute
1967	               Extensions to RSVP-TE for LSP Tunnels", RFC 4090, May
1968	               2005.

1970	   [RFC4110]   Callon, R., and Suzuki, M., "A Framework for Layer 3
1971	               Provider-Provisioned Virtual Private Networks (PPVPNs)",
1972	               RFC 4110, July 2005.

1974	   [RFC4972]   Vasseur, JP., and Le Roux, JL., "Routing Extensions for
1975	               Discovery of Multiprotocol (MPLS) Label Switch Router
1976	               (LSR) Traffic Engineering (TE) Mesh Membership",
1977	               RFC 4972, July 2007.

1979	   [RFC5036]   Andersson, L., Doolan, P., Feldman, N., Fredette, A., and
1980	               B. Thomas, "LDP Specification", RFC 5036, October 2007.

1982	   [MP2P-RSVP] Yasukawa, Y., "Supporting Multipoint-to-Point Label
1983	               Switched Paths in Multiprotocol Label Switching Traffic
1984	               Engineering", draft-yasukawa-mpls-mp2p-rsvpte, work in
1985	               progress.

1987	   [MPLS-SEC]  Fang, L. (Ed.), "Security Framework for MPLS and GMPLS
1988	               Networks", draft-ietf-mpls-mpls-and-gmpls-security-
1989	               framework, work in progress.

1991	19. Authors' Addresses

1993	   Seisho Yasukawa
1994	   NTT Corporation
1995	   9-11, Midori-Cho 3-Chome
1996	   Musashino-Shi, Tokyo 180-8585 Japan
1997	   Phone: +81 422 59 4769
1998	   EMail: s.yasukawa@hco.ntt.co.jp

2000	   Adrian Farrel
2001	   Old Dog Consulting
2002	   EMail: adrian@olddog.co.uk

2004	   Olufemi Komolafe
2005	   Cisco Systems
2006	   96 Commercial Street
2007	   Edinburgh
2008	   EH6 6LX
2009	   United Kingdom
2010	   EMail: femi@cisco.com

2012	20. Disclaimer of Validity

2014	   All IETF Documents and the information contained therein are provided
2015	   on an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE
2016	   REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE
2017	   IETF TRUST AND THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL
2018	   WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY
2019	   WARRANTY THAT THE USE OF THE INFORMATION THEREIN WILL NOT INFRINGE
2020	   ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS
2021	   FOR A PARTICULAR PURPOSE.

2023	21. Full Copyright Statement

2025	   Copyright (c) 2008 IETF Trust and the persons identified as the
2026	   document authors. All rights reserved.

2028	   This document is subject to BCP 78 and the IETF Trust's Legal
2029	   Provisions Relating to IETF Documents
2030	   (http://trustee.ietf.org/license-info) in effect on the date of
2031	   publication of this document. Please review these documents
2032	   carefully, as they describe your rights and restrictions with respect
2033	   to this document.