idnits 2.17.1 draft-ietf-mpls-te-scaling-analysis-05.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** The document seems to lack a License Notice according IETF Trust Provisions of 28 Dec 2009, Section 6.b.i or Provisions of 12 Sep 2009 Section 6.b -- however, there's a paragraph with a matching beginning. Boilerplate error? -- It seems you're using the 'non-IETF stream' Licence Notice instead Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- Couldn't find a document date in the document -- date freshness check skipped. Checking references for intended status: Informational ---------------------------------------------------------------------------- == Missing Reference: '4X' is mentioned on line 879, but not defined Summary: 1 error (**), 0 flaws (~~), 2 warnings (==), 3 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 Network Working Group S. Yasukawa 2 Internet-Draft NTT 3 Intended Status: Informational A. Farrel 4 Created: December 14, 2008 Old Dog Consulting 5 Expires: June 14, 2009 O. Komolafe 6 Cisco Systems 8 An Analysis of Scaling Issues in MPLS-TE Core Networks 10 draft-ietf-mpls-te-scaling-analysis-05.txt 12 Status of this Memo 14 This Internet-Draft is submitted to IETF in full conformance with 15 the provisions of BCP 78 and BCP 79. 17 Internet-Drafts are working documents of the Internet Engineering 18 Task Force (IETF), its areas, and its working groups. Note that 19 other groups may also distribute working documents as Internet- 20 Drafts. 22 Internet-Drafts are draft documents valid for a maximum of six months 23 and may be updated, replaced, or obsoleted by other documents at any 24 time. It is inappropriate to use Internet-Drafts as reference 25 material or to cite them other than as "work in progress." 27 The list of current Internet-Drafts can be accessed at 28 http://www.ietf.org/ietf/1id-abstracts.txt. 30 The list of Internet-Draft Shadow Directories can be accessed at 31 http://www.ietf.org/shadow.html. 33 Abstract 35 Traffic engineered Multiprotocol Label Switching (MPLS-TE) is 36 deployed in providers' core networks. As providers plan to grow these 37 networks, they need to understand whether existing protocols and 38 implementations can support the network sizes that they are planning. 40 This document presents an analysis of some of the scaling concerns 41 for the number of Label Switching Paths (LSPs) in MPLS-TE core 42 networks, and examines the value of two techniques (LSP hierarchies, 43 and multipoint-to-point LSPs) for improving scaling. The intention is 44 to motivate the development of appropriate deployment techniques and 45 protocol extensions to enable the application of MPLS-TE in large 46 networks. 48 This document only considers the question of achieving scalability 49 for the support of point-to-point MPLS-TE LSPs. Point-to-multipoint 50 MPLS-TE LSPs are for future study. 52 Contents 53 1. Introduction .................................................. 3 54 1.1 Overview ..................................................... 3 55 1.2 Glossary of Notation ......................................... 4 56 2. Issues of Concern for Scaling ................................. 5 57 2.1 LSP State ................................................... 5 58 2.2 Processing Overhead .......................................... 5 59 2.3 RSVP-TE Implications ......................................... 6 60 2.4 Management ................................................... 7 61 3. Network Topologies ............................................ 7 62 3.1 The Snowflake Network Topology ............................... 8 63 3.2 The Ladder Network Topology .................................. 10 64 3.3 Commercial Drivers for Selected Configurations ............... 12 65 3.4 Other Network Topologies ..................................... 13 66 4. Required Network Sizes ........................................ 14 67 4.1 Practical Numbers ............................................ 14 68 5. Scaling in Flat Networks ...................................... 14 69 5.1 Snowflake Networks ........................................... 15 70 5.2 Ladder Networks .............................................. 16 71 6. Scaling Snowflake Networks with Forwarding Adjacencies ........ 19 72 6.1 Two Layer Hierarchy .......................................... 20 73 6.1.1 Tuning the Network Topology to Suit the Two Layer Hierarchy 21 74 6.2 Alternative Two Layer Hierarchy .............................. 21 75 6.3 Three Layer Hierarchy ........................................ 22 76 6.4 Issues with Hierarchical LSPs ............................... 23 77 7. Scaling Ladder Networks with Forwarding Adjacencies............ 24 78 7.1 Two Layer Hierarchy .......................................... 24 79 7.2 Three Layer Hierarchy ........................................ 25 80 7.3 Issues with Hierarchical LSPs ................................ 26 81 8. Scaling Improvements Through Multipoint-to-Point LSPs ......... 26 82 8.1 Overview of MP2P LSPs ........................................ 27 83 8.2 LSP State: A Better Measure of Scalability ................... 27 84 8.3 Scaling Improvements for Snowflake Networks .................. 28 85 8.3.1 Comparison with Other Scenarios ............................ 29 86 8.4 Scaling Improvements for Ladder Networks ..................... 30 87 8.4.1 Comparison with Other Scenarios ............................ 31 88 8.4.2 LSP State Compared with LSP Numbers ........................ 33 89 8.5 Issues with MP2P LSPs ........................................ 33 90 9. Combined Models ............................................... 34 91 10. An Alternate Solution ........................................ 35 92 10.1 Pros and Cons of the Alternate Solution ..................... 36 93 11. Management Considerations .................................... 37 94 12. Security Considerations ...................................... 37 95 13. Recommendations .............................................. 38 96 14. IANA Considerations .......................................... 38 97 15. Acknowledgements ............................................. 38 98 16. Intellectual Property Consideration .......................... 39 99 17. Normative References ......................................... 40 100 18. Informative References ....................................... 40 101 19. Authors' Addresses ........................................... 41 103 1. Introduction 105 Network operators and service providers are examining scaling issues 106 as they look to deploy ever-larger traffic engineered Multiprotocol 107 Label Switching (MPLS-TE) networks. Concerns have been raised about 108 the number of Label Switched Paths (LSPs) that need to be supported 109 at the edge and at the core of the network. The impact on control 110 plane and management plane resources threatens to outweigh the 111 benefits and popularity of MPLS-TE, while the physical limitations of 112 the routers may constrain the deployment options. 114 Historically, it has been assumed that all MPLS-TE scaling issues 115 can be addressed using hierarchical LSP [RFC4206]. However, analysis 116 shows that the improvement gained by LSP hierarchies is not as 117 significant in all topologies and at all points in the network as 118 might have been presumed. Further, additional management issues are 119 introduced to determine the end-points of the hierarchical LSPs and 120 to operate them. Although this does not invalidate the benefits of 121 LSP hierarchies, it does indicate that additional techniques may be 122 desirable in order to fully scale MPLS-TE networks. 124 This document examines the scaling properties of two generic MPLS-TE 125 network topologies, and investigates the benefits of two scaling 126 techniques. 128 1.1 Overview 130 Physical topology scaling concerns are addressed by building networks 131 that are not fully meshed. Network topologies tend to be meshed in 132 the core, but tree-shaped at the edges giving rise to a snow-flake 133 design. Alternatively, the core may be more of a ladder shape with 134 tree-shaped edges. 136 MPLS-TE, however, establishes a logical full mesh between all edge 137 points in the network, and this is where the scaling problems arise 138 since the structure of the network tends to focus a large number of 139 LSPs within the core of the network. 141 This document presents two generic network topologies: the snow-flake 142 and the ladder, and attempts to parameterize the networks by making 143 some generalities. It introduces terminology for the different 144 scaling parameters and examines how many LSPs might be required to be 145 carried within the core of a network. 147 Two techniques (hierarchical LSPs, and multipoint-to-point LSPs) are 148 introduced and an examination is made of the scaling benefits that 149 they offer as well as of some of the concerns with using these 150 techniques. 152 Of necessity, this document makes many generalizations. Not least 153 among these are a set of assumptions about the symmetry and 154 connectivity of the physical network. It is hoped that these 155 generalizations will not impinge on the usefulness of the overview of 156 the scaling properties that this document attempts to give. Indeed, 157 the symmetry of the example topologies tends to highlight the 158 scaling issues of the different solution models, and this may be 159 useful in exposing the worst case scenarios. 161 Although protection mechanisms like Fast Re-route (FRR) [RFC4090] are 162 briefly discussed, the main body of this document considers stable 163 network cases. It should be noted that make-before-break 164 re-optimisation after link failure may result in a significant number 165 of 'duplicate' LSPs. This issue is not addressed in this document. 167 It should also be understood that certain deployment models where 168 separate traffic engineered LSPs are used to provide different 169 services such as layer 3 VPNs [RFC4110] or pseudowires [RFC3985], or 170 different classes of service [RFC3270], may result in 'duplicate' or 171 'parallel' LSPs running between any pair of provider edge nodes 172 (PEs). This scaling factor is also not considered in this document, 173 but may be easily applied as a linear factor by the reader. 175 The operation of security mechanisms in MPLS-TE networks [MPLS-SEC] 176 may have an impact on the ability of the network to scale. For 177 example, they may increase both the size and number of control plane 178 messages. Additionally, they may increase the processing overhead as 179 control plane messages are subject to processing algorithms (such as 180 encryption), and security keys need to be managed. Deployers will 181 need to consider the trade-offs between scaling objectives and 182 security objectives in thier networks and should resist the 183 temptation to respond to a degradation of scaling performance by 184 turning off security techniques that have previously been deemed as 185 necessary. Further analysis of the effects of security measures on 186 scalability are not considered further in this document. 188 This document is designed to help service providers discover whether 189 existing protocols and implementations can support the network sizes 190 that they are planning. To do this, it presents an analysis of some 191 of the scaling concerns for MPLS-TE core networks, and examines the 192 value of two techniques for improving scaling. This should motivate 193 the development of appropriate deployment techniques and protocol 194 extensions to enable the application of MPLS-TE in large networks. 196 This document only considers the question of achieving scalability 197 for the support of point-to-point MPLS-TE LSPs. Point-to-multipoint 198 MPLS-TE LSPs are for future study. 200 1.2 Glossary of Notation 202 This document applies consistent notation to define various 203 parameters of the networks that are analyzed. These terms are defined 204 as they are introduced though-out the document, but are grouped 205 together here for quick reference. Refer to the full definitions in 206 the text for detailed explanations. 208 n A network level. n = 1 is the core of the network. 209 See section 3, for more details of the definition of a level. 210 P(n) A node at level n in the network. 211 S(n) The number of nodes at level n. That is, the number of P(n) 212 nodes. 213 L(n) The number of LSPs seen by a P(n) node. 214 X(n) The number of LSP segment states held by a P(n) node. 215 M(n) The number of P(n+1) nodes subtended to a P(n) node. 216 R The number of rungs in a ladder network. 217 E The number of edge nodes (PEs) subtended below (directly or 218 indirectly) a spar node in a ladder network. 219 K The cost-effectiveness of the network expressed in terms of 220 the ratio of the number of PEs to the number of network nodes. 222 2. Issues of Concern for Scaling 224 This section presents some of the issues associated with the support 225 of LSPs at a Label Switching Router (LSR) or within the network. 226 These issues may mean that there is a limit to the number LSPs that 227 can be supported. 229 2.1 LSP State 231 LSP state is the data (information) that must be stored at an LSR in 232 order to maintain an LSP. Here we refer to the information that is 233 necessary to maintain forwarding plane state and the additional 234 information required when LSPs are established through control 235 plane protocols. While the size of the LSP state is implementation- 236 dependent, it is clear that any implementation will require some data 237 in order to maintain LSP state. 239 Thus LSP state becomes a scaling concern because, as the number of 240 LSPs at an LSR increases, so the amount of memory required to 241 maintain the LSPs increases in direct proportion. Since the memory 242 capacity of an LSR is limited, there is a related limit placed on the 243 number LSPs that can be supported. 245 Note that techniques to reduce the memory requirements (such as data 246 compression) may serve to increase the number of LSPs that can be 247 supported, but this will only achieve a moderate multiplier and may 248 significantly decrease the ability to process the state rapidly. 250 In this document we define X(n) as "The number of LSP segment states 251 held by a P(n) node." This definition observes that an LSR at the end 252 of an LSP only has to maintain state in one direction (i.e., into the 253 network), while a transit LSR must maintain state in both directions 254 (i.e., toward both ends of the LSP). Furthermore, in multipoint-to- 255 point (MP2P) LSPs (see Section 8) a transit LSR may need to maintain 256 LSP state for one downstream segment (toward the destination) and 257 multiple upstream segments (from multiple sources). That is, we 258 define LSP segment state as the state necessary to maintain an LSP in 259 one direction to one adjacent node. 261 2.2 Processing Overhead 263 Depending largely on implementation issues, the number of LSPs 264 passing through an LSR may impact the processing speed for each LSP. 265 For example, control block search times can increase with the number 266 of control blocks to be searched, and even excellent implementations 267 cannot completely mitigate this fact. Thus, since CPU power is 268 constrained in any LSR, there may be a practical limit to the number 269 of LSPs that can be supported. 271 Further processing overhead considerations depend on issues specific 272 to the control plane protocols, and are discussed in the next 273 section. 275 2.3 RSVP-TE Implications 277 Like many connection-oriented signaling protocols, RSVP-TE requires 278 that state is held within the network in order to maintain LSPs. The 279 impact of this is described in Section 2.1. Note that RSVP-TE 280 requires that separate information is maintained for upstream and 281 downstream relationships, but does not require any specific 282 implementation of that state. 284 RSVP-TE is a soft-state protocol which means that protocol messages 285 (refresh messages) must be regularly exchanged between signaling 286 neighbors in order to maintain the state for each LSP that runs 287 between the neighbors. A common period for the transmission (and 288 receipt) of refresh messages is 30 seconds meaning that each LSR must 289 send and receive one message in each direction (upstream and 290 downstream) for every LSP it supports every 30 seconds. This has the 291 potential to be a significant constraint on the scaling of the 292 network, but various improvements [RFC2961] mean that this refresh 293 processing can be significantly reduced allowing an implementation to 294 be optimized to nearly remove all concerns about soft state scaling 295 in a stable network. 297 Observations of existing implementations indicate that there may be a 298 threshold of around 50,000 LSPs above which an LSR struggles to 299 achieve sufficient processing to maintain LSP state. Although refresh 300 reduction [RFC2961] may substantially improve this situation, 301 it has also been observed that under these circumstances the size of 302 the Srefresh may become very large and the processing required may 303 still cause significant disruption to an LSR. 305 Another approach is to increase the refresh time. There is a 306 correlation between the percentage increase in refresh time and the 307 improvement in performance for the LSR. However, it should be noted 308 that RSVP-TE's soft-state nature depends on regular refresh messages, 309 thus a degree of functionality is lost by increasing the refresh 310 time. This loss may be partially mitigated by the use of the RSVP-TE 311 Hello message, and can also be reduced by the use of various GMPLS 312 extensions [RFC3473] such as the use of [RFC2961] message 313 acknowledgements on all messages. 315 RSVP-TE also requires that signaling adjacencies are maintained 316 through the use of Hello message exchanges. Although [RFC3209] 317 suggests that Hello messages should be retransmitted every 5ms, in 318 practice values of around 3 seconds are more common. Nevertheless, 319 the support of Hello messages can represent a scaling limitation on 320 an RSVP-TE implementation since one message must be sent and received 321 to/from each signaling adjacency every time period. This can impose 322 limits on the number of neighbors (physical or logical) that an LSR 323 supports, but does not impact the number of LSPs that the LSR can 324 handle. 326 2.4 Management 328 Another practical concern for the scalability of large MPLS-TE 329 networks is the ability to manage the network. This may be 330 constrained by the available tools, the practicality of managing 331 large numbers of LSPs, and the management protocols in use. 333 Management tools are software implementations. Although such 334 implementations should not constrain the control plane protocols, it 335 is realistic to appreciate that network deployments will be limited 336 by the scalability of the available tools. In practice, most existing 337 tools have a limit to the number of LSPs that they can support. While 338 a Network Management System (NMS) may be able to support a large 339 number of LSPs, the number that can be supported by anElement 340 Management System (EMS) (or the number supported by an NMS per-LSR) 341 is more likely to be limited. 343 Similarly, practical constraints may be imposed by the operation of 344 management protocols. For example, an LSR may be swamped by 345 management protocol requests to read information about the LSPs that 346 it supports, and this might impact its ability to sustain those LSPs 347 in the control plane. OAM, alarms, and notifications can further add 348 to the burden placed on an LSR and limit the number of LSPs it can 349 support. 351 All of these consideration encourage a reduction in the number of 352 LSPs supported within the network and at any particular LSR. 354 3. Network Topologies 356 In order to provide some generic analysis of the potential scaling 357 issues for MPLS-TE networks, this document explores two network 358 topology models. These topologies are selected partly because of 359 their symmetry which makes them more tractable to a formulaic 360 approach, and partly because they represent generalizations of 361 real deployment models. Section 3.3 provides a discussion of the 362 commercial drivers for deployed topologies and gives more analysis 363 of why it is reasonable to consider these two topologies. 365 The first topology is the snowflake model. In this type of network, 366 only the very core of the network is meshed. The edges of the network 367 are formed as trees rooted in the core. 369 The second network topology considered is the ladder model. In this 370 type of network the core of the network is shaped and meshed in the 371 form of a ladder and trees are attached rooted to the edge of the 372 ladder. 374 The sections that follow examine these topologies in detail in order 375 to parameterize them. 377 3.1 The Snowflake Network Topology 379 The snowflake topologies considered in this document are based on a 380 hierarchy of connectivity within the core network. PE nodes have 381 connectivity to P-nodes as shown in Figure 1. There is no direct 382 connectivity between the PEs. Dual homing of PEs to multiple P 383 nodes is not considered in this document although it may be a 384 valuable addition to a network configuration. 386 P 387 /|\ 388 / | \ 389 / | \ 390 / | \ 391 PE PE PE 393 Figure 1 : PE to P Node Connectivity 395 The relationship between P-nodes is also structured in a hierarchical 396 way. Thus, as shown in Figure 2, multiple P-nodes at one level are 397 connected to a P-node at a higher level. We number the levels such 398 that level 1 is the top level (top in our figure, and nearest to the 399 core of the network) and level (n) is immediately above level (n+1), 400 and we denote a P-node at level n as a P(n). 402 As with PEs, there is no direct connectivity between P(n+1) nodes. 403 Again, dual homing of P(n+1) nodes to multiple P(n) nodes is not 404 considered in this document although it may be a valuable addition 405 to a network configuration. 407 P(n) 408 /|\ 409 / | \ 410 / | \ 411 / | \ 412 P(n+1) P(n+1) P(n+1) 414 Figure 2 : Relationship Between P-Nodes 416 At the top level, P(1) nodes are connected in a full mesh. In 417 reality, the level 1 part of the network may be slightly less well 418 connected than this, but assuming a full mesh provides for 419 generality. Thus, the snowflake topology comprises a clique with 420 topologically equivalent trees subtended from each node in the 421 clique. 423 The key multipliers for scalability are the number of P(1) nodes, and 424 the multiplier relationship between P(n) and P(n+1) at each level 425 including down to PEs. 427 We define the multiplier M(n) as the number of P(n+1) nodes at level 428 (n+1) attached to any one P(n). Assume that M(n) is constant for all 429 nodes at level n. Since nodes at the same level are not 430 interconnected (except at the top level), and since each P(n+1) node 431 is connected to precisely one P(n) node, M(n) is one less than the 432 degree of the node at level n (that is, the P(n) node is attached to 433 M(n) nodes at level (n+1) and 1 node at level (n-1)). 435 We define S(n) as the number of nodes at level (n). 437 Thus: S(n) = S(1)*M(1)*M(2)*...*M(n-1) 439 So the number of PEs can be expressed as: 441 S(PE) = S(1)*M(1)*M(2)*...*M(n) 443 where the network has (n) layers of P-nodes. 445 Thus we may depict an example snowflake network as shown in Figure 3. 446 In this case: 448 S(1) = 3 449 M(1) = 3 450 S(2) = S(1)*M(1) = 9 451 M(2) = 2 452 S(PE) = S(1)*M(1)*M(2) = 18 453 PE PE PE PE PE PE 454 \ \/ \/ / 455 PE--P(2) P(2) P(2) P(2)--PE 456 \ | | / 457 \| |/ 458 PE--P(2)---P(1)------P(1)---P(2)--PE 459 / \ / \ 460 PE \ / PE 461 \/ 462 P(1) 463 /|\ 464 / | \ 465 / | \ 466 PE--P(2) P(2) P(2)--PE 467 / /\ \ 468 PE PE PE PE 470 Figure 3 : An Example Snowflake Network 472 3.2 The Ladder Network Topology 474 The ladder networks considered in this network are based on an 475 arrangement of routers in the core network that resembles a ladder. 477 Ladder networks typically have long and thin cores that are arranged 478 as conventional ladders. That is, they have one or more spars 479 connected by rungs. Each node on a spar may have: 481 - connection to one or more other spars 482 - connection to a tree of other core nodes 483 - connection to customer nodes. 485 Figure 4 shows a simplified example of a ladder network. A core of 486 twelve nodes make up two spars connected by six rungs. 488 PE PE PE PE 489 PE PE PE | PE | PE PE PE | PE | PE 490 \| \|/ |/ | \| \|/ 491 PE-P-----P-----P-----P------P-----P--PE 492 | | | | | |\ 493 | | | | | | PE 494 | | | | | | 495 PE-P-----P-----P-----P------P-----P 496 /| /|\ |\ |\ |\ \ 497 PE PE PE | PE | PE | PE | PE PE 498 PE PE PE PE 500 Figure 4 : A Simplified Ladder Network. 502 In practice, not all nodes on a spar (call them Spar-Nodes) need to 503 have subtended PEs. That is, they can exist simply to give 504 connectivity along the spar to other spar nodes, or across a rung to 505 another spar. Similarly, the connectivity between spars can be more 506 complex with multiple connections from one spar-node to another spar. 507 Lastly, the network may be complicated by the inclusion of more than 508 two spars (or simplified by reduction to a single spar). 510 These variables make the ladder network non-trivial to model. For the 511 sake of simplicity we will make the following restrictions: 513 - There are precisely two spars in the core network. 515 - Every spar-node connects to precisely one spar-node on the other 516 spar. That is, each spar-node is attached to precisely one rung. 518 - Each spar-node connects to either one (end-spar) or two (core-spar) 519 other spar-nodes on the same spar. 521 - Every spar-node has the same number of PEs subtended. This does not 522 mean that there are no P-nodes subtended to the spar-nodes, but 523 does mean that the edge tree subtended to each spar-node is 524 identical. 526 From these restrictions we are able to quantify a ladder network as 527 follows: 529 R - The number of rungs. That is, the number of spar-nodes on 530 each spar. 531 S(1) - The number of spar-nodes in the network. S(1)=2*R. 532 E - The number of subtended edge nodes (PEs) to each spar-node. 534 The number of rungs may vary considerably. A number less than 3 is 535 unlikely (since that would not be a significantly connected network), 536 and a number greater than 100 seems improbable (because that would 537 represent a very long, thin network). 539 E can be treated as for the snowflake network. That is, we can 540 consider a number of levels of attachment from P(1) nodes which are 541 the spar-nodes, through P(i) down to P(n) which are the PEs. 542 Practically, we need to only consider n=2 (PEs attached direct to the 543 spar-nodes) and n=3 (one level of P-nodes between the PEs and the 544 spar-nodes). 546 Let M(i) be the ratio of P(i) nodes to P(i-1) nodes, i.e. the 547 connectivity between levels of P-node as defined for the snowflake 548 topology. Hence, the number of nodes at any level (n) is: 550 S(n) = S(1)*M(1)*M(2)*...*M(n-1) 552 So number of PEs subtended to a spar node is: 554 E = M(1)*M(2)*...*M(n) 556 And the number of PEs can be expressed as: 558 S(PE) = S(1)*M(1)*M(2)*...*M(n) 559 = S(1)*E 561 Thus we may depict an example ladder network as shown in Figure 5. 562 In this case: 564 R = 5 565 S(1) = 10 566 M(1) = 2 567 S(2) = S(1)*M(1) = 20 568 M(2) = 2 569 E = M(1)*M(2) = 4 570 S(PE) = S(1)*E = 40 572 PE PE PE PE PE PE PE PE PE PE PE PE PE PE PE PE 573 \| \| \| \| |/ |/ |/ |/ 574 P(2) P(2) P(2) P(2) P(2) P(2) P(2) P(2) 575 \ \ | \ / | / / 576 PE \ \ | \ / | / / PE 577 \ \ \| \/ |/ / / 578 PE-P(2)---P(1)---P(1)---P(1)---P(1)---P(1)---P(2)-PE 579 | | | | | 580 | | | | | 581 | | | | | 582 PE-P(2)---P(1)---P(1)---P(1)---P(1)---P(1)---P(2)-PE 583 / / / | /\ |\ \ \ 584 PE / / | / \ | \ \ PE 585 / / | / \ | \ \ 586 P(2) P(2) P(2) P(2) P(2) P(2) P(2) P(2) 587 /| /| /| /| |\ |\ |\ |\ 588 PE PE PE PE PE PE PE PE PE PE PE PE PE PE PE PE 590 Figure 5 : An Example Ladder Network 592 3.3 Commercial Drivers for Selected Configurations 594 It is reasonable to ask why these two particular network topologies 595 have been chosen. 597 The most important consideration is physical scalability. Each node 598 (label switching router - LSR) is only able to support a limited 599 number of physical interfaces. This necessarily reduces the ability 600 to fully mesh a network and leads to the tree-like structure of the 601 network toward the PEs. 603 A realistic commercial consideration for an operator is the fact that 604 the only revenue-generating nodes in the network are the PEs. Other 605 nodes are needed only to support connectivity and scalability. 606 Therefore, there is a desire to maximize S(PE) while minimizing the 607 sum of S(n) for all values of (n). This could be achieved by 608 minimizing the number of levels, and by maximizing the connectivity 609 at each layer, M(n). Ultimately, however, this would produce a 610 network of just interconnected PEs, which is clearly in conflict with 611 the physical scaling situation. 613 Therefore, the solution calls for a "few" levels with "relatively 614 large" connectivity at each level. We might say that the 615 cost-effectiveness of the network can be stated as: 617 K = S(PE)/(S(1)+S(2) + ... + S(n)) where n is the level above the PEs 619 We should observe, however, that this equation may be naive in that 620 the cost of a network is not actually a function of the number of 621 routers (since a router chasis is often free or low cost), but is 622 really a function of the cost of the line cards which is, itself, a 623 product of the capacity of the line cards. Thus, the relatively high 624 connectivity decreases the cost-effectiveness, while a topology that 625 tends to channel data through a network core tends to demand higher 626 capacity (so more expensive) line cards. 628 A further consideration is the availability of connectivity (usually 629 fibers) between LSR sites. Although it is always possible to lay new 630 fiber, this may not be cost-effective or timely. As much of a problem 631 is likely to be the physical shape and topography of the country in 632 which the network is laid. If the country is 'long and thin' then a 633 ladder network is likely to be used. 635 This document examines the implications for control plane and data 636 plane scalability of this type of network when MPLS-TE LSPs are used 637 to provide full connectivity between all PEs. 639 3.4 Other Network Topologies 641 As explained in Section 1, this document is using two symmetrical 642 and generalized network topologies for simplicity of modelling. In 643 practice there are two other topological considerations. 645 a. Multihoming 646 It is relatively common for a node at level (n) to be attached to 647 more than one node at level (n-1). This is particularly common at 648 PEs that may be connected to more than one P(n). 650 b. Meshing within a level 651 A level in the network will often include links between P-nodes at 652 the same level including the possibility of links between PEs. 654 This may result in a network that looks like a series of 655 concentric circles with spokes. 657 Both of these features are likely to have some impact on the scaling 658 of the networks. However, for the purposes of establishing the 659 ground-rules for scaling, this document restricts itself to the 660 consideration of the symmetrical networks described in Sections 2.1 661 and 2.2. Discussion of other network formats is for future study. 663 4. Required Network Sizes 665 An important question for this evaluation and analysis is the size of 666 the network that operators require. How many PEs are required? What 667 ratio of P to PE is acceptable. How many ports do devices have for 668 physical connectivity? What type of MPLS-TE connectivity between PEs 669 is required? 671 Although presentation of figures for desired network sizes must be 672 treated with caution because history shows that networks grow beyond 673 all projections, it is useful to set some acceptable lower bounds. 674 That is, we can state that we are interested in networks of at least 675 a certain size. 677 The most important features are: 679 - The network should have at least 1000 PEs. 680 - Each pair of PEs should be connected by at least one LSP in each 681 direction. 683 4.1 Practical Numbers 685 In practice, reasonable target numbers are as follows. 687 S(PE) >= 1000 688 Number of levels is 3. That is: 1, 2 and PE. 689 M(2) <= 20 690 M(1) <= 20 691 S(1) <= 100 693 5. Scaling in Flat Networks 695 Before proceeding to examine potential scaling improvements, we need 696 to examine how well the flat networks described in the previous 697 sections scale. 699 Consider the requirement for a full mesh of LSPs linking all PEs. 700 That is, each PE has an LSP to and from each other LSP. Thus, if 701 there are S(PE) PEs in the network, there are S(PE)*(S(PE) - 1) LSPs. 703 Define L(n) as the number of LSPs handled by a level (n) LSR. 705 L(PE) = 2*(S(PE) - 1) 707 5.1 Snowflake Networks 709 There are a total of S(PE) PEs in the network and, since each PE 710 establishes an LSP with every other PE, it would be expected that 711 there are S(PE) - 1 LSPs incoming to each PE and the same number of 712 LSPs outgoing from the same PE, giving a total of 2(S(PE) - 1) on 713 the incident link. Hence, in a snowflake topology (see Figure 3), 714 since there are M(2) PEs attached to each P(2) node, it may tempting 715 to think that L(2), the number of LSPs traversing each P(2) node, is 716 simply 2*(S(PE) - 1)*M(2). However, it should be noted that of the 717 S(PE) - 1 LSPs incoming to each PE, M(2) - 1 originated from nodes 718 attached to the same P(2) node and so this value would count the LSPs 719 between the between the M(2) PEs attached to each P(2) node twice; 720 once when outgoing from the M(2) - 1 other nodes and once when 721 incoming into a particular PE. 723 There are a total of M(2)*(M(2) - 1) LSPs between these M(2) PEs and 724 since this value is erroneously included twice in 2*(S(PE) - 1)*M(2), 725 the correct value is: 727 L(2) = 2*M(2)*(S(PE) - 1) - M(2)*(M(2) - 1) 728 = M(2)*(2*S(PE) - M(2) - 1) 730 An alternative way of looking at this, that proves extensible for the 731 calculation of L(1), is to observe that each PE subtended to a P(2) 732 node has an LSP in each direction to all S(PE) - M(2) PEs in the rest 733 of the system, and there are M(2) such locally subtended PEs; thus 734 2*M(2)*(S(PE) - M(2)). Additionally there are M(2)*(M(2) - 1) LSPs 735 between the locally subtended PEs. So: 737 L(2) = 2*M(2)*(S(PE) - M(2)) + M(2)*(M(2) - 1) 738 = M(2)*(2*S(PE) - M(2) - 1) 740 L(1) can be computed in the same way as this second evaluation of 741 L(2). Each PE subtended below a P(1) node has an LSP in each 742 direction to all PEs not below the P(1) node. There are M(1)*M(2) PEs 743 below the P(1) node, so this accounts for 744 2*M(1)*M(2)*(S(PE) - M(1)*M(2)) LSPs. To this, we need to add the 745 number of LSPs that pass through the P(1) node and that run between 746 the PEs subtended below the P(1). Consider each P(2): it has M(2) PEs 747 that each has an LSP going to all of the PEs subtended to the other 748 P(2) nodes subtended to the P(1). There are M(1) - 1 such other P(2) 749 nodes, and so M(2)*(M(1) - 1) other such PEs. So the number of LSPs 750 from the PEs below a P(2) node is M(2)*M(2)*(M(1) - 1). And there 751 are M(1) P(2) nodes below the P(1), giving rise to a total of 752 M(2)*M(2)*M(1)*(M(1) - 1) LSPs. Thus: 754 L(1) = 2*M(1)*M(2)*(S(PE) - M(1)*M(2)) + M(2)*M(2)*M(1)*(M(1) - 1) 755 = M(1)*M(2)*(2*S(PE) - M(2)*(M(1) + 1)) 757 So, for example, with S(1) = 5, M(1) = 10, and M(2) = 20, we see: 759 S(PE) = 1000 760 L(PE) = 1998 761 L(2) = 39580 762 L(1) = 356000 764 Alternatively, with S(1) = 10, M(1) = 10, and M(2) = 20, we see: 766 S(PE) = 2000 767 L(PE) = 3998 768 L(2) = 79580 769 L(1) = 756000 771 In both examples, the number of LSPs at the core (P(1)) nodes is 772 probably unacceptably large even though there are only a relatively 773 modest number of PEs. In fact, L(2) may even be too large in the 774 second example. 776 5.2 Ladder Networks 778 In ladder networks L(PE) remains the same at 2*(S(PE) - 1). 780 L(2) can be computed using the same mechanism as for the snowflake 781 topology because the subtended tree is the same format. Hence, 783 L(2) = 2*M(2)*(S(PE) - 1) - M(2)*(M(2) - 1) 785 But L(1) requires a different computation because each P(1) not only 786 sees LSPs for the subtended PEs, but is also a transit node for some 787 of the LSPs that cross the core (the core is not fully meshed). 789 Each P(1) sees: 791 o all of the LSPs between locally attached PEs 792 o less the those LSPs between locally attached PEs that can be 793 served exclusively by the attached P(2) nodes 794 o all LSPs between locally attached PEs and remote PEs 795 o LSPs in transit that pass through the P(1). 797 The first three numbers are easily determined and match what we have 798 seen from the snowflake network. They are: 800 o E*(E-1) 801 o M(1)*M(2)*(M(2)-1) = E*(M(2) - 1) 802 o 2*E*E*(S(1) - 1) 803 The number of LSPs in transit is more complicated to compute. It is 804 simplified by not considering the ends of the ladders, but examining 805 an arbitrary segment of the middle of the ladder such as shown in 806 Figure 6. We look to compute and generalize the number of LSPs 807 traversing each core link (labeled a and b in Figure 6) and so 808 determine the number of transit LSPs seen by each P(1). 810 : : : : : : 811 : : : : : : 812 P(2) P(2) P(2) P(2) P(2) P(2) 813 \ | \ / | / 814 \ | \ / | / 815 \| \/ |/ 816 ......P(1)---P(1)---P(1)...... 817 | a | | 818 | |b | 819 | | | 820 ......P(1)---P(1)---P(1)...... 821 /| /\ |\ 822 / | / \ | \ 823 / | / \ | \ 824 P(2) P(2) P(2) P(2) P(2) P(2) 825 : : : : : : 826 : : : : : : 828 Figure 6 : An Arbitrary Section of a Ladder Network 830 Of course, the number of LSPs carried on links a and b in the Figure 831 depends on how LSPs are routed through the core network. But if we 832 assume a symmetrical routing policy and an even distribution of LSPs 833 across all shortest paths, the result is the same. 835 Now we can see that each P(1) sees half of 2a+b LSPs (since each LSP 836 would otherwise be counted twice as it passed through the P(1)) 837 except that some of the LSPs are locally terminated so are only 838 included once in the sum 2a+b. 840 So L(1) = a + b/2 - 841 (locally terminated transit LSPs)/2 + 842 (locally contained LSPs) 844 Thus: 846 L(1) = a + b/2 - 847 2*E*E*(S(1) - 1)/2 + 848 E*(E-1) - E*(M(2) - 1) 849 = a + b/2 + 850 E*E*(2 - S(1)) - E*M(2) 852 So all we have to do is work out a and b. 854 Recall that the ladder length R = S(1)/2, and define X = E*E. 856 Consider the contribution made by all of the LSPs that make n hops on 857 the ladder to the totals of each of a and b. If the ladder was 858 unbounded then we could say that in the case of a, there are n*2X 859 LSPs along the spar only, and n(n-1)*2X/n = 2X(n-1) LSPs use a rung 860 and the spar. Thus, the LSPs that make n hops on the ladder 861 contribute (4n-2)X LSPs to a. Note that the edge cases are special 862 because LSPs that make only one hop on the ladder cannot transit a 863 P(1) but only start or end there. 865 So with a ladder of length R = S(1)/2 we could say: 867 R 868 a = SUM[(4i-2)*X] + 2RX 869 i=2 871 = 2*X*R*(R+1) 873 And similarly, considering b in an unbounded ladder, the LSPs that 874 only travel one hop on the LSP are a special case, contributing 2X 875 LSPs, and each other LSP that traverses n hops on the ladder 876 contributes 2n*2X/n = 4X LSPs. So: 878 R+1 879 b = 2X + SUM[4X] 880 i=2 882 = 2*X + 4*X*R 884 In fact the ladders are bounded and so the number of LSPs is reduced 885 because of the effect of the ends of the ladders. The links that see 886 the most LSPs are in the middle of the ladder. Consider a ladder of 887 length R; a node in the middle of the ladder is R/2 hops away from 888 the end of the ladder. So we see that the formula for the 889 contribution to the count of spar-only LSPs for a is only valid up to 890 n=R/2, and for spar-and-rung LSPs for n=1+R/2. Above these limits the 891 contribution made by spar-only LSPs decays as (n-R/2)*2X. However, 892 for a first order approximation, we will use the values of a and b as 893 computed above. This gives us an upper bound of the number of LSPs 894 without using a more complex formula for the reduction made by the 895 effect of the ends of the ladder. 897 From this: 899 L(1) = a + b/2 + 900 E*E*(2 - S(1)) - E*M(2) 901 = 2*X*R*(R+1) + 902 X + 2*X*R + 903 E*E*(2 - S(1)) - E*M(2) 904 = E*E*S(1)*(1 + S(1)/2) + 905 E*E + E*E*S(1) + 906 2*E+E - E*E*S(1) - E*M(2) 907 = E*E*S(1)*(1 + S(1)/2) + 3*E+E - E*M(2) 908 = E*E*S(1)*S(1)/2 + E*E*S(1) + 3*E*E - E*M(2) 910 So, for example, with S(1) = 6, M(1) = 10, and M(2) = 17, we see: 911 E = 170 912 S(PE) = 1020 913 L(PE) = 2038 914 L(2) = 34374 915 L(1) = 777410 917 Alternatively, with S(1) = 10, M(1) = 10, and M(2) = 20, we see: 918 E = 200 919 S(PE) = 2000 920 L(PE) = 3998 921 L(2) = 79580 922 L(1) = 2516000 924 In both examples, the number of LSPs at the core (P(1)) nodes is 925 probably unacceptably large even though there are only a relatively 926 modest number of PEs. In fact, L(2) may even be too large in the 927 second example. 929 Compare the L(1) values with the total number of LSPs in the sytem 930 S(PE)*(S(PE) - 1) which is 1039380 and 3998000 respectively. 932 6. Scaling Snowflake Networks with Forwarding Adjacencies 934 One of the purposes of LSP hierarchies [RFC4206] is to improve the 935 scaling properties of MPLS-TE networks. LSP tunnels (sometimes known 936 as Forwarding Adjacencies (FAs)) may be established to provide 937 connectivity over the core of the network and multiple edge-to-edge 938 LSPs may be tunneled down a single FA LSP. 940 In our network we consider a mesh of FA LSPs between all core nodes 941 at the same level. We consider two possibilities here. In the first 942 all P(2) nodes are connected to all other P(2) nodes by LSP tunnels, 943 and the PE-to-PE LSPs are tunneled across the core of the network. In 944 the second, an extra layer of LSP hierarchy is introduced by 945 connecting all P(1) nodes in an LSP mesh and tunneling the P(2)-to- 946 P(2) tunnels through these. 948 6.1 Two Layer Hierarchy 950 In this hierarchy model, the P(2) nodes are connected by a mesh of 951 tunnels. This means that the P(1) nodes do not see the PE-to-PE LSPs. 953 It remains the case that: 954 L(PE) = 2*(S(PE) - 1) 956 L(2) is slightly increased. It can be computed as the sum of all LSPs 957 for all attached PEs including the LSPs between the attached PE (this 958 figure is unchanged from Section 5.1: i.e. M(2)*(2*S(PE) - M(2) - 1)) 959 plus the number of FA LSPs providing a mesh to the other P(2) nodes. 960 Since the number of P(2) nodes is S(2), each P(2) node sees 961 2*(S(2) - 1) FA LSPs. Thus: 962 L(2) = M(2)*(2*S(PE) - M(2) - 1) + 2*(S(2) - 1) 964 L(1), however, is significantly reduced and can be computed as the 965 sum of the number of FA LSPs to and from each attached P(2) to each 966 other P(2) in the network, including (but counting only once) the FA 967 LSPs between attached P(2) nodes. In fact, the problem is identical 968 to the L(2) computation in Section 5.1. So: 970 L(1) = M(1)*(2*S(2) - M(1) - 1) 972 So, for example, with S(1) = 5, M(1) = 10, and M(2) = 20, we see: 973 S(PE) = 1000 974 S(2) = 50 975 L(PE) = 1998 976 L(2) = 39678 977 L(1) = 890 979 Alternatively, with S(1) = 10, M(1) = 10, and M(2) = 20, we see: 980 S(PE) = 2000 981 S(2) = 100 982 L(PE) = 3998 983 L(2) = 79778 984 L(1) = 1890 986 So, in both examples, potential problems at the core (P(1)) nodes 987 caused by an excessive number of LSPs can be avoided, but any problem 988 with L(2) is made slightly worse as can be seen from the table below. 990 Example| Count | Unmodified | 2-Layer 991 | | (Section 5.1) | Hierarchy 992 -------+-------+---------------+---------- 993 A | L(2) | 39580 | 39678 994 | L(1) | 356000 | 890 995 -------+-------+---------------+---------- 996 B | L(2) | 79580 | 79778 997 | L(1) | 756000 | 1890 999 6.1.1 Tuning the Network Topology to Suit the Two Layer Hierarchy 1001 Clearly we can reduce L(2) by selecting appropriate values of S(1), 1002 M(1), and M(2). We can do this without negative consequences, since 1003 no change will affect L(PE), and since a large percentage increase in 1004 L(1) is sustainable because L(1) is now so small. 1006 Observe that: 1007 L(2) = M(2)*(2*S(PE) - M(2) - 1) + 2*(S(2) - 1) 1009 where S(PE) = S(1)*M(1)*M(2) and S(2) = S(1)*M(1). So L(2) scales 1010 with M(2)^2 and we can have the most impact by reducing M(2) while 1011 keeping S(PE) constant. 1013 For example, with S(1) = 10, M(1) = 10, and M(2) = 10, we see: 1014 S(PE) = 1000 1015 S(2) = 100 1016 L(PE) = 1998 1017 L(2) = 20088 1018 L(1) = 1890 1020 And similarly, with S(1) = 20, M(1) = 20, and M(2) = 5, we see: 1021 S(PE) = 2000 1022 S(2) = 400 1023 L(PE) = 3998 1024 L(2) = 20768 1025 L(1) = 15580 1027 These considerable scaling benefits must be offset against the cost- 1028 effectiveness of the network. Recall from Section 3.3 that 1030 K = S(PE)/(S(1)+S(2) ... + S(n)) 1032 where n is the level above the PEs, so that for our network: 1034 K = S(PE) / (S(1) + S(2)) 1036 Thus, in the first example the cost-effectiveness has been halved 1037 from 1000/55 to 1000/110. And in the second example it has been 1038 reduced to roughly one quarter, changing from 2000/110 to 2000/420. 1040 So, although the tuning changes may be necessary to reach the desired 1041 network size, they come at a considerable cost to the operator. 1043 6.2 Alternative Two Layer Hierarchy 1045 An alternative to the two layer hierarchy presented in section 6.1 is 1046 to provide a full mesh of FA LSPs between P(1) nodes. This technique 1047 is only of benefit to any nodes in the core of the level 1 network. 1048 It makes no difference to the PE and P(2) nodes since they continue 1049 to see only the PE-to-PE LSPs. Furthermore, this approach increases 1050 the burden at the P(1) nodes since they have to support all of the 1051 PE-to-PE LSPs as in the flat model, plus the additional 2*(S(1) - 1) 1052 P(1)-to-P(1) FA LSPs. Thus, this approach should only be considered 1053 where there is a mesh of P-nodes within the ring of P(1) nodes, and 1054 it is not considered further in this document. 1056 6.3 Three Layer Hierarchy 1058 As demonstrated by section 6.2, introducing a mesh of FA LSPs at the 1059 top level (P(1)) has no benefit, but if we introduce an additional 1060 level in the network (P(3) between P(2) and PE) to make a four level 1061 snowflake, we can introduce a new layer of FA LSPs so that we have a 1062 full mesh of FA LSPs between all P(3) nodes to carry the PE-to-PE 1063 LSPs, and a full mesh of FA LSPs between all P(2) nodes to carry the 1064 P(3)-to-P(3) LSPs. 1066 The number of PEs is S(PE) = S(1)*M(1)*M(2)*M(3), and the number of 1067 PE-to-PE LSPs at a PE remains as L(PE) = 2*(S(PE) - 1). 1069 The number of LSPs at a P(3) can be deduced from section 6.1. It is 1070 the sum of all LSPs for all attached PEs including the LSPs between 1071 the attached PE plus the number of FA LSPs providing a mesh to the 1072 other P(3) nodes. 1074 L(3) = M(3)*(2*S(PE) - M(3) - 1) + 2*(S(3) - 1) 1076 The number of LSPs at P(2) can also be deduced from section 6.1 since 1077 it is the sum of all LSPs for all attached P(3) nodes including the 1078 LSPs between the attached PE plus the number of FA LSPs providing a 1079 mesh to the other P(2) nodes. 1081 L(2) = M(2)*(2*S(3) - M(2) - 1) + 2*(S(2) - 1) 1083 Finally, L(1) can be copied straight from 6.1. 1085 L(1) = M(1)*(2*S(2) - M(1) - 1) 1087 For example, with S(1) = 5, M(1) = 5, M(2) = 5, and M(3) = 8 we see: 1088 S(PE) = 1000 1089 S(3) = 125 1090 S(2) = 25 1091 L(PE) = 1998 1092 L(3) = 16176 1093 L(2) = 1268 1094 L(1) = 220 1096 Similarly, with S(1) = 5, M(1) = 5, M(2) = 8, and M(3) = 10 we see: 1097 S(PE) = 2000 1098 S(3) = 200 1099 S(2) = 25 1100 L(PE) = 3998 1101 L(3) = 40038 1102 L(2) = 3184 1103 L(1) = 220 1105 Clearly, there are considerable scaling improvements with this three- 1106 layer hierarchy, and all of the numbers (even L(3) in the second 1107 example) are manageable. 1109 Of course, the extra level in the network tends to reduce the cost- 1110 effectiveness of the networks with values of K = 1000/155 and 1111 K = 2000/230 (from 1000/55 and 2000/110) for the examples above. 1112 That is, a reduction by a factor of 3 in the first case and 2 in the 1113 second case. Such a change in cost-effectiveness has to be weighed 1114 against the the desire to deploy such a large network. If LSP 1115 hierarchies are the only scaling tool available, and networks this 1116 size are required, the cost-effectiveness may need to be sacrificed. 1118 6.4 Issues with Hierarchical LSPs 1120 A basic observation for hierarchical scaling techniques is that it is 1121 hard to have any impact on the number of LSPs that must be supported 1122 by the level of P(n) nodes adjacent to the PEs (for example, it is 1123 hard to reduce L(3) in section 6.3). In fact, the only way we can 1124 change the number of LSPs supported by these nodes is to change the 1125 scaling ratio M(n) in the network, in other words to change the 1126 number of PEs subtended to any P(n). But such a change has a direct 1127 effect on the number of PEs in the network and so the 1128 cost-effectiveness is impacted. 1130 Another concern with the hierarchical approach is that it must be 1131 configured and managed. This may not seem like a large burden, but it 1132 must be recalled that the P(n) nodes are not at the edge of the 1133 network - they are a set of nodes that must be identified so that the 1134 FA LSPs can be configured and provisioned. Effectively, the operator 1135 must plan and construct a layered network with a ring of P(n) nodes 1136 giving access to the level (n) network. This design activity is open 1137 to considerable risk as failing to close the ring (i.e. allowing a 1138 node to be at both level (n+1) and at level (n)) may cause 1139 operational confusion. 1141 Protocol techniques (such as IGP auto-mesh [RFC4972]) have been 1142 developed to reduce the configuration necessary to build this type of 1143 multi-level network. In the case of auto-mesh, the routing protocol 1144 is used to advertise the membership of a 'mesh group', and all 1145 members of the mesh can discover each other and connect with LSP 1146 tunnels. Thus the P(n) nodes giving access to level (n) can advertise 1147 their existence to each other, and it is not necessary to configure 1148 each with information about all of the others. Although this process 1149 can help to reduce the configuration overhead, it does not eliminate 1150 it as each member of the mesh group must still be planned and 1151 configured for membership. 1153 An important consideration for the use of hierarchical LSPs is how 1154 they can be protected using MPLS Fast Re-Route (FRR) [RFC4090]. FRR 1155 may provide link protection either by protecting the tunnels as they 1156 traverse a broken link, or by treating each level (n) tunnel LSP as a 1157 link in level (n+1), and providing protection for the level (n+1) 1158 LSPs (although in this model fault detection and propagation time may 1159 be an issue). Node protection may be performed in a similar way, but 1160 protection of the first and last nodes of a hierarchical LSP is 1161 particularly difficult. Additionally, the whole notion of scaling 1162 with regard to FRR gives rise to separate concerns that are outside 1163 the scope of this document as currently formulated. 1165 Finally, observe that we have been explaining these techniques using 1166 conveniently symmetrical networks. Consider how we would arrange the 1167 hierarchical LSPs in a network where some PEs are connected closer to 1168 the center of the network than others. 1170 7. Scaling Ladder Networks with Forwarding Adjacencies 1172 7.1 Two Layer Hierarchy 1174 In Section 6.2, we observed that there is no value to placing FA LSPs 1175 between the P(1) nodes of our example snowflake topologies. This is 1176 because those LSPs would be just one hop long and would, in fact, 1177 only serve to increase the burden at the P(1) nodes. However, in the 1178 ladder model, there is value to this approach. The P(1) nodes are the 1179 spar nodes of the ladder, and they are not all mutually adjacent. 1180 That is, the P(1)-to-P(1) hierarchical LSPs can create a full mesh of 1181 P(1) nodes where one does not exist in the physical topology. 1183 The number of LSPs seen by a P(1) node is then: 1185 o all of the tunnels terminating at the P(1) node 1186 o any transit tunnels 1187 o all of the LSPs due to subtended PEs. 1189 This is a substantial reduction because all of the transit LSPs are 1190 reduced to just one per remote P(1) that causes any transit LSP. So: 1192 L(1) = 2*(S(1) - 1) + 1193 O(S(1)*S(1)/2) + 1194 2*E*E*(S(1) - 1) + E*(E-1) - E*(M(2) - 1) 1196 where O(S(1)*S(1)/2) gives an upper bound order of magnitude. So: 1198 L(1) = S(1)*S(1)/2 + 2*S(1) + 2*E*E*(S(1) - 1) - E*M(2) - 2 1200 So, in our two examples: 1202 With S(1) = 6, M(1) = 10, and M(2) = 17, we see: 1203 E = 170 1204 S(PE) = 1020 1205 L(PE) = 2038 1206 L(2) = 34374 1207 L(1) = 286138 1209 Alternatively, with S(1) = 10, M(1) = 10, and M(2) = 20, we see: 1210 E = 200 1211 S(PE) = 2000 1212 L(PE) = 3998 1213 L(2) = 79580 1214 L(1) = 716060 1216 Both of these show significant improvements over the previous L(1) 1217 figures of 777410 and 2516000. But the numbers are still too large to 1218 manage, and there is no improvement in the L(2) figures. 1220 7.2 Three Layer Hierarchy 1222 We can also apply the three layer hierarchy to the ladder model. In 1223 this case the number of LSPs between P(1) nodes is not reduced, but 1224 tunnels are also set up between all P(2) nodes. Thus the number of 1225 LSPs seen by a P(1) node is: 1227 o all of the tunnels terminating at the P(1) node 1228 o any transit tunnels between P(1) nodes 1229 o all of the LSPs due to subtended P(2) nodes. 1231 No PE-to-PE LSPs are seen at the P(1) nodes. 1233 L(1) = 2*(S(1) - 1) + 1234 O(S(1)*S(1)/2) + 1235 2*(S(1) - 1)*M(1)*M(1) + M(1)*(M(1) - 1) 1237 where O(S(1)*S(1)/2) gives an upper bound order of magnitude. So: 1239 L(1) = S(1)*S(1)/2 + 2*S(1) + 2*M(1)*M(1)*S(1) - M(1)(M(1) + 1) - 2 1241 Unfortunately, there is a small increase in the number of LSPs seen 1242 by the P(2) nodes. Each P(2) now sees all of the PE-to-PE LSPs that 1243 it saw before, and it is also an end point for a set of P(2)-to-P(2) 1244 tunnels. Thus L(2) increases to: 1246 L(2) = 2*M(2)*(S(PE) - 1) - M(2)*(M(2) - 1) + 2*(S(1)*M(1) - 1) 1248 So, in our two examples: 1250 With S(1) = 6, M(1) = 10, and M(2) = 17, we see: 1251 E = 170 1252 S(PE) = 1020 1253 L(PE) = 2038 1254 L(2) = 34492 1255 L(1) = 1118 1257 Alternatively, with S(1) = 10, M(1) = 10, and M(2) = 20, we see: 1258 E = 200 1259 S(PE) = 2000 1260 L(PE) = 3998 1261 L(2) = 79778 1262 L(1) = 1958 1264 This represents a very dramatic decrease in LSPs across the core. 1266 7.3 Issues with Hierarchical LSPs 1268 The same issues exist for hierarchical LSPs as described in Section 1269 6.4. Although dramatic improvements can be made to the scaling 1270 numbers for the number of LSPs at core nodes, this can only be done 1271 at the cost of configuring P(2) to P(2) tunnels. The mesh of P(1) 1272 tunnels is not enough. 1274 But the sheer number of P(2) to P(2) tunnels that must be configured 1275 is a significant management burden that can only be eased by using a 1276 technique like automesh [RFC4972]. 1278 It is significant, however, that the scaling problem at the P(2) 1279 nodes cannot be improved by using tunnels, and the only solution to 1280 ease this in the hierarchical approach would be to institute another 1281 layer of hierarchy (that is, P(3) nodes) between the P(2) nodes and 1282 the PEs. This is, of course, a significant expense. 1284 8. Scaling Improvements Through Multipoint-to-Point LSPs 1286 An alternative (or complementary) scaling technique has been proposed 1287 using multipoint-to-point (MP2P) LSPs. The fundamental improvement in 1288 this case is achieved by reducing the number of LSPs toward the 1289 destination as LSPs toward the same destination are merged. 1291 This section presents an overview of MP2P LSPs and describes their 1292 applicability and scaling benefits. 1294 8.1 Overview of MP2P LSPs 1296 Note that the MP2P LSPs discussed here are for MPLS-TE and are not 1297 the same concept familiar in the Label Distribution Protocol (LDP) 1298 described in [RFC5036]. 1300 Traffic flows generally converge toward their destination and this 1301 can be utilized by MPLS in constructing an MP2P LSP. With such an 1302 LSP, the LFIB mappings at each LSR are many-to-one so that multiple 1303 pairs {incoming interface, incoming label} are mapped to a single 1304 pair {outgoing interface, outgoing label}. Obviously, if per-platform 1305 labels are used, this mapping may be optimized within an 1306 implementation. 1308 It is important to note that with MP2P MPLS-TE LSPs the traffic flows 1309 are merged. That is, some additional form of identifier is required 1310 if demerging is required. For example, if the payload is IP traffic 1311 belonging to the same client network, no additional demerging 1312 information is required since the IP packet contains sufficient data. 1313 On the other hand, if the data comes, for example, from a variety of 1314 VPN client networks, then the flows will need to be labeled in their 1315 own right as point-to-point (P2P) flows, so that traffic can be 1316 disambiguated at the egress of the MP2P LSPs. 1318 Techniques for establishing MP2P MPLS-TE LSPs and for assigning the 1319 correct bandwidth downstream of LSP merge points are out of the scope 1320 of this document. 1322 8.2 LSP State: A Better Measure of Scalability 1324 Consider the network topology shown in figure 3. Suppose that we 1325 establish MP2P LSP tunnels such that there is one tunnel terminating 1326 at each PE, and that that tunnel has every other PE as an ingress. 1327 Thus, a PE-to-PE MP2P LSP tunnel would have S(PE)-1 ingresses and one 1328 egress, and there would be S(PE) such tunnels. 1330 Note that there still remain 2*(S(PE) - 1) PE-to-PE P2P LSPs that are 1331 carried through these tunnels. 1333 Let's consider the number of LSPs handled at each node in the 1334 network. 1336 The PEs continue to handle the same number of PE-to-PE P2P LSPs, and 1337 must also handle the MP2P LSPs. So: 1339 L(PE) = 2*(S(PE) - 1) + S(PE) 1341 But all P(n) nodes in the network only handle the MP2P LSP tunnels. 1342 Nominally, this means that L(n) = S(PE) for all values of n. This 1343 would appear to be a great success with the number of LSPs cut to 1344 completely manageable levels. 1346 But the number of LSPs is not the only issue (although it may have 1347 some impact for some of the scaling concerns listed in section 4). We 1348 are more interested in the amount of LSP state that is maintained by 1349 an LSR. This reflects the amount of storage required at the LSR, the 1350 amount of protocol processing, and the amount of information that 1351 needs to be managed. 1353 In fact, we were also interested in this measure of scalability in 1354 the earlier sections of this document, but in those cases we could 1355 see a direct correlation between the number of LSPs and the amount of 1356 LSP state since transit LSPs had two pieces of state information (one 1357 on the incoming and one on the outgoing interface), and ingress or 1358 egress LSPs had just one piece of state. 1360 We can quantify the amount of LSP state according to the number of 1361 LSP segments managed by an LSR. So (as above), in the case of a P2P 1362 LSP, an ingress or egress has one segment to maintain, while a 1363 transit has two segments. Similarly, for an MP2P LSP, an LSR must 1364 maintain one set of state information for each upstream segment 1365 (which, we can assume, is in a one-to-one relationship with the 1366 number of upstream neighbors), and exactly one downstream segment - 1367 ingresses obviously have no upstream neighbors, and egresses have no 1368 downstream segments. 1370 So we can start again on our examination of the scaling properties of 1371 MP2P LSPs using X(n) to represent the amount of LSP state held at 1372 each P(n) node. 1374 8.3 Scaling Improvements for Snowflake Networks 1376 At the PEs, there is only connectivity to one other network node, the 1377 P(2) node. But note that if P2P LSPs need to be used to allow 1378 disambiguation of data at the MP2P LSP egresses, then these P2P LSPs 1379 are tunneled within the MP2P LSPs . So X(PE) is: 1381 X(PE) = 2*(S(PE) - 1) if no disambiguation is required 1383 and 1385 X(PE) = 4*(S(PE) - 1) if disambiguation is required 1387 Each P(2) node has M(2) downstream PEs. The P(2) sees a single MP2P 1388 LSP targeted at each downstream PE with one downstream segment (to 1389 that PE) and M(2) - 1 upstream segments from the other subtended PEs. 1390 Additionally, each of these LSPs has an upstream segment from the one 1391 upstream P(1). This gives a total of M(2)*(1 + M(2)) LSP segments. 1393 There are also LSPs running from the subtended PEs to each other PE 1394 in the network. There are S(PE) - M(2) such PEs, and the P(2) sees 1395 one upstream segment for each of these from each subtended PE. It 1396 also has one downstream segment for each of these LSPs. This gives 1397 (M(2) + 1)*(S(PE) - M(2)) LSP segments. 1399 Thus: 1401 X(2) = M(2)*(1 + M(2)) + (M(2) + 1)*(S(PE) - M(2)) 1402 = S(PE)*(M(2) + 1) 1404 Similarly, at each P(1) node there are M(1) downstream P(2) nodes and 1405 so a total of M(1)*M(2) downstream PEs. Each P(1) is connected in a 1406 full mesh with the other P(1) nodes so has (S(1) - 1) neighbors. 1408 The P(1) sees a single MP2P LSP targeted at each downstream PE. This 1409 has one downstream segment (to the P(2) to which the PE is connected) 1410 and M(1) - 1 upstream segments from the other subtended P(2) nodes. 1411 Additionally, each of these LSPs has an upstream segment from each of 1412 the P(1) neighbors. This gives a total number of LSP segments of 1413 M(1)*M(2)*(M(1) + S(1) - 1). 1415 There are also LSPs running from each of the subtended PEs to each 1416 other PE in the network. There are S(PE) - M(1)M(2) such PEs, and the 1417 P(1) sees one upstream segment for each of these from each subtended 1418 P(2) (since the aggregation form the subtended PEs has already 1419 happened at the P(2) nodes). It also has one downstream segment to 1420 appropriate next hop P(1) neighbor for each of these LSPs. This gives 1421 (M(1) + 1)*(S(PE) - M(1)*M(2)) LSP segments. 1423 Thus: 1425 X(1) = M(1)*M(2)*(M(1) + S(1) - 1) + 1426 (M(1) + 1)*(S(PE) - M(1)*M(2)) 1427 = M(1)*M(2)*(S(1) - 2) + S(PE)*(M(1) + 1) 1429 So, for example, with S(1) = 10, M(1) = 10, and M(2) = 10, we see: 1430 S(PE) = 1000 1431 S(2) = 100 1432 X(PE) = 3996 (or 1998) 1433 X(2) = 11000 1434 X(1) = 11800 1436 And similarly, with S(1) = 20, M(1) = 20, and M(2) = 5, we see: 1437 S(PE) = 2000 1438 S(2) = 400 1439 X(PE) = 5996 (or 2998) 1440 X(2) = 12000 1441 X(1) = 39800 1443 8.3.1 Comparison with Other Scenarios 1445 For comparison with the examples in sections 5 and 6, we need to 1446 convert those LSP-based figures to our new measure of LSP state. 1448 Observe that each LSP in sections 5 and 6 generates two state units 1449 at a transit LSR and one at an ingress or egress. So we can provide 1450 conversions as follows: 1452 Section 5 (flat snowflake network) 1453 L(PE) = 2*(S(PE) - 1) 1454 L(2) = M(2)*(2*S(PE) - M(2) - 1) 1455 L(1) = M(1)*M(2)*(2*S(PE) - M(2)*(M(1) + 1)) 1456 X(PE) = 2*(S(PE) - 1) 1457 X(2) = 2*M(2)*(2*S(PE) - M(2) - 1) 1458 X(1) = 2*M(1)*M(2)*(2*S(PE) - M(2)*(M(1) + 1)) 1460 For the example with S(1) = 10, M(1) = 10, and M(2) = 10, this 1461 gives a comparison table as follows: 1463 Count | Unmodified | MP2P 1464 ------+-------------+---------- 1465 X(PE) | 1998 | 3996 1466 X(2) | 39780 | 11000 1467 X(1) | 378000 | 11800 1469 Clearly this technique is a significant improvement over the flat 1470 network within the core of the network, although the PEs are more 1471 heavily stressed if disambiguation is required. 1473 Section 6.1 (Two-layer hierarchy snowflake network) 1474 L(PE) = 2*(S(PE) - 1) 1475 L(2) = M(2)*(2*S(PE) - M(2) - 1) + 2*(S(2) - 1) 1476 L(1) = M(1)*(2*S(2) - M(1) - 1) 1477 X(PE) = 2*(S(PE) - 1) 1478 X(2) = 2*M(2)*(2*S(PE) - M(2) - 1) + 2*(S(2) - 1) 1479 X(1) = 2*M(1)*(2*S(2) - M(1) - 1) 1481 Note that in the computation of X(2) the hierarchical LSPs only add 1482 one state at each P(2) node. 1484 For the same example with S(1) = 10, M(1) = 10, and M(2) = 10, this 1485 gives a comparison table as follows: 1487 Count | 2-Layer | MP2P 1488 | Hierarchy | 1489 ------+-------------+---------- 1490 X(PE) | 1998 | 3996 1491 X(2) | 39978 | 11000 1492 X(1) | 3780 | 11800 1494 And we can observe that the MP2P model is better at P(2), but the 1495 hierarchical model is better at P(1). 1497 In fact, this comparison can be generalized to observe that the MP2P 1498 model produces best effects toward the edge of the network, while the 1499 hierarchical model makes most impression at the core. However, the 1500 requirement for disambiguation of P2P LSPs tunneled within the MP2P 1501 LSPs does cause a double burden at the PEs. 1503 8.4 Scaling Improvements for Ladder Networks 1505 MP2P LSPs applied just within the ladder will not make a significant 1506 difference, but applying MP2P for all LSPs and at all nodes makes a 1507 very big difference without requiring any further configuration. 1509 LSP state at a spar node may be divided into those LSPs segments that 1510 enter or leave the spar node due to subtended PEs (local LSP 1511 segments), and those that enter or leave the spar node due to remote 1512 PEs (remote segments). 1514 The local segments may be counted as: 1516 o E LSPs targeting local PEs 1517 o (S(1)-1)*E*M(1) LSPs targeting remote PEs 1519 The remote segments may be counted as: 1521 o (S(1)-1)*E outgoing LSPs targeting remote PEs 1522 o <= 3*S(1)*E incoming LSPs targeting any PE (there are precisely 1523 P(1) nodes attached to any other P(1) node). 1525 Hence, using X(1) as a measure of LSP state rather than a count of 1526 LSPs, we get: 1528 X(1) <= E + (S(1)-1)*E*M(1) + (S(1)-1)*E + 3*S(1)*E 1529 <= (4 + M(1))*S(1)*E - M(1)*E 1531 The number of LSPs at the P(2) nodes is also improved. We may also 1532 count the LSP state in the same way so that there are: 1534 o M(2) LSPs targeting local PEs 1535 o M(2)*(S(1)*E) LSPs from local PEs to all other PEs 1536 o S(1)*E - M(2) LSPs to remote PEs. 1538 So using X(2) as a measure of LSP state and not a count of LSPs, we 1539 have: 1541 X(2) = M(2) + M(2)*(S(1)*E) + S(1)*E - M(2) 1542 = (M(2) + 1)*S(1)*E 1544 Our examples from Section 5.2 give us the following numbers: 1546 With S(1) = 6, M(1) = 10, and M(2) = 17, we see: 1547 E = 170 1548 S(PE) = 1020 1549 X(PE) = 2038 1550 X(2) = 18360 1551 X(1) <= 12580 1553 Alternatively, with S(1) = 10, M(1) = 10, and M(2) = 20, we see: 1554 E = 200 1555 S(PE) = 2000 1556 X(PE) = 3998 1557 X(2) = 42000 1558 X(1) <= 26000 1560 8.4.1 Comparison with Other Scenarios 1562 The use of MP2P compares very favorably with all scaling scenarios. 1563 It is the only technique able to reduce the value of X(2), and it 1564 does this by a factor of almost two. The impact on X(1) is better 1565 than everything except the three level hierarchy. 1567 The following table provides a quick cross-reference for the figures 1568 for the example ladder networks. Note that the previous figures are 1569 modified to provide counts of LSP state rather than LSP numbers. 1570 Again, each LSP contributes one state at its end points, and two 1571 states at transit nodes. 1573 Thus, for the all cases we have 1575 X(PE) = 2*(S(PE) - 1) or 1576 X(PE) = 4*(S(PE) - 1) if disambiguation is required. 1578 In the unmodified (flat) case, we have: 1580 X(2) = 2*(M(2)*(2*S(PE) - M(2) - 1)) 1581 X(1) = 2*(M(1)*M(2)*(2*S(PE) - M(2)*(M(1) + 1))) 1583 In the 2-level hierarchy, we have: 1585 X(2) = 2*(2*M(2)*(S(PE) - 1) - M(2)*(M(2) - 1)) 1586 X(1) = S(1)*S(1) + 2*S(1) + 4*E*E*(S(1) - 1) - 2*E*M(2) - 2 1588 In the 3-level hierarchy, we have: 1590 X(2) = 2*(2*M(2)*(S(PE) - 1) - M(2)*(M(2) - 1)) + 2*(S(1)*M(1) - 1) 1591 X(1) = S(1)*S(1) + 2*S(1) + 4*M(1)*M(1)*S(1) - 2*M(1)(M(1) + 1) - 2 1593 Example A: S(1) = 6, M(1) = 10, and M(2) = 17 1594 Example B: S(1) = 10, M(1) = 10, and M(2) = 20 1596 Example| Count | Unmodified | 2-Level | 3-Level | MP2P 1597 | | | Hierarchy | Hierarchy | 1598 -------+-------+------------+------------+-------------+------- 1599 A | X(2) | 68748 | 68748 | 68866 | 18360 1600 | X(1) | 1554820 | 572266 | 2226 | 12580 1601 -------+-------+------------+------------+-------------+------- 1602 B | X(2) | 159160 | 159160 | 159358 | 42000 1603 | X(1) | 5032000 | 1433998 | 3898 | 26000 1605 8.4.2 LSP State Compared with LSP Numbers 1607 Recall that in Section 8.3, the true benefit of MP2P was analyzed 1608 with respect to the LSP segment state required, rather than the 1609 actual number of LSPs. This proved to be a more accurate comparison 1610 of the techniques because the MP2P LSPs require state on each branch 1611 of the LSP so the saving is not linear with the reduced number of 1612 LSPs. 1614 A similar analysis could be performed here for the ladder network. 1615 The net effect is that it increases the state by an order of two for 1616 all transit LSPs in the P2P models, and by a multiplier equal to the 1617 degree of a node in the MP2P model. 1619 A rough estimate shows that, as with snowflake networks, MP2P 1620 provides better scaling than the 1-level hierarchical model, and is 1621 considerably better at the core. But MP2P compares less will with the 1622 2-level hierarchy especially in the core. 1624 8.5 Issues with MP2P LSPs 1626 The biggest challenges for MP2P LSPs are the provision of support in 1627 the control and data planes. To some extent support must also be 1628 provided in the management plane. 1630 Control plane support is just a matter of defining the protocols and 1631 procedures [MP2P-RSVP], although it must be clearly understood that 1632 this will introduce some complexity to the control plane. 1634 Hardware issues may be a little more tricky. For example, the 1635 capacity of the upstream segments must never (allowing for 1636 statistical over-subscription) exceed the capacity of the downstream 1637 segment. Similarly, data planes must be equipped with sufficient 1638 buffers to handle incoming packet collisions. 1640 The management plane will be impacted in several ways. Firstly the 1641 management applications will need to handle LSPs with multiple 1642 senders. This means that, although the applications need to process 1643 fewer LSPs, they will be more complicated and will, in fact, need to 1644 process the same number of ingresses and egresses. Other issues like 1645 diagnostics and OAM would also need to be enhanced to support MP2P, 1646 but might be borrowed heavily from LDP networks. 1648 Lastly, note that when the MP2P solution is used, the receiver (the 1649 single egress PE of an MP2P tunnel) cannot use the incoming label as 1650 an indicator of the source of the data. Contrast this with P2P LSPs. 1651 Depending on deployment, this might not be an issue since the PE-PE 1652 connectivity may in any case be a tunnel with inner labels to 1653 discriminate the data flows. 1655 In other deployments, it may be considered necessary to include 1656 additional PE-PE P2P LSPs and tunnel these through the MP2P LSPs. 1657 This would require the PEs to support twice as many LSPs. Since PEs 1658 are not usually as fully specified as P-routers, this may cause some 1659 concern although the use of penultimate hop popping on the MP2P LSPs 1660 might help to reduce this issue. 1662 In all cases, care must be taken not to confuse the reduction in the 1663 number of LSPs with a reduction in the LSP state that is required. In 1664 fact, the discussion in Section 8.3 is slightly optimistic since 1665 LSP state toward the destination will probably need to include 1666 sender information and so will increase depending on the number of 1667 senders for the MP2P LSP. Section 8.4, on the other hand, counts LSP 1668 state rather than LSPs. This issue is clearly dependent on the 1669 protocol solution for MP2P RSVP-TE which is out of scope for this 1670 document. 1672 MPLS Fast Re-route (FRR) [RFC4090] is an attractive scheme for 1673 providing rapid local protection from node or link failures. Such a 1674 scheme has, however, not been designed for MP2P at the time of 1675 writing, so it remains to be seen how practical it would be, 1676 especially in the case of the failure of a merge node. Initial 1677 examination of this case suggests that FRR would not be a problem 1678 for MP2P given that each flow can be handled separately. 1680 As a final note, observe that the MP2P scenario presented in this 1681 document may be optimistic. MP2P LSP merging may be hard to achieve 1682 between LSPs with significantly different traffic and QoS parameters. 1683 Therefore, it may be necessary to increase the number of MP2P LSPs 1684 arriving at an egress. 1686 9. Combined Models 1688 There is nothing to prevent the combination of hierarchical and MP2P 1689 solutions within a network. 1691 Note that if MP2P LSPs are tunneled through P2P FA LSPs across the 1692 core, none of the benefit of LSP merging is seen for the hops during 1693 which the MP2P LSPs are tunneled. 1695 On the other hand, it is possible to construct solutions where MP2P 1696 FA LSPs are constructed within the network resulting in savings from 1697 both modes of operation. 1699 10. An Alternate Solution 1701 A simple solution to reducing the number of LSP tunnels handled by 1702 any node in the network has been proposed. In this solution it is 1703 observed that part of the problem is caused purely by the total 1704 number of LSP in the network, and that this is a function of the 1705 number of PEs since a full mesh of PE-PE LSPs is required. The 1706 conclusion of this observation is to move the tunnel end-points 1707 further into the network so that, instead of having a full mesh of 1708 PE-PE tunnels, we have only a full mesh of P(n)-P(n) tunnels. 1710 Obviously, there is no change in the physical network topology, so 1711 the PEs remain subtended to the P(n) nodes, and the consequence is 1712 that there is no TE on the links between PEs and P(n) nodes. 1714 In this case we have already done the hard work for computing the 1715 number of LSP in the previous sections. The power of the analysis in 1716 the earlier sections is demonstrated by its applicability to this new 1717 model - all we need to do is make minor changes to the formulae. This 1718 is most simply done by removing a layer from the network. We 1719 introduce the term "tunnel end-point" (TEP), and replace the P(n) 1720 nodes with TEPs. Thus, the example of a flat snowflake network in 1721 Figure 3 becomes as shown in Figure 7. Corresponding changes can be 1722 made to all of the sample topologies. 1724 PE PE PE PE PE PE 1725 \ \/ \/ / 1726 PE--TEP TEP TEP TEP--PE 1727 \ | | / 1728 \| |/ 1729 PE--TEP---P(1)------P(1)---TEP--PE 1730 / \ / \ 1731 PE \ / PE 1732 \/ 1733 P(1) 1734 /|\ 1735 / | \ 1736 / | \ 1737 PE--TEP TEP TEP--PE 1738 / /\ \ 1739 PE PE PE PE 1741 Figure 7 : An Example Snowflake Network With Tunnel End-Points 1742 To perform the scaling calculations we need only replace the PE 1743 counts in the formulae with TEP counts, and observe that there is 1744 one fewer layer in the network. For example, in the flat snowflake 1745 network shown in Figure 7 we can see that the number of LSPs seen 1746 at a TEP is: 1748 L(TEP) = 2*(S(TPE) - 1) 1750 In our sample networks S(TPE) is typically of the order of 50 or 100 1751 (the original values of S(2)), so L(TEP) is less than 200 which is 1752 quite manageable. 1754 Similarly, the number of LSPs handled by a P(1) node can be derived 1755 from the original formula for the number of LSPs seen at a P(2) node 1756 since all we have done is reduce n in P(n) from 2 to 1. So our new 1757 formula is: 1759 L(1) = M(1)*(2*S(TEP) - M(1) - 1) 1761 With figures for M(1) = 10 and S(TEP) = 100, this gives us 1762 L(1) = 1890. This is also very manageable. 1764 10.1 Pros and Cons of the Alternate Solution 1766 On the face of it, this alternate solution seems very attractive. 1767 Simply by contracting the edges of the tunnels into the network, we 1768 have shown a dramatic reduction in the number of tunnels needed, and 1769 there is no requirement to apply any additional scaling techniques. 1771 But what of the PE-P(n) links? In the earlier sections of this 1772 document we have assumed that there was some requirement for PE-PE 1773 LSPs with TE properties that extended to the PE-P(n) links at both 1774 ends of each LSP. That means that there was a requirement to provide 1775 reservation-based QoS on those links, to be able to discriminate 1776 traffic flows for priority-based treatment, and to be able to 1777 distinguish applications and sources that send data based on the LSPs 1778 that carry the data. 1780 It might be argued that, since the PE-P(n) links do not offer any 1781 routing options (each such link provides the only access to the 1782 network for a PE), most of the benefits of tunnels are lost on these 1783 peripheral links. However, TE is not just about routing. Just as 1784 important are the abilities to make resource reservations, to 1785 prioritize traffic, and to discriminate between traffic from 1786 different applications, customers, or VPNs. 1788 Furthermore, in multihoming scenarios where each PE is connected to 1789 more than one P(n) or where a PE has multiple links to a single P(n), 1790 there may be a desire to preselect the link to be used and to direct 1791 the traffic to that link using a PE-PE LSP. Note that multihoming 1792 has not been considered in this document. 1794 Operationally, P(n)-P(n) LSPs offer the additional management 1795 overhead that is seen for hierarchical LSPs described in Section 6. 1796 That is, the LSPs have to be configured and established through 1797 additional configuration or management operations that are not 1798 carried out at the PEs. As described in Section 6, automesh [RFC4972] 1799 could be used to ease this task. But it must be noted that, as 1800 mentioned above, some of the key uses of tunnels require that traffic 1801 is classified and placed in an appropriate tunnel according to its 1802 traffic class, end-points, originating application, and customer 1803 (such as client VPN). This information may not be readily available 1804 for each packet at the P(n) nodes since it is PE-based information. 1805 Of course, it is possible to conceive of techniques to make this 1806 information available such as assigning a different label for each 1807 class of traffic, but this gives rise to the original problems of 1808 larger numbers of LSPs. 1810 Our conclusion is, therefore, that this alternate technique may be 1811 suitable for the general distribution of traffic based solely on the 1812 destination, or on a combination of the destination and key fields 1813 carried in the IP header. In this case it can provide a very 1814 satisfactory answer to the scaling issues in an MPLS-TE network. But 1815 if more sophisticated packet classification and discrimination is 1816 required, this technique will make the desired function hard to 1817 achieve, and the trade-off between scaling and feature-level will 1818 swing too far towards solving the scaling issue at the expense of 1819 delivery of function to the customer. 1821 11. Management Considerations 1823 The management issues of the models presented in this document 1824 have been discussed in line. No one solution is without its 1825 management overhead. 1827 Note, however, that scalability of management tools is one of the 1828 motivators for this work and that network scaling solutions that 1829 reduce the active management of LSPs at the cost of additional effort 1830 to manage the more static elements of the network represent a 1831 benefit. That is, it is worth the additional effort to set up MP2P or 1832 FA LSPs if it means that the network can be scaled to a larger size 1833 without being constrained by the management tools. 1835 The MP2P technique may prove harder to debug through OAM methods than 1836 the FA LSP approach. 1838 12. Security Considerations 1840 The techniques described in this document use existing or 1841 yet-to-be-defined signaling protocol extensions and are subject to 1842 the security provided by those extensions. Note that we are talking 1843 about tunneling techniques used within the network and both 1844 approaches are vulnerable to the creation of bogus tunnels that 1845 deliver data to an egress or consume network resources. 1847 The fact that the MP2P technique may prove harder to debug through 1848 OAM methods than the FA LSP approach is a security concern since it 1849 is important to be able to detect misconnections. 1851 General issues of the realtionship between scaling and security are 1852 covered in Section 1.1, but the details are beyond the scope of this 1853 document. Readers are refered to [MPLS-SEC] for details of MPLS 1854 security techniques. 1856 13. Recommendations 1858 The analysis in this document suggests that the ability to signal 1859 MP2P MPLS-TE LSPs is a desirable addition to the operator's MPLS-TE 1860 toolkit. 1862 At this stage, no further recommendations are made, but it would be 1863 valuable to consult more widely to discover: 1865 - The concerns of other service providers with respect to network 1866 scalability. 1868 - More opinions on the realistic constraints to the network 1869 parameters listed in Section 4. 1871 - Desirable values for the cost-effectiveness of the network 1872 (parameter K) 1874 - The applicability, manageability and support for the two techniques 1875 described 1877 - The feasibility of combining the two techniques as discussed in 1878 Section 9. 1880 - The level of concern over the loss of functionality that would 1881 occur if the alternate solution described in Section 10 was 1882 adopted. 1884 14. IANA Considerations 1886 This document makes no requests for IANA action. 1888 15. Acknowledgements 1890 The authors are grateful to Jean-Louis Le Roux for discussions and 1891 review input. Thanks to Ben Niven-Jenkins, JP Vasseur, Loa Andersson, 1892 Anders Gavler, and Tom Polk for their comments. Thanks to Dave Allen 1893 for useful discussion of the math. 1895 16. Intellectual Property Consideration 1897 The IETF Trust takes no position regarding the validity or scope of 1898 any Intellectual Property Rights or other rights that might be 1899 claimed to pertain to the implementation or use of the technology 1900 described in any IETF Document or the extent to which any license 1901 under such rights might or might not be available; nor does it 1902 represent that it has made any independent effort to identify any 1903 such rights. 1905 Copies of Intellectual Property disclosures made to the IETF 1906 Secretariat and any assurances of licenses to be made available, or 1907 the result of an attempt made to obtain a general license or 1908 permission for the use of such proprietary rights by implementers or 1909 users of this specification can be obtained from the IETF on-line IPR 1910 repository at http://www.ietf.org/ipr 1912 The IETF invites any interested party to bring to its attention any 1913 copyrights, patents or patent applications, or other proprietary 1914 rights that may cover technology that may be required to implement 1915 any standard or specification contained in an IETF Document. Please 1916 address the information to the IETF at ietf-ipr@ietf.org. 1918 The definitive version of an IETF Document is that published by, or 1919 under the auspices of, the IETF. Versions of IETF Documents that are 1920 published by third parties, including those that are translated into 1921 other languages, should not be considered to be definitive versions 1922 of IETF Documents. The definitive version of these Legal Provisions 1923 is that published by, or under the auspices of, the IETF. Versions of 1924 these Legal Provisions that are published by third parties, including 1925 those that are translated into other languages, should not be 1926 considered to be definitive versions of these Legal Provisions. 1928 For the avoidance of doubt, each Contributor to the IETF Standards 1929 Process licenses each Contribution that he or she makes as part of 1930 the IETF Standards Process to the IETF Trust pursuant to the 1931 provisions of RFC 5378. No language to the contrary, or terms, 1932 conditions or rights that differ from or are inconsistent with the 1933 rights and licenses granted under RFC 5378, shall have any effect and 1934 shall be null and void, whether published or posted by such 1935 Contributor, or included with or in such Contribution. 1937 17. Normative References 1939 [RFC4206] Kompella, K. and Y. Rekhter, "Label Switched Paths (LSP) 1940 Hierarchy with Generalized Multi-Protocol Label 1941 Switching (GMPLS) Traffic Engineering (TE)", RFC 4206, 1942 October 2005. 1944 18. Informative References 1946 [RFC2961] Berger, L., Gan, D., Swallow, G., Pan, P., Tommasi, F., 1947 and S. Molendini, "RSVP Refresh Overhead Reduction 1948 Extensions", RFC 2961, April 2001. 1950 [RFC3209] Awduche, D., Berger, L., Gan, D., Li, T., Srinivasan, V., 1951 and G. Swallow, "RSVP-TE: Extensions to RSVP for LSP 1952 Tunnels", RFC 3209, December 2001. 1954 [RFC3270] Le Faucheur, F., et al., "Multi-Protocol Label Switching 1955 (MPLS) Support of Differentiated Services", RFC 3270, May 1956 2002. 1958 [RFC3473] Berger, L., et al., Editor, "Generalized Multi-Protocol 1959 Label Switching (GMPLS) Signaling Resource ReserVation 1960 Protocol-Traffic Engineering (RSVP-TE) Extensions", 1961 RFC 3473, January 2003. 1963 [RFC3985] Bryant, S., and Pate, P., "Pseudo Wire Emulation Edge-to- 1964 Edge (PWE3) Architecture", RFC 3985, March 2005. 1966 [RFC4090] P. Pan, G. Swallow, A. Atlas (Editors), "Fast Reroute 1967 Extensions to RSVP-TE for LSP Tunnels", RFC 4090, May 1968 2005. 1970 [RFC4110] Callon, R., and Suzuki, M., "A Framework for Layer 3 1971 Provider-Provisioned Virtual Private Networks (PPVPNs)", 1972 RFC 4110, July 2005. 1974 [RFC4972] Vasseur, JP., and Le Roux, JL., "Routing Extensions for 1975 Discovery of Multiprotocol (MPLS) Label Switch Router 1976 (LSR) Traffic Engineering (TE) Mesh Membership", 1977 RFC 4972, July 2007. 1979 [RFC5036] Andersson, L., Doolan, P., Feldman, N., Fredette, A., and 1980 B. Thomas, "LDP Specification", RFC 5036, October 2007. 1982 [MP2P-RSVP] Yasukawa, Y., "Supporting Multipoint-to-Point Label 1983 Switched Paths in Multiprotocol Label Switching Traffic 1984 Engineering", draft-yasukawa-mpls-mp2p-rsvpte, work in 1985 progress. 1987 [MPLS-SEC] Fang, L. (Ed.), "Security Framework for MPLS and GMPLS 1988 Networks", draft-ietf-mpls-mpls-and-gmpls-security- 1989 framework, work in progress. 1991 19. Authors' Addresses 1993 Seisho Yasukawa 1994 NTT Corporation 1995 9-11, Midori-Cho 3-Chome 1996 Musashino-Shi, Tokyo 180-8585 Japan 1997 Phone: +81 422 59 4769 1998 EMail: s.yasukawa@hco.ntt.co.jp 2000 Adrian Farrel 2001 Old Dog Consulting 2002 EMail: adrian@olddog.co.uk 2004 Olufemi Komolafe 2005 Cisco Systems 2006 96 Commercial Street 2007 Edinburgh 2008 EH6 6LX 2009 United Kingdom 2010 EMail: femi@cisco.com 2012 20. Disclaimer of Validity 2014 All IETF Documents and the information contained therein are provided 2015 on an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE 2016 REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE 2017 IETF TRUST AND THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL 2018 WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY 2019 WARRANTY THAT THE USE OF THE INFORMATION THEREIN WILL NOT INFRINGE 2020 ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS 2021 FOR A PARTICULAR PURPOSE. 2023 21. Full Copyright Statement 2025 Copyright (c) 2008 IETF Trust and the persons identified as the 2026 document authors. All rights reserved. 2028 This document is subject to BCP 78 and the IETF Trust's Legal 2029 Provisions Relating to IETF Documents 2030 (http://trustee.ietf.org/license-info) in effect on the date of 2031 publication of this document. Please review these documents 2032 carefully, as they describe your rights and restrictions with respect 2033 to this document.