idnits 2.17.1 draft-ietf-teas-interconnected-te-info-exchange-03.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (October 15, 2015) is 3087 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Obsolete informational reference (is this intentional?): RFC 5246 (Obsoleted by RFC 8446) Summary: 0 errors (**), 0 flaws (~~), 1 warning (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 Network Working Group A. Farrel (Ed.) 2 Internet-Draft J. Drake 3 Intended status: Standards Track Juniper Networks 4 Expires: April 15, 2016 5 N. Bitar 6 Verizon Networks 8 G. Swallow 9 Cisco Systems, Inc. 11 D. Ceccarelli 12 Ericsson 14 X. Zhang 15 Huawei 16 October 15, 2015 18 Problem Statement and Architecture for Information Exchange 19 Between Interconnected Traffic Engineered Networks 21 draft-ietf-teas-interconnected-te-info-exchange-03.txt 23 Abstract 25 In Traffic Engineered (TE) systems, it is sometimes desirable to 26 establish an end-to-end TE path with a set of constraints (such as 27 bandwidth) across one or more network from a source to a destination. 28 TE information is the data relating to nodes and TE links that is 29 used in the process of selecting a TE path. TE information is 30 usually only available within a network. We call such a zone of 31 visibility of TE information a domain. An example of a domain may be 32 an IGP area or an Autonomous System. 34 In order to determine the potential to establish a TE path through a 35 series of connected networks, it is necessary to have available a 36 certain amount of TE information about each network. This need not 37 be the full set of TE information available within each network, but 38 does need to express the potential of providing TE connectivity. This 39 subset of TE information is called TE reachability information. 41 This document sets out the problem statement and architecture for the 42 exchange of TE information between interconnected TE networks in 43 support of end-to-end TE path establishment. For reasons that are 44 explained in the document, this work is limited to simple TE 45 constraints and information that determine TE reachability. 47 Status of This Memo 49 This Internet-Draft is submitted in full conformance with the 50 provisions of BCP 78 and BCP 79. 52 Internet-Drafts are working documents of the Internet Engineering 53 Task Force (IETF). Note that other groups may also distribute 54 working documents as Internet-Drafts. The list of current Internet- 55 Drafts is at http://datatracker.ietf.org/drafts/current/. 57 Internet-Drafts are draft documents valid for a maximum of six months 58 and may be updated, replaced, or obsoleted by other documents at any 59 time. It is inappropriate to use Internet-Drafts as reference 60 material or to cite them other than as "work in progress." 62 Copyright Notice 64 Copyright (c) 2015 IETF Trust and the persons identified as the 65 document authors. All rights reserved. 67 This document is subject to BCP 78 and the IETF Trust's Legal 68 Provisions Relating to IETF Documents 69 (http://trustee.ietf.org/license-info) in effect on the date of 70 publication of this document. Please review these documents 71 carefully, as they describe your rights and restrictions with respect 72 to this document. Code Components extracted from this document must 73 include Simplified BSD License text as described in Section 4.e of 74 the Trust Legal Provisions and are provided without warranty as 75 described in the Simplified BSD License. 77 Table of Contents 79 1. Introduction ................................................. 5 80 1.1. Terminology ................................................ 6 81 1.1.1. TE Paths and TE Connections .............................. 6 82 1.1.2. TE Metrics and TE Attributes ............................. 6 83 1.1.3. TE Reachability .......................................... 6 84 1.1.4. Domain ................................................... 7 85 1.1.5. Aggregation .............................................. 7 86 1.1.6. Abstraction .............................................. 7 87 1.1.7. Abstract Link ............................................ 7 88 1.1.8. Abstraction Layer Network ................................ 8 89 2. Overview of Use Cases ........................................ 8 90 2.1. Peer Networks .............................................. 8 91 2.2. Client-Server Networks ..................................... 10 92 2.3. Dual-Homing ................................................ 12 93 2.4. Requesting Connectivity .................................... 13 94 2.4.1. Discovering Server Network Information ................... 15 95 3. Problem Statement ............................................ 15 96 3.1. Policy and Filters ......................................... 15 97 3.2. Confidentiality ............................................ 15 98 3.3. Information Overload ....................................... 17 99 3.4. Issues of Information Churn ................................ 17 100 3.5. Issues of Aggregation ...................................... 18 101 4. Architecture ................................................. 19 102 4.1. TE Reachability ............................................ 19 103 4.2. Abstraction not Aggregation ................................ 20 104 4.2.1. Abstract Links ........................................... 21 105 4.2.2. The Abstraction Layer Network ............................ 21 106 4.2.3. Abstraction in Client-Server Networks..................... 24 107 4.2.4. Abstraction in Peer Networks ............................. 29 108 4.3. Considerations for Dynamic Abstraction ..................... 32 109 4.4. Requirements for Advertising Links and Nodes ............... 32 110 4.5. Addressing Considerations .................................. 33 111 5. Building on Existing Protocols ............................... 33 112 5.1. BGP-LS ..................................................... 34 113 5.2. IGPs ....................................................... 34 114 5.3. RSVP-TE .................................................... 34 115 5.4. Notes on a Solution ........................................ 35 116 6. Applicability to Optical Domains and Networks ................. 36 117 7. Modeling the User-to-Network Interface ....................... 40 118 8. Abstraction in L3VPN Multi-AS Environments ................... 42 119 9. Scoping Future Work .......................................... 43 120 9.1. Not Solving the Internet ................................... 43 121 9.2. Working With "Related" Domains ............................. 43 122 9.3. Not Finding Optimal Paths in All Situations ................ 44 123 9.4. Sanity and Scaling ......................................... 44 124 10. Manageability Considerations ................................ 44 125 10.1. Managing the Abstraction Layer Network .................... 44 126 10.2. Managing Interactions of Client and Abstraction Layer Networks 127 45 128 10.3. Managing Interactions of Abstraction Layer and Server Networks 129 46 130 11. IANA Considerations ......................................... 47 131 12. Security Considerations ..................................... 47 132 13. Acknowledgements ............................................ 47 133 14. References .................................................. 48 134 14.1. Informative References .................................... 48 135 Authors' Addresses ............................................... 52 136 Contributors ..................................................... 52 137 A. Existing Work ................................................ 54 138 A.1. Per-Domain Path Computation ................................ 54 139 A.2. Crankback .................................................. 54 140 A.3. Path Computation Element ................................... 55 141 A.4. GMPLS UNI and Overlay Networks ............................. 57 142 A.5. Layer One VPN .............................................. 57 143 A.6. Policy and Link Advertisement .............................. 58 144 B. Additional Features .......................................... 59 145 B.1. Macro Shared Risk Link Groups .............................. 59 146 B.2. Mutual Exclusivity ......................................... 60 148 1. Introduction 150 Traffic Engineered (TE) systems such as MPLS-TE [RFC2702] and GMPLS 151 [RFC3945] offer a way to establish paths through a network in a 152 controlled way that reserves network resources on specified links. 153 TE paths are computed by examining the Traffic Engineering Database 154 (TED) and selecting a sequence of links and nodes that are capable of 155 meeting the requirements of the path to be established. The TED is 156 constructed from information distributed by the IGP running in the 157 network, for example OSPF-TE [RFC3630] or ISIS-TE [RFC5305]. 159 It is sometimes desirable to establish an end-to-end TE path that 160 crosses more than one network or administrative domain as described 161 in [RFC4105] and [RFC4216]. In these cases, the availability of TE 162 information is usually limited to within each network. Such networks 163 are often referred to as Domains [RFC4726] and we adopt that 164 definition in this document: viz. 166 For the purposes of this document, a domain is considered to be any 167 collection of network elements within a common sphere of address 168 management or path computational responsibility. Examples of such 169 domains include IGP areas and Autonomous Systems. 171 In order to determine the potential to establish a TE path through a 172 series of connected domains and to choose the appropriate domain 173 connection points through which to route a path, it is necessary to 174 have available a certain amount of TE information about each domain. 175 This need not be the full set of TE information available within each 176 domain, but does need to express the potential of providing TE 177 connectivity. This subset of TE information is called TE 178 reachability information. The TE reachability information can be 179 exchanged between domains based on the information gathered from the 180 local routing protocol, filtered by configured policy, or statically 181 configured. 183 This document sets out the problem statement and architecture for the 184 exchange of TE information between interconnected TE domains in 185 support of end-to-end TE path establishment. The scope of this 186 document is limited to the simple TE constraints and information 187 (such as TE metrics, hop count, bandwidth, delay, shared risk) 188 necessary to determine TE reachability: discussion of multiple 189 additional constraints that might qualify the reachability can 190 significantly complicate aggregation of information and the stability 191 of the mechanism used to present potential connectivity as is 192 explained in the body of this document. 194 An Appendix to this document summarizes existing relevant existing 195 work that is used to route TE paths across multiple domains. 197 1.1. Terminology 199 This section introduces some key terms that need to be understood to 200 arrive at a common understanding of the problem space. Some of the 201 terms are defined in more detail in the sections that follow (in 202 which case forward pointers are provided) and some terms are taken 203 from definitions that already exist in other RFCs (in which case 204 references are given, but no apology is made for repeating or 205 summarizing the definitions here). 207 1.1.1. TE Paths and TE Connections 209 A TE connection is a Label Switched Path (LSP) through an MPLS-TE or 210 GMPLS network that directs traffic along a particular path (the TE 211 path) in order to provide a specific service such as bandwidth 212 guarantee, separation of traffic, or resilience between a well-known 213 pair of end points. 215 1.1.2. TE Metrics and TE Attributes 217 TE metrics and TE attributes are terms applied to parameters of links 218 (and possibly nodes) in a network that is traversed by TE 219 connections. The TE metrics and TE attributes are used by path 220 computation algorithms to select the TE paths that the TE connections 221 traverse. Provisioning a TE connection through a network may result 222 in dynamic changes to the TE metrics and TE attributes of the links 223 and nodes in the network. 225 These terms are also sometimes used to describe the end-to-end 226 characteristics of a TE connection and can be derived according to a 227 formula from the metrics and attributes of the links and nodes that 228 the TE connection traverses. Thus, for example, the end-to-end delay 229 for a TE connection is usually considered to be the sum of the delay 230 on each link that the connection traverses. 232 1.1.3. TE Reachability 234 In an IP network, reachability is the ability to deliver a packet to 235 a specific address or prefix. That is, the existence of an IP path 236 to that address or prefix. TE reachability is the ability to reach a 237 specific address along a TE path. More specifically, it is the 238 ability to establish a TE connection in an MPLS-TE or GMPLS sense. 239 Thus we talk about TE reachability as the potential of providing TE 240 connectivity. 242 TE reachability may be unqualified (there is a TE path, but no 243 information about available resources or other constraints is 244 supplied) which is helpful especially in determining a path to a 245 destination that lies in an unknown domain, or may be qualified by TE 246 attributes and TE metrics such as hop count, available bandwidth, 247 delay, shared risk, etc. 249 1.1.4. Domain 251 As defined in [RFC4726], a domain is any collection of network 252 elements within a common sphere of address management or path 253 computational responsibility. Examples of such domains include 254 Interior Gateway Protocol (IGP) areas and Autonomous Systems (ASes). 256 1.1.5. Aggregation 258 The concept of aggregation is discussed in Section 3.5. In 259 aggregation, multiple network resources from a domain are represented 260 outside the domain as a single entity. Thus multiple links and nodes 261 forming a TE connection may be represented as a single link, or a 262 collection of nodes and links (perhaps the whole domain) may be 263 represented as a single node with its attachment links. 265 1.1.6. Abstraction 267 Section 4.2 introduces the concept of abstraction and distinguishes 268 it from aggregation. Abstraction may be viewed as "policy-based 269 aggregation" where the policies are applied to overcome the issues 270 with aggregation as identified in Section 3 of this document. 272 Abstraction is the process of applying policy to the available TE 273 information within a domain, to produce selective information that 274 represents the potential ability to connect across the domain. Thus, 275 abstraction does not necessarily offer all possible connectivity 276 options, but presents a general view of potential connectivity 277 according to the policies that determine how the domain's 278 administrator wants to allow the domain resources to be used. 280 1.1.7. Abstract Link 282 An abstract link is the representation of the characteristics of a 283 path between two nodes in a domain produced by abstraction. The 284 abstract link is advertised outside that domain as a TE link for use 285 in signaling in other domains. Thus, an abstract link represents 286 the potential to connect between a pair of nodes. 288 More details of abstract links are provided in Section 4.2.1. 290 1.1.8. Abstraction Layer Network 292 The abstraction layer network is introduced in Section 4.2.2. It may 293 be seen as a brokerage layer network between one or more server 294 networks and one or more client network. The abstraction layer 295 network is the collection of abstract links that provide potential 296 connectivity across the server network(s) and on which path 297 computation can be performed to determine edge-to-edge paths that 298 provide connectivity as links in the client network. 300 In the simplest case, the abstraction layer network is just a set of 301 edge-to-edge connections (i.e., abstract links), but to make the use 302 of server resources more flexible, the abstract links might not all 303 extend from edge to edge, but might offer connectivity between server 304 nodes to form a more complex network. 306 2. Overview of Use Cases 308 2.1. Peer Networks 310 The peer network use case can be most simply illustrated by the 311 example in Figure 1. A TE path is required between the source (Src) 312 and destination (Dst), that are located in different domains. There 313 are two points of interconnection between the domains, and selecting 314 the wrong point of interconnection can lead to a sub-optimal path, or 315 even fail to make a path available. Note that peer networks are 316 assumed to have the same technology type: that is, the same 317 "switching capability" to use the term from GMPLS [RFC3945]. 319 For example, when Domain A attempts to select a path, it may 320 determine that adequate bandwidth is available from Src through both 321 interconnection points x1 and x2. It may pick the path through x1 322 for local policy reasons: perhaps the TE metric is smaller. However, 323 if there is no connectivity in Domain Z from x1 to Dst, the path 324 cannot be established. Techniques such as crankback (see Section 325 4.1) may be used to alleviate this situation, but do not lead to 326 rapid setup or guaranteed optimality. Furthermore RSVP signalling 327 creates state in the network that is immediately removed by the 328 crankback procedure. Frequent events of such a kind impact 329 scalability in a non-deterministic manner. 331 -------------- -------------- 332 | Domain A | x1 | Domain Z | 333 | ----- +----+ ----- | 334 | | Src | +----+ | Dst | | 335 | ----- | x2 | ----- | 336 -------------- -------------- 338 Figure 1 : Peer Networks 340 There are countless more complicated examples of the problem of peer 341 networks. Figure 2 shows the case where there is a simple mesh of 342 domains. Clearly, to find a TE path from Src to Dst, Domain A must 343 not select a path leaving through interconnect x1 since Domain B has 344 no connectivity to Domain Z. Furthermore, in deciding whether to 345 select interconnection x2 (through Domain C) or interconnection x3 346 though Domain D, Domain A must be sensitive to the TE connectivity 347 available through each of Domains C and D, as well the TE 348 connectivity from each of interconnections x4 and x5 to Dst within 349 Domain Z. The problem may be further complicated when the source 350 domain does not know in which domain the destination node is located, 351 since the choice of a domain path clearly depends on the knowledge of 352 the destination domain: this issue is obviously mitigated in IP 353 networks by inter-domain routing [RFC4271]. 355 -------------- 356 | Domain B | 357 | | 358 | | 359 /-------------- 360 / 361 / 362 /x1 363 --------------/ -------------- 364 | Domain A | | Domain Z | 365 | | -------------- | | 366 | ----- | x2| Domain C | x4| ----- | 367 | | Src | +---+ +---+ | Dst | | 368 | ----- | | | | ----- | 369 | | -------------- | | 370 --------------\ /-------------- 371 \x3 / 372 \ / 373 \ /x5 374 \--------------/ 375 | Domain D | 376 | | 377 | | 378 -------------- 380 Figure 2 : Peer Networks in a Mesh 382 Of course, many network interconnection scenarios are going to be a 383 combination of the situations expressed in these two examples. There 384 may be a mesh of domains, and the domains may have multiple points of 385 interconnection. 387 2.2. Client-Server Networks 389 Two major classes of use case relate to the client-server 390 relationship between networks. These use cases have sometimes been 391 referred to as overlay networks. In both cases, the client and 392 server network may have the same switching capability, or may be 393 built from nodes and links that have different technology types in 394 the client and server networks. 396 The first group of use cases, shown in Figure 3, occurs when domains 397 belonging to one network are connected by a domain belonging to 398 another network. In this scenario, once connectivity is formed 399 across the lower layer network, the domains of the upper layer 400 network can be merged into a single domain by running IGP adjacencies 401 and by treating the server layer connectivity as links in the higher 402 layer network. The TE relationship between the domains (higher and 403 lower layer) in this case is reduced to determining what server layer 404 connectivity to establish, how to trigger it, how to route it in the 405 server layer, and what resources and capacity to assign within the 406 server layer. As the demands in the higher layer network vary, the 407 connectivity in the server layer may need to be modified. Section 408 2.4 explains in a little more detail how connectivity may be 409 requested. 411 -------------- -------------- 412 | Domain A | | Domain Z | 413 | | | | 414 | ----- | | ----- | 415 | | Src | | | | Dst | | 416 | ----- | | ----- | 417 | | | | 418 --------------\ /-------------- 419 \x1 x2/ 420 \ / 421 \ / 422 \---------------/ 423 | Server Domain | 424 | | 425 | | 426 --------------- 428 Figure 3 : Client-Server Networks 430 The second class of use case of client-server networking is for 431 Virtual Private Networks (VPNs). In this case, as opposed to the 432 former one, it is assumed that the client network has a different 433 address space than that of the server layer where non-overlapping IP 434 addresses between the client and the server networks cannot be 435 guaranteed. A simple example is shown in Figure 4. The VPN sites 436 comprise a set of domains that are interconnected over a core domain, 437 the provider network. 439 -------------- -------------- 440 | Domain A | | Domain Z | 441 | (VPN site) | | (VPN site) | 442 | | | | 443 | ----- | | ----- | 444 | | Src | | | | Dst | | 445 | ----- | | ----- | 446 | | | | 447 --------------\ /-------------- 448 \x1 x2/ 449 \ / 450 \ / 451 \---------------/ 452 | Core Domain | 453 | | 454 | | 455 /---------------\ 456 / \ 457 / \ 458 /x3 x4\ 459 --------------/ \-------------- 460 | Domain B | | Domain C | 461 | (VPN site) | | (VPN site) | 462 | | | | 463 | | | | 464 -------------- -------------- 466 Figure 4 : A Virtual Private Network 468 Note that in the use cases shown in Figures 3 and 4 the client layer 469 domains may (and, in fact, probably do) operate as a single connected 470 network. 472 Both use cases in this section become "more interesting" when 473 combined with the use case in Section 2.1. That is, when the 474 connectivity between higher layer domains or VPN sites is provided 475 by a sequence or mesh of lower layer domains. Figure 5 shows how 476 this might look in the case of a VPN. 478 ------------ ------------ 479 | Domain A | | Domain Z | 480 | (VPN site) | | (VPN site) | 481 | ----- | | ----- | 482 | | Src | | | | Dst | | 483 | ----- | | ----- | 484 | | | | 485 ------------\ /------------ 486 \x1 x2/ 487 \ / 488 \ / 489 \---------- ----------/ 490 | Domain X |x5 | Domain Y | 491 | (core) +---+ (core) | 492 | | | | 493 | +---+ | 494 | |x6 | | 495 /---------- ----------\ 496 / \ 497 / \ 498 /x3 x4\ 499 ------------/ \------------ 500 | Domain B | | Domain C | 501 | (VPN site) | | (VPN site) | 502 | | | | 503 ------------ ------------ 505 Figure 5 : A VPN Supported Over Multiple Server Domains 507 2.3. Dual-Homing 509 A further complication may be added to the client-server relationship 510 described in Section 2.2 by considering what happens when a client 511 domain is attached to more than one server domain, or has two points 512 of attachment to a server domain. Figure 6 shows an example of this 513 for a VPN. 515 ------------ 516 | Domain A | 517 | (VPN site) | 518 ------------ | ----- | 519 | Domain B | | | Src | | 520 | (VPN site) | | ----- | 521 | | | | 522 ------------\ -+--------+- 523 \x1 | | 524 \ x2| |x3 525 \ | | ------------ 526 \--------+- -+-------- | Domain Z | 527 | Domain X | x8 | Domain Y | x4 | (VPN site) | 528 | (core) +----+ (core) +----+ ----- | 529 | | | | | | Dst | | 530 | +----+ +----+ ----- | 531 | | x9 | | x5 | | 532 /---------- ----------\ ------------ 533 / \ 534 / \ 535 /x6 x7\ 536 ------------/ \------------ 537 | Domain C | | Domain D | 538 | (VPN site) | | (VPN site) | 539 | | | | 540 ------------ ------------ 542 Figure 6 : Dual-Homing in a Virtual Private Network 544 2.4. Requesting Connectivity 546 This relationship between domains can be entirely under the control 547 of management processes, dynamically triggered by the client network, 548 or some hybrid of these cases. In the management case, the server 549 network may be requested to establish a set of LSPs to provide client 550 layer connectivity. In the dynamic case, the client may make a 551 request to the server network exerting a range of controls over the 552 paths selected in the server network. This range extends from no 553 control (i.e., a simple request for connectivity), through a set of 554 constraints (such as latency, path protection, etc.), up to and 555 including full control of the path and resources used in the server 556 network (i.e., the use of explicit paths with label subobjects). 558 There are various models by which a server network can be requested 559 to set up the connections that support a service provided to the 560 client network. These requests may come from management systems, 561 directly from the client network control plane, or through an 562 intermediary broker such as the Virtual Network Topology Manager 563 (VNTM) [RFC5623]. 565 The trigger that causes the request to the server layer is also 566 flexible. It could be that the client layer discovers a pressing 567 need for server layer resources (such as the desire to provision an 568 end-to-end connection in the client layer, or severe congestion on 569 a specific path), or it might be that a planning application has 570 considered how best to optimize traffic in the client network or 571 how to handle a predicted traffic demand. 573 In all cases, the relationship between client and server networks is 574 subject to policy so that server resources are under the 575 administrative control of the operator or the server layer network 576 and are only used to support a client layer network in ways that the 577 server layer operator approves. 579 As just noted, connectivity requests issued to a server network may 580 include varying degrees of constraint upon the choice of path that 581 the server network can implement. 583 o Basic Provisioning is a simple request for connectivity. The only 584 constraints are the end points of the connection and the capacity 585 (bandwidth) that the connection will support for the client layer. 586 In the case of some server networks, even the bandwidth component 587 of a basic provisioning request is superfluous because the server 588 layer has no facility to vary bandwidth, but can offer connectivity 589 only at a default capacity. 591 o Basic Provisioning with Optimization is a service request that 592 indicates one or more metrics that the server layer must optimize 593 in its selection of a path. Metrics may be hop count, path length, 594 summed TE metric, jitter, delay, or any number of technology- 595 specific constraints. 597 o Basic Provisioning with Optimization and Constraints enhances the 598 optimization process to apply absolute constraints to functions of 599 the path metrics. For example, a connection may be requested that 600 optimizes for the shortest path, but in any case requests that the 601 end-to-end delay be less than a certain value. Equally, 602 optimization my be expressed in terms of the impact on the network. 603 For example, a service may be requested in order to leave maximal 604 flexibility to satisfy future service requests. 606 o Fate Diversity requests ask for the server layer to provide a path 607 that does not use any network resources (usually links and nodes) 608 that share fate (i.e., can fail as the result of a single event) as 609 the resources used by another connection. This allows the client 610 layer to construct protection services over the server layer 611 network, for example by establishing links that are known to be 612 fate diverse. The connections that have diverse paths need not 613 share end points. 615 o Provisioning with Fate Sharing is the exact opposite of Fate 616 Diversity. In this case two or more connections are requested to 617 to follow same path in the server network. This may be requested, 618 for example, to create a bundled or aggregated link in the client 619 layer where each component of the client layer composite link is 620 required to have the same server layer properties (metrics, delay, 621 etc.) and the same failure characteristics. 623 o Concurrent Provisioning enables the inter-related connections 624 requests described in the previous two bullets to be enacted 625 through a single, compound service request. 627 o Service Resilience requests the server layer to provide 628 connectivity for which the server layer takes responsibility to 629 recover from faults. The resilience may be achieved through the 630 use of link-level protection, segment protection, end-to-end 631 protection, or recovery mechanisms. 633 2.4.1. Discovering Server Network Information 635 Although the topology and resource availability information of a 636 server network may be hidden from the client network, the service 637 request interface may support features that report details about the 638 services and potential services that the server network supports. 640 o Reporting of path details, service parameters, and issues such as 641 path diversity of LSPs that support deployed services allows the 642 client network to understand to what extent its requests were 643 satisfied. This is particularly important when the requests were 644 made as "best effort". 646 o A server network may support requests of the form "if I was to ask 647 you for this service, would you be able to provide it?" That is, 648 a service request that does everything except actually provision 649 the service. 651 3. Problem Statement 653 The problem statement presented in this section is as much about the 654 issues that may arise in any solution (and so have to be avoided) 655 and the features that are desirable within a solution, as it is about 656 the actual problem to be solved. 658 The problem can be stated very simply and with reference to the use 659 cases presented in the previous section. 661 A mechanism is required that allows TE-path computation in one 662 domain to make informed choices about the TE-capabilities and exit 663 points from the domain when signaling an end-to-end TE path that 664 will extend across multiple domains. 666 Thus, the problem is one of information collection and presentation, 667 not about signaling. Indeed, the existing signaling mechanisms for 668 TE LSP establishment are likely to prove adequate [RFC4726] with the 669 possibility of minor extensions. Similarly, TE information may 670 currently be distributed in a domain by TE extensions to one of the 671 two IGPs as described in OSPF-TE [RFC3630] and ISIS-TE [RFC5305], 672 and TE information may be exported from a domain (for example, 673 northbound) using link state extensions to BGP [I-D.ietf-idr-ls- 674 distribution]. 676 An interesting annex to the problem is how the path is made available 677 for use. For example, in the case of a client-server network, the 678 path established in the server network needs to be made available as 679 a TE link to provide connectivity in the client network. 681 3.1. Policy and Filters 683 A solution must be amenable to the application of policy and filters. 684 That is, the operator of a domain that is sharing information with 685 another domain must be able to apply controls to what information is 686 shared. Furthermore, the operator of a domain that has information 687 shared with it must be able to apply policies and filters to the 688 received information. 690 Additionally, the path computation within a domain must be able to 691 weight the information received from other domains according to local 692 policy such that the resultant computed path meets the local 693 operator's needs and policies rather than those of the operators of 694 other domains. 696 3.2. Confidentiality 698 A feature of the policy described in Section 3.1 is that an operator 699 of a domain may desire to keep confidential the details about its 700 internal network topology and loading. This information could be 701 construed as commercially sensitive. 703 Although it is possible that TE information exchange will take place 704 only between parties that have significant trust, there are also use 705 cases (such as the VPN supported over multiple server domains 706 described in Section 2.4) where information will be shared between 707 domains that have a commercial relationship, but a low level of 708 trust. 710 Thus, it must be possible for a domain to limit the information share 711 to just that which the computing domain needs to know with the 712 understanding that less information that is made available the more 713 likely it is that the result will be a less optimal path and/or more 714 crankback events. 716 3.3. Information Overload 718 One reason that networks are partitioned into separate domains is to 719 reduce the set of information that any one router has to handle. 720 This also applies to the volume of information that routing protocols 721 have to distribute. 723 Over the years routers have become more sophisticated with greater 724 processing capabilities and more storage, the control channels on 725 which routing messages are exchanged have become higher capacity, and 726 the routing protocols (and their implementations) have become more 727 robust. Thus, some of the arguments in favor of dividing a network 728 into domains may have been reduced. Conversely, however, the size of 729 networks continues to grow dramatically with a consequent increase in 730 the total amount of routing-related information available. 731 Additionally, in this case, the problem space spans two or more 732 networks. 734 Any solution to the problems voiced in this document must be aware of 735 the issues of information overload. If the solution was to simply 736 share all TE information between all domains in the network, the 737 effect from the point of view of the information load would be to 738 create one single flat network domain. Thus the solution must 739 deliver enough information to make the computation practical (i.e., 740 to solve the problem), but not so much as to overload the receiving 741 domain. Furthermore, the solution cannot simply rely on the policies 742 and filters described in Section 3.1 because such filters might not 743 always be enabled. 745 3.4. Issues of Information Churn 747 As LSPs are set up and torn down, the available TE resources on links 748 in the network change. In order to reliably compute a TE path 749 through a network, the computation point must have an up-to-date view 750 of the available TE resources. However, collecting this information 751 may result in considerable load on the distribution protocol and 752 churn in the stored information. In order to deal with this problem 753 even in a single domain, updates are sent at periodic intervals or 754 whenever there is a significant change in resources, whichever 755 happens first. 757 Consider, for example, that a TE LSP may traverse ten links in a 758 network. When the LSP is set up or torn down, the resources 759 available on each link will change resulting in a new advertisement 760 of the link's capabilities and capacity. If the arrival rate of new 761 LSPs is relatively fast, and the hold times relatively short, the 762 network may be in a constant state of flux. Note that the 763 problem here is not limited to churn within a single domain, since 764 the information shared between domains will also be changing. 765 Furthermore, the information that one domain needs to share with 766 another may change as the result of LSPs that are contained within or 767 cross the first domain but which are of no direct relevance to the 768 domain receiving the TE information. 770 In packet networks, where the capacity of an LSP is often a small 771 fraction of the resources available on any link, this issue is 772 partially addressed by the advertising routers. They can apply a 773 threshold so that they do not bother to update the advertisement of 774 available resources on a link if the change is less than a configured 775 percentage of the total (or alternatively, the remaining) resources. 776 The updated information in that case will be disseminated based on an 777 update interval rather than a resource change event. 779 In non-packet networks, where link resources are physical switching 780 resources (such as timeslots or wavelengths) the capacity of an LSP 781 may more frequently be a significant percentage of the available link 782 resources. Furthermore, in some switching environments, it is 783 necessary to achieve end-to-end resource continuity (such as using 784 the same wavelength on the whole length of an LSP), so it is far more 785 desirable to keep the TE information held at the computation points 786 up-to-date. Fortunately, non-packet networks tend to be quite a bit 787 smaller than packet networks, the arrival rates of non-packet LSPs 788 are much lower, and the hold times considerably longer. Thus the 789 information churn may be sustainable. 791 3.5. Issues of Aggregation 793 One possible solution to the issues raised in other sub-sections of 794 this section is to aggregate the TE information shared between 795 domains. Two aggregation mechanisms are often considered: 797 - Virtual node model. In this view, the domain is aggregated as if 798 it was a single node (or router / switch). Its links to other 799 domains are presented as real TE links, but the model assumes that 800 any LSP entering the virtual node through a link can be routed to 801 leave the virtual node through any other link (although recent work 802 on "limited cross-connect switches" may help with this problem 803 [RFC7579]). 805 - Virtual link model. In this model, the domain is reduced to a set 806 of edge-to-edge TE links. Thus, when computing a path for an LSP 807 that crosses the domain, a computation point can see which domain 808 entry points can be connected to which other and with what TE 809 attributes. 811 It is of the nature of aggregation that information is removed from 812 the system. This can cause inaccuracies and failed path computation. 813 For example, in the virtual node model there might not actually be a 814 TE path available between a pair of domain entry points, but the 815 model lacks the sophistication to represent this "limited cross- 816 connect capability" within the virtual node. On the other hand, in 817 the virtual link model it may prove very hard to aggregate multiple 818 link characteristics: for example, there may be one path available 819 with high bandwidth, and another with low delay, but this does not 820 mean that the connectivity should be assumed or advertised as having 821 both high bandwidth and low delay. 823 The trick to this multidimensional problem, therefore, is to 824 aggregate in a way that retains as much useful information as 825 possible while removing the data that is not needed. An important 826 part of this trick is a clear understanding of what information is 827 actually needed. 829 It should also be noted in the context of Section 3.4 that changes in 830 the information within a domain may have a bearing on what aggregated 831 data is shared with another domain. Thus, while the data shared in 832 reduced, the aggregation algorithm (operating on the routers 833 responsible for sharing information) may be heavily exercised. 835 4. Architecture 837 4.1. TE Reachability 839 As described in Section 1.1, TE reachability is the ability to reach 840 a specific address along a TE path. The knowledge of TE reachability 841 enables an end-to-end TE path to be computed. 843 In a single network, TE reachability is derived from the Traffic 844 Engineering Database (TED) that is the collection of all TE 845 information about all TE links in the network. The TED is usually 846 built from the data exchanged by the IGP, although it can be 847 supplemented by configuration and inventory details especially in 848 transport networks. 850 In multi-network scenarios, TE reachability information can be 851 described as "You can get from node X to node Y with the following 852 TE attributes." For transit cases, nodes X and Y will be edge nodes 853 of the transit network, but it is also important to consider the 854 information about the TE connectivity between an edge node and a 855 specific destination node. TE reachability may be qualified by TE 856 attributes such as TE metrics, hop count, available bandwidth, delay, 857 shared risk, etc. 859 TE reachability information can be exchanged between networks so that 860 nodes in one network can determine whether they can establish TE 861 paths across or into another network. Such exchanges are subject to 862 a range of policies imposed by the advertiser (for security and 863 administrative control) and by the receiver (for scalability and 864 stability). 866 4.2. Abstraction not Aggregation 868 Aggregation is the process of synthesizing from available 869 information. Thus, the virtual node and virtual link models 870 described in Section 3.5 rely on processing the information available 871 within a network to produce the aggregate representations of links 872 and nodes that are presented to the consumer. As described in 873 Section 3, dynamic aggregation is subject to a number of pitfalls. 875 In order to distinguish the architecture described in this document 876 from the previous work on aggregation, we use the term "abstraction" 877 in this document. The process of abstraction is one of applying 878 policy to the available TE information within a domain, to produce 879 selective information that represents the potential ability to 880 connect across the domain. 882 Abstraction does not offer all possible connectivity options (refer 883 to Section 3.5), but does present a general view of potential 884 connectivity. Abstraction may have a dynamic element, but is not 885 intended to keep pace with the changes in TE attribute availability 886 within the network. 888 Thus, when relying on an abstraction to compute an end-to-end path, 889 the process might not deliver a usable path. That is, there is no 890 actual guarantee that the abstractions are current or feasible. 892 While abstraction uses available TE information, it is subject to 893 policy and management choices. Thus, not all potential connectivity 894 will be advertised to each client. The filters may depend on 895 commercial relationships, the risk of disclosing confidential 896 information, and concerns about what use is made of the connectivity 897 that is offered. 899 4.2.1. Abstract Links 901 An abstract link is a measure of the potential to connect a pair of 902 points with certain TE parameters. That is, it is a path and its 903 characteristics in the server network. An abstract link represents 904 the possibility of setting up an LSP, and LSPs may be set up over the 905 abstract link. 907 When looking at a network such as that in Figure 7, the link from CN1 908 to CN4 may be an abstract link. It is easy to advertise it as a link 909 by abstracting the TE information in the server network subject to 910 policy. 912 The path (i.e., the abstract link) represents the possibility of 913 establishing an LSP from client edge to client edge across the server 914 network. There is not necessarily a one-to-one relationship between 915 abstract link and LSP because more than one LSP could be set up over 916 the path. 918 Since the client nodes do not have visibility into the core network, 919 they must rely on abstraction information delivered to them by the 920 core network. That is, the core network will report on the potential 921 for connectivity. 923 4.2.2. The Abstraction Layer Network 925 Figure 7 introduces the abstraction layer network. This construct 926 separates the client layer resources (nodes C1, C2, C3, and C4, and 927 the corresponding links), and the server layer resources (nodes CN1, 928 CN2, CN3, and CN4 and the corresponding links). Additionally, the 929 architecture introduces an intermediary layer called the abstraction 930 layer. The abstraction layer contains the client layer edge nodes 931 (C2 and C3), the server layer edge nodes (CN1 and CN4), the client- 932 server links (C2-CN1 and CN4-C3) and the abstract link CN1-CN4. 934 The client layer network is able to operate as normal. Connectivity 935 across the network can either be found or not found based on links 936 that appear in the client layer TED. If connectivity cannot be 937 found, end-to-end LSPs cannot be set up. This failure may be 938 reported but no dynamic action is taken by the client layer. 940 The server network layer also operates as normal. LSPs across the 941 server layer between client edges are set up in response to 942 management commands or in response to signaling requests. 944 The abstraction layer consists of the physical links between the 945 two networks, and also the abstract links. The abstract links are 946 created by the server network according to local policy and represent 947 the potential connectivity that could be created across the server 948 network and which the server network is willing to make available for 949 use by the client network. Thus, in this example, the diameter of 950 the abstraction layer network is only three hops, but an instance of 951 an IGP could easily be run so that all nodes participating in the 952 abstraction layer (and in particular the client network edge nodes) 953 can see the TE connectivity in the layer. 955 -- -- -- -- 956 |C1|--|C2| |C3|--|C4| Client Network 957 -- | | | | -- 958 | | | | . . . . . . . . . . . 959 | | | | 960 | | | | 961 | | --- --- | | Abstraction 962 | |---|CN1|================|CN4|---| | Layer Network 963 -- | | | | -- 964 | | | | . . . . . . . . . . . . . . 965 | | | | 966 | | | | 967 | | --- --- | | Server Network 968 | |--|CN2|--|CN3|--| | 969 --- --- --- --- 971 Key 972 --- Direct connection between two nodes 973 === Abstract link 975 Figure 7 : Architecture for Abstraction Layer Network 977 When the client layer needs additional connectivity it can make a 978 request to the abstraction layer network. For example, the operator 979 of the client network may want to create a link from C2 to C3. The 980 abstraction layer can see the potential path C2-CN1-CN4-C3 and can 981 set up an LSP C2-CN1-CN4-C3 across the server network and make the 982 LSP available as a link in the client network. 984 Sections 4.2.3 and 4.2.4 show how this model is used to satisfy the 985 requirements for connectivity in client-server networks and in peer 986 networks. 988 4.2.2.1. Nodes in the Abstraction Layer Network 990 Figure 7 shows a very simplified network diagram and the reader would 991 be forgiven for thinking that only client network edge nodes and 992 server network edge nodes may appear in the abstraction layer 993 network. But this is not the case: other nodes from the server 994 network may be present. This allows the abstraction layer network 995 to be more complex than a full mesh with access spokes. 997 Thus, as shown in Figure 8, a transit node in the server network 998 (here the node is CN3) can be exposed as a node in the abstraction 999 layer network with abstract links connecting it to other nodes in 1000 the abstraction layer network. Of course, in the network shown in 1001 Figure 8, there is little if any value in exposing CN3, but if it 1002 had other abstract links to other nodes in the abstraction layer 1003 network and/or direct connections to client network nodes, then the 1004 resulting network would be richer. 1006 -- -- -- -- Client 1007 |C1|--|C2| |C3|--|C4| Network 1008 -- | | | | -- 1009 | | | | . . . . . . . . . 1010 | | | | 1011 | | | | 1012 | | --- --- --- | | Abstraction 1013 | |--|CN1|========|CN3|========|CN5|--| | Layer Network 1014 -- | | | | | | -- 1015 | | | | | | . . . . . . . . . . . . 1016 | | | | | | 1017 | | | | | | Server 1018 | | --- | | --- | | Network 1019 | |--|CN2|-| |-|CN4|--| | 1020 --- --- --- --- --- 1022 Figure 8 : Abstraction Layer Network with Additional Node 1024 It should be noted that the nodes included in the abstraction layer 1025 network in this way are not "abstract nodes" in the sense of a 1026 virtual node described in Section 3.5. While it is the case that 1027 the policy point responsible for advertising server network resources 1028 into the abstraction layer network could choose to advertise abstract 1029 nodes in place of real physical nodes, it is believed that doing so 1030 would introduce significant complexity in terms of: 1032 - Coordination between all of the external interfaces of the abstract 1033 node 1035 - Management of changes in the server network that lead to limited 1036 capabilities to reach (cross-connect) across the Abstract Node. It 1037 may be noted that recent work on limited cross-connect capabilities 1038 such as exist in asymmetrical switches could be used to represent 1039 the limitations in an abstract node [RFC7579], [RFC7580]. 1041 4.2.3. Abstraction in Client-Server Networks 1043 Figure 9 shows the basic architectural concepts for a client-server 1044 network. The client network nodes are C1, C2, CE1, CE2, C3, and C4. 1045 The core network nodes are CN1, CN2, CN3, and CN4. The interfaces 1046 CE1-CN1 and CE2-CN2 are the interfaces between the client and core 1047 networks. 1049 The technologies (switching capabilities) of the client and server 1050 networks may be the same or different. If they are different, the 1051 client layer traffic must be tunneled over a server layer LSP. If 1052 they are the same, the client LSP may be routed over the server layer 1053 links, tunneled over a server layer LSP, or constructed from the 1054 concatenation (stitching) of client layer and server layer LSP 1055 segments. 1057 : : 1058 Client Network : Core Network : Client Network 1059 : : 1060 -- -- --- --- -- -- 1061 |C1|--|C2|--|CE1|................................|CE2|--|C3|--|C4| 1062 -- -- | | --- --- | | -- -- 1063 | |===|CN1|================|CN4|===| | 1064 | |---| | | |---| | 1065 --- | | --- --- | | --- 1066 | |--|CN2|--|CN3|--| | 1067 --- --- --- --- 1069 Key 1070 --- Direct connection between two nodes 1071 ... CE-to-CE LSP tunnel 1072 === Potential path across the core (abstract link) 1074 Figure 9 : Architecture for Client-Server Network 1076 The objective is to be able to support an end-to-end connection, 1077 C1-to-C4, in the client network. This connection may support TE or 1078 normal IP forwarding. To achieve this, CE1 is to be connected to CE2 1079 by a link in the client layer. This enables the client network to 1080 view itself as connected and to select an end-to-end path. 1082 As shown in the figure, three abstraction layer links are formed: 1083 CE1-CN1, CN1-CN2, and CN2-CE2. A three-hop LSP is then established 1084 from CE1 to CE2 that can be presented as a link in the client layer. 1086 The practicalities of how the CE1-CE2 LSP is carried across the core 1087 LSP may depend on the switching and signaling options available in 1088 the core network. The LSP may be tunneled down the core LSP using 1089 the mechanisms of a hierarchical LSP [RFC4206], or the LSP segments 1090 CE1-CN1 and CN2-CE2 may be stitched to the core LSP as described in 1091 [RFC5150]. 1093 Section 4.2.2 has already introduced the concept of the abstraction 1094 layer network through an example of a simple layered network. But it 1095 may be helpful to expand on the example using a slightly more complex 1096 network. 1098 Figure 10 shows a multi-layer network comprising client nodes 1099 (labeled as Cn for n= 0 to 9) and server nodes (labeled as Sn for 1100 n = 1 to 9). 1102 -- -- 1103 |C3|---|C4| 1104 /-- --\ 1105 -- -- -- -- --/ \-- 1106 |C1|---|C2|---|S1|---|S2|----|S3| |C5| 1107 -- /-- --\ --\ --\ /-- 1108 / \-- \-- \-- --/ -- 1109 / |S4| |S5|----|S6|---|C6|---|C7| 1110 / /-- --\ /-- /-- -- 1111 --/ -- --/ -- \--/ --/ 1112 |C8|---|C9|---|S7|---|S8|----|S9|---|C0| 1113 -- -- -- -- -- -- 1115 Figure 10 : An example Multi-Layer Network 1117 If the network in Figure 10 is operated as separate client and server 1118 networks then the client layer topology will appear as shown in 1119 Figure 11. As can be clearly seen, the network is partitioned and 1120 there is no way to set up an LSP from a node on the left hand side 1121 (say C1) to a node on the right hand side (say C7). 1123 -- -- 1124 |C3|---|C4| 1125 -- --\ 1126 -- -- \-- 1127 |C1|---|C2| |C5| 1128 -- /-- /-- 1129 / --/ -- 1130 / |C6|---|C7| 1131 / /-- -- 1132 --/ -- --/ 1133 |C8|---|C9| |C0| 1134 -- -- -- 1136 Figure 11 : Client Layer Topology Showing Partitioned Network 1138 For reference, Figure 12 shows the corresponding server layer 1139 topology. 1141 -- -- -- 1142 |S1|---|S2|----|S3| 1143 --\ --\ --\ 1144 \-- \-- \-- 1145 |S4| |S5|----|S6| 1146 /-- --\ /-- 1147 --/ -- \--/ 1148 |S7|---|S8|----|S9| 1149 -- -- -- 1151 Figure 12 : Server Layer Topology 1153 Operating on the TED for the server layer, a management entity or a 1154 software component may apply policy and consider what abstract links 1155 it might offer for use by the client layer. To do this it obviously 1156 needs to be aware of the connections between the layers (there is no 1157 point in offering an abstract link S2-S8 since this could not be of 1158 any use in this example). 1160 In our example, after consideration of which LSPs could be set up in 1161 the server layer, four abstract links are offered: S1-S3, S3-S6, 1162 S1-S9, and S7-S9. These abstract links are shown as double lines on 1163 the resulting topology of the abstraction layer network in Figure 13. 1164 As can be seen, two of the links must share part of a path (S1-S9 1165 must share with either S1-S3 or with S7-S9). This could be achieved 1166 using distinct resources (for example, separate lambdas) where the 1167 paths are common, but it could also be done using resource sharing. 1169 That would mean that when both paths S1-S3 and S7-S9 carry client- 1170 edge to client-edge LSPs the resources on the path S1-S9 are used and 1171 might be depleted to the point that the path is resource constrained 1172 and cannot be used. 1174 -- 1175 |C3| 1176 /-- 1177 -- -- --/ 1178 |C2|---|S1|==========|S3| 1179 -- --\\ --\\ 1180 \\ \\ 1181 \\ \\-- -- 1182 \\ |S6|---|C6| 1183 \\ -- -- 1184 -- -- \\-- -- 1185 |C9|---|S7|=====|S9|---|C0| 1186 -- -- -- -- 1188 Figure 13 : Abstraction Layer Network with Abstract Links 1190 The separate IGP instance running in the abstraction layer network 1191 means that this topology is visible at the edge nodes (C2, C3, C6, 1192 C9, and C0) as well as at a PCE if one is present. 1194 Now the client layer is able to make requests to the abstraction 1195 layer network to provide connectivity. In our example, it requests 1196 that C2 is connected to C3 and that C2 is connected to C0. This 1197 results in several actions: 1199 1. The management component for the abstraction layer network asks 1200 its PCE to compute the paths necessary to make the connections. 1201 This yields C2-S1-S3-C3 and C2-S1-S9-C0. 1203 2. The management component for the abstraction layer network 1204 instructs C2 to start the signaling process for the new LSPs in 1205 the abstraction layer. 1207 3. C2 signals the LSPs for setup using the explicit routes 1208 C2-S1-S3-C3 and C2-S1-S9-C0. 1210 4. When the signaling messages reach S1 (in our example, both LSPs 1211 traverse S1) the server layer network may support them by a 1212 number of means including establishing server layer LSPs as 1213 tunnels depending on the mismatch of technologies between the 1214 client and server networks. For example, S1-S2-S3 and S1-S2-S5-S9 1215 might be traversed via an LSP tunnel, using LSPs stitched 1216 together, or simply by routing the client layer LSP through the 1217 server network. If server layer LSPs are needed to they can be 1218 signaled at this point. 1220 5. Once any server layer LSPs that are needed have been established, 1221 S1 can continue to signal the client-edge to client-edge LSP 1222 across the abstraction layer either using the server layer LSPs as 1223 tunnels or as stitching segments, or simply routing through the 1224 server layer network. 1226 -- -- 1227 |C3|-|C4| 1228 /-- --\ 1229 / \-- 1230 -- --/ |C5| 1231 |C1|---|C2| /-- 1232 -- /--\ --/ -- 1233 / \ |C6|---|C7| 1234 / \ /-- -- 1235 / \--/ 1236 --/ -- |C0| 1237 |C8|---|C9| -- 1238 -- -- 1240 Figure 14 : Connected Client Layer Network with Additional Links 1242 6. Finally, once the client-edge to client-edge LSPs have been set 1243 up, the client layer can be informed and can start to advertise 1244 the new TE links C2-C3 and C2-C0. The resulting client layer 1245 topology is shown in Figure 14. 1247 7. Now the client layer can compute an end-to-end path from C1 to C7. 1249 4.2.3.1 A Server with Multiple Clients 1251 A single server network may support multiple client networks. This 1252 is not an uncommon state of affairs for example when the server 1253 network provides connectivity for multiple customers. 1255 In this case, the abstraction provided by the server layer may vary 1256 considerably according to the policies and commercial relationships 1257 with each customer. This variance would lead to a separate 1258 abstraction layer network maintained to support each client network. 1260 On the other hand, it may be that multiple clients are subject to the 1261 same policies and the abstraction can be identical. In this case, a 1262 single abstraction layer network can support more than one client. 1264 The choices here are made as an operational issue by the server layer 1265 network. 1267 4.2.3.2 A Client with Multiple Servers 1269 A single client network may be supported by multiple server networks. 1270 The server networks may provide connectivity between different parts 1271 of the client network or may provide parallel (redundant) 1272 connectivity for the client network. 1274 In this case the abstraction layer network should contain the 1275 abstract links from all server networks so that it can make suitable 1276 computations and create the correct TE links in the client network. 1277 That is, the relationship between client network and abstraction 1278 layer network should be one-to-one. 1280 4.2.4. Abstraction in Peer Networks 1282 Figure 15 shows the basic architectural concepts for connecting 1283 across peer networks. Nodes from four networks are shown: A1 and A2 1284 come from one network; B1, B2, and B3 from another network; etc. The 1285 interfaces between the networks (sometimes known as External Network- 1286 to-Network Interfaces - ENNIs) are A2-B1, B3-C1, and C3-D1. 1288 The objective is to be able to support an end-to-end connection A1- 1289 to-D2. This connection is for TE connectivity. 1291 As shown in the figure, abstract links that span the transit networks 1292 are used to achieve the required connectivity. These links form the 1293 key building blocks of the end-to-end connectivity. An end-to-end 1294 LSP uses these links as part of its path. If the stitching 1295 capabilities of the networks are homogeneous then the end-to-end LSP 1297 : : : 1298 Network A : Network B : Network C : Network D 1299 : : : 1300 -- -- -- -- -- -- -- -- -- -- 1301 |A1|--|A2|---|B1|--|B2|--|B3|---|C1|--|C2|--|C3|---|D1|--|D2| 1302 -- -- | | -- | | | | -- | | -- -- 1303 | |========| | | |========| | 1304 -- -- -- -- 1306 Key 1307 --- Direct connection between two nodes 1308 === Abstract link across transit network 1310 Figure 15 : Architecture for Peering 1312 may simply traverse the path defined by the abstract links across the 1313 various peer networks or may utilize stitching of LSP segments that 1314 each traverse a network along the path of an abstract link. If the 1315 network switching technologies support or necessitate the use of LSP 1316 hierarchies, the end-to-end LSP may be tunneled across each network 1317 using hierarchical LSPs that each each traverse a network along the 1318 path of an abstract link. 1320 Peer networks exist in many situations in the Internet. Packet 1321 networks may peer as IGP areas (levels) or as ASes. Transport 1322 networks (such as optical networks) may peer to provide 1323 concatenations of optical paths through single vendor environments 1324 (see Section 6). Figure 16 shows a simple example of three peer 1325 networks (A, B, and C) each comprising a few nodes. 1327 Network A : Network B : Network C 1328 : : 1329 -- -- -- : -- -- -- : -- -- 1330 |A1|---|A2|----|A3|---|B1|---|B2----|B3|---|C1|---|C2| 1331 -- --\ /-- : -- /--\ -- : -- -- 1332 \--/ : / \ : 1333 |A4| : / \ : 1334 --\ : / \ : 1335 -- \-- : --/ \-- : -- -- 1336 |A5|---|A6|---|B4|----------|B6|---|C3|---|C4| 1337 -- -- : -- -- : -- -- 1338 : : 1339 : : 1341 Figure 16 : A Network Comprising Three Peer Networks 1343 As discussed in Section 2, peered networks do not share visibility of 1344 their topologies or TE capabilities for scaling and confidentiality 1345 reasons. That means, in our example, that computing a path from A1 1346 to C4 can be impossible without the aid of cooperating PCEs or some 1347 form of crankback. 1349 But it is possible to produce abstract links for reachability across 1350 transit peer networks and to create an abstraction layer network. 1351 That network can be enhanced with specific reachability information 1352 if a destination network is partitioned as is the case with Network C 1353 in Figure 16. 1355 Suppose Network B decides to offer three abstract links B1-B3, B4-B3, 1356 and B4-B6. The abstraction layer network could then be constructed 1357 to look like the network in Figure 17. 1359 -- -- -- -- 1360 |A3|---|B1|====|B3|----|C1| 1361 -- -- //-- -- 1362 // 1363 // 1364 // 1365 -- --// -- -- 1366 |A6|---|B4|=====|B6|---|C3| 1367 -- -- -- -- 1369 Figure 17 : Abstraction Layer Network for the Peer Network Example 1371 Using a process similar to that described in Section 4.2.3, Network A 1372 can request connectivity to Network C and abstract links can be 1373 advertised that connect the edges of the two networks and that can be 1374 used to carry LSPs that traverse both networks. Furthermore, if 1375 Network C is partitioned, reachability information can be exchanged 1376 to allow Network A to select the correct abstract link as shown in 1377 Figure 18. 1379 Network A : Network C 1380 : 1381 -- -- -- : -- -- 1382 |A1|---|A2|----|A3|=========|C1|.....|C2| 1383 -- --\ /-- : -- -- 1384 \--/ : 1385 |A4| : 1386 --\ : 1387 -- \-- : -- -- 1388 |A5|---|A6|=========|C3|.....|C4| 1389 -- -- : -- -- 1391 Figure 18 : Tunnel Connections to Network C with TE Reachability 1393 Peer networking cases can be made far more complex by dual homing 1394 between network peering nodes (for example, A3 might connect to B1 1395 and B4 in Figure 17) and by the networks themselves being arranged in 1396 a mesh (for example, A6 might connect to B4 and C1 in Figure 17). 1398 These additional complexities can be handled gracefully by the 1399 abstraction layer network model. 1401 Further examples of abstraction in peer networks can be found in 1402 Sections 6 and 8. 1404 4.3. Considerations for Dynamic Abstraction 1406 It is possible to consider a highly dynamic system where the server 1407 network adaptively suggests new abstract links into the abstraction 1408 layer, and where the abstraction layer proactively deploys new 1409 client-edge to client-edge LSPs to provide new links in the client 1410 network. Such fluidity is, however, to be treated with caution 1411 especially in the case of client-server networks of differing 1412 technologies where hierarchical server layer LSPs are used because of 1413 the longer turn-up times of connections in some server networks, 1414 because the server networks are likely to be sparsely connected and 1415 expensive physical resources will only be deployed where there is 1416 believed to be a need for them. More significantly, the complex 1417 commercial, policy, and administrative relationships that may exist 1418 between client and server network operators mean that stability is 1419 more likely to be the desired operational practice. 1421 Thus, proposals for fully automated multi-layer networks based on 1422 this architecture may be regarded as forward-looking topics for 1423 research both in terms of network stability and with regard to 1424 ecomonic impact. 1426 However, some elements of automation should not be discarded. A 1427 server network may automatically apply policy to determine the best 1428 set of abstract links to offer and the most suitable way for the 1429 server network to support them. And a client network may dynamically 1430 observe congestion, lack of connectivity, or predicted changes in 1431 traffic demand, and may use this information to request additional 1432 links from the abstraction layer. And, once policies have been 1433 configured, the whole system should be able to operate autonomous of 1434 operator control (which is not to say that the operator will not have 1435 the option of exerting control at every step in the process). 1437 4.4. Requirements for Advertising Links and Nodes 1439 The abstraction layer network is "just another network layer". The 1440 links and nodes in the network need to be advertised along with their 1441 associated TE information (metrics, bandwidth, etc.) so that the 1442 topology is disseminated and so that routing decisions can be made. 1444 This requires a routing protocol running between the nodes in the 1445 abstraction layer network. Note that this routing information 1446 exchange could be piggy-backed on an existing routing protocol 1447 instance, or use a new instance (or even a new protocol). Clearly, 1448 the information exchanged is only that which has been created as 1449 part of the abstraction function according to policy. 1451 It should be noted that in many cases the abstract represents the 1452 potential for connectivity across the server network but that no such 1453 connectivity exists. In this case we may ponder how the routing 1454 protocol in the abstraction layer will advertise topology information 1455 for and over a link that has no underlying connectivity. In other 1456 words, there must be a communication channel between the abstract 1457 layer nodes so that the routing protocol messages can flow. The 1458 answer is that control plane connectivity already exists in the 1459 server network and on the client-server edge links, and this can be 1460 used to carry the routing protocol messages for the abstraction layer 1461 network. The same consideration applies to the advertisement, in the 1462 client network of the potential connectivity that the abstraction 1463 layer network can provide although it may be more normal to establish 1464 that connectivity before advertising a link in the client network. 1466 4.5. Addressing Considerations 1468 The network layers in this architecture should be able to operate 1469 with separate address spaces and these may overlap without any 1470 technical issues. That is, one address may mean one thing in the 1471 client network, yet the same address may have a different meaning in 1472 the abstraction layer network or the server network. In other words 1473 there is complete address separation between networks. 1475 However, this will require some care both because human operators may 1476 well become confused, and because mapping between address spaces is 1477 needed at the interfaces between the network layers. That mapping 1478 requires configuration so that, for example, when the server network 1479 announces an abstract link from A to B, the abstraction layer network 1480 must recognize that A and B are server network addresses and must map 1481 them to abstraction layer addresses (say P and Q) before including 1482 the link in its own topology. And similarly, when the abstraction 1483 layer network informs the client network that a new link is available 1484 from S to T, it must map those addresses from its own address space 1485 to that of the client network. 1487 This form of address mapping will become particularly important in 1488 cases where one abstraction layer network is constructed from 1489 connectivity in multiple server layer networks, or where one 1490 abstraction layer network provides connectivity for multiple client 1491 networks. 1493 5. Building on Existing Protocols 1495 This section is not intended to prejudge a solutions framework or any 1496 applicability work. It does, however, very briefly serve to note the 1497 existence of protocols that could be examined for applicability to 1498 serve in realizing the model described in this document. 1500 The general principle of protocol re-use is preferred over the 1501 invention of new protocols or additional protocol extensions, and it 1502 would be advantageous to make use of an existing protocol that is 1503 commonly implemented on network nodes and is currently deployed, or 1504 to use existing computational elements such as Path Computation 1505 Elements (PCEs). This has many benefits in network stability, time 1506 to deployment, and operator training. 1508 It is recognized, however, that existing protocols are unlikely to be 1509 immediately suitable to this problem space without some protocol 1510 extensions. Extending protocols must be done with care and with 1511 consideration for the stability of existing deployments. In extreme 1512 cases, a new protocol can be preferable to a messy hack of an 1513 existing protocol. 1515 5.1. BGP-LS 1517 BGP-LS is a set of extensions to BGP described in 1518 [I-D.ietf-idr-ls-distribution]. It's purpose is to announce topology 1519 information from one network to a "north-bound" consumer. 1520 Application of BGP-LS to date has focused on a mechanism to build a 1521 TED for a PCE. However, BGP's mechanisms would also serve well to 1522 advertise abstract links from a server network into the abstraction 1523 layer network, or to advertise potential connectivity from the 1524 abstraction layer network to the client network. 1526 5.2. IGPs 1528 Both OSPF and IS-IS have been extended through a number of RFCs to 1529 advertise TE information. Additionally, both protocols are capable 1530 of running in a multi-instance mode either as ships that pass in the 1531 night (i.e., completely separate instances using different address) 1532 or as dual instances on the same address space. This means that 1533 either IGP could probably be used as the routing protocol in the 1534 abstraction layer network. 1536 5.3. RSVP-TE 1538 RSVP-TE signaling can be used to set up all traffic engineered LSPs 1539 demanded by this model without the need for any protocol extensions. 1541 If necessary, LSP hierarchy [RFC4206] or LSP stitching [RFC5150] can 1542 be used to carry LSPs over the server layer network, again without 1543 needing any protocol extensions. 1545 Furthermore, the procedures in [RFC6107] allow the dynamic signaling 1546 of the purpose of any LSP that is established. This means that 1547 when an LSP tunnel is set up, the two ends can coordinate into which 1548 routing protocol instance it should be advertised, and can also agree 1549 on the addressing to be said to identify the link that will be 1550 created. 1552 5.4. Notes on a Solution 1554 This section is not intended to be prescriptive or dictate the 1555 protocol solutions that may be used to satisfy the architecture 1556 described in this document, but it does show how the existing 1557 protocols listed in the previous sections can be combined to provide 1558 a solution with only minor modifications. 1560 A server network can be operated using GMPLS routing and signaling 1561 protocols. Using information gathered from the routing protocol, a 1562 TED can be constructed containing resource availability information 1563 and Shared Risk Link Group (SRLG) details. A policy-based process 1564 can then determine which nodes and abstract links it wishes to 1565 advertise to form the abstract layer network. 1567 The server network can now use BGP-LS to advertise a topology of 1568 links and nodes to form the abstraction layer network. This 1569 information would most likely be advertised from a single point of 1570 control that made all of the abstraction decisions, but the function 1571 could be distributed to multiple server network edge nodes. The 1572 information can be advertised by BGP-LS to multiple points within the 1573 abstraction layer (such as all client network edge nodes) or to a 1574 single controller. 1576 Multiple server networks may advertise information that is used to 1577 construct an abstraction layer network, and one server network may 1578 advertise different information in different instances of BGP-LS to 1579 form different abstraction layer networks. Furthermore, in the case 1580 of one controller constructing multiple abstraction layer networks, 1581 BGP-LS uses the route target mechanism defined in [RFC4364] to 1582 distinguish the different applications (effectively abstraction layer 1583 network VPNs) of the exported information. 1585 Extensions may be made to BGP-LS to allow advertisement of Macro 1586 Shared Risk Link Groups (MSRLGs) per Appendix B, mutually exclusive 1587 links, and to indicate whether the abstract link has been pre- 1588 established or not. Such extensions are valid options, but do not 1589 form a core component of this architecture. 1591 The abstraction layer network may operate under central control or 1592 use a distributed control plane. Since the links and nodes may be a 1593 mix of physical and abstract links, and since the nodes may have 1594 diverse cross-connect capabilities, it is most likely that a GMPLS 1595 routing protocol will be beneficial for collecting and correlating 1596 the routing information and for distributing updates. No special 1597 additional features are needed beyond adding those extra parameters 1598 just described for BGP-LS, but it should be noted that the control 1599 plane of the abstraction layer network must run in an out of band 1600 control network because the data-bearing links might not yet have 1601 been established via connections in the server layer network. 1603 The abstraction layer network is also able to determine potential 1604 connectivity from client network edge to client network edge. It 1605 will determine which client network links to create according to 1606 policy and subject to requests from the client network, and will 1607 take four steps: 1609 - First it will compute a path for across the abstraction layer 1610 network. 1611 - Then, if the support of the abstract links requires the use of 1612 server layer LSPs for tunneling or stitching, and if those LSPs are 1613 not already established, it will ask the server layer to set them 1614 up. 1615 - Then, it will signal the client-edge to client-edge LSP. 1616 - Finally, the abstraction layer network will inform the client 1617 network of the existence of the new client network link. 1619 This last step can be achieved either by coordination of the end 1620 points of the LSPs that span the abstraction layer (these points are 1621 client network edge nodes) using mechanisms such as those described 1622 in [RFC6107], or using BGP-LS from a central controller. 1624 Once the client network edge nodes are aware of a new link, they will 1625 automatically advertise it using their routing protocol and it will 1626 become available for use by traffic in the client network. 1628 Sections 6, 7, and 8 discuss the applicability of this architecture 1629 to different network types and problem spaces, while Section 9 gives 1630 some advice about scoping future work. Section 9 on manageability 1631 considerations is particularly relevant in the context of this 1632 section because it contains a discussion of the policies and 1633 mechanisms for indicating connectivity and link availability between 1634 network layers in this architecture. 1636 6. Applicability to Optical Domains and Networks 1638 Many optical networks are arranged a set of small domains. Each 1639 domain is a cluster of nodes, usually from the same equipment vendor 1640 and with the same properties. The domain may be constructed as a 1641 mesh or a ring, or maybe as an interconnected set of rings. 1643 The network operator seeks to provide end-to-end connectivity across 1644 a network constructed from multiple domains, and so (of course) the 1645 domains are interconnected. In a network under management control 1646 such as through an Operations Support System (OSS), each domain is 1647 under the operational control of a Network Management System (NMS). 1649 In this way, an end-to-end path may be commissioned by the OSS 1650 instructing each NMS, and the NMSes setting up the path fragments 1651 across the domains. 1653 However, in a system that uses a control plane, there is a need for 1654 integration between the domains. 1656 Consider a simple domain, D1, as shown in Figure 19. In this case, 1657 the nodes A through F are arranged in a topological ring. Suppose 1658 that there is a control plane in use in this domain, and that OSPF is 1659 used as the TE routing protocol. 1661 ----------------- 1662 | D1 | 1663 | B---C | 1664 | / \ | 1665 | / \ | 1666 | A D | 1667 | \ / | 1668 | \ / | 1669 | F---E | 1670 | | 1671 ----------------- 1673 Figure 19 : A Simple Optical Domain 1675 Now consider that the operator's network is built from a mesh of such 1676 domains, D1 through D7, as shown in Figure 20. It is possible that 1678 ------ ------ ------ ------ 1679 | | | | | | | | 1680 | D1 |---| D2 |---| D3 |---| D4 | 1681 | | | | | | | | 1682 ------\ ------\ ------\ ------ 1683 \ | \ | \ | 1684 \------ \------ \------ 1685 | | | | | | 1686 | D5 |---| D6 |---| D7 | 1687 | | | | | | 1688 ------ ------ ------ 1690 Figure 20 : A Simple Optical Domain 1692 these domains share a single, common instance of OSPF in which case 1693 there is nothing further to say because that OSPF instance will 1694 distribute sufficient information to build a single TED spanning the 1695 whole network, and an end-to-end path can be computed. A more likely 1696 scenario is that each domain is running its own OSPF instance. In 1697 this case, each is able to handle the peculiarities (or rather, 1698 advanced functions) of each vendor's equipment capabilities. 1700 The question now is how to combine the multiple sets of information 1701 distributed by the different OSPF instances. Three possible models 1702 suggest themselves based on pre-existing routing practices. 1704 o In the first model (the Area-Based model) each domain is treated as 1705 a separate OSPF area. The end-to-end path will be specified to 1706 traverse multiple areas, and each area will be left to determine 1707 the path across the nodes in the area. The feasibility of an end- 1708 to-end path (and, thus, the selection of the sequence of areas and 1709 their interconnections) can be derived using hierarchical PCE. 1711 This approach, however, fits poorly with established use of the 1712 OSPF area: in this form of optical network, the interconnection 1713 points between domains are likely to be links; and the mesh of 1714 domains is far more interconnected and unstructured than we are 1715 used to seeing in the normal area-based routing paradigm. 1717 Furthermore, while hierarchical PCE may be able to solve this type 1718 of network, the effort involved may be considerable for more than a 1719 small collection of domains. 1721 o Another approach (the AS-Based model) treats each domain as a 1722 separate Autonomous System (AS). The end-to-end path will be 1723 specified to traverse multiple ASes, and each AS will be left to 1724 determine the path across the AS. 1726 This model sits more comfortably with the established routing 1727 paradigm, but causes a massive escalation of ASes in the global 1728 Internet. It would, in practice, require that the operator used 1729 private AS numbers [RFC6996] of which there are plenty. 1731 Then, as suggested in the Area-Based model, hierarchical PCE 1732 could be used to determine the feasibility of an end-to-end path 1733 and to derive the sequence of domains and the points of 1734 interconnection to use. But, just as in that other model, the 1735 scalability of this model using a hierarchical PCE must be 1736 questioned given the sheer number of ASes and their 1737 interconnectivity. 1739 Furthermore, determining the mesh of domains (i.e., the inter-AS 1740 connections) conventionally requires the use of BGP as an inter- 1741 domain routing protocol. However, not only is BGP not normally 1742 available on optical equipment, but this approach indicates that 1743 the TE properties of the inter-domain links would need to be 1744 distributed and updated using BGP: something for which it is not 1745 well suited. 1747 o The third approach (the ASON model) follows the architectural 1748 model set out by the ITU-T [G.8080] and uses the routing protocol 1749 extensions described in [RFC6827]. In this model the concept of 1750 "levels" is introduced to OSPF. Referring back to Figure 20, each 1751 OSPF instance running in a domain would be construed as a "lower 1752 level" OSPF instance and would leak routes into a "higher level" 1753 instance of the protocol that runs across the whole network. 1755 This approach handles the awkwardness of representing the domains 1756 as areas or ASes by simply considering them as domains running 1757 distinct instances of OSPF. Routing advertisements flow "upward" 1758 from the domains to the high level OSPF instance giving it a full 1759 view of the whole network and allowing end-to-end paths to be 1760 computed. Routing advertisements may also flow "downward" from the 1761 network-wide OSPF instance to any one domain so that it has 1762 visibility of the connectivity of the whole network. 1764 While architecturally satisfying, this model suffers from having to 1765 handle the different characteristics of different equipment 1766 vendors. The advertisements coming from each low level domain 1767 would be meaningless when distributed into the other domains, and 1768 the high level domain would need to be kept up-to-date with the 1769 semantics of each new release of each vendor's equipment. 1770 Additionally, the scaling issues associated with a well-meshed 1771 network of domains each with many entry and exit points and each 1772 with network resources that are continually being updated reduces 1773 to the same problem as noted in the virtual link model. 1774 Furthermore, in the event that the domains are under control of 1775 different administrations, the domains would not want to distribute 1776 the details of their topologies and TE resources. 1778 Practically, this third model turns out to be very close to the 1779 methodology described in this document. As noted in Section 6.1 of 1780 [RFC6827], there are policy rules that can be applied to define 1781 exactly what information is exported from or imported to a low level 1782 OSPF instance. The document even notes that some forms of 1783 aggregation may be appropriate. Thus, we can apply the following 1784 simplifications to the mechanisms defined in RFC 6827: 1786 - Zero information is imported to low level domains. 1788 - Low level domains export only abstracted links as defined in this 1789 document and according to local abstraction policy and with 1790 appropriate removal of vendor-specific information. 1792 - There is no need to formally define routing levels within OSPF. 1794 - Export of abstracted links from the domains to the network-wide 1795 routing instance (the abstraction routing layer) can take place 1796 through any mechanism including BGP-LS or direct interaction 1797 between OSPF implementations. 1799 With these simplifications, it can be seen that the framework defined 1800 in this document can be constructed from the architecture discussed 1801 in RFC 6827, but without needing any of the protocol extensions that 1802 that document defines. Thus, using the terminology and concepts 1803 already established, the problem may solved as shown in Figure 21. 1804 The abstraction layer network is constructed from the inter-domain 1805 links, the domain border nodes, and the abstracted (cross-domain) 1806 links. 1808 Abstraction Layer 1809 -- -- -- -- -- -- 1810 | |===========| |--| |===========| |--| |===========| | 1811 | | | | | | | | | | | | 1812 ..| |...........| |..| |...........| |..| |...........| |...... 1813 | | | | | | | | | | | | 1814 | | -- -- | | | | -- -- | | | | -- -- | | 1815 | |_| |_| |_| | | |_| |_| |_| | | |_| |_| |_| | 1816 | | | | | | | | | | | | | | | | | | | | | | | | 1817 -- -- -- -- -- -- -- -- -- -- -- -- 1818 Domain 1 Domain 2 Domain 3 1819 Key Optical Layer 1820 ... Layer separation 1821 --- Physical link 1822 === Abstract link 1824 Figure 21 : The Optical Network Implemented Through the 1825 Abstraction Layer Network 1827 7. Modeling the User-to-Network Interface 1829 The User-to-Network Interface (UNI) is an important architectural 1830 concept in many implementations and deployments of client-server 1831 networks especially those where the client and server network have 1832 different technologies. The UNI can be seen described in [G.8080], 1833 and the GMPLS approach to the UNI is documented in [RFC4208]. Other 1834 GMPLS-related documents describe the application of GMPLS to specific 1835 UNI scenarios: for example, [RFC6005] describes how GMPLS can support 1836 a UNI that provides access to Ethernet services. 1838 Figure 1 of [RFC6005] is reproduced here as Figure 22. It shows the 1839 Ethernet UNI reference model, and that figure can serve as an example 1840 for all similar UNIs. In this case, the UNI is an interface between 1841 client network edge nodes and the server network. It should be noted 1842 that neither the client network nor the server network need be an 1843 Ethernet switching network. 1845 There are three network layers in this model: the client network, the 1846 "Ethernet service network", and the server network. The so-called 1847 Ethernet service network consists of links comprising the UNI links 1848 and the tunnels across the server network, and nodes comprising the 1849 client network edge nodes and various server nodes. That is, the 1850 Ethernet service network is equivalent to the abstraction layer 1851 network with the UNI links being the physical links between the 1852 client and server networks, and the client edge nodes taking the 1854 Client Client 1855 Network +----------+ +-----------+ Network 1856 -------------+ | | | | +------------- 1857 +----+ | | +-----+ | | +-----+ | | +----+ 1858 ------+ | | | | | | | | | | | | +------ 1859 ------+ EN +-+-----+--+ CN +-+----+--+ CN +--+-----+-+ EN +------ 1860 | | | +--+--| +-+-+ | | +--+-----+-+ | 1861 +----+ | | | +--+--+ | | | +--+--+ | | +----+ 1862 | | | | | | | | | | 1863 -------------+ | | | | | | | | +------------- 1864 | | | | | | | | 1865 -------------+ | | | | | | | | +------------- 1866 | | | +--+--+ | | | +--+--+ | | 1867 +----+ | | | | | | +--+--+ | | | +----+ 1868 ------+ +-+--+ | | CN +-+----+--+ CN | | | | +------ 1869 ------+ EN +-+-----+--+ | | | | +--+-----+-+ EN +------ 1870 | | | | +-----+ | | +-----+ | | | | 1871 +----+ | | | | | | +----+ 1872 | +----------+ |-----------+ | 1873 -------------+ Server Network(s) +------------- 1874 Client UNI UNI Client 1875 Network <-----> <-----> Network 1876 Scope of This Document 1878 Legend: EN - Client Edge Node 1879 CN - Server Node 1881 Figure 22 : Ethernet UNI Reference Model 1883 role of UNI Client-side (UNI-C) and the server edge nodes acting as 1884 the UNI Network-side (UNI-N) nodes. 1886 An issue that is often raised concerns how a dual-homed client edge 1887 node (such as that shown at the bottom left-hand corner of Figure 22) 1888 can make determinations about how they connect across the UNI. This 1889 can be particularly important when reachability across the server 1890 network is limited or when two diverse paths are desired (for 1891 example, to provide protection). However, in the model described in 1892 this network, the edge node (the UNI-C) is part of the abstraction 1893 layer network and can see sufficient topology information to make 1894 these decisions. If the approach introduced in this document is used 1895 to model the UNI as described in this section, there is no need to 1896 enhance the signaling protocols at the GMPLS UNI nor to add routing 1897 exchanges at the UNI. 1899 8. Abstraction in L3VPN Multi-AS Environments 1901 Serving layer-3 VPNs (L3PVNs) across a multi-AS or multi-operator 1902 environment currently provides a significant planning challenge. 1903 Figure 6 shows the general case of the problem that needs to be 1904 solved. This section shows how the abstraction layer network can 1905 address this problem. 1907 In the VPN architecture, the CE nodes are the client network edge 1908 nodes, and the PE nodes are the server network edge nodes. The 1909 abstraction layer network is made up of the CE nodes, the CE-PE 1910 links, the PE nodes, and PE-PE tunnels that are the abstract links. 1912 In the multi-AS or multi-operator case, the abstraction layer network 1913 also includes the PEs (maybe ASBRs) at the edges of the multiple 1914 server networks, and the PE-PE (maybe inter-AS) links. This gives 1915 rise to the architecture shown in Figure 23. 1917 The policy for adding abstract links to the abstraction layer network 1918 will be driven substantially by the needs of the VPN. Thus, when a 1919 new VPN site is added and the existing abstraction layer network 1920 cannot support the required connectivity, a new abstract link will be 1921 created out of the underlying network. 1923 It is important to note that each VPN instance can have a separate 1924 abstraction layer network. This means that the server network 1925 resources can be partitioned and that traffic can be kept separate. 1926 This can be achieved even when VPN sites from different VPNs connect 1927 at the same PE. Alternatively, multiple VPNs can share the same 1928 abstraction layer network if that is operationally preferable. 1930 Lastly, just as for the UNI discussed in Section 7, the issue of 1931 dual-homing of VPN sites is a function of the abstraction layer 1932 network and so is just a normal routing problem in that network. 1934 ........... ............. 1935 VPN Site : : VPN Site 1936 -- -- : : -- -- 1937 |C1|-|CE| : : |CE|-|C2| 1938 -- | | : : | | -- 1939 | | : : | | 1940 | | : : | | 1941 | | : : | | 1942 | | : -- -- -- -- : | | 1943 | |----|PE|=========|PE|---|PE|=====|PE|----| | 1944 -- : | | | | | | | | : -- 1945 ........... | | | | | | | | ............ 1946 | | | | | | | | 1947 | | | | | | | | 1948 | | | | | | | | 1949 | | - - | | | | - | | 1950 | |-|P|-|P|-| | | |-|P|-| | 1951 -- - - -- -- - -- 1953 Figure 23 : The Abstraction Layer Network for a Multi-AS VPN 1955 9. Scoping Future Work 1957 The section is provided to help guide the work on this problem and to 1958 ensure that oceans are not knowingly boiled. 1960 9.1. Not Solving the Internet 1962 The scope of the use cases and problem statement in this document is 1963 limited to "some small set of interconnected domains." In 1964 particular, it is not the objective of this work to turn the whole 1965 Internet into one large, interconnected TE network. 1967 9.2. Working With "Related" Domains 1969 Subsequent to Section 9.1, the intention of this work is to solve 1970 the TE interconnectivity for only "related" domains. Such domains 1971 may be under common administrative operation (such as IGP areas 1972 within a single AS, or ASes belonging to a single operator), or may 1973 have a direct commercial arrangement for the sharing of TE 1974 information to provide specific services. Thus, in both cases, there 1975 is a strong opportunity for the application of policy. 1977 9.3. Not Finding Optimal Paths in All Situations 1979 As has been well described in this document, abstraction necessarily 1980 involves compromises and removal of information. That means that it 1981 is not possible to guarantee that an end-to-end path over 1982 interconnected TE domains follows the absolute optimal (by any measure 1983 of optimality) path. This is taken as understood, and future work 1984 should not attempt to achieve such paths which can only be found by a 1985 full examination of all network information across all connected 1986 networks. 1988 9.4. Sanity and Scaling 1990 All of the above points play into a final observation. This work is 1991 intended to bite off a small problem for some relatively simple use 1992 cases as described in Section 2. It is not intended that this work 1993 will be immediately (or even soon) extended to cover many large 1994 interconnected domains. Obviously the solution should as far as 1995 possible be designed to be extensible and scalable, however, it is 1996 also reasonable to make trade-offs in favor of utility and 1997 simplicity. 1999 10. Manageability Considerations 2001 Manageability should not be a significant additional burden. Each 2002 layer in the network model can and should be managed independently. 2004 That is, each client network will run its own management systems and 2005 tools to manage the nodes and links in the client network: each 2006 client network link that uses an abstract link will still be 2007 available for management in the client network as any other link. 2009 Similarly, each server network will run its own management systems 2010 and tools to manage the nodes and links in that network just as 2011 normal. 2013 Three issues remain for consideration: 2015 - How is the abstraction layer network managed? 2016 - How is the interface between the client network and the abstraction 2017 layer network managed? 2018 - How is the interface between the abstraction layer network and the 2019 server network managed? 2021 10.1. Managing the Abstraction Layer Network 2023 Management of the abstraction layer network differs from the client 2024 and server networks because not all of the links that are visible in 2025 the TED are real links. That is, it is not possible to run OAM on 2026 the links that constitute the potential of a link. 2028 Other than that, however, the management should be essentially the 2029 same. Routing and signaling protocols can be run in the abstraction 2030 layer (using out of band channels for links that have not yet been 2031 established), and a centralized TED can be constructed and used to 2032 examine the availability and status of the links and nodes in the 2033 network. 2035 Note that different deployment models will place the "ownership" of 2036 the abstraction layer network differently. In some case the the 2037 abstraction layer network will be constructed by the operator of the 2038 server layer and run by that operator as a service for one or more 2039 client networks. In other cases, one or more server networks will 2040 present the potential of links to an abstraction layer network run 2041 by the operator of the client network. And it is feasible that a 2042 business model could be built where a third-party operator manages 2043 the abstraction layer network, constructing it from the connectivity 2044 available in multiple server networks, and facilitating connectivity 2045 for multiple client networks. 2047 10.2. Managing Interactions of Client and Abstraction Layer Networks 2049 The interaction between the client network and the abstraction layer 2050 network is a management task. It might be automated (software 2051 driven) or it might require manual intervention. 2053 This is a two-way interaction: 2055 - The client network can express the need for additional 2056 connectivity. For example, the client layer may try and fail to 2057 find a path across the client network and may request additional, 2058 specific connectivity (this is similar to the situation with 2059 Virtual Network Topology Manager (VNTM) [RFC5623]). Alternatively, 2060 a more proactive client layer management system may monitor traffic 2061 demands (current and predicted), network usage, and network "hot 2062 spots" and may request changes in connectivity by both releasing 2063 unused links and by requesting new links. 2065 - The abstraction layer network can make links available to the 2066 client network or can withdraw them. These actions can be in 2067 response to requests from the client layer, or can be driven by 2068 processes within the abstraction layer (perhaps reorganizing the 2069 use of server layer resources). In any case, the presentation of 2070 new links to the client layer is heavily subject to policy since 2071 this is both operationally key to the success of this architecture 2072 and the central plank of the commercial model described in this 2073 document. Such policies belong to the operator of the abstraction 2074 layer network and are expected to be fully configurable. 2076 Once the abstraction layer network has decided to make a link 2077 available to the client network it will install it at the link end 2078 points (which are nodes in the client network) such that it appears 2079 and can be advertised as a link in the client network. 2081 In all cases, it is important that the operators of both networks are 2082 able to track the requests and responses, and the operator of the 2083 client network should be able to see which links in that network are 2084 "real" physical links, and which are presented by the abstraction 2085 layer network. 2087 10.3. Managing Interactions of Abstraction Layer and Server Networks 2089 The interactions between the abstraction layer network and the server 2090 network a similar to those described in Section 10.2, but there is a 2091 difference in that the server layer is more likely to offer up 2092 connectivity, and the abstraction layer network is less likely to ask 2093 for it. 2095 That is, the server network will, according to policy that may 2096 include commercial relationships, offer the abstraction layer network 2097 a set of potential connectivity that the abstraction layer network 2098 can treat as links. This server network policy will include: 2099 - how much connectivity to offer 2100 - what level of server layer redundancy to include 2101 - how to support the use of the abstraction links, 2103 This process of offering links from the server network may include a 2104 mechanism to indicate which links have been pre-established in the 2105 server network, and can include other properties such as: 2106 - link-level protection ([RFC4202]) 2107 - SRLG and MSRLG (see Appendix A) 2108 - mutual exclusivity (see Appendix B). 2110 The abstraction layer network needs a mechanism to tell the server 2111 This mechanism could also include the ability to request additional 2112 connectivity from the server layer, although it seems most likely 2113 that the server layer will already have presented as much 2114 connectivity as it is physically capable of subject to the 2115 constraints of policy. 2117 Finally, the server layer will need to confirm the establishment of 2118 connectivity, withdraw links if they are no longer feasible, and 2119 report failures. 2121 Again, it is important that the operators of both networks are able 2122 to track the requests and responses, and the operator of the server 2123 network should be able to see which links are in use. 2125 11. IANA Considerations 2127 This document makes no requests for IANA action. The RFC Editor may 2128 safely remove this section. 2130 12. Security Considerations 2132 Security of signaling and routing protocols is usually administered 2133 and achieved within the boundaries of a domain. Thus, and for 2134 example, a domain with a GMPLS control plane [RFC3945] would apply 2135 the security mechanisms and considerations that are appropriate to 2136 GMPLS [RFC5920]. Furthermore, domain-based security relies strongly 2137 on ensuring that control plane messages are not allowed to enter the 2138 domain from outside. Thus, the mechanisms in this document for 2139 inter-domain exchange of control plane messages and information 2140 naturally raise additional questions of security. 2142 In this context, additional security considerations arising from this 2143 document relate to the exchange of control plane information between 2144 domains. Messages are passed between domains using control plane 2145 protocols operating between peers that have predictable relationships 2146 (for example, UNI-C to UNI-N, between BGP-LS speakers, or between 2147 peer domains). Thus, the security that needs to be given additional 2148 attention for inter-domain TE concentrates on authentication of 2149 peers, assertion that messages have not been tampered with, and to a 2150 lesser extent protecting the content of the messages from inspection 2151 since that might give away sensitive information about the networks. 2152 The protocols described in Appendix A and which are likely to provide 2153 the foundation to solutions to this architecture already include 2154 such protection and further can be run over protected transports 2155 such as IPsec [RFC6701], TLS [RFC5246], and the TCP Authentication 2156 Option (TCP-AO) [RFC5925]. 2158 It is worth noting that the control plane of the abstraction layer 2159 network is likely to be out of band. That is, control plane messages 2160 will be exchanged over network links that are not the links to which 2161 they apply. This models the facilities of GMPLS (but not of MPLS-TE) 2162 and the security mechanisms can be applied to the protocols operating 2163 in the out of band network. 2165 13. Acknowledgements 2167 Thanks to Igor Bryskin for useful discussions in the early stages of 2168 this work. 2170 Thanks to Gert Grammel for discussions on the extent of aggregation 2171 in abstract nodes and links. 2173 Thanks to Deborah Brungard, Dieter Beller, Dhruv Dhody, Vallinayakam 2174 Somasundaram, and Hannes Gredler for review and input. 2176 Particular thanks to Vishnu Pavan Beeram for detailed discussions and 2177 white-board scribbling that made many of the ideas in this document 2178 come to life. 2180 Text in Section 4.2.3 is freely adapted from the work of Igor 2181 Bryskin, Wes Doonan, Vishnu Pavan Beeram, John Drake, Gert Grammel, 2182 Manuel Paul, Ruediger Kunze, Friedrich Armbruster, Cyril Margaria, 2183 Oscar Gonzalez de Dios, and Daniele Ceccarelli in 2184 [I-D.beeram-ccamp-gmpls-enni] for which the authors of this document 2185 express their thanks. 2187 14. References 2189 14.1. Informative References 2191 [G.8080] ITU-T, "Architecture for the automatically switched optical 2192 network (ASON)", Recommendation G.8080. 2194 [I-D.beeram-ccamp-gmpls-enni] 2195 Bryskin, I., Beeram, V. P., Drake, J. et al., "Generalized 2196 Multiprotocol Label Switching (GMPLS) External Network 2197 Network Interface (E-NNI): Virtual Link Enhancements for 2198 the Overlay Model", draft-beeram-ccamp-gmpls-enni, work in 2199 progress. 2201 [I-D.ietf-ccamp-rsvp-te-srlg-collect] 2202 Zhang, F. (Ed.) and O. Gonzalez de Dios (Ed.), "RSVP-TE 2203 Extensions for Collecting SRLG Information", draft-ietf- 2204 ccamp-rsvp-te-srlg-collect, work in progress. 2206 [I-D.ietf-idr-ls-distribution] 2207 Gredler, H., Medved, J., Previdi, S., Farrel, A., and Ray, 2208 S., "North-Bound Distribution of Link-State and TE 2209 Information using BGP", draft-ietf-idr-ls-distribution, 2210 work in progress. 2212 [RFC2702] Awduche, D., Malcolm, J., Agogbua, J., O'Dell, M., and 2213 McManus, J., "Requirements for Traffic Engineering Over 2214 MPLS", RFC 2702, September 1999. 2216 [RFC3209] Awduche, D., Berger, L., Gan, D., Li, T., Srinivasan, V., 2217 and G. Swallow, "RSVP-TE: Extensions to RSVP for LSP 2218 Tunnels", RFC 3209, December 2001. 2220 [RFC3473] L. Berger, "Generalized Multi-Protocol Label Switching 2221 (GMPLS) Signaling Resource ReserVation Protocol-Traffic 2222 Engineering (RSVP-TE) Extensions", RC 3473, January 2003. 2224 [RFC3630] Katz, D., Kompella, and K., Yeung, D., "Traffic Engineering 2225 (TE) Extensions to OSPF Version 2", RFC 3630, September 2226 2003. 2228 [RFC3945] Mannie, E., (Ed.), "Generalized Multi-Protocol Label 2229 Switching (GMPLS) Architecture", RFC 3945, October 2004. 2231 [RFC4105] Le Roux, J.-L., Vasseur, J.-P., and Boyle, J., 2232 "Requirements for Inter-Area MPLS Traffic Engineering", 2233 RFC 4105, June 2005. 2235 [RFC4202] Kompella, K. and Y. Rekhter, "Routing Extensions in Support 2236 of Generalized Multi-Protocol Label Switching (GMPLS)", 2237 RFC 4202, October 2005. 2239 [RFC4206] Kompella, K. and Y. Rekhter, "Label Switched Paths (LSP) 2240 Hierarchy with Generalized Multi-Protocol Label Switching 2241 (GMPLS) Traffic Engineering (TE)", RFC 4206, October 2005. 2243 [RFC4208] Swallow, G., Drake, J., Ishimatsu, H., and Y. Rekhter, 2244 "User-Network Interface (UNI): Resource ReserVation 2245 Protocol-Traffic Engineering (RSVP-TE) Support for the 2246 Overlay Model", RFC 4208, October 2005. 2248 [RFC4216] Zhang, R., and Vasseur, J.-P., "MPLS Inter-Autonomous 2249 System (AS) Traffic Engineering (TE) Requirements", 2250 RFC 4216, November 2005. 2252 [RFC4271] Rekhter, Y., Li, T., and Hares, S., "A Border Gateway 2253 Protocol 4 (BGP-4)", RFC 4271, January 2006. 2255 [RFC4364] Rosen, E. and Y. Rekhter, "BGP/MPLS IP Virtual Private 2256 Networks (VPNs)", RFC 4364, February 2006. 2258 [RFC4655] Farrel, A., Vasseur, J., and J. Ash, "A Path Computation 2259 Element (PCE)-Based Architecture", RFC 4655, August 2006. 2261 [RFC4726] Farrel, A., Vasseur, J.-P., and Ayyangar, A., "A Framework 2262 for Inter-Domain Multiprotocol Label Switching Traffic 2263 Engineering", RFC 4726, November 2006. 2265 [RFC4847] T. Takeda (Ed.), "Framework and Requirements for Layer 1 2266 Virtual Private Networks," RFC 4847, April 2007. 2268 [RFC4874] Lee, CY., Farrel, A., and S. De Cnodder, "Exclude Routes - 2269 Extension to Resource ReserVation Protocol-Traffic 2270 Engineering (RSVP-TE)", RFC 4874, April 2007. 2272 [RFC4920] Farrel, A., Satyanarayana, A., Iwata, A., Fujita, N., and 2273 Ash, G., "Crankback Signaling Extensions for MPLS and GMPLS 2274 RSVP-TE", RFC 4920, July 2007. 2276 [RFC5150] Ayyangar, A., Kompella, K., Vasseur, JP., and A. Farrel, 2277 "Label Switched Path Stitching with Generalized 2278 Multiprotocol Label Switching Traffic Engineering (GMPLS 2279 TE)", RFC 5150, February 2008. 2281 [RFC5152] Vasseur, JP., Ayyangar, A., and Zhang, R., "A Per-Domain 2282 Path Computation Method for Establishing Inter-Domain 2283 Traffic Engineering (TE) Label Switched Paths (LSPs)", 2284 RFC 5152, February 2008. 2286 [RFC5195] Ould-Brahim, H., Fedyk, D., and Y. Rekhter, "BGP-Based 2287 Auto-Discovery for Layer-1 VPNs", RFC 5195, June 2008. 2289 [RFC5246] Dierks, T. and E. Rescorla, "The Transport Layer Security 2290 (TLS) Protocol Version 1.2", RFC 5246, August 2008. 2292 [RFC5251] Fedyk, D., Rekhter, Y., Papadimitriou, D., Rabbat, R., and 2293 L. Berger, "Layer 1 VPN Basic Mode", RFC 5251, July 2008. 2295 [RFC5252] Bryskin, I. and L. Berger, "OSPF-Based Layer 1 VPN Auto- 2296 Discovery", RFC 5252, July 2008. 2298 [RFC5305] Li, T., and Smit, H., "IS-IS Extensions for Traffic 2299 Engineering", RFC 5305, October 2008. 2301 [RFC5440] Vasseur, JP. and Le Roux, JL., "Path Computation Element 2302 (PCE) Communication Protocol (PCEP)", RFC 5440, March 2009. 2304 [RFC5441] Vasseur, JP., Zhang, R., Bitar, N, and Le Roux, JL., "A 2305 Backward-Recursive PCE-Based Computation (BRPC) Procedure 2306 to Compute Shortest Constrained Inter-Domain Traffic 2307 Engineering Label Switched Paths", RFC 5441, April 2009. 2309 [RFC5523] L. Berger, "OSPFv3-Based Layer 1 VPN Auto-Discovery", RFC 2310 5523, April 2009. 2312 [RFC5553] Farrel, A., Bradford, R., and JP. Vasseur, "Resource 2313 Reservation Protocol (RSVP) Extensions for Path Key 2314 Support", RFC 5553, May 2009. 2316 [RFC5623] Oki, E., Takeda, T., Le Roux, JL., and A. Farrel, 2317 "Framework for PCE-Based Inter-Layer MPLS and GMPLS Traffic 2318 Engineering", RFC 5623, September 2009. 2320 [RFC5920] L. Fang, Ed., "Security Framework for MPLS and GMPLS 2321 Networks", RFC 5920, July 2010. 2323 [RFC5925] Touch, J., Mankin, A., and R. Bonica, "The TCP 2324 Authentication Option", RFC 5925, June 2010. 2326 [RFC6005] Nerger, L., and D. Fedyk, "Generalized MPLS (GMPLS) Support 2327 for Metro Ethernet Forum and G.8011 User Network Interface 2328 (UNI)", RFC 6005, October 2010. 2330 [RFC6107] Shiomoto, K., and A. Farrel, "Procedures for Dynamically 2331 Signaled Hierarchical Label Switched Paths", RFC 6107, 2332 February 2011. 2334 [RFC6701] Frankel, S. and S. Krishnan, "IP Security (IPsec) and 2335 Internet Key Exchange (IKE) Document Roadmap", RFC 6701, 2336 February 2011. 2338 [RFC6805] King, D., and A. Farrel, "The Application of the Path 2339 Computation Element Architecture to the Determination of a 2340 Sequence of Domains in MPLS and GMPLS", RFC 6805, November 2341 2012. 2343 [RFC6827] Malis, A., Lindem, A., and D. Papadimitriou, "Automatically 2344 Switched Optical Network (ASON) Routing for OSPFv2 2345 Protocols", RFC 6827, January 2013. 2347 [RFC6996] J. Mitchell, "Autonomous System (AS) Reservation for 2348 Private Use", BCP 6, RFC 6996, July 2013. 2350 [RFC7399] Farrel, A. and D. King, "Unanswered Questions in the Path 2351 Computation Element Architecture", RFC 7399, October 2014. 2353 [RFC7579] Bernstein, G., Lee, Y.,et al., "General Network Element 2354 Constraint Encoding for GMPLS-Controlled Networks", RFC 2355 7579, June 2015. 2357 [RFC7580] Zhang, F., Lee, Y,. Han, J, Bernstein, G., and Xu, Y., 2358 "OSPF-TE Extensions for General Network Element 2359 Constraints", RFC 7580, June 2015. 2361 Authors' Addresses 2363 Adrian Farrel 2364 Juniper Networks 2365 EMail: adrian@olddog.co.uk 2367 John Drake 2368 Juniper Networks 2369 EMail: jdrake@juniper.net 2371 Nabil Bitar 2372 Verizon 2373 40 Sylvan Road 2374 Waltham, MA 02145 2375 EMail: nabil.bitar@verizon.com 2377 George Swallow 2378 Cisco Systems, Inc. 2379 1414 Massachusetts Ave 2380 Boxborough, MA 01719 2381 EMail: swallow@cisco.com 2383 Xian Zhang 2384 Huawei Technologies 2385 Email: zhang.xian@huawei.com 2387 Daniele Ceccarelli 2388 Ericsson 2389 Via A. Negrone 1/A 2390 Genova - Sestri Ponente 2391 Italy 2392 EMail: daniele.ceccarelli@ericsson.com 2394 Contributors 2396 Gert Grammel 2397 Juniper Networks 2398 Email: ggrammel@juniper.net 2400 Vishnu Pavan Beeram 2401 Juniper Networks 2402 Email: vbeeram@juniper.net 2404 Oscar Gonzalez de Dios 2405 Email: ogondio@tid.es 2407 Fatai Zhang 2408 Email: zhangfatai@huawei.com 2409 Zafar Ali 2410 Email: zali@cisco.com 2412 Rajan Rao 2413 Email: rrao@infinera.com 2415 Sergio Belotti 2416 Email: sergio.belotti@alcatel-lucent.com 2418 Diego Caviglia 2419 Email: diego.caviglia@ericsson.com 2421 Jeff Tantsura 2422 Email: jeff.tantsura@ericsson.com 2424 Khuzema Pithewan 2425 Email: kpithewan@infinera.com 2427 Cyril Margaria 2428 Email: cyril.margaria@googlemail.com 2430 Victor Lopez 2431 Email: vlopez@tid.es 2433 Appendix A. Existing Work 2435 This appendix briefly summarizes relevant existing work that is used 2436 to route TE paths across multiple domains. 2438 A.1. Per-Domain Path Computation 2440 The per-domain mechanism of path establishment is described in 2441 [RFC5152] and its applicability is discussed in [RFC4726]. In 2442 summary, this mechanism assumes that each domain entry point is 2443 responsible for computing the path across the domain, but that 2444 details of the path in the next domain are left to the next domain 2445 entry point. The computation may be performed directly by the entry 2446 point or may be delegated to a computation server. 2448 This basic mode of operation can run into many of the issues 2449 described alongside the use cases in Section 2. However, in practice 2450 it can be used effectively with a little operational guidance. 2452 For example, RSVP-TE [RFC3209] includes the concept of a "loose hop" 2453 in the explicit path that is signaled. This allows the original 2454 request for an LSP to list the domains or even domain entry points to 2455 include on the path. Thus, in the example in Figure 1, the source 2456 can be told to use the interconnection x2. Then the source computes 2457 the path from itself to x2, and initiates the signaling. When the 2458 signaling message reaches Domain Z, the entry point to the domain 2459 computes the remaining path to the destination and continues the 2460 signaling. 2462 Another alternative suggested in [RFC5152] is to make TE routing 2463 attempt to follow inter-domain IP routing. Thus, in the example 2464 shown in Figure 2, the source would examine the BGP routing 2465 information to determine the correct interconnection point for 2466 forwarding IP packets, and would use that to compute and then signal 2467 a path for Domain A. Each domain in turn would apply the same 2468 approach so that the path is progressively computed and signaled 2469 domain by domain. 2471 Although the per-domain approach has many issues and drawbacks in 2472 terms of achieving optimal (or, indeed, any) paths, it has been the 2473 mainstay of inter-domain LSP set-up to date. 2475 A.2. Crankback 2477 Crankback addresses one of the main issues with per-domain path 2478 computation: what happens when an initial path is selected that 2479 cannot be completed toward the destination? For example, what 2480 happens if, in Figure 2, the source attempts to route the path 2481 through interconnection x2, but Domain C does not have the right TE 2482 resources or connectivity to route the path further? 2484 Crankback for MPLS-TE and GMPLS networks is described in [RFC4920] 2485 and is based on a concept similar to the Acceptable Label Set 2486 mechanism described for GMPLS signaling in [RFC3473]. When a node 2487 (i.e., a domain entry point) is unable to compute a path further 2488 across the domain, it returns an error message in the signaling 2489 protocol that states where the blockage occurred (link identifier, 2490 node identifier, domain identifier, etc.) and gives some clues about 2491 what caused the blockage (bad choice of label, insufficient bandwidth 2492 available, etc.). This information allows a previous computation 2493 point to select an alternative path, or to aggregate crankback 2494 information and return it upstream to a previous computation point. 2496 Crankback is a very powerful mechanism and can be used to find an 2497 end-to-end path in a multi-domain network if one exists. 2499 On the other hand, crankback can be quite resource-intensive as 2500 signaling messages and path setup attempts may "wander around" in the 2501 network attempting to find the correct path for a long time. Since 2502 RSVP-TE signaling ties up networks resources for partially 2503 established LSPs, since network conditions may be in flux, and most 2504 particularly since LSP setup within well-known time limits is highly 2505 desirable, crankback is not a popular mechanism. 2507 Furthermore, even if crankback can always find an end-to-end path, it 2508 does not guarantee to find the optimal path. (Note that there have 2509 been some academic proposals to use signaling-like techniques to 2510 explore the whole network in order to find optimal paths, but these 2511 tend to place even greater burdens on network processing.) 2513 A.3. Path Computation Element 2515 The Path Computation Element (PCE) is introduced in [RFC4655]. It is 2516 an abstract functional entity that computes paths. Thus, in the 2517 example of per-domain path computation (see A.1) the source node and 2518 each domain entry point is a PCE. On the other hand, the PCE can 2519 also be realized as a separate network element (a server) to which 2520 computation requests can be sent using the Path Computation Element 2521 Communication Protocol (PCEP) [RFC5440]. 2523 Each PCE has responsibility for computations within a domain, and has 2524 visibility of the attributes within that domain. This immediately 2525 enables per-domain path computation with the opportunity to off-load 2526 complex, CPU-intensive, or memory-intensive computation functions 2527 from routers in the network. But the use of PCE in this way does not 2528 solve any of the problems articulated in A.1 and A.2. 2530 Two significant mechanisms for cooperation between PCEs have been 2531 described. These mechanisms are intended to specifically address the 2532 problems of computing optimal end-to-end paths in multi-domain 2533 environments. 2535 - The Backward-Recursive PCE-Based Computation (BRPC) mechanism 2536 [RFC5441] involves cooperation between the set of PCEs along the 2537 inter-domain path. Each one computes the possible paths from 2538 domain entry point (or source node) to domain exit point (or 2539 destination node) and shares the information with its upstream 2540 neighbor PCE which is able to build a tree of possible paths 2541 rooted at the destination. The PCE in the source domain can 2542 select the optimal path. 2544 BRPC is sometimes described as "crankback at computation time". It 2545 is capable of determining the optimal path in a multi-domain 2546 network, but depends on knowing the domain that contains the 2547 destination node. Furthermore, the mechanism can become quite 2548 complicated and involve a lot of data in a mesh of interconnected 2549 domains. Thus, BRPC is most often proposed for a simple mesh of 2550 domains and specifically for a path that will cross a known 2551 sequence of domains, but where there may be a choice of domain 2552 interconnections. In this way, BRPC would only be applied to 2553 Figure 2 if a decision had been made (externally) to traverse 2554 Domain C rather than Domain D (notwithstanding that it could 2555 functionally be used to make that choice itself), but BRPC could be 2556 used very effectively to select between interconnections x1 and x2 2557 in Figure 1. 2559 - Hierarchical PCE (H-PCE) [RFC6805] offers a parent PCE that is 2560 responsible for navigating a path across the domain mesh and for 2561 coordinating intra-domain computations by the child PCEs 2562 responsible for each domain. This approach makes computing an end- 2563 to-end path across a mesh of domains far more tractable. However, 2564 it still leaves unanswered the issue of determining the location of 2565 the destination (i.e., discovering the destination domain) as 2566 described in Section 2.1.1. Furthermore, it raises the question of 2567 who operates the parent PCE especially in networks where the 2568 domains are under different administrative and commercial control. 2570 It should also be noted that [RFC5623] discusses how PCE is used in a 2571 multi-layer network with coordination between PCEs operating at each 2572 network layer. Further issues and considerations of the use of PCE 2573 can be found in [RFC7399]. 2575 A.4. GMPLS UNI and Overlay Networks 2577 [RFC4208] defines the GMPLS User-to-Network Interface (UNI) to 2578 present a routing boundary between an overlay network and the core 2579 network, i.e. the client-server interface. In the client network, 2580 the nodes connected directly to the core network are known as edge 2581 nodes, while the nodes in the server network are called core nodes. 2583 In the overlay model defined by [RFC4208] the core nodes act as a 2584 closed system and the edge nodes do not participate in the routing 2585 protocol instance that runs among the core nodes. Thus the UNI 2586 allows access to and limited control of the core nodes by edge nodes 2587 that are unaware of the topology of the core nodes. This respects 2588 the operational and layer boundaries while scaling the network. 2590 [RFC4208] does not define any routing protocol extension for the 2591 interaction between core and edge nodes but allows for the exchange 2592 of reachability information between them. In terms of a VPN, the 2593 client network can be considered as the customer network comprised 2594 of a number of disjoint sites, and the edge nodes match the VPN CE 2595 nodes. Similarly, the provider network in the VPN model is 2596 equivalent to the server network. 2598 [RFC4208] is, therefore, a signaling-only solution that allows edge 2599 nodes to request connectivity cross the core network, and leaves the 2600 core network to select the paths for the LSPs as they traverse the 2601 core (setting up hierarchical LSPs if necessitated by the 2602 technology). This solution is supplemented by a number of signaling 2603 extensions such as [RFC4874], [RFC5553], [I-D.ietf-ccamp-xro-lsp- 2604 subobject], [I-D.ietf-ccamp-rsvp-te-srlg-collect], and [I-D.ietf- 2605 ccamp-te-metric-recording] to give the edge node more control over 2606 path within the core network and by allowing the edge nodes to supply 2607 additional constraints on the path used in the core network. 2608 Nevertheless, in this UNI/overlay model, the edge node has limited 2609 information of precisely what LSPs could be set up across the core, 2610 and what TE services (such as diverse routes for end-to-end 2611 protection, end-to-end bandwidth, etc.) can be supported. 2613 A.5. Layer One VPN 2615 A Layer One VPN (L1VPN) is a service offered by a core layer 1 2616 network to provide layer 1 connectivity (TDM, LSC) between two or 2617 more customer networks in an overlay service model [RFC4847]. 2619 As in the UNI case, the customer edge has some control over the 2620 establishment and type of the connectivity. In the L1VPN context 2621 three different service models have been defined classified by the 2622 semantics of information exchanged over the customer interface: 2624 Management Based, Signaling Based (a.k.a. basic), and Signaling and 2625 Routing service model (a.k.a. enhanced). 2627 In the management based model, all edge-to-edge connections are set 2628 up using configuration and management tools. This is not a dynamic 2629 control plane solution and need not concern us here. 2631 In the signaling based service model [RFC5251] the CE-PE interface 2632 allows only for signaling message exchange, and the provider network 2633 does not export any routing information about the core network. VPN 2634 membership is known a priori (presumably through configuration) or is 2635 discovered using a routing protocol [RFC5195], [RFC5252], [RFC5523], 2636 as is the relationship between CE nodes and ports on the PE. This 2637 service model is much in line with GMPLS UNI as defined in [RFC4208]. 2639 In the enhanced model there is an additional limited exchange of 2640 routing information over the CE-PE interface between the provider 2641 network and the customer network. The enhanced model considers four 2642 different types of service models, namely: Overlay Extension, Virtual 2643 Node, Virtual Link and Per-VPN service models. All of these 2644 represent particular cases of the TE information aggregation and 2645 representation. 2647 A.6. Policy and Link Advertisement 2649 Inter-domain networking relies on policy and management input to 2650 coordinate the allocation of resources under different administrative 2651 control. [RFC5623] introduces a functional component called the 2652 Virtual Network Topology Manager (VNTM) for this purpose. 2654 An important companion to this function is determining how 2655 connectivity across the abstraction layer network is made available 2656 as a TE link in the client network. Obviously, if the connectivity 2657 is established using management intervention, the consequent client 2658 network TE link can also be configured manually. However, if 2659 connectivity from client edge to client edge is achieved using 2660 dynamic signalling then there is need for the end points to exchange 2661 the link properties that they should advertise within the client 2662 network, and in the case of support for more than one client network, 2663 it will be necessary to indicate which client or clients can use the 2664 link. This capability it provided in [RFC6107]. 2666 Appendix B. Additional Features 2668 This Appendix describes additional features that may be desirable and 2669 that can be achieved within this architecture. 2671 B.1. Macro Shared Risk Link Groups 2673 Network links often share fate with one or more other links. That 2674 is, a scenario that may cause a link to fail could cause one or more 2675 other links to fail. This may occur, for example, if the links are 2676 supported by the same fiber bundle, or if some links are routed down 2677 the same duct or in a common piece of infrastructure such as a 2678 bridge. A common way to identify the links that may share fate is to 2679 label them as belonging to a Shared Risk Link Group (SRLG) [RFC4202]. 2681 TE links created from LSPs in lower layers may also share fate, and 2682 it can be hard for a client network to know about this problem 2683 because it does not know the topology of the server network or the 2684 path of the server layer LSPs that are used to create the links in 2685 the client network. 2687 For example, looking at the example used in Section 4.2.3 and 2688 considering the two abstract links S1-S3 and S1-S9 there is no way 2689 for the client layer to know whether the links C2-C0 and C2-C3 share 2690 fate. Clearly, if the client layer uses these links to provide a 2691 link-diverse end-to-end protection scheme, it needs to know that the 2692 links actually share a piece of network infrastructure (the server 2693 layer link S1-S2). 2695 Per [RFC4202], an SRLG represents a shared physical network resource 2696 upon which the normal functioning of a link depends. Multiple SRLGs 2697 can be identified and advertised for every TE link in a network. 2698 However, this can produce a scalability problem in a mutli-layer 2699 network that equates to advertising in the client layer the server 2700 layer route of each TE link. 2702 Macro SRLGs (MSRLGs) address this scaling problem and are a form of 2703 abstraction performed at the same time that the abstract links are 2704 derived. In this way, links that actually share resources in the 2705 server layer are advertised as having the same MSRLG, rather than 2706 advertising each SRLG for each resource on each path in the server 2707 layer. This saving is possible because the abstract links are 2708 formulated on behalf of the server layer by a central management 2709 agency that is aware of all of the link abstractions being offered. 2711 It may be noted that a less optimal alternative path for the abstract 2712 link S1-S9 exists in the server layer (S1-S4-S7-S8-S9). It would be 2713 possible for the client layer request for connectivity C2-C0 to ask 2714 that the path be maximally disjoint from the path C2-C3. While 2715 nothing can be done about the shared link C2-S1, the abstraction 2716 layer could request to use the link S1-S9 in a way that is diverse 2717 from use of the link S1-S3, and this request could be honored if the 2718 server layer policy allows. 2720 Note that SRLGs and MSRLGs may be very hard to describe in the case 2721 of multiple server layer networks because the abstraction points will 2722 not know whether the resources in the various server layers share 2723 physical locations. 2725 B.2. Mutual Exclusivity 2727 As noted in the discussion of Figure 13, it is possible that some 2728 abstraction layer links can not be used at the same time. This 2729 arises when the potentiality of the links is indicated by the server 2730 layer, but the use the links would actually compete for server layer 2731 resources. In Figure 13 this arose when both link S1-S3 and link 2732 S7-S9 were used to carry LSPs: in that case the link S1-S9 could no 2733 longer be used. 2735 Such a situation need not be an issue when client-edge to client-edge 2736 LSPs are set up one by one because the use of one abstraction layer 2737 link and the corresponding use of server layer resources will cause 2738 the server layer to withdraw the availability of the other 2739 abstraction layer links, and these will become unavailable for 2740 further abstraction layer path computations. 2742 Furthermore, in deployments where abstraction layer links are only 2743 presented as available after server layer LSPs have been established 2744 to support them, the problem is unlikely exist. 2746 However, when the server layer is constrained, but chooses to 2747 advertise the potential of multiple abstraction layer links even 2748 though they compete for resources, and when multiple client-edge to 2749 client-edge LSPs are computed simultaneously (perhaps to provide 2750 protection services) there may be contention for server layer 2751 resources. In the case that protected abstraction layer LSPs are 2752 being established, this situation would be avoided through the use of 2753 SRLGs and/or MSRLGs since the two abstraction layer links that 2754 compete for server layer resources must also fate share across those 2755 resources. But in the case where the multiple client-edge to client- 2756 edge LSPs do not care about fate sharing, it may be necessary to flag 2757 the mutually exclusive links in the abstraction layer TED so that 2758 path computation can avoid accidentally attempting to utilize two of 2759 a set of such links at the same time.