idnits 2.17.1 draft-ietf-teas-interconnected-te-info-exchange-02.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (March 7, 2015) is 3337 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Obsolete informational reference (is this intentional?): RFC 5246 (Obsoleted by RFC 8446) Summary: 0 errors (**), 0 flaws (~~), 1 warning (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 Network Working Group A. Farrel (Ed.) 2 Internet-Draft J. Drake 3 Intended status: Standards Track Juniper Networks 4 Expires: September 7, 2015 5 N. Bitar 6 Verizon Networks 8 G. Swallow 9 Cisco Systems, Inc. 11 D. Ceccarelli 12 Ericsson 14 X. Zhang 15 Huawei 16 March 7, 2015 18 Problem Statement and Architecture for Information Exchange 19 Between Interconnected Traffic Engineered Networks 21 draft-ietf-teas-interconnected-te-info-exchange-02.txt 23 Abstract 25 In Traffic Engineered (TE) systems, it is sometimes desirable to 26 establish an end-to-end TE path with a set of constraints (such as 27 bandwidth) across one or more network from a source to a destination. 28 TE information is the data relating to nodes and TE links that is 29 used in the process of selecting a TE path. TE information is 30 usually only available within a network. We call such a zone of 31 visibility of TE information a domain. An example of a domain may be 32 an IGP area or an Autonomous System. 34 In order to determine the potential to establish a TE path through a 35 series of connected networks, it is necessary to have available a 36 certain amount of TE information about each network. This need not 37 be the full set of TE information available within each network, but 38 does need to express the potential of providing TE connectivity. This 39 subset of TE information is called TE reachability information. 41 This document sets out the problem statement and architecture for the 42 exchange of TE information between interconnected TE networks in 43 support of end-to-end TE path establishment. For reasons that are 44 explained in the document, this work is limited to simple TE 45 constraints and information that determine TE reachability. 47 Status of This Memo 49 This Internet-Draft is submitted in full conformance with the 50 provisions of BCP 78 and BCP 79. 52 Internet-Drafts are working documents of the Internet Engineering 53 Task Force (IETF). Note that other groups may also distribute 54 working documents as Internet-Drafts. The list of current Internet- 55 Drafts is at http://datatracker.ietf.org/drafts/current/. 57 Internet-Drafts are draft documents valid for a maximum of six months 58 and may be updated, replaced, or obsoleted by other documents at any 59 time. It is inappropriate to use Internet-Drafts as reference 60 material or to cite them other than as "work in progress." 62 Copyright Notice 64 Copyright (c) 2015 IETF Trust and the persons identified as the 65 document authors. All rights reserved. 67 This document is subject to BCP 78 and the IETF Trust's Legal 68 Provisions Relating to IETF Documents 69 (http://trustee.ietf.org/license-info) in effect on the date of 70 publication of this document. Please review these documents 71 carefully, as they describe your rights and restrictions with respect 72 to this document. Code Components extracted from this document must 73 include Simplified BSD License text as described in Section 4.e of 74 the Trust Legal Provisions and are provided without warranty as 75 described in the Simplified BSD License. 77 Table of Contents 79 1. Introduction ................................................. 5 80 1.1. Terminology ................................................ 6 81 1.1.1. TE Paths and TE Connections .............................. 6 82 1.1.2. TE Metrics and TE Attributes ............................. 6 83 1.1.3. TE Reachability .......................................... 6 84 1.1.4. Domain ................................................... 7 85 1.1.5. Aggregation .............................................. 7 86 1.1.6. Abstraction .............................................. 7 87 1.1.7. Abstract Link ............................................ 7 88 1.1.8. Abstraction Layer Network ................................ 8 89 2. Overview of Use Cases ........................................ 8 90 2.1. Peer Networks .............................................. 8 91 2.1.1. Where is the Destination? ................................ 9 92 2.2. Client-Server Networks ..................................... 10 93 2.3. Dual-Homing ................................................ 12 94 2.4. Requesting Connectivity .................................... 13 95 2.4.1. Discovering Server Network Information ................... 15 96 3. Problem Statement ............................................ 15 97 3.1. Use of Existing Protocol Mechanisms ........................ 16 98 3.2. Policy and Filters ......................................... 16 99 3.3. Confidentiality ............................................ 17 100 3.4. Information Overload ....................................... 17 101 3.5. Issues of Information Churn ................................ 18 102 3.6. Issues of Aggregation ...................................... 19 103 3.7. Virtual Network Topology ................................... 20 104 4. Existing Work ................................................ 21 105 4.1. Per-Domain Path Computation ................................ 21 106 4.2. Crankback .................................................. 22 107 4.3. Path Computation Element ................................... 23 108 4.4. GMPLS UNI and Overlay Networks ............................. 24 109 4.5. Layer One VPN .............................................. 25 110 4.6. VNT Manager and Link Advertisement ......................... 25 111 4.7. What Else is Needed and Why? ............................... 26 112 5. Architectural Concepts ....................................... 27 113 5.1. Basic Components ........................................... 27 114 5.1.1. Peer Interconnection ..................................... 27 115 5.1.2. Client-Server Interconnection ............................ 28 116 5.2. TE Reachability ............................................ 29 117 5.3. Abstraction not Aggregation ................................ 29 118 5.3.1. Abstract Links ........................................... 30 119 5.3.2. The Abstraction Layer Network ............................ 30 120 5.3.3. Abstraction in Client-Server Networks..................... 33 121 5.3.4. Abstraction in Peer Networks ............................. 39 122 5.4. Considerations for Dynamic Abstraction ..................... 41 123 5.5. Requirements for Advertising Links and Nodes ............... 42 124 5.6. Addressing Considerations .................................. 42 125 6. Building on Existing Protocols ............................... 43 126 6.1. BGP-LS ..................................................... 43 127 6.2. IGPs ....................................................... 43 128 6.3. RSVP-TE .................................................... 43 129 6.4. Notes on a Solution ........................................ 44 130 7. Applicability to Optical Domains and Networks ................. 46 131 8. Modeling the User-to-Network Interface ....................... 50 132 9. Abstraction in L3VPN Multi-AS Environments ................... 51 133 10. Scoping Future Work ......................................... 53 134 10.1. Not Solving the Internet .................................. 53 135 10.2. Working With "Related" Domains ............................ 53 136 10.3. Not Finding Optimal Paths in All Situations ............... 53 137 10.4. Not Breaking Existing Protocols ........................... 53 138 10.5. Sanity and Scaling ........................................ 53 139 11. Manageability Considerations ................................ 54 140 11.1. Managing the Abstraction Layer Network .................... 54 141 11.2. Managing Interactions of Client and Abstraction Layer Networks 142 55 143 11.3. Managing Interactions of Abstraction Layer and Server Networks 144 56 145 12. IANA Considerations ......................................... 56 146 13. Security Considerations ..................................... 57 147 14. Acknowledgements ............................................ 57 148 15. References .................................................. 58 149 15.1. Informative References .................................... 58 150 Authors' Addresses ............................................... 62 151 Contributors ..................................................... 63 153 1. Introduction 155 Traffic Engineered (TE) systems such as MPLS-TE [RFC2702] and GMPLS 156 [RFC3945] offer a way to establish paths through a network in a 157 controlled way that reserves network resources on specified links. 158 TE paths are computed by examining the Traffic Engineering Database 159 (TED) and selecting a sequence of links and nodes that are capable of 160 meeting the requirements of the path to be established. The TED is 161 constructed from information distributed by the IGP running in the 162 network, for example OSPF-TE [RFC3630] or ISIS-TE [RFC5305]. 164 It is sometimes desirable to establish an end-to-end TE path that 165 crosses more than one network or administrative domain as described 166 in [RFC4105] and [RFC4216]. In these cases, the availability of TE 167 information is usually limited to within each network. Such networks 168 are often referred to as Domains [RFC4726] and we adopt that 169 definition in this document: viz. 171 For the purposes of this document, a domain is considered to be any 172 collection of network elements within a common sphere of address 173 management or path computational responsibility. Examples of such 174 domains include IGP areas and Autonomous Systems. 176 In order to determine the potential to establish a TE path through a 177 series of connected domains and to choose the appropriate domain 178 connection points through which to route a path, it is necessary to 179 have available a certain amount of TE information about each domain. 180 This need not be the full set of TE information available within each 181 domain, but does need to express the potential of providing TE 182 connectivity. This subset of TE information is called TE 183 reachability information. The TE reachability information can be 184 exchanged between domains based on the information gathered from the 185 local routing protocol, filtered by configured policy, or statically 186 configured. 188 This document sets out the problem statement and architecture for the 189 exchange of TE information between interconnected TE domains in 190 support of end-to-end TE path establishment. The scope of this 191 document is limited to the simple TE constraints and information 192 (such as TE metrics, hop count, bandwidth, delay, shared risk) 193 necessary to determine TE reachability: discussion of multiple 194 additional constraints that might qualify the reachability can 195 significantly complicate aggregation of information and the stability 196 of the mechanism used to present potential connectivity as is 197 explained in the body of this document. 199 1.1. Terminology 201 This section introduces some key terms that need to be understood to 202 arrive at a common understanding of the problem space. Some of the 203 terms are defined in more detail in the sections that follow (in 204 which case forward pointers are provided) and some terms are taken 205 from definitions that already exist in other RFCs (in which case 206 references are given, but no apology is made for repeating or 207 summarizing the definitions here). 209 1.1.1. TE Paths and TE Connections 211 A TE connection is a Label Switched Path (LSP) through an MPLS-TE or 212 GMPLS network that directs traffic along a particular path (the TE 213 path) in order to provide a specific service such as bandwidth 214 guarantee, separation of traffic, or resilience between a well-known 215 pair of end points. 217 1.1.2. TE Metrics and TE Attributes 219 TE metrics and TE attributes are terms applied to parameters of links 220 (and possibly nodes) in a network that is traversed by TE 221 connections. The TE metrics and TE attributes are used by path 222 computation algorithms to select the TE paths that the TE connections 223 traverse. Provisioning a TE connection through a network may result 224 in dynamic changes to the TE metrics and TE attributes of the links 225 and nodes in the network. 227 These terms are also sometimes used to describe the end-to-end 228 characteristics of a TE connection and can be derived according to a 229 formula from the metrics and attributes of the links and nodes that 230 the TE connection traverses. Thus, for example, the end-to-end delay 231 for a TE connection is usually considered to be the sum of the delay 232 on each link that the connection traverses. 234 1.1.3. TE Reachability 236 In an IP network, reachability is the ability to deliver a packet to 237 a specific address or prefix. That is, the existence of an IP path 238 to that address or prefix. TE reachability is the ability to reach a 239 specific address along a TE path. More specifically, it is the 240 ability to establish a TE connection in an MPLS-TE or GMPLS sense. 241 Thus we talk about TE reachability as the potential of providing TE 242 connectivity. 244 TE reachability may be unqualified (there is a TE path, but no 245 information about available resources or other constraints is 246 supplied) which is helpful especially in determining a path to a 247 destination that lies in an unknown domain, or may be qualified by TE 248 attributes and TE metrics such as hop count, available bandwidth, 249 delay, shared risk, etc. 251 1.1.4. Domain 253 As defined in [RFC4726], a domain is any collection of network 254 elements within a common sphere of address management or path 255 computational responsibility. Examples of such domains include 256 Interior Gateway Protocol (IGP) areas and Autonomous Systems (ASes). 258 1.1.5. Aggregation 260 The concept of aggregation is discussed in Section 3.6. In 261 aggregation, multiple network resources from a domain are represented 262 outside the domain as a single entity. Thus multiple links and nodes 263 forming a TE connection may be represented as a single link, or a 264 collection of nodes and links (perhaps the whole domain) may be 265 represented as a single node with its attachment links. 267 1.1.6. Abstraction 269 Section 5.3 introduces the concept of abstraction and distinguishes 270 it from aggregation. Abstraction may be viewed as "policy-based 271 aggregation" where the policies are applied to overcome the issues 272 with aggregation as identified in section 3 of this document. 274 Abstraction is the process of applying policy to the available TE 275 information within a domain, to produce selective information that 276 represents the potential ability to connect across the domain. Thus, 277 abstraction does not necessarily offer all possible connectivity 278 options, but presents a general view of potential connectivity 279 according to the policies that determine how the domain's 280 administrator wants to allow the domain resources to be used. 282 1.1.7. Abstract Link 284 An abstract link is the representation of the characteristics of a 285 path between two nodes in a domain produced by abstraction. The 286 abstract link is advertised outside that domain as a TE link for use 287 in signaling in other domains. Thus, an abstract link represents 288 the potential to connect between a pair of nodes. 290 More details of abstract links are provided in Section 5.3.1. 292 1.1.8. Abstraction Layer Network 294 The abstraction layer network is introduced in Section 5.3.2. It may 295 be seen as a brokerage layer network between one or more server 296 networks and one or more client network. The abstraction layer 297 network is the collection of abstract links that provide potential 298 connectivity across the server network(s) and on which path 299 computation can be performed to determine edge-to-edge paths that 300 provide connectivity as links in the client network. 302 In the simplest case, the abstraction layer network is just a set of 303 edge-to-edge connections (i.e., abstract links), but to make the use 304 of server resources more flexible, the abstract links might not all 305 extend from edge to edge, but might offer connectivity between server 306 nodes to form a more complex network. 308 2. Overview of Use Cases 310 2.1. Peer Networks 312 The peer network use case can be most simply illustrated by the 313 example in Figure 1. A TE path is required between the source (Src) 314 and destination (Dst), that are located in different domains. There 315 are two points of interconnection between the domains, and selecting 316 the wrong point of interconnection can lead to a sub-optimal path, or 317 even fail to make a path available. 319 For example, when Domain A attempts to select a path, it may 320 determine that adequate bandwidth is available from Src through both 321 interconnection points x1 and x2. It may pick the path through x1 322 for local policy reasons: perhaps the TE metric is smaller. However, 323 if there is no connectivity in Domain Z from x1 to Dst, the path 324 cannot be established. Techniques such as crankback (see Section 325 4.2) may be used to alleviate this situation, but do not lead to 326 rapid setup or guaranteed optimality. Furthermore RSVP signalling 327 creates state in the network that is immediately removed by the 328 crankback procedure. Frequent events of such a kind impact 329 scalability in a non-deterministic manner. 331 -------------- -------------- 332 | Domain A | x1 | Domain Z | 333 | ----- +----+ ----- | 334 | | Src | +----+ | Dst | | 335 | ----- | x2 | ----- | 336 -------------- -------------- 338 Figure 1 : Peer Networks 340 There are countless more complicated examples of the problem of peer 341 networks. Figure 2 shows the case where there is a simple mesh of 342 domains. Clearly, to find a TE path from Src to Dst, Domain A must 343 not select a path leaving through interconnect x1 since Domain B has 344 no connectivity to Domain Z. Furthermore, in deciding whether to 345 select interconnection x2 (through Domain C) or interconnection x3 346 though Domain D, Domain A must be sensitive to the TE connectivity 347 available through each of Domains C and D, as well the TE 348 connectivity from each of interconnections x4 and x5 to Dst within 349 Domain Z. 351 -------------- 352 | Domain B | 353 | | 354 | | 355 /-------------- 356 / 357 / 358 /x1 359 --------------/ -------------- 360 | Domain A | | Domain Z | 361 | | -------------- | | 362 | ----- | x2| Domain C | x4| ----- | 363 | | Src | +---+ +---+ | Dst | | 364 | ----- | | | | ----- | 365 | | -------------- | | 366 --------------\ /-------------- 367 \x3 / 368 \ / 369 \ /x5 370 \--------------/ 371 | Domain D | 372 | | 373 | | 374 -------------- 376 Figure 2 : Peer Networks in a Mesh 378 Of course, many network interconnection scenarios are going to be a 379 combination of the situations expressed in these two examples. There 380 may be a mesh of domains, and the domains may have multiple points of 381 interconnection. 383 2.1.1. Where is the Destination? 385 A variation of the problems expressed in Section 2.1 arises when the 386 source domain (Domain A in both figures) does not know where the 387 destination is located. That is, when the domain in which the 388 destination node is located is not known to the source domain. 390 This is most easily seen in consideration of Figure 2 where the 391 decision about which interconnection to select needs to be based on 392 building a path toward the destination domain. Yet this can only be 393 achieved if it is known in which domain the destination node lies, or 394 at least if there is some indication in which direction the 395 destination lies. This function is obviously provided in IP networks 396 by inter-domain routing [RFC4271]. 398 2.2. Client-Server Networks 400 Two major classes of use case relate to the client-server 401 relationship between networks. These use cases have sometimes been 402 referred to as overlay networks. 404 The first group of use case, shown in Figure 3, occurs when domains 405 belonging to one network are connected by a domain belonging to 406 another network. In this scenario, once connections (or tunnels) are 407 formed across the lower layer network, the domains of the upper layer 408 network can be merged into a single domain by running IGP adjacencies 409 over the tunnels, and treating the tunnels as links in the higher 410 layer network. The TE relationship between the domains (higher and 411 lower layer) in this case is reduced to determining which tunnels to 412 set up, how to trigger them, how to route them, and what capacity to 413 assign them. As the demands in the higher layer network vary, these 414 tunnels may need to be modified. Section 2.4 explains in a little 415 more detail how connectivity may be requested 417 -------------- -------------- 418 | Domain A | | Domain Z | 419 | | | | 420 | ----- | | ----- | 421 | | Src | | | | Dst | | 422 | ----- | | ----- | 423 | | | | 424 --------------\ /-------------- 425 \x1 x2/ 426 \ / 427 \ / 428 \---------------/ 429 | Server Domain | 430 | | 431 | | 432 --------------- 434 Figure 3 : Client-Server Networks 436 The second class of use case of client-server networking is for 437 Virtual Private Networks (VPNs). In this case, as opposed to the 438 former one, it is assumed that the client network has a different 439 address space than that of the server layer where non-overlapping IP 440 addresses between the client and the server networks cannot be 441 guaranteed. A simple example is shown in Figure 4. The VPN sites 442 comprise a set of domains that are interconnected over a core domain, 443 the provider network. 445 -------------- -------------- 446 | Domain A | | Domain Z | 447 | (VPN site) | | (VPN site) | 448 | | | | 449 | ----- | | ----- | 450 | | Src | | | | Dst | | 451 | ----- | | ----- | 452 | | | | 453 --------------\ /-------------- 454 \x1 x2/ 455 \ / 456 \ / 457 \---------------/ 458 | Core Domain | 459 | | 460 | | 461 /---------------\ 462 / \ 463 / \ 464 /x3 x4\ 465 --------------/ \-------------- 466 | Domain B | | Domain C | 467 | (VPN site) | | (VPN site) | 468 | | | | 469 | | | | 470 -------------- -------------- 472 Figure 4 : A Virtual Private Network 474 Note that in the use cases shown in Figures 3 and 4 the client layer 475 domains may (and, in fact, probably do) operate as a single connected 476 network. 478 Both use cases in this section become "more interesting" when 479 combined with the use case in Section 2.1. That is, when the 480 connectivity between higher layer domains or VPN sites is provided 481 by a sequence or mesh of lower layer domains. Figure 5 shows how 482 this might look in the case of a VPN. 484 ------------ ------------ 485 | Domain A | | Domain Z | 486 | (VPN site) | | (VPN site) | 487 | ----- | | ----- | 488 | | Src | | | | Dst | | 489 | ----- | | ----- | 490 | | | | 491 ------------\ /------------ 492 \x1 x2/ 493 \ / 494 \ / 495 \---------- ----------/ 496 | Domain X |x5 | Domain Y | 497 | (core) +---+ (core) | 498 | | | | 499 | +---+ | 500 | |x6 | | 501 /---------- ----------\ 502 / \ 503 / \ 504 /x3 x4\ 505 ------------/ \------------ 506 | Domain B | | Domain C | 507 | (VPN site) | | (VPN site) | 508 | | | | 509 ------------ ------------ 511 Figure 5 : A VPN Supported Over Multiple Server Domains 513 2.3. Dual-Homing 515 A further complication may be added to the client-server relationship 516 described in Section 2.2 by considering what happens when a client 517 domain is attached to more than one server domain, or has two points 518 of attachment to a server domain. Figure 6 shows an example of this 519 for a VPN. 521 ------------ 522 | Domain A | 523 | (VPN site) | 524 ------------ | ----- | 525 | Domain B | | | Src | | 526 | (VPN site) | | ----- | 527 | | | | 528 ------------\ -+--------+- 529 \x1 | | 530 \ x2| |x3 531 \ | | ------------ 532 \--------+- -+-------- | Domain Z | 533 | Domain X | x8 | Domain Y | x4 | (VPN site) | 534 | (core) +----+ (core) +----+ ----- | 535 | | | | | | Dst | | 536 | +----+ +----+ ----- | 537 | | x9 | | x5 | | 538 /---------- ----------\ ------------ 539 / \ 540 / \ 541 /x6 x7\ 542 ------------/ \------------ 543 | Domain C | | Domain D | 544 | (VPN site) | | (VPN site) | 545 | | | | 546 ------------ ------------ 548 Figure 6 : Dual-Homing in a Virtual Private Network 550 2.4. Requesting Connectivity 552 This relationship between domains can be entirely under the control 553 of management processes, dynamically triggered by the client network, 554 or some hybrid of these cases. In the management case, the server 555 network may be requested to establish a set of LSPs to provide client 556 layer connectivity. In the dynamic case, the client may make a 557 request to the server network exerting a range of controls over the 558 paths selected in the server network. This range extends from no 559 control (i.e., a simple request for connectivity), through a set of 560 constraints (such as latency, path protection, etc.), up to and 561 including full control of the path and resources used in the server 562 network (i.e., the use of explicit paths with label subobjects). 564 There are various models by which a server network can be requested 565 to set up the connections that support a service provided to the 566 client network. These requests may come from management systems, 567 directly from the client network control plane, or through some 568 intermediary broker such as the Virtual Network Topology Manager 569 discussed in Section 4.6. 571 The trigger that causes the request to the server layer is also 572 flexible. It could be that the client layer discovers a pressing 573 need for server layer resources (such as the desire to provision an 574 end-to-end connection in the client layer, or severe congestion on 575 a specific path), or it might be that a planning application has 576 considered how best to optimize traffic in the client network or 577 how to handle a predicted traffic demand. 579 In all cases, the relationship between client and server networks is 580 subject to policy so that server resources are under the 581 administrative control of the operator or the server layer network 582 and are only used to support a client layer network in ways that the 583 server layer operator approves. 585 As just noted, connectivity requests issued to a server network may 586 include varying degrees of constraint upon the choice of path that 587 the server network can implement. 589 o Basic Provisioning is a simple request for connectivity. The only 590 constraints are the end points of the connection and the capacity 591 (bandwidth) that the connection will support for the client layer. 592 In the case of some server networks, even the bandwidth component 593 of a basic provisioning request is superfluous because the server 594 layer has no facility to vary bandwidth, but can offer connectivity 595 only at a default capacity. 597 o Basic Provisioning with Optimization is a service request that 598 indicates one or more metrics that the server layer must optimize 599 in its selection of a path. Metrics may be hop count, path length, 600 summed TE metric, jitter, delay, or any number of technology- 601 specific constraints. 603 o Basic Provisioning with Optimization and Constraints enhances the 604 optimization process to apply absolute constraints to functions of 605 the path metrics. For example, a connection may be requested that 606 optimizes for the shortest path, but in any case requests that the 607 end-to-end delay be less than a certain value. Equally, 608 optimization my be expressed in terms of the impact on the network. 609 For example, a service may be requested in order to leave maximal 610 flexibility to satisfy future service requests. 612 o Fate Diversity requests ask for the server layer to provide a path 613 that does not use any network resources (usually links and nodes) 614 that share fate (i.e., can fail as the result of a single event) as 615 the resources used by another connection. This allows the client 616 layer to construct protection services over the server layer 617 network, for example by establishing virtual links that are known 618 to be fate diverse. The connections that have diverse paths need 619 not share end points. 621 o Provisioning with Fate Sharing is the exact opposite of Fate 622 Diversity. In this case two or more connections are requested to 623 to follow same path in the server network. This may be requested, 624 for example, to create a bundled or aggregated link in the client 625 layer where each component of the client layer composite link is 626 required to have the same server layer properties (metrics, delay, 627 etc.) and the same failure characteristics. 629 o Concurrent Provisioning enables the inter-related connections 630 requests described in the previous two bullets to be enacted 631 through a single, compound service request. 633 o Service Resilience requests the server layer to provide 634 connectivity for which the server layer takes responsibility to 635 recover from faults. The resilience may be achieved through the 636 use of link-level protection, segment protection, end-to-end 637 protection, or recovery mechanisms. 639 2.4.1. Discovering Server Network Information 641 Although the topology and resource availability information of a 642 server network may be hidden from the client network, the service 643 request interface may support features that report details about the 644 services and potential services that the server network supports. 646 o Reporting of path details, service parameters, and issues such as 647 path diversity of LSPs that support deployed services allows the 648 client network to understand to what extent its requests were 649 satisfied. This is particularly important when the requests were 650 made as "best effort". 652 o A server network may support requests of the form "if I was to ask 653 you for this service, would you be able to provide it?" That is, 654 a service request that does everything except actually provision 655 the service. 657 3. Problem Statement 659 The problem statement presented in this section is as much about the 660 issues that may arise in any solution (and so have to be avoided) 661 and the features that are desirable within a solution, as it is about 662 the actual problem to be solved. 664 The problem can be stated very simply and with reference to the use 665 cases presented in the previous section. 667 A mechanism is required that allows TE-path computation in one 668 domain to make informed choices about the TE-capabilities and exit 669 points from the domain when signaling an end-to-end TE path that 670 will extend across multiple domains. 672 Thus, the problem is one of information collection and presentation, 673 not about signaling. Indeed, the existing signaling mechanisms for 674 TE LSP establishment are likely to prove adequate [RFC4726] with the 675 possibility of minor extensions. 677 An interesting annex to the problem is how the path is made available 678 for use. For example, in the case of a client-server network, the 679 path established in the server network needs to be made available as 680 a TE link to provide connectivity in the client network. 682 3.1. Use of Existing Protocol Mechanisms 684 TE information may currently be distributed in a domain by TE 685 extensions to one of the two IGPs as described in OSPF-TE [RFC3630] 686 and ISIS-TE [RFC5305]. TE information may be exported from a domain 687 (for example, northbound) using link state extensions to BGP 688 [I-D.ietf-idr-ls-distribution]. 690 It is desirable that a solution to the problem described in this 691 document does not require the implementation of a new, network-wide 692 protocol. Instead, it would be advantageous to make use of an 693 existing protocol that is commonly implemented on network nodes and 694 is currently deployed, or to use existing computational elements such 695 as Path Computation Elements (PCEs). This has many benefits in 696 network stability, time to deployment, and operator training. 698 It is recognized, however, that existing protocols are unlikely to be 699 immediately suitable to this problem space without some protocol 700 extensions. Extending protocols must be done with care and with 701 consideration for the stability of existing deployments. In extreme 702 cases, a new protocol can be preferable to a messy hack of an 703 existing protocol. 705 3.2. Policy and Filters 707 A solution must be amenable to the application of policy and filters. 708 That is, the operator of a domain that is sharing information with 709 another domain must be able to apply controls to what information is 710 shared. Furthermore, the operator of a domain that has information 711 shared with it must be able to apply policies and filters to the 712 received information. 714 Additionally, the path computation within a domain must be able to 715 weight the information received from other domains according to local 716 policy such that the resultant computed path meets the local 717 operator's needs and policies rather than those of the operators of 718 other domains. 720 3.3. Confidentiality 722 A feature of the policy described in Section 3.3 is that an operator 723 of a domain may desire to keep confidential the details about its 724 internal network topology and loading. This information could be 725 construed as commercially sensitive. 727 Although it is possible that TE information exchange will take place 728 only between parties that have significant trust, there are also use 729 cases (such as the VPN supported over multiple server domains 730 described in Section 2.4) where information will be shared between 731 domains that have a commercial relationship, but a low level of 732 trust. 734 Thus, it must be possible for a domain to limit the information share 735 to just that which the computing domain needs to know with the 736 understanding that less information that is made available the more 737 likely it is that the result will be a less optimal path and/or more 738 crankback events. 740 3.4. Information Overload 742 One reason that networks are partitioned into separate domains is to 743 reduce the set of information that any one router has to handle. 744 This also applies to the volume of information that routing protocols 745 have to distribute. 747 Over the years routers have become more sophisticated with greater 748 processing capabilities and more storage, the control channels on 749 which routing messages are exchanged have become higher capacity, and 750 the routing protocols (and their implementations) have become more 751 robust. Thus, some of the arguments in favor of dividing a network 752 into domains may have been reduced. Conversely, however, the size of 753 networks continues to grow dramatically with a consequent increase in 754 the total amount of routing-related information available. 755 Additionally, in this case, the problem space spans two or more 756 networks. 758 Any solution to the problems voiced in this document must be aware of 759 the issues of information overload. If the solution was to simply 760 share all TE information between all domains in the network, the 761 effect from the point of view of the information load would be to 762 create one single flat network domain. Thus the solution must 763 deliver enough information to make the computation practical (i.e., 764 to solve the problem), but not so much as to overload the receiving 765 domain. Furthermore, the solution cannot simply rely on the policies 766 and filters described in Section 3.2 because such filters might not 767 always be enabled. 769 3.5. Issues of Information Churn 771 As LSPs are set up and torn down, the available TE resources on links 772 in the network change. In order to reliably compute a TE path 773 through a network, the computation point must have an up-to-date view 774 of the available TE resources. However, collecting this information 775 may result in considerable load on the distribution protocol and 776 churn in the stored information. In order to deal with this problem 777 even in a single domain, updates are sent at periodic intervals or 778 whenever there is a significant change in resources, whichever 779 happens first. 781 Consider, for example, that a TE LSP may traverse ten links in a 782 network. When the LSP is set up or torn down, the resources 783 available on each link will change resulting in a new advertisement 784 of the link's capabilities and capacity. If the arrival rate of new 785 LSPs is relatively fast, and the hold times relatively short, the 786 network may be in a constant state of flux. Note that the 787 problem here is not limited to churn within a single domain, since 788 the information shared between domains will also be changing. 789 Furthermore, the information that one domain needs to share with 790 another may change as the result of LSPs that are contained within or 791 cross the first domain but which are of no direct relevance to the 792 domain receiving the TE information. 794 In packet networks, where the capacity of an LSP is often a small 795 fraction of the resources available on any link, this issue is 796 partially addressed by the advertising routers. They can apply a 797 threshold so that they do not bother to update the advertisement of 798 available resources on a link if the change is less than a configured 799 percentage of the total (or alternatively, the remaining) resources. 800 The updated information in that case will be disseminated based on an 801 update interval rather than a resource change event. 803 In non-packet networks, where link resources are physical switching 804 resources (such as timeslots or wavelengths) the capacity of an LSP 805 may more frequently be a significant percentage of the available link 806 resources. Furthermore, in some switching environments, it is 807 necessary to achieve end-to-end resource continuity (such as using 808 the same wavelength on the whole length of an LSP), so it is far more 809 desirable to keep the TE information held at the computation points 810 up-to-date. Fortunately, non-packet networks tend to be quite a bit 811 smaller than packet networks, the arrival rates of non-packet LSPs 812 are much lower, and the hold times considerably longer. Thus the 813 information churn may be sustainable. 815 3.6. Issues of Aggregation 817 One possible solution to the issues raised in other sub-sections of 818 this section is to aggregate the TE information shared between 819 domains. Two aggregation mechanisms are often considered: 821 - Virtual node model. In this view, the domain is aggregated as if 822 it was a single node (or router / switch). Its links to other 823 domains are presented as real TE links, but the model assumes that 824 any LSP entering the virtual node through a link can be routed to 825 leave the virtual node through any other link (although recent work 826 on "limited cross-connect switches" may help with this problem 827 [I-D.ietf-ccamp-general-constraint-encode]). 829 - Virtual link model. In this model, the domain is reduced to a set 830 of edge-to-edge TE links. Thus, when computing a path for an LSP 831 that crosses the domain, a computation point can see which domain 832 entry points can be connected to which other and with what TE 833 attributes. 835 It is of the nature of aggregation that information is removed from 836 the system. This can cause inaccuracies and failed path computation. 837 For example, in the virtual node model there might not actually be a 838 TE path available between a pair of domain entry points, but the 839 model lacks the sophistication to represent this "limited cross- 840 connect capability" within the virtual node. On the other hand, in 841 the virtual link model it may prove very hard to aggregate multiple 842 link characteristics: for example, there may be one path available 843 with high bandwidth, and another with low delay, but this does not 844 mean that the connectivity should be assumed or advertised as having 845 both high bandwidth and low delay. 847 The trick to this multidimensional problem, therefore, is to 848 aggregate in a way that retains as much useful information as 849 possible while removing the data that is not needed. An important 850 part of this trick is a clear understanding of what information is 851 actually needed. 853 It should also be noted in the context of Section 3.5 that changes in 854 the information within a domain may have a bearing on what aggregated 855 data is shared with another domain. Thus, while the data shared in 856 reduced, the aggregation algorithm (operating on the routers 857 responsible for sharing information) may be heavily exercised. 859 3.7. Virtual Network Topology 861 The terms "virtual topology" and "virtual network topology" have 862 become overloaded in a relatively short time. We draw on [RFC5212] 863 and [RFC5623] for inspiration to provide a definition for use in this 864 document. Our definition is based on the fact that a topology at the 865 and [RFC5623] for inspiration to provide a definition for use in this 866 document. Our definition is based on the fact that a topology at the 867 client network layer is constructed of nodes and links. Typically, 868 the nodes are routers in the client layer, and the links are data 869 links. However, a layered network provides connectivity through the 870 lower layer as LSPs, and these LSPs can provide links in the client 871 layer. Furthermore, those LSPs may have been established in advance, 872 or might be LSPs that could be set up if required. This leads to the 873 definition: 875 A Virtual Network Topology (VNT) is made up of links in a network 876 layer. Those links may be realized as direct data links or as 877 multi-hop connections (LSPs) in a lower network layer. Those 878 underlying LSPs may be established in advance or created on demand. 880 The creation and management of a VNT requires interaction with 881 management and policy. Activity is needed in both the client and 882 server layer: 884 - In the server layer, LSPs need to be set up either in advance in 885 response to management instructions or in answer to dynamic 886 requests subject to policy considerations. 888 - In the server layer, evaluation of available TE resources can lead 889 to the announcement of potential connectivity (i.e., LSPs that 890 could be set up on demand). 892 - In the client layer, connectivity (lower layer LSPs or potential 893 LSPs) needs to be announced in the IGP as a normal TE link. Such 894 links may or may not be made available to IP routing: but, they are 895 never made available to IP routing until fully instantiated. 897 - In the client layer, requests to establish lower layer LSPs need to 898 be made either when links supported by potential LSPs are about to 899 be used (i.e., when a higher layer LSP is signalled to cross the 900 link, the setup of the lower layer LSP is triggered), or when the 901 client layer determines it needs more connectivity or capacity. 903 It is a fundamental of the use of a VNT that there is a policy point 904 at the lower-layer node responsible for the instantiation of a lower- 905 layer LSP. At the moment that the setup of a lower-layer LSP is 906 triggered, whether from a client-layer management tool or from 907 signaling in the client layer, the server layer must be able to apply 908 policy to determine whether to actually set up the LSP. Thus, fears 909 that a micro-flow in the client layer might cause the activation of 910 100G optical resources in the server layer can be completely 911 controlled by the policy of the server layer network's operator (and 912 could even be subject to commercial terms). 914 These activities require an architecture and protocol elements as 915 well as management components and policy elements. 917 4. Existing Work 919 This section briefly summarizes relevant existing work that is used 920 to route TE paths across multiple domains. 922 4.1. Per-Domain Path Computation 924 The per-domain mechanism of path establishment is described in 925 [RFC5152] and its applicability is discussed in [RFC4726]. In 926 summary, this mechanism assumes that each domain entry point is 927 responsible for computing the path across the domain, but that 928 details of the path in the next domain are left to the next domain 929 entry point. The computation may be performed directly by the entry 930 point or may be delegated to a computation server. 932 This basic mode of operation can run into many of the issues 933 described alongside the use cases in Section 2. However, in practice 934 it can be used effectively with a little operational guidance. 936 For example, RSVP-TE [RFC3209] includes the concept of a "loose hop" 937 in the explicit path that is signaled. This allows the original 938 request for an LSP to list the domains or even domain entry points to 939 include on the path. Thus, in the example in Figure 1, the source 940 can be told to use the interconnection x2. Then the source computes 941 the path from itself to x2, and initiates the signaling. When the 942 signaling message reaches Domain Z, the entry point to the domain 943 computes the remaining path to the destination and continues the 944 signaling. 946 Another alternative suggested in [RFC5152] is to make TE routing 947 attempt to follow inter-domain IP routing. Thus, in the example 948 shown in Figure 2, the source would examine the BGP routing 949 information to determine the correct interconnection point for 950 forwarding IP packets, and would use that to compute and then signal 951 a path for Domain A. Each domain in turn would apply the same 952 approach so that the path is progressively computed and signaled 953 domain by domain. 955 Although the per-domain approach has many issues and drawbacks in 956 terms of achieving optimal (or, indeed, any) paths, it has been the 957 mainstay of inter-domain LSP set-up to date. 959 4.2. Crankback 961 Crankback addresses one of the main issues with per-domain path 962 computation: what happens when an initial path is selected that 963 cannot be completed toward the destination? For example, what 964 happens if, in Figure 2, the source attempts to route the path 965 through interconnection x2, but Domain C does not have the right TE 966 resources or connectivity to route the path further? 968 Crankback for MPLS-TE and GMPLS networks is described in [RFC4920] 969 and is based on a concept similar to the Acceptable Label Set 970 mechanism described for GMPLS signaling in [RFC3473]. When a node 971 (i.e., a domain entry point) is unable to compute a path further 972 across the domain, it returns an error message in the signaling 973 protocol that states where the blockage occurred (link identifier, 974 node identifier, domain identifier, etc.) and gives some clues about 975 what caused the blockage (bad choice of label, insufficient bandwidth 976 available, etc.). This information allows a previous computation 977 point to select an alternative path, or to aggregate crankback 978 information and return it upstream to a previous computation point. 980 Crankback is a very powerful mechanism and can be used to find an 981 end-to-end path in a multi-domain network if one exists. 983 On the other hand, crankback can be quite resource-intensive as 984 signaling messages and path setup attempts may "wander around" in the 985 network attempting to find the correct path for a long time. Since 986 RSVP-TE signaling ties up networks resources for partially 987 established LSPs, since network conditions may be in flux, and most 988 particularly since LSP setup within well-known time limits is highly 989 desirable, crankback is not a popular mechanism. 991 Furthermore, even if crankback can always find an end-to-end path, it 992 does not guarantee to find the optimal path. (Note that there have 993 been some academic proposals to use signaling-like techniques to 994 explore the whole network in order to find optimal paths, but these 995 tend to place even greater burdens on network processing.) 997 4.3. Path Computation Element 999 The Path Computation Element (PCE) is introduced in [RFC4655]. It is 1000 an abstract functional entity that computes paths. Thus, in the 1001 example of per-domain path computation (Section 4.1) the source node 1002 and each domain entry point is a PCE. On the other hand, the PCE can 1003 also be realized as a separate network element (a server) to which 1004 computation requests can be sent using the Path Computation Element 1005 Communication Protocol (PCEP) [RFC5440]. 1007 Each PCE has responsibility for computations within a domain, and has 1008 visibility of the attributes within that domain. This immediately 1009 enables per-domain path computation with the opportunity to off-load 1010 complex, CPU-intensive, or memory-intensive computation functions 1011 from routers in the network. But the use of PCE in this way does not 1012 solve any of the problems articulated in Sections 4.1 and 4.2. 1014 Two significant mechanisms for cooperation between PCEs have been 1015 described. These mechanisms are intended to specifically address the 1016 problems of computing optimal end-to-end paths in multi-domain 1017 environments. 1019 - The Backward-Recursive PCE-Based Computation (BRPC) mechanism 1020 [RFC5441] involves cooperation between the set of PCEs along the 1021 inter-domain path. Each one computes the possible paths from 1022 domain entry point (or source node) to domain exit point (or 1023 destination node) and shares the information with its upstream 1024 neighbor PCE which is able to build a tree of possible paths 1025 rooted at the destination. The PCE in the source domain can 1026 select the optimal path. 1028 BRPC is sometimes described as "crankback at computation time". It 1029 is capable of determining the optimal path in a multi-domain 1030 network, but depends on knowing the domain that contains the 1031 destination node. Furthermore, the mechanism can become quite 1032 complicated and involve a lot of data in a mesh of interconnected 1033 domains. Thus, BRPC is most often proposed for a simple mesh of 1034 domains and specifically for a path that will cross a known 1035 sequence of domains, but where there may be a choice of domain 1036 interconnections. In this way, BRPC would only be applied to 1037 Figure 2 if a decision had been made (externally) to traverse 1038 Domain C rather than Domain D (notwithstanding that it could 1039 functionally be used to make that choice itself), but BRPC could be 1040 used very effectively to select between interconnections x1 and x2 1041 in Figure 1. 1043 - Hierarchical PCE (H-PCE) [RFC6805] offers a parent PCE that is 1044 responsible for navigating a path across the domain mesh and for 1045 coordinating intra-domain computations by the child PCEs 1046 responsible for each domain. This approach makes computing an end- 1047 to-end path across a mesh of domains far more tractable. However, 1048 it still leaves unanswered the issue of determining the location of 1049 the destination (i.e., discovering the destination domain) as 1050 described in Section 2.1.1. Furthermore, it raises the question of 1051 who operates the parent PCE especially in networks where the 1052 domains are under different administrative and commercial control. 1054 It should also be noted that [RFC5623] discusses how PCE is used in a 1055 multi-layer network with coordination between PCEs operating at each 1056 network layer. Further issues and considerations of the use of PCE 1057 can be found in [RFC7399]. 1059 4.4. GMPLS UNI and Overlay Networks 1061 [RFC4208] defines the GMPLS User-to-Network Interface (UNI) to 1062 present a routing boundary between an overlay network and the core 1063 network, i.e. the client-server interface. In the client network, 1064 the nodes connected directly to the core network are known as edge 1065 nodes, while the nodes in the server network are called core nodes. 1067 In the overlay model defined by [RFC4208] the core nodes act as a 1068 closed system and the edge nodes do not participate in the routing 1069 protocol instance that runs among the core nodes. Thus the UNI 1070 allows access to and limited control of the core nodes by edge nodes 1071 that are unaware of the topology of the core nodes. This respects 1072 the operational and layer boundaries while scaling the network. 1074 [RFC4208] does not define any routing protocol extension for the 1075 interaction between core and edge nodes but allows for the exchange 1076 of reachability information between them. In terms of a VPN, the 1077 client network can be considered as the customer network comprised 1078 of a number of disjoint sites, and the edge nodes match the VPN CE 1079 nodes. Similarly, the provider network in the VPN model is 1080 equivalent to the server network. 1082 [RFC4208] is, therefore, a signaling-only solution that allows edge 1083 nodes to request connectivity cross the core network, and leaves the 1084 core network to select the paths and set up the core LSPs. This 1085 solution is supplemented by a number of signaling extensions such as 1086 [RFC4874], [RFC5553], [I-D.ietf-ccamp-xro-lsp-subobject], 1087 [I-D.ietf-ccamp-rsvp-te-srlg-collect], and 1088 [I-D.ietf-ccamp-te-metric-recording] to give the edge node more 1089 control over the LSP that the core network will set up by exchanging 1090 information about core LSPs that have been established and by 1091 allowing the edge nodes to supply additional constraints on the core 1092 LSPs that are to be set up. 1094 Nevertheless, in this UNI/overlay model, the edge node has limited 1095 information of precisely what LSPs could be set up across the core, 1096 and what TE services (such as diverse routes for end-to-end 1097 protection, end-to-end bandwidth, etc.) can be supported. 1099 4.5. Layer One VPN 1101 A Layer One VPN (L1VPN) is a service offered by a core layer 1 1102 network to provide layer 1 connectivity (TDM, LSC) between two or 1103 more customer networks in an overlay service model [RFC4847]. 1105 As in the UNI case, the customer edge has some control over the 1106 establishment and type of the connectivity. In the L1VPN context 1107 three different service models have been defined classified by the 1108 semantics of information exchanged over the customer interface: 1109 Management Based, Signaling Based (a.k.a. basic), and Signaling and 1110 Routing service model (a.k.a. enhanced). 1112 In the management based model, all edge-to-edge connections are set 1113 up using configuration and management tools. This is not a dynamic 1114 control plane solution and need not concern us here. 1116 In the signaling based service model [RFC5251] the CE-PE interface 1117 allows only for signaling message exchange, and the provider network 1118 does not export any routing information about the core network. VPN 1119 membership is known a priori (presumably through configuration) or is 1120 discovered using a routing protocol [RFC5195], [RFC5252], [RFC5523], 1121 as is the relationship between CE nodes and ports on the PE. This 1122 service model is much in line with GMPLS UNI as defined in [RFC4208]. 1124 In the enhanced model there is an additional limited exchange of 1125 routing information over the CE-PE interface between the provider 1126 network and the customer network. The enhanced model considers four 1127 different types of service models, namely: Overlay Extension, Virtual 1128 Node, Virtual Link and Per-VPN service models. All of these 1129 represent particular cases of the TE information aggregation and 1130 representation. 1132 4.6. VNT Manager and Link Advertisement 1134 As discussed in Section 3.7, operation of a VNT requires policy and 1135 management input. In order to handle this, [RFC5623] introduces the 1136 concept of the Virtual Network Topology Manager (VNTM). This is a 1137 functional component that applies policy to requests from client 1138 networks (or agents of the client network, such as a PCE) for the 1139 establishment of LSPs in the server network to provide connectivity 1140 in the client network. 1142 The VNTM would, in fact, form part of the provisioning path for all 1143 server network LSPs whether they are set up ahead of client network 1144 demand or triggered by end-to-end client network LSP signaling. 1146 An important companion to this function is determining how the LSP 1147 set up across the server network is made available as a TE link in 1148 the client network. Obviously, if the LSP is established using 1149 management intervention, the subsequent client network TE link can 1150 also be configured manually. However, if the LSP is signaled 1151 dynamically there is need for the end points to exchange the link 1152 properties that they should advertise within the client network, and 1153 in the case of a server network that supports more than one client, 1154 it will be necessary to indicate which client or clients can use the 1155 link. This capability it provided in [RFC6107]. 1157 Note that a potential server network LSP that is advertised as a TE 1158 link in the client network might to be determined dynamically by 1159 the edge nodes. In this case there will need to be some effort to 1160 ensure that both ends of the link have the same view of the available 1161 TE resources, or else the advertised link will be asymmetrical. 1163 4.7. What Else is Needed and Why? 1165 As can be seen from Sections 4.1 through 4.6, a lot of effort has 1166 focused on client-server networks as described in Figure 3. Far less 1167 consideration has been given to network peering or the combination of 1168 the two use cases. 1170 Various work has been suggested to extend the definition of the UNI 1171 such that routing information can be passed across the interface. 1172 However, this approach seems to break the architectural concept of 1173 network separation that the UNI facilitates. 1175 Other approaches are working toward a flattening of the network with 1176 complete visibility into the server networks being made available in 1177 the client network. These approaches, while functional, ignore the 1178 main reasons for introducing network separation in the first place. 1180 The remainder of this document introduces a new approach based on 1181 network abstraction that allows a server network to use its own 1182 knowledge of its resources and topology combined with its own 1183 policies to determine what edge-to-edge connectivity capabilities it 1184 will inform the client networks about. 1186 5. Architectural Concepts 1188 5.1. Basic Components 1190 This section revisits the use cases from Section 2 to present the 1191 basic architectural components that provide connectivity in the 1192 peer and client-server cases. These component models can then be 1193 used in later sections to enable discussion of a solution 1194 architecture. 1196 5.1.1. Peer Interconnection 1198 Figure 7 shows the basic architectural concepts for connecting across 1199 peer networks. Nodes from four networks are shown: A1 and A2 come 1200 from one network; B1, B2, and B3 from another network; etc. The 1201 interfaces between the networks (sometimes known as External Network- 1202 to-Network Interfaces - ENNIs) are A2-B1, B3-C1, and C3-D1. 1204 The objective is to be able to support an end-to-end connection A1- 1205 to-D2. This connection is for TE connectivity. 1207 As shown in the figure, LSP tunnels that span the transit networks 1208 are used to achieve the required connectivity. These transit LSPs 1209 form the key building blocks of the end-to-end connectivity. 1211 The transit tunnels can be used as hierarchical LSPs [RFC4206] to 1212 carry the end-to-end LSP, or can become stitching segments [RFC5150] 1213 of the end-to-end LSP. The transit tunnels B1-B3 and C-C3 can be 1214 as an abstract link as discussed in Section 5.3. 1216 : : : 1217 Network A : Network B : Network C : Network D 1218 : : : 1219 -- -- -- -- -- -- -- -- -- -- 1220 |A1|--|A2|---|B1|--|B2|--|B3|---|C1|--|C2|--|C3|---|D1|--|D2| 1221 -- -- | | -- | | | | -- | | -- -- 1222 | |========| | | |========| | 1223 -- -- -- -- 1225 Key 1226 --- Direct connection between two nodes 1227 === LSP tunnel across transit network 1229 Figure 7 : Architecture for Peering 1231 5.1.2. Client-Server Interconnection 1233 Figure 8 shows the basic architectural concepts for a client-server 1234 network. The client network nodes are C1, C2, CE1, CE2, C3, and C4. 1235 The core network nodes are CN1, CN2, CN3, and CN4. The interfaces 1236 CE1-CN1 and CE2-CN2 are the interfaces between the client and core 1237 networks. 1239 The objective is to be able to support an end-to-end connection, 1240 C1-to-C4, in the client network. This connection may support TE or 1241 normal IP forwarding. To achieve this, CE1 is to be connected to CE2 1242 by a link in the client layer that is supported by a core network 1243 LSP. 1245 As shown in the figure, two LSPs are used to achieve the required 1246 connectivity. One LSP is set up across the core from CN1 to CN2. 1247 This core LSP then supports a three-hop LSP from CE1 to CE2 with its 1248 middle hop being the core LSP. It is this LSP that is presented as a 1249 link in the client network. 1251 The practicalities of how the CE1-CE2 LSP is carried across the core 1252 LSP may depend on the switching and signaling options available in 1253 the core network. The LSP may be tunneled down the core LSP using 1254 the mechanisms of a hierarchical LSP [RFC4206], or the LSP segments 1255 CE1-CN1 and CN2-CE2 may be stitched to the core LSP as described in 1256 [RFC5150]. 1258 : : 1259 Client Network : Core Network : Client Network 1260 : : 1261 -- -- --- --- -- -- 1262 |C1|--|C2|--|CE1|................................|CE2|--|C3|--|C4| 1263 -- -- | | --- --- | | -- -- 1264 | |---|CN1|================|CN4|---| | 1265 --- | | --- --- | | --- 1266 | |--|CN2|--|CN3|--| | 1267 --- --- --- --- 1269 Key 1270 --- Direct connection between two nodes 1271 ... CE-to-CE LSP tunnel 1272 === LSP tunnel across the core 1274 Figure 8 : Architecture for Client-Server Network 1276 5.2. TE Reachability 1278 As described in Section 1.1, TE reachability is the ability to reach 1279 a specific address along a TE path. The knowledge of TE reachability 1280 enables an end-to-end TE path to be computed. 1282 In a single network, TE reachability is derived from the Traffic 1283 Engineering Database (TED) that is the collection of all TE 1284 information about all TE links in the network. The TED is usually 1285 built from the data exchanged by the IGP, although it can be 1286 supplemented by configuration and inventory details especially in 1287 transport networks. 1289 In multi-network scenarios, TE reachability information can be 1290 described as "You can get from node X to node Y with the following 1291 TE attributes." For transit cases, nodes X and Y will be edge nodes 1292 of the transit network, but it is also important to consider the 1293 information about the TE connectivity between an edge node and a 1294 specific destination node. 1296 TE reachability may be unqualified (there is a TE path), or may be 1297 qualified by TE attributes such as TE metrics, hop count, available 1298 bandwidth, delay, shared risk, etc. 1300 TE reachability information can be exchanged between networks so that 1301 nodes in one network can determine whether they can establish TE 1302 paths across or into another network. Such exchanges are subject to 1303 a range of policies imposed by the advertiser (for security and 1304 administrative control) and by the receiver (for scalability and 1305 stability). 1307 5.3. Abstraction not Aggregation 1309 Aggregation is the process of synthesizing from available 1310 information. Thus, the virtual node and virtual link models 1311 described in Section 3.6 rely on processing the information available 1312 within a network to produce the aggregate representations of links 1313 and nodes that are presented to the consumer. As described in 1314 Section 3, dynamic aggregation is subject to a number of pitfalls. 1316 In order to distinguish the architecture described in this document 1317 from the previous work on aggregation, we use the term "abstraction" 1318 in this document. The process of abstraction is one of applying 1319 policy to the available TE information within a domain, to produce 1320 selective information that represents the potential ability to 1321 connect across the domain. 1323 Abstraction does not offer all possible connectivity options (refer 1324 to Section 3.6), but does present a general view of potential 1325 connectivity. Abstraction may have a dynamic element, but is not 1326 intended to keep pace with the changes in TE attribute availability 1327 within the network. 1329 Thus, when relying on an abstraction to compute an end-to-end path, 1330 the process might not deliver a usable path. That is, there is no 1331 actual guarantee that the abstractions are current or feasible. 1333 While abstraction uses available TE information, it is subject to 1334 policy and management choices. Thus, not all potential connectivity 1335 will be advertised to each client. The filters may depend on 1336 commercial relationships, the risk of disclosing confidential 1337 information, and concerns about what use is made of the connectivity 1338 that is offered. 1340 5.3.1. Abstract Links 1342 An abstract link is a measure of the potential to connect a pair of 1343 points with certain TE parameters. An abstract link may be realized 1344 by an existing LSP, or may represent the possibility of setting up an 1345 LSP. 1347 When looking at a network such as that in Figure 8, the link from CN1 1348 to CN4 may be an abstract link. If the LSP has already been set up, 1349 it is easy to advertise it as a link with known TE attributes: policy 1350 will have been applied in the server network to decide what LSP to 1351 set up. If the LSP has not yet been established, the potential for 1352 an LSP can be abstracted from the TE information in the core network 1353 subject to policy, and the resultant potential LSP can be advertised. 1355 Since the client nodes do not have visibility into the core network, 1356 they must rely on abstraction information delivered to them by the 1357 core network. That is, the core network will report on the potential 1358 for connectivity. 1360 5.3.2. The Abstraction Layer Network 1362 Figure 9 introduces the abstraction layer network. This construct 1363 separates the client layer resources (nodes C1, C2, C3, and C4, and 1364 the corresponding links), and the server layer resources (nodes CN1, 1365 CN2, CN3, and CN4 and the corresponding links). Additionally, the 1366 architecture introduces an intermediary layer called the abstraction 1367 layer. The abstraction layer contains the client layer edge nodes 1368 (C2 and C3), the server layer edge nodes (CN1 and CN4), the client- 1369 server links (C2-CN1 and CN4-C3) and the abstract link CN1-CN4. 1371 -- -- -- -- 1372 |C1|--|C2| |C3|--|C4| Client Network 1373 -- | | | | -- 1374 | | | | . . . . . . . . . . . 1375 | | | | 1376 | | | | 1377 | | --- --- | | Abstraction 1378 | |---|CN1|================|CN4|---| | Layer Network 1379 -- | | | | -- 1380 | | | | . . . . . . . . . . . . . . 1381 | | | | 1382 | | | | 1383 | | --- --- | | Server Network 1384 | |--|CN2|--|CN3|--| | 1385 --- --- --- --- 1387 Key 1388 --- Direct connection between two nodes 1389 === Abstract link 1391 Figure 9 : Architecture for Abstraction Layer Network 1393 The client layer network is able to operate as normal. Connectivity 1394 across the network can either be found or not found based on links 1395 that appear in the client layer TED. If connectivity cannot be 1396 found, end-to-end LSPs cannot be set up. This failure may be 1397 reported but no dynamic action is taken by the client layer. 1399 The server network layer also operates as normal. LSPs across the 1400 server layer are set up in response to management commands or in 1401 response to signaling requests. 1403 The abstraction layer consists of the physical links between the 1404 two networks, and also the abstract links. The abstract links are 1405 created by the server network according to local policy and represent 1406 the potential connectivity that could be created across the server 1407 network and which the server network is willing to make available for 1408 use by the client network. Thus, in this example, the diameter of 1409 the abstraction layer network is only three hops, but an instance of 1410 an IGP could easily be run so that all nodes participating in the 1411 abstraction layer (and in particular the client network edge nodes) 1412 can see the TE connectivity in the layer. 1414 When the client layer needs additional connectivity it can make a 1415 request to the abstraction layer network. For example, the operator 1416 of the client network may want to create a link from C2 to C3. The 1417 abstraction layer can see the potential path C2-CN1-CN4-C3, and asks 1418 the server layer to realize the abstract link CN1-CN4. The server 1419 layer provisions the LSP CN1-CN2-CN3-CN4 and makes the LSP available 1420 as a hierarchical LSP to turn the abstract link into a link that can 1421 be used in the client network. The abstraction layer can then set up 1422 an LSP C2-CN1-CN4-C3 using stitching or tunneling, and make the LSP 1423 available as a virtual link in the client network. 1425 Sections 5.3.3 and 5.3.4 show how this model is used to satisfy the 1426 requirements for connectivity in client-server networks and in peer 1427 networks. 1429 5.3.2.1. Nodes in the Abstraction Layer Network 1431 Figure 9 shows a very simplified network diagram and the reader would 1432 be forgiven for thinking that only client network edge nodes and 1433 server network edge nodes may appear in the abstraction layer 1434 network. But this is not the case: other nodes from the server 1435 network may be present. This allows the abstraction layer network 1436 to be more complex than a full mesh with access spokes. 1438 Thus, as shown in Figure 10, a transit node in the server network 1439 (here the node is CN3) can be exposed as a node in the abstraction 1440 layer network with abstract links connecting it to other nodes in 1441 the abstraction layer network. Of course, in the network shown in 1442 Figure 10, there is little if any value in exposing CN3, but if it 1443 had other abstract links to other nodes in the abstraction layer 1444 network and/or direct connections to client network nodes, then the 1445 resulting network would be richer. 1447 -- -- -- -- Client 1448 |C1|--|C2| |C3|--|C4| Network 1449 -- | | | | -- 1450 | | | | . . . . . . . . . 1451 | | | | 1452 | | | | 1453 | | --- --- --- | | Abstraction 1454 | |--|CN1|========|CN3|========|CN5|--| | Layer Network 1455 -- | | | | | | -- 1456 | | | | | | . . . . . . . . . . . . 1457 | | | | | | 1458 | | | | | | Server 1459 | | --- | | --- | | Network 1460 | |--|CN2|-| |-|CN4|--| | 1461 --- --- --- --- --- 1463 Figure 10 : Abstraction Layer Network with Additional Node 1465 It should be noted that the nodes included in the abstraction layer 1466 network in this way are not "abstract nodes" in the sense of a 1467 virtual node described in Section 3.6. While it is the case that 1468 the policy point responsible for advertising server network resources 1469 into the abstraction layer network could choose to advertise abstract 1470 nodes in place of real physical nodes, it is believed that doing so 1471 would introduce significant complexity in terms of: 1473 - Coordination between all of the external interfaces of the abstract 1474 node 1476 - Management of changes in the server network that lead to limited 1477 capabilities to reach (cross-connect) across the Abstract Node. It 1478 may be noted that recent work on limited cross-connect capabilities 1479 such as exist in asymmetrical switches could be used to represent 1480 the limitations in an abstract node 1481 [I-D.ietf-ccamp-general-constraint-encode], 1482 [I-D.ietf-ccamp-gmpls-general-constraints-ospf-te]. 1484 5.3.3. Abstraction in Client-Server Networks 1486 Section 5.3.2 has already introduced the concept of the abstraction 1487 layer network through an example of a simple layered network. But it 1488 may be helpful to expand on the example using a slightly more complex 1489 network. 1491 Figure 11 shows a multi-layer network comprising client nodes 1492 (labeled as Cn for n= 0 to 9) and server nodes (labeled as Sn for 1493 n = 1 to 9). 1495 -- -- 1496 |C3|---|C4| 1497 /-- --\ 1498 -- -- -- -- --/ \-- 1499 |C1|---|C2|---|S1|---|S2|----|S3| |C5| 1500 -- /-- --\ --\ --\ /-- 1501 / \-- \-- \-- --/ -- 1502 / |S4| |S5|----|S6|---|C6|---|C7| 1503 / /-- --\ /-- /-- -- 1504 --/ -- --/ -- \--/ --/ 1505 |C8|---|C9|---|S7|---|S8|----|S9|---|C0| 1506 -- -- -- -- -- -- 1508 Figure 11 : An example Multi-Layer Network 1510 If the network in Figure 11 is operated as separate client and server 1511 networks then the client layer topology will appear as shown in 1512 Figure 12. As can be clearly seen, the network is partitioned and 1513 there is no way to set up an LSP from a node on the left hand side 1514 (say C1) to a node on the right hand side (say C7). 1516 -- -- 1517 |C3|---|C4| 1518 -- --\ 1519 -- -- \-- 1520 |C1|---|C2| |C5| 1521 -- /-- /-- 1522 / --/ -- 1523 / |C6|---|C7| 1524 / /-- -- 1525 --/ -- --/ 1526 |C8|---|C9| |C0| 1527 -- -- -- 1529 Figure 12 : Client Layer Topology Showing Partitioned Network 1531 For reference, Figure 13 shows the corresponding server layer 1532 topology. 1534 -- -- -- 1535 |S1|---|S2|----|S3| 1536 --\ --\ --\ 1537 \-- \-- \-- 1538 |S4| |S5|----|S6| 1539 /-- --\ /-- 1540 --/ -- \--/ 1541 |S7|---|S8|----|S9| 1542 -- -- -- 1544 Figure 13 : Server Layer Topology 1546 Operating on the TED for the server layer, a management entity or a 1547 software component may apply policy and consider what abstract links 1548 it might offer for use by the client layer. To do this it obviously 1549 needs to be aware of the connections between the layers (there is no 1550 point in offering an abstract link S2-S8 since this could not be of 1551 any use in this example). 1553 In our example, after consideration of which LSPs could be set up in 1554 the server layer, four abstract links are offered: S1-S3, S3-S6, 1555 S1-S9, and S7-S9. These abstract links are shown as double lines on 1556 the resulting topology of the abstraction layer network in Figure 14. 1557 As can be seen, two of the links must share part of a path (S1-S9 1558 must share with either S1-S3 or with S7-S9). This could be achieved 1559 using distinct resources (for example, separate lambdas) where the 1560 paths are common, but it could also be done using resource sharing. 1562 That would mean that when both S1-S3 and S7-S9 are realized as links 1563 carrying abstraction layer LSPs, the link S1-S9 can no longer be 1564 used. 1566 -- 1567 |C3| 1568 /-- 1569 -- -- --/ 1570 |C2|---|S1|==========|S3| 1571 -- --\\ --\\ 1572 \\ \\ 1573 \\ \\-- -- 1574 \\ |S6|---|C6| 1575 \\ -- -- 1576 -- -- \\-- -- 1577 |C9|---|S7|=====|S9|---|C0| 1578 -- -- -- -- 1580 Figure 14 : Abstraction Layer Network with Abstract Links 1582 The separate IGP instance running in the abstraction layer network 1583 means that this topology is visible at the edge nodes (C2, C3, C6, 1584 C9, and C0) as well as at a PCE if one is present. 1586 Now the client layer is able to make requests to the abstraction 1587 layer network to provide connectivity. In our example, it requests 1588 that C2 is connected to C3 and that C2 is connected to C0. This 1589 results in several actions: 1591 1. The management component for the abstraction layer network asks 1592 its PCE to compute the paths necessary to make the connections. 1593 This yields C2-S1-S3-C3 and C2-S1-S9-C0. 1595 2. The management component for the abstraction layer network 1596 instructs C2 to start the signaling process for the new LSPs in 1597 the abstraction layer. 1599 3. C2 signals the LSPs for setup using the explicit routes 1600 C2-S1-S3-C3 and C2-S1-S9-C0. 1602 4. When the signaling messages reach S1 (in our example, both LSPs 1603 traverse S1) the abstraction layer network may find that the 1604 necessary underlying LSPs (S1-S2-S3 and S1-S2-S5-S9) have not 1605 been established since it is not a requirement that an abstract 1606 link be backed up by a real LSP. In this case, S1 computes the 1607 paths of the underlying LSPs and signals them. 1609 5. Once the serve layer LSPs have been established, S1 can continue 1610 to signal the abstraction layer LSPs either using the server layer 1611 LSPs as tunnels or as stitching segments. 1613 -- -- 1614 |C3|-|C4| 1615 /-- --\ 1616 / \-- 1617 -- --/ |C5| 1618 |C1|---|C2| /-- 1619 -- /--\ --/ -- 1620 / \ |C6|---|C7| 1621 / \ /-- -- 1622 / \--/ 1623 --/ -- |C0| 1624 |C8|---|C9| -- 1625 -- -- 1627 Figure 15 : Connected Client Layer Network with Additional Links 1629 6. Finally, once the abstraction layer LSPs have been set up, the 1630 client layer can be informed and can start to advertise the 1631 new TE links C2-C3 and C2-C0. The resulting client layer topology 1632 is shown in Figure 15. 1634 7. Now the client layer can compute an end-to-end path from C1 to C7. 1636 5.3.3.1 Macro Shared Risk Link Groups 1638 Network links often share fate with one or more other links. That 1639 is, a scenario that may cause a links to fail could cause one or more 1640 other links to fail. This may occur, for example, if the links are 1641 supported by the same fiber bundle, or if some links are routed down 1642 the same duct or in a common piece of infrastructure such as a 1643 bridge. A common way to identify the links that may share fate is to 1644 label them as belonging to a Shared Risk Link Group (SRLG) [RFC4202]. 1646 TE links created from LSPs in lower layers may also share fate, and 1647 it can be hard for a client network to know about this problem 1648 because it does not know the topology of the server network or the 1649 path of the server layer LSPs that are used to create the links in 1650 the client network. 1652 For example, looking at the example used in Section 5.3.3 and 1653 considering the two abstract links S1-S3 and S1-S9 there is no way 1654 for the client layer to know whether the links C2-C0 and C2-C3 share 1655 fate. Clearly, if the client layer uses these links to provide a 1656 link-diverse end-to-end protection scheme, it needs to know that the 1657 links actually share a piece of network infrastructure (the server 1658 layer link S1-S2). 1660 Per [RFC4202], an SRLG represents a shared physical network resource 1661 upon which the normal functioning of a link depends. Multiple SRLGs 1662 can be identified and advertised for every TE link in a network. 1663 However, this can produce a scalability problem in a mutli-layer 1664 network that equates to advertising in the client layer the server 1665 layer route of each TE link. 1667 Macro SRLGs (MSRLGs) address this scaling problem and are a form of 1668 abstraction performed at the same time that the abstract links are 1669 derived. In this way, only the links that actually links in the 1670 server layer need to be advertised rather than every link that 1671 potentially shares resources. This saving is possible because the 1672 abstract links are formulated on behalf of the server layer by a 1673 central management agency that is aware of all of the link 1674 abstractions being offered. 1676 It may be noted that a less optimal alternative path for the abstract 1677 link S1-S9 exists in the server layer (S1-S4-S7-S8-S9). It is would 1678 be possible for the client layer request for connectivity C2-C0 to 1679 ask that the path be maximally disjoint from the path C2-C3. While 1680 nothing can be done about the shared link C2-S1, the abstraction 1681 layer could request that the server layer instantiate the link S1-S9 1682 to be diverse from the link S1-S3, and this request could be honored 1683 if the server layer policy allows. 1685 5.3.3.2 Mutual Exclusivity 1687 As noted in the discussion of Figure 14, it is possible that some 1688 abstraction layer links can not be realized and/or used at the same 1689 time. This arises when the potentiality of the links is indicated by 1690 the server layer, but the realization of the links by LSPs in the 1691 server layer network would actually compete for server layer 1692 resources. In Figure 14 this arose when both link S1-S3 and link 1693 S7-S9 were realized as links carrying abstraction layer LSPs: in that 1694 case the link S1-S9 could no longer be used. 1696 Such a situation need not be an issue when abstraction layer LSPs are 1697 set up one by one because the use of one abstraction layer link and 1698 th corresponding use of server layer resources will cause the server 1699 layer to withdraw the availability of the other abstraction layer 1700 links, and these will become unavailable for further abstraction 1701 layer path computations. 1703 Furthermore, in deployments where abstraction layer links are only 1704 presented as available after server layer LSPs have been established 1705 to support them, the problem is unlikely exist. 1707 However, when the server layer is constrained, but chooses to 1708 advertise the potential of multiple abstraction layer links even 1709 though they compete for resources, and when multiple abstraction 1710 layer LSPs are computed simultaneously (perhaps to provide protection 1711 services, there may be contention for server layer resources. In the 1712 case that protected abstraction layer LSPs are being established, 1713 this situation would be avoided through the use of SRLGs and/or 1714 MSRLGs since the two abstraction layer links that compete for server 1715 layer resources must also fate share across those resources. But in 1716 the case where the multiple abstraction layer LSPs do not care about 1717 fate sharing, it may be necessary to flag the mutually exclusive 1718 links in the abstraction layer TED so that path computation can avoid 1719 accidentally attempting to utilize two of a set of such links at the 1720 same time. 1722 5.3.3.3 A Server with Multiple Clients 1724 A single server network may support multiple client networks. This 1725 is not an uncommon state of affairs for example when the server 1726 network provides connectivity for multiple customers. 1728 In this case, the abstraction provided by the server layer may vary 1729 considerably according to the policies and commercial relationships 1730 with each customer. This variance would lead to a separate 1731 abstraction layer network maintained to support each client network. 1733 On the other hand, it may be that multiple clients are subject to the 1734 same policies and the abstraction can be identical. In this case, a 1735 single abstraction layer network can support more than one client. 1737 The choices here are made as an operational issue by the server layer 1738 network. 1740 5.3.3.4 A Client with Multiple Servers 1742 A single client network may be supported by multiple server networks. 1743 The server networks may provide connectivity between different parts 1744 of the client network or may provide parallel (redundant) 1745 connectivity for the client network. 1747 In this case the abstraction layer network should contain the 1748 abstract links from all server networks so that it can make suitable 1749 computations and create the correct TE links in the client network. 1750 That is, the relationship between client network and abstraction 1751 layer network should be one-to-one. 1753 Note that SRLGs and MSRLGs may be very hard to describe in the case 1754 of multiple server layer networks because the abstraction points will 1755 not know whether the resources in the various server layers share 1756 physical locations. 1758 5.3.4. Abstraction in Peer Networks 1760 Peer networks exist in many situations in the Internet. Packet 1761 networks may peer as IGP areas (levels) or as ASes. Transport 1762 networks (such as optical networks) may peer to provide 1763 concatenations of optical paths through single vendor environments 1764 (see Section 7). Figure 16 shows a simple example of three peer 1765 networks (A, B, and C) each comprising a few nodes. 1767 Network A : Network B : Network C 1768 : : 1769 -- -- -- : -- -- -- : -- -- 1770 |A1|---|A2|----|A3|---|B1|---|B2----|B3|---|C1|---|C2| 1771 -- --\ /-- : -- /--\ -- : -- -- 1772 \--/ : / \ : 1773 |A4| : / \ : 1774 --\ : / \ : 1775 -- \-- : --/ \-- : -- -- 1776 |A5|---|A6|---|B4|----------|B6|---|C3|---|C4| 1777 -- -- : -- -- : -- -- 1778 : : 1779 : : 1781 Figure 16 : A Network Comprising Three Peer Networks 1783 As discussed in Section 2, peered networks do not share visibility of 1784 their topologies or TE capabilities for scaling and confidentiality 1785 reasons. That means, in our example, that computing a path from A1 1786 to C4 can be impossible without the aid of cooperating PCEs or some 1787 form of crankback. 1789 But it is possible to produce abstract links for the reachability 1790 across transit peer networks and instantiate an abstraction layer 1791 network. That network can be enhanced with specific reachability 1792 information if a destination network is partitioned as is the case 1793 with Network C in Figure 16. 1795 Suppose Network B decides to offer three abstract links B1-B3, B4-B3, 1796 and B4-B6. The abstraction layer network could then be constructed 1797 to look like the network in Figure 17. 1799 -- -- -- -- 1800 |A3|---|B1|====|B3|----|C1| 1801 -- -- //-- -- 1802 // 1803 // 1804 // 1805 -- --// -- -- 1806 |A6|---|B4|=====|B6|---|C3| 1807 -- -- -- -- 1809 Figure 17 : Abstraction Layer Network for the Peer Network Example 1811 Using a process similar to that described in Section 5.3.3, Network A 1812 can request connectivity to Network C and the abstract links can be 1813 instantiated as tunnels across the transit network, and edge-to-edge 1814 LSPs can be set up to join the two networks. Furthermore, if Network 1815 C is partitioned, reachability information can be exchanged to allow 1816 Network A to select the correct edge-to-edge LSP as shown in Figure 1817 18. 1819 Network A : Network C 1820 : 1821 -- -- -- : -- -- 1822 |A1|---|A2|----|A3|=========|C1|.....|C2| 1823 -- --\ /-- : -- -- 1824 \--/ : 1825 |A4| : 1826 --\ : 1827 -- \-- : -- -- 1828 |A5|---|A6|=========|C3|.....|C4| 1829 -- -- : -- -- 1831 Figure 18 : Tunnel Connections to Network C with TE Reachability 1833 Peer networking cases can be made far more complex by dual homing 1834 between network peering nodes (for example, A3 might connect to B1 1835 and B4 in Figure 17) and by the networks themselves being arranged in 1836 a mesh (for example, A6 might connect to B4 and C1 in Figure 17). 1838 These additional complexities can be handled gracefully by the 1839 abstraction layer network model. 1841 Further examples of abstraction in peer networks can be found in 1842 Sections 7 and 9. 1844 5.4. Considerations for Dynamic Abstraction 1846 It is possible to consider a highly dynamic system where the server 1847 network adaptively suggests new abstract links into the abstraction 1848 layer, and where the abstraction layer proactively deploys new LSPs 1849 to provide new connections in the client network. Such fluidity is, 1850 however, to be treated with caution because of the longer turn-up 1851 times of connections in server networks, because the server networks 1852 are likely to be sparsely connected and expensive physical resources 1853 will only be deployed where there is believed to be a need for them, 1854 and because of the complex commercial or administrative relationships 1855 that may exist between client and server network operators. 1857 Thus, proposals for fully automated multi-layer networks based on 1858 this architecture may be regarded as forward-looking topics for 1859 research both in terms of network stability and with regard to 1860 eccomonic impact. 1862 However, some elements of automation should not be discarded. A 1863 server network may automatically apply policy to determine the best 1864 set of abstract links to offer and the most suitable server network 1865 LSPs to realize those links. And a client network may dynamically 1866 observe congestion, lack of connectivity, or predicted changes in 1867 traffic demand, and may use this information to request additional 1868 links from the abstraction layer. And, once policies have been 1869 configured, the whole system should be able to operate autonomous of 1870 operator control (which is not to say that the operator will not have 1871 the option of exerting control at every step in the process). 1873 But it is important, in this discussion, to rule out most processes 1874 of dynamic abstraction. As the available resources in the server 1875 layer fluctuate because of newly provisioned server layer LSPs or due 1876 to failed resources, it would significantly destabilize the system to 1877 continually update the advertised abstract links. Thus, and 1878 notwithstanding the discussion of mutually exclusive links in Section 1879 5.3.3.2, a server network will mainly plan and advertise abstract 1880 links that are stable in the event of establishment of other abstract 1881 links. 1883 As an example of this last point, consider two abstract links that 1884 can be realized by a pair of server network LSPs that share a single 1885 server network link at some point in their paths. Those abstract 1886 links could be advertised each with the full capacity of the server 1887 network link. But if this were done, then the establishment of one 1888 abstract link would immediately preclude the other causing some small 1889 degree of flap in the abstraction layer network topology. A server 1890 network might instead choose to split the shared resources between 1891 the two server network LSPs and so make the abstraction layer network 1892 stable. This ability is, of course, highly dependent on the network 1893 technology in the server network. 1895 5.5. Requirements for Advertising Links and Nodes 1897 The abstraction layer network is "just another network layer". The 1898 links and nodes in the network need to be advertised along with their 1899 associated TE information (metrics, bandwidth, etc.) so that the 1900 topology is disseminated and so that routing decisions can be made. 1902 This requires a routing protocol running between the nodes in the 1903 abstraction layer network. Note that this routing information 1904 exchange could be piggy-backed on an existing routing protocol 1905 instance, or use a new instance (or even a new protocol). Clearly, 1906 the information exchanged is only that which has been created as 1907 part of the abstraction function according to policy. 1909 It should be noted that in some cases abstract link enablement is on- 1910 demand and all that is advertised in the topology for the abstraction 1911 layer network is the potential for an abstract link to be set up. In 1912 this case we may ponder how the routing protocol will advertise 1913 topology information over a link that is not yet established. In 1914 other words, there must be a communication channel between the 1915 participating nodes so that the routing protocol messages can flow. 1916 The answer is that control plane connectivity exists in the server 1917 network and on the client-server edge links, and this can be used to 1918 carry the routing protocol messages for the abstraction layer 1919 network. The same consideration applies to the advertisement, in the 1920 client network of the potential connectivity that the abstraction 1921 layer network can provide. 1923 5.6. Addressing Considerations 1925 The network layers in this architecture should be able to operate 1926 with separate address spaces and these may overlap without any 1927 technical issues. That is, one address may mean one thing in the 1928 client network, yet the same address may have a different meaning in 1929 the abstraction layer network or the server network. In other words 1930 there is complete address separation between networks. 1932 However, this will require some care both because human operators may 1933 well become confused, and because mapping between address spaces is 1934 needed at the interfaces between the network layers. That mapping 1935 requires configuration so that, for example, when the server network 1936 announces an abstract link from A to B, the abstraction layer network 1937 must recognize that A and B are server network addresses and must map 1938 them to abstraction layer addresses (say P and Q) before including 1939 the link in its own topology. And similarly, when the abstraction 1940 layer network informs the client network that a new link is available 1941 from S to T, it must map those addresses from its own address space 1942 to that of the client network. 1944 This form of address mapping will become particularly important in 1945 cases where one abstraction layer network is constructed from 1946 connectivity in multiple server layer networks, or where one 1947 abstraction layer network provides connectivity for multiple client 1948 networks. 1950 6. Building on Existing Protocols 1952 This section is not intended to prejudge a solutions framework or any 1953 applicability work. It does, however, very briefly serve to note the 1954 existence of protocols that could be examined for applicability to 1955 serve in realizing the model described in this document. 1957 The general principle of protocol re-use is preferred over the 1958 invention of new protocols or additional protocol extensions as 1959 mentioned in Section 3.1. 1961 6.1. BGP-LS 1963 BGP-LS is a set of extensions to BGP described in 1964 [I-D.ietf-idr-ls-distribution]. It's purpose is to announce topology 1965 information from one network to a "north-bound" consumer. 1966 Application of BGP-LS to date has focused on a mechanism to build a 1967 TED for a PCE. However, BGP's mechanisms would also serve well to 1968 advertise abstract links from a server network into the abstraction 1969 layer network, or to advertise potential connectivity from the 1970 abstraction layer network to the client network. 1972 6.2. IGPs 1974 Both OSPF and IS-IS have been extended through a number of RFCs to 1975 advertise TE information. Additionally, both protocols are capable 1976 of running in a multi-instance mode either as ships that pass in the 1977 night (i.e., completely separate instances using different address) 1978 or as dual instances on the same address space. This means that 1979 either IGP could probably be used as the routing protocol in the 1980 abstraction layer network. 1982 6.3. RSVP-TE 1984 RSVP-TE signaling can be used to set up traffic engineered LSPs to 1985 serve as hierarchical LSPs in the core network providing abstract 1986 links for the abstraction layer network as described in [RFC4206]. 1987 Similarly, the CE-to-CE LSP tunnel across the abstraction layer 1988 network can be established using RSVP-TE without any protocol 1989 extensions. 1991 Furthermore, the procedures in [RFC6107] allow the dynamic signaling 1992 of the purpose of any LSP that is established. This means that 1993 when an LSP tunnel is set up, the two ends can coordinate into which 1994 routing protocol instance it should be advertised, and can also agree 1995 on the addressing to be said to identify the link that will be 1996 created. 1998 6.4. Notes on a Solution 2000 This section is not intended to be proscriptive or dictate the 2001 protocol solutions that may be used to satisfy the architecture 2002 described in this document, but it does show how the existing 2003 protocols listed in the previous sections can be combined to provide 2004 a solution with only minor modifications. 2006 A server network can be operated using GMPLS routing and signaling 2007 protocols. Using information gathered from the routing protocol, a 2008 TED can be constructed containing resource availability information 2009 and SRLG details. A policy-based process can then determine which 2010 nodes and abstract links it wishes to advertise to form the abstract 2011 layer network. 2013 The server network can now use BGP-LS to advertise a topology of 2014 links and nodes to form the abstraction layer network. This 2015 information would most likely be advertised from a single point of 2016 control that made all of the abstraction decisions, but the function 2017 could be distributed to multiple server network edge nodes. The 2018 information can be advertised by BGP-LS to multiple points within the 2019 abstraction layer (such as all client network edge nodes) or to a 2020 single controller. 2022 Multiple server networks may advertise information that is used to 2023 construct an abstraction layer network, and one server network may 2024 advertise different information in different instances of BGP-LS to 2025 form different abstraction layer networks. Furthermore, in the case 2026 of one controller constructing multiple abstraction layer networks, 2027 BGP-LS uses the route target mechanism defined in [RFC4364] to 2028 distinguish the different applications (effectively abstraction layer 2029 network VPNs) of the exported information. 2031 Extensions may be made to BGP-LS to allow advertisement of MSLRGs, 2032 mutually exclusive links, and to indicate whether the abstract link 2033 has been pre-established or not. 2035 The abstraction layer network may operate under central control or 2036 use a distributed control plane. Since the links and nodes may be a 2037 mix of physical and abstract links, and since the nodes may have 2038 diverse cross-connect capabilities, it is most likely that a GMPLS 2039 routing protocol will be beneficial for collecting and correlating 2040 the routing information and for distributing updates. No special 2041 additional features are needed beyond adding those extra parameters 2042 just described for BGP-LS, but it should be noted that the control 2043 plane of the abstraction layer network must run in an out of band 2044 control network because the data-bearing links might not yet have 2045 been established via connections in the server layer network. 2047 The abstraction layer network is also able to determine potential 2048 connectivity from client network edge to client network edge. It 2049 will determine which client network links to create according to 2050 policy and subject to requests from the client network, and will 2051 take four steps: 2053 - First it will compute a path for an abstraction layer LSP that 2054 will realize the link for the client network. 2055 - First it will request the server layer network to realize any 2056 abstraction layer links that the LSP traverses and that are not 2057 yet enabled. 2058 - Then, once those links have been realized, it will signal the 2059 abstraction layer LSP. 2060 - Finally, the abstraction layer network will inform the client 2061 network of the existence of the new client network link. 2063 This last step can be achieved either by coordination of the end 2064 points of the abstraction layer LSP (these points are client network 2065 edge nodes) using mechanisms such as those described in [RFC6107], 2066 or using BGP-LS from a central controller. 2068 Once the client network edge nodes are aware of a new link, they will 2069 automatically advertise it their routing protocol and it will become 2070 available for use by traffic in the client network. 2072 Sections 7, 8, and 9 discuss the applicability of this architecture 2073 to different network types and problem spaces, while Section 10 gives 2074 some advice about scoping future work. Section 11 on manageability 2075 considerations is particularly relevant in the context of this 2076 section because it contains a discussion of the policies and 2077 mechanisms for indicating connectivity and link availability between 2078 network layers in this architecture. 2080 7. Applicability to Optical Domains and Networks 2082 Many optical networks are arranged a set of small domains. Each 2083 domain is a cluster of nodes, usually from the same equipment vendor 2084 and with the same properties. The domain may be constructed as a 2085 mesh or a ring, or maybe as an interconnected set of rings. 2087 The network operator seeks to provide end-to-end connectivity across 2088 a network constructed from multiple domains, and so (of course) the 2089 domains are interconnected. In a network under management control 2090 such as through an Operations Support System (OSS), each domain is 2091 under the operational control of a Network Management System (NMS). 2092 In this way, an end-to-end path may be commissioned by the OSS 2093 instructing each NMS, and the NMSes setting up the path fragments 2094 across the domains. 2096 However, in a system that uses a control plane, there is a need for 2097 integration between the domains. 2099 Consider a simple domain, D1, as shown in Figure 19. In this case, 2100 the nodes A through F are arranged in a topological ring. Suppose 2101 that there is a control plane in use in this domain, and that OSPF is 2102 used as the TE routing protocol. 2104 ----------------- 2105 | D1 | 2106 | B---C | 2107 | / \ | 2108 | / \ | 2109 | A D | 2110 | \ / | 2111 | \ / | 2112 | F---E | 2113 | | 2114 ----------------- 2116 Figure 19 : A Simple Optical Domain 2118 Now consider that the operator's network is built from a mesh of such 2119 domains, D1 through D7, as shown in Figure 20. It is possible that 2120 these domains share a single, common instance of OSPF in which case 2121 there is nothing further to say because that OSPF instance will 2122 distribute sufficient information to build a single TED spanning the 2123 whole network, and an end-to-end path can be computed. A more likely 2124 scenario is that each domain is running its own OSPF instance. In 2125 this case, each is able to handle the peculiarities (or rather, 2126 advanced functions) of each vendor's equipment capabilities. 2128 ------ ------ ------ ------ 2129 | | | | | | | | 2130 | D1 |---| D2 |---| D3 |---| D4 | 2131 | | | | | | | | 2132 ------\ ------\ ------\ ------ 2133 \ | \ | \ | 2134 \------ \------ \------ 2135 | | | | | | 2136 | D5 |---| D6 |---| D7 | 2137 | | | | | | 2138 ------ ------ ------ 2140 Figure 20 : A Simple Optical Domain 2142 The question now is how to combine the multiple sets of information 2143 distributed by the different OSPF instances. Three possible models 2144 suggest themselves based on pre-existing routing practices. 2146 o In the first model (the Area-Based model) each domain is treated as 2147 a separate OSPF area. The end-to-end path will be specified to 2148 traverse multiple areas, and each area will be left to determine 2149 the path across the nodes in the area. The feasibility of an end- 2150 to-end path (and, thus, the selection of the sequence of areas and 2151 their interconnections) can be derived using hierarchical PCE. 2153 This approach, however, fits poorly with established use of the 2154 OSPF area: in this form of optical network, the interconnection 2155 points between domains are likely to be links; and the mesh of 2156 domains is far more interconnected and unstructured than we are 2157 used to seeing in the normal area-based routing paradigm. 2159 Furthermore, while hierarchical PCE may be able to solve this type 2160 of network, the effort involved may be considerable for more than a 2161 small collection of domains. 2163 o Another approach (the AS-Based model) treats each domain as a 2164 separate Autonomous System (AS). The end-to-end path will be 2165 specified to traverse multiple ASes, and each AS will be left to 2166 determine the path across the AS. 2168 This model sits more comfortably with the established routing 2169 paradigm, but causes a massive escalation of ASes in the global 2170 Internet. It would, in practice, require that the operator used 2171 private AS numbers [RFC6996] of which there are plenty. 2173 Then, as suggested in the Area-Based model, hierarchical PCE 2174 could be used to determine the feasibility of an end-to-end path 2175 and to derive the sequence of domains and the points of 2176 interconnection to use. But, just as in that other model, the 2177 scalability of the hierarchical PCE approach must be questioned. 2179 Furthermore, determining the mesh of domains (i.e., the inter-AS 2180 connections) conventionally requires the use of BGP as an inter- 2181 domain routing protocol. However, not only is BGP not normally 2182 available on optical equipment, but this approach indicates that 2183 the TE properties of the inter-domain links would need to be 2184 distributed and updated using BGP: something for which it is not 2185 well suited. 2187 o The third approach (the ASON model) follows the architectural 2188 model set out by the ITU-T [G.8080] and uses the routing protocol 2189 extensions described in [RFC6827]. In this model the concept of 2190 "levels" is introduced to OSPF. Referring back to Figure 20, each 2191 OSPF instance running in a domain would be construed as a "lower 2192 level" OSPF instance and would leak routes into a "higher level" 2193 instance of the protocol that runs across the whole network. 2195 This approach handles the awkwardness of representing the domains 2196 as areas or ASes by simply considering them as domains running 2197 distinct instances of OSPF. Routing advertisements flow "upward" 2198 from the domains to the high level OSPF instance giving it a full 2199 view of the whole network and allowing end-to-end paths to be 2200 computed. Routing advertisements may also flow "downward" from the 2201 network-wide OSPF instance to any one domain so that it has 2202 visibility of the connectivity of the whole network. 2204 While architecturally satisfying, this model suffers from having to 2205 handle the different characteristics of different equipment 2206 vendors. The advertisements coming from each low level domain 2207 would be meaningless when distributed into the other domains, and 2208 the high level domain would need to be kept up-to-date with the 2209 semantics of each new release of each vendor's equipment. 2210 Additionally, the scaling issues associated with a well-meshed 2211 network of domains each with many entry and exit points and each 2212 with network resources that are continually being updated reduces 2213 to the same problem as noted in the virtual link model. 2214 Furthermore, in the event that the domains are under control of 2215 different administrations, the domains would not want to distribute 2216 the details of their topologies and TE resources. 2218 Practically, this third model turns out to be very close to the 2219 methodology described in this document. As noted in Section 7.1 of 2220 [RFC6827], there are policy rules that can be applied to define 2221 exactly what information is exported from or imported to a low level 2222 OSPF instance. The document even notes that some forms of 2223 aggregation may be appropriate. Thus, we can apply the following 2224 simplifications to the mechanisms defined in RFC 6827: 2226 - Zero information is imported to low level domains. 2228 - Low level domains export only abstracted links as defined in this 2229 document and according to local abstraction policy and with 2230 appropriate removal of vendor-specific information. 2232 - There is no need to formally define routing levels within OSPF. 2234 - Export of abstracted links from the domains to the network-wide 2235 routing instance (the abstraction routing layer) can take place 2236 through any mechanism including BGP-LS or direct interaction 2237 between OSPF implementations. 2239 With these simplifications, it can be seen that the framework defined 2240 in this document can be constructed from the architecture discussed 2241 in RFC 6827, but without needing any of the protocol extensions that 2242 that document defines. Thus, using the terminology and concepts 2243 already established, the problem may solved as shown in Figure 21. 2244 The abstraction layer network is constructed from the inter-domain 2245 links, the domain border nodes, and the abstracted (cross-domain) 2246 links. 2248 Abstraction Layer 2249 -- -- -- -- -- -- 2250 | |===========| |--| |===========| |--| |===========| | 2251 | | | | | | | | | | | | 2252 ..| |...........| |..| |...........| |..| |...........| |...... 2253 | | | | | | | | | | | | 2254 | | -- -- | | | | -- -- | | | | -- -- | | 2255 | |_| |_| |_| | | |_| |_| |_| | | |_| |_| |_| | 2256 | | | | | | | | | | | | | | | | | | | | | | | | 2257 -- -- -- -- -- -- -- -- -- -- -- -- 2258 Domain 1 Domain 2 Domain 3 2259 Key Optical Layer 2260 ... Layer separation 2261 --- Physical link 2262 === Abstract link 2264 Figure 21 : The Optical Network Implemented Through the 2265 Abstraction Layer Network 2267 8. Modeling the User-to-Network Interface 2269 The User-to-Network Interface (UNI) is an important architectural 2270 concept in many implementations and deployments of client-server 2271 networks especially those where the client and server network have 2272 different technologies. The UNI can be seen described in [G.8080], 2273 and the GMPLS approach to the UNI is documented in [RFC4208]. Other 2274 GMPLS-related documents describe the application of GMPLS to specific 2275 UNI scenarios: for example, [RFC6005] describes how GMPLS can support 2276 a UNI that provides access to Ethernet services. 2278 Figure 1 of [RFC6005] is reproduced here as Figure 22. It shows the 2279 Ethernet UNI reference model, and that figure can serve as an example 2280 for all similar UNIs. In this case, the UNI is an interface between 2281 client network edge nodes and the server network. It should be noted 2282 that neither the client network nor the server network need be an 2283 Ethernet switching network. 2285 There are three network layers in this model: the client network, the 2286 "Ethernet service network", and the server network. The so-called 2287 Ethernet service network consists of links comprising the UNI links 2288 and the tunnels across the server network, and nodes comprising the 2289 client network edge nodes and various server nodes. That is, the 2290 Ethernet service network is equivalent to the abstraction layer 2291 network with the UNI links being the physical links between the 2292 client and server networks, and the client edge nodes taking the 2293 role of UNI Client-side (UNI-C) and the server edge nodes acting as 2294 the UNI Network-side (UNI-N) nodes. 2296 Client Client 2297 Network +----------+ +-----------+ Network 2298 -------------+ | | | | +------------- 2299 +----+ | | +-----+ | | +-----+ | | +----+ 2300 ------+ | | | | | | | | | | | | +------ 2301 ------+ EN +-+-----+--+ CN +-+----+--+ CN +--+-----+-+ EN +------ 2302 | | | +--+--| +-+-+ | | +--+-----+-+ | 2303 +----+ | | | +--+--+ | | | +--+--+ | | +----+ 2304 | | | | | | | | | | 2305 -------------+ | | | | | | | | +------------- 2306 | | | | | | | | 2307 -------------+ | | | | | | | | +------------- 2308 | | | +--+--+ | | | +--+--+ | | 2309 +----+ | | | | | | +--+--+ | | | +----+ 2310 ------+ +-+--+ | | CN +-+----+--+ CN | | | | +------ 2311 ------+ EN +-+-----+--+ | | | | +--+-----+-+ EN +------ 2312 | | | | +-----+ | | +-----+ | | | | 2313 +----+ | | | | | | +----+ 2314 | +----------+ |-----------+ | 2315 -------------+ Server Network(s) +------------- 2316 Client UNI UNI Client 2317 Network <-----> <-----> Network 2318 Scope of This Document 2320 Legend: EN - Client Edge Node 2321 CN - Server Node 2323 Figure 22 : Ethernet UNI Reference Model 2325 An issue that is often raised concerns how a dual-homed client edge 2326 node (such as that shown at the bottom left-hand corner of Figure 22) 2327 can make determinations about how they connect across the UNI. This 2328 can be particularly important when reachability across the server 2329 network is limited or when two diverse paths are desired (for 2330 example, to provide protection). However, in the model described in 2331 this network, the edge node (the UNI-C) is part of the abstraction 2332 layer network and can see sufficient topology information to make 2333 these decisions. If the approach introduced in this document is used 2334 to model the UNI as described in this section, there is no need to 2335 enhance the signaling protocols at the GMPLS UNI nor to add routing 2336 exchanges at the UNI. 2338 9. Abstraction in L3VPN Multi-AS Environments 2340 Serving layer-3 VPNs (L3PVNs) across a multi-AS or multi-operator 2341 environment currently provides a significant planning challenge. 2342 Figure 6 shows the general case of the problem that needs to be 2343 solved. This section shows how the abstraction layer network can 2344 address this problem. 2346 In the VPN architecture, the CE nodes are the client network edge 2347 nodes, and the PE nodes are the server network edge nodes. The 2348 abstraction layer network is made up of the CE nodes, the CE-PE 2349 links, the PE nodes, and PE-PE tunnels that are the abstract links. 2351 In the multi-AS or multi-operator case, the abstraction layer network 2352 also includes the PEs (maybe ASBRs) at the edges of the multiple 2353 server networks, and the PE-PE (maybe inter-AS) links. This gives 2354 rise to the architecture shown in Figure 23. 2356 ........... ............. 2357 VPN Site : : VPN Site 2358 -- -- : : -- -- 2359 |C1|-|CE| : : |CE|-|C2| 2360 -- | | : : | | -- 2361 | | : : | | 2362 | | : : | | 2363 | | : : | | 2364 | | : -- -- -- -- : | | 2365 | |----|PE|=========|PE|---|PE|=====|PE|----| | 2366 -- : | | | | | | | | : -- 2367 ........... | | | | | | | | ............ 2368 | | | | | | | | 2369 | | | | | | | | 2370 | | | | | | | | 2371 | | - - | | | | - | | 2372 | |-|P|-|P|-| | | |-|P|-| | 2373 -- - - -- -- - -- 2375 Figure 23 : The Abstraction Layer Network for a Multi-AS VPN 2377 The policy for adding abstract links to the abstraction layer network 2378 will be driven substantially by the needs of the VPN. Thus, when a 2379 new VPN site is added and the existing abstraction layer network 2380 cannot support the required connectivity, a new abstract link will be 2381 created out of the underlying network. 2383 It is important to note that each VPN instance can have a separate 2384 abstraction layer network. This means that the server network 2385 resources can be partitioned and that traffic can be kept separate. 2386 This can be achieved even when VPN sites from different VPNs connect 2387 at the same PE. Alternatively, multiple VPNs can share the same 2388 abstraction layer network if that is operationally preferable. 2390 Lastly, just as for the UNI discussed in Section 8, the issue of 2391 dual-homing of VPN sites is a function of the abstraction layer 2392 network and so is just a normal routing problem in that network. 2394 10. Scoping Future Work 2396 The section is provided to help guide the work on this problem and to 2397 ensure that oceans are not knowingly boiled. 2399 10.1. Not Solving the Internet 2401 The scope of the use cases and problem statement in this document is 2402 limited to "some small set of interconnected domains." In 2403 particular, it is not the objective of this work to turn the whole 2404 Internet into one large, interconnected TE network. 2406 10.2. Working With "Related" Domains 2408 Subsequent to Section 10.1, the intention of this work is to solve 2409 the TE interconnectivity for only "related" domains. Such domains 2410 may be under common administrative operation (such as IGP areas 2411 within a single AS, or ASes belonging to a single operator), or may 2412 have a direct commercial arrangement for the sharing of TE 2413 information to provide specific services. Thus, in both cases, there 2414 is a strong opportunity for the application of policy. 2416 10.3. Not Finding Optimal Paths in All Situations 2418 As has been well described in this document, abstraction necessarily 2419 involves compromises and removal of information. That means that it 2420 is not possible to guarantee that an end-to-end path over 2421 interconnected TE domains follows the absolute optimal (by any measure 2422 of optimality) path. This is taken as understood, and future work 2423 should not attempt to achieve such paths which can only be found by a 2424 full examination of all network information across all connected 2425 networks. 2427 10.4. Not Breaking Existing Protocols 2429 It is a clear objective of this work to not break existing protocols. 2430 The Internet relies on the stability of a few key routing protocols, 2431 and so it is critical that any new work must not make these protocols 2432 brittle or unstable. 2434 10.5. Sanity and Scaling 2436 All of the above points play into a final observation. This work is 2437 intended to bite off a small problem for some relatively simple use 2438 cases as described in Section 2. It is not intended that this work 2439 will be immediately (or even soon) extended to cover many large 2440 interconnected domains. Obviously the solution should as far as 2441 possible be designed to be extensible and scalable, however, it is 2442 also reasonable to make trade-offs in favor of utility and 2443 simplicity. 2445 11. Manageability Considerations 2447 Manageability should not be a significant additional burden. Each 2448 layer in the network model can and should be managed independently. 2450 That is, each client network will run its own management systems and 2451 tools to manage the nodes and links in the client network: each 2452 client network link that is realized by mans of an abstract link will 2453 still be available for management in the client network as any other 2454 link. 2456 Similarly, each server network will run its own management systems 2457 and tools to manage the nodes and links in that network just as 2458 normal. 2460 Three issues remain for consideration: 2462 - How is the abstraction layer network managed? 2463 - How is the interface between the client network and the abstraction 2464 layer network managed? 2465 - How is the interface between the abstraction layer network and the 2466 server network managed? 2468 11.1. Managing the Abstraction Layer Network 2470 Management of the abstraction layer network differs from the client 2471 and server networks because not all of the links that are visible in 2472 the TED have been realized. That is, it is not possible to run OAM 2473 on the links that constitute the potential of a link that could be 2474 realized by an LSP in the server network, but that have not yet been 2475 established. 2477 Other than that, however, the management should be essentially the 2478 same. Routing and signaling protocols can be run in the abstraction 2479 layer (using out of band channels for links that have not yet been 2480 established), and a centralized TED can be constructed and used to 2481 examine the availability and status of the links and nodes in the 2482 network. 2484 Note that different deployment models will place the "ownership" of 2485 th abstraction layer network differently. In some case the the 2486 abstraction layer network will be constructed by the operator of the 2487 server layer and run by that operator as a service for one or more 2488 client networks. In other cases, one or more server networks will 2489 present the potential of links to an abstraction layer network run 2490 by the operator of the client network. And it is feasible that a 2491 business model could be built where a third-party operator manages 2492 the abstraction layer network, constructing it from the connectivity 2493 available in multiple server networks, and facilitating connectivity 2494 for multiple client networks. 2496 11.2. Managing Interactions of Client and Abstraction Layer Networks 2498 The interaction between the client network and th abstraction layer 2499 network is a management task. It might be automated (software 2500 driven) or it might require manual intervention. 2502 This is a two-way interaction: 2504 - The client network can express the need for additional 2505 connectivity. For example, the client layer may try and fail to 2506 find a path across the client network and may request additional, 2507 specific connectivity (this is similar to the situation with 2508 Virtual Network Topology Manager (VNTM) [RFC5623]). Alternatively, 2509 a more proactive client layer management system may monitor traffic 2510 demands (current and predicted), network usage, and network "hot 2511 spots" and may request changes in connectivity by both releasing 2512 unused links and by requesting new links. 2514 - The abstraction layer network can make links available to the 2515 client network or can withdraw them. These actions can be in 2516 response to requests from the client layer, or can be driven by 2517 processes within the abstraction layer (perhaps reorganizing the 2518 use of server layer resources). In any case, the presentation of 2519 new links to the client layer is heavily subject to policy since 2520 this is both operationally key to the success of this architecture 2521 and the central plank of the commercial model described in this 2522 document. Such policies belong to the operator of the abstraction 2523 layer network and are expected to be fully configurable. 2525 Once the abstraction layer network has decided to make a link 2526 available to the client network it will install it at the link end 2527 points (which are nodes in the client network) such that it appears 2528 and can be advertised as a link in the client network. 2530 In all cases, it is important that the operators of both networks are 2531 able to track the requests and responses, and the operator of the 2532 client network should be able to see which links in that network are 2533 "real" physical links, and which are presented by the abstraction 2534 layer network. 2536 11.3. Managing Interactions of Abstraction Layer and Server Networks 2538 The interactions between the abstraction layer network and the server 2539 network a similar to those described in Section 11.2, but there is a 2540 difference in that the server layer is more likely to offer up 2541 connectivity, and the abstraction layer network is less likely to ask 2542 for it. 2544 That is, the server network will, according to policy that may 2545 include commercial relationships, offer the abstraction layer network 2546 a set of potential connectivity that the abstraction layer network 2547 can treat as links. This server network policy will include: 2548 - how much connectivity to offer 2549 - what level of server layer redundancy to include 2550 - whether to realize the connectivity when it is offered, or to wait 2551 until the abstraction layer network asks to use a link. 2553 This process of offering links from the server network may include a 2554 mechanism to indicate which links have been pre-established in the 2555 server network, and can include other properties such as: 2556 - link-level protection ([RFC4202]) 2557 - SRLG and MSRLG (Section 5.3.3.1) 2558 - mutual exclusivity (Section 5.3.3.2). 2560 The abstraction layer network needs a mechanism to request that a 2561 link is realized if it hasn't already been established as an LSP in 2562 the server network. This mechanism could also include the ability 2563 to request additional connectivity from the server layer, although 2564 it seems most likely that the server layer will already have 2565 presented as much connectivity as it is physically capable of 2566 subject to the constraints of policy. 2568 Finally, the server layer will need to confirm the establishment of 2569 connectivity, withdraw links if they are no longer feasible, and 2570 report failures. 2572 Again, it is important that the operators of both networks are able 2573 to track the requests and responses, and the operator of the server 2574 network should be able to see which links are in use. 2576 12. IANA Considerations 2578 This document makes no requests for IANA action. The RFC Editor may 2579 safely remove this section. 2581 13. Security Considerations 2583 Security of signaling and routing protocols is usually administered 2584 and achieved within the boundaries of a domain. Thus, and for 2585 example, a domain with a GMPLS control plane [RFC3945] would apply 2586 the security mechanisms and considerations that are appropriate to 2587 GMPLS [RFC5920]. Furthermore, domain-based security relies strongly 2588 on ensuring that control plane messages are not allowed to enter the 2589 domain from outside. Thus, the mechanisms in this document for 2590 inter-domain exchange of control plane messages and information 2591 naturally raise additional questions of security. 2593 In this context, additional security considerations arising from this 2594 document relate to the exchange of control plane information between 2595 domains. Messages are passed between domains using control plane 2596 protocols operating between peers that have predictable relationships 2597 (for example, UNI-C to UNI-N, between BGP-LS speakers, or between 2598 peer domains). Thus, the security that needs to be given additional 2599 attention for inter-domain TE concentrates on authentication of 2600 peers, assertion that messages have not been tampered with, and to a 2601 lesser extent protecting the content of the messages from inspection 2602 since that might give away sensitive information about the networks. 2603 The protocols described in Section 6 and which are likely to provide 2604 the foundation to solutions to this architecture already include 2605 such protection and further can be run over protected transports 2606 such as IPsec [RFC6701], TLS [RFC5246], and the TCP Authentication 2607 Option (TCP-AO) [RFC5925]. 2609 It is worth noting that the control plane of the abstraction layer 2610 network is likely to be out of band. That is, control plane messages 2611 will be exchanged over network links that are not the links to which 2612 they apply. This models the facilities of GMPLS (but not of MPLS-TE) 2613 and the security mechanisms can be applied to the protocols operating 2614 in the out of band network. 2616 14. Acknowledgements 2618 Thanks to Igor Bryskin for useful discussions in the early stages of 2619 this work. 2621 Thanks to Gert Grammel for discussions on the extent of aggregation 2622 in abstract nodes and links. 2624 Thanks to Deborah Brungard, Dieter Beller, Dhruv Dhody, Vallinayakam 2625 Somasundaram, and Hannes Gredler for review and input. 2627 Particular thanks to Vishnu Pavan Beeram for detailed discussions and 2628 white-board scribbling that made many of the ideas in this document 2629 come to life. 2631 Text in Section 5.3.3 is freely adapted from the work of Igor 2632 Bryskin, Wes Doonan, Vishnu Pavan Beeram, John Drake, Gert Grammel, 2633 Manuel Paul, Ruediger Kunze, Friedrich Armbruster, Cyril Margaria, 2634 Oscar Gonzalez de Dios, and Daniele Ceccarelli in 2635 [I-D.beeram-ccamp-gmpls-enni] for which the authors of this document 2636 express their thanks. 2638 15. References 2640 15.1. Informative References 2642 [G.8080] ITU-T, "Architecture for the automatically switched optical 2643 network (ASON)", Recommendation G.8080. 2645 [I-D.beeram-ccamp-gmpls-enni] 2646 Bryskin, I., Beeram, V. P., Drake, J. et al., "Generalized 2647 Multiprotocol Label Switching (GMPLS) External Network 2648 Network Interface (E-NNI): Virtual Link Enhancements for 2649 the Overlay Model", draft-beeram-ccamp-gmpls-enni, work in 2650 progress. 2652 [I-D.ietf-ccamp-general-constraint-encode] 2653 Bernstein, G., Lee, Y., Li, D., and Imajuku, W., "General 2654 Network Element Constraint Encoding for GMPLS Controlled 2655 Networks", draft-ietf-ccamp-general-constraint-encode, work 2656 in progress. 2658 [I-D.ietf-ccamp-gmpls-general-constraints-ospf-te] 2659 Zhang, F., Lee, Y,. Han, J, Bernstein, G., and Xu, Y., 2660 "OSPF-TE Extensions for General Network Element 2661 Constraints", draft-ietf-ccamp-gmpls-general-constraints- 2662 ospf-te, work in progress. 2664 [I-D.ietf-ccamp-rsvp-te-srlg-collect] 2665 Zhang, F. (Ed.) and O. Gonzalez de Dios (Ed.), "RSVP-TE 2666 Extensions for Collecting SRLG Information", draft-ietf- 2667 ccamp-rsvp-te-srlg-collect, work in progress. 2669 [I-D.ietf-ccamp-te-metric-recording] 2670 Z. Ali, et al., "Resource ReserVation Protocol-Traffic 2671 Engineering (RSVP-TE) extension for recording TE Metric of 2672 a Label Switched Path," draft-ali-ccamp-te-metric- 2673 recording, work in progress. 2675 [I-D.ietf-ccamp-xro-lsp-subobject] 2676 Z. Ali, et al., "Resource ReserVation Protocol-Traffic 2677 Engineering (RSVP-TE) LSP Route Diversity using Exclude 2678 Routes," draft-ali-ccamp-xro-lsp-subobject, work in 2679 progress. 2681 [I-D.ietf-idr-ls-distribution] 2682 Gredler, H., Medved, J., Previdi, S., Farrel, A., and Ray, 2683 S., "North-Bound Distribution of Link-State and TE 2684 Information using BGP", draft-ietf-idr-ls-distribution, 2685 work in progress. 2687 [RFC2702] Awduche, D., Malcolm, J., Agogbua, J., O'Dell, M., and 2688 McManus, J., "Requirements for Traffic Engineering Over 2689 MPLS", RFC 2702, September 1999. 2691 [RFC3209] Awduche, D., Berger, L., Gan, D., Li, T., Srinivasan, V., 2692 and G. Swallow, "RSVP-TE: Extensions to RSVP for LSP 2693 Tunnels", RFC 3209, December 2001. 2695 [RFC3473] L. Berger, "Generalized Multi-Protocol Label Switching 2696 (GMPLS) Signaling Resource ReserVation Protocol-Traffic 2697 Engineering (RSVP-TE) Extensions", RC 3473, January 2003. 2699 [RFC3630] Katz, D., Kompella, and K., Yeung, D., "Traffic Engineering 2700 (TE) Extensions to OSPF Version 2", RFC 3630, September 2701 2003. 2703 [RFC3945] Mannie, E., (Ed.), "Generalized Multi-Protocol Label 2704 Switching (GMPLS) Architecture", RFC 3945, October 2004. 2706 [RFC4105] Le Roux, J.-L., Vasseur, J.-P., and Boyle, J., 2707 "Requirements for Inter-Area MPLS Traffic Engineering", 2708 RFC 4105, June 2005. 2710 [RFC4202] Kompella, K. and Y. Rekhter, "Routing Extensions in Support 2711 of Generalized Multi-Protocol Label Switching (GMPLS)", 2712 RFC 4202, October 2005. 2714 [RFC4206] Kompella, K. and Y. Rekhter, "Label Switched Paths (LSP) 2715 Hierarchy with Generalized Multi-Protocol Label Switching 2716 (GMPLS) Traffic Engineering (TE)", RFC 4206, October 2005. 2718 [RFC4208] Swallow, G., Drake, J., Ishimatsu, H., and Y. Rekhter, 2719 "User-Network Interface (UNI): Resource ReserVation 2720 Protocol-Traffic Engineering (RSVP-TE) Support for the 2721 Overlay Model", RFC 4208, October 2005. 2723 [RFC4216] Zhang, R., and Vasseur, J.-P., "MPLS Inter-Autonomous 2724 System (AS) Traffic Engineering (TE) Requirements", 2725 RFC 4216, November 2005. 2727 [RFC4271] Rekhter, Y., Li, T., and Hares, S., "A Border Gateway 2728 Protocol 4 (BGP-4)", RFC 4271, January 2006. 2730 [RFC4364] Rosen, E. and Y. Rekhter, "BGP/MPLS IP Virtual Private 2731 Networks (VPNs)", RFC 4364, February 2006. 2733 [RFC4655] Farrel, A., Vasseur, J., and J. Ash, "A Path Computation 2734 Element (PCE)-Based Architecture", RFC 4655, August 2006. 2736 [RFC4726] Farrel, A., Vasseur, J.-P., and Ayyangar, A., "A Framework 2737 for Inter-Domain Multiprotocol Label Switching Traffic 2738 Engineering", RFC 4726, November 2006. 2740 [RFC4847] T. Takeda (Ed.), "Framework and Requirements for Layer 1 2741 Virtual Private Networks," RFC 4847, April 2007. 2743 [RFC4874] Lee, CY., Farrel, A., and S. De Cnodder, "Exclude Routes - 2744 Extension to Resource ReserVation Protocol-Traffic 2745 Engineering (RSVP-TE)", RFC 4874, April 2007. 2747 [RFC4920] Farrel, A., Satyanarayana, A., Iwata, A., Fujita, N., and 2748 Ash, G., "Crankback Signaling Extensions for MPLS and GMPLS 2749 RSVP-TE", RFC 4920, July 2007. 2751 [RFC5150] Ayyangar, A., Kompella, K., Vasseur, JP., and A. Farrel, 2752 "Label Switched Path Stitching with Generalized 2753 Multiprotocol Label Switching Traffic Engineering (GMPLS 2754 TE)", RFC 5150, February 2008. 2756 [RFC5152] Vasseur, JP., Ayyangar, A., and Zhang, R., "A Per-Domain 2757 Path Computation Method for Establishing Inter-Domain 2758 Traffic Engineering (TE) Label Switched Paths (LSPs)", 2759 RFC 5152, February 2008. 2761 [RFC5195] Ould-Brahim, H., Fedyk, D., and Y. Rekhter, "BGP-Based 2762 Auto-Discovery for Layer-1 VPNs", RFC 5195, June 2008. 2764 [RFC5212] Shiomoto, K., Papadimitriou, D., Le Roux, JL., Vigoureux, 2765 M., and D. Brungard, "Requirements for GMPLS-Based Multi- 2766 Region and Multi-Layer Networks (MRN/MLN)", RFC 5212, July 2767 2008. 2769 [RFC5246] Dierks, T. and E. Rescorla, "The Transport Layer Security 2770 (TLS) Protocol Version 1.2", RFC 5246, August 2008. 2772 [RFC5251] Fedyk, D., Rekhter, Y., Papadimitriou, D., Rabbat, R., and 2773 L. Berger, "Layer 1 VPN Basic Mode", RFC 5251, July 2008. 2775 [RFC5252] Bryskin, I. and L. Berger, "OSPF-Based Layer 1 VPN Auto- 2776 Discovery", RFC 5252, July 2008. 2778 [RFC5305] Li, T., and Smit, H., "IS-IS Extensions for Traffic 2779 Engineering", RFC 5305, October 2008. 2781 [RFC5440] Vasseur, JP. and Le Roux, JL., "Path Computation Element 2782 (PCE) Communication Protocol (PCEP)", RFC 5440, March 2009. 2784 [RFC5441] Vasseur, JP., Zhang, R., Bitar, N, and Le Roux, JL., "A 2785 Backward-Recursive PCE-Based Computation (BRPC) Procedure 2786 to Compute Shortest Constrained Inter-Domain Traffic 2787 Engineering Label Switched Paths", RFC 5441, April 2009. 2789 [RFC5523] L. Berger, "OSPFv3-Based Layer 1 VPN Auto-Discovery", RFC 2790 5523, April 2009. 2792 [RFC5553] Farrel, A., Bradford, R., and JP. Vasseur, "Resource 2793 Reservation Protocol (RSVP) Extensions for Path Key 2794 Support", RFC 5553, May 2009. 2796 [RFC5623] Oki, E., Takeda, T., Le Roux, JL., and A. Farrel, 2797 "Framework for PCE-Based Inter-Layer MPLS and GMPLS Traffic 2798 Engineering", RFC 5623, September 2009. 2800 [RFC5920] L. Fang, Ed., "Security Framework for MPLS and GMPLS 2801 Networks", RFC 5920, July 2010. 2803 [RFC5925] Touch, J., Mankin, A., and R. Bonica, "The TCP 2804 Authentication Option", RFC 5925, June 2010. 2806 [RFC6005] Nerger, L., and D. Fedyk, "Generalized MPLS (GMPLS) Support 2807 for Metro Ethernet Forum and G.8011 User Network Interface 2808 (UNI)", RFC 6005, October 2010. 2810 [RFC6107] Shiomoto, K., and A. Farrel, "Procedures for Dynamically 2811 Signaled Hierarchical Label Switched Paths", RFC 6107, 2812 February 2011. 2814 [RFC6701] Frankel, S. and S. Krishnan, "IP Security (IPsec) and 2815 Internet Key Exchange (IKE) Document Roadmap", RFC 6701, 2816 February 2011. 2818 [RFC6805] King, D., and A. Farrel, "The Application of the Path 2819 Computation Element Architecture to the Determination of a 2820 Sequence of Domains in MPLS and GMPLS", RFC 6805, November 2821 2012. 2823 [RFC6827] Malis, A., Lindem, A., and D. Papadimitriou, "Automatically 2824 Switched Optical Network (ASON) Routing for OSPFv2 2825 Protocols", RFC 6827, January 2013. 2827 [RFC6996] J. Mitchell, "Autonomous System (AS) Reservation for 2828 Private Use", BCP 6, RFC 6996, July 2013. 2830 [RFC7399] Farrel, A. and D. King, "Unanswered Questions in the Path 2831 Computation Element Architecture", RFC 7399, October 2014. 2833 Authors' Addresses 2835 Adrian Farrel 2836 Juniper Networks 2837 EMail: adrian@olddog.co.uk 2839 John Drake 2840 Juniper Networks 2841 EMail: jdrake@juniper.net 2843 Nabil Bitar 2844 Verizon 2845 40 Sylvan Road 2846 Waltham, MA 02145 2847 EMail: nabil.bitar@verizon.com 2849 George Swallow 2850 Cisco Systems, Inc. 2851 1414 Massachusetts Ave 2852 Boxborough, MA 01719 2853 EMail: swallow@cisco.com 2855 Xian Zhang 2856 Huawei Technologies 2857 Email: zhang.xian@huawei.com 2859 Daniele Ceccarelli 2860 Ericsson 2861 Via A. Negrone 1/A 2862 Genova - Sestri Ponente 2863 Italy 2864 EMail: daniele.ceccarelli@ericsson.com 2866 Contributors 2868 Gert Grammel 2869 Juniper Networks 2870 Email: ggrammel@juniper.net 2872 Vishnu Pavan Beeram 2873 Juniper Networks 2874 Email: vbeeram@juniper.net 2876 Oscar Gonzalez de Dios 2877 Email: ogondio@tid.es 2879 Fatai Zhang 2880 Email: zhangfatai@huawei.com 2882 Zafar Ali 2883 Email: zali@cisco.com 2885 Rajan Rao 2886 Email: rrao@infinera.com 2888 Sergio Belotti 2889 Email: sergio.belotti@alcatel-lucent.com 2891 Diego Caviglia 2892 Email: diego.caviglia@ericsson.com 2894 Jeff Tantsura 2895 Email: jeff.tantsura@ericsson.com 2897 Khuzema Pithewan 2898 Email: kpithewan@infinera.com 2900 Cyril Margaria 2901 Email: cyril.margaria@googlemail.com 2903 Victor Lopez 2904 Email: vlopez@tid.es