idnits 2.17.1 draft-ietf-teas-interconnected-te-info-exchange-04.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- == There are 1 instance of lines with non-RFC6890-compliant IPv4 addresses in the document. If these are example addresses, they should be changed. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (March 20, 2016) is 2958 days in the past. Is this intentional? Checking references for intended status: Best Current Practice ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Obsolete informational reference (is this intentional?): RFC 7752 (Obsoleted by RFC 9552) -- Obsolete informational reference (is this intentional?): RFC 5246 (Obsoleted by RFC 8446) Summary: 0 errors (**), 0 flaws (~~), 2 warnings (==), 3 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 Network Working Group A. Farrel (Ed.) 2 Internet-Draft J. Drake 3 Intended status: Best Current Practice Juniper Networks 4 Expires: September 20, 2016 5 N. Bitar 6 Nuage Networks 8 G. Swallow 9 Cisco Systems, Inc. 11 D. Ceccarelli 12 Ericsson 14 X. Zhang 15 Huawei 16 March 20, 2016 18 Problem Statement and Architecture for Information Exchange 19 Between Interconnected Traffic Engineered Networks 21 draft-ietf-teas-interconnected-te-info-exchange-04.txt 23 Abstract 25 In Traffic Engineered (TE) systems, it is sometimes desirable to 26 establish an end-to-end TE path with a set of constraints (such as 27 bandwidth) across one or more network from a source to a destination. 28 TE information is the data relating to nodes and TE links that is 29 used in the process of selecting a TE path. TE information is 30 usually only available within a network. We call such a zone of 31 visibility of TE information a domain. An example of a domain may be 32 an IGP area or an Autonomous System. 34 In order to determine the potential to establish a TE path through a 35 series of connected networks, it is necessary to have available a 36 certain amount of TE information about each network. This need not 37 be the full set of TE information available within each network, but 38 does need to express the potential of providing TE connectivity. This 39 subset of TE information is called TE reachability information. 41 This document sets out the problem statement and architecture for the 42 exchange of TE information between interconnected TE networks in 43 support of end-to-end TE path establishment. For reasons that are 44 explained in the document, this work is limited to simple TE 45 constraints and information that determine TE reachability. 47 Status of This Memo 49 This Internet-Draft is submitted in full conformance with the 50 provisions of BCP 78 and BCP 79. 52 Internet-Drafts are working documents of the Internet Engineering 53 Task Force (IETF). Note that other groups may also distribute 54 working documents as Internet-Drafts. The list of current Internet- 55 Drafts is at http://datatracker.ietf.org/drafts/current/. 57 Internet-Drafts are draft documents valid for a maximum of six months 58 and may be updated, replaced, or obsoleted by other documents at any 59 time. It is inappropriate to use Internet-Drafts as reference 60 material or to cite them other than as "work in progress." 62 Copyright Notice 64 Copyright (c) 2016 IETF Trust and the persons identified as the 65 document authors. All rights reserved. 67 This document is subject to BCP 78 and the IETF Trust's Legal 68 Provisions Relating to IETF Documents 69 (http://trustee.ietf.org/license-info) in effect on the date of 70 publication of this document. Please review these documents 71 carefully, as they describe your rights and restrictions with respect 72 to this document. Code Components extracted from this document must 73 include Simplified BSD License text as described in Section 4.e of 74 the Trust Legal Provisions and are provided without warranty as 75 described in the Simplified BSD License. 77 Table of Contents 79 1. Introduction ................................................. 5 80 1.1. Terminology ................................................ 6 81 1.1.1. TE Paths and TE Connections .............................. 6 82 1.1.2. TE Metrics and TE Attributes ............................. 6 83 1.1.3. TE Reachability .......................................... 6 84 1.1.4. Domain ................................................... 7 85 1.1.5. Aggregation .............................................. 7 86 1.1.6. Abstraction .............................................. 7 87 1.1.7. Abstract Link ............................................ 7 88 1.1.8. Abstract Node or Virtual Node ............................ 8 89 1.1.9. Abstraction Layer Network ................................ 8 90 2. Overview of Use Cases ........................................ 8 91 2.1. Peer Networks .............................................. 8 92 2.2. Client-Server Networks ..................................... 10 93 2.3. Dual-Homing ................................................ 12 94 2.4. Requesting Connectivity .................................... 13 95 2.4.1. Discovering Server Network Information ................... 15 96 3. Problem Statement ............................................ 15 97 3.1. Policy and Filters ......................................... 15 98 3.2. Confidentiality ............................................ 15 99 3.3. Information Overload ....................................... 17 100 3.4. Issues of Information Churn ................................ 17 101 3.5. Issues of Aggregation ...................................... 18 102 4. Architecture ................................................. 19 103 4.1. TE Reachability ............................................ 19 104 4.2. Abstraction not Aggregation ................................ 20 105 4.2.1. Abstract Links ........................................... 21 106 4.2.2. The Abstraction Layer Network ............................ 21 107 4.2.3. Abstraction in Client-Server Networks..................... 24 108 4.2.4. Abstraction in Peer Networks ............................. 29 109 4.3. Considerations for Dynamic Abstraction ..................... 32 110 4.4. Requirements for Advertising Links and Nodes ............... 32 111 4.5. Addressing Considerations .................................. 33 112 5. Building on Existing Protocols ............................... 33 113 5.1. BGP-LS ..................................................... 34 114 5.2. IGPs ....................................................... 34 115 5.3. RSVP-TE .................................................... 34 116 5.4. Notes on a Solution ........................................ 35 117 6. Applicability to Optical Domains and Networks ................. 36 118 7. Modeling the User-to-Network Interface ....................... 40 119 8. Abstraction in L3VPN Multi-AS Environments ................... 42 120 9. Scoping Future Work .......................................... 43 121 9.1. Not Solving the Internet ................................... 43 122 9.2. Working With "Related" Domains ............................. 43 123 9.3. Not Finding Optimal Paths in All Situations ................ 44 124 9.4. Sanity and Scaling ......................................... 44 125 10. Manageability Considerations ................................ 44 126 10.1. Managing the Abstraction Layer Network .................... 44 127 10.2. Managing Interactions of Client and Abstraction Layer Networks 128 45 129 10.3. Managing Interactions of Abstraction Layer and Server Networks 130 46 131 11. IANA Considerations ......................................... 47 132 12. Security Considerations ..................................... 47 133 13. Acknowledgements ............................................ 47 134 14. References .................................................. 48 135 14.1. Informative References .................................... 48 136 Authors' Addresses ............................................... 52 137 Contributors ..................................................... 52 138 A. Existing Work ................................................ 54 139 A.1. Per-Domain Path Computation ................................ 54 140 A.2. Crankback .................................................. 54 141 A.3. Path Computation Element ................................... 55 142 A.4. GMPLS UNI and Overlay Networks ............................. 57 143 A.5. Layer One VPN .............................................. 57 144 A.6. Policy and Link Advertisement .............................. 58 145 B. Additional Features .......................................... 59 146 B.1. Macro Shared Risk Link Groups .............................. 59 147 B.2. Mutual Exclusivity ......................................... 60 149 1. Introduction 151 Traffic Engineered (TE) systems such as MPLS-TE [RFC2702] and GMPLS 152 [RFC3945] offer a way to establish paths through a network in a 153 controlled way that reserves network resources on specified links. 154 TE paths are computed by examining the Traffic Engineering Database 155 (TED) and selecting a sequence of links and nodes that are capable of 156 meeting the requirements of the path to be established. The TED is 157 constructed from information distributed by the IGP running in the 158 network, for example OSPF-TE [RFC3630] or ISIS-TE [RFC5305]. 160 It is sometimes desirable to establish an end-to-end TE path that 161 crosses more than one network or administrative domain as described 162 in [RFC4105] and [RFC4216]. In these cases, the availability of TE 163 information is usually limited to within each network. Such networks 164 are often referred to as Domains [RFC4726] and we adopt that 165 definition in this document: viz. 167 For the purposes of this document, a domain is considered to be any 168 collection of network elements within a common sphere of address 169 management or path computational responsibility. Examples of such 170 domains include IGP areas and Autonomous Systems. 172 In order to determine the potential to establish a TE path through a 173 series of connected domains and to choose the appropriate domain 174 connection points through which to route a path, it is necessary to 175 have available a certain amount of TE information about each domain. 176 This need not be the full set of TE information available within each 177 domain, but does need to express the potential of providing TE 178 connectivity. This subset of TE information is called TE 179 reachability information. The TE reachability information can be 180 exchanged between domains based on the information gathered from the 181 local routing protocol, filtered by configured policy, or statically 182 configured. 184 This document sets out the problem statement and architecture for the 185 exchange of TE information between interconnected TE domains in 186 support of end-to-end TE path establishment. The scope of this 187 document is limited to the simple TE constraints and information 188 (such as TE metrics, hop count, bandwidth, delay, shared risk) 189 necessary to determine TE reachability: discussion of multiple 190 additional constraints that might qualify the reachability can 191 significantly complicate aggregation of information and the stability 192 of the mechanism used to present potential connectivity as is 193 explained in the body of this document. 195 An Appendix to this document summarizes existing relevant existing 196 work that is used to route TE paths across multiple domains. 198 1.1. Terminology 200 This section introduces some key terms that need to be understood to 201 arrive at a common understanding of the problem space. Some of the 202 terms are defined in more detail in the sections that follow (in 203 which case forward pointers are provided) and some terms are taken 204 from definitions that already exist in other RFCs (in which case 205 references are given, but no apology is made for repeating or 206 summarizing the definitions here). 208 1.1.1. TE Paths and TE Connections 210 A TE connection is a Label Switched Path (LSP) through an MPLS-TE or 211 GMPLS network that directs traffic along a particular path (the TE 212 path) in order to provide a specific service such as bandwidth 213 guarantee, separation of traffic, or resilience between a well-known 214 pair of end points. 216 1.1.2. TE Metrics and TE Attributes 218 TE metrics and TE attributes are terms applied to parameters of links 219 (and possibly nodes) in a network that is traversed by TE 220 connections. The TE metrics and TE attributes are used by path 221 computation algorithms to select the TE paths that the TE connections 222 traverse. Provisioning a TE connection through a network may result 223 in dynamic changes to the TE metrics and TE attributes of the links 224 and nodes in the network. 226 These terms are also sometimes used to describe the end-to-end 227 characteristics of a TE connection and can be derived according to a 228 formula from the metrics and attributes of the links and nodes that 229 the TE connection traverses. Thus, for example, the end-to-end delay 230 for a TE connection is usually considered to be the sum of the delay 231 on each link that the connection traverses. 233 1.1.3. TE Reachability 235 In an IP network, reachability is the ability to deliver a packet to 236 a specific address or prefix. That is, the existence of an IP path 237 to that address or prefix. TE reachability is the ability to reach a 238 specific address along a TE path. More specifically, it is the 239 ability to establish a TE connection in an MPLS-TE or GMPLS sense. 240 Thus we talk about TE reachability as the potential of providing TE 241 connectivity. 243 TE reachability may be unqualified (there is a TE path, but no 244 information about available resources or other constraints is 245 supplied) which is helpful especially in determining a path to a 246 destination that lies in an unknown domain, or may be qualified by TE 247 attributes and TE metrics such as hop count, available bandwidth, 248 delay, shared risk, etc. 250 1.1.4. Domain 252 As defined in [RFC4726], a domain is any collection of network 253 elements within a common sphere of address management or path 254 computational responsibility. Examples of such domains include 255 Interior Gateway Protocol (IGP) areas and Autonomous Systems (ASes). 257 1.1.5. Aggregation 259 The concept of aggregation is discussed in Section 3.5. In 260 aggregation, multiple network resources from a domain are represented 261 outside the domain as a single entity. Thus multiple links and nodes 262 forming a TE connection may be represented as a single link, or a 263 collection of nodes and links (perhaps the whole domain) may be 264 represented as a single node with its attachment links. 266 1.1.6. Abstraction 268 Section 4.2 introduces the concept of abstraction and distinguishes 269 it from aggregation. Abstraction may be viewed as "policy-based 270 aggregation" where the policies are applied to overcome the issues 271 with aggregation as identified in Section 3 of this document. 273 Abstraction is the process of applying policy to the available TE 274 information within a domain, to produce selective information that 275 represents the potential ability to connect across the domain. Thus, 276 abstraction does not necessarily offer all possible connectivity 277 options, but presents a general view of potential connectivity 278 according to the policies that determine how the domain's 279 administrator wants to allow the domain resources to be used. 281 1.1.7. Abstract Link 283 An abstract link is the representation of the characteristics of a 284 path between two nodes in a domain produced by abstraction. The 285 abstract link is advertised outside that domain as a TE link for use 286 in signaling in other domains. Thus, an abstract link represents 287 the potential to connect between a pair of nodes. 289 More details of abstract links are provided in Section 4.2.1. 291 1.1.8. Abstract Node or Virtual Node 293 An abstract node was defined in [RFC3209] as a group of nodes whose 294 internal topology is opaque to an ingress node of the LSP. More 295 generally, an abstract node or virtual node is the representation as 296 a single node in a TE topology of one or more nodes and the links 297 that connect them. An abstract node may be advertised outside the 298 domain as a TE node for use in path computation and signaling in 299 other domains. 301 Sections 3.5 and 4.2.2.1 provide more information about the uses 302 and issues of abstract nodes and virtual nodes. 304 1.1.9. Abstraction Layer Network 306 The abstraction layer network is introduced in Section 4.2.2. It may 307 be seen as a brokerage layer network between one or more server 308 networks and one or more client network. The abstraction layer 309 network is the collection of abstract links that provide potential 310 connectivity across the server network(s) and on which path 311 computation can be performed to determine edge-to-edge paths that 312 provide connectivity as links in the client network. 314 In the simplest case, the abstraction layer network is just a set of 315 edge-to-edge connections (i.e., abstract links), but to make the use 316 of server resources more flexible, the abstract links might not all 317 extend from edge to edge, but might offer connectivity between server 318 nodes to form a more complex network. 320 2. Overview of Use Cases 322 2.1. Peer Networks 324 The peer network use case can be most simply illustrated by the 325 example in Figure 1. A TE path is required between the source (Src) 326 and destination (Dst), that are located in different domains. There 327 are two points of interconnection between the domains, and selecting 328 the wrong point of interconnection can lead to a sub-optimal path, or 329 even fail to make a path available. Note that peer networks are 330 assumed to have the same technology type: that is, the same 331 "switching capability" to use the term from GMPLS [RFC3945]. 333 For example, when Domain A attempts to select a path, it may 334 determine that adequate bandwidth is available from Src through both 335 interconnection points x1 and x2. It may pick the path through x1 336 for local policy reasons: perhaps the TE metric is smaller. However, 337 if there is no connectivity in Domain Z from x1 to Dst, the path 338 cannot be established. Techniques such as crankback may be used to 339 alleviate this situation, but do not lead to rapid setup or 340 guaranteed optimality. Furthermore RSVP signalling creates state in 341 the network that is immediately removed by the crankback procedure. 342 Frequent events of such a kind impact scalability in a non- 343 deterministic manner. More details of crankback can be found in 344 Section A.2. 346 -------------- -------------- 347 | Domain A | x1 | Domain Z | 348 | ----- +----+ ----- | 349 | | Src | +----+ | Dst | | 350 | ----- | x2 | ----- | 351 -------------- -------------- 353 Figure 1 : Peer Networks 355 There are countless more complicated examples of the problem of peer 356 networks. Figure 2 shows the case where there is a simple mesh of 357 domains. Clearly, to find a TE path from Src to Dst, Domain A must 358 not select a path leaving through interconnect x1 since Domain B has 359 no connectivity to Domain Z. Furthermore, in deciding whether to 361 -------------- 362 | Domain B | 363 | | 364 | | 365 /-------------- 366 / 367 / 368 /x1 369 --------------/ -------------- 370 | Domain A | | Domain Z | 371 | | -------------- | | 372 | ----- | x2| Domain C | x4| ----- | 373 | | Src | +---+ +---+ | Dst | | 374 | ----- | | | | ----- | 375 | | -------------- | | 376 --------------\ /-------------- 377 \x3 / 378 \ / 379 \ /x5 380 \--------------/ 381 | Domain D | 382 | | 383 | | 384 -------------- 386 Figure 2 : Peer Networks in a Mesh 388 select interconnection x2 (through Domain C) or interconnection x3 389 though Domain D, Domain A must be sensitive to the TE connectivity 390 available through each of Domains C and D, as well the TE 391 connectivity from each of interconnections x4 and x5 to Dst within 392 Domain Z. The problem may be further complicated when the source 393 domain does not know in which domain the destination node is located, 394 since the choice of a domain path clearly depends on the knowledge of 395 the destination domain: this issue is obviously mitigated in IP 396 networks by inter-domain routing [RFC4271]. 398 Of course, many network interconnection scenarios are going to be a 399 combination of the situations expressed in these two examples. There 400 may be a mesh of domains, and the domains may have multiple points of 401 interconnection. 403 2.2. Client-Server Networks 405 Two major classes of use case relate to the client-server 406 relationship between networks. These use cases have sometimes been 407 referred to as overlay networks. In both cases, the client and 408 server network may have the same switching capability, or may be 409 built from nodes and links that have different technology types in 410 the client and server networks. 412 The first group of use cases, shown in Figure 3, occurs when domains 413 belonging to one network are connected by a domain belonging to 414 another network. In this scenario, once connectivity is formed 415 across the lower layer network, the domains of the upper layer 416 network can be merged into a single domain by running IGP adjacencies 418 -------------- -------------- 419 | Domain A | | Domain Z | 420 | | | | 421 | ----- | | ----- | 422 | | Src | | | | Dst | | 423 | ----- | | ----- | 424 | | | | 425 --------------\ /-------------- 426 \x1 x2/ 427 \ / 428 \ / 429 \---------------/ 430 | Server Domain | 431 | | 432 | | 433 --------------- 435 Figure 3 : Client-Server Networks 437 and by treating the server layer connectivity as links in the higher 438 layer network. The TE relationship between the domains (higher and 439 lower layer) in this case is reduced to determining what server layer 440 connectivity to establish, how to trigger it, how to route it in the 441 server layer, and what resources and capacity to assign within the 442 server layer. As the demands in the higher layer network vary, the 443 connectivity in the server layer may need to be modified. Section 444 2.4 explains in a little more detail how connectivity may be 445 requested. 447 The second class of use case of client-server networking is for 448 Virtual Private Networks (VPNs). In this case, as opposed to the 449 former one, it is assumed that the client network has a different 450 address space than that of the server layer where non-overlapping IP 451 addresses between the client and the server networks cannot be 452 guaranteed. A simple example is shown in Figure 4. The VPN sites 453 comprise a set of domains that are interconnected over a core domain, 454 the provider network. 456 -------------- -------------- 457 | Domain A | | Domain Z | 458 | (VPN site) | | (VPN site) | 459 | | | | 460 | ----- | | ----- | 461 | | Src | | | | Dst | | 462 | ----- | | ----- | 463 | | | | 464 --------------\ /-------------- 465 \x1 x2/ 466 \ / 467 \ / 468 \---------------/ 469 | Core Domain | 470 | | 471 | | 472 /---------------\ 473 / \ 474 / \ 475 /x3 x4\ 476 --------------/ \-------------- 477 | Domain B | | Domain C | 478 | (VPN site) | | (VPN site) | 479 | | | | 480 | | | | 481 -------------- -------------- 483 Figure 4 : A Virtual Private Network 485 Note that in the use cases shown in Figures 3 and 4 the client layer 486 domains may (and, in fact, probably do) operate as a single connected 487 network. 489 Both use cases in this section become "more interesting" when 490 combined with the use case in Section 2.1. That is, when the 491 connectivity between higher layer domains or VPN sites is provided 492 by a sequence or mesh of lower layer domains. Figure 5 shows how 493 this might look in the case of a VPN. 495 ------------ ------------ 496 | Domain A | | Domain Z | 497 | (VPN site) | | (VPN site) | 498 | ----- | | ----- | 499 | | Src | | | | Dst | | 500 | ----- | | ----- | 501 | | | | 502 ------------\ /------------ 503 \x1 x2/ 504 \ / 505 \ / 506 \---------- ----------/ 507 | Domain X |x5 | Domain Y | 508 | (core) +---+ (core) | 509 | | | | 510 | +---+ | 511 | |x6 | | 512 /---------- ----------\ 513 / \ 514 / \ 515 /x3 x4\ 516 ------------/ \------------ 517 | Domain B | | Domain C | 518 | (VPN site) | | (VPN site) | 519 | | | | 520 ------------ ------------ 522 Figure 5 : A VPN Supported Over Multiple Server Domains 524 2.3. Dual-Homing 526 A further complication may be added to the client-server relationship 527 described in Section 2.2 by considering what happens when a client 528 domain is attached to more than one server domain, or has two points 529 of attachment to a server domain. Figure 6 shows an example of this 530 for a VPN. 532 ------------ 533 | Domain A | 534 | (VPN site) | 535 ------------ | ----- | 536 | Domain B | | | Src | | 537 | (VPN site) | | ----- | 538 | | | | 539 ------------\ -+--------+- 540 \x1 | | 541 \ x2| |x3 542 \ | | ------------ 543 \--------+- -+-------- | Domain Z | 544 | Domain X | x8 | Domain Y | x4 | (VPN site) | 545 | (core) +----+ (core) +----+ ----- | 546 | | | | | | Dst | | 547 | +----+ +----+ ----- | 548 | | x9 | | x5 | | 549 /---------- ----------\ ------------ 550 / \ 551 / \ 552 /x6 x7\ 553 ------------/ \------------ 554 | Domain C | | Domain D | 555 | (VPN site) | | (VPN site) | 556 | | | | 557 ------------ ------------ 559 Figure 6 : Dual-Homing in a Virtual Private Network 561 2.4. Requesting Connectivity 563 This relationship between domains can be entirely under the control 564 of management processes, dynamically triggered by the client network, 565 or some hybrid of these cases. In the management case, the server 566 network may be requested to establish a set of LSPs to provide client 567 layer connectivity. In the dynamic case, the client may make a 568 request to the server network exerting a range of controls over the 569 paths selected in the server network. This range extends from no 570 control (i.e., a simple request for connectivity), through a set of 571 constraints (such as latency, path protection, etc.), up to and 572 including full control of the path and resources used in the server 573 network (i.e., the use of explicit paths with label subobjects). 575 There are various models by which a server network can be requested 576 to set up the connections that support a service provided to the 577 client network. These requests may come from management systems, 578 directly from the client network control plane, or through an 579 intermediary broker such as the Virtual Network Topology Manager 580 (VNTM) [RFC5623]. 582 The trigger that causes the request to the server layer is also 583 flexible. It could be that the client layer discovers a pressing 584 need for server layer resources (such as the desire to provision an 585 end-to-end connection in the client layer, or severe congestion on 586 a specific path), or it might be that a planning application has 587 considered how best to optimize traffic in the client network or 588 how to handle a predicted traffic demand. 590 In all cases, the relationship between client and server networks is 591 subject to policy so that server resources are under the 592 administrative control of the operator or the server layer network 593 and are only used to support a client layer network in ways that the 594 server layer operator approves. 596 As just noted, connectivity requests issued to a server network may 597 include varying degrees of constraint upon the choice of path that 598 the server network can implement. 600 o Basic Provisioning is a simple request for connectivity. The only 601 constraints are the end points of the connection and the capacity 602 (bandwidth) that the connection will support for the client layer. 603 In the case of some server networks, even the bandwidth component 604 of a basic provisioning request is superfluous because the server 605 layer has no facility to vary bandwidth, but can offer connectivity 606 only at a default capacity. 608 o Basic Provisioning with Optimization is a service request that 609 indicates one or more metrics that the server layer must optimize 610 in its selection of a path. Metrics may be hop count, path length, 611 summed TE metric, jitter, delay, or any number of technology- 612 specific constraints. 614 o Basic Provisioning with Optimization and Constraints enhances the 615 optimization process to apply absolute constraints to functions of 616 the path metrics. For example, a connection may be requested that 617 optimizes for the shortest path, but in any case requests that the 618 end-to-end delay be less than a certain value. Equally, 619 optimization my be expressed in terms of the impact on the network. 620 For example, a service may be requested in order to leave maximal 621 flexibility to satisfy future service requests. 623 o Fate Diversity requests ask for the server layer to provide a path 624 that does not use any network resources (usually links and nodes) 625 that share fate (i.e., can fail as the result of a single event) as 626 the resources used by another connection. This allows the client 627 layer to construct protection services over the server layer 628 network, for example by establishing links that are known to be 629 fate diverse. The connections that have diverse paths need not 630 share end points. 632 o Provisioning with Fate Sharing is the exact opposite of Fate 633 Diversity. In this case two or more connections are requested to 634 to follow same path in the server network. This may be requested, 635 for example, to create a bundled or aggregated link in the client 636 layer where each component of the client layer composite link is 637 required to have the same server layer properties (metrics, delay, 638 etc.) and the same failure characteristics. 640 o Concurrent Provisioning enables the inter-related connections 641 requests described in the previous two bullets to be enacted 642 through a single, compound service request. 644 o Service Resilience requests the server layer to provide 645 connectivity for which the server layer takes responsibility to 646 recover from faults. The resilience may be achieved through the 647 use of link-level protection, segment protection, end-to-end 648 protection, or recovery mechanisms. 650 2.4.1. Discovering Server Network Information 652 Although the topology and resource availability information of a 653 server network may be hidden from the client network, the service 654 request interface may support features that report details about the 655 services and potential services that the server network supports. 657 o Reporting of path details, service parameters, and issues such as 658 path diversity of LSPs that support deployed services allows the 659 client network to understand to what extent its requests were 660 satisfied. This is particularly important when the requests were 661 made as "best effort". 663 o A server network may support requests of the form "if I was to ask 664 you for this service, would you be able to provide it?" That is, 665 a service request that does everything except actually provision 666 the service. 668 3. Problem Statement 670 The problem statement presented in this section is as much about the 671 issues that may arise in any solution (and so have to be avoided) 672 and the features that are desirable within a solution, as it is about 673 the actual problem to be solved. 675 The problem can be stated very simply and with reference to the use 676 cases presented in the previous section. 678 A mechanism is required that allows TE-path computation in one 679 domain to make informed choices about the TE-capabilities and exit 680 points from the domain when signaling an end-to-end TE path that 681 will extend across multiple domains. 683 Thus, the problem is one of information collection and presentation, 684 not about signaling. Indeed, the existing signaling mechanisms for 685 TE LSP establishment are likely to prove adequate [RFC4726] with the 686 possibility of minor extensions. Similarly, TE information may 687 currently be distributed in a domain by TE extensions to one of the 688 two IGPs as described in OSPF-TE [RFC3630] and ISIS-TE [RFC5305], 689 and TE information may be exported from a domain (for example, 690 northbound) using link state extensions to BGP [RFC7752]. 692 An interesting annex to the problem is how the path is made available 693 for use. For example, in the case of a client-server network, the 694 path established in the server network needs to be made available as 695 a TE link to provide connectivity in the client network. 697 3.1. Policy and Filters 699 A solution must be amenable to the application of policy and filters. 700 That is, the operator of a domain that is sharing information with 701 another domain must be able to apply controls to what information is 702 shared. Furthermore, the operator of a domain that has information 703 shared with it must be able to apply policies and filters to the 704 received information. 706 Additionally, the path computation within a domain must be able to 707 weight the information received from other domains according to local 708 policy such that the resultant computed path meets the local 709 operator's needs and policies rather than those of the operators of 710 other domains. 712 3.2. Confidentiality 714 A feature of the policy described in Section 3.1 is that an operator 715 of a domain may desire to keep confidential the details about its 716 internal network topology and loading. This information could be 717 construed as commercially sensitive. 719 Although it is possible that TE information exchange will take place 720 only between parties that have significant trust, there are also use 721 cases (such as the VPN supported over multiple server domains 722 described in Section 2.4) where information will be shared between 723 domains that have a commercial relationship, but a low level of 724 trust. 726 Thus, it must be possible for a domain to limit the information share 727 to just that which the computing domain needs to know with the 728 understanding that less information that is made available the more 729 likely it is that the result will be a less optimal path and/or more 730 crankback events. 732 3.3. Information Overload 734 One reason that networks are partitioned into separate domains is to 735 reduce the set of information that any one router has to handle. 736 This also applies to the volume of information that routing protocols 737 have to distribute. 739 Over the years routers have become more sophisticated with greater 740 processing capabilities and more storage, the control channels on 741 which routing messages are exchanged have become higher capacity, and 742 the routing protocols (and their implementations) have become more 743 robust. Thus, some of the arguments in favor of dividing a network 744 into domains may have been reduced. Conversely, however, the size of 745 networks continues to grow dramatically with a consequent increase in 746 the total amount of routing-related information available. 747 Additionally, in this case, the problem space spans two or more 748 networks. 750 Any solution to the problems voiced in this document must be aware of 751 the issues of information overload. If the solution was to simply 752 share all TE information between all domains in the network, the 753 effect from the point of view of the information load would be to 754 create one single flat network domain. Thus the solution must 755 deliver enough information to make the computation practical (i.e., 756 to solve the problem), but not so much as to overload the receiving 757 domain. Furthermore, the solution cannot simply rely on the policies 758 and filters described in Section 3.1 because such filters might not 759 always be enabled. 761 3.4. Issues of Information Churn 763 As LSPs are set up and torn down, the available TE resources on links 764 in the network change. In order to reliably compute a TE path 765 through a network, the computation point must have an up-to-date view 766 of the available TE resources. However, collecting this information 767 may result in considerable load on the distribution protocol and 768 churn in the stored information. In order to deal with this problem 769 even in a single domain, updates are sent at periodic intervals or 770 whenever there is a significant change in resources, whichever 771 happens first. 773 Consider, for example, that a TE LSP may traverse ten links in a 774 network. When the LSP is set up or torn down, the resources 775 available on each link will change resulting in a new advertisement 776 of the link's capabilities and capacity. If the arrival rate of new 777 LSPs is relatively fast, and the hold times relatively short, the 778 network may be in a constant state of flux. Note that the 779 problem here is not limited to churn within a single domain, since 780 the information shared between domains will also be changing. 781 Furthermore, the information that one domain needs to share with 782 another may change as the result of LSPs that are contained within or 783 cross the first domain but which are of no direct relevance to the 784 domain receiving the TE information. 786 In packet networks, where the capacity of an LSP is often a small 787 fraction of the resources available on any link, this issue is 788 partially addressed by the advertising routers. They can apply a 789 threshold so that they do not bother to update the advertisement of 790 available resources on a link if the change is less than a configured 791 percentage of the total (or alternatively, the remaining) resources. 792 The updated information in that case will be disseminated based on an 793 update interval rather than a resource change event. 795 In non-packet networks, where link resources are physical switching 796 resources (such as timeslots or wavelengths) the capacity of an LSP 797 may more frequently be a significant percentage of the available link 798 resources. Furthermore, in some switching environments, it is 799 necessary to achieve end-to-end resource continuity (such as using 800 the same wavelength on the whole length of an LSP), so it is far more 801 desirable to keep the TE information held at the computation points 802 up-to-date. Fortunately, non-packet networks tend to be quite a bit 803 smaller than packet networks, the arrival rates of non-packet LSPs 804 are much lower, and the hold times considerably longer. Thus the 805 information churn may be sustainable. 807 3.5. Issues of Aggregation 809 One possible solution to the issues raised in other sub-sections of 810 this section is to aggregate the TE information shared between 811 domains. Two aggregation mechanisms are often considered: 813 - Virtual node model. In this view, the domain is aggregated as if 814 it was a single node (or router / switch). Its links to other 815 domains are presented as real TE links, but the model assumes that 816 any LSP entering the virtual node through a link can be routed to 817 leave the virtual node through any other link (although recent work 818 on "limited cross-connect switches" may help with this problem 820 [RFC7579]). 822 - Virtual link model. In this model, the domain is reduced to a set 823 of edge-to-edge TE links. Thus, when computing a path for an LSP 824 that crosses the domain, a computation point can see which domain 825 entry points can be connected to which other and with what TE 826 attributes. 828 It is of the nature of aggregation that information is removed from 829 the system. This can cause inaccuracies and failed path computation. 830 For example, in the virtual node model there might not actually be a 831 TE path available between a pair of domain entry points, but the 832 model lacks the sophistication to represent this "limited cross- 833 connect capability" within the virtual node. On the other hand, in 834 the virtual link model it may prove very hard to aggregate multiple 835 link characteristics: for example, there may be one path available 836 with high bandwidth, and another with low delay, but this does not 837 mean that the connectivity should be assumed or advertised as having 838 both high bandwidth and low delay. 840 The trick to this multidimensional problem, therefore, is to 841 aggregate in a way that retains as much useful information as 842 possible while removing the data that is not needed. An important 843 part of this trick is a clear understanding of what information is 844 actually needed. 846 It should also be noted in the context of Section 3.4 that changes in 847 the information within a domain may have a bearing on what aggregated 848 data is shared with another domain. Thus, while the data shared in 849 reduced, the aggregation algorithm (operating on the routers 850 responsible for sharing information) may be heavily exercised. 852 4. Architecture 854 4.1. TE Reachability 856 As described in Section 1.1, TE reachability is the ability to reach 857 a specific address along a TE path. The knowledge of TE reachability 858 enables an end-to-end TE path to be computed. 860 In a single network, TE reachability is derived from the Traffic 861 Engineering Database (TED) that is the collection of all TE 862 information about all TE links in the network. The TED is usually 863 built from the data exchanged by the IGP, although it can be 864 supplemented by configuration and inventory details especially in 865 transport networks. 867 In multi-network scenarios, TE reachability information can be 868 described as "You can get from node X to node Y with the following 869 TE attributes." For transit cases, nodes X and Y will be edge nodes 870 of the transit network, but it is also important to consider the 871 information about the TE connectivity between an edge node and a 872 specific destination node. TE reachability may be qualified by TE 873 attributes such as TE metrics, hop count, available bandwidth, delay, 874 shared risk, etc. 876 TE reachability information can be exchanged between networks so that 877 nodes in one network can determine whether they can establish TE 878 paths across or into another network. Such exchanges are subject to 879 a range of policies imposed by the advertiser (for security and 880 administrative control) and by the receiver (for scalability and 881 stability). 883 4.2. Abstraction not Aggregation 885 Aggregation is the process of synthesizing from available 886 information. Thus, the virtual node and virtual link models 887 described in Section 3.5 rely on processing the information available 888 within a network to produce the aggregate representations of links 889 and nodes that are presented to the consumer. As described in 890 Section 3, dynamic aggregation is subject to a number of pitfalls. 892 In order to distinguish the architecture described in this document 893 from the previous work on aggregation, we use the term "abstraction" 894 in this document. The process of abstraction is one of applying 895 policy to the available TE information within a domain, to produce 896 selective information that represents the potential ability to 897 connect across the domain. 899 Abstraction does not offer all possible connectivity options (refer 900 to Section 3.5), but does present a general view of potential 901 connectivity. Abstraction may have a dynamic element, but is not 902 intended to keep pace with the changes in TE attribute availability 903 within the network. 905 Thus, when relying on an abstraction to compute an end-to-end path, 906 the process might not deliver a usable path. That is, there is no 907 actual guarantee that the abstractions are current or feasible. 909 While abstraction uses available TE information, it is subject to 910 policy and management choices. Thus, not all potential connectivity 911 will be advertised to each client. The filters may depend on 912 commercial relationships, the risk of disclosing confidential 913 information, and concerns about what use is made of the connectivity 914 that is offered. 916 4.2.1. Abstract Links 918 An abstract link is a measure of the potential to connect a pair of 919 points with certain TE parameters. That is, it is a path and its 920 characteristics in the server network. An abstract link represents 921 the possibility of setting up an LSP, and LSPs may be set up over the 922 abstract link. 924 When looking at a network such as that in Figure 7, the link from CN1 925 to CN4 may be an abstract link. It is easy to advertise it as a link 926 by abstracting the TE information in the server network subject to 927 policy. 929 The path (i.e., the abstract link) represents the possibility of 930 establishing an LSP from client edge to client edge across the server 931 network. There is not necessarily a one-to-one relationship between 932 abstract link and LSP because more than one LSP could be set up over 933 the path. 935 Since the client nodes do not have visibility into the core network, 936 they must rely on abstraction information delivered to them by the 937 core network. That is, the core network will report on the potential 938 for connectivity. 940 4.2.2. The Abstraction Layer Network 942 Figure 7 introduces the abstraction layer network. This construct 943 separates the client layer resources (nodes C1, C2, C3, and C4, and 944 the corresponding links), and the server layer resources (nodes CN1, 945 CN2, CN3, and CN4 and the corresponding links). Additionally, the 946 architecture introduces an intermediary layer called the abstraction 947 layer. The abstraction layer contains the client layer edge nodes 948 (C2 and C3), the server layer edge nodes (CN1 and CN4), the client- 949 server links (C2-CN1 and CN4-C3) and the abstract link CN1-CN4. 951 The client layer network is able to operate as normal. Connectivity 952 across the network can either be found or not found based on links 953 that appear in the client layer TED. If connectivity cannot be 954 found, end-to-end LSPs cannot be set up. This failure may be 955 reported but no dynamic action is taken by the client layer. 957 The server network layer also operates as normal. LSPs across the 958 server layer between client edges are set up in response to 959 management commands or in response to signaling requests. 961 The abstraction layer consists of the physical links between the 962 two networks, and also the abstract links. The abstract links are 963 created by the server network according to local policy and represent 964 the potential connectivity that could be created across the server 965 network and which the server network is willing to make available for 966 use by the client network. Thus, in this example, the diameter of 967 the abstraction layer network is only three hops, but an instance of 968 an IGP could easily be run so that all nodes participating in the 969 abstraction layer (and in particular the client network edge nodes) 970 can see the TE connectivity in the layer. 972 -- -- -- -- 973 |C1|--|C2| |C3|--|C4| Client Network 974 -- | | | | -- 975 | | | | . . . . . . . . . . . 976 | | | | 977 | | | | 978 | | --- --- | | Abstraction 979 | |---|CN1|================|CN4|---| | Layer Network 980 -- | | | | -- 981 | | | | . . . . . . . . . . . . . . 982 | | | | 983 | | | | 984 | | --- --- | | Server Network 985 | |--|CN2|--|CN3|--| | 986 --- --- --- --- 988 Key 989 --- Direct connection between two nodes 990 === Abstract link 992 Figure 7 : Architecture for Abstraction Layer Network 994 When the client layer needs additional connectivity it can make a 995 request to the abstraction layer network. For example, the operator 996 of the client network may want to create a link from C2 to C3. The 997 abstraction layer can see the potential path C2-CN1-CN4-C3 and can 998 set up an LSP C2-CN1-CN4-C3 across the server network and make the 999 LSP available as a link in the client network. 1001 Sections 4.2.3 and 4.2.4 show how this model is used to satisfy the 1002 requirements for connectivity in client-server networks and in peer 1003 networks. 1005 4.2.2.1. Nodes in the Abstraction Layer Network 1007 Figure 7 shows a very simplified network diagram and the reader would 1008 be forgiven for thinking that only client network edge nodes and 1009 server network edge nodes may appear in the abstraction layer 1010 network. But this is not the case: other nodes from the server 1011 network may be present. This allows the abstraction layer network 1012 to be more complex than a full mesh with access spokes. 1014 Thus, as shown in Figure 8, a transit node in the server network 1015 (here the node is CN3) can be exposed as a node in the abstraction 1016 layer network with abstract links connecting it to other nodes in 1017 the abstraction layer network. Of course, in the network shown in 1018 Figure 8, there is little if any value in exposing CN3, but if it 1019 had other abstract links to other nodes in the abstraction layer 1020 network and/or direct connections to client network nodes, then the 1021 resulting network would be richer. 1023 -- -- -- -- Client 1024 |C1|--|C2| |C3|--|C4| Network 1025 -- | | | | -- 1026 | | | | . . . . . . . . . 1027 | | | | 1028 | | | | 1029 | | --- --- --- | | Abstraction 1030 | |--|CN1|========|CN3|========|CN5|--| | Layer Network 1031 -- | | | | | | -- 1032 | | | | | | . . . . . . . . . . . . 1033 | | | | | | 1034 | | | | | | Server 1035 | | --- | | --- | | Network 1036 | |--|CN2|-| |-|CN4|--| | 1037 --- --- --- --- --- 1039 Figure 8 : Abstraction Layer Network with Additional Node 1041 It should be noted that the nodes included in the abstraction layer 1042 network in this way are not "abstract nodes" in the sense of a 1043 virtual node described in Section 3.5. While it is the case that 1044 the policy point responsible for advertising server network resources 1045 into the abstraction layer network could choose to advertise abstract 1046 nodes in place of real physical nodes, it is believed that doing so 1047 would introduce significant complexity in terms of: 1049 - Coordination between all of the external interfaces of the abstract 1050 node 1052 - Management of changes in the server network that lead to limited 1053 capabilities to reach (cross-connect) across the Abstract Node. It 1054 may be noted that recent work on limited cross-connect capabilities 1055 such as exist in asymmetrical switches could be used to represent 1056 the limitations in an abstract node [RFC7579], [RFC7580]. 1058 4.2.3. Abstraction in Client-Server Networks 1060 Figure 9 shows the basic architectural concepts for a client-server 1061 network. The client network nodes are C1, C2, CE1, CE2, C3, and C4. 1062 The core network nodes are CN1, CN2, CN3, and CN4. The interfaces 1063 CE1-CN1 and CE2-CN2 are the interfaces between the client and core 1064 networks. 1066 The technologies (switching capabilities) of the client and server 1067 networks may be the same or different. If they are different, the 1068 client layer traffic must be tunneled over a server layer LSP. If 1069 they are the same, the client LSP may be routed over the server layer 1070 links, tunneled over a server layer LSP, or constructed from the 1071 concatenation (stitching) of client layer and server layer LSP 1072 segments. 1074 : : 1075 Client Network : Core Network : Client Network 1076 : : 1077 -- -- --- --- -- -- 1078 |C1|--|C2|--|CE1|................................|CE2|--|C3|--|C4| 1079 -- -- | | --- --- | | -- -- 1080 | |===|CN1|================|CN4|===| | 1081 | |---| | | |---| | 1082 --- | | --- --- | | --- 1083 | |--|CN2|--|CN3|--| | 1084 --- --- --- --- 1086 Key 1087 --- Direct connection between two nodes 1088 ... CE-to-CE LSP tunnel 1089 === Potential path across the core (abstract link) 1091 Figure 9 : Architecture for Client-Server Network 1093 The objective is to be able to support an end-to-end connection, 1094 C1-to-C4, in the client network. This connection may support TE or 1095 normal IP forwarding. To achieve this, CE1 is to be connected to CE2 1096 by a link in the client layer. This enables the client network to 1097 view itself as connected and to select an end-to-end path. 1099 As shown in the figure, three abstraction layer links are formed: 1100 CE1-CN1, CN1-CN2, and CN2-CE2. A three-hop LSP is then established 1101 from CE1 to CE2 that can be presented as a link in the client layer. 1103 The practicalities of how the CE1-CE2 LSP is carried across the core 1104 LSP may depend on the switching and signaling options available in 1105 the core network. The LSP may be tunneled down the core LSP using 1106 the mechanisms of a hierarchical LSP [RFC4206], or the LSP segments 1107 CE1-CN1 and CN2-CE2 may be stitched to the core LSP as described in 1108 [RFC5150]. 1110 Section 4.2.2 has already introduced the concept of the abstraction 1111 layer network through an example of a simple layered network. But it 1112 may be helpful to expand on the example using a slightly more complex 1113 network. 1115 Figure 10 shows a multi-layer network comprising client nodes 1116 (labeled as Cn for n= 0 to 9) and server nodes (labeled as Sn for 1117 n = 1 to 9). 1119 -- -- 1120 |C3|---|C4| 1121 /-- --\ 1122 -- -- -- -- --/ \-- 1123 |C1|---|C2|---|S1|---|S2|----|S3| |C5| 1124 -- /-- --\ --\ --\ /-- 1125 / \-- \-- \-- --/ -- 1126 / |S4| |S5|----|S6|---|C6|---|C7| 1127 / /-- --\ /-- /-- -- 1128 --/ -- --/ -- \--/ --/ 1129 |C8|---|C9|---|S7|---|S8|----|S9|---|C0| 1130 -- -- -- -- -- -- 1132 Figure 10 : An example Multi-Layer Network 1134 If the network in Figure 10 is operated as separate client and server 1135 networks then the client layer topology will appear as shown in 1136 Figure 11. As can be clearly seen, the network is partitioned and 1137 there is no way to set up an LSP from a node on the left hand side 1138 (say C1) to a node on the right hand side (say C7). 1140 -- -- 1141 |C3|---|C4| 1142 -- --\ 1143 -- -- \-- 1144 |C1|---|C2| |C5| 1145 -- /-- /-- 1146 / --/ -- 1147 / |C6|---|C7| 1148 / /-- -- 1149 --/ -- --/ 1150 |C8|---|C9| |C0| 1151 -- -- -- 1153 Figure 11 : Client Layer Topology Showing Partitioned Network 1155 For reference, Figure 12 shows the corresponding server layer 1156 topology. 1158 -- -- -- 1159 |S1|---|S2|----|S3| 1160 --\ --\ --\ 1161 \-- \-- \-- 1162 |S4| |S5|----|S6| 1163 /-- --\ /-- 1164 --/ -- \--/ 1165 |S7|---|S8|----|S9| 1166 -- -- -- 1168 Figure 12 : Server Layer Topology 1170 Operating on the TED for the server layer, a management entity or a 1171 software component may apply policy and consider what abstract links 1172 it might offer for use by the client layer. To do this it obviously 1173 needs to be aware of the connections between the layers (there is no 1174 point in offering an abstract link S2-S8 since this could not be of 1175 any use in this example). 1177 In our example, after consideration of which LSPs could be set up in 1178 the server layer, four abstract links are offered: S1-S3, S3-S6, 1179 S1-S9, and S7-S9. These abstract links are shown as double lines on 1180 the resulting topology of the abstraction layer network in Figure 13. 1181 As can be seen, two of the links must share part of a path (S1-S9 1182 must share with either S1-S3 or with S7-S9). This could be achieved 1183 using distinct resources (for example, separate lambdas) where the 1184 paths are common, but it could also be done using resource sharing. 1186 That would mean that when both paths S1-S3 and S7-S9 carry client- 1187 edge to client-edge LSPs the resources on the path S1-S9 are used and 1188 might be depleted to the point that the path is resource constrained 1189 and cannot be used. 1191 -- 1192 |C3| 1193 /-- 1194 -- -- --/ 1195 |C2|---|S1|==========|S3| 1196 -- --\\ --\\ 1197 \\ \\ 1198 \\ \\-- -- 1199 \\ |S6|---|C6| 1200 \\ -- -- 1201 -- -- \\-- -- 1202 |C9|---|S7|=====|S9|---|C0| 1203 -- -- -- -- 1205 Figure 13 : Abstraction Layer Network with Abstract Links 1207 The separate IGP instance running in the abstraction layer network 1208 means that this topology is visible at the edge nodes (C2, C3, C6, 1209 C9, and C0) as well as at a PCE if one is present. 1211 Now the client layer is able to make requests to the abstraction 1212 layer network to provide connectivity. In our example, it requests 1213 that C2 is connected to C3 and that C2 is connected to C0. This 1214 results in several actions: 1216 1. The management component for the abstraction layer network asks 1217 its PCE to compute the paths necessary to make the connections. 1218 This yields C2-S1-S3-C3 and C2-S1-S9-C0. 1220 2. The management component for the abstraction layer network 1221 instructs C2 to start the signaling process for the new LSPs in 1222 the abstraction layer. 1224 3. C2 signals the LSPs for setup using the explicit routes 1225 C2-S1-S3-C3 and C2-S1-S9-C0. 1227 4. When the signaling messages reach S1 (in our example, both LSPs 1228 traverse S1) the server layer network may support them by a 1229 number of means including establishing server layer LSPs as 1230 tunnels depending on the mismatch of technologies between the 1231 client and server networks. For example, S1-S2-S3 and S1-S2-S5-S9 1232 might be traversed via an LSP tunnel, using LSPs stitched 1233 together, or simply by routing the client layer LSP through the 1234 server network. If server layer LSPs are needed to they can be 1235 signaled at this point. 1237 5. Once any server layer LSPs that are needed have been established, 1238 S1 can continue to signal the client-edge to client-edge LSP 1239 across the abstraction layer either using the server layer LSPs as 1240 tunnels or as stitching segments, or simply routing through the 1241 server layer network. 1243 -- -- 1244 |C3|-|C4| 1245 /-- --\ 1246 / \-- 1247 -- --/ |C5| 1248 |C1|---|C2| /-- 1249 -- /--\ --/ -- 1250 / \ |C6|---|C7| 1251 / \ /-- -- 1252 / \--/ 1253 --/ -- |C0| 1254 |C8|---|C9| -- 1255 -- -- 1257 Figure 14 : Connected Client Layer Network with Additional Links 1259 6. Finally, once the client-edge to client-edge LSPs have been set 1260 up, the client layer can be informed and can start to advertise 1261 the new TE links C2-C3 and C2-C0. The resulting client layer 1262 topology is shown in Figure 14. 1264 7. Now the client layer can compute an end-to-end path from C1 to C7. 1266 4.2.3.1 A Server with Multiple Clients 1268 A single server network may support multiple client networks. This 1269 is not an uncommon state of affairs for example when the server 1270 network provides connectivity for multiple customers. 1272 In this case, the abstraction provided by the server layer may vary 1273 considerably according to the policies and commercial relationships 1274 with each customer. This variance would lead to a separate 1275 abstraction layer network maintained to support each client network. 1277 On the other hand, it may be that multiple clients are subject to the 1278 same policies and the abstraction can be identical. In this case, a 1279 single abstraction layer network can support more than one client. 1281 The choices here are made as an operational issue by the server layer 1282 network. 1284 4.2.3.2 A Client with Multiple Servers 1286 A single client network may be supported by multiple server networks. 1287 The server networks may provide connectivity between different parts 1288 of the client network or may provide parallel (redundant) 1289 connectivity for the client network. 1291 In this case the abstraction layer network should contain the 1292 abstract links from all server networks so that it can make suitable 1293 computations and create the correct TE links in the client network. 1294 That is, the relationship between client network and abstraction 1295 layer network should be one-to-one. 1297 4.2.4. Abstraction in Peer Networks 1299 Figure 15 shows the basic architectural concepts for connecting 1300 across peer networks. Nodes from four networks are shown: A1 and A2 1301 come from one network; B1, B2, and B3 from another network; etc. The 1302 interfaces between the networks (sometimes known as External Network- 1303 to-Network Interfaces - ENNIs) are A2-B1, B3-C1, and C3-D1. 1305 The objective is to be able to support an end-to-end connection A1- 1306 to-D2. This connection is for TE connectivity. 1308 As shown in the figure, abstract links that span the transit networks 1309 are used to achieve the required connectivity. These links form the 1310 key building blocks of the end-to-end connectivity. An end-to-end 1311 LSP uses these links as part of its path. If the stitching 1312 capabilities of the networks are homogeneous then the end-to-end LSP 1314 : : : 1315 Network A : Network B : Network C : Network D 1316 : : : 1317 -- -- -- -- -- -- -- -- -- -- 1318 |A1|--|A2|---|B1|--|B2|--|B3|---|C1|--|C2|--|C3|---|D1|--|D2| 1319 -- -- | | -- | | | | -- | | -- -- 1320 | |========| | | |========| | 1321 -- -- -- -- 1323 Key 1324 --- Direct connection between two nodes 1325 === Abstract link across transit network 1327 Figure 15 : Architecture for Peering 1329 may simply traverse the path defined by the abstract links across the 1330 various peer networks or may utilize stitching of LSP segments that 1331 each traverse a network along the path of an abstract link. If the 1332 network switching technologies support or necessitate the use of LSP 1333 hierarchies, the end-to-end LSP may be tunneled across each network 1334 using hierarchical LSPs that each each traverse a network along the 1335 path of an abstract link. 1337 Peer networks exist in many situations in the Internet. Packet 1338 networks may peer as IGP areas (levels) or as ASes. Transport 1339 networks (such as optical networks) may peer to provide 1340 concatenations of optical paths through single vendor environments 1341 (see Section 6). Figure 16 shows a simple example of three peer 1342 networks (A, B, and C) each comprising a few nodes. 1344 Network A : Network B : Network C 1345 : : 1346 -- -- -- : -- -- -- : -- -- 1347 |A1|---|A2|----|A3|---|B1|---|B2----|B3|---|C1|---|C2| 1348 -- --\ /-- : -- /--\ -- : -- -- 1349 \--/ : / \ : 1350 |A4| : / \ : 1351 --\ : / \ : 1352 -- \-- : --/ \-- : -- -- 1353 |A5|---|A6|---|B4|----------|B6|---|C3|---|C4| 1354 -- -- : -- -- : -- -- 1355 : : 1356 : : 1358 Figure 16 : A Network Comprising Three Peer Networks 1360 As discussed in Section 2, peered networks do not share visibility of 1361 their topologies or TE capabilities for scaling and confidentiality 1362 reasons. That means, in our example, that computing a path from A1 1363 to C4 can be impossible without the aid of cooperating PCEs or some 1364 form of crankback. 1366 But it is possible to produce abstract links for reachability across 1367 transit peer networks and to create an abstraction layer network. 1368 That network can be enhanced with specific reachability information 1369 if a destination network is partitioned as is the case with Network C 1370 in Figure 16. 1372 Suppose Network B decides to offer three abstract links B1-B3, B4-B3, 1373 and B4-B6. The abstraction layer network could then be constructed 1374 to look like the network in Figure 17. 1376 -- -- -- -- 1377 |A3|---|B1|====|B3|----|C1| 1378 -- -- //-- -- 1379 // 1380 // 1381 // 1382 -- --// -- -- 1383 |A6|---|B4|=====|B6|---|C3| 1384 -- -- -- -- 1386 Figure 17 : Abstraction Layer Network for the Peer Network Example 1388 Using a process similar to that described in Section 4.2.3, Network A 1389 can request connectivity to Network C and abstract links can be 1390 advertised that connect the edges of the two networks and that can be 1391 used to carry LSPs that traverse both networks. Furthermore, if 1392 Network C is partitioned, reachability information can be exchanged 1393 to allow Network A to select the correct abstract link as shown in 1394 Figure 18. 1396 Network A : Network C 1397 : 1398 -- -- -- : -- -- 1399 |A1|---|A2|----|A3|=========|C1|.....|C2| 1400 -- --\ /-- : -- -- 1401 \--/ : 1402 |A4| : 1403 --\ : 1404 -- \-- : -- -- 1405 |A5|---|A6|=========|C3|.....|C4| 1406 -- -- : -- -- 1408 Figure 18 : Tunnel Connections to Network C with TE Reachability 1410 Peer networking cases can be made far more complex by dual homing 1411 between network peering nodes (for example, A3 might connect to B1 1412 and B4 in Figure 17) and by the networks themselves being arranged in 1413 a mesh (for example, A6 might connect to B4 and C1 in Figure 17). 1415 These additional complexities can be handled gracefully by the 1416 abstraction layer network model. 1418 Further examples of abstraction in peer networks can be found in 1419 Sections 6 and 8. 1421 4.3. Considerations for Dynamic Abstraction 1423 It is possible to consider a highly dynamic system where the server 1424 network adaptively suggests new abstract links into the abstraction 1425 layer, and where the abstraction layer proactively deploys new 1426 client-edge to client-edge LSPs to provide new links in the client 1427 network. Such fluidity is, however, to be treated with caution 1428 especially in the case of client-server networks of differing 1429 technologies where hierarchical server layer LSPs are used because of 1430 the longer turn-up times of connections in some server networks, 1431 because the server networks are likely to be sparsely connected and 1432 expensive physical resources will only be deployed where there is 1433 believed to be a need for them. More significantly, the complex 1434 commercial, policy, and administrative relationships that may exist 1435 between client and server network operators mean that stability is 1436 more likely to be the desired operational practice. 1438 Thus, proposals for fully automated multi-layer networks based on 1439 this architecture may be regarded as forward-looking topics for 1440 research both in terms of network stability and with regard to 1441 ecomonic impact. 1443 However, some elements of automation should not be discarded. A 1444 server network may automatically apply policy to determine the best 1445 set of abstract links to offer and the most suitable way for the 1446 server network to support them. And a client network may dynamically 1447 observe congestion, lack of connectivity, or predicted changes in 1448 traffic demand, and may use this information to request additional 1449 links from the abstraction layer. And, once policies have been 1450 configured, the whole system should be able to operate autonomous of 1451 operator control (which is not to say that the operator will not have 1452 the option of exerting control at every step in the process). 1454 4.4. Requirements for Advertising Links and Nodes 1456 The abstraction layer network is "just another network layer". The 1457 links and nodes in the network need to be advertised along with their 1458 associated TE information (metrics, bandwidth, etc.) so that the 1459 topology is disseminated and so that routing decisions can be made. 1461 This requires a routing protocol running between the nodes in the 1462 abstraction layer network. Note that this routing information 1463 exchange could be piggy-backed on an existing routing protocol 1464 instance, or use a new instance (or even a new protocol). Clearly, 1465 the information exchanged is only that which has been created as 1466 part of the abstraction function according to policy. 1468 It should be noted that in many cases the abstract represents the 1469 potential for connectivity across the server network but that no such 1470 connectivity exists. In this case we may ponder how the routing 1471 protocol in the abstraction layer will advertise topology information 1472 for and over a link that has no underlying connectivity. In other 1473 words, there must be a communication channel between the abstract 1474 layer nodes so that the routing protocol messages can flow. The 1475 answer is that control plane connectivity already exists in the 1476 server network and on the client-server edge links, and this can be 1477 used to carry the routing protocol messages for the abstraction layer 1478 network. The same consideration applies to the advertisement, in the 1479 client network of the potential connectivity that the abstraction 1480 layer network can provide although it may be more normal to establish 1481 that connectivity before advertising a link in the client network. 1483 4.5. Addressing Considerations 1485 The network layers in this architecture should be able to operate 1486 with separate address spaces and these may overlap without any 1487 technical issues. That is, one address may mean one thing in the 1488 client network, yet the same address may have a different meaning in 1489 the abstraction layer network or the server network. In other words 1490 there is complete address separation between networks. 1492 However, this will require some care both because human operators may 1493 well become confused, and because mapping between address spaces is 1494 needed at the interfaces between the network layers. That mapping 1495 requires configuration so that, for example, when the server network 1496 announces an abstract link from A to B, the abstraction layer network 1497 must recognize that A and B are server network addresses and must map 1498 them to abstraction layer addresses (say P and Q) before including 1499 the link in its own topology. And similarly, when the abstraction 1500 layer network informs the client network that a new link is available 1501 from S to T, it must map those addresses from its own address space 1502 to that of the client network. 1504 This form of address mapping will become particularly important in 1505 cases where one abstraction layer network is constructed from 1506 connectivity in multiple server layer networks, or where one 1507 abstraction layer network provides connectivity for multiple client 1508 networks. 1510 5. Building on Existing Protocols 1512 This section is not intended to prejudge a solutions framework or any 1513 applicability work. It does, however, very briefly serve to note the 1514 existence of protocols that could be examined for applicability to 1515 serve in realizing the model described in this document. 1517 The general principle of protocol re-use is preferred over the 1518 invention of new protocols or additional protocol extensions, and it 1519 would be advantageous to make use of an existing protocol that is 1520 commonly implemented on network nodes and is currently deployed, or 1521 to use existing computational elements such as Path Computation 1522 Elements (PCEs). This has many benefits in network stability, time 1523 to deployment, and operator training. 1525 It is recognized, however, that existing protocols are unlikely to be 1526 immediately suitable to this problem space without some protocol 1527 extensions. Extending protocols must be done with care and with 1528 consideration for the stability of existing deployments. In extreme 1529 cases, a new protocol can be preferable to a messy hack of an 1530 existing protocol. 1532 5.1. BGP-LS 1534 BGP-LS is a set of extensions to BGP described in [RFC7752]. It's 1535 purpose is to announce topology information from one network to a 1536 "north-bound" consumer. Application of BGP-LS to date has focused on 1537 a mechanism to build a TED for a PCE. However, BGP's mechanisms 1538 would also serve well to advertise abstract links from a server 1539 network into the abstraction layer network, or to advertise potential 1540 connectivity from the abstraction layer network to the client 1541 network. 1543 5.2. IGPs 1545 Both OSPF and IS-IS have been extended through a number of RFCs to 1546 advertise TE information. Additionally, both protocols are capable 1547 of running in a multi-instance mode either as ships that pass in the 1548 night (i.e., completely separate instances using different address) 1549 or as dual instances on the same address space. This means that 1550 either IGP could probably be used as the routing protocol in the 1551 abstraction layer network. 1553 5.3. RSVP-TE 1555 RSVP-TE signaling can be used to set up all traffic engineered LSPs 1556 demanded by this model without the need for any protocol extensions. 1558 If necessary, LSP hierarchy [RFC4206] or LSP stitching [RFC5150] can 1559 be used to carry LSPs over the server layer network, again without 1560 needing any protocol extensions. 1562 Furthermore, the procedures in [RFC6107] allow the dynamic signaling 1563 of the purpose of any LSP that is established. This means that 1564 when an LSP tunnel is set up, the two ends can coordinate into which 1565 routing protocol instance it should be advertised, and can also agree 1566 on the addressing to be said to identify the link that will be 1567 created. 1569 5.4. Notes on a Solution 1571 This section is not intended to be prescriptive or dictate the 1572 protocol solutions that may be used to satisfy the architecture 1573 described in this document, but it does show how the existing 1574 protocols listed in the previous sections can be combined to provide 1575 a solution with only minor modifications. 1577 A server network can be operated using GMPLS routing and signaling 1578 protocols. Using information gathered from the routing protocol, a 1579 TED can be constructed containing resource availability information 1580 and Shared Risk Link Group (SRLG) details. A policy-based process 1581 can then determine which nodes and abstract links it wishes to 1582 advertise to form the abstract layer network. 1584 The server network can now use BGP-LS to advertise a topology of 1585 links and nodes to form the abstraction layer network. This 1586 information would most likely be advertised from a single point of 1587 control that made all of the abstraction decisions, but the function 1588 could be distributed to multiple server network edge nodes. The 1589 information can be advertised by BGP-LS to multiple points within the 1590 abstraction layer (such as all client network edge nodes) or to a 1591 single controller. 1593 Multiple server networks may advertise information that is used to 1594 construct an abstraction layer network, and one server network may 1595 advertise different information in different instances of BGP-LS to 1596 form different abstraction layer networks. Furthermore, in the case 1597 of one controller constructing multiple abstraction layer networks, 1598 BGP-LS uses the route target mechanism defined in [RFC4364] to 1599 distinguish the different applications (effectively abstraction layer 1600 network VPNs) of the exported information. 1602 Extensions may be made to BGP-LS to allow advertisement of Macro 1603 Shared Risk Link Groups (MSRLGs) per Appendix B, mutually exclusive 1604 links, and to indicate whether the abstract link has been pre- 1605 established or not. Such extensions are valid options, but do not 1606 form a core component of this architecture. 1608 The abstraction layer network may operate under central control or 1609 use a distributed control plane. Since the links and nodes may be a 1610 mix of physical and abstract links, and since the nodes may have 1611 diverse cross-connect capabilities, it is most likely that a GMPLS 1612 routing protocol will be beneficial for collecting and correlating 1613 the routing information and for distributing updates. No special 1614 additional features are needed beyond adding those extra parameters 1615 just described for BGP-LS, but it should be noted that the control 1616 plane of the abstraction layer network must run in an out of band 1617 control network because the data-bearing links might not yet have 1618 been established via connections in the server layer network. 1620 The abstraction layer network is also able to determine potential 1621 connectivity from client network edge to client network edge. It 1622 will determine which client network links to create according to 1623 policy and subject to requests from the client network, and will 1624 take four steps: 1626 - First it will compute a path for across the abstraction layer 1627 network. 1628 - Then, if the support of the abstract links requires the use of 1629 server layer LSPs for tunneling or stitching, and if those LSPs are 1630 not already established, it will ask the server layer to set them 1631 up. 1632 - Then, it will signal the client-edge to client-edge LSP. 1633 - Finally, the abstraction layer network will inform the client 1634 network of the existence of the new client network link. 1636 This last step can be achieved either by coordination of the end 1637 points of the LSPs that span the abstraction layer (these points are 1638 client network edge nodes) using mechanisms such as those described 1639 in [RFC6107], or using BGP-LS from a central controller. 1641 Once the client network edge nodes are aware of a new link, they will 1642 automatically advertise it using their routing protocol and it will 1643 become available for use by traffic in the client network. 1645 Sections 6, 7, and 8 discuss the applicability of this architecture 1646 to different network types and problem spaces, while Section 9 gives 1647 some advice about scoping future work. Section 9 on manageability 1648 considerations is particularly relevant in the context of this 1649 section because it contains a discussion of the policies and 1650 mechanisms for indicating connectivity and link availability between 1651 network layers in this architecture. 1653 6. Applicability to Optical Domains and Networks 1655 Many optical networks are arranged a set of small domains. Each 1656 domain is a cluster of nodes, usually from the same equipment vendor 1657 and with the same properties. The domain may be constructed as a 1658 mesh or a ring, or maybe as an interconnected set of rings. 1660 The network operator seeks to provide end-to-end connectivity across 1661 a network constructed from multiple domains, and so (of course) the 1662 domains are interconnected. In a network under management control 1663 such as through an Operations Support System (OSS), each domain is 1664 under the operational control of a Network Management System (NMS). 1666 In this way, an end-to-end path may be commissioned by the OSS 1667 instructing each NMS, and the NMSes setting up the path fragments 1668 across the domains. 1670 However, in a system that uses a control plane, there is a need for 1671 integration between the domains. 1673 Consider a simple domain, D1, as shown in Figure 19. In this case, 1674 the nodes A through F are arranged in a topological ring. Suppose 1675 that there is a control plane in use in this domain, and that OSPF is 1676 used as the TE routing protocol. 1678 ----------------- 1679 | D1 | 1680 | B---C | 1681 | / \ | 1682 | / \ | 1683 | A D | 1684 | \ / | 1685 | \ / | 1686 | F---E | 1687 | | 1688 ----------------- 1690 Figure 19 : A Simple Optical Domain 1692 Now consider that the operator's network is built from a mesh of such 1693 domains, D1 through D7, as shown in Figure 20. It is possible that 1695 ------ ------ ------ ------ 1696 | | | | | | | | 1697 | D1 |---| D2 |---| D3 |---| D4 | 1698 | | | | | | | | 1699 ------\ ------\ ------\ ------ 1700 \ | \ | \ | 1701 \------ \------ \------ 1702 | | | | | | 1703 | D5 |---| D6 |---| D7 | 1704 | | | | | | 1705 ------ ------ ------ 1707 Figure 20 : A Simple Optical Domain 1709 these domains share a single, common instance of OSPF in which case 1710 there is nothing further to say because that OSPF instance will 1711 distribute sufficient information to build a single TED spanning the 1712 whole network, and an end-to-end path can be computed. A more likely 1713 scenario is that each domain is running its own OSPF instance. In 1714 this case, each is able to handle the peculiarities (or rather, 1715 advanced functions) of each vendor's equipment capabilities. 1717 The question now is how to combine the multiple sets of information 1718 distributed by the different OSPF instances. Three possible models 1719 suggest themselves based on pre-existing routing practices. 1721 o In the first model (the Area-Based model) each domain is treated as 1722 a separate OSPF area. The end-to-end path will be specified to 1723 traverse multiple areas, and each area will be left to determine 1724 the path across the nodes in the area. The feasibility of an end- 1725 to-end path (and, thus, the selection of the sequence of areas and 1726 their interconnections) can be derived using hierarchical PCE. 1728 This approach, however, fits poorly with established use of the 1729 OSPF area: in this form of optical network, the interconnection 1730 points between domains are likely to be links; and the mesh of 1731 domains is far more interconnected and unstructured than we are 1732 used to seeing in the normal area-based routing paradigm. 1734 Furthermore, while hierarchical PCE may be able to solve this type 1735 of network, the effort involved may be considerable for more than a 1736 small collection of domains. 1738 o Another approach (the AS-Based model) treats each domain as a 1739 separate Autonomous System (AS). The end-to-end path will be 1740 specified to traverse multiple ASes, and each AS will be left to 1741 determine the path across the AS. 1743 This model sits more comfortably with the established routing 1744 paradigm, but causes a massive escalation of ASes in the global 1745 Internet. It would, in practice, require that the operator used 1746 private AS numbers [RFC6996] of which there are plenty. 1748 Then, as suggested in the Area-Based model, hierarchical PCE 1749 could be used to determine the feasibility of an end-to-end path 1750 and to derive the sequence of domains and the points of 1751 interconnection to use. But, just as in that other model, the 1752 scalability of this model using a hierarchical PCE must be 1753 questioned given the sheer number of ASes and their 1754 interconnectivity. 1756 Furthermore, determining the mesh of domains (i.e., the inter-AS 1757 connections) conventionally requires the use of BGP as an inter- 1758 domain routing protocol. However, not only is BGP not normally 1759 available on optical equipment, but this approach indicates that 1760 the TE properties of the inter-domain links would need to be 1761 distributed and updated using BGP: something for which it is not 1762 well suited. 1764 o The third approach (the ASON model) follows the architectural 1765 model set out by the ITU-T [G.8080] and uses the routing protocol 1766 extensions described in [RFC6827]. In this model the concept of 1767 "levels" is introduced to OSPF. Referring back to Figure 20, each 1768 OSPF instance running in a domain would be construed as a "lower 1769 level" OSPF instance and would leak routes into a "higher level" 1770 instance of the protocol that runs across the whole network. 1772 This approach handles the awkwardness of representing the domains 1773 as areas or ASes by simply considering them as domains running 1774 distinct instances of OSPF. Routing advertisements flow "upward" 1775 from the domains to the high level OSPF instance giving it a full 1776 view of the whole network and allowing end-to-end paths to be 1777 computed. Routing advertisements may also flow "downward" from the 1778 network-wide OSPF instance to any one domain so that it has 1779 visibility of the connectivity of the whole network. 1781 While architecturally satisfying, this model suffers from having to 1782 handle the different characteristics of different equipment 1783 vendors. The advertisements coming from each low level domain 1784 would be meaningless when distributed into the other domains, and 1785 the high level domain would need to be kept up-to-date with the 1786 semantics of each new release of each vendor's equipment. 1787 Additionally, the scaling issues associated with a well-meshed 1788 network of domains each with many entry and exit points and each 1789 with network resources that are continually being updated reduces 1790 to the same problem as noted in the virtual link model. 1791 Furthermore, in the event that the domains are under control of 1792 different administrations, the domains would not want to distribute 1793 the details of their topologies and TE resources. 1795 Practically, this third model turns out to be very close to the 1796 methodology described in this document. As noted in Section 6.1 of 1797 [RFC6827], there are policy rules that can be applied to define 1798 exactly what information is exported from or imported to a low level 1799 OSPF instance. The document even notes that some forms of 1800 aggregation may be appropriate. Thus, we can apply the following 1801 simplifications to the mechanisms defined in RFC 6827: 1803 - Zero information is imported to low level domains. 1805 - Low level domains export only abstracted links as defined in this 1806 document and according to local abstraction policy and with 1807 appropriate removal of vendor-specific information. 1809 - There is no need to formally define routing levels within OSPF. 1811 - Export of abstracted links from the domains to the network-wide 1812 routing instance (the abstraction routing layer) can take place 1813 through any mechanism including BGP-LS or direct interaction 1814 between OSPF implementations. 1816 With these simplifications, it can be seen that the framework defined 1817 in this document can be constructed from the architecture discussed 1818 in RFC 6827, but without needing any of the protocol extensions that 1819 that document defines. Thus, using the terminology and concepts 1820 already established, the problem may solved as shown in Figure 21. 1821 The abstraction layer network is constructed from the inter-domain 1822 links, the domain border nodes, and the abstracted (cross-domain) 1823 links. 1825 Abstraction Layer 1826 -- -- -- -- -- -- 1827 | |===========| |--| |===========| |--| |===========| | 1828 | | | | | | | | | | | | 1829 ..| |...........| |..| |...........| |..| |...........| |...... 1830 | | | | | | | | | | | | 1831 | | -- -- | | | | -- -- | | | | -- -- | | 1832 | |_| |_| |_| | | |_| |_| |_| | | |_| |_| |_| | 1833 | | | | | | | | | | | | | | | | | | | | | | | | 1834 -- -- -- -- -- -- -- -- -- -- -- -- 1835 Domain 1 Domain 2 Domain 3 1836 Key Optical Layer 1837 ... Layer separation 1838 --- Physical link 1839 === Abstract link 1841 Figure 21 : The Optical Network Implemented Through the 1842 Abstraction Layer Network 1844 7. Modeling the User-to-Network Interface 1846 The User-to-Network Interface (UNI) is an important architectural 1847 concept in many implementations and deployments of client-server 1848 networks especially those where the client and server network have 1849 different technologies. The UNI can be seen described in [G.8080], 1850 and the GMPLS approach to the UNI is documented in [RFC4208]. Other 1851 GMPLS-related documents describe the application of GMPLS to specific 1852 UNI scenarios: for example, [RFC6005] describes how GMPLS can support 1853 a UNI that provides access to Ethernet services. 1855 Figure 1 of [RFC6005] is reproduced here as Figure 22. It shows the 1856 Ethernet UNI reference model, and that figure can serve as an example 1857 for all similar UNIs. In this case, the UNI is an interface between 1858 client network edge nodes and the server network. It should be noted 1859 that neither the client network nor the server network need be an 1860 Ethernet switching network. 1862 There are three network layers in this model: the client network, the 1863 "Ethernet service network", and the server network. The so-called 1864 Ethernet service network consists of links comprising the UNI links 1865 and the tunnels across the server network, and nodes comprising the 1866 client network edge nodes and various server nodes. That is, the 1867 Ethernet service network is equivalent to the abstraction layer 1868 network with the UNI links being the physical links between the 1869 client and server networks, and the client edge nodes taking the 1871 Client Client 1872 Network +----------+ +-----------+ Network 1873 -------------+ | | | | +------------- 1874 +----+ | | +-----+ | | +-----+ | | +----+ 1875 ------+ | | | | | | | | | | | | +------ 1876 ------+ EN +-+-----+--+ CN +-+----+--+ CN +--+-----+-+ EN +------ 1877 | | | +--+--| +-+-+ | | +--+-----+-+ | 1878 +----+ | | | +--+--+ | | | +--+--+ | | +----+ 1879 | | | | | | | | | | 1880 -------------+ | | | | | | | | +------------- 1881 | | | | | | | | 1882 -------------+ | | | | | | | | +------------- 1883 | | | +--+--+ | | | +--+--+ | | 1884 +----+ | | | | | | +--+--+ | | | +----+ 1885 ------+ +-+--+ | | CN +-+----+--+ CN | | | | +------ 1886 ------+ EN +-+-----+--+ | | | | +--+-----+-+ EN +------ 1887 | | | | +-----+ | | +-----+ | | | | 1888 +----+ | | | | | | +----+ 1889 | +----------+ |-----------+ | 1890 -------------+ Server Network(s) +------------- 1891 Client UNI UNI Client 1892 Network <-----> <-----> Network 1893 Scope of This Document 1895 Legend: EN - Client Edge Node 1896 CN - Server Node 1898 Figure 22 : Ethernet UNI Reference Model 1900 role of UNI Client-side (UNI-C) and the server edge nodes acting as 1901 the UNI Network-side (UNI-N) nodes. 1903 An issue that is often raised concerns how a dual-homed client edge 1904 node (such as that shown at the bottom left-hand corner of Figure 22) 1905 can make determinations about how they connect across the UNI. This 1906 can be particularly important when reachability across the server 1907 network is limited or when two diverse paths are desired (for 1908 example, to provide protection). However, in the model described in 1909 this network, the edge node (the UNI-C) is part of the abstraction 1910 layer network and can see sufficient topology information to make 1911 these decisions. If the approach introduced in this document is used 1912 to model the UNI as described in this section, there is no need to 1913 enhance the signaling protocols at the GMPLS UNI nor to add routing 1914 exchanges at the UNI. 1916 8. Abstraction in L3VPN Multi-AS Environments 1918 Serving layer-3 VPNs (L3PVNs) across a multi-AS or multi-operator 1919 environment currently provides a significant planning challenge. 1920 Figure 6 shows the general case of the problem that needs to be 1921 solved. This section shows how the abstraction layer network can 1922 address this problem. 1924 In the VPN architecture, the CE nodes are the client network edge 1925 nodes, and the PE nodes are the server network edge nodes. The 1926 abstraction layer network is made up of the CE nodes, the CE-PE 1927 links, the PE nodes, and PE-PE tunnels that are the abstract links. 1929 In the multi-AS or multi-operator case, the abstraction layer network 1930 also includes the PEs (maybe ASBRs) at the edges of the multiple 1931 server networks, and the PE-PE (maybe inter-AS) links. This gives 1932 rise to the architecture shown in Figure 23. 1934 The policy for adding abstract links to the abstraction layer network 1935 will be driven substantially by the needs of the VPN. Thus, when a 1936 new VPN site is added and the existing abstraction layer network 1937 cannot support the required connectivity, a new abstract link will be 1938 created out of the underlying network. 1940 It is important to note that each VPN instance can have a separate 1941 abstraction layer network. This means that the server network 1942 resources can be partitioned and that traffic can be kept separate. 1943 This can be achieved even when VPN sites from different VPNs connect 1944 at the same PE. Alternatively, multiple VPNs can share the same 1945 abstraction layer network if that is operationally preferable. 1947 Lastly, just as for the UNI discussed in Section 7, the issue of 1948 dual-homing of VPN sites is a function of the abstraction layer 1949 network and so is just a normal routing problem in that network. 1951 ........... ............. 1952 VPN Site : : VPN Site 1953 -- -- : : -- -- 1954 |C1|-|CE| : : |CE|-|C2| 1955 -- | | : : | | -- 1956 | | : : | | 1957 | | : : | | 1958 | | : : | | 1959 | | : -- -- -- -- : | | 1960 | |----|PE|=========|PE|---|PE|=====|PE|----| | 1961 -- : | | | | | | | | : -- 1962 ........... | | | | | | | | ............ 1963 | | | | | | | | 1964 | | | | | | | | 1965 | | | | | | | | 1966 | | - - | | | | - | | 1967 | |-|P|-|P|-| | | |-|P|-| | 1968 -- - - -- -- - -- 1970 Figure 23 : The Abstraction Layer Network for a Multi-AS VPN 1972 9. Scoping Future Work 1974 The section is provided to help guide the work on this problem and to 1975 ensure that oceans are not knowingly boiled. 1977 9.1. Not Solving the Internet 1979 The scope of the use cases and problem statement in this document is 1980 limited to "some small set of interconnected domains." In 1981 particular, it is not the objective of this work to turn the whole 1982 Internet into one large, interconnected TE network. 1984 9.2. Working With "Related" Domains 1986 Subsequent to Section 9.1, the intention of this work is to solve 1987 the TE interconnectivity for only "related" domains. Such domains 1988 may be under common administrative operation (such as IGP areas 1989 within a single AS, or ASes belonging to a single operator), or may 1990 have a direct commercial arrangement for the sharing of TE 1991 information to provide specific services. Thus, in both cases, there 1992 is a strong opportunity for the application of policy. 1994 9.3. Not Finding Optimal Paths in All Situations 1996 As has been well described in this document, abstraction necessarily 1997 involves compromises and removal of information. That means that it 1998 is not possible to guarantee that an end-to-end path over 1999 interconnected TE domains follows the absolute optimal (by any measure 2000 of optimality) path. This is taken as understood, and future work 2001 should not attempt to achieve such paths which can only be found by a 2002 full examination of all network information across all connected 2003 networks. 2005 9.4. Sanity and Scaling 2007 All of the above points play into a final observation. This work is 2008 intended to bite off a small problem for some relatively simple use 2009 cases as described in Section 2. It is not intended that this work 2010 will be immediately (or even soon) extended to cover many large 2011 interconnected domains. Obviously the solution should as far as 2012 possible be designed to be extensible and scalable, however, it is 2013 also reasonable to make trade-offs in favor of utility and 2014 simplicity. 2016 10. Manageability Considerations 2018 Manageability should not be a significant additional burden. Each 2019 layer in the network model can and should be managed independently. 2021 That is, each client network will run its own management systems and 2022 tools to manage the nodes and links in the client network: each 2023 client network link that uses an abstract link will still be 2024 available for management in the client network as any other link. 2026 Similarly, each server network will run its own management systems 2027 and tools to manage the nodes and links in that network just as 2028 normal. 2030 Three issues remain for consideration: 2032 - How is the abstraction layer network managed? 2033 - How is the interface between the client network and the abstraction 2034 layer network managed? 2035 - How is the interface between the abstraction layer network and the 2036 server network managed? 2038 10.1. Managing the Abstraction Layer Network 2040 Management of the abstraction layer network differs from the client 2041 and server networks because not all of the links that are visible in 2042 the TED are real links. That is, it is not possible to run OAM on 2043 the links that constitute the potential of a link. 2045 Other than that, however, the management should be essentially the 2046 same. Routing and signaling protocols can be run in the abstraction 2047 layer (using out of band channels for links that have not yet been 2048 established), and a centralized TED can be constructed and used to 2049 examine the availability and status of the links and nodes in the 2050 network. 2052 Note that different deployment models will place the "ownership" of 2053 the abstraction layer network differently. In some case the the 2054 abstraction layer network will be constructed by the operator of the 2055 server layer and run by that operator as a service for one or more 2056 client networks. In other cases, one or more server networks will 2057 present the potential of links to an abstraction layer network run 2058 by the operator of the client network. And it is feasible that a 2059 business model could be built where a third-party operator manages 2060 the abstraction layer network, constructing it from the connectivity 2061 available in multiple server networks, and facilitating connectivity 2062 for multiple client networks. 2064 10.2. Managing Interactions of Client and Abstraction Layer Networks 2066 The interaction between the client network and the abstraction layer 2067 network is a management task. It might be automated (software 2068 driven) or it might require manual intervention. 2070 This is a two-way interaction: 2072 - The client network can express the need for additional 2073 connectivity. For example, the client layer may try and fail to 2074 find a path across the client network and may request additional, 2075 specific connectivity (this is similar to the situation with 2076 Virtual Network Topology Manager (VNTM) [RFC5623]). Alternatively, 2077 a more proactive client layer management system may monitor traffic 2078 demands (current and predicted), network usage, and network "hot 2079 spots" and may request changes in connectivity by both releasing 2080 unused links and by requesting new links. 2082 - The abstraction layer network can make links available to the 2083 client network or can withdraw them. These actions can be in 2084 response to requests from the client layer, or can be driven by 2085 processes within the abstraction layer (perhaps reorganizing the 2086 use of server layer resources). In any case, the presentation of 2087 new links to the client layer is heavily subject to policy since 2088 this is both operationally key to the success of this architecture 2089 and the central plank of the commercial model described in this 2090 document. Such policies belong to the operator of the abstraction 2091 layer network and are expected to be fully configurable. 2093 Once the abstraction layer network has decided to make a link 2094 available to the client network it will install it at the link end 2095 points (which are nodes in the client network) such that it appears 2096 and can be advertised as a link in the client network. 2098 In all cases, it is important that the operators of both networks are 2099 able to track the requests and responses, and the operator of the 2100 client network should be able to see which links in that network are 2101 "real" physical links, and which are presented by the abstraction 2102 layer network. 2104 10.3. Managing Interactions of Abstraction Layer and Server Networks 2106 The interactions between the abstraction layer network and the server 2107 network a similar to those described in Section 10.2, but there is a 2108 difference in that the server layer is more likely to offer up 2109 connectivity, and the abstraction layer network is less likely to ask 2110 for it. 2112 That is, the server network will, according to policy that may 2113 include commercial relationships, offer the abstraction layer network 2114 a set of potential connectivity that the abstraction layer network 2115 can treat as links. This server network policy will include: 2116 - how much connectivity to offer 2117 - what level of server layer redundancy to include 2118 - how to support the use of the abstraction links, 2120 This process of offering links from the server network may include a 2121 mechanism to indicate which links have been pre-established in the 2122 server network, and can include other properties such as: 2123 - link-level protection ([RFC4202]) 2124 - SRLG and MSRLG (see Appendix A) 2125 - mutual exclusivity (see Appendix B). 2127 The abstraction layer network needs a mechanism to tell the server 2128 This mechanism could also include the ability to request additional 2129 connectivity from the server layer, although it seems most likely 2130 that the server layer will already have presented as much 2131 connectivity as it is physically capable of subject to the 2132 constraints of policy. 2134 Finally, the server layer will need to confirm the establishment of 2135 connectivity, withdraw links if they are no longer feasible, and 2136 report failures. 2138 Again, it is important that the operators of both networks are able 2139 to track the requests and responses, and the operator of the server 2140 network should be able to see which links are in use. 2142 11. IANA Considerations 2144 This document makes no requests for IANA action. The RFC Editor may 2145 safely remove this section. 2147 12. Security Considerations 2149 Security of signaling and routing protocols is usually administered 2150 and achieved within the boundaries of a domain. Thus, and for 2151 example, a domain with a GMPLS control plane [RFC3945] would apply 2152 the security mechanisms and considerations that are appropriate to 2153 GMPLS [RFC5920]. Furthermore, domain-based security relies strongly 2154 on ensuring that control plane messages are not allowed to enter the 2155 domain from outside. Thus, the mechanisms in this document for 2156 inter-domain exchange of control plane messages and information 2157 naturally raise additional questions of security. 2159 In this context, additional security considerations arising from this 2160 document relate to the exchange of control plane information between 2161 domains. Messages are passed between domains using control plane 2162 protocols operating between peers that have predictable relationships 2163 (for example, UNI-C to UNI-N, between BGP-LS speakers, or between 2164 peer domains). Thus, the security that needs to be given additional 2165 attention for inter-domain TE concentrates on authentication of 2166 peers, assertion that messages have not been tampered with, and to a 2167 lesser extent protecting the content of the messages from inspection 2168 since that might give away sensitive information about the networks. 2169 The protocols described in Appendix A and which are likely to provide 2170 the foundation to solutions to this architecture already include 2171 such protection and further can be run over protected transports 2172 such as IPsec [RFC6701], TLS [RFC5246], and the TCP Authentication 2173 Option (TCP-AO) [RFC5925]. 2175 It is worth noting that the control plane of the abstraction layer 2176 network is likely to be out of band. That is, control plane messages 2177 will be exchanged over network links that are not the links to which 2178 they apply. This models the facilities of GMPLS (but not of MPLS-TE) 2179 and the security mechanisms can be applied to the protocols operating 2180 in the out of band network. 2182 13. Acknowledgements 2184 Thanks to Igor Bryskin for useful discussions in the early stages of 2185 this work. 2187 Thanks to Gert Grammel for discussions on the extent of aggregation 2188 in abstract nodes and links. 2190 Thanks to Deborah Brungard, Dieter Beller, Dhruv Dhody, Vallinayakam 2191 Somasundaram, and Hannes Gredler for review and input. 2193 Particular thanks to Vishnu Pavan Beeram for detailed discussions and 2194 white-board scribbling that made many of the ideas in this document 2195 come to life. 2197 Text in Section 4.2.3 is freely adapted from the work of Igor 2198 Bryskin, Wes Doonan, Vishnu Pavan Beeram, John Drake, Gert Grammel, 2199 Manuel Paul, Ruediger Kunze, Friedrich Armbruster, Cyril Margaria, 2200 Oscar Gonzalez de Dios, and Daniele Ceccarelli in 2201 [I-D.beeram-ccamp-gmpls-enni] for which the authors of this document 2202 express their thanks. 2204 14. References 2206 14.1. Informative References 2208 [G.8080] ITU-T, "Architecture for the automatically switched optical 2209 network (ASON)", Recommendation G.8080. 2211 [I-D.beeram-ccamp-gmpls-enni] 2212 Bryskin, I., Beeram, V. P., Drake, J. et al., "Generalized 2213 Multiprotocol Label Switching (GMPLS) External Network 2214 Network Interface (E-NNI): Virtual Link Enhancements for 2215 the Overlay Model", draft-beeram-ccamp-gmpls-enni, work in 2216 progress. 2218 [I-D.ietf-ccamp-rsvp-te-srlg-collect] 2219 Zhang, F. (Ed.) and O. Gonzalez de Dios (Ed.), "RSVP-TE 2220 Extensions for Collecting SRLG Information", draft-ietf- 2221 ccamp-rsvp-te-srlg-collect, work in progress. 2223 [RFC7752] Gredler, H., Medved, J., Previdi, S., Farrel, A., and Ray, 2224 S., "North-Bound Distribution of Link-State and Traffic 2225 Engineering (TE) Information Using BGP", RFC 7752, March 2226 2016. 2228 [RFC2702] Awduche, D., Malcolm, J., Agogbua, J., O'Dell, M., and 2229 McManus, J., "Requirements for Traffic Engineering Over 2230 MPLS", RFC 2702, September 1999. 2232 [RFC3209] Awduche, D., Berger, L., Gan, D., Li, T., Srinivasan, V., 2233 and G. Swallow, "RSVP-TE: Extensions to RSVP for LSP 2234 Tunnels", RFC 3209, December 2001. 2236 [RFC3473] L. Berger, "Generalized Multi-Protocol Label Switching 2237 (GMPLS) Signaling Resource ReserVation Protocol-Traffic 2238 Engineering (RSVP-TE) Extensions", RC 3473, January 2003. 2240 [RFC3630] Katz, D., Kompella, and K., Yeung, D., "Traffic Engineering 2241 (TE) Extensions to OSPF Version 2", RFC 3630, September 2242 2003. 2244 [RFC3945] Mannie, E., (Ed.), "Generalized Multi-Protocol Label 2245 Switching (GMPLS) Architecture", RFC 3945, October 2004. 2247 [RFC4105] Le Roux, J.-L., Vasseur, J.-P., and Boyle, J., 2248 "Requirements for Inter-Area MPLS Traffic Engineering", 2249 RFC 4105, June 2005. 2251 [RFC4202] Kompella, K. and Y. Rekhter, "Routing Extensions in Support 2252 of Generalized Multi-Protocol Label Switching (GMPLS)", 2253 RFC 4202, October 2005. 2255 [RFC4206] Kompella, K. and Y. Rekhter, "Label Switched Paths (LSP) 2256 Hierarchy with Generalized Multi-Protocol Label Switching 2257 (GMPLS) Traffic Engineering (TE)", RFC 4206, October 2005. 2259 [RFC4208] Swallow, G., Drake, J., Ishimatsu, H., and Y. Rekhter, 2260 "User-Network Interface (UNI): Resource ReserVation 2261 Protocol-Traffic Engineering (RSVP-TE) Support for the 2262 Overlay Model", RFC 4208, October 2005. 2264 [RFC4216] Zhang, R., and Vasseur, J.-P., "MPLS Inter-Autonomous 2265 System (AS) Traffic Engineering (TE) Requirements", 2266 RFC 4216, November 2005. 2268 [RFC4271] Rekhter, Y., Li, T., and Hares, S., "A Border Gateway 2269 Protocol 4 (BGP-4)", RFC 4271, January 2006. 2271 [RFC4364] Rosen, E. and Y. Rekhter, "BGP/MPLS IP Virtual Private 2272 Networks (VPNs)", RFC 4364, February 2006. 2274 [RFC4655] Farrel, A., Vasseur, J., and J. Ash, "A Path Computation 2275 Element (PCE)-Based Architecture", RFC 4655, August 2006. 2277 [RFC4726] Farrel, A., Vasseur, J.-P., and Ayyangar, A., "A Framework 2278 for Inter-Domain Multiprotocol Label Switching Traffic 2279 Engineering", RFC 4726, November 2006. 2281 [RFC4847] T. Takeda (Ed.), "Framework and Requirements for Layer 1 2282 Virtual Private Networks," RFC 4847, April 2007. 2284 [RFC4874] Lee, CY., Farrel, A., and S. De Cnodder, "Exclude Routes - 2285 Extension to Resource ReserVation Protocol-Traffic 2286 Engineering (RSVP-TE)", RFC 4874, April 2007. 2288 [RFC4920] Farrel, A., Satyanarayana, A., Iwata, A., Fujita, N., and 2289 Ash, G., "Crankback Signaling Extensions for MPLS and GMPLS 2290 RSVP-TE", RFC 4920, July 2007. 2292 [RFC5150] Ayyangar, A., Kompella, K., Vasseur, JP., and A. Farrel, 2293 "Label Switched Path Stitching with Generalized 2294 Multiprotocol Label Switching Traffic Engineering (GMPLS 2295 TE)", RFC 5150, February 2008. 2297 [RFC5152] Vasseur, JP., Ayyangar, A., and Zhang, R., "A Per-Domain 2298 Path Computation Method for Establishing Inter-Domain 2299 Traffic Engineering (TE) Label Switched Paths (LSPs)", 2300 RFC 5152, February 2008. 2302 [RFC5195] Ould-Brahim, H., Fedyk, D., and Y. Rekhter, "BGP-Based 2303 Auto-Discovery for Layer-1 VPNs", RFC 5195, June 2008. 2305 [RFC5246] Dierks, T. and E. Rescorla, "The Transport Layer Security 2306 (TLS) Protocol Version 1.2", RFC 5246, August 2008. 2308 [RFC5251] Fedyk, D., Rekhter, Y., Papadimitriou, D., Rabbat, R., and 2309 L. Berger, "Layer 1 VPN Basic Mode", RFC 5251, July 2008. 2311 [RFC5252] Bryskin, I. and L. Berger, "OSPF-Based Layer 1 VPN Auto- 2312 Discovery", RFC 5252, July 2008. 2314 [RFC5305] Li, T., and Smit, H., "IS-IS Extensions for Traffic 2315 Engineering", RFC 5305, October 2008. 2317 [RFC5440] Vasseur, JP. and Le Roux, JL., "Path Computation Element 2318 (PCE) Communication Protocol (PCEP)", RFC 5440, March 2009. 2320 [RFC5441] Vasseur, JP., Zhang, R., Bitar, N, and Le Roux, JL., "A 2321 Backward-Recursive PCE-Based Computation (BRPC) Procedure 2322 to Compute Shortest Constrained Inter-Domain Traffic 2323 Engineering Label Switched Paths", RFC 5441, April 2009. 2325 [RFC5523] L. Berger, "OSPFv3-Based Layer 1 VPN Auto-Discovery", RFC 2326 5523, April 2009. 2328 [RFC5553] Farrel, A., Bradford, R., and JP. Vasseur, "Resource 2329 Reservation Protocol (RSVP) Extensions for Path Key 2330 Support", RFC 5553, May 2009. 2332 [RFC5623] Oki, E., Takeda, T., Le Roux, JL., and A. Farrel, 2333 "Framework for PCE-Based Inter-Layer MPLS and GMPLS Traffic 2334 Engineering", RFC 5623, September 2009. 2336 [RFC5920] L. Fang, Ed., "Security Framework for MPLS and GMPLS 2337 Networks", RFC 5920, July 2010. 2339 [RFC5925] Touch, J., Mankin, A., and R. Bonica, "The TCP 2340 Authentication Option", RFC 5925, June 2010. 2342 [RFC6005] Nerger, L., and D. Fedyk, "Generalized MPLS (GMPLS) Support 2343 for Metro Ethernet Forum and G.8011 User Network Interface 2344 (UNI)", RFC 6005, October 2010. 2346 [RFC6107] Shiomoto, K., and A. Farrel, "Procedures for Dynamically 2347 Signaled Hierarchical Label Switched Paths", RFC 6107, 2348 February 2011. 2350 [RFC6701] Frankel, S. and S. Krishnan, "IP Security (IPsec) and 2351 Internet Key Exchange (IKE) Document Roadmap", RFC 6701, 2352 February 2011. 2354 [RFC6805] King, D., and A. Farrel, "The Application of the Path 2355 Computation Element Architecture to the Determination of a 2356 Sequence of Domains in MPLS and GMPLS", RFC 6805, November 2357 2012. 2359 [RFC6827] Malis, A., Lindem, A., and D. Papadimitriou, "Automatically 2360 Switched Optical Network (ASON) Routing for OSPFv2 2361 Protocols", RFC 6827, January 2013. 2363 [RFC6996] J. Mitchell, "Autonomous System (AS) Reservation for 2364 Private Use", BCP 6, RFC 6996, July 2013. 2366 [RFC7399] Farrel, A. and D. King, "Unanswered Questions in the Path 2367 Computation Element Architecture", RFC 7399, October 2014. 2369 [RFC7579] Bernstein, G., Lee, Y.,et al., "General Network Element 2370 Constraint Encoding for GMPLS-Controlled Networks", RFC 2371 7579, June 2015. 2373 [RFC7580] Zhang, F., Lee, Y,. Han, J, Bernstein, G., and Xu, Y., 2374 "OSPF-TE Extensions for General Network Element 2375 Constraints", RFC 7580, June 2015. 2377 Authors' Addresses 2379 Adrian Farrel 2380 Juniper Networks 2381 EMail: adrian@olddog.co.uk 2383 John Drake 2384 Juniper Networks 2385 EMail: jdrake@juniper.net 2387 Nabil Bitar 2388 Nuage Networks 2389 EMail: nbitar40@gmail.com 2391 George Swallow 2392 Cisco Systems, Inc. 2393 1414 Massachusetts Ave 2394 Boxborough, MA 01719 2395 EMail: swallow@cisco.com 2397 Xian Zhang 2398 Huawei Technologies 2399 Email: zhang.xian@huawei.com 2401 Daniele Ceccarelli 2402 Ericsson 2403 Via A. Negrone 1/A 2404 Genova - Sestri Ponente 2405 Italy 2406 EMail: daniele.ceccarelli@ericsson.com 2408 Contributors 2410 Gert Grammel 2411 Juniper Networks 2412 Email: ggrammel@juniper.net 2414 Vishnu Pavan Beeram 2415 Juniper Networks 2416 Email: vbeeram@juniper.net 2418 Oscar Gonzalez de Dios 2419 Email: ogondio@tid.es 2421 Fatai Zhang 2422 Email: zhangfatai@huawei.com 2423 Zafar Ali 2424 Email: zali@cisco.com 2426 Rajan Rao 2427 Email: rrao@infinera.com 2429 Sergio Belotti 2430 Email: sergio.belotti@alcatel-lucent.com 2432 Diego Caviglia 2433 Email: diego.caviglia@ericsson.com 2435 Jeff Tantsura 2436 Email: jeff.tantsura@ericsson.com 2438 Khuzema Pithewan 2439 Email: kpithewan@infinera.com 2441 Cyril Margaria 2442 Email: cyril.margaria@googlemail.com 2444 Victor Lopez 2445 Email: vlopez@tid.es 2447 Appendix A. Existing Work 2449 This appendix briefly summarizes relevant existing work that is used 2450 to route TE paths across multiple domains. 2452 A.1. Per-Domain Path Computation 2454 The per-domain mechanism of path establishment is described in 2455 [RFC5152] and its applicability is discussed in [RFC4726]. In 2456 summary, this mechanism assumes that each domain entry point is 2457 responsible for computing the path across the domain, but that 2458 details of the path in the next domain are left to the next domain 2459 entry point. The computation may be performed directly by the entry 2460 point or may be delegated to a computation server. 2462 This basic mode of operation can run into many of the issues 2463 described alongside the use cases in Section 2. However, in practice 2464 it can be used effectively with a little operational guidance. 2466 For example, RSVP-TE [RFC3209] includes the concept of a "loose hop" 2467 in the explicit path that is signaled. This allows the original 2468 request for an LSP to list the domains or even domain entry points to 2469 include on the path. Thus, in the example in Figure 1, the source 2470 can be told to use the interconnection x2. Then the source computes 2471 the path from itself to x2, and initiates the signaling. When the 2472 signaling message reaches Domain Z, the entry point to the domain 2473 computes the remaining path to the destination and continues the 2474 signaling. 2476 Another alternative suggested in [RFC5152] is to make TE routing 2477 attempt to follow inter-domain IP routing. Thus, in the example 2478 shown in Figure 2, the source would examine the BGP routing 2479 information to determine the correct interconnection point for 2480 forwarding IP packets, and would use that to compute and then signal 2481 a path for Domain A. Each domain in turn would apply the same 2482 approach so that the path is progressively computed and signaled 2483 domain by domain. 2485 Although the per-domain approach has many issues and drawbacks in 2486 terms of achieving optimal (or, indeed, any) paths, it has been the 2487 mainstay of inter-domain LSP set-up to date. 2489 A.2. Crankback 2491 Crankback addresses one of the main issues with per-domain path 2492 computation: what happens when an initial path is selected that 2493 cannot be completed toward the destination? For example, what 2494 happens if, in Figure 2, the source attempts to route the path 2495 through interconnection x2, but Domain C does not have the right TE 2496 resources or connectivity to route the path further? 2498 Crankback for MPLS-TE and GMPLS networks is described in [RFC4920] 2499 and is based on a concept similar to the Acceptable Label Set 2500 mechanism described for GMPLS signaling in [RFC3473]. When a node 2501 (i.e., a domain entry point) is unable to compute a path further 2502 across the domain, it returns an error message in the signaling 2503 protocol that states where the blockage occurred (link identifier, 2504 node identifier, domain identifier, etc.) and gives some clues about 2505 what caused the blockage (bad choice of label, insufficient bandwidth 2506 available, etc.). This information allows a previous computation 2507 point to select an alternative path, or to aggregate crankback 2508 information and return it upstream to a previous computation point. 2510 Crankback is a very powerful mechanism and can be used to find an 2511 end-to-end path in a multi-domain network if one exists. 2513 On the other hand, crankback can be quite resource-intensive as 2514 signaling messages and path setup attempts may "wander around" in the 2515 network attempting to find the correct path for a long time. Since 2516 RSVP-TE signaling ties up networks resources for partially 2517 established LSPs, since network conditions may be in flux, and most 2518 particularly since LSP setup within well-known time limits is highly 2519 desirable, crankback is not a popular mechanism. 2521 Furthermore, even if crankback can always find an end-to-end path, it 2522 does not guarantee to find the optimal path. (Note that there have 2523 been some academic proposals to use signaling-like techniques to 2524 explore the whole network in order to find optimal paths, but these 2525 tend to place even greater burdens on network processing.) 2527 A.3. Path Computation Element 2529 The Path Computation Element (PCE) is introduced in [RFC4655]. It is 2530 an abstract functional entity that computes paths. Thus, in the 2531 example of per-domain path computation (see A.1) the source node and 2532 each domain entry point is a PCE. On the other hand, the PCE can 2533 also be realized as a separate network element (a server) to which 2534 computation requests can be sent using the Path Computation Element 2535 Communication Protocol (PCEP) [RFC5440]. 2537 Each PCE has responsibility for computations within a domain, and has 2538 visibility of the attributes within that domain. This immediately 2539 enables per-domain path computation with the opportunity to off-load 2540 complex, CPU-intensive, or memory-intensive computation functions 2541 from routers in the network. But the use of PCE in this way does not 2542 solve any of the problems articulated in A.1 and A.2. 2544 Two significant mechanisms for cooperation between PCEs have been 2545 described. These mechanisms are intended to specifically address the 2546 problems of computing optimal end-to-end paths in multi-domain 2547 environments. 2549 - The Backward-Recursive PCE-Based Computation (BRPC) mechanism 2550 [RFC5441] involves cooperation between the set of PCEs along the 2551 inter-domain path. Each one computes the possible paths from 2552 domain entry point (or source node) to domain exit point (or 2553 destination node) and shares the information with its upstream 2554 neighbor PCE which is able to build a tree of possible paths 2555 rooted at the destination. The PCE in the source domain can 2556 select the optimal path. 2558 BRPC is sometimes described as "crankback at computation time". It 2559 is capable of determining the optimal path in a multi-domain 2560 network, but depends on knowing the domain that contains the 2561 destination node. Furthermore, the mechanism can become quite 2562 complicated and involve a lot of data in a mesh of interconnected 2563 domains. Thus, BRPC is most often proposed for a simple mesh of 2564 domains and specifically for a path that will cross a known 2565 sequence of domains, but where there may be a choice of domain 2566 interconnections. In this way, BRPC would only be applied to 2567 Figure 2 if a decision had been made (externally) to traverse 2568 Domain C rather than Domain D (notwithstanding that it could 2569 functionally be used to make that choice itself), but BRPC could be 2570 used very effectively to select between interconnections x1 and x2 2571 in Figure 1. 2573 - Hierarchical PCE (H-PCE) [RFC6805] offers a parent PCE that is 2574 responsible for navigating a path across the domain mesh and for 2575 coordinating intra-domain computations by the child PCEs 2576 responsible for each domain. This approach makes computing an end- 2577 to-end path across a mesh of domains far more tractable. However, 2578 it still leaves unanswered the issue of determining the location of 2579 the destination (i.e., discovering the destination domain) as 2580 described in Section 2.1.1. Furthermore, it raises the question of 2581 who operates the parent PCE especially in networks where the 2582 domains are under different administrative and commercial control. 2584 It should also be noted that [RFC5623] discusses how PCE is used in a 2585 multi-layer network with coordination between PCEs operating at each 2586 network layer. Further issues and considerations of the use of PCE 2587 can be found in [RFC7399]. 2589 A.4. GMPLS UNI and Overlay Networks 2591 [RFC4208] defines the GMPLS User-to-Network Interface (UNI) to 2592 present a routing boundary between an overlay network and the core 2593 network, i.e. the client-server interface. In the client network, 2594 the nodes connected directly to the core network are known as edge 2595 nodes, while the nodes in the server network are called core nodes. 2597 In the overlay model defined by [RFC4208] the core nodes act as a 2598 closed system and the edge nodes do not participate in the routing 2599 protocol instance that runs among the core nodes. Thus the UNI 2600 allows access to and limited control of the core nodes by edge nodes 2601 that are unaware of the topology of the core nodes. This respects 2602 the operational and layer boundaries while scaling the network. 2604 [RFC4208] does not define any routing protocol extension for the 2605 interaction between core and edge nodes but allows for the exchange 2606 of reachability information between them. In terms of a VPN, the 2607 client network can be considered as the customer network comprised 2608 of a number of disjoint sites, and the edge nodes match the VPN CE 2609 nodes. Similarly, the provider network in the VPN model is 2610 equivalent to the server network. 2612 [RFC4208] is, therefore, a signaling-only solution that allows edge 2613 nodes to request connectivity cross the core network, and leaves the 2614 core network to select the paths for the LSPs as they traverse the 2615 core (setting up hierarchical LSPs if necessitated by the 2616 technology). This solution is supplemented by a number of signaling 2617 extensions such as [RFC4874], [RFC5553], [I-D.ietf-ccamp-xro-lsp- 2618 subobject], [I-D.ietf-ccamp-rsvp-te-srlg-collect], and [I-D.ietf- 2619 ccamp-te-metric-recording] to give the edge node more control over 2620 path within the core network and by allowing the edge nodes to supply 2621 additional constraints on the path used in the core network. 2622 Nevertheless, in this UNI/overlay model, the edge node has limited 2623 information of precisely what LSPs could be set up across the core, 2624 and what TE services (such as diverse routes for end-to-end 2625 protection, end-to-end bandwidth, etc.) can be supported. 2627 A.5. Layer One VPN 2629 A Layer One VPN (L1VPN) is a service offered by a core layer 1 2630 network to provide layer 1 connectivity (TDM, LSC) between two or 2631 more customer networks in an overlay service model [RFC4847]. 2633 As in the UNI case, the customer edge has some control over the 2634 establishment and type of the connectivity. In the L1VPN context 2635 three different service models have been defined classified by the 2636 semantics of information exchanged over the customer interface: 2638 Management Based, Signaling Based (a.k.a. basic), and Signaling and 2639 Routing service model (a.k.a. enhanced). 2641 In the management based model, all edge-to-edge connections are set 2642 up using configuration and management tools. This is not a dynamic 2643 control plane solution and need not concern us here. 2645 In the signaling based service model [RFC5251] the CE-PE interface 2646 allows only for signaling message exchange, and the provider network 2647 does not export any routing information about the core network. VPN 2648 membership is known a priori (presumably through configuration) or is 2649 discovered using a routing protocol [RFC5195], [RFC5252], [RFC5523], 2650 as is the relationship between CE nodes and ports on the PE. This 2651 service model is much in line with GMPLS UNI as defined in [RFC4208]. 2653 In the enhanced model there is an additional limited exchange of 2654 routing information over the CE-PE interface between the provider 2655 network and the customer network. The enhanced model considers four 2656 different types of service models, namely: Overlay Extension, Virtual 2657 Node, Virtual Link and Per-VPN service models. All of these 2658 represent particular cases of the TE information aggregation and 2659 representation. 2661 A.6. Policy and Link Advertisement 2663 Inter-domain networking relies on policy and management input to 2664 coordinate the allocation of resources under different administrative 2665 control. [RFC5623] introduces a functional component called the 2666 Virtual Network Topology Manager (VNTM) for this purpose. 2668 An important companion to this function is determining how 2669 connectivity across the abstraction layer network is made available 2670 as a TE link in the client network. Obviously, if the connectivity 2671 is established using management intervention, the consequent client 2672 network TE link can also be configured manually. However, if 2673 connectivity from client edge to client edge is achieved using 2674 dynamic signalling then there is need for the end points to exchange 2675 the link properties that they should advertise within the client 2676 network, and in the case of support for more than one client network, 2677 it will be necessary to indicate which client or clients can use the 2678 link. This capability it provided in [RFC6107]. 2680 Appendix B. Additional Features 2682 This Appendix describes additional features that may be desirable and 2683 that can be achieved within this architecture. 2685 B.1. Macro Shared Risk Link Groups 2687 Network links often share fate with one or more other links. That 2688 is, a scenario that may cause a link to fail could cause one or more 2689 other links to fail. This may occur, for example, if the links are 2690 supported by the same fiber bundle, or if some links are routed down 2691 the same duct or in a common piece of infrastructure such as a 2692 bridge. A common way to identify the links that may share fate is to 2693 label them as belonging to a Shared Risk Link Group (SRLG) [RFC4202]. 2695 TE links created from LSPs in lower layers may also share fate, and 2696 it can be hard for a client network to know about this problem 2697 because it does not know the topology of the server network or the 2698 path of the server layer LSPs that are used to create the links in 2699 the client network. 2701 For example, looking at the example used in Section 4.2.3 and 2702 considering the two abstract links S1-S3 and S1-S9 there is no way 2703 for the client layer to know whether the links C2-C0 and C2-C3 share 2704 fate. Clearly, if the client layer uses these links to provide a 2705 link-diverse end-to-end protection scheme, it needs to know that the 2706 links actually share a piece of network infrastructure (the server 2707 layer link S1-S2). 2709 Per [RFC4202], an SRLG represents a shared physical network resource 2710 upon which the normal functioning of a link depends. Multiple SRLGs 2711 can be identified and advertised for every TE link in a network. 2712 However, this can produce a scalability problem in a mutli-layer 2713 network that equates to advertising in the client layer the server 2714 layer route of each TE link. 2716 Macro SRLGs (MSRLGs) address this scaling problem and are a form of 2717 abstraction performed at the same time that the abstract links are 2718 derived. In this way, links that actually share resources in the 2719 server layer are advertised as having the same MSRLG, rather than 2720 advertising each SRLG for each resource on each path in the server 2721 layer. This saving is possible because the abstract links are 2722 formulated on behalf of the server layer by a central management 2723 agency that is aware of all of the link abstractions being offered. 2725 It may be noted that a less optimal alternative path for the abstract 2726 link S1-S9 exists in the server layer (S1-S4-S7-S8-S9). It would be 2727 possible for the client layer request for connectivity C2-C0 to ask 2728 that the path be maximally disjoint from the path C2-C3. While 2729 nothing can be done about the shared link C2-S1, the abstraction 2730 layer could request to use the link S1-S9 in a way that is diverse 2731 from use of the link S1-S3, and this request could be honored if the 2732 server layer policy allows. 2734 Note that SRLGs and MSRLGs may be very hard to describe in the case 2735 of multiple server layer networks because the abstraction points will 2736 not know whether the resources in the various server layers share 2737 physical locations. 2739 B.2. Mutual Exclusivity 2741 As noted in the discussion of Figure 13, it is possible that some 2742 abstraction layer links can not be used at the same time. This 2743 arises when the potentiality of the links is indicated by the server 2744 layer, but the use the links would actually compete for server layer 2745 resources. In Figure 13 this arose when both link S1-S3 and link 2746 S7-S9 were used to carry LSPs: in that case the link S1-S9 could no 2747 longer be used. 2749 Such a situation need not be an issue when client-edge to client-edge 2750 LSPs are set up one by one because the use of one abstraction layer 2751 link and the corresponding use of server layer resources will cause 2752 the server layer to withdraw the availability of the other 2753 abstraction layer links, and these will become unavailable for 2754 further abstraction layer path computations. 2756 Furthermore, in deployments where abstraction layer links are only 2757 presented as available after server layer LSPs have been established 2758 to support them, the problem is unlikely exist. 2760 However, when the server layer is constrained, but chooses to 2761 advertise the potential of multiple abstraction layer links even 2762 though they compete for resources, and when multiple client-edge to 2763 client-edge LSPs are computed simultaneously (perhaps to provide 2764 protection services) there may be contention for server layer 2765 resources. In the case that protected abstraction layer LSPs are 2766 being established, this situation would be avoided through the use of 2767 SRLGs and/or MSRLGs since the two abstraction layer links that 2768 compete for server layer resources must also fate share across those 2769 resources. But in the case where the multiple client-edge to client- 2770 edge LSPs do not care about fate sharing, it may be necessary to flag 2771 the mutually exclusive links in the abstraction layer TED so that 2772 path computation can avoid accidentally attempting to utilize two of 2773 a set of such links at the same time.