idnits 2.17.1 draft-ietf-teas-interconnected-te-info-exchange-07.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- == There are 1 instance of lines with non-RFC6890-compliant IPv4 addresses in the document. If these are example addresses, they should be changed. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (May 21, 2016) is 2896 days in the past. Is this intentional? Checking references for intended status: Best Current Practice ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Obsolete informational reference (is this intentional?): RFC 7752 (Obsoleted by RFC 9552) -- Obsolete informational reference (is this intentional?): RFC 5246 (Obsoleted by RFC 8446) Summary: 0 errors (**), 0 flaws (~~), 2 warnings (==), 3 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group A. Farrel (Ed.) 3 Internet-Draft J. Drake 4 Intended status: Best Current Practice Juniper Networks 5 Expires: November 21, 2016 6 N. Bitar 7 Nokia 9 G. Swallow 10 Cisco Systems, Inc. 12 D. Ceccarelli 13 Ericsson 15 X. Zhang 16 Huawei 17 May 21, 2016 19 Problem Statement and Architecture for Information Exchange 20 Between Interconnected Traffic Engineered Networks 22 draft-ietf-teas-interconnected-te-info-exchange-07.txt 24 Abstract 26 In Traffic Engineered (TE) systems, it is sometimes desirable to 27 establish an end-to-end TE path with a set of constraints (such as 28 bandwidth) across one or more network from a source to a destination. 29 TE information is the data relating to nodes and TE links that is 30 used in the process of selecting a TE path. TE information is 31 usually only available within a network. We call such a zone of 32 visibility of TE information a domain. An example of a domain may be 33 an IGP area or an Autonomous System. 35 In order to determine the potential to establish a TE path through a 36 series of connected networks, it is necessary to have available a 37 certain amount of TE information about each network. This need not 38 be the full set of TE information available within each network, but 39 does need to express the potential of providing TE connectivity. This 40 subset of TE information is called TE reachability information. 42 This document sets out the problem statement for the exchange of TE 43 information between interconnected TE networks in support of end-to- 44 end TE path establishment and describes the best current practice 45 architecture to meet this problem statement. For reasons that are 46 explained in the document, this work is limited to simple TE 47 constraints and information that determine TE reachability. 49 Status of This Memo 51 This Internet-Draft is submitted in full conformance with the 52 provisions of BCP 78 and BCP 79. 54 Internet-Drafts are working documents of the Internet Engineering 55 Task Force (IETF). Note that other groups may also distribute 56 working documents as Internet-Drafts. The list of current Internet- 57 Drafts is at http://datatracker.ietf.org/drafts/current/. 59 Internet-Drafts are draft documents valid for a maximum of six months 60 and may be updated, replaced, or obsoleted by other documents at any 61 time. It is inappropriate to use Internet-Drafts as reference 62 material or to cite them other than as "work in progress." 64 Copyright Notice 66 Copyright (c) 2016 IETF Trust and the persons identified as the 67 document authors. All rights reserved. 69 This document is subject to BCP 78 and the IETF Trust's Legal 70 Provisions Relating to IETF Documents 71 (http://trustee.ietf.org/license-info) in effect on the date of 72 publication of this document. Please review these documents 73 carefully, as they describe your rights and restrictions with respect 74 to this document. Code Components extracted from this document must 75 include Simplified BSD License text as described in Section 4.e of 76 the Trust Legal Provisions and are provided without warranty as 77 described in the Simplified BSD License. 79 Table of Contents 81 1. Introduction .................................................. 5 82 1.1. Terminology ................................................ 6 83 1.1.1. TE Paths and TE Connections ............................... 6 84 1.1.2. TE Metrics and TE Attributes .............................. 6 85 1.1.3. TE Reachability ........................................... 7 86 1.1.4. Domain .................................................... 7 87 1.1.5. Server Network ............................................ 7 88 1.1.6. Client Network ............................................ 7 89 1.1.7. Aggregation ............................................... 7 90 1.1.8. Abstraction ............................................... 8 91 1.1.9. Abstract Link ............................................. 8 92 1.1.10. Abstract Node or Virtual Node ............................ 8 93 1.1.11. Abstraction Layer Network ................................ 9 94 2. Overview of Use Cases ......................................... 9 95 2.1. Peer Networks ............................................... 9 96 2.2. Client-Server Networks ...................................... 11 97 2.3. Dual-Homing ................................................. 13 98 2.4. Requesting Connectivity ..................................... 14 99 2.4.1. Discovering Server Network Information .................... 16 100 3. Problem Statement ............................................. 16 101 3.1. Policy and Filters .......................................... 17 102 3.2. Confidentiality ............................................. 17 103 3.3. Information Overload ........................................ 18 104 3.4. Issues of Information Churn ................................. 18 105 3.5. Issues of Aggregation ....................................... 19 106 4. Architecture .................................................. 20 107 4.1. TE Reachability ............................................. 20 108 4.2. Abstraction not Aggregation ................................. 21 109 4.2.1. Abstract Links ............................................ 22 110 4.2.2. The Abstraction Layer Network ............................. 22 111 4.2.3. Abstraction in Client-Server Networks...................... 25 112 4.2.4. Abstraction in Peer Networks .............................. 30 113 4.3. Considerations for Dynamic Abstraction ...................... 32 114 4.4. Requirements for Advertising Links and Nodes ................ 33 115 4.5. Addressing Considerations ................................... 34 116 5. Building on Existing Protocols ................................ 34 117 5.1. BGP-LS ...................................................... 35 118 5.2. IGPs ........................................................ 35 119 5.3. RSVP-TE ..................................................... 35 120 5.4. Notes on a Solution ......................................... 35 121 6. Application of the Architecture to Optical Domains and Networks 37 122 7. Application of the Architecture to the User-to-Network Interface 123 41 124 8. Application of the Architecture to L3VPN Multi-AS Environments 43 125 9. Scoping Future Work ........................................... 44 126 9.1. Not Solving the Internet .................................... 44 127 9.2. Working With "Related" Domains .............................. 44 128 9.3. Not Finding Optimal Paths in All Situations ................. 44 129 9.4. Sanity and Scaling .......................................... 44 130 10. Manageability Considerations ................................. 45 131 10.1. Managing the Abstraction Layer Network ..................... 45 132 10.2. Managing Interactions of Client and Abstraction Layer Networks 133 46 134 10.3. Managing Interactions of Abstraction Layer and Server Networks 135 46 136 11. IANA Considerations .......................................... 47 137 12. Security Considerations ...................................... 47 138 13. Acknowledgements ............................................. 48 139 14. References ................................................... 49 140 14.1. Informative References ..................................... 49 141 Authors' Addresses ............................................... 52 142 Contributors ..................................................... 53 143 A. Existing Work ................................................. 55 144 A.1. Per-Domain Path Computation ................................. 55 145 A.2. Crankback ................................................... 55 146 A.3. Path Computation Element .................................... 56 147 A.4. GMPLS UNI and Overlay Networks .............................. 58 148 A.5. Layer One VPN ............................................... 58 149 A.6. Policy and Link Advertisement ............................... 59 150 B. Additional Features ........................................... 60 151 B.1. Macro Shared Risk Link Groups ............................... 60 152 B.2. Mutual Exclusivity .......................................... 61 154 1. Introduction 156 Traffic Engineered (TE) systems such as MPLS-TE [RFC2702] and GMPLS 157 [RFC3945] offer a way to establish paths through a network in a 158 controlled way that reserves network resources on specified links. 159 TE paths are computed by examining the Traffic Engineering Database 160 (TED) and selecting a sequence of links and nodes that are capable of 161 meeting the requirements of the path to be established. The TED is 162 constructed from information distributed by the IGP running in the 163 network, for example OSPF-TE [RFC3630] or ISIS-TE [RFC5305]. 165 It is sometimes desirable to establish an end-to-end TE path that 166 crosses more than one network or administrative domain as described 167 in [RFC4105] and [RFC4216]. In these cases, the availability of TE 168 information is usually limited to within each network. Such networks 169 are often referred to as Domains [RFC4726] and we adopt that 170 definition in this document: viz. 172 For the purposes of this document, a domain is considered to be any 173 collection of network elements within a common sphere of address 174 management or path computational responsibility. Examples of such 175 domains include IGP areas and Autonomous Systems. 177 In order to determine the potential to establish a TE path through a 178 series of connected domains and to choose the appropriate domain 179 connection points through which to route a path, it is necessary to 180 have available a certain amount of TE information about each domain. 181 This need not be the full set of TE information available within each 182 domain, but does need to express the potential of providing TE 183 connectivity. This subset of TE information is called TE 184 reachability information. The TE reachability information can be 185 exchanged between domains based on the information gathered from the 186 local routing protocol, filtered by configured policy, or statically 187 configured. 189 This document sets out the problem statement for the exchange of TE 190 information between interconnected TE networks in support of end-to- 191 end TE path establishment and describes the best current practice 192 architecture to meet this problem statement. The scope of this 193 document is limited to the simple TE constraints and information 194 (such as TE metrics, hop count, bandwidth, delay, shared risk) 195 necessary to determine TE reachability: discussion of multiple 196 additional constraints that might qualify the reachability can 197 significantly complicate aggregation of information and the stability 198 of the mechanism used to present potential connectivity as is 199 explained in the body of this document. 201 An Appendix to this document summarizes relevant existing work that 202 is used to route TE paths across multiple domains. 204 1.1. Terminology 206 This section introduces some key terms that need to be understood to 207 arrive at a common understanding of the problem space. Some of the 208 terms are defined in more detail in the sections that follow (in 209 which case forward pointers are provided) and some terms are taken 210 from definitions that already exist in other RFCs (in which case 211 references are given, but no apology is made for repeating or 212 summarizing the definitions here). 214 1.1.1. TE Paths and TE Connections 216 A TE connection is a Label Switched Path (LSP) through an MPLS-TE or 217 GMPLS network that directs traffic along a particular path (the TE 218 path) in order to provide a specific service such as bandwidth 219 guarantee, separation of traffic, or resilience between a well-known 220 pair of end points. 222 1.1.2. TE Metrics and TE Attributes 224 TE metrics and TE attributes are terms applied to parameters of links 225 (and possibly nodes) in a network that is traversed by TE 226 connections. The TE metrics and TE attributes are used by path 227 computation algorithms to select the TE paths that the TE connections 228 traverse. A TE metric is a quantifiable value (including measured 229 characteristics) describing some property of a link or node that can 230 be used as part of TE routing or planning, while a TE attribute is a 231 wider term (i.e., including the concept of a TE metric) that refers 232 to any property or characteristic of a link or node that can be used 233 as part of TE routing or planning. Thus, the delay introduced by 234 transmission of a packet on a link is an example of a TE metric while 235 the geographic location of a router is an example of a more general 236 attribute. 238 Provisioning a TE connection through a network may result in dynamic 239 changes to the TE metrics and TE attributes of the links and nodes in 240 the network. 242 These terms are also sometimes used to describe the end-to-end 243 characteristics of a TE connection and can be derived according to a 244 formula from the TE metrics and TE attributes of the links and nodes 245 that the TE connection traverses. Thus, for example, the end-to-end 246 delay for a TE connection is usually considered to be the sum of the 247 delay on each link that the connection traverses. 249 1.1.3. TE Reachability 251 In an IP network, reachability is the ability to deliver a packet to 252 a specific address or prefix. That is, the existence of an IP path 253 to that address or prefix. TE reachability is the ability to reach a 254 specific address along a TE path. More specifically, it is the 255 ability to establish a TE connection in an MPLS-TE or GMPLS sense. 256 Thus we talk about TE reachability as the potential of providing TE 257 connectivity. 259 TE reachability may be unqualified (there is a TE path, but no 260 information about available resources or other constraints is 261 supplied) which is helpful especially in determining a path to a 262 destination that lies in an unknown domain, or may be qualified by TE 263 attributes and TE metrics such as hop count, available bandwidth, 264 delay, shared risk, etc. 266 1.1.4. Domain 268 As defined in [RFC4726], a domain is any collection of network 269 elements within a common sphere of address management or path 270 computational responsibility. Examples of such domains include 271 Interior Gateway Protocol (IGP) areas and Autonomous Systems (ASes). 273 1.1.5. Server Network 275 A Server Network is a network that provides connectivity for another 276 network (the Client Network) in a client-server relationship. A 277 Server Network is sometimes referred to as an underlay network. 279 1.1.6. Client Network 281 A Client Network is a network that uses the connectivity provided by 282 a Server Network. A Client Network is sometimes referred to as an 283 overlay network. 285 1.1.7. Aggregation 287 The concept of aggregation is discussed in Section 3.5. In 288 aggregation, multiple network resources from a domain are represented 289 outside the domain as a single entity. Thus multiple links and nodes 290 forming a TE connection may be represented as a single link, or a 291 collection of nodes and links (perhaps the whole domain) may be 292 represented as a single node with its attachment links. 294 1.1.8. Abstraction 296 Section 4.2 introduces the concept of abstraction and distinguishes 297 it from aggregation. Abstraction may be viewed as "policy-based 298 aggregation" where the policies are applied to overcome the issues 299 with aggregation as identified in Section 3 of this document. 301 Abstraction is the process of applying policy to the available TE 302 information within a domain, to produce selective information that 303 represents the potential ability to connect across the domain. Thus, 304 abstraction does not necessarily offer all possible connectivity 305 options, but presents a general view of potential connectivity 306 according to the policies that determine how the domain's 307 administrator wants to allow the domain resources to be used. 309 1.1.9. Abstract Link 311 An abstract link is the representation of the characteristics of a 312 path between two nodes in a domain produced by abstraction. The 313 abstract link is advertised outside that domain as a TE link for use 314 in signaling in other domains. Thus, an abstract link represents 315 the potential to connect between a pair of nodes. 317 More details of abstract links are provided in Section 4.2.1. 319 1.1.10. Abstract Node or Virtual Node 321 An abstract node was defined in [RFC3209] as a group of nodes whose 322 internal topology is opaque to an ingress node of the LSP. More 323 generally, an abstract node is the representation as a single node in 324 a TE topology of some or all of the resources of one or more nodes 325 and the links that connect them. An abstract node may be advertised 326 outside the domain as a TE node for use in path computation and 327 signaling in other domains. 329 The term virtual node has typically been applied to the aggregation 330 of a domain (that is, a collection of nodes and links that operate 331 as a single administrative entity for TE purposes) into a single 332 entity that is treated as a node for the purposes of end-to-end 333 traffic engineering. Virtual nodes are often considered a way to 334 present islands of single vendor equipment in an optical network. 336 Sections 3.5 and 4.2.2.1 provide more information about the uses 337 and issues of abstract nodes and virtual nodes. 339 1.1.11. Abstraction Layer Network 341 The abstraction layer network is introduced in Section 4.2.2. It may 342 be seen as a brokerage layer network between one or more server 343 network and one or more client network. The abstraction layer 344 network is the collection of abstract links that provide potential 345 connectivity across the server networks and on which path computation 346 can be performed to determine edge-to-edge paths that provide 347 connectivity as links in the client network. 349 In the simplest case, the abstraction layer network is just a set of 350 edge-to-edge connections (i.e., abstract links), but to make the use 351 of server network resources more flexible, the abstract links might 352 not all extend from edge to edge, but might offer connectivity 353 between server network nodes to form a more complex network. 355 2. Overview of Use Cases 357 2.1. Peer Networks 359 The peer network use case can be most simply illustrated by the 360 example in Figure 1. A TE path is required between the source (Src) 361 and destination (Dst), that are located in different domains. There 362 are two points of interconnection between the domains, and selecting 363 the wrong point of interconnection can lead to a sub-optimal path, or 364 even fail to make a path available. Note that peer networks are 365 assumed to have the same technology type: that is, the same 366 "switching capability" to use the term from GMPLS [RFC3945]. 368 -------------- -------------- 369 | Domain A | x1 | Domain Z | 370 | ----- +----+ ----- | 371 | | Src | +----+ | Dst | | 372 | ----- | x2 | ----- | 373 -------------- -------------- 375 Figure 1 : Peer Networks 377 For example, when Domain A attempts to select a path, it may 378 determine that adequate bandwidth is available from Src through both 379 interconnection points x1 and x2. It may pick the path through x1 380 for local policy reasons: perhaps the TE metric is smaller. However, 381 if there is no connectivity in Domain Z from x1 to Dst, the path 382 cannot be established. Techniques such as crankback may be used to 383 alleviate this situation, but do not lead to rapid setup or 384 guaranteed optimality. Furthermore RSVP signalling creates state in 385 the network that is immediately removed by the crankback procedure. 386 Frequent events of such a kind impact scalability in a non- 387 deterministic manner. More details of crankback can be found in 388 Section A.2. 390 There are countless more complicated examples of the problem of peer 391 networks. Figure 2 shows the case where there is a simple mesh of 392 domains. Clearly, to find a TE path from Src to Dst, Domain A must 393 not select a path leaving through interconnect x1 since Domain B has 394 no connectivity to Domain Z. Furthermore, in deciding whether to 395 select interconnection x2 (through Domain C) or interconnection x3 396 though Domain D, Domain A must be sensitive to the TE connectivity 397 available through each of Domains C and D, as well the TE 398 connectivity from each of interconnections x4 and x5 to Dst within 399 Domain Z. The problem may be further complicated when the source 400 domain does not know in which domain the destination node is located, 401 since the choice of a domain path clearly depends on the knowledge of 402 the destination domain: this issue is obviously mitigated in IP 403 networks by inter-domain routing [RFC4271]. 405 Of course, many network interconnection scenarios are going to be a 406 combination of the situations expressed in these two examples. There 407 may be a mesh of domains, and the domains may have multiple points of 408 interconnection. 410 -------------- 411 | Domain B | 412 | | 413 | | 414 /-------------- 415 / 416 /x1 417 --------------/ -------------- 418 | Domain A | | Domain Z | 419 | | -------------- | | 420 | ----- | x2| Domain C | x4| ----- | 421 | | Src | +---+ +---+ | Dst | | 422 | ----- | | | | ----- | 423 | | -------------- | | 424 --------------\ /-------------- 425 \x3 / 426 \ / 427 \ /x5 428 \--------------/ 429 | Domain D | 430 | | 431 | | 432 -------------- 434 Figure 2 : Peer Networks in a Mesh 436 2.2. Client-Server Networks 438 Two major classes of use case relate to the client-server 439 relationship between networks. These use cases have sometimes been 440 referred to as overlay networks. In both cases, the client and 441 server network may have the same switching capability, or may be 442 built from nodes and links that have different technology types in 443 the client and server networks. 445 The first group of use cases, shown in Figure 3, occurs when domains 446 belonging to one network are connected by a domain belonging to 447 another network. In this scenario, once connectivity is formed 448 across the lower layer network, the domains of the upper layer 449 network can be merged into a single domain by running IGP adjacencies 450 and by treating the server network layer connectivity as links in the 451 higher layer network. The TE relationship between the domains 452 (higher and lower layers) in this case is reduced to determining what 453 server network connectivity to establish, how to trigger it, how to 454 route it in the server network, and what resources and capacity to 455 assign within the server network layer. As the demands in the higher 456 layer (client) network vary, the connectivity in the server network 457 may need to be modified. Section 2.4 explains in a little more 458 detail how connectivity may be requested. 460 ---------------- ---------------- 461 | Client Network | | Client Network | 462 | Domain A | | Domain B | 463 | | | | 464 | ----- | | ----- | 465 | | Src | | | | Dst | | 466 | ----- | | ----- | 467 | | | | 468 ----------------\ /---------------- 469 \x1 x2/ 470 \ / 471 \ / 472 \----------------/ 473 | Server Network | 474 | Domain | 475 | | 476 ---------------- 478 Figure 3 : Client-Server Networks 480 The second class of use case of client-server networking is for 481 Virtual Private Networks (VPNs). In this case, as opposed to the 482 former one, it is assumed that the client network has a different 483 address space than that of the server network where non-overlapping 484 IP addresses between the client and the server networks cannot be 485 guaranteed. A simple example is shown in Figure 4. The VPN sites 486 comprise a set of domains that are interconnected over a core domain, 487 the provider network, that is the server network in our model. 489 Note that in the use cases shown in Figures 3 and 4 the client 490 network domains may (and, in fact, probably do) operate as a single 491 connected network. 493 -------------- -------------- 494 | Domain A | | Domain Z | 495 | (VPN site) | | (VPN site) | 496 | | | | 497 | ----- | | ----- | 498 | | Src | | | | Dst | | 499 | ----- | | ----- | 500 | | | | 501 --------------\ /-------------- 502 \x1 x2/ 503 \ / 504 \ / 505 \---------------/ 506 | Core Domain | 507 | | 508 | | 509 /---------------\ 510 / \ 511 / \ 512 /x3 x4\ 513 --------------/ \-------------- 514 | Domain B | | Domain C | 515 | (VPN site) | | (VPN site) | 516 | | | | 517 | | | | 518 -------------- -------------- 520 Figure 4 : A Virtual Private Network 522 Both use cases in this section become "more interesting" when 523 combined with the use case in Section 2.1. That is, when the 524 connectivity between higher layer domains or VPN sites is provided 525 by a sequence or mesh of lower layer domains. Figure 5 shows how 526 this might look in the case of a VPN. 528 ------------ ------------ 529 | Domain A | | Domain Z | 530 | (VPN site) | | (VPN site) | 531 | ----- | | ----- | 532 | | Src | | | | Dst | | 533 | ----- | | ----- | 534 | | | | 535 ------------\ /------------ 536 \x1 x2/ 537 \ / 538 \ / 539 \---------- ----------/ 540 | Domain X |x5 | Domain Y | 541 | (core) +---+ (core) | 542 | | | | 543 | +---+ | 544 | |x6 | | 545 /---------- ----------\ 546 / \ 547 / \ 548 /x3 x4\ 549 ------------/ \------------ 550 | Domain B | | Domain C | 551 | (VPN site) | | (VPN site) | 552 | | | | 553 ------------ ------------ 555 Figure 5 : A VPN Supported Over Multiple Server Domains 557 2.3. Dual-Homing 559 A further complication may be added to the client-server relationship 560 described in Section 2.2 by considering what happens when a client 561 network domain is attached to more than one domain in the server 562 network, or has two points of attachment to a server network domain. 563 Figure 6 shows an example of this for a VPN. 565 ------------ 566 | Domain B | 567 | (VPN site) | 568 ------------ | ----- | 569 | Domain A | | | Src | | 570 | (VPN site) | | ----- | 571 | | | | 572 ------------\ -+--------+- 573 \x1 | | 574 \ x2| |x3 575 \ | | ------------ 576 \--------+- -+-------- | Domain C | 577 | Domain X | x8 | Domain Y | x4 | (VPN site) | 578 | (core) +----+ (core) +----+ ----- | 579 | | | | | | Dst | | 580 | +----+ +----+ ----- | 581 | | x9 | | x5 | | 582 /---------- ----------\ ------------ 583 / \ 584 / \ 585 /x6 x7\ 586 ------------/ \------------ 587 | Domain D | | Domain E | 588 | (VPN site) | | (VPN site) | 589 | | | | 590 ------------ ------------ 592 Figure 6 : Dual-Homing in a Virtual Private Network 594 2.4. Requesting Connectivity 596 The relationship between domains can be entirely under the control of 597 management processes, dynamically triggered by the client network, or 598 some hybrid of these cases. In the management case, the server 599 network may be requested to establish a set of LSPs to provide client 600 network connectivity. In the dynamic case, the client network may 601 make a request to the server network exerting a range of controls 602 over the paths selected in the server network. This range extends 603 from no control (i.e., a simple request for connectivity), through a 604 set of constraints (such as latency, path protection, etc.), up to 605 and including full control of the path and resources used in the 606 server network (i.e., the use of explicit paths with label 607 subobjects). 609 There are various models by which a server network can be requested 610 to set up the connections that support a service provided to the 611 client network. These requests may come from management systems, 612 directly from the client network control plane, or through an 613 intermediary broker such as the Virtual Network Topology Manager 614 (VNTM) [RFC5623]. 616 The trigger that causes the request to the server network is also 617 flexible. It could be that the client network discovers a pressing 618 need for server network resources (such as the desire to provision an 619 end-to-end connection in the client network or severe congestion on 620 a specific path), or it might be that a planning application has 621 considered how best to optimize traffic in the client network or 622 how to handle a predicted traffic demand. 624 In all cases, the relationship between client and server networks is 625 subject to policy so that server network resources are under the 626 administrative control of the operator or the server network and are 627 only used to support a client network in ways that the server network 628 operator approves. 630 As just noted, connectivity requests issued to a server network may 631 include varying degrees of constraint upon the choice of path that 632 the server network can implement. 634 o Basic Provisioning is a simple request for connectivity. The only 635 constraints are the end points of the connection and the capacity 636 (bandwidth) that the connection will support for the client 637 network. In the case of some server networks, even the bandwidth 638 component of a basic provisioning request is superfluous because 639 the server network has no facility to vary bandwidth, but can offer 640 connectivity only at a default capacity. 642 o Basic Provisioning with Optimization is a service request that 643 indicates one or more metrics that the server network must optimize 644 in its selection of a path. Metrics may be hop count, path length, 645 summed TE metric, jitter, delay, or any number of technology- 646 specific constraints. 648 o Basic Provisioning with Optimization and Constraints enhances the 649 optimization process to apply absolute constraints to functions of 650 the path metrics. For example, a connection may be requested that 651 optimizes for the shortest path, but in any case requests that the 652 end-to-end delay be less than a certain value. Equally, 653 optimization my be expressed in terms of the impact on the network. 654 For example, a service may be requested in order to leave maximal 655 flexibility to satisfy future service requests. 657 o Fate Diversity requests ask for the server network to provide a 658 path that does not use any network resources (usually links and 659 nodes) that share fate (i.e., can fail as the result of a single 660 event) as the resources used by another connection. This allows 661 the client network to construct protection services over the server 662 network, for example by establishing links that are known to be 663 fate diverse. The connections that have diverse paths need not 664 share end points. 666 o Provisioning with Fate Sharing is the exact opposite of Fate 667 Diversity. In this case two or more connections are requested to 668 to follow same path in the server network. This may be requested, 669 for example, to create a bundled or aggregated link in the client 670 network where each component of the client layer composite link is 671 required to have the same server network properties (metrics, 672 delay, etc.) and the same failure characteristics. 674 o Concurrent Provisioning enables the inter-related connections 675 requests described in the previous two bullets to be enacted 676 through a single, compound service request. 678 o Service Resilience requests the server network to provide 679 connectivity for which the server network takes responsibility to 680 recover from faults. The resilience may be achieved through the 681 use of link-level protection, segment protection, end-to-end 682 protection, or recovery mechanisms. 684 2.4.1. Discovering Server Network Information 686 Although the topology and resource availability information of a 687 server network may be hidden from the client network, the service 688 request interface may support features that report details about the 689 services and potential services that the server network supports. 691 o Reporting of path details, service parameters, and issues such as 692 path diversity of LSPs that support deployed services allows the 693 client network to understand to what extent its requests were 694 satisfied. This is particularly important when the requests were 695 made as "best effort". 697 o A server network may support requests of the form "if I was to ask 698 you for this service, would you be able to provide it?" That is, 699 a service request that does everything except actually provision 700 the service. 702 3. Problem Statement 704 The problem statement presented in this section is as much about the 705 issues that may arise in any solution (and so have to be avoided) 706 and the features that are desirable within a solution, as it is about 707 the actual problem to be solved. 709 The problem can be stated very simply and with reference to the use 710 cases presented in the previous section. 712 A mechanism is required that allows TE-path computation in one 713 domain to make informed choices about the TE-capabilities and exit 714 points from the domain when signaling an end-to-end TE path that 715 will extend across multiple domains. 717 Thus, the problem is one of information collection and presentation, 718 not about signaling. Indeed, the existing signaling mechanisms for 719 TE LSP establishment are likely to prove adequate [RFC4726] with the 720 possibility of minor extensions. Similarly, TE information may 721 currently be distributed in a domain by TE extensions to one of the 722 two IGPs as described in OSPF-TE [RFC3630] and ISIS-TE [RFC5305], 723 and TE information may be exported from a domain (for example, 724 northbound) using link state extensions to BGP [RFC7752]. 726 An interesting annex to the problem is how the path is made available 727 for use. For example, in the case of a client-server network, the 728 path established in the server network needs to be made available as 729 a TE link to provide connectivity in the client network. 731 3.1. Policy and Filters 733 A solution must be amenable to the application of policy and filters. 734 That is, the operator of a domain that is sharing information with 735 another domain must be able to apply controls to what information is 736 shared. Furthermore, the operator of a domain that has information 737 shared with it must be able to apply policies and filters to the 738 received information. 740 Additionally, the path computation within a domain must be able to 741 weight the information received from other domains according to local 742 policy such that the resultant computed path meets the local 743 operator's needs and policies rather than those of the operators of 744 other domains. 746 3.2. Confidentiality 748 A feature of the policy described in Section 3.1 is that an operator 749 of a domain may desire to keep confidential the details about its 750 internal network topology and loading. This information could be 751 construed as commercially sensitive. 753 Although it is possible that TE information exchange will take place 754 only between parties that have significant trust, there are also use 755 cases (such as the VPN supported over multiple server network domains 756 described in Section 2.4) where information will be shared between 757 domains that have a commercial relationship, but a low level of 758 trust. 760 Thus, it must be possible for a domain to limit the information share 761 to just that which the computing domain needs to know with the 762 understanding that the less information that is made available the 763 more likely it is that the result will be a less optimal path and/or 764 more crankback events. 766 3.3. Information Overload 768 One reason that networks are partitioned into separate domains is to 769 reduce the set of information that any one router has to handle. 770 This also applies to the volume of information that routing protocols 771 have to distribute. 773 Over the years routers have become more sophisticated with greater 774 processing capabilities and more storage, the control channels on 775 which routing messages are exchanged have become higher capacity, and 776 the routing protocols (and their implementations) have become more 777 robust. Thus, some of the arguments in favor of dividing a network 778 into domains may have been reduced. Conversely, however, the size of 779 networks continues to grow dramatically with a consequent increase in 780 the total amount of routing-related information available. 781 Additionally, in this case, the problem space spans two or more 782 networks. 784 Any solution to the problems voiced in this document must be aware of 785 the issues of information overload. If the solution was to simply 786 share all TE information between all domains in the network, the 787 effect from the point of view of the information load would be to 788 create one single flat network domain. Thus the solution must 789 deliver enough information to make the computation practical (i.e., 790 to solve the problem), but not so much as to overload the receiving 791 domain. Furthermore, the solution cannot simply rely on the policies 792 and filters described in Section 3.1 because such filters might not 793 always be enabled. 795 3.4. Issues of Information Churn 797 As LSPs are set up and torn down, the available TE resources on links 798 in the network change. In order to reliably compute a TE path 799 through a network, the computation point must have an up-to-date view 800 of the available TE resources. However, collecting this information 801 may result in considerable load on the distribution protocol and 802 churn in the stored information. In order to deal with this problem 803 even in a single domain, updates are sent at periodic intervals or 804 whenever there is a significant change in resources, whichever 805 happens first. 807 Consider, for example, that a TE LSP may traverse ten links in a 808 network. When the LSP is set up or torn down, the resources 809 available on each link will change resulting in a new advertisement 810 of the link's capabilities and capacity. If the arrival rate of new 811 LSPs is relatively fast, and the hold times relatively short, the 812 network may be in a constant state of flux. Note that the 813 problem here is not limited to churn within a single domain, since 814 the information shared between domains will also be changing. 815 Furthermore, the information that one domain needs to share with 816 another may change as the result of LSPs that are contained within or 817 cross the first domain but which are of no direct relevance to the 818 domain receiving the TE information. 820 In packet networks, where the capacity of an LSP is often a small 821 fraction of the resources available on any link, this issue is 822 partially addressed by the advertising routers. They can apply a 823 threshold so that they do not bother to update the advertisement of 824 available resources on a link if the change is less than a configured 825 percentage of the total (or alternatively, the remaining) resources. 826 The updated information in that case will be disseminated based on an 827 update interval rather than a resource change event. 829 In non-packet networks, where link resources are physical switching 830 resources (such as timeslots or wavelengths) the capacity of an LSP 831 may more frequently be a significant percentage of the available link 832 resources. Furthermore, in some switching environments, it is 833 necessary to achieve end-to-end resource continuity (such as using 834 the same wavelength on the whole length of an LSP), so it is far more 835 desirable to keep the TE information held at the computation points 836 up-to-date. Fortunately, non-packet networks tend to be quite a bit 837 smaller than packet networks, the arrival rates of non-packet LSPs 838 are much lower, and the hold times considerably longer. Thus the 839 information churn may be sustainable. 841 3.5. Issues of Aggregation 843 One possible solution to the issues raised in other sub-sections of 844 this section is to aggregate the TE information shared between 845 domains. Two aggregation mechanisms are often considered: 847 - Virtual node model. In this view, the domain is aggregated as if 848 it was a single node (or router / switch). Its links to other 849 domains are presented as real TE links, but the model assumes that 850 any LSP entering the virtual node through a link can be routed to 851 leave the virtual node through any other link (although recent work 852 on "limited cross-connect switches" may help with this problem 854 [RFC7579]). 856 - Virtual link model. In this model, the domain is reduced to a set 857 of edge-to-edge TE links. Thus, when computing a path for an LSP 858 that crosses the domain, a computation point can see which domain 859 entry points can be connected to which other and with what TE 860 attributes. 862 It is of the nature of aggregation that information is removed from 863 the system. This can cause inaccuracies and failed path computation. 864 For example, in the virtual node model there might not actually be a 865 TE path available between a pair of domain entry points, but the 866 model lacks the sophistication to represent this "limited cross- 867 connect capability" within the virtual node. On the other hand, in 868 the virtual link model it may prove very hard to aggregate multiple 869 link characteristics: for example, there may be one path available 870 with high bandwidth, and another with low delay, but this does not 871 mean that the connectivity should be assumed or advertised as having 872 both high bandwidth and low delay. 874 The trick to this multidimensional problem, therefore, is to 875 aggregate in a way that retains as much useful information as 876 possible while removing the data that is not needed. An important 877 part of this trick is a clear understanding of what information is 878 actually needed. 880 It should also be noted in the context of Section 3.4 that changes in 881 the information within a domain may have a bearing on what aggregated 882 data is shared with another domain. Thus, while the data shared is 883 reduced, the aggregation algorithm (operating on the routers 884 responsible for sharing information) may be heavily exercised. 886 4. Architecture 888 4.1. TE Reachability 890 As described in Section 1.1, TE reachability is the ability to reach 891 a specific address along a TE path. The knowledge of TE reachability 892 enables an end-to-end TE path to be computed. 894 In a single network, TE reachability is derived from the Traffic 895 Engineering Database (TED) that is the collection of all TE 896 information about all TE links in the network. The TED is usually 897 built from the data exchanged by the IGP, although it can be 898 supplemented by configuration and inventory details especially in 899 transport networks. 901 In multi-network scenarios, TE reachability information can be 902 described as "You can get from node X to node Y with the following 903 TE attributes." For transit cases, nodes X and Y will be edge nodes 904 of the transit network, but it is also important to consider the 905 information about the TE connectivity between an edge node and a 906 specific destination node. TE reachability may be qualified by TE 907 attributes such as TE metrics, hop count, available bandwidth, delay, 908 shared risk, etc. 910 TE reachability information can be exchanged between networks so that 911 nodes in one network can determine whether they can establish TE 912 paths across or into another network. Such exchanges are subject to 913 a range of policies imposed by the advertiser (for security and 914 administrative control) and by the receiver (for scalability and 915 stability). 917 4.2. Abstraction not Aggregation 919 Aggregation is the process of synthesizing from available 920 information. Thus, the virtual node and virtual link models 921 described in Section 3.5 rely on processing the information available 922 within a network to produce the aggregate representations of links 923 and nodes that are presented to the consumer. As described in 924 Section 3, dynamic aggregation is subject to a number of pitfalls. 926 In order to distinguish the architecture described in this document 927 from the previous work on aggregation, we use the term "abstraction" 928 in this document. The process of abstraction is one of applying 929 policy to the available TE information within a domain, to produce 930 selective information that represents the potential ability to 931 connect across the domain. 933 Abstraction does not offer all possible connectivity options (refer 934 to Section 3.5), but does present a general view of potential 935 connectivity. Abstraction may have a dynamic element, but is not 936 intended to keep pace with the changes in TE attribute availability 937 within the network. 939 Thus, when relying on an abstraction to compute an end-to-end path, 940 the process might not deliver a usable path. That is, there is no 941 actual guarantee that the abstractions are current or feasible. 943 While abstraction uses available TE information, it is subject to 944 policy and management choices. Thus, not all potential connectivity 945 will be advertised to each client network. The filters may depend on 946 commercial relationships, the risk of disclosing confidential 947 information, and concerns about what use is made of the connectivity 948 that is offered. 950 4.2.1. Abstract Links 952 An abstract link is a measure of the potential to connect a pair of 953 points with certain TE parameters. That is, it is a path and its 954 characteristics in the server network. An abstract link represents 955 the possibility of setting up an LSP, and LSPs may be set up over the 956 abstract link. 958 When looking at a network such as that in Figure 7, the link from CN1 959 to CN4 may be an abstract link. It is easy to advertise it as a link 960 by abstracting the TE information in the server network subject to 961 policy. 963 The path (i.e., the abstract link) represents the possibility of 964 establishing an LSP from client network edge to client network edge 965 across the server network. There is not necessarily a one-to-one 966 relationship between abstract link and LSP because more than one LSP 967 could be set up over the path. 969 Since the client network nodes do not have visibility into the server 970 network, they must rely on abstraction information delivered to them 971 by the server network. That is, the server network will report on 972 the potential for connectivity. 974 4.2.2. The Abstraction Layer Network 976 Figure 7 introduces the abstraction layer network. This construct 977 separates the client network resources (nodes C1, C2, C3, and C4, and 978 the corresponding links), and the server network resources (nodes 979 CN1, CN2, CN3, and CN4 and the corresponding links). Additionally, 980 the architecture introduces an intermediary network layer called the 981 abstraction layer. The abstraction layer contains the client network 982 edge nodes (C2 and C3), the server network edge nodes (CN1 and CN4), 983 the client-server links (C2-CN1 and CN4-C3) and the abstract link 984 CN1-CN4. 986 The client network is able to operate as normal. Connectivity across 987 the network can either be found or not found based on links that 988 appear in the client network TED. If connectivity cannot be found, 989 end-to-end LSPs cannot be set up. This failure may be reported, but 990 no dynamic action is taken by the client network. 992 The server network also operates as normal. LSPs across the server 993 network between client network edges are set up in response to 994 management commands or in response to signaling requests. 996 The abstraction layer consists of the physical links between the 997 two networks, and also the abstract links. The abstract links are 998 created by the server network according to local policy and represent 999 the potential connectivity that could be created across the server 1000 network and which the server network is willing to make available for 1001 use by the client network. Thus, in this example, the diameter of 1002 the abstraction layer network is only three hops, but an instance of 1003 an IGP could easily be run so that all nodes participating in the 1004 abstraction layer (and in particular the client network edge nodes) 1005 can see the TE connectivity in the layer. 1007 -- -- -- -- 1008 |C1|--|C2| |C3|--|C4| Client Network 1009 -- | | | | -- 1010 | | | | . . . . . . . . . . . 1011 | | | | 1012 | | | | 1013 | | --- --- | | Abstraction 1014 | |---|CN1|================|CN4|---| | Layer Network 1015 -- | | | | -- 1016 | | | | . . . . . . . . . . . . . . 1017 | | | | 1018 | | | | 1019 | | --- --- | | Server Network 1020 | |--|CN2|--|CN3|--| | 1021 --- --- --- --- 1023 Key 1024 --- Direct connection between two nodes 1025 === Abstract link 1027 Figure 7 : Architecture for Abstraction Layer Network 1029 When the client network needs additional connectivity it can make a 1030 request to the abstraction layer network. For example, the operator 1031 of the client network may want to create a link from C2 to C3. The 1032 abstraction layer can see the potential path C2-CN1-CN4-C3 and can 1033 set up an LSP C2-CN1-CN4-C3 across the server network and make the 1034 LSP available as a link in the client network. 1036 Sections 4.2.3 and 4.2.4 show how this model is used to satisfy the 1037 requirements for connectivity in client-server networks and in peer 1038 networks. 1040 4.2.2.1. Nodes in the Abstraction Layer Network 1042 Figure 7 shows a very simplified network diagram and the reader would 1043 be forgiven for thinking that only client network edge nodes and 1044 server network edge nodes may appear in the abstraction layer 1045 network. But this is not the case: other nodes from the server 1046 network may be present. This allows the abstraction layer network 1047 to be more complex than a full mesh with access spokes. 1049 Thus, as shown in Figure 8, a transit node in the server network 1050 (here the node is CN3) can be exposed as a node in the abstraction 1051 layer network with abstract links connecting it to other nodes in 1052 the abstraction layer network. Of course, in the network shown in 1053 Figure 8, there is little if any value in exposing CN3, but if it 1054 had other abstract links to other nodes in the abstraction layer 1055 network and/or direct connections to client network nodes, then the 1056 resulting network would be richer. 1058 -- -- -- -- Client 1059 |C1|--|C2| |C3|--|C4| Network 1060 -- | | | | -- 1061 | | | | . . . . . . . . . 1062 | | | | 1063 | | | | 1064 | | --- --- --- | | Abstraction 1065 | |--|CN1|========|CN3|========|CN5|--| | Layer Network 1066 -- | | | | | | -- 1067 | | | | | | . . . . . . . . . . . . 1068 | | | | | | 1069 | | | | | | Server 1070 | | --- | | --- | | Network 1071 | |--|CN2|-| |-|CN4|--| | 1072 --- --- --- --- --- 1074 Figure 8 : Abstraction Layer Network with Additional Node 1076 It should be noted that the nodes included in the abstraction layer 1077 network in this way are not "abstract nodes" in the sense of a 1078 virtual node described in Section 3.5. While it is the case that 1079 the policy point responsible for advertising server network resources 1080 into the abstraction layer network could choose to advertise abstract 1081 nodes in place of real physical nodes, it is believed that doing so 1082 would introduce significant complexity in terms of: 1084 - Coordination between all of the external interfaces of the abstract 1085 node 1087 - Management of changes in the server network that lead to limited 1088 capabilities to reach (cross-connect) across the Abstract Node. It 1089 may be noted that recent work on limited cross-connect capabilities 1090 such as exist in asymmetrical switches could be used to represent 1091 the limitations in an abstract node [RFC7579], [RFC7580]. 1093 4.2.3. Abstraction in Client-Server Networks 1095 Figure 9 shows the basic architectural concepts for a client-server 1096 network. The client network nodes are C1, C2, CE1, CE2, C3, and C4. 1097 The server (core) network nodes are CN1, CN2, CN3, and CN4. The 1098 interfaces CE1-CN1 and CE2-CN2 are the interfaces between the client 1099 and server networks. 1101 The technologies (switching capabilities) of the client and server 1102 networks may be the same or different. If they are different, the 1103 client network traffic must be tunneled over a server network LSP. 1104 If they are the same, the client network LSP may be routed over the 1105 server network links, tunneled over a server network LSP, or 1106 constructed from the concatenation (stitching) of client network and 1107 server network LSP segments. 1109 : : 1110 Client Network : Server Network : Client Network 1111 : : 1112 -- -- --- --- -- -- 1113 |C1|--|C2|--|CE1|................................|CE2|--|C3|--|C4| 1114 -- -- | | --- --- | | -- -- 1115 | |===|CN1|================|CN4|===| | 1116 | |---| | | |---| | 1117 --- | | --- --- | | --- 1118 | |--|CN2|--|CN3|--| | 1119 --- --- --- --- 1121 Key 1122 --- Direct connection between two nodes 1123 ... CE-to-CE LSP tunnel 1124 === Potential path across the server network (abstract link) 1126 Figure 9 : Architecture for Client-Server Network 1128 The objective is to be able to support an end-to-end connection, 1129 C1-to-C4, in the client network. This connection may support TE or 1130 normal IP forwarding. To achieve this, CE1 is to be connected to CE2 1131 by a link in the client network. This enables the client network to 1132 view itself as connected and to select an end-to-end path. 1134 As shown in the figure, three abstraction layer links are formed: 1135 CE1-CN1, CN1-CN2, and CN2-CE2. A three-hop LSP is then established 1136 from CE1 to CE2 that can be presented as a link in the client 1137 network. 1139 The practicalities of how the CE1-CE2 LSP is carried across the 1140 server network LSP may depend on the switching and signaling options 1141 available in the server network. The LSP may be tunneled down the 1142 server network LSP using the mechanisms of a hierarchical LSP 1143 [RFC4206], or the LSP segments CE1-CN1 and CN2-CE2 may be stitched to 1144 the server network LSP as described in [RFC5150]. 1146 Section 4.2.2 has already introduced the concept of the abstraction 1147 layer network through an example of a simple layered network. But it 1148 may be helpful to expand on the example using a slightly more complex 1149 network. 1151 Figure 10 shows a multi-layer network comprising client network nodes 1152 (labeled as Cn for n= 0 to 9) and server network nodes (labeled as Sn 1153 for n = 1 to 9). 1155 -- -- 1156 |C3|---|C4| 1157 /-- --\ 1158 -- -- -- -- --/ \-- 1159 |C1|---|C2|---|S1|---|S2|----|S3| |C5| 1160 -- /-- --\ --\ --\ /-- 1161 / \-- \-- \-- --/ -- 1162 / |S4| |S5|----|S6|---|C6|---|C7| 1163 / /-- --\ /-- /-- -- 1164 --/ -- --/ -- \--/ --/ 1165 |C8|---|C9|---|S7|---|S8|----|S9|---|C0| 1166 -- -- -- -- -- -- 1168 Figure 10 : An example Multi-Layer Network 1170 If the network in Figure 10 is operated as separate client and server 1171 networks then the client network topology will appear as shown in 1173 -- -- 1174 |C3|---|C4| 1175 -- --\ 1176 -- -- \-- 1177 |C1|---|C2| |C5| 1178 -- /-- /-- 1179 / --/ -- 1180 / |C6|---|C7| 1181 / /-- -- 1182 --/ -- --/ 1183 |C8|---|C9| |C0| 1184 -- -- -- 1186 Figure 11 : Client Network Topology Showing Partitioned Network 1188 Figure 11. As can be clearly seen, the network is partitioned and 1189 there is no way to set up an LSP from a node on the left hand side 1190 (say C1) to a node on the right hand side (say C7). 1192 For reference, Figure 12 shows the corresponding server network 1193 topology. 1195 -- -- -- 1196 |S1|---|S2|----|S3| 1197 --\ --\ --\ 1198 \-- \-- \-- 1199 |S4| |S5|----|S6| 1200 /-- --\ /-- 1201 --/ -- \--/ 1202 |S7|---|S8|----|S9| 1203 -- -- -- 1205 Figure 12 : Server Network Topology 1207 Operating on the TED for the server network, a management entity or a 1208 software component may apply policy and consider what abstract links 1209 it might offer for use by the client network. To do this it 1210 obviously needs to be aware of the connections between the layers 1211 (there is no point in offering an abstract link S2-S8 since this 1212 could not be of any use in this example). 1214 In our example, after consideration of which LSPs could be set up in 1215 the server network, four abstract links are offered: S1-S3, S3-S6, 1216 S1-S9, and S7-S9. These abstract links are shown as double lines on 1217 the resulting topology of the abstraction layer network in Figure 13. 1218 As can be seen, two of the links must share part of a path (S1-S9 1219 must share with either S1-S3 or with S7-S9). This could be achieved 1221 -- 1222 |C3| 1223 /-- 1224 -- -- --/ 1225 |C2|---|S1|==========|S3| 1226 -- --\\ --\\ 1227 \\ \\ 1228 \\ \\-- -- 1229 \\ |S6|---|C6| 1230 \\ -- -- 1231 -- -- \\-- -- 1232 |C9|---|S7|=====|S9|---|C0| 1233 -- -- -- -- 1235 Figure 13 : Abstraction Layer Network with Abstract Links 1237 using distinct resources (for example, separate lambdas) where the 1238 paths are common, but it could also be done using resource sharing. 1240 That would mean that when both paths S1-S3 and S7-S9 carry client- 1241 edge to client-edge LSPs the resources on the path S1-S9 are used and 1242 might be depleted to the point that the path is resource constrained 1243 and cannot be used. 1245 The separate IGP instance running in the abstraction layer network 1246 means that this topology is visible at the edge nodes (C2, C3, C6, 1247 C9, and C0) as well as at a PCE if one is present. 1249 Now the client network is able to make requests to the abstraction 1250 layer network to provide connectivity. In our example, it requests 1251 that C2 is connected to C3 and that C2 is connected to C0. This 1252 results in several actions: 1254 1. The management component for the abstraction layer network asks 1255 its PCE to compute the paths necessary to make the connections. 1256 This yields C2-S1-S3-C3 and C2-S1-S9-C0. 1258 2. The management component for the abstraction layer network 1259 instructs C2 to start the signaling process for the new LSPs in 1260 the abstraction layer. 1262 3. C2 signals the LSPs for setup using the explicit routes 1263 C2-S1-S3-C3 and C2-S1-S9-C0. 1265 4. When the signaling messages reach S1 (in our example, both LSPs 1266 traverse S1) the server network may support them by a number of 1267 means including establishing server network LSPs as tunnels 1268 depending on the mismatch of technologies between the client and 1269 server networks. For example, S1-S2-S3 and S1-S2-S5-S9 might be 1270 traversed via an LSP tunnel, using LSPs stitched together, or 1271 simply by routing the client network LSP through the server 1272 network. If server network LSPs are needed to they can be 1273 signaled at this point. 1275 5. Once any server network LSPs that are needed have been 1276 established, S1 can continue to signal the client-edge to client- 1277 edge LSP across the abstraction layer either using the server 1278 network LSPs as tunnels or as stitching segments, or simply 1279 routing through the server network. 1281 6. Finally, once the client-edge to client-edge LSPs have been set 1282 up, the client network can be informed and can start to advertise 1283 the new TE links C2-C3 and C2-C0. The resulting client network 1284 topology is shown in Figure 14. 1286 -- -- 1287 |C3|-|C4| 1288 /-- --\ 1289 / \-- 1290 -- --/ |C5| 1291 |C1|---|C2| /-- 1292 -- /--\ --/ -- 1293 / \ |C6|---|C7| 1294 / \ /-- -- 1295 / \--/ 1296 --/ -- |C0| 1297 |C8|---|C9| -- 1298 -- -- 1300 Figure 14 : Connected Client Network with Additional Links 1302 7. Now the client network can compute an end-to-end path from C1 to 1303 C7. 1305 4.2.3.1 A Server with Multiple Clients 1307 A single server network may support multiple client networks. This 1308 is not an uncommon state of affairs for example when the server 1309 network provides connectivity for multiple customers. 1311 In this case, the abstraction provided by the server network may vary 1312 considerably according to the policies and commercial relationships 1313 with each customer. This variance would lead to a separate 1314 abstraction layer network maintained to support each client network. 1316 On the other hand, it may be that multiple clients networks are 1317 subject to the same policies and the abstraction can be identical. 1318 In this case, a single abstraction layer network can support more 1319 than one client. 1321 The choices here are made as an operational issue by the server 1322 network. 1324 4.2.3.2 A Client with Multiple Servers 1326 A single client network may be supported by multiple server networks. 1327 The server networks may provide connectivity between different parts 1328 of the client network or may provide parallel (redundant) 1329 connectivity for the client network. 1331 In this case the abstraction layer network should contain the 1332 abstract links from all server networks so that it can make suitable 1333 computations and create the correct TE links in the client network. 1335 That is, the relationship between client network and abstraction 1336 layer network should be one-to-one. 1338 4.2.4. Abstraction in Peer Networks 1340 Figure 15 shows the basic architectural concepts for connecting 1341 across peer networks. Nodes from four networks are shown: A1 and A2 1342 come from one network; B1, B2, and B3 from another network; etc. The 1343 interfaces between the networks (sometimes known as External Network- 1344 to-Network Interfaces - ENNIs) are A2-B1, B3-C1, and C3-D1. 1346 The objective is to be able to support an end-to-end connection A1- 1347 to-D2. This connection is for TE connectivity. 1349 As shown in the figure, abstract links that span the transit networks 1350 are used to achieve the required connectivity. These links form the 1351 key building blocks of the end-to-end connectivity. An end-to-end 1352 LSP uses these links as part of its path. If the stitching 1353 capabilities of the networks are homogeneous then the end-to-end LSP 1354 may simply traverse the path defined by the abstract links across the 1355 various peer networks or may utilize stitching of LSP segments that 1356 each traverse a network along the path of an abstract link. If the 1357 network switching technologies support or necessitate the use of LSP 1358 hierarchies, the end-to-end LSP may be tunneled across each network 1359 using hierarchical LSPs that each each traverse a network along the 1360 path of an abstract link. 1362 : : : 1363 Network A : Network B : Network C : Network D 1364 : : : 1365 -- -- -- -- -- -- -- -- -- -- 1366 |A1|--|A2|---|B1|--|B2|--|B3|---|C1|--|C2|--|C3|---|D1|--|D2| 1367 -- -- | | -- | | | | -- | | -- -- 1368 | |========| | | |========| | 1369 -- -- -- -- 1371 Key 1372 --- Direct connection between two nodes 1373 === Abstract link across transit network 1375 Figure 15 : Architecture for Peering 1377 Peer networks exist in many situations in the Internet. Packet 1378 networks may peer as IGP areas (levels) or as ASes. Transport 1379 networks (such as optical networks) may peer to provide 1380 concatenations of optical paths through single vendor environments 1381 (see Section 6). Figure 16 shows a simple example of three peer 1382 networks (A, B, and C) each comprising a few nodes. 1384 Network A : Network B : Network C 1385 : : 1386 -- -- -- : -- -- -- : -- -- 1387 |A1|---|A2|----|A3|---|B1|---|B2|---|B3|---|C1|---|C2| 1388 -- --\ /-- : -- /--\ -- : -- -- 1389 \--/ : / \ : 1390 |A4| : / \ : 1391 --\ : / \ : 1392 -- \-- : --/ \-- : -- -- 1393 |A5|---|A6|---|B4|----------|B6|---|C3|---|C4| 1394 -- -- : -- -- : -- -- 1395 : : 1396 : : 1398 Figure 16 : A Network Comprising Three Peer Networks 1400 As discussed in Section 2, peered networks do not share visibility of 1401 their topologies or TE capabilities for scaling and confidentiality 1402 reasons. That means, in our example, that computing a path from A1 1403 to C4 can be impossible without the aid of cooperating PCEs or some 1404 form of crankback. 1406 But it is possible to produce abstract links for reachability across 1407 transit peer networks and to create an abstraction layer network. 1408 That network can be enhanced with specific reachability information 1409 if a destination network is partitioned as is the case with Network C 1410 in Figure 16. 1412 Suppose Network B decides to offer three abstract links B1-B3, B4-B3, 1413 and B4-B6. The abstraction layer network could then be constructed 1414 to look like the network in Figure 17. 1416 -- -- -- -- 1417 |A3|---|B1|====|B3|----|C1| 1418 -- -- //-- -- 1419 // 1420 // 1421 // 1422 -- --// -- -- 1423 |A6|---|B4|=====|B6|---|C3| 1424 -- -- -- -- 1426 Figure 17 : Abstraction Layer Network for the Peer Network Example 1428 Using a process similar to that described in Section 4.2.3, Network A 1429 can request connectivity to Network C and abstract links can be 1430 advertised that connect the edges of the two networks and that can be 1431 used to carry LSPs that traverse both networks. Furthermore, if 1432 Network C is partitioned, reachability information can be exchanged 1433 to allow Network A to select the correct abstract link as shown in 1434 Figure 18. 1436 Network A : Network C 1437 : 1438 -- -- -- : -- -- 1439 |A1|---|A2|----|A3|=========|C1|.....|C2| 1440 -- --\ /-- : -- -- 1441 \--/ : 1442 |A4| : 1443 --\ : 1444 -- \-- : -- -- 1445 |A5|---|A6|=========|C3|.....|C4| 1446 -- -- : -- -- 1448 Figure 18 : Tunnel Connections to Network C with TE Reachability 1450 Peer networking cases can be made far more complex by dual homing 1451 between network peering nodes (for example, A3 might connect to B1 1452 and B4 in Figure 17) and by the networks themselves being arranged in 1453 a mesh (for example, A6 might connect to B4 and C1 in Figure 17). 1455 These additional complexities can be handled gracefully by the 1456 abstraction layer network model. 1458 Further examples of abstraction in peer networks can be found in 1459 Sections 6 and 8. 1461 4.3. Considerations for Dynamic Abstraction 1463 It is possible to consider a highly dynamic system where the server 1464 network adaptively suggests new abstract links into the abstraction 1465 layer, and where the abstraction layer proactively deploys new 1466 client-edge to client-edge LSPs to provide new links in the client 1467 network. Such fluidity is, however, to be treated with caution 1468 especially in the case of client-server networks of differing 1469 technologies where hierarchical server network LSPs are used because 1470 of the longer turn-up times of connections in some server networks, 1471 because the server networks are likely to be sparsely connected and 1472 expensive physical resources will only be deployed where there is 1473 believed to be a need for them. More significantly, the complex 1474 commercial, policy, and administrative relationships that may exist 1475 between client and server network operators mean that stability is 1476 more likely to be the desired operational practice. 1478 Thus, proposals for fully automated multi-layer networks based on 1479 this architecture may be regarded as forward-looking topics for 1480 research both in terms of network stability and with regard to 1481 ecomonic impact. 1483 However, some elements of automation should not be discarded. A 1484 server network may automatically apply policy to determine the best 1485 set of abstract links to offer and the most suitable way for the 1486 server network to support them. And a client network may dynamically 1487 observe congestion, lack of connectivity, or predicted changes in 1488 traffic demand, and may use this information to request additional 1489 links from the abstraction layer. And, once policies have been 1490 configured, the whole system should be able to operate autonomous of 1491 operator control (which is not to say that the operator will not have 1492 the option of exerting control at every step in the process). 1494 4.4. Requirements for Advertising Links and Nodes 1496 The abstraction layer network is "just another network layer". The 1497 links and nodes in the network need to be advertised along with their 1498 associated TE information (metrics, bandwidth, etc.) so that the 1499 topology is disseminated and so that routing decisions can be made. 1501 This requires a routing protocol running between the nodes in the 1502 abstraction layer network. Note that this routing information 1503 exchange could be piggy-backed on an existing routing protocol 1504 instance (subject to different switching capabilities applying to the 1505 links in the different networks, or to adequate address space 1506 separation), or use a new instance (or even a new protocol). 1507 Clearly, the information exchanged is only that which has been 1508 created as part of the abstraction function according to policy. 1510 It should be noted that in many cases the abstract represents the 1511 potential for connectivity across the server network but that no such 1512 connectivity exists. In this case we may ponder how the routing 1513 protocol in the abstraction layer will advertise topology information 1514 for and over a link that has no underlying connectivity. In other 1515 words, there must be a communication channel between the abstract 1516 layer nodes so that the routing protocol messages can flow. The 1517 answer is that control plane connectivity already exists in the 1518 server network and on the client-server edge links, and this can be 1519 used to carry the routing protocol messages for the abstraction layer 1520 network. The same consideration applies to the advertisement, in the 1521 client network of the potential connectivity that the abstraction 1522 layer network can provide although it may be more normal to establish 1523 that connectivity before advertising a link in the client network. 1525 4.5. Addressing Considerations 1527 The network layers in this architecture should be able to operate 1528 with separate address spaces and these may overlap without any 1529 technical issues. That is, one address may mean one thing in the 1530 client network, yet the same address may have a different meaning in 1531 the abstraction layer network or the server network. In other words 1532 there is complete address separation between networks. 1534 However, this will require some care both because human operators may 1535 well become confused, and because mapping between address spaces is 1536 needed at the interfaces between the network layers. That mapping 1537 requires configuration so that, for example, when the server network 1538 announces an abstract link from A to B, the abstraction layer network 1539 must recognize that A and B are server network addresses and must map 1540 them to abstraction layer addresses (say P and Q) before including 1541 the link in its own topology. And similarly, when the abstraction 1542 layer network informs the client network that a new link is available 1543 from S to T, it must map those addresses from its own address space 1544 to that of the client network. 1546 This form of address mapping will become particularly important in 1547 cases where one abstraction layer network is constructed from 1548 connectivity in multiple server networks, or where one abstraction 1549 layer network provides connectivity for multiple client networks. 1551 5. Building on Existing Protocols 1553 This section is non-normative and is not intended to prejudge a 1554 solutions framework or any applicability work. It does, however, 1555 very briefly serve to note the existence of protocols that could be 1556 examined for applicability to serve in realizing the model described 1557 in this document. 1559 The general principle of protocol re-use is preferred over the 1560 invention of new protocols or additional protocol extensions, and it 1561 would be advantageous to make use of an existing protocol that is 1562 commonly implemented on network nodes and is currently deployed, or 1563 to use existing computational elements such as Path Computation 1564 Elements (PCEs). This has many benefits in network stability, time 1565 to deployment, and operator training. 1567 It is recognized, however, that existing protocols are unlikely to be 1568 immediately suitable to this problem space without some protocol 1569 extensions. Extending protocols must be done with care and with 1570 consideration for the stability of existing deployments. In extreme 1571 cases, a new protocol can be preferable to a messy hack of an 1572 existing protocol. 1574 5.1. BGP-LS 1576 BGP-LS is a set of extensions to BGP described in [RFC7752]. It's 1577 purpose is to announce topology information from one network to a 1578 "north-bound" consumer. Application of BGP-LS to date has focused on 1579 a mechanism to build a TED for a PCE. However, BGP's mechanisms 1580 would also serve well to advertise abstract links from a server 1581 network into the abstraction layer network, or to advertise potential 1582 connectivity from the abstraction layer network to the client 1583 network. 1585 5.2. IGPs 1587 Both OSPF and IS-IS have been extended through a number of RFCs to 1588 advertise TE information. Additionally, both protocols are capable 1589 of running in a multi-instance mode either as ships that pass in the 1590 night (i.e., completely separate instances using different address 1591 spaces) or as dual instances on the same address space. This means 1592 that either IGP could probably be used as the routing protocol in the 1593 abstraction layer network. 1595 5.3. RSVP-TE 1597 RSVP-TE signaling can be used to set up all traffic engineered LSPs 1598 demanded by this model without the need for any protocol extensions. 1600 If necessary, LSP hierarchy [RFC4206] or LSP stitching [RFC5150] can 1601 be used to carry LSPs over the server network, again without needing 1602 any protocol extensions. 1604 Furthermore, the procedures in [RFC6107] allow the dynamic signaling 1605 of the purpose of any LSP that is established. This means that 1606 when an LSP tunnel is set up, the two ends can coordinate into which 1607 routing protocol instance it should be advertised, and can also agree 1608 on the addressing to be said to identify the link that will be 1609 created. 1611 5.4. Notes on a Solution 1613 This section is not intended to be prescriptive or dictate the 1614 protocol solutions that may be used to satisfy the architecture 1615 described in this document, but it does show how the existing 1616 protocols listed in the previous sections can be combined to provide 1617 a solution with only minor modifications. 1619 A server network can be operated using GMPLS routing and signaling 1620 protocols. Using information gathered from the routing protocol, a 1621 TED can be constructed containing resource availability information 1622 and Shared Risk Link Group (SRLG) details. A policy-based process 1623 can then determine which nodes and abstract links it wishes to 1624 advertise to form the abstract layer network. 1626 The server network can now use BGP-LS to advertise a topology of 1627 links and nodes to form the abstraction layer network. This 1628 information would most likely be advertised from a single point of 1629 control that made all of the abstraction decisions, but the function 1630 could be distributed to multiple server network edge nodes. The 1631 information can be advertised by BGP-LS to multiple points within the 1632 abstraction layer (such as all client network edge nodes) or to a 1633 single controller. 1635 Multiple server networks may advertise information that is used to 1636 construct an abstraction layer network, and one server network may 1637 advertise different information in different instances of BGP-LS to 1638 form different abstraction layer networks. Furthermore, in the case 1639 of one controller constructing multiple abstraction layer networks, 1640 BGP-LS uses the route target mechanism defined in [RFC4364] to 1641 distinguish the different applications (effectively abstraction layer 1642 network VPNs) of the exported information. 1644 Extensions may be made to BGP-LS to allow advertisement of Macro 1645 Shared Risk Link Groups (MSRLGs) per Appendix B, mutually exclusive 1646 links, and to indicate whether the abstract link has been pre- 1647 established or not. Such extensions are valid options, but do not 1648 form a core component of this architecture. 1650 The abstraction layer network may operate under central control or 1651 use a distributed control plane. Since the links and nodes may be a 1652 mix of physical and abstract links, and since the nodes may have 1653 diverse cross-connect capabilities, it is most likely that a GMPLS 1654 routing protocol will be beneficial for collecting and correlating 1655 the routing information and for distributing updates. No special 1656 additional features are needed beyond adding those extra parameters 1657 just described for BGP-LS, but it should be noted that the control 1658 plane of the abstraction layer network must run in an out of band 1659 control network because the data-bearing links might not yet have 1660 been established via connections in the server network. 1662 The abstraction layer network is also able to determine potential 1663 connectivity from client network edge to client network edge. It 1664 will determine which client network links to create according to 1665 policy and subject to requests from the client network, and will 1666 take four steps: 1668 - First it will compute a path for across the abstraction layer 1669 network. 1671 - Then, if the support of the abstract links requires the use of 1672 server network LSPs for tunneling or stitching, and if those LSPs 1673 are not already established, it will ask the server layer to set 1674 them up. 1675 - Then, it will signal the client-edge to client-edge LSP. 1676 - Finally, the abstraction layer network will inform the client 1677 network of the existence of the new client network link. 1679 This last step can be achieved either by coordination of the end 1680 points of the LSPs that span the abstraction layer (these points are 1681 client network edge nodes) using mechanisms such as those described 1682 in [RFC6107], or using BGP-LS from a central controller. 1684 Once the client network edge nodes are aware of a new link, they will 1685 automatically advertise it using their routing protocol and it will 1686 become available for use by traffic in the client network. 1688 Sections 6, 7, and 8 discuss the applicability of this architecture 1689 to different network types and problem spaces, while Section 9 gives 1690 some advice about scoping future work. Section 9 on manageability 1691 considerations is particularly relevant in the context of this 1692 section because it contains a discussion of the policies and 1693 mechanisms for indicating connectivity and link availability between 1694 network layers in this architecture. 1696 6. Application of the Architecture to Optical Domains and Networks 1698 Many optical networks are arranged as a set of small domains. Each 1699 domain is a cluster of nodes, usually from the same equipment vendor 1700 and with the same properties. The domain may be constructed as a 1701 mesh or a ring, or maybe as an interconnected set of rings. 1703 The network operator seeks to provide end-to-end connectivity across 1704 a network constructed from multiple domains, and so (of course) the 1705 domains are interconnected. In a network under management control 1706 such as through an Operations Support System (OSS), each domain is 1707 under the operational control of a Network Management System (NMS). 1709 In this way, an end-to-end path may be commissioned by the OSS 1710 instructing each NMS, and the NMSes setting up the path fragments 1711 across the domains. 1713 However, in a system that uses a control plane, there is a need for 1714 integration between the domains. 1716 Consider a simple domain, D1, as shown in Figure 19. In this case, 1717 the nodes A through F are arranged in a topological ring. Suppose 1718 that there is a control plane in use in this domain, and that OSPF is 1719 used as the TE routing protocol. 1721 ----------------- 1722 | D1 | 1723 | B---C | 1724 | / \ | 1725 | / \ | 1726 | A D | 1727 | \ / | 1728 | \ / | 1729 | F---E | 1730 | | 1731 ----------------- 1733 Figure 19 : A Simple Optical Domain 1735 Now consider that the operator's network is built from a mesh of such 1736 domains, D1 through D7, as shown in Figure 20. It is possible that 1737 these domains share a single, common instance of OSPF in which case 1738 there is nothing further to say because that OSPF instance will 1739 distribute sufficient information to build a single TED spanning the 1740 whole network, and an end-to-end path can be computed. A more likely 1741 scenario is that each domain is running its own OSPF instance. In 1742 this case, each is able to handle the peculiarities (or rather, 1743 advanced functions) of each vendor's equipment capabilities. 1745 ------ ------ ------ ------ 1746 | | | | | | | | 1747 | D1 |---| D2 |---| D3 |---| D4 | 1748 | | | | | | | | 1749 ------\ ------\ ------\ ------ 1750 \ | \ | \ | 1751 \------ \------ \------ 1752 | | | | | | 1753 | D5 |---| D6 |---| D7 | 1754 | | | | | | 1755 ------ ------ ------ 1757 Figure 20 : A Simple Optical Domain 1759 The question now is how to combine the multiple sets of information 1760 distributed by the different OSPF instances. Three possible models 1761 suggest themselves based on pre-existing routing practices. 1763 o In the first model (the Area-Based model) each domain is treated as 1764 a separate OSPF area. The end-to-end path will be specified to 1765 traverse multiple areas, and each area will be left to determine 1766 the path across the nodes in the area. The feasibility of an end- 1767 to-end path (and, thus, the selection of the sequence of areas and 1768 their interconnections) can be derived using hierarchical PCE. 1770 This approach, however, fits poorly with established use of the 1771 OSPF area: in this form of optical network, the interconnection 1772 points between domains are likely to be links; and the mesh of 1773 domains is far more interconnected and unstructured than we are 1774 used to seeing in the normal area-based routing paradigm. 1776 Furthermore, while hierarchical PCE may be able to solve this type 1777 of network, the effort involved may be considerable for more than a 1778 small collection of domains. 1780 o Another approach (the AS-Based model) treats each domain as a 1781 separate Autonomous System (AS). The end-to-end path will be 1782 specified to traverse multiple ASes, and each AS will be left to 1783 determine the path across the AS. 1785 This model sits more comfortably with the established routing 1786 paradigm, but causes a massive escalation of ASes in the global 1787 Internet. It would, in practice, require that the operator used 1788 private AS numbers [RFC6996] of which there are plenty. 1790 Then, as suggested in the Area-Based model, hierarchical PCE 1791 could be used to determine the feasibility of an end-to-end path 1792 and to derive the sequence of domains and the points of 1793 interconnection to use. But, just as in that other model, the 1794 scalability of this model using a hierarchical PCE must be 1795 questioned given the sheer number of ASes and their 1796 interconnectivity. 1798 Furthermore, determining the mesh of domains (i.e., the inter-AS 1799 connections) conventionally requires the use of BGP as an inter- 1800 domain routing protocol. However, not only is BGP not normally 1801 available on optical equipment, but this approach indicates that 1802 the TE properties of the inter-domain links would need to be 1803 distributed and updated using BGP: something for which it is not 1804 well suited. 1806 o The third approach (the ASON model) follows the architectural 1807 model set out by the ITU-T [G.8080] and uses the routing protocol 1808 extensions described in [RFC6827]. In this model the concept of 1809 "levels" is introduced to OSPF. Referring back to Figure 20, each 1810 OSPF instance running in a domain would be construed as a "lower 1811 level" OSPF instance and would leak routes into a "higher level" 1812 instance of the protocol that runs across the whole network. 1814 This approach handles the awkwardness of representing the domains 1815 as areas or ASes by simply considering them as domains running 1816 distinct instances of OSPF. Routing advertisements flow "upward" 1817 from the domains to the high level OSPF instance giving it a full 1818 view of the whole network and allowing end-to-end paths to be 1819 computed. Routing advertisements may also flow "downward" from the 1820 network-wide OSPF instance to any one domain so that it has 1821 visibility of the connectivity of the whole network. 1823 While architecturally satisfying, this model suffers from having to 1824 handle the different characteristics of different equipment 1825 vendors. The advertisements coming from each low level domain 1826 would be meaningless when distributed into the other domains, and 1827 the high level domain would need to be kept up-to-date with the 1828 semantics of each new release of each vendor's equipment. 1829 Additionally, the scaling issues associated with a well-meshed 1830 network of domains each with many entry and exit points and each 1831 with network resources that are continually being updated reduces 1832 to the same problem as noted in the virtual link model. 1833 Furthermore, in the event that the domains are under control of 1834 different administrations, the domains would not want to distribute 1835 the details of their topologies and TE resources. 1837 Practically, this third model turns out to be very close to the 1838 methodology described in this document. As noted in Section 6.1 of 1839 [RFC6827], there are policy rules that can be applied to define 1840 exactly what information is exported from or imported to a low level 1841 OSPF instance. The document even notes that some forms of 1842 aggregation may be appropriate. Thus, we can apply the following 1843 simplifications to the mechanisms defined in RFC 6827: 1845 - Zero information is imported to low level domains. 1847 - Low level domains export only abstracted links as defined in this 1848 document and according to local abstraction policy and with 1849 appropriate removal of vendor-specific information. 1851 - There is no need to formally define routing levels within OSPF. 1853 - Export of abstracted links from the domains to the network-wide 1854 routing instance (the abstraction routing layer) can take place 1855 through any mechanism including BGP-LS or direct interaction 1856 between OSPF implementations. 1858 With these simplifications, it can be seen that the framework defined 1859 in this document can be constructed from the architecture discussed 1860 in RFC 6827, but without needing any of the protocol extensions that 1861 that document defines. Thus, using the terminology and concepts 1862 already established, the problem may solved as shown in Figure 21. 1863 The abstraction layer network is constructed from the inter-domain 1864 links, the domain border nodes, and the abstracted (cross-domain) 1865 links. 1867 Abstraction Layer 1868 -- -- -- -- -- -- 1869 | |===========| |--| |===========| |--| |===========| | 1870 | | | | | | | | | | | | 1871 ..| |...........| |..| |...........| |..| |...........| |...... 1872 | | | | | | | | | | | | 1873 | | -- -- | | | | -- -- | | | | -- -- | | 1874 | |_| |_| |_| | | |_| |_| |_| | | |_| |_| |_| | 1875 | | | | | | | | | | | | | | | | | | | | | | | | 1876 -- -- -- -- -- -- -- -- -- -- -- -- 1877 Domain 1 Domain 2 Domain 3 1878 Key Optical Layer 1879 ... Layer separation 1880 --- Physical link 1881 === Abstract link 1883 Figure 21 : The Optical Network Implemented Through the 1884 Abstraction Layer Network 1886 7. Application of the User-to-Network Interface 1888 The User-to-Network Interface (UNI) is an important architectural 1889 concept in many implementations and deployments of client-server 1890 networks especially those where the client and server network have 1891 different technologies. The UNI can be seen described in [G.8080], 1892 and the GMPLS approach to the UNI is documented in [RFC4208]. Other 1893 GMPLS-related documents describe the application of GMPLS to specific 1894 UNI scenarios: for example, [RFC6005] describes how GMPLS can support 1895 a UNI that provides access to Ethernet services. 1897 Figure 1 of [RFC6005] is reproduced here as Figure 22. It shows the 1898 Ethernet UNI reference model, and that figure can serve as an example 1899 for all similar UNIs. In this case, the UNI is an interface between 1900 client network edge nodes and the server network. It should be noted 1901 that neither the client network nor the server network need be an 1902 Ethernet switching network. 1904 There are three network layers in this model: the client network, the 1905 "Ethernet service network", and the server network. The so-called 1906 Ethernet service network consists of links comprising the UNI links 1907 and the tunnels across the server network, and nodes comprising the 1908 client network edge nodes and various server network nodes. That is, 1909 the Ethernet service network is equivalent to the abstraction layer 1910 network with the UNI links being the physical links between the 1911 client and server networks, and the client edge nodes taking the 1912 role of UNI Client-side (UNI-C) and the server edge nodes acting as 1913 the UNI Network-side (UNI-N) nodes. 1915 Client Client 1916 Network +----------+ +-----------+ Network 1917 -------------+ | | | | +------------- 1918 +----+ | | +-----+ | | +-----+ | | +----+ 1919 ------+ | | | | | | | | | | | | +------ 1920 ------+ EN +-+-----+--+ CN +-+----+--+ CN +--+-----+-+ EN +------ 1921 | | | +--+--| +-+-+ | | +--+-----+-+ | 1922 +----+ | | | +--+--+ | | | +--+--+ | | +----+ 1923 | | | | | | | | | | 1924 -------------+ | | | | | | | | +------------- 1925 | | | | | | | | 1926 -------------+ | | | | | | | | +------------- 1927 | | | +--+--+ | | | +--+--+ | | 1928 +----+ | | | | | | +--+--+ | | | +----+ 1929 ------+ +-+--+ | | CN +-+----+--+ CN | | | | +------ 1930 ------+ EN +-+-----+--+ | | | | +--+-----+-+ EN +------ 1931 | | | | +-----+ | | +-----+ | | | | 1932 +----+ | | | | | | +----+ 1933 | +----------+ |-----------+ | 1934 -------------+ Server Networks +------------- 1935 Client UNI UNI Client 1936 Network <-----> <-----> Network 1937 Scope of This Document 1939 Legend: EN - Client Network Edge Node 1940 CN - Server Network (Core) Node 1942 Figure 22 : Ethernet UNI Reference Model 1944 An issue that is often raised concerns how a dual-homed client 1945 network edge node (such as that shown at the bottom left-hand corner 1946 of Figure 22) can make determinations about how they connect across 1947 the UNI. This can be particularly important when reachability across 1948 the server network is limited or when two diverse paths are desired 1949 (for example, to provide protection). However, in the model 1950 described in this network, the edge node (the UNI-C) is part of the 1951 abstraction layer network and can see sufficient topology information 1952 to make these decisions. If the approach introduced in this document 1953 is used to model the UNI as described in this section, there is no 1954 need to enhance the signaling protocols at the GMPLS UNI nor to add 1955 routing exchanges at the UNI. 1957 8. Application of the Architecture to L3VPN Multi-AS Environments 1959 Serving layer-3 VPNs (L3PVNs) across a multi-AS or multi-operator 1960 environment currently provides a significant planning challenge. 1961 Figure 6 shows the general case of the problem that needs to be 1962 solved. This section shows how the abstraction layer network can 1963 address this problem. 1965 In the VPN architecture, the CE nodes are the client network edge 1966 nodes, and the PE nodes are the server network edge nodes. The 1967 abstraction layer network is made up of the CE nodes, the CE-PE 1968 links, the PE nodes, and PE-PE tunnels that are the abstract links. 1970 In the multi-AS or multi-operator case, the abstraction layer network 1971 also includes the PEs (maybe ASBRs) at the edges of the multiple 1972 server networks, and the PE-PE (maybe inter-AS) links. This gives 1973 rise to the architecture shown in Figure 23. 1975 The policy for adding abstract links to the abstraction layer network 1976 will be driven substantially by the needs of the VPN. Thus, when a 1977 new VPN site is added and the existing abstraction layer network 1978 cannot support the required connectivity, a new abstract link will be 1979 created out of the underlying network. 1981 ........... ............. 1982 VPN Site : : VPN Site 1983 -- -- : : -- -- 1984 |C1|-|CE| : : |CE|-|C2| 1985 -- | | : : | | -- 1986 | | : : | | 1987 | | : : | | 1988 | | : : | | 1989 | | : -- -- -- -- : | | 1990 | |----|PE|=========|PE|---|PE|=====|PE|----| | 1991 -- : | | | | | | | | : -- 1992 ........... | | | | | | | | ............ 1993 | | | | | | | | 1994 | | | | | | | | 1995 | | | | | | | | 1996 | | - - | | | | - | | 1997 | |-|P|-|P|-| | | |-|P|-| | 1998 -- - - -- -- - -- 2000 Figure 23 : The Abstraction Layer Network for a Multi-AS VPN 2002 It is important to note that each VPN instance can have a separate 2003 abstraction layer network. This means that the server network 2004 resources can be partitioned and that traffic can be kept separate. 2006 This can be achieved even when VPN sites from different VPNs connect 2007 at the same PE. Alternatively, multiple VPNs can share the same 2008 abstraction layer network if that is operationally preferable. 2010 Lastly, just as for the UNI discussed in Section 7, the issue of 2011 dual-homing of VPN sites is a function of the abstraction layer 2012 network and so is just a normal routing problem in that network. 2014 9. Scoping Future Work 2016 The section is provided to help guide the work on this problem and to 2017 ensure that oceans are not knowingly boiled. This guidance is non- 2018 normative for this architecture description. 2020 9.1. Not Solving the Internet 2022 The scope of the use cases and problem statement in this document is 2023 limited to "some small set of interconnected domains." In 2024 particular, it is not the objective of this work to turn the whole 2025 Internet into one large, interconnected TE network. 2027 9.2. Working With "Related" Domains 2029 Subsequent to Section 9.1, the intention of this work is to solve 2030 the TE interconnectivity for only "related" domains. Such domains 2031 may be under common administrative operation (such as IGP areas 2032 within a single AS, or ASes belonging to a single operator), or may 2033 have a direct commercial arrangement for the sharing of TE 2034 information to provide specific services. Thus, in both cases, there 2035 is a strong opportunity for the application of policy. 2037 9.3. Not Finding Optimal Paths in All Situations 2039 As has been well described in this document, abstraction necessarily 2040 involves compromises and removal of information. That means that it 2041 is not possible to guarantee that an end-to-end path over 2042 interconnected TE domains follows the absolute optimal (by any measure 2043 of optimality) path. This is taken as understood, and future work 2044 should not attempt to achieve such paths which can only be found by a 2045 full examination of all network information across all connected 2046 networks. 2048 9.4. Sanity and Scaling 2050 All of the above points play into a final observation. This work is 2051 intended to bite off a small problem for some relatively simple use 2052 cases as described in Section 2. It is not intended that this work 2053 will be immediately (or even soon) extended to cover many large 2054 interconnected domains. Obviously the solution should as far as 2055 possible be designed to be extensible and scalable, however, it is 2056 also reasonable to make trade-offs in favor of utility and 2057 simplicity. 2059 10. Manageability Considerations 2061 Manageability should not be a significant additional burden. Each 2062 layer in the network model can and should be managed independently. 2064 That is, each client network will run its own management systems and 2065 tools to manage the nodes and links in the client network: each 2066 client network link that uses an abstract link will still be 2067 available for management in the client network as any other link. 2069 Similarly, each server network will run its own management systems 2070 and tools to manage the nodes and links in that network just as 2071 normal. 2073 Three issues remain for consideration: 2075 - How is the abstraction layer network managed? 2076 - How is the interface between the client network and the abstraction 2077 layer network managed? 2078 - How is the interface between the abstraction layer network and the 2079 server network managed? 2081 10.1. Managing the Abstraction Layer Network 2083 Management of the abstraction layer network differs from the client 2084 and server networks because not all of the links that are visible in 2085 the TED are real links. That is, it is not possible to run OAM on 2086 the links that constitute the potential of a link. 2088 Other than that, however, the management should be essentially the 2089 same. Routing and signaling protocols can be run in the abstraction 2090 layer (using out of band channels for links that have not yet been 2091 established), and a centralized TED can be constructed and used to 2092 examine the availability and status of the links and nodes in the 2093 network. 2095 Note that different deployment models will place the "ownership" of 2096 the abstraction layer network differently. In some case the 2097 abstraction layer network will be constructed by the operator of the 2098 server network and run by that operator as a service for one or more 2099 client networks. In other cases, one or more server networks will 2100 present the potential of links to an abstraction layer network run 2101 by the operator of the client network. And it is feasible that a 2102 business model could be built where a third-party operator manages 2103 the abstraction layer network, constructing it from the connectivity 2104 available in multiple server networks, and facilitating connectivity 2105 for multiple client networks. 2107 10.2. Managing Interactions of Client and Abstraction Layer Networks 2109 The interaction between the client network and the abstraction layer 2110 network is a management task. It might be automated (software 2111 driven) or it might require manual intervention. 2113 This is a two-way interaction: 2115 - The client network can express the need for additional 2116 connectivity. For example, the client network may try and fail to 2117 find a path across the client network and may request additional, 2118 specific connectivity (this is similar to the situation with 2119 Virtual Network Topology Manager (VNTM) [RFC5623]). Alternatively, 2120 a more proactive client network management system may monitor 2121 traffic demands (current and predicted), network usage, and network 2122 "hot spots" and may request changes in connectivity by both 2123 releasing unused links and by requesting new links. 2125 - The abstraction layer network can make links available to the 2126 client network or can withdraw them. These actions can be in 2127 response to requests from the client network, or can be driven by 2128 processes within the abstraction layer (perhaps reorganizing the 2129 use of server network resources). In any case, the presentation of 2130 new links to the client network is heavily subject to policy since 2131 this is both operationally key to the success of this architecture 2132 and the central plank of the commercial model described in this 2133 document. Such policies belong to the operator of the abstraction 2134 layer network and are expected to be fully configurable. 2136 Once the abstraction layer network has decided to make a link 2137 available to the client network it will install it at the link end 2138 points (which are nodes in the client network) such that it appears 2139 and can be advertised as a link in the client network. 2141 In all cases, it is important that the operators of both networks are 2142 able to track the requests and responses, and the operator of the 2143 client network should be able to see which links in that network are 2144 "real" physical links, and which are presented by the abstraction 2145 layer network. 2147 10.3. Managing Interactions of Abstraction Layer and Server Networks 2149 The interactions between the abstraction layer network and the server 2150 network a similar to those described in Section 10.2, but there is a 2151 difference in that the server network is more likely to offer up 2152 connectivity, and the abstraction layer network is less likely to ask 2153 for it. 2155 That is, the server network will, according to policy that may 2156 include commercial relationships, offer the abstraction layer network 2157 a set of potential connectivity that the abstraction layer network 2158 can treat as links. This server network policy will include: 2159 - how much connectivity to offer 2160 - what level of server network redundancy to include 2161 - how to support the use of the abstraction links, 2163 This process of offering links from the server network may include a 2164 mechanism to indicate which links have been pre-established in the 2165 server network, and can include other properties such as: 2166 - link-level protection ([RFC4202]) 2167 - SRLG and MSRLG (see Appendix A) 2168 - mutual exclusivity (see Appendix B). 2170 The abstraction layer network needs a mechanism to tell the server 2171 network which links it is making use of. This mechanism could also 2172 include the ability to request additional connectivity from the 2173 server network, although it seems most likely that the server network 2174 will already have presented as much connectivity as it is physically 2175 capable of subject to the constraints of policy. 2177 Finally, the server network will need to confirm the establishment of 2178 connectivity, withdraw links if they are no longer feasible, and 2179 report failures. 2181 Again, it is important that the operators of both networks are able 2182 to track the requests and responses, and the operator of the server 2183 network should be able to see which links are in use. 2185 11. IANA Considerations 2187 This document makes no requests for IANA action. The RFC Editor may 2188 safely remove this section. 2190 12. Security Considerations 2192 Security of signaling and routing protocols is usually administered 2193 and achieved within the boundaries of a domain. Thus, and for 2194 example, a domain with a GMPLS control plane [RFC3945] would apply 2195 the security mechanisms and considerations that are appropriate to 2196 GMPLS [RFC5920]. Furthermore, domain-based security relies strongly 2197 on ensuring that control plane messages are not allowed to enter the 2198 domain from outside. 2200 In this context, additional security considerations arising from this 2201 document relate to the exchange of control plane information between 2202 domains. Messages are passed between domains using control plane 2203 protocols operating between peers that have predictable relationships 2204 (for example, UNI-C to UNI-N, between BGP-LS speakers, or between 2205 peer domains). Thus, the security that needs to be given additional 2206 attention for inter-domain TE concentrates on authentication of 2207 peers, assertion that messages have not been tampered with, and to a 2208 lesser extent protecting the content of the messages from inspection 2209 since that might give away sensitive information about the networks. 2210 The protocols described in Appendix A and which are likely to provide 2211 the foundation to solutions to this architecture already include 2212 such protection and further can be run over protected transports 2213 such as IPsec [RFC6701], TLS [RFC5246], and the TCP Authentication 2214 Option (TCP-AO) [RFC5925]. 2216 It is worth noting that the control plane of the abstraction layer 2217 network is likely to be out of band. That is, control plane messages 2218 will be exchanged over network links that are not the links to which 2219 they apply. This models the facilities of GMPLS (but not of MPLS-TE) 2220 and the security mechanisms can be applied to the protocols operating 2221 in the out of band network. 2223 13. Acknowledgements 2225 Thanks to Igor Bryskin for useful discussions in the early stages of 2226 this work and to Gert Grammel for discussions on the extent of 2227 aggregation in abstract nodes and links. 2229 Thanks to Deborah Brungard, Dieter Beller, Dhruv Dhody, Vallinayakam 2230 Somasundaram, Hannes Gredler, Stewart Bryant, Brian Carpenter, and 2231 Hilarie Orman for review and input. 2233 Particular thanks to Vishnu Pavan Beeram for detailed discussions and 2234 white-board scribbling that made many of the ideas in this document 2235 come to life. 2237 Text in Section 4.2.3 is freely adapted from the work of Igor 2238 Bryskin, Wes Doonan, Vishnu Pavan Beeram, John Drake, Gert Grammel, 2239 Manuel Paul, Ruediger Kunze, Friedrich Armbruster, Cyril Margaria, 2240 Oscar Gonzalez de Dios, and Daniele Ceccarelli in 2241 [I-D.beeram-ccamp-gmpls-enni] for which the authors of this document 2242 express their thanks. 2244 14. References 2246 14.1. Informative References 2248 [G.8080] ITU-T, "Architecture for the automatically switched optical 2249 network (ASON)", Recommendation G.8080. 2251 [I-D.beeram-ccamp-gmpls-enni] 2252 Bryskin, I., Beeram, V. P., Drake, J. et al., "Generalized 2253 Multiprotocol Label Switching (GMPLS) External Network 2254 Network Interface (E-NNI): Virtual Link Enhancements for 2255 the Overlay Model", draft-beeram-ccamp-gmpls-enni, work in 2256 progress. 2258 [I-D.ietf-ccamp-rsvp-te-srlg-collect] 2259 Zhang, F. (Ed.) and O. Gonzalez de Dios (Ed.), "RSVP-TE 2260 Extensions for Collecting SRLG Information", draft-ietf- 2261 ccamp-rsvp-te-srlg-collect, work in progress. 2263 [RFC7752] Gredler, H., Medved, J., Previdi, S., Farrel, A., and Ray, 2264 S., "North-Bound Distribution of Link-State and Traffic 2265 Engineering (TE) Information Using BGP", RFC 7752, March 2266 2016. 2268 [RFC2702] Awduche, D., Malcolm, J., Agogbua, J., O'Dell, M., and 2269 McManus, J., "Requirements for Traffic Engineering Over 2270 MPLS", RFC 2702, September 1999. 2272 [RFC3209] Awduche, D., Berger, L., Gan, D., Li, T., Srinivasan, V., 2273 and G. Swallow, "RSVP-TE: Extensions to RSVP for LSP 2274 Tunnels", RFC 3209, December 2001. 2276 [RFC3473] L. Berger, "Generalized Multi-Protocol Label Switching 2277 (GMPLS) Signaling Resource ReserVation Protocol-Traffic 2278 Engineering (RSVP-TE) Extensions", RC 3473, January 2003. 2280 [RFC3630] Katz, D., Kompella, and K., Yeung, D., "Traffic Engineering 2281 (TE) Extensions to OSPF Version 2", RFC 3630, September 2282 2003. 2284 [RFC3945] Mannie, E., (Ed.), "Generalized Multi-Protocol Label 2285 Switching (GMPLS) Architecture", RFC 3945, October 2004. 2287 [RFC4105] Le Roux, J.-L., Vasseur, J.-P., and Boyle, J., 2288 "Requirements for Inter-Area MPLS Traffic Engineering", 2289 RFC 4105, June 2005. 2291 [RFC4202] Kompella, K. and Y. Rekhter, "Routing Extensions in Support 2292 of Generalized Multi-Protocol Label Switching (GMPLS)", 2293 RFC 4202, October 2005. 2295 [RFC4206] Kompella, K. and Y. Rekhter, "Label Switched Paths (LSP) 2296 Hierarchy with Generalized Multi-Protocol Label Switching 2297 (GMPLS) Traffic Engineering (TE)", RFC 4206, October 2005. 2299 [RFC4208] Swallow, G., Drake, J., Ishimatsu, H., and Y. Rekhter, 2300 "User-Network Interface (UNI): Resource ReserVation 2301 Protocol-Traffic Engineering (RSVP-TE) Support for the 2302 Overlay Model", RFC 4208, October 2005. 2304 [RFC4216] Zhang, R., and Vasseur, J.-P., "MPLS Inter-Autonomous 2305 System (AS) Traffic Engineering (TE) Requirements", 2306 RFC 4216, November 2005. 2308 [RFC4271] Rekhter, Y., Li, T., and Hares, S., "A Border Gateway 2309 Protocol 4 (BGP-4)", RFC 4271, January 2006. 2311 [RFC4364] Rosen, E. and Y. Rekhter, "BGP/MPLS IP Virtual Private 2312 Networks (VPNs)", RFC 4364, February 2006. 2314 [RFC4655] Farrel, A., Vasseur, J., and J. Ash, "A Path Computation 2315 Element (PCE)-Based Architecture", RFC 4655, August 2006. 2317 [RFC4726] Farrel, A., Vasseur, J.-P., and Ayyangar, A., "A Framework 2318 for Inter-Domain Multiprotocol Label Switching Traffic 2319 Engineering", RFC 4726, November 2006. 2321 [RFC4847] T. Takeda (Ed.), "Framework and Requirements for Layer 1 2322 Virtual Private Networks," RFC 4847, April 2007. 2324 [RFC4874] Lee, CY., Farrel, A., and S. De Cnodder, "Exclude Routes - 2325 Extension to Resource ReserVation Protocol-Traffic 2326 Engineering (RSVP-TE)", RFC 4874, April 2007. 2328 [RFC4920] Farrel, A., Satyanarayana, A., Iwata, A., Fujita, N., and 2329 Ash, G., "Crankback Signaling Extensions for MPLS and GMPLS 2330 RSVP-TE", RFC 4920, July 2007. 2332 [RFC5150] Ayyangar, A., Kompella, K., Vasseur, JP., and A. Farrel, 2333 "Label Switched Path Stitching with Generalized 2334 Multiprotocol Label Switching Traffic Engineering (GMPLS 2335 TE)", RFC 5150, February 2008. 2337 [RFC5152] Vasseur, JP., Ayyangar, A., and Zhang, R., "A Per-Domain 2338 Path Computation Method for Establishing Inter-Domain 2339 Traffic Engineering (TE) Label Switched Paths (LSPs)", 2340 RFC 5152, February 2008. 2342 [RFC5195] Ould-Brahim, H., Fedyk, D., and Y. Rekhter, "BGP-Based 2343 Auto-Discovery for Layer-1 VPNs", RFC 5195, June 2008. 2345 [RFC5246] Dierks, T. and E. Rescorla, "The Transport Layer Security 2346 (TLS) Protocol Version 1.2", RFC 5246, August 2008. 2348 [RFC5251] Fedyk, D., Rekhter, Y., Papadimitriou, D., Rabbat, R., and 2349 L. Berger, "Layer 1 VPN Basic Mode", RFC 5251, July 2008. 2351 [RFC5252] Bryskin, I. and L. Berger, "OSPF-Based Layer 1 VPN Auto- 2352 Discovery", RFC 5252, July 2008. 2354 [RFC5305] Li, T., and Smit, H., "IS-IS Extensions for Traffic 2355 Engineering", RFC 5305, October 2008. 2357 [RFC5440] Vasseur, JP. and Le Roux, JL., "Path Computation Element 2358 (PCE) Communication Protocol (PCEP)", RFC 5440, March 2009. 2360 [RFC5441] Vasseur, JP., Zhang, R., Bitar, N, and Le Roux, JL., "A 2361 Backward-Recursive PCE-Based Computation (BRPC) Procedure 2362 to Compute Shortest Constrained Inter-Domain Traffic 2363 Engineering Label Switched Paths", RFC 5441, April 2009. 2365 [RFC5523] L. Berger, "OSPFv3-Based Layer 1 VPN Auto-Discovery", RFC 2366 5523, April 2009. 2368 [RFC5553] Farrel, A., Bradford, R., and JP. Vasseur, "Resource 2369 Reservation Protocol (RSVP) Extensions for Path Key 2370 Support", RFC 5553, May 2009. 2372 [RFC5623] Oki, E., Takeda, T., Le Roux, JL., and A. Farrel, 2373 "Framework for PCE-Based Inter-Layer MPLS and GMPLS Traffic 2374 Engineering", RFC 5623, September 2009. 2376 [RFC5920] L. Fang, Ed., "Security Framework for MPLS and GMPLS 2377 Networks", RFC 5920, July 2010. 2379 [RFC5925] Touch, J., Mankin, A., and R. Bonica, "The TCP 2380 Authentication Option", RFC 5925, June 2010. 2382 [RFC6005] Nerger, L., and D. Fedyk, "Generalized MPLS (GMPLS) Support 2383 for Metro Ethernet Forum and G.8011 User Network Interface 2384 (UNI)", RFC 6005, October 2010. 2386 [RFC6107] Shiomoto, K., and A. Farrel, "Procedures for Dynamically 2387 Signaled Hierarchical Label Switched Paths", RFC 6107, 2388 February 2011. 2390 [RFC6701] Frankel, S. and S. Krishnan, "IP Security (IPsec) and 2391 Internet Key Exchange (IKE) Document Roadmap", RFC 6701, 2392 February 2011. 2394 [RFC6805] King, D., and A. Farrel, "The Application of the Path 2395 Computation Element Architecture to the Determination of a 2396 Sequence of Domains in MPLS and GMPLS", RFC 6805, November 2397 2012. 2399 [RFC6827] Malis, A., Lindem, A., and D. Papadimitriou, "Automatically 2400 Switched Optical Network (ASON) Routing for OSPFv2 2401 Protocols", RFC 6827, January 2013. 2403 [RFC6996] J. Mitchell, "Autonomous System (AS) Reservation for 2404 Private Use", BCP 6, RFC 6996, July 2013. 2406 [RFC7399] Farrel, A. and D. King, "Unanswered Questions in the Path 2407 Computation Element Architecture", RFC 7399, October 2014. 2409 [RFC7579] Bernstein, G., Lee, Y.,et al., "General Network Element 2410 Constraint Encoding for GMPLS-Controlled Networks", RFC 2411 7579, June 2015. 2413 [RFC7580] Zhang, F., Lee, Y,. Han, J, Bernstein, G., and Xu, Y., 2414 "OSPF-TE Extensions for General Network Element 2415 Constraints", RFC 7580, June 2015. 2417 Authors' Addresses 2419 Adrian Farrel 2420 Juniper Networks 2421 EMail: adrian@olddog.co.uk 2423 John Drake 2424 Juniper Networks 2425 EMail: jdrake@juniper.net 2427 Nabil Bitar 2428 Nokia 2429 EMail: nbitar40@gmail.com 2430 George Swallow 2431 Cisco Systems, Inc. 2432 1414 Massachusetts Ave 2433 Boxborough, MA 01719 2434 EMail: swallow@cisco.com 2436 Xian Zhang 2437 Huawei Technologies 2438 Email: zhang.xian@huawei.com 2440 Daniele Ceccarelli 2441 Ericsson 2442 Via A. Negrone 1/A 2443 Genova - Sestri Ponente 2444 Italy 2445 EMail: daniele.ceccarelli@ericsson.com 2447 Contributors 2449 Gert Grammel 2450 Juniper Networks 2451 Email: ggrammel@juniper.net 2453 Vishnu Pavan Beeram 2454 Juniper Networks 2455 Email: vbeeram@juniper.net 2457 Oscar Gonzalez de Dios 2458 Email: ogondio@tid.es 2460 Fatai Zhang 2461 Email: zhangfatai@huawei.com 2463 Zafar Ali 2464 Email: zali@cisco.com 2466 Rajan Rao 2467 Email: rrao@infinera.com 2469 Sergio Belotti 2470 Email: sergio.belotti@alcatel-lucent.com 2472 Diego Caviglia 2473 Email: diego.caviglia@ericsson.com 2475 Jeff Tantsura 2476 Email: jeff.tantsura@ericsson.com 2477 Khuzema Pithewan 2478 Email: kpithewan@infinera.com 2480 Cyril Margaria 2481 Email: cyril.margaria@googlemail.com 2483 Victor Lopez 2484 Email: vlopez@tid.es 2486 Appendix A. Existing Work 2488 This appendix briefly summarizes relevant existing work that is used 2489 to route TE paths across multiple domains. It is non-normative. 2491 A.1. Per-Domain Path Computation 2493 The per-domain mechanism of path establishment is described in 2494 [RFC5152] and its applicability is discussed in [RFC4726]. In 2495 summary, this mechanism assumes that each domain entry point is 2496 responsible for computing the path across the domain, but that 2497 details of the path in the next domain are left to the next domain 2498 entry point. The computation may be performed directly by the entry 2499 point or may be delegated to a computation server. 2501 This basic mode of operation can run into many of the issues 2502 described alongside the use cases in Section 2. However, in practice 2503 it can be used effectively with a little operational guidance. 2505 For example, RSVP-TE [RFC3209] includes the concept of a "loose hop" 2506 in the explicit path that is signaled. This allows the original 2507 request for an LSP to list the domains or even domain entry points to 2508 include on the path. Thus, in the example in Figure 1, the source 2509 can be told to use the interconnection x2. Then the source computes 2510 the path from itself to x2, and initiates the signaling. When the 2511 signaling message reaches Domain Z, the entry point to the domain 2512 computes the remaining path to the destination and continues the 2513 signaling. 2515 Another alternative suggested in [RFC5152] is to make TE routing 2516 attempt to follow inter-domain IP routing. Thus, in the example 2517 shown in Figure 2, the source would examine the BGP routing 2518 information to determine the correct interconnection point for 2519 forwarding IP packets, and would use that to compute and then signal 2520 a path for Domain A. Each domain in turn would apply the same 2521 approach so that the path is progressively computed and signaled 2522 domain by domain. 2524 Although the per-domain approach has many issues and drawbacks in 2525 terms of achieving optimal (or, indeed, any) paths, it has been the 2526 mainstay of inter-domain LSP set-up to date. 2528 A.2. Crankback 2530 Crankback addresses one of the main issues with per-domain path 2531 computation: what happens when an initial path is selected that 2532 cannot be completed toward the destination? For example, what 2533 happens if, in Figure 2, the source attempts to route the path 2534 through interconnection x2, but Domain C does not have the right TE 2535 resources or connectivity to route the path further? 2537 Crankback for MPLS-TE and GMPLS networks is described in [RFC4920] 2538 and is based on a concept similar to the Acceptable Label Set 2539 mechanism described for GMPLS signaling in [RFC3473]. When a node 2540 (i.e., a domain entry point) is unable to compute a path further 2541 across the domain, it returns an error message in the signaling 2542 protocol that states where the blockage occurred (link identifier, 2543 node identifier, domain identifier, etc.) and gives some clues about 2544 what caused the blockage (bad choice of label, insufficient bandwidth 2545 available, etc.). This information allows a previous computation 2546 point to select an alternative path, or to aggregate crankback 2547 information and return it upstream to a previous computation point. 2549 Crankback is a very powerful mechanism and can be used to find an 2550 end-to-end path in a multi-domain network if one exists. 2552 On the other hand, crankback can be quite resource-intensive as 2553 signaling messages and path setup attempts may "wander around" in the 2554 network attempting to find the correct path for a long time. Since 2555 RSVP-TE signaling ties up networks resources for partially 2556 established LSPs, since network conditions may be in flux, and most 2557 particularly since LSP setup within well-known time limits is highly 2558 desirable, crankback is not a popular mechanism. 2560 Furthermore, even if crankback can always find an end-to-end path, it 2561 does not guarantee to find the optimal path. (Note that there have 2562 been some academic proposals to use signaling-like techniques to 2563 explore the whole network in order to find optimal paths, but these 2564 tend to place even greater burdens on network processing.) 2566 A.3. Path Computation Element 2568 The Path Computation Element (PCE) is introduced in [RFC4655]. It is 2569 an abstract functional entity that computes paths. Thus, in the 2570 example of per-domain path computation (see A.1) the source node and 2571 each domain entry point is a PCE. On the other hand, the PCE can 2572 also be realized as a separate network element (a server) to which 2573 computation requests can be sent using the Path Computation Element 2574 Communication Protocol (PCEP) [RFC5440]. 2576 Each PCE has responsibility for computations within a domain, and has 2577 visibility of the attributes within that domain. This immediately 2578 enables per-domain path computation with the opportunity to off-load 2579 complex, CPU-intensive, or memory-intensive computation functions 2580 from routers in the network. But the use of PCE in this way does not 2581 solve any of the problems articulated in A.1 and A.2. 2583 Two significant mechanisms for cooperation between PCEs have been 2584 described. These mechanisms are intended to specifically address the 2585 problems of computing optimal end-to-end paths in multi-domain 2586 environments. 2588 - The Backward-Recursive PCE-Based Computation (BRPC) mechanism 2589 [RFC5441] involves cooperation between the set of PCEs along the 2590 inter-domain path. Each one computes the possible paths from 2591 domain entry point (or source node) to domain exit point (or 2592 destination node) and shares the information with its upstream 2593 neighbor PCE which is able to build a tree of possible paths 2594 rooted at the destination. The PCE in the source domain can 2595 select the optimal path. 2597 BRPC is sometimes described as "crankback at computation time". It 2598 is capable of determining the optimal path in a multi-domain 2599 network, but depends on knowing the domain that contains the 2600 destination node. Furthermore, the mechanism can become quite 2601 complicated and involve a lot of data in a mesh of interconnected 2602 domains. Thus, BRPC is most often proposed for a simple mesh of 2603 domains and specifically for a path that will cross a known 2604 sequence of domains, but where there may be a choice of domain 2605 interconnections. In this way, BRPC would only be applied to 2606 Figure 2 if a decision had been made (externally) to traverse 2607 Domain C rather than Domain D (notwithstanding that it could 2608 functionally be used to make that choice itself), but BRPC could be 2609 used very effectively to select between interconnections x1 and x2 2610 in Figure 1. 2612 - Hierarchical PCE (H-PCE) [RFC6805] offers a parent PCE that is 2613 responsible for navigating a path across the domain mesh and for 2614 coordinating intra-domain computations by the child PCEs 2615 responsible for each domain. This approach makes computing an end- 2616 to-end path across a mesh of domains far more tractable. However, 2617 it still leaves unanswered the issue of determining the location of 2618 the destination (i.e., discovering the destination domain) as 2619 described in Section 2.1. Furthermore, it raises the question of 2620 who operates the parent PCE especially in networks where the 2621 domains are under different administrative and commercial control. 2623 It should also be noted that [RFC5623] discusses how PCE is used in a 2624 multi-layer network with coordination between PCEs operating at each 2625 network layer. Further issues and considerations of the use of PCE 2626 can be found in [RFC7399]. 2628 A.4. GMPLS UNI and Overlay Networks 2630 [RFC4208] defines the GMPLS User-to-Network Interface (UNI) to 2631 present a routing boundary between an overlay (client) network and 2632 the server network, i.e. the client-server interface. In the client 2633 network, the nodes connected directly to the server network are known 2634 as edge nodes, while the nodes in the server network are called core 2635 nodes. 2637 In the overlay model defined by [RFC4208] the core nodes act as a 2638 closed system and the edge nodes do not participate in the routing 2639 protocol instance that runs among the core nodes. Thus the UNI 2640 allows access to and limited control of the core nodes by edge nodes 2641 that are unaware of the topology of the core nodes. This respects 2642 the operational and layer boundaries while scaling the network. 2644 [RFC4208] does not define any routing protocol extension for the 2645 interaction between core and edge nodes but allows for the exchange 2646 of reachability information between them. In terms of a VPN, the 2647 client network can be considered as the customer network comprised 2648 of a number of disjoint sites, and the edge nodes match the VPN CE 2649 nodes. Similarly, the provider network in the VPN model is 2650 equivalent to the server network. 2652 [RFC4208] is, therefore, a signaling-only solution that allows edge 2653 nodes to request connectivity cross the server network, and leaves 2654 the server network to select the paths for the LSPs as they traverse 2655 the core nodes (setting up hierarchical LSPs if necessitated by the 2656 technology). This solution is supplemented by a number of signaling 2657 extensions such as [RFC4874], [RFC5553], [I-D.ietf-ccamp-xro-lsp- 2658 subobject], [I-D.ietf-ccamp-rsvp-te-srlg-collect], and [I-D.ietf- 2659 ccamp-te-metric-recording] to give the edge node more control over 2660 path within the server network and by allowing the edge nodes to 2661 supply additional constraints on the path used in the server network. 2662 Nevertheless, in this UNI/overlay model, the edge node has limited 2663 information of precisely what LSPs could be set up across the server 2664 network, and what TE services (such as diverse routes for end-to-end 2665 protection, end-to-end bandwidth, etc.) can be supported. 2667 A.5. Layer One VPN 2669 A Layer One VPN (L1VPN) is a service offered by a layer 1 server 2670 network to provide layer 1 connectivity (TDM, LSC) between two or 2671 more customer networks in an overlay service model [RFC4847]. 2673 As in the UNI case, the customer edge has some control over the 2674 establishment and type of the connectivity. In the L1VPN context 2675 three different service models have been defined classified by the 2676 semantics of information exchanged over the customer interface: 2677 Management Based, Signaling Based (a.k.a. basic), and Signaling and 2678 Routing service model (a.k.a. enhanced). 2680 In the management based model, all edge-to-edge connections are set 2681 up using configuration and management tools. This is not a dynamic 2682 control plane solution and need not concern us here. 2684 In the signaling based service model [RFC5251] the CE-PE interface 2685 allows only for signaling message exchange, and the provider network 2686 does not export any routing information about the server network. 2687 VPN membership is known a priori (presumably through configuration) 2688 or is discovered using a routing protocol [RFC5195], [RFC5252], 2689 [RFC5523], as is the relationship between CE nodes and ports on the 2690 PE. This service model is much in line with GMPLS UNI as defined in 2691 [RFC4208]. 2693 In the enhanced model there is an additional limited exchange of 2694 routing information over the CE-PE interface between the provider 2695 network and the customer network. The enhanced model considers four 2696 different types of service models, namely: Overlay Extension, Virtual 2697 Node, Virtual Link and Per-VPN service models. All of these 2698 represent particular cases of the TE information aggregation and 2699 representation. 2701 A.6. Policy and Link Advertisement 2703 Inter-domain networking relies on policy and management input to 2704 coordinate the allocation of resources under different administrative 2705 control. [RFC5623] introduces a functional component called the 2706 Virtual Network Topology Manager (VNTM) for this purpose. 2708 An important companion to this function is determining how 2709 connectivity across the abstraction layer network is made available 2710 as a TE link in the client network. Obviously, if the connectivity 2711 is established using management intervention, the consequent client 2712 network TE link can also be configured manually. However, if 2713 connectivity from client edge to client edge is achieved using 2714 dynamic signalling then there is need for the end points to exchange 2715 the link properties that they should advertise within the client 2716 network, and in the case of support for more than one client network, 2717 it will be necessary to indicate which client network or networks can 2718 use the link. This capability it provided in [RFC6107]. 2720 Appendix B. Additional Features 2722 This Appendix describes additional features that may be desirable and 2723 that can be achieved within this architecture. It is non-normative. 2725 B.1. Macro Shared Risk Link Groups 2727 Network links often share fate with one or more other links. That 2728 is, a scenario that may cause a link to fail could cause one or more 2729 other links to fail. This may occur, for example, if the links are 2730 supported by the same fiber bundle, or if some links are routed down 2731 the same duct or in a common piece of infrastructure such as a 2732 bridge. A common way to identify the links that may share fate is to 2733 label them as belonging to a Shared Risk Link Group (SRLG) [RFC4202]. 2735 TE links created from LSPs in lower layers may also share fate, and 2736 it can be hard for a client network to know about this problem 2737 because it does not know the topology of the server network or the 2738 path of the server network LSPs that are used to create the links in 2739 the client network. 2741 For example, looking at the example used in Section 4.2.3 and 2742 considering the two abstract links S1-S3 and S1-S9 there is no way 2743 for the client network to know whether the links C2-C0 and C2-C3 2744 share fate. Clearly, if the client layer uses these links to provide 2745 a link-diverse end-to-end protection scheme, it needs to know that 2746 the links actually share a piece of network infrastructure (the 2747 server network link S1-S2). 2749 Per [RFC4202], an SRLG represents a shared physical network resource 2750 upon which the normal functioning of a link depends. Multiple SRLGs 2751 can be identified and advertised for every TE link in a network. 2752 However, this can produce a scalability problem in a mutli-layer 2753 network that equates to advertising in the client network the server 2754 network route of each TE link. 2756 Macro SRLGs (MSRLGs) address this scaling problem and are a form of 2757 abstraction performed at the same time that the abstract links are 2758 derived. In this way, links that actually share resources in the 2759 server network are advertised as having the same MSRLG, rather than 2760 advertising each SRLG for each resource on each path in the server 2761 network. This saving is possible because the abstract links are 2762 formulated on behalf of the server network by a central management 2763 agency that is aware of all of the link abstractions being offered. 2765 It may be noted that a less optimal alternative path for the abstract 2766 link S1-S9 exists in the server network (S1-S4-S7-S8-S9). It would 2767 be possible for the client network request for connectivity C2-C0 to 2768 ask that the path be maximally disjoint from the path C2-C3. While 2769 nothing can be done about the shared link C2-S1, the abstraction 2770 layer could request to use the link S1-S9 in a way that is diverse 2771 from use of the link S1-S3, and this request could be honored if the 2772 server network policy allows. 2774 Note that SRLGs and MSRLGs may be very hard to describe in the case 2775 of multiple server networks because the abstraction points will not 2776 know whether the resources in the various server layers share 2777 physical locations. 2779 B.2. Mutual Exclusivity 2781 As noted in the discussion of Figure 13, it is possible that some 2782 abstraction layer links can not be used at the same time. This 2783 arises when the potentiality of the links is indicated by the server 2784 network, but the use the links would actually compete for server 2785 network resources. In Figure 13 this arose when both link S1-S3 and 2786 link S7-S9 were used to carry LSPs: in that case the link S1-S9 could 2787 no longer be used. 2789 Such a situation need not be an issue when client-edge to client-edge 2790 LSPs are set up one by one because the use of one abstraction layer 2791 link and the corresponding use of server network resources will cause 2792 the server network to withdraw the availability of the other 2793 abstraction layer links, and these will become unavailable for 2794 further abstraction layer path computations. 2796 Furthermore, in deployments where abstraction layer links are only 2797 presented as available after server network LSPs have been 2798 established to support them, the problem is unlikely exist. 2800 However, when the server network is constrained, but chooses to 2801 advertise the potential of multiple abstraction layer links even 2802 though they compete for resources, and when multiple client-edge to 2803 client-edge LSPs are computed simultaneously (perhaps to provide 2804 protection services) there may be contention for server network 2805 resources. In the case that protected abstraction layer LSPs are 2806 being established, this situation would be avoided through the use of 2807 SRLGs and/or MSRLGs since the two abstraction layer links that 2808 compete for server network resources must also fate share across 2809 those resources. But in the case where the multiple client-edge to 2810 client-edge LSPs do not care about fate sharing, it may be necessary 2811 to flag the mutually exclusive links in the abstraction layer TED so 2812 that path computation can avoid accidentally attempting to utilize 2813 two of a set of such links at the same time.