Network Working Group G. Bernstein, L. Ong Internet Draft Ciena Expiration Date: May 2002 B. Rajagopalan Document: draft-ietf-ipo-optical-inter-domain-00.txt D. Pendarakis Tellium Angela Chiu Celion Frank Hujber --- John Strand AT&T V. Sharma Metanoia Sudheer Dharanikota Nayna Networks Dean Cheng Polaris Rauf Izmailov NEC November 2001 Optical Inter Domain Routing Considerations Status of this Memo This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC2026 [1]. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet- Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. Abstract This draft investigates the requirements for general inter-domain and inter-area routing in optical networks and reviews the applicability of existing route protocols in various optical routing applications. Table of Contents: Bernstein, G. [Page 1] draft-ipo-optical-inter-domain-00.txt November 2001 1 Introduction 3 1.1 Specification of Requirements 4 1.2 Abbreviations 4 2 Background 4 2.1 Basic Concept of Domains and Network Partitioning 4 2.2 Major Differences between Optical and IP datagram Routing 6 2.3 Diversity in Optical Routing 7 2.3.1 Generalizing Link Diversity 8 2.3.2 Generalizing Node Diversity 9 2.4 Routing Information Categorization 9 2.4.1 Link and Topology Related Information 10 2.4.2 Domain and Node Related Information 10 3 Applications of Inter Domain Optical Routing 11 3.1 Intra Carrier Applications of Optical Inter Domain Routing 11 3.1.1 Intra-Carrier Scalability 12 3.1.2 Intra-Carrier Inter-vendor 13 3.1.3 Inter-Layer Partitioning 15 3.1.4 Interaction with IP Layer Routing 17 3.1.5 Inter-Business Unit 17 3.2 Inter-Carrier Inter-Domain Optical Routing 20 3.3 Multi-Domain Connection Control 22 4 Multiple Layers of Routing 23 4.1 Layers in Transport Networks 23 4.2 Layer Integration 23 4.3 Interaction with IP Layer Routing 25 5 Existing Routing Protocol Applicability 25 5.1 OSPF 26 5.1.1 Terminology 26 5.1.2 Neighbor/Adjacency Discovery 27 5.1.3 Addressing & Reachability 29 5.1.4 Topology Discovery & Dissemination 29 5.1.5 Resources 30 5.1.6 General Protocol Properties 30 5.1.6.1 System Overhead 30 5.1.6.2 Network Resource Overhead 31 5.1.6.3 Reliability 31 5.1.7 Scaling Capability 32 5.1.8 Interworking Capability 33 5.1.9 References 33 5.2 IS-IS and Integrated IS-IS 33 5.2.1 Terminology 33 5.2.2 Neighbor/Adjacency Discovery 34 5.2.3 Addressing & Reachability 34 5.2.3.1 CLNP/NSAP addressing and routing 34 5.2.3.2 IP addressing and routing 35 5.2.4 Topology Discovery & Dissemination 36 Bernstein, G. [Page 2] draft-ipo-optical-inter-domain-00.txt November 2001 5.2.5 Resources 36 5.2.6 Interface types and network Medium Support 37 5.2.7 General Protocol Properties 37 5.2.7.1 System Overhead 37 5.2.7.2 Network Resource Overhead 37 5.2.7.3 Reliability 39 5.2.8 Scaling Capability 39 5.2.9 Interworking Capability 39 5.2.10 Inter Domain Routing Protocols 40 5.2.11 References 40 5.3 BGP 40 5.3.1 Terminology 40 5.3.2 Neighbor/Adjacency Discovery 40 5.3.3 Addressing & Reachability 41 5.3.4 Topology Discovery & Dissemination 42 5.3.5 Scaling Capabilities 42 5.3.6 Interworking Capability 42 5.3.7 References 42 5.4 PNNI routing 42 5.4.1 Terminology 43 5.4.2 Neighbor/Adjacency Discovery 44 5.4.3 Addressing & Reachability 46 5.4.4 Topology Discovery & Dissemination 47 5.4.5 Resources 48 5.4.6 General Protocol Properties 49 5.4.6.1 System Overhead 49 5.4.6.2 Network Resource Overhead 49 5.4.6.3 Reliability 50 5.4.7 Scaling Capability 50 5.4.8 References 52 6 Conclusion 52 7 Security Considerations 52 7.1.1 53 8 References 53 9 Acknowledgments 54 10 Author's Addresses 55 1 Introduction Multi Protocol Label Switching (MPLS) has received much attention recently for use as a control plane for non-packet switched technologies. In particular, optical technologies have a need to upgrade their control plane as reviewed in reference [2]. Many different optical switching and multiplexing technologies exist and more are sure to come. For the purposes of this draft we only consider non-packet (i.e. circuit switching) forms of optical switching. Bernstein, G. [Page 3] draft-ipo-optical-inter-domain-00.txt November 2001 As the requirements for and extensions to interior gateway protocols such as OSPF and IS-IS have begun to be investigated in the single area case, e.g., reference [3], we consider the requirements that optical networking and switching impose in the inter-domain case. By inter-domain in this draft we consider inter-area, inter-layer, and inter-vendor partitioning of routing and possibly other possibilities for partitioning routing in addition to administrative inter-domain (inter-carrier) partitioning. Comparisons of these requirements to existing functionality in BGP, OSPF, IS-IS and ATM's PNNI routing protocol are made. In particular, optical routing needs to provide for path diversity, switching capabilities, transport capabilities and impairments, and bandwidth/resource status reporting. To add to the concreteness of these considerations we try to illustrate them with one or more specific examples from a particular optical networking layer or technology. This is not to reduce the generality of the requirement but to facilitate the understanding of the requirement or concept. 1.1 Specification of Requirements The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT","SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119. 1.2 Abbreviations LSP Label Switched Path (MPLS terminology) LSR Label Switched Router (MPLS terminology) MPLS Multiprotocol Label Switching SDH Synchronous Digital Hierarchy (ITU standard) SONET Synchronous Optical NETwork (ANSI standard) STM(-N) Synchronous Transport Module (-N) STS(-N) Synchronous Transport Signal-Level N (SONET) TU-n Tributary Unit-n (SDH) TUG(-n) Tributary Unit Group (-n) (SDH) VC-n Virtual Container-n (SDH) VTn Virtual Tributary-n (SONET) 2 Background 2.1 Basic Concept of Domains and Network Partitioning In this draft we use the term domain in a very general sense, i.e., beyond the BGP interpretation of Administrative Domain. In this draft we will consider domains as the result of partitioning of a network into subnetworks, as shown in the network of Figure 1. A network may be partitioned for a variety of reasons, such as: * Administrative boundaries * Scalability of routing or signaling * Isolation of partitions for security or reliability reasons Bernstein, G. [Page 4] draft-ipo-optical-inter-domain-00.txt November 2001 * Technology differences in the systems in different domains The Inter-Domain interface is likely to have different characteristics than the Intra-Domain interface as the domain boundary exists for the purpose of hiding some aspect within the domain from the outside world. Examples of the use of Domains include BGP Autonomous Systems (AS) and OSPF Areas. An Administrative Domain AS is a subnetwork under the control of single administration as viewed from outside of the domain. In BGP case this is used to denote an interface between two separate carriers or ISP. In the terminology used here we would call this an inter-carrier domain interface. An OSPF Area, on the other hand, is a subnetwork within an administrative domain (OSPF specific) and results from partitioning of domains within a single carrier, i.e., the interface to an Area is an intra-carrier domain interface. /------------------------------------\ / \ / /-\ \ | Domain |NE2|<--Internal Node | | B /\-/\ | | / \ | | /-\/ \/-\ | | |NE1|-------|NE3| / \+------\-/ \-/ / /\ | | / / \------+-----------+-----------------/ / | |<--- Inter-Domain Links / /------+-----------+-----------------\ | / | | \ | / | /-\ | /-\ \ | | Domain | |NE2|---+---------|NE3| | | | A | /\-/\ | ------/\-/ | | | | / \ | / | | | | /-\/ \/-\/ | | | | |NE1|-------|NE5|---- | / | \ -- \-/ \-/ \ /-\ / | \ / | | --|NE4| / | \/ | | \-/ / | /\-----+-----------+----------------/ Border Nodes | |<--- Inter-Domain Links \ |<----------+---- (External Links) /\------+-----------+-----------------\ / \ | | \ / \ /-\ /-\ | | --|NE1|-----|NE2| | | \-/ \-/ | | Domain | |<--- Intra-Domain | Bernstein, G. [Page 5] draft-ipo-optical-inter-domain-00.txt November 2001 | C | | Links (Internal| | /-\ /-\ Links) | | |NE3|-----|NE3| | | \-/ \-/ | \ / \------------------------------------/ Figure 1. Partitioning a network into domains. The domain concept as used here is orthogonal to the transport network concept of layering. When the term layer is used with respect to the transport network we are not referring to the 7- layer OSI model which includes application, presentation, etc., layers. With regard to this model all the optical transport layers would lie at the "physical layer". In the transport network, layers are used for multiplexing, performance monitoring and fault management purposes. Layers tend to be very technology specific. At some point an optical routing protocol must include information particular to the technology layer for which it is being used to acquire/disseminate topology and resource status information. For more information on layering and domain concepts see reference [G.805]. 2.2 Major Differences between Optical and IP datagram Routing Let us first review the major difference between routing for optical (circuit switched networks) and IP datagram networks. In IP datagram networks packet forwarding is done on a hop-by-hop basis (no connection established ahead of time). While circuit switched optical networks end to end connections must be explicitly established based on network topology and resource status information. This topology and resource status information can be obtained via routing protocols. Note that the routing protocols in the circuit switch case are not involved with data (or bit) forwarding, i.e., they are not "service impacting", while in the IP datagram case the routing protocols are explicitly involved with data plane forwarding decisions and hence are very much service impacting. This does not imply routing is unimportant in the optical case, only that its service impacting effect is secondary. For example, topology and resource status inaccuracies will affect whether a new connection can be established (or a restoration connection can be established) but will not (and should not) cause an existing connection to be torn down. This tends to lead to a slightly different view towards incorporating new information fields (objects, LSA, etc.) into optical routing protocols versus IP routing protocols. In the optical circuit case, any information that can potentially aid in route computations or be used in service differentiation may be Bernstein, G. [Page 6] draft-ipo-optical-inter-domain-00.txt November 2001 incorporated into the route protocol, as either a standard element or a vendor specific extension. Whether a route computation algorithm uses this information and whether two route computation algorithms use this information in the same way doesn’t matter since the optical connections are explicitly routed (although perhaps loosely). The optical route computation problem is really a constraint-based routing problem. The basic route calculation is an atomic service that occurs, for a given connection, in a single network element. (In the case of loose explicit routing some details may be filled in by other NE’s.) This means that, even in a heterogeneous optical network, NEs from different vendors need not use the same algorithm. Another difference - clear, hard blocking prevails in the optical world while some level of overloading is ok in the IP world, i.e., statistical multiplexing is not available with optical circuits. This also manifests itself in the commitment of the protection (or restoration) bandwidth. In a packet-based network although the protection path can be setup prior to any fault, the resources along the protection path are not used until the failure occurs. In circuit-based networks a protection path generally implies a committed resource. Such a basic difference restricts the direct applicability of some of the traffic engineering mechanisms used in a packet-based network to a circuit-based network. 2.3 Diversity in Optical Routing There are two basic demands that drive the need to discover diverse routes for establishing optical paths: 1. Reliability/Robustness 2. Bandwidth capacity. Many times multiple optical connections are set up between the same end points. An important constraint on these connections is that they must be diversely routed in some way [4]. In particular they could be routed over paths that are link diverse, i.e., two connections do not share any common link. Or the more stringent constraint that the two paths should be node diverse, i.e., the two paths do not traverse any common node. Additionally, insufficient bandwidth may exist to set up all the desired connection across the same path (set of links) and hence we need to know about alternative (diverse) ways of reaching the destination that may still have unused capacity. "Diversity" is a relationship between lightpaths. Two lightpaths are said to be diverse if they have no single point of failure. In traditional telephony the dominant transport failure mode is a failure in the interoffice plant, such as a fiber cut inflicted by a backhoe. Bernstein, G. [Page 7] draft-ipo-optical-inter-domain-00.txt November 2001 Data network operators have relied on their private line providers to ensure diversity and so IP routing protocols have not had to deal directly with the problem. GMPLS makes the complexities handled by the private line provisioning process, including diversity, part of the common control plane and so visible to all. Diversity is discussed in the IPO WG document [5]. A key associated concept, "Shared Risk Link Groups", is discussed in a number of other IETF (refs) and OIF (refs) documents. Some implications for routing that are drawn in [5] are: . Dealing with diversity is an unavoidable requirement for routing in the optical layer. It requires dealing with constraints in the routing process but most importantly requires additional state information – the SRLG relationships and also the routings of any existing circuits from which the new circuit is to be diverse – to be available to the routing process. . At present SRLG information cannot be self-discovered. Indeed, in a large network it is very difficult to maintain accurate SRLG information. The problem becomes particularly daunting whenever multiple administrative domains are involved, for instance after the acquisition of one network by another, because there normally is a likelihood that there are diversity violations between the domains. It is very unlikely that diversity relationships between carriers will be known any time in the near future. - Considerable variation in what different customers will mean by acceptable diversity should be anticipated. Consequently we suggest that an SRLG should be defined as follows: (i) It is a relationship between two or more links, and (ii) it is characterized by two parameters, the type of compromise (shared conduit, shared ROW, shared optical ring, etc.) and the extent of the compromise (e.g., the number of miles over which the compromise persisted). This will allow the SRLG’s appropriate to a particular routing request to be easily identified. 2.3.1 Generalizing Link Diversity Optical networks may posses a number of hierarchical signaling layers. For example two routers interconnected across an optical network may communicate with IP packets encapsulated within an STS- 48c SONET path layer signal. Within the optical network this STS- 48c signal may be multiplexed at the SONET line layer into an OC-192 line layer signal. In addition this OC-192 may be wavelength division multiplexed onto a fiber with other OC-192 signals at different wavelengths (lambdas). These WDM signals can then be either lambda switched, wave band switched or fiber switched. Hence when we talk about diversity we need to specify the layer to which we are referring. In the previous example we can talk about Bernstein, G. [Page 8] draft-ipo-optical-inter-domain-00.txt November 2001 diversity with respect to the SONET line layer, wave bands, and/or optical fibers. A similar situation arises when we consider the definition of node diversity. For example are we talking with respect to a SONET path layer switch or an optical switch or multiplexer? The Shared Risk Link Group concept in reference [6] generalizes the notion of link diversity (general list of numbers). First it's useful with respect to major outages (cable cuts, natural disasters) to have a few more types of diversity defined: 1. Cable (conduit) diversity (allows us to know which fibers are in the same cable (conduit). This helps avoid sending signals over routes that are most vulnerable to "ordinary" cable cuts (technically known as backhoe fades). 2. Right of Way (ROW) diversity. This helps avoid sending signals over routes that are subject to larger scale disasters such as ship anchor drags, train derailments, etc. 3. Geographic Route diversity. This type of diversity can help one avoid sending signals over routes that are subject to various larger scale disasters such as earthquakes, floods, tornadoes, hurricanes, etc. A route could be approximately described by a piecewise set of latitude/longitude or UTM coordinate pairs. We also have a form of link abstraction/summarization via the link bundling concept [7]. 2.3.2 Generalizing Node Diversity The concept of a node abstraction associated with GMPLS appears in reference [11] where it is used to generalize the concept of an explicitly routed path. In this case an abstract node can be a set of IP addresses or an AS number. From the point of view of node diverse routing specific concepts of interest include: 1. Nodes, i.e., individual switching elements. 2. Switching centers, i.e., a central office or exchange site. 3. Cities, or towns that contain more that one switching center. 4. Metro areas, or counties 5. States, 6. Countries, or 7. Geographic Regions 2.4 Routing Information Categorization Different applications of inter-domain optical routing call for different types of information to be shared or hidden between domains. In the following we decompose the information that can be transferred via a routing protocol broadly into link/topology information and node/domain information. We further subdivide these Bernstein, G. [Page 9] draft-ipo-optical-inter-domain-00.txt November 2001 categories and will use this taxonomy of routing information when discussing the routing applications. 2.4.1 Link and Topology Related Information -Internal topology- information is information concerning the nodes and links and their connectivity within a domain. This type of information is traditionally shared within a domain via an intra- domain (interior gateway) routing protocol such as OSPF or IS-IS. For example the existence of nodes that only have links to other nodes within the domain, i.e., do not have links to other domains, would be strictly internal topology information. These nodes are known as internal nodes, while nodes with links to other domains are known as border nodes. Also included in this information is link/port property information such as whether the link is protected and what type of protection is being used, e.g., linear 1+1, linear 1:N, or some type of ring such as a 4F-BLSR [cite T1 document]. -Internal Resource- Information is concerned with the bandwidth available on links within a domain and possibly other resource related information. This information plays an important role in path selection within a domain. -Inter-Domain Topology- Information is concerned with how the domains are interconnected. This information can be key in inter- domain path selection, for example, in determining diverse routes. For the network in Figure 1 this information would let us know that domain A has two distinct links to domain B, domain A has two distinct links to domain C, but that domains B and C are not directly connected via any links. -Inter-Domain Resource- Information is concerned with the available bandwidth on inter-domain links. This information is important for inter-domain path selection and inter-domain traffic engineering purposes. For example in Figure 1 this information would give us some kind of bandwidth or capacity measure on the links between domain A and domain B, and the links between domains A and C. The exact nature of this information may be application/context dependent. 2.4.2 Domain and Node Related Information -Reachability- information tells us what addresses are directly reachable via a particular domain. These systems can be end systems (clients) to the network or nodes within the network depending upon the application/context. Suppose in domain B of Figure 1 each of the network elements, NE1-NE3, have subtending end systems, and that NE1-NE3 do not represent a valid final destination for a path. Under Bernstein, G. [Page 10] draft-ipo-optical-inter-domain-00.txt November 2001 this assumption the collection of the addresses of all these subtending end systems would form the reachability information for domain B. -Subnetwork Capability- information is concerned with the capabilities or features offered by the domain as a whole. This information is used in some applications where sharing the internal topology and resource information is inappropriate. This information can include: (a) Switching capabilities, (b) Protection capabilities, (c) Some kind of overall available capacity measure, (d) Reliability measures. Examples: 1. For example, in the SONET realm, one subnetwork may switch down to an STS-3c granularity while another switches down to an STS- 1 granularity. Understanding what types of signals within a SDH/SONET multiplex structure can be switched by a subnetwork is important. Similar examples of granularity in switching apply to the waveband case. 2. Some networking technologies, particularly SONET/SDH, provide a wide range of standardized protection technologies. But not all domains will offer all protection options. For example, a 2/4- F BLSR based subnetwork could offer extra data traffic, ring protected traffic and non-preemptible unprotected traffic, (NUT)[8], while a mesh network might offer shared SONET line layer linear protection and some form of mesh protection. 3. Some domains may be in locations that have lower incidences of link failure. Such information could be helpful in computing routes to statistically "share the pain". -End System Capabilities- information: While properties of the subnetwork are very important when trying to decide which domain to use to access a system (in the case of multi-homing), end systems also posses a wide variety of capabilities. Throwing end system capabilities such as a system's ability to support SONET/SDH virtual concatenation for distribution into a routing protocol may not be that advantageous since it somewhat counters the ability to summarize reachability information. Detailed end-system information may alternatively be obtained via a directory service or some type of direct query between the end systems. 3 Applications of Inter Domain Optical Routing 3.1 Intra Carrier Applications of Optical Inter Domain Routing Intra Carrier inter domain routing refers to a situation where the network that is to be partitioned into areas is under the control of one administrative entity. The main reasons for this partitioning in optical networks stem from scalability, inter-vendor Bernstein, G. [Page 11] draft-ipo-optical-inter-domain-00.txt November 2001 interoperability, legacy equipment interoperability, and inter-layer partitioning. 3.1.1 Intra-Carrier Scalability As networks grow it is useful to partition a carriers network into separate optical routing domains which share limited or summarized information amongst each other. This reduces the overhead of information exchange across the network as a whole, and reduces the convergence time of routing protocols within a particular area. Hence we see in the inter-carrier scalability application that we will hide or summarize internal topology and resource information, while completely sharing inter-domain topology and resources information so that diverse paths can still be calculated. Note that general domain capabilities/capacity as well as reachability information would tend to be shared as completely as possible. For example the network shown in Figure 1 can be approximately represented as shown in Figure 2. This summarized network topology only has 4 links whose state need to be advertised in a routing protocol versus 17 links in the original network. Note that we may also advertise the capabilities of the three domains in Figure 2 as opposed to the 12 nodes of Figure 1. In Note that this partitioning into domains can recurse, i.e., we can have multiple levels of routing hierarchy to permit larger and larger networks. Such was the motivation behind the extensive hierarchical routing capability within ATM's PNNI routing protocol. The trade off to partitioning into domains for scalability is that less information is available for use in the route selection process which can lead to inefficient utilization of network resources. On the other hand frequently this partitioning occurs on somewhat "natural" boundaries and as such the potential inefficiencies can be minimized. -------- ------ /- -\ /- -\ ----- // NE 3 \mmmmmm // \\ / \ / Port 2 \ mmmmm NE 3 NE 5 \ /NE 3 \ / \ / Port 4 Port 6 \ mmmmmPort 17 \ | | | mmmmm | | | | | | | | | | | Domain A | | Domain C | | Domain B | | | | | | | | NE 2 NE 1 | | | | | |Port 5 Port 2 mmmmmmmmmNE 1 | | mmm / \Port 21 / \ NE 1 /mmmm \ / \ / \ Port 7 mm \\ // \ / \\ // \- -/ ----- \- -/ ------ Bernstein, G. [Page 12] draft-ipo-optical-inter-domain-00.txt November 2001 -------- Figure 2. Summarized topology for the network of Figure 1. Also in Figure 2 we show the end points of the links between being identified by the triple of (domain, NE address, NE port number). This information is available via the discovery process. Though not strictly necessary including the identification of border nodes in a domain, allowing other nodes to understand whether these links terminate on the same or different nodes is valuable in setting up diverse inter-domain paths. In current intra-domain IP routing protocols, such as OSPF's, a multiple area capability provides for intra-carrier scalability. However, this is currently done by sharing reachability information and using a vector distance method to obtain routes. This does not discover and propagate inter-domain topology information and hence insufficient information is available for diverse route calculations. When the topology within the domain is approximated or hidden then signaling and call processing at the domain border will receive an approximated (loose) route and the border node or signaling entity must then translate this to a precise route through the domain. Hence there is some linkage between multi-domain connection control and inter-area/inter-domain routing. 3.1.2 Intra-Carrier Inter-vendor An important application of intra-carrier optical routing is the intra-carrier inter-vendor scenario. From a carrier’s perspective, the use of domains provides a clean way to isolate clouds of equipment belonging to different vendors, while at the same time allowing for interoperability between the vendors. An advantage of this method is that it allows the vendors complete freedom to use any combination of routing protocols or traditional management-based methods to propagate topology and resources internal to their domains. In other words, the routing entity in each domain could obtain this information either by participating in a routing protocol like OSPF, or by querying each NE via an EMS, or by simply having the required information manually configured into it. Note that the routing entity shown in each vendor’s cloud in Figure 3 is in reality an abstract representation of the routing intelligence within the vendor cloud. This intelligence may either be implemented in a distributed way, by having a routing protocol running at each NE, or in a centralized way, through the use of an intelligent, centralized routing entity that communicates with the individual NE’s (either via a protocol or by querying individual Bernstein, G. [Page 13] draft-ipo-optical-inter-domain-00.txt November 2001 elements) to retrieve connectivity and resource information that it uses to build a complete topology and resource map of the domain. Therefore, vendors may use different protocols as the primary option between their own devices, adding specialized features or optimizing their performance based on their choice of protocol. /------------------------------------\ / /-\ \ / Domain B |NE3| +-------+ \ | (Vendor 2) /\-/\ |Routing| | | / \ |Entity | | | /-\/ \/-\ +-------+ | \ |NE1|-------|NE2| @ / \ \-/: \-/ @ / \------+-:---------+---------@-------/ Neighbor discovery | : | @ Exchange of routing Between NEs in the | : | @ information between same domain. ----> | : | @ the domains routing | : | @ entities. /------+-:---------+---------@-------\ / | : | @ \ / /-\ /-\ +-------+ \ | |NE1|-----|NE2| |Routing| | | \-/\ /\-/ |Entity | | | \/-\/ +-------+ | | Domain A |NE3| | | (Vendor 1) \-/ / \ / \------------------------------------/ Figure 3. Intra-Carrier Inter-vendor routing domains Even if it is a centralized entity, the routing entity could still be run on a given NE in the vendor cloud. In other words, these entities could be distributed, or centralized onto a single node, or independent of any of the nodes. In ATM's PNNI protocol, for example, this was centralized on a node elected as the "peer group leader". In the inter-vendor case, it can be particularly advantageous to centralize this so that the flow of information can be monitored. A centralized routing entity could apply flooding and summarization mechanisms as if it is a switching system. Since this is optical rather than IP routing, signaling would be carried by a control channel between the routing entity and the neighboring system, rather than being carried over the data links. The functions of the routing entity include: (a) direct reachability exchange (that is, which NE’s can be directly reached from this domain, (b) verification of area connectedness (that is, understanding how the two domains are interconnected), (c) area/domain topology (possibly summarized) exchange and updates, and (d) topology updates concerning other domains/areas. Bernstein, G. [Page 14] draft-ipo-optical-inter-domain-00.txt November 2001 When a carrier partitions its network for inter-vendor interoperability as described above, it may still share information about the internal topology of the domains in some standardized form that has been agreed upon between the vendors. Although one option is to force both vendors to adopt a new common protocol, another is to require that only a minimum subset of reachability/topology information be shared between the vendor clouds. The latter option helps during fault situations, by providing fault isolation at the domain boundaries. It prevents an outage in a domain composed of one vendor’s equipment from causing a reaction in an adjacent domain composed of another vendor’s equipment, thus preventing a situation that would typically degenerate into a process known as “finger pointing” between the two vendors. The setup described above takes into account the three most important sub-cases of inter-carrier inter-vendor partitioning: a. The first is where both vendor domains run distributed routing protocols. This is the most flexible case, and is the situation when new equipment capable of running such protocols is deployed. b. The second is where the optical subnetworks or domains (which includes a large number of existing installations) do not run any internal routing protocol (because the NE’s are not capable of doing so), relying instead on EMS-based topology discovery/resource management. In this case, interoperability with other vendor clouds can be realized by having the routing entity run as a separate software entity with access to the appropriate information. These entities may exchange routing proxy addresses through the neighbor discovery protocol, and then exchange routing information (proxying for the entire domain) with each other. The basic advantage here is that even though the vendor specific element management system (EMS) knows the topology of its subnetwork, it is far easier to get an inter-domain routing protocol to share information than trying to get the separate vendors management systems to communicate. c. The third is where one domain has a centralized routing entity, while the other runs a distributed routing protocol. Once again, the neighbor discovery process between the area border NE’s could be used to advertise the address of the routing entity. 3.1.3 Inter-Layer Partitioning In transport networks layering is a part of the multiplex and OA&M structure of the signals, playing a role in multiplexing, monitoring Bernstein, G. [Page 15] draft-ipo-optical-inter-domain-00.txt November 2001 and general link management. Layering in the transport network is defined in fairly abstract terms in [G.805] and the concepts are applied to SDH in [G.803]. As explained in a recent ITU SG15 document (WD45 Q.14/15) not all the layers in the transport network are of interest to the control plane, or to routing in particular. Some layers may not contain active switching elements, however this does not mean that information flow concerning a non-switching layer is not valuable in routing. For example in [GB-WDM-SRLG] static WDM layer information was used to set the SRLGs for SONET lines (i.e., information passed around by a link state protocol operating at the SONET line layer). It should be noted that much of the information available from non-switching layers relates to performance monitoring and fault management. In this situation the network is partitioned into sub-networks that operate at different switching layers. One reason for doing this is that not all the information from one layer is necessary or relevant to another layer. For example, between transparent optical switches and SDH/SONET path (VC) layer switches, the optical switches have no direct use for the SONET layer information. In addition optical networks may keep a lot more physical layer information (such as the properties of every optical amplifier on a WDM span) that is of no use to the SONET layer. One again this promotes scalability, but also simplifies the implementation by reducing inter-layer information transfer to that which is actually useful. Now in network planning it is very useful to have a view of the current higher layer traffic matrix [9] being satisfied and higher layer traffic trend measurements over time. Although we can somewhat see this in higher layer resource status changes over time, this represents a link level view when we really desire the trend (change in time) of the traffic matrices between sites. How this information gets distributed is an open issue. Currently individual nodes in a GMPLS network know only about connections that they source or sink. Network planning is generally a longer time horizon process than even traffic engineering hence it is an open question as to whether this would ever be a useful function to incorporate into a network element. Now looking the other way is initially simpler, i.e., it is easier to ask: what can a higher layer use for path selection/computation from a lower layer. The first item that springs to mind is diversity information. For example in setting up a SONET STS-1 path we can use information from a WDM system concerning which SONET lines share the same WDM fiber. This is information, however, is already abstracted into routing protocols via SRLG concept. Other Bernstein, G. [Page 16] draft-ipo-optical-inter-domain-00.txt November 2001 information from a lower layer is of questionable value since it tends to be technology specific and puts more and more burden on the upper layer to be able to effectively understand and use this information. 3.1.4 Interaction with IP Layer Routing The applicability of IP-based routing protocols has, over the years, been constantly expanded to increasingly more circuit-oriented layers. The community began with pure datagram routing, gradually expanded to cover virtual-circuit switched packet routing (for e.g., MPLS), and is finally looking at the application of routing protocols to real circuit switching, e.g. the optical layer. However, as pointed out earlier in this document, it is not clear that the different layers should necessarily share the same instance of the routing protocols. Indeed, there may be significant reasons for not doing so and many carrier tend to partition there networks along switching layer boundaries. For example, IP-layer reachability information is not particularly useful for the optical layer, so it seems an overkill to burden the optical equipment with storing and distributing that information. (It is an extra expense on memory and processing for information that the optical layer does not really care about, so there is little incentive for a vendor to want to do so.) Likewise, information on physical plant (fibers, conduits, ducts) diversity, which is crucial at the optical transport layer, is very unlikely to be used directly by the IP layer. So, it would be quite wasteful of resources to burden the IP layer routing with distributing and manipulating this information. Thus, the extent of interaction or integration with IP layer routing (if any) requires careful consideration. 3.1.5 Inter-Business Unit A slightly different but interesting application of intra-carrier optical routing occurs in the intra-carrier inter-business unit scenario. This arises because a carrier often has multiple administrative domains, with groups of administrative domains being under the purview of independent BU’s within the carrier. Note that different BU’s represent independent cost centers with their own profit objectives and sales targets. As a result, while the BU’s can profitably share topology information and would like to do so, they may not be so inclined to advertise the details of their resource usage into domains belonging to other BUs. Since each BU has its own revenue targets, advertising detailed resource availability information to other, potentially competing, BUs can have a negative impact on a BUs revenue generation. This is because Bernstein, G. [Page 17] draft-ipo-optical-inter-domain-00.txt November 2001 the knowledge of available resources in one BU may enable other BUs within the carrier to requisition capacity from this BU. This would force the BU in question to yield to their request, possibly at the expense of selling capacity to more profitable, external, revenue generating customers. Thus, this scenario is likely to have an additional dimension of information sharing, namely, policy-based information sharing, which does not apply to the other cases that we have discussed so far. Two examples where the inter-business unit scenario could become important are in the case of metro-core-metro networks within a given carrier, and in the case of regional networks within a carrier. In the metro-core-metro situation, such as the one depicted in resource sharing between them. /------------------------------------\ / \ / /-\ \ | Domain |NE2| | Metro BU | B /\-/\<-- Metro Ring | Network | / \ | | /-\/ \/-\ | | |NE1|-------|NE3| / \ \-/ \-/ / \ | | / \------+-----------+-----------------/ | | /------+-----------+-----------------\ / | | \ / | /-\ | /-\ \ | Domain | |NE2|---+---------|NE3| | CORE BU | A | /\-/\ | ------/\-/ | Network | | / \ | / | | | /-\/ \/-\/ |<--Core | | |NE1|-------|NE5|---- | Mesh / \ \-/ \-/ \ /-\ / \ | | --|NE4| / \ | | \-/ / \-----+-----------+----------------/ | | /------+-----------+-----------------\ / | | \ / /-\ /-\ \ | |NE1|-----|NE2| | Metro BU | Domain \-/ \-/ | Network | C | |<--- Metro Ring | | | | | | /-\ /-\ | Bernstein, G. [Page 18] draft-ipo-optical-inter-domain-00.txt November 2001 | |NE3|-----|NE3| | | \-/ \-/ | \ / \------------------------------------/ Figure 4, the metro network domains could be under the jurisdiction of one BU, while the core network domains belong to a different BU. In this case, for example, it is possible that the metro BU, armed with resource availability information about the core BU’s domains, could requisition capacity from the core network when needed. This may harm the core network’s profit goals, because they may not be able to charge an internal customer the same rates that they could charge for the same capacity from an external customer, thus motivating the need for selective, policy-based resource sharing between them. /------------------------------------\ / \ / /-\ \ | Domain |NE2| | Metro BU | B /\-/\<-- Metro Ring | Network | / \ | | /-\/ \/-\ | | |NE1|-------|NE3| / \ \-/ \-/ / \ | | / \------+-----------+-----------------/ | | /------+-----------+-----------------\ / | | \ / | /-\ | /-\ \ | Domain | |NE2|---+---------|NE3| | CORE BU | A | /\-/\ | ------/\-/ | Network | | / \ | / | | | /-\/ \/-\/ |<--Core | | |NE1|-------|NE5|---- | Mesh / \ \-/ \-/ \ /-\ / \ | | --|NE4| / \ | | \-/ / \-----+-----------+----------------/ | | /------+-----------+-----------------\ / | | \ / /-\ /-\ \ | |NE1|-----|NE2| | Metro BU | Domain \-/ \-/ | Network | C | |<--- Metro Ring | | | | | | /-\ /-\ | | |NE3|-----|NE3| | | \-/ \-/ | Bernstein, G. [Page 19] draft-ipo-optical-inter-domain-00.txt November 2001 \ / \------------------------------------/ Figure 4. Intra-carrier inter-business unit routing domains. 3.2 Inter-Carrier Inter-Domain Optical Routing In this case we are talking about dealing with outside entities, i.e., between service providers. There may be a range of levels of trust here; for example there might be some level of trust between two providers that have formed a marketing alliance or have some other form of business relationship. In general, however, trust can not be assumed. In this case, all the concerns of revealing too much information about one's network come into play. However, not revealing enough, say about diversity capabilities may also lead customers elsewhere. Also there are some other security issues not seen before. For example, in route distribution one carrier might not be inclined to pass on routing information that could point the way to competitive alternatives. This impacts the methods for route updates, etc. With the interest in bandwidth trading [10] we can also look at this as an advertisement of network connectivity and capability with of course any "warts" covered up. This would include reliance on other carrier for fibers or lambdas. Also a fair amount of details such as "unused capacity" would not be advertised since this maybe financially sensitive information. Private line pricing today is based primarily on the service itself (bandwidth, end-points, etc.) and the holding time, and there is no reason to expect that this will change. When multiple service providers are involved the algorithm for dividing up the revenue stream (which can be quite large even for a single connection) must be explicit by connect time. This could be done off-line or could be done at connect time. In either case, the entity or entities doing the routing will need to take provider pricing structures into account whenever there is a choice between providers that needs to be made. The routing logic could do this explicitly if the prices are captured in the advertised metrics or some other advertised data; alternatively it could be done by some sort of policy control, as it is today by BGP. The essence of bandwidth trading is the existence of competing price structures that are known to the entity deciding which competitor to use. It is possible to create plausible bandwidth trading scenarios involving the UNI, the NNI, or both. If the NNI is involved, these price structures will need to be established across it. The situation is further complicated by the fact that bandwidth trading could be realized using any one of a number of business models, each with its own information requirements. To give two examples: If an auction model were used the buyer might repeatedly broadcast the Bernstein, G. [Page 20] draft-ipo-optical-inter-domain-00.txt November 2001 lowest bid received to date and solicit lower bids from the competing providers. On the other hand, if there were a more formal market the providers might post their asking prices in some public fashion and a buyer would be matched by some third party with the lowest offer. In the inter-carrier case notions of hierarchy seem rather sensitive, i.e., he who controls the summarization and advertisement may have an undue advantage over competitors. In addition, a "bandwidth aggregator" may want to advertise capabilities that he has put together via deals with multiple carriers... Notes: We can attempt to extend the SRLG concept to links between ASs but we will need the two ASs to agree on the meaning and number of the list of 32 bit integers that comprise the SRLG, i.e., previously the SRLG concept was one of AS scope. And this is also where things get tricky since it may not be possible to distinguish diverse routes based upon differing path vectors (i.e., AS number traversal list). The reason for this is due the fact that many carriers "fill out" their networks by renting either dark fiber or "lambdas" from a WDM system and hence although the path vectors may be AS diverse they may not even be fiber diverse. Hence there is a need for sharing of diversity information or constraints between ASs when setting up diverse connections across multiple ASs. This gets us somewhat into a quandary over which information needs to be public and how to coordinate its distribution. In this sense geographic link information may be the simplest and least contentious to get various players to disclose and standardize. Notes: (1) The real issue is consistency between the cloud/AS’s since in many cases they are sharing conduit, ROW, etc. Getting this to happen could be very problematic. It would be preferable to see a diversity option that doesn’t require this. For example, ensure that there is diversity within each cloud and then do restoration separately within each cloud. (2) See the definition of SRLG in the Carrier Requirements – an equivalence class of links, the extent of violation, and the level. (3) Flexibility in defining the level of violation seems very desirable – these historically have drifted in time. There are many others – eg, if the shared resources are SPRING protected that’s less of a problem than otherwise. Notes: Participation in the inter-domain network carries constraints on the carriers. First, in order to participate, each provider network MUST be willing to advertise the destinations that are reachable through his network at each entry point and advertise the formats available. Without providing such information, there is little motivation to participate since it is unlikely that others will be able to access services of which they are not aware. Second, every participating carriers MUST agree to fairly include the information made available by every other carrier so that each Bernstein, G. [Page 21] draft-ipo-optical-inter-domain-00.txt November 2001 carrier has an equal opportunity to provide services. There may be specific exceptions, but the carrier claiming those exceptions MUST advertise the exceptions themselves. In this manner, other carriers that might otherwise be aware of distant services can be prompted to seek those services manually. Note a combination of minimal required information transferred with deferral to the originating subnetwork along with some basic security mechanisms such as integrity and non-repudiation may be useful in helping organizations to "play nice". 3.3 Multi-Domain Connection Control MPLS’ loose routing capability allows one to specify a route for an optical connection in terms of a sequence of optical AS numbers. This, for example, is handled via RSVP-TE’s abstract node concept [11]. Currently there is nothing in the GMPLS signaling specification that differentiates between intra AS boundaries, i.e., between two neighbor optical LSRs in the same AS, and inter AS boundaries, i.e. between two neighbor optical LSRs in different ASs. Note that these same notions can apply to separate routing domains within an AS. There may, however, be some useful reasons for differentiating these two cases: 1. Separation of signaling domains, 2. Separation of protection domains. While routing protocols (used for their topology information) in the optical case are not "service impacting", signaling protocols most certainly are. It is desirable to build some type of "wall" between optical ASs so that faults in one that lead to "signaling storms" do not get propagated to other ASs. Note that the same motivation applies for isolating other kinds of clouds, like vendors specific ones. The natural situation where "signaling storms" would be most likely to arise is during network restoration signaling, i.e., signaling to recover connections during major network outages, e.g., natural disasters etc. In this case it may be very advantageous to break up general source reroute forms of restoration into per domain segments or to start reroute at domain boundaries rather than all the way back at the originating node. Note that this has the advantage of reducing the need for globally consistent SRLG’s. (See earlier SRLG comment.) Such a capability requires some loose coordination between the local, intermediate and global protection mechanisms [12]. This is typically implemented via hold off timers, i.e., one layer of protection will not attempt restoration until a more fundamental (local) form has been given a chance to recover the connection [12]. In other words, prevention of restoration related signaling storms may require the breaking up of a large network into multiple signaling (and hence routing) domains. These domains could be within the same AS. Bernstein, G. [Page 22] draft-ipo-optical-inter-domain-00.txt November 2001 4 Multiple Layers of Routing 4.1 Layers in Transport Networks In transport networks layering is a part of the multiplex and OA&M structure of the signals, playing a role in multiplexing, monitoring and general link management. Layering in the transport network is defined in fairly abstract terms in [G.805] and the concepts are applied to SDH in [G.803]. As explained in a recent ITU SG15 document (WD45 Q.14/15) not all the layers in the transport network are of interest to the control plane, or to routing in particular. Some layers may not contain active switching elements, however this does not mean that information flow concerning a non-switching layer is not valuable in routing. For example in [GB-WDM-SRLG] static WDM layer information was used to set the SRLGs for SONET lines (i.e., information passed around by a link state protocol operating at the SONET line layer). It should be noted that much of the information available from non-switching layers relates to performance monitoring and fault management. Hence work in this area within CCAMP should take into account this layered approach. Note that this is distinct from the layer idea used in the 7-layer OSI model or IP layer model. In the IP model, the term Layer means that, for example, the Application Layer entity requests services for delivering a message to an entity on another computer and it contacts the Transport Layer service, which in turn contacts the Internet Layer. Lower layers are successively contacted until an end-to-end service is provided. A key concept is that the Application Layer cannot (or rather should not) contact the Internet Layer directly. In this model all the "layers" discussed in this document would lie in the "physical layer" (from an IP perspective). 4.2 Layer Integration As previously discussed, there are multiple layers of signals included in what in the IP model one would call the Physical Layer. One could separate the layers by creating sublayers in the Physical Layer. For example, sublayers in the Physical Layer might be, top to bottom: LOVCs, HOVCs, and Lambdas. If a system supports only one of the three, then isolation of the sublayers is a given; it's geographical. But there are systems which will support more than one physical sublayer, therefore, it is necessary to establish whether or not there is a need to isolate the sublayers in the same manner. Or put another way is there a reason to "integrate" the sublayers for the purposes of routing (topology dissemination). If they are isolated, then there will be separate topological models for each sublayer: one mesh for the LOVC, one for the HOVC, one for the Lambda, and possibly others. The appropriate way to access a sublayer is via the use of sublayer SAPs (service access points). For example, in this way, one may find that use of Lambdas is more Bernstein, G. [Page 23] draft-ipo-optical-inter-domain-00.txt November 2001 efficient because each sublayer can assess the availability of services at its own layer before searching for coarser-granularity services. On the other hand, the control plane must accommodate three separate routing protocols, or at least three separate instances of the same routing protocol, all operating at both intra and inter-domain level. Section 4.4.2, herein, states "For transport across a SONET network, the lower order signals must be multiplexed into a non-concatenated higher order signal." Given that this is true, LOVCs are not routed independently, but only as tributaries of HOVCs. In addition in the SDH hierarchy there is a signal, VC3, that can be treated (multiplexed) as either a LOVC or a HOVC. With this tight and somewhat confused coupling of these layers it may beneficial to sometimes combine them into the same route protocol instance. Use of the terms LOVC and HOVC infers that all of the services to be supported by inter-domain routing are those formally associated with the terms in SONET and SDH standards. However, among the optical systems emerging in today’s market are rate and format independent systems, which claim to offer services that do not rely on SONET/SDH framing. Their intent is to support Ethernet, ATM, and OTN framing without the need for electronics specifically targeted at the signal of interest. The question arises whether or not to include these "clear channel" services as a separate sublayer of the Physical Layer. The alternative to separate routing protocols per sublayer is the original notion behind GMPLS routing and the forwarding adjaciency concept [13]. Rather than separating the route protocols into separate layers (or sublayers) with distinct topologies, each ONE would advertise the services it can provide, along with its topology information. For example, a ONE (optical network element) might advertise that it carries a route to node A with STS-N service and clear-channel lambda service and carries multiple routes to node B with STS-N service. It might, alternatively, advertise its entire network with summarized link capacity information for every included link. Neighboring carriers would, implicitly, be allowed to summarize that information for internal advertisement via its IGP. Further consideration could be given to a query service, where a carrier advertises the geographical area it serves without detailed reachability or capacity information. A second carrier desiring service could query the first carrier as to reachability for a specific destination, and the first carrier would respond with availability and capacity information. Integrating multiple layers into the same routing protocol instance leaves us fewer routing protocols to manage. The downside of this is that more information must be exchanged via this routing protocol and more network elements participate in this single instance of the routing protocol which can lead to scalability concerns. If the equipment working on the different sublayers comes from different Bernstein, G. [Page 24] draft-ipo-optical-inter-domain-00.txt November 2001 vendors there would be little incentive to integrate multiple layers into the routing protocol for a single layer product. Regardless of whether multiple layers are integrated into the same routing protocol instance it can be very useful to share information between layers as illustrated by the following examples: o Drop side links between layers: Capabilities of the links that are between the (client and server) layers need to be propagated into the routing protocol. o Summarize link capabilities: Summarizing the server layer capabilities in the client layer will reduce the amount of information required for multi-layer constraint based path computation. o Send only that are required: Sending only the capabilities that are useful in the constraint path computation in the client layer. 4.3 Interaction with IP Layer Routing The applicability of IP-based routing protocols has, over the years, been constantly expanded to increasingly more circuit-oriented layers. The community began with pure datagram routing, gradually expanded to cover virtual-circuit switched packet routing (for e.g., MPLS), and is finally looking at the application of routing protocols to real circuit switching, e.g. the optical layer. However, as pointed out earlier in this document, it is not clear that the different layers should necessarily share the same instance of the IP routing protocols. Indeed, there may be significant reasons for not doing so. For example, IP-layer reachability information is not particularly useful for the optical layer, so it seems an overkill to burden the optical equipment with storing and distributing that information. (It is an extra expense on memory and processing for information that the optical layer does not really care about, so there is little incentive for a vendor to want to do so.) Likewise, information on physical plant (fibers, conduits, ducts) diversity, which is crucial at the optical transport layer, is very unlikely to be used directly by the IP layer. So, it would be quite wasteful of resources to burden the IP layer routing with distributing and manipulating this information. Thus, the extent of interaction or integration with IP layer routing (if any) requires careful consideration. 5 Existing Routing Protocol Applicability Here we look at the applicability of OSPF,IS-IS PNNI and BGP to various aspects of the general optical inter domain routing problem. All protocols provide reachability information. The questions to be investigated are how they deal with partitioning the network, Bernstein, G. [Page 25] draft-ipo-optical-inter-domain-00.txt November 2001 diverse routing, summarized/abstracted topology information sharing, and suitability for the inter-domain environment. 5.1 OSPF OSPF stands for Open Shortest Path First, an IP interior routing protocol defined by the IETF as documented as the RFC2328. OSPF uses the link state algorithm. OSPF was originally defined to construct shortest path tree in IP networks to forward IP datagram in hop-by- hop manner in IPv4 addressing space. The OSPF is capable of scaling to a very large networks by partitioning networks into areas. Each OSPF area is identified by a unique identifier called Area ID. The default area is called backbone area with the Area ID as zero. A router with interfaces with multiple OSPF areas is called Area Border Router (ABR). The topology information including node, links and addresses belonging to a single area are advertised throughout that area but not beyond. Only reachability of address and address prefix from one area may be exported to other areas possibly with summarization or policy-based suppression. A shortest path tree is constructed at each router for each of the attached OSPF area. IP datagram destined to other areas is forwarded via the ABR. An OSPF router that has interface outside of an OSPF domain is called Autonomous System Boundary Router (ASBR). An ASBR may import reachability information from outside the AS and IP datagram in and out of an OSPF domain is via the ASBR. There have been numerous RFCs and IETF drafts that propose extensions and enhancement to the RFC2328 where the features include the following: 1. The OSPF NSSA Option (RFC 1587). 2. OSPF Database Overflow (RFC 1765) 3. The OSPF Opaque LSA Option (RFC 2370) 4. OSPF for IPv6 (RFC 2740) 5. IETF draft, “Traffic Engineering Extensions to OSPF” 6. IETF draft, “OSPF Extensions in Support of Generalized MPLS” In particular, extensions have been added to OSPF to support traffic engineering for MPLS and GMPLS based networks. 5.1.1 Terminology ABR Area Border Router. An ABR has interfaces to more than One OSPF areas. Area A group of OSPF nodes that inter-connected with each other and all their interfaces belong to the same OSPF area. AS Autonomous System – specifies an IP administrative Bernstein, G. [Page 26] draft-ipo-optical-inter-domain-00.txt November 2001 domain. ASBR Autonomous System Boundary Router. An ASBR is an OSPF node that has interface with routers in other AS, or is capable of importing IP routes from non-OSPF sources. Backup Designated Router There is a single BDR elected in OSPF broadcast or NBMA networks. It takes the role of the DR when the original DR fails. Designated Router There is a single DR elected in OSPF broadcast of NBMA networks. LSA Link State Advertisement NBMA Non broadcast and multi-access network NSSA An OSPF stub area that allows importing AS external LSA under a policy-based condition. Stub Area In OSPF, AS external LSA is flooded throughout all areas that belong to the same AS. An OSPF area may optionally be specified as a stub area so that no AS external LSA flooded into that area. Virtual Link A logical point-to-point link that connects two OSPF ABRs. 5.1.2 Neighbor/Adjacency Discovery The neighbor and adjacency mean different things in OSPF although both describe the relationship between two inter-connected OSPF nodes. The inter-connection between OSPF nodes includes both physical and logical context. In particular, the inter-connection between two adjacent OSPF neighbors can be realized by either physical cable or virtual link. The neighbor relationship is at the IP layer for OSPF. If we can use OSPF or OSPF-like IP based routing protocol for optical networks, that implies we still need control plane at the layer-3. The OSPF neighbor relationship describes the communication using OSPF Hello protocol between a pair of inter-connected OSPF nodes. Note the peer relationship is regardless of the underlined interface type; e.g., on a point-to-point interface, there is a single pair of OSPF nodes communicating using Hello protocol and on a broadcast network interface such as Ethernet, the neighbor relationship applies to each pair of OSPF nodes on that Ethernet. One of the Bernstein, G. [Page 27] draft-ipo-optical-inter-domain-00.txt November 2001 usages of the OSPF neighbor relationship is to monitor the bi- directional communication status between a pair of OSPF nodes, although on the multi-access networks, the neighbor relationship is also used for the election of OSPF Designated Router and Backup Designated Router. Note that this is an "IP layer peer" relationship. One important item to note is that in the optical world we typically have much faster and more accurate methods to monitor the health of the communications between the two nodes. This is also true for some of the non-optical networks even on the Ethernet. However, all existing routing protocols (OSPF, PNNI, IS- IS, etc.) have their own link healthy checking mechanism. What happens is if the link layer fails, the routing protocol will get notification with proper handling before its own mechanism kicks in. But if the failure not due to the lower layer, the protocol’s mechanism will be used. The OSPF adjacency specifies a relationship between a pair of OSPF nodes that have 2-way communication neighbor relationship as whether to exchange OSPF link state information between them. Note the adjacency relationship is always formed between the two inter- connected nodes on a point-to-point network interface, and is also so between any pair of nodes on a multi-access network except those where none of them is either Backup Designated Router or Designated Router. The OSPF link state information includes the addresses, nodes, links, traffic-engineering parameters that belong to all nodes in a single OSPF area. Information exchanged between separate OSPF areas is only limited to the addresses. The IETF may decide if and how to exchange information related to the traffic-engineering parameters between OSPF areas. Note the neighbor and adjacency are required elements commonly exist in any link-state algorithm based routing protocols. Although they are associated with OSPF nodes at one-hop away (note the “hop” is with logical context), the link state information that exchanged using the protocol mechanism is associated with a much larger network segment, and that is called Area in OSPF. This is important since in the optical networks, there might require other neighbor information and its exchange between inter-connected nodes, such as those on the same optical ring; it is recommended that separate protocol(s) be used for that purpose for the reasons with two-fold. First, it is desirable to separate specific technology such as SONET, DWDM, ADM etc. from routing protocol that operates at the networking layer. Second, the information that related to the underlined technology tends to be with local significance, not network-wide. Bernstein, G. [Page 28] draft-ipo-optical-inter-domain-00.txt November 2001 5.1.3 Addressing & Reachability OSPF is an interior routing protocol used for IP networks. The OSPF version 2 supports IPv4 addressing only and OSPF version 3 (work-in- progress) supports both IPv4 and IPv6 addressing. There is a network mask associated with an IP address. In IP networks, an address or address prefix is associated with entities that include the following: 1. Host-oriented device, including routers, switches, workstations, etc., and called host address. 2. Network segment called subnet, and called IP subnet address. 3. Network interface that associated with a NE including routers, switches, etc., and called interface IP address. OSPF is capable of advertising IP addresses that associated with all the above. Within a single OSPF area, addresses that associated with nodes, interfaces, and subnets within that area are advertised throughout that area with no summarization, along with other link state information. This is important such that routing path for any reachable address can be calculated in very optimized fashion. On the other hand, OSPF is capable of summarizing addresses that associated with one area before advertising them to other areas. In addition, OSPF can summarize addresses and selectively (based on administrative policies) advertising them from one Autonomous System (AS) to others via inter-domain routing protocols such as BGP-4. For reachable addresses across boundaries topology of OSPF areas or AS, the routing path for them may or may not be optimized due to the lack of information since the OSPF link state advertisements not across those boundaries. 5.1.4 Topology Discovery & Dissemination The information that contained in the link state advertisements includes three categories as follows: 1. Reachability of IP addresses and IP subnets (discussed in the Section 2.1.2) 2. Traffic engineering parameters (discussed in the Section 2.1.4) 3. Network topology including OSPF nodes, links and their connectivity (discussed in this section) The topology related link state advertisements only flow within a single OSPF area but not beyond. In a single OSPF area, each node knows all the other nodes, their links and connectivity without any aggregation. This is the basis for routing path optimization within a single OSPF area. Between OSPF areas, there is no topology related link state advertisements exchange. The only information across the OSPF area boundary is the reachability of the IP addresses and IP subnets via OSPF Area Border Routers. This scenario is also true across OSPF domains where the reachability is via OSPF Autonomous System Bernstein, G. [Page 29] draft-ipo-optical-inter-domain-00.txt November 2001 Boundary Routers. Therefore, while OSPF is a link-state algorithm based routing protocol, it is only true within an OSPF area and at the next higher level of the hierarchy it can only perform distance vector algorithm based routing. This will most likely be one of the major shortcomings for OSPF to be used for optical networks where if an optical network needs to scale to very large and be partitioned into OSPF areas, there will be no topology information at all across the area boundary. 5.1.5 Resources The OSPF was originally defined for routing IP packets with best effort. The MPLS/GMPLS extensions to the OSPF add the ability for OSPF to advertise traffic engineering parameters using opaque Link State Advertisements. The traffic engineering parameters advertised by OSPF are associated with OSPF links that can carry user traffic. Currently the set of engineering parameters including the following: 1. Maximum link bandwidth 2. Available link bandwidth (with 8 priority levels) 3. Traffic engineering metric 4. Administrative class 5. Protection type 6. Shared risk link group Like the scope of the topology link state advertisements, the advertising for traffic engineering information is also restricted within a single OSPF area. This makes the constraints based routing using OSPF very effective within a single OSPF area but not beyond. This is another major shortcoming of OSPF that needs to be considered for using OSPF in the optical networks. 5.1.6 General Protocol Properties This section briefly discusses the general protocol properties of the OSPF in the specific areas as in the following sections. 5.1.6.1 System Overhead The memory required to support OSPF operation is a variable depending on several factors including the following: 1. Whether the node is an ABR or not – ABR usually consumes more memory since it needs to maintain database for multiple OSPF areas. 2. Whether the node is an ASBR or not - ASBR may also consume more memory since it needs to store routes obtained from other routing protocols such as BGP-4. 3. The number of nodes and links within each OSPF area. 4. The number of reachable addresses within a single OSPF area, imported from other areas and other AS. 5. The amount of the traffic engineering information within an OSPF area. Bernstein, G. [Page 30] draft-ipo-optical-inter-domain-00.txt November 2001 Most of the OSPF routing messages are re-transmittable driven by the associated timers in the order of seconds and the timer values are configurable. 5.1.6.2 Network Resource Overhead OSPF routing messages can either be carried in-band, out-band or out-of-fiber. The consumption of the network resources depend on factors including the following: 1. The amount of the routing messages that need to be put on the wire depends on factors that similar to those listed above (Section 2.1.7.1) for the memory consumption. 2. Interface type – it is visible that during transit time (election of (Backup) Designate Router, topology change, etc.), the OSPF routing messages on interfaces of multi-access networks (Ethernet, etc.) are very large and the actual amount depending on the implementation, the number of routers on the subnet, etc. 3. OSPF packets – Most of the OSPF packets as defined in the RFC 2328([OSPF1])are very compact that saves the bandwidth consumption on the wire. The OSPF address advertising LSA (type 3 and 5) only allows to contain one single reachable address. This is not economic at all. Some of the traffic engineering parameters contained in the OSPF opaque LSA are not with compact format such as bandwidth based on priority. Consideration needs to be taken in this regard since in the optical networks, control channels are usually out-of-band or out-of-fiber, where there may not always be large quantity of bandwidth available for routing messages. 5.1.6.3 Reliability OSPF does not have re-start capability originally other than re- build its routing database from scratch upon re-start. This fact cannot be accepted by most of the carrier-class networks today. Note however, that routing protocols for the optical network are not service impacting in the way that they are for IP datagram routing. Hence, for connection/circuit based networks, the routing protocol’s re-start capability may not be critical as in the datagram networks. The only scenario is – if we use signaling protocol to setup Bernstein, G. [Page 31] draft-ipo-optical-inter-domain-00.txt November 2001 circuits, and if we also (sometimes) have to use the signaling protocol to re-route failed circuits, there will be a busy routing scenario where the routing protocol’s re-start appears important. However, the IETF has a few proposals to improve this and in particular the proposal as described in the [OSPF5] allows a graceful re-start along with backward compatibility. 5.1.7 Scaling Capability OSPF is capable of scaling and can handle large-scale IP networks. The scalability of a routing protocol is achieved by reducing the amount of the routing information exchanged between network segments. The reduction of the routing messages will certainly have impact on the routing efficiency and optimization. The kind of information reduction and their impact for OSPF are as follows: 1. Address summarization – as described in the Section 2.1.2, OSPF is capable of performing address summarization at the area boundary and Autonomous System boundary. As a result, the amount of the addresses that advertised throughout the IP networks is greatly reduced. The reduction of the address advertisement does not affect the reachability since the routing path across OSPF areas and Autonomous Systems is via OSPF Area Border Router and Autonomous System Boundary Router, respectively. 2. Topology aggregation – as described in the Section 2.1.3, OSPF is capable of topology aggregation, but to the extreme, i.e., there is no topology information exchange between OSPF areas and Autonomous Systems at all. This greatly reduces the amount of the advertisements throughout the IP networks but with a big price, i.e., routing path across the OSPF areas or Autonomous Systems will most unlikely be optimized. 3. Traffic engineering information aggregation – as described in the Section 2.1.4, there is no traffic engineering information exchange between OSPF areas and Autonomous Systems as defined in current OSPF TE extensions. As a result, the scaling purpose is achieved but the routing with traffic engineering requirements across OSPF areas or Autonomous Systems is very inefficient or ineffective. 4. OSPF has another feature that limits the flooding of the Link State Advertisements (LSA) – an OSPF area can be configured as a so-called Stub Area and as such no AS-based LSA is allowed to flow into it. The AS-based LSA includes the OSPF type 5 LSA only as today. The OSPF type 5 LSA contains reachable addresses from other Autonomous Systems; the communication between other AS and a stub area is via the ABR(s) that attach to the stub area. The address summarization capability of OSPF is very valuable. The topology aggregation behavior of OSPF does not sound great in multi- Bernstein, G. [Page 32] draft-ipo-optical-inter-domain-00.txt November 2001 area operation especially for traffic engineering based applications. 5.1.8 Interworking Capability OSPF cannot interwork with any other protocol using any common protocol messages. However, OSPF as an IP routing protocol is capable of interworking in terms of sharing and exchanging IP reachability information as follows: 1. OSPF can operate together with any other IP routing protocols on the same router/switch where the IP forwarding table (FIB) can be shared among all the IP routing protocols. 2. OSPF is capable of exchanging IP reachability information with any other IP routing protocols. The most common scenario is the reachability information exchange between OSPF and BGP-4. The NNI routing protocol, if not OSPF, needs to have the capability of interworking with OSPF and IS-IS where the minimum requirement is the exchange of reachability information, although the exchange of topology and resource information are also highly desired. 5.1.9 References [OSPF1] RFC 2328, “OSPF Version 2”. [OSPF2] IETF draft, “Traffic Engineering Extensions to OSPF”, draft- katz-yeung-ospf-traffic-05.txt [OSPF3] IETF draft, “OSPF Extensions in Support of Generalized MPLS”, draft-ietf-ccamp-ospf-gmpls-extensions-00.txt [OSPF4] RFC 2370, “The OSPF Opaque LSA Option” [OSPF5] IETF draft, “Hitless OSPF Restart”, draft-ietf-ospf-hitless- restart-01.txt [OSPF6] IETF draft, “OSPF Version 2 Management Information Base”, draft-ietf-ospf-mib-update-06.txt [OSPF7] RFC 1587, “The OSPF NSSA Option” 5.2 IS-IS and Integrated IS-IS IS-IS was originally developed at Digital as Decnet Phase V IS-IS is a link state routing protocol originally specified to route OSI CNLS packets between Intermediate Systems (ISO’s description for routers). The specification for the protocol can be found in ISO document 10589. RFC 1195 defines Integrated IS-IS, an adaptation of this protocol capable of routing both IP and OSI packets. 5.2.1 Terminology IS-IS Intermediate System to Intermediate System routing exchange protocol Intermediate system ISO term for a router Bernstein, G. [Page 33] draft-ipo-optical-inter-domain-00.txt November 2001 CLNP Connection-Less Network Protocol – ISO’s version of IP LSP Link State PDU (Protocol Data Unit) ISO International Standards Organization SPF Dijkstra’s Shortest Path First algorithm 5.2.2 Neighbor/Adjacency Discovery Neighboring routers are discovered through the periodic transmission and reception of IS-IS Hello packets. The hello packet provides the neighbor with the router’s network layer address and a holding time. The holding time is the number of seconds that the neighbor should maintain reachability to the sending router without receiving further IS-IS Hello packets. The Hold timer is set so that even if some Hello packets are dropped, the neighbor connection remains active. One version of the IS-IS Hello packet is used for point to point links, the other is used for the LAN version, this contains additional information such as the ID of other routers to ensure that connectivity between neighbors is bi-directional. 5.2.3 Addressing & Reachability Integrated IS-IS supports NSAP, IPv4 and IPv6addressing. In IS-IS routing, the network is partitioned into routing ‘domains’ (equivalent to AS in OSPF). Routing domain boundaries are defined by network management setting some links to be ‘exterior links’. If a link is marked as ‘exterior’, no IS-IS routing messages are sent on that link. 5.2.3.1 CLNP/NSAP addressing and routing SDH/SONET, DECnet Phase V, ATM and cellular digital packet data use CLNP Addressing, sometimes known as NSAP addressing. Routing boundaries between routers in an area, areas in a domain and between domains are determined by sections of the NSAP address. A ‘router’ is any entity that routes CLNP packets on the basis of their layer 3 addresses, therefore a router in this context can be a dedicated CLNP router or a controller card in a SONET ADM. ISO 10589 specifies that up to three areas addresses may be configured within an area. A number of implementations allow more Bernstein, G. [Page 34] draft-ipo-optical-inter-domain-00.txt November 2001 than that. In order for two neighboring level one routers to be in the same area any one of the area addresses must match any one of the areas in a received packet. The NSAP address contains an area section and an ID section; for routers to be within the same area, the area portion of their NSAP address must to be identical. Level 1 routes on the basis of the ID field, there is no hierarchy in the ID portion of the address. The ID field is assumed to be a flat address space with no topological significance. Level 2 routes according to the longest prefix of the area portion of the address. Level 2 routing is similar to IP routing. In order to support end OSI end systems, a router must run the ES-IS protocol. This is not the same as IS-IS. OSI IS-IS routing makes use of a two-level hierarchical routing. Level 1 routers know the topology in their area, including all the routers and end systems in their area. However the level 1 routers do not know the identity of routers outside their area. Level 1 routers forward all traffic outside their area to a level 2 router in their area. Level 2 routers do not need to know the topology within any level 1 area, except that in some cases a level 2 router may also be a level 1 router. 5.2.3.2 IP addressing and routing Level 1 routers within an area exchange LSPs that identify the IP addresses that are reachable by each router. Specifically zero or more [IP address, subnet mask, metric] combinations may be included in each LSP. A level 1 router routes as follows: 1. If a specified destination address matches an [IP address, subnet mask, metric] reachable within the area, the packet is routed using level 1 routing. 2. If a specified destination address does not match any [IP address, subnet mask, metric] combination listed as reachable within the area, the packet is routed towards the nearest level2 router. Provided that the level two router is attached to another area, otherwise this does not hold. Level 2 routers include in their level 2 LSPs a complete list of [IP address, subnet mask, metric] specifying all IP addresses reachable in their area. This information may be obtained from a combination of the level 1 LSPs (obtained from level 1 routers in the same area), and/or by manual configuration. In addition, level 2 routers may report external reachability information, corresponding to addresses thatcan be reached via routers in other routing domains. Bernstein, G. [Page 35] draft-ipo-optical-inter-domain-00.txt November 2001 Some implementations allow this with level one LSPs as well as level two. If supported the option is configurable. 5.2.4 Topology Discovery & Dissemination Topology discovery and dissemination is achieved by means of LSPs, Link State PDUs. Each router constructs a packet known as a link state packet or LSP (OSPF uses the term Link State Advertisements or LSAs) which contains a list of reachable address prefixes, areas and routers. An IS-IS router will generate a LSP periodically or when it has a new neighbor, the cost of the link to an existing neighbor has changed or if the link to a neighbor has gone down. The LSP is then transmitted to all other routers. Each router stores information gained from the most recently generated LSP from the other routers. The LSPs provide each router with a complete map of topology. LSPs are only exchanged between routers of their own type (i.e. level 1 or level2), though as stated earlier, some routers can operate as level 1 and 2 routers. Based on the topological information obtained from LSPs, each router can use SPF to calculate the optimal (least cost) path to a destination. However whilst an optimal route can be chosen within an area, the extent to which an optimized path is chosen between areas is dependent on the level of summarization used for address prefixes in level 2 routing tables. As with OSPF, there is a balance between reduced table entries (to minimize router memory usage)and an optimum path route. 5.2.5 Resources Work is currently being undertaken at the IETF to add extensions to Integrated IS-IS LSPs to provide the ability for IS-IS to advertise traffic engineering parameters, in parallel to similar efforts for OSPF. A set of parameters taken from the draft-ietf-isis-gmpls-extensions- 04.txt Internet Draft, includes the following: 1. Maximum link bandwidth 2. Reservable link bandwidth 3. Unreserved bandwidth 4. TE Default metric 5. Link Protection Type 6. Interface Switching CapabilityDescriptor There are many other extensions being proposed in particular those in support of circuit provisioning of SDH, SONET and G.709 equipment. Bernstein, G. [Page 36] draft-ipo-optical-inter-domain-00.txt November 2001 5.2.6 Interface types and network Medium Support Whilst Integrated IS-IS can route both CNLP and IP packets, the routing protocol itself is based on CNLP. This requires that all infrastructure used in an IS-IS routing domain be able pass IS- IS/CNLP packets. In practice CNLP can be carried on a variety of data-link layer technologies such as Ethernet (all types), Token Ring, FDDI, and Frame Relay. Also POS in which case OSICP is used. 5.2.7 General Protocol Properties This section briefly discusses the general protocol properties of Integrated IS-IS. 5.2.7.1 System Overhead The amount of system overhead (memory and processing)required for Integrated IS-IS or IS-IS routing is dependent on the following: -Whether the router participates in both level 1 and level 2 routing– a level 1/ level 2router will need to maintain a detailed database for its own area and a summarized database for all other areas. -The number of nodes within each IS-IS area. -The total number of reachable addresses in every IS-IS area-thus affecting the size of level 2 routing tables. -The level of address summarization used in level 2 routing. As mentioned before, this summarization is done at the expense of calculating optimal path routes. -The amount of the traffic engineering information disseminated within an IS-IS area. GMPLS provides for the dissemination of resource information using IS-IS extensions. Such information will typically include available ports, timeslots (SONET) and wavelengths (Optical layer equipment). The level churn of connections will have a serious impact on the amount of system overhead required to process such IS-IS information particularly if such information is to be processed in ‘real-time’. Such concerns are equally applicable to OSPF extensions. 5.2.7.2 Network Resource Overhead IS-IS routing messages can either be carried in-band, out-band or out-of-fiber. From the perspective of Transport network elements under consideration in the OIF, in-band could mean the DCC bytes available within SONET and SDH overheads. This is currently how standards-based SDH and SONET configuration messages and alarms are Bernstein, G. [Page 37] draft-ipo-optical-inter-domain-00.txt November 2001 routed between network elements and their respective element managers. Out-band routing messages could be carried on a supervisory wavelength, whilst out-of-fiber routing messages could be carried out on a physically independent network such as an Ethernet network. Where out-of-fiber networks are used to transport routing messages, such networks often rely on routers to route packets end-to-end. Unless a different routing instance or static routing is used, the topology information that is shared will include both the topology of the transport network and the ‘out-of –band’ network. This has an impact on the amount of routing information that passes through transport network elements where both in-band and out-of-band resources are used. In-band channels such as SDH/SONET DCC channels represent ‘free’ resource, however such resources are scarce(192kbps and 576kbps for Section and Line DCC) and it is important to ensure that unnecessary consumption is avoided at all costs. Consumption of the network resources depend on factors including the following: -The amount of the routing messages that need to be put on the wire depends on factors that similar to those listed above (Section 2.2.7.1). -IS-IS LSPs containing traffic engineering parameters are likely consume a large amount of bandwidth. Work needs to be done to quantify this. However whilst IS-IS is an improvement over OSPF, in that a single LSP can contain multiple reachable addresses, it is the total bandwidth that matters. The total amount of information carried in all the LSPs, LSP count in itself is not informative. I’m not sure that I would say that IS-IS represents an improvement over OSPF here. IS-IS is limited in that a single logical LSP can at most be composed of 256 physical LSPs, each constrained to the local MTU. This limitation makes it impossible for IS-IS to carry the number of routes that OSPF can support. Furthermore, on a LAN, the DR in IS-IS will be sending out a Complete Sequence Number Packet (CSNP) every second or so. Actually this CSNP may well require multiple CSNPs if there are a large number of LSPs in its database. So OSPF is more efficient in terms of its use of bandwidth on the wire. However that efficiency comes at a price. OSPF is a more complex protocol. Bernstein, G. [Page 38] draft-ipo-optical-inter-domain-00.txt November 2001 5.2.7.3 Reliability IS-IS in a similar manner to OSPF does not have re-start capability, other than re-build its routing database from scratch upon re-start. This fact cannot be accepted by most of the carrier-class networks today. When after a significant failure within a network, a large number of network elements are brought back ‘on-line’ the exchange of LSPs could exceed the traffic limits of the network carrying the LSPs. Such congestion can be caused by insufficient link capacity or intermediate router throughput capability. The result of such congestion can be the prolonged time required to populate route tables (through loss of LSPs), or even a complete failure to re- establish connectivity. Such failures are inherent in all networks that use Link State Routing Protocols, and work is being done in the IETF and ATM Forum to modify IS-IS, OSPF and PNNI to ensure that the appropriate mechanisms are in place to prevent such occurrences. This will ultimately have an impact on scalability. IS-IS has already in place a mechanism to deal with router congestion (OSPF has something similar as an option). When the IS-IS routing database exceeds its allocated memory, a router that cannot fit a new LSP into its database refuses to acknowledge the LSP. The neighbor continues trying to transmit the LSP. A router that is forced to refuse an LSP sets a flag in its LSP indicating that its LSP database is full. Other routers use paths through that router only if no path through non-overloaded routers exits. This behavior means that a temporary overload situation heals itself without human intervention. 5.2.8 Scaling Capability Scaling issues have already been addressed in other parts of this document. The scaling capability of IS-IS is no different to OSPF, except in one regard: See my comment above. OSPF can carry more routes than ISIS. The large address space and hierarchical structure of the NSAP format often used by IS-IS routers (20 bytes versus 4 bytes forIPv4) allows the creation of worldwide networks without the need for address translation or private addressing. 5.2.9 Interworking Capability Integrated IS-IS can route between network elements regardless of whether IP or NSAP addressing is used. Where it becomes necessary to connect IP network elements to NSAP based network elements, Integrated IS-IS provides an easy way of doing so. Information can be learnt from other Interior Gateway Protocols such as RIP and OSPF (sometimes known as route redistribution).Interworking with inter domain routing protocols is described in the next section. Bernstein, G. [Page 39] draft-ipo-optical-inter-domain-00.txt November 2001 5.2.10 Inter Domain Routing Protocols Inter domain routing protocol information (IRPI) is included in level 2 LSPs and serves to provide information only to inter domain routing protocols. The IRPI field allows interdomain routers to find each other. The interdomain routers within a routing domain could discover which of the level 2 routers within the domain could discover which of the level 2 routers within the domain were also interdomain routers. Integrated IS-IS allows IP addresses reachable via inter-domain routing to be reported in level 2 LSPs in the ‘IP external reachability information’ field. This includes routes learned from OSPF, RIP or any other external routing protocol. 5.2.11 References -RFC 1195, “Use of OSI IS-IS for Routing in TCP/IP and Dual Environments” -Radia Perlman, “Interconnections” -John T. Moy,”OSPF, Anatomy of an Internet Routing Protocol” -IETF Internet Draft, draft-ietf-isis-gmpls-extensions-04.txt -ISO 10589 Intermediate system to intermediate system Intra-Domain routing information exchange protocol for use in conjunction with the protocol for providing the Connectionless-mode Network Service 5.3 BGP From RFC1771: "The primary function of a BGP speaking system is to exchange network reachability information with other BGP systems. This network reachability information includes information on the list of Autonomous Systems (ASs) that reachability information traverses. This information is sufficient to construct a graph of AS connectivity from which routing loops may be pruned and some policy decisions at the AS level may be enforced." 5.3.1 Terminology Autonomous System BGP Speaker External Peer Internal Peer NLRI Network layer reachability information 5.3.2 Neighbor/Adjacency Discovery BGP does not include a neighbor discovery protocol. In fact BGP uses the term peer since communication is not always limited to directly attached neighbors. All peers must be configured. As an inter-domain protocol this makes sense, since you do not necessarily want any two boxes that get connected together in differing domains Bernstein, G. [Page 40] draft-ipo-optical-inter-domain-00.txt November 2001 to start exchanging routing information. Two types of peers are distinguished in BGP. An external peer is a BGP speaker in a different AS, while an internal peer is a BGP speaker within the same AS. There is however a working assumption that external peers are "directly connected" however this is not strictly required. Internal peers are frequently not directly connected. 5.3.3 Addressing & Reachability BGP works with IPv4 and IPv6 addresses. Much of BGP's strength lies in its very general mechanisms for dealing with reachability information. In particular this information can be aggregated (summarized), filtered on input, and filtered on output (for controlled information dissemination). BGP is "the" reachability protocol. The Update message contains a path (AS_PATH) that furnished at least one possible route to reach the destinations summarized (via prefixes) in the Network Layer Reachability Information (NRLI) field. Note that the NEXT_HOP attribute can be used in terms of the next optical hop (rather than IP hop). Hence as it stands BGP can be used for arbitrary optical reachability. The BGP sessions are set up via the IPCC addresses (IP routable) but the information exchanged pertains to the optical network not the IP control channel network. BGP-4 [BGP1] provides a number of policy mechanisms that relate to how routing information is used and disseminated. In particular the E-BGP border router model keeps distinct the routing information received from each of a border routers autonomous systems external peers (Adj-RIBs-In -- Adjacent Routing Information Base In), the routing information that the Autonomous System (AS) itself is using (Loc-RIB -- Local Routing Information Base), and the routing information that the AS forwards onto its external peers (Adj-RIBs- Out -- Adjacent Routing Information Base Out). Via this model one can develop policies with regards to which routes get chosen for use in the AS, i.e., which routes from the Adj-RIBs-In are chosen to populate the Loc-RIB. One also develops policies concerning what routing information gets advertised to external peers, i.e., which routes from Loc-RIB gets exported to each of the Adj-RIBs-Out. The choice of which routes get imported for local routes generally is concerned with the "quality" of those advertising the routes since not too much else is known (besides the AS path vector). In deciding which routes to advertise to external peers "transit policies", i.e., whose traffic is allowed to transit this AS is the prime consideration. In the MPLS and in particular the explicitly routed optical case we have a very strong additional policy mechanism, that of connection admission control (CAC). Although an optical AS probably shouldn’t advertise transit capabilities that it doesn’t wish to support, CAC during connection establishment will be the final arbiter of any Bernstein, G. [Page 41] draft-ipo-optical-inter-domain-00.txt November 2001 transit policy. In addition, some areas that are being addressed by policies in the IP datagram case such as load balancing are much easier to implement via CAC and/or explicit routing. 5.3.4 Topology Discovery & Dissemination BGP is a path vector type of protocol which can be thought of as a variant of a vector distance protocol. For each set of NLRI a path vector is kept to indicate the sequence (or collection) of autonomous systems that are traverse to reached these summarized destinations. The main reason for keeping this information is to avoid routing loops. Although the path vector information can give us some notion of AS connectivity this is by necessity extremely limited. As part of the BGP route selection process only one route to a set of destinations can be selected and only routes that are actually used can be advertised via BGP to other ASs. 5.3.5 Scaling Capabilities BGP-4 was originally defined as an inter-domain routing protocol for traditional IP networks where a common scenario was the BGP-4 would be used as point-to-point communication protocol between ASBR that is not very large in number in the IP networks. When the BGP-4 is used in other scenario where the situation does not hold, as for example when BGP-4 is used in a BGP/MPLS based VPN network, there appears to be a serious scaling problem, i.e., there tends to be a large number of end-to-end BGP-4 sessions in a network. There exists mechanism to resolve the scalability problem using BGP reflectors ([BGP4]). However, BGP reflector uses a server-client model that creates a single-point-failure problem. 5.3.6 Interworking Capability BGP there exist several implementations where BGP interoperates with popular IGPs (Interior Gateway Protocols) such as OSPF and IS-IS. 5.3.7 References [BGP1] Rekhter Y., and T. Li, "A Border Gateway Protocol 4 (BGP-4)", RFC 1771, T.J. Watson Research Center, IBM Corp., cisco Systems, March 1995. [BGP2] Rekhter, Y., and P. Gross, "Application of the Border Gateway Protocol in the Internet", RFC 1772, T.J. Watson Research Center, IBM Corp., MCI, March 1995. [BGP3] Rekhter, Y., Rosen E., “Carrying Label Information in BGP-4”, RFC 3107 [BGP4] Bates, T. et al, “BGP Route Reflection – An Alternative to Full Mesh IBGP”, RFC 2796 5.4 PNNI routing PNNI standards for Private Network-to-Network, or Node-to-Node Interface used for ATM networks. PNNI routing protocol is defined by Bernstein, G. [Page 42] draft-ipo-optical-inter-domain-00.txt November 2001 the ATM Forum and is used for ATM networks. PNNI routing protocol is link state algorithm based and in fact many elements are similar to other routing protocols such as OSPF for IP networks and IS-IS for OSI networks. PNNI routing protocol is capable of scaling to very large networks through its hierarchical architecture. The addressing used by PNNI routing protocol is called ATM End System Address, or AESA, that is similar to the NSAP addressing. The companion signaling protocol used in PNNI network is the PNNI signaling protocol that is similar to the UNI4.0 signaling protocol but with symmetric interface, among other differences. PNNI signaling protocol is also defined by the ATM Forum and actually contained in the same specification as the PNNI routing protocol. 5.4.1 Terminology ABR Available Bit Rate Adjacency The relationship between two PNNI communicating neighboring peer nodes. AESA ATM End Station Address Aggregation Token A number assigned to an outside link by the border nodes at the ends of the outside link. The same number is associated with all uplinks and induced uplinks associated with the outside link. In the parent and all higher-level peer group, all uplinks with the same aggregation token are aggregated. Border Node A logical node that is in a specified peer group, and has at least one link that crosses the peer group boundary. CBR Constant Bit Rate Complex Node Representation A collection of nodal state parameters that provides the detailed state information associated with a logical node. DTL Designated Transit List Entry Border Node The node that receives a call from anoutside link. This is the first node within a peer group to see such a call. Exit Border Node The node that will progress a call to an outside link. This is the last node within a peer group to see such a call. Bernstein, G. [Page 43] draft-ipo-optical-inter-domain-00.txt November 2001 Horizontal Link A link that is between two PNNI nodes within the same peer group. Inside Link Synonymous with horizontal link. Level Indicator PNNI hierarchical level indicator, from 0 to 104 inclusively. LGN Logical Group Node Outside Link A link to a lowest level outside node. In contrast to an inside link or an uplink, a outside link is not included as part of the topology information. PTSE PNNI Topology State Element Peer Group A set of logical nodes which are grouped for purposes of creating a routing hierarchy. PGL Peer Group Leader RCC Routing Control Channel Service Category Specify the CBR, real-time VBR, non real-time VBR, ABR, or UBR. Simple Node Representation An aggregation at the maximum level for a logical node by only advertising the logical node itself, in the contrast of the complex node representation. SVCC ATM SVC-based Connection UBR Unspecified Bit Rate Uplink Represents a connectivity between a border node and an up node. VBR Variable Bit Rate VPC ATM Virtual Path Connection Up node The node that represents the outside neighbor of a border node in the common peer group. The upnode must be a neighboring peer of one of the border node’s ancestor node. 5.4.2 Neighbor/Adjacency Discovery The neighbor relationship specifies when two PNNI nodes that can communicate with each other directly on a PNNI link. The adjacency Bernstein, G. [Page 44] draft-ipo-optical-inter-domain-00.txt November 2001 specifies the relationship between two PNNI neighboring peer nodes or peer groups. A PNNI link can be one of the three as follows: 1. A physical link that inter-connects two ATM switches. 2. A VPC-based PNNI logical link. 3. A SVCC-based PNNI logical link. Note in PNNI networks, all inter-connections between PNNI nodes are point-to-point. All PNNI routing messages are exchanged between PNNI neighbors. PNNI routing messages are carried on the Routing Control Channel (RCC) that is out-of-band but always in the same fiber where the data channels reside. Note that the definition of "out of band" in the packet switching case is a bit different then in circuit switching case. A Hello protocol is used to monitor the bi-directional communication status between a pair of PNNI nodes, determines if the two neighbors belong to the same peer group or not, etc. If the two neighbors belong to the same peer group, the link is called Inside Link otherwise Outside Link. PNNI neighbors on an Inside Link will also exchange PNNI topology information in packets called PNNI Topology State Element (PTSE). PTSE carries the topology information including nodes, links, addresses, and resources, etc. within that peer group. The operation model of a PNNI peer group is similar to that of an OSPF area wherein all routing information within that peer group is known to all the nodes in that peer group. In addition, a peer group elects one single node as the Peer Group Leader node. A separate PNNI node instance at the higher hierarchical level may run on the same ATM switch platform as the PGL of the lower hierarchical level resides. The PNNI node instance at higher levels is called Logical Group Node (LGN). The LGN then forms a neighbor relationship with any other LGN in other peer groups on a SVCC-based RCC that allows routing messages including Hello exchanged between the two LGNs on that RCC. This scenario is recursive in nature. PNNI neighbors on an Outside Link will also exchange hierarchical information if available, including the identity of the higher level PNNI nodes, peer groups, etc. PNNI nodes with Outside Links are called Border Nodes. One obvious similarity between ATM networks and optical networks is the circuit-based interconnection. The other similarity is that both networks dedicated to a single underlined networking technology, i.e., ATM and optics, respectively. The similarity may be useful for us to realize that as one part of the networking mechanism - the routing protocol is also possible to share much similarity. Note however "optics" is actually quite varied but it we want to separate out technology specific routing information from more general information. What is similar across all the optical technologies that we will be dealing with is that they are circuit Bernstein, G. [Page 45] draft-ipo-optical-inter-domain-00.txt November 2001 oriented. This tends to make bandwidth accounting/allocation quite simple hence we spend almost no time worrying about it (except for compact representations on the wire) while ATM spent a lot of time on packet QoS issues.Technology-specific routing information (such as regeneration requirements in “all-optical” networks may create essentially different QoS issues ) 5.4.3 Addressing & Reachability PNNI is an interior routing protocol used for ATM networks. The addresses used in PNNI networks are ATM End System Address (AESA). The AESA is 20-byte in length. The common portion of several AESAs, consisted of the most significant bit stream of these AESA is called ATM address prefix. The ATM address prefix is used to summarize AESAs and other address prefixes. An AESA is associated with an ATM host. In PNNI network, each PNNI node has a unique AESA in the associated routing domain. Other ATM devices register their AESAs with their attached ATM switches via the ILMI protocol ([2]), and these AESAs are then advertised by the associated PNNI node at the lowest hierarchy on that ATM switch throughout the associated PNNI peer group along with the identity of the advertising node itself. The advertised AESAs are therefore reachable by other nodes in the same peer group. The AESAs that registered with a PNNI node may be summarized if possible and in this case, only the ATM address prefix is advertised. This is one addressing scaling mechanism in the PNNI networks. All the AESAs and ATM address prefixes belong to a single PNNI peer group are stored on each node in that peer group. The PGL in that peer group may summarize these AESAs and address prefixes, then deliver them to the LGN at the next higher hierarchical level and advertise them to the LGN’s peer nodes and peer groups, such that these AESAs and address prefixes are reachable from other peer groups. Note the PGL and LGN perform a special role here as passing address reachability information along the PNNI hierarchical levels, but the actual data communication between peer groups is via the border nodes. Currently there does not exist any ATM inter-domain routing protocol and as such, reachability for AESAs and address prefixes can only possible via static configuration. ATM address follows the same format of NSAP in the sense of 20-byte in length, although there are different encoding rules for ATM addresses (NSAP format, DCD format, etc.). PNNI routing protocol itself does not have the ability to carry IP address but there exists a “PNNI Augmented Routing” (PAR), also defined by the ATM Forum, is capable to do so. But I don’t think any vendor implemented it yet. PAR was expected to be implemented on ATM switch and edge IP Bernstein, G. [Page 46] draft-ipo-optical-inter-domain-00.txt November 2001 router so in the ATM core, it runs PNNI for ATM networks but interface with IP routers for IP routing. 5.4.4 Topology Discovery & Dissemination The routing information is contained in the PNNI Topology State Element (PTSE) that includes three categories as follows: 1. Reachability of AESA and ATM address prefixes (discussed in the Section 2.4.3) 2. Traffic parameters as called resource available information (discussed in the Section 2.4.5) 3. Network topology including PNNI nodes, links, their connectivity and hierarchy (discussed in this section) The nodes and links within a single PNNI peer group are advertised throughout that peer group. Note this behavior is the same for all the peer groups - at each of the hierarchical levels. As similar to the OSPF area, routing path within a single PNNI peer group can be made very optimized Within a single peer group, the detail topology information including nodes, links and addresses along with resources information are known by all nodes, so the constraint- based routing can take such advantage to make the routing path very optimized. Note the routing messages are only exchanged on Inside Links. There are two types of PNNI nodes, i.e., the LGN/PGL nodes and the border nodes, the LGN/PGL need to perform additional duties as described as follows. Each LGN at higher level hierarchy needs to pass the topology information down to the PNNI node at the next lower hierarchy, i.e., the PGL of the peer group at that lower level. The PGL then advertises that information throughout its peer group. This behavior is recursive in nature. Each PNNI border node while performing Hello with its neighbors belong to other peer groups and in addition, it exchanges the identity of its ancestor nodes, i.e., LGNs and peer groups at higher hierarchies with its peers. The obtained information is advertised via a so-called Uplink PTSE throughout the border node’s own peer group. The PGL in that peer group will then pass the information contained in the Uplink to one or more of the LGNs at its higher level hierarchies. The information contained in the Uplink will be used to identify the neighboring LGN, if identified, a PNNI logical link will be established between the two LGNs at that higher hierarchical level. Note while the logical link is one-hop away, it is based on SVCC-based RCC such that physically it may contain multiple physical hops. A logical link logically inter-connects two adjacent peer groups, and it is advertised within the associated peer group at the higher level hierarchy as well as passes down to the peer groups below that hierarchy, so that PNNI nodes at lower Bernstein, G. [Page 47] draft-ipo-optical-inter-domain-00.txt November 2001 level hierarchies understand the connectivity with other peer groups also. This behavior is also recursive in nature. As described above collectively, the topology information of a single peer group is always exactly known by all nodes in that peer group, but the topology information of other peer groups is not known except the inter-connectivity between peer groups. Therefore, routing path within a single peer group is possibly made very optimized, but not across peer group boundaries. I'm not sure that this is that bad, at least one gets to see how the peer groups are connected and choose from there. This is much better than not having that visibility. But of course not as optimal as complete information. 5.4.5 Resources ATM is well known for its QoS and traffic management capability, and the PNNI routing protocol is also capable of advertising various of traffic engineering parameters, as so-called Resource Availability information. The Resource Available information is contained in the following PTSEs: 1. Reachable address PTSE 2. Horizontal links PTSE – advertisements for PNNI Inside Links and logical links 3. Uplinks PTSE The set of traffic parameters currently advertised by PNNI routing protocol is service category based and directional based, and as follows: 1. Maximum cell rate (MaxCR) 2. Available cell rate (AvCR) 3. Administrative weight (AW) 4. Cell transfer delay (CTD) 5. Cell delay variation (CDV) 6. Cell loss ratio for CLP0 (CLR0) 7. Cell loss ratio for CLP0+1 (CLR0+1) 8. Cell rate margin (CRM) 9. Variance factor (VF) It is interesting to see the set of traffic parameters is very focused on ATM technology only, although the PNNI routing protocol uses the link state algorithm that is also used by other non-ATM routing protocols. Perhaps this is a good example for us to speculate that a routing protocol that uses IP addressing, link- state based but focus on optical networks may also be built easily from scratch, with much more cleaner background, and more reflective and interactive with the underlined technology. Bernstein, G. [Page 48] draft-ipo-optical-inter-domain-00.txt November 2001 5.4.6 General Protocol Properties This section briefly discusses the general protocol properties of the PNNI in the specific areas as in the following sections. 5.4.6.1 System Overhead The memory required to support PNNI operation is a variable depending on several factors including the following: 1. Whether the node is a PGL or not, and if is, how many hierarchical levels above – for a PGL node in a given peer group, there would be a separate PNNI node instance (LGN) operating on the same ATM switch platform. This scenario is also recursive in nature. Additional PNNI nodes in the same hardware platform will consume more memory and CPU resources. 2. Whether the node is a border node or not – border node may also consume more memory and CPU resources since it needs to generate and maintain Uplinks among other things. 3. The number of nodes and links within each PNNI peer group. 4. The number of reachable addresses within a single PNNI peer group as well as advertised from other peer groups. 5. The amount of the traffic engineering information within a PNNI peer group. 6. The total number of hierarchical levels. Most of the PNNI routing messages are re-transmittable driven by the associated timers in the order of seconds or milli-seconds and the timer values are configurable. 5.4.6.2 Network Resource Overhead PNNI routing messages can either be carried in-band, out-band but always in-fiber as deployed as in today’s ATM networks. The consumption of the network resources depend on factors including the following: 1. The amount of the routing messages that need to be put on the wire - depends on factors that similar to those listed above for the memory consumption. 2. The amount of the Resource Availability information for the advertising. The code point for carrying Resource Availability is defined in such a way where if more than one service category with the same set of Resource Availability information, a compact format can be used. E.g., if the set of Bernstein, G. [Page 49] draft-ipo-optical-inter-domain-00.txt November 2001 traffic parameters for CBR and rt-VBR on a link is the same, there requires only one data block as defined by the code point for both services, with a one-bit flag set for the indication. In order to reduce network resource overhead, updates of resource availability in PNNI is controlled by timers and thresholds (defining “significant change” of a resource). Optical network lack cell-specific resources described in 2.4.5., which should reduce the amount of advertised traffic. 3. The amount of the reachable addresses and address prefixes for the advertising. If there exists a large number of reachable addresses or address prefixes, they might be summarized before the advertising. Note PNNI provides configurable summary address that can be used for the address summarization. PNNI routing protocol defines bandwidth and service category that used for routing messages on RCCs. Default values are recommended and also both are configurable as defined in the PNNI routing MIB. This is a very good feature that needs to be taken into account for the routing protocol used in the optical networks. One of the reasons is that control channel used to carry routing messages in the optical networks may be in-band, out-band or out-of-fiber, and as such, the bandwidth consumption and perhaps other QoS requirements for the routing messages might need to be estimated and defined for reference. 5.4.6.3 Reliability PNNI does not have re-start capability originally other than re- build its routing database from scratch upon re-start. This fact cannot be accepted by most of the carrier-class networks today, and as a result, vendors will most likely to implement their own schemes to support high reliability requirements. But the routing is not service impacting unlike the call processing. 5.4.7 Scaling Capability PNNI routing protocol is capable of scaling and can handle very large ATM networks with the following characteristics: -The scalability of PNNI routing protocol is the most comparing to any other existing routing protocols due to its hierarchical architecture. PNNI networks can scale up to 104 hierarchical levels and at each level, there may be multiple peer groups. Each PNNI peer group has up to from several 10s to several hundred nodes in reality today. -Like other routing protocols, PNNI achieves scaling capability by partitioning networks into segments (peer groups in PNNI case) and restrict the information flow across segment Bernstein, G. [Page 50] draft-ipo-optical-inter-domain-00.txt November 2001 boundary. The amount of the flow that floods into foreign PNNI peer groups is controllable using protocol mechanism along with configurable parameters. -PNNI routing protocol is the only one as today can achieve link state algorithm based routing both at the lowest level hierarchy as well as all the higher level hierarchies. This feature helps the effective routing across peer group boundaries. Note link state algorithm based routing only exists at the lowest level for both OSPF and IS-IS, not beyond. The highlights of PNNI routing protocol’s scaling mechanism are as follows: 1. Address summarization – as described in the Section 2.4.3, PNNI is capable of performing address summarization at each node of each hierarchical level. 2. Link aggregation – this is a similar feature as the link bundling defined in the GMPLS but with much elegant infrastructure. A set of horizontal links between two PNNI LGNs may be aggregated into any smaller number of links by configuration so-called Aggregation Token on the Outside Links at the lowest hierarchical level. Upon aggregation, the amount of Resource Availability information is also aggregated and reduced. 3. Nodal aggregation – this is a unique feature that no other routing protocol possesses as today. A PNNI peer group is represented by a LGN at the next higher hierarchical level and as a default, the topology information inside the child peer group is totally hidden from other peer group, exactly the same scenario as that in an OSPF area. Therefore, the routing path across PNNI peer group boundary has to be blindly by default sent to one of the Exit Border node of the child peer group, resulting the inefficient or sub-optimized routing paths. A LGN that does not advertise any topology information for its child peer group is called Simple Node Representation.Such a feature properly extended could come in handy in the optical arena. For example, the representations of a optical ring would be quite simple (and this is a very popular optical subnetwork.. Optionally, a PNNI LGN can advertise some topology information for the child peer group it represents, and such a LGN is called Complex Node Representation. Usually a Complex node advertises a set of so- called Nodal State PTSEs where each of them describes the set of the traffic parameters (aggregated) between a pair of border nodes in the associated child peer group; the information can be used during Bernstein, G. [Page 51] draft-ipo-optical-inter-domain-00.txt November 2001 the route calculation when choosing path traversing that peer group. The PNNI routing protocol does not specify the limit of those advertisements and how the aggregation is accomplished and this provides flexibility for vendors’ own implementation as well as customers’ preferences based on applications. The Complex Node Representation provides a powerful and flexible solution to a common topology scaling problem such that the routing efficiency and optimization may also be possible across peer groups. PNNI Complex Node representation is a logical star network with several “exceptions”. As pointed out in the previous paragrpaph, for optical networks, a different Complex Node representation may be needed (perhaps, a logical ring with several “exceptions”). The routing protocol used for the optical networks certainly requires address summarization capability. It also certainly requires the link aggregation capability since parallel links are very common in optical networks. The need for nodal aggregation will depend on the prediction of the size of the optical networks, i.e., it may not be needed the scope of a single PNNI peer group or a single OSPF area can cover an entire given optical network. The question is – is this a correct assumption? Nodal aggregation at the peer group level may be useful for dealing with some types of "generalized node" diverse paths. But I'm not really sure, link aggregation and the hierarchy stuff give us a lot already. 5.4.8 References [PNNI1] ATM Forum, “Private Network to Network Interface”, af-pnni- 0055.000 [PNNI2] ATM Forum, “I-LMI - Local Management Interface”, af-ilmi- 0065.500 [PNNI3] ATM Forum, “ATM Interim Inter-Switch Signaling Protocol”, af-pnni-0026.000 [PNNI4] ATM Forum, “ATM Inter-network Interface (AINI)”, af-cs-0125- 00 6 Conclusion This draft highlighted some of the considerations for an inter- domain route protocol for use in optical internetworking. The main differences between optical routing and datagram routing were highlighted. Additional requirements to be addressed in an optical inter-domain route protocol were discussed and several applications of inter-domain routing were highlighted. A summary of optical sublayer specific routing information was furnished for both the transparent optical sublayer and the SONET/SDH sublayer. Finally a review of the applicability of several existing route protocols to the optical inter-domain route problem was given. 7 Security Considerations Bernstein, G. [Page 52] draft-ipo-optical-inter-domain-00.txt November 2001 7.1.1 Protection of the routing information exchanged across an optical inter-domain interface is of high importance; erroneous reachability or topology information may result in connection provisioning requests that either fail or are routed across sub-optimal paths. It is also possible that failed requests may consume significant control and transport resources for a transient amount of time. It follows that erroneous routing information could result in degraded carrier network operation, or even render a carrier’s network inoperable. Security requirements are expected to be of higher importance in interfaces between different administrative domains. Therefore, an optical inter-domain routing protocol should provide the following: 1. Authenticate entities with which routing information is exchanged. For example, a carrier should authenticate the identity of other carriers it is connected to. The specific mechanisms used for authentication should provide protection against attacks; for example they should not be based on simple clear-text password authentication schemes. 2. Guarantee the integrity of routing information (topology, reachability and resource status) exchanged across the interface. This requirement can be satisfied using security mechanisms at different layers. For example, each routing message could be individually authenticated using a keyed message digest, which is embedded in the message. Both OSPF and BGP provide such options. Alternatively, the two parties could establish a security association at the network layer using IPSEC, which could be used to provide security services to the optical inter-domain routing protocol. From the point of view of routing, information integrity is likely to be the most important requirement. However, in some cases it might be necessary to provide confidentiality of the routing information as well. A possible scenario for this is when a carrier would like to advertise information privately to another carrier, but does not wish to publicly disseminate this information, due to policy constraints. It should be noted than none of the known mechanisms that provide information integrity (such as keyed digests or IPSEC) can provide adequate protection against a compromised node participating in the inter-domain routing protocol. This is an item for further study. 8 References [1] Bradner, S., "The Internet Standards Process -- Revision 3", BCP 9, RFC 2026, October 1996. Bernstein, G. [Page 53] draft-ipo-optical-inter-domain-00.txt November 2001 [2] G. Bernstein, J. Yates, D. Saha, "IP-Centric Control and Management of Optical Transport Networks", IEEE Communications Magazine, October 2000. [3] G. Bernstein, E. Mannie, V. Sharma, "Framework for MPLS-based Control of Optical SDH/SONET Networks", , July 2001. [4] Ramesh Bhandari, Survivable Networks: Algorithms for Diverse Routing, Kluwer Academic Publishers,1999. [5] Strand, J. (ed.) "Impairments And Other Constraints On Optical Layer Routing", work in progress, draft-ietf-ipo-impairments- 00.txt, May 2001. [6] Kompella, K., et. al. "IS-IS Extensions in Support of Generalized MPLS", Work in Progress, draft-ietf-isis-gmpls- extensions-01.txt, November 2000. [7] K. Kompella, et. al. "Link Bundling in MPLS Traffic Engineering", Work in Progress, draft-kompella-mpls-bundle- 05.txt, February 2001. [8] ANSI T1.105.01-1995, Synchronous Optical Network (SONET) Automatic Protection Switching, American National Standards institute. [9] Robert S. Cahn, Wide Area Network Design: Concepts and Tools for Optimization, Morgan Kaufmann Publishers, Inc., 1998. [10] Meghan Fuller, "Bandwidth trading no longer a case of 'if' but 'when' says report", Lightwave, June 2001. (www.light-wave.com) [11] Awduche, D., et. Al., "RSVP-TE: Extensions to RSVP for LSP Tunnels", Work in Progress, draft-ietf-mpls-rsvp-lsp-tunnel- 08.txt, February 2001. [12] K. Owens, V. Sharma, M. Oommen, "Network Survivability Considerations for Traffic Engineered IP Networks", Work in Progress, draft-owens-te-network-survivability-01.txt, July 2001. [13] K. Kompella and Y. Rekhter, "LSP Hierarchy with MPLS TE", draft-ietf-mpls-lsp-hierarchy-02.txt, Internet Draft, Work in Progress, February 2001. 9 Acknowledgments Bernstein, G. [Page 54] draft-ipo-optical-inter-domain-00.txt November 2001 The authors would like to thank Shezad Mirza of BT and Cheryl Sanchez of Calix for their help in the routing protocol analysis sections. 10 Author's Addresses Greg Bernstein, Lyndon Ong Ciena Corporation 10480 Ridgeview Court Cupertino, CA 94014 Phone: (510) 573-2237 Email: greg@ciena.com, lyong@ciena.com Bala Rajagopalan, Dimitrios Pendarakis Tellium, Inc 2 Crescent Place Ocean Port, NJ 07757 Email: braja@tellium.com, dpendarakis@Tellium.com John Strand AT&T Labs 200 Laurel Ave., Rm A5-1D06 Middletown, NJ 07748 Phone:(732) 420-9036 Email: jls@research.att.com Angela Chiu Celion Networks 1 Shiela Dr., Suite 2 Tinton Falls, NJ 07724 Phone:(732) 747-9987 Email: angela.chiu@celion.com Frank Hujber Alphion Corporation 4 Industrial Way West Eatontown, NJ 07724 fhujber@alphion.com Vishal Sharma Metanoia, Inc. 335 Elan Village Lane, Unit 203 San Jose, CA 95134 Phone: +1 408 943 1794 Email: v.sharma@ieee.org Sudheer Dharanikota Nayna Networks Inc. 475 Sycamore drive, Milpitas, CA 95035 Email : sudheer@nayna.com Bernstein, G. [Page 55] draft-ipo-optical-inter-domain-00.txt November 2001 Rauf Izmailov NEC USA, Inc. 4 Independence Way Princeton, NJ 08540 Email: rauf@nec-lab.com Bernstein, G. [Page 56]