Traffic Engineering Working Group Chris Liljenstolpe Internet Draft Cable & Wireless Expiration Date: May 2003 Offline Traffic Engineering in a Large ISP Setting draft-liljenstolpe-tewg-cwbcp-02.txt 1. Status of this Memo This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC2026. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as ``work in progress.'' The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. 2. Introduction This document is in response to a request made by the Traffic Engineering Working Group for a set of traffic engineering practices from a sample of the ISP engineering community. It reflects the current traffic engineering principles and practices that Cable & Wireless uses for its global ``packet'' networks (including IP and MPLS) at the time of publication. It will also identify some of the history that has lead to the specific principles and practices as well as some of the trade-offs between these methods and other possible approaches. It is not intended to be a detailed engineering guide or ``how-to'' document. Liljenstolpe [Page 1] Internet Draft draft-liljenstolpe-tewg-cwbcp-03.txt November 2002 3. Overview of Traffic Engineering This document does not intend to be an in-depth tutorial on traffic engineering, but a brief overview of the topic is warranted to insure that the specific terminology used is defined. 3.1. IP vs. Transport TE In an IP network, there are two layers at which traffic engineering can occur. The most obvious is at the IP layer itself. 3.1.1. IP Traffic Engineering The IP protocol suite provides a number of tools that can be used to traffic engineer a network. Examples include direct route table manipulation (such as setting certain specific routes statically, or applying route filters to achieve the same effect in a dynamic environment) and route metric manipulation (adding or reducing ``cost'' to a specific link to bias the likelihood that a specific route will be used). We will call this IP traffic engineering. 3.1.2. Transport Traffic Engineering Another medium that is widely utilized to traffic engineer IP networks is the underlying transport network. While not all transport networks used for IP networks today lend themselves to a traffic engineering application, many that are in common use do. For example, traffic engineering at the transport layer is very easy to implement if the transport layer is Frame Relay, ATM, MPLS, or VLAN- enabled Ethernet (basically any transport technology that allows for virtual circuits). It is possible to traffic engineer at the transport layer using nothing more than discrete physical circuits (such as SONET or SDH links), but this can become prohibitively expensive in terms of router ports and numbers of physical circuits to manage, depending on the size of the overlying IP network, and the completeness of the mesh (see below for a discussion of meshing). If an IP network determines the destination of a packet solely on the destination router's adjacency to the router currently processing that packet, without the use of intermediate systems or hop-by-hop routing, then that network can be considered a transport traffic engineered network. Liljenstolpe [Page 2] Internet Draft draft-liljenstolpe-tewg-cwbcp-03.txt November 2002 3.1.3. A Note About MPLS MPLS introduces a slight perturbation into the above definitions as an MPLS network can behave as part of the IP network (sharing control plane information) or as a transport network that the IP network overlays. Therefore, if the traffic engineering is done via MPLS or a combination of IP and MPLS, and the IP and MPLS networks share control plane information, then the IP network is considered to be IP traffic engineered. On the other hand, if the MPLS network does not share a control plane with the IP network (the IP network is a strict overlay on top of the MPLS network), and the traffic engineering is done at the MPLS layer, than the network should be considered a transport traffic engineered network. 3.2. Dynamic vs. Off-line TE There are two ``domains'' in which a network can be traffic engineered. The first is to allow the network equipment (either IP or transport layer) to make dynamic determinations as to the best path a given packet is to take through the network. This is dynamic traffic engineering and is most commonly exemplified by the IP IGP routing protocols such as IS-IS or OSPF, but can also be implemented in the transport layer by such things as PNNI and SPVCs for ATM. The other ``domain'' is off-line, or static traffic engineering. In this method, the paths are laid out either manually, or with some off-line planning tool, then provisioned into the network, either by setting up circuits (real or virtual) between IP routers, or setting specific routes into those routers. These paths are then used until a new set is calculated and provisioned into the network, replacing the previous set. One hybrid approach is to use off-line traffic engineering for the normal operations case, but allow the network to use dynamic traffic engineering for failure modes. In this case, the network has one set of paths provisioned for normal operations, but if one of those paths fail, the traffic engineering layer (either transport or IP) utilizes dynamic protocols to recover from the failure. Liljenstolpe [Page 3] Internet Draft draft-liljenstolpe-tewg-cwbcp-03.txt November 2002 3.3. Full vs. Partial Mesh One topic that is related to traffic engineering is to what degree the IP routers are interconnected with one another. One possibility is that each router has a direct logical adjacency to every other router in the network, this is called a complete mesh. The other option is to have each router connected to one or more other routers such that a directed graph can be drawn between any two routers with zero or more intermediate nodes, or ``hops'' needed to complete the path. While this is not directly a traffic engineering issue, it is very closely linked. For networks that are implemented as a complete mesh, then traffic engineering at the transport layer is usually selected, while IP traffic engineering is usually selected for partial mesh IP networks. This is a general rule, and there are exceptions to it in operation today. 4. Cable & Wireless's Approach to Traffic Engineering Cable & Wireless's AS 3561 network has been in operation for many years as one of the principal ``Internet backbone'' networks, and was one of the first layer 2 traffic engineered backbones. The team that has engineered and operated that network has built up a set of engineering guidelines for the traffic engineering of AS 3561 that this memo will discuss at a high level. 4.1. Meshing Cable & Wireless builds a hybrid network with two levels of complete mesh utilized in the network. The network is comprised of a collection of nodes. Each node has two ``core'' routers and a number of aggregation (or edge) routers. Within each node, all routers (the two core and the multiple aggregation) are fully meshed, meaning that each router has a direct adjacency (actually two) to every other router in the node. This is referred to as the intra-node mesh. Furthermore, each core router in the entire network has two direct adjacencies to each other core router in the network. This provides a complete mesh between the nodes. This is referred to as the inter- node mesh. Traffic flowing between the intra-node mesh and the inter-node mesh must take an IP hop on the core routers to transfer from one mesh to Liljenstolpe [Page 4] Internet Draft draft-liljenstolpe-tewg-cwbcp-03.txt November 2002 the other. This is the only intermediary router hop for most traffic on the network. 4.2. Off-line Traffic Engineering In the debate between dynamic and off-line traffic engineering, Cable & Wireless comes down on the off-line side of the discussion. While this approach requires significant systems support, Cable & Wireless believes that the benefits, to its network, out weight those costs. 4.2.1. The Approach The basic premise is that Cable & Wireless will forecast, to some future point in time, the traffic demand that will be presented by each node pair adjacency. This forecast is then matched against a combination of factors, such as SLA requirements (e.g. latency and jitter), amount of available capacity in the network in total, the amount of capacity available for a specific node (ingress/egress capacity), allowable utilization for a specific node as well as maximal allowable utilization, and the physical topography of the network (location and capacity of each circuit, diversity of circuits and facilities, etc.). This matching is done by a Cable & Wireless internally developed tool that calculates the most optimal network map given the constraints mentioned above, for both the primary and the backup path sets (the adjacencies seen by the IP network). These results are checked against both the operational path sets, as well as commercial software, and a spot analysis is performed, all to insure sanity. If all agree, the new configurations are laid in on the network and traffic starts using the new paths. This process is repeated either when the ending horizon of the forecast is near, when the forecast is proven invalid, or a major network event occurs (such as adding major new circuits, a long-duration circuit or node failure, etc.). 4.2.2. Costs of Approach First, Cable & Wireless must design, maintain, and support a very complex system of in-house software - the network topology tool mentioned above. This entails developing an in-house software development team dedicated to this software and associated and similar network tools. Second, Cable & Wireless must be able to generate medium-range, accurate forecasts of demand, such that ``emergency'' invocations of Liljenstolpe [Page 5] Internet Draft draft-liljenstolpe-tewg-cwbcp-03.txt November 2002 the process are an abnormality, not a normality. This is done by comparing historical trends for each node pair to the forecasts for each node provided by the commercial side of the company. Thirdly, Cable & Wireless must maintain a set of network design engineers that can evaluate the data that the network design tool produces and check it for completeness and sanity. 4.2.3. Reasoning The reason that Cable & Wireless takes this approach is three-fold. The first is historical. Back in 1996?, Cable & Wireless switched from a DS-3 based router-to-router backbone to an OC-3 ATM switch-to- switch backbone. These switches then had routers connected to them, and customers connected to those routers. At that time, the only mechanism for provisioning virtual circuits on ATM was manually configured PVCs. As the network grew, it became very time consuming and error-prone to provision the entire ATM network by hand, so a tool was created to automate the process. That tool has, over time, developed into the traffic engineering tool mentioned above. The second reason has to do with prior experience. This tool takes approximately two to four hours to calculate an optimal set of primary and backup paths where the size of the network approaches thirty thousand paths. This is on a large, dedicated server. Therefore, it is Cable & Wireless's belief that a router or switch processor, burdened by other tasks (such as route updates, house- keeping, SNMP, etc.) can not hope to provide the same level of optimal path routing in real-time. While each processor would not be directly calculating all thirty thousand paths, they would each be working on some sub-set, without having a view of the whole. To reach that final, complete set, takes convergence which brings us to the third reason. Convergence becomes a significant issue in a large network. As the number of independent views of the network increases (number of switches or routers), the number of locally optimal, but globally sub-optimal solution sets also increases. Therefore, these local solution sets must be converged into some semblance of a globally optimal solution. This takes time, and it takes more time as the network gets larger. Liljenstolpe [Page 6] Internet Draft draft-liljenstolpe-tewg-cwbcp-03.txt November 2002 4.3. Transport Layer Traffic Engineering Cable & Wireless performs its traffic engineering at the transport layer. The two principal traffic engineering transports are ATM PVCs and MPLS LSPs. The MPLS network control plane is isolated from the IP control plane, and therefore the MPLS traffic engineering is considered transport layer, as discussed above. 4.3.1. Reasoning There are three reasons why Cable & Wireless has chosen to traffic engineer at the transport, rather than the IP, layer. The first is purely historical. Cable & Wireless built a complete mesh over its ATM network when it first deployed ATM, and, by default, traffic engineered at the ATM layer. It has never had major problems with that methodology, so a certain degree of ``if it isn't broke...'' exists. The other two reasons are analytical in nature, and are detailed below. First off, traffic engineering at the IP layer is a global exercise. As IP routes in a hop-by-hop method, where each router makes its own routing decisions with no consideration of where the traffic has been, there is no way to segregate traffic on a flow-by-flow basis. With transport layer engineering, it is possible to take a flow that is routed A-B-C-D, and route it A-B-E-D while traffic flowing B-C-D stays on the B-C-D route (in these examples, A, B, C, D, E are nodes in a network, not a source or destination host). When an IP network reaches a sufficient size, it is desirable to have that kind of control over flows as some become quite large. While it is now possible to do this at wire rate on some routers today, that is a relatively new development. It will also become more difficult as the number of discrete IP addresses attached to ``A'' becomes large, and/or the number of these cases in the network becomes large. The second reason is the ``ripples of water'' effect (taken from the imagery of a rock hitting a calm pond). This attempts to describe the effect when a routing metric is changed on an IP network. Once the metric is changed on the desired router, that router floods the information to all the other routers adjacent to it and so on and so forth until the information reaches an administrative boundary of the network. Each new router that receives this information recalculates all of its routes affected by the change, and floods all of that information out, and so on. This has two un-desirable qualities. First, it can take a long time for the network to re-stabilize, or Liljenstolpe [Page 7] Internet Draft draft-liljenstolpe-tewg-cwbcp-03.txt November 2002 converge after such a change on a very interconnected, large network. In the mean time the network can behave in an unstable manner. Secondly, these changes can introduce unintended byproducts of the original change. The change could have been made to change the flow of traffic from A-Z, but could have also affected traffic flowing from B-D even though that was not the original intent. This problem gets more insidious as the traffic engineering extensions are used on the IGP (such as OSPF-TE, or IS-IS-TE), or MPLS is used with constrained LSPs. In this case, A-Z may have had its path change, taking bandwidth from part of the path carrying B-D, now B-D has to change paths even though they may not have even shared any resources with the old path for A-Z. The ``ripples of water'' and the very selective way of routing individual node-to-node adjacencies are the principal reasons that Cable & Wireless has selected to do traffic engineering at the transport layer. 4.3.2. Costs Cable & Wireless believes that the only additional cost for using transport layer traffic engineering is the cost of the underlying transport network that the IP network overlays. While this can be expensive, it is a much cheaper, if the right technology is used, way of providing a complete mesh network. Again, given the issues with router metric manipulation ``ripples of water'' effects, the cost is seen as easily justifiable. 4.3.3. What About IP IGP? While Cable & Wireless does do its traffic engineering at the transport layer, it still needs to deploy an IP IGP (IS-IS in particular) to announce adjacency availability to the IP routers. So, the IGP is used for adjacency discovery, not route distribution, although that is present, but its use deprecated (by default, all routers will prefer a directly connected route over an IGP announced multi-hop route). Liljenstolpe [Page 8] Internet Draft draft-liljenstolpe-tewg-cwbcp-03.txt November 2002 5. Security This draft makes no changes to the security requirements of any of the discussed protocols. 6. Author Chris Liljenstolpe Cable & Wireless 11700 Plaza America Drive Reston, VA 20190 USA +1.703.292.2232 chris@cw.net Liljenstolpe [Page 9]