idnits 2.17.1 draft-irtf-ncrg-network-design-complexity-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** The document seems to lack separate sections for Informative/Normative References. All references will be assumed normative when checking for downward references. -- The document has examples using IPv4 documentation addresses according to RFC6890, but does not use any IPv6 documentation addresses. Maybe there should be IPv6 examples, too? Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (August 30, 2013) is 3891 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- -- Missing reference section? 'R1' on line 154 looks like a reference -- Missing reference section? 'R2' on line 154 looks like a reference -- Missing reference section? 'R4' on line 154 looks like a reference -- Missing reference section? 'R6' on line 154 looks like a reference Summary: 2 errors (**), 0 flaws (~~), 1 warning (==), 6 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group A. Retana 3 Internet-Draft Cisco Systems, Inc. 4 Intended status: Informational R. White 5 Expires: March 03, 2014 IETF 6 August 30, 2013 8 Network Design Complexity Measurement and Tradeoffs 9 draft-irtf-ncrg-network-design-complexity-00 11 Abstract 13 Network architecture revolves around the concept of fitting the 14 design of a network to its purpose; of asking the question, "what 15 network will best fit these needs?" A part of fitting network design 16 to requirements is the problem of complexity, an idea often 17 informally measured using intuition and subjective experience. When 18 would adding a particular protocol, policy, or configuration be "too 19 complex?" This document suggests a series of continuums along which 20 network complexity might be measured. No suggestions are made on how 21 to measure complexity for each of these continuums; this is left for 22 future documents. 24 Status of This Memo 26 This Internet-Draft is submitted in full conformance with the 27 provisions of BCP 78 and BCP 79. 29 Internet-Drafts are working documents of the Internet Engineering 30 Task Force (IETF). Note that other groups may also distribute 31 working documents as Internet-Drafts. The list of current Internet- 32 Drafts is at http://datatracker.ietf.org/drafts/current/. 34 Internet-Drafts are draft documents valid for a maximum of six months 35 and may be updated, replaced, or obsoleted by other documents at any 36 time. It is inappropriate to use Internet-Drafts as reference 37 material or to cite them other than as "work in progress." 39 This Internet-Draft will expire on March 03, 2014. 41 Copyright Notice 43 Copyright (c) 2013 IETF Trust and the persons identified as the 44 document authors. All rights reserved. 46 This document is subject to BCP 78 and the IETF Trust's Legal 47 Provisions Relating to IETF Documents 48 (http://trustee.ietf.org/license-info) in effect on the date of 49 publication of this document. Please review these documents 50 carefully, as they describe your rights and restrictions with respect 51 to this document. Code Components extracted from this document must 52 include Simplified BSD License text as described in Section 4.e of 53 the Trust Legal Provisions and are provided without warranty as 54 described in the Simplified BSD License. 56 Table of Contents 58 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 59 2. Control Plane State versus Optimal Forwarding Paths (Stretch) 3 60 3. Configuration State versus Failure Domain Separation . . . . 4 61 4. Policy Centralization versus Optimal Policy Application . . . 6 62 5. Configuration State versus Per Hop Forwarding Optimization . 7 63 6. Reactivity versus Stability . . . . . . . . . . . . . . . . . 7 64 7. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . 9 65 8. Security Considerations . . . . . . . . . . . . . . . . . . . 9 66 9. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 9 67 10. References . . . . . . . . . . . . . . . . . . . . . . . . . 9 68 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 9 70 1. Introduction 72 Network complexity is a systemic, rather than component level, 73 problem; complexity must be measured in terms of the multiple moving 74 parts of a system, and complexity may be more than what the 75 complexity of the individual pieces, examined individually, might 76 suggest. 78 There are two basic ways in which systemic level problems might be 79 addressed: interfaces and continuums. In addressing a systemic 80 problem through interfaces, we seek to treat each piece of the system 81 as a "black box," and develop a complete understanding of the 82 interfaces between these black boxes. In addressing a systemic 83 problem as a continuum, we seek to understand the impact of a single 84 change or element to the entire system as a set of tradeoffs. 86 While network complexity can profitably be approached from either of 87 these perspectives, in this document we have chosen to approach the 88 systemic impacts of network complexity from the perspective of 89 continuums of tradeoffs. In theory, modifying the network to resolve 90 one particular problem (or class of problems) will add complexity 91 which results in the increased likelihood (or appearance) of another 92 class of problems. Discovering these continuums of tradeoffs, and 93 then determining how to measure each one, become the key steps in 94 understanding and measuring systemic complexity in this view. 96 This document proposes five such continuums; more may be possible. 98 o Control Plane State versus Optimal Forwarding Paths (or its 99 opposite measure, stretch) 101 o Configuration State versus Failure Domain Separation 103 o Policy Centralization versus Optimal Policy Application 105 o Configuration State versus Per Hop Forwarding Optimization 107 o Reactivity versus Stability 109 Each of these continuums is described in a separate section of this 110 document. 112 2. Control Plane State versus Optimal Forwarding Paths (Stretch) 114 Control plane state is the aggregate amount of information carried by 115 the control plane through the network in order to produce the 116 forwarding table at each device. Each additional piece of 117 information added to the control plane --such as more specific 118 reachability information, policy information, additional control 119 planes for virtualization and tunneling, or more precise topology 120 information-- adds to the complexity of the control plane. This 121 added complexity, in turn, adds to the burden of monitoring, 122 understanding, troubleshooting, and managing the network. 124 Removing control plane state, however, is not always a net positive 125 gain for the network as a system; removing control plane state almost 126 always results in decreased optimality in the forwarding and handing 127 of packets travelling through the network. This decreased optimality 128 can be termed stretch, which is defined as the difference between the 129 absolute shortest (or best) path traffic could take through the 130 network and the path the traffic actually takes. Stretch is 131 expressed as the difference between the optimal and actual path. The 132 figure below provides and example of this tradeoff. 134 R1-------+ 135 | | 136 R2 R3 137 | | 138 R4-------R5 139 | 140 R6 142 Assume each link is of equal cost in this figure, and: 144 o R4 is advertising 192.0.2.1/32 as a reachable destination not 145 shown on the diagram 147 o R5 is advertising 192.0.2.2/32 as a reachable destination not 148 shown on the diagram 150 o R6 is advertising 192.0.2.3/32 as a reachable destination not 151 shown on the diagram 153 For R1, the shortest path to 192.0.2.3/32, advertised by R6, is along 154 the path [R1,R2,R4,R6]. 156 Assume, however, the network administrator decides to aggregate 157 reachability information at R2 and R3, advertising 192.0.2.0/24 158 towards R1 from both of these points. This reduces the overall 159 complexity of the control plane by reducing the amount of information 160 carried past these two routers (at R1 only in this case). 162 Aggregating reachability information at R2 and R3, however, may have 163 the impact of making both routes towards 192.0.2.3/32 appear as equal 164 cost paths to R1; there is no particular reason R1 should choose the 165 shortest path through R2 over the longer path through R3. This, in 166 effect, increases the stretch of the network. The shortest path from 167 R1 to R6 is 3 hops, a path that will always be chosen before 168 aggregation is configured. Assuming half of the traffic will be 169 forwarded along the path through R2 (3 hops), and half through R3 (4 170 hops), the network is stretched by ((3+4)/2) - 3), or .5, a "half a 171 hop." 173 Traffic engineering through various tunneling mechanisms is, at a 174 broad level, adding control plane state to provide more optimal 175 forwarding (or network utlization). Optimizing network utilization 176 may require detuning stretch (intentionally increasing stretch) to 177 increase overall network utilization and efficiency; this is simply 178 an alternate instance of control plane state (and hence complexity) 179 weighed against optimal forwarding through the network. 181 3. Configuration State versus Failure Domain Separation 183 A failure domain, within the context of a network control plane, can 184 be defined as the set of devices impacted by a change in the network 185 topology or configuration. A network with larger failure domains is 186 more prone to cascading failures, so smaller failure domains are 187 normally preferred over larger ones. 189 The primary means used to limit the size of a failure domain within a 190 network's control plane is information hiding; the two primary types 191 of information hidden in a network control plane are reachability 192 information and topology information. An example of aggregating 193 reachability information is summarizing the routes 192.0.2.1/32, 194 192.0.2.2/32, and 192.0.2.3/32 into the single route 192.0.2.0/24, 195 along with the aggregation of the metric information associated with 196 each of the component routes. Note that aggregation is a "natural" 197 part of IP networks, starting with the aggregation of individual 198 hosts into a subnet at the network edge. An example of topology 199 aggregation is the summarization of routes at a link state flooding 200 domain boundary, or the lack of topology information in a distance- 201 vector protocol. 203 While limiting the size of failure domains appears to be an absolute 204 good in terms of network complexity, there is a definite tradeoff in 205 configuration complexity. The more failure domain edges created in a 206 network, the more complex configuration will become. This is 207 particularly true if redistribution of routing information between 208 multiple control plane processes is used to create failure domain 209 boundaries; moving between different types of control planes causes a 210 loss of the consistent metrics most control planes rely on to build 211 loop free paths. Redistribution, in particular, opens the door to 212 very destructive positive feedback loops within the control plane. 213 Examples of control plane complexity caused by the creation of 214 failure domain boundaries include route filters, routing aggregation 215 configuration, and metric modifications to engineer traffic across 216 failure domain boundaries. 218 Returning to the network described in the previous section, 219 aggregating routing information at R2 and R3 will divide the network 220 into two failure domains: (R1,R2,R3), and (R2,R3,R4,R5). A failure 221 at R5 should have no impact on the forwarding information at R1. 223 A false failure domain separation occurs, however, when the metric of 224 the aggregate route advertised by R2 and R3 is dependent on one of 225 the routes within the aggregate. For instance, if the metric of the 226 192.0.2.0/24 aggregate is taken from the metric of the component 227 192.0.2.1/32, then a failure of this one component will cause changes 228 in the forwarding table at R1 --in this case, the control plane has 229 not truly been separated into two distinct failure domains. The 230 added complexity in the illustration network would be the management 231 of the configuration required to aggregate the contorl plane 232 information, and the management of the metrics to ensure the control 233 plane is truly separated into two distinct failure domains. 235 Replacing aggregation with redistribution adds the complexity of 236 managing the feedback of routing information redistributed between 237 the failure domains. For instance, if R1, R2, and R3 were configured 238 to run one routing protocol, while R2, R3, R4, R5, and R6 were 239 configured to run another protocol, R2 and R3 could be configured to 240 redistribute reachability information between these two control 241 planes. This can split the control plane into multiple failure 242 domains (depending on how, specifically, redistribution is 243 configured), but at the cost of creating and managing the 244 redistribution configuration. Futher, R3 must be configured to block 245 routing information redistributed at R2 towards R1 from being 246 redistributined (again) towards R4 and R5. 248 4. Policy Centralization versus Optimal Policy Application 250 Another broad area where control plane complexity interacts with 251 optimal network utilization is Quality of Service (QoS). Two 252 specific actions are required to optimize the flow of traffic through 253 a network: marking and Per Hop Behaviors (PHBs). Rather than 254 examining each packet at each forwarding device in a network, packets 255 are often marked, or classified, in some way (typically through Type 256 of Service bits) so they can be handled consistently at all 257 forwarding devices. 259 Packet marking policies must be configured on specific forwarding 260 devices throughout the network. Distributing marking closer to the 261 edge of the network necessarily means configuring and managing more 262 devices, but produces optimal forwarding at a larger number of 263 network devices. Moving marking towards the network core means 264 packets are marked for proper handling across a smaller number of 265 devices. In the same way, each device through which a packet passes 266 with the correct PHBs configured represents an increase in the 267 consistency in packet handling through the network as well as an 268 increase in the number of devices which must be configured and 269 managed for the correct PHBs. The network below is used for an 270 illustration of this concept. 272 +----R1----+ 273 | | 274 +--R2--+ +--R3--+ 275 | | | | 276 R4 R5 R6 R7 278 In this network, marking and PHB configuration may be configured on 279 any device, R1 through R7. 281 Assume marking is configured at the network edge; in this case, four 282 devices, (R4,R5,R6,R7), must be configured, including ongoing 283 configuration management, to mark packets. Moving packet marking to 284 R2 and R3 will halve the number of devices on which packet marking 285 configuration must be managed, but at the cost of inconsistent packet 286 handling at the inbound interfaces of R2 and R3 themselves. 288 Thus reducing the number of devices which must have managed 289 configurations for packet marking will reduce optimal packet flow 290 through the network. Assuming packet marking is actually configured 291 along the edge of this network, configuring PHBs on different devices 292 has this same tradeoff of managed configuration versus optimal 293 traffic flow. If the correct PHBs are configured on R1, R2, and R3, 294 then packets passing through the network will be handled correctly at 295 each hop. The cost involved will be the management of PHB 296 configuration on three devices. Configuring a single device for the 297 correct PHBs (R1, for instance), will decrease the amount of 298 configuration management required, at the cost of less than optimal 299 packet handling along the entire path. 301 5. Configuration State versus Per Hop Forwarding Optimization 303 The number of PHBs configured along a forwarding path exhibits the 304 same complexity versus optimality tradeoff described in the section 305 above. The more types of service (or queues) traffic is divided 306 into, the more optimally traffic will be managed as it passes through 307 the network. At the same time, each class of service must be 308 managed, both in terms of configuration and in its interaction with 309 other classes of service configured in the network. 311 6. Reactivity versus Stability 313 The speed at which the network's control plane can react to a change 314 in configuration or topology is an area of widespread study. Control 315 plane convergence can be broken down into four essential parts: 317 o Detecting the change 319 o Propagating information about the change 321 o Determining the best path(s) through the network after the change 323 o Changing the forwarding path at each network element along the 324 modified paths 326 Each of these areas can be addressed in an effort to improve network 327 convergence speeds; some of these improvements come at the cost of 328 increased complexity. 330 Changes in network topology can be detected much more quickly through 331 faster echo (or hello) mechanisms, lower layer physical detection, 332 and other methods. Each of these mechanisms, however, can only be 333 used at the cost of evaluating and managing false positives and high 334 rates of topology change. 336 If the state of a link change can be detected in 10ms, for instance, 337 the link could theoretically change state 50 times in a second --it 338 would be impossible to tune a network control plane to react to 339 topology changes at this rate. Injecting topology change information 340 into the control plane at this rate can destabalize the control 341 plane, and hence the network itself. To counter this, most fast down 342 detection techniques include some form of dampening mechanism; 343 configuring and managing these dampening mechanisms represents an 344 added complexity that must be configured and managed. 346 Changes in network topology must also be propagated throughout the 347 network, so each device along the path can compute new forwarding 348 tables. In high speed network environments, propagation of routing 349 information changes can take place in tens of milliseconds, opening 350 the possibility of multiple changes being propagated per second. 351 Injecting information at this rate into the contral plane creates the 352 risk of overloading the processes and devices participating in the 353 control plane, as well as creating destructive positive feedback 354 loops in the network. To avoid these consequences, most control 355 plane protocols regulate the speed at which information about network 356 changes can be transmitted by any individual device. A recent 357 innovation in this area is using exponential backoff techniques to 358 manage the rate at which information is advertised into the control 359 plane; the first change is transmitted quickly, while subsequent 360 changes are transmitted more slowly. These techniques all control 361 the destabalilzing effects of rapid information flows through the 362 control plane through the added complexity of configuring and 363 managing the rate at which the control plane can propagate 364 information about network changes. 366 All control planes require some form of algorithmic calculation to 367 find the best path through the network to any given destination. 368 These algorithms are often lightweight, but they still require some 369 amount of memory and computational power to execute. Rapid changes 370 in the network can overwhelm the devices on which these algorithms 371 run, particularly if changes are presented more quickly than the 372 algorithm can run. Once the devices running these algorithms become 373 processor or memory bound, it could experience a computational 374 failure altogether, causing a more general network outage. To 375 prevent computational overloading, control plane protocols are 376 designed with timers limiting how often they can compute the best 377 path through a network; often these timers are exponential in nature, 378 allowing the first computation to run quickly, while delaying 379 subsequent computations. Configuring and managing these timers is 380 another source of complexity within the network. 382 Another option to improve the speed at which the control plane reacts 383 to changes in the network is to precompute alternate paths at each 384 device, and possibly preinstall forwarding information into local 385 forwarding tables. Additional state is often needed to precompute 386 alternate paths, and additional algorithms and techniques are often 387 configured and deployed. This additional state, and these additional 388 algorithms, add some amount of complexity to the configuration and 389 management of the network. 391 In some situations (for some topologies), a tunnel is required to 392 pass traffic around a network failure or topology change. These 393 tunnels, while not manually configured, represent additional 394 complexity at the forwarding and control planes. 396 7. Conclusion 398 This document describes various areas of network and design where 399 complexity is traded off against some optimization in the operation 400 of the network. This is (by it's nature) not an exhaustive list, but 401 it can serve to guide the measurement of network complexity and the 402 search for other areas where these tradeoffs exist. 404 8. Security Considerations 406 None. 408 9. Acknowledgements 410 The authors would like to thank Michael Behringer and Dave Meyer for 411 their comments. 413 10. References 415 Authors' Addresses 417 Alvaro Retana 418 Cisco Systems, Inc. 419 7025 Kit Creek Rd. 420 Research Triangle Park, NC 27709 421 USA 423 Email: aretana@cisco.com 425 Russ White 426 IETF 428 Email: russw@riw.us