idnits 2.17.1 draft-irtf-ncrg-complexity-framework-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (November 03, 2013) is 3827 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Unused Reference: 'I-D.irtf-ncrg-network-design-complexity' is defined on line 495, but no explicit reference was found in the text Summary: 1 error (**), 0 flaws (~~), 2 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Internet Research Task Force M. Behringer 3 Internet-Draft Cisco 4 Intended status: Informational G. Huston 5 Expires: May 07, 2014 Asia Pacific Network Information Centre 6 November 03, 2013 8 A Framework for Defining Network Complexity 9 draft-irtf-ncrg-complexity-framework-01.txt 11 Abstract 13 Complexity is a widely used parameter in network design, yet there is 14 no generally accepted definition of the term. Complexity metrics 15 exist in a wide range of research papers, but most of these address 16 only a particular aspect of a network, for example the complexity of 17 a graph or software. There is a desire to define the complexity of a 18 network as a whole, as deployed today to provide Internet services. 19 This document provides a framework to guide research on the topic of 20 network complexity. 22 Status of This Memo 24 This Internet-Draft is submitted in full conformance with the 25 provisions of BCP 78 and BCP 79. 27 Internet-Drafts are working documents of the Internet Engineering 28 Task Force (IETF). Note that other groups may also distribute 29 working documents as Internet-Drafts. The list of current Internet- 30 Drafts is at http://datatracker.ietf.org/drafts/current/. 32 Internet-Drafts are draft documents valid for a maximum of six months 33 and may be updated, replaced, or obsoleted by other documents at any 34 time. It is inappropriate to use Internet-Drafts as reference 35 material or to cite them other than as "work in progress." 37 This Internet-Draft will expire on May 07, 2014. 39 Copyright Notice 41 Copyright (c) 2013 IETF Trust and the persons identified as the 42 document authors. All rights reserved. 44 This document is subject to BCP 78 and the IETF Trust's Legal 45 Provisions Relating to IETF Documents 46 (http://trustee.ietf.org/license-info) in effect on the date of 47 publication of this document. Please review these documents 48 carefully, as they describe your rights and restrictions with respect 49 to this document. 51 Table of Contents 53 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 54 2. General Considerations . . . . . . . . . . . . . . . . . . . 3 55 2.1. The Behavior of a Complex Network . . . . . . . . . . . . 3 56 2.2. Robust Yet Fragile . . . . . . . . . . . . . . . . . . . 4 57 2.3. The Complexity Cube . . . . . . . . . . . . . . . . . . . 4 58 2.4. Related Concepts . . . . . . . . . . . . . . . . . . . . 4 59 2.5. Technical Debt . . . . . . . . . . . . . . . . . . . . . 5 60 2.6. Layering considerations . . . . . . . . . . . . . . . . . 6 61 3. Tradeoffs . . . . . . . . . . . . . . . . . . . . . . . . . . 6 62 4. Structural Complexity . . . . . . . . . . . . . . . . . . . . 7 63 5. Components of Complexity . . . . . . . . . . . . . . . . . . 7 64 5.1. The Physical Network (Hardware) . . . . . . . . . . . . . 7 65 5.2. State in the Network . . . . . . . . . . . . . . . . . . 7 66 5.3. Churn . . . . . . . . . . . . . . . . . . . . . . . . . . 8 67 5.4. Algorithms . . . . . . . . . . . . . . . . . . . . . . . 8 68 6. Location of Complexity . . . . . . . . . . . . . . . . . . . 8 69 6.1. Topological Location . . . . . . . . . . . . . . . . . . 8 70 6.2. Logical Location . . . . . . . . . . . . . . . . . . . . 8 71 6.3. Layering Considerations . . . . . . . . . . . . . . . . . 8 72 7. Dependencies . . . . . . . . . . . . . . . . . . . . . . . . 8 73 7.1. Local Dependencies . . . . . . . . . . . . . . . . . . . 9 74 7.2. Network Wide Dependencies . . . . . . . . . . . . . . . . 9 75 7.3. Network External Dependencies . . . . . . . . . . . . . . 9 76 8. Management Interactions . . . . . . . . . . . . . . . . . . . 9 77 8.1. Configuration Complexity . . . . . . . . . . . . . . . . 9 78 8.2. Troubleshooting Complexity . . . . . . . . . . . . . . . 9 79 8.3. Monitoring Complexity . . . . . . . . . . . . . . . . . . 9 80 8.4. Complexity of System Integration . . . . . . . . . . . . 9 81 9. External Interactions . . . . . . . . . . . . . . . . . . . . 10 82 9.1. User Interactions . . . . . . . . . . . . . . . . . . . . 10 83 9.2. Interactions on End Systems . . . . . . . . . . . . . . . 10 84 9.3. Inter-Network Interactions . . . . . . . . . . . . . . . 10 85 10. Examples . . . . . . . . . . . . . . . . . . . . . . . . . . 10 86 11. Security Considerations . . . . . . . . . . . . . . . . . . . 10 87 12. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 11 88 13. Informative References . . . . . . . . . . . . . . . . . . . 11 89 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 12 91 1. Introduction 93 During the design phase of a network complexity plays a key role. 94 Network designers generally seek to find the simplest design that 95 fulfils a set of requirements. As no objective definition of network 96 complexity exists, subjective measures are used to come to a 97 conclusion. The resulting diverging views on what constitutes 98 complexity subsequently lead to conflicts in design teams. While 99 most people would agree that complexity is an important factor in 100 network design, today's design decisions are made based on a rough 101 estimation of the network's complexity, rather than a solid 102 understanding. 104 The goal of this document is to define a framework for network 105 complexity research. This framework describes related research and 106 current understanding of the topic, as well as outlining some ways 107 research could be taken forward. Specifically, contributions are 108 invited in all of the areas mentioned. 110 Many references to existing research in the area of network 111 complexity are listed on the Network Complexity Wiki [wiki]. This 112 wiki also contains background information on previous meetings on the 113 subject, previous research, etc. 115 2. General Considerations 117 2.1. The Behavior of a Complex Network 119 While there is no generally accepted definition of network 120 complexity, there is some understanding of the behavior of a complex 121 network. It has some or all of the following properties: 123 o Self-Organization: A network runs some protocols and processes 124 without external control; for example a routing process, failover 125 mechanisms, etc. The interaction of those mechanisms can lead to 126 a complex behaviour. 128 o Un-predictability: In a complex network, the effect of a local 129 change on the behaviour of the global network may be 130 unpredictable. 132 o Emergence: A network has an emergent property if a small local 133 change produces a large scale, seemingly unrelated state or 134 result. 136 o Non-linearity: An input into the network produces a non-linear 137 result. 139 o Fragility: A small local input can break the entire system. 141 2.2. Robust Yet Fragile 143 Networks typically follow the "robust yet fragile" paradigm: They are 144 designed to be robust against a set of failures, yet they are very 145 vulnerable to other failures. Doyle [Doyle] explains the concept 146 with an example: The Internet is robust against single component 147 failure, but fragile to targeted attacks. The "robust yet fragile" 148 property also touches on the fact that all network designs are 149 necessarily making trade-offs between different design goals. The 150 simplest one is articulated in "The Twelve Networking Truths" RFC1925 151 [RFC1925]: "Good, Fast, Cheap: Pick any two (you can't have all 152 three)." In real network design, trade-offs between many aspects 153 have to be made, including, for example, issues of scope, time and 154 cost in the network cycle of planning, design, implementation and 155 management of a network platform. Tradeoff between varoius 156 parameters are discussed in section 3. 158 2.3. The Complexity Cube 160 Complex tasks on a network can be done in different components of the 161 network. For example, routing can be controlled by central 162 algorithms, and the result distributed (e.g., OpenFlow model); the 163 routing algorithm can also run completely distributed (e.g., routing 164 protocols such as OSPF or ISIS), or a human operator could calculate 165 routing tables and statically configure routing. Behringer 166 [Behringer] defines these three axes of complexity as a "complexity 167 cube" with three axes: Network elements, central systems, and human 168 operators. While different functions can be shifted between these 169 axes of the network, the overall complexity may change. 171 2.4. Related Concepts 173 When discussing network complexity, a large number of influencing 174 factors have to be taken into account to arrive at a full picture, 175 for example: 177 o State in the network: Contains the network elements, such as 178 routers, switches (with their OS, including protocols), lines, 179 central systems, etc. The number and algorithmical complexity of 180 the protocols on network devices for example. 182 o Human operators: Complexity manifests itself often by a network 183 that is not completely understood by human operators. Human error 184 is a primary source for catastrophic failures, and therefore must 185 be taken into account. 187 o Classes / templates: Rather than counting the number of lines in a 188 configuration, or the number of hardware elements, more important 189 is the number of classes from which those can be derived. In 190 other words, it is probably less complex to have 1000 interfaces 191 which are identically configured than 5 that are completely 192 different configured. 194 o Dependencies and interactions: The number of dependencies between 195 elements, as well as the interactions between them has influence 196 on the complexity of the network. 198 o TCO (Total cost of ownership): TCO could be a good metric for 199 network complexity, if the TCO calculation takes into accont all 200 influencing factors, for example training time for staff to be 201 able to maintain a network. 203 o Benchmark Unit Cost is a related metric that indicates the cost of 204 operating a certain component. If calculated well, it reflects at 205 least parts of the complexity of this component. Therefore, the 206 way TCO or BUC are calculated can help to derive a complexity 207 metric. 209 o Churn / rate of change: The change rate in a network itself can 210 contribute to complexity, especially if a number of components of 211 the overall network interact. 213 Networks differ in terms of their intended purpose (such as is found 214 in differences between enterprise and public carriage network 215 platforms, and in their intended role (such as is found in the 216 diferences between so-called "access" networks and "core" transit 217 networks). The differences in terms of role and purpose can often 218 lead to differences in the tolerance for, and even the metrics of, 219 complexity within such different network scenarios. This is not 220 necessarily a space where a single methodology for measuring 221 complexity, and defining a single threshold value of acceptability of 222 complexity, is appropriate. 224 2.5. Technical Debt 226 Many changes in a network are made with a dependency on the existing 227 network. Often, a suboptimal decision is made because the optimal 228 decision is hard or impossible to realise at the time. Over time, 229 the number of suboptimal changes in themselves cause significant 230 complexity, which would not have been there had the optimal solution 231 been implemented. 233 The term "technical debt" refers to the accumulated complexity of 234 sub-optimal changes over time. As with financial debt, the idea is 235 that also technical debt must be repaid one day by cleaning up the 236 network or software. 238 2.6. Layering considerations 240 In considering the larger space of applications, transport services, 241 network services and media services, it is feasible to engineer 242 responses for certain types of desired applications responses in many 243 different ways, and involving different layers of the so-called 244 network protocol stack. For example, quality of Service could be 245 engineered at any of these layers, or even in a number of 246 combinations of different layers. 248 Considerations of complexity arise when mutually incompatible 249 measures are used in combination (such as error detection and 250 retransmission at the media layer in conjunction with the use TCP 251 transport protocol), or when assumptions used in one layer are 252 violated by another layer. This results in surprising outcomes that 253 may result in complex interactions. This has lead to the perspective 254 that increased layering frequently increases complexity [RFC3439]. 256 While this research work is focussed network complexity, the 257 interactions of the network with the end-to-end transport protocols, 258 application layer protocols and media properties are relevant 259 considerations here. 261 3. Tradeoffs 263 >[I-D.irtf-ncrg-network-design-complexity] describes a set of trade- 264 offs in network design to illustrate the practical choices network 265 operators have to make. The amount of parameters to consider in such 266 tradeoff scenarios is very large, thus that a complete listing may 267 not be possible. Also the dependencies between the various metrics 268 itself is very complex and requires further study. This document 269 attempts to define a methodology and an overall high level structure. 271 To analyse tradeoffs it is necessary to formalise them. The list of 272 parameters for such tradeoffs is long, and the parameters can be 273 complex in themselves. For example, "cost" can be a simple 274 unidimensional metric, but "extensibility" or "optimal forwarding 275 state" are harder to define in detail. 277 A list of parameters to trade off contains metrics such as: 279 o Cost: How much does the network cost to build (capex) and run 280 (opex) 282 o Bandwidth / delay / jitter: Traffic characteristics between two 283 points (average, max, ...) 285 o Configuration complexity: How hard to configure and maintain the 286 configuration 288 o Susceptibility to Denial-of-Service: How easy is it to attack the 289 service 291 o Security (confidentiality / integrity): How easy is it to sniff / 292 modify / insert the data flow 294 o Scalability: To what size can I grow the network / service 296 o Extensibility: Can I use the network for other services in the 297 future? 299 o Ease of troubleshooting: How hard is it to find and correct 300 problems? 302 o Predictability: If I change a parameter, what will happen? 304 o Clean failure: When a problem arises, does the root cause lead to 305 deterministic failure 307 The list of the above criteria can be seen as forming an 308 n-dimensional design space, where each network is represented in one 309 intersection of all parameters. 311 4. Structural Complexity 313 tbc 315 5. Components of Complexity 317 Complexity can be found in various components of a networked system. 318 For example, the configuration of a network element reflects some of 319 the complexity contained in this system. Or an algorithm used by a 320 protocol may be more or less complex. When classifying complexity 321 the first question to ask is "WHAT is complex?". This section offers 322 a method to answer this question. 324 5.1. The Physical Network (Hardware) 326 tbc 328 5.2. State in the Network 329 tbc 331 5.3. Churn 333 The frequency of chance in a network intuitively contributes to its 334 complexity: A network which is not subjected to change tends to be 335 more stable [need ref here]. While there is permanently a certain 336 base complexity in the network, this complexity is "under control" 337 and does not lead to negative side effects. 339 [I-D.sircar-complexity-entropy] describes how entropy metrics can be 340 used to describe changing complexity in a network. The fundamental 341 thesis is that change itself constitutes complexity. When a network 342 undergoes change, the network entropy and the complextiy increases. 343 This is also true when the change has simplification as a goal. The 344 entropy increases during change, and decreases in periods of 345 stability. It can therefore be used to measure the impact of change 346 on complexity. 348 5.4. Algorithms 350 tbc 352 6. Location of Complexity 354 The previous section discussed in which form complexity may be 355 perceived. This section focuses on where this complexity is located 356 in a network. For example, an algorithm can run centrally, 357 distributed, or even in the head of a network administrator. In 358 classifying the complexity of a network, the location of a component 359 may have an impact on overall complexity. This section offers a 360 methodology to the question "WHERE is the complex component?" 362 6.1. Topological Location 364 tbc 366 6.2. Logical Location 368 tbc 370 6.3. Layering Considerations 372 tbc 374 7. Dependencies 375 Dependencies are generally regarded as related to overall complexity. 376 A system with less dependencies is generally considered less complex. 377 This section proposes a way to analyse dependencies in a network. 379 For example, [Chun] states: "We conjecture that the complexity 380 particular to networked systems arises from the need to ensure state 381 is kept in sync with its distributed dependencies." 383 In this document we distinguish three types of dependencis: Local 384 dependencies, network wide dependencies, and network external 385 dependencies. 387 7.1. Local Dependencies 389 tbc 391 7.2. Network Wide Dependencies 393 tbc 395 7.3. Network External Dependencies 397 tbc 399 8. Management Interactions 401 A static network generally is relatively stable; conversely, changes 402 introduce a degree of uncertainty and therefore need to be examined 403 in detail. Also, the trouble shooting of a network exposes 404 intuitively the complexity of the network. This section proposes a 405 methodology to classify management interactions with regard to their 406 relationship to network complexity. 408 8.1. Configuration Complexity 410 tbc 412 8.2. Troubleshooting Complexity 414 tbc 416 8.3. Monitoring Complexity 418 tbc 420 8.4. Complexity of System Integration 422 tbc 424 9. External Interactions 426 The user experience of a network also illustrates a form of 427 complexity. A network can expose certain tasks to the user, or deal 428 with them internally, hidden to the user. This section describes how 429 user interactions can be analysed to expose complexity. 431 9.1. User Interactions 433 tbc 435 9.2. Interactions on End Systems 437 tbc 439 9.3. Inter-Network Interactions 441 tbc 443 10. Examples 445 In the foreseeable future it is unlikely to define a single, 446 objective metric that includes all the relevant aspects of 447 complexity. In the absence of such a global metric, a comparative 448 approach could be easier. 450 For example, it is possible to compare the complexity of a 451 centralised systems where algorithms run centrally, and the results 452 are distributed to the network nodes with a distributed algorithm. 453 The type of algorithm may be similar, but the location is different, 454 and a different dependency graph would result. The supporting 455 hardware may be the same, thus could be ignored for this exercise. 456 Also layering is likely to be the same. The management interactions 457 though would significantly differ in both cases. 459 The classification in this document also makes it easier to survey 460 existing research with regards to which area of complexity is 461 covered. This could help in identifying open areas for research. 463 11. Security Considerations 465 This document does not discuss any specific security considerations. 467 12. Acknowledgements 469 The motivations and framework of this overview of studies into 470 network complexity is the result of many meetings and discussions, 471 with too many people to provide a full list here. However, key 472 contributions have been made by: John Doyle, Jon Crowcroft, Mark 473 Handley, Fred Baker, Paul Vixie, Lars Eggert, Bob Briscoe, Keith 474 Jones, Bruno Klauser, Steve Youell, Joel Obstfeld. 476 The authors would like to acknowledge the contributions of Rana 477 Sircar, Ken Carlberg and Luca Caviglione in the preparation of this 478 Research Group document. 480 13. Informative References 482 [Behringer] 483 Behringer, M., "Classifying Network Complexity", 484 Proceedings of the ACM Re-Arch'09, December 2009. 486 [Chun] Chun, B-G., Ratnasamy, S., and E. Eddie, "NetComplex: A 487 Complexity Metric for Networked System Design", 5th Usenix 488 Symposium on Networked Systems Design and Implementation 489 NSDI 2008, April 2008, 490 . 492 [Doyle] Doyle, J., "The 'robust yet fragile' nature of the 493 Internet", PNAS vol. 102 no. 41 14497-14502, October 2005. 495 [I-D.irtf-ncrg-network-design-complexity] 496 Retana, A. and R. White, "Network Design Complexity 497 Measurement and Tradeoffs", draft-irtf-ncrg-network- 498 design-complexity-00 (work in progress), August 2013. 500 [I-D.sircar-complexity-entropy] 501 Sircar, R. and M. Behringer, "Using Entropy as a Measure 502 for Changes in Network Complexity", draft-sircar- 503 complexity-entropy-00 (work in progress), October 2013. 505 [RFC1925] Callon, R., "The Twelve Networking Truths", RFC 1925, 506 April 1996. 508 [RFC3439] Bush, R. and D. Meyer, "Some Internet Architectural 509 Guidelines and Philosophy", RFC 3439, December 2002. 511 [wiki] , "Network Complexity Wiki", , 512 . 514 Authors' Addresses 516 Michael H. Behringer 517 Cisco 519 Email: mbehring@cisco.com 521 Geoff Huston 522 Asia Pacific Network Information Centre 524 Email: gih@apnic.net