idnits 2.17.1 draft-ietf-anima-autonomic-control-plane-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack separate sections for Informative/Normative References. All references will be assumed normative when checking for downward references. ** The document seems to lack a both a reference to RFC 2119 and the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. RFC 2119 keyword, line 236: '... 1. The ACP SHOULD provide robust c...' RFC 2119 keyword, line 241: '... 2. The ACP MUST have a separate ad...' RFC 2119 keyword, line 245: '... 3. The ACP MUST use autonomically ...' RFC 2119 keyword, line 250: '... 4. The ACP MUST be generic. Usabl...' RFC 2119 keyword, line 251: '...rastructure. It MUST NOT be tied to a...' (15 more instances...) Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (October 6, 2015) is 3123 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Outdated reference: A later version (-03) exists of draft-behringer-anima-autonomic-addressing-01 == Outdated reference: A later version (-04) exists of draft-behringer-anima-reference-model-03 ** Downref: Normative reference to an Informational draft: draft-behringer-anima-reference-model (ref. 'I-D.behringer-anima-reference-model') ** Downref: Normative reference to an Informational draft: draft-behringer-autonomic-control-plane (ref. 'I-D.behringer-autonomic-control-plane') == Outdated reference: A later version (-02) exists of draft-eckert-anima-stable-connectivity-01 ** Downref: Normative reference to an Informational draft: draft-eckert-anima-stable-connectivity (ref. 'I-D.eckert-anima-stable-connectivity') == Outdated reference: A later version (-45) exists of draft-ietf-anima-bootstrapping-keyinfra-00 == Outdated reference: A later version (-15) exists of draft-ietf-anima-grasp-00 ** Downref: Normative reference to an Informational RFC: RFC 7575 ** Downref: Normative reference to an Informational RFC: RFC 7576 Summary: 7 errors (**), 0 flaws (~~), 6 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 ANIMA WG M. Behringer, Ed. 3 Internet-Draft S. Bjarnason 4 Intended status: Standards Track Balaji. BL 5 Expires: April 8, 2016 T. Eckert 6 Cisco Systems 7 October 6, 2015 9 An Autonomic Control Plane 10 draft-ietf-anima-autonomic-control-plane-01 12 Abstract 14 Autonomic functions need a control plane to communicate, which 15 depends on some addressing and routing. This Autonomic Control Plane 16 should ideally be self-managing, and as independent as possible of 17 configuration. This document defines an "Autonomic Control Plane", 18 with the primary use as a control plane for autonomic functions. It 19 also serves as a "virtual out of band channel" for OAM communications 20 over a network that is not configured, or mis-configured. 22 Status of This Memo 24 This Internet-Draft is submitted in full conformance with the 25 provisions of BCP 78 and BCP 79. 27 Internet-Drafts are working documents of the Internet Engineering 28 Task Force (IETF). Note that other groups may also distribute 29 working documents as Internet-Drafts. The list of current Internet- 30 Drafts is at http://datatracker.ietf.org/drafts/current/. 32 Internet-Drafts are draft documents valid for a maximum of six months 33 and may be updated, replaced, or obsoleted by other documents at any 34 time. It is inappropriate to use Internet-Drafts as reference 35 material or to cite them other than as "work in progress." 37 This Internet-Draft will expire on April 8, 2016. 39 Copyright Notice 41 Copyright (c) 2015 IETF Trust and the persons identified as the 42 document authors. All rights reserved. 44 This document is subject to BCP 78 and the IETF Trust's Legal 45 Provisions Relating to IETF Documents 46 (http://trustee.ietf.org/license-info) in effect on the date of 47 publication of this document. Please review these documents 48 carefully, as they describe your rights and restrictions with respect 49 to this document. Code Components extracted from this document must 50 include Simplified BSD License text as described in Section 4.e of 51 the Trust Legal Provisions and are provided without warranty as 52 described in the Simplified BSD License. 54 Table of Contents 56 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 57 2. Use Cases for an Autonomic Control Plane . . . . . . . . . . 4 58 2.1. An Infrastructure for Autonomic Functions . . . . . . . . 4 59 2.2. Secure Bootstrap over an Unconfigured Network . . . . . . 4 60 2.3. Data Plane Independent Permanent Reachability . . . . . . 5 61 3. Requirements . . . . . . . . . . . . . . . . . . . . . . . . 5 62 4. Overview . . . . . . . . . . . . . . . . . . . . . . . . . . 6 63 5. Self-Creation of an Autonomic Control Plane . . . . . . . . . 8 64 5.1. Preconditions . . . . . . . . . . . . . . . . . . . . . . 8 65 5.2. Candidate ACP Neighbor Selection . . . . . . . . . . . . 8 66 5.3. Capability Negotiation . . . . . . . . . . . . . . . . . 9 67 5.4. Channel Establishment . . . . . . . . . . . . . . . . . . 9 68 5.5. Context Separation . . . . . . . . . . . . . . . . . . . 10 69 5.6. Addressing inside the ACP . . . . . . . . . . . . . . . . 10 70 5.7. Routing in the ACP . . . . . . . . . . . . . . . . . . . 12 71 6. Workarounds for Non-Autonomic Nodes . . . . . . . . . . . . . 12 72 6.1. Connecting a Non-Autonomic Controller / NMS system . . . 12 73 6.2. ACP through Non-Autonomic L3 Clouds . . . . . . . . . . . 13 74 7. The Negotiation Protocol . . . . . . . . . . . . . . . . . . 13 75 8. The Channel Type . . . . . . . . . . . . . . . . . . . . . . 13 76 9. Self-Healing Properties . . . . . . . . . . . . . . . . . . . 14 77 10. Self-Protection Properties . . . . . . . . . . . . . . . . . 15 78 11. The Administrator View . . . . . . . . . . . . . . . . . . . 15 79 12. Security Considerations . . . . . . . . . . . . . . . . . . . 16 80 13. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 16 81 14. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 16 82 15. Change log [RFC Editor: Please remove] . . . . . . . . . . . 17 83 15.1. Initial version . . . . . . . . . . . . . . . . . . . . 17 84 15.2. draft-behringer-anima-autonomic-control-plane-00 . . . . 17 85 15.3. draft-behringer-anima-autonomic-control-plane-01 . . . . 17 86 15.4. draft-behringer-anima-autonomic-control-plane-02 . . . . 17 87 15.5. draft-behringer-anima-autonomic-control-plane-03 . . . . 18 88 15.6. draft-ietf-anima-autonomic-control-plane-00 . . . . . . 18 89 15.7. draft-ietf-anima-autonomic-control-plane-01 . . . . . . 18 90 16. References . . . . . . . . . . . . . . . . . . . . . . . . . 19 91 Appendix A. Background on the choice of routing protocol . . . . 20 92 Appendix B. Alternative: An ACP without Separation . . . . . . . 21 93 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 22 95 1. Introduction 97 Autonomic Networking is a concept of self-management: Autonomic 98 functions self-configure, and negotiate parameters and settings 99 across the network. [RFC7575] defines the fundamental ideas and 100 design goals of Autonomic Networking. A gap analysis of Autonomic 101 Networking is given in [RFC7576]. The reference architecture for 102 Autonomic Networking in the IETF is currently being defined in the 103 document [I-D.behringer-anima-reference-model] 105 Autonomic functions need a stable and robust infrastructure to 106 communicate on. This infrastructure should be as robust as possible, 107 and it should be re-usable by all autonomic functions. [RFC7575] 108 calls it the "Autonomic Control Plane". This document defines the 109 Autonomic Control Plane. 111 Today, the management and control plane of networks typically runs in 112 the global routing table, which is dependent on correct configuration 113 and routing. Misconfigurations or routing problems can therefore 114 disrupt management and control channels. Traditionally, an out of 115 band network has been used to recover from such problems, or 116 personnel is sent on site to access devices through console ports. 117 However, both options are operationally expensive. 119 In increasingly automated networks either controllers or distributed 120 autonomic service agents in the network require a control plane which 121 is independent of the network they manage, to avoid impacting their 122 own operations. 124 This document describes options for a self-forming, self-managing and 125 self-protecting "Autonomic Control Plane" (ACP) which is inband on 126 the network, yet as independent as possible of configuration, 127 addressing and routing problems (for details how this achieved, see 128 Section 5). It therefore remains operational even in the presence of 129 configuration errors, addressing or routing issues, or where policy 130 could inadvertently affect control plane connectivity. The Autonomic 131 Control Plane serves several purposes at the same time: 133 o Autonomic functions communicate over the ACP. The ACP therefore 134 supports directly Autonomic Networking functions, as described in 135 [I-D.behringer-anima-reference-model]. For example, GRASP 136 [I-D.ietf-anima-grasp] can run inside the ACP. 138 o An operator can use it to log into remote devices, even if the 139 data plane is misconfigured or unconfigured. 141 o A controller or network management system can use it to securely 142 bootstrap network devices in remote locations, even if the network 143 in between is not yet configured; no data-plane dependent 144 bootstrap configuration is required. An example of such a secure 145 bootstrap process is described in 146 [I-D.ietf-anima-bootstrapping-keyinfra] 148 This document describes some use cases for the ACP in Section 2, it 149 defines the requirements in Section 3, Section 4 gives an overview 150 how an Autonomic Control Plane is constructed, and in Section 5 the 151 detailed process is explained. Section 6 explains how non-autonomic 152 nodes and networks can be integrated, Section 7 defines the 153 negotiation protocol, and Section 8 the first channel types for the 154 ACP. 156 The document "Autonomic Network Stable Connectivity" 157 [I-D.eckert-anima-stable-connectivity] describes how the ACP can be 158 used to provide stable connectivity for OAM applications. It also 159 explains on how existing management solutions can leverage the ACP in 160 parallel with traditional management models, when to use the ACP 161 versus the data plane, how to integrate IPv4 based management, etc. 163 2. Use Cases for an Autonomic Control Plane 165 2.1. An Infrastructure for Autonomic Functions 167 Autonomic Functions need a stable infrastructure to run on, and all 168 autonomic functions should use the same infrastructure to minimise 169 the complexity of the network. This way, there is only need for a 170 single discovery mechanism, a single security mechanism, and other 171 processes that distributed functions require. 173 2.2. Secure Bootstrap over an Unconfigured Network 175 Today, bootstrapping a new device typically requires all devices 176 between a controlling node (such as an SDN controller) and the new 177 device to be completely and correctly addressed, configured and 178 secured. Therefore, bootstrapping a network happens in layers around 179 the controller. Without console access (for example through an out 180 of band network) it is not possible today to make devices securely 181 reachable before having configured the entire network between. 183 With the ACP, secure bootstrap of new devices can happen without 184 requiring any configuration on the network. A new device can 185 automatically be bootstrapped in a secure fashion and be deployed 186 with a domain certificate. This does not require any configuration 187 on intermediate nodes, because they can communicate through the ACP. 189 2.3. Data Plane Independent Permanent Reachability 191 Today, most critical control plane protocols and network management 192 protocols are running in the data plane (global routing table) of the 193 network. This leads to undesirable dependencies between control and 194 management plane on one side and the data plane on the other: Only if 195 the data plane is operational, will the other planes work as 196 expected. 198 Data plane connectivity can be affected by errors and faults, for 199 example certain AAA misconfigurations can lock an administrator out 200 of a device; routing or addressing issues can make a device 201 unreachable; shutting down interfaces over which a current management 202 session is running can lock an admin irreversibly out of the device. 203 Traditionally only console access can help recover from such issues. 205 Data plane dependencies also affect NOC/SDN controller applications: 206 Certain network changes are today hard to operate, because the change 207 itself may affect reachability of the devices. Examples are address 208 or mask changes, routing changes, or security policies. Today such 209 changes require precise hop-by-hop planning. 211 The ACP provides reachability that is largely independent of the data 212 plane, which allows control plane and management plane to operate 213 more robustly: 215 o For management plane protocols, the ACP provides the functionality 216 of a "Virtual-out-of-band (VooB) channel", by providing 217 connectivity to all devices regardless of their configuration or 218 global routing table. 220 o For control plane protocols, the ACP allows their operation even 221 when the data plane is temporarily faulty, or during transitional 222 events, such as routing changes, which may affect the control 223 plane at least temporarily. This is specifically important for 224 autonomic service agents, which could affect data plane 225 connectivity. 227 The document "Autonomic Network Stable Connectivity" 228 [I-D.eckert-anima-stable-connectivity] explains the use cases for the 229 ACP in significantly more detail and explains how the ACP can be used 230 in practical network operations. 232 3. Requirements 234 The Autonomic Control Plane has the following requirements: 236 1. The ACP SHOULD provide robust connectivity: As far as possible, 237 it should be independent of configured addressing, configuration 238 and routing. Requirements 2 and 3 build on this requirement, but 239 also have value on their own. 241 2. The ACP MUST have a separate address space from the data plane. 242 Reason: traceability, debug-ability, separation from data plane, 243 security (can block easily at edge). 245 3. The ACP MUST use autonomically managed address space. Reason: 246 easy bootstrap and setup ("autonomic"); robustness (admin can't 247 mess things up so easily). This document suggests to use ULA 248 addressing for this purpose. 250 4. The ACP MUST be generic. Usable by all the functions and 251 protocols of the AN infrastructure. It MUST NOT be tied to a 252 particular protocol. 254 5. The ACP MUST provide security: Messages coming through the ACP 255 MUST be authenticated to be from a trusted node, and SHOULD (very 256 strong SHOULD) be encrypted. 258 The default mode of operation of the ACP is hop-by-hop, because this 259 interaction can be built on IPv6 link local addressing, which is 260 autonomic, and has no dependency on configuration (requirement 1). 261 It may be necessary to have end-to-end connectivity in some cases, 262 for example to provide an end-to-end security association for some 263 protocols. This is possible, but then has a dependency on routable 264 address space. 266 4. Overview 268 The Autonomic Control Plane is constructed in the following way (for 269 details, see Section 5): 271 o An autonomic node creates a virtual routing and forwarding (VRF) 272 instance, or a similar virtual context. 274 o It determines, following a policy, a candidate peer list. This is 275 the list of nodes to which it should establish an autonomic 276 control plane. Default policy is: To all adjacent nodes in the 277 same domain. Intent can override this default policy. 279 o For each node in the candidate peer list, it authenticates that 280 node and negotiates a mutually acceptable channel type. 282 o It then establishes a secure tunnel of the negotiated channel 283 type. These tunnels are placed into the previously set up VRF. 284 This creates an overlay network with hop-by-hop tunnels. 286 o Inside the ACP VRF, each node sets up a virtual interface with its 287 ULA IPv6 address. 289 o Each node runs a lightweight routing protocol, to announce 290 reachability of the virtual addresses inside the ACP. 292 o Non-autonomic NMS systems or controllers have to be manually 293 connected into the ACP. 295 o Connecting over non-autonomic Layer-3 clouds initially requires a 296 tunnel between autonomic nodes. 298 o None of the above operations (except manual ones) is reflected in 299 the configuration of the device. 301 The following figure illustrates the ACP. 303 autonomic node 1 autonomic node 2 304 ................... ................... 305 secure . . secure . . secure 306 tunnel : +-----------+ : tunnel : +-----------+ : tunnel 307 ..--------| ACP VRF |---------------------| ACP VRF |---------.. 308 : / \ / \ <--routing--> / \ / \ : 309 : \ / \ / \ / \ / : 310 ..--------| virtual |---------------------| virtual |---------.. 311 : | interface | : : | interface | : 312 : +-----------+ : : +-----------+ : 313 : : : : 314 : data plane :...............: data plane : 315 : : link : : 316 :.................: :.................: 318 Figure 1 320 The resulting overlay network is normally based exclusively on hop- 321 by-hop tunnels. This is because addressing used on links is IPv6 322 link local addressing, which does not require any prior set-up. This 323 way the ACP can be built even if there is no configuration on the 324 devices, or if the data plane has issues such as addressing or 325 routing problems. 327 5. Self-Creation of an Autonomic Control Plane 329 This section describes the steps to set up an Autonomic Control 330 Plane, and highlights the key properties which make it 331 "indestructible" against many inadvert changes to the data plane, for 332 example caused by misconfigurations. 334 5.1. Preconditions 336 An autonomic node can be a router, switch, controller, NMS host, or 337 any other IP device. We assume an autonomic node has: 339 o A globally unique domain certificate, with which it can 340 cryptographically assert its membership of the domain. The 341 document [I-D.ietf-anima-bootstrapping-keyinfra] describes how a 342 domain certificate can be automatically and securely derived from 343 a vendor specific Unique Device Identifier (UDI) or IDevID 344 certificate. (Note the UDI used in this document is NOT the UUID 345 specified in [RFC4122].) 347 o An adjacency table, which contains information about adjacent 348 autonomic nodes, at a minimum: node-ID, IP address, domain, 349 certificate. An autonomic device maintains this adjacency table 350 up to date. Where the next autonomic device is not directly 351 adjacent, the information in the adjacency table can be 352 supplemented by configuration. For example, the node-ID and IP 353 address could be configured. 355 The adjacency table MAY contain information about the validity and 356 trust of the adjacent autonomic node's certificate. However, 357 subsequent steps MUST always start with authenticating the peer. 359 The adjacency table contains information about adjacent autonomic 360 nodes in general, independently of their domain and trust status. 361 The next step determines to which of those autonomic nodes an ACP 362 connection should be established. 364 5.2. Candidate ACP Neighbor Selection 366 An autonomic node must determine to which other autonomic nodes in 367 the adjacency table it should build an ACP connection. 369 The ACP is by default established exclusively between nodes in the 370 same domain. 372 Intent can change this default behaviour. The precise format for 373 this Intent needs to be defined outside this document. Example 374 Intent policies are: 376 o The ACP should be built between all sub-domains for a given parent 377 domain. For example: For domain "example.com", nodes of 378 "example.com", "access.example.com", "core.example.com" and 379 "city.core.example.com" should all establish one single ACP. 381 o Two domains should build one single ACP between themselves, for 382 example "example1.com" should establish the ACP also with nodes 383 from "example2.com". For this case, the two domains must be able 384 to validate their trust, typically by cross-signing their 385 certificate infrastructure. 387 The result of the candidate ACP neighbor selection process is a list 388 of adjacent or configured autonomic neighbors to which an ACP channel 389 should be established. The next step begins that channel 390 establishment. 392 5.3. Capability Negotiation 394 Autonomic devices may have different capabilities based on the type 395 of device, OS version, etc. To establish a trusted secure ACP 396 channel, devices must first negotiate their mutual capabilities in 397 the data plane. This allows for the support of different channel 398 types in the future. 400 For each node on the candidate ACP neighbor list, capabilities need 401 to be exchanged. The capability negotiation is based on GRASP 402 [I-D.ietf-anima-grasp]. The relevant protocol details are defined in 403 Section 7. This negotiation MUST be secure: The identity of the 404 other node MUST be validated during capability negotiation, and the 405 exchange MUST be authenticated. 407 The first parameter to be negotiated is the ACP Channel type. The 408 channel types are defined in Section 8. Other parameters may be 409 added later. 411 Intent may also influence the capability negotiation. For example, 412 Intent may require a minimum ACP tunnel security. This is outside 413 scope for this document. 415 5.4. Channel Establishment 417 After authentication and capability negotiation autonomic nodes 418 establish a secure channel towards the AN neighbors with the above 419 negotiated parameters. 421 The channel establishment MUST be authenticated. Whether or not, and 422 how, a channel is encrypted is part of the capability negotiation, 423 potentially controlled by Intent. 425 In order to be independent of configured link addresses, channels 426 SHOULD use IPv6 link local addresses between adjacent neighbors 427 wherever possible. This way, the ACP tunnels are independent of 428 correct network wide routing. 430 Since channels are by default established between adjacent neighbors, 431 the resulting overlay network does hop by hop encryption. Each node 432 decrypts incoming traffic from the ACP, and encrypts outgoing traffic 433 to its neighbors in the ACP. Routing is discussed in Section 5.7. 435 If two nodes are connected via several links, the ACP SHOULD be 436 established on every link, but it is possible to establish the ACP 437 only on a sub-set of links. Having an ACP channel on every link has 438 a number of advantages, for example it allows for a faster failover 439 in case of link failure, and it reflects the physical topology more 440 closely. Using a subset of links (for example, a single link), 441 reduces resource consumption on the devices, because state needs to 442 be kept per ACP channel. 444 5.5. Context Separation 446 The ACP is in a separate context from the normal data plane of the 447 device. This context includes the ACP channels IPv6 forwarding and 448 routing as well as any required higher layer ACP functions. 450 In classical network device platforms, a dedicated so called "Virtual 451 routing and forwarding instance" (VRF) is one logical implementation 452 option for the ACP. If possible by the platform SW architecture, 453 separation options that minimize shared components are preferred. 454 The context for the ACP needs to be established automatically during 455 bootstrap of a device. As much as possible it should be protected 456 from being modified unintentionally by data plane configuration. 458 Context separation improves security, because the ACP is not 459 reachable from the global routing table. Also, configuration errors 460 from the data plane setup do not affect the ACP. 462 [EDNOTE: Previous versions of this document also discussed an option 463 where the ACP runs in the data plane without logical separation. 464 Consensus is to focus only on the separated ACP now, and to remove 465 the ACP in the data plane from this document. See Appendix B for the 466 reasons for this decision.] 468 5.6. Addressing inside the ACP 470 The channels explained above typically only establish communication 471 between two adjacent nodes. In order for communication to happen 472 across multiple hops, the autonomic control plane requires internal 473 network wide valid addresses and routing. Each autonomic node must 474 create a virtual interface with a network wide unique address inside 475 the ACP context mentioned in Section 5.5. 477 The ACP is based exclusively on IPv6 addressing, for a variety of 478 reasons: 480 o Simplicity, reliability and scale: If other network layer 481 protocols were supported, each would have to have its own set of 482 security associations, routing table and process, etc. 484 o Autonomic functions do not require IPv4: Autonomic functions and 485 autonomic service agents are new concepts. They can be 486 exclusively built on IPv6 from day one. There is no need for 487 backward compatibility. 489 o OAM protocols no not require IPv4: The ACP may carry OAM 490 protocols. All relevant protocols (SNMP, TFTP, SSH, SCP, Radius, 491 Diameter, ...) are available in IPv6. 493 Once an autonomic node is enrolled in a domain, it automatically 494 creates a network wide Unique Local Addresses (ULA) in accordance 495 with [RFC4193] with the following algorithm: 497 o Prefix FD00::/8, defining locally assigned unique local addresses. 498 See Section 3.1 of [RFC4193]. 500 o Global ID: an MD5 hash of the domain ID, using the 40 least 501 significant bits. This results in a pseudo-random global ID, in 502 accordance with Section 3.2 of [RFC4193]. 504 o Subnet ID and interface ID: 505 [I-D.behringer-anima-autonomic-addressing] defines how these 506 fields can be constructed and used. 508 With this algorithm, all autonomic devices in the same domain have 509 the same /48 prefix. Conversely, global IDs from different domains 510 are unlikely to clash, such that two networks can be merged, as long 511 as the policy allows that merge. See also Section 9 for a discussion 512 on merging domains. 514 Links inside the ACP only use link-local IPv6 addressing, such that 515 each node only requires one routable virtual address. 517 5.7. Routing in the ACP 519 Once ULA address are set up all autonomic entities should run a 520 routing protocol within the autonomic control plane context. This 521 routing protocol distributes the ULA created in the previous section 522 for reachability. The use of the autonomic control plane specific 523 context eliminates the probable clash with the global routing table 524 and also secures the ACP from interference from the configuration 525 mismatch or incorrect routing updates. 527 The establishment of the routing plane and its parameters are 528 automatic and strictly within the confines of the autonomic control 529 plane. Therefore, no manual configuration is required. 531 All routing updates are automatically secured in transit as the 532 channels of the autonomic control plane are by default secured. 534 The routing protocol inside the ACP should be light weight and highly 535 scalable to ensure that the ACP does not become a limiting factor in 536 network scalability. We suggest the use of RPL [RFC6550] as one such 537 protocol which is light weight and scales well for the control plane 538 traffic. See Appendix A for more details on the choice of RPL. 540 6. Workarounds for Non-Autonomic Nodes 542 6.1. Connecting a Non-Autonomic Controller / NMS system 544 The Autonomic Control Plane can be used by management systems, such 545 as controllers or network management system (NMS) hosts (henceforth 546 called simply "NMS hosts"), to connect to devices through it. For 547 this, an NMS host must have access to the ACP. By default, the ACP 548 is a self-protecting overlay network, which only allows access to 549 trusted systems. Therefore, a traditional, non-autonomic NMS system 550 does not have access to the ACP by default, just like any other 551 external device. 553 If the NMS host is not autonomic, i.e., it does not support autonomic 554 negotiation of the ACP, then it can be brought into the ACP by 555 explicit configuration. On an adjacent autonomic node with ACP, the 556 interface with the NMS host can be configured to be part of the ACP. 557 In this case, the NMS host is with this interface entirely and 558 exclusively inside the ACP. It would likely require a second 559 interface for connections between the NMS host and administrators, or 560 Internet based services. This mode of connecting an NMS host has 561 security consequences: All systems and processes connected to this 562 implicitly trusted interface have access to all autonomic nodes on 563 the entire ACP, without further authentication. Thus, this 564 connection must be physically controlled. 566 The non-autonomic NMS host must be routed in the ACP. This involves 567 two parts: 1) the NMS host must point default to the AN device for 568 the ULA prefix used inside the ACP, and 2) the prefix used between AN 569 node and NMS host must be announced into the ACP, and distributed 570 there. 572 The document "Autonomic Network Stable Connectivity" 573 [I-D.eckert-anima-stable-connectivity] explains in more detail how 574 the ACP can be integrated in a mixed NOC environment. 576 6.2. ACP through Non-Autonomic L3 Clouds 578 Not all devices in a network may be autonomic. If non-autonomic 579 Layer-2 devices are between autonomic nodes, the communications 580 described in this document should work, since it is IP based. 581 However, non-autonomic Layer-3 devices do not forward link local 582 autonomic messages, and thus break the Autonomic Control Plane. 584 One workaround is to manually configure IP tunnels between autonomic 585 nodes across a non-autonomic Layer-3 cloud. The tunnels are 586 represented on each autonomic node as virtual interfaces, and all 587 autonomic transactions work across such tunnels. 589 Such manually configured tunnels are less "indestructible" than an 590 automatically created ACP based on link local addressing, since they 591 depend on correct data plane operations, such as routing and 592 addressing. 594 7. The Negotiation Protocol 596 This section describes the negotiation exchange in detail. It is 597 based on GRASP [I-D.ietf-anima-grasp]. Since at the time of 598 establishing the ACP channel there is obviously no ACP yet, this 599 negotiation protocol must run in the data plane. This negotiation 600 MUST be authenticated, to avoid downgrade attackes, where an attacker 601 injects bogus negotiation messages demanding a less secure ACP 602 channel type. The negotiation MAY be encrypted. 604 [The detailed negotiation flow and mapping into GRASP messages is to 605 be completed.] 607 8. The Channel Type 609 Two adjacent nodes negotiate an ACP channel. This channel MUST be 610 authenticated and SHOULD be encrypted. 612 The nodes negotiate a parameter called "ACP channel type". This 613 document defines a single, MUST implement channel type: GRE with 614 IPsec transport mode. See IANA Considerations (Section 13) for the 615 formal definition of this parameter. 617 9. Self-Healing Properties 619 The ACP is self-healing: 621 o New neighbors will automatically join the ACP after successful 622 validation and will become reachable using their unique ULA 623 address across the ACP. 625 o When any changes happen in the topology, the routing protocol used 626 in the ACP will automatically adapt to the changes and will 627 continue to provide reachability to all devices. 629 o If an existing device gets revoked, it will automatically be 630 denied access to the ACP as its domain certificate will be 631 validated against a Certificate Revocation List during 632 authentication. Since the revocation check is only done at the 633 establishment of a new security association, existing ones are not 634 automatically torn down. If an immediate disconnect is required, 635 existing sessions to a freshly revoked device can be re-set. 637 The ACP can also sustain network partitions and mergers. Practically 638 all ACP operations are link local, where a network partition has no 639 impact. Devices authenticate each other using the domain 640 certificates to establish the ACP locally. Addressing inside the ACP 641 remains unchanged, and the routing protocol inside both parts of the 642 ACP will lead to two working (although partitioned) ACPs. 644 There are few central dependencies: A certificate revocation list 645 (CRL) may not be available during a network partition; a suitable 646 policy to not immediately disconnect neighbors when no CRL is 647 available can address this issue. Also, a registrar or Certificate 648 Authority might not be available during a partition. This may delay 649 renewal of certificates that are to expire in the future, and it may 650 prevent the enrolment of new devices during the partition. 652 After a network partition, a re-merge will just establish the 653 previous status, certificates can be renewed, the CRL is available, 654 and new devices can be enrolled everywhere. Since all devices use 655 the same trust anchor, a re-merge will be smooth. 657 Merging two networks with different trust anchors requires the trust 658 anchors to mutually trust each other (for example, by cross-signing). 659 As long as the domain names are different, the addressing will not 660 overlap (see Section 5.6). 662 10. Self-Protection Properties 664 As explained in Section 5, the ACP is based on channels being built 665 between devices which have been previously authenticated based on 666 their domain certificates. The channels themselves are protected 667 using standard encryption technologies like DTLS or IPsec which 668 provide additional authentication during channel establishment, data 669 integrity and data confidentiality protection of data inside the ACP 670 and in addition, provide replay protection. 672 An attacker will therefore not be able to join the ACP unless having 673 a valid domain certificate, also packet injection and sniffing 674 traffic will not be possible due to the security provided by the 675 encryption protocol. 677 The remaining attack vector would be to attack the underlying AN 678 protocols themselves, either via directed attacks or by denial-of- 679 service attacks. However, as the ACP is built using link-local IPv6 680 address, remote attacks are impossible. The ULA addresses are only 681 reachable inside the ACP context, therefore unreachable from the data 682 plane. Also, the ACP protocols should be implemented to be attack 683 resistant and not consume unnecessary resources even while under 684 attack. 686 11. The Administrator View 688 An ACP is self-forming, self-managing and self-protecting, therefore 689 has minimal dependencies on the administrator of the network. 690 Specifically, since it is independent of configuration, there is no 691 scope for configuration errors on the ACP itself. The administrator 692 may have the option to enable or disable the entire approach, but 693 detailed configuration is not possible. This means that the ACP must 694 not be reflected in the running configuration of devices, except a 695 possible on/off switch. 697 While configuration is not possible, an administrator must have full 698 visibility of the ACP and all its parameters, to be able to do 699 trouble-shooting. Therefore, an ACP must support all show and debug 700 options, as for any other network function. Specifically, a network 701 management system or controller must be able to discover the ACP, and 702 monitor its health. This visibility of ACP operations must clearly 703 be separated from visibility of data plane so automated systems will 704 never have to deal with ACP aspect unless they explicitly desire to 705 do so. 707 Since an ACP is self-protecting, a device not supporting the ACP, or 708 without a valid domain certificate cannot connect to it. This means 709 that by default a traditional controller or network management system 710 cannot connect to an ACP. See Section 6.1 for more details on how to 711 connect an NMS host into the ACP. 713 12. Security Considerations 715 An ACP is self-protecting and there is no need to apply configuration 716 to make it secure. Its security therefore does not depend on 717 configuration. 719 However, the security of the ACP depends on a number of other 720 factors: 722 o The usage of domain certificates depends on a valid supporting PKI 723 infrastructure. If the chain of trust of this PKI infrastructure 724 is compromised, the security of the ACP is also compromised. This 725 is typically under the control of the network administrator. 727 o Security can be compromised by implementation errors (bugs), as in 728 all products. 730 Fundamentally, security depends on correct operation, implementation 731 and architecture. Autonomic approaches such as the ACP largely 732 eliminate the dependency on correct operation; implementation and 733 architectural mistakes are still possible, as in all networking 734 technologies. 736 13. IANA Considerations 738 Section 8 describes an option for the channel negotiation, the 739 channel type. We request IANA to create a registry for ACP channel 740 types. 742 The ACP channel type is a 8-bit unsigned integer. This document only 743 assigns the first value. 745 Number | Channel Type | RFC 746 ---------+-----------------------------------+------------ 747 0 | GRE tunnel protected with | this document 748 | IPsec transport mode | 749 1-255 | reserved for future channel types | 751 14. Acknowledgements 753 This work originated from an Autonomic Networking project at Cisco 754 Systems, which started in early 2010. Many people contributed to 755 this project and the idea of the Autonomic Control Plane, amongst 756 which (in alphabetical order): Ignas Bagdonas, Parag Bhide, Alex 757 Clemm, Yves Hertoghs, Bruno Klauser, Max Pritikin, Ravi Kumar 758 Vadapalli. 760 Further input and suggestions were received from: Rene Struik, Brian 761 Carpenter, Benoit Claise. 763 15. Change log [RFC Editor: Please remove] 765 15.1. Initial version 767 First version of this document: 768 [I-D.behringer-autonomic-control-plane] 770 15.2. draft-behringer-anima-autonomic-control-plane-00 772 Initial version of the anima document; only minor edits. 774 15.3. draft-behringer-anima-autonomic-control-plane-01 776 o Clarified that the ACP should be based on, and support only IPv6. 778 o Clarified in intro that ACP is for both, between devices, as well 779 as for access from a central entity, such as an NMS. 781 o Added a section on how to connect an NMS system. 783 o Clarified the hop-by-hop crypto nature of the ACP. 785 o Added several references to GDNP as a candidate protocol. 787 o Added a discussion on network split and merge. Although, this 788 should probably go into the certificate management story longer 789 term. 791 15.4. draft-behringer-anima-autonomic-control-plane-02 793 Addresses (numerous) comments from Brian Carpenter. See mailing list 794 for details. The most important changes are: 796 o Introduced a new section "overview", to ease the understanding of 797 the approach. 799 o Merged the previous "problem statement" and "use case" sections 800 into a mostly re-written "use cases" section, since they were 801 overlapping. 803 o Clarified the relationship with draft-eckert-anima-stable- 804 connectivity 806 15.5. draft-behringer-anima-autonomic-control-plane-03 808 o Took out requirement for IPv6 --> that's in the reference doc. 810 o Added requirement section. 812 o Changed focus: more focus on autonomic functions, not only virtual 813 out of band. This goes a bit throughout the document, starting 814 with a changed abstract and intro. 816 15.6. draft-ietf-anima-autonomic-control-plane-00 818 No changes; re-submitted as WG document. 820 15.7. draft-ietf-anima-autonomic-control-plane-01 822 o Added some paragraphs in addressing section on "why IPv6 only", to 823 reflect the discussion on the list. 825 o Moved the data-plane ACP out of the main document, into an 826 appendix. The focus is now the virtually separated ACP, since it 827 has significant advantages, and isn't much harder to do. 829 o Changed the self-creation algorithm: Part of the initial steps go 830 into the reference document. This document now assumes an 831 adjacency table, and domain certificate. How those get onto the 832 device is outside scope for this document. 834 o Created a new section 6 "workarounds for non-autonomic nodes", and 835 put the previous controller section (5.9) into this new section. 836 Now, section 5 is "autonomic only", and section 6 explains what to 837 do with non-autonomic stuff. Much cleaner now. 839 o Added an appendix explaining the choice of RPL as a routing 840 protocol. 842 o Formalised the creation process a bit more. Now, we create a 843 "candidate peer list" from the adjacency table, and form the ACP 844 with those candidates. Also it explains now better that policy 845 (Intent) can influence the peer selection. (section 4 and 5) 847 o Introduce a section for the capability negotiation protocol 848 (section 7). This needs to be worked out in more detail. This 849 will likely be based on GRASP. 851 o Introduce a new parameter: ACP tunnel type. And defines it in the 852 IANA considerations section. Suggest GRE protected with IPSec 853 transport mode as the default tunnel type. 855 o Updated links, lots of small edits. 857 16. References 859 [I-D.behringer-anima-autonomic-addressing] 860 Behringer, M., "An Autonomic IPv6 Addressing Scheme", 861 draft-behringer-anima-autonomic-addressing-01 (work in 862 progress), June 2015. 864 [I-D.behringer-anima-reference-model] 865 Behringer, M., Carpenter, B., Eckert, T., Ciavaglia, L., 866 Liu, B., Jeff, J., and J. Strassner, "A Reference Model 867 for Autonomic Networking", draft-behringer-anima- 868 reference-model-03 (work in progress), June 2015. 870 [I-D.behringer-autonomic-control-plane] 871 Behringer, M., Bjarnason, S., BL, B., and T. Eckert, "An 872 Autonomic Control Plane", draft-behringer-autonomic- 873 control-plane-00 (work in progress), June 2014. 875 [I-D.eckert-anima-stable-connectivity] 876 Eckert, T. and M. Behringer, "Using Autonomic Control 877 Plane for Stable Connectivity of Network OAM", draft- 878 eckert-anima-stable-connectivity-01 (work in progress), 879 March 2015. 881 [I-D.ietf-anima-bootstrapping-keyinfra] 882 Pritikin, M., Richardson, M., Behringer, M., and S. 883 Bjarnason, "Bootstrapping Key Infrastructures", draft- 884 ietf-anima-bootstrapping-keyinfra-00 (work in progress), 885 August 2015. 887 [I-D.ietf-anima-grasp] 888 Carpenter, B. and B. Liu, "A Generic Autonomic Signaling 889 Protocol (GRASP)", draft-ietf-anima-grasp-00 (work in 890 progress), August 2015. 892 [RFC4122] Leach, P., Mealling, M., and R. Salz, "A Universally 893 Unique IDentifier (UUID) URN Namespace", RFC 4122, 894 DOI 10.17487/RFC4122, July 2005, 895 . 897 [RFC4193] Hinden, R. and B. Haberman, "Unique Local IPv6 Unicast 898 Addresses", RFC 4193, DOI 10.17487/RFC4193, October 2005, 899 . 901 [RFC6550] Winter, T., Ed., Thubert, P., Ed., Brandt, A., Hui, J., 902 Kelsey, R., Levis, P., Pister, K., Struik, R., Vasseur, 903 JP., and R. Alexander, "RPL: IPv6 Routing Protocol for 904 Low-Power and Lossy Networks", RFC 6550, 905 DOI 10.17487/RFC6550, March 2012, 906 . 908 [RFC7575] Behringer, M., Pritikin, M., Bjarnason, S., Clemm, A., 909 Carpenter, B., Jiang, S., and L. Ciavaglia, "Autonomic 910 Networking: Definitions and Design Goals", RFC 7575, 911 DOI 10.17487/RFC7575, June 2015, 912 . 914 [RFC7576] Jiang, S., Carpenter, B., and M. Behringer, "General Gap 915 Analysis for Autonomic Networking", RFC 7576, 916 DOI 10.17487/RFC7576, June 2015, 917 . 919 Appendix A. Background on the choice of routing protocol 921 In a pre-standard implementation, the "IPv6 Routing Protocol for Low- 922 Power and Lossy Networks (RPL, [RFC6550] was chosen. This 923 Appendix explains the reasoning behind that decision. 925 Requirements for routing in the ACP are: 927 o Self-management: The ACP must build automatically, without human 928 intervention. Therefore routing protocol must also work 929 completely automatically. RPL is a simple, self-managing 930 protocol, which does not require zones or areas; it is also self- 931 configuring, since configuration is carried as part of the 932 protocol (see Section 6.7.6 of [RFC6550]). 934 o Scale: The ACP builds over an entire domain, which could be a 935 large enterprise or service provider network. The routing 936 protocol must therefore support domains of 100,000 nodes or more, 937 ideally without the need for zoning or separation into areas. RPL 938 has this scale property. This is based on extensive use of 939 default routing. RPL also has other scalability improvements, 940 such as selecting only a subset of peers instead of all possible 941 ones, and trickle support for information synchronisation. 943 o Low resource consumption: The ACP supports traditional network 944 infrastructure, thus runs in addition to traditional protocols. 945 The ACP, and specifically the routing protocol must have low 946 resource consumption both in terms of memory and CPU requirements. 947 Specifically, at edge nodes, where memory and CPU are scarce, 948 consumption should be minimal. RPL builds a destination-oriented 949 directed acyclic graph (DODAG), where the main resource 950 consumption is at the root of the DODAG. The closer to the edge 951 of the network, the less state needs to be maintained. This 952 adapts nicely to the typical network design. Also, all changes 953 below a common parent node are kept below that parent node. 955 o Support for unstructured address space: In the Autonomic 956 Networking Infrastructure, node addresses are identifiers, and may 957 not be assigned in a topological way. Also, nodes may move 958 topologically, without changing their address. Therefore, the 959 routing protocol must support completely unstructured address 960 space. RPL is specifically made for mobile ad-hoc networks, with 961 no assumptions on topologically aligned addressing. 963 o Modularity: To keep the initial implementation small, yet allow 964 later for more complex methods, it is highly desirable that the 965 routing protocol has a simple base functionality, but can import 966 new functional modules if needed. RPL has this property with the 967 concept of "objective function", which is a plugin to modify 968 routing behaviour. 970 o Extensibility: Since the Autonomic Networking Infrastructure is a 971 new concept, it is likely that changes in the way of operation 972 will happen over time. RPL allows for new objective functions to 973 be introduced later, which allow changes to the way the routing 974 protocol creates the DAGs. 976 o Multi-topology support: It may become necessary in the future to 977 support more than one DODAG for different purposes, using 978 different objective functions. RPL allow for the creation of 979 several parallel DODAGs, should this be required. This could be 980 used to create different topologies to reach different roots. 982 o No need for path optimisation: RPL does not necessarily compute 983 the optimal path between any two nodes. However, the ACP does not 984 require this today, since it carries mainly non-delay-sensitive 985 feedback loops. It is possible that different optimisation 986 schemes become necessary in the future, but RPL can be expanded 987 (see point "Extensibility" above). 989 Appendix B. Alternative: An ACP without Separation 991 Section 5 explains how the ACP is constructed as a virtually 992 separated overlay network. An alternative ACP design can be achieved 993 without the VRFs. In this case, the autonomic virtual addresses are 994 part of the data plane, and subject to routing, filtering, QoS, etc 995 on the data plane. The secure tunnels are in this case used by 996 traffic to and from the autonomic address space. They are still 997 required to provide the authentication function for all autonomic 998 packets. 1000 At IETF 93 in Prague, the suggestion was made to not advance with the 1001 data plane ACP, and only continue with the virtually separate ACP. 1002 The reason for this decision is that the contextual separation of the 1003 ACP provides a range of benefits (more robustness, less potential 1004 interactions with user configurations), while it is not much harder 1005 to achieve. 1007 This appendix serves to explain the decision; it will be removed in 1008 the next version of the draft. 1010 Authors' Addresses 1012 Michael H. Behringer (editor) 1013 Cisco Systems 1014 Building D, 45 Allee des Ormes 1015 Mougins 06250 1016 France 1018 Email: mbehring@cisco.com 1020 Steinthor Bjarnason 1021 Cisco Systems 1023 Email: sbjarnas@cisco.com 1025 Balaji BL 1026 Cisco Systems 1028 Email: blbalaji@cisco.com 1030 Toerless Eckert 1031 Cisco Systems 1033 Email: eckert@cisco.com