idnits 2.17.1 draft-ietf-anima-autonomic-control-plane-07.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack separate sections for Informative/Normative References. All references will be assumed normative when checking for downward references. ** The document seems to lack a both a reference to RFC 2119 and the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. RFC 2119 keyword, line 258: '... ACP1: The ACP SHOULD provide robust...' RFC 2119 keyword, line 263: '... ACP2: The ACP MUST have a separate ...' RFC 2119 keyword, line 267: '... ACP3: The ACP MUST use autonomicall...' RFC 2119 keyword, line 272: '... ACP4: The ACP MUST be generic. Usa...' RFC 2119 keyword, line 273: '...rastructure. It MUST NOT be tied to a...' (27 more instances...) Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == Line 434 has weird spacing: '... called anim...' == Using lowercase 'not' together with uppercase 'MUST', 'SHALL', 'SHOULD', or 'RECOMMENDED' is not an accepted usage according to RFC 2119. Please use uppercase 'NOT' together with RFC 2119 keywords (if that is what you mean). Found 'MUST not' in this paragraph: The Certificate Authority in an ANIMA network MUST not change this, and create the respective subjectAltName / rfc822Name in the certificate. -- The document date (July 3, 2017) is 2482 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Unused Reference: 'I-D.richardson-anima-6join-discovery' is defined on line 1746, but no explicit reference was found in the text == Unused Reference: 'RFC4122' is defined on line 1751, but no explicit reference was found in the text == Unused Reference: 'RFC5082' is defined on line 1765, but no explicit reference was found in the text == Unused Reference: 'RFC6762' is defined on line 1797, but no explicit reference was found in the text == Unused Reference: 'RFC6763' is defined on line 1801, but no explicit reference was found in the text == Outdated reference: A later version (-03) exists of draft-carpenter-anima-ani-objectives-02 ** Downref: Normative reference to an Informational draft: draft-carpenter-anima-ani-objectives (ref. 'I-D.carpenter-anima-ani-objectives') == Outdated reference: A later version (-45) exists of draft-ietf-anima-bootstrapping-keyinfra-06 == Outdated reference: A later version (-15) exists of draft-ietf-anima-grasp-14 == Outdated reference: A later version (-10) exists of draft-ietf-anima-reference-model-04 ** Downref: Normative reference to an Informational draft: draft-ietf-anima-reference-model (ref. 'I-D.ietf-anima-reference-model') == Outdated reference: A later version (-10) exists of draft-ietf-anima-stable-connectivity-02 ** Downref: Normative reference to an Informational draft: draft-ietf-anima-stable-connectivity (ref. 'I-D.ietf-anima-stable-connectivity') ** Downref: Normative reference to an Informational draft: draft-richardson-anima-6join-discovery (ref. 'I-D.richardson-anima-6join-discovery') ** Obsolete normative reference: RFC 4941 (Obsoleted by RFC 8981) ** Obsolete normative reference: RFC 6347 (Obsoleted by RFC 9147) ** Downref: Normative reference to an Informational RFC: RFC 7404 ** Downref: Normative reference to an Informational RFC: RFC 7575 ** Downref: Normative reference to an Informational RFC: RFC 7576 Summary: 11 errors (**), 0 flaws (~~), 13 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 ANIMA WG M. Behringer, Ed. 3 Internet-Draft 4 Intended status: Standards Track T. Eckert, Ed. 5 Expires: January 4, 2018 Huawei 6 S. Bjarnason 7 Arbor Networks 8 July 3, 2017 10 An Autonomic Control Plane 11 draft-ietf-anima-autonomic-control-plane-07 13 Abstract 15 Autonomic functions need a control plane to communicate, which 16 depends on some addressing and routing. This Autonomic Control Plane 17 should ideally be self-managing, and as independent as possible of 18 configuration. This document defines an "Autonomic Control Plane", 19 with the primary use as a control plane for autonomic functions. It 20 also serves as a "virtual out of band channel" for OAM communications 21 over a network that is not configured, or mis-configured. 23 Status of This Memo 25 This Internet-Draft is submitted in full conformance with the 26 provisions of BCP 78 and BCP 79. 28 Internet-Drafts are working documents of the Internet Engineering 29 Task Force (IETF). Note that other groups may also distribute 30 working documents as Internet-Drafts. The list of current Internet- 31 Drafts is at http://datatracker.ietf.org/drafts/current/. 33 Internet-Drafts are draft documents valid for a maximum of six months 34 and may be updated, replaced, or obsoleted by other documents at any 35 time. It is inappropriate to use Internet-Drafts as reference 36 material or to cite them other than as "work in progress." 38 This Internet-Draft will expire on January 4, 2018. 40 Copyright Notice 42 Copyright (c) 2017 IETF Trust and the persons identified as the 43 document authors. All rights reserved. 45 This document is subject to BCP 78 and the IETF Trust's Legal 46 Provisions Relating to IETF Documents 47 (http://trustee.ietf.org/license-info) in effect on the date of 48 publication of this document. Please review these documents 49 carefully, as they describe your rights and restrictions with respect 50 to this document. Code Components extracted from this document must 51 include Simplified BSD License text as described in Section 4.e of 52 the Trust Legal Provisions and are provided without warranty as 53 described in the Simplified BSD License. 55 Table of Contents 57 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 58 2. Use Cases for an Autonomic Control Plane . . . . . . . . . . 4 59 2.1. An Infrastructure for Autonomic Functions . . . . . . . . 5 60 2.2. Secure Bootstrap over an Unconfigured Network . . . . . . 5 61 2.3. Data Plane Independent Permanent Reachability . . . . . . 5 62 3. Requirements . . . . . . . . . . . . . . . . . . . . . . . . 6 63 4. Overview . . . . . . . . . . . . . . . . . . . . . . . . . . 7 64 5. Self-Creation of an Autonomic Control Plane . . . . . . . . . 8 65 5.1. Preconditions . . . . . . . . . . . . . . . . . . . . . . 8 66 5.1.1. Domain Certificate with ACP information . . . . . . . 8 67 5.1.2. AN Adjacency Table . . . . . . . . . . . . . . . . . 10 68 5.2. Neighbor discovery . . . . . . . . . . . . . . . . . . . 11 69 5.2.1. L2 topology considerations . . . . . . . . . . . . . 11 70 5.2.2. CDP/LLDP/mDNS considerations . . . . . . . . . . . . 12 71 5.2.3. Discovery with GRASP . . . . . . . . . . . . . . . . 12 72 5.3. Candidate ACP Neighbor Selection . . . . . . . . . . . . 15 73 5.4. Channel Selection . . . . . . . . . . . . . . . . . . . . 15 74 5.5. Candidate ACP Neighbor certificate verification . . . . . 17 75 5.6. Security Association protocols . . . . . . . . . . . . . 17 76 5.6.1. ACP via IKEv2 . . . . . . . . . . . . . . . . . . . . 17 77 5.6.2. ACP via dTLS . . . . . . . . . . . . . . . . . . . . 18 78 5.6.3. ACP Security Profiles . . . . . . . . . . . . . . . . 19 79 5.7. GRASP instance details . . . . . . . . . . . . . . . . . 19 80 5.8. Context Separation . . . . . . . . . . . . . . . . . . . 19 81 5.9. Addressing inside the ACP . . . . . . . . . . . . . . . . 20 82 5.9.1. Fundamental Concepts of Autonomic Addressing . . . . 20 83 5.9.2. The ACP Addressing Base Scheme . . . . . . . . . . . 21 84 5.9.3. ACP Addressing Sub-Scheme . . . . . . . . . . . . . . 22 85 5.9.4. Usage of the Zone Field . . . . . . . . . . . . . . . 23 86 5.9.5. Other ACP Addressing Sub-Schemes . . . . . . . . . . 24 87 5.10. Routing in the ACP . . . . . . . . . . . . . . . . . . . 24 88 5.10.1. RPL Profile for the ACP . . . . . . . . . . . . . . 25 89 5.11. General ACP Considerations . . . . . . . . . . . . . . . 25 90 6. Workarounds for Non-Autonomic Nodes . . . . . . . . . . . . . 26 91 6.1. Non-Autonomic Controller / NMS system (ACP connect) . . . 26 92 6.2. ACP through Non-Autonomic L3 Clouds . . . . . . . . . . . 28 93 7. Self-Healing Properties . . . . . . . . . . . . . . . . . . . 28 94 8. Self-Protection Properties . . . . . . . . . . . . . . . . . 29 95 9. The Administrator View . . . . . . . . . . . . . . . . . . . 30 96 10. Security Considerations . . . . . . . . . . . . . . . . . . . 30 97 11. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 31 98 12. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 31 99 13. Change log [RFC Editor: Please remove] . . . . . . . . . . . 31 100 13.1. Initial version . . . . . . . . . . . . . . . . . . . . 31 101 13.2. draft-behringer-anima-autonomic-control-plane-00 . . . . 31 102 13.3. draft-behringer-anima-autonomic-control-plane-01 . . . . 32 103 13.4. draft-behringer-anima-autonomic-control-plane-02 . . . . 32 104 13.5. draft-behringer-anima-autonomic-control-plane-03 . . . . 32 105 13.6. draft-ietf-anima-autonomic-control-plane-00 . . . . . . 32 106 13.7. draft-ietf-anima-autonomic-control-plane-01 . . . . . . 33 107 13.8. draft-ietf-anima-autonomic-control-plane-02 . . . . . . 33 108 13.9. draft-ietf-anima-autonomic-control-plane-03 . . . . . . 34 109 13.10. draft-ietf-anima-autonomic-control-plane-04 . . . . . . 34 110 13.11. draft-ietf-anima-autonomic-control-plane-05 . . . . . . 34 111 13.12. draft-ietf-anima-autonomic-control-plane-06 . . . . . . 35 112 13.13. draft-ietf-anima-autonomic-control-plane-07 . . . . . . 35 113 14. References . . . . . . . . . . . . . . . . . . . . . . . . . 37 114 Appendix A. Background on the choice of routing protocol . . . . 39 115 Appendix B. Extending ACP channel negotiation (via GRASP) . . . 41 116 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 43 118 1. Introduction 120 Autonomic Networking is a concept of self-management: Autonomic 121 functions self-configure, and negotiate parameters and settings 122 across the network. [RFC7575] defines the fundamental ideas and 123 design goals of Autonomic Networking. A gap analysis of Autonomic 124 Networking is given in [RFC7576]. The reference architecture for 125 Autonomic Networking in the IETF is currently being defined in the 126 document [I-D.ietf-anima-reference-model] 128 Autonomic functions need a stable and robust infrastructure to 129 communicate on. This infrastructure should be as robust as possible, 130 and it should be re-usable by all autonomic functions. [RFC7575] 131 calls it the "Autonomic Control Plane". This document defines the 132 Autonomic Control Plane. 134 Today, the management and control plane of networks typically runs in 135 the global routing table, which is dependent on correct configuration 136 and routing. Misconfigurations or routing problems can therefore 137 disrupt management and control channels. Traditionally, an out of 138 band network has been used to recover from such problems, or 139 personnel is sent on site to access devices through console ports 140 (craft ports). However, both options are operationally expensive. 142 In increasingly automated networks either controllers or distributed 143 autonomic service agents in the network require a control plane which 144 is independent of the network they manage, to avoid impacting their 145 own operations. 147 This document describes options for a self-forming, self-managing and 148 self-protecting "Autonomic Control Plane" (ACP) which is inband on 149 the network, yet as independent as possible of configuration, 150 addressing and routing problems (for details how this achieved, see 151 Section 5). It therefore remains operational even in the presence of 152 configuration errors, addressing or routing issues, or where policy 153 could inadvertently affect control plane connectivity. The Autonomic 154 Control Plane serves several purposes at the same time: 156 o Autonomic functions communicate over the ACP. The ACP therefore 157 supports directly Autonomic Networking functions, as described in 158 [I-D.ietf-anima-reference-model]. For example, GRASP 159 [I-D.ietf-anima-grasp] can run securely inside the ACP and depends 160 on the ACP as its "security substrate". 162 o An operator can use it to log into remote devices, even if the 163 data plane is misconfigured or unconfigured. 165 o A controller or network management system can use it to securely 166 bootstrap network devices in remote locations, even if the network 167 in between is not yet configured; no data-plane dependent 168 bootstrap configuration is required. An example of such a secure 169 bootstrap process is described in 170 [I-D.ietf-anima-bootstrapping-keyinfra] 172 This document describes some use cases for the ACP in Section 2, it 173 defines the requirements in Section 3, Section 4 gives an overview 174 how an Autonomic Control Plane is constructed, and in Section 5 the 175 detailed process is explained. Section 6 explains how non-autonomic 176 nodes and networks can be integrated, and Section 5.6 the first 177 channel types for the ACP. 179 The document "Autonomic Network Stable Connectivity" 180 [I-D.ietf-anima-stable-connectivity] describes how the ACP can be 181 used to provide stable connectivity for OAM applications. It also 182 explains on how existing management solutions can leverage the ACP in 183 parallel with traditional management models, when to use the ACP 184 versus the data plane, how to integrate IPv4 based management, etc. 186 2. Use Cases for an Autonomic Control Plane 187 2.1. An Infrastructure for Autonomic Functions 189 Autonomic Functions need a stable infrastructure to run on, and all 190 autonomic functions should use the same infrastructure to minimise 191 the complexity of the network. This way, there is only need for a 192 single discovery mechanism, a single security mechanism, and other 193 processes that distributed functions require. 195 2.2. Secure Bootstrap over an Unconfigured Network 197 Today, bootstrapping a new device typically requires all devices 198 between a controlling node (such as an SDN controller) and the new 199 device to be completely and correctly addressed, configured and 200 secured. Therefore, bootstrapping a network happens in layers around 201 the controller. Without console access (for example through an out 202 of band network) it is not possible today to make devices securely 203 reachable before having configured the entire network between. 205 With the ACP, secure bootstrap of new devices can happen without 206 requiring any configuration on the network. A new device can 207 automatically be bootstrapped in a secure fashion and be deployed 208 with a domain certificate. This does not require any configuration 209 on intermediate nodes, because they can communicate through the ACP. 211 2.3. Data Plane Independent Permanent Reachability 213 Today, most critical control plane protocols and network management 214 protocols are running in the data plane (global routing table) of the 215 network. This leads to undesirable dependencies between control and 216 management plane on one side and the data plane on the other: Only if 217 the data plane is operational, will the other planes work as 218 expected. 220 Data plane connectivity can be affected by errors and faults, for 221 example certain AAA misconfigurations can lock an administrator out 222 of a device; routing or addressing issues can make a device 223 unreachable; shutting down interfaces over which a current management 224 session is running can lock an admin irreversibly out of the device. 225 Traditionally only console access can help recover from such issues. 227 Data plane dependencies also affect NOC/SDN controller applications: 228 Certain network changes are today hard to operate, because the change 229 itself may affect reachability of the devices. Examples are address 230 or mask changes, routing changes, or security policies. Today such 231 changes require precise hop-by-hop planning. 233 The ACP provides reachability that is largely independent of the data 234 plane, which allows control plane and management plane to operate 235 more robustly: 237 o For management plane protocols, the ACP provides the functionality 238 of a "Virtual-out-of-band (VooB) channel", by providing 239 connectivity to all devices regardless of their configuration or 240 global routing table. 242 o For control plane protocols, the ACP allows their operation even 243 when the data plane is temporarily faulty, or during transitional 244 events, such as routing changes, which may affect the control 245 plane at least temporarily. This is specifically important for 246 autonomic service agents, which could affect data plane 247 connectivity. 249 The document "Autonomic Network Stable Connectivity" 250 [I-D.ietf-anima-stable-connectivity] explains the use cases for the 251 ACP in significantly more detail and explains how the ACP can be used 252 in practical network operations. 254 3. Requirements 256 The Autonomic Control Plane has the following requirements: 258 ACP1: The ACP SHOULD provide robust connectivity: As far as 259 possible, it should be independent of configured addressing, 260 configuration and routing. Requirements 2 and 3 build on this 261 requirement, but also have value on their own. 263 ACP2: The ACP MUST have a separate address space from the data 264 plane. Reason: traceability, debug-ability, separation from 265 data plane, security (can block easily at edge). 267 ACP3: The ACP MUST use autonomically managed address space. Reason: 268 easy bootstrap and setup ("autonomic"); robustness (admin 269 can't mess things up so easily). This document suggests to 270 use ULA addressing for this purpose. 272 ACP4: The ACP MUST be generic. Usable by all the functions and 273 protocols of the AN infrastructure. It MUST NOT be tied to a 274 particular protocol. 276 ACP5: The ACP MUST provide security: Messages coming through the ACP 277 MUST be authenticated to be from a trusted node, and SHOULD 278 (very strong SHOULD) be encrypted. 280 The default mode of operation of the ACP is hop-by-hop, because this 281 interaction can be built on IPv6 link local addressing, which is 282 autonomic, and has no dependency on configuration (requirement 1). 283 It may be necessary to have ACP connectivity over non-autonomic 284 nodes, for example to link autonomic nodes over the general Internet. 285 This is possible, but then has a dependency on routing over the non- 286 autonomic hops. 288 4. Overview 290 The Autonomic Control Plane is constructed in the following way (for 291 details, see Section 5): 293 1. An autonomic node creates a virtual routing and forwarding (VRF) 294 instance, or a similar virtual context. 296 2. It determines, following a policy, a candidate peer list. This 297 is the list of nodes to which it should establish an Autonomic 298 Control Plane. Default policy is: To all adjacent nodes in the 299 same domain. 301 3. For each node in the candidate peer list, it authenticates that 302 node and negotiates a mutually acceptable channel type. 304 4. It then establishes a secure tunnel of the negotiated channel 305 type. These tunnels are placed into the previously set up VRF. 306 This creates an overlay network with hop-by-hop tunnels. 308 5. Inside the ACP VRF, each node sets up a virtual interface with 309 its ULA IPv6 address. 311 6. Each node runs a lightweight routing protocol, to announce 312 reachability of the virtual addresses inside the ACP. 314 Note: 316 o Non-autonomic NMS systems or controllers have to be manually 317 connected into the ACP. 319 o Connecting over non-autonomic Layer-3 clouds initially requires a 320 tunnel between autonomic nodes. 322 o None of the above operations (except manual ones) is reflected in 323 the configuration of the device. 325 The following figure illustrates the ACP. 327 autonomic node 1 autonomic node 2 328 ................... ................... 329 secure . . secure . . secure 330 tunnel : +-----------+ : tunnel : +-----------+ : tunnel 331 ..--------| ACP VRF |---------------------| ACP VRF |---------.. 332 : / \ / \ <--routing--> / \ / \ : 333 : \ / \ / \ / \ / : 334 ..--------| virtual |---------------------| virtual |---------.. 335 : | interface | : : | interface | : 336 : +-----------+ : : +-----------+ : 337 : : : : 338 : data plane :...............: data plane : 339 : : link : : 340 :.................: :.................: 342 Figure 1 344 The resulting overlay network is normally based exclusively on hop- 345 by-hop tunnels. This is because addressing used on links is IPv6 346 link local addressing, which does not require any prior set-up. This 347 way the ACP can be built even if there is no configuration on the 348 devices, or if the data plane has issues such as addressing or 349 routing problems. 351 5. Self-Creation of an Autonomic Control Plane 353 This section describes the steps to set up an Autonomic Control 354 Plane, and highlights the key properties which make it 355 "indestructible" against many inadvert changes to the data plane, for 356 example caused by misconfigurations. 358 5.1. Preconditions 360 An autonomic node can be a router, switch, controller, NMS host, or 361 any other IP device. We assume an autonomic node has a globally 362 unique domain certificate (LDevID), as well as an adjacency table. 364 5.1.1. Domain Certificate with ACP information 366 To establish an ACP securely, an Autnomic Node MUST have a globally 367 unique domain certificate (LDevID), with which it can 368 cryptographically assert its membership in the domain. The document 369 [I-D.ietf-anima-bootstrapping-keyinfra] (BRSKI) describes how a 370 domain certificate can automatically and securely be derived from a 371 vendor specific Unique Device Identifier (UDI) or IDevID certificate. 373 The domain certificate (LDevID) of an autonomic node MUST contain 374 ANIMA specific information, specifically the domain name, the address 375 of the device in the ACP with the Zone-ID set to zero ("ACP 376 address"). This information MUST be encoded in the LDevID in the 377 subjectAltName / rfc822Name field in the following way: 379 anima.acp+{+}@ 381 Example: 383 anima.acp+FDA3:79A6:F6EE:0:200:0:6400:1@example.com 385 The ACP address MUST be specified in its canonical form, as specified 386 in [RFC5952], to allow for easy textual comparisons. 388 The optional field is used for future extensions to this 389 specification. It MUST be ignored unless otherwise specified. 391 The subjectAlName / rfc822Name encoding of the ACP domain name and 392 ACP address is used for the following reasons: 394 o The LDevID assigned by BRSKI should be reuseable for other 395 purposed beside authentication for ACP. 397 o There are a wide range of pre-existing protocols/services where 398 authentication with LDevID is desirable. Enrolling and 399 maintaining separate LDevIDs for each of these protocols/services 400 is often undesirable overhead. Therefore it is beneficial if the 401 BRSKI enrolled LDevID can also be used for other protocols/ 402 services beside the ACP. 404 o The elements in the LDevID required for the ACP should therefore 405 not cause incompatibilities with any pre-existing ASN.1 code 406 potentially in use in those other pre-existing SW systems. 408 o subjectAltname / rfc822Name is a pre-existing element that must be 409 supported by all existing ASN.1 parsers for LDevID. 411 o The elements in the LDevID required for the ACP should also not be 412 misinterpreted by any pre-existing protocol/service that might use 413 the LDevID. If the elements used for the ACP are interpreted by 414 other protocols/services, then the impact should be benign. 416 o Using an IP address format encoding could result in non-benign 417 misinterpretation of the ACP information, for example other 418 protocol/services unaware of the ACP could try to do something 419 with the ACP address that would fail to work correctly (because it 420 is in a different VRF than what they expect), or that could cause 421 security issues. 423 o At minimum, both the AN domain name and the non-domain name 424 derived part of the ACP address need to be encoded in one or more 425 appropriate fields of the certificate, so there are not many 426 alternatives with pre-existing fields where the only possible 427 conflicts would likely be beneficial. 429 o rfc822Name encoding is quite flexible. We choose to encode the 430 full ACP address AND the domain name, so that it is easier to 431 examine/use the encoded "ACP information". 433 o The format of the rfc822Name is choosen so that an operator can 434 set up a mailbox called anima.acp@ that would receive 435 emails sent towards the rfc822Name of any AN device inside a 436 domain. This is possible because components behind a plus symbol 437 are considered part of a single mailbox. In other words, it is 438 not necessary to set up a separate mailbox for every autonomic 439 devices ACP information, but only one for the whole domain. 441 o In result, if any unexpected use of the ACP addressing information 442 in a certificate happens, it is benign and detectable: it would be 443 mail to that mailbox. 445 In the BRSKI bootstrap process in an ANIMA network, the registrar 446 (acting as an EST server) MUST include the subjectAltName / 447 rfc822Name encoded ACP address and domain name to the enrolling 448 device (called pledge) via its response to the pledges EST CSR 449 Attribute request that is mandatory in BRSKI. 451 The Certificate Authority in an ANIMA network MUST not change this, 452 and create the respective subjectAltName / rfc822Name in the 453 certificate. 455 ANIMA nodes can therefore find ACP address and domain using this 456 field in the domain certificate, both for themselves, as well as for 457 other nodes. 459 See section 4.2.1.6 of [RFC5280] for details on the subjectAltName 460 field. 462 5.1.2. AN Adjacency Table 464 To know to which nodes to establish an ACP channel, every autonomic 465 node maintains an adjacency table. The adjacency table contains 466 information about adjacent autonomic nodes, at a minimum: node-ID, IP 467 address, domain, certificate. An autonomic device MUST maintain this 468 adjacency table up to date. This table is used to determine to which 469 neighbor an ACP connection is established. 471 Where the next autonomic device is not directly adjacent, the 472 information in the adjacency table can be supplemented by 473 configuration. For example, the node-ID and IP address could be 474 configured. 476 The adjacency table MAY contain information about the validity and 477 trust of the adjacent autonomic node's certificate. However, 478 subsequent steps MUST always start with authenticating the peer. 480 The adjacency table contains information about adjacent autonomic 481 nodes in general, independently of their domain and trust status. 482 The next step determines to which of those autonomic nodes an ACP 483 connection should be established. 485 5.2. Neighbor discovery 487 5.2.1. L2 topology considerations 489 ANrtr1 ------ ANswitch1 --- ANswitch2 ------- ANrtr2 490 .../ \ \ ... 491 ANrtrM ------ \ ------- ANrtrN 492 ANswitchM ... 494 Figure 2 496 Consider a large L2 LAN with ANrtr1...ANrtrN connected via some 497 topology of L2 switches (eg: in a large enterprise campus or IoT 498 environment using large L2 LANs). If the discovery protocol used for 499 the ACP is operating at the subnet level, every AN router will see 500 all other AN routers on the LAN as neighbors and a full mesh of ACP 501 channels will be built. If some or all of the AN switches are 502 autonomic with the same discovery protocol, then the full mesh would 503 include those switches as well. 505 A full mesh of ACP connections like this can creates fundamental 506 challenges. The number of security associations of the secure 507 channel protocols will not scale arbitrarily, especially when they 508 leverage platform accelerated encryption/decryption. Likewise, any 509 other ACP operations needs to scale to the number of direct ACP 510 neigbors. An AN router with just 4 interfaces might be deployed into 511 a LAN with hundreds of neighbors connected via switches. Introducing 512 such a new unpredictable scaling factor requirement makes it harder 513 to support the ACP on arbitrary platforms and in arbitrary 514 deployments. 516 Predictable scaling requirements for ACP neighbors can most easily be 517 achieved if in topologies like these, AN capable L2 switches can 518 ensure that discovery messages terminate on them so that neighboring 519 AN routers and switches will only find the physcially connected AN L2 520 switches as their candidate ACP neighbors. With such a discovery 521 mechanism in place, the ACP and its security associations will only 522 need to scale to the number of physcial interfaces instead of a 523 potentially much larger number of "LAN-connected" neighbors. And the 524 ACP topology will follow directly the physical topology, something 525 which can then also be leveraged in management operations or by ASAs. 527 In the example above, consider ANswitch1 and ANswitchM are AN 528 capable, and ANswitch2 is not AN capable. The desired ACP topology 529 is therefore that ANrtr1 and ANrtrM only have an ACP connetion to 530 ANswitch1, and that ANswitch1, ANrtr2, ANrtrN have a full mesh of ACP 531 connection amongst each other. ANswitch1 also has an ACP connection 532 with ANswitchM and ANswitchM has ACP connections to anything else 533 behind it. 535 5.2.2. CDP/LLDP/mDNS considerations 537 LLDP (and Cisco's CDP) are example of L2 discovery protocols that 538 terminate their messages on L2 ports. If those protocols would be 539 chosen for ACP neighbor discovery, ACP neighbor discovery would 540 therefore also terminate on L2 ports. This would prevent ACP 541 construction over non-ANIMA switches. 543 mDNS operates at the subnet level, and is also used on L2 switches. 544 The authors of this document are not aware of mDNS implementation 545 that terminate their messages on L2 ports instead of the subnet 546 level. If mDNS was used as the ACP discovery mechanism on an ACP 547 capable L2 switch, then this would be necessary to implement. It is 548 likely that termination of mDNS messages could only be applied to all 549 mDNS messages from a port, which would then make it necessary to 550 software forward any non-ACP related mDNS messages to maintain prior 551 non-ACP mDNS functionality. With low performance of software 552 forwarding in many L2 switches, this could easily make the ACP 553 unsupportable on such L2 switches. 555 5.2.3. Discovery with GRASP 557 Because of the above considerations, the ACP uses DULL (Discovery 558 Unsolicited Link-Local) insecure instances of GRASP for discovery of 559 ACP neighbors. See section 3.5.2.2 of [I-D.ietf-anima-grasp] These 560 can easily be set up to match the aforementioned requirements without 561 affecting other uses of GRASP. Note that each such DULL instance of 562 GRASP is also used for the discovery of a bootstrap proxy when the 563 device is not yet enrolled into the autonomic domain. Because the 564 discover of ACP neighbors only happens after the device is enrolled 565 into the autonomic domain, it never needs to discover a bootstrap 566 proxy and ACP neighbor at the same time. 568 An autonomic node announces itself to potential ACP peers by use of 569 the "AN_ACP" objective. This is a synchronization objective intended 570 to be flooded on a single link using the GRASP Flood Synchronization 571 (M_FLOOD) message. In accordance with the design of the Flood 572 message, a locator consisting of a specific link-local IP address, IP 573 protocol number and port number will be distributed with the flooded 574 objective. An example of the message is informally: 576 [M_FLOOD, 12340815, h'fe80000000000000c0011001FEEF0000, 1, 577 ["AN_ACP", SYNCH-FLAG, 1, "IKEv2"], 578 [O_IPv6_LOCATOR, 579 h'fe80000000000000c0011001FEEF0000, UDP, 15000] 580 ] 582 The formal CDDL definition is: 584 flood-message = [M_FLOOD, session-id, initiator, ttl, 585 +[objective, (locator-option / [])]] 587 objective = ["AN_ACP", objective-flags, loop-count, 588 objective-value] 590 objective-flags = ; as in the GRASP specification 591 loop-count = 1 ; limit to link-local operation 592 objective-value = text ; name of the (list of) secure 593 ; channel negotiation protocol(s) 595 The objective-flags field is set to indicate synchronization. 597 The ttl and loop-count are fixed at 1 since this is a link-local 598 operation. 600 The session-id is a random number used for loop prevention 601 (distinguishing a message from a prior instance of the same message). 602 In DULL this field is irrelevant but must still be set according to 603 the GRASP specification. 605 The originator MUST be the IPv6 link local address of the originating 606 autonomic node on the sending interface. 608 The 'objective-value' parameter is (normally) a string indicating the 609 secure channel protocol available at the specified or implied 610 locator. 612 The locator is optional and only required when the secure channel 613 protocol is not offered at a well-defined port number, or if there is 614 no well defined port number. For example, "IKEv2" has a well defined 615 port number 500, but in the above example, the candidate ACP neighbor 616 is offering ACP secure channel negotiation via IKEv2 on port 15000 617 (for the sake of creating the example). 619 If a locator is included, it MUST be an O_IPv6_LOCATOR, and the IPv6 620 address MUST be the same as the initiator address (these are DULL 621 requirements to minimize third party DoS attacks). 623 The secure channel methods defined in this document use the objective 624 values of "IKEv2" and "dTLS". There is no disstinction between IKEv2 625 native and GRE-IKEv2 because this is purely negotiated via IKEv2. 627 A node that supports more than one secure channel protocol needs to 628 flood multiple versions of the "AN_ACP" objective, each accompanied 629 by its own locator. This can be in a single GRASP M_FLOOD packet. 631 If multiple secure channel protocols are supported that all are run 632 on well-defined ports, then they can be announced via a single AN_ACP 633 objective using a list of string names as the objective value without 634 a following locator-option. 636 Note that a node serving both as an ACP node and BRSKI Join Proxy may 637 choose to distribute the "AN_ACP" objective and "AN_join_proxy" 638 objective in the same flood message, since GRASP allows multiple 639 objectives in one Flood message. This may be impractical though if 640 ACP and BRSKI operations are implemented via separate software 641 modules / ASAs though. 643 As explained above, in an ACP enabled L2 switch, each of these GRASP 644 instances would actually need to be per-L2-port. The result of the 645 discovery is the IPv6 link-local address of the neighbor as well as 646 its supported secure channel protocols (and non-standard port they 647 are running on). It is stored in the AN Adjacency Table, see 648 Section 5.1.2 which then drives the further building of the ACP to 649 that neighbor. 651 For example, ANswitch1 would run separate DULL GRASP instances on its 652 ports to ANrtr1, ANswitch2 and ANswitchI, even though all those three 653 ports may be in the data plane in the same (V)LAN. This is easily 654 achieved by extracting native GRASP multicast messages by their MAC 655 multicast destination address. None of the other type of GRASP 656 instances (eg: as used inside the ACP) use GRASP messages that would 657 be affected by such extraction, because all other GRASP messages have 658 encrypted encapsulations. 660 5.3. Candidate ACP Neighbor Selection 662 An autonomic node must determine to which other autonomic nodes in 663 the adjacency table it should build an ACP connection. This is based 664 on the information in the AN Adjacency table. 666 The ACP is by default established exclusively between nodes in the 667 same domain. 669 Intent can change this default behaviour. Since Intent is 670 transported over the ACP, the first ACP connection a node establishes 671 is always following the default behaviour. The precise format for 672 this Intent needs to be defined outside this document. Example 673 Intent policies which need to be supported include: 675 o The ACP should be built between all sub-domains for a given parent 676 domain. For example: For domain "example.com", nodes of 677 "example.com", "access.example.com", "core.example.com" and 678 "city.core.example.com" should all establish one single ACP. 680 o Two domains should build one single ACP between themselves, for 681 example "example1.com" should establish the ACP also with nodes 682 from "example2.com". For this case, the two domains must be able 683 to validate their trust, typically by cross-signing their 684 certificate infrastructure. 686 The result of the candidate ACP neighbor selection process is a list 687 of adjacent or configured autonomic neighbors to which an ACP channel 688 should be established. The next step begins that channel 689 establishment. 691 5.4. Channel Selection 693 To avoid attacks, initial discovery of candidate ACP peers can not 694 include any non-protected negotiation. To avoid re-inventing and 695 validating security association mechanisms, the next step after 696 discoving the address of a candidate neighbor can only be to try 697 first to establish a security association with that neighbor using a 698 well-known security association method. 700 At this time in the lifecycle of autonomic devices, it is unclear 701 whether it is feasible to even decide on a single MTI (mandatory to 702 implement) security association protocol across all autonomic 703 devices: 705 From the use-cases it seems clear that not all type of autonomic 706 devices can or need to connect directly to each other or are able to 707 support or prefer all possible mechanisms. For example, code space 708 limited IoT devices may only support dTLS (because that code exists 709 already on them for end-to-end security use-cases), but low-end in- 710 ceiling L2 switches may only want to support MacSec because that is 711 also supported in HW, and only a more flexible gateway device may 712 need to support both of these mechanisms and potentially more. 714 To support extensible secure channel protocol selection without a 715 single common MTI protocol, autonomic devices must try all the ACP 716 secure channel protocols it supports and that are feasible because 717 the candidate ACP neighbor also announced them via its AN_ACP GRASP 718 parameters (these are called the "feasible" ACP secure channel 719 protocols). 721 To ensure that the selection of the secure channel protocols always 722 succeeds in a predictable fashion without blocking, the following 723 rules apply: 725 An autonomic device may choose to attempt initiate the different 726 feasible ACP secure channel protocol it supports according to its 727 local policies sequentially or in parallel, but it MUST support 728 acting as a responder to all of them in parallel. 730 Once the first secure channel protocol succeeds, the two peers know 731 each others certificates (because that must be used by all secure 732 channel protocols for mutual authentication. The device with the 733 lower Device-ID in the ACP address becomes Bob, the one with the 734 higher Device-ID in the certificate Alice. 736 Bob becomes passive, he does not attempt to further initiate ACP 737 secure channel protocols with Alice and does not consider it to be an 738 error when Alice closes secure channels. Alice becomes the active 739 party, continues to attempt setting up secure channel protocols with 740 Bob until she arrives at the best one (from her view) that also works 741 with Bob. 743 For example, originally Bob could have been the initiator of one ACP 744 secure channel protocol that Bob preferred and the security 745 association succeeded. The roles of Bob abd Alice are then assigned. 746 At this stage, the protocol may not even have completed negotiationg 747 a common security profile. The protocol could for example have been 748 IPsec. It is not up to Alice to devide how to proceed. Even if the 749 IPsec connecting determined a working profile with Bob, Alice might 750 prefer some other secure protocol (eg: dTLS) and try to set that up 751 with Bob. If that succeeds, she would close the IPsec connection. If 752 no better protocol attempt succeeds, she would keep the IPsec 753 connection. 755 All this negotiation is in the context of an "L2 interface". Alice 756 and Bob will build ACP connections to each other on every "L2 757 interface" that they both connect to. An autonomic device must not 758 assume that neighbors with the same L2 or link-local IPv6 addresses 759 on different L2 interfaces are the ame devices. This can only be 760 determined after examining the certificate after a successful 761 security association attempt. 763 5.5. Candidate ACP Neighbor certificate verification 765 Independent of the security association protocol choosen, candidate 766 ACP neighbors need to be authenticated based on their autonomic 767 domain certificate. This implies that any security association 768 protocol MUST support certificate based authentication that can 769 support the following verification steps: 771 o The certificate is valid as proven by the security associations 772 protocol exchanges. 774 o The peers certificate is signed by the same CA as the devices 775 domain certificate. 777 o The peers certificate has a valid ACP information field 778 (subjectAltName / rfc822Name) and the domain name in that peers 779 ACP information field is the same as in the devices certificate. 781 o The peers certificate is valid according to the CRL or OCSP method 782 indicated in the devices certificate. If the peers certificate 783 fails any of these checks, the connection attempt is aborted and 784 an error logged (with throttling). 786 This document does not mandate specific support for CRL or OCSP 787 options. If CRL or OCSP URLs are specified in the devices 788 certificate then the device SHOULD connect to the URL via the ACP if 789 it has an IPv6 address that is reachable via the ACP. Better 790 mechanisms to locate CRL or OCSP server(s), for example via GRASP are 791 subject to future documents. 793 5.6. Security Association protocols 795 The following sections define the security association protocols that 796 we consider to be important and feasible to specify in this document: 798 5.6.1. ACP via IKEv2 800 An autonomic device announces its ability to support IKEv2 as the ACP 801 secure channel protcol in GRASP as "IKEv2". 803 5.6.1.1. Native IPsec 805 To run ACP via IPsec transport mode, no further IANA assignments/ 806 definitions are required. All autonomic devices supporting IPsec 807 MUST support IPsec security setup via IKEv2, transport mode 808 encapsulation via the device and peer link-local IPv6 addresses, 809 AES256 encryption and SHA256 hash. 811 In terms of IKEv2, this means the initiator will offer to support 812 IPsec transport mode with next protocol equal 41 (IPv6). 814 5.6.1.2. IPsec with GRE encapsulation 816 In network devices it is often easier to provide virtual interfaces 817 on top of GRE encapsulation than natively on top of a simple IPsec 818 association. On those devices it may be necessary to run the ACP 819 secure channel on top of a GRE connection protected by the IPsec 820 association. The requirements for the IPsec association are the same 821 as in the native IPsec case, but instead of directly carrying the ACP 822 IPv6 packets, the payload is an ACP IPv6 packet inside GRE/IPv6. The 823 mandatory security profile is the same as for native IPsec: peer 824 link-local IPv6 addresses, AES256 encryption, SHA256 hash. 826 In terms of IKEv2 negotiation, this means the initiator must offer to 827 support IPsec transport mode with next protocol equal to GRE (47), 828 followed by 41 (IPv6) (because native IPsec is required to be 829 supported, see below). 831 If IKEv2 initiator and responder support GRE, it will be selected. 832 The version of GRE to be used must the according to [RFC7676]. 834 5.6.2. ACP via dTLS 836 We define the use of ACP via dTLS in the assumption that it is likely 837 the first transport encryption code basis supported in some classes 838 of constrained devices. 840 To run ACP via UDP and dTLS v1.2 [RFC6347] a locally assigned UDP 841 port is used that is announced as a parameter in the GRASP AN_ACP 842 objective to candidate neighbors. All autonomic devices supporting 843 ACP via dTLS must use AES256 encryption. 845 There is no additional session setup or other security association 846 besides this simple dTLS setup. As soon as the dTLS session is 847 functional, the ACP peers will exchange ACP IPv6 packets as the 848 payload of the dTLS transport connection. Any dTLS defined security 849 association mechanisms such as re-keying are used as they would be 850 for any transport application relying solely on dTLS. 852 5.6.3. ACP Security Profiles 854 A baseline autonomic device MUST support IPsec. A constrained 855 autonomic device MUST support dTLS. Autonomic edge device connecting 856 constrained areas with baseline areas MUST therefore support IPsec 857 and dTLS. 859 The MTU for ACP secure channels must be derived locally from the 860 underlying link MTU minus the security encapsulation overhead. Given 861 how ACP channels are built across layer2 connections only, the 862 probability for MTU mismatch is low. For additional reliability, 863 applications to be runa cross the ACP should only assume to have 864 minimum MTU available (1280). 866 Autonomic devices need to specify in documentation the set of secure 867 ACP mechanisms they suppport. 869 5.7. GRASP instance details 871 Received GRASP packets are assigned to an instance of GRASP by the 872 context they are received on: 874 o GRASP packets received on an ACP (virtual) interfaces are assigned 875 to the ACP instance of GRASP 877 o GRASP/UDP packets received on L2 interfaces/ports where the device 878 is willing to run ACP are assigned to a DULL instance of GRASP for 879 that interface/port. 881 5.8. Context Separation 883 The ACP is in a separate context from the normal data plane of the 884 device. This context includes the ACP channels IPv6 forwarding and 885 routing as well as any required higher layer ACP functions. 887 In classical network device platforms, a dedicated so called "Virtual 888 routing and forwarding instance" (VRF) is one logical implementation 889 option for the ACP. If possible by the platform SW architecture, 890 separation options that minimize shared components are preferred, 891 such as a logical container or virtual machine instance. The context 892 for the ACP needs to be established automatically during bootstrap of 893 a device. As much as possible it should be protected from being 894 modified unintentionally by data plane configuration. 896 Context separation improves security, because the ACP is not 897 reachable from the global routing table. Also, configuration errors 898 from the data plane setup do not affect the ACP. 900 5.9. Addressing inside the ACP 902 The channels explained above typically only establish communication 903 between two adjacent nodes. In order for communication to happen 904 across multiple hops, the autonomic control plane requires internal 905 network wide valid addresses and routing. Each autonomic node must 906 create a virtual interface with a network wide unique address inside 907 the ACP context mentioned in Section 5.8. This address may be used 908 also in other virtual contexts. 910 With the algorithm introduced here, all autonomic devices in the same 911 domain have the same /48 prefix. Conversely, global IDs from 912 different domains are unlikely to clash, such that two networks can 913 be merged, as long as the policy allows that merge. See also 914 Section 7 for a discussion on merging domains. 916 Links inside the ACP only use link-local IPv6 addressing, such that 917 each node only requires one routable virtual address. 919 5.9.1. Fundamental Concepts of Autonomic Addressing 921 o Usage: Autonomic addresses are exclusively used for self- 922 management functions inside a trusted domain. They are not used 923 for user traffic. Communications with entities outside the 924 trusted domain use another address space, for example normally 925 managed routable address space. 927 o Separation: Autonomic address space is used separately from user 928 address space and other address realms. This supports the 929 robustness requirement. 931 o Loopback-only: Only loopback interfaces of autonomic nodes carry a 932 routable address; all other interfaces exclusively use IPv6 link 933 local for autonomic functions. The usage of IPv6 link local 934 addressing is discussed in [RFC7404]. 936 o Use-ULA: For loopback interfaces of autonomic nodes, we use Unique 937 Local Addresses (ULA), as specified in [RFC4193]. An alternative 938 scheme was discussed, using assigned ULA addressing. The 939 consensus was to use ULA-random [[RFC4193] with L=1], because it 940 was deemed to be sufficient. 942 o No external connectivity: They do not provide access to the 943 Internet. If a node requires further reaching connectivity, it 944 should use another, traditionally managed address scheme in 945 parallel. 947 o Addresses in the ACP are permanent, and do not support temporary 948 addresses as defined in [RFC4941]. 950 The ACP is based exclusively on IPv6 addressing, for a variety of 951 reasons: 953 o Simplicity, reliability and scale: If other network layer 954 protocols were supported, each would have to have its own set of 955 security associations, routing table and process, etc. 957 o Autonomic functions do not require IPv4: Autonomic functions and 958 autonomic service agents are new concepts. They can be 959 exclusively built on IPv6 from day one. There is no need for 960 backward compatibility. 962 o OAM protocols no not require IPv4: The ACP may carry OAM 963 protocols. All relevant protocols (SNMP, TFTP, SSH, SCP, Radius, 964 Diameter, ...) are available in IPv6. 966 5.9.2. The ACP Addressing Base Scheme 968 The Base ULA addressing scheme for autonomic nodes has the following 969 format: 971 8 40 3 77 972 +--+--------------+------+------------------------------------------+ 973 |FD| hash(domain) | Type | (sub-scheme) | 974 +--+--------------+------+------------------------------------------+ 976 Figure 3: ACP Addressing Base Scheme 978 The first 48 bits follow the ULA scheme, as defined in [RFC4193], to 979 which a type field is added: 981 o "FD" identifies a locally defined ULA address. 983 o The ULA "global ID" is set here to be a hash of the domain name, 984 which results in a pseudo-random 40 bit value. It is calculated 985 as the first 40 bits of the SHA256 hash of the domain name, in the 986 example "example.com". 988 o To allow for extensibility, the fact that the ULA "global ID" is 989 such a hash MUST NOT be assumed by any autonomic device during 990 normal operations but only by registrars during the creation of a 991 response to the CSR Attribute request, eg: when the certificate is 992 created in which the address is inserted via the ACP information 993 attribute. 995 o Type: This field allows different address sub-schemes in the 996 future. The goal is to start with a single sub-schemes, but to 997 allow for extensions later if and when required. This addresses 998 the "upgradability" requirement. Assignment of types for this 999 field should be maintained by IANA. 1001 5.9.3. ACP Addressing Sub-Scheme 1003 The sub-scheme defined here is defined by the Type value 0 (zero) in 1004 the base scheme. 1006 51 13 63 1 1007 +------------------------+---------+----------------------------+---+ 1008 | (base scheme) | Zone-ID | Device-ID | V | 1009 +------------------------+---------+----------------------------+---+ 1011 Figure 4: ACP Addressing Sub-Scheme 1013 The fields are defined as follows: [Editor's note: The lengths of the 1014 fields is for discussion.] 1016 o Zone-ID: If set to all zero bits: The Device-ID bits are used as 1017 an identifier (as opposed to a locator). This results in a non- 1018 hierarchical, flat addressing scheme. Any other value indicates a 1019 zone. See section Section 5.9.4 on how this field is used in 1020 detail. 1022 o Device-ID: A unique value for each device. 1024 o V: Virtualization bit: 0: autonomic node base system; 1: a virtual 1025 context on an autonomic node. 1027 The Device-ID is derived as follows: In an Autonomic Network, a 1028 registrar is enrolling new devices. As part of the enrolment process 1029 the registrar assigns a number to the device, which is unique for 1030 this registrar, but not necessarily unique in the domain. The 64 bit 1031 Device-ID is then composed as: 1033 o 48 bit: Registrar ID, a number unique inside the domain that 1034 identifies the registrar which assigned the name to the device. A 1035 MAC address of the registrar can be used for this purpose. 1037 o 15 bit: Device number, a number which is unique for a given 1038 registrar, to identify the device. This can be a sequentially 1039 assigned number. 1041 The "Device-ID" itself is unique in a domain (i.e., the Zone-ID is 1042 not required for uniqueness). Therefore, a device can be addressed 1043 either as part of a flat hierarchy (zone ID = 0), or with an 1044 aggregation scheme (any other zone ID). A address with zone-ID = 0 1045 is an identifier, with another zone-ID as a locator. See 1046 Section 5.9.4 for a description of the zone bits. 1048 This addressing sub-scheme allows the direct addressing of specific 1049 virtual containers / VMs on an autonomic node. An increasing number 1050 of hardware platforms have a distributed architecture, with a base OS 1051 for the node itself, and the support for hardware blades with 1052 potentially different OSs. The VMs on the blades could be considered 1053 as separate autonomic nodes, in which case it would make sense to be 1054 able to address them directly. Autonomic Service Agents (ASAs) could 1055 be instantiated in either the base OS, or one of the VMs on a blade. 1056 This addressing scheme allows for the easy separation of the hardware 1057 context. 1059 The location of the V bit(s) at the end of the address allows to 1060 announce a single prefix for each autonomic node, while having 1061 separate virtual contexts addressable directly. 1063 [EDNOTE: various suggestions from mcr in his mail from 30 Nov 2016 to 1064 be considered (https://mailarchive.ietf.org/arch/msg/anima/ 1065 nZpEphrTqDCBdzsKMpaIn2gsIzI).] 1067 5.9.4. Usage of the Zone Field 1069 The "Zone-ID" allows for the introduction of structure in the 1070 addressing scheme. 1072 Zone = zero is the default addressing scheme in an autonomic domain. 1073 Every autonomic node MUST respond to its ACP address with zone=0. 1074 Used on its own this leads to a non-hierarchical address scheme, 1075 which is suitable for networks up to a certain size. In this case, 1076 the addresses primarily act as identifiers for the nodes, and 1077 aggregation is not possible. 1079 If aggregation is required, the 13 bit value allows for up to 8191 1080 zones. The allocation of zone numbers may either happen 1081 automatically through a to-be-defined algorithm; or it could be 1082 configured and maintained manually. 1084 If a device learns through an autonomic method or through 1085 configuration that it is part of a zone, it MUST also respond to its 1086 ACP address with that zone number. In this case the ACP loopback is 1087 configured with two ACP addresses: One for zone 0 and one for the 1088 assigned zone. This method allows for a smooth transition between a 1089 flat addressing scheme and an hierarchical one. 1091 (Theoretically, the 13 bits for the Zone-ID would allow also for two 1092 levels of zones, introducing a sub-hierarchy. We do not think this 1093 is required at this point, but a new type could be used in the future 1094 to support such a scheme.) 1096 Note: Another way to introduce hierarchy is to use sub-domains in the 1097 naming scheme. The node names "node17.subdomainA.example.com" and 1098 "node4.subdomainB.example.com" would automatically lead to different 1099 ULA prefixes, which can be used to introduce a routing hierarchy in 1100 the network, assuming that the subdomains are aligned with routing 1101 areas. Because the domain name in the ACP information field of the 1102 certificate is used to authenticate an ACP peers certificate, care 1103 must be taken when using such an approach though: To allow for 1104 devices in separate subdomains to have mutually permitted 1105 certificates, the domain part of the ACP information can not carry 1106 the subdomain. Instead it shuold be carried as an extension to the 1107 address part. This part will be ignored and instead only the address 1108 field using the different subdomain hash based ULA prefix will be 1109 used. Example: 1110 anima.acp+FDA3:79A6:F6EE:0:200:0:6400:1+sub:subdomainA@example.com 1112 5.9.5. Other ACP Addressing Sub-Schemes 1114 Other ACP addressing sub-schemes can be defined if and when required. 1115 IANA would need to assign a new "type" for each new addressing sub- 1116 scheme. 1118 5.10. Routing in the ACP 1120 Once ULA address are set up all autonomic entities should run a 1121 routing protocol within the autonomic control plane context. This 1122 routing protocol distributes the ULA created in the previous section 1123 for reachability. The use of the autonomic control plane specific 1124 context eliminates the probable clash with the global routing table 1125 and also secures the ACP from interference from the configuration 1126 mismatch or incorrect routing updates. 1128 The establishment of the routing plane and its parameters are 1129 automatic and strictly within the confines of the autonomic control 1130 plane. Therefore, no manual configuration is required. 1132 All routing updates are automatically secured in transit as the 1133 channels of the autonomic control plane are by default secured, and 1134 this routing runs only inside the ACP. 1136 The routing protocol inside the ACP is RPL ([RFC6550]) with the 1137 following profile. See Appendix A for more details on the choice of 1138 RPL. 1140 5.10.1. RPL Profile for the ACP 1142 o RPL Mode of Operations (MOP): mode 3 "Storing Mode of Operations 1143 with multicast support". Implementations should support also 1144 other modes. Note: Root indicates mode in DIO flow. 1146 o Objective Function (OF): Use OF0 [RFC6552]. No use of metric 1147 containers, Default RPLInstanceID = 0. 1149 * stretch_rank: none provided ("not stretched"). 1151 * rank_factor: Derived from link speed: <= 100Mbps: 1152 LOW_SPEED_FACTOR(5), else HIGH_SPEED_FACTOR(1) 1154 o Trickle: Not used; Data Path Validation: Not used. 1156 o Proactive, aggressive DAO state maintenance: 1158 * Use K-flag in unsolicited DAO indicating change from previous 1159 information (to require DAO-ACK). 1161 * Retry such DAO DAO-RETRIES(3) times with DAO- 1162 ACK_TIME_OUT(256ms) in between. 1164 o Administrative Preference ([RFC6552], 3.2.6 - to become root): 1165 Indicated in DODAGPreference field of DIO message. 1167 * Explicit configured "root": 0b100 1169 * Registrar (Default): 0b011 1171 * AN-connect (non registrar): 0b010 1173 * Default: 0b001. 1175 The RPL root can create additional RPL instances with other OF and 1176 metrics as desired, eg: via intent. 1178 5.11. General ACP Considerations 1180 In order to be independent of configured link addresses, channels 1181 SHOULD use IPv6 link local addresses between adjacent neighbors 1182 wherever possible. This way, the ACP tunnels are independent of 1183 correct network wide routing. 1185 Since channels are by default established between adjacent neighbors, 1186 the resulting overlay network does hop by hop encryption. Each node 1187 decrypts incoming traffic from the ACP, and encrypts outgoing traffic 1188 to its neighbors in the ACP. Routing is discussed in Section 5.10. 1190 If two nodes are connected via several links, the ACP SHOULD be 1191 established on every link, but it is possible to establish the ACP 1192 only on a sub-set of links. Having an ACP channel on every link has 1193 a number of advantages, for example it allows for a faster failover 1194 in case of link failure, and it reflects the physical topology more 1195 closely. Using a subset of links (for example, a single link), 1196 reduces resource consumption on the devices, because state needs to 1197 be kept per ACP channel. 1199 6. Workarounds for Non-Autonomic Nodes 1201 6.1. Non-Autonomic Controller / NMS system (ACP connect) 1203 The Autonomic Control Plane can be used by management systems, such 1204 as controllers or network management system (NMS) hosts (henceforth 1205 called simply "NMS hosts"), to connect to devices through it. For 1206 this, an NMS host must have access to the ACP. The ACP is a self- 1207 protecting overlay network, which allows by default access only to 1208 trusted, autonomic systems. Therefore, a traditional, non-autonomic 1209 NMS system does not have access to the ACP by default, just like any 1210 other external device. 1212 If the NMS host is not autonomic, i.e., it does not support autonomic 1213 negotiation of the ACP, then it can be brought into the ACP by 1214 explicit configuration. To support connections to adjacent non- 1215 autonomic nodes, an autonomic node with ACP must support "ACP 1216 connect" (sometimes also connect "autonomic connect"): 1218 "ACP connect" is a function on an autonomic device that we call an 1219 "ACP edge device". With "ACP connect", interfaces on the device can 1220 be configured to be put into the ACP VRF. The ACP is then accessible 1221 to other (NOC) systems on such an interface without those systems 1222 having to support any ACP discovery or ACP channel setup. This is 1223 also called "native" access to the ACP because to those (NOC) systems 1224 the interface looks like a normal network interface (without any 1225 encryption/novel-signaling). 1227 data-plane "native" (no ACP) 1228 . 1229 +-----------+ +-----------+ . +-------------+ 1230 | | | Autonomic | v | |+ 1231 | | | Device |-----------------| |+ 1232 | Autonomic |-----------|"ACP edge | | NOC Device || 1233 | Device | ^ | device" O-----------------| "NMS hosts" || 1234 | | . | | . ^ | || 1235 +-----------+ . +-----------+ . . +-------------+| 1236 . . . +-------------+ 1237 data-plane "native" . ACP "native" (unencrypted) 1238 + ACP auto-negotiated . 1239 and encrypted ACP connect interface 1240 eg: "vrf ACP native" (config) 1242 Figure 5: ACP connect 1244 ACP connect has security consequences: All systems and processes 1245 connected via ACP connect have access to all autonomic nodes on the 1246 entire ACP, without further authentication. Thus, the ACP connect 1247 interface and (NOC) systems connected to it must be physically 1248 controlled/secured. 1250 The ACP connect interface must be configured with some IPv6 address 1251 prefix. This prefix could use the ACP address prefix or could be 1252 different. It must be distributed into the ACP routing protocol 1253 unless the ACP device is the root of the ACP routing protocol (eg: 1254 when all other autonomic devices have a default route in the ACP 1255 towards it). The NOC hosts must route the ACP address prefix to the 1256 ACP edge devices address on the ACP connect interface. 1258 An ACP connect interface provides exclusively access to only the ACP. 1259 This is likely insufficient for many NOC hosts. Instead, they would 1260 likely require a second interface outside the ACP for connections 1261 between the NMS host and administrators, or Internet based services, 1262 or even for direct access to the data plane. The document "Autonomic 1263 Network Stable Connectivity" [I-D.ietf-anima-stable-connectivity] 1264 explains in more detail how the ACP can be integrated in a mixed NOC 1265 environment. 1267 Note: If an NMS host is autonomic itself, it negotiates access to the 1268 ACP with its neighbor, like any other autonomic node and then runs a 1269 normal (encrypted) ACP connection to the neighbor. 1271 6.2. ACP through Non-Autonomic L3 Clouds 1273 Not all devices in a network may be autonomic. If non-autonomic 1274 Layer-2 devices are between autonomic nodes, the communications 1275 described in this document should work, since it is IP based. 1276 However, non-autonomic Layer-3 devices do not forward link local 1277 autonomic messages, and thus break the Autonomic Control Plane. 1279 One workaround is to manually configure IP tunnels between autonomic 1280 nodes across a non-autonomic Layer-3 cloud. The tunnels are 1281 represented on each autonomic node as virtual interfaces, and all 1282 autonomic transactions work across such tunnels. 1284 Such manually configured tunnels are less "indestructible" than an 1285 automatically created ACP based on link local addressing, since they 1286 depend on correct data plane operations, such as routing and 1287 addressing. 1289 Future work should envisage an option where the edge device of the L3 1290 cloud is configured to automatically forward ACP discovery messages 1291 to the right exit point. This optimisation is not considered in this 1292 document. 1294 7. Self-Healing Properties 1296 The ACP is self-healing: 1298 o New neighbors will automatically join the ACP after successful 1299 validation and will become reachable using their unique ULA 1300 address across the ACP. 1302 o When any changes happen in the topology, the routing protocol used 1303 in the ACP will automatically adapt to the changes and will 1304 continue to provide reachability to all devices. 1306 o If an existing device gets revoked, it will automatically be 1307 denied access to the ACP as its domain certificate will be 1308 validated against a Certificate Revocation List during 1309 authentication. Since the revocation check is only done at the 1310 establishment of a new security association, existing ones are not 1311 automatically torn down. If an immediate disconnect is required, 1312 existing sessions to a freshly revoked device can be re-set. 1314 The ACP can also sustain network partitions and mergers. Practically 1315 all ACP operations are link local, where a network partition has no 1316 impact. Devices authenticate each other using the domain 1317 certificates to establish the ACP locally. Addressing inside the ACP 1318 remains unchanged, and the routing protocol inside both parts of the 1319 ACP will lead to two working (although partitioned) ACPs. 1321 There are few central dependencies: A certificate revocation list 1322 (CRL) may not be available during a network partition; a suitable 1323 policy to not immediately disconnect neighbors when no CRL is 1324 available can address this issue. Also, a registrar or Certificate 1325 Authority might not be available during a partition. This may delay 1326 renewal of certificates that are to expire in the future, and it may 1327 prevent the enrolment of new devices during the partition. 1329 After a network partition, a re-merge will just establish the 1330 previous status, certificates can be renewed, the CRL is available, 1331 and new devices can be enrolled everywhere. Since all devices use 1332 the same trust anchor, a re-merge will be smooth. 1334 Merging two networks with different trust anchors requires the trust 1335 anchors to mutually trust each other (for example, by cross-signing). 1336 As long as the domain names are different, the addressing will not 1337 overlap (see Section 5.9). 1339 It is also highly desirable for implementation of the ACP to be able 1340 to run it over interfaces that are administratively down. If this is 1341 not feasible, then it might instead be possible to request explicit 1342 operator override upon administrative actions that would 1343 administratively bring down an interface across whicht the ACP is 1344 running. Especially if bringing down the ACP is known to disconnect 1345 the operator from the device. For example any such down 1346 administrative action could perform a dependency check to see if the 1347 transport connection across which this action is performed is 1348 affected by the down action (with default RPL routing used, packet 1349 forwarding will be symmetric, so this is actually possible to check). 1351 8. Self-Protection Properties 1353 As explained in Section 5, the ACP is based on secure channels built 1354 between devices that have mutually authenticated each other with 1355 their domain certificates. The channels themselves are protected 1356 using standard encryption technologies like DTLS or IPsec which 1357 provide additional authentication during channel establishment, data 1358 integrity and data confidentiality protection of data inside the ACP 1359 and in addition, provide replay protection. 1361 An attacker will therefore not be able to join the ACP unless having 1362 a valid domain certificate, also packet injection and sniffing 1363 traffic will not be possible due to the security provided by the 1364 encryption protocol. 1366 The remaining attack vector would be to attack the underlying AN 1367 protocols themselves, either via directed attacks or by denial-of- 1368 service attacks. However, as the ACP is built using link-local IPv6 1369 address, remote attacks are impossible. The ULA addresses are only 1370 reachable inside the ACP context, therefore unreachable from the data 1371 plane. Also, the ACP protocols should be implemented to be attack 1372 resistant and not consume unnecessary resources even while under 1373 attack. 1375 9. The Administrator View 1377 An ACP is self-forming, self-managing and self-protecting, therefore 1378 has minimal dependencies on the administrator of the network. 1379 Specifically, since it is independent of configuration, there is no 1380 scope for configuration errors on the ACP itself. The administrator 1381 may have the option to enable or disable the entire approach, but 1382 detailed configuration is not possible. This means that the ACP must 1383 not be reflected in the running configuration of devices, except a 1384 possible on/off switch. 1386 While configuration is not possible, an administrator must have full 1387 visibility of the ACP and all its parameters, to be able to do 1388 trouble-shooting. Therefore, an ACP must support all show and debug 1389 options, as for any other network function. Specifically, a network 1390 management system or controller must be able to discover the ACP, and 1391 monitor its health. This visibility of ACP operations must clearly 1392 be separated from visibility of data plane so automated systems will 1393 never have to deal with ACP aspect unless they explicitly desire to 1394 do so. 1396 Since an ACP is self-protecting, a device not supporting the ACP, or 1397 without a valid domain certificate cannot connect to it. This means 1398 that by default a traditional controller or network management system 1399 cannot connect to an ACP. See Section 6.1 for more details on how to 1400 connect an NMS host into the ACP. 1402 10. Security Considerations 1404 An ACP is self-protecting and there is no need to apply configuration 1405 to make it secure. Its security therefore does not depend on 1406 configuration. 1408 However, the security of the ACP depends on a number of other 1409 factors: 1411 o The usage of domain certificates depends on a valid supporting PKI 1412 infrastructure. If the chain of trust of this PKI infrastructure 1413 is compromised, the security of the ACP is also compromised. This 1414 is typically under the control of the network administrator. 1416 o Security can be compromised by implementation errors (bugs), as in 1417 all products. 1419 There is no prevention of source-address spoofing inside the ACP. 1420 This implies that if an attacker gains access to the ACP, (s)he can 1421 spoof all addresses inside the ACP and fake messages from any other 1422 device. 1424 Fundamentally, security depends on correct operation, implementation 1425 and architecture. Autonomic approaches such as the ACP largely 1426 eliminate the dependency on correct operation; implementation and 1427 architectural mistakes are still possible, as in all networking 1428 technologies. 1430 11. IANA Considerations 1432 12. Acknowledgements 1434 This work originated from an Autonomic Networking project at Cisco 1435 Systems, which started in early 2010. Many people contributed to 1436 this project and the idea of the Autonomic Control Plane, amongst 1437 which (in alphabetical order): Ignas Bagdonas, Parag Bhide, Balaji 1438 BL, Alex Clemm, Yves Hertoghs, Bruno Klauser, Max Pritikin, Ravi 1439 Kumar Vadapalli. 1441 Special thanks to Pascal Thubert to provide the details for the 1442 recommendations of the RPL profile to use in the ACP 1444 Further input and suggestions were received from: Rene Struik, Brian 1445 Carpenter, Benoit Claise. 1447 13. Change log [RFC Editor: Please remove] 1449 13.1. Initial version 1451 First version of this document: draft-behringer-autonomic-control- 1452 plane 1454 13.2. draft-behringer-anima-autonomic-control-plane-00 1456 Initial version of the anima document; only minor edits. 1458 13.3. draft-behringer-anima-autonomic-control-plane-01 1460 o Clarified that the ACP should be based on, and support only IPv6. 1462 o Clarified in intro that ACP is for both, between devices, as well 1463 as for access from a central entity, such as an NMS. 1465 o Added a section on how to connect an NMS system. 1467 o Clarified the hop-by-hop crypto nature of the ACP. 1469 o Added several references to GDNP as a candidate protocol. 1471 o Added a discussion on network split and merge. Although, this 1472 should probably go into the certificate management story longer 1473 term. 1475 13.4. draft-behringer-anima-autonomic-control-plane-02 1477 Addresses (numerous) comments from Brian Carpenter. See mailing list 1478 for details. The most important changes are: 1480 o Introduced a new section "overview", to ease the understanding of 1481 the approach. 1483 o Merged the previous "problem statement" and "use case" sections 1484 into a mostly re-written "use cases" section, since they were 1485 overlapping. 1487 o Clarified the relationship with draft-ietf-anima-stable- 1488 connectivity 1490 13.5. draft-behringer-anima-autonomic-control-plane-03 1492 o Took out requirement for IPv6 --> that's in the reference doc. 1494 o Added requirement section. 1496 o Changed focus: more focus on autonomic functions, not only virtual 1497 out of band. This goes a bit throughout the document, starting 1498 with a changed abstract and intro. 1500 13.6. draft-ietf-anima-autonomic-control-plane-00 1502 No changes; re-submitted as WG document. 1504 13.7. draft-ietf-anima-autonomic-control-plane-01 1506 o Added some paragraphs in addressing section on "why IPv6 only", to 1507 reflect the discussion on the list. 1509 o Moved the data-plane ACP out of the main document, into an 1510 appendix. The focus is now the virtually separated ACP, since it 1511 has significant advantages, and isn't much harder to do. 1513 o Changed the self-creation algorithm: Part of the initial steps go 1514 into the reference document. This document now assumes an 1515 adjacency table, and domain certificate. How those get onto the 1516 device is outside scope for this document. 1518 o Created a new section 6 "workarounds for non-autonomic nodes", and 1519 put the previous controller section (5.9) into this new section. 1520 Now, section 5 is "autonomic only", and section 6 explains what to 1521 do with non-autonomic stuff. Much cleaner now. 1523 o Added an appendix explaining the choice of RPL as a routing 1524 protocol. 1526 o Formalised the creation process a bit more. Now, we create a 1527 "candidate peer list" from the adjacency table, and form the ACP 1528 with those candidates. Also it explains now better that policy 1529 (Intent) can influence the peer selection. (section 4 and 5) 1531 o Introduce a section for the capability negotiation protocol 1532 (section 7). This needs to be worked out in more detail. This 1533 will likely be based on GRASP. 1535 o Introduce a new parameter: ACP tunnel type. And defines it in the 1536 IANA considerations section. Suggest GRE protected with IPSec 1537 transport mode as the default tunnel type. 1539 o Updated links, lots of small edits. 1541 13.8. draft-ietf-anima-autonomic-control-plane-02 1543 o Added explicitly text for the ACP channel negotiation. 1545 o Merged draft-behringer-anima-autonomic-addressing-02 into this 1546 document, as suggested by WG chairs. 1548 13.9. draft-ietf-anima-autonomic-control-plane-03 1550 o Changed Neighbor discovery protocol from GRASP to mDNS. Bootstrap 1551 protocol team decided to go with mDNS to discover bootstrap proxy, 1552 and ACP should be consistent with this. Reasons to go with mDNS 1553 in bootstrap were a) Bootstrap should be reuseable also outside of 1554 full anima solutions and introduce as few as possible new 1555 elements. mDNS was considered well-known and very-likely even pre- 1556 existing in low-end devices (IoT). b) Using GRASP both for the 1557 insecure neighbor discovery and secure ACP operatations raises the 1558 risk of introducing security issues through implementation issues/ 1559 non-isolation between those two instances of GRASP. 1561 o Shortened the section on GRASP instances, because with mDNS being 1562 used for discovery, there is no insecure GRASP session any longer, 1563 simplifying the GRASP considerations. 1565 o Added certificate requirements for ANIMA in section 5.1.1, 1566 specifically how the ANIMA information is encoded in 1567 subjectAltName. 1569 o Deleted the appendix on "ACP without separation", as originally 1570 planned, and the paragraph in the main text referring to it. 1572 o Deleted one sub-addressing scheme, focusing on a single scheme 1573 now. 1575 o Included information on how ANIMA information must be encoded in 1576 the domain certificate in Section 5.1. 1578 o Editorial changes, updated draft references, etc. 1580 13.10. draft-ietf-anima-autonomic-control-plane-04 1582 Changed discovery of ACP neighbor back from mDNS to GRASP after 1583 revisiting the L2 problem. Described problem in discovery section 1584 itself to justify. Added text to explain how ACP discovery relates 1585 to BRSKY (bootstrap) discovery and pointed to Michael Richardsons 1586 draft detailing it. Removed appendix section that contained the 1587 original explanations why GRASP would be useful (current text is 1588 meant to be better). 1590 13.11. draft-ietf-anima-autonomic-control-plane-05 1592 o Section 5.3 (candidate ACP neighbor selection): Add that Intent 1593 can override only AFTER an initial default ACP establishment. 1595 o Section 5.9.1 (addressing): State that addresses in the ACP are 1596 permanent, and do not support temporary addresses as defined in 1597 RFC4941. 1599 o Modified Section 5.2.3 to point to the GRASP objective defined in 1600 [I-D.carpenter-anima-ani-objectives]. (and added that reference) 1602 o Section 5.9.2: changed from MD5 for calculating the first 40 bits 1603 to SHA256; reason is MD5 should not be used any more. 1605 o Added address sub-scheme to the IANA section. 1607 o Made the routing section more prescriptive. 1609 o Clarified in Section 6.1 the ACP Connect port, and defined that 1610 term "ACP Connect". 1612 o Section 6.2: Added some thoughts (from mcr) on how traversing a L3 1613 cloud could be automated. 1615 o Added a CRL check in Section 5.6. 1617 o Added a note on the possibility of source-address spoofing into 1618 the security considerations section. 1620 o Other editoral changes, including those proposed by Michael 1621 Richardson on 30 Nov 2016 (see ANIMA list). 1623 13.12. draft-ietf-anima-autonomic-control-plane-06 1625 o Added proposed RPL profile. 1627 o detailed dTLS profile - dTLS with any additional negotiation/ 1628 signaling channel. 1630 o Fixed up text for ACP/GRE encap. Removed text claiming its 1631 incompatible with non-GRE IPsec and detailled it. 1633 o Added text to suggest admin down interfaces should still run ACP. 1635 13.13. draft-ietf-anima-autonomic-control-plane-07 1637 o Changed author association. 1639 o Improved ACP connect setion (after confusion about term came up in 1640 the stable connectivity draft review). Added picture, defined 1641 complete terminology. 1643 o Moved ACP channel negotiation from normative section to appendix 1644 because it can in the timeline of this document not be fully 1645 specified to be implementable. Aka: work for future document. 1646 That work would also need to include analysing IKEv2 and describin 1647 the difference of a proposed GRASP/TLS solution to it. 1649 o Removed IANA request to allocate registry for GRASP/TLS. This 1650 would come with future draft (see above). 1652 o Gave the name "ACP information" to the field in the certificate 1653 carrying the ACP address and domain name. 1655 o Changed the rules for mutual authentication of certificates to 1656 rely on the domain in the ACP information of the certificate 1657 instead of the OU in the certificate. Also renewed the text 1658 pointing out that the ACP information in the certificate is meant 1659 to be in a form that it does not disturb other uses of the 1660 certificate. As long as the ACP expected to rely on a common OU 1661 across all certificates in a domain, this was not really true: 1662 Other uses of the certificates might require different OUs for 1663 different areas/type of devices. With the rules in this draft 1664 version, the ACP authentication does not rely on any other fields 1665 in the certificate. 1667 o Added an extension field to the ACP information so that in the 1668 future additional fields like a subdomain could be inserted. An 1669 example using such a subdomain field was added to the pre-existing 1670 text suggesting sub-domains. This approach is necessary so that 1671 there can be a single (main) domain in the ACP information, 1672 because that is used for mutual authentication of the certificate. 1673 Also clarified that only the register(s) SHOULD/MUST use that the 1674 ACP address was generated from the domain name - so that we can 1675 easier extend change this in extensions. 1677 o Took the text for the GRASP discovery of ACP neighbors from Brians 1678 grasp-ani-objectives draft. Alas, that draft was behind the 1679 latest GRASP draft, so i had to overhaul. The mayor change is to 1680 describe in the ACP draft the whole format of the M_FLOOD message 1681 (and not only the actual objective). This should make it a lot 1682 easier to read (without having to go back and forth to the GRASP 1683 RFC/draft). It was also necessary because the locator in the 1684 M_FLOOD messages has an important role and its not coded inside 1685 the objective. The specification of how to format the M_FLOOD 1686 message shuold now be complete, the text may be some duplicate 1687 with the DULL specificateion in GRASP, but no contradiction. 1689 o One of the main outcomes of reworking the GRASP section was the 1690 notion that GRASP announces both the candidate peers IPv6 link 1691 local address but also the support ACP security protocol including 1692 the port it is running on. In the past we shied away from using 1693 this information because it is not secured, but i think the 1694 additional attack vectors possible by using this information are 1695 negligible: If an attacker on an L2 subnet can fake another 1696 devices GRASP message then it can already provide a similar amount 1697 of attack by purely faking the link-local address. 1699 o Removed the section on discovery and BRSKI. This can be revived 1700 in the BRSKI document, but it seems mood given how we did remove 1701 mDNS from the latest BRSKI document (aka: this section discussed 1702 discrepancies between GRASP and mDNS discovery which should not 1703 exist anymore with latest BRSKI. 1705 o Tried to resolve the EDNOTE about CRL vs. OCSP by pointing out we 1706 do not specify which one is to be used but that the ACP should be 1707 used to reach the URL included in the certificate to get to the 1708 CRL storage or OCSP server. 1710 o Changed ACP via IPsec to ACP via IKEv2 and restructured the 1711 sections to make IPsec native and IPsec via GRE subsections. 1713 o No need for any assigned dTLS port if ACP is run across dTLS 1714 because it is signalled via GRASP. 1716 14. References 1718 [I-D.carpenter-anima-ani-objectives] 1719 Carpenter, B. and B. Liu, "Technical Objective Formats for 1720 the Autonomic Network Infrastructure", draft-carpenter- 1721 anima-ani-objectives-02 (work in progress), June 2017. 1723 [I-D.ietf-anima-bootstrapping-keyinfra] 1724 Pritikin, M., Richardson, M., Behringer, M., Bjarnason, 1725 S., and K. Watsen, "Bootstrapping Remote Secure Key 1726 Infrastructures (BRSKI)", draft-ietf-anima-bootstrapping- 1727 keyinfra-06 (work in progress), May 2017. 1729 [I-D.ietf-anima-grasp] 1730 Bormann, C., Carpenter, B., and B. Liu, "A Generic 1731 Autonomic Signaling Protocol (GRASP)", draft-ietf-anima- 1732 grasp-14 (work in progress), July 2017. 1734 [I-D.ietf-anima-reference-model] 1735 Behringer, M., Carpenter, B., Eckert, T., Ciavaglia, L., 1736 Pierre, P., Liu, B., Nobre, J., and J. Strassner, "A 1737 Reference Model for Autonomic Networking", draft-ietf- 1738 anima-reference-model-04 (work in progress), July 2017. 1740 [I-D.ietf-anima-stable-connectivity] 1741 Eckert, T. and M. Behringer, "Using Autonomic Control 1742 Plane for Stable Connectivity of Network OAM", draft-ietf- 1743 anima-stable-connectivity-02 (work in progress), February 1744 2017. 1746 [I-D.richardson-anima-6join-discovery] 1747 Richardson, M., "GRASP discovery of Registrar by Join 1748 Assistant", draft-richardson-anima-6join-discovery-00 1749 (work in progress), October 2016. 1751 [RFC4122] Leach, P., Mealling, M., and R. Salz, "A Universally 1752 Unique IDentifier (UUID) URN Namespace", RFC 4122, 1753 DOI 10.17487/RFC4122, July 2005, 1754 . 1756 [RFC4193] Hinden, R. and B. Haberman, "Unique Local IPv6 Unicast 1757 Addresses", RFC 4193, DOI 10.17487/RFC4193, October 2005, 1758 . 1760 [RFC4941] Narten, T., Draves, R., and S. Krishnan, "Privacy 1761 Extensions for Stateless Address Autoconfiguration in 1762 IPv6", RFC 4941, DOI 10.17487/RFC4941, September 2007, 1763 . 1765 [RFC5082] Gill, V., Heasley, J., Meyer, D., Savola, P., Ed., and C. 1766 Pignataro, "The Generalized TTL Security Mechanism 1767 (GTSM)", RFC 5082, DOI 10.17487/RFC5082, October 2007, 1768 . 1770 [RFC5280] Cooper, D., Santesson, S., Farrell, S., Boeyen, S., 1771 Housley, R., and W. Polk, "Internet X.509 Public Key 1772 Infrastructure Certificate and Certificate Revocation List 1773 (CRL) Profile", RFC 5280, DOI 10.17487/RFC5280, May 2008, 1774 . 1776 [RFC5952] Kawamura, S. and M. Kawashima, "A Recommendation for IPv6 1777 Address Text Representation", RFC 5952, 1778 DOI 10.17487/RFC5952, August 2010, 1779 . 1781 [RFC6347] Rescorla, E. and N. Modadugu, "Datagram Transport Layer 1782 Security Version 1.2", RFC 6347, DOI 10.17487/RFC6347, 1783 January 2012, . 1785 [RFC6550] Winter, T., Ed., Thubert, P., Ed., Brandt, A., Hui, J., 1786 Kelsey, R., Levis, P., Pister, K., Struik, R., Vasseur, 1787 JP., and R. Alexander, "RPL: IPv6 Routing Protocol for 1788 Low-Power and Lossy Networks", RFC 6550, 1789 DOI 10.17487/RFC6550, March 2012, 1790 . 1792 [RFC6552] Thubert, P., Ed., "Objective Function Zero for the Routing 1793 Protocol for Low-Power and Lossy Networks (RPL)", 1794 RFC 6552, DOI 10.17487/RFC6552, March 2012, 1795 . 1797 [RFC6762] Cheshire, S. and M. Krochmal, "Multicast DNS", RFC 6762, 1798 DOI 10.17487/RFC6762, February 2013, 1799 . 1801 [RFC6763] Cheshire, S. and M. Krochmal, "DNS-Based Service 1802 Discovery", RFC 6763, DOI 10.17487/RFC6763, February 2013, 1803 . 1805 [RFC7404] Behringer, M. and E. Vyncke, "Using Only Link-Local 1806 Addressing inside an IPv6 Network", RFC 7404, 1807 DOI 10.17487/RFC7404, November 2014, 1808 . 1810 [RFC7575] Behringer, M., Pritikin, M., Bjarnason, S., Clemm, A., 1811 Carpenter, B., Jiang, S., and L. Ciavaglia, "Autonomic 1812 Networking: Definitions and Design Goals", RFC 7575, 1813 DOI 10.17487/RFC7575, June 2015, 1814 . 1816 [RFC7576] Jiang, S., Carpenter, B., and M. Behringer, "General Gap 1817 Analysis for Autonomic Networking", RFC 7576, 1818 DOI 10.17487/RFC7576, June 2015, 1819 . 1821 [RFC7676] Pignataro, C., Bonica, R., and S. Krishnan, "IPv6 Support 1822 for Generic Routing Encapsulation (GRE)", RFC 7676, 1823 DOI 10.17487/RFC7676, October 2015, 1824 . 1826 Appendix A. Background on the choice of routing protocol 1828 In a pre-standard implementation, the "IPv6 Routing Protocol for Low- 1829 Power and Lossy Networks (RPL, [RFC6550] was chosen. This 1830 Appendix explains the reasoning behind that decision. 1832 Requirements for routing in the ACP are: 1834 o Self-management: The ACP must build automatically, without human 1835 intervention. Therefore routing protocol must also work 1836 completely automatically. RPL is a simple, self-managing 1837 protocol, which does not require zones or areas; it is also self- 1838 configuring, since configuration is carried as part of the 1839 protocol (see Section 6.7.6 of [RFC6550]). 1841 o Scale: The ACP builds over an entire domain, which could be a 1842 large enterprise or service provider network. The routing 1843 protocol must therefore support domains of 100,000 nodes or more, 1844 ideally without the need for zoning or separation into areas. RPL 1845 has this scale property. This is based on extensive use of 1846 default routing. RPL also has other scalability improvements, 1847 such as selecting only a subset of peers instead of all possible 1848 ones, and trickle support for information synchronisation. 1850 o Low resource consumption: The ACP supports traditional network 1851 infrastructure, thus runs in addition to traditional protocols. 1852 The ACP, and specifically the routing protocol must have low 1853 resource consumption both in terms of memory and CPU requirements. 1854 Specifically, at edge nodes, where memory and CPU are scarce, 1855 consumption should be minimal. RPL builds a destination-oriented 1856 directed acyclic graph (DODAG), where the main resource 1857 consumption is at the root of the DODAG. The closer to the edge 1858 of the network, the less state needs to be maintained. This 1859 adapts nicely to the typical network design. Also, all changes 1860 below a common parent node are kept below that parent node. 1862 o Support for unstructured address space: In the Autonomic 1863 Networking Infrastructure, node addresses are identifiers, and may 1864 not be assigned in a topological way. Also, nodes may move 1865 topologically, without changing their address. Therefore, the 1866 routing protocol must support completely unstructured address 1867 space. RPL is specifically made for mobile ad-hoc networks, with 1868 no assumptions on topologically aligned addressing. 1870 o Modularity: To keep the initial implementation small, yet allow 1871 later for more complex methods, it is highly desirable that the 1872 routing protocol has a simple base functionality, but can import 1873 new functional modules if needed. RPL has this property with the 1874 concept of "objective function", which is a plugin to modify 1875 routing behaviour. 1877 o Extensibility: Since the Autonomic Networking Infrastructure is a 1878 new concept, it is likely that changes in the way of operation 1879 will happen over time. RPL allows for new objective functions to 1880 be introduced later, which allow changes to the way the routing 1881 protocol creates the DAGs. 1883 o Multi-topology support: It may become necessary in the future to 1884 support more than one DODAG for different purposes, using 1885 different objective functions. RPL allow for the creation of 1886 several parallel DODAGs, should this be required. This could be 1887 used to create different topologies to reach different roots. 1889 o No need for path optimisation: RPL does not necessarily compute 1890 the optimal path between any two nodes. However, the ACP does not 1891 require this today, since it carries mainly non-delay-sensitive 1892 feedback loops. It is possible that different optimisation 1893 schemes become necessary in the future, but RPL can be expanded 1894 (see point "Extensibility" above). 1896 Appendix B. Extending ACP channel negotiation (via GRASP) 1898 The mechanism described in the normative part of this document to 1899 support multiple different ACP secure channel protocols without a 1900 single network wide MTI protocol is important to allow extending 1901 secure ACP channel protocols beyond what is specified in this 1902 document, but it will run into problem if it would be used for 1903 multiple protocols: 1905 The need to potentially have multiple of these security associations 1906 even temporarily run in parallel to determine which of them works 1907 best does not support the most lightweight implementation options. 1909 The simple policy of letting one side (Alice) decide what is best may 1910 not lead to the mutual best result. 1912 The two limitations can easier be solved if the solution was more 1913 modular and as few as possible initial secure channel negotiation 1914 protocols would be used, and these protocols would then take on the 1915 responsibility to support more flexible objectives to negotiate the 1916 mutually preferred ACP security channel protocol. 1918 IKEv2 is the IETF standard protocol to negotiate network security 1919 associations. It is meant to be extensible, but it is unclear 1920 whether it would be feasible to extend IKEv2 to support possible 1921 future requirements for ACP secure channel negotiation: 1923 Consider the simple case where the use of native IPsec vs. IPsec via 1924 GRE is to be negotiated and the objective is the maximum throughput. 1925 Both sides would indicate some agreed upon performance metric and the 1926 preferred encapsulation is the one with the higher performance of the 1927 slower side. IKEv2 does not support negotiation with this objective. 1929 Consider dTLS and some form of 802.1AE (MacSEC) are to be added as 1930 negotiation options - and the performance objective should work 1931 across all IPsec, dDTLS and 802.1AE options. In the case of MacSEC, 1932 the negotiation would also need to determine a key for the peering. 1933 It is unclear if it would be even appropriate to consider extending 1934 the scope of negotiation in IKEv2 to those cases. Even if feasible 1935 to define, it is unclear if implementations of IKEv2 would be eager 1936 to adopt those type of extension given the long cycles of security 1937 testing that necessarily goes along with core security protocols such 1938 as IKEv2 implementations. 1940 A more modular alternative to extending IKEv2 could be to layer a 1941 modular negotiation mechanism on top of the multitide of existing or 1942 possible future secure channel protocols. For this, GRASP over TLS 1943 could be considered as a first ACP secure channel negotiation 1944 protocol. The following are initial considerations for such an 1945 approach. A full specification is subject to a separate document: 1947 To explicitly allow negotiation of the ACP channel protocol, GRASP 1948 over a TLS connection using the GRASP_LISTEN_PORT and the devices and 1949 peers link-local IPv6 address is used. When Alice and Bob support 1950 GRASP negotiation, they do prefer it over any other non-explicitly 1951 negotiated security association protocol and should wait trying any 1952 non-negotiated ACP channel protocol until after it is clear that 1953 GRASP/TLS will not work to the peer. 1955 When Alice and Bob successfully establish the GRASP/TSL session, they 1956 will negotiate the channel mechanism to use using objectives such as 1957 performance and perceived quality of the security. After agreeing on 1958 a channel mechanism, Alice and Bob start the selected Channel 1959 protocol. Once the secure channel protocol is successfully running, 1960 the GRASP/TLS connection can be kept alive or timed out as long as 1961 the selected channel protocol has a secure association between Alice 1962 and Bob. When it terminates, it needs to be re-negotiated via GRASP/ 1963 TLS. 1965 Notes: 1967 o Negotiation of a channel type may require IANA assignments of code 1968 points. 1970 o TLS is subject to reset attacks, which IKEv2 is not. Normally, 1971 ACP connections (as specified in this document) will be over link- 1972 local addresses so the attack surface for this one issue in TCP is 1973 highly reduced. 1975 o GRASP packets received inside a TLS connection established for 1976 GRASP/TLS ACP negotiation are assigned to a separate GRASP domain 1977 unique to that TLS connection. 1979 Authors' Addresses 1981 Michael H. Behringer (editor) 1983 Email: mchael.h.behringer@gmail.com 1985 Toerless Eckert (editor) 1986 Futurewei Technologies Inc. 1987 2330 Central Expy 1988 Santa Clara 95050 1989 USA 1991 Email: tte+ietf@cs.fau.de 1993 Steinthor Bjarnason 1994 Arbor Networks 1995 2727 South State Street, Suite 200 1996 Ann Arbor MI 48104 1997 United States 1999 Email: sbjarnason@arbor.net