idnits 2.17.1 draft-ietf-anima-stable-connectivity-05.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack separate sections for Informative/Normative References. All references will be assumed normative when checking for downward references. ** The abstract seems to contain references ([RFC6291]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (August 2, 2017) is 2452 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- -- Looks like a reference, but probably isn't: '1' on line 897 == Unused Reference: 'RFC5246' is defined on line 917, but no explicit reference was found in the text == Unused Reference: 'RFC6347' is defined on line 933, but no explicit reference was found in the text == Unused Reference: 'RFC6418' is defined on line 937, but no explicit reference was found in the text == Outdated reference: A later version (-30) exists of draft-ietf-anima-autonomic-control-plane-08 == Outdated reference: A later version (-45) exists of draft-ietf-anima-bootstrapping-keyinfra-07 == Outdated reference: A later version (-10) exists of draft-ietf-anima-reference-model-04 ** Obsolete normative reference: RFC 5246 (Obsoleted by RFC 8446) ** Obsolete normative reference: RFC 6347 (Obsoleted by RFC 9147) ** Obsolete normative reference: RFC 6434 (Obsoleted by RFC 8504) ** Obsolete normative reference: RFC 6824 (Obsoleted by RFC 8684) Summary: 6 errors (**), 0 flaws (~~), 7 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 ANIMA T. Eckert, Ed. 3 Internet-Draft Huawei 4 Intended status: Informational M. Behringer 5 Expires: February 3, 2018 August 2, 2017 7 Using Autonomic Control Plane for Stable Connectivity of Network OAM 8 draft-ietf-anima-stable-connectivity-05 10 Abstract 12 OAM (Operations, Administration and Maintenance - as per BCP161, 13 [RFC6291]) processes for data networks are often subject to the 14 problem of circular dependencies when relying on connectivity 15 provided by the network to be managed for the OAM purposes. 17 Provisioning while bringing up devices and networks tends to be more 18 difficult to automate than service provisioning later on, changes in 19 core network functions impacting reachability cannot be automated 20 because of ongoing connectivity requirements for the OAM equipment 21 itself, and widely used OAM protocols are not secure enough to be 22 carried across the network without security concerns. 24 This document describes how to integrate OAM processes with the 25 autonomic control plane (ACP) in Autonomic Networks (AN) in order to 26 provide stable and secure connectivity for those OAM processes. This 27 connectivity is not subject to aforementioned circular dependencies. 29 Status of This Memo 31 This Internet-Draft is submitted in full conformance with the 32 provisions of BCP 78 and BCP 79. 34 Internet-Drafts are working documents of the Internet Engineering 35 Task Force (IETF). Note that other groups may also distribute 36 working documents as Internet-Drafts. The list of current Internet- 37 Drafts is at http://datatracker.ietf.org/drafts/current/. 39 Internet-Drafts are draft documents valid for a maximum of six months 40 and may be updated, replaced, or obsoleted by other documents at any 41 time. It is inappropriate to use Internet-Drafts as reference 42 material or to cite them other than as "work in progress." 44 This Internet-Draft will expire on February 3, 2018. 46 Copyright Notice 48 Copyright (c) 2017 IETF Trust and the persons identified as the 49 document authors. All rights reserved. 51 This document is subject to BCP 78 and the IETF Trust's Legal 52 Provisions Relating to IETF Documents 53 (http://trustee.ietf.org/license-info) in effect on the date of 54 publication of this document. Please review these documents 55 carefully, as they describe your rights and restrictions with respect 56 to this document. Code Components extracted from this document must 57 include Simplified BSD License text as described in Section 4.e of 58 the Trust Legal Provisions and are provided without warranty as 59 described in the Simplified BSD License. 61 Table of Contents 63 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 64 1.1. Self dependent OAM Connectivity . . . . . . . . . . . . . 2 65 1.2. Data Communication Networks (DCNs) . . . . . . . . . . . 3 66 1.3. Leveraging the ACP . . . . . . . . . . . . . . . . . . . 4 67 2. Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . 4 68 2.1. Stable Connectivity for Centralized OAM . . . . . . . . . 4 69 2.1.1. Simple Connectivity for Non-ACP capable NMS Hosts . . 5 70 2.1.2. Challenges and Limitation of Simple Connectivity . . 6 71 2.1.3. Simultaneous ACP and Data Plane Connectivity . . . . 7 72 2.1.4. IPv4-only NMS Hosts . . . . . . . . . . . . . . . . . 8 73 2.1.5. Path Selection Policies . . . . . . . . . . . . . . . 11 74 2.1.6. Autonomic NOC Device/Applications . . . . . . . . . . 12 75 2.1.7. Encryption of data-plane connections . . . . . . . . 13 76 2.1.8. Long Term Direction of the Solution . . . . . . . . . 14 77 2.2. Stable Connectivity for Distributed Network/OAM . . . . . 15 78 3. Architectural Considerations . . . . . . . . . . . . . . . . 15 79 3.1. No IPv4 for ACP . . . . . . . . . . . . . . . . . . . . . 15 80 4. Security Considerations . . . . . . . . . . . . . . . . . . . 16 81 5. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 17 82 6. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 17 83 7. Change log [RFC Editor: Please remove] . . . . . . . . . . . 18 84 8. References . . . . . . . . . . . . . . . . . . . . . . . . . 19 85 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 21 87 1. Introduction 89 1.1. Self dependent OAM Connectivity 91 OAM (Operations, Administration and Maintenance - as per BCP161, 92 [RFC6291]) for data networks is often subject to the problem of 93 circular dependencies when relying on the connectivity service 94 provided by the network to be managed. OAM can easily but 95 unintentionally break the connectivity required for its own 96 operations. Avoiding these problems can lead to complexity in OAM. 97 This document describes this problem and how to use the Autonomic 98 Control Plane (ACP) to solve it without further OAM complexity: 100 The ability to perform OAM on a network device requires first the 101 execution of OAM necessary to create network connectivity to that 102 device in all intervening devices. This typically leads to 103 sequential, 'expanding ring configuration' from a NOC (Network 104 Operations Center). It also leads to tight dependencies between 105 provisioning tools and security enrollment of devices. Any process 106 that wants to enroll multiple devices along a newly deployed network 107 topology needs to tightly interlock with the provisioning process 108 that creates connectivity before the enrollment can move on to the 109 next device. 111 When performing change operations on a network, it likewise is 112 necessary to understand at any step of that process that there is no 113 interruption of connectivity that could lead to removal of 114 connectivity to remote devices. This includes especially change 115 provisioning of routing, forwarding, security and addressing policies 116 in the network that often occur through mergers and acquisitions, the 117 introduction of IPv6 or other mayor re-hauls in the infrastructure 118 design. Examples include change of an IGP or areas, PA (Provider 119 Aggregatabe) to PI (Provider Independent) addressing, or systematic 120 topology changes (such as L2 to L3 changes). 122 All these circular dependencies make OAM complex and potentially 123 fragile. When automation is being used, for example through 124 provisioning systems, this complexity extends into that automation 125 software. 127 1.2. Data Communication Networks (DCNs) 129 In the late 1990'th and early 2000, IP networks became the method of 130 choice to build separate OAM networks for the communications 131 infrastructure within Network Providers. This concept was 132 standardized in ITU-T G.7712/Y.1703 [ITUT] and called "Data 133 Communications Networks" (DCN). These where (and still are) 134 physically separate IP(/MPLS) networks that provide access to OAM 135 interfaces of all equipment that had to be managed, from PSTN (Public 136 Switched Telephone Network) switches over optical equipment to 137 nowadays Ethernet and IP/MPLS production network equipment. 139 Such DCN provide stable connectivity not subject to aforementioned 140 problems because they are separate network entirely, so change 141 configuration of the production IP network is done via the DCN but 142 never affects the DCN configuration. Of course, this approach comes 143 at a cost of buying and operating a separate network and this cost is 144 not feasible for many providers, most notably smaller providers, most 145 enterprises and typical IoT networks (Internet of Things). 147 1.3. Leveraging the ACP 149 One of the goals of the Autonomic Networks Autonomic Control Plane 150 (ACP as defined in [I-D.ietf-anima-autonomic-control-plane] ) is to 151 provide similar stable connectivity as a DCN, but without having to 152 build a separate DCN. It is clear that such 'in-band' approach can 153 never achieve fully the same level of separation, but the goal is to 154 get as close to it as possible. 156 This solution approach has several aspects. One aspect is designing 157 the implementation of the ACP in network devices to make it actually 158 perform without interruption by changes in what we will call in this 159 document the "data-plane", a.k.a: the operator or controller 160 configured services planes of the network equipment. This aspect is 161 not currently covered in this document. 163 Another aspect is how to leverage the stable IPv6 connectivity 164 provided by the ACP for OAM purposes. This is the current scope of 165 this document. 167 2. Solutions 169 2.1. Stable Connectivity for Centralized OAM 171 The ANI is the "Autonomic Networking Infrastructure" consisting of 172 secure zero touch Bootstrap (BRSKI - 173 [I-D.ietf-anima-bootstrapping-keyinfra]), GeneRic Autonomic Signaling 174 Protocol (GRASP - [I-D.ietf-anima-grasp]), and Autonomic Control 175 Plane (ACP - [I-D.ietf-anima-autonomic-control-plane]). Refer to 176 [I-D.ietf-anima-reference-model] for an overview of the ANI and how 177 its components interact and [RFC7575] for concepts and terminology of 178 ANI and autonomic networks. 180 This section describes stable connectivity for centralized OAM via 181 ACP/ANI starting by what we expect to be the most easy to deploy 182 short-term option. It then describes limitation and challenges of 183 that approach and their solutions/workarounds to finish with the 184 preferred target option of autonomic NOC devices in Section 2.1.6. 186 This order was chosen because it helps to explain how simple initial 187 use of ACP can be, how difficult workarounds can become (and 188 therefore what to avoid), and finally because one very promising 189 long-term solution alternative is exactly like the most easy short- 190 term solution only virtualized and automated. 192 In the most common case, OAM will be performed by one or more 193 applications running on a variety of centralized NOC systems that 194 communicate with network devices. We describe differently advanced 195 approaches to leverage the ACP for stable connectivity. There is a 196 wide range of options, some of which are simple, some more complex. 198 Three stages can be considered: 200 o There are simple options described in sections Section 2.1.1 201 through Section 2.1.3 that we consider to be good starting points 202 to operationalize the use of the ACP for stable connectivity 203 today. These options require only network and OAN/NOC device 204 configuration. 206 o The are workarounds to connect the ACP to non-IPv6 capable NOC 207 devices through the use of IPv4/IPv6 NAT (Network Address 208 Translation) as described in section Section 2.1.4. These 209 workarounds are not recommended but if such non-IPv6 capable NOC 210 devices need to be used longer term, then this is the only option 211 to connect them to the ACP. 213 o Near to long term options can provide all the desired operational, 214 zero touch and security benefits of an autonomic network, but a 215 range of details for this still have to be worked out and 216 development work on NOC/OAM equipment is necessary. These options 217 are discussed in sections Section 2.1.5 through Section 2.1.8. 219 2.1.1. Simple Connectivity for Non-ACP capable NMS Hosts 221 In the most simple candidate deployment case, the ACP extends all the 222 way into the NOC via one or more "ACP edge devices" as defined in 223 section 6.1 of [I-D.ietf-anima-autonomic-control-plane]. These 224 devices "leak" the (otherwise encrypted) ACP natively to NMS hosts. 225 They acts as the default router to those NMS hosts and provide them 226 with IPv6 connectivity into the ACP. NMS hosts with this setup need 227 to support IPv6 (see e.g. [RFC6434]) but require no other 228 modifications to leverage the ACP. 230 Note that even though the ACP only uses IPv6, it can of course 231 support OAM for any type of network deployment as long as the network 232 devices support the ACP: The Data Plane can be IPv4 only, dual-stack 233 or IPv6 only. It is always spearate from the ACP, therefore there is 234 no dependency between the ACP and the IP version(s) used in the Data 235 Plane. 237 This setup is sufficient for troubleshooting such as SSH into network 238 devices, NMS that performs SNMP read operations for status checking, 239 software downloads into autonomic devices, provisioning of devices 240 via NETCONF and so on. In conjunction with otherwise unmodified OAM 241 via separate NMS hosts it can provide a good subset of the stable 242 connectivity goals. The limitations of this approach are discussed 243 in the next section. 245 Because the ACP provides 'only' for IPv6 connectivity, and because 246 addressing provided by the ACP does not include any topological 247 addressing structure that operations in a NOC often relies on to 248 recognize where devices are on the network, it is likely highly 249 desirable to set up DNS (Domain Name System - see [RFC1034]) so that 250 the ACP IPv6 addresses of autonomic devices are known via domain 251 names that include the desired structure. For example, if DNS in the 252 network was set up with names for network devices as 253 devicename.noc.example.com, and the well known structure of the Data 254 Plane IPv4 addresses space was used by operators to infer the region 255 where a device is located in, then the ACP address of that device 256 could be set up as devicename_.acp.noc.example.com, and 257 devicename.acp.noc.example.com could be a CNAME to 258 devicename_.acp.noc.example.com. Note that many networks 259 already use names for network equipment where topological information 260 is included, even without an ACP. 262 2.1.2. Challenges and Limitation of Simple Connectivity 264 This simple connectivity of non-autonomic NMS hosts suffers from a 265 range of challenges (that is, operators may not be able to do it this 266 way) or limitations (that is, operator cannot achieve desired goals 267 with this setup). The following list summarizes these challenges and 268 limitations. The following sections describe additional mechanisms 269 to overcome them. 271 Note that these challenges and limitations exist because ACP is 272 primarily designed to support distributed ASA (Autonomic Service 273 Agent, a piece of autonomic software) in the most lightweight 274 fashion, but not mandatorily require support for additional 275 mechanisms to best support centralized NOC operations. It is this 276 document that describes additional (short term) workarounds and (long 277 term) extensions. 279 1. (Limitation) NMS hosts cannot directly probe whether the desired 280 so called 'data-plane' network connectivity works because they do 281 not directly have access to it. This problem is similar to 282 probing connectivity for other services (such as VPN services) 283 that they do not have direct access to, so the NOC may already 284 employ appropriate mechanisms to deal with this issue (probing 285 proxies). See Section 2.1.3 for candidate solutions. 287 2. (Challenge) NMS hosts need to support IPv6 which often is still 288 not possible in enterprise networks. See Section 2.1.4 for some 289 workarounds. 291 3. (Limitation) Performance of the ACP will be limited versus normal 292 'data-plane' connectivity. The setup of the ACP will often 293 support only non-hardware accelerated forwarding. Running a 294 large amount of traffic through the ACP, especially for tasks 295 where it is not necessary will reduce its performance/ 296 effectiveness for those operations where it is necessary or 297 highly desirable. See Section 2.1.5 for candidate solutions. 299 4. (Limitation) Security of the ACP is reduced by exposing the ACP 300 natively (and unencrypted) into a subnet in the NOC where the NOC 301 devices are attached to it. See Section 2.1.7 for candidate 302 solutions. 304 These four problems can be tackled independently of each other by 305 solution improvements. Combining some of these solutions 306 improvements together can lead towards a candiate long term solution. 308 2.1.3. Simultaneous ACP and Data Plane Connectivity 310 Simultaneous connectivity to both ACP and data-plane can be achieved 311 in a variety of ways. If the data-plane is IPv4-only, then any 312 method for dual-stack attachment of the NOC device/application will 313 suffice: IPv6 connectivity from the NOC provides access via the ACP, 314 IPv4 will provide access via the data-plane. If as explained above 315 in the simple case, an autonomic device supports native attachment to 316 the ACP, and the existing NOC setup is IPv4 only, then it could be 317 sufficient to attach the ACP device(s) as the IPv6 default router to 318 the NOC subnet and keep the existing IPv4 default router setup 319 unchanged. 321 If the data-plane of the network is also supporting IPv6, then the 322 most compatible setup for NOC devices is to have two IPv6 interfaces. 323 One virtual ((e.g. via IEEE 802.1Q [IEEE802.1Q]) or physical 324 interface connecting to a data-plane subnet, and another into an ACP 325 connect subnet as specified in the ACP connection section of 326 [I-D.ietf-anima-autonomic-control-plane]. That document also 327 specifies how the NOC devices can receive autoconfigured addressing 328 and routes towards the ACP connect subnet if it supports [RFC6724] 329 and [RFC4191]. 331 Configuring a second interface on a NOC host may be impossible or be 332 seen as undesired complexity. In that case the ACP edge device needs 333 to provide support for a "Combined ACP and Data Plane interface" as 334 also described in the ACP connect section of 335 [I-D.ietf-anima-autonomic-control-plane]. This setup may not work 336 with autoconfiguration and all NOC host network stacks due to 337 limitations in those network stacks. They need to be able to perform 338 RFC6724 source address selection rule 5.5 including caching of next- 339 hop information. See the ACP document text for more details. 341 For security reasons, it is not considered appropriate in the ACP 342 document to connect a non-ACP router to an ACP connect interface. 343 The reason is that the ACP is a secured network domain and all NOC 344 devices connecting via ACP connect interfaces are also part of that 345 secure domain - the main difference is that the physical link between 346 the ACP edge device and the NOC devices is not authenticated/ 347 encrypted and therefore needs to be physically secured. If the 348 secure ACP was extendable via untrusted routers then it would be a 349 lot more verify the secure domain assertion. Therefore the ACP edge 350 devices are not supposed to redistribute routes from non-ACP routers 351 into the ACP. 353 2.1.4. IPv4-only NMS Hosts 355 ACP does not support IPv4: Single stack IPv6 management of the 356 network via ACP and (as needed) data plane. Independent of whether 357 the data plane is dual-stack, has IPv4 as a service or is single 358 stack IPv6. Dual plane management, IPv6 for ACP, IPv4 for the data 359 plane is likewise an architecturally simple option. 361 The implication of this architectural decision is the potential need 362 for short-term workarounds when the operational practices in a 363 network do not yet meet these target expectations. This section 364 explains when and why these workarounds may be operationally 365 necessary and describes them. However, the long term goal is to 366 upgrade all NMS hosts to native IPv6, so the workarounds described in 367 this section should not be considered permanent. 369 Most network equipment today supports IPv6 but it is by far not 370 ubiquitously supported in NOC backend solutions (HW/SW), especially 371 not in the product space for enterprises. Even when it is supported, 372 there are often additional limitations or issues using it in a dual 373 stack setup or the operator mandates for simplicity single stack for 374 all operations. For these reasons an IPv4 only management plane is 375 still required and common practice in many enterprises. Without the 376 desire to leverage the ACP, this required and common practice is not 377 a problem for those enterprises even when they run dual stack in the 378 network. We discuss these workarounds here because it is a short 379 term deployment challenge specific to the operations of the ACP. 381 To connect IPv4 only management plane devices/applications with the 382 ACP, some form of IP/ICMP translation of packets IPv4<->IPv6 is 383 necessary. The basic mechanisms for this are defined in SIIT 384 ([RFC7915]). There are multiple solutions using this mechanisms. To 385 understand the possible solutions, we consider the requirements: 387 1. NMS hosts need to be able to initiate connections to any ACP 388 device for management purposes. Examples include provisioning 389 via Netconf/(SSH), SNMP poll operations or just diagnostics via 390 SSH connections from operators. Every ACP device/function that 391 needs to be reachable from NMS hosts needs to have a separate 392 IPv4 address. 394 2. ACP devices need to be able to initiate connections to NMS hosts 395 for example to initiate NTP or radius/diameter connections, send 396 syslog or SNMP trap or initiate Netconf Call Home connections 397 after bootstrap. Every NMS host needs to have a separate IPv6 398 address reachable from the ACP. When connections from ACP 399 devices are made to NMS hosts, the IPv4 source address of these 400 connections as seen by the NMS Host must also be unique per ACP 401 device and the same address as in (1) to maintain the same 402 addressing simplicity as in a native IPv4 deployment. For 403 example in syslog, the source-IP address of a logging device is 404 used to identify it, and if the device shows problems, an 405 operator might want to SSH into the device to diagnose it. 407 Because of these requirements, the necessary and sufficient set of 408 solutions are those that provide 1:1 mapping of IPv6 ACP addresses 409 into IPv4 space and 1:1 mapping of IPv4 NMS host space into IPv6 (for 410 use in the ACP). This means that stateless SIIT based solutions are 411 sufficient and preferred. 413 Note that ACP devices may use multiple IPv6 addresses in the ACP 414 based on which Sub-Scheme they use. For example in the Zone Sub- 415 Scheme, an ACP device could use two addresses, one with the last 416 address bit (V-bit) 0 and one with 1. Both addresses may need to be 417 reachable thought the IPv6/IPv4 address translation. 419 The need to allocate for every ACP device one or multiple IPv4 420 addresses should not be a problem if - as we assume - the NMS hosts 421 can use private IPv4 address space ([RFC1918]). Nevertheless even 422 with RFC1918 address space it is important that the ACP IPv6 423 addresses can efficiently be mapped into IPv4 address space without 424 too much waste. 426 The currently most flexible mapping scheme to achieve this is 427 [RFC7757] because it allows configured IPv4 <-> IPv6 prefix mapping. 428 Assume the ACP uses the Zone Addressing Sub-Scheme and there are 3 429 registrars. In the Zone Addressing Sub-Scheme, there is for each 430 registrar a constant /112 prefix for which in RFC7757 an EAM 431 (Explicit Address Mapping) into a /16 (eg: RFC1918) prefix into IPv4 432 can be configured. Within the registrars /112 prefix, Device-Numbers 433 for devices are sequentially assigned: with V-bit effectively two 434 numbers are assigned per ACP device. This also means that if IPv4 435 address space is even more constrained, and it is known that a 436 registrar will never need the full /15 extent of Device-Numbers, then 437 a longer than /112 prefix can be configured into the EAM to use less 438 IPv4 space. 440 When using the Vlong Addressing Sub-Scheme, it is unlikely that one 441 wants or need to translate the full /8 or /16 bits of addressing 442 space per ACP device into IPv4. In this case, the EAM rules of 443 dropping trailing bits can be used to map only N bits of the V-bits 444 into IPv4. This does imply though that only V-addresses that differ 445 in those high-order N V-bits can be distinguished on the IPv4 side. 447 Likewise, the IPv4 address space used for NMS hosts can easily be 448 mapped into an ACP prefix assigned to an ACP connect interface. 450 A full specification of a solution to perform SIIT in conjunction 451 with ACP connect following the considerations below is outside the 452 scope of this document. 454 To be in compliance with security expectations, SIIT has to to happen 455 on the ACP edge device itself so that ACP security considerations can 456 be taken into account. Eg: that IPv4 only NMS hosts can be dealt 457 with exactly like IPv6 hosts connected to an ACP connect interface. 459 Note that prior solutions such as NAT64 ([RFC6146]) may equally be 460 useable to translate between ACP IPv6 address space and NMS Hosts 461 IPv4 address space, and that as workarounds this can also be done on 462 non ACP Edge Devices connected to an ACP connect interface. The 463 details vary depending on implementation because the options to 464 configure address mappings vary widely. Outside of EAM, there are no 465 standardized solutions that allow for mapping of prefixes, so it will 466 most likely be necessary to explicitly map every individual (/128) 467 ACP device address to an IPv4 address. Such an approach should use 468 automation/scripting where these address translation entries are 469 created dynamically whenever an ACP device is enrolled or first 470 connected to the ACP network. 472 Overall, the use of NAT is especially subject to the ROI (Return On 473 Investment) considerations, but the methods described here may not be 474 too different from the same problems encountered totally independent 475 of AN/ACP when some parts of the network are to introduce IPv6 but 476 NMS hosts are not (yet) upgradeable. 478 2.1.5. Path Selection Policies 480 As mentioned above, the ACP is not expected to have high performance 481 because its primary goal is connectivity and security, and for 482 existing network device platforms this often means that it is a lot 483 more effort to implement that additional connectivity with hardware 484 acceleration than without - especially because of the desire to 485 support full encryption across the ACP to achieve the desired 486 security. 488 Some of these issues may go away in the future with further adoption 489 of the ACP and network device designs that better tender to the needs 490 of a separate OAM plane, but it is wise to plan for even long-term 491 designs of the solution that does NOT depend on high-performance of 492 the ACP. This is opposite to the expectation that future NMS hosts 493 will have IPv6, so that any considerations for IPv4/NAT in this 494 solution are temporary. 496 To solve the expected performance limitations of the ACP, we do 497 expect to have the above describe dual-connectivity via both ACP and 498 data-plane between NOC application devices and AN devices with ACP. 499 The ACP connectivity is expected to always be there (as soon as a 500 device is enrolled), but the data-plane connectivity is only present 501 under normal operations but will not be present during e.g. early 502 stages of device bootstrap, failures, provisioning mistakes or during 503 network configuration changes. 505 The desired policy is therefore as follows: In the absence of further 506 security considerations (see below), traffic between NMS hosts and AN 507 devices should prefer data-plane connectivity and resort only to 508 using the ACP when necessary, unless it is an operation known to be 509 so much tied to the cases where the ACP is necessary that it makes no 510 sense to try using the data plane. An example here is of course the 511 SSH connection from the NOC into a network device to troubleshoot 512 network connectivity. This could easily always rely on the ACP. 513 Likewise, if an NMS host is known to transmit large amounts of data, 514 and it uses the ACP, then its performance need to be controlled so 515 that it will not overload the ACP performance. Typical examples of 516 this are software downloads. 518 There is a wide range of methods to build up these policies. We 519 describe a few: 521 Ideally, a NOC system would learn and keep track of all addresses of 522 a device (ACP and the various data plane addresses). Every action of 523 the NOC system would indicate via a "path-policy" what type of 524 connection it needs (e.g. only data-plane, ACP-only, default to data- 525 plane, fallback to ACP,...). A connection policy manager would then 526 build connection to the target using the right address(es). Shorter 527 term, a common practice is to identify different paths to a device 528 via different names (e.g. loopback vs. interface addresses). This 529 approach can be expanded to ACP uses, whether it uses NOC system 530 local names or DNS. We describe example schemes using DNS: 532 DNS can be used to set up names for the same network devices but with 533 different addresses assigned: One name (name.noc.example.com) with 534 only the data-plane address(es) (IPv4 and/or IPv6) to be used for 535 probing connectivity or performing routine software downloads that 536 may stall/fail when there are connectivity issues. One name (name- 537 acp.noc.example.com) with only the ACP reachable address of the 538 device for troubleshooting and probing/discovery that is desired to 539 always only use the ACP. One name with data plane and ACP addresses 540 (name-both.noc.example.com). 542 Traffic policing and/or shaping of at the ACP edge in the NOC can be 543 used to throttle applications such as software download into the ACP. 545 MPTCP (Multipath TCP -see [RFC6824]) is a very attractive candidate 546 to automate the use of both data-plane and ACP and minimize or fully 547 avoid the need for the above mentioned logical names to pre-set the 548 desired connectivity (data-plane-only, ACP only, both). For example, 549 a set-up for non MPTCP aware applications would be as follows: 551 DNS naming is set up to provide the ACP IPv6 address of network 552 devices. Unbeknownst to the application, MPTCP is used. MPTCP 553 mutually discovers between the NOC and network device the data-plane 554 address and caries all traffic across it when that MPTCP subflow 555 across the data-plane can be built. 557 In the Autonomic network devices where data-plane and ACP are in 558 separate VRFs, it is clear that this type of MPTCP subflow creation 559 across different VRFs is new/added functionality. Likewise, the 560 policies of preferring a particular address (NOC-device) or VRF (AN 561 device) for the traffic is potentially also a policy not provided as 562 a standard. 564 2.1.6. Autonomic NOC Device/Applications 566 Setting up connectivity between the NOC and autonomic devices when 567 the NOC device itself is non-autonomic is as mentioned in the 568 beginning a security issue. It also results as shown in the previous 569 paragraphs in a range of connectivity considerations, some of which 570 may be quite undesirable or complex to operationalize. 572 Making NMS hosts autonomic and having them participate in the ACP is 573 therefore not only a highly desirable solution to the security 574 issues, but can also provide a likely easier operationalization of 575 the ACP because it minimizes NOC-special edge considerations - the 576 ACP is simply built all the way automatically, even inside the NOC 577 and only authorized and authenticate NOC devices/applications will 578 have access to it. 580 Supporting the ACP all the way into an application device requires 581 implementing the following aspects in it: AN bootstrap/enrollment 582 mechanisms, the secure channel for the ACP and at least the host side 583 of IPv6 routing setup for the ACP. Minimally this could all be 584 implemented as an application and be made available to the host OS 585 via e.g. a tap driver to make the ACP show up as another IPv6 enabled 586 interface. 588 Having said this: If the structure of NMS hosts is transformed 589 through virtualization anyhow, then it may be considered equally 590 secure and appropriate to construct (physical) NMS host system by 591 combining a virtual AN/ACP enabled router with non-AN/ACP enabled 592 NOC-application VMs via a hypervisor, leveraging the configuration 593 options described in the previous sections but just virtualizing 594 them. 596 2.1.7. Encryption of data-plane connections 598 When combining ACP and data-plane connectivity for availability and 599 performance reasons, this too has an impact on security: When using 600 the ACP, the traffic will be mostly encryption protected, especially 601 when considering the above described use of AN application devices. 602 If instead the data-plane is used, then this is not the case anymore 603 unless it is done by the application. 605 The simplest solution for this problem exists when using AN capable 606 NMS hosts, because in that case the communicating AN capable NMS host 607 and the AN network device have certificates through the AN enrollment 608 process that they can mutually trust (same AN domain). In result, 609 data-plane connectivity that does support this can simply leverage 610 TLS/DTLS ([RFC5246]/[RFC6347]) with mutual AN-domain certificate 611 authentication - and does not incur new key management. 613 If this automatic security benefit is seen as most important, but a 614 "full" ACP stack into the NMS host is unfeasible, then it would still 615 be possible to design a stripped down version of AN functionality for 616 such NOC hosts that only provides enrollment of the NOC host into the 617 AN domain to the extent that the host receives an AN domain 618 certificate, but without directly participating in the ACP 619 afterwards. Instead, the host would just leverage TLS/DTLS using its 620 AN certificate via the data-plane with AN network devices as well as 621 indirectly via the ACP with the above mentioned in-NOC network edge 622 connectivity into the ACP. 624 When using the ACP itself, TLS/DTLS for the transport layer between 625 NMS hosts and network device is somewhat of a double price to pay 626 (ACP also encrypts) and could potentially be optimized away, but 627 given the assumed lower performance of the ACP, it seems that this is 628 an unnecessary optimization. 630 2.1.8. Long Term Direction of the Solution 632 If we consider what potentially could be the most lightweight and 633 autonomic long term solution based on the technologies described 634 above, we see the following direction: 636 1. NMS hosts should at least support IPv6. IPv4/IPv6 NAT in the 637 network to enable use of ACP is long term undesirable. Having 638 IPv4 only applications automatically leverage IPv6 connectivity 639 via host-stack translation may be an option but this has not been 640 investigated yet. 642 2. Build the ACP as a lightweight application for NMS hosts so ACP 643 extends all the way into the actual NMS hosts. 645 3. Leverage and as necessary enhance MPTCP with automatic dual- 646 connectivity: If an MPTCP unaware application is using ACP 647 connectivity, the policies used should add subflow(s) via the 648 data-plane and prefer them. 650 4. Consider how to best map NMS host desires to underlying transport 651 mechanisms: With the above mentioned 3 points, not all options 652 are covered. Depending on the OAM, one may still want only ACP, 653 only data-plane, or automatically prefer one over the other and/ 654 or use the ACP with low performance or high-performance (for 655 emergency OAM such as countering DDoS). It is as of today not 656 clear what the simplest set of tools is to enable explicitly the 657 choice of desired behavior of each OAM. The use of the above 658 mentioned DNS and MPTCP mechanisms is a start, but this will 659 require additional thoughts. This is likely a specific case of 660 the more generic scope of TAPS. 662 2.2. Stable Connectivity for Distributed Network/OAM 664 The ANI (ACP, Bootstrap, GRASP) can provide via the GRASP protocol 665 common direct-neighbor discovery and capability negotiation (GRASP 666 via ACP and/or data-plane) and stable and secure connectivity for 667 functions running distributed in network devices (GRASP via ACP). It 668 can therefore eliminate the need to re-implement similar functions in 669 each distributed function in the network. Today, every distributed 670 protocol does this with functional elements usually called "Hello" 671 mechanisms and with often protocol specific security mechanisms. 673 KARP (Keying and Authentication for Routing Protocols, see [RFC6518]) 674 has tried to start provide common directions and therefore reduce the 675 re-invention of at least some of the security aspects, but it only 676 covers routing-protocols and it is unclear how well it applicable to 677 a potentially wider range of network distributed agents such as those 678 performing distributed OAM. The ACP can help in these cases. 680 3. Architectural Considerations 682 3.1. No IPv4 for ACP 684 The ACP is targeted to be IPv6 only, and the prior explanations in 685 this document show that this can lead to some complexity when having 686 to connect IPv4 only NOC solutions, and that it will be impossible to 687 leverage the ACP when the OAM agents on an ACP network device do not 688 support IPv6. Therefore, the question was raised whether the ACP 689 should optionally also support IPv4. 691 The decision not to include IPv4 for ACP as something that is 692 considered in the use cases in this document is because of the 693 following reasons: 695 In SP networks that have started to support IPv6, often the next 696 planned step is to consider moving out IPv4 from a native transport 697 as just a service on the edge. There is no benefit/need for multiple 698 parallel transport families within the network, and standardizing on 699 one reduces OPEX and improves reliability. This evolution in the 700 data plane makes it highly unlikely that investing development cycles 701 into IPv4 support for ACP will have a longer term benefit or enough 702 critical short-term use-cases. Support for IPv4-only for ACP is 703 purely a strategic choice to focus on the known important long term 704 goals. 706 In other type of networks as well, we think that efforts to support 707 autonomic networking is better spent in ensuring that one address 708 family will be support so all use cases will long-term work with it, 709 instead of duplicating effort into IPv4. Especially because auto- 710 addressing for the ACP with IPv4 would be more complex than in IPv6 711 due to the IPv4 addressing space. 713 4. Security Considerations 715 In this section, we discuss only security considerations not covered 716 in the appropriate sub-sections of the solutions described. 718 Even though ACPs are meant to be isolated, explicit operator 719 misconfiguration to connect to insecure OAM equipment and/or bugs in 720 ACP devices may cause leakage into places where it is not expected. 721 Mergers/Acquisitions and other complex network reconfigurations 722 affecting the NOC are typical examples. 724 ACP prefix addresses are ULA addresses. Using these addresses also 725 for NOC devices as proposed in this document is not only necessary 726 for above explained simple routing functionality but it is also more 727 secure than global IPv6 addresses. ULA addresses are not routed in 728 the global Internet and will therefore be subject to more filtering 729 even in places where specific ULA addresses are being used. Packets 730 are therefore less likely to leak to be successfully injected into 731 the isolated ACP environment. 733 The random nature of a ULA prefix provides strong protection against 734 address collision even though there is no central assignment 735 authority. This is helped by the expectation, that ACPs are never 736 expected to connect all together, but only few ACPs may ever need to 737 connect together, e.g. when mergers and aquisitions occur. 739 The ACP specification demands that only packets from configured ACP 740 prefixes are permitted from ACP connect interfaces. It also requires 741 that RPL root ACP devices need to be able to diagnose unknown ACP 742 destination addresses. 744 To help diagnose packets that unexpectedly leaked for example from 745 another ACP (that was meant to be deployed separately), it can be 746 useful to voluntarily list your own the ULA ACP prefixes in one of 747 the sites on the Internet, for example 748 https://www.sixxs.net/tools/grh/ula/. Note that this does not 749 constitute registration and if you want to ensure that your leaked 750 ACP packets can be recognized to come from you, you may need to list 751 your prefixes in multiple of those sites. 753 Note that there is a provision in [RFC4193] for non-locally assigned 754 address space (L bit = 0), but there is no existing standardization 755 for this, so these ULA prefixes must not be used. 757 According to RFC4193 section 4.4, PTR records for ULA addresses 758 should not be installed into the global DNS (no guaranteed 759 ownership). Hence also the need to rely on voluntary lists (and in 760 prior paragraph) to make the use of an ULA prefix globally known. 762 Nevertheless, some legacy OAM applications running across the ACP may 763 rely on reverse DNS lookup for authentication of requests (eg: TFTP 764 for download of network firmware/config/software). Operators may 765 therefore use split horizon DNS to provide global PTR records for 766 their own ULA prefixes only into their own domain to continue relying 767 on this method. Given the security of the ACP, this may even 768 increase the security of such legacy methods. 770 Any current and future protocols must rely on secure end-to-end 771 communications (TLS/DTLS) and identification and authentication via 772 the certificates assigned to both ends. This is enabled by the 773 certificate mechanisms of the ACP. 775 If DNS and especially reverse DNS are set up, then it should be set 776 up in an automated fashion, linked to the autonomic registrar backend 777 so that the DNS and reverse DNS records are actually derived from the 778 subject name elements of the ACP device certificates in the same way 779 as the autonomic devices themselves will derive their ULA addresses 780 from their certificates to ensure correct and consistent DNS entries. 782 If an operator feels that reverse DNS records are beneficial to its 783 own operations but that they should not be made available publically 784 for "security" by concealment reasons, then the case of ACP DNS 785 entries is probably one of the least problematic use cases for split- 786 DNS: The ACP DNS names are only needed for the NMS hosts intending to 787 use the ACP - but not network wide across the enterprise. 789 5. IANA Considerations 791 This document requests no action by IANA. 793 6. Acknowledgements 795 This work originated from an Autonomic Networking project at cisco 796 Systems, which started in early 2010 including customers involved in 797 the design and early testing. Many people contributed to the aspects 798 described in this document, including in alphabetical order: BL 799 Balaji, Steinthor Bjarnason, Yves Herthoghs, Sebastian Meissner, Ravi 800 Kumar Vadapalli. The author would also like to thank Michael 801 Richardson, James Woodyatt and Brian Carpenter for their review and 802 comments. Special thanks to Sheng Jiang and Mohamed Boucadair for 803 their thorough review. 805 7. Change log [RFC Editor: Please remove] 807 05: Integrated fixes from Brian Carpenters review. Details on 808 semantic/structural changes: 810 * Folded most comments back into draft-ietf-anima-autonomic- 811 control-plane-09 because this stable connectivity draft was 812 suggesting things that are better written out and standardized 813 in the ACP document. 815 * Section on simultaneous ACP and data plane connectivity section 816 reduced/rewritten because of this. 818 * Re-emphasized security model of ACP - ACP-connect can not 819 arbitrarily extend ACP routing domain. 821 * Re-wrote much of NMS section to be less suggestive and more 822 descriptive, avoiding the term NAT and referring to relevant 823 RFCs (SIIT etc.). 825 * Main additional text in IPv4 section is about explaining how we 826 suggest to use EAM (Explicit Address Mapping) which actuall 827 would well work with the Zone and Vlong Addressing Sub-Schemes 828 of ACP. 830 * Moved, but not changed section of "why no IPv4 in ACP" before 831 IANA considerations to make structure of document more logical. 833 * Refined security considerations: explained how optional ULA 834 prefix listing on Internet is only for diagnostic purposes. 835 Explained how this is useful because DNS must not be used. 836 Explained how split horizon DNS can be used nevertheless. 838 04: Integrated fixes from Mohamed Boucadairs review (thorough 839 textual review). 841 03: Integrated fixes from thorough Shepherd review (Sheng Jiang). 843 01: Refresh timeout. Stable document, change in author 844 association. 846 01: Refresh timeout. Stable document, no changes. 848 00: Changed title/dates. 850 individual-02: Updated references. 852 individual-03: Modified ULA text to not suggest ULA-C as much 853 better anymore, but still mention it. 855 individual-02: Added explanation why no IPv4 for ACP. 857 individual-01: Added security section discussing the role of 858 address prefix selection and DNS for ACP. Title change to 859 emphasize focus on OAM. Expanded abstract. 861 individual-00: Initial version. 863 8. References 865 [I-D.ietf-anima-autonomic-control-plane] 866 Behringer, M., Eckert, T., and S. Bjarnason, "An Autonomic 867 Control Plane (ACP)", draft-ietf-anima-autonomic-control- 868 plane-08 (work in progress), July 2017. 870 [I-D.ietf-anima-bootstrapping-keyinfra] 871 Pritikin, M., Richardson, M., Behringer, M., Bjarnason, 872 S., and K. Watsen, "Bootstrapping Remote Secure Key 873 Infrastructures (BRSKI)", draft-ietf-anima-bootstrapping- 874 keyinfra-07 (work in progress), July 2017. 876 [I-D.ietf-anima-grasp] 877 Bormann, C., Carpenter, B., and B. Liu, "A Generic 878 Autonomic Signaling Protocol (GRASP)", draft-ietf-anima- 879 grasp-15 (work in progress), July 2017. 881 [I-D.ietf-anima-reference-model] 882 Behringer, M., Carpenter, B., Eckert, T., Ciavaglia, L., 883 Pierre, P., Liu, B., Nobre, J., and J. Strassner, "A 884 Reference Model for Autonomic Networking", draft-ietf- 885 anima-reference-model-04 (work in progress), July 2017. 887 [IEEE802.1Q] 888 International Telecommunication Union, "802.1Q-2014 - IEEE 889 Standard for Local and metropolitan area networks - 890 Bridges and Bridged Networks", 2014. 892 [ITUT] International Telecommunication Union, "Architecture and 893 specification of data communication network", 894 ITU-T Recommendation G.7712/Y.1703, Noevember 2001. 896 This is the earliest but superceeded version of the 897 series. See REC-G.7712 Home Page [1] for current 898 versions. 900 [RFC1034] Mockapetris, P., "Domain names - concepts and facilities", 901 STD 13, RFC 1034, DOI 10.17487/RFC1034, November 1987, 902 . 904 [RFC1918] Rekhter, Y., Moskowitz, B., Karrenberg, D., de Groot, G., 905 and E. Lear, "Address Allocation for Private Internets", 906 BCP 5, RFC 1918, DOI 10.17487/RFC1918, February 1996, 907 . 909 [RFC4191] Draves, R. and D. Thaler, "Default Router Preferences and 910 More-Specific Routes", RFC 4191, DOI 10.17487/RFC4191, 911 November 2005, . 913 [RFC4193] Hinden, R. and B. Haberman, "Unique Local IPv6 Unicast 914 Addresses", RFC 4193, DOI 10.17487/RFC4193, October 2005, 915 . 917 [RFC5246] Dierks, T. and E. Rescorla, "The Transport Layer Security 918 (TLS) Protocol Version 1.2", RFC 5246, 919 DOI 10.17487/RFC5246, August 2008, 920 . 922 [RFC6146] Bagnulo, M., Matthews, P., and I. van Beijnum, "Stateful 923 NAT64: Network Address and Protocol Translation from IPv6 924 Clients to IPv4 Servers", RFC 6146, DOI 10.17487/RFC6146, 925 April 2011, . 927 [RFC6291] Andersson, L., van Helvoort, H., Bonica, R., Romascanu, 928 D., and S. Mansfield, "Guidelines for the Use of the "OAM" 929 Acronym in the IETF", BCP 161, RFC 6291, 930 DOI 10.17487/RFC6291, June 2011, 931 . 933 [RFC6347] Rescorla, E. and N. Modadugu, "Datagram Transport Layer 934 Security Version 1.2", RFC 6347, DOI 10.17487/RFC6347, 935 January 2012, . 937 [RFC6418] Blanchet, M. and P. Seite, "Multiple Interfaces and 938 Provisioning Domains Problem Statement", RFC 6418, 939 DOI 10.17487/RFC6418, November 2011, 940 . 942 [RFC6434] Jankiewicz, E., Loughney, J., and T. Narten, "IPv6 Node 943 Requirements", RFC 6434, DOI 10.17487/RFC6434, December 944 2011, . 946 [RFC6518] Lebovitz, G. and M. Bhatia, "Keying and Authentication for 947 Routing Protocols (KARP) Design Guidelines", RFC 6518, 948 DOI 10.17487/RFC6518, February 2012, 949 . 951 [RFC6724] Thaler, D., Ed., Draves, R., Matsumoto, A., and T. Chown, 952 "Default Address Selection for Internet Protocol Version 6 953 (IPv6)", RFC 6724, DOI 10.17487/RFC6724, September 2012, 954 . 956 [RFC6824] Ford, A., Raiciu, C., Handley, M., and O. Bonaventure, 957 "TCP Extensions for Multipath Operation with Multiple 958 Addresses", RFC 6824, DOI 10.17487/RFC6824, January 2013, 959 . 961 [RFC7575] Behringer, M., Pritikin, M., Bjarnason, S., Clemm, A., 962 Carpenter, B., Jiang, S., and L. Ciavaglia, "Autonomic 963 Networking: Definitions and Design Goals", RFC 7575, 964 DOI 10.17487/RFC7575, June 2015, 965 . 967 [RFC7757] Anderson, T. and A. Leiva Popper, "Explicit Address 968 Mappings for Stateless IP/ICMP Translation", RFC 7757, 969 DOI 10.17487/RFC7757, February 2016, 970 . 972 [RFC7915] Bao, C., Li, X., Baker, F., Anderson, T., and F. Gont, 973 "IP/ICMP Translation Algorithm", RFC 7915, 974 DOI 10.17487/RFC7915, June 2016, 975 . 977 Authors' Addresses 979 Toerless Eckert (editor) 980 Futurewei Technologies Inc. 981 2330 Central Expy 982 Santa Clara 95050 983 USA 985 Email: tte+ietf@cs.fau.de 987 Michael H. Behringer 989 Email: michael.h.behringer@gmail.com