idnits 2.17.1 draft-ietf-anima-stable-connectivity-04.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack separate sections for Informative/Normative References. All references will be assumed normative when checking for downward references. ** The abstract seems to contain references ([RFC6291]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (July 27, 2017) is 2462 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Unused Reference: 'RFC5246' is defined on line 822, but no explicit reference was found in the text == Unused Reference: 'RFC6347' is defined on line 833, but no explicit reference was found in the text == Unused Reference: 'RFC6418' is defined on line 837, but no explicit reference was found in the text == Outdated reference: A later version (-30) exists of draft-ietf-anima-autonomic-control-plane-08 == Outdated reference: A later version (-45) exists of draft-ietf-anima-bootstrapping-keyinfra-07 == Outdated reference: A later version (-10) exists of draft-ietf-anima-reference-model-04 ** Obsolete normative reference: RFC 5246 (Obsoleted by RFC 8446) ** Obsolete normative reference: RFC 6347 (Obsoleted by RFC 9147) ** Obsolete normative reference: RFC 6434 (Obsoleted by RFC 8504) ** Obsolete normative reference: RFC 6824 (Obsoleted by RFC 8684) Summary: 6 errors (**), 0 flaws (~~), 7 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 ANIMA T. Eckert, Ed. 3 Internet-Draft Huawei 4 Intended status: Informational M. Behringer 5 Expires: January 28, 2018 July 27, 2017 7 Using Autonomic Control Plane for Stable Connectivity of Network OAM 8 draft-ietf-anima-stable-connectivity-04 10 Abstract 12 OAM (Operations, Administration and Maintenance - as per BCP161, 13 [RFC6291]) processes for data networks are often subject to the 14 problem of circular dependencies when relying on connectivity 15 provided by the network to be managed for the OAM purposes. 16 Provisioning during device/network bring up tends to be far less easy 17 to automate than service provisioning later on, changes in core 18 network functions impacting reachability may not be easy to be 19 automated either because of ongoing connectivity requirements for the 20 OAM, and widely used OAM protocols are not secure enough to be 21 carried across the network without security concerns. 23 This document describes how to integrate OAM with the autonomic 24 control plane (ACP) in Autonomic Networks (AN) to provide stable and 25 secure connectivity for conducting OAM. This connectivity is not 26 subject to aforementioned circular dependencies. 28 Status of This Memo 30 This Internet-Draft is submitted in full conformance with the 31 provisions of BCP 78 and BCP 79. 33 Internet-Drafts are working documents of the Internet Engineering 34 Task Force (IETF). Note that other groups may also distribute 35 working documents as Internet-Drafts. The list of current Internet- 36 Drafts is at http://datatracker.ietf.org/drafts/current/. 38 Internet-Drafts are draft documents valid for a maximum of six months 39 and may be updated, replaced, or obsoleted by other documents at any 40 time. It is inappropriate to use Internet-Drafts as reference 41 material or to cite them other than as "work in progress." 43 This Internet-Draft will expire on January 28, 2018. 45 Copyright Notice 47 Copyright (c) 2017 IETF Trust and the persons identified as the 48 document authors. All rights reserved. 50 This document is subject to BCP 78 and the IETF Trust's Legal 51 Provisions Relating to IETF Documents 52 (http://trustee.ietf.org/license-info) in effect on the date of 53 publication of this document. Please review these documents 54 carefully, as they describe your rights and restrictions with respect 55 to this document. Code Components extracted from this document must 56 include Simplified BSD License text as described in Section 4.e of 57 the Trust Legal Provisions and are provided without warranty as 58 described in the Simplified BSD License. 60 Table of Contents 62 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 63 1.1. Self dependent OAM Connectivity . . . . . . . . . . . . . 2 64 1.2. Data Communication Networks (DCNs) . . . . . . . . . . . 3 65 1.3. Leveraging the ACP . . . . . . . . . . . . . . . . . . . 4 66 2. Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . 4 67 2.1. Stable Connectivity for Centralized OAM . . . . . . . . . 4 68 2.1.1. Simple Connectivity for Non-ACP capable NMS Hosts . . 5 69 2.1.2. Challenges and Limitation of Simple Connectivity . . 6 70 2.1.3. Simultaneous ACP and Data Plane Connectivity . . . . 7 71 2.1.4. IPv4-only NMS Hosts . . . . . . . . . . . . . . . . . 9 72 2.1.5. Path Selection Policies . . . . . . . . . . . . . . . 10 73 2.1.6. Autonomic NOC Device/Applications . . . . . . . . . . 12 74 2.1.7. Encryption of data-plane connections . . . . . . . . 12 75 2.1.8. Long Term Direction of the Solution . . . . . . . . . 13 76 2.2. Stable Connectivity for Distributed Network/OAM . . . . . 14 77 3. Security Considerations . . . . . . . . . . . . . . . . . . . 14 78 4. No IPv4 for ACP . . . . . . . . . . . . . . . . . . . . . . . 16 79 5. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 16 80 6. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 16 81 7. Change log [RFC Editor: Please remove] . . . . . . . . . . . 17 82 8. References . . . . . . . . . . . . . . . . . . . . . . . . . 17 83 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 19 85 1. Introduction 87 1.1. Self dependent OAM Connectivity 89 OAM (Operations, Administration and Maintenance - as per BCP161, 90 [RFC6291]) for data networks is often subject to the problem of 91 circular dependencies when relying on the connectivity service 92 provided by the network to be managed. OAM can easily but 93 unintentionally break the connectivity required for its own 94 operations. Avoiding these problems can lead to complexity in OAM. 95 This document describes this problem and how to use the Autonomic 96 Control Plane (ACP) to solve it without further OAM complexity: 98 The ability to perform OAM on a network device requires first the 99 execution of OAM necessary to create network connectivity to that 100 device in all intervening devices. This typically leads to 101 sequential, 'expanding ring configuration' from a NOC (Network 102 Operations Center). It also leads to tight dependencies between 103 provisioning tools and security enrollment of devices. Any process 104 that wants to enroll multiple devices along a newly deployed network 105 topology needs to tightly interlock with the provisioning process 106 that creates connectivity before the enrollment can move on to the 107 next device. 109 When performing change operations on a network, it likewise is 110 necessary to understand at any step of that process that there is no 111 interruption of connectivity that could lead to removal of 112 connectivity to remote devices. This includes especially change 113 provisioning of routing, forwarding, security and addressing policies 114 in the network that often occur through mergers and acquisitions, the 115 introduction of IPv6 or other mayor re-hauls in the infrastructure 116 design. Examples include change of an IGP or areas, PA (Provider 117 Aggregatabe) to PI (Provider Independent) addressing, or systematic 118 topology changes (such as L2 to L3 changes). 120 All these circular dependencies make OAM complex and potentially 121 fragile. When automation is being used, for example through 122 provisioning systems, this complexity extends into that automation 123 software. 125 1.2. Data Communication Networks (DCNs) 127 In the late 1990'th and early 2000, IP networks became the method of 128 choice to build separate OAM networks for the communications 129 infrastructure within Network Providers. This concept was 130 standardized in ITU-T G.7712/Y.1703 [ITUT] and called "Data 131 Communications Networks" (DCN). These where (and still are) 132 physically separate IP(/MPLS) networks that provide access to OAM 133 interfaces of all equipment that had to be managed, from PSTN (Public 134 Switched Telephone Network) switches over optical equipment to 135 nowadays Ethernet and IP/MPLS production network equipment. 137 Such DCN provide stable connectivity not subject to aforementioned 138 problems because they are separate network entirely, so change 139 configuration of the production IP network is done via the DCN but 140 never affects the DCN configuration. Of course, this approach comes 141 at a cost of buying and operating a separate network and this cost is 142 not feasible for many providers, most notably smaller providers, most 143 enterprises and typical IoT networks (Internet of Things). 145 1.3. Leveraging the ACP 147 One of the goals of the Autonomic Networks Autonomic Control Plane 148 (ACP as defined in [I-D.ietf-anima-autonomic-control-plane] ) is to 149 provide similar stable connectivity as a DCN, but without having to 150 build a separate DCN. It is clear that such 'in-band' approach can 151 never achieve fully the same level of separation, but the goal is to 152 get as close to it as possible. 154 This solution approach has several aspects. One aspect is designing 155 the implementation of the ACP in network devices to make it actually 156 perform without interruption by changes in what we will call in this 157 document the "data-plane", a.k.a: the operator or controller 158 configured services planes of the network equipment. This aspect is 159 not currently covered in this document. 161 Another aspect is how to leverage the stable IPv6 connectivity 162 provided by the ACP for OAM purposes. This is the current scope of 163 this document. 165 2. Solutions 167 2.1. Stable Connectivity for Centralized OAM 169 The ANI is the "Autonomic Networking Infrastructure" consisting of 170 secure zero touch Bootstrap (BRSKI - 171 [I-D.ietf-anima-bootstrapping-keyinfra]), GeneRic Autonomic Signaling 172 Protocol (GRASP - [I-D.ietf-anima-grasp]), and Autonomic Control 173 Plane (ACP - [I-D.ietf-anima-autonomic-control-plane]). Refer to 174 [I-D.ietf-anima-reference-model] for an overview of the ANI and how 175 its components interact and [RFC7575] for concepts and terminology of 176 ANI and autonomic networks. 178 This section describes stable connectivity for centralized OAM via 179 ACP/ANI starting by what we expect to be the most easy to deploy 180 short-term option. It then describes limitation and challenges of 181 that approach and their solutions/workarounds to finish with the 182 preferred target option of autonomic NOC devices in Section 2.1.6. 184 This order was chosen because it helps to explain how simple initial 185 use of ACP can be, how difficult workarounds can become (and 186 therefore what to avoid), and finally because one very promising 187 long-term solution alternative is exactly like the most easy short- 188 term solution only virtualized and automated. 190 In the most common case, OAM will be performed by one or more 191 applications running on a variety of centralized NOC systems that 192 communicate with network devices. We describe differently advanced 193 approaches to leverage the ACP for stable connectivity. There is a 194 wide range of options, some of which are simple, some more complex. 196 Three stages can be considered: 198 o There are simple options described in sections Section 2.1.1 199 through Section 2.1.3 that we consider to be good starting points 200 to operationalize the use of the ACP for stable connectivity 201 today. These options require only network and OAN/NOC device 202 configuration. 204 o The are workarounds to connect the ACP to non-IPv6 capable NOC 205 devices through the use of IPv4/IPv6 NAT (Network Address 206 Translation) as described in section Section 2.1.4. These 207 workarounds are not recommended but if such non-IPv6 capable NOC 208 devices need to be used longer term, then this is the only option 209 to connect them to the ACP. 211 o Near to long term options can provide all the desired operational, 212 zero touch and security benefits of an autonomic network, but a 213 range of details for this still have to be worked out and 214 development work on NOC/OAM equipment is necessary. These options 215 are discussed in sections Section 2.1.5 through Section 2.1.8. 217 2.1.1. Simple Connectivity for Non-ACP capable NMS Hosts 219 In the most simple candidate deployment case, the ACP extends all the 220 way into the NOC via one or more "ACP edge devices" as defined in 221 section 6.1 of [I-D.ietf-anima-autonomic-control-plane]. These 222 devices "leak" the (otherwise encrypted) ACP natively to NMS hosts. 223 They acts as the default router to those NMS hosts and provide them 224 with IPv6 connectivity into the ACP. NMS hosts with this setup need 225 to support IPv6 (see e.g. [RFC6434]) but require no other 226 modifications to leverage the ACP. 228 Note that even though the ACP only uses IPv6, it can of course 229 support OAM for any type of network deployment as long as the network 230 devices support the ACP: The Data Plane can be IPv4 only, dual-stack 231 or IPv6 only. It is always spearate from the ACP, therefore there is 232 no dependency between the ACP and the IP version(s) used in the Data 233 Plane. 235 This setup is sufficient for troubleshooting such as SSH into network 236 devices, NMS that performs SNMP read operations for status checking, 237 software downloads into autonomic devices, provisioning of devices 238 via NETCONF and so on. In conjunction with otherwise unmodified OAM 239 via separate NMS hosts it can provide a good subset of the stable 240 connectivity goals. The limitations of this approach are discussed 241 in the next section. 243 Because the ACP provides 'only' for IPv6 connectivity, and because 244 addressing provided by the ACP does not include any topological 245 addressing structure that operations in a NOC often relies on to 246 recognize where devices are on the network, it is likely highly 247 desirable to set up DNS (Domain Name System - see [RFC1034]) so that 248 the ACP IPv6 addresses of autonomic devices are known via domain 249 names that include the desired structure. For example, if DNS in the 250 network was set up with names for network devices as 251 devicename.noc.example.com, and the well known structure of the Data 252 Plane IPv4 addresses space was used by operators to infer the region 253 where a device is located in, then the ACP address of that device 254 could be set up as devicename_.acp.noc.example.com, and 255 devicename.acp.noc.example.com could be a CNAME to 256 devicename_.acp.noc.example.com. Note that many networks 257 already use names for network equipment where topological information 258 is included, even without an ACP. 260 2.1.2. Challenges and Limitation of Simple Connectivity 262 This simple connectivity of non-autonomic NMS hosts suffers from a 263 range of challenges (that is, operators may not be able to do it this 264 way) or limitations (that is, operator cannot achieve desired goals 265 with this setup). The following list summarizes these challenges and 266 limitations. The following sections describe additional mechanisms 267 to overcome them. 269 Note that these challenges and limitations exist because ACP is 270 primarily designed to support distributed ASA in the most lightweight 271 fashion, but not mandatorily require support for additional 272 mechanisms to best support centralized NOC operations. It is this 273 document that describes additional (short term) workarounds and (long 274 term) extensions. 276 1. (Limitation) NMS hosts cannot directly probe whether the desired 277 so called 'data-plane' network connectivity works because they do 278 not directly have access to it. This problem is similar to 279 probing connectivity for other services (such as VPN services) 280 that they do not have direct access to, so the NOC may already 281 employ appropriate mechanisms to deal with this issue (probing 282 proxies). See Section 2.1.3 for candidate solutions. 284 2. (Challenge) NMS hosts need to support IPv6 which often is still 285 not possible in enterprise networks. See Section 2.1.4 for some 286 workarounds. 288 3. (Limitation) Performance of the ACP will be limited versus normal 289 'data-plane' connectivity. The setup of the ACP will often 290 support only non-hardware accelerated forwarding. Running a 291 large amount of traffic through the ACP, especially for tasks 292 where it is not necessary will reduce its performance/ 293 effectiveness for those operations where it is necessary or 294 highly desirable. See Section 2.1.5 for candidate solutions. 296 4. (Limitation) Security of the ACP is reduced by exposing the ACP 297 natively (and unencrypted) into a LAN in the NOC where the NOC 298 devices are attached to it. See Section 2.1.7 for candidate 299 solutions. 301 These four problems can be tackled independently of each other by 302 solution improvements. Combining some of these solutions 303 improvements together can lead towards a candiate long term solution. 305 2.1.3. Simultaneous ACP and Data Plane Connectivity 307 Simultaneous connectivity to both ACP and data-plane can be achieved 308 in a variety of ways. If the data-plane is IPv4-only, then any 309 method for dual-stack attachment of the NOC device/application will 310 suffice: IPv6 connectivity from the NOC provides access via the ACP, 311 IPv4 will provide access via the data-plane. If as explained above 312 in the simple case, an autonomic device supports native attachment to 313 the ACP, and the existing NOC setup is IPv4 only, then it could be 314 sufficient to attach the ACP device(s) as the IPv6 default router to 315 the NOC LANs and keep the existing IPv4 default router setup 316 unchanged. 318 If the data-plane of the network is also supporting IPv6, then the 319 NOC devices that need access to the ACP should have a dual-homing 320 IPv6 setup. One option is to make the NOC devices multi-homed with 321 one logical or physical IPv6 interface connecting to the data-plane, 322 and another into the ACP. The LAN that provides access to the ACP 323 should then be given an IPv6 prefix that shares a common prefix with 324 the IPv6 ULA (see [RFC4193]) of the ACP so that the standard IPv6 325 interface selection rules on the NOC host would result in the desired 326 automatic selection of the right interface: towards the ACP facing 327 interface for connections to ACP addresses, and towards the data- 328 plane interface for anything else. If this cannot be achieved 329 automatically, then it needs to be done via IPv6 static routes in the 330 NOC host. 332 Providing two virtual (e.g. dot1q subnet) connections into NOC hosts 333 may be seen as an undesired complexity. In that case the routing 334 policy to provide access to both ACP and data-plane via IPv6 needs to 335 happen in the NOC network itself: The NMS host gets a single 336 attachment interface but still with the same two IPv6 addresses as in 337 before - one for use towards the ACP, one towards the data-plane. 338 The first-hop router connecting to the NMS host would then have 339 separate interfaces: one towards the data-plane, one towards the ACP. 340 Routing of traffic from NMS hosts would then have to be based on the 341 source IPv6 address of the host: Traffic from the address designated 342 for ACP use would get routed towards the ACP, traffic from the 343 designated data-plane address towards the data-plane. 345 In the simple case, we get the following topology: Existing NMS hosts 346 connect via an existing NOClan and existing first hop Rtr1 to the 347 data-plane. Rtr1 is not made autonomic, but instead the edge router 348 of the Autonomic network ANrtr is attached via a separate interface 349 to Rtr1 and ANrtr provides access to the ACP via ACPaccessLan. Rtr1 350 is configured with the above described IPv6 source routing policies 351 and the NOC-app-devices are given the secondary IPv6 address for 352 connectivity into the ACP. 354 --... (data-plane) 355 NOC-app-device(s) -- NOClan -- Rtr1 356 --- ACPaccessLan -- ANrtr ... (ACP) 358 Figure 1 360 If Rtr1 was to be upgraded to also implement Autonomic Networking and 361 the ACP, the picture would change as follows: 363 ---- ... (data-plane) 364 NOC-app-device(s) ---- NOClan --- ANrtr1 365 . . ---- ... (ACP) 366 \-/ 367 (ACP to data-plane loopback) 369 Figure 2 371 In this case, ANrtr1 would have to implement some more advanced 372 routing such as cross-VRF routing because the data-plane and ACP are 373 most likely run via separate VRFs. A workaround without additional 374 software functionality could be a physical external loopback cable 375 into two ports of ANrtr1 to connect the data-plane and ACP VRF as 376 shown in the picture. A (virtual) software loopback between the ACP 377 and data plane VRF would of course be the better solution. 379 2.1.4. IPv4-only NMS Hosts 381 ACP does not support IPv4: Single stack IPv6 management of the 382 network via ACP and (as needed) data plane. Independent of whether 383 the data plane is dual-stack, has IPv4 as a service or is single 384 stack IPv6. Dual plane management, IPv6 for ACP, IPv4 for the data 385 plane is likewise an architecturally simple option. 387 The downside of this architectural decision is the potential need for 388 short-term workarounds when the operational practices in a network 389 that cannot meet these target expectations. This section motivates 390 when and why these workarounds may be necessary and describes them. 391 All the workarounds described in this section are HIGHLY UNDESIRABLE. 392 The only recommended solution is to enable IPv6 on NMS hosts. 394 Most network equipment today supports IPv6 but it is by far not 395 ubiquitously supported in NOC backend solutions (HW/SW), especially 396 not in the product space for enterprises. Even when it is supported, 397 there are often additional limitations or issues using it in a dual 398 stack setup or the operator mandates for simplicity single stack for 399 all operations. For these reasons an IPv4 only management plane is 400 still required and common practice in many enterprises. Without the 401 desire to leverage the ACP, this required and common practice is not 402 a problem for those enterprises even when they run dual stack in the 403 network. We document these workarounds here because it is a short 404 term deployment challenge specific to the operations of the ACP. 406 To bridge an IPv4 only management plane with the ACP, IPv4 to IPv6 407 NAT can be used. This NAT setup could for example be done in Rt1r1 408 in above picture to also support IPv4 only NMS hots connected to 409 NOClan. 411 To support connections initiated from IPv4 only NMS hosts towards the 412 ACP of network devices, it is necessary to create a static mapping of 413 ACP IPv6 addresses into an unused IPv4 address space and dynamic or 414 static mapping of the IPv4 NOC application device address (prefix) 415 into IPv6 routed in the ACP. The main issue in this setup is the 416 mapping of all ACP IPv6 addresses to IPv4. Without further network 417 intelligence, this needs to be a 1:1 address mapping because the 418 prefix used for ACP IPv6 addresses is too long to be mapped directly 419 into IPv4 on a prefix basis. 421 One could implement in router software dynamic mappings by leveraging 422 DNS, but it seems highly undesirable to implement such complex 423 technologies for something that ultimately is a temporary problem 424 (IPv4 only NMS hosts). With today's operational directions it is 425 likely more preferable to automate the setup of 1:1 NAT mappings in 426 that NAT router as part of the automation process of network device 427 enrollment into the ACP. 429 The ACP can also be used for connections initiated by the network 430 device into the NMS hosts. For example, syslog from autonomic 431 devices. In this case, static mappings of the NMS hosts IPv4 432 addresses are required. This can easily be done with a static prefix 433 mapping into IPv6. 435 Overall, the use of NAT is especially subject to the ROI (Return On 436 Investment) considerations, but the methods described here may not be 437 too different from the same problems encountered totally independent 438 of AN/ACP when some parts of the network are to introduce IPv6 but 439 NMS hosts are not (yet) upgradeable. 441 2.1.5. Path Selection Policies 443 As mentioned above, the ACP is not expected to have high performance 444 because its primary goal is connectivity and security, and for 445 existing network device platforms this often means that it is a lot 446 more effort to implement that additional connectivity with hardware 447 acceleration than without - especially because of the desire to 448 support full encryption across the ACP to achieve the desired 449 security. 451 Some of these issues may go away in the future with further adoption 452 of the ACP and network device designs that better tender to the needs 453 of a separate OAM plane, but it is wise to plan for even long-term 454 designs of the solution that does NOT depend on high-performance of 455 the ACP. This is opposite to the expectation that future NMS hosts 456 will have IPv6, so that any considerations for IPv4/NAT in this 457 solution are temporary. 459 To solve the expected performance limitations of the ACP, we do 460 expect to have the above describe dual-connectivity via both ACP and 461 data-plane between NOC application devices and AN devices with ACP. 462 The ACP connectivity is expected to always be there (as soon as a 463 device is enrolled), but the data-plane connectivity is only present 464 under normal operations but will not be present during e.g. early 465 stages of device bootstrap, failures, provisioning mistakes or during 466 network configuration changes. 468 The desired policy is therefore as follows: In the absence of further 469 security considerations (see below), traffic between NMS hosts and AN 470 devices should prefer data-plane connectivity and resort only to 471 using the ACP when necessary, unless it is an operation known to be 472 so much tied to the cases where the ACP is necessary that it makes no 473 sense to try using the data plane. An example here is of course the 474 SSH connection from the NOC into a network device to troubleshoot 475 network connectivity. This could easily always rely on the ACP. 476 Likewise, if an NMS host is known to transmit large amounts of data, 477 and it uses the ACP, then its performance need to be controlled so 478 that it will not overload the ACP performance. Typical examples of 479 this are software downloads. 481 There is a wide range of methods to build up these policies. We 482 describe a few: 484 Ideally, a NOC system would learn and keep track of all addresses of 485 a device (ACP and the various data plane addresses). Every action of 486 the NOC system would indicate via a "path-policy" what type of 487 connection it needs (e.g. only data-plane, ACP-only, default to data- 488 plane, fallback to ACP,...). A connection policy manager would then 489 build connection to the target using the right address(es). Shorter 490 term, a common practice is to identify different paths to a device 491 via different names (e.g. loopback vs. interface addresses). This 492 approach can be expanded to ACP uses, whether it uses NOC system 493 local names or DNS. We describe example schemes using DNS: 495 DNS can be used to set up names for the same network devices but with 496 different addresses assigned: One name (name.noc.example.com) with 497 only the data-plane address(es) (IPv4 and/or IPv6) to be used for 498 probing connectivity or performing routine software downloads that 499 may stall/fail when there are connectivity issues. One name (name- 500 acp.noc.example.com) with only the ACP reachable address of the 501 device for troubleshooting and probing/discovery that is desired to 502 always only use the ACP. One name with data plane and ACP addresses 503 (name-both.noc.example.com). 505 Traffic policing and/or shaping of at the ACP edge in the NOC can be 506 used to throttle applications such as software download into the ACP. 508 MPTCP (Multipath TCP -see [RFC6824]) is a very attractive candidate 509 to automate the use of both data-plane and ACP and minimize or fully 510 avoid the need for the above mentioned logical names to pre-set the 511 desired connectivity (data-plane-only, ACP only, both). For example, 512 a set-up for non MPTCP aware applications would be as follows: 514 DNS naming is set up to provide the ACP IPv6 address of network 515 devices. Unbeknownst to the application, MPTCP is used. MPTCP 516 mutually discovers between the NOC and network device the data-plane 517 address and caries all traffic across it when that MPTCP subflow 518 across the data-plane can be built. 520 In the Autonomic network devices where data-plane and ACP are in 521 separate VRFs, it is clear that this type of MPTCP subflow creation 522 across different VRFs is new/added functionality. Likewise, the 523 policies of preferring a particular address (NOC-device) or VRF (AN 524 device) for the traffic is potentially also a policy not provided as 525 a standard. 527 2.1.6. Autonomic NOC Device/Applications 529 Setting up connectivity between the NOC and autonomic devices when 530 the NOC device itself is non-autonomic is as mentioned in the 531 beginning a security issue. It also results as shown in the previous 532 paragraphs in a range of connectivity considerations, some of which 533 may be quite undesirable or complex to operationalize. 535 Making NMS hosts autonomic and having them participate in the ACP is 536 therefore not only a highly desirable solution to the security 537 issues, but can also provide a likely easier operationalization of 538 the ACP because it minimizes NOC-special edge considerations - the 539 ACP is simply built all the way automatically, even inside the NOC 540 and only authorized and authenticate NOC devices/applications will 541 have access to it. 543 Supporting the ACP all the way into an application device requires 544 implementing the following aspects in it: AN bootstrap/enrollment 545 mechanisms, the secure channel for the ACP and at least the host side 546 of IPv6 routing setup for the ACP. Minimally this could all be 547 implemented as an application and be made available to the host OS 548 via e.g. a tap driver to make the ACP show up as another IPv6 enabled 549 interface. 551 Having said this: If the structure of NMS hosts is transformed 552 through virtualization anyhow, then it may be considered equally 553 secure and appropriate to construct (physical) NMS host system by 554 combining a virtual AN/ACP enabled router with non-AN/ACP enabled 555 NOC-application VMs via a hypervisor, leveraging the configuration 556 options described in the previous sections but just virtualizing 557 them. 559 2.1.7. Encryption of data-plane connections 561 When combining ACP and data-plane connectivity for availability and 562 performance reasons, this too has an impact on security: When using 563 the ACP, the traffic will be mostly encryption protected, especially 564 when considering the above described use of AN application devices. 565 If instead the data-plane is used, then this is not the case anymore 566 unless it is done by the application. 568 The simplest solution for this problem exists when using AN capable 569 NMS hosts, because in that case the communicating AN capable NMS host 570 and the AN network device have certificates through the AN enrollment 571 process that they can mutually trust (same AN domain). In result, 572 data-plane connectivity that does support this can simply leverage 573 TLS/DTLS ([RFC5246]/[RFC6347]) with mutual AN-domain certificate 574 authentication - and does not incur new key management. 576 If this automatic security benefit is seen as most important, but a 577 "full" ACP stack into the NMS host is unfeasible, then it would still 578 be possible to design a stripped down version of AN functionality for 579 such NOC hosts that only provides enrollment of the NOC host into the 580 AN domain to the extent that the host receives an AN domain 581 certificate, but without directly participating in the ACP 582 afterwards. Instead, the host would just leverage TLS/DTLS using its 583 AN certificate via the data-plane with AN network devices as well as 584 indirectly via the ACP with the above mentioned in-NOC network edge 585 connectivity into the ACP. 587 When using the ACP itself, TLS/DTLS for the transport layer between 588 NMS hosts and network device is somewhat of a double price to pay 589 (ACP also encrypts) and could potentially be optimized away, but 590 given the assumed lower performance of the ACP, it seems that this is 591 an unnecessary optimization. 593 2.1.8. Long Term Direction of the Solution 595 If we consider what potentially could be the most lightweight and 596 autonomic long term solution based on the technologies described 597 above, we see the following direction: 599 1. NMS hosts should at least support IPv6. IPv4/IPv6 NAT in the 600 network to enable use of ACP is long term undesirable. Having 601 IPv4 only applications automatically leverage IPv6 connectivity 602 via host-stack translation may be an option but the operational 603 viability of this approach is not well enough understood. 605 2. Build the ACP as a lightweight application for NMS hosts so ACP 606 extends all the way into the actual NMS hosts. 608 3. Leverage and as necessary enhance MPTCP with automatic dual- 609 connectivity: If an MPTCP unaware application is using ACP 610 connectivity, the policies used should add subflow(s) via the 611 data-plane and prefer them. 613 4. Consider how to best map NMS host desires to underlying transport 614 mechanisms: With the above mentioned 3 points, not all options 615 are covered. Depending on the OAM, one may still want only ACP, 616 only data-plane, or automatically prefer one over the other and/ 617 or use the ACP with low performance or high-performance (for 618 emergency OAM such as countering DDoS). It is as of today not 619 clear what the simplest set of tools is to enable explicitly the 620 choice of desired behavior of each OAM. The use of the above 621 mentioned DNS and MPTCP mechanisms is a start, but this will 622 require additional thoughts. This is likely a specific case of 623 the more generic scope of TAPS. 625 2.2. Stable Connectivity for Distributed Network/OAM 627 The ANI (ACP, Bootstrap, GRASP) can provide via the GRASP protocol 628 common direct-neighbor discovery and capability negotiation (GRASP 629 via ACP and/or data-plane) and stable and secure connectivity for 630 functions running distributed in network devices (GRASP via ACP). It 631 can therefore eliminate the need to re-implement similar functions in 632 each distributed function in the network. Today, every distributed 633 protocol does this with functional elements usually called "Hello" 634 mechanisms and with often protocol specific security mechanisms. 636 KARP (Keying and Authentication for Routing Protocols, see [RFC6518]) 637 has tried to start provide common directions and therefore reduce the 638 re-invention of at least some of the security aspects, but it only 639 covers routing-protocols and it is unclear how well it applicable to 640 a potentially wider range of network distributed agents such as those 641 performing distributed OAM. The ACP can help in these cases. 643 3. Security Considerations 645 In this section, we discuss only security considerations not covered 646 in the appropriate sub-sections of the solutions described. 648 Even though ACPs are meant to be isolated, explicit operator 649 misconfiguration to connect to insecure OAM equipment and/or bugs in 650 ACP devices may cause leakage into places where it is not expected. 651 Mergers/Acquisitions and other complex network reconfigurations 652 affecting the NOC are typical examples. 654 ULA addressing as proposed in this document is preferred over 655 globally reachable addresses because it is not routed in the global 656 Internet and will therefore be subject to more filtering even in 657 places where specific ULA addresses are being used. 659 Random ULA addressing provides more than sufficient protection 660 against address collision even though there is no central assignment 661 authority. This is helped by the expectation, that ACPs are never 662 expected to connect all together, but only few ACPs may ever need to 663 connect together, e.g. when mergers and aquisitions occur. 665 If packets with unexpected ULA addresses are seen and one expects 666 them to be from another networks ACP from which they leaked, then 667 some form of ULA prefix registration (not allocation) can be 668 beneficial. Some voluntary registries exist, for example 669 https://www.sixxs.net/tools/grh/ula/, although none of them is 670 preferable because of being operated by some recognized authority. 671 If an operator would want to make its ULA prefix known, it might need 672 to register it with multiple existing registries. 674 ULA Centrally assigned ULA addresses (ULA-C) was an attempt to 675 introduce centralized registration of randomly assigned addresses and 676 potentially even carve out a different ULA prefix for such addresses. 677 This proposal is currently not proceeding, and it is questionable 678 whether the stable connectivity use case provides sufficient 679 motivation to revive this effort. 681 Using current registration options implies that there will not be 682 reverse DNS mapping for ACP addresses. For that one will have to 683 rely on looking up the unknown/unexpected network prefix in the 684 registry to determine the owner of these addresses. 686 Reverse DNS resolution may be beneficial for specific already 687 deployed insecure legacy protocols on NOC OAM systems that intend to 688 communicate via the ACP (e.g. TFTP) and leverages reverse-DNS for 689 authentication. Given how the ACP provides path security except 690 potentially for the last-hop in the NOC, the ACP does make it easier 691 to extend the lifespan of such protocols in a secure fashion as far 692 to just the transport is concerned. The ACP does not make reverse 693 DNS lookup a secure authentication method though. Any current and 694 future protocols must rely on secure end-to-end communications (TLS/ 695 DTLS) and identification and authentication via the certificates 696 assigned to both ends. This is enabled by the certificate mechanisms 697 of the ACP. 699 If DNS and especially reverse DNS are set up, then it should be set 700 up in an automated fashion, linked to the autonomic registrar backend 701 so that the DNS and reverse DNS records are actually derived from the 702 subject name elements of the ACP device certificates in the same way 703 as the autonomic devices themselves will derive their ULA addresses 704 from their certificates to ensure correct and consistent DNS entries. 706 If an operator feels that reverse DNS records are beneficial to its 707 own operations but that they should not be made available publically 708 for "security" by concealment reasons, then the case of ACP DNS 709 entries is probably one of the least problematic use cases for split- 710 DNS: The ACP DNS names are only needed for the NMS hosts intending to 711 use the ACP - but not network wide across the enterprise. 713 4. No IPv4 for ACP 715 The ACP is targeted to be IPv6 only, and the prior explanations in 716 this document show that this can lead to some complexity when having 717 to connect IPv4 only NOC solutions, and that it will be impossible to 718 leverage the ACP when the OAM agents on an ACP network device do not 719 support IPv6. Therefore, the question was raised whether the ACP 720 should optionally also support IPv4. 722 The decision not to include IPv4 for ACP as something that is 723 considered in the use cases in this document is because of the 724 following reasons: 726 In SP networks that have started to support IPv6, often the next 727 planned step is to consider moving out IPv4 from a native transport 728 as just a service on the edge. There is no benefit/need for multiple 729 parallel transport families within the network, and standardizing on 730 one reduces OPEX and improves reliability. This evolution in the 731 data plane makes it highly unlikely that investing development cycles 732 into IPv4 support for ACP will have a longer term benefit or enough 733 critical short-term use-cases. Support for IPv4-only for ACP is 734 purely a strategic choice to focus on the known important long term 735 goals. 737 In other type of networks as well, we think that efforts to support 738 autonomic networking is better spent in ensuring that one address 739 family will be support so all use cases will long-term work with it, 740 instead of duplicating effort into IPv4. Especially because auto- 741 addressing for the ACP with IPv4 would be more complex than in IPv6 742 due to the IPv4 addressing space. 744 5. IANA Considerations 746 This document requests no action by IANA. 748 6. Acknowledgements 750 This work originated from an Autonomic Networking project at cisco 751 Systems, which started in early 2010 including customers involved in 752 the design and early testing. Many people contributed to the aspects 753 described in this document, including in alphabetical order: BL 754 Balaji, Steinthor Bjarnason, Yves Herthoghs, Sebastian Meissner, Ravi 755 Kumar Vadapalli. The author would also like to thank Michael 756 Richardson, James Woodyatt and Brian Carpenter for their review and 757 comments. Special thanks to Sheng Jiang and Mohamed Boucadair for 758 their thorough review. 760 7. Change log [RFC Editor: Please remove] 762 04: Integrated fixes from Mohamed Boucadairs review. 764 03: Integrated fixes from Shepherd review (Sheng Jiang). 766 01: Refresh timeout. Stable document, change in author 767 association. 769 01: Refresh timeout. Stable document, no changes. 771 00: Changed title/dates. 773 individual-02: Updated references. 775 individual-03: Modified ULA text to not suggest ULA-C as much 776 better anymore, but still mention it. 778 individual-02: Added explanation why no IPv4 for ACP. 780 individual-01: Added security section discussing the role of 781 address prefix selection and DNS for ACP. Title change to 782 emphasize focus on OAM. Expanded abstract. 784 individual-00: Initial version. 786 8. References 788 [I-D.ietf-anima-autonomic-control-plane] 789 Behringer, M., Eckert, T., and S. Bjarnason, "An Autonomic 790 Control Plane (ACP)", draft-ietf-anima-autonomic-control- 791 plane-08 (work in progress), July 2017. 793 [I-D.ietf-anima-bootstrapping-keyinfra] 794 Pritikin, M., Richardson, M., Behringer, M., Bjarnason, 795 S., and K. Watsen, "Bootstrapping Remote Secure Key 796 Infrastructures (BRSKI)", draft-ietf-anima-bootstrapping- 797 keyinfra-07 (work in progress), July 2017. 799 [I-D.ietf-anima-grasp] 800 Bormann, C., Carpenter, B., and B. Liu, "A Generic 801 Autonomic Signaling Protocol (GRASP)", draft-ietf-anima- 802 grasp-15 (work in progress), July 2017. 804 [I-D.ietf-anima-reference-model] 805 Behringer, M., Carpenter, B., Eckert, T., Ciavaglia, L., 806 Pierre, P., Liu, B., Nobre, J., and J. Strassner, "A 807 Reference Model for Autonomic Networking", draft-ietf- 808 anima-reference-model-04 (work in progress), July 2017. 810 [ITUT] International Telecommunication Union, "Architecture and 811 specification of data communication network", 812 ITU-T Recommendation G.7712/Y.1703, June 2008. 814 [RFC1034] Mockapetris, P., "Domain names - concepts and facilities", 815 STD 13, RFC 1034, DOI 10.17487/RFC1034, November 1987, 816 . 818 [RFC4193] Hinden, R. and B. Haberman, "Unique Local IPv6 Unicast 819 Addresses", RFC 4193, DOI 10.17487/RFC4193, October 2005, 820 . 822 [RFC5246] Dierks, T. and E. Rescorla, "The Transport Layer Security 823 (TLS) Protocol Version 1.2", RFC 5246, 824 DOI 10.17487/RFC5246, August 2008, 825 . 827 [RFC6291] Andersson, L., van Helvoort, H., Bonica, R., Romascanu, 828 D., and S. Mansfield, "Guidelines for the Use of the "OAM" 829 Acronym in the IETF", BCP 161, RFC 6291, 830 DOI 10.17487/RFC6291, June 2011, 831 . 833 [RFC6347] Rescorla, E. and N. Modadugu, "Datagram Transport Layer 834 Security Version 1.2", RFC 6347, DOI 10.17487/RFC6347, 835 January 2012, . 837 [RFC6418] Blanchet, M. and P. Seite, "Multiple Interfaces and 838 Provisioning Domains Problem Statement", RFC 6418, 839 DOI 10.17487/RFC6418, November 2011, 840 . 842 [RFC6434] Jankiewicz, E., Loughney, J., and T. Narten, "IPv6 Node 843 Requirements", RFC 6434, DOI 10.17487/RFC6434, December 844 2011, . 846 [RFC6518] Lebovitz, G. and M. Bhatia, "Keying and Authentication for 847 Routing Protocols (KARP) Design Guidelines", RFC 6518, 848 DOI 10.17487/RFC6518, February 2012, 849 . 851 [RFC6824] Ford, A., Raiciu, C., Handley, M., and O. Bonaventure, 852 "TCP Extensions for Multipath Operation with Multiple 853 Addresses", RFC 6824, DOI 10.17487/RFC6824, January 2013, 854 . 856 [RFC7575] Behringer, M., Pritikin, M., Bjarnason, S., Clemm, A., 857 Carpenter, B., Jiang, S., and L. Ciavaglia, "Autonomic 858 Networking: Definitions and Design Goals", RFC 7575, 859 DOI 10.17487/RFC7575, June 2015, 860 . 862 Authors' Addresses 864 Toerless Eckert (editor) 865 Futurewei Technologies Inc. 866 2330 Central Expy 867 Santa Clara 95050 868 USA 870 Email: tte+ietf@cs.fau.de 872 Michael H. Behringer 874 Email: michael.h.behringer@gmail.com