idnits 2.17.1 draft-ietf-anima-stable-connectivity-03.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack separate sections for Informative/Normative References. All references will be assumed normative when checking for downward references. == There are 1 instance of lines with non-RFC2606-compliant FQDNs in the document. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (July 3, 2017) is 2486 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Outdated reference: A later version (-30) exists of draft-ietf-anima-autonomic-control-plane-06 == Outdated reference: A later version (-45) exists of draft-ietf-anima-bootstrapping-keyinfra-06 == Outdated reference: A later version (-15) exists of draft-ietf-anima-grasp-14 == Outdated reference: A later version (-10) exists of draft-ietf-anima-reference-model-04 ** Obsolete normative reference: RFC 6824 (Obsoleted by RFC 8684) Summary: 2 errors (**), 0 flaws (~~), 6 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 ANIMA T. Eckert 3 Internet-Draft Huawei 4 Intended status: Informational M. Behringer 5 Expires: January 4, 2018 July 3, 2017 7 Using Autonomic Control Plane for Stable Connectivity of Network OAM 8 draft-ietf-anima-stable-connectivity-03 10 Abstract 12 OAM (Operations, Administration and Management) processes for data 13 networks are often subject to the problem of circular dependencies 14 when relying on network connectivity of the network to be managed for 15 the OAM operations itself. Provisioning during device/network bring 16 up tends to be far less easy to automate than service provisioning 17 later on, changes in core network functions impacting reachability 18 can not be automated either because of ongoing connectivity 19 requirements for the OAM equipment itself, and widely used OAM 20 protocols are not secure enough to be carried across the network 21 without security concerns. 23 This document describes how to integrate OAM processes with the 24 autonomic control plane (ACP) in Autonomic Networks (AN). to provide 25 stable and secure connectivity for those OAM processes. 27 Status of This Memo 29 This Internet-Draft is submitted in full conformance with the 30 provisions of BCP 78 and BCP 79. 32 Internet-Drafts are working documents of the Internet Engineering 33 Task Force (IETF). Note that other groups may also distribute 34 working documents as Internet-Drafts. The list of current Internet- 35 Drafts is at http://datatracker.ietf.org/drafts/current/. 37 Internet-Drafts are draft documents valid for a maximum of six months 38 and may be updated, replaced, or obsoleted by other documents at any 39 time. It is inappropriate to use Internet-Drafts as reference 40 material or to cite them other than as "work in progress." 42 This Internet-Draft will expire on January 4, 2018. 44 Copyright Notice 46 Copyright (c) 2017 IETF Trust and the persons identified as the 47 document authors. All rights reserved. 49 This document is subject to BCP 78 and the IETF Trust's Legal 50 Provisions Relating to IETF Documents 51 (http://trustee.ietf.org/license-info) in effect on the date of 52 publication of this document. Please review these documents 53 carefully, as they describe your rights and restrictions with respect 54 to this document. Code Components extracted from this document must 55 include Simplified BSD License text as described in Section 4.e of 56 the Trust Legal Provisions and are provided without warranty as 57 described in the Simplified BSD License. 59 Table of Contents 61 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 62 1.1. Self dependent OAM connectivity . . . . . . . . . . . . . 2 63 1.2. Data Communication Networks (DCNs) . . . . . . . . . . . 3 64 1.3. Leveraging the ACP . . . . . . . . . . . . . . . . . . . 3 65 2. Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . 4 66 2.1. Stable connectivity for centralized OAM operations . . . 4 67 2.1.1. Simple connectivity for non-autonomic NMS hosts . . . 5 68 2.1.2. Challenges and limitation of simple connectivity . . 6 69 2.1.3. Simultaneous ACP and data plane connectivity . . . . 7 70 2.1.4. IPv4 only NMS hosts . . . . . . . . . . . . . . . . . 8 71 2.1.5. Path selection policies . . . . . . . . . . . . . . . 10 72 2.1.6. Autonomic NOC device/applications . . . . . . . . . . 11 73 2.1.7. Encryption of data-plane connections . . . . . . . . 12 74 2.1.8. Long term direction of the solution . . . . . . . . . 13 75 2.2. Stable connectivity for distributed network/OAM functions 13 76 3. Security Considerations . . . . . . . . . . . . . . . . . . . 14 77 4. No IPv4 for ACP . . . . . . . . . . . . . . . . . . . . . . . 15 78 5. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 16 79 6. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 16 80 7. Change log [RFC Editor: Please remove] . . . . . . . . . . . 16 81 8. References . . . . . . . . . . . . . . . . . . . . . . . . . 17 82 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 18 84 1. Introduction 86 1.1. Self dependent OAM connectivity 88 OAM (Operations, Administration and Management) processes for data 89 networks are often subject to the problem of circular dependencies 90 when relying on network connectivity of the network to be managed for 91 the OAM operations itself: 93 The ability to perform OAM operations on a network device requires 94 first the execution of OAM procedures necessary to create network 95 connectivity to that device in all intervening devices. This 96 typically leads to sequential, 'expanding ring configuration' from a 97 NOC (Network Operations Center). It also leads to tight dependencies 98 between provisioning tools and security enrollment of devices. Any 99 process that wants to enroll multiple devices along a newly deployed 100 network topology needs to tightly interlock with the provisioning 101 process that creates connectivity before the enrollment can move on 102 to the next device. 104 When performing change operations on a network, it likewise is 105 necessary to understand at any step of that process that there is no 106 interruption of connectivity that could lead to removal of 107 connectivity to remote devices. This includes especially change 108 provisioning of routing, security and addressing policies in the 109 network that often occur through mergers and acquisitions, the 110 introduction of IPv6 or other mayor re-hauls in the infrastructure 111 design. Examples include change of IGP protocols or areas, PD 112 (Provider Dependent) to PI (Provider Independent) addressing, 113 systematic topology changes. 115 All this circular dependencies make OAM processes complex and 116 potentially fragile. When automation is being used, for example 117 through provisioning systems or network controllers, this complexity 118 extends into that automation software. 120 1.2. Data Communication Networks (DCNs) 122 In the late 1990'th and early 2000, IP networks became the method of 123 choice to build separate OAM networks for the communications 124 infrastructure in service providers. This concept was standardized 125 in G.7712/Y.1703 and called "Data Communications Networks" (DCN). 126 These where (and still are) physically separate IP(/MPLS) networks 127 that provide access to OAM interfaces of all equipment that had to be 128 managed, from PSTN (Public Switched Telephone Network) switches over 129 optical equipment to nowadays ethernet and IP/MPLS production network 130 equipment. 132 Such DCN provide stable connectivity not subject to aforementioned 133 problems because they are separate network entirely, so change 134 configuration of the production IP network is done via the DCN but 135 never affects the DCN configuration. Of course, this approach comes 136 at a cost of buying and operating a separate network and this cost is 137 not feasible for many networks, most notably smaller service 138 providers, most enterprises and typical IoT networks. 140 1.3. Leveraging the ACP 142 One goal of the Autonomic Networks Autonomic Control plane (ACP as 143 defined in [I-D.ietf-anima-autonomic-control-plane] ) in Autonomic 144 Networks is to provide similar stable connectivity as a DCN, but 145 without having to build a separate DCN. It is clear that such 'in- 146 band' approach can never achieve fully the same level of separation, 147 but the goal is to get as close to it as possible. 149 This solution approach has several aspects. One aspect is designing 150 the implementation of the ACP in network devices to make it actually 151 perform without interruption by changes in what we will call in this 152 document the "data-plane", aka: the operator or controller configured 153 services planes of the network equipment. This aspect is not 154 currently covered in this document. 156 Another aspect is how to leverage the stable IPv6 connectivity 157 provided by the ACP to build actual OAM solutions. This is the 158 current scope of this document. 160 2. Solutions 162 2.1. Stable connectivity for centralized OAM operations 164 The ANI is the "Autonomic Networking Infrastructure" consisting of 165 secure zero touch Bootstrap (BRSKI - 166 [I-D.ietf-anima-bootstrapping-keyinfra]), Generic Signaling (GRASP - 167 [I-D.ietf-anima-grasp] and Autonomic Control Plane (ACP - 168 [I-D.ietf-anima-autonomic-control-plane] ). See 169 [I-D.ietf-anima-reference-model] for an overview of the ANI and how 170 its components interact and [RFC7575] for concepts and terminology of 171 ANI and autonomic networks. 173 This section describes stable connectivity for centralized OAM 174 operations via ACP/ANI starting by what we expect to be the easiest 175 short-term deployment option. It then describes limitation/ 176 challenges of that approach and their solutions/workarounds to finish 177 with the preferred target option of autonomic NOC devices in 178 Section 2.1.6. 180 This order was choosen because it helps to explain how simple initial 181 use of ACP can be, how difficult workarounds can become (and 182 therefore what to avoid), and finally because one very promising 183 long-term solution alternative is exactly like the most easy short- 184 term solution only virtualized and automated. 186 In the most common case, OAM operations will be performed by one or 187 more applications running on a variety of centralized NOC systems 188 that communicate with network devices. We describe differently 189 advanced approaches to leverage the ACP for stable connectivity 190 leveraging the ACP. The descriptions will show that there is a wide 191 range of options, some of which are simple, some more complex. 193 Most easily we think there are three stages of interest: 195 o There are simple options described first that we consider to be 196 good starting points to operationalize the use of the ACP for 197 stable connectivity. 199 o The are more advanced intermediate options that try to establish 200 backward compatibility with existing deployed approached such as 201 leveraging NAT (Network Address Translation). Selection and 202 deployment of these approaches needs to be carefully vetted to 203 ensure that they provide positive RoI. This very much depends on 204 the operational processes of the network operator. 206 o It seems clearly feasible to build towards a long-term 207 configuration that provides all the desired operational, zero 208 touch and security benefits of an autonomic network, but a range 209 of details for this still have to be worked out. 211 2.1.1. Simple connectivity for non-autonomic NMS hosts 213 In the most simple deployment case, the ACP extends all the way into 214 the NOC via an autonomic device set up as an ACP edge device 215 providing native access to the ACP for NMS hosts (as defined in 216 section 6.1 of [I-D.ietf-anima-autonomic-control-plane]. It acts as 217 the default-router to those hosts and provides them with only IPv6 218 connectivity into the ACP - but no IPv4 connectivity. NMS hosts with 219 this setup need to support IPv6 but require no other modifications to 220 leverage the ACP. 222 Note that even though the ACP only uses IPv6, it can and should be 223 used to provide stable connectivity for management of any network: 224 IPv4 only, dual-stack or IPv6 only. 226 This setup is sufficient for troubleshooting OAM operations such as 227 SSH into network devices, NMS that perform SNMP read operations for 228 status checking, for software downloads into autonomic devices and so 229 on. In conjunction with otherwise unmodified OAM operations via 230 separate NMS hosts it can provide a good subset of the interesting 231 stable connectivity goals from the ACP. 233 Because the ACP provides 'only' for IPv6 connectivity, and because 234 the addressing provided by the ACP does not include any addressing 235 structure that operations in a NOC often relies on to recognize where 236 devices are on the network, it is likely highly desirable to set up 237 DNS (Domain Name System - see [RFC1034]) so that the ACP IPv6 238 addresses of autonomic devices are known via domain names with 239 logical names. For example, if DNS in the network was set up with 240 names for network devices as devicename.noc.example.com, then the ACP 241 address of that device could be mapped to devicename- 242 acp.noc.exmaple.com. 244 2.1.2. Challenges and limitation of simple connectivity 246 This simple connectivity of non-autonomic NMS hosts suffers from a 247 range of possible challenges (operators may not be able to do it this 248 way) or limitations (operator can not achieve desired goals with this 249 setup). The following list summarizes these and the following 250 sections describe additional mechanisms to overcome them. 252 Note that these challenges and limitations exist because the ACP is 253 primarily designed to support distributed ASA in the most 254 leightweight fashion but not mandatorily require support for 255 additional mechanisms to best support centralized NOC operations. It 256 is this document that describes additional (short term) workarounds 257 and (long term) extensions. 259 1. Limitation: NMS hosts can not directly probe whether the desired 260 so called 'data-plane' network connectivity works because they do 261 not directly have access to it. This problem is not dissimilar 262 to probing connectivity for other services (such as VPN services) 263 that they do not have direct access to, so the NOC may already 264 employ appropriate mechanisms to deal with this issue (probing 265 proxies). See Section 2.1.3 for solutions. 267 2. Challenge: NMS hosts need to support IPv6 which often is still 268 not possible in many enterprise networks. See Section 2.1.4 for 269 (highly undesirable) workarounds. 271 3. Limitation: Performance of the ACP will be limited versus normal 272 'data-plane' connectivity. The setup of the ACP will often 273 support only non-hardware accelerated forwarding. Running a 274 large amount of traffic through the ACP, especially for tasks 275 where it is not necessary will reduce its performance/ 276 effectiveness for those operations where it is necessary or 277 highly desirable. See Section 2.1.5 for solutions. 279 4. Limitation: Security of the ACP is reduced by exposing the ACP 280 natively (and unencrypted) into a LAN In the NOC where the NOC 281 devices are attached to it. See Section 2.1.7 for solutions. 283 These four problems can be tackled independently of each other by 284 solution improvements. Combining these solutions improvements 285 together ultimately leads towards the target long term solution. 287 2.1.3. Simultaneous ACP and data plane connectivity 289 Simultaneous connectivity to both ACP and data-plane can be achieved 290 in a variety of ways. If the data-plane is only IPv4, then any 291 method for dual-stack attachment of the NOC device/application will 292 suffice: IPv6 connectivity from the NOC provides access via the ACP, 293 IPv4 will provide access via the data-plane. If as explained above 294 in the most simple case, an autonomic device supports native 295 attachment to the ACP, and the existing NOC setup is IPv4 only, then 296 it could be sufficient to simply attach the ACP device(s) as the IPv6 297 default-router to the NOC LANs and keep the existing IPv4 default 298 router setup unchanged. 300 If the data-plane of the network is also supporting IPv6, then the 301 NOC devices that need access to the ACP should have a dual-homing 302 IPv6 setup. One option is to make the NOC devices multi-homed with 303 one logical or physical IPv6 interface connecting to the data-plane, 304 and another into the ACP. The LAN that provides access to the ACP 305 should then be given an IPv6 prefix that shares a common prefix with 306 the IPv6 ULA (see [RFC4193]) of the ACP so that the standard IPv6 307 interface selection rules on the NOC host would result in the desired 308 automatic selection of the right interface: towards the ACP facing 309 interface for connections to ACP addresses, and towards the data- 310 plane interface for anything else. If this can not be achieved 311 automatically, then it needs to be done via simple IPv6 static routes 312 in the NOC host. 314 Providing two virtual (eg: dot1q subnet) connections into NOC hosts 315 may be seen as undesired complexity. In that case the routing policy 316 to provide access to both ACP and data-plane via IPv6 needs to happen 317 in the NOC network itself: The NMS host gets a single attachment 318 interface but still with the same two IPv6 addresses as in before - 319 one for use towards the ACP, one towards the data-plane. The first- 320 hop router connecting to the NMS host would then have separate 321 interfaces: one towards the data-plane, one towards the ACP. Routing 322 of traffic from NMS hosts would then have to be based on the source 323 IPv6 address of the host: Traffic from the address designated for ACP 324 use would get routed towards the ACP, traffic from the designated 325 data-plane address towards the data-plane. 327 In the most simple case, we get the following topology: Existing NMS 328 hosts connect via an existing NOClan and existing first hop Rtr1 to 329 the data-plane. Rtr1 is not made autonomic, but instead the edge 330 router of the Autonomic network ANrtr is attached via a separate 331 interface to Rtr1 and ANrtr provides access to the ACP via 332 ACPaccessLan. Rtr1 is configured with the above described IPv6 333 source routing policies and the NOC-app-devices are given the 334 secondary IPv6 address for connectivity into the ACP. 336 --... (data-plane) 337 NOC-app-device(s) -- NOClan -- Rtr1 338 --- ACPaccessLan -- ANrtr ... (ACP) 340 Figure 1 342 If Rtr1 was to be upgraded to also implement Autonomic Networking and 343 the ACP, the picture would change as follows: 345 ---- ... (data-plane) 346 NOC-app-device(s) ---- NOClan --- ANrtr1 347 . . ---- ... (ACP) 348 \-/ 349 (ACP to data-plane loopback) 351 Figure 2 353 In this case, ANrtr1 would have to implement some more advanced 354 routing such as cross-VRF routing because the data-plane and ACP are 355 most likely run via separate VRFs. A workaround without additional 356 software functionality could be a physical external loopback cable 357 into two ports of ANrtr1 to connect the data-plane and ACP VRF as 358 shown in the picture. A (virtual) software loopback between the ACP 359 and data plane VRF would of course be the better solution. 361 2.1.4. IPv4 only NMS hosts 363 The ACP does not support IPv4 to ensure long term simplicity: Single 364 stack IPv6 management of the network via ACP and (as needed) data 365 plane. Independent of whether the data plane is dual-stack, has IPv4 366 as a service or is single stack IPv6. Dual plane management, IPv6 367 for the ACP, IPv4 for the data plane is likewise an architecturally 368 simple option. 370 The downside of this architectural decision is the potential need for 371 short-term workarounds when the operational practices in a network 372 that can not meet these target expectations. This section motivates 373 when and why these workarounds may be necessary and describes them. 374 All the workarounds described in this section are HIGHLY UNDESIRABLE. 375 The only long term solution is to enable IPv6 on NMS hosts. 377 Most network equipment today supports IPv6 but it is by far not 378 ubiquitously supported in NOC backend solutions (HW/SW), especially 379 not in the product space for enterprises. Even when it is supported, 380 there are often additional limitations or issues using it in a dual 381 stack setup or the operator mandates for simplicity single stack for 382 all operations. For these reasons an IPv4 only management plane is 383 still required and common practice in many enterprises. Without the 384 desire to leverage the ACP, this required and common practice is not 385 a problem for those enterprises even when they run dual stack in the 386 network. Therefore we document these workarounds here because it is 387 a short term deployment challence specific to the operations of the 388 ACP. 390 To bridge an IPv4 only management plane with the ACP, IPv4 to IPv6 391 NAT can be used. This NAT setup could for example be done in Rt1r1 392 in above picture to also support IPv4 only NMS hots connected to 393 NOClan. 395 To support connections initiated from IPv4 only NMS hosts towards the 396 ACP of network devices, it is necessary to create a static mapping of 397 ACP IPv6 addresses into an unused IPv4 address space and dynamic or 398 static mapping of the IPv4 NOC application device address (prefix) 399 into IPv6 routed in the ACP. The main issue in this setup is the 400 mapping of all ACP IPv6 addresses to IPv4. Without further network 401 intelligence, this needs to be a 1:1 address mapping because the 402 prefix used for ACP IPv6 addresses is too long to be mapped directly 403 into IPv4 on a prefix basis. 405 One could implement in router software dynamic mappings by leveraging 406 DNS, but it seems highly undesirable to implement such complex 407 technologies for something that ultimately is a temporary problem 408 (IPv4 only NMS hosts). With today's operational directions it is 409 likely more preferable to automate the setup of 1:1 NAT mappings in 410 that NAT router as part of the automation process of network device 411 enrollment into the ACP. 413 The ACP can also be used for connections initiated by the network 414 device into the NMS hosts. For example syslog from autonomic 415 devices. In this case, static mappings of the NMS hosts IPv4 416 addresses are required. This can easily be done with a static prefix 417 mapping into IPv6. 419 Overall, the use of NAT is especially subject to the RoI (Return of 420 Investment) considerations, but the methods described here may not be 421 too different from the same problems encountered totally independent 422 of AN/ACP when some parts of the network are to introduce IPv6 but 423 NMS hosts are not (yet) upgradeable. 425 2.1.5. Path selection policies 427 As mentioned above, the ACP is not expected to have high performance 428 because its primary goal is connectivity and security, and for 429 existing network device platforms this often means that it is a lot 430 more effort to implement that additional connectivity with hardware 431 acceleration than without - especially because of the desire to 432 support full encryption across the ACP to achieve the desired 433 security. 435 Some of these issues may go away in the future with further adoption 436 of the ACP and network device designs that better tender to the needs 437 of a separate OAM plane, but it is wise to plan for even long-term 438 designs of the solution that does NOT depend on high-performance of 439 the ACP. This is opposite to the expectation that future NMS hosts 440 will have IPv6, so that any considerations for IPv4/NAT in this 441 solution are temporary. 443 To solve the expected performance limitations of the ACP, we do 444 expect to have the above describe dual-connectivity via both ACP and 445 data-plane between NOC application devices and AN devices with ACP. 446 The ACP connectivity is expected to always be there (as soon as a 447 device is enrolled), but the data-plane connectivity is only present 448 under normal operations but will not be present during eg: early 449 stages of device bootstrap, failures, provisioning mistakes or during 450 network configuration changes. 452 The desired policy is therefore as follows: In the absence of further 453 security considerations (see below), traffic between NMS hosts and AN 454 devices should prefer data-plane connectivity and resort only to 455 using the ACP when necessary, unless it is an operation known to be 456 so much tied to the cases where the ACP is necessary that it makes no 457 sense to try using the data plane. An example here is of course the 458 SSH connection from the NOC into a network device to troubleshoot 459 network connectivity. This could easily always rely on the ACP. 460 Likewise, if an NMS host is known to transmit large amounts of data, 461 and it uses the ACP, then its performance need to be controlled so 462 that it will not overload the ACP performance. Typical examples of 463 this are software downloads. 465 There is a wide range of methods to build up these policies. We 466 describe a few: 468 Ideally, a NOC system would learn and keep track of all addresses of 469 a device (ACP and the various data plane addresses). Every action of 470 the NOC system would indicate via a "path-policy" what type of 471 connection it needs (eg: only data-plane, ACP-only, default to data- 472 plane, fallback to ACP,...). A connection policy manager would then 473 build connection to the target using the right address(es). Shorter 474 term, a common practice is to identify different paths to a device 475 via different names (eg: loopback vs. interface addresses). This 476 approach can be expanded to ACP uses, whether it uses NOC system 477 local names or DNS. We describe example schemes using DNS: 479 DNS can be used to set up names for the same network devices but with 480 different addresses assigned: One name (name.noc.example.com) with 481 only the data-plane address(es) (IPv4 and/or IPv6) to be used for 482 probing connectivity or performing routine software downloads that 483 may stall/fail when there are connectivity issues. One name (name- 484 acp.noc.example.com) with only the ACP reachable address of the 485 device for troubleshooting and probing/discovery that is desired to 486 always only use the ACP. One name with data plane and ACP addresses 487 (name-both.noc.example.com). 489 Traffic policing and/or shaping of at the ACP edge in the NOC can be 490 used to throttle applications such as software download into the ACP. 492 MP-TCP (Multipath TCP -see [RFC6824]) is a very attractive candidate 493 to automate the use of both data-plane and ACP and minimize or fully 494 avoid the need for the above mentioned logical names to pre-set the 495 desired connectivity (data-plane-only, ACP only, both). For example, 496 a set-up for non MP-TCP aware applications would be as follows: 498 DNS naming is set up to provide the ACP IPv6 address of network 499 devices. Unbeknownst to the application, MP-TCP is used. MP-TCP 500 mutually discovers between the NOC and network device the data-plane 501 address and caries all traffic across it when that MP-TCP sub-flow 502 across the data-plane can be built. 504 In the Autonomic network devices where data-plane and ACP are in 505 separate VRFs, it is clear that this type of MP-TCP sub-flow creation 506 across different VRFs is new/added functionality. Likewise the 507 policies of preferring a particular address (NOC-device) or VRF (AN 508 device) for the traffic is potentially also a policy not provided as 509 a standard. 511 2.1.6. Autonomic NOC device/applications 513 Setting up connectivity between the NOC and autonomic devices when 514 the NOC device itself is non-autonomic is as mentioned in the 515 beginning a security issue. It also results as shown in the previous 516 paragraphs in a range of connectivity considerations, some of which 517 may be quite undesirable or complex to operationalize. 519 Making NMS hosts autonomic and having them participate in the ACP is 520 therefore not only a highly desirable solution to the security 521 issues, but can also provide a likely easier operationalization of 522 the ACP because it minimizes NOC-special edge considerations - the 523 ACP is simply built all the way automatically, even inside the NOC 524 and only authorized and authenticate NOC devices/applications will 525 have access to it. 527 Supporting the ACP all the way into an application device requires 528 implementing the following aspects in it: AN bootstrap/enrollment 529 mechanisms, the secure channel for the ACP and at least the host side 530 of IPv6 routing setup for the ACP. Minimally this could all be 531 implemented as an application and be made available to the host OS 532 via eg: a tap driver to make the ACP show up as another IPv6 enabled 533 interface. 535 Having said this: If the structure of NMS hosts is transformed 536 through virtualization anyhow, then it may be considered equally 537 secure and appropriate to construct (physical) NMS host system by 538 combining a virtual AN/ACP enabled router with non-AN/ACP enabled 539 NOC-application VMs via a hypervisor, leveraging the configuration 540 options described in the previous sections but just virtualizing 541 them. 543 2.1.7. Encryption of data-plane connections 545 When combining ACP and data-plane connectivity for availability and 546 performance reasons, this too has an impact on security: When using 547 the ACP, the traffic will be mostly encryption protected, especially 548 when considering the above described use of AN application devices. 549 If instead the data-plane is used, then this is not the case anymore 550 unless it is done by the application. 552 The simplest solution for this problem exists when using AN capable 553 NMS hosts, because in that case the communicating AN capable NMS host 554 and the AN network device have certificates through the AN enrollment 555 process that they can mutually trust (same AN domain). In result, 556 data-plane connectivity that does support this can simply leverage 557 TLS/dTLS with mutual AN-domain certificate authentication - and does 558 not incur new key management. 560 If this automatic security benefit is seen as most important, but a 561 "full" ACP stack into the NMS host is unfeasible, then it would still 562 be possible to design a stripped down version of AN functionality for 563 such NOC hosts that only provides enrollment of the NOC host into the 564 AN domain to the extend that the host receives an AN domain 565 certificate, but without directly participating in the ACP 566 afterwards. Instead, the host would just leverage TLS/dTLS using its 567 AN certificate via the data-plane with AN network devices as well as 568 indirectly via the ACP with the above mentioned in-NOC network edge 569 connectivity into the ACP. 571 When using the ACP itself, TLS/dTLS for the transport layer between 572 NMS hosts and network device is somewhat of a double price to pay 573 (ACP also encrypts) and could potentially be optimized away, but 574 given the assumed lower performance of the ACP, it seems that this is 575 an unnecessary optimization. 577 2.1.8. Long term direction of the solution 579 If we consider what potentially could be the most lightweight and 580 autonomic long term solution based on the technologies described 581 above, we see the following direction: 583 1. NMS hosts should at least support IPv6. IPv4/IPv6 NAT in the 584 network to enable use of ACP is long term undesirable. Having 585 IPv4 only applications automatically leverage IPv6 connectivity 586 via host-stack options is likely non-feasible (NOTE: this has 587 still to be vetted more). 589 2. Build the ACP as a lightweight application for NMS hosts so ACP 590 extends all the way into the actual NMS hosts. 592 3. Leverage and as necessary enhance MP-TCP with automatic dual- 593 connectivity: If the MP-TCP unaware application is using ACP 594 connectivity, the policies used should add sub-flow(s) via the 595 data-plane and prefer them. 597 4. Consider how to best map NMS host desires to underlying transport 598 mechanisms: With the above mentioned 3 points, not all options 599 are covered. Depending on the OAM operation, one may still want 600 only ACP, only data-plane, or automatically prefer one over the 601 other and/or use the ACP with low performance or high-performance 602 (for emergency OAM actions such as countering DDoS). It is as of 603 today not clear what the simplest set of tools is to enable 604 explicitly the choice of desired behavior of each OAM operations. 605 The use of the above mentioned DNS and MP-TCP mechanisms is a 606 start, but this will require additional thoughts. This is likely 607 a specific case of the more generic scope of TAPS. 609 2.2. Stable connectivity for distributed network/OAM functions 611 The ANI (ACP, Bootstrap, GRASP) can provide via the GRASP protocol 612 common direct-neighbor discovery and capability negotiation (GRASP 613 via ACP and/or data-plane) and stable and secure connectivity for 614 functions running distributed in network devices (GRASP via ACP). It 615 can therefore eliminate the need to re-implement similar functions in 616 each distributed function in the network. Today, every distributed 617 protocol does this with functional elements usually called "Hello" 618 mechanisms and with often protocol specific security mechanisms. 620 KARP (Keying and Authentication for Routing Protocols, see [RFC6518]) 621 has tried to start provide common directions and therefore reduce the 622 re-invention of at least some of the security aspects, but it only 623 covers routing-protocols and it is unclear how well it applicable to 624 a potentially wider range of network distributed agents such as those 625 performing distributed OAM functions. The ACP can help in these 626 cases. 628 3. Security Considerations 630 In this section, we discuss only security considerations not covered 631 in the appropriate sub-sections of the solutions described. 633 Even though ACPs are meant to be isolated, explicit operator 634 misconfiguration to connect to insecure OAM equipment and/or bugs in 635 ACP devices may cause leakage into places where it is not expected. 636 Mergers/Aquisitions and other complex network reconfigurations 637 affecting the NOC are typical examples. 639 ULA addressing as proposed in this document is preferred over 640 globally reachable addresses because it is not routed in the global 641 Internet and will therefore be subject to more filtering even in 642 places where specific ULA addresses are being used. 644 Randomn ULA addressing provides more than sufficient protection 645 against address collision even though there is no central assignment 646 authority. This is helped by the expectation, that ACPs are never 647 expected to connect all together, but only few ACPs may ever need to 648 connect together, eg: when mergers and aquisitions occur. 650 If packets with unexpected ULA addresses are seen and one expects 651 them to be from another networks ACP from which they leaked, then 652 some form of ULA prefix registrastion (not allocation) can be 653 beneficial. Some voluntary registries exist, for example 654 https://www.sixxs.net/tools/grh/ula/, although none of them is 655 preferrable because of being operated by some recognized authority. 656 If an operator would want to make its ULA prefix known, it might need 657 to register it with multiple existing registries. 659 ULA Centrally assigned ULA addresses (ULA-C) was an attempt to 660 introduce centralized registration of randomly assigned addresses and 661 potentially even carve out a different ULA prefix for such addresses. 662 This proposal is currently not proceeding, and it is questionable 663 whether the stable connectivity use case provides sufficient 664 motivation to revive this effort. 666 Using current registration options implies that there will not be 667 reverse DNS mapping for ACP addresses. For that one will have to 668 rely on looking up the unknown/unexpected network prefix in the 669 registry registry to determine the owner of these addresses. 671 Reverse DNS resolution may be beneficial for specific already 672 deployed insecure legacy protocols on NOC OAM systems that intend to 673 communicate via the ACP (eg: TFTP) and leverages reverse-DNS for 674 authentication. Given how the ACP provides path security except 675 potentially for the last-hop in the NOC, the ACP does make it easier 676 to extend the lifespan of such protocols in a secure fashion as far 677 to just the transport is concerned. The ACP does not make reverse 678 DNS lookup a secure authentication method though. Any current and 679 future protocols must rely on secure end-to-end communications (TLD, 680 dTLS) and identification and authentication via the certificates 681 assigned to both ends. This is enabled by the certificate mechanisms 682 of the ACP. 684 If DNS and especially reverse DNS are set up, then it should be set 685 up in an automated fashion, linked to the autonomic registrar backend 686 so that the DNS and reverse DNS records are actually derived from the 687 subject name elements of the ACP device certificates in the same way 688 as the autonomic devices themselves will derive their ULA addresses 689 from their certificates to ensure correct and consistent DNS entries. 691 If an operator feels that reverse DNS records are beneficial to its 692 own operations but that they should not be made available publically 693 for "security" by concealment reasons, then the case of ACP DNS 694 entries is probably one of the least problematic use cases for split- 695 DNS: The ACP DNS names are only needed for the NMS hosts intending to 696 use the ACP - but not network wide across the enterprise. 698 4. No IPv4 for ACP 700 The ACP is targeted to be IPv6 only, and the prior explanations in 701 this document show that this can lead to some complexity when having 702 to connect IPv4 only NOC solutions, and that it will be impossible to 703 leverage the ACP when the OAM agents on an ACP network device do not 704 support IPv6. Therefore, the question was raised whether the ACP 705 should optionally also support IPv4. 707 The decision not to include IPv4 for ACP as something that is 708 considered in the use cases in this document is because of the 709 following reasons: 711 In SP networks that have started to support IPv6, often the next 712 planned step is to consider moving out IPv4 from a native transport 713 as just a service on the edge. There is no benefit/need for multiple 714 parallel transport families within the network, and standardizing on 715 one reduces OPEX and improves reliability. This evolution in the 716 data plane makes it highly unlikely that investing development cycles 717 into IPv4 support for ACP will have a longer term benefit or enough 718 critical short-term use-cases. Support for only IPv4 for ACP is 719 purely a strategic choice to focus on the known important long term 720 goals. 722 In other type of networks as well, we think that efforts to support 723 autonomic networking is better spent in ensuring that one address 724 family will be support so all use cases will long-term work with it, 725 instead of duplicating effort into IPv4. Especially because auto- 726 addressing for the ACP with IPv4 would be more ecomplex than in IPv6 727 due to the IPv4 addressing space. 729 5. IANA Considerations 731 This document requests no action by IANA. 733 6. Acknowledgements 735 This work originated from an Autonomic Networking project at cisco 736 Systems, which started in early 2010 including customers involved in 737 the design and early testing. Many people contributed to the aspects 738 described in this document, including in alphabetical order: BL 739 Balaji, Steinthor Bjarnason, Yves Herthoghs, Sebastian Meissner, Ravi 740 Kumar Vadapalli. The author would also like to thank Michael 741 Richardson, James Woodyatt and Brian Carpenter for their review and 742 comments. Special thanks to Sheng Jiang for his thorough review. 744 7. Change log [RFC Editor: Please remove] 746 03: Integrated fixed from Shepherd review (Sheng Jiang). 748 01: Refresh timeout. Stable document, change in author 749 association. 751 01: Refresh timeout. Stable document, no changes. 753 00: Changed title/dates. 755 individual-02: Updated references. 757 individual-03: Modified ULA text to not suggest ULA-C as much 758 better anymore, but still mention it. 760 individual-02: Added explanation why no IPv4 for ACP. 762 individual-01: Added security section discussing the role of 763 address prefix selection and DNS for ACP. Title change to 764 emphasize focus on OAM. Expanded abstract. 766 individual-00: Initial version. 768 8. References 770 [I-D.ietf-anima-autonomic-control-plane] 771 Behringer, M., Eckert, T., and S. Bjarnason, "An Autonomic 772 Control Plane", draft-ietf-anima-autonomic-control- 773 plane-06 (work in progress), March 2017. 775 [I-D.ietf-anima-bootstrapping-keyinfra] 776 Pritikin, M., Richardson, M., Behringer, M., Bjarnason, 777 S., and K. Watsen, "Bootstrapping Remote Secure Key 778 Infrastructures (BRSKI)", draft-ietf-anima-bootstrapping- 779 keyinfra-06 (work in progress), May 2017. 781 [I-D.ietf-anima-grasp] 782 Bormann, C., Carpenter, B., and B. Liu, "A Generic 783 Autonomic Signaling Protocol (GRASP)", draft-ietf-anima- 784 grasp-14 (work in progress), July 2017. 786 [I-D.ietf-anima-reference-model] 787 Behringer, M., Carpenter, B., Eckert, T., Ciavaglia, L., 788 Pierre, P., Liu, B., Nobre, J., and J. Strassner, "A 789 Reference Model for Autonomic Networking", draft-ietf- 790 anima-reference-model-04 (work in progress), July 2017. 792 [RFC1034] Mockapetris, P., "Domain names - concepts and facilities", 793 STD 13, RFC 1034, DOI 10.17487/RFC1034, November 1987, 794 . 796 [RFC4193] Hinden, R. and B. Haberman, "Unique Local IPv6 Unicast 797 Addresses", RFC 4193, DOI 10.17487/RFC4193, October 2005, 798 . 800 [RFC6518] Lebovitz, G. and M. Bhatia, "Keying and Authentication for 801 Routing Protocols (KARP) Design Guidelines", RFC 6518, 802 DOI 10.17487/RFC6518, February 2012, 803 . 805 [RFC6824] Ford, A., Raiciu, C., Handley, M., and O. Bonaventure, 806 "TCP Extensions for Multipath Operation with Multiple 807 Addresses", RFC 6824, DOI 10.17487/RFC6824, January 2013, 808 . 810 [RFC7575] Behringer, M., Pritikin, M., Bjarnason, S., Clemm, A., 811 Carpenter, B., Jiang, S., and L. Ciavaglia, "Autonomic 812 Networking: Definitions and Design Goals", RFC 7575, 813 DOI 10.17487/RFC7575, June 2015, 814 . 816 Authors' Addresses 818 Toerless Eckert 819 Futurewei Technologies Inc. 820 2330 Central Expy 821 Santa Clara 95050 822 USA 824 Email: tte+ietf@cs.fau.de 826 Michael H. Behringer 828 Email: michael.h.behringer@gmail.com