idnits 2.17.1 

draft-ietf-anima-stable-connectivity-04.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  ** The document seems to lack separate sections for Informative/Normative
     References.  All references will be assumed normative when checking for
     downward references.

  ** The abstract seems to contain references ([RFC6291]), which it
     shouldn't.  Please replace those with straight textual mentions of the
     documents in question.


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  -- The document date (July 27, 2017) is 2462 days in the past.  Is this
     intentional?


  Checking references for intended status: Informational
  ----------------------------------------------------------------------------

  == Unused Reference: 'RFC5246' is defined on line 822, but no explicit
     reference was found in the text

  == Unused Reference: 'RFC6347' is defined on line 833, but no explicit
     reference was found in the text

  == Unused Reference: 'RFC6418' is defined on line 837, but no explicit
     reference was found in the text

  == Outdated reference: A later version (-30) exists of
     draft-ietf-anima-autonomic-control-plane-08

  == Outdated reference: A later version (-45) exists of
     draft-ietf-anima-bootstrapping-keyinfra-07

  == Outdated reference: A later version (-10) exists of
     draft-ietf-anima-reference-model-04

  ** Obsolete normative reference: RFC 5246 (Obsoleted by RFC 8446)

  ** Obsolete normative reference: RFC 6347 (Obsoleted by RFC 9147)

  ** Obsolete normative reference: RFC 6434 (Obsoleted by RFC 8504)

  ** Obsolete normative reference: RFC 6824 (Obsoleted by RFC 8684)


     Summary: 6 errors (**), 0 flaws (~~), 7 warnings (==), 1 comment (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	ANIMA                                                     T. Eckert, Ed.
3	Internet-Draft                                                    Huawei
4	Intended status: Informational                              M. Behringer
5	Expires: January 28, 2018                                  July 27, 2017

7	  Using Autonomic Control Plane for Stable Connectivity of Network OAM
8	                draft-ietf-anima-stable-connectivity-04

10	Abstract

12	   OAM (Operations, Administration and Maintenance - as per BCP161,
13	   [RFC6291]) processes for data networks are often subject to the
14	   problem of circular dependencies when relying on connectivity
15	   provided by the network to be managed for the OAM purposes.
16	   Provisioning during device/network bring up tends to be far less easy
17	   to automate than service provisioning later on, changes in core
18	   network functions impacting reachability may not be easy to be
19	   automated either because of ongoing connectivity requirements for the
20	   OAM, and widely used OAM protocols are not secure enough to be
21	   carried across the network without security concerns.

23	   This document describes how to integrate OAM with the autonomic
24	   control plane (ACP) in Autonomic Networks (AN) to provide stable and
25	   secure connectivity for conducting OAM.  This connectivity is not
26	   subject to aforementioned circular dependencies.

28	Status of This Memo

30	   This Internet-Draft is submitted in full conformance with the
31	   provisions of BCP 78 and BCP 79.

33	   Internet-Drafts are working documents of the Internet Engineering
34	   Task Force (IETF).  Note that other groups may also distribute
35	   working documents as Internet-Drafts.  The list of current Internet-
36	   Drafts is at http://datatracker.ietf.org/drafts/current/.

38	   Internet-Drafts are draft documents valid for a maximum of six months
39	   and may be updated, replaced, or obsoleted by other documents at any
40	   time.  It is inappropriate to use Internet-Drafts as reference
41	   material or to cite them other than as "work in progress."

43	   This Internet-Draft will expire on January 28, 2018.

45	Copyright Notice

47	   Copyright (c) 2017 IETF Trust and the persons identified as the
48	   document authors.  All rights reserved.

50	   This document is subject to BCP 78 and the IETF Trust's Legal
51	   Provisions Relating to IETF Documents
52	   (http://trustee.ietf.org/license-info) in effect on the date of
53	   publication of this document.  Please review these documents
54	   carefully, as they describe your rights and restrictions with respect
55	   to this document.  Code Components extracted from this document must
56	   include Simplified BSD License text as described in Section 4.e of
57	   the Trust Legal Provisions and are provided without warranty as
58	   described in the Simplified BSD License.

60	Table of Contents

62	   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   2
63	     1.1.  Self dependent OAM Connectivity . . . . . . . . . . . . .   2
64	     1.2.  Data Communication Networks (DCNs)  . . . . . . . . . . .   3
65	     1.3.  Leveraging the ACP  . . . . . . . . . . . . . . . . . . .   4
66	   2.  Solutions . . . . . . . . . . . . . . . . . . . . . . . . . .   4
67	     2.1.  Stable Connectivity for Centralized OAM . . . . . . . . .   4
68	       2.1.1.  Simple Connectivity for Non-ACP capable NMS Hosts . .   5
69	       2.1.2.  Challenges and Limitation of Simple Connectivity  . .   6
70	       2.1.3.  Simultaneous ACP and Data Plane Connectivity  . . . .   7
71	       2.1.4.  IPv4-only NMS Hosts . . . . . . . . . . . . . . . . .   9
72	       2.1.5.  Path Selection Policies . . . . . . . . . . . . . . .  10
73	       2.1.6.  Autonomic NOC Device/Applications . . . . . . . . . .  12
74	       2.1.7.  Encryption of data-plane connections  . . . . . . . .  12
75	       2.1.8.  Long Term Direction of the Solution . . . . . . . . .  13
76	     2.2.  Stable Connectivity for Distributed Network/OAM . . . . .  14
77	   3.  Security Considerations . . . . . . . . . . . . . . . . . . .  14
78	   4.  No IPv4 for ACP . . . . . . . . . . . . . . . . . . . . . . .  16
79	   5.  IANA Considerations . . . . . . . . . . . . . . . . . . . . .  16
80	   6.  Acknowledgements  . . . . . . . . . . . . . . . . . . . . . .  16
81	   7.  Change log [RFC Editor: Please remove]  . . . . . . . . . . .  17
82	   8.  References  . . . . . . . . . . . . . . . . . . . . . . . . .  17
83	   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .  19

85	1.  Introduction

87	1.1.  Self dependent OAM Connectivity

89	   OAM (Operations, Administration and Maintenance - as per BCP161,
90	   [RFC6291]) for data networks is often subject to the problem of
91	   circular dependencies when relying on the connectivity service
92	   provided by the network to be managed.  OAM can easily but
93	   unintentionally break the connectivity required for its own
94	   operations.  Avoiding these problems can lead to complexity in OAM.
95	   This document describes this problem and how to use the Autonomic
96	   Control Plane (ACP) to solve it without further OAM complexity:

98	   The ability to perform OAM on a network device requires first the
99	   execution of OAM necessary to create network connectivity to that
100	   device in all intervening devices.  This typically leads to
101	   sequential, 'expanding ring configuration' from a NOC (Network
102	   Operations Center).  It also leads to tight dependencies between
103	   provisioning tools and security enrollment of devices.  Any process
104	   that wants to enroll multiple devices along a newly deployed network
105	   topology needs to tightly interlock with the provisioning process
106	   that creates connectivity before the enrollment can move on to the
107	   next device.

109	   When performing change operations on a network, it likewise is
110	   necessary to understand at any step of that process that there is no
111	   interruption of connectivity that could lead to removal of
112	   connectivity to remote devices.  This includes especially change
113	   provisioning of routing, forwarding, security and addressing policies
114	   in the network that often occur through mergers and acquisitions, the
115	   introduction of IPv6 or other mayor re-hauls in the infrastructure
116	   design.  Examples include change of an IGP or areas, PA (Provider
117	   Aggregatabe) to PI (Provider Independent) addressing, or systematic
118	   topology changes (such as L2 to L3 changes).

120	   All these circular dependencies make OAM complex and potentially
121	   fragile.  When automation is being used, for example through
122	   provisioning systems, this complexity extends into that automation
123	   software.

125	1.2.  Data Communication Networks (DCNs)

127	   In the late 1990'th and early 2000, IP networks became the method of
128	   choice to build separate OAM networks for the communications
129	   infrastructure within Network Providers.  This concept was
130	   standardized in ITU-T G.7712/Y.1703 [ITUT] and called "Data
131	   Communications Networks" (DCN).  These where (and still are)
132	   physically separate IP(/MPLS) networks that provide access to OAM
133	   interfaces of all equipment that had to be managed, from PSTN (Public
134	   Switched Telephone Network) switches over optical equipment to
135	   nowadays Ethernet and IP/MPLS production network equipment.

137	   Such DCN provide stable connectivity not subject to aforementioned
138	   problems because they are separate network entirely, so change
139	   configuration of the production IP network is done via the DCN but
140	   never affects the DCN configuration.  Of course, this approach comes
141	   at a cost of buying and operating a separate network and this cost is
142	   not feasible for many providers, most notably smaller providers, most
143	   enterprises and typical IoT networks (Internet of Things).

145	1.3.  Leveraging the ACP

147	   One of the goals of the Autonomic Networks Autonomic Control Plane
148	   (ACP as defined in [I-D.ietf-anima-autonomic-control-plane] ) is to
149	   provide similar stable connectivity as a DCN, but without having to
150	   build a separate DCN.  It is clear that such 'in-band' approach can
151	   never achieve fully the same level of separation, but the goal is to
152	   get as close to it as possible.

154	   This solution approach has several aspects.  One aspect is designing
155	   the implementation of the ACP in network devices to make it actually
156	   perform without interruption by changes in what we will call in this
157	   document the "data-plane", a.k.a: the operator or controller
158	   configured services planes of the network equipment.  This aspect is
159	   not currently covered in this document.

161	   Another aspect is how to leverage the stable IPv6 connectivity
162	   provided by the ACP for OAM purposes.  This is the current scope of
163	   this document.

165	2.  Solutions

167	2.1.  Stable Connectivity for Centralized OAM

169	   The ANI is the "Autonomic Networking Infrastructure" consisting of
170	   secure zero touch Bootstrap (BRSKI -
171	   [I-D.ietf-anima-bootstrapping-keyinfra]), GeneRic Autonomic Signaling
172	   Protocol (GRASP - [I-D.ietf-anima-grasp]), and Autonomic Control
173	   Plane (ACP - [I-D.ietf-anima-autonomic-control-plane]).  Refer to
174	   [I-D.ietf-anima-reference-model]  for an overview of the ANI and how
175	   its components interact and [RFC7575] for concepts and terminology of
176	   ANI and autonomic networks.

178	   This section describes stable connectivity for centralized OAM via
179	   ACP/ANI starting by what we expect to be the most easy to deploy
180	   short-term option.  It then describes limitation and challenges of
181	   that approach and their solutions/workarounds to finish with the
182	   preferred target option of autonomic NOC devices in Section 2.1.6.

184	   This order was chosen because it helps to explain how simple initial
185	   use of ACP can be, how difficult workarounds can become (and
186	   therefore what to avoid), and finally because one very promising
187	   long-term solution alternative is exactly like the most easy short-
188	   term solution only virtualized and automated.

190	   In the most common case, OAM will be performed by one or more
191	   applications running on a variety of centralized NOC systems that
192	   communicate with network devices.  We describe differently advanced
193	   approaches to leverage the ACP for stable connectivity.  There is a
194	   wide range of options, some of which are simple, some more complex.

196	   Three stages can be considered:

198	   o  There are simple options described in sections Section 2.1.1
199	      through Section 2.1.3 that we consider to be good starting points
200	      to operationalize the use of the ACP for stable connectivity
201	      today.  These options require only network and OAN/NOC device
202	      configuration.

204	   o  The are workarounds to connect the ACP to non-IPv6 capable NOC
205	      devices through the use of IPv4/IPv6 NAT (Network Address
206	      Translation) as described in section Section 2.1.4.  These
207	      workarounds are not recommended but if such non-IPv6 capable NOC
208	      devices need to be used longer term, then this is the only option
209	      to connect them to the ACP.

211	   o  Near to long term options can provide all the desired operational,
212	      zero touch and security benefits of an autonomic network, but a
213	      range of details for this still have to be worked out and
214	      development work on NOC/OAM equipment is necessary.  These options
215	      are discussed in sections Section 2.1.5 through Section 2.1.8.

217	2.1.1.  Simple Connectivity for Non-ACP capable NMS Hosts

219	   In the most simple candidate deployment case, the ACP extends all the
220	   way into the NOC via one or more "ACP edge devices" as defined in
221	   section 6.1 of [I-D.ietf-anima-autonomic-control-plane].  These
222	   devices "leak" the (otherwise encrypted) ACP natively to NMS hosts.
223	   They acts as the default router to those NMS hosts and provide them
224	   with IPv6 connectivity into the ACP.  NMS hosts with this setup need
225	   to support IPv6 (see e.g.  [RFC6434]) but require no other
226	   modifications to leverage the ACP.

228	   Note that even though the ACP only uses IPv6, it can of course
229	   support OAM for any type of network deployment as long as the network
230	   devices support the ACP: The Data Plane can be IPv4 only, dual-stack
231	   or IPv6 only.  It is always spearate from the ACP, therefore there is
232	   no dependency between the ACP and the IP version(s) used in the Data
233	   Plane.

235	   This setup is sufficient for troubleshooting such as SSH into network
236	   devices, NMS that performs SNMP read operations for status checking,
237	   software downloads into autonomic devices, provisioning of devices
238	   via NETCONF and so on.  In conjunction with otherwise unmodified OAM
239	   via separate NMS hosts it can provide a good subset of the stable
240	   connectivity goals.  The limitations of this approach are discussed
241	   in the next section.

243	   Because the ACP provides 'only' for IPv6 connectivity, and because
244	   addressing provided by the ACP does not include any topological
245	   addressing structure that operations in a NOC often relies on to
246	   recognize where devices are on the network, it is likely highly
247	   desirable to set up DNS (Domain Name System - see [RFC1034]) so that
248	   the ACP IPv6 addresses of autonomic devices are known via domain
249	   names that include the desired structure.  For example, if DNS in the
250	   network was set up with names for network devices as
251	   devicename.noc.example.com, and the well known structure of the Data
252	   Plane IPv4 addresses space was used by operators to infer the region
253	   where a device is located in, then the ACP address of that device
254	   could be set up as devicename_<region>.acp.noc.example.com, and
255	   devicename.acp.noc.example.com could be a CNAME to
256	   devicename_<region>.acp.noc.example.com.  Note that many networks
257	   already use names for network equipment where topological information
258	   is included, even without an ACP.

260	2.1.2.  Challenges and Limitation of Simple Connectivity

262	   This simple connectivity of non-autonomic NMS hosts suffers from a
263	   range of challenges (that is, operators may not be able to do it this
264	   way) or limitations (that is, operator cannot achieve desired goals
265	   with this setup).  The following list summarizes these challenges and
266	   limitations.  The following sections describe additional mechanisms
267	   to overcome them.

269	   Note that these challenges and limitations exist because ACP is
270	   primarily designed to support distributed ASA in the most lightweight
271	   fashion, but not mandatorily require support for additional
272	   mechanisms to best support centralized NOC operations.  It is this
273	   document that describes additional (short term) workarounds and (long
274	   term) extensions.

276	   1.  (Limitation) NMS hosts cannot directly probe whether the desired
277	       so called 'data-plane' network connectivity works because they do
278	       not directly have access to it.  This problem is similar to
279	       probing connectivity for other services (such as VPN services)
280	       that they do not have direct access to, so the NOC may already
281	       employ appropriate mechanisms to deal with this issue (probing
282	       proxies).  See Section 2.1.3 for candidate solutions.

284	   2.  (Challenge) NMS hosts need to support IPv6 which often is still
285	       not possible in enterprise networks.  See Section 2.1.4 for some
286	       workarounds.

288	   3.  (Limitation) Performance of the ACP will be limited versus normal
289	       'data-plane' connectivity.  The setup of the ACP will often
290	       support only non-hardware accelerated forwarding.  Running a
291	       large amount of traffic through the ACP, especially for tasks
292	       where it is not necessary will reduce its performance/
293	       effectiveness for those operations where it is necessary or
294	       highly desirable.  See Section 2.1.5 for candidate solutions.

296	   4.  (Limitation) Security of the ACP is reduced by exposing the ACP
297	       natively (and unencrypted) into a LAN in the NOC where the NOC
298	       devices are attached to it.  See Section 2.1.7 for candidate
299	       solutions.

301	   These four problems can be tackled independently of each other by
302	   solution improvements.  Combining some of these solutions
303	   improvements together can lead towards a candiate long term solution.

305	2.1.3.  Simultaneous ACP and Data Plane Connectivity

307	   Simultaneous connectivity to both ACP and data-plane can be achieved
308	   in a variety of ways.  If the data-plane is IPv4-only, then any
309	   method for dual-stack attachment of the NOC device/application will
310	   suffice: IPv6 connectivity from the NOC provides access via the ACP,
311	   IPv4 will provide access via the data-plane.  If as explained above
312	   in the simple case, an autonomic device supports native attachment to
313	   the ACP, and the existing NOC setup is IPv4 only, then it could be
314	   sufficient to attach the ACP device(s) as the IPv6 default router to
315	   the NOC LANs and keep the existing IPv4 default router setup
316	   unchanged.

318	   If the data-plane of the network is also supporting IPv6, then the
319	   NOC devices that need access to the ACP should have a dual-homing
320	   IPv6 setup.  One option is to make the NOC devices multi-homed with
321	   one logical or physical IPv6 interface connecting to the data-plane,
322	   and another into the ACP.  The LAN that provides access to the ACP
323	   should then be given an IPv6 prefix that shares a common prefix with
324	   the IPv6 ULA (see [RFC4193]) of the ACP so that the standard IPv6
325	   interface selection rules on the NOC host would result in the desired
326	   automatic selection of the right interface: towards the ACP facing
327	   interface for connections to ACP addresses, and towards the data-
328	   plane interface for anything else.  If this cannot be achieved
329	   automatically, then it needs to be done via IPv6 static routes in the
330	   NOC host.

332	   Providing two virtual (e.g. dot1q subnet) connections into NOC hosts
333	   may be seen as an undesired complexity.  In that case the routing
334	   policy to provide access to both ACP and data-plane via IPv6 needs to
335	   happen in the NOC network itself: The NMS host gets a single
336	   attachment interface but still with the same two IPv6 addresses as in
337	   before - one for use towards the ACP, one towards the data-plane.
338	   The first-hop router connecting to the NMS host would then have
339	   separate interfaces: one towards the data-plane, one towards the ACP.
340	   Routing of traffic from NMS hosts would then have to be based on the
341	   source IPv6 address of the host: Traffic from the address designated
342	   for ACP use would get routed towards the ACP, traffic from the
343	   designated data-plane address towards the data-plane.

345	   In the simple case, we get the following topology: Existing NMS hosts
346	   connect via an existing NOClan and existing first hop Rtr1 to the
347	   data-plane.  Rtr1 is not made autonomic, but instead the edge router
348	   of the Autonomic network ANrtr is attached via a separate interface
349	   to Rtr1 and ANrtr provides access to the ACP via ACPaccessLan.  Rtr1
350	   is configured with the above described IPv6 source routing policies
351	   and the NOC-app-devices are given the secondary IPv6 address for
352	   connectivity into the ACP.

354	                                     --... (data-plane)
355	 NOC-app-device(s) -- NOClan -- Rtr1
356	                                     --- ACPaccessLan -- ANrtr ... (ACP)

358	                                 Figure 1

360	   If Rtr1 was to be upgraded to also implement Autonomic Networking and
361	   the ACP, the picture would change as follows:

363	                                                ---- ... (data-plane)
364	       NOC-app-device(s) ---- NOClan --- ANrtr1
365	                                         .  .   ---- ... (ACP)
366	                                         \-/
367	                                         (ACP to data-plane loopback)

369	                                 Figure 2

371	   In this case, ANrtr1 would have to implement some more advanced
372	   routing such as cross-VRF routing because the data-plane and ACP are
373	   most likely run via separate VRFs.  A workaround without additional
374	   software functionality could be a physical external loopback cable
375	   into two ports of ANrtr1 to connect the data-plane and ACP VRF as
376	   shown in the picture.  A (virtual) software loopback between the ACP
377	   and data plane VRF would of course be the better solution.

379	2.1.4.  IPv4-only NMS Hosts

381	   ACP does not support IPv4: Single stack IPv6 management of the
382	   network via ACP and (as needed) data plane.  Independent of whether
383	   the data plane is dual-stack, has IPv4 as a service or is single
384	   stack IPv6.  Dual plane management, IPv6 for ACP, IPv4 for the data
385	   plane is likewise an architecturally simple option.

387	   The downside of this architectural decision is the potential need for
388	   short-term workarounds when the operational practices in a network
389	   that cannot meet these target expectations.  This section motivates
390	   when and why these workarounds may be necessary and describes them.
391	   All the workarounds described in this section are HIGHLY UNDESIRABLE.
392	   The only recommended solution is to enable IPv6 on NMS hosts.

394	   Most network equipment today supports IPv6 but it is by far not
395	   ubiquitously supported in NOC backend solutions (HW/SW), especially
396	   not in the product space for enterprises.  Even when it is supported,
397	   there are often additional limitations or issues using it in a dual
398	   stack setup or the operator mandates for simplicity single stack for
399	   all operations.  For these reasons an IPv4 only management plane is
400	   still required and common practice in many enterprises.  Without the
401	   desire to leverage the ACP, this required and common practice is not
402	   a problem for those enterprises even when they run dual stack in the
403	   network.  We document these workarounds here because it is a short
404	   term deployment challenge specific to the operations of the ACP.

406	   To bridge an IPv4 only management plane with the ACP, IPv4 to IPv6
407	   NAT can be used.  This NAT setup could for example be done in Rt1r1
408	   in above picture to also support IPv4 only NMS hots connected to
409	   NOClan.

411	   To support connections initiated from IPv4 only NMS hosts towards the
412	   ACP of network devices, it is necessary to create a static mapping of
413	   ACP IPv6 addresses into an unused IPv4 address space and dynamic or
414	   static mapping of the IPv4 NOC application device address (prefix)
415	   into IPv6 routed in the ACP.  The main issue in this setup is the
416	   mapping of all ACP IPv6 addresses to IPv4.  Without further network
417	   intelligence, this needs to be a 1:1 address mapping because the
418	   prefix used for ACP IPv6 addresses is too long to be mapped directly
419	   into IPv4 on a prefix basis.

421	   One could implement in router software dynamic mappings by leveraging
422	   DNS, but it seems highly undesirable to implement such complex
423	   technologies for something that ultimately is a temporary problem
424	   (IPv4 only NMS hosts).  With today's operational directions it is
425	   likely more preferable to automate the setup of 1:1 NAT mappings in
426	   that NAT router as part of the automation process of network device
427	   enrollment into the ACP.

429	   The ACP can also be used for connections initiated by the network
430	   device into the NMS hosts.  For example, syslog from autonomic
431	   devices.  In this case, static mappings of the NMS hosts IPv4
432	   addresses are required.  This can easily be done with a static prefix
433	   mapping into IPv6.

435	   Overall, the use of NAT is especially subject to the ROI (Return On
436	   Investment) considerations, but the methods described here may not be
437	   too different from the same problems encountered totally independent
438	   of AN/ACP when some parts of the network are to introduce IPv6 but
439	   NMS hosts are not (yet) upgradeable.

441	2.1.5.  Path Selection Policies

443	   As mentioned above, the ACP is not expected to have high performance
444	   because its primary goal is connectivity and security, and for
445	   existing network device platforms this often means that it is a lot
446	   more effort to implement that additional connectivity with hardware
447	   acceleration than without - especially because of the desire to
448	   support full encryption across the ACP to achieve the desired
449	   security.

451	   Some of these issues may go away in the future with further adoption
452	   of the ACP and network device designs that better tender to the needs
453	   of a separate OAM plane, but it is wise to plan for even long-term
454	   designs of the solution that does NOT depend on high-performance of
455	   the ACP.  This is opposite to the expectation that future NMS hosts
456	   will have IPv6, so that any considerations for IPv4/NAT in this
457	   solution are temporary.

459	   To solve the expected performance limitations of the ACP, we do
460	   expect to have the above describe dual-connectivity via both ACP and
461	   data-plane between NOC application devices and AN devices with ACP.
462	   The ACP connectivity is expected to always be there (as soon as a
463	   device is enrolled), but the data-plane connectivity is only present
464	   under normal operations but will not be present during e.g.  early
465	   stages of device bootstrap, failures, provisioning mistakes or during
466	   network configuration changes.

468	   The desired policy is therefore as follows: In the absence of further
469	   security considerations (see below), traffic between NMS hosts and AN
470	   devices should prefer data-plane connectivity and resort only to
471	   using the ACP when necessary, unless it is an operation known to be
472	   so much tied to the cases where the ACP is necessary that it makes no
473	   sense to try using the data plane.  An example here is of course the
474	   SSH connection from the NOC into a network device to troubleshoot
475	   network connectivity.  This could easily always rely on the ACP.
476	   Likewise, if an NMS host is known to transmit large amounts of data,
477	   and it uses the ACP, then its performance need to be controlled so
478	   that it will not overload the ACP performance.  Typical examples of
479	   this are software downloads.

481	   There is a wide range of methods to build up these policies.  We
482	   describe a few:

484	   Ideally, a NOC system would learn and keep track of all addresses of
485	   a device (ACP and the various data plane addresses).  Every action of
486	   the NOC system would indicate via a "path-policy" what type of
487	   connection it needs (e.g. only data-plane, ACP-only, default to data-
488	   plane, fallback to ACP,...).  A connection policy manager would then
489	   build connection to the target using the right address(es).  Shorter
490	   term, a common practice is to identify different paths to a device
491	   via different names (e.g. loopback vs. interface addresses).  This
492	   approach can be expanded to ACP uses, whether it uses NOC system
493	   local names or DNS.  We describe example schemes using DNS:

495	   DNS can be used to set up names for the same network devices but with
496	   different addresses assigned: One name (name.noc.example.com) with
497	   only the data-plane address(es) (IPv4 and/or IPv6) to be used for
498	   probing connectivity or performing routine software downloads that
499	   may stall/fail when there are connectivity issues.  One name (name-
500	   acp.noc.example.com) with only the ACP reachable address of the
501	   device for troubleshooting and probing/discovery that is desired to
502	   always only use the ACP.  One name with data plane and ACP addresses
503	   (name-both.noc.example.com).

505	   Traffic policing and/or shaping of at the ACP edge in the NOC can be
506	   used to throttle applications such as software download into the ACP.

508	   MPTCP (Multipath TCP -see [RFC6824]) is a very attractive candidate
509	   to automate the use of both data-plane and ACP and minimize or fully
510	   avoid the need for the above mentioned logical names to pre-set the
511	   desired connectivity (data-plane-only, ACP only, both).  For example,
512	   a set-up for non MPTCP aware applications would be as follows:

514	   DNS naming is set up to provide the ACP IPv6 address of network
515	   devices.  Unbeknownst to the application, MPTCP is used.  MPTCP
516	   mutually discovers between the NOC and network device the data-plane
517	   address and caries all traffic across it when that MPTCP subflow
518	   across the data-plane can be built.

520	   In the Autonomic network devices where data-plane and ACP are in
521	   separate VRFs, it is clear that this type of MPTCP subflow creation
522	   across different VRFs is new/added functionality.  Likewise, the
523	   policies of preferring a particular address (NOC-device) or VRF (AN
524	   device) for the traffic is potentially also a policy not provided as
525	   a standard.

527	2.1.6.  Autonomic NOC Device/Applications

529	   Setting up connectivity between the NOC and autonomic devices when
530	   the NOC device itself is non-autonomic is as mentioned in the
531	   beginning a security issue.  It also results as shown in the previous
532	   paragraphs in a range of connectivity considerations, some of which
533	   may be quite undesirable or complex to operationalize.

535	   Making NMS hosts autonomic and having them participate in the ACP is
536	   therefore not only a highly desirable solution to the security
537	   issues, but can also provide a likely easier operationalization of
538	   the ACP because it minimizes NOC-special edge considerations - the
539	   ACP is simply built all the way automatically, even inside the NOC
540	   and only authorized and authenticate NOC devices/applications will
541	   have access to it.

543	   Supporting the ACP all the way into an application device requires
544	   implementing the following aspects in it: AN bootstrap/enrollment
545	   mechanisms, the secure channel for the ACP and at least the host side
546	   of IPv6 routing setup for the ACP.  Minimally this could all be
547	   implemented as an application and be made available to the host OS
548	   via e.g. a tap driver to make the ACP show up as another IPv6 enabled
549	   interface.

551	   Having said this: If the structure of NMS hosts is transformed
552	   through virtualization anyhow, then it may be considered equally
553	   secure and appropriate to construct (physical) NMS host system by
554	   combining a virtual AN/ACP enabled router with non-AN/ACP enabled
555	   NOC-application VMs via a hypervisor, leveraging the configuration
556	   options described in the previous sections but just virtualizing
557	   them.

559	2.1.7.  Encryption of data-plane connections

561	   When combining ACP and data-plane connectivity for availability and
562	   performance reasons, this too has an impact on security: When using
563	   the ACP, the traffic will be mostly encryption protected, especially
564	   when considering the above described use of AN application devices.
565	   If instead the data-plane is used, then this is not the case anymore
566	   unless it is done by the application.

568	   The simplest solution for this problem exists when using AN capable
569	   NMS hosts, because in that case the communicating AN capable NMS host
570	   and the AN network device have certificates through the AN enrollment
571	   process that they can mutually trust (same AN domain).  In result,
572	   data-plane connectivity that does support this can simply leverage
573	   TLS/DTLS ([RFC5246]/[RFC6347]) with mutual AN-domain certificate
574	   authentication - and does not incur new key management.

576	   If this automatic security benefit is seen as most important, but a
577	   "full" ACP stack into the NMS host is unfeasible, then it would still
578	   be possible to design a stripped down version of AN functionality for
579	   such NOC hosts that only provides enrollment of the NOC host into the
580	   AN domain to the extent that the host receives an AN domain
581	   certificate, but without directly participating in the ACP
582	   afterwards.  Instead, the host would just leverage TLS/DTLS using its
583	   AN certificate via the data-plane with AN network devices as well as
584	   indirectly via the ACP with the above mentioned in-NOC network edge
585	   connectivity into the ACP.

587	   When using the ACP itself, TLS/DTLS for the transport layer between
588	   NMS hosts and network device is somewhat of a double price to pay
589	   (ACP also encrypts) and could potentially be optimized away, but
590	   given the assumed lower performance of the ACP, it seems that this is
591	   an unnecessary optimization.

593	2.1.8.  Long Term Direction of the Solution

595	   If we consider what potentially could be the most lightweight and
596	   autonomic long term solution based on the technologies described
597	   above, we see the following direction:

599	   1.  NMS hosts should at least support IPv6.  IPv4/IPv6 NAT in the
600	       network to enable use of ACP is long term undesirable.  Having
601	       IPv4 only applications automatically leverage IPv6 connectivity
602	       via host-stack translation may be an option but the operational
603	       viability of this approach is not well enough understood.

605	   2.  Build the ACP as a lightweight application for NMS hosts so ACP
606	       extends all the way into the actual NMS hosts.

608	   3.  Leverage and as necessary enhance MPTCP with automatic dual-
609	       connectivity: If an MPTCP unaware application is using ACP
610	       connectivity, the policies used should add subflow(s) via the
611	       data-plane and prefer them.

613	   4.  Consider how to best map NMS host desires to underlying transport
614	       mechanisms: With the above mentioned 3 points, not all options
615	       are covered.  Depending on the OAM, one may still want only ACP,
616	       only data-plane, or automatically prefer one over the other and/
617	       or use the ACP with low performance or high-performance (for
618	       emergency OAM such as countering DDoS).  It is as of today not
619	       clear what the simplest set of tools is to enable explicitly the
620	       choice of desired behavior of each OAM.  The use of the above
621	       mentioned DNS and MPTCP mechanisms is a start, but this will
622	       require additional thoughts.  This is likely a specific case of
623	       the more generic scope of TAPS.

625	2.2.  Stable Connectivity for Distributed Network/OAM

627	   The ANI (ACP, Bootstrap, GRASP) can provide via the GRASP protocol
628	   common direct-neighbor discovery and capability negotiation (GRASP
629	   via ACP and/or data-plane) and stable and secure connectivity for
630	   functions running distributed in network devices (GRASP via ACP).  It
631	   can therefore eliminate the need to re-implement similar functions in
632	   each distributed function in the network.  Today, every distributed
633	   protocol does this with functional elements usually called "Hello"
634	   mechanisms and with often protocol specific security mechanisms.

636	   KARP (Keying and Authentication for Routing Protocols, see [RFC6518])
637	   has tried to start provide common directions and therefore reduce the
638	   re-invention of at least some of the security aspects, but it only
639	   covers routing-protocols and it is unclear how well it applicable to
640	   a potentially wider range of network distributed agents such as those
641	   performing distributed OAM.  The ACP can help in these cases.

643	3.  Security Considerations

645	   In this section, we discuss only security considerations not covered
646	   in the appropriate sub-sections of the solutions described.

648	   Even though ACPs are meant to be isolated, explicit operator
649	   misconfiguration to connect to insecure OAM equipment and/or bugs in
650	   ACP devices may cause leakage into places where it is not expected.
651	   Mergers/Acquisitions and other complex network reconfigurations
652	   affecting the NOC are typical examples.

654	   ULA addressing as proposed in this document is preferred over
655	   globally reachable addresses because it is not routed in the global
656	   Internet and will therefore be subject to more filtering even in
657	   places where specific ULA addresses are being used.

659	   Random ULA addressing provides more than sufficient protection
660	   against address collision even though there is no central assignment
661	   authority.  This is helped by the expectation, that ACPs are never
662	   expected to connect all together, but only few ACPs may ever need to
663	   connect together, e.g. when mergers and aquisitions occur.

665	   If packets with unexpected ULA addresses are seen and one expects
666	   them to be from another networks ACP from which they leaked, then
667	   some form of ULA prefix registration (not allocation) can be
668	   beneficial.  Some voluntary registries exist, for example
669	   https://www.sixxs.net/tools/grh/ula/, although none of them is
670	   preferable because of being operated by some recognized authority.
671	   If an operator would want to make its ULA prefix known, it might need
672	   to register it with multiple existing registries.

674	   ULA Centrally assigned ULA addresses (ULA-C) was an attempt to
675	   introduce centralized registration of randomly assigned addresses and
676	   potentially even carve out a different ULA prefix for such addresses.
677	   This proposal is currently not proceeding, and it is questionable
678	   whether the stable connectivity use case provides sufficient
679	   motivation to revive this effort.

681	   Using current registration options implies that there will not be
682	   reverse DNS mapping for ACP addresses.  For that one will have to
683	   rely on looking up the unknown/unexpected network prefix in the
684	   registry to determine the owner of these addresses.

686	   Reverse DNS resolution may be beneficial for specific already
687	   deployed insecure legacy protocols on NOC OAM systems that intend to
688	   communicate via the ACP (e.g.  TFTP) and leverages reverse-DNS for
689	   authentication.  Given how the ACP provides path security except
690	   potentially for the last-hop in the NOC, the ACP does make it easier
691	   to extend the lifespan of such protocols in a secure fashion as far
692	   to just the transport is concerned.  The ACP does not make reverse
693	   DNS lookup a secure authentication method though.  Any current and
694	   future protocols must rely on secure end-to-end communications (TLS/
695	   DTLS) and identification and authentication via the certificates
696	   assigned to both ends.  This is enabled by the certificate mechanisms
697	   of the ACP.

699	   If DNS and especially reverse DNS are set up, then it should be set
700	   up in an automated fashion, linked to the autonomic registrar backend
701	   so that the DNS and reverse DNS records are actually derived from the
702	   subject name elements of the ACP device certificates in the same way
703	   as the autonomic devices themselves will derive their ULA addresses
704	   from their certificates to ensure correct and consistent DNS entries.

706	   If an operator feels that reverse DNS records are beneficial to its
707	   own operations but that they should not be made available publically
708	   for "security" by concealment reasons, then the case of ACP DNS
709	   entries is probably one of the least problematic use cases for split-
710	   DNS: The ACP DNS names are only needed for the NMS hosts intending to
711	   use the ACP - but not network wide across the enterprise.

713	4.  No IPv4 for ACP

715	   The ACP is targeted to be IPv6 only, and the prior explanations in
716	   this document show that this can lead to some complexity when having
717	   to connect IPv4 only NOC solutions, and that it will be impossible to
718	   leverage the ACP when the OAM agents on an ACP network device do not
719	   support IPv6.  Therefore, the question was raised whether the ACP
720	   should optionally also support IPv4.

722	   The decision not to include IPv4 for ACP as something that is
723	   considered in the use cases in this document is because of the
724	   following reasons:

726	   In SP networks that have started to support IPv6, often the next
727	   planned step is to consider moving out IPv4 from a native transport
728	   as just a service on the edge.  There is no benefit/need for multiple
729	   parallel transport families within the network, and standardizing on
730	   one reduces OPEX and improves reliability.  This evolution in the
731	   data plane makes it highly unlikely that investing development cycles
732	   into IPv4 support for ACP will have a longer term benefit or enough
733	   critical short-term use-cases.  Support for IPv4-only for ACP is
734	   purely a strategic choice to focus on the known important long term
735	   goals.

737	   In other type of networks as well, we think that efforts to support
738	   autonomic networking is better spent in ensuring that one address
739	   family will be support so all use cases will long-term work with it,
740	   instead of duplicating effort into IPv4.  Especially because auto-
741	   addressing for the ACP with IPv4 would be more complex than in IPv6
742	   due to the IPv4 addressing space.

744	5.  IANA Considerations

746	   This document requests no action by IANA.

748	6.  Acknowledgements

750	   This work originated from an Autonomic Networking project at cisco
751	   Systems, which started in early 2010 including customers involved in
752	   the design and early testing.  Many people contributed to the aspects
753	   described in this document, including in alphabetical order: BL
754	   Balaji, Steinthor Bjarnason, Yves Herthoghs, Sebastian Meissner, Ravi
755	   Kumar Vadapalli.  The author would also like to thank Michael
756	   Richardson, James Woodyatt and Brian Carpenter for their review and
757	   comments.  Special thanks to Sheng Jiang and Mohamed Boucadair for
758	   their thorough review.

760	7.  Change log [RFC Editor: Please remove]

762	      04: Integrated fixes from Mohamed Boucadairs review.

764	      03: Integrated fixes from Shepherd review (Sheng Jiang).

766	      01: Refresh timeout.  Stable document, change in author
767	      association.

769	      01: Refresh timeout.  Stable document, no changes.

771	      00: Changed title/dates.

773	      individual-02: Updated references.

775	      individual-03: Modified ULA text to not suggest ULA-C as much
776	      better anymore, but still mention it.

778	      individual-02: Added explanation why no IPv4 for ACP.

780	      individual-01: Added security section discussing the role of
781	      address prefix selection and DNS for ACP.  Title change to
782	      emphasize focus on OAM.  Expanded abstract.

784	      individual-00: Initial version.

786	8.  References

788	   [I-D.ietf-anima-autonomic-control-plane]
789	              Behringer, M., Eckert, T., and S. Bjarnason, "An Autonomic
790	              Control Plane (ACP)", draft-ietf-anima-autonomic-control-
791	              plane-08 (work in progress), July 2017.

793	   [I-D.ietf-anima-bootstrapping-keyinfra]
794	              Pritikin, M., Richardson, M., Behringer, M., Bjarnason,
795	              S., and K. Watsen, "Bootstrapping Remote Secure Key
796	              Infrastructures (BRSKI)", draft-ietf-anima-bootstrapping-
797	              keyinfra-07 (work in progress), July 2017.

799	   [I-D.ietf-anima-grasp]
800	              Bormann, C., Carpenter, B., and B. Liu, "A Generic
801	              Autonomic Signaling Protocol (GRASP)", draft-ietf-anima-
802	              grasp-15 (work in progress), July 2017.

804	   [I-D.ietf-anima-reference-model]
805	              Behringer, M., Carpenter, B., Eckert, T., Ciavaglia, L.,
806	              Pierre, P., Liu, B., Nobre, J., and J. Strassner, "A
807	              Reference Model for Autonomic Networking", draft-ietf-
808	              anima-reference-model-04 (work in progress), July 2017.

810	   [ITUT]     International Telecommunication Union, "Architecture and
811	              specification of data communication network",
812	              ITU-T Recommendation G.7712/Y.1703, June 2008.

814	   [RFC1034]  Mockapetris, P., "Domain names - concepts and facilities",
815	              STD 13, RFC 1034, DOI 10.17487/RFC1034, November 1987,
816	              <http://www.rfc-editor.org/info/rfc1034>.

818	   [RFC4193]  Hinden, R. and B. Haberman, "Unique Local IPv6 Unicast
819	              Addresses", RFC 4193, DOI 10.17487/RFC4193, October 2005,
820	              <http://www.rfc-editor.org/info/rfc4193>.

822	   [RFC5246]  Dierks, T. and E. Rescorla, "The Transport Layer Security
823	              (TLS) Protocol Version 1.2", RFC 5246,
824	              DOI 10.17487/RFC5246, August 2008,
825	              <http://www.rfc-editor.org/info/rfc5246>.

827	   [RFC6291]  Andersson, L., van Helvoort, H., Bonica, R., Romascanu,
828	              D., and S. Mansfield, "Guidelines for the Use of the "OAM"
829	              Acronym in the IETF", BCP 161, RFC 6291,
830	              DOI 10.17487/RFC6291, June 2011,
831	              <http://www.rfc-editor.org/info/rfc6291>.

833	   [RFC6347]  Rescorla, E. and N. Modadugu, "Datagram Transport Layer
834	              Security Version 1.2", RFC 6347, DOI 10.17487/RFC6347,
835	              January 2012, <http://www.rfc-editor.org/info/rfc6347>.

837	   [RFC6418]  Blanchet, M. and P. Seite, "Multiple Interfaces and
838	              Provisioning Domains Problem Statement", RFC 6418,
839	              DOI 10.17487/RFC6418, November 2011,
840	              <http://www.rfc-editor.org/info/rfc6418>.

842	   [RFC6434]  Jankiewicz, E., Loughney, J., and T. Narten, "IPv6 Node
843	              Requirements", RFC 6434, DOI 10.17487/RFC6434, December
844	              2011, <http://www.rfc-editor.org/info/rfc6434>.

846	   [RFC6518]  Lebovitz, G. and M. Bhatia, "Keying and Authentication for
847	              Routing Protocols (KARP) Design Guidelines", RFC 6518,
848	              DOI 10.17487/RFC6518, February 2012,
849	              <http://www.rfc-editor.org/info/rfc6518>.

851	   [RFC6824]  Ford, A., Raiciu, C., Handley, M., and O. Bonaventure,
852	              "TCP Extensions for Multipath Operation with Multiple
853	              Addresses", RFC 6824, DOI 10.17487/RFC6824, January 2013,
854	              <http://www.rfc-editor.org/info/rfc6824>.

856	   [RFC7575]  Behringer, M., Pritikin, M., Bjarnason, S., Clemm, A.,
857	              Carpenter, B., Jiang, S., and L. Ciavaglia, "Autonomic
858	              Networking: Definitions and Design Goals", RFC 7575,
859	              DOI 10.17487/RFC7575, June 2015,
860	              <http://www.rfc-editor.org/info/rfc7575>.

862	Authors' Addresses

864	   Toerless Eckert (editor)
865	   Futurewei Technologies Inc.
866	   2330 Central Expy
867	   Santa Clara  95050
868	   USA

870	   Email: tte+ietf@cs.fau.de

872	   Michael H. Behringer

874	   Email: michael.h.behringer@gmail.com