idnits 2.17.1 

draft-ietf-anima-stable-connectivity-03.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  ** The document seems to lack separate sections for Informative/Normative
     References.  All references will be assumed normative when checking for
     downward references.

  == There are 1 instance of lines with non-RFC2606-compliant FQDNs in the
     document.


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  -- The document date (July 3, 2017) is 2486 days in the past.  Is this
     intentional?


  Checking references for intended status: Informational
  ----------------------------------------------------------------------------

  == Outdated reference: A later version (-30) exists of
     draft-ietf-anima-autonomic-control-plane-06

  == Outdated reference: A later version (-45) exists of
     draft-ietf-anima-bootstrapping-keyinfra-06

  == Outdated reference: A later version (-15) exists of
     draft-ietf-anima-grasp-14

  == Outdated reference: A later version (-10) exists of
     draft-ietf-anima-reference-model-04

  ** Obsolete normative reference: RFC 6824 (Obsoleted by RFC 8684)


     Summary: 2 errors (**), 0 flaws (~~), 6 warnings (==), 1 comment (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	ANIMA                                                          T. Eckert
3	Internet-Draft                                                    Huawei
4	Intended status: Informational                              M. Behringer
5	Expires: January 4, 2018                                    July 3, 2017

7	  Using Autonomic Control Plane for Stable Connectivity of Network OAM
8	                draft-ietf-anima-stable-connectivity-03

10	Abstract

12	   OAM (Operations, Administration and Management) processes for data
13	   networks are often subject to the problem of circular dependencies
14	   when relying on network connectivity of the network to be managed for
15	   the OAM operations itself.  Provisioning during device/network bring
16	   up tends to be far less easy to automate than service provisioning
17	   later on, changes in core network functions impacting reachability
18	   can not be automated either because of ongoing connectivity
19	   requirements for the OAM equipment itself, and widely used OAM
20	   protocols are not secure enough to be carried across the network
21	   without security concerns.

23	   This document describes how to integrate OAM processes with the
24	   autonomic control plane (ACP) in Autonomic Networks (AN). to provide
25	   stable and secure connectivity for those OAM processes.

27	Status of This Memo

29	   This Internet-Draft is submitted in full conformance with the
30	   provisions of BCP 78 and BCP 79.

32	   Internet-Drafts are working documents of the Internet Engineering
33	   Task Force (IETF).  Note that other groups may also distribute
34	   working documents as Internet-Drafts.  The list of current Internet-
35	   Drafts is at http://datatracker.ietf.org/drafts/current/.

37	   Internet-Drafts are draft documents valid for a maximum of six months
38	   and may be updated, replaced, or obsoleted by other documents at any
39	   time.  It is inappropriate to use Internet-Drafts as reference
40	   material or to cite them other than as "work in progress."

42	   This Internet-Draft will expire on January 4, 2018.

44	Copyright Notice

46	   Copyright (c) 2017 IETF Trust and the persons identified as the
47	   document authors.  All rights reserved.

49	   This document is subject to BCP 78 and the IETF Trust's Legal
50	   Provisions Relating to IETF Documents
51	   (http://trustee.ietf.org/license-info) in effect on the date of
52	   publication of this document.  Please review these documents
53	   carefully, as they describe your rights and restrictions with respect
54	   to this document.  Code Components extracted from this document must
55	   include Simplified BSD License text as described in Section 4.e of
56	   the Trust Legal Provisions and are provided without warranty as
57	   described in the Simplified BSD License.

59	Table of Contents

61	   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   2
62	     1.1.  Self dependent OAM connectivity . . . . . . . . . . . . .   2
63	     1.2.  Data Communication Networks (DCNs)  . . . . . . . . . . .   3
64	     1.3.  Leveraging the ACP  . . . . . . . . . . . . . . . . . . .   3
65	   2.  Solutions . . . . . . . . . . . . . . . . . . . . . . . . . .   4
66	     2.1.  Stable connectivity for centralized OAM operations  . . .   4
67	       2.1.1.  Simple connectivity for non-autonomic NMS hosts . . .   5
68	       2.1.2.  Challenges and limitation of simple connectivity  . .   6
69	       2.1.3.  Simultaneous ACP and data plane connectivity  . . . .   7
70	       2.1.4.  IPv4 only NMS hosts . . . . . . . . . . . . . . . . .   8
71	       2.1.5.  Path selection policies . . . . . . . . . . . . . . .  10
72	       2.1.6.  Autonomic NOC device/applications . . . . . . . . . .  11
73	       2.1.7.  Encryption of data-plane connections  . . . . . . . .  12
74	       2.1.8.  Long term direction of the solution . . . . . . . . .  13
75	     2.2.  Stable connectivity for distributed network/OAM functions  13
76	   3.  Security Considerations . . . . . . . . . . . . . . . . . . .  14
77	   4.  No IPv4 for ACP . . . . . . . . . . . . . . . . . . . . . . .  15
78	   5.  IANA Considerations . . . . . . . . . . . . . . . . . . . . .  16
79	   6.  Acknowledgements  . . . . . . . . . . . . . . . . . . . . . .  16
80	   7.  Change log [RFC Editor: Please remove]  . . . . . . . . . . .  16
81	   8.  References  . . . . . . . . . . . . . . . . . . . . . . . . .  17
82	   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .  18

84	1.  Introduction

86	1.1.  Self dependent OAM connectivity

88	   OAM (Operations, Administration and Management) processes for data
89	   networks are often subject to the problem of circular dependencies
90	   when relying on network connectivity of the network to be managed for
91	   the OAM operations itself:

93	   The ability to perform OAM operations on a network device requires
94	   first the execution of OAM procedures necessary to create network
95	   connectivity to that device in all intervening devices.  This
96	   typically leads to sequential, 'expanding ring configuration' from a
97	   NOC (Network Operations Center).  It also leads to tight dependencies
98	   between provisioning tools and security enrollment of devices.  Any
99	   process that wants to enroll multiple devices along a newly deployed
100	   network topology needs to tightly interlock with the provisioning
101	   process that creates connectivity before the enrollment can move on
102	   to the next device.

104	   When performing change operations on a network, it likewise is
105	   necessary to understand at any step of that process that there is no
106	   interruption of connectivity that could lead to removal of
107	   connectivity to remote devices.  This includes especially change
108	   provisioning of routing, security and addressing policies in the
109	   network that often occur through mergers and acquisitions, the
110	   introduction of IPv6 or other mayor re-hauls in the infrastructure
111	   design.  Examples include change of IGP protocols or areas, PD
112	   (Provider Dependent) to PI (Provider Independent) addressing,
113	   systematic topology changes.

115	   All this circular dependencies make OAM processes complex and
116	   potentially fragile.  When automation is being used, for example
117	   through provisioning systems or network controllers, this complexity
118	   extends into that automation software.

120	1.2.  Data Communication Networks (DCNs)

122	   In the late 1990'th and early 2000, IP networks became the method of
123	   choice to build separate OAM networks for the communications
124	   infrastructure in service providers.  This concept was standardized
125	   in G.7712/Y.1703 and called "Data Communications Networks" (DCN).
126	   These where (and still are) physically separate IP(/MPLS) networks
127	   that provide access to OAM interfaces of all equipment that had to be
128	   managed, from PSTN (Public Switched Telephone Network) switches over
129	   optical equipment to nowadays ethernet and IP/MPLS production network
130	   equipment.

132	   Such DCN provide stable connectivity not subject to aforementioned
133	   problems because they are separate network entirely, so change
134	   configuration of the production IP network is done via the DCN but
135	   never affects the DCN configuration.  Of course, this approach comes
136	   at a cost of buying and operating a separate network and this cost is
137	   not feasible for many networks, most notably smaller service
138	   providers, most enterprises and typical IoT networks.

140	1.3.  Leveraging the ACP

142	   One goal of the Autonomic Networks Autonomic Control plane (ACP as
143	   defined in [I-D.ietf-anima-autonomic-control-plane] ) in Autonomic
144	   Networks is to provide similar stable connectivity as a DCN, but
145	   without having to build a separate DCN.  It is clear that such 'in-
146	   band' approach can never achieve fully the same level of separation,
147	   but the goal is to get as close to it as possible.

149	   This solution approach has several aspects.  One aspect is designing
150	   the implementation of the ACP in network devices to make it actually
151	   perform without interruption by changes in what we will call in this
152	   document the "data-plane", aka: the operator or controller configured
153	   services planes of the network equipment.  This aspect is not
154	   currently covered in this document.

156	   Another aspect is how to leverage the stable IPv6 connectivity
157	   provided by the ACP to build actual OAM solutions.  This is the
158	   current scope of this document.

160	2.  Solutions

162	2.1.  Stable connectivity for centralized OAM operations

164	   The ANI is the "Autonomic Networking Infrastructure" consisting of
165	   secure zero touch Bootstrap (BRSKI -
166	   [I-D.ietf-anima-bootstrapping-keyinfra]), Generic Signaling (GRASP -
167	   [I-D.ietf-anima-grasp] and Autonomic Control Plane (ACP -
168	   [I-D.ietf-anima-autonomic-control-plane] ).  See
169	   [I-D.ietf-anima-reference-model]  for an overview of the ANI and how
170	   its components interact and [RFC7575] for concepts and terminology of
171	   ANI and autonomic networks.

173	   This section describes stable connectivity for centralized OAM
174	   operations via ACP/ANI starting by what we expect to be the easiest
175	   short-term deployment option.  It then describes limitation/
176	   challenges of that approach and their solutions/workarounds to finish
177	   with the preferred target option of autonomic NOC devices in
178	   Section 2.1.6.

180	   This order was choosen because it helps to explain how simple initial
181	   use of ACP can be, how difficult workarounds can become (and
182	   therefore what to avoid), and finally because one very promising
183	   long-term solution alternative is exactly like the most easy short-
184	   term solution only virtualized and automated.

186	   In the most common case, OAM operations will be performed by one or
187	   more applications running on a variety of centralized NOC systems
188	   that communicate with network devices.  We describe differently
189	   advanced approaches to leverage the ACP for stable connectivity
190	   leveraging the ACP.  The descriptions will show that there is a wide
191	   range of options, some of which are simple, some more complex.

193	   Most easily we think there are three stages of interest:

195	   o  There are simple options described first that we consider to be
196	      good starting points to operationalize the use of the ACP for
197	      stable connectivity.

199	   o  The are more advanced intermediate options that try to establish
200	      backward compatibility with existing deployed approached such as
201	      leveraging NAT (Network Address Translation).  Selection and
202	      deployment of these approaches needs to be carefully vetted to
203	      ensure that they provide positive RoI.  This very much depends on
204	      the operational processes of the network operator.

206	   o  It seems clearly feasible to build towards a long-term
207	      configuration that provides all the desired operational, zero
208	      touch and security benefits of an autonomic network, but a range
209	      of details for this still have to be worked out.

211	2.1.1.  Simple connectivity for non-autonomic NMS hosts

213	   In the most simple deployment case, the ACP extends all the way into
214	   the NOC via an autonomic device set up as an ACP edge device
215	   providing native access to the ACP for NMS hosts (as defined in
216	   section 6.1 of [I-D.ietf-anima-autonomic-control-plane].  It acts as
217	   the default-router to those hosts and provides them with only IPv6
218	   connectivity into the ACP - but no IPv4 connectivity.  NMS hosts with
219	   this setup need to support IPv6 but require no other modifications to
220	   leverage the ACP.

222	   Note that even though the ACP only uses IPv6, it can and should be
223	   used to provide stable connectivity for management of any network:
224	   IPv4 only, dual-stack or IPv6 only.

226	   This setup is sufficient for troubleshooting OAM operations such as
227	   SSH into network devices, NMS that perform SNMP read operations for
228	   status checking, for software downloads into autonomic devices and so
229	   on.  In conjunction with otherwise unmodified OAM operations via
230	   separate NMS hosts it can provide a good subset of the interesting
231	   stable connectivity goals from the ACP.

233	   Because the ACP provides 'only' for IPv6 connectivity, and because
234	   the addressing provided by the ACP does not include any addressing
235	   structure that operations in a NOC often relies on to recognize where
236	   devices are on the network, it is likely highly desirable to set up
237	   DNS (Domain Name System - see [RFC1034]) so that the ACP IPv6
238	   addresses of autonomic devices are known via domain names with
239	   logical names.  For example, if DNS in the network was set up with
240	   names for network devices as devicename.noc.example.com, then the ACP
241	   address of that device could be mapped to devicename-
242	   acp.noc.exmaple.com.

244	2.1.2.  Challenges and limitation of simple connectivity

246	   This simple connectivity of non-autonomic NMS hosts suffers from a
247	   range of possible challenges (operators may not be able to do it this
248	   way) or limitations (operator can not achieve desired goals with this
249	   setup).  The following list summarizes these and the following
250	   sections describe additional mechanisms to overcome them.

252	   Note that these challenges and limitations exist because the ACP is
253	   primarily designed to support distributed ASA in the most
254	   leightweight fashion but not mandatorily require support for
255	   additional mechanisms to best support centralized NOC operations.  It
256	   is this document that describes additional (short term) workarounds
257	   and (long term) extensions.

259	   1.  Limitation: NMS hosts can not directly probe whether the desired
260	       so called 'data-plane' network connectivity works because they do
261	       not directly have access to it.  This problem is not dissimilar
262	       to probing connectivity for other services (such as VPN services)
263	       that they do not have direct access to, so the NOC may already
264	       employ appropriate mechanisms to deal with this issue (probing
265	       proxies).  See Section 2.1.3 for solutions.

267	   2.  Challenge: NMS hosts need to support IPv6 which often is still
268	       not possible in many enterprise networks.  See Section 2.1.4 for
269	       (highly undesirable) workarounds.

271	   3.  Limitation: Performance of the ACP will be limited versus normal
272	       'data-plane' connectivity.  The setup of the ACP will often
273	       support only non-hardware accelerated forwarding.  Running a
274	       large amount of traffic through the ACP, especially for tasks
275	       where it is not necessary will reduce its performance/
276	       effectiveness for those operations where it is necessary or
277	       highly desirable.  See Section 2.1.5 for solutions.

279	   4.  Limitation: Security of the ACP is reduced by exposing the ACP
280	       natively (and unencrypted) into a LAN In the NOC where the NOC
281	       devices are attached to it.  See Section 2.1.7 for solutions.

283	   These four problems can be tackled independently of each other by
284	   solution improvements.  Combining these solutions improvements
285	   together ultimately leads towards the target long term solution.

287	2.1.3.  Simultaneous ACP and data plane connectivity

289	   Simultaneous connectivity to both ACP and data-plane can be achieved
290	   in a variety of ways.  If the data-plane is only IPv4, then any
291	   method for dual-stack attachment of the NOC device/application will
292	   suffice: IPv6 connectivity from the NOC provides access via the ACP,
293	   IPv4 will provide access via the data-plane.  If as explained above
294	   in the most simple case, an autonomic device supports native
295	   attachment to the ACP, and the existing NOC setup is IPv4 only, then
296	   it could be sufficient to simply attach the ACP device(s) as the IPv6
297	   default-router to the NOC LANs and keep the existing IPv4 default
298	   router setup unchanged.

300	   If the data-plane of the network is also supporting IPv6, then the
301	   NOC devices that need access to the ACP should have a dual-homing
302	   IPv6 setup.  One option is to make the NOC devices multi-homed with
303	   one logical or physical IPv6 interface connecting to the data-plane,
304	   and another into the ACP.  The LAN that provides access to the ACP
305	   should then be given an IPv6 prefix that shares a common prefix with
306	   the IPv6 ULA (see [RFC4193]) of the ACP so that the standard IPv6
307	   interface selection rules on the NOC host would result in the desired
308	   automatic selection of the right interface: towards the ACP facing
309	   interface for connections to ACP addresses, and towards the data-
310	   plane interface for anything else.  If this can not be achieved
311	   automatically, then it needs to be done via simple IPv6 static routes
312	   in the NOC host.

314	   Providing two virtual (eg: dot1q subnet) connections into NOC hosts
315	   may be seen as undesired complexity.  In that case the routing policy
316	   to provide access to both ACP and data-plane via IPv6 needs to happen
317	   in the NOC network itself: The NMS host gets a single attachment
318	   interface but still with the same two IPv6 addresses as in before -
319	   one for use towards the ACP, one towards the data-plane.  The first-
320	   hop router connecting to the NMS host would then have separate
321	   interfaces: one towards the data-plane, one towards the ACP.  Routing
322	   of traffic from NMS hosts would then have to be based on the source
323	   IPv6 address of the host: Traffic from the address designated for ACP
324	   use would get routed towards the ACP, traffic from the designated
325	   data-plane address towards the data-plane.

327	   In the most simple case, we get the following topology: Existing NMS
328	   hosts connect via an existing NOClan and existing first hop Rtr1 to
329	   the data-plane.  Rtr1 is not made autonomic, but instead the edge
330	   router of the Autonomic network ANrtr is attached via a separate
331	   interface to Rtr1 and ANrtr provides access to the ACP via
332	   ACPaccessLan.  Rtr1 is configured with the above described IPv6
333	   source routing policies and the NOC-app-devices are given the
334	   secondary IPv6 address for connectivity into the ACP.

336	                                     --... (data-plane)
337	 NOC-app-device(s) -- NOClan -- Rtr1
338	                                     --- ACPaccessLan -- ANrtr ... (ACP)

340	                                 Figure 1

342	   If Rtr1 was to be upgraded to also implement Autonomic Networking and
343	   the ACP, the picture would change as follows:

345	                                                ---- ... (data-plane)
346	       NOC-app-device(s) ---- NOClan --- ANrtr1
347	                                         .  .   ---- ... (ACP)
348	                                         \-/
349	                                         (ACP to data-plane loopback)

351	                                 Figure 2

353	   In this case, ANrtr1 would have to implement some more advanced
354	   routing such as cross-VRF routing because the data-plane and ACP are
355	   most likely run via separate VRFs.  A workaround without additional
356	   software functionality could be a physical external loopback cable
357	   into two ports of ANrtr1 to connect the data-plane and ACP VRF as
358	   shown in the picture.  A (virtual) software loopback between the ACP
359	   and data plane VRF would of course be the better solution.

361	2.1.4.  IPv4 only NMS hosts

363	   The ACP does not support IPv4 to ensure long term simplicity: Single
364	   stack IPv6 management of the network via ACP and (as needed) data
365	   plane.  Independent of whether the data plane is dual-stack, has IPv4
366	   as a service or is single stack IPv6.  Dual plane management, IPv6
367	   for the ACP, IPv4 for the data plane is likewise an architecturally
368	   simple option.

370	   The downside of this architectural decision is the potential need for
371	   short-term workarounds when the operational practices in a network
372	   that can not meet these target expectations.  This section motivates
373	   when and why these workarounds may be necessary and describes them.
374	   All the workarounds described in this section are HIGHLY UNDESIRABLE.
375	   The only long term solution is to enable IPv6 on NMS hosts.

377	   Most network equipment today supports IPv6 but it is by far not
378	   ubiquitously supported in NOC backend solutions (HW/SW), especially
379	   not in the product space for enterprises.  Even when it is supported,
380	   there are often additional limitations or issues using it in a dual
381	   stack setup or the operator mandates for simplicity single stack for
382	   all operations.  For these reasons an IPv4 only management plane is
383	   still required and common practice in many enterprises.  Without the
384	   desire to leverage the ACP, this required and common practice is not
385	   a problem for those enterprises even when they run dual stack in the
386	   network.  Therefore we document these workarounds here because it is
387	   a short term deployment challence specific to the operations of the
388	   ACP.

390	   To bridge an IPv4 only management plane with the ACP, IPv4 to IPv6
391	   NAT can be used.  This NAT setup could for example be done in Rt1r1
392	   in above picture to also support IPv4 only NMS hots connected to
393	   NOClan.

395	   To support connections initiated from IPv4 only NMS hosts towards the
396	   ACP of network devices, it is necessary to create a static mapping of
397	   ACP IPv6 addresses into an unused IPv4 address space and dynamic or
398	   static mapping of the IPv4 NOC application device address (prefix)
399	   into IPv6 routed in the ACP.  The main issue in this setup is the
400	   mapping of all ACP IPv6 addresses to IPv4.  Without further network
401	   intelligence, this needs to be a 1:1 address mapping because the
402	   prefix used for ACP IPv6 addresses is too long to be mapped directly
403	   into IPv4 on a prefix basis.

405	   One could implement in router software dynamic mappings by leveraging
406	   DNS, but it seems highly undesirable to implement such complex
407	   technologies for something that ultimately is a temporary problem
408	   (IPv4 only NMS hosts).  With today's operational directions it is
409	   likely more preferable to automate the setup of 1:1 NAT mappings in
410	   that NAT router as part of the automation process of network device
411	   enrollment into the ACP.

413	   The ACP can also be used for connections initiated by the network
414	   device into the NMS hosts.  For example syslog from autonomic
415	   devices.  In this case, static mappings of the NMS hosts IPv4
416	   addresses are required.  This can easily be done with a static prefix
417	   mapping into IPv6.

419	   Overall, the use of NAT is especially subject to the RoI (Return of
420	   Investment) considerations, but the methods described here may not be
421	   too different from the same problems encountered totally independent
422	   of AN/ACP when some parts of the network are to introduce IPv6 but
423	   NMS hosts are not (yet) upgradeable.

425	2.1.5.  Path selection policies

427	   As mentioned above, the ACP is not expected to have high performance
428	   because its primary goal is connectivity and security, and for
429	   existing network device platforms this often means that it is a lot
430	   more effort to implement that additional connectivity with hardware
431	   acceleration than without - especially because of the desire to
432	   support full encryption across the ACP to achieve the desired
433	   security.

435	   Some of these issues may go away in the future with further adoption
436	   of the ACP and network device designs that better tender to the needs
437	   of a separate OAM plane, but it is wise to plan for even long-term
438	   designs of the solution that does NOT depend on high-performance of
439	   the ACP.  This is opposite to the expectation that future NMS hosts
440	   will have IPv6, so that any considerations for IPv4/NAT in this
441	   solution are temporary.

443	   To solve the expected performance limitations of the ACP, we do
444	   expect to have the above describe dual-connectivity via both ACP and
445	   data-plane between NOC application devices and AN devices with ACP.
446	   The ACP connectivity is expected to always be there (as soon as a
447	   device is enrolled), but the data-plane connectivity is only present
448	   under normal operations but will not be present during eg: early
449	   stages of device bootstrap, failures, provisioning mistakes or during
450	   network configuration changes.

452	   The desired policy is therefore as follows: In the absence of further
453	   security considerations (see below), traffic between NMS hosts and AN
454	   devices should prefer data-plane connectivity and resort only to
455	   using the ACP when necessary, unless it is an operation known to be
456	   so much tied to the cases where the ACP is necessary that it makes no
457	   sense to try using the data plane.  An example here is of course the
458	   SSH connection from the NOC into a network device to troubleshoot
459	   network connectivity.  This could easily always rely on the ACP.
460	   Likewise, if an NMS host is known to transmit large amounts of data,
461	   and it uses the ACP, then its performance need to be controlled so
462	   that it will not overload the ACP performance.  Typical examples of
463	   this are software downloads.

465	   There is a wide range of methods to build up these policies.  We
466	   describe a few:

468	   Ideally, a NOC system would learn and keep track of all addresses of
469	   a device (ACP and the various data plane addresses).  Every action of
470	   the NOC system would indicate via a "path-policy" what type of
471	   connection it needs (eg: only data-plane, ACP-only, default to data-
472	   plane, fallback to ACP,...).  A connection policy manager would then
473	   build connection to the target using the right address(es).  Shorter
474	   term, a common practice is to identify different paths to a device
475	   via different names (eg: loopback vs. interface addresses).  This
476	   approach can be expanded to ACP uses, whether it uses NOC system
477	   local names or DNS.  We describe example schemes using DNS:

479	   DNS can be used to set up names for the same network devices but with
480	   different addresses assigned: One name (name.noc.example.com) with
481	   only the data-plane address(es) (IPv4 and/or IPv6) to be used for
482	   probing connectivity or performing routine software downloads that
483	   may stall/fail when there are connectivity issues.  One name (name-
484	   acp.noc.example.com) with only the ACP reachable address of the
485	   device for troubleshooting and probing/discovery that is desired to
486	   always only use the ACP.  One name with data plane and ACP addresses
487	   (name-both.noc.example.com).

489	   Traffic policing and/or shaping of at the ACP edge in the NOC can be
490	   used to throttle applications such as software download into the ACP.

492	   MP-TCP (Multipath TCP -see [RFC6824]) is a very attractive candidate
493	   to automate the use of both data-plane and ACP and minimize or fully
494	   avoid the need for the above mentioned logical names to pre-set the
495	   desired connectivity (data-plane-only, ACP only, both).  For example,
496	   a set-up for non MP-TCP aware applications would be as follows:

498	   DNS naming is set up to provide the ACP IPv6 address of network
499	   devices.  Unbeknownst to the application, MP-TCP is used.  MP-TCP
500	   mutually discovers between the NOC and network device the data-plane
501	   address and caries all traffic across it when that MP-TCP sub-flow
502	   across the data-plane can be built.

504	   In the Autonomic network devices where data-plane and ACP are in
505	   separate VRFs, it is clear that this type of MP-TCP sub-flow creation
506	   across different VRFs is new/added functionality.  Likewise the
507	   policies of preferring a particular address (NOC-device) or VRF (AN
508	   device) for the traffic is potentially also a policy not provided as
509	   a standard.

511	2.1.6.  Autonomic NOC device/applications

513	   Setting up connectivity between the NOC and autonomic devices when
514	   the NOC device itself is non-autonomic is as mentioned in the
515	   beginning a security issue.  It also results as shown in the previous
516	   paragraphs in a range of connectivity considerations, some of which
517	   may be quite undesirable or complex to operationalize.

519	   Making NMS hosts autonomic and having them participate in the ACP is
520	   therefore not only a highly desirable solution to the security
521	   issues, but can also provide a likely easier operationalization of
522	   the ACP because it minimizes NOC-special edge considerations - the
523	   ACP is simply built all the way automatically, even inside the NOC
524	   and only authorized and authenticate NOC devices/applications will
525	   have access to it.

527	   Supporting the ACP all the way into an application device requires
528	   implementing the following aspects in it: AN bootstrap/enrollment
529	   mechanisms, the secure channel for the ACP and at least the host side
530	   of IPv6 routing setup for the ACP.  Minimally this could all be
531	   implemented as an application and be made available to the host OS
532	   via eg: a tap driver to make the ACP show up as another IPv6 enabled
533	   interface.

535	   Having said this: If the structure of NMS hosts is transformed
536	   through virtualization anyhow, then it may be considered equally
537	   secure and appropriate to construct (physical) NMS host system by
538	   combining a virtual AN/ACP enabled router with non-AN/ACP enabled
539	   NOC-application VMs via a hypervisor, leveraging the configuration
540	   options described in the previous sections but just virtualizing
541	   them.

543	2.1.7.  Encryption of data-plane connections

545	   When combining ACP and data-plane connectivity for availability and
546	   performance reasons, this too has an impact on security: When using
547	   the ACP, the traffic will be mostly encryption protected, especially
548	   when considering the above described use of AN application devices.
549	   If instead the data-plane is used, then this is not the case anymore
550	   unless it is done by the application.

552	   The simplest solution for this problem exists when using AN capable
553	   NMS hosts, because in that case the communicating AN capable NMS host
554	   and the AN network device have certificates through the AN enrollment
555	   process that they can mutually trust (same AN domain).  In result,
556	   data-plane connectivity that does support this can simply leverage
557	   TLS/dTLS with mutual AN-domain certificate authentication - and does
558	   not incur new key management.

560	   If this automatic security benefit is seen as most important, but a
561	   "full" ACP stack into the NMS host is unfeasible, then it would still
562	   be possible to design a stripped down version of AN functionality for
563	   such NOC hosts that only provides enrollment of the NOC host into the
564	   AN domain to the extend that the host receives an AN domain
565	   certificate, but without directly participating in the ACP
566	   afterwards.  Instead, the host would just leverage TLS/dTLS using its
567	   AN certificate via the data-plane with AN network devices as well as
568	   indirectly via the ACP with the above mentioned in-NOC network edge
569	   connectivity into the ACP.

571	   When using the ACP itself, TLS/dTLS for the transport layer between
572	   NMS hosts and network device is somewhat of a double price to pay
573	   (ACP also encrypts) and could potentially be optimized away, but
574	   given the assumed lower performance of the ACP, it seems that this is
575	   an unnecessary optimization.

577	2.1.8.  Long term direction of the solution

579	   If we consider what potentially could be the most lightweight and
580	   autonomic long term solution based on the technologies described
581	   above, we see the following direction:

583	   1.  NMS hosts should at least support IPv6.  IPv4/IPv6 NAT in the
584	       network to enable use of ACP is long term undesirable.  Having
585	       IPv4 only applications automatically leverage IPv6 connectivity
586	       via host-stack options is likely non-feasible (NOTE: this has
587	       still to be vetted more).

589	   2.  Build the ACP as a lightweight application for NMS hosts so ACP
590	       extends all the way into the actual NMS hosts.

592	   3.  Leverage and as necessary enhance MP-TCP with automatic dual-
593	       connectivity: If the MP-TCP unaware application is using ACP
594	       connectivity, the policies used should add sub-flow(s) via the
595	       data-plane and prefer them.

597	   4.  Consider how to best map NMS host desires to underlying transport
598	       mechanisms: With the above mentioned 3 points, not all options
599	       are covered.  Depending on the OAM operation, one may still want
600	       only ACP, only data-plane, or automatically prefer one over the
601	       other and/or use the ACP with low performance or high-performance
602	       (for emergency OAM actions such as countering DDoS).  It is as of
603	       today not clear what the simplest set of tools is to enable
604	       explicitly the choice of desired behavior of each OAM operations.
605	       The use of the above mentioned DNS and MP-TCP mechanisms is a
606	       start, but this will require additional thoughts.  This is likely
607	       a specific case of the more generic scope of TAPS.

609	2.2.  Stable connectivity for distributed network/OAM functions

611	   The ANI (ACP, Bootstrap, GRASP) can provide via the GRASP protocol
612	   common direct-neighbor discovery and capability negotiation (GRASP
613	   via ACP and/or data-plane) and stable and secure connectivity for
614	   functions running distributed in network devices (GRASP via ACP).  It
615	   can therefore eliminate the need to re-implement similar functions in
616	   each distributed function in the network.  Today, every distributed
617	   protocol does this with functional elements usually called "Hello"
618	   mechanisms and with often protocol specific security mechanisms.

620	   KARP (Keying and Authentication for Routing Protocols, see [RFC6518])
621	   has tried to start provide common directions and therefore reduce the
622	   re-invention of at least some of the security aspects, but it only
623	   covers routing-protocols and it is unclear how well it applicable to
624	   a potentially wider range of network distributed agents such as those
625	   performing distributed OAM functions.  The ACP can help in these
626	   cases.

628	3.  Security Considerations

630	   In this section, we discuss only security considerations not covered
631	   in the appropriate sub-sections of the solutions described.

633	   Even though ACPs are meant to be isolated, explicit operator
634	   misconfiguration to connect to insecure OAM equipment and/or bugs in
635	   ACP devices may cause leakage into places where it is not expected.
636	   Mergers/Aquisitions and other complex network reconfigurations
637	   affecting the NOC are typical examples.

639	   ULA addressing as proposed in this document is preferred over
640	   globally reachable addresses because it is not routed in the global
641	   Internet and will therefore be subject to more filtering even in
642	   places where specific ULA addresses are being used.

644	   Randomn ULA addressing provides more than sufficient protection
645	   against address collision even though there is no central assignment
646	   authority.  This is helped by the expectation, that ACPs are never
647	   expected to connect all together, but only few ACPs may ever need to
648	   connect together, eg: when mergers and aquisitions occur.

650	   If packets with unexpected ULA addresses are seen and one expects
651	   them to be from another networks ACP from which they leaked, then
652	   some form of ULA prefix registrastion (not allocation) can be
653	   beneficial.  Some voluntary registries exist, for example
654	   https://www.sixxs.net/tools/grh/ula/, although none of them is
655	   preferrable because of being operated by some recognized authority.
656	   If an operator would want to make its ULA prefix known, it might need
657	   to register it with multiple existing registries.

659	   ULA Centrally assigned ULA addresses (ULA-C) was an attempt to
660	   introduce centralized registration of randomly assigned addresses and
661	   potentially even carve out a different ULA prefix for such addresses.
662	   This proposal is currently not proceeding, and it is questionable
663	   whether the stable connectivity use case provides sufficient
664	   motivation to revive this effort.

666	   Using current registration options implies that there will not be
667	   reverse DNS mapping for ACP addresses.  For that one will have to
668	   rely on looking up the unknown/unexpected network prefix in the
669	   registry registry to determine the owner of these addresses.

671	   Reverse DNS resolution may be beneficial for specific already
672	   deployed insecure legacy protocols on NOC OAM systems that intend to
673	   communicate via the ACP (eg: TFTP) and leverages reverse-DNS for
674	   authentication.  Given how the ACP provides path security except
675	   potentially for the last-hop in the NOC, the ACP does make it easier
676	   to extend the lifespan of such protocols in a secure fashion as far
677	   to just the transport is concerned.  The ACP does not make reverse
678	   DNS lookup a secure authentication method though.  Any current and
679	   future protocols must rely on secure end-to-end communications (TLD,
680	   dTLS) and identification and authentication via the certificates
681	   assigned to both ends.  This is enabled by the certificate mechanisms
682	   of the ACP.

684	   If DNS and especially reverse DNS are set up, then it should be set
685	   up in an automated fashion, linked to the autonomic registrar backend
686	   so that the DNS and reverse DNS records are actually derived from the
687	   subject name elements of the ACP device certificates in the same way
688	   as the autonomic devices themselves will derive their ULA addresses
689	   from their certificates to ensure correct and consistent DNS entries.

691	   If an operator feels that reverse DNS records are beneficial to its
692	   own operations but that they should not be made available publically
693	   for "security" by concealment reasons, then the case of ACP DNS
694	   entries is probably one of the least problematic use cases for split-
695	   DNS: The ACP DNS names are only needed for the NMS hosts intending to
696	   use the ACP - but not network wide across the enterprise.

698	4.  No IPv4 for ACP

700	   The ACP is targeted to be IPv6 only, and the prior explanations in
701	   this document show that this can lead to some complexity when having
702	   to connect IPv4 only NOC solutions, and that it will be impossible to
703	   leverage the ACP when the OAM agents on an ACP network device do not
704	   support IPv6.  Therefore, the question was raised whether the ACP
705	   should optionally also support IPv4.

707	   The decision not to include IPv4 for ACP as something that is
708	   considered in the use cases in this document is because of the
709	   following reasons:

711	   In SP networks that have started to support IPv6, often the next
712	   planned step is to consider moving out IPv4 from a native transport
713	   as just a service on the edge.  There is no benefit/need for multiple
714	   parallel transport families within the network, and standardizing on
715	   one reduces OPEX and improves reliability.  This evolution in the
716	   data plane makes it highly unlikely that investing development cycles
717	   into IPv4 support for ACP will have a longer term benefit or enough
718	   critical short-term use-cases.  Support for only IPv4 for ACP is
719	   purely a strategic choice to focus on the known important long term
720	   goals.

722	   In other type of networks as well, we think that efforts to support
723	   autonomic networking is better spent in ensuring that one address
724	   family will be support so all use cases will long-term work with it,
725	   instead of duplicating effort into IPv4.  Especially because auto-
726	   addressing for the ACP with IPv4 would be more ecomplex than in IPv6
727	   due to the IPv4 addressing space.

729	5.  IANA Considerations

731	   This document requests no action by IANA.

733	6.  Acknowledgements

735	   This work originated from an Autonomic Networking project at cisco
736	   Systems, which started in early 2010 including customers involved in
737	   the design and early testing.  Many people contributed to the aspects
738	   described in this document, including in alphabetical order: BL
739	   Balaji, Steinthor Bjarnason, Yves Herthoghs, Sebastian Meissner, Ravi
740	   Kumar Vadapalli.  The author would also like to thank Michael
741	   Richardson, James Woodyatt and Brian Carpenter for their review and
742	   comments.  Special thanks to Sheng Jiang for his thorough review.

744	7.  Change log [RFC Editor: Please remove]

746	      03: Integrated fixed from Shepherd review (Sheng Jiang).

748	      01: Refresh timeout.  Stable document, change in author
749	      association.

751	      01: Refresh timeout.  Stable document, no changes.

753	      00: Changed title/dates.

755	      individual-02: Updated references.

757	      individual-03: Modified ULA text to not suggest ULA-C as much
758	      better anymore, but still mention it.

760	      individual-02: Added explanation why no IPv4 for ACP.

762	      individual-01: Added security section discussing the role of
763	      address prefix selection and DNS for ACP.  Title change to
764	      emphasize focus on OAM.  Expanded abstract.

766	      individual-00: Initial version.

768	8.  References

770	   [I-D.ietf-anima-autonomic-control-plane]
771	              Behringer, M., Eckert, T., and S. Bjarnason, "An Autonomic
772	              Control Plane", draft-ietf-anima-autonomic-control-
773	              plane-06 (work in progress), March 2017.

775	   [I-D.ietf-anima-bootstrapping-keyinfra]
776	              Pritikin, M., Richardson, M., Behringer, M., Bjarnason,
777	              S., and K. Watsen, "Bootstrapping Remote Secure Key
778	              Infrastructures (BRSKI)", draft-ietf-anima-bootstrapping-
779	              keyinfra-06 (work in progress), May 2017.

781	   [I-D.ietf-anima-grasp]
782	              Bormann, C., Carpenter, B., and B. Liu, "A Generic
783	              Autonomic Signaling Protocol (GRASP)", draft-ietf-anima-
784	              grasp-14 (work in progress), July 2017.

786	   [I-D.ietf-anima-reference-model]
787	              Behringer, M., Carpenter, B., Eckert, T., Ciavaglia, L.,
788	              Pierre, P., Liu, B., Nobre, J., and J. Strassner, "A
789	              Reference Model for Autonomic Networking", draft-ietf-
790	              anima-reference-model-04 (work in progress), July 2017.

792	   [RFC1034]  Mockapetris, P., "Domain names - concepts and facilities",
793	              STD 13, RFC 1034, DOI 10.17487/RFC1034, November 1987,
794	              <http://www.rfc-editor.org/info/rfc1034>.

796	   [RFC4193]  Hinden, R. and B. Haberman, "Unique Local IPv6 Unicast
797	              Addresses", RFC 4193, DOI 10.17487/RFC4193, October 2005,
798	              <http://www.rfc-editor.org/info/rfc4193>.

800	   [RFC6518]  Lebovitz, G. and M. Bhatia, "Keying and Authentication for
801	              Routing Protocols (KARP) Design Guidelines", RFC 6518,
802	              DOI 10.17487/RFC6518, February 2012,
803	              <http://www.rfc-editor.org/info/rfc6518>.

805	   [RFC6824]  Ford, A., Raiciu, C., Handley, M., and O. Bonaventure,
806	              "TCP Extensions for Multipath Operation with Multiple
807	              Addresses", RFC 6824, DOI 10.17487/RFC6824, January 2013,
808	              <http://www.rfc-editor.org/info/rfc6824>.

810	   [RFC7575]  Behringer, M., Pritikin, M., Bjarnason, S., Clemm, A.,
811	              Carpenter, B., Jiang, S., and L. Ciavaglia, "Autonomic
812	              Networking: Definitions and Design Goals", RFC 7575,
813	              DOI 10.17487/RFC7575, June 2015,
814	              <http://www.rfc-editor.org/info/rfc7575>.

816	Authors' Addresses

818	   Toerless Eckert
819	   Futurewei Technologies Inc.
820	   2330 Central Expy
821	   Santa Clara  95050
822	   USA

824	   Email: tte+ietf@cs.fau.de

826	   Michael H. Behringer

828	   Email: michael.h.behringer@gmail.com