idnits 2.17.1 

draft-voit-netmod-peer-mount-requirements-02.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  ** The document seems to lack an Introduction section.


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  -- The document date (March 9, 2015) is 3329 days in the past.  Is this
     intentional?


  Checking references for intended status: Informational
  ----------------------------------------------------------------------------

  -- Looks like a reference, but probably isn't: '1' on line 957

  -- Looks like a reference, but probably isn't: '2' on line 960

  -- Looks like a reference, but probably isn't: '3' on line 962

  -- Looks like a reference, but probably isn't: '4' on line 964

  ** Obsolete normative reference: RFC 3768 (Obsoleted by RFC 5798)


     Summary: 2 errors (**), 0 flaws (~~), 1 warning (==), 5 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------

1	NETCONF Data Modeling Language Working Group (netmod)            E. Voit
2	Internet-Draft                                                  A. Clemm
3	Intended status: Informational                             Cisco Systems
4	Expires: September 10, 2015                                   S. Mertens
5	                                                               Prismtech
6	                                                           March 9, 2015

8	 Requirements for Peer Mounting of YANG subtrees from Remote Datastores
9	              draft-voit-netmod-peer-mount-requirements-02

11	Abstract

13	   Network integrated applications want simple ways to access YANG
14	   objects and subtrees which might be distributed across network.
15	   Performance requirements may dictate that it is unaffordable for a
16	   subset of these applications to go through existing centralized
17	   management brokers.  For such applications, development complexity
18	   must be minimized.  Specific aspects of complexity developers want to
19	   ignore include:

21	   o  whether authoritative information is actually sourced from remote
22	      datastores (as well as how to get to those datastores),

24	   o  whether such information has been locally cached or not,

26	   o  whether there are zero, one, or more controllers asserting
27	      ownership of information, and

29	   o  whether there are interactions with other applications
30	      concurrently running elsewhere

32	   The solution requirements described in this document detail what is
33	   needed to support application access to authoritative network YANG
34	   objects from controllers (star) or peering network devices (mesh) in
35	   such a way to meet these goals.

37	Status of This Memo

39	   This Internet-Draft is submitted in full conformance with the
40	   provisions of BCP 78 and BCP 79.

42	   Internet-Drafts are working documents of the Internet Engineering
43	   Task Force (IETF).  Note that other groups may also distribute
44	   working documents as Internet-Drafts.  The list of current Internet-
45	   Drafts is at http://datatracker.ietf.org/drafts/current/.

47	   Internet-Drafts are draft documents valid for a maximum of six months
48	   and may be updated, replaced, or obsoleted by other documents at any
49	   time.  It is inappropriate to use Internet-Drafts as reference
50	   material or to cite them other than as "work in progress."

52	   This Internet-Draft will expire on September 10, 2015.

54	Copyright Notice

56	   Copyright (c) 2015 IETF Trust and the persons identified as the
57	   document authors.  All rights reserved.

59	   This document is subject to BCP 78 and the IETF Trust's Legal
60	   Provisions Relating to IETF Documents
61	   (http://trustee.ietf.org/license-info) in effect on the date of
62	   publication of this document.  Please review these documents
63	   carefully, as they describe your rights and restrictions with respect
64	   to this document.  Code Components extracted from this document must
65	   include Simplified BSD License text as described in Section 4.e of
66	   the Trust Legal Provisions and are provided without warranty as
67	   described in the Simplified BSD License.

69	Table of Contents

71	   1.  Business Problem  . . . . . . . . . . . . . . . . . . . . . .   3
72	   2.  Terminology . . . . . . . . . . . . . . . . . . . . . . . . .   4
73	   3.  Solution Context  . . . . . . . . . . . . . . . . . . . . . .   5
74	     3.1.  Peer Mount  . . . . . . . . . . . . . . . . . . . . . . .   6
75	     3.2.  Eventual Consistency and YANG 1.1 . . . . . . . . . . . .   7
76	   4.  Example Use Cases . . . . . . . . . . . . . . . . . . . . . .   8
77	     4.1.  Cloud Policer . . . . . . . . . . . . . . . . . . . . . .   8
78	     4.2.  DDoS Thresholding . . . . . . . . . . . . . . . . . . . .   9
79	     4.3.  Service Chain Classification, Load Balancing and Capacity
80	           Management  . . . . . . . . . . . . . . . . . . . . . . .  10
81	   5.  Requirements  . . . . . . . . . . . . . . . . . . . . . . . .  11
82	     5.1.  Application Simplification  . . . . . . . . . . . . . . .  11
83	     5.2.  Caching . . . . . . . . . . . . . . . . . . . . . . . . .  13
84	     5.3.  Subscribing to Remote Object Updates  . . . . . . . . . .  14
85	     5.4.  Lifecycle of the Mount Topology . . . . . . . . . . . . .  14
86	       5.4.1.  Discovery and Creation of Mount Topology  . . . . . .  14
87	       5.4.2.  Restrictions on the Mount Topology  . . . . . . . . .  15
88	     5.5.  Mount Filter  . . . . . . . . . . . . . . . . . . . . . .  15
89	     5.6.  Auto-Negotiation of Peer Mount Client QoS . . . . . . . .  15
90	     5.7.  Datastore Qualification . . . . . . . . . . . . . . . . .  16
91	     5.8.  Local Mounting  . . . . . . . . . . . . . . . . . . . . .  16
92	     5.9.  Mount Cascades  . . . . . . . . . . . . . . . . . . . . .  16
93	     5.10. Transport . . . . . . . . . . . . . . . . . . . . . . . .  16
94	     5.11. Security Considerations . . . . . . . . . . . . . . . . .  17
95	     5.12. High Availability . . . . . . . . . . . . . . . . . . . .  17
96	       5.12.1.  Reliability  . . . . . . . . . . . . . . . . . . . .  18
97	       5.12.2.  Alignment to late joining peers  . . . . . . . . . .  18
98	       5.12.3.  Liveliness . . . . . . . . . . . . . . . . . . . . .  18
99	       5.12.4.  Merging of datasets  . . . . . . . . . . . . . . . .  18
100	       5.12.5.  Distributed Mount Servers  . . . . . . . . . . . . .  19
101	     5.13. Configuration . . . . . . . . . . . . . . . . . . . . . .  19
102	     5.14. Assurance and Monitoring  . . . . . . . . . . . . . . . .  19
103	   6.  IANA Considerations . . . . . . . . . . . . . . . . . . . . .  19
104	   7.  Acknowledgements  . . . . . . . . . . . . . . . . . . . . . .  19
105	   8.  References  . . . . . . . . . . . . . . . . . . . . . . . . .  20
106	     8.1.  Normative References  . . . . . . . . . . . . . . . . . .  20
107	     8.2.  Informative References  . . . . . . . . . . . . . . . . .  20
108	     8.3.  URIs  . . . . . . . . . . . . . . . . . . . . . . . . . .  21
109	   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .  21

111	1.  Business Problem

113	   Instrumenting Physical and Virtual Network Elements purely along
114	   device boundaries is insufficient for today's requirements.  Instead,
115	   users, applications, and operators are asking for the ability to
116	   interact with varying subsets of network information at the highest
117	   viable level of abstraction.  Likewise applications that run locally
118	   on devices may require access to data that transcends the boundaries
119	   of the device they are deployed.  Achieving this can be difficult
120	   since a running network is comprised of a distributed mesh of object
121	   ownership.  (I.e., the authoritative device owning a particular
122	   object will vary.)  Solutions require the transparent assembly of
123	   different objects from across a network in order to provide
124	   consolidated, time synchronized, and consistent views required for
125	   that abstraction.

127	   Recent approaches have focused on a Network Controller as the arbiter
128	   of new network-wide abstractions.  Controller based solutions are
129	   supportable by requirements outlined in this document.  However this
130	   is not the only deployment model covered by this document.  Equally
131	   valid are deployment models where Network Elements exchange
132	   information in a way which allows one or more of those Elements to
133	   provide the desired network level abstraction.  This is not a new
134	   idea.  Examples of Network Element based protocols which already do
135	   network level abstractions include VRRP [RFC3768], mLACP/ICCP[ICCP],
136	   and Anycast-RP [RFC4610] . As network elements increase their compute
137	   power and support Linux based compute virtualization, we should
138	   expect additional local applications to emerge as well (such as
139	   Distributed Analytics [1]).

141	   Ultimately network application programming must be simplified.  To do
142	   this:

144	   o  we must provide APIs to both controller and network element based
145	      applications in a way which allows access to network objects as if
146	      they were coming from a cloud,

148	   o  we must enable these local applications to interact with network
149	      level abstractions,

151	   o  we must hide the mesh of interdependencies and consistency
152	      enforcement mechanisms between devices which will underpin a
153	      particular abstraction,

155	   o  we must enable flexible deployment models, in which applications
156	      are able to run not only on controller and OSS frameworks but also
157	      on network devices without requiring heavy middleware with large
158	      footprints, and

160	   o  we need to maintain clear authoritative ownership of individual
161	      data items while not burdening applications with the need to
162	      reconcile and synchronize information replicated in different
163	      systems, nor needing to maintain redundant data models that
164	      operate on the same underlying data.

166	   These steps will eliminate much unnecessary overhead currently
167	   required of today's network programmer.

169	2.  Terminology

171	   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
172	   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
173	   document are to be interpreted as described in RFC 2119 [RFC2119].

175	   Authoritative Datastore - A datastore containing the authoritative
176	   copy of an object, i.e. the source and the "owner" of the object.

178	   Client Datastore - a datastore containing an object whose source and
179	   "owner" is a remote datastore.

181	   Data Node - An instance of management information in a YANG
182	   datastore.

184	   Datastore - A conceptual store of instantiated information, with
185	   individual data items represented by data nodes which are arranged in
186	   hierarchical manner.

188	   Data Subtree - An instantiated data node and the data nodes that are
189	   hierarchically contained within it.

191	   Mount Client - The system at which the mount point resides, into
192	   which on or more remote subtrees may be mounted.

194	   Mount Binding - An instance of mounting from a specific Mount Point
195	   to a remote datastore.  Types include:

197	   o  On-demand: Mount Client only pulls information when application
198	      requests

200	   o  Periodic: Mount Server pushes current state at a pre-defined
201	      interval

203	   o  Unsolicited: Mount Server maintains active bindings and sends to
204	      client cache upon change

206	   Mount Point - Point in the local data store which may reference a
207	   single remote subtree

209	   Mount Server - The server with which the Mount Client communicates
210	   and which provides the Mount Client with access to the mounted
211	   information.  Can be used synonymously with Mount Target.

213	   Peer Mount - The act of representing remote objects in the local
214	   datastore

216	   Target Data Node - Data Node on Mount Server against which a Mount
217	   Binding is established

219	3.  Solution Context

221	   YANG modeling has emerged as a preferred way to offer network
222	   abstractions.  The requirements in this document can be enabled by
223	   expanding of the syntax of YANG capabilities embodied within RFC 6020
224	   [RFC6020] and YANG 1.1 [rfc6020bis].  A companion draft to this one
225	   which details a potential set of YANG technology extensions which can
226	   support key requirements within this document are contained in .
227	   [draft-clemm-mount].

229	   To date systems built upon YANG models have been missing two
230	   capabilities:

232	   1.  Peer Datastore Mount: Datastores have not been able to proxy
233	       objects located elsewhere.  This puts additional burden upon
234	       applications which then need to find and access multiple
235	       (potentially remote) systems.

237	   2.  Eventual Consistency: YANG Datastore implementations have
238	       typically assumed ACID [2] transaction models.  There is nothing
239	       inherent in YANG itself which demands ACID transactional
240	       guarantees.  YANG models can also expose information which might
241	       be in the process of undergoing convergence.  Since IP networking
242	       has been designed with convergence in mind, this is a useful
243	       capability since some types of applications must participate
244	       where there is dynamically changing state.

246	3.1.  Peer Mount

248	   First this document will dive deeper into Peer Datastore Mount
249	   (a.k.a., "Peer Mount").  Contrary to existing YANG datastores, where
250	   hierarchical datatree(s) are local in scope and only includes data
251	   that is "owned" by the local system, we need an agent or interface on
252	   one system which is able refer to managed resources that reside on
253	   another system.  This allows applications on the same system as the
254	   YANG datastore server, as well as remote clients that access the
255	   datastore through a management protocol such as NETCONF, to access
256	   all data as if it were local to that same server.  This must be done
257	   in a manner that is transparent to users and applications.  This must
258	   be done in a way which does not require a user or application to be
259	   aware of the fact that some data resides in a different location and
260	   have them directly access that other system.  In this way, the user
261	   is projected an image of one virtual consolidated datastore.

263	   The value in such a datastore comes from its under-the-covers
264	   federation.  The datastore transparently exposes information from
265	   multiple systems across the network.  The user does not need to be
266	   aware of the precise distribution and ownership of data themselves,
267	   nor is there a need for the application to discover those data
268	   sources, maintain separate associations with them, and partition its
269	   operations to fit along remote system boundaries.  The effect is that
270	   a network device can broaden and customize the information available
271	   for local access.  Life for the application is easier.

273	   Any Object type can be included in such a datastore.  This can
274	   include configuration data that is either persistent or ephemeral,
275	   and which is valid within only a single device or across a domain of
276	   devices.  This can include operational data that represents state
277	   across a single device or across a multiple devices.

279	   Another useful aspect of "Peer Mount" is its ability to embed
280	   information from external YANG models which haven't necessarily been
281	   normalized.  Normalization is a good thing.  But the massive human
282	   efforts invested in uber-data-models have never gained industry
283	   traction due to the resulting models' brittle nature and complexity.
284	   By mounting remote trees/objects into local datastores it is possible
285	   to expose remote objects under a locally optimized hierarchy without
286	   having to transpose remote objects into a separate local model.  Once
287	   this exists, object translation and normalization become optional
288	   capabilities which may also be hidden.

290	   Another useful aspect of "Peer Mount" is its ability to mount remote
291	   trees where the local datastore does not know the full subtree being
292	   installed.  In fact, the remote datastore might be dynamically
293	   changing the mounted tree.  These dynamic changes can be reflected as
294	   needed under the "attachment points" within the namespace hierarchy
295	   where the data subtrees from remote systems have been mounted.  In
296	   this case, the precise details of what these subtrees exactly contain
297	   does not need to be understood by the system implementing the
298	   attachment point, it simply acts as a single point of entry and
299	   "proxy" for the attached data.

301	3.2.  Eventual Consistency and YANG 1.1

303	   The CAP theorem [3] states that it is impossible for a distributed
304	   computer system to simultaneously provide Consistency, Availability,
305	   and Partition tolerance.  (I.e., distributed network state management
306	   is hard.)  Mostly for this reason YANG implementations have shied
307	   away from distributed datastore implementations where ACID
308	   transactional guarantees cannot be given.  This of course limits the
309	   universe of applicability for YANG technology.

311	   Leveraging YANG concepts, syntax, and models for objects which might
312	   be happening to undergo network convergence is valuable.  Such reuse
313	   greatly expands the universe of information visible to networking
314	   applications.  The good news is that there is nothing in YANG 1.1
315	   syntax that prohibits its reapplication for distributed datastores.
316	   Extensions are needed however.

318	   Requirements described within this document can be used to define
319	   technology extensions to YANG 1.1 for remote datastore mounting.
320	   Because of the CAP theorem, it must be recognized that systems built
321	   upon these extensions MAY choose to support eventual consistency
322	   rather than ACID guarantees.  Some applications do not demand ACID
323	   guarantees (examples are contained in this document's Use Case
324	   section).  Therefore for certain classes of applications, eventual
325	   consistency [4] should be viewed as a cornerstone feature capability
326	   rather than a bug.

328	   Other industries have been able to identify and realize the value in
329	   such model.  The Object Management Group Data-Distribution Service
330	   for Real-Time Systems has even standardized these capabilities for
331	   non-YANG deployments [OMG-DDS].  Commercial deployments exist.

333	4.  Example Use Cases

335	   Many types of applications can benefit from the simple and quick
336	   availability of objects from peer network devices.  Because network
337	   management and orchestration systems have been fulfilling a subset of
338	   the requirements for decades, it is important to focus on what has
339	   changed.  Changes include:

341	   o  SDN applications wish to interact with local datastore(s) as if
342	      they represent the real-time state of the distributed network.

344	   o  Independent sets of applications and SDN controllers might care
345	      about the same authoritative data node or subtree.

347	   o  Changes in the real-time state of objects can announce themselves
348	      to subscribing applications.

350	   o  The union of an ever increasing number of abstractions provided
351	      from different layers of the network are assumed to be consistent
352	      with each other (at least once a reasonable convergence time has
353	      been factored in).

355	   o  CPU and VM improvements makes running Linux based applications on
356	      network elements viable.

358	   Such changes can enable a new class of applications.  These
359	   applications are built upon fast-feedback-loops which dynamically
360	   tune the network based on iterative interactions upon a distributed
361	   datastore.

363	4.1.  Cloud Policer

365	   A Cloud Policer enables a single aggregated data rate to tenants/
366	   users of a data center cloud that applies across their VMs; a rate
367	   independent of where specific VMs are physically hosted.  This works
368	   by having edge router based traffic counters available to a
369	   centralized application, which can then maintain an aggregate across
370	   those counters.  Based on the sum of the counters across the set of
371	   edge routers, new values for each device based Policer can be
372	   recalculated and installed.  Effectively policing rates are
373	   continuously rebalanced based on the most recent traffic offered to
374	   the aggregate set of edge devices.

376	   The cloud policer provides a very simple cloud QoS model.  Many other
377	   QoS models could also be implemented.  Example extensions include:

379	   o  CIR/PIR guarantees for a tenant,
380	   o  hierarchical QoS treatment,

382	   o  providing traffic delivery guarantees for specific enterprise
383	      branch offices, and

385	   o  adjusting the prioritization of one application based on the
386	      activity of another application which perhaps is in a completely
387	      different location.

389	   It is possible to implement such a cloud policer application with
390	   maximum application developer simplicity using peer mount.  To do
391	   this the application accesses a local datastore which in turn does a
392	   peer mount from edge routers the objects which house current traffic
393	   counter statistics.  These counters are accessed as if they were part
394	   of the local datastore structures, without concern for the fact that
395	   the actual authoritative copies reside on remote systems.

397	   Beyond this centralized counter collection peer mount, it is also
398	   possible to have distributed edge routers mount information in the
399	   reverse direction.  In this case local edge routers can peer mount
400	   centrally calculated policer rates for the device, and access these
401	   objects as if they were locally configured.

403	   For both directions of mounting, the authoritative copy resides in a
404	   single system and is mounted by peers.  Therefore issues with regards
405	   to inconsistent configuration of the same redundant data across the
406	   network are avoided.  Also as can be seen in this use case, the same
407	   system can act as a mount client of some objects while acting as
408	   server for other objects.

410	4.2.  DDoS Thresholding

412	   Another extension of the "Cloud Policer" application is the creation
413	   of additional action thresholds at bandwidth rates far greater than
414	   might be expected.  If these higher thresholds are hit, it is
415	   possible to connect in DDoS scrubbers to ingress traffic.  This can
416	   be done in seconds after a bandwidth spike.  This can also be done if
417	   non-bandwidth counters are available.  For example, if TCP flag
418	   counts are available it is possible to look for changes in SYN/ACK
419	   ratios which might signal a different type of attack.  In all cases,
420	   when network counters indicate a return to normal traffic profiles
421	   the DDoS Scrubbers can be automatically disconnected.

423	   Benefits of only connecting a DDoS scrubber in the rare event an
424	   attack might be underway include:

426	   o  marking down traffic for an out-of-profile tenant so that an
427	      potential attack doesn't adversely impact others,

429	   o  applying DDoS Scrubbing across many devices when an attack is
430	      detected in one,

432	   o  reducing DDoS scrubber CPU, power, and licensing requirements
433	      (during the vast majority of time, spikes are not occurring), and

435	   o  dynamic management and allocation of scarce platform resources
436	      (such as optimizing span port usage, or limiting IP-FIX reporting
437	      to levels where devices can do full flow detail exporting).

439	4.3.  Service Chain Classification, Load Balancing and Capacity
440	      Management

442	   Service Chains will dynamically change ingress classification
443	   filters, allocate paths from many ingress devices across shared
444	   resources.  This information needs to be updated in real time as
445	   available capacity is allocated or failures are discovered.  It is
446	   possible to simplify service chain configuration and dynamic topology
447	   maintenance by transparently updating remote cached topologies when
448	   an authoritative object is changed within a central repository.  For
449	   example if the CPU in one VM spikes, you might want to recalculate
450	   and adjust many chained paths to relieve the pressure.  Or perhaps
451	   after the recalculation you want to spin up a new VM, and then adjust
452	   chains when that capacity is on-line.

454	   A key value here is central calculation and transparent auto-
455	   distribution.  In other words, a change only need be updated by an
456	   application in a single location, and the infrastructure will
457	   automatically synchronize changes across any number of subscribing
458	   devices without application involvement.  In fact, the application
459	   need not even know many devices are monitoring the object which has
460	   been changed.

462	   Beyond 1:n policy distribution, applications can step back from
463	   aspects of failure recovery.  What happens if a device is rebooting
464	   or simply misses a distribution of new information?  With peer mount
465	   there is no doubt as to where the authoritative information resides
466	   if things get out of synch.

468	   While this ability is certainly useful for dynamic service chain
469	   filtering classification and next hop mapping, this use case has more
470	   general applicability.  With a distributed datastore, diverse
471	   applications and hosts can locally access a single device's current
472	   VM CPU and Bandwidth values.  They can do it without needing to
473	   explicitly query that remote machine.  Updates from a device would
474	   come from a periodic push of stats to a transparent cache to
475	   subscribed, or via an unsolicited update which is only sent when
476	   these value exceed established norms.

478	5.  Requirements

480	   To achieve the objectives described above, the network needs to
481	   support a number of requirements

483	5.1.  Application Simplification

485	   A major obstacle to network programmability are any requirements
486	   which force applications to use abstractions more complicated than
487	   the developer cares to touch.  To simplify applications development
488	   and reduce unnecessary code, the following needs must be met.

490	   Applications MUST be able to access a local datastore which includes
491	   objects whose authoritative source is located in a remote datastore
492	   hosted on a different server.

494	   Local datastores MUST be able to provide a hierarchical view of
495	   objects assembled from objects whose authoritative source may
496	   originate from potentially different and overlapping namespaces.

498	   Applications MUST be able to access all objects of a datastore
499	   without concern where the actual object is located, i.e. whether the
500	   authoritative copy of the object is hosted on the same system as the
501	   local datastore or whether it is hosted in a remote datastore.

503	   With two exceptions, a datastore's application facing interfaces MUST
504	   make no differentiation whether individual objects exposed are
505	   authoritatively owned by the datastore or mounted from remote.  This
506	   includes Netconf and Restconf as well as other, possibly proprietary
507	   interfaces (such as, CLI generated from corresponding YANG data
508	   models).  The two exceptions are that it is acceptable to make a
509	   distinction between an object authoritatively owned by the data store
510	   and a remote object as follows:

512	   o  Object updates / editing, creation and deletion.  E.g. via edit-
513	      config conditions and constraints are assessed at the
514	      authoritative datastore when the update/create/delete is
515	      conducted.  Any conditions or constraints at remote client
516	      datastores are NOT assessed.

518	   o  Locks obtained at a client datastore: It is conceivable for the
519	      interface to distinguish between two lock modes: locking the
520	      entire subtree including remote data (in which case the
521	      datastore's mount client needs to explicitly obtain and release
522	      locks from mounted authoritative datastores), or locking only
523	      authoritatively owned data, excluding remote data from the lock.

525	   These exceptions should not be very problematic as non-authoritative
526	   copies will typically be marked as read-only.  This will not violate
527	   any considerations of "no differentiation" of local or remote.

529	   When a change is made to an object, that change will be reflected in
530	   any datastore in which the object is included.  This means that a
531	   change made to the object through a remote datastore will affect the
532	   object in the authoritative datastore.  Likewise, changes to an
533	   object in the authoritative datastore will be reflected at any client
534	   datastores.

536	   The distributed datastore MUST be able to include objects from
537	   multiple remote datastores.  The same object may be included in
538	   multiple remote datastores; in other words, an object's authoritative
539	   datastore MUST support multiple clients.

541	   The distributed datastore infrastructure MUST enable to access to
542	   some subset of the same objects on different devices.  (This includes
543	   multiple controllers as well as multiple physical and virtual peer
544	   devices.)

546	   Applications SHOULD be able to extract a time synchronized set of
547	   operational data from the datastore.  (In other words, the
548	   application asks for a subset of network state at time-stamp or time-
549	   range "X".  The datastore would then deliver time synchronized
550	   snapshots of the network state per the request.  The datastore may
551	   work with NTP and operational counter to optimize the synchronization
552	   results of such a query.  It is understood that some types of data
553	   might be undergoing convergence conditions.)

555	   Authoritative datastore retain full ownership of "their" objects.
556	   This means that while remote datastores may access the data, any
557	   modifications to objects that are initiated at those remote
558	   datastores need to be authorized by the authoritative owner of the
559	   data.  Likewise, the authoritative owner of the data may make changes
560	   to objects, including modifications, additions, and deletions,
561	   without needing to first ask for permission from remote clients.

563	   Applications MUST be designed to deal with incomplete data if remote
564	   objects are not accessible, e.g. due to temporal connectivity issues
565	   preventing access to the authoritative source.  (This will be true
566	   for many protocols and programming languages.  Mount is unlikely to
567	   add anything new here unless applications have extra error handling
568	   routines to deal with when there is no response from a remote
569	   system.).

571	5.2.  Caching

573	   Remote objects in a datastore can be accessed "on demand", when the
574	   application interacting with the datastore demands it.  In that case,
575	   a request made to the local datastore is forwarded to the remote
576	   system.  The response from the remote system, e.g. the retrieved
577	   data, is subsequently merged and collated with the other data to
578	   return a consolidated response to the invoking application.

580	   A downside of a datastore which is distributed across devices can be
581	   the latency induced when remote object acquisition is necessary.
582	   There are plenty of applications which have requirements which simply
583	   cannot be served when latency is introduced.  The good news is that
584	   the concept of caching lends itself well to distributed datastores.
585	   It is possible to transparently store some types of objects locally
586	   even when the authoritative copy is remote.  Instead of fetching data
587	   on demand when an application demands it, the application is simply
588	   provided with the local copy.  It is then up to the datastore
589	   infrastructure to keep selected replicated info in synch, e.g. by
590	   prefetching information, or by having the remote system publish
591	   updates which are then locally stored.  At this point, it is expected
592	   that a preferred method of subscribing to and publishing updates will
593	   be accomplished via [yang-pub-sub-reqts] and
594	   [draft-clemm-datastore-push].  Other methods could work equally well
595	   .

597	   This is not a new idea.  Caching and Content Delivery Networks (CDN)
598	   have sped read access for objects within the Internet for years.
599	   This has enabled greater performance and scale for certain content.
600	   Just as important, these technologies have been employed without end
601	   user applications being explicitly aware of their involvement.  Such
602	   concepts are applicable for scaling the performance of a distributed
603	   datastore.

605	   Where caching occurs, it MUST be possible for the Mount Client to
606	   store object copies of a remote data node or subtree in such a way
607	   that applications are unaware that any caching is occurring.
608	   However, the interface to a datastore MAY provide applications with a
609	   special mode/flag to allow them to force a read-through.

611	   Where caching occurs, system administration facilities SHOULD allow
612	   facilities to flush either the entire cache, or information
613	   associated with select Mount Points.

615	5.3.  Subscribing to Remote Object Updates

617	   When caching occurs, data can go stale. [draft-clemm-datastore-push]
618	   provides a mechanism where changes in an authoritative data node or
619	   subtree can be monitored.  If changes occur, these changes can be
620	   delivered to any subscribing datastores.  In this way remote caches
621	   can be kept up-to-date.  In this way, directly monitoring remote
622	   applications can quickly receive notifications without continuous
623	   polling.

625	   A Mount Server SHOULD support [draft-clemm-datastore-push] Periodic
626	   or On-Change pub/sub capabilities in which one or more remote clients
627	   subscribe to updates of a target data node / subtree, which are then
628	   automatically published by the Mount Server.

630	   It MUST be possible for Applications to bind to subscribed Data Node
631	   / Subtrees so that upon Mount Client receipt of subscribed
632	   information, it is immediately passed to the application.

634	   It MUST be possible for a Target Data Node to support 1:n Mount
635	   Bindings to many subscribed Mount Points.

637	5.4.  Lifecycle of the Mount Topology

639	   Mount can drive a dynamic and richly interconnected mesh of peer-to-
640	   peer of object relationships.  Each of these Mounts will be
641	   independently established by a Mount Client.

643	   It MUST be possible to bootstrap the Mount Client by providing the
644	   YANG paths to resources on the Mount Server.

646	   There SHOULD be the ability to add Mount Client bindings during run-
647	   time.

649	   A Mount Client MUST be able to be able to create, delete, and timeout
650	   Mount Bindings.

652	   Any Subscription MUST be able to inform the Mount Client of an
653	   intentional/graceful disconnect.

655	   A Mount Client MUST be able to verify the status of Subscriptions,
656	   and drive re-establishment if it has disappeared.

658	5.4.1.  Discovery and Creation of Mount Topology

660	   Application visibility into an ever-changing set of network objects
661	   is not trivial.  While some applications can be easily configured to
662	   know the Devices and available Mount Points of interest, other
663	   applications will have to balance many aspects of dynamic device
664	   availability, capabilities, and interconnectedness.  For the most
665	   part, maintenance of these dynamic elements can be done on the YANG
666	   objects themselves without anything needed new for Peer Mount.
667	   Technologies such as need reference are covered in other standards
668	   initiatives.  Therefore this draft does delve deeply into the needs
669	   for Auto-discovery of YANG objects which may be advertised.

671	   However it will likely become interesting for a network element to
672	   limit the Data Subtrees which might be subscribed for Unsolicited and
673	   Periodic Update.  It is assumed these capabilities will be included
674	   as part of [draft-clemm-datastore-push]

676	5.4.2.  Restrictions on the Mount Topology

678	   Mount Clients MUST NOT create recursive Mount bindings (i.e., the
679	   Mount Client should not load any object or subtree which it has
680	   already delivered to another in the role of a Mount Server.)  Note:
681	   Objects mounted from a controller as part of orchestration are *not*
682	   considered the same objects as those which might be mounted back from
683	   a network device showing the actual running config.

685	5.5.  Mount Filter

687	   The Mount Server default MUST be to deliver the same Data Node /
688	   Subtree that would have been delivered via direct YANG access.

690	   It SHOULD be possible for a Mount Client to request something less
691	   that the full subtree or a target node as defined in
692	   [yang-pub-sub-reqts].

694	5.6.  Auto-Negotiation of Peer Mount Client QoS

696	   The interest that a Mount Client expresses in a particular subtree
697	   SHOULD include the non-functional data delivery requirements (QoS) on
698	   the data that is being mounted.  Additionally, Mount Servers SHOULD
699	   advertise their data delivery capabilities.  With this information
700	   the Mount Client can decide whether the quality of the delivered data
701	   is sufficient to serve applications residing above the Mount Client.

703	   An example here is reliability.  A reliable protocol might be
704	   overkill for a state that is republished with high frequency.
705	   Therefore a Mount Server may sometimes choose to not provide a
706	   reliable method of communication for certain objects.  It is up to
707	   the Mount Client to determine whether what is offered is sufficiently
708	   reliable for its application.  Only when the Mount Server is offering
709	   data delivery QoS better or equal to what is requested, shall a mount
710	   binding be established.

712	   Another example is where subscribed objects must be pushed from the
713	   Mount Server within a certain interval from when an object change is
714	   identified.  In such a scenario the interval period of the Mount
715	   Server must be equal or smaller than what is requested by a Mount
716	   Client.  If this "deadline" is not met by the Mount Server the
717	   infrastructure MAY take action to notify clients.

719	5.7.  Datastore Qualification

721	   It is conceivable to differentiate between different datastores on
722	   the remote server, that is, to designate the name of the actual
723	   datastore to mount, e.g. "running" or "startup".  If on the target
724	   node there are multiple datastores available, but there has no
725	   specific datastore identified by the Mount Client, then the running
726	   or "effective" datastore is the assumed target.

728	   It is conceivable to use such Datastore Qualification in conjunction
729	   with ephemeral datastores, to address requirements being worked in
730	   the I2RS WG [draft-haas].

732	5.8.  Local Mounting

734	   It is conceivable that the mount target does not reside in a remote
735	   datastore, but that data nodes in the same datastore as the
736	   mountpoint are targeted for mounting.  This amounts to introducing an
737	   "aliasing" capability in a datastore.  While this is not the scenario
738	   that is primarily targeted, it is supported and there may be valid
739	   use cases for it.

741	5.9.  Mount Cascades

743	   It is possible for the mounted subtree to in turn contain a
744	   mountpoint.  However, circular mount relationships MUST NOT be
745	   introduced.  For this reason, a mounted subtree MUST NOT contain a
746	   mountpoint that refers back to the mounting system with a mount
747	   target that directly or indirectly contains the originating
748	   mountpoint.  As part of a mount operation, the mount points of the
749	   mounted system need to be checked accordingly.

751	5.10.  Transport

753	   Many secured transports are viable assuming transport, data security,
754	   scale, and performance objectives are met.  Netconf is recommended
755	   for starting.  Other transports may be proposed over time.

757	   It MUST be possible to support Netconf Transport of subscribed Nodes
758	   and Subtrees.

760	5.11.  Security Considerations

762	   Many security mechanisms exist to protect data access for CLI and API
763	   on network devices.  To the degree possible these mechanisms should
764	   transparently protect data when performing a Peer Mount.

766	   The same mechanisms used to determine whether a remote host has
767	   access to a particular YANG Data Node or Subtree MUST be invoked to
768	   determine whether a Mount Client has access to that information.

770	   The same traditional transport level security mechanism security used
771	   for YANG over a particular transport MUST be used for the delivery of
772	   objects from a Mount Server to a Mount Client.

774	   A Mount Server implementation MUST NOT change any credentials passed
775	   by the Mount Client system for any Mount Binding request.

777	   The Mount Server MUST deliver no more objects from a Data Node or
778	   Subtree than allowable based on the security credentials provided by
779	   the Mount Client.

781	   To ensure the ensuring maximum scale limits, it MUST be possible to
782	   for a Mount Server to limit the number of bindings and transactional
783	   limits

785	   It SHOULD be possible to prioritize which Mount Binding instances
786	   should be serviced first if there is CPU, bandwidth, or other
787	   capacity constraints.

789	5.12.  High Availability

791	   A key intent for Peer Mount is to allow access to an authoritative
792	   copy of an object for a particular domain.  Of course system and
793	   software failures or scheduled upgrades might mean that the primary
794	   copy is not consistently accessible from a single device.  In
795	   addition, system failovers might mean that the authoritative copy
796	   might be housed on a different device than the one where the binding
797	   was originally established.  Peer Mount architectures must be built
798	   to enable Mount Clients to transparently provide access to objects
799	   where the authoritative copy moves due to dynamic network
800	   reconfigurations .

802	   A Peer Mount architecture MUST guarantee that mount bindings between
803	   a Mount Server and Mount Clients are eventually consistent.  The
804	   infrastructure providing this level of consistency MUST be able to
805	   operate in scenarios where a system is (temporarily) not fully
806	   connected.  Furthermore, Mount Clients MAY have various requirements
807	   on the boundaries under which eventual consistency is allowed to take
808	   place.  This subject can be decomposed in the following items:

810	5.12.1.  Reliability

812	   Eventual consistency can only be guaranteed when peers are
813	   communicating using a reliable method of data delivery.  A scenario
814	   that deserves attention in particular is when a subset of Mount
815	   Clients receive a pushed subscription update.  If a Mount Server
816	   loses connectivity, cross network element consistency can be lost.
817	   In such a scenario Mount Clients MAY elect a new designated Mount
818	   Server from the set of Mount Clients which have received the latest
819	   state.

821	5.12.2.  Alignment to late joining peers

823	   When a mount binding is established a Mount Server SHOULD provide the
824	   Mount Client with the latest state of the requested data.  In order
825	   to increase availability and fault tolerance an infrastructure MAY
826	   support the capability to have multiple alignment sources.  In
827	   (temporary) absence of a Mount Server, Mount Clients MAY elect a
828	   temporary Mount Server to service late joining Mount Clients.

830	5.12.3.  Liveliness

832	   Upon losing liveliness and being unable to refresh cached data
833	   provided from a Mount Server, a Mount Client MAY decide to purge the
834	   mount bindings of that server.  Purging mount bindings under such
835	   conditions however makes a system vulnerable to losing network-wide
836	   consistency.  A Mount Client can take proactive action based on the
837	   assumption that the Mount Server is no longer available.  When
838	   connectivity is only temporarily lost, this assumption could be false
839	   for other datastores.  This can introduce a potential for decision-
840	   making based on semantical disagreement.  To properly handle these
841	   scenarios, application behavior MUST be designed accordingly and
842	   timeouts with regards to liveliness detection MUST be carefully
843	   determined.

845	5.12.4.  Merging of datasets

847	   A traditional problem with merging replicated datasets during the
848	   failover and recovery of Mount Servers is handling the corresponding
849	   target data node lifecycle management.  When two replicas of a
850	   dataset experienced a prolonged loss of connectivity a merge between
851	   the two is required upon re-establishing connectivity.  A replica
852	   might have been modifying contents of the set, including deletion of
853	   objects.  A naive merge of the two replicas would discard these
854	   deletes by aligning the now stale, deleted objects to the replica
855	   that deleted them.

857	   Authoritative ownership is an elegant solution to this problem since
858	   modifications of content can only take place at the owner.  Therefore
859	   a Mount Client SHOULD, upon reestablishing connectivity with a newly
860	   authoritative Mount Server, replace any existing cache contents from
861	   a mount binding with the latest version.

863	5.12.5.  Distributed Mount Servers

865	   For selected objects, Mount Bindings SHOULD be allowed to Anycast
866	   addresses so that a Distributed Mount Server implementation can
867	   transparently provide (a) availability during failure events to Mount
868	   Clients, and (b) load balancing on behalf of Mount Clients.

870	5.13.  Configuration

872	   At the Mount Client, it MUST be possible for all Mount bindings to
873	   configure the following such that the application needs no knowledge.
874	   This will include a diverse list of elements such as the YANG URI
875	   path to the remote subtree.

877	5.14.  Assurance and Monitoring

879	   API usage for YANG should be tracked via existing mechanisms.  There
880	   is no intent to require additional transaction tracking than would
881	   have been provided normally.  However there are additional
882	   requirements which should allow the state of existing and historical
883	   bindings to be provided.

885	   A Mount Client MUST be able to poll a Mount Server for the state of
886	   Subsciptions maintained between the two devices.

888	   A Mount Server MUST be able to publish the set of Subscriptions which
889	   are currently established on or below any identified data node.

891	6.  IANA Considerations

893	   This document makes no request of IANA.

895	7.  Acknowledgements

897	   We wish to acknowledge the helpful contributions, comments, and
898	   suggestions that were received from Ambika Prasad Tripathy.  Shashi
899	   Kumar Bansal, Prabhakara Yellai, Dinkar Kunjikrishnan, Harish
900	   Gumaste, Rohit M., Shruthi V. , Sudarshan Ganapathi, and Swaroop
901	   Shastri.

903	8.  References

905	8.1.  Normative References

907	   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
908	              Requirement Levels", BCP 14, RFC 2119, March 1997.

910	   [RFC3768]  Hinden, R., "Virtual Router Redundancy Protocol (VRRP)",
911	              RFC 3768, April 2004.

913	   [RFC4610]  Farinacci, D. and Y. Cai, "Anycast-RP Using Protocol
914	              Independent Multicast (PIM)", RFC 4610, August 2006.

916	   [RFC6020]  Bjorklund, M., "YANG - A Data Modeling Language for the
917	              Network Configuration Protocol (NETCONF)", RFC 6020,
918	              October 2010.

920	8.2.  Informative References

922	   [ICCP]     Martini, Luca., "Inter-Chassis Communication Protocol for
923	              L2VPN PE Redundancy", March 2014,
924	              <https://tools.ietf.org/html/draft-ietf-pwe3-iccp-16>.

926	   [OMG-DDS]  "Data Distribution Service for Real-time Systems, version
927	              1.2", January 2007, <http://www.omg.org/spec/DDS/1.2/>.

929	   [draft-clemm-datastore-push]
930	              Clemm, Alex., "Subscribing to datastore push updates",
931	              March 2015.

933	   [draft-clemm-mount]
934	              Clemm, Alex., "Mounting YANG-Defined Information from
935	              Remote Datastores", October 2014,
936	              <http://tools.ietf.org/id/
937	              draft-clemm-netmod-mount-02.txt>.

939	   [draft-haas]
940	              Haas, J., "I2RS requirements for netmod/netconf draft-
941	              haas-i2rs-netmod-netconf-requirements-00", September 2014,
942	              <draft-haas-i2rs-netmod-netconf-requirements>.

944	   [rfc6020bis]
945	              Bjorklund, Martin., "YANG - A Data Modeling Language for
946	              the Network Configuration Protocol (NETCONF)", January
947	              2015, <https://tools.ietf.org/html/draft-ietf-netmod-
948	              rfc6020bis-03>.

950	   [yang-pub-sub-reqts]
951	              Voit, Eric., Clemm, Alex., and Alberto. Gonzalez Prieto,
952	              "Requirements for Subscription to YANG Datastores", March
953	              2015.

955	8.3.  URIs

957	   [1] http://thomaswdinsmore.com/2014/05/01/distributed-analytics-
958	       primer/

960	   [2] http://en.wikipedia.org/wiki/ACID

962	   [3] http://robertgreiner.com/2014/08/cap-theorem-revisited/

964	   [4] http://guide.couchdb.org/draft/consistency.html

966	Authors' Addresses

968	   Eric Voit
969	   Cisco Systems

971	   Email: evoit@cisco.com

973	   Alex Clemm
974	   Cisco Systems

976	   Email: alex@cisco.com

978	   Sander Mertens
979	   Prismtech

981	   Email: sander.mertens@prismtech.com