idnits 2.17.1 draft-voit-netmod-peer-mount-requirements-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an Introduction section. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (September 25, 2014) is 3501 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- -- Looks like a reference, but probably isn't: '1' on line 994 -- Looks like a reference, but probably isn't: '2' on line 997 -- Looks like a reference, but probably isn't: '3' on line 999 -- Looks like a reference, but probably isn't: '4' on line 1001 ** Obsolete normative reference: RFC 3768 (Obsoleted by RFC 5798) Summary: 2 errors (**), 0 flaws (~~), 1 warning (==), 5 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 NETCONF Data Modeling Language Working Group (netmod) E. Voit 2 Internet-Draft A. Clemm 3 Intended status: Informational S. Bansal 4 Expires: March 29, 2015 A. Tripathy 5 P. Yellai 6 Cisco Systems 7 September 25, 2014 9 Requirements for Peer Mounting of YANG subtrees from Remote Datastores 10 draft-voit-netmod-peer-mount-requirements-00 12 Abstract 14 Network integrated applications want simple ways to access YANG 15 objects and subtrees which might be distributed across network. 16 Performance requirements may dictate that it is unaffordable for a 17 subset of these applications to go through existing centralized 18 management brokers. For such applications, development complexity 19 must be minimized. Specific aspects of complexity developers want to 20 ignore include: 22 o whether authoritative information is actually sourced from remote 23 datastores (as well as how to get to those datastores), 25 o whether such information has been locally cached or not, 27 o whether there are zero, one, or more controllers asserting 28 ownership of information, and 30 o whether there are interactions with other applications 31 concurrently running elsewhere 33 The solution requirements described in this document detail what is 34 needed to support application access to authoritative network YANG 35 objects from controllers (star) or peering network devices (mesh) in 36 such a way to meet these goals. 38 Status of This Memo 40 This Internet-Draft is submitted in full conformance with the 41 provisions of BCP 78 and BCP 79. 43 Internet-Drafts are working documents of the Internet Engineering 44 Task Force (IETF). Note that other groups may also distribute 45 working documents as Internet-Drafts. The list of current Internet- 46 Drafts is at http://datatracker.ietf.org/drafts/current/. 48 Internet-Drafts are draft documents valid for a maximum of six months 49 and may be updated, replaced, or obsoleted by other documents at any 50 time. It is inappropriate to use Internet-Drafts as reference 51 material or to cite them other than as "work in progress." 53 This Internet-Draft will expire on March 29, 2015. 55 Copyright Notice 57 Copyright (c) 2014 IETF Trust and the persons identified as the 58 document authors. All rights reserved. 60 This document is subject to BCP 78 and the IETF Trust's Legal 61 Provisions Relating to IETF Documents 62 (http://trustee.ietf.org/license-info) in effect on the date of 63 publication of this document. Please review these documents 64 carefully, as they describe your rights and restrictions with respect 65 to this document. Code Components extracted from this document must 66 include Simplified BSD License text as described in Section 4.e of 67 the Trust Legal Provisions and are provided without warranty as 68 described in the Simplified BSD License. 70 Table of Contents 72 1. Business Problem . . . . . . . . . . . . . . . . . . . . . . 3 73 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 4 74 3. Solution Context . . . . . . . . . . . . . . . . . . . . . . 5 75 3.1. Peer Mount . . . . . . . . . . . . . . . . . . . . . . . 6 76 3.2. Eventual Consistency and YANG 1.1 . . . . . . . . . . . . 7 77 4. Example Use Cases . . . . . . . . . . . . . . . . . . . . . . 7 78 4.1. Cloud Policer . . . . . . . . . . . . . . . . . . . . . . 8 79 4.2. DDoS Thresholding . . . . . . . . . . . . . . . . . . . . 9 80 4.3. Service Chain Classification, Load Balancing and Capacity 81 Management . . . . . . . . . . . . . . . . . . . . . . . 10 82 5. Requirements . . . . . . . . . . . . . . . . . . . . . . . . 10 83 5.1. Application Simplification . . . . . . . . . . . . . . . 11 84 5.2. Caching Considerations . . . . . . . . . . . . . . . . . 12 85 5.2.1. Caching Overview . . . . . . . . . . . . . . . . . . 12 86 5.2.2. Pub/Sub of Object Updates . . . . . . . . . . . . . . 13 87 5.3. Lifecycle of the Mount Topology . . . . . . . . . . . . . 16 88 5.3.1. Discovery and Creation of Mount Topology . . . . . . 16 89 5.3.2. Restrictions on the Mount Topology . . . . . . . . . 17 90 5.4. Mount Filter . . . . . . . . . . . . . . . . . . . . . . 17 91 5.5. Transport . . . . . . . . . . . . . . . . . . . . . . . . 17 92 5.6. Security Considerations . . . . . . . . . . . . . . . . . 17 93 5.7. High Availability . . . . . . . . . . . . . . . . . . . . 18 94 5.8. Configuration . . . . . . . . . . . . . . . . . . . . . . 19 95 5.9. Assurance and Monitoring . . . . . . . . . . . . . . . . 19 97 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 20 98 7. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 20 99 8. References . . . . . . . . . . . . . . . . . . . . . . . . . 21 100 8.1. Normative References . . . . . . . . . . . . . . . . . . 21 101 8.2. Informative References . . . . . . . . . . . . . . . . . 21 102 8.3. URIs . . . . . . . . . . . . . . . . . . . . . . . . . . 21 103 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 22 105 1. Business Problem 107 Instrumenting Physical and Virtual Network Elements purely along 108 device boundaries is insufficient for today's requirements. Instead, 109 users, applications, and operators are asking for the ability to 110 interact with varying subsets of network information at the highest 111 viable level of abstraction. Likewise applications that run locally 112 on devices may require access to data that transcends the boundaries 113 of the device they are deployed. Achieving this can be difficult 114 since a running network is comprised of a distributed mesh of object 115 ownership. (I.e., the authoritative device owning a particular 116 object will vary.) Solutions require the transparent assembly of 117 different objects from across a network in order to provide 118 consolidated, time synchronized, and consistent views required for 119 that abstraction. 121 Recent approaches have focused on a Network Controller as the arbiter 122 of new network-wide abstractions. Controller based solutions are 123 supportable by requirements outlined in this document. However this 124 is not the only deployment model covered by this document. Equally 125 valid are deployment models where Network Elements exchange 126 information in a way which allows one or more of those Elements to 127 provide the desired network level abstraction. This is not a new 128 idea. Examples of Network Element based protocols which already do 129 network level abstractions include VRRP [RFC3768], mLACP/ICCP[ICCP], 130 and Anycast-RP [RFC4610] . As network elements increase their compute 131 power and support Linux based compute virtualization, we should 132 expect additional local applications to emerge as well (such as 133 Distributed Analytics [1]). 135 Ultimately network application programming must be simplified. To do 136 this: 138 o we must provide APIs to both controller and network element based 139 applications in a way which allows access to network objects as if 140 they were coming from a cloud, 142 o we must enable these local applications to interact with network 143 level abstractions, 145 o we must hide the mesh of interdependencies and consistency 146 enforcement mechanisms between devices which will underpin a 147 particular abstraction, 149 o we must enable flexible deployment models, in which applications 150 are able to run not only on controller and OSS frameworks but also 151 on network devices without requiring heavy middleware with large 152 footprints, and 154 o we need to maintain clear authoritative ownership of individual 155 data items while not burdening applications with the need to 156 reconcile and synchronize information replicated in different 157 systems, nor needing to maintain redundant data models that 158 operate on the same underlying data. 160 These steps will eliminate much unnecessary overhead currently 161 required of today's network programmer. 163 2. Terminology 165 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 166 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 167 document are to be interpreted as described in RFC 2119 [RFC2119]. 169 Authoritative Datastore - A datastore containing the authoritative 170 copy of an object, i.e. the source and the "owner" of the object. 172 Client Datastore - a datastore containing an object whose source and 173 "owner" is a remote datastore. 175 Data Node - An instance of management information in a YANG 176 datastore. 178 Datastore - A conceptual store of instantiated information, with 179 individual data items represented by data nodes which are arranged in 180 hierarchical manner. 182 Data Subtree - An instantiated data node and the data nodes that are 183 hierarchically contained within it. 185 Mount Client - The system at which the mount point resides, into 186 which on or more remote subtrees may be mounted. 188 Mount Binding - An instance of mounting from a specific Mount Point 189 to a remote datastore. Types include: 191 o On-demand: Mount Client only pulls information when application 192 requests 194 o Periodic: Mount Server pushes current state at a pre-defined 195 interval 197 o Unsolicited: Mount Server maintains active bindings and sends to 198 client cache upon change 200 Mount Point - Point in the local data store which may reference a 201 single remote subtree 203 Mount Server - The server with which the Mount Client communicates 204 and which provides the Mount Client with access to the mounted 205 information. Can be used synonymously with Mount Target. 207 Peer Mount - The act of representing remote objects in the local 208 datastore 210 Target Data Node - Data Node on Mount Server against which a Mount 211 Binding is established 213 3. Solution Context 215 YANG modeling has emerged as a preferred way to offer network 216 abstractions. The requirements in this document can be enabled by 217 expanding of the syntax of YANG capabilities embodied within RFC 6020 218 [RFC6020] and YANG 1.1 [rfc6020bis]. A companion draft to this one 219 which details a potential set of YANG technology extensions which can 220 support key requirements within this document are contained in . 221 [draft-clemm-mount].A "-02" release of this draft which includes 222 specifications to support many additional concepts will be posted in 223 the coming days. 225 To date systems built upon YANG models have been missing two 226 capabilities: 228 1. Peer Datastore Mount: Datastores have not been able to proxy 229 objects located elsewhere. This puts additional burden upon 230 applications which then need to find and access multiple 231 (potentially remote) systems. 233 2. Eventual Consistency: YANG Datastore implementations have 234 typically assumed ACID [2] transaction models. There is nothing 235 inherent in YANG itself which demands ACID transactional 236 guarantees. YANG models can also expose information which might 237 be in the process of undergoing convergence. Since IP networking 238 has been designed with convergence in mind, this is a useful 239 capability since some types of applications must participate 240 where there is dynamically changing state. 242 3.1. Peer Mount 244 First this document will dive deeper into Peer Datastore Mount 245 (a.k.a., "Peer Mount"). Contrary to existing YANG datastores, where 246 hierarchical datatree(s) are local in scope and only includes data 247 that is "owned" by the local system, we need an agent or interface on 248 one system which is able refer to managed resources that reside on 249 another system. This allows applications on the same system as the 250 YANG datastore server, as well as remote clients that access the 251 datastore through a management protocol such as NETCONF, to access 252 all data as if it were local to that same server. This must be done 253 in a manner that is transparent to users and applications. This must 254 be done in a way which does not require a user or application to be 255 aware of the fact that some data resides in a different location and 256 have them directly access that other system. In this way, the user 257 is projected an image of one virtual consolidated datastore. 259 The value in such a datastore comes from its under-the-covers 260 federation. The datastore transparently exposes information from 261 multiple systems across the network. The user does not need to be 262 aware of the precise distribution and ownership of data themselves, 263 nor is there a need for the application to discover those data 264 sources, maintain separate associations with them, and partition its 265 operations to fit along remote system boundaries. The effect is that 266 a network device can broaden and customize the information available 267 for local access. Life for the application is easier. 269 Any Object type can be included in such a datastore. This can 270 include configuration data that is either persistent or ephemeral, 271 and which is valid within only a single device or across a domain of 272 devices. This can include operational data that represents state 273 across a single device or across a multiple devices. 275 Another useful aspect of "Peer Mount" is its ability to embed 276 information from external YANG models which haven't necessarily been 277 normalized. Normalization is a good thing. But the massive human 278 efforts invested in uber-data-models have never gained industry 279 traction due to the resulting models' brittle nature and complexity. 280 By mounting remote trees/objects into local datastores it is possible 281 to expose remote objects under a locally optimized hierarchy without 282 having to transpose remote objects into a separate local model. Once 283 this exists, object translation and normalization become optional 284 capabilities which may also be hidden. 286 Another useful aspect of "Peer Mount" is its ability to mount remote 287 trees where the local datastore does not know the full subtree being 288 installed. In fact, the remote datastore might be dynamically 289 changing the mounted tree. These dynamic changes can be reflected as 290 needed under the "attachment points" within the namespace hierarchy 291 where the data subtrees from remote systems have been mounted. In 292 this case, the precise details of what these subtrees exactly contain 293 does not need to be understood by the system implementing the 294 attachment point, it simply acts as a single point of entry and 295 "proxy" for the attached data. 297 3.2. Eventual Consistency and YANG 1.1 299 The CAP theorem [3] states that it is impossible for a distributed 300 computer system to simultaneously provide Consistency, Availability, 301 and Partition tolerance. (I.e., distributed network state management 302 is hard.) Mostly for this reason YANG implementations have shied 303 away from distributed datastore implementations where ACID 304 transactional guarantees cannot be given. This of course limits the 305 universe of applicability for YANG technology. 307 Leveraging YANG concepts, syntax, and models for objects which might 308 be happening to undergo network convergence is valuable. Such reuse 309 greatly expands the universe of information visible to networking 310 applications. The good news is that there is nothing in YANG 1.1 311 syntax that prohibits its reapplication for distributed datastores. 312 Extensions are needed however. 314 Requirements described within this document can be used to define 315 technology extensions to YANG 1.1 for remote datastore mounting. 316 Because of the CAP theorem, it must be recognized that systems built 317 upon these extensions MAY choose to support eventual consistency 318 rather than ACID guarantees. Some applications do not demand ACID 319 guarantees (examples are contained in this document's Use Case 320 section). Therefore for certain classes of applications, eventual 321 consistency [4] should be viewed as a cornerstone feature capability 322 rather than a bug. 324 4. Example Use Cases 326 Many types of applications can benefit from the simple and quick 327 availability of objects from peer network devices. Because network 328 management and orchestration systems have been fulfilling a subset of 329 the requirements for decades, it is important to focus on what has 330 changed. Changes include: 332 o SDN applications wish to interact with local datastore(s) as if 333 they represent the real-time state of the distributed network. 335 o Independent sets of applications and SDN controllers might care 336 about the same authoritative data node or subtree. 338 o Changes in the real-time state of objects can announce themselves 339 to subscribing applications. 341 o The union of an ever increasing number of abstractions provided 342 from different layers of the network are assumed to be consistent 343 with each other (at least once a reasonable convergence time has 344 been factored in). 346 o CPU and VM improvements makes running Linux based applications on 347 network elements viable. 349 Such changes can enable a new class of applications. These 350 applications are built upon fast-feedback-loops which dynamically 351 tune the network based on iterative interactions upon a distributed 352 datastore. 354 4.1. Cloud Policer 356 A Cloud Policer enables a single aggregated data rate to tenants/ 357 users of a data center cloud that applies across their VMs; a rate 358 independent of where specific VMs are physically hosted. This works 359 by having edge router based traffic counters available to a 360 centralized application, which can then maintain an aggregate across 361 those counters. Based on the sum of the counters across the set of 362 edge routers, new values for each device based Policer can be 363 recalculated and installed. Effectively policing rates are 364 continuously rebalanced based on the most recent traffic offered to 365 the aggregate set of edge devices. 367 The cloud policer provides a very simple cloud QoS model. Many other 368 QoS models could also be implemented. Example extensions include: 370 o CIR/PIR guarantees for a tenant, 372 o hierarchical QoS treatment, 374 o providing traffic delivery guarantees for specific enterprise 375 branch offices, and 377 o adjusting the prioritization of one application based on the 378 activity of another application which perhaps is in a completely 379 different location. 381 It is possible to implement such a cloud policer application with 382 maximum application developer simplicity using peer mount. To do 383 this the application accesses a local datastore which in turn does a 384 peer mount from edge routers the objects which house current traffic 385 counter statistics. These counters are accessed as if they were part 386 of the local datastore structures, without concern for the fact that 387 the actual authoritative copies reside on remote systems. 389 Beyond this centralized counter collection peer mount, it is also 390 possible to have distributed edge routers mount information in the 391 reverse direction. In this case local edge routers can peer mount 392 centrally calculated policer rates for the device, and access these 393 objects as if they were locally configured. 395 For both directions of mounting, the authoritative copy resides in a 396 single system and is mounted by peers. Therefore issues with regards 397 to inconsistent configuration of the same redundant data across the 398 network are avoided. Also as can be seen in this use case, the same 399 system can act as a mount client of some objects while acting as 400 server for other objects. 402 4.2. DDoS Thresholding 404 Another extension of the "Cloud Policer" application is the creation 405 of additional action thresholds at bandwidth rates far greater than 406 might be expected. If these higher thresholds are hit, it is 407 possible to connect in DDoS scrubbers to ingress traffic. This can 408 be done in seconds after a bandwidth spike. This can also be done if 409 non-bandwidth counters are available. For example, if TCP flag 410 counts are available it is possible to look for changes in SYN/ACK 411 ratios which might signal a different type of attack. In all cases, 412 when network counters indicate a return to normal traffic profiles 413 the DDoS Scrubbers can be automatically disconnected. 415 Benefits of only connecting a DDoS scrubber in the rare event an 416 attack might be underway include: 418 o marking down traffic for an out-of-profile tenant so that an 419 potential attack doesn't adversely impact others, 421 o applying DDoS Scrubbing across many devices when an attack is 422 detected in one, 424 o reducing DDoS scrubber CPU, power, and licensing requirements 425 (during the vast majority of time, spikes are not occurring), and 427 o dynamic management and allocation of scarce platform resources 428 (such as optimizing span port usage, or limiting IP-FIX reporting 429 to levels where devices can do full flow detail exporting). 431 4.3. Service Chain Classification, Load Balancing and Capacity 432 Management 434 Service Chains will dynamically change ingress classification 435 filters, allocate paths from many ingress devices across shared 436 resources. This information needs to be updated in real time as 437 available capacity is allocated or failures are discovered. It is 438 possible to simplify service chain configuration and dynamic topology 439 maintenance by transparently updating remote cached topologies when 440 an authoritative object is changed within a central repository. For 441 example if the CPU in one VM spikes, you might want to recalculate 442 and adjust many chained paths to relieve the pressure. Or perhaps 443 after the recalculation you want to spin up a new VM, and then adjust 444 chains when that capacity is on-line. 446 A key value here is central calculation and transparent auto- 447 distribution. In other words, a change only need be updated by an 448 application in a single location, and the infrastructure will 449 automatically synchronize changes across any number of subscribing 450 devices without application involvement. In fact, the application 451 need not even know many devices are monitoring the object which has 452 been changed. 454 Beyond 1:n policy distribution, applications can step back from 455 aspects of failure recovery. What happens if a device is rebooting 456 or simply misses a distribution of new information? With peer mount 457 there is no doubt as to where the authoritative information resides 458 if things get out of synch. 460 While this ability is certainly useful for dynamic service chain 461 filtering classification and next hop mapping, this use case has more 462 general applicability. With a distributed datastore, diverse 463 applications and hosts can locally access a single device's current 464 VM CPU and Bandwidth values. They can do it without needing to 465 explicitly query that remote machine. Updates from a device would 466 come from a periodic push of stats to a transparent cache to 467 subscribed, or via an unsolicited update which is only sent when 468 these value exceed established norms. 470 5. Requirements 472 To achieve the objectives described above, the network needs to 473 support a number of requirements 475 5.1. Application Simplification 477 A major obstacle to network programmability are any requirements 478 which force applications to use abstractions more complicated than 479 the developer cares to touch. To simplify applications development 480 and reduce unnecessary code, the following needs must be met. 482 Applications MUST be able to access a local datastore which includes 483 objects whose authoritative source is located in a remote datastore 484 hosted on a different server. 486 Local datastores MUST be able to provide a hierarchical view of 487 objects assembled from objects whose authoritative source may 488 originate from potentially different and overlapping namespaces. 490 Applications MUST be able to access all objects of a datastore 491 without concern where the actual object is located, i.e. whether the 492 authoritative copy of the object is hosted on the same system as the 493 local datastore or whether it is hosted in a remote datastore. 495 With two exceptions, a datastore's application facing interfaces MUST 496 make no differentiation whether individual objects exposed are 497 authoritatively owned by the datastore or mounted from remote. This 498 includes Netconf and Restconf as well as other, possibly proprietary 499 interfaces (such as, CLI generated from corresponding YANG data 500 models). The two exceptions are that it is acceptable to make a 501 distinction between an object authoritatively owned by the data store 502 and a remote object as follows: 504 o Object updates / editing, creation and deletion. E.g. via edit- 505 config conditions and constraints are assessed at the 506 authoritative datastore when the update/create/delete is 507 conducted. Any conditions or constraints at remote client 508 datastores are NOT assessed. 510 o Locks obtained at a client datastore: It is conceivable for the 511 interface to distinguish between two lock modes: locking the 512 entire subtree including remote data (in which case the 513 datastore's mount client needs to explicitly obtain and release 514 locks from mounted authoritative datastores), or locking only 515 authoritatively owned data, excluding remote data from the lock. 517 These exceptions should not be very problematic as non-authoritative 518 copies will typically be marked as read-only. This will not violate 519 any considerations of "no differentiation" of local or remote. 521 When a change is made to an object, that change will be reflected in 522 any datastore in which the object is included. This means that a 523 change made to the object through a remote datastore will affect the 524 object in the authoritative datastore. Likewise, changes to an 525 object in the authoritative datastore will be reflected at any client 526 datastores. 528 The distributed datastore MUST be able to include objects from 529 multiple remote datastores. The same object may be included in 530 multiple remote datastores; in other words, an object's authoritative 531 datastore MUST support multiple clients. 533 The distributed datastore infrastructure MUST enable to access to 534 some subset of the same objects on different devices. (This includes 535 multiple controllers as well as multiple physical and virtual peer 536 devices.) 538 Applications SHOULD be able to extract a time synchronized set of 539 operational data from the datastore. (In other words, the 540 application asks for a subset of network state at time-stamp or time- 541 range "X". The datastore would then deliver time synchronized 542 snapshots of the network state per the request. The datastore may 543 work with NTP and operational counter to optimize the synchronization 544 results of such a query. It is understood that some types of data 545 might be undergoing convergence conditions.) 547 Authoritative datastore retain full ownership of "their" objects. 548 This means that while remote datastores may access the data, any 549 modifications to objects that are initiated at those remote 550 datastores need to be authorized by the authoritative owner of the 551 data. Likewise, the authoritative owner of the data may make changes 552 to objects, including modifications, additions, and deletions, 553 without needing to first ask for permission from remote clients. 555 Applications MUST be designed to deal with incomplete data if remote 556 objects are not accessible, e.g. due to temporal connectivity issues 557 preventing access to the authoritative source. (This will be true 558 for many protocols and programming languages. Mount is unlikely to 559 add anything new here unless applications have extra error handling 560 routines to deal with when there is no response from a remote 561 system.). 563 5.2. Caching Considerations 565 5.2.1. Caching Overview 567 Remote objects in a datastore can be accessed "on demand", when the 568 application interacting with the datastore demands it. In that case, 569 a request made to the local datastore is forwarded to the remote 570 system. The response from the remote system, e.g. the retrieved 571 data, is subsequently merged and collated with the other data to 572 return a consolidated response to the invoking application. 574 A downside of a datastore which is distributed across devices can be 575 the latency induced when remote object acquisition is necessary. 576 There are plenty of applications which have requirements which simply 577 cannot be served when latency is introduced. The good news is that 578 the concept of caching lends itself well to distributed datastores. 579 It is possible to transparently store some types of objects locally 580 even when the authoritative copy is remote. Instead of fetching data 581 on demand when an application demands it, the application is simply 582 provided with the local copy. It is then up to the datastore 583 infrastructure to keep selected replicated info in synch, e.g. by 584 prefetching information, or by having the remote system publish 585 updates which are then locally stored. 587 This is not a new idea. Caching and Content Delivery Networks (CDN) 588 have sped read access for objects within the Internet for years. 589 This has enabled greater performance and scale for certain content. 590 Just as important, these technologies have been employed without end 591 user applications being explicitly aware of their involvement. Such 592 concepts are applicable for scaling the performance of a distributed 593 datastore. 595 Where caching occurs, it MUST be possible for the Mount Client to 596 store object copies of a remote data node or subtree in such a way 597 that applications are unaware that any caching is occurring. 598 However, the interface to a datastore MAY provide applications with a 599 special mode/flag to allow them to force a read-through and perhaps 600 even a write-through. 602 Where caching occurs, system administration facilities SHOULD allow 603 facilities to flush either the entire cache, or information 604 associated with select Mount Points. 606 5.2.2. Pub/Sub of Object Updates 608 When caching occurs, data can go stale. Pub/Sub provides a mechanism 609 where changes in an authoritative data node or subtree can be 610 monitored. If changes occur, these changes can be delivered to any 611 subscribing datastores. In this way remote caches can be kept up-to- 612 date. In this way, directly monitoring remote applications can 613 quickly receive notifications without continuous polling. 615 5.2.2.1. General Pub/Sub Update Requirements 617 A Mount Client SHOULD be able to take advantage of pub/sub 618 capabilities offered by a mount server. However, not every Mount 619 Server offers according capabilities. 621 A Mount Client SHOULD be able to revert back to retrieve objects "On 622 Demand" and/or to pre-fetch objects by request. 624 A Mount Server MAY support a pub/sub capability in which one or more 625 remote clients subscribe to updates of a target data node / subtree, 626 which are then automatically published by the Mount Server. 628 One or more of the following pub/sub policies MUST be supported: 630 o On Demand (i.e. no pub/sub) - default 632 o Periodic (with a specified time interval) 634 o On change, immediate as the change occurs. 636 o On change, at the end of fixed intervals if a change has occurred 638 Further modifications are possible: e.g. on change, whether to only 639 publish only the object that has changed or the entire subtree that 640 had been subscribed to. (Effectively this is aggregate replication 641 at tree level, not at the object level.) 643 Pub/sub is applicable to other applications as well, not limited to 644 peer mounting. For example, a pub/sub capability can greatly 645 facilitate monitoring, as applications no longer have to "poll" for 646 data but can simply choose to subscribe to a stream of the most 647 current data. Accordingly, servers that offer pub/sub capabilities 648 for its YANG datastore SHOULD NOT limit subscribers to Mount Clients, 649 but allow other applications to subscribe as well. 651 It MUST be possible for Applications to subscribe to Data Node / 652 Subtrees so that upon Mount Client receipt of subscribed information, 653 it is immediately passed to the application. 655 It MUST be possible for the Mount Client to subscribe to Data Node / 656 Subtrees so that upon Mount Client receipt of subscribed information, 657 it is cached and therefore awaiting local application requests. 659 If there are no applications subscribing to a Data Node / Subtree, a 660 server SHOULD cease to publish the corresponding data. 662 It MUST be possible for a Subscription to include a timestamp when 663 the Subscription will expire. 665 It MUST be possible to identify a specific time when a Mount Binding 666 will return the current value(s) of a mounted Data Node / Subtree. 667 (Such timeframes can be in the very near future in order to support a 668 snapshot of network state or counters across many devices.) 670 A publisher is not responsible to monitor if the subscribers are 671 still active. It MAY do so, but is not obliged to do so. 672 Subscriptions upon a Target Data Node do not remain active forever 673 but MUST be periodically re-subscribed . The reason for this is to 674 avoid "waste", for example in cases when subscribers "die". If a 675 subscriber restarts, it is the subscribers responsibility to check 676 whether its subscriptions are still intact or to resubscribe if 677 needed. 679 It MUST be possible for a Target Data Node to support 1:1 Mount 680 Bindings to a single subscribed Mount Point. 682 It MUST be possible for a Target Data Node to support 1:n Mount 683 Bindings to many subscribed Mount Points. 685 5.2.2.2. Periodic Pub/Sub Updates 687 Especially with network based Counters or Operational data, there 688 need be no recurring request to send the next instance of data which 689 is released on schedule to subscribers. 691 It MUST be possible to for a Periodic Mount Point to identify a 692 specific time when a Mount Target will return the current value(s) of 693 a mounted Data Node / Subtree. This will allow for synchronization 694 of calculation for objects delivered from many Mount Bindings to 695 local applications. 697 It MUST be possible to for a Periodic Mount Point to identify the 698 desired start and stop timestamps for any replicated objects 699 associated with duration. This will allow for time period 700 synchronization of source data for objects delivered from many Mount 701 Bindings to local applications. 703 5.2.2.3. Change-trigger Pub/Sub Updates 705 For an Unsolicited Mount Point, if a data node or subtree changes, 706 the Mount Target MUST provide updated objects to the Mount Client. 708 For an Unsolicited Mount Point, if a data node or subtree changes, 709 the Mount Target SHOULD be able to provide just the updated objects 710 to the Mount Client. Note: If there is a Mount Filter in place, then 711 only the updated objects based on the filter will be delivered. It 712 is possible that a Filter will result in no update needing to be 713 sent. 715 It SHOULD be possible to provide criteria per Mount Binding on the 716 characteristics of changes to a Target Data Node's monitored objects 717 on before an update is sent to the subscribing system. (Effectively 718 this becomes a "threshold trigger" for change notification to remote 719 caches.) 721 5.3. Lifecycle of the Mount Topology 723 Mount can drive a dynamic and richly interconnected mesh of peer-to- 724 peer of object relationships. Each of these Mounts will be 725 independently established by a Mount Client. 727 It MUST be possible to bootstrap the Mount Client by providing the 728 YANG paths to resources on the Mount Server. 730 There SHOULD be the ability to add Mount Client bindings during run- 731 time. 733 A Mount Client MUST be able to be able to create, delete, and timeout 734 Mount Bindings. 736 A Mount Server maintaining a periodic or unsolicited Mount Binding 737 MUST be able to inform the Mount Client of an intentional graceful 738 disconnection of that binding. 740 A Mount Client must be able to verify the existence of a periodic or 741 unsolicited Mount Binding which has successfully been established on 742 a Mount Server, and re-establish if it has disappeared. 744 5.3.1. Discovery and Creation of Mount Topology 746 Application visibility into an ever-changing set of network objects 747 is not trivial. While some applications can be easily configured to 748 know the Devices and available Mount Points of interest, other 749 applications will have to balance many aspects of dynamic device 750 availability, capabilities, and interconnectedness. For the most 751 part, maintenance of these dynamic elements can be done on the YANG 752 objects themselves without anything needed new for Peer Mount. 753 Technologies such as need reference are covered in other standards 754 initiatives. Therefore this draft does delve deeply into the needs 755 for Auto-discovery of YANG objects which may be advertised. 757 However it will likely become interesting for a network element to 758 limit the Data Subtrees which might be subscribed for Unsolicited and 759 Periodic Update. 761 It SHOULD be possible for a Mount Server to advertise potential 762 Target Data Nodes which can support unsolicited and periodic binding 763 types. 765 5.3.2. Restrictions on the Mount Topology 767 Mount Clients MUST NOT create recursive Mount bindings (i.e., the 768 Mount Client should not load any object or subtree which it has 769 already delivered to another in the role of a Mount Server.) Note: 770 Objects mounted from a controller as part of orchestration are *not* 771 considered the same objects as those which might be mounted back from 772 a network device showing the actual running config. 774 5.4. Mount Filter 776 The Mount Server default MUST be to deliver the same Data Node / 777 Subtree that would have been delivered via direct YANG access. 779 It SHOULD be possible for a Mount Client to request something less 780 that the full subtree or a target node. This will be valuable when 781 the number or size of objects under a Target Data Node is large. 783 5.5. Transport 785 Many secured transports are viable assuming transport, data security, 786 scale, and performance objectives are met. Netconf is recommended 787 for starting. Other transports may be proposed over time. 788 Additional study is needed to assess how aspects of locking will 789 supported in parallel with eventual consistency for different object 790 writes. 792 It MUST be possible to support Netconf Transport of subscribed Nodes 793 and Subtrees. 795 RESTconf [RESTconf] must be examined as well, especially as section 796 1.2 studies a possible mix of locking. 798 5.6. Security Considerations 800 Many security mechanisms exist to protect read/write access for CLI 801 and API on network devices. To the degree possible these mechanisms 802 should transparently protect data read and write when performing a 803 Peer Mount. The text below starts with a subset of those 804 requirements . Additional ones should be added. 806 The same mechanisms used to determine whether a remote host has 807 access to a particular YANG Data Node or Subtree MUST be invoked to 808 determine whether a Mount Client has access to that information. 810 The same traditional transport level security mechanism security used 811 for YANG over a particular transport MUST be used for the delivery of 812 objects from a Mount Server to a Mount Client. 814 A Mount Server implementation MUST NOT change any credentials passed 815 by the Mount Client system for any Mount Binding request. 817 The Mount Server MUST deliver no more objects from a Data Node or 818 Subtree than allowable based on the security credentials provided by 819 the Mount Client. 821 To ensure the ensuring maximum scale limits, it MUST be possible to 822 for a Mount Server to limit the number of bindings and transactional 823 limits 825 It SHOULD be possible to prioritize which Mount Binding instances 826 should be serviced first if there is CPU, bandwidth, or other 827 capacity constraints. 829 5.7. High Availability 831 A key intent for Peer Mount is to allow access to an authoritative 832 copy of an object for a particular domain. Of course system and 833 software failures or scheduled upgrades might mean that the primary 834 copy is not consistently accessible from a single device. In 835 addition, system failovers might mean that the authoritative copy 836 might be housed on a different device than the one where the binding 837 was originally established. Peer Mount architectures must be built 838 to enable Mount Clients to transparently provide access to objects 839 where the authoritative copy moves due to dynamic network 840 reconfigurations . 842 For selected objects, Mount Bindings SHOULD be allowed to Anycast or 843 ECMP (Equal Cost Multiple Path) addresses so that a Distributed Mount 844 Server implementation can transparently provide (a) availability 845 during failure events to Mount Clients, and (b) load balancing on 846 behalf of Mount Clients. 848 Where anycast unsolicited or periodic bindings are allowed to Anycast 849 addresses, the real time state of Mount Server bindings MUST be 850 coordinated across the set of Anycast addressed devices. In this 851 way, the state of periodic and unsolicited Mount Bindings will not be 852 lost during a failover. 854 The Mount Client and Mount Server MUST either have heart-beat 855 mechanism OR use a connection oriented transport to detect each 856 other's failures. 858 When a Mount Server detects disappearance of a Mount Client, the 859 Mount Server SHOULD purge all the mount bindings from the failed 860 Mount Client. 862 When a failover occurs on the Mount Client side, the new instance of 863 the Mount Client SHOULD re-establish the mount bindings with the 864 Mount Server(s). 866 When a failover occurs on the Mount Server side, the new owner of an 867 unsolicited mount binding SHOULD send out the current state of the 868 object to subscribed Mount Clients. 870 5.8. Configuration 872 At the Mount Client, it MUST be possible for all Mount bindings to 873 configure the following such that the application needs no knowledge. 874 This will includea diverse list of elements such at the YANG URI path 875 to the remote subtree. 877 5.9. Assurance and Monitoring 879 API usage for YANG should be tracked via existing mechanisms. There 880 is no intent to require additional transaction tracking than would 881 have been provided normally. However there are additional 882 requirements which should allow the state of existing and historical 883 bindings to be provided. 885 A Mount Client MUST be able to poll a Mount Server for the state of 886 unsolicited and periodic Mount Binding maintained between the two 887 devices. 889 A Mount Server MUST be able to publish the set of unsolicited and 890 periodic Mount Bindings which are currently established on or below 891 any identified data node. 893 A Mount Server MUST be able to publish the set of unsolicited and 894 periodic Mount Bindings which are going to a specific Mount Client. 896 A Mount Server MUST be able to publish the set fulfilled Mount 897 Bindings which are going to a specific Mount Client. 899 A Mount Server MUST be able to publish a list of the Mount Bindings 900 transactions successfully completed. 902 A Mount Server MUST be able to publish a list of the Mount Bindings 903 which failed, along with reasons that they failed. These reasons 904 might include: 906 o Improper security credentials provided for the Mount Client to 907 access the target node 909 o Target node referenced does not exist 911 o Binding type requested not available for the target node 913 o Mount Server out of resources or resources not available 915 o Connection from client lost before binding complete 917 A Mount Client MUST be able to publish a list of the Mount Bindings 918 transactions successfully completed. 920 A Mount Client MUST be able to publish a list of the Mount Bindings 921 which failed, along with reasons that they failed. These reasons 922 might include: 924 o No response from Mount Client 926 o Connection could not be established with Mount Client 928 o Security credentials provided to Mount Server rejected 930 o Target node referenced does not exist 932 o Binding type requested not available for the target node 934 o Mount Server out of resources or resources not available 936 o Connection from client lost before binding complete 938 6. IANA Considerations 940 This document makes no request of IANA. 942 Note to RFC Editor: this section may be removed on publication as an 943 RFC. 945 7. Acknowledgements 947 We wish to acknowledge the helpful contributions, comments, and 948 suggestions that were received from Dinkar Kunjikrishnan, Harish 949 Gumaste, Rohit M., Shruthi V. , Sudarshan Ganapathi, and Swaroop 950 Shastri. 952 8. References 954 8.1. Normative References 956 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 957 Requirement Levels", BCP 14, RFC 2119, March 1997. 959 [RFC3768] Hinden, R., "Virtual Router Redundancy Protocol (VRRP)", 960 RFC 3768, April 2004. 962 [RFC4610] Farinacci, D. and Y. Cai, "Anycast-RP Using Protocol 963 Independent Multicast (PIM)", RFC 4610, August 2006. 965 [RFC6020] Bjorklund, M., "YANG - A Data Modeling Language for the 966 Network Configuration Protocol (NETCONF)", RFC 6020, 967 October 2010. 969 8.2. Informative References 971 [ICCP] Martini, L., Ed., "Inter-Chassis Communication Protocol 972 for L2VPN PE Redundancy", March 2014, 973 . 975 [RESTconf] 976 Bierman, A., Ed., "RESTCONF Protocol", July 2014, 977 . 980 [draft-clemm-mount] 981 Clemm, A., Ed., "Mounting YANG-Defined Information from 982 Remote Datastores", September 2013, 983 . 986 [rfc6020bis] 987 Bjorklund, M., "YANG - A Data Modeling Language for the 988 Network Configuration Protocol (NETCONF)", July 2014, 989 . 992 8.3. URIs 994 [1] http://thomaswdinsmore.com/2014/05/01/distributed-analytics- 995 primer/ 997 [2] http://en.wikipedia.org/wiki/ACID 999 [3] http://robertgreiner.com/2014/08/cap-theorem-revisited/ 1001 [4] http://guide.couchdb.org/draft/consistency.html 1003 Authors' Addresses 1005 Eric Voit 1006 Cisco Systems 1008 Email: evoit@cisco.com 1010 Alex Clemm 1011 Cisco Systems 1013 Email: alex@cisco.com 1015 Shashi Kumar Bansal 1016 Cisco Systems 1018 Email: shabansa@cisco.com 1020 Ambika Tripathy 1021 Cisco Systems 1023 Email: ambtripa@cisco.com 1025 Prabhakara Yellai 1026 Cisco Systems 1028 Email: pyellai@cisco.com