idnits 2.17.1 draft-alimi-protocol-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (June 10, 2013) is 3973 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- -- Obsolete informational reference (is this intentional?): RFC 5661 (Obsoleted by RFC 8881) Summary: 0 errors (**), 0 flaws (~~), 1 warning (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 APPSAWG R. Alimi 3 Internet-Draft Google 4 Intended status: Informational A. Rahman 5 Expires: December 12, 2013 InterDigital Communications, LLC 6 D. Kutscher 7 NEC 8 Y. Yang 9 Yale University 10 H. Song 11 K. Pentikousis 12 Huawei Technologies 13 June 10, 2013 15 DECADE: DECoupled Application Data Enroute 16 draft-alimi-protocol-01 18 Abstract 20 Content distribution applications, such as those those employing 21 peer-to-peer (P2P) technologies, are widely used on the Internet and 22 make up a large portion of the traffic in many networks. Often, 23 however, content distribution applications use network resources in a 24 counter-productive manner. One way to improve efficiency is to 25 introduce storage capabilities within the network and enable 26 cooperation between end-host and in-network content distribution 27 mechanisms. This is the capability provided by a DECADE-compatible 28 system, which is introduced in this document. DECADE enables 29 applications to take advantage of in-network storage when 30 distributing data objects as opposed to using solely end-to-end 31 resources. This document presents the underlying principles and key 32 functionalities of such a system and illustrates operation through a 33 set of examples. 35 Requirements Language 37 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 38 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 39 document are to be interpreted as described in RFC 2119 [RFC2119]. 41 Status of This Memo 43 This Internet-Draft is submitted in full conformance with the 44 provisions of BCP 78 and BCP 79. 46 Internet-Drafts are working documents of the Internet Engineering 47 Task Force (IETF). Note that other groups may also distribute 48 working documents as Internet-Drafts. The list of current Internet- 49 Drafts is at http://datatracker.ietf.org/drafts/current/. 51 Internet-Drafts are draft documents valid for a maximum of six months 52 and may be updated, replaced, or obsoleted by other documents at any 53 time. It is inappropriate to use Internet-Drafts as reference 54 material or to cite them other than as "work in progress." 56 This Internet-Draft will expire on December 12, 2013. 58 Copyright Notice 60 Copyright (c) 2013 IETF Trust and the persons identified as the 61 document authors. All rights reserved. 63 This document is subject to BCP 78 and the IETF Trust's Legal 64 Provisions Relating to IETF Documents 65 (http://trustee.ietf.org/license-info) in effect on the date of 66 publication of this document. Please review these documents 67 carefully, as they describe your rights and restrictions with respect 68 to this document. Code Components extracted from this document must 69 include Simplified BSD License text as described in Section 4.e of 70 the Trust Legal Provisions and are provided without warranty as 71 described in the Simplified BSD License. 73 Table of Contents 75 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 76 2. Architectural Principles . . . . . . . . . . . . . . . . . . 5 77 2.1. Data and Control/Metadata Plane Decoupling . . . . . . . 5 78 2.2. Immutable Data Objects . . . . . . . . . . . . . . . . . 6 79 2.3. Data Object Identifiers . . . . . . . . . . . . . . . . . 7 80 2.4. Explicit Control . . . . . . . . . . . . . . . . . . . . 8 81 2.5. Resource and Data Access Control through Delegation . . . 8 82 3. System Components . . . . . . . . . . . . . . . . . . . . . . 9 83 3.1. Content Distribution Application . . . . . . . . . . . . 9 84 3.2. DECADE Client . . . . . . . . . . . . . . . . . . . . . . 10 85 3.3. DECADE Server . . . . . . . . . . . . . . . . . . . . . . 11 86 3.4. Data Sequencing and Naming . . . . . . . . . . . . . . . 12 87 3.5. Token-based Authorization and Resource Control . . . . . 13 88 3.6. Discovery . . . . . . . . . . . . . . . . . . . . . . . . 14 89 4. DECADE Protocol Design . . . . . . . . . . . . . . . . . . . 15 90 4.1. Naming . . . . . . . . . . . . . . . . . . . . . . . . . 15 91 4.2. Resource Protocol . . . . . . . . . . . . . . . . . . . . 15 92 4.3. Data Transfer . . . . . . . . . . . . . . . . . . . . . . 19 93 4.4. Server-to-Server Protocols . . . . . . . . . . . . . . . 19 95 5. In-Network Storage Components Mapping to DECADE . . . . . . . 20 96 6. Security Considerations . . . . . . . . . . . . . . . . . . . 21 97 6.1. Threat: System Denial of Service Attacks . . . . . . . . 21 98 6.2. Threat: Authorization Mechanisms Compromised . . . . . . 22 99 6.3. Threat: Data Object Spoofing . . . . . . . . . . . . . . 22 100 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 23 101 8. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 23 102 9. Informative References . . . . . . . . . . . . . . . . . . . 23 103 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 24 105 1. Introduction 107 Content distribution applications, such as peer-to-peer (P2P) 108 applications, are widely used on the Internet to distribute data 109 objects, and comprise a large portion of the traffic in many 110 networks. Said applications can often introduce performance 111 bottlenecks in otherwise well-provisioned networks. In some cases, 112 operators are forced to invest substantially in infrastructure to 113 accommodate the use of such applications. For instance, in many 114 subscriber networks, it can be expensive to upgrade network equipment 115 in the "last-mile", because it can involve replacing equipment and 116 upgrading wiring and devices at individual homes, businesses, 117 DSLAMs(Digital Subscriber Line Access Multiplexers) and CMTSs (Cable 118 Modem Termination Systems) in remote locations. It may be more 119 practical and economical to upgrade the core infrastructure, instead 120 of the edge part of the network, as this involves fewer components 121 that are shared by many subscribers. See [RFC6646] and [RFC6392] for 122 a more complete discussion of the problem domain and general 123 discussions of the capabilities envisioned for a DECADE system. 125 This document presents mechanisms for providing in-network storage 126 that can be integrated into content distribution applications. The 127 primary focus is P2P-based content distribution, but DECADE may be 128 useful to other applications with similar characteristics and 129 requirements. The approach we adopt in this document is to define 130 the core functionalities and protocol functions that are needed to 131 support a DECADE system. This document provides illustrative 132 examples so that implementers can understand the main concepts in 133 DECADE, but it is generally assumed that readers are familiar with 134 the terms and concepts used in [RFC6646] and [RFC6392]. 136 Figure 1 is a schematic of a simple DECADE system with two DECADE 137 clients and two DECADE servers. As illustrated, a client uses the 138 DECADE Resource Protocol (DRP) to convey to a server information 139 related to access control and resource scheduling policies. DRP can 140 also be used between servers for exchanging this type of information. 141 A DECADE system employs standard data transfer (SDT) protocol(s) to 142 transfer data objects to and from a server, as we will explain later. 144 Native Application 145 .-------------. Protocol(s) .-------------. 146 | Application | <------------------> | Application | 147 | End-Point | | End-Point | 148 | | | | 149 | .--------. | | .--------. | 150 | | DECADE | | | | DECADE | | 151 | | Client | | | | Client | | 152 | `--------' | | `--------' | 153 `-------------' `-------------' 154 | ^ | ^ 155 DECADE | | Standard | | 156 Resource | | Data DRP | | SDT 157 Protocol | | Transfer | | 158 (DRP) | | (SDT) | | 159 | | | | 160 | | | | 161 | | | | 162 | | | | 163 | | | | 164 | | | | 165 v V v V 166 .=============. DRP .=============. 167 | DECADE | <------------------> | DECADE | 168 | Server | <------------------> | Server | 169 `=============' SDT `=============' 171 Figure 1: DECADE Overview 173 With Figure 1 at hand, assume that Application End-Point B requests a 174 data object from Application End-Point A. In this case, End-Point A 175 will act as the sender and End-Point B as the receiver for said data 176 object. Let S(A) denote the DECADE storage server to which A has 177 access. Figure 2 illustrates the four steps involved in the request, 178 starting with the initial contact between B and A during which the 179 former requests a data object using their native application protocol 180 (see Section 3.1). Next, A uses DRP to obtain a token corresponding 181 to the data object that was requested by B. There may be several ways 182 for A to obtain such a token, e.g., compute it locally or request one 183 from its DECADE storage server, S(A); see Section 4.2.1 for more 184 details. Once obtained, A then provides the token to B (again, using 185 their native application protocol). Finally, B provides the received 186 token to S(A) via DRP, and subsequently requests and downloads the 187 data object via SDT. 189 .----------. 190 2. Obtain --------> | S(A) | <------ 191 Token / `----------' \ 4. Request and 192 (DRP) / \ Download Data 193 Locally / \ Object 194 or From / \ (DRP + SDT) 195 S(A) v 1. App Request v 196 .-------------. <--------------------------- .-------------. 197 | Application | | Application | 198 | End-Point A | | End-Point B | 199 `-------------' ---------------------------> `-------------' 200 3. App Response (token) 202 Figure 2: Download from Storage Server 204 2. Architectural Principles 206 This section presents the key principles followed by any DECADE 207 system. 209 2.1. Data and Control/Metadata Plane Decoupling 211 A DECADE system aims to be application-independent and SHOULD support 212 multiple content distribution applications. Typically, a complete 213 content distribution application implements a set of control plane 214 functions including content search, indexing and collection, access 215 control, replication, request routing, and QoS scheduling. 216 Implementers of different content distribution applications may have 217 unique considerations when designing the control plane functions. 218 For example, with respect to the metadata management scheme, 219 traditional file systems provide a standard metadata abstraction: a 220 recursive structure of directories to offer namespace management 221 where each file is an opaque byte stream. Content distribution 222 applications may use different metadata management schemes. For 223 instance, one application might use a sequence of blocks (e.g., for 224 file sharing), while another application might use a sequence of 225 frames (with different sizes) indexed by time. 227 With respect to resource scheduling algorithms, a major advantage of 228 many successful P2P systems is their substantial expertise in 229 achieving efficient utilization of peer resources. For instance, 230 many streaming P2P systems include optimization algorithms for 231 constructing overlay topologies that can support low-latency, high- 232 bandwidth streaming. The research community as well as implementers 233 of such systems continuously fine-tune existing algorithms and invent 234 new ones. A DECADE system should be able to accommodate and benefit 235 from all new developments. 237 In short, given the diversity of control plane functions, a DECADE 238 system should allow for as much flexibility as possible to the 239 control plane to implement specific policies. This conforms to the 240 end-to-end systems principle and allows innovation and satisfaction 241 of specific performance goals. Decoupling the control plane from the 242 data plane is not new, of course. For example, OpenFlow is an 243 implementation of this principle for Internet routing, where the 244 computation of the forwarding table and the application of the 245 forwarding table are separated. The Google File System 246 [GoogleFileSystem] applies the same principle to file system design 247 by utilizing a Master to handle meta-data management and several 248 Chunk servers to handle data plane functions (i.e., read and write of 249 chunks of data). Finally, NFSv4.1's pNFS extension [RFC5661] also 250 adheres to this principle. 252 2.2. Immutable Data Objects 254 A common property of bulk content to be broadly distributed is that 255 it is immutable -- once content is generated, it is typically not 256 modified. For example, once a movie has been edited and released for 257 distribution it is very uncommon that the corresponding video frames 258 and images need to be modified. The same applies to document 259 distribution, such as RFCs, audio files, such as podcasts, and 260 program patches. Focusing on immutable data can substantially 261 simplify data plane design, since consistency requirements can be 262 relaxed. It also simplifies data reuse and implementation of de- 263 duplication. 265 Depending on its specific requirements, an application may store 266 immutable data objects in DECADE servers such that each data object 267 is completely self-contained (e.g., a complete, independently 268 decodable video segment). An application may also divide data into 269 data objects that require application level assembly. Many content 270 distribution applications divide bulk content into data objects for 271 multiple reasons, including (a) fetching different data objects from 272 different sources in parallel; and (b) faster recovery and 273 verification as individual data objects might be recovered and 274 verified. Typically, applications use a data object size larger than 275 a single packet in order to reduce control overhead. 277 A DECADE system SHOULD be agnostic to the nature of the data objects 278 and SHOULD NOT specify a fixed size for them. A protocol 279 specification based on this architecture MAY prescribe requirements 280 on minimum and maximum sizes for compliant implementations. 282 Note that immutable data objects can still be deleted. Applications 283 can support modification of existing data stored at a DECADE server 284 through a combination of storing new data objects and deleting 285 existing data objects. For example, a meta-data management function 286 of the control plane might associate a name with a sequence of 287 immutable data objects. If one of the data objects is modified, the 288 meta-data management function changes the mapping of the name to a 289 new sequence of immutable data objects. 291 Throughout this document, all data objects are assumed to be 292 immutable. 294 2.3. Data Object Identifiers 296 A data object stored in a DECADE server SHALL be accessed by content 297 consumers via a data object identifier. Each content consumer may be 298 able to access more than one storage server. A data object that is 299 replicated across different storage servers managed by a DECADE 300 Storage Provider MAY be accessed through a single identifier. Since 301 data objects are immutable, it SHALL be possible to support 302 persistent identifiers for data objects. 304 Data object identifiers SHOULD be created by content providers when 305 uploading the corresponding objects to a DECADE server. The scheme 306 for the assignment/derivation of the data object identifier to a data 307 object depends as the data object naming scheme and is out of scope 308 of this document. One possibility is to name data objects using 309 hashes as described in [RFC6920]. Note that this document describes 310 naming schemes on a semantic level only but specific SDTs and DRPs 311 will use specific representations. 313 In particular, for some applications it is important that clients and 314 servers are able to validate the name-object binding, i.e., by 315 verifying that a received object really corresponds to the name 316 (identifier) that was used for requesting it (or that was provided by 317 a sender). Data object identifiers can support name-object binding 318 validation by providing message digests or so-called self-certifying 319 naming information -- if a specific application has this requirement. 321 Different name-object binding validation mechanisms MAY be supported 322 in a single DECADE system. Content distribution applications can 323 decide what mechanism to use, or to not provide name-object 324 validation (e.g., if authenticity and integrity can by ascertained by 325 alternative means). We expect that applications may be able to 326 construct unique names (with high probability) without requiring a 327 registry or other forms of coordination. Names may be self- 328 describing so that a receiving entity (i.e. the content consumer) 329 understands, for example, which hash function to use for validating 330 name-object binding. 332 Some content distribution applications will derive the name of a data 333 object from the hash over the data object, which is made possible by 334 the fact that DECADE objects are immutable. But there may be other 335 applications such as live streaming where object names will not based 336 on hashes but rather on an enumeration scheme. The naming scheme 337 will also enable those applications to construct unique names. 339 In order to enable the uniqueness, flexibility and self-describing 340 properties, the naming scheme used in a DECADE system SHOULD provide 341 a "type" field that indicates the name-object validation function 342 type (for example, "sha-256") and the cryptographic data (such as an 343 object hash) that corresponds to the type information. Moreover, the 344 naming scheme MAY additionally provide application or publisher 345 information. 347 The specific format of the name (e.g., encoding, hash algorithms, 348 etc.) is out of scope of this document. 350 2.4. Explicit Control 352 To support the functions of an application's control plane, 353 applications SHOULD be able to keep track and coordinate which data 354 is stored at particular servers. Thus, in contrast with traditional 355 caches, applications are given explicit control over the placement 356 (selection of a DECADE server), deletion (or expiration policy), and 357 access control for stored data objects. Consider deletion/expiration 358 policy as a simple example. An application might require that a 359 DECADE server stores data objects for a relatively short period of 360 time (e.g., for live-streaming data). Another application might need 361 to store data objects for a longer duration (e.g., for video-on- 362 demand), and so on. 364 2.5. Resource and Data Access Control through Delegation 366 A DECADE system provides a shared infrastructure to be used by 367 multiple content consumers and content providers spanning multiple 368 content distribution applications. Thus, it needs to provide both 369 resource and data access control, as discussed in the following 370 subsections. 372 2.5.1. Resource Allocation 374 There are two primary interacting entities in a DECADE system. 375 First, in-network storage providers coordinate DECADE server 376 provisioning, including their total available resource; see 377 Section 4.2.1. Second, applications coordinate data transfers 378 amongst available DECADE servers and between servers and clients. A 379 form of isolation is required to enable concurrently-running 380 applications to each explicitly manage its own data objects and share 381 of resources at the available servers. Therefore, a storage provider 382 should delegate resource management on a DECADE server to content 383 providers, enabling them to explicitly and independently manage their 384 own share of resources on a server. 386 2.5.2. User Delegation 388 In-network storage providers will have the ability to explicitly 389 manage the entities allowed to utilize the resources available on a 390 DECADE server. This is needed for reasons such as capacity-planning 391 and legal considerations in certain deployment scenarios. The DECADE 392 server SHOULD grant a share of the resources to the DECADE client of 393 a content provider or content consumer. The client can in turn share 394 the granted resources amongst its (possibly) multiple applications. 395 The share of resources granted by a server is called a User 396 Delegation. As a simple example, a DECADE server operated by an ISP 397 might be configured to grant each ISP subscriber 1.5 Mb/s of network 398 capacity. The ISP subscriber might in turn divide this share of 399 resources amongst a video streaming application and file-sharing 400 application which are running concurrently. 402 3. System Components 404 As noted earlier, the primary focus of this document is the 405 architectural principles and the system components that implement 406 them. While specific system components might differ between 407 implementations, this document details the major components and their 408 overall roles in the architecture. To keep the scope narrow, we only 409 discuss the primary components related to protocol development. 410 Particular deployments will require additional components (e.g., 411 monitoring and accounting at a server), but they are intentionally 412 omitted from this document. 414 3.1. Content Distribution Application 416 Content distribution applications have many functional components. 417 For example, many P2P applications have components and algorithms to 418 manage overlay topology, rate allocation, piece selection, and so on. 419 In this document, we focus on the components directly engaged in a 420 DECADE system. Figure 3 illustrates the components discussed in this 421 section from the perspective of a single Application End-Point. 423 Native Protocol(s) 424 (with other Application End-Points) 425 .---------------------> 426 | 427 | 428 .----------------------------------------------------------------. 429 | Application End-Point | 430 | .-------------------. .-------------------. | 431 | | Application-Layer | ... | App Data Assembly | | 432 | | Algorithms | | Sequencing | | 433 | `-------------------' `-------------------' | 434 | | 435 | .==========================================================. | 436 | | DECADE Client | | 437 | | .-------------------------. .--------------------------. | | 438 | | | Resource Controller | | Data Controller | | | 439 | | | .--------. .----------. | | .------------. .-------. | | | 440 | | | | Data | | Resource | | | | Data | | Data | | | | 441 | | | | Access | | Sharing | | | | Scheduling | | Index | | | | 442 | | | | Policy | | Policy | | | | | | | | | | 443 | | | '--------' `----------' | | `------------' `-------' | | | 444 | | `-------------------------' `--------------------------' | | 445 | | | ^ | | 446 | `== | ============================== | ====================' | 447 `----- | ------------------------------ | -----------------------' 448 | | 449 | DECADE Resource Protocol (DRP) | Standard Data Transfer 450 v V 452 Figure 3: Application and DECADE Client Components 454 A DECADE system is geared towards supporting applications that can 455 distribute content using data objects. To accomplish this, 456 applications can include a component responsible for creating the 457 individual data objects before distribution and then re-assembling 458 data objects at the content consumer. We call this component 459 Application Data Assembly. In producing and assembling data objects, 460 two important considerations are sequencing and naming. A DECADE 461 system assumes that applications implement this functionality 462 themselves. See Section 4.1 for further discussion. In addition to 463 DECADE DRP/SDT, applications will most likely also support other, 464 native application protocols (e.g., P2P control and data transfer 465 protocols). 467 3.2. DECADE Client 469 The DECADE client provides the local support to an application, and 470 can be implemented standalone, embedded into the application, or 471 integrated in other entities such as network devices themselves. In 472 general, applications may have different Resource Sharing Policies 473 and Data Access Policies to control their resource and data in DECADE 474 servers. These policies may be existing policies of applications or 475 custom policies. The specific implementation is decided by the 476 application. 478 Recall that DECADE decouples the control and the data transfer of 479 applications. A Data Scheduling component schedules data transfers 480 according to network conditions, available servers, and/or available 481 server resources. The Data Index indicates data available at remote 482 servers. The Data Index (or a subset of it) can be advertised to 483 other clients. A common use case for this is to provide the ability 484 to locate data amongst distributed Application End-Points (i.e., a 485 data search mechanism such as a Distributed Hash Table). 487 3.3. DECADE Server 489 Figure 4 illustrates the primary components of a DECADE server. Note 490 that the description below does not assume a single-host or 491 centralized implementation: a DECADE server is not necessarily a 492 single physical machine but can also be implemented in a distributed 493 manner on a cluster of machines. 495 | DECADE Resource | Standard Data Transfer 496 | Protocol (DRP) | 497 | | 498 .= | ================= | ===========================. 499 | | v DECADE Server | 500 | | .----------------. | 501 | |----> | Access Control | <--------. | 502 | | `----------------' | | 503 | | ^ | | 504 | | | | | 505 | | v | | 506 | | .---------------------. | | 507 | `-> | Resource Scheduling | <------| | 508 | `---------------------' | | 509 | ^ | | 510 | | | | 511 | v .-----------------. | 512 | .-----------------. | User Delegation | | 513 | | Data Store | | Management | | 514 | `-----------------' `-----------------' | 515 `===================================================' 517 Figure 4: DECADE Server Components 519 Provided sufficient authorization, a client SHALL be able to access 520 its own data or other client's data in a DECADE server. Clients MAY 521 also authorize other clients to store data. If access is authorized 522 by a client, the server SHOULD provide access. Applications may 523 apply resource sharing policies or use a custom policy. DECADE 524 Servers will then perform resource scheduling according to the 525 resource sharing policies indicated by the client as well as any 526 other previously configured User Delegations. Data from applications 527 will be stored at a DECADE server. Data may be deleted from storage 528 either explicitly or automatically (e.g., after a TTL expiration). 530 3.4. Data Sequencing and Naming 532 The DECADE naming scheme implies no sequencing or grouping of 533 objects, even if this is done at the application layer. To 534 illustrate these properties, this section presents several 535 illustrative examples of use. 537 3.4.1. Application with Fixed-Size Chunks 539 Similar to the example in Section 3.1, consider an application in 540 which each individual application-layer segment of data is called a 541 "chunk" and has a name of the form: "CONTENT_ID:SEQUENCE_NUMBER". 542 Furthermore, assume that the application's native protocol uses 543 chunks of size 16 KB. Now, assume that this application wishes to 544 store data in a DECADE server in data objects of size 64 KB. To 545 accomplish this, it can map a sequence of 4 chunks into a single data 546 object, as shown in Figure 5. 548 Application Chunks 549 .---------.---------.---------.---------.---------.---------.-------- 550 | | | | | | | 551 | Chunk_0 | Chunk_1 | Chunk_2 | Chunk_3 | Chunk_4 | Chunk_5 | Chunk_6 552 | | | | | | | 553 `---------`---------`---------`---------`---------`---------`-------- 555 DECADE Data Objects 556 .---------------------------------------.---------------------------- 557 | | 558 | Object_0 | Object_1 559 | | 560 `---------------------------------------`---------------------------- 562 Figure 5: Mapping Application Chunks to DECADE Data Objects 564 In this example, the application maintains a logical mapping that is 565 able to determine the name of a DECADE data object given the chunks 566 contained within that data object. The name may be conveyed from 567 either the original content provider, another End-Point with which 568 the application is communicating, etc. As long as the data contained 569 within each sequence of chunks is globally unique, the corresponding 570 data objects have globally unique names. 572 3.4.2. Application with Continuous Streaming Data 573 Consider an application whose native protocol retrieves a continuous 574 data stream (e.g., an MPEG2 stream) instead of downloading and 575 redistributing chunks of data. Such an application could segment the 576 continuous data stream to produce either fixed-sized or variable- 577 sized data objects. Figure 6 depicts how a video streaming 578 application might produce variable-sized data objects such that each 579 data object contains 10 seconds of video data. Similarly with the 580 previous example, the application may maintain a mapping that is able 581 to determine the name of a data object given the time offset of the 582 video chunk. 584 Application's Video Stream 585 .-------------------------------------------------------------------- 586 | 587 | 588 | 589 `-------------------------------------------------------------------- 590 ^ ^ ^ ^ ^ 591 | | | | | 592 0 Seconds 10 Seconds 20 Seconds 30 Seconds 40 Seconds 593 0 B 400 KB 900 KB 1200 KB 1500 KB 595 DECADE Data Objects 596 .--------------.--------------.--------------.--------------.-------- 597 | | | | | 598 | Object_0 | Object_1 | Object_2 | Object_3 | 599 | (400 KB) | (500 KB) | (300 KB) | (300 KB) | 600 `--------------`--------------`--------------`--------------`-------- 602 Figure 6: Mapping a Continuous Data Stream to DECADE Data Objects 604 3.5. Token-based Authorization and Resource Control 606 A key feature of a DECADE system is that an application endpoint can 607 authorize other application endpoints to store or retrieve data 608 objects from in-network storage. This is accomplished using an OAuth 609 [RFC6749] based authorization scheme. A separate OAuth flow can be 610 used for this purpose: a client authenticates with the application 611 server or the P2P application peer, and requests the trusted by the 612 client, and the token contains particular self-contained properties 613 (see Section 4.2.1 for details). The client then uses the token when 614 sending requests to the DECADE server. Upon receiving a token, the 615 server validates the signature and the operation being performed. 617 This is a simple scheme, but has some important advantages over an 618 alternative approach, for example, in which a client explicitly 619 manipulates an Access Control List (ACL) associated with each data 620 object. In particular, it has the following advantages when applied 621 to DECADE target applications. First, authorization policies are 622 implemented within the application, thus it explicitly controls when 623 tokens are generated and to whom they are distributed and for how 624 long they will be valid. Second, fine-grained access and resource 625 control can be applied to data objects; see Section 4.2.1 for the 626 list of restrictions that can be enforced with a token. Third, there 627 is no messaging between a client and server to manipulate data object 628 permissions. This can simplify, in particular, applications which 629 share data objects with many dynamic peers and need to frequently 630 adjust access control policies attached to data objects. Finally, 631 tokens can provide anonymous access, in which a server does not need 632 to know the identity of each client that accesses it. This enables a 633 client to send tokens to clients belonging to other storage 634 providers, and allow them to read or write data objects from the 635 storage of its own storage provider. In addition to clients applying 636 access control policies to data objects, the server MAY be configured 637 to apply additional policies based on user, object properties, 638 geographic location, etc. A client might thus be denied access even 639 though it possesses a valid token. 641 There are existing protocols (e.g., OAuth [RFC6749]) that implement 642 similar referral mechanisms using tokens. A protocol specification 643 for a DECADE system SHOULD endeavor to use existing mechanisms 644 wherever possible. 646 3.6. Discovery 648 A DECADE system SHOULD include a discovery mechanism through which 649 clients locate an appropriate server. A discovery mechanism SHOULD 650 allow a client to determine an IP address or some other identifier 651 that can be resolved to locate the server for which the client will 652 be authorized to generate tokens (via DRP). (The discovery mechanism 653 might also result in an error if no such servers can be located.) 654 After discovering one or more servers, a client can distribute load 655 and requests across them (subject to resource limitations and 656 policies of the servers themselves) according to the policies of the 657 Application End-Point in which it is embedded. The discovery 658 mechanism outlined here does not provide the ability to locate 659 arbitrary DECADE servers to which a client might obtain tokens from 660 others. To do so will require application-level knowledge, and it is 661 assumed that this functionality is implemented in the content 662 distribution application. 664 The particular protocol used for discovery is out of scope of this 665 document, but any specification SHOULD re-use standard protocols 666 wherever possible. 668 4. DECADE Protocol Design 670 This section presents the DRP and the SDT protocol in terms of 671 abstract protocol interactions that are intended to be mapped to 672 specific protocols in an implementation. In general, the DRP/SDT 673 functionality between a DECADE client-server are very similar to the 674 DRP/SDT functionality between server-server. Any differences are 675 highlighted below. DRP is used by a DECADE client to configure the 676 resources and authorization used to satisfy requests (reading, 677 writing, and management operations concerning data objects) at a 678 server. SDT will be used to transport data between a client and a 679 server, as illustrated in Figure 1. 681 4.1. Naming 683 A DECADE system SHOULD use [RFC6920] as the recommended and default 684 naming scheme. Other naming schemes that meet the guidelines in 685 Section 2.3 may alternatively be used. In order to provide a simple 686 and generic interface, the DECADE server will be responsible only for 687 storing and retrieving individual data objects. 689 The DECADE naming format SHOULD NOT attempt to replace any naming or 690 sequencing of data objects already performed by an Application. 691 Instead, naming is intended to apply only to data objects referenced 692 by DECADE-specific purposes. An application using a DECADE client 693 may use a naming and sequencing scheme independent of DECADE names. 694 The DECADE client SHOULD maintain a mapping from its own data objects 695 and their names to the DECADE-specific data objects and names. 696 Furthermore, the DECADE naming scheme implies no sequencing or 697 grouping of objects, even if this is done at the application layer. 699 4.2. Resource Protocol 701 DRP will provide configuration of access control and resource sharing 702 policies on DECADE servers. A content distribution application, 703 e.g., a live P2P streaming session, can have permission to manage 704 data at several servers, for instance, servers belonging to different 705 storage providers. DRP allows one instance of such an application, 706 i.e., an Application End-Point, to apply access control and resource 707 sharing policies on each of them. 709 On a single DECADE server, the following resources SHOULD be managed: 710 a) communication resources in terms of bandwidth (upload/download) 711 and also in terms of number of active clients (simultaneous 712 connections); and b) storage resources. 714 4.2.1. Access and Resource Control Token 715 As in DECADE system, the resource owner agent is always the same 716 entity or co-located with the authorization server, so we use a 717 separate OAuth 2.0 request and response flow for the access and 718 resource control token. 720 An OAuth request to access the data objects MUST include the 721 following fields: 723 response_type: REQUIRED. Value MUST be set to "token". 725 client_id: the client_id indicates either the application that is 726 using the DECADE service or the end user who is using the DECADE 727 service from a DECADE storage service provider. DECADE storage 728 service providers MUST provide the ID distribution and management 729 function, which is out of the scope of this document. 731 scope: data object names that are requested. 733 An OAuth response includes the following information: 735 token_type: "Bearer"? 737 expires_in: The lifetime in seconds of the access token. 739 access_token: a token denotes the following information. 741 service URI: the server address or URI which is providing the 742 service; 744 Permitted operations (e.g., read, write) and objects (e.g., names 745 of data objects that might be read or written); 747 Priority: optional. If it is presented, value MUST be set to be 748 either "Urgent", "High", "Normal" or "Low". 750 Bandwidth: given to requested operation, a weight value used in a 751 weighted bandwidth sharing scheme, or a integer in number of bps; 753 Amount: data size in number of bytes that might be read or 754 written. 756 token_signature: the signature of the access token. 758 The tokens SHOULD be generated by an entity trusted by both the 759 DECADE client and the server at the request of a DECADE client. For 760 example, this entity could be the client, a server trusted by the 761 client, or another server managed by a storage provider and trusted 762 by the client. It is important for a server to trust the entity 763 generating the tokens since each token may incur a resource cost on 764 the server when used. Likewise, it is important for a client to 765 trust the entity generating the tokens since the tokens grant access 766 to the data stored at the server. 768 Upon generating a token, a client can distribute it to another client 769 (e.g., via their native application protocol). The receiving client 770 can then connect to the server specified in the token and perform any 771 operation permitted by the token. The token SHOULD be sent along 772 with the operation. The server SHOULD validate the token to identify 773 the client that issued it and whether the requested operation is 774 permitted by the contents of the token. If the token is successfully 775 validated, the server SHOULD apply the resource control policies 776 indicated in the token while performing the operation. 778 Tokens SHOULD include a unique identifier to allow a server to detect 779 when a token is used multiple times and reject the additional usage 780 attempts. Since usage of a token incurs resource costs to a server 781 (e.g., bandwidth and storage) and a Content Provider may have a 782 limited budget (see Section 2.5), the Content Provider should be able 783 to indicate if a token may be used multiple times. 785 It SHOULD be possible to revoke tokens after they are generated. 786 This could be accomplished by supplying the server the unique 787 identifiers of the tokens which are to be revoked. 789 4.2.2. Status Information 791 DRP SHOULD provide a status request service that clients can use to 792 request status information of a server. Access to such status 793 information SHOULD require client authorization; that is, clients 794 need to be authorized to access the requested status information. 795 This authorization is based on the user delegation concept as 796 described in Section 2.5. The following status information elements 797 SHOULD be obtained: a) list of associated data objects (with 798 properties); and b) resources used/available. In addition, the 799 following information elements MAY be available: c) list of servers 800 to which data objects have been distributed (in a certain time- 801 frame); and d) list of clients to which data objects have been 802 distributed (in a certain time-frame). 804 For the list of servers/clients to which data objects have been 805 distributed to, the server SHOULD be able to decide on time bounds 806 for which this information is stored and specify the corresponding 807 time frame in the response to such requests. Some of this 808 information may be used for accounting purposes, e.g., the list of 809 clients to which data objects have been distributed. 811 Access information MAY be provided for accounting purposes, for 812 example, when content providers are interested in access statistics 813 for resources and/or to perform accounting per user. Again, access 814 to such information requires client authorization and SHOULD based on 815 the delegation concept as described in Section 2.5. The following 816 type of access information elements MAY be requested: a) what data 817 objects have been accessed by whom and for how many times; and b) 818 access tokens that a server as seen for a given data object. 820 The server SHOULD decide on time bounds for which this information is 821 stored and specify the corresponding time frame in the response to 822 such requests. 824 4.2.3. Data Object Attributes 826 Data Objects that are stored on a DECADE server SHOULD have 827 associated attributes (in addition to the object identifier and data 828 object) that relate to the data storage and its management. These 829 attributes may be used by the server (and possibly the underlying 830 storage system) to perform specialized processing or handling for the 831 data object, or to attach related server or storage-layer properties 832 to the data object. These attributes have a scope local to a server. 833 In particular, these attributes SHOULD NOT be applied to a server or 834 client to which a data object is copied. 836 Depending on authorization, clients SHOULD be permitted to get or set 837 such attributes. This authorization is based on the delegation as 838 per Section 2.5. DECADE does not limit the set of permissible 839 attributes, but rather specifies a set of baseline attributes that 840 SHOULD be supported: 842 Expiration Time: Time at which the data object can be deleted; 844 Data Object size: In bytes; 846 Media type Labelling of type as per [RFC6838]; 848 Access statistics: How often the data object has been accessed (and 849 what tokens have been used). 851 The data object attributes defined here are distinct from application 852 metadata (see Section 2.1). Application metadata is custom 853 information that an application might wish to associate with a data 854 object to understand its semantic meaning (e.g., whether it is video 855 and/or audio, its playback length in time, or its index in a stream). 856 If an application wishes to store such metadata persistently, it can 857 be stored within data objects themselves. 859 4.3. Data Transfer 861 A DECADE server will provide a data access interface, and SDT will be 862 used to write data objects to a server and to read (download) data 863 objects from a server. Semantically, SDT is a client-server 864 protocol; that is, the server always responds to client requests. 866 To write a data object, a client first generates the object's name 867 (see Section 4.1), and then uploads the object to a server and 868 supplies the generated name. The name can be used to access 869 (download) the object later; for example, the client can pass the 870 name as a reference to other clients that can then refer to the 871 object. Data objects can be self-contained objects such as 872 multimedia resources, files etc., but also chunks, such as chunks of 873 a P2P distribution protocol that can be part of a containing object 874 or a stream. If supported, a server can verify the integrity and 875 other security properties of uploaded objects. 877 A client can request named data objects from a server. In a 878 corresponding request message, a client specifies the object name and 879 a suitable access and resource control token. The server checks the 880 validity of the received token and its associated resource usage- 881 related properties. If the named data object exists on the server 882 and the token can be validated, the server delivers the requested 883 object in a response message. If the data object cannot be delivered 884 the server provides a corresponding status/reason information in a 885 response message. Specifics regarding error handling, including 886 additional error conditions (e.g., overload), precedence for returned 887 errors and its relation with server policy, are deferred to eventual 888 protocol specification. 890 4.4. Server-to-Server Protocols 892 An important feature of a DECADE system is the capability for one 893 server to directly download data objects from another server. This 894 capability allows applications to directly replicate data objects 895 between servers without requiring end-hosts to use uplink capacity to 896 upload data objects to a different server. 898 DRP and SDT SHOULD support operations directly between servers. 899 Servers are not assumed to trust each other nor are configured to do 900 so. All data operations are performed on behalf of clients via 901 explicit instruction. However, the objects being processed do not 902 necessarily have to originate or terminate at the client (i.e., the 903 data object might be limited to being exchanged between servers even 904 if the instruction is triggered by the client). Clients thus will be 905 able to indicate to a server which remote server(s) to access, what 906 operation is to be performed, the content provider at the remote 907 server from which to retrieve the data object, or in which the object 908 is to be stored, and the credentials indicating access and resource 909 control to perform the operation at the remote server. 911 Server-to-server support is focused on reading and writing data 912 objects between servers. The data object referred to at the remote 913 server is the same as the original data object requested by the 914 client. Object attributes (see Section 4.2.3) might also be 915 specified in the request to the remote server. In this way, a server 916 acts as a proxy for a client, and a client can instantiate requests 917 via that proxy. The operations will be performed as if the original 918 requester had its own client co-located with the server. When a 919 client sends a request to a server with these additional parameters, 920 it is giving the server permission to act (proxy) on its behalf. 921 Thus, it would be prudent for the supplied token to have narrow 922 privileges (e.g., limited to only the necessary data objects) or 923 validity time (e.g., a small expiration time). 925 In the case of a retrieval operation, the server is to retrieve the 926 data object from the remote server using the specified credentials, 927 and then optionally return the object to a client. In the case of a 928 storage operation, the server is to store the object to the remote 929 server using the specified credentials. The object might optionally 930 be uploaded from the client or might already exist at the server. 932 5. In-Network Storage Components Mapping to DECADE 934 This section evaluates how the basic components of an in-network 935 storage system (see Section 3 of [RFC6392]) map into a DECADE system. 937 With respect to Data Access Interface, DECADE clients can read and 938 write objects of arbitrary size through the client's Data Controller, 939 making use of standard data transfer (SDT). With respect to Data 940 Management Operations, clients can move or delete previously stored 941 objects via the client's Data Controller, making use of SDT. Clients 942 can enumerate or search contents of servers to find objects matching 943 desired criteria through services provided by the Content 944 Distribution Application (e.g., buffer-map exchanges, a DHT, or peer- 945 exchange). In doing so, Application End-Points might consult their 946 local Data Index in the client's Data Controller (Data Search 947 Capability). 949 With respect to Access Control Authorization, all methods of access 950 control are supported: public-unrestricted, public-restricted and 951 private. Access Control Policies are generated by a content 952 distribution application and provided to the client's Resource 953 Controller. The server is responsible for implementing the access 954 control checks. Clients can manage the resources (e.g., bandwidth) 955 on the DECADE server that can be used by other Application End-Points 956 (Resource Control Interface). Resource Sharing Policies are 957 generated by a content distribution application and provided to the 958 client's Resource Controller. The server is responsible for 959 implementing the resource sharing policies. 961 Although the particular protocol used for discovery is outside the 962 scope of this document, different options and considerations have 963 been discussed in Section 3.6. Finally with respect to the storage 964 mode, DECADE servers provide an object-based storage mode. Immutable 965 data objects might be stored at a server. Applications might 966 consider existing blocks as data objects, or they might adjust block 967 sizes before storing in a server. 969 6. Security Considerations 971 In general, the security considerations mentioned in [RFC6646] apply 972 to this document as well. A DECADE system provides a distributed 973 storage service for content distribution and similar applications. 974 The system consists of servers and clients that use these servers to 975 upload data objects, to request distribution of data objects, and to 976 download data objects. Such a system is employed in an overall 977 application context -- for example in a P2P application, and it is 978 expected that DECADE clients take part in application-specific 979 communication sessions. The security considerations here focus on 980 threats related to the DECADE system and its communication services, 981 i.e., the DRP/SDT protocols that have been described in an abstract 982 fashion in this document. 984 6.1. Threat: System Denial of Service Attacks 986 A DECADE network might be used to distribute data objects from one 987 client to a set of servers using the server-to-server communication 988 feature that a client can request when uploading an object; see 989 Section 4.4. Multiple clients uploading many objects at different 990 servers at the same time and requesting server-to-server distribution 991 for them could thus mount massive distributed denial of service 992 (DDOS) attacks, overloading a network of servers. This threat is 993 addressed by the server's access control and resource control 994 framework. Servers can require Application End-Points to be 995 authorized to store and to download objects, and Application End- 996 Points can delegate authorization to other Application End-Points 997 using the token mechanism. Of course the effective security of this 998 approach depends on the strength of the token mechanism. See below 999 for a discussion of this and related communication security threats. 1001 Denial of Service Attacks against a single server (directing many 1002 requests to that server) might still lead to considerable load for 1003 processing requests and invalidating tokens. SDT therefore MUST 1004 provide a redirection mechanism. 1006 6.2. Threat: Authorization Mechanisms Compromised 1008 A DECADE system does not require Application End-Points to 1009 authenticate in order to access a server for downloading objects, 1010 since authorization is not based on End-Point or user identities but 1011 on a delegation-based authorization mechanism. Hence, most protocol 1012 security threats are related to the authorization scheme. The 1013 security of the token mechanism depends on the strength of the token 1014 mechanism and on the secrecy of the tokens. A token can represent 1015 authorization to store a certain amount of data, to download certain 1016 objects, to download a certain amount of data per time etc. If it is 1017 possible for an attacker to guess, construct or simply obtain tokens, 1018 the integrity of the data maintained by the servers is compromised. 1020 This is a general security threat that applies to authorization 1021 delegation schemes. Specifications of existing delegation schemes 1022 such as [RFC6749] discuss these general threats in detail. We can 1023 say that the DRP has to specify appropriate algorithms for token 1024 generation. Moreover, authorization tokens should have a limited 1025 validity period that should be specified by the application. Token 1026 confidentiality should be provided by application protocols that 1027 carry tokens, and the SDT and DRP should provide secure 1028 (confidential) communication modes. 1030 6.3. Threat: Data Object Spoofing 1032 In a DECADE system, an Application End-Point is referring other 1033 Application End-Points to servers to download a specified data 1034 objects. An attacker could "inject" a faked version of the object 1035 into this process, so that the downloading End-Point effectively 1036 receives a different object (compared to what the uploading End-Point 1037 provided). As result, the downloading End-Point believes that is has 1038 received an object that corresponds to the name it was provided 1039 earlier, whereas in fact it is a faked object. Corresponding attacks 1040 could be mounted against the application protocol (that is used for 1041 referring other End-Points to servers), servers themselves (and their 1042 storage sub-systems), and the SDT by which the object is uploaded, 1043 distributed and downloaded. 1045 A DECADE systems fundamental mechanism against object spoofing is 1046 name-object binding validation, i.e., the ability of a receiver to 1047 check whether the name he was provided and that he used to request an 1048 object, actually corresponds to the bits he received. As described 1049 above, this allows for different forms of name-object binding, for 1050 example using hashes of data objects, with different hash functions 1051 (different algorithms, different digest lengths). For those 1052 application scenarios where hashes of data objects are not applicable 1053 (for example live-streaming) other forms of name-object binding can 1054 be used (see Section 4.1). This flexibility also addresses 1055 cryptographic algorithm evolution: hash functions might get 1056 deprecated, better alternatives might be invented etc., so that 1057 applications can choose appropriate mechanisms meeting their security 1058 requirements. 1060 DECADE servers MAY perform name-object binding validation on stored 1061 objects, but Application End-Points MUST NOT rely on that. In other 1062 words, Application End-Points SHOULD perform name-object binding 1063 validation on received objects. 1065 7. IANA Considerations 1067 This document does not have any IANA considerations. 1069 8. Acknowledgments 1071 We thank the following people for their contributions to and/or 1072 detailed reviews of this or earlier versions of this document: 1073 Carsten Bormann, David Bryan, Dave Crocker, Yingjie Gu, David 1074 Harrington, Hongqiang (Harry) Liu, David McDysan, Borje Ohlman, 1075 Martin Stiemerling, Richard Woundy, and Ning Zong. 1077 9. Informative References 1079 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1080 Requirement Levels", BCP 14, RFC 2119, March 1997. 1082 [RFC5661] Shepler, S., Eisler, M., and D. Noveck, "Network File 1083 System (NFS) Version 4 Minor Version 1 Protocol", RFC 1084 5661, January 2010. 1086 [RFC6392] Alimi, R., Rahman, A., and Y. Yang, "A Survey of In- 1087 Network Storage Systems", RFC 6392, October 2011. 1089 [RFC6646] Song, H., Zong, N., Yang, Y., and R. Alimi, "DECoupled 1090 Application Data Enroute (DECADE) Problem Statement", RFC 1091 6646, July 2012. 1093 [RFC6749] Hardt, D., "The OAuth 2.0 Authorization Framework", RFC 1094 6749, October 2012. 1096 [RFC6838] Freed, N., Klensin, J., and T. Hansen, "Media Type 1097 Specifications and Registration Procedures", BCP 13, RFC 1098 6838, January 2013. 1100 [RFC6920] Farrell, S., Kutscher, D., Dannewitz, C., Ohlman, B., 1101 Keranen, A., and P. Hallam-Baker, "Naming Things with 1102 Hashes", RFC 6920, April 2013. 1104 [GoogleFileSystem] 1105 Ghemawat, S., Gobioff, H., and S. Leung, "The Google File 1106 System", SOSP 2003, October 2003. 1108 Authors' Addresses 1110 Richard Alimi 1111 Google 1113 Email: ralimi@google.com 1115 Akbar Rahman 1116 InterDigital Communications, LLC 1118 Email: akbar.rahman@interdigital.com 1120 Dirk Kutscher 1121 NEC 1123 Email: dirk.kutscher@neclab.eu 1125 Y. Richard Yang 1126 Yale University 1128 Email: yry@cs.yale.edu 1130 Haibin Song 1131 Huawei Technologies 1133 Email: haibin.song@huawei.com 1135 Kostas Pentikousis 1136 Huawei Technologies 1138 Email: k.pentikousis@huawei.com