idnits 2.17.1 draft-ietf-decade-arch-02.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** You're using the IETF Trust Provisions' Section 6.b License Notice from 12 Sep 2009 rather than the newer Notice from 28 Dec 2009. (See https://trustee.ietf.org/license-info/) Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (July 11, 2011) is 4672 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- -- Obsolete informational reference (is this intentional?): RFC 2616 (Obsoleted by RFC 7230, RFC 7231, RFC 7232, RFC 7233, RFC 7234, RFC 7235) == Outdated reference: A later version (-06) exists of draft-ietf-decade-problem-statement-03 == Outdated reference: A later version (-06) exists of draft-ietf-decade-survey-04 == Outdated reference: A later version (-08) exists of draft-ietf-decade-reqs-02 Summary: 1 error (**), 0 flaws (~~), 4 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 DECADE R. Alimi 3 Internet-Draft Google 4 Intended status: Informational Y. Yang 5 Expires: January 12, 2012 Yale University 6 A. Rahman 7 InterDigital Communications, LLC 8 D. Kutscher 9 NEC 10 H. Liu 11 Yale University 12 July 11, 2011 14 DECADE Architecture 15 draft-ietf-decade-arch-02 17 Abstract 19 Content Distribution Applications (e.g., P2P applications) are widely 20 used on the Internet and make up a large portion of the traffic in 21 many networks. One technique to improve the network efficiency of 22 these applications is to introduce storage capabilities within the 23 networks. This document presents an architecture, discusses the 24 underlying principles, and identifies core components and protocols 25 for supporting in-network storage functionality for these 26 applications. 28 Requirements Language 30 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 31 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 32 document are to be interpreted as described in RFC 2119 [RFC2119]. 34 Status of this Memo 36 This Internet-Draft is submitted to IETF in full conformance with the 37 provisions of BCP 78 and BCP 79. 39 Internet-Drafts are working documents of the Internet Engineering 40 Task Force (IETF), its areas, and its working groups. Note that 41 other groups may also distribute working documents as Internet- 42 Drafts. 44 Internet-Drafts are draft documents valid for a maximum of six months 45 and may be updated, replaced, or obsoleted by other documents at any 46 time. It is inappropriate to use Internet-Drafts as reference 47 material or to cite them other than as "work in progress." 48 The list of current Internet-Drafts can be accessed at 49 http://www.ietf.org/ietf/1id-abstracts.txt. 51 The list of Internet-Draft Shadow Directories can be accessed at 52 http://www.ietf.org/shadow.html. 54 This Internet-Draft will expire on January 12, 2012. 56 Copyright Notice 58 Copyright (c) 2011 IETF Trust and the persons identified as the 59 document authors. All rights reserved. 61 This document is subject to BCP 78 and the IETF Trust's Legal 62 Provisions Relating to IETF Documents 63 (http://trustee.ietf.org/license-info) in effect on the date of 64 publication of this document. Please review these documents 65 carefully, as they describe your rights and restrictions with respect 66 to this document. Code Components extracted from this document must 67 include Simplified BSD License text as described in Section 4.e of 68 the Trust Legal Provisions and are provided without warranty as 69 described in the BSD License. 71 Table of Contents 73 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 5 74 2. Functional Entities . . . . . . . . . . . . . . . . . . . . . 6 75 2.1. DECADE Server . . . . . . . . . . . . . . . . . . . . . . 6 76 2.2. DECADE Client . . . . . . . . . . . . . . . . . . . . . . 6 77 2.3. DECADE Storage Provider . . . . . . . . . . . . . . . . . 6 78 2.4. DECADE Content Provider . . . . . . . . . . . . . . . . . 6 79 2.5. DECADE Content Consumer . . . . . . . . . . . . . . . . . 7 80 2.6. Content Distribution Application . . . . . . . . . . . . . 7 81 2.6.1. Application End-Point . . . . . . . . . . . . . . . . 7 82 3. Protocol Flow . . . . . . . . . . . . . . . . . . . . . . . . 7 83 3.1. Overview . . . . . . . . . . . . . . . . . . . . . . . . . 7 84 3.2. An Example . . . . . . . . . . . . . . . . . . . . . . . . 9 85 4. Architectural Principles . . . . . . . . . . . . . . . . . . . 9 86 4.1. Decoupled Control/Metadata and Data Planes . . . . . . . . 10 87 4.2. Immutable Data Objects . . . . . . . . . . . . . . . . . . 11 88 4.3. Data Object Identifiers . . . . . . . . . . . . . . . . . 12 89 4.4. Explicit Control . . . . . . . . . . . . . . . . . . . . . 12 90 4.5. Resource and Data Access Control through User 91 Delegation . . . . . . . . . . . . . . . . . . . . . . . . 12 92 4.5.1. Resource Allocation . . . . . . . . . . . . . . . . . 12 93 4.5.2. User Delegations . . . . . . . . . . . . . . . . . . . 13 94 5. System Components . . . . . . . . . . . . . . . . . . . . . . 13 95 5.1. Content Distribution Application . . . . . . . . . . . . . 14 96 5.1.1. Data Assembly . . . . . . . . . . . . . . . . . . . . 15 97 5.1.2. Native Protocols . . . . . . . . . . . . . . . . . . . 16 98 5.1.3. DECADE Client . . . . . . . . . . . . . . . . . . . . 16 99 5.2. DECADE Server . . . . . . . . . . . . . . . . . . . . . . 16 100 5.2.1. Access Control . . . . . . . . . . . . . . . . . . . . 17 101 5.2.2. Resource Scheduling . . . . . . . . . . . . . . . . . 17 102 5.2.3. Data Store . . . . . . . . . . . . . . . . . . . . . . 18 103 5.3. Data Sequencing and Naming . . . . . . . . . . . . . . . . 18 104 5.3.1. DECADE Data Object Naming Scheme . . . . . . . . . . . 18 105 5.3.2. Application Usage . . . . . . . . . . . . . . . . . . 19 106 5.3.3. Application Usage Example . . . . . . . . . . . . . . 19 107 5.4. Token-based Authentication and Resource Control . . . . . 21 108 5.5. Discovery . . . . . . . . . . . . . . . . . . . . . . . . 22 109 6. DECADE Protocols . . . . . . . . . . . . . . . . . . . . . . . 23 110 6.1. DECADE Resource Protocol (DRP) . . . . . . . . . . . . . . 23 111 6.1.1. Controlled Resources . . . . . . . . . . . . . . . . . 23 112 6.1.2. Access and Resource Control Token . . . . . . . . . . 24 113 6.1.3. Status Information . . . . . . . . . . . . . . . . . . 25 114 6.1.4. Object Properties . . . . . . . . . . . . . . . . . . 26 115 6.2. Standard Data Transport (SDT) . . . . . . . . . . . . . . 26 116 6.2.1. Writing/Uploading Objects . . . . . . . . . . . . . . 26 117 6.2.2. Downloading Objects . . . . . . . . . . . . . . . . . 27 118 7. Server-to-Server Protocols . . . . . . . . . . . . . . . . . . 28 119 7.1. Operational Overview . . . . . . . . . . . . . . . . . . . 29 120 8. Potential Optimizations . . . . . . . . . . . . . . . . . . . 29 121 8.1. Pipelining to Avoid Store-and-Forward Delays . . . . . . . 30 122 8.2. Deduplication . . . . . . . . . . . . . . . . . . . . . . 30 123 8.2.1. Traffic Deduplication . . . . . . . . . . . . . . . . 30 124 8.2.2. Cross-Server Storage Deduplication . . . . . . . . . . 31 125 9. Security Considerations . . . . . . . . . . . . . . . . . . . 31 126 10. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 32 127 11. References . . . . . . . . . . . . . . . . . . . . . . . . . . 32 128 11.1. Normative References . . . . . . . . . . . . . . . . . . . 32 129 11.2. Informative References . . . . . . . . . . . . . . . . . . 32 130 Appendix A. Appendix: Evaluation of Some Candidate Existing 131 Protocols for DECADE DRP and SDT . . . . . . . . . . 33 132 A.1. HTTP . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 133 A.1.1. HTTP Support for DECADE Resource Protocol 134 Primitives . . . . . . . . . . . . . . . . . . . . . . 33 135 A.1.2. HTTP Support for DECADE Standard Data Transport 136 Protocol Primitives . . . . . . . . . . . . . . . . . 34 137 A.1.3. Traffic De-duplication Primitives . . . . . . . . . . 35 138 A.1.4. Other Operations . . . . . . . . . . . . . . . . . . . 35 139 A.1.5. Conclusions . . . . . . . . . . . . . . . . . . . . . 35 140 A.2. WEBDAV . . . . . . . . . . . . . . . . . . . . . . . . . . 35 141 A.2.1. WEBDAV Support for DECADE Resource Protocol 142 Primitives . . . . . . . . . . . . . . . . . . . . . . 36 143 A.2.2. WebDAV Support for DECADE Standard Transport 144 Protocol Primitives . . . . . . . . . . . . . . . . . 37 145 A.2.3. Other Operations . . . . . . . . . . . . . . . . . . . 37 146 A.2.4. Conclusions . . . . . . . . . . . . . . . . . . . . . 38 147 Appendix B. In-Network Storage Components Mapped to DECADE 148 Architecture . . . . . . . . . . . . . . . . . . . . 39 149 B.1. Data Access Interface . . . . . . . . . . . . . . . . . . 39 150 B.2. Data Management Operations . . . . . . . . . . . . . . . . 39 151 B.3. Data Search Capability . . . . . . . . . . . . . . . . . . 39 152 B.4. Access Control Authorization . . . . . . . . . . . . . . . 39 153 B.5. Resource Control Interface . . . . . . . . . . . . . . . . 39 154 B.6. Discovery Mechanism . . . . . . . . . . . . . . . . . . . 39 155 B.7. Storage Mode . . . . . . . . . . . . . . . . . . . . . . . 40 156 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 40 158 1. Introduction 160 Content Distribution Applications are widely used on the Internet 161 today to distribute data, and they contribute a large portion of the 162 traffic in many networks. The DECADE architecture described in this 163 document enables such applications to leverage in-network storage to 164 achieve more efficient content distribution. Specifically, in many 165 subscriber networks, it can be expensive to upgrade network equipment 166 in the "last-mile", because it can involve replacing equipment and 167 upgrading wiring at individual homes, businesses, and devices such as 168 DSLAMs (Digital Subscriber Line Access Multiplexers) and CMTSs (Cable 169 Modem Termination Systems) in remote locations. Therefore, it can be 170 cheaper to upgrade the core infrastructure, which involves fewer 171 components that are shared by many subscribers. See 172 [I-D.ietf-decade-problem-statement] for a more complete discussion of 173 the problem domain and general discussions of the capabilities to be 174 provided by DECADE. 176 This document presents an architecture for providing in-network 177 storage that can be integrated into Content Distribution 178 Applications. The primary focus is P2P-based content distribution, 179 but the architecture may be useful to other applications with similar 180 characteristics and requirements. See [I-D.ietf-decade-reqs] for a 181 definition of the target applications supported by DECADE. 183 The design philosophy of the DECADE architecture is to provide only 184 the core functionalities that are needed for applications to make use 185 of in-network storage. With such core functionalities, the protocol 186 may be simpler and easier to support by storage providers. If more 187 complex functionalities are needed by a certain application or class 188 of applications, it may be layered on top of the DECADE protocol. 190 The DECADE protocol will leverage existing data transport and 191 application layer protocols. The design is to work with a small set 192 of alternative IETF protocols. In this document, we use "data 193 transport" to refer to a protocol that is used to read data from and 194 write data into DECADE in-network storage. 196 This document proceeds in two steps. First, it details the core 197 architectural principles that we use to guide the DECADE design. 198 Next, given these core principles, this document presents the core 199 components of the DECADE architecture and identifies the usage of 200 existing protocols and where there is a need for new protocol 201 development. 203 2. Functional Entities 205 This section defines the functional entities involved in a DECADE 206 system. Functional entities can be classified as follows: 208 o A physical or logical component in the DECADE architecture: DECADE 209 Client, DECADE Server, Content Distribution Application and 210 Application End Point; 212 o Operator of a physical or logical component in the DECADE 213 architecture: DECADE Storage Provider; and 215 o Source or sink of content distributed via the DECADE architecture: 216 DECADE Content Provider, and DECADE Content Consumer. 218 2.1. DECADE Server 220 A DECADE server stores DECADE data inside the network, and thereafter 221 manages both the stored data and access to that data. To reinforce 222 that these servers are responsible for storage of raw data, this 223 document also refers to them as storage servers. 225 2.2. DECADE Client 227 A DECADE client stores and retrieves data at DECADE Servers. 229 2.3. DECADE Storage Provider 231 A DECADE storage provider deploys and/or manages DECADE storage 232 server(s) within a network. A storage provider may also own or 233 manage the network in which the DECADE servers are deployed, but this 234 is not mandatory. 236 A DECADE storage provider, possibly in cooperation with one or more 237 network providers, determines deployment locations for DECADE servers 238 and determines the available resources for each. 240 2.4. DECADE Content Provider 242 A DECADE content provider accesses DECADE storage servers (by way of 243 a DECADE client) to upload and manage data. A content provider can 244 access one or more storage servers. A content provider may be a 245 single process or a distributed application (e.g., in a P2P 246 scenario), and may either be fixed or mobile. 248 2.5. DECADE Content Consumer 250 A DECADE content consumer accesses storage servers (by way of a 251 DECADE client) to download data that has previously been stored by a 252 DECADE content provider. A content consumer can access one or more 253 storage servers. A content consumer may be a single process or a 254 distributed application (e.g., in a P2P scenario), and may either be 255 fixed or mobile. An instance of a distributed application, such as a 256 P2P application, may both provide content to and consume content from 257 DECADE storage servers. 259 2.6. Content Distribution Application 261 A content distribution application (as a target application for 262 DECADE as described in [I-D.ietf-decade-reqs]) is a distributed 263 application designed for dissemination of a possibly-large data set 264 to multiple consumers. Content Distribution Applications typically 265 divide content into smaller blocks for dissemination. 267 The term Application Developer refers to the developer of a 268 particular Content Distribution Application. 270 2.6.1. Application End-Point 272 An Application End-Point is an instance of a Content Distribution 273 Application that makes use of DECADE server(s). A particular 274 Application End-Point may be a DECADE Content Provider, a DECADE 275 Content Consumer, or both. For example, an Application End-Point may 276 be an instance of a video streaming client, or it may be the source 277 providing the video to a set of clients. 279 An Application End-Point need not be actively transferring data with 280 other Application End-Points to interact with the DECADE storage 281 system. That is, an End-Point may interact with the DECADE storage 282 servers as an offline activity. 284 3. Protocol Flow 286 3.1. Overview 288 The DECADE Architecture uses two protocols, as shown in Figure 1. 289 First, the DECADE Resource Protocol is responsible for communication 290 of access control and resource scheduling policies from DECADE Client 291 to DECADE Server, as well as between DECADE Servers. The DECADE 292 Architecture includes exactly one DRP for interoperability and a 293 common format through which these policies can be communicated. 295 Native Application 296 .-------------. Protocol(s) .-------------. 297 | Application | <------------------> | Application | 298 | End-Point | | End-Point | 299 | | | | 300 | .--------. | | .--------. | 301 | | DECADE | | | | DECADE | | 302 | | Client | | | | Client | | 303 | `--------' | | `--------' | 304 `-------------' `-------------' 305 | ^ | ^ 306 DECADE | | Standard | | 307 Resource | | Data DRP | | SDT 308 Protocol | | Transport | | 309 (DRP) | | (SDT) | | 310 | | | | 311 | | | | 312 | | | | 313 | | | | 314 | | | | 315 | | | | 316 v V v V 317 .=============. DRP .=============. 318 | DECADE | <------------------> | DECADE | 319 | Server | <------------------> | Server | 320 `=============' SDT `=============' 322 Figure 1: Generic Protocol Flow 324 Second, Standard Data Transport protocols (e.g., WebDAV or NFS or 325 HTTP/s) are used to transfer data objects to and from a DECADE 326 Server. The DECADE architecture may be used with multiple standard 327 data transports. 329 Decoupling the protocols in this way allows DECADE to directly 330 utilize existing standard data transports, as well as allowing both 331 DECADE and DRP to evolve independently from data transports. 333 It is also important to note that the two protocols do not need to be 334 separate on the wire. For example, DRP messages may be piggybacked 335 within some extension fields provided by certain data transport 336 protocols. In such a scenario, DRP is technically a data structure 337 (transported by other protocols), but it can still be considered as a 338 logical protocol that provides the services of configuring DECADE 339 resource usage. Hence, this document considers SDT and DRP as two 340 separate, logical functional components for clarity. 342 3.2. An Example 344 Before discussing details of the architecture, this section provides 345 an example data transfer scenario to illustrate how the DECADE 346 Architecture can be applied. 348 In this example, we assume that Application End-Point B (the 349 receiver) is requesting a data object from Application End-Point A 350 (the sender). Let S(A) denote A's DECADE storage server. There are 351 multiple usage scenarios (by choice of the Content Distribution 352 Application). For simplicity of introduction, we design the example 353 to use only a single DECADE Server; Section 7 details a case when 354 both A and B wish to employ DECADE Servers. 356 When an Application End-Point wishes to use its DECADE storage 357 server, it provides a token (see Section 6.1.2 for details) to the 358 other Application End-Point. The token is sent using the Content 359 Distribution Application's native protocol. 361 The steps of the example are illustrated in Figure 2. First, B 362 requests a data object from A using their native protocol. Next, A 363 uses the DECADE Resource Protocol (DRP) to obtain a token from its 364 DECADE storage server, S(A). A then provides the token to B (again, 365 using their native protocol). Finally, provides the token to S(B) 366 via DRP, and requests and downloads the data object via a Standard 367 Data Transport (SDT). 369 .----------. 370 ----------> | S(A) | <------ 371 2. Obtain / `----------' \ 4. Request and 372 Token / \ Download Object 373 (DRP) / \ (DRP + SDT) 374 v 1. App request v 375 .-------------. <--------------------------- .-------------. 376 | End-Point A | | End-Point B | 377 `-------------' ---------------------------> `-------------' 378 3. App response (token) 380 Figure 2: Download from Storage Server 382 4. Architectural Principles 384 We identify the following key principles. 386 4.1. Decoupled Control/Metadata and Data Planes 388 The DECADE infrastructure is intended to support multiple content 389 distribution applications. A complete content distribution 390 application implements a set of control and management functions 391 including content search, indexing and collection, access control, ad 392 insertion, replication, request routing, and QoS scheduling. An 393 observation of DECADE is that different content distribution 394 applications can have unique considerations designing the control and 395 signaling functions: 397 o Metadata Management Scheme: Traditional file systems provide a 398 standard metadata abstraction: a recursive structure of 399 directories to offer namespace management; each file is an opaque 400 byte stream. In content distribution, applications may use 401 different metadata management schemes. For example, one 402 application may use a sequence of blocks (e.g., for file sharing), 403 while another application may use a sequence of frames (with 404 different sizes) indexed by time. 406 o Resource Scheduling Algorithms: a major competitive advantage of 407 many successful P2P systems is their substantial expertise in 408 achieving highly efficient utilization of peer and infrastructural 409 resources. For instance, many live P2P systems have their 410 specific algorithms in constructing topologies to achieve low- 411 latency, high-bandwidth streaming. They continue to fine-tune 412 such algorithms. 414 Given the diversity of control-plane functions, in-network storage 415 should export basic mechanisms and allow as much flexibility as 416 possible to the control planes to implement specific policies. This 417 conforms to the end-to-end systems principle and allows innovation 418 and satisfaction of specific business goals. 420 Decoupling control plane and data plane is not new. For example, 421 OpenFlow is an implementation of this principle for Internet routing, 422 where the computation of the forwarding table and the application of 423 the forwarding table are separated. Google File System applies the 424 principle to file system design, by utilizing the Master to handle 425 the meta-data management, and the Chunk Servers to handle the data 426 plane functions (i.e., read and write of chunks of data). NFS4 also 427 implements this principle. 429 Note that applications may have different Data Plane implementations 430 in order to support particular requirements (e.g., low latency). In 431 order to provide interoperability, the DECADE architecture does not 432 intend to enable arbitrary data transport protocols. However, the 433 architecture may allow for more-than-one data transport protocols to 434 be used. 436 Also note that although an application's existing control plane 437 functions remain implemented within the application, the particular 438 implementation may need to be adjusted to support DECADE. 440 4.2. Immutable Data Objects 442 A property of bulk contents to be broadly distributed is that they 443 typically are immutable -- once a piece of content is generated, it 444 is typically not modified. It is not common that bulk contents such 445 as video frames and images need to be modified after distribution. 447 Many content distribution applications divide content objects into 448 blocks for two reasons: (1) multipath: different blocks may be 449 fetched from different content sources in parallel, and (2) faster 450 recovery and verification: individual blocks may be recovered and 451 verified. Typically, applications use a block size larger than a 452 single packet in order to reduce control overhead. 454 Common applications using the aforementioned data model include P2P 455 streaming (live and video-on-demand) and P2P file-sharing. However, 456 other additional types of applications may match this model. 458 DECADE adopts a design in which immutable data objects may be stored 459 at a storage server. Applications may consider existing blocks as 460 DECADE data objects, or they may adjust block sizes before storing in 461 a DECADE server. 463 Focusing on immutable data blocks in the data plane can substantially 464 simplify the data plane design, since consistency requirements can be 465 relaxed. It also allows effective reuse of data blocks and de- 466 duplication of redundant data. 468 Depending on its specific requirements, an application may store data 469 in DECADE servers such that each data object is completely self- 470 contained (e.g., a complete, independently decodable video segment). 471 An application may also divide data into chunks that require 472 application level assembly. The DECADE architecture and protocols 473 are agnostic to the nature of the data objects and do not specify a 474 fixed size for them. 476 Note that immutable content may still be deleted. Also note that 477 immutable data blocks do not imply that contents cannot be modified. 478 For example, a meta-data management function of the control plane may 479 associate a name with a sequence of immutable blocks. If one of the 480 blocks is modified, the meta-data management function changes the 481 mapping of the name to a new sequence of immutable blocks. 483 Throughout this document, all the data objects/blocks are referred as 484 immutable data objects/blocks. 486 4.3. Data Object Identifiers 488 Objects that are stored in a DECADE storage server can be accessed by 489 DECADE content consumers by a resource identifier that has been 490 assigned within a certain application context. 492 Because a DECADE content consumer can access more than one storage 493 server within a single application context, a data object that is 494 replicated across different storage servers managed by a DECADE 495 storage provider, can be accessed by a single identifier. 497 Note that since data objects are immutable, it is possible to support 498 persistent identifiers for data objects. 500 4.4. Explicit Control 502 To support the functions of an application's control plane, 503 applications must be able to know and control which data is stored at 504 particular locations. Thus, in contrast with content caches, 505 applications are given explicit control over the placement (selection 506 of a DECADE server), deletion (or expiration policy), and access 507 control for stored data. 509 Consider deletion/expiration policy as a simple example. An 510 application may require that a DECADE server store content for a 511 relatively short period of time (e.g., for live-streaming data). 512 Another application may need to store content for a longer duration 513 (e.g., for video-on-demand). 515 4.5. Resource and Data Access Control through User Delegation 517 DECADE provides a shared infrastructure to be used by multiple 518 tenants of multiple content distribution applications. Thus, it 519 needs to provide both resource and data access control. 521 4.5.1. Resource Allocation 523 There are two primary interacting entities in the DECADE 524 architecture. First, Storage Providers control where DECADE storage 525 servers are provisioned and their total available resources. Second, 526 Applications control data transfers amongst available DECADE servers 527 and between DECADE servers and end-points. A form of isolation is 528 required to enable concurrently-running Applications to each 529 explicitly manage its own content and share of resources at the 530 available servers. 532 The Storage Provider delegates the management of the resources at a 533 DECADE server to one or more applications. Applications are able to 534 explicitly and independently manage their own shares of resources. 536 4.5.2. User Delegations 538 Storage providers have the ability to explicitly manage the entities 539 allowed to utilize the resources at a DECADE server. This capability 540 is needed for reasons such as capacity-planning and legal 541 considerations in certain deployment scenarios. 543 To provide a scalable way to manage applications granted resources at 544 a DECADE server, we consider an architecture that adds a layer of 545 indirection. Instead of granting resources to an application, the 546 DECADE server grants a share of the resources to a user. The user 547 may in turn share the granted resources amongst multiple 548 applications. The share of resources granted by a storage provider 549 is called a User Delegation. 551 As a simple example, DECADE Server operated by an ISP may be 552 configured to grant each ISP Subscriber 1.5 Mbps of bandwidth. The 553 ISP Subscriber may in turn divide this share of resources amongst a 554 video streaming application and file-sharing application which are 555 running concurrently. 557 In general, a User Delegation may be granted to an end-user (e.g., an 558 ISP subscriber), a Content Provider, or an Application Provider. A 559 particular instance of an application may make use of the storage 560 resources: 562 o granted to the end-user (with the end-user's permission), 564 o granted to the Content Provider (with the Content Provider's 565 permission), and/or 567 o granted to the Application Provider. 569 5. System Components 571 The primary focus of this document is the architectural principals 572 and the system components that implement them. While certain system 573 components might differ amongst implementations, the document details 574 the major components and their overall roles in the architecture. 576 To keep the scope narrow, we only discuss the primary components 577 related to protocol development. Particular deployments may require 578 additional components (e.g., monitoring and accounting at a DECADE 579 server), but they are intentionally omitted from this document. 581 5.1. Content Distribution Application 583 Content Distribution Applications have many functional components. 584 For example, many P2P applications have components and algorithms to 585 manage overlay topology management, piece selection, etc. In 586 supporting DECADE, it may be advantageous for an application 587 developer to consider DECADE in the implementation of these 588 components. However, in this architecture document, we focus on the 589 components directly employed to support DECADE. 591 Figure 3 illustrates the components discussed in this section from 592 the perspective of a single Application End-Point and their relation 593 to the DECADE protocols. 595 Native Protocol(s) 596 (with other Application End-Points) 597 .---------------------> 598 | 599 | 600 .----------------------------------------------------------. 601 | Application End-Point | 602 | .------------. .-------------------. | 603 | | App-Layer | ... | App Data Assembly | | 604 | | Algorithms | | Sequencing | | 605 | `------------' `-------------------' | 606 | | 607 | .------------------------------------------------------. | 608 | | DECADE Client | | 609 | | | | 610 | | .-------------------------. .----------------------. | | 611 | | | Resource Controller | | Data Controller | | | 612 | | | .--------. .----------. | | .--------. .-------. | | | 613 | | | | Data | | Resource | | | | Data | | Data | | | | 614 | | | | Access | | Sharing | | | | Sched. | | Index | | | | 615 | | | | Policy | | Policy | | | | | | | | | | 616 | | | '--------' `----------' | | `--------' `-------' | | | 617 | | `-------------------------' `----------------------' | | 618 | | | ^ | | 619 | `------------ | ----------------- | -------------------' | 620 `-------------- | ----------------- | ---------------------' 621 | | 622 | DECADE | Standard 623 | Resource | Data 624 | Protocol | Transport 625 | (DRP) | (SDT) 626 v V 628 Figure 3: Application Components 630 5.1.1. Data Assembly 632 DECADE is primarily designed to support applications that can divide 633 distributed contents into data objects. To accomplish this, 634 applications include a component responsible for creating the 635 individual data objects before distribution and then re-assembling 636 data objects at the Content Consumer. We call this component 637 Application Data Assembly. The specific implementation is entirely 638 decided by the application. 640 In producing and assembling the data objects, two important 641 considerations are sequencing and naming. The DECADE architecture 642 assumes that applications implement this functionality themselves. 644 See Section 5.3 for further discussion. 646 5.1.2. Native Protocols 648 Applications may still use existing protocols. In particular, an 649 application may reuse existing protocols primarily for control/ 650 signaling. However, an application may still retain its existing 651 data transport protocols, in addition to DECADE as the data transport 652 protocol. This can be important for applications that are designed 653 to be highly robust (e.g., if DECADE servers are unavailable). 655 5.1.3. DECADE Client 657 An application may be modified to support DECADE. We call the layer 658 providing the DECADE support to an application the DECADE Client. It 659 is important to note that a DECADE Client need not be embedded into 660 an application. It could be implemented alone, or could be 661 integrated in other entities such as network devices themselves. 663 5.1.3.1. Resource Controller 665 Applications may have different Resource Sharing Policies and Data 666 Access Policies to control their resource and data in DECADE servers. 667 These policies can be existing policies of applications (e.g., tit- 668 for-tat) or custom policies adapted for DECADE. The specific 669 implementation is decided by the application. 671 5.1.3.2. Data Controller 673 DECADE is designed to decouple the control and the data transport of 674 applications. Data transport between applications and DECADE servers 675 uses standard data transport protocols. A Data Scheduling component 676 schedules data transfers according to network conditions, available 677 DECADE Servers, and/or available DECADE Server resources. The Data 678 Index indicates data available at remote DECADE servers. The Data 679 Index (or a subset of it) may be advertised to other Application End- 680 Points. A common use case for this is to provide the ability to 681 locate data amongst a distributed set of Application End-Points 682 (i.e., a data search mechanism). 684 5.2. DECADE Server 686 A DECADE Server stores data from Application End-Points, and provides 687 control and access of those data to Application End-Points. Note 688 that a DECADE Server is not necessarily a single physical machine, it 689 could also be implemented as a cluster of machines. 691 | | 692 | DECADE | Standard 693 | Resource | Data 694 | Protocol | Transport 695 | (DRP) | (SDT) 696 | | 697 .= | ================= | ======================. 698 | | v | 699 | | .----------------. | 700 | |----> | Access Control | <--------. | 701 | | `----------------' | | 702 | | ^ | | 703 | | | | | 704 | | v | | 705 | | .---------------------. | | 706 | `-> | Resource Scheduling | <------| | 707 | `---------------------' | | 708 | ^ | | 709 | | | | 710 | v .------------. | 711 | .-----------------. | User | | 712 | | Data Store | | Delegation | | 713 | `-----------------' | Management | | 714 | DECADE Server `------------' | 715 `==============================================' 717 Figure 4: DECADE Server Components 719 5.2.1. Access Control 721 An Application End-Point can access its own data or other Application 722 End-Point's data (provided sufficient authorization) in DECADE 723 servers. Application End-Points may also authorize other End-Points 724 to store data. If an access is authorized by an Application End- 725 Point, the DECADE Server will provide access. 727 Note that even if a request is authorized, it may still fail to 728 complete due to insufficient resources by either the requesting 729 Application End-Point, the providing Application End-Point, or the 730 DECADE Server itself. 732 5.2.2. Resource Scheduling 734 Applications may apply their existing resource sharing policies or 735 use a custom policy for DECADE. DECADE servers perform resource 736 scheduling according to the resource sharing policies indicated by 737 Application End-Points as well as configured User Delegations. 739 5.2.3. Data Store 741 Data from applications may be stored at a DECADE Server. Data can be 742 deleted from storage either explicitly or automatically (e.g., after 743 a TTL expiration). It may be possible to perform optimizations in 744 certain cases, such as avoiding writing temporary data (e.g., live 745 streaming) to persistent storage, if appropriate storage hints are 746 supported by the SDT. 748 5.3. Data Sequencing and Naming 750 In order to provide a simple and generic interface, the DECADE Server 751 is only responsible for storing and retrieving individual data 752 objects. Furthermore, DECADE uses its own simple naming scheme that 753 provides uniqueness (with high probability) between data objects, 754 even across multiple applications. 756 5.3.1. DECADE Data Object Naming Scheme 758 The name of a data object is derived from the hash over the data 759 object's content (the raw bytes), which is made possible by the fact 760 that DECADE objects are immutable. This scheme multiple appealing 761 properties: 763 o Simple integrity verification 765 o Unique names (with high probability) 767 o Application independent, without a new IANA-maintained registry 769 The DECADE naming scheme also includes a "type" field, the "type" 770 identifier indicates that the name is the hash of the data object's 771 content and the particular hashing algorithm used. This allows the 772 DECADE protocol to evolve by either changing the hashing algorithm 773 (e.g., if security vulnerabilities with an existing hashing algorithm 774 are discovered), or move to a different naming scheme altogether. 776 The specific format of the name (e.g., encoding, hash algorithms, 777 etc) is out of scope of this document, and left for protocol 778 specification. 780 Another advantage of this scheme is that a DECADE client knows the 781 name of a data object before it is completely stored at the DECADE 782 server. This allows for particular optimizations, such as 783 advertising data object while the data object is being stored, 784 removing store-and-forward delays. For example, a DECADE client A 785 may simultaneously begin storing an object to a DECADE server, and 786 advertise that the object is available to DECADE client B. If it is 787 supported by the DECADE server, client B may begin downloading the 788 object before A is finished storing the object. 790 5.3.2. Application Usage 792 Recall from Section 5.1.1 that an Application typically includes its 793 own naming and sequencing scheme. It is important to note that the 794 DECADE naming format does not attempt to replace any naming or 795 sequencing of data objects already performed by an Application; 796 instead, the DECADE naming is intended to apply only to data objects 797 referenced at the DECADE layer. 799 DECADE names are not necessarily correlated with the naming or 800 sequencing used by the Application using a DECADE client. The DECADE 801 client is expected to maintain a mapping from its own data objects 802 and their names to the DECADE data objects and names. Furthermore, 803 the DECADE naming scheme implies no sequencing or grouping of 804 objects, even if this is done at the application layer. 806 Not only does an Application retain its own naming scheme, it may 807 also decide the sizes of data objects to be distributed via DECADE. 808 This is desirable since sizes of data objects may impact Application 809 performance (e.g., overhead vs. data distribution delay), and the 810 particular tradeoff is application-dependent. 812 5.3.3. Application Usage Example 814 To illustrate these properties, this section presents multiple 815 examples. 817 5.3.3.1. Application with Fixed-Size Chunks 819 Similar to the example in Section 5.1.1, consider an Application in 820 which each individual application-layer segment of data is called a 821 "chunk" and has a name of the form: "CONTENT_ID:SEQUENCE_NUMBER". 822 Furthermore, assume that the application's native protocol uses 823 chunks of size 16KB. 825 Now, assume that this application wishes to make use of DECADE, and 826 assume that it wishes to store data to DECADE servers in data objects 827 of size 64KB. To accomplish this, it can map a sequence of 4 chunks 828 into a single DECADE object, as shown in Figure 5. 830 Application Chunks 831 .---------.---------.---------.---------.---------.---------.-------- 832 | | | | | | | 833 | Chunk_0 | Chunk_1 | Chunk_2 | Chunk_3 | Chunk_4 | Chunk_5 | Chunk_6 834 | | | | | | | 835 `---------`---------`---------`---------`---------`---------`-------- 837 DECADE Data Objects 838 .---------------------------------------.---------------------------- 839 | | 840 | Object_0 | Object_1 841 | | 842 `---------------------------------------`---------------------------- 844 Figure 5: Mapping Application Chunks to DECADE Data Objects 846 In this example, the Application might maintain a logical mapping 847 that is able to determine the name of a DECADE data object given the 848 chunks contained within that data object. The name might be learned 849 from either the original source, another endpoint with which the it 850 is communicating, a tracker, etc. 852 It is important to note that as long as the data contained within 853 each sequence of chunks is unique, the corresponding DECADE data 854 objects have unique names. This is desired, and happens 855 automatically if particular Application segments the same stream of 856 data in a different way, including different chunk size sizes or 857 different padding schemes. 859 5.3.3.2. Application with Continuous Streaming Data 861 Next, consider an Application whose native protocol retrieves a 862 continuous data stream (e.g., an MPEG2 stream) instead of downloading 863 and redistributing chunks of data. Such an application could segment 864 the continuous data stream to produce either fixed-sized or variable- 865 sized DECADE data objects. 867 Figure 6 shows how a video streaming application might produce 868 variable-sized DECADE data objects such that each DECADE data object 869 contains 10 seconds of video data. 871 Application's Video Stream 872 .-------------------------------------------------------------------- 873 | 874 | 875 | 876 `-------------------------------------------------------------------- 877 ^ ^ ^ ^ ^ 878 | | | | | 879 0 Seconds 10 Seconds 20 Seconds 30 Seconds 40 Seconds 880 0 B 400 KB 900 KB 1200 KB 1500 KB 882 DECADE Data Objects 883 .--------------.--------------.--------------.--------------.-------- 884 | | | | | 885 | Object_0 | Object_1 | Object_2 | Object_3 | 886 | (400 KB) | (500 KB) | (300 KB) | (300 KB) | 887 `--------------`--------------`--------------`--------------`-------- 889 Figure 6: Mapping a Continuous Data Stream to DECADE Data Objects 891 Similar to the previous example, the Application might maintain a 892 mapping that is able to determine the name of a DECADE data object 893 given the time offset of the video chunk. 895 5.4. Token-based Authentication and Resource Control 897 A primary use case for DECADE is a DECADE Client authorizing other 898 DECADE Clients to store or retrieve data objects from its DECADE 899 storage. To support this, DECADE uses a token-based authentication 900 scheme. 902 In particular, an entity trusted by a DECADE Client generates a 903 digitally-signed token with particular properties (see Section 6.1.2 904 for details). The DECADE Client distributes this token to other 905 DECADE Clients which then use the token when sending requests to the 906 DECADE Server. Upon receiving a token, the DECADE Server validates 907 the signature and the operation being performed. 909 This is a simple scheme, but has multiple important advantages over 910 an alternate approach in which a DECADE Client explicitly manipulates 911 an Access Control List (ACL) associated with each DECADE data object. 912 In particular, it has the following advantages when applied to 913 DECADE's target applications: 915 o Authorization policies are implemented within the Application; an 916 Application explicitly controls when tokens are generated and to 917 whom they are distributed. 919 o Fine-grained access and resource control can be applied to data 920 objects; see Section 6.1.2 for the list of restrictions that can 921 be enforced with a token. 923 o There is no messaging between a DECADE Client and DECADE Server to 924 manipulate data object permissions. This can simplify, in 925 particular, Applications which share data objects with many 926 dynamic peers and need to frequently adjust access control 927 policies attached to DECADE data objects. 929 o Tokens can provide anonymous access, in which a DECADE Server does 930 not need to know the identity of each DECADE Client that accesses 931 it. This enables a DECADE Client to send tokens to DECADE Clients 932 in other administrative or security domains, and allow them to 933 read or write data objects from its DECADE storage. 935 It is important to note that, in addition to DECADE Clients applying 936 access control policies to DECADE data objects, the DECADE Server may 937 be configured to apply additional policies based on user, object, 938 geographic location, etc. Defining such policies is out of scope of 939 the DECADE Working Group, but in such a case, a DECADE Client may be 940 denied access even though it possess a valid token. 942 5.5. Discovery 944 DECADE includes a discovery mechanism through which DECADE clients 945 locate an appropriate DECADE Server. [I-D.ietf-decade-reqs] details 946 specific requirements of the discovery mechanism; this section 947 discusses how they relate to other principles outlined in this 948 document. 950 A discovery mechanism allows a DECADE client to determine an IP 951 address or some other identifier that can be resolved to locate the 952 server for which the client will be authorized to generate tokens 953 (via DRP). (Note that the discovery mechanism may also result in an 954 error if no such DECADE servers can be located.) After discovering 955 one or more DECADE servers, a DECADE client may distribute load and 956 requests across them (subject to resource limitations and policies of 957 the DECADE servers themselves) according to the policies of the 958 Application End-Point in which it is embedded. 960 The particular protocol used for discovery is out of scope of this 961 document, but any specification will re-use standard protocols 962 wherever possible. 964 It is important to note that the discovery mechanism outlined here 965 does not provide the ability to locate arbitrary DECADE servers to 966 which a DECADE client might obtain tokens from others. To do so 967 requires application-level knowledge, and it is assumed that this 968 functionality is implemented in the Content Distribution Application, 969 or if desired and needed, as an extension to this DECADE 970 architecture. 972 6. DECADE Protocols 974 This section specifies the DECADE Resource Protocol (DRP) and the 975 Standard Data Transport (SDT) in terms of abstract protocol 976 interactions that are intended to mapped to specific protocols. Note 977 that while the protocols are logically separate, DRP is specified as 978 being carried through extension fields within an SDT (e.g., HTTP 979 headers). 981 The DRP is the protocol used by a DECADE client to configure the 982 resources and authorization used to satisfy requests (reading, 983 writing, and management operations concerning DECADE objects) at a 984 DECADE server. The SDT is used to send the operations to the DECADE 985 server. Necessary DRP metadata is supplied using mechanisms in the 986 SDT that are provided for extensibility (e.g., additional request 987 parameters or extension headers). 989 6.1. DECADE Resource Protocol (DRP) 991 DRP provides configuration of access control and resource sharing 992 policies on DECADE servers. A content distribution application, 993 e.g., a live P2P streaming session, MAY employ several DECADE 994 servers, for instance, servers in different operator domains, and DRP 995 allows one instance of such an application, e.g., an application 996 endpoint, to apply access control and resource sharing policies on 997 each of them. 999 6.1.1. Controlled Resources 1001 On a single DECADE server, the following resources may be managed: 1003 communication resources: DECADE servers have limited communication 1004 resources in terms of bandwidth (upload/download) but also in 1005 terms of number of connected clients (connections) at a time. 1007 storage resources: DECADE servers have limited storage resources. 1009 6.1.2. Access and Resource Control Token 1011 A token includes the following fields: 1013 Permitted operations (e.g., read, write) 1015 Permitted objects (e.g., names of data objects that may be read or 1016 written) 1018 Permitted clients (e.g., as indicated by IP address or other 1019 identifier) that may use the token 1021 Expiration time 1023 Priority for bandwidth given to requested operation (e.g., a 1024 weight used in a weighted bandwidth sharing scheme) 1026 Amount of data that may be read or written 1028 The particular format for the token is out of scope of this document. 1030 The tokens are generated by a trusted entity at the request of a 1031 DECADE Client. It is out of scope of this document to identify which 1032 entity serves this purpose, but examples include the DECADE Client 1033 itself, a DECADE Server trusted by the DECADE Client, or another 1034 server managed by a Storage Provider trusted by the DECADE Client. 1036 Upon generating a token, a DECADE Client may distribute it to another 1037 DECADE Client (e.g., via their native Application protocol). The 1038 receiving DECADE Client may then connect to the sending DECADE 1039 Client's DECADE Server and perform any operation permitted by the 1040 token. The token must be sent along with the operation. The DECADE 1041 Server validates the token to identify the DECADE Client that issued 1042 it and whether the requested operation is permitted by the contents 1043 of the token. If the token is successfully validated, the DECADE 1044 Server applies the resource control policies indicated in the token 1045 while performing the operation. 1047 It is possible for DRP to allow tokens to apply to a batch of 1048 operations to reduce communication overhead required between DECADE 1049 Clients. 1051 DRP may also define tokens to include a unique identifier to allow a 1052 DECADE Server to detect when a token is used multiple times. 1054 6.1.3. Status Information 1056 DRP provides a request service for status information that DECADE 1057 clients can use to request information from a DECADE server. 1059 status information per application context on a specific server: 1060 Access to such status information requires client authorization, 1061 i.e., DECADE clients need to be authorized to access status 1062 information for a specific application context. This 1063 authorization (and the mapping to application contexts) is based 1064 on the user delegation concept as described in Section 4.5. The 1065 following status information elements can be obtained: 1067 * list of associated objects (with properties) 1069 * resources used/available 1071 * list of servers to which objects have been distributed (in a 1072 certain time-frame) 1074 * list of clients to which objects have been distributed (in a 1075 certain time-frame) 1077 For the list of servers/clients to which objects have been 1078 distributed to, the DECADE server can decide on time bounds for 1079 which this information is stored and specify the corresponding 1080 time frame in the response to such requests. Some of this 1081 information can be used for accounting purposes, e.g., the list of 1082 clients to which objects have been distributed. 1084 access information per application context on a specific server: 1085 Access information can be provided for accounting purposes, for 1086 example, when application service providers are interested to 1087 maintain access statistics for resources and/or to perform 1088 accounting per user. Again, access to such information requires 1089 client authorization based on the user delegation concept as 1090 described in Section 4.5. The following access information 1091 elements can be requested: 1093 * what objects have been accessed how many times 1095 * access tokens that a server as seen for a given object 1097 The DECADE server can decide on time bounds for which this 1098 information is stored and specify the corresponding time frame in 1099 the response to such requests. 1101 6.1.4. Object Properties 1103 Objects that are stored on a DECADE server can provide properties (in 1104 addition to the object identifier and the actual content). Depending 1105 on authorization, DECADE clients may get or set such properties. 1106 This authorization (and the mapping to application contexts) is based 1107 on the user delegation concept as described in Section 4.5. The 1108 DECADE architecture does not limit the set of permissible properties, 1109 but rather specifies a set of baseline properties that SHOULD be 1110 supported by implementations. 1112 TTL: TTL of the object as an absolute time value 1114 object size: in bytes 1116 MIME type 1118 access statistics: how often the object has been accessed (and what 1119 tokens have been used) 1121 6.2. Standard Data Transport (SDT) 1123 A DECADE server provide a data access interface, and SDT is used to 1124 write objects to a server and to read (download) objects from a 1125 server. Semantically, SDT is a client-server protocol, i.e., the 1126 DECADE server always responds to client requests. 1128 An SDT used in DECADE SHOULD offer a transport mode that provides 1129 confidentiality and integrity. 1131 6.2.1. Writing/Uploading Objects 1133 For writing objects, a client uploads an object to a DECADE server. 1134 The object on the server will be named (associated to an identifier), 1135 and this name can be used to access (download) the object later, 1136 e.g., the client can pass the name as a reference to other client 1137 that can then refer to the object. 1139 DECADE objects can be self-contained objects such as multimedia 1140 resources, files etc., but also chunks, such as chunks of a P2P 1141 distribution protocol that can be part of a containing object or a 1142 stream. 1144 A server MUST accept download requests for an object that is still 1145 being uploaded. 1147 The application that originates the objects MUST generate DECADE 1148 object names according to the naming specification in Section 5.3. 1150 The naming scheme provides that the name is unique. DECADE clients 1151 (as parts of application entities) upload a named object to a server, 1152 and a DECADE server MUST NOT change the name. It MUST be possible 1153 for downloading clients, to access the object using the original 1154 name. A DECADE server MAY verify the integrity and other security 1155 properties of uploaded objects. 1157 In the following we provide an abstract specification of the upload 1158 operation that we name 'PUT METHOD'. See Appendix A.1 for an example 1159 how this could be mapped to HTTP. 1161 Method PUT: 1163 Parameters: 1165 NAME: The naming of the object according to Section 5.3 1167 OBJECT: The object itself. The protocol MUST provide transparent 1168 binary object transport. 1170 Description: The PUT method is used by a DECADE client to upload an 1171 object with an associated name 'NAME' to a DECADE server. 1173 RESPONSES: The DECADE server MUST respond with one the following 1174 response messages: 1176 CREATED: The object has been uploaded successfully and is now 1177 available under the specified name. 1179 ERRORs: There was an error uploading the content 1181 6.2.2. Downloading Objects 1183 A DECADE client can request named objects from a DECADE server. In 1184 the following, we provide an abstract specification of the download 1185 operation that we name 'GET METHOD'. See Section 5.3 for an example 1186 how this could be mapped to HTTP. 1188 Method GET: 1190 Parameters: 1192 NAME: The naming of the object according to Section 5.3. 1194 Description: The GET method is used by a DECADE client to download 1195 an object with an associated name 'NAME' from a DECADE server. 1197 RESPONSES: The DECADE server MUST respond with one the following 1198 response messages: 1200 OK: The request has succeeded, and an entity corresponding to the 1201 requested resource is sent in the response. 1203 ERRORs: 1205 NOTFOUND: The DECADE server has not found anything matching 1206 the request object name. 1208 Other Errors: TBD in a future version of this document 1210 7. Server-to-Server Protocols 1212 An important feature of DECADE is the capability for one DECADE 1213 server to directly download data objects from another DECADE server. 1214 This capability allows Applications to directly replicate data 1215 objects between servers without requiring end-hosts to use uplink 1216 capacity to upload data objects to a different DECADE server. 1217 Similar to other operations in DRP and SDT, replicating data objects 1218 between DECADE servers is an explicit operation. 1220 To support this functionality, DECADE re-uses the already-specified 1221 protocols to support operations directly between servers. DECADE 1222 servers are not assumed to trust each other nor are configured to do 1223 so. All data operations are performed on behalf of DECADE clients 1224 via explicit instruction, so additional capabilities are needed in 1225 the DECADE client-server protocols DECADE clients must be able to 1226 indicate to a DECADE server the following additional parameters: 1228 o which remote DECADE server(s) to access; 1230 o the operation to be performed (PUT or GET); and 1232 o Credentials indicating permission to perform the operation at the 1233 remote DECADE server. 1235 In this way, a DECADE server is also a DECADE client, and requests 1236 may instantiate requests via that client. The operations are 1237 performed as if the original requester had its own DECADE client co- 1238 located with the DECADE server. It is this mode of operation that 1239 provides substantial savings in uplink capacity. 1241 7.1. Operational Overview 1243 DECADE's server-to-server support is focused on reading and writing 1244 data objects between DECADE servers. A DECADE GET or PUT request MAY 1245 supply the following additional parameters: 1247 REMOTE_SERVER: Address of the remote DECADE server. The format of 1248 the address is out-of-scope of this document. 1250 REMOTE_USER: The account at the remote server from which to retrieve 1251 the object (for a GET), or in which the object is to be stored 1252 (for a PUT). 1254 TOKEN: Credentials to be used at the remote server. 1256 These parameters are used by the DECADE server to instantiate a 1257 request to the specified remote server. It is assumed that the data 1258 object referred to at the remote server is the same as the original 1259 request. It is also assumed that the operation performed at the 1260 remote server is the same as the operation in the original request. 1261 Though explicitly supplying these may provide additional freedom, it 1262 is not clear what benefit they might provide. 1264 Note that when a DECADE client invokes a request a DECADE server with 1265 these additional parameters, it is giving the DECADE server 1266 permission to act on its behalf. Thus, it would be wise for the 1267 supplied token to have narrow privileges (e.g., limited to only the 1268 necessary data objects) or validity time (e.g., a small expiration 1269 time). 1271 In the case of a GET operation, the DECADE server is to retrieve the 1272 data object from the remote server using the specified credentials 1273 (via a GET request to the remote server), and then return the object 1274 to the client. In the case of a PUT operation, the DECADE server is 1275 to store the object from the client, and then store the object to the 1276 remote server using the specified credentials (via a PUT request to 1277 the remote server). 1279 8. Potential Optimizations 1281 As suggestions for the protocol design and eventual implementations, 1282 we discuss particular optimizations that are enabled by the DECADE 1283 Architecture discussed in this document. 1285 8.1. Pipelining to Avoid Store-and-Forward Delays 1287 A DECADE server may choose to not fully store an object before 1288 beginning to serve it. For example, when serving a GET request, 1289 instead of waiting for the complete data to arrive from a remote 1290 server or DECADE client, a DECADE server may forward received data 1291 bytes as they come in. This pipelining mode reduces store-and- 1292 forward delays, which could be substantial for large objects. A 1293 similar behavior could be used for PUT. 1295 8.2. Deduplication 1297 A common concern amongst Storage Providers is the total volume of 1298 data that needs to be stored. An optimization frequently applied in 1299 existing storage systems is de-duplication, which attempts to avoid 1300 storing identical data multiple times. A DECADE Server 1301 implementation may internally perform de-duplication of data on disk. 1302 The DECADE architecture enables additional forms of de-duplication. 1304 Note that these techniques may impact protocol design. Discussions 1305 of whether or not they should be adopted is out of the scope of this 1306 document. 1308 8.2.1. Traffic Deduplication 1310 8.2.1.1. Rationale 1312 When a DECADE client (A) indicates its DECADE account on a DECADE 1313 server (S) to fetch an object from a remote entity (R) (a DECADE 1314 server or DECADE client) and if the object is already stored locally 1315 in S, S may perform Traffic Deduplication. This means that S does 1316 not download the object from R, in order to save network traffic. In 1317 particular, S performs a challenge to make sure that the remote 1318 entity R actually has the object and then replies with its local 1319 object copy directly. 1321 8.2.1.2. An Example 1323 As shown in Figure 7, without Traffic Deduplication, unnecessary 1324 transfer of an object from R to S may happen, if the server S already 1325 has the object requested by A. If Traffic Deduplication is enabled, S 1326 only needs to challenge R to verify that it does have the data to 1327 avoid data-stealing attacks. 1329 A S R 1330 +----------+ obj req +------------+ obj req +----------+ 1331 | DECADE |=========>| A's |==========>| Remote | 1332 | CLIENT |<=========| Account |<==========| Entity | 1333 +----------+ obj rsp +------------+ obj rsp +----------+ 1335 (a) Without Traffic Deduplication 1337 A S R 1338 +----------+ obj req +------------+ challenge +----------+ 1339 | DECADE |=========>| A's |---------->| Remote | 1340 | CLIENT |<=========| Account |<----------| Entity | 1341 +----------+ obj rsp +------------+ obj hash +----------+ 1343 (b) With Traffic Deduplication 1345 Figure 7 1347 8.2.1.3. HTTP Compatibility of Challenge 1349 How to integrate traffic deduplication with HTTP is shown in 1350 Appendix A.1.3. 1352 8.2.2. Cross-Server Storage Deduplication 1354 The same object might be uploaded multiple times to different DECADE 1355 servers. For storage efficiency, storage providers may desire that a 1356 single object be stored on one or a few servers. They might 1357 implement an internal mechanism to achieve the goal, for example, by 1358 redirecting requests to proper servers. The DECADE protocol supports 1359 the redirection of DECADE client requests to support further cross- 1360 server storage deduplication. 1362 9. Security Considerations 1364 In general, the security considerations mentioned in 1365 [I-D.ietf-decade-problem-statement] apply to this document as well. 1367 In addition, it should be noted that the token-based approach 1368 Section 5.4 provides authorization through token delegation. The 1369 strength of this authorization depends on several factors: 1371 1. the uniqueness of tokens: tokens should be constructed in a way 1372 that minimize the possibilities for collisions; 1374 2. validity of tokens: applications/users should not re-use tokens; 1375 and 1377 3. secrecy of tokens: if tokens are compromised to unauthorized 1378 entities, access control for the associated resources cannot be 1379 provided. 1381 Depending on the specific application, DECADE can be used to access 1382 confidential information. Hence DECADE implementations SHOULD 1383 provide a secure transport mode that allows for encryption. 1385 10. IANA Considerations 1387 This document does not have any IANA considerations. 1389 11. References 1391 11.1. Normative References 1393 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1394 Requirement Levels", BCP 14, RFC 2119, March 1997. 1396 11.2. Informative References 1398 [RFC2616] Fielding, R., Gettys, J., Mogul, J., Frystyk, H., 1399 Masinter, L., Leach, P., and T. Berners-Lee, "Hypertext 1400 Transfer Protocol -- HTTP/1.1", RFC 2616, June 1999. 1402 [RFC3744] Clemm, G., Reschke, J., Sedlar, E., and J. Whitehead, "Web 1403 Distributed Authoring and Versioning (WebDAV) 1404 Access Control Protocol", RFC 3744, May 2004. 1406 [RFC4331] Korver, B. and L. Dusseault, "Quota and Size Properties fo 1407 r Distributed Authoring and Versioning (DAV) Collections", 1408 RFC 4331, February 2006. 1410 [RFC4709] Reschke, J., "Mounting Web Distributed Authoring and 1411 Versioning (WebDAV) Servers", RFC 4709, October 2006. 1413 [RFC4918] Dusseault, L., "HTTP Extensions for Web Distributed 1414 Authoring and Versioning (WebDAV)", RFC 4918, June 2007. 1416 [I-D.ietf-decade-problem-statement] 1417 Song, H., Zong, N., Yang, Y., and R. Alimi, "DECoupled 1418 Application Data Enroute (DECADE) Problem Statement", 1419 draft-ietf-decade-problem-statement-03 (work in progress), 1420 March 2011. 1422 [I-D.ietf-decade-survey] 1423 Alimi, R., Rahman, A., and Y. Yang, "A Survey of In- 1424 network Storage Systems", draft-ietf-decade-survey-04 1425 (work in progress), March 2011. 1427 [I-D.ietf-decade-reqs] 1428 Yingjie, G., Bryan, D., Yang, Y., and R. Alimi, "DECADE 1429 Requirements", draft-ietf-decade-reqs-02 (work in 1430 progress), May 2011. 1432 [GoogleStorageDevGuide] 1433 "Google Storage Developer Guide", . 1436 Appendix A. Appendix: Evaluation of Some Candidate Existing Protocols 1437 for DECADE DRP and SDT 1439 In this section we evaluate how well the abstract protocol 1440 interactions specified in Section 6 for DECADE DRP and SDT can be 1441 fulfilled by existing protocols such as HTTP and WEBDAV. 1443 A.1. HTTP 1445 HTTP [RFC2616] is a key protocol for the Internet in general and 1446 especially for the World Wide Web. HTTP is a request-response 1447 protocol. A typical transaction involves a client (e.g. web browser) 1448 requesting content (resources) from a web server. Another example is 1449 when a client stores or deletes content from a server. 1451 A.1.1. HTTP Support for DECADE Resource Protocol Primitives 1453 DRP provides configuration of access control and resource sharing 1454 policies on DECADE servers. 1456 A.1.1.1. Access Control Primitives 1458 Access control requires mechanisms for defining the access policies 1459 for the server, and then checking the authorization of a user before 1460 it stores or retrieves content. HTTP supports a rudimentary access 1461 control via "HTTP Secure" (HTTPS). HTTPS is a combination of HTTP 1462 with SSL/TLS. The main use of HTTPS is to authenticate the server 1463 and encrypt all traffic between the client and the server. There is 1464 also a mode to support client authentication though this is less 1465 frequently used. 1467 A.1.1.2. Communication Resource Controls Primitives 1469 Communications resources include bandwidth (upload/download) and 1470 number of simultaneous connected clients (connections). HTTP 1471 supports bandwidth control indirectly through "persistent" HTTP 1472 connections. Persistent HTTP connections allows a client to keep 1473 open the underlying TCP connection to the server to allow streaming 1474 and pipelining (multiple simultaneous requests for a given client). 1476 HTTP does not define protocol operation to allow limiting the 1477 communication resources to a client. However servers typically 1478 perform this function via implementation algorithms. 1480 A.1.1.3. Storage Resource Control Primitives 1482 Storage resources include amount of memory and lifetime of storage. 1483 HTTP does not allow direct control of storage at the server end 1484 point. However HTTP supports caching at intermediate points such as 1485 a web proxy. For this purpose, HTTP defines cache control mechanisms 1486 that define how long and in what situations the intermediate point 1487 may store and use the content. 1489 A.1.2. HTTP Support for DECADE Standard Data Transport Protocol 1490 Primitives 1492 SDT is used to write objects and read (download) objects from a 1493 DECADE server. The object can be either a self-contained object such 1494 as a multimedia file or a chunk from a P2P system. 1496 A.1.2.1. Writing Primitives 1498 Writing involves uploading objects to the server. HTTP supports two 1499 methods of writing called PUT and POST. In HTTP the object is called 1500 a resource and is identified by a URI. PUT uploads a resource to a 1501 specific location on the server. POST, on the other hand, submits 1502 the object to the server and the server decides whether to update an 1503 existing resource or to create a new resource. 1505 For DECADE, the choice of whether to use PUT or POST will be 1506 influenced by which entity is responsible for the naming. If the 1507 client performs the naming, then PUT is appropriate. If the server 1508 performs the naming, then POST should be used (to allow the server to 1509 define the URI). 1511 A.1.2.2. Downloading Primitives 1513 Downloading involves fetching of an object from the server. HTTP 1514 supports downloading through the GET and HEAD methods. GET fetches a 1515 specific resource as identified by the URL. HEAD is similar but only 1516 fetches the metadata ("header") associated with the resource but not 1517 the resource itself. 1519 A.1.3. Traffic De-duplication Primitives 1521 To challenge a remote entity for an object, the DECADE server should 1522 provide a seed number, which is generated by the server randomly, and 1523 ask the remote entity to return a hash calculated from the seed 1524 number and the content of the object. The server MAY also specify 1525 the hash function which the remote entity should use. HTTP supports 1526 the challenge message through the GET methods. The message type 1527 ("challenge"), the seed number and the hash function name are put in 1528 URL. In the reply, the hash is sent in an ETAG header. 1530 A.1.4. Other Operations 1532 HTTP supports deleting of content on the server through the DELETE 1533 method. 1535 A.1.5. Conclusions 1537 HTTP can provide a rudimentary DRP and SDT for some aspects of 1538 DECADE, but will not be able to satisfy all the DECADE requirements. 1539 For example, HTTP does not provide a complete access control 1540 mechanism, nor does it support storage resource controls at the end 1541 point server. 1543 It is possible, however, to envision combining HTTP with a custom 1544 suite of other protocols to fulfill most of the DECADE requirements 1545 for DRP and SDT. For example, Google Storage for Developers is built 1546 using HTTP (with extensive proprietary extensions such as custom HTTP 1547 headers). Google Storage also uses OAUTH 2.0 (for access control) in 1548 combination with HTTP [GoogleStorageDevGuide]. 1550 A.2. WEBDAV 1552 WebDAV [RFC4918] is a protocol for enhanced Web content creation and 1553 management. It was developed as an extension to HTTP Appendix A.1. 1554 WebDAV supports traditional operations for reading/writing from 1555 storage, as well as more advanced features such as locking and 1556 namespace management which are important when multiple users 1557 collaborate to author or edit a set of documents. HTTP is a subset 1558 of WebDAV functionality. Therefore, all the points noted above in 1559 Appendix A.1 apply implicitly to WebDAV as well. 1561 A.2.1. WEBDAV Support for DECADE Resource Protocol Primitives 1563 DRP provides configuration of access control and resource sharing 1564 policies on DECADE servers. 1566 A.2.1.1. Access Control Primitives 1568 Access control requires mechanisms for defining the access policies 1569 for the server, and then checking the authorization of a user before 1570 it stores or retrieves content. WebDAV has an Access Control 1571 Protocol defined in [RFC3744]. 1573 The goal of WebDAV access control is to provide an interoperable 1574 mechanism for handling discretionary access control for content and 1575 metadata managed by WebDAV servers. WebDAV defines an Access Control 1576 List (ACL) per resource. An ACL contains a set of Access Control 1577 Entries (ACEs), where each ACE specifies a principal (i.e. user or 1578 group of users) and a set of privileges that are granted to that 1579 principal. When a principal tries to perform an operation on that 1580 resource, the server evaluates the ACEs in the ACL to determine if 1581 the principal has permission for that operation. 1583 WebDAV also requires that an authentication mechanism be available 1584 for the server to validate the identity of a principal. As a 1585 minimum, all WebDAV compliant implementations are required to support 1586 HTTP Digest Authentication. 1588 A.2.1.2. Communication Resource Controls Primitives 1590 Communications resources include bandwidth (upload/download) and 1591 number of simultaneous connected clients (connections). WebDAV 1592 supports communication resource control as described in 1593 Appendix A.1.1.2. 1595 A.2.1.3. Storage Resource Control Primitives 1597 Storage resources include amount of memory and lifetime of storage. 1598 WebDAV supports the concept of properties (which are metadata for a 1599 resource). A property is either "live" or "dead". Live properties 1600 include cases where a) the value of a property is protected and 1601 maintained by the server, and b) the value of the property is 1602 maintained by the client, but the server performs syntax checking on 1603 submitted values. A dead property has its syntax and semantics 1604 enforced by the client; the server merely records the value of the 1605 property. 1607 WebDAV supports a list of standardized properties [RFC4918] that are 1608 useful for storage resource control. These include the self- 1609 explanatory "creationdate", and "getcontentlength" properties. There 1610 is also an operation called PROPFIND to retrieve all the properties 1611 defined for the requested URI. 1613 WebDAV also has a Quota and Size Properties mechanism defined in 1614 [RFC4331] that can be used for storage control. Specifically, two 1615 key properties are defined per resource: "quota-available-bytes" and 1616 "quota-used-bytes". 1618 WebDAV does not define protocol operation for storage resource 1619 control. However servers typically perform this function via 1620 implementation algorithms in conjunction with the storage related 1621 properties discussed above. 1623 A.2.2. WebDAV Support for DECADE Standard Transport Protocol Primitives 1625 SDT is used to write objects and read (download) objects from a 1626 DECADE server. The object can be either a self-contained object such 1627 as a multimedia file or a chunk from a P2P system. 1629 A.2.2.1. Writing Primitives 1631 Writing involves uploading objects to the server. WebDAV supports 1632 PUT and POST as described in Appendix A.1.2.1. WebDAV LOCK/UNLOCK 1633 functionality is not needed as DECADE assumes immutable data objects. 1634 Therefore, resources cannot be edited and so do not need to be 1635 locked. This approach should help to greatly simplify DECADE 1636 implementations as the LOCK/UNLOCK functionality is quite complex. 1638 A.2.2.2. Downloading Primitives 1640 Downloading involves fetching of an object from the server. WebDAV 1641 supports GET and HEAD as described in Appendix A.1.2.2. WebDAV LOCK/ 1642 UNLOCK functionality is not needed as DECADE assumes immutable data 1643 objects. 1645 A.2.3. Other Operations 1647 WebDAV supports DELETE as described in Appendix A.1.4. In addition 1648 WebDAV supports COPY and MOVE methods. The COPY operation creates a 1649 duplicate of the source resource identified by the Request-URI, in 1650 the destination resource identified by the URI in the Destination 1651 header. 1653 The MOVE operation on a resource is the logical equivalent of a COPY, 1654 followed by consistency maintenance processing, followed by a delete 1655 of the source, where all three actions are performed in a single 1656 operation. The consistency maintenance step allows the server to 1657 perform updates caused by the move, such as updating all URLs, other 1658 than the Request-URI that identifies the source resource, to point to 1659 the new destination resource. 1661 WebDAV also supports the concept of "collections" of resources to 1662 support joint operations on related objects (e.g. file system 1663 directories) within a server's namespace. For example, GET and HEAD 1664 may be done on a single resource (as in HTTP) or on a collection. 1665 The MKCOL operation is used to create a new collection. DECADE may 1666 find the concept of collections to be useful if there is a need to 1667 support directory like structures in DECADE. 1669 WebDAV servers can be interfaced from an HTML-based user interface in 1670 a web browser. However, it is frequently desirable to be able to 1671 switch from an HTML-based view to a presentation provided by a native 1672 WebDAV client, directly supporting WebDAV features. The method to 1673 perform this in a platform-neutral mechanism is specified in the 1674 WebDAV protocol for "mounting WebDAV servers" [RFC4709]. This type 1675 of feature may also be attractive for DECADE clients. 1677 A.2.4. Conclusions 1679 WebDAV has a rich array of features that can provide a good base for 1680 DRP and SDT for DECADE. An initial analysis finds that the following 1681 WebDAV features will be useful for DECADE: 1683 - access control 1685 - properties (and PROPFIND operation) 1687 - COPY/MOVE operations 1689 - collections 1691 - mounting WebDAV servers 1693 It is recommended that the following WebDAV features NOT be used for 1694 DECADE: 1696 - LOCK/UNLOCK 1698 Finally, some extensions to WebDAV may still be required to meet all 1699 DECADE requirements. For example, defining a new WebDAV "time-to- 1700 live" property may be useful for DECADE. Further analysis is 1701 required to fully define the potential extensions to WebDAV to meet 1702 all DECADE requirements. 1704 Appendix B. In-Network Storage Components Mapped to DECADE Architecture 1706 In this section we evaluate how the basic components of an in-network 1707 storage system identified in Section 3 of [I-D.ietf-decade-survey] 1708 map into the DECADE architecture. 1710 It is important to note that complex and/or application-specific 1711 behavior is delegated to applications instead of tuning the storage 1712 system wherever possible. 1714 B.1. Data Access Interface 1716 Users can read and write objects of arbitrary size through the DECADE 1717 Client's Data Controller, making use of a standard data transport. 1719 B.2. Data Management Operations 1721 Users can move or delete previously stored objects via the DECADE 1722 Client's Data Controller, making use of a standard data transport. 1724 B.3. Data Search Capability 1726 Users can enumerate or search contents of DECADE servers to find 1727 objects matching desired criteria through services provided by the 1728 Content Distribution Application (e.g., buffer-map exchanges, a DHT, 1729 or peer-exchange). In doing so, End-Points may consult their local 1730 Data Index in the DECADE Client's Data Controller. 1732 B.4. Access Control Authorization 1734 All methods of access control are supported: public-unrestricted, 1735 public-restricted and private. Access Control Policies are generated 1736 by a Content Distribution Application and provided to the DECADE 1737 Client's Resource Controller. The DECADE Server is responsible for 1738 implementing the access control checks. 1740 B.5. Resource Control Interface 1742 Users can manage the resources (e.g. bandwidth) on the DECADE server 1743 that can be used by other Application End-Points. Resource Sharing 1744 Policies are generated by a Content Distribution Application and 1745 provided to the DECADE Client's Resource Controller. The DECADE 1746 Server is responsible for implementing the resource sharing policies. 1748 B.6. Discovery Mechanism 1750 The particular protocol used for discovery is outside the scope of 1751 this document. However, options and considerations have been 1752 discussed in Section 5.5. 1754 B.7. Storage Mode 1756 DECADE Servers provide an object-based storage mode. Immutable data 1757 objects may be stored at a DECADE server. Applications may consider 1758 existing blocks as DECADE data objects, or they may adjust block 1759 sizes before storing in a DECADE server. 1761 Authors' Addresses 1763 Richard Alimi 1764 Google 1766 Email: ralimi@google.com 1768 Y. Richard Yang 1769 Yale University 1771 Email: yry@cs.yale.edu 1773 Akbar Rahman 1774 InterDigital Communications, LLC 1776 Email: akbar.rahman@interdigital.com 1778 Dirk Kutscher 1779 NEC 1781 Email: dirk.kutscher@neclab.eu 1783 Hongqiang Liu 1784 Yale University 1786 Email: hongqiang.liu@yale.edu