idnits 2.17.1 draft-ietf-decade-arch-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** You're using the IETF Trust Provisions' Section 6.b License Notice from 12 Sep 2009 rather than the newer Notice from 28 Dec 2009. (See https://trustee.ietf.org/license-info/) Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** There are 31 instances of too long lines in the document, the longest one being 3 characters in excess of 72. ** The document seems to lack a both a reference to RFC 2119 and the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. RFC 2119 keyword, line 937: '...reaming session, MAY employ several DE...' RFC 2119 keyword, line 1055: '... of baseline properties that SHOULD be...' RFC 2119 keyword, line 1087: '... A server MUST accept download reque...' RFC 2119 keyword, line 1090: '...nates the objects MUST generate DECADE...' RFC 2119 keyword, line 1094: '... a DECADE server MUST not change the n...' (6 more instances...) Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == Using lowercase 'not' together with uppercase 'MUST', 'SHALL', 'SHOULD', or 'RECOMMENDED' is not an accepted usage according to RFC 2119. Please use uppercase 'NOT' together with RFC 2119 keywords (if that is what you mean). Found 'MUST not' in this paragraph: The application that originates the objects MUST generate DECADE object names according to the naming specification in Section 4.4. The naming scheme provides that the name is unique. DECADE clients (as parts of application entities) upload a named object to a server, and a DECADE server MUST not change the name. It MUST be possible for downloading clients, to access the object using the original name. A DECADE server MAY verify the integrity and other security properties of uploaded objects. -- The document date (May 21, 2011) is 4723 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- -- Obsolete informational reference (is this intentional?): RFC 2616 (Obsoleted by RFC 7230, RFC 7231, RFC 7232, RFC 7233, RFC 7234, RFC 7235) == Outdated reference: A later version (-06) exists of draft-ietf-decade-problem-statement-03 == Outdated reference: A later version (-06) exists of draft-ietf-decade-survey-04 == Outdated reference: A later version (-08) exists of draft-ietf-decade-reqs-02 Summary: 3 errors (**), 0 flaws (~~), 5 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 DECADE R. Alimi 3 Internet-Draft Google 4 Intended status: Informational Y. Yang 5 Expires: November 22, 2011 Yale University 6 A. Rahman 7 InterDigital Communications, LLC 8 D. Kutscher 9 NEC 10 H. Liu 11 Yale University 12 May 21, 2011 14 DECADE Architecture 15 draft-ietf-decade-arch-01 17 Abstract 19 Peer-to-peer (P2P) applications have become widely used on the 20 Internet today and make up a large portion of the traffic in many 21 networks. One technique to improve the network efficiency of P2P 22 applications is to introduce storage capabilities within the 23 networks. The DECADE Working Group has been formed with the goal of 24 developing an architecture to provide this capability. This document 25 presents an architecture, discusses the underlying principles, and 26 identifies core components and protocols supporting the architecture. 28 Status of this Memo 30 This Internet-Draft is submitted to IETF in full conformance with the 31 provisions of BCP 78 and BCP 79. 33 Internet-Drafts are working documents of the Internet Engineering 34 Task Force (IETF), its areas, and its working groups. Note that 35 other groups may also distribute working documents as Internet- 36 Drafts. 38 Internet-Drafts are draft documents valid for a maximum of six months 39 and may be updated, replaced, or obsoleted by other documents at any 40 time. It is inappropriate to use Internet-Drafts as reference 41 material or to cite them other than as "work in progress." 43 The list of current Internet-Drafts can be accessed at 44 http://www.ietf.org/ietf/1id-abstracts.txt. 46 The list of Internet-Draft Shadow Directories can be accessed at 47 http://www.ietf.org/shadow.html. 49 This Internet-Draft will expire on November 22, 2011. 51 Copyright Notice 53 Copyright (c) 2011 IETF Trust and the persons identified as the 54 document authors. All rights reserved. 56 This document is subject to BCP 78 and the IETF Trust's Legal 57 Provisions Relating to IETF Documents 58 (http://trustee.ietf.org/license-info) in effect on the date of 59 publication of this document. Please review these documents 60 carefully, as they describe your rights and restrictions with respect 61 to this document. Code Components extracted from this document must 62 include Simplified BSD License text as described in Section 4.e of 63 the Trust Legal Provisions and are provided without warranty as 64 described in the BSD License. 66 Table of Contents 68 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 5 69 2. Entities . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 70 2.1. DECADE Storage Servers . . . . . . . . . . . . . . . . . . 6 71 2.2. DECADE Storage Provider . . . . . . . . . . . . . . . . . 6 72 2.3. DECADE Content Providers . . . . . . . . . . . . . . . . . 6 73 2.4. DECADE Content Consumers . . . . . . . . . . . . . . . . . 6 74 2.5. Content Distribution Application . . . . . . . . . . . . . 6 75 2.6. Application End-Point . . . . . . . . . . . . . . . . . . 7 76 3. Architectural Principles . . . . . . . . . . . . . . . . . . . 7 77 3.1. Decoupled Control/Metadata and Data Planes . . . . . . . . 7 78 3.2. Immutable Data Objects . . . . . . . . . . . . . . . . . . 8 79 3.3. Data Object Identifiers . . . . . . . . . . . . . . . . . 9 80 3.4. Explicit Control . . . . . . . . . . . . . . . . . . . . . 10 81 3.5. Resource and Data Access Control through User 82 Delegation . . . . . . . . . . . . . . . . . . . . . . . . 10 83 3.5.1. Resource Allocation . . . . . . . . . . . . . . . . . 10 84 3.5.2. User Delegations . . . . . . . . . . . . . . . . . . . 10 85 4. System Components . . . . . . . . . . . . . . . . . . . . . . 11 86 4.1. Content Distribution Application . . . . . . . . . . . . . 13 87 4.1.1. Data Assembly . . . . . . . . . . . . . . . . . . . . 13 88 4.1.2. Native Protocols . . . . . . . . . . . . . . . . . . . 14 89 4.1.3. DECADE Client . . . . . . . . . . . . . . . . . . . . 14 90 4.2. DECADE Server . . . . . . . . . . . . . . . . . . . . . . 14 91 4.2.1. Access Control . . . . . . . . . . . . . . . . . . . . 14 92 4.2.2. Resource Scheduling . . . . . . . . . . . . . . . . . 15 93 4.2.3. Data Store . . . . . . . . . . . . . . . . . . . . . . 15 94 4.3. Protocols . . . . . . . . . . . . . . . . . . . . . . . . 15 95 4.3.1. DECADE Resource Protocol . . . . . . . . . . . . . . . 16 96 4.3.2. Standard Data Transports . . . . . . . . . . . . . . . 16 97 4.4. Data Sequencing and Naming . . . . . . . . . . . . . . . . 16 98 4.4.1. DECADE Data Object Naming Schame . . . . . . . . . . . 16 99 4.4.2. Application Usage . . . . . . . . . . . . . . . . . . 17 100 4.4.3. Application Usage Example . . . . . . . . . . . . . . 17 101 4.5. Token-based Authentication and Resource Control . . . . . 19 102 4.6. In-Network Storage Components Mapped to DECADE 103 Architecture . . . . . . . . . . . . . . . . . . . . . . . 20 104 4.6.1. Data Access Interface . . . . . . . . . . . . . . . . 20 105 4.6.2. Data Management Operations . . . . . . . . . . . . . . 20 106 4.6.3. Data Search Capability . . . . . . . . . . . . . . . . 21 107 4.6.4. Access Control Authorization . . . . . . . . . . . . . 21 108 4.6.5. Resource Control Interface . . . . . . . . . . . . . . 21 109 4.6.6. Discovery Mechanism . . . . . . . . . . . . . . . . . 21 110 4.6.7. Storage Mode . . . . . . . . . . . . . . . . . . . . . 21 111 5. DECADE Protocols . . . . . . . . . . . . . . . . . . . . . . . 21 112 5.1. DECADE Resource Protocol (DRP) . . . . . . . . . . . . . . 22 113 5.1.1. Controlled Resources . . . . . . . . . . . . . . . . . 22 114 5.1.2. Token-based Authentication and Resource Control . . . 22 115 5.1.3. Status Information . . . . . . . . . . . . . . . . . . 23 116 5.1.4. Object Properties . . . . . . . . . . . . . . . . . . 24 117 5.2. Standard Data Transport (SDT) . . . . . . . . . . . . . . 25 118 5.2.1. Writing/Uploading Objects . . . . . . . . . . . . . . 25 119 5.2.2. Downloading Objects . . . . . . . . . . . . . . . . . 26 120 6. Server-to-Server Protocols . . . . . . . . . . . . . . . . . . 27 121 6.1. Operational Overview . . . . . . . . . . . . . . . . . . . 27 122 7. Potential Optimizations . . . . . . . . . . . . . . . . . . . 28 123 7.1. Pipelining to Avoid Store-and-Forward Delays . . . . . . . 28 124 7.2. Deduplication . . . . . . . . . . . . . . . . . . . . . . 28 125 7.2.1. Traffic Deduplication . . . . . . . . . . . . . . . . 29 126 7.2.2. Cross-Server Storage Deduplication . . . . . . . . . . 30 127 8. Security Considerations . . . . . . . . . . . . . . . . . . . 30 128 9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 30 129 10. Informative References . . . . . . . . . . . . . . . . . . . . 30 130 Appendix A. Appendix: Evaluation of Some Candidate Existing 131 Protocols for DECADE DRP and SDT . . . . . . . . . . 31 132 A.1. HTTP . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 133 A.1.1. HTTP Support for DECADE Resource Protocol 134 Primitives . . . . . . . . . . . . . . . . . . . . . . 31 135 A.1.2. HTTP Support for DECADE Standard Transport 136 Protocol Primitives . . . . . . . . . . . . . . . . . 32 137 A.1.3. Traffic Deduplication Primitives . . . . . . . . . . . 33 138 A.1.4. Other Operations . . . . . . . . . . . . . . . . . . . 33 139 A.1.5. Conclusions . . . . . . . . . . . . . . . . . . . . . 33 140 A.2. WEBDAV . . . . . . . . . . . . . . . . . . . . . . . . . . 33 141 A.2.1. WEBDAV Support for DECADE Resource Protocol 142 Primitives . . . . . . . . . . . . . . . . . . . . . . 34 143 A.2.2. WebDAV Support for DECADE Standard Transport 144 Protocol Primitives . . . . . . . . . . . . . . . . . 35 145 A.2.3. Other Operations . . . . . . . . . . . . . . . . . . . 35 146 A.2.4. Conclusions . . . . . . . . . . . . . . . . . . . . . 36 147 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 37 149 1. Introduction 151 Peer-to-peer (P2P) applications have become widely used on the 152 Internet today to distribute contents, and they contribute a large 153 portion of the traffic in many networks. The DECADE Working Group 154 has been formed with the goal of developing an architecture to 155 introduce in-network storage to be used by such applications, to 156 achieve more efficient content distribution. Specifically, in many 157 subscriber networks, it is typically more expensive to upgrade 158 network equipment in the "last-mile", because it can involve 159 replacing equipment and upgrading wiring at individual homes, 160 businesses, and devices such as DSLAMs and CMTSs. On the other hand, 161 it can be cheaper to upgrade the core infrastructure, which involves 162 fewer components that are shared by many subscribers. See 163 [I-D.ietf-decade-problem-statement] for a more complete discussion of 164 the problem domain and general discussions of the capabilities to be 165 provided by DECADE. 167 This document presents a potential architecture of providing in- 168 network storage that can be integrated into content distribution 169 applications. The primary focus is P2P-based content distribution, 170 but the architecture may be useful to other applications with similar 171 characteristics and requirements. In particular, content 172 distribution applications that may split data into smaller pieces for 173 distribution may be able to utilize DECADE. 175 The design philosophy of the DECADE architecture is to provide only 176 the core functionalities that are needed for applications to make use 177 of in-network storage. With such core functionalities, the protocol 178 may be simpler and easier to support by storage providers. If more 179 complex functionalities are needed by a certain application or class 180 of applications, it may be layered on top of the DECADE protocol. 182 The DECADE protocol will leverage existing transport and application 183 layer protocols and will be designed to work with a small set of 184 alternative IETF protocols. 186 This document proceeds in two steps. First, it details the core 187 architectural principles that can guide the DECADE design. Next, 188 given these core principles, this document presents the core 189 components of the DECADE architecture and identifies usage of 190 existing protocols and where there is a need for new protocol 191 development. 193 This document will be updated to track the progress of the DECADE 194 survey [I-D.ietf-decade-survey] and requirements 195 [I-D.ietf-decade-reqs] drafts. 197 2. Entities 199 2.1. DECADE Storage Servers 201 DECADE storage servers are operated by DECADE storage providers and 202 provide the DECADE functionality as specified in this document, 203 including mechanisms to store, retrieve and manage data. A storage 204 provider may typically operate multiple storage servers. 206 2.2. DECADE Storage Provider 208 A DECADE in-network storage provider deploys and/or manages DECADE 209 servers within a network. A storage provider may also own or manage 210 the network in which the DECADE servers are deployed. 212 A DECADE storage provider, possibly in cooperation with one or more 213 network providers, determines deployment locations for DECADE servers 214 and determines the available resources for each. 216 2.3. DECADE Content Providers 218 DECADE content providers access DECADE storage servers (by way of a 219 DECADE client) to upload and manage data. A content provider can 220 access one or more storage servers. A content provider may be a 221 single process or a distributed application (e.g., in a P2P 222 scenario). 224 2.4. DECADE Content Consumers 226 DECADE content consumers access storage servers (by way of a DECADE 227 client) to download data that has previously been stored by a content 228 provider. A content consumer can access one or more storage servers. 229 A content consumer may be a single process or a distributed 230 application (e.g., in a P2P scenario). An instance of a distributed 231 application, such as a P2P application, may both provide content to 232 and consume content from DECADE storage servers. 234 2.5. Content Distribution Application 236 A content distribution application is a distributed application 237 designed for dissemination of possibly-large data to multiple 238 consumers. Content Distribution Applications typically divide 239 content into smaller immutable blocks for dissemination. 241 The term Application Developer refers to the developer of a 242 particular Content Distribution Application. 244 2.6. Application End-Point 246 An Application End-Point is an instance of a Content Distribution 247 Application that makes use of DECADE server(s). A particular 248 Application End-Point may be a DECADE Content Provider, a DECADE 249 Content Consumer, or both. 251 An Application End-Point need not be an active member of a "swarm" to 252 interact with the DECADE storage system. That is, an End-Point may 253 interact with the DECADE storage servers as an offline activity. 255 3. Architectural Principles 257 We identify the following key principles. 259 3.1. Decoupled Control/Metadata and Data Planes 261 The DECADE infrastructure is intended to support multiple content 262 distribution applications. A complete content distribution 263 application implements a set of control and management functions 264 including content search, indexing and collection, access control, ad 265 insertion, replication, request routing, and QoS scheduling. A 266 observation of DECADE is that different content distribution 267 applications can have unique considerations designing the control and 268 signaling functions: 270 o Metadata Management: Traditional file systems provide a standard 271 metadata abstraction: a recursive structure of directories to 272 offer namespace management; each file is an opaque byte stream. 273 In content distribution, applications may use different metadata 274 management schemes. For example, one application may use a 275 sequence of blocks (e.g., for file sharing), while another 276 application may use a sequence of frames (with different sizes) 277 indexed by time. For example, Apple Live Streaming uses a dynamic 278 playlist to allow switching of frames encoded at different 279 encoding rates. 281 o Resource and Access Control: For example, a major competitive 282 advantage of many successful P2P systems is their substantial 283 expertise in achieving highly efficient utilization of peer and 284 infrastructural resources. For instance, many live P2P systems 285 have their specific algorithms in constructing topologies to 286 achieve low-latency, high-bandwidth streaming. They continue to 287 fine-tune such algorithms. 289 Given the diversity of control-plane functions, in-network storage 290 should export basic mechanisms and allow as much flexibility as 291 possible to the control planes to implement specific policies. This 292 conforms to the end-to-end systems principle and allows innovation 293 and satisfaction of specific business goals. 295 Specifically, in the DECADE architecture, the control plane focuses 296 on the application-specific, complex, and/or processing intensive 297 functions while the data plane provides storage and data transport 298 functions. 300 o Control plane: Signals details of where the data is to be 301 downloaded from. The control signals may also include the time, 302 quality of service, and receivers of the download. It also 303 provides higher layer meta-data management functions such as 304 defining the sequence of data blocks forming a higher layer 305 content object. These are behaviors designed and implemented by 306 the Application. By Application, we mean the broad sense that 307 includes other control plane protocols. 309 o Data plane: Stores and transfers basic data objects as instructed 310 by the Application's Control Plane. 312 Decoupling control plane and data plane is not new. For example, 313 OpenFlow is an implementation of this principle for Internet routing, 314 where the computation of the forwarding table and the application of 315 the forwarding table are separated. Google File System applies the 316 principle to file system design, by utilizing the Master to handle 317 the meta-data management, and the Chunk Servers to handle the data 318 plane functions (i.e., read and write of chunks of data). NFS4 also 319 implements this principle. 321 Note that applications may have different Data Plane implementations 322 in order to support particular requirements (e.g., low latency). In 323 order to provide interoperability, the DECADE architecture does not 324 intend to enable arbitrary data transport protocols. However, the 325 architecture may allow for more-than-one data transport protocols to 326 be used. 328 Also note that although an application's existing control plane 329 functions remain implemented within the application, the particular 330 implementation may need to be adjusted to support DECADE. 332 3.2. Immutable Data Objects 334 A property of bulk contents to be broadly distributed is that they 335 typically are immutable -- once a piece of content is generated, it 336 is typically not modified. It is not common that bulk contents such 337 as video frames and images need to be modified after distribution. 339 Many content distribution applications divide content objects into 340 blocks for two reasons: (1) multipath: different blocks may be 341 fetched from different content sources in parallel, and (2) faster 342 recovery and verification: individual blocks may be recovered and 343 verified. Typically, applications use a block size larger than a 344 single packet in order to reduce control overhead. 346 Common applications whose content matches this model include P2P 347 streaming (live and video-on-demand) and P2P file-sharing content. 348 However, other additional types of applications may match this model. 350 DECADE adopts a design in which immutable data objects may be stored 351 at a storage server. Applications may consider existing blocks as 352 DECADE data objects, or they may adjust block sizes before storing in 353 a DECADE server. 355 Focusing on immutable data blocks in the data plane can substantially 356 simplify the data plane design, since consistency requirements can be 357 relaxed. It also allows effective reuse of data blocks and de- 358 duplication of redundant data. 360 Depending on its specific requirements, an application may store data 361 in DECADE servers such that each data object is completely self- 362 contained (e.g., a complete, independently decodable video segment). 363 An application may also divide data into chunks that require 364 application level assembly. The DECADE architecture and protocols 365 are agnostic to the nature of the data objects and do not specify a 366 fixed size for them. 368 Note that immutable content may still be deleted. Also note that 369 immutable data blocks do not imply that contents cannot be modified. 370 For example, a meta-data management function of the control plane may 371 associate a name with a sequence of immutable blocks. If one of the 372 blocks is modified, the meta-data management function changes the 373 mapping of the name to a new sequence of immutable blocks. 375 3.3. Data Object Identifiers 377 Objects that are stored in a DECADE storage server can be accessed by 378 DECADE content consumers by a resource identifier that has been 379 assigned within a certain application context. 381 Because a DECADE content consumer can access more than one storage 382 server within a single application context, a data object that is 383 replicated across different storage servers managed by a DECADE 384 storage provider, can be accessed by a single identifier. 386 Note that since data objects are immutable, it is possible to support 387 persistent identifiers for data objects. 389 3.4. Explicit Control 391 To support the functions of an application's control plane, 392 applications must be able to know and control which data is stored at 393 particular locations. Thus, in contrast with content caches, 394 applications are given explicit control over the placement (selection 395 of a DECADE server), deletion (or expiration policy), and access 396 control for stored data. 398 Consider deletion/expiration policy as a simple example. An 399 application may require that a DECADE server store content for a 400 relatively short period of time (e.g., for live-streaming data). 401 Another application may need to store content for a longer duration 402 (e.g., for video-on-demand). 404 3.5. Resource and Data Access Control through User Delegation 406 DECADE provides a shared infrastructure to be used by multiple 407 tenants of multiple content distribution applications. Thus, it 408 needs to provide both resource and data access control. 410 3.5.1. Resource Allocation 412 There are two primary interacting entities in the DECADE 413 architecture. First, Storage Providers control where DECADE storage 414 servers are provisioned and their total available resources. Second, 415 Applications control data transfers amongst available DECADE servers 416 and between DECADE servers and end-points. A form of isolation is 417 required to enable concurrently-running Applications to each 418 explicitly manage their own content and share of resources at the 419 available servers. 421 The Storage Provider delegates the management of the resources at a 422 DECADE server to one or more applications. Applications are able to 423 explicitly and independently manage their own shares of resources. 425 3.5.2. User Delegations 427 Storage providers have the ability to explicitly manage the entities 428 allowed to utilize the resources at a DECADE server. This capability 429 is needed for reasons such as capacity-planning and legal 430 considerations in certain deployment scenarios. 432 To provide a scalable way to manage applications granted resources at 433 a DECADE server, we consider an architecture that adds a layer of 434 indirection. Instead of granting resources to an application, the 435 DECADE server grants a share of the resources to a user. The user 436 may in turn share the granted resources amongst multiple 437 applications. The share of resources granted by a storage provider 438 is called a User Delegation. 440 A User Delegation may be granted to an end-user (e.g., an ISP 441 subscriber), a Content Provider, or an Application Provider. A 442 particular instance of an application may make use of the storage 443 resources: 445 o granted to the end-user (with the end-user's permission), 447 o granted to the Content Provider (with the Content Provider's 448 permission), and/or 450 o granted to the Application Provider. 452 4. System Components 454 The primary focus of the current version of this document is on the 455 architectural principles. The detailed system components will be 456 discussed in the next document revision. 458 This section presents an overview of the components in the DECADE 459 architecture. 461 .--------------------------------------------------------------. 462 | Application End-Point | 463 | .------------. .-------------------. | 464 | | App-Layer | ... | App Data Assembly | | 465 | | Algorithms | | Sequencing | | 466 | `------------' `-------------------' | 467 | | 468 | .----------------------------------------------------------. | 469 | | DECADE Client | | 470 | | | | 471 | | .-------------------------. .--------------------------. | | 472 | | | Resource Controller | | Data Controller | | | 473 | | | .--------. .----------. | | .------------. .-------. | | | 474 Native | | | | Data | | Resource | | | | Data | | Data | | | | 475 App | | | | Access | | Sharing | | | | Scheduling | | Index | | | | 476 Protocol(s)| | | | Policy | | Policy | | | | | | | | | | 477 .--> | | | '--------' `----------' | | `------------' `-------' | | | 478 | | | `-------------------------' `--------------------------' | | 479 | | | | ^ | | 480 | | `------------ | ----------------- | -----------------------' | 481 | `-------------- | ----------------- | -------------------------' 482 | | | 483 v | DECADE | Standard 484 .-------------. | Resource | Data 485 | Application | | Protocol (DRP) | Transport (SDT) 486 | End-Point | | | 487 `-------------' | | Content Distribution 488 ^ ^ | | Application 489 = | ===== | ============== | ================= | ========================== 490 | | | | DECADE Server(s) 491 | | | | 492 | | .- | ----------------- | ----------------------. 493 | | | | v | 494 | | | | .----------------. | 495 | | | |----> | Access Control | <--------. | 496 | DRP | SDT | | `----------------' | | 497 | | | | ^ | | 498 | | | | v | | 499 | | | | .---------------------. | | 500 | | | `-> | Resource Scheduling | <------| | 501 v v DRP | `---------------------' | | 502 .------------. <------> | ^ | | 503 | DECADE | | v .------------. | 504 | Server | SDT | .-----------------. | User | | 505 `------------' <------> | | Data Store | | Delegation | | 506 | `-----------------' | Management | | 507 | DECADE Server `------------' | 508 `----------------------------------------------' 510 Figure 1: DECADE Architecture Components 512 A component diagram of the DECADE architecture is displayed in 513 Figure 1. The diagram illustrates the major components of a Content 514 Distribution Application related to DECADE, as well as the functional 515 components of a DECADE Server. 517 To keep the scope narrow, we only discuss the primary components 518 related to protocol development. Particular deployments may require 519 additional components (e.g., monitoring and accounting at a DECADE 520 server), but they are intentionally omitted from the current version 521 of this document. 523 4.1. Content Distribution Application 525 Content Distribution Applications have many functional components. 526 For example, many P2P applications have components to manage overlay 527 topology management, piece selection, etc. In supporting DECADE, it 528 may be advantageous to consider DECADE within some of these 529 components. However, in this architecture document, we focus on the 530 components directly employed to support DECADE. 532 4.1.1. Data Assembly 534 DECADE is primarily designed to support applications that can divide 535 distributed contents into immutable data objects. To accomplish 536 this, applications include a component responsible for creating the 537 individual data objects before distribution and then re-assembling 538 data objects at the Content Consumer. We call this component 539 Application Data Assembly. The specific implementation is entirely 540 decided by the application. 542 In producing and assembling the data objects, two important 543 considerations are sequencing and naming. The DECADE architecture 544 assumes that applications implement this functionality themselves. 546 For example, a Content Distribution Application might divide a single 547 content (e.g., a finite-length file or a live stream) into multiple 548 data objects with names of the form "CONTENT_ID:SEQUENCE_NUMBER" 549 where CONTENT_ID identifies the particular content (e.g., a 550 particular movie or TV channel distributed by the application), and 551 SEQUENCE_NUMBER both identifies an individual data object and 552 determines its order when a client reconstructs individual data 553 objects into the full content. 555 4.1.2. Native Protocols 557 Applications may still use existing protocols. In particular, an 558 application may reuse existing protocols primarily for control/ 559 signaling. However, an application may still retain its existing 560 data transport protocols, in addition to DECADE as the data transport 561 protocol. This can be important for applications that are designed 562 to be highly robust (e.g., if DECADE servers are unavailable). 564 4.1.3. DECADE Client 566 An application may be modified to support DECADE. We call the layer 567 providing the DECADE support to an application the DECADE Client. It 568 is important to note that a DECADE Client need not be embedded into 569 an application. It could be implemented alone, or could be 570 integrated in other entities such as network devices themselves. 572 4.1.3.1. Resource Controller 574 Applications may have different Resource sharing policies and Data 575 access policies to control their resource and data in DECADE servers. 576 These policies can be existing policies of applications (e.g., tit- 577 for-tat) or custom policies adapted for DECADE. The specific 578 implementation is decided by the application. 580 4.1.3.2. Data Controller 582 DECADE is designed to decouple the control and the data transport of 583 applications. Data transport between applications and DECADE servers 584 uses standard data transport protocols. It may need to schedule the 585 data being transferred according to network conditions, available 586 DECADE Servers, and/or available DECADE Server resources. An index 587 indicates data available at remote DECADE servers. The index (or a 588 subset of it) may be advertised to other Application End-Points. 590 4.2. DECADE Server 592 DECADE server is an important functional component of DECADE. It 593 stores data from Application End-Points, and provides control and 594 access of those data to Application End-Points. Note that a DECADE 595 server is not necessarily a single physical machine, it could also be 596 implemented as a cluster of machines. 598 4.2.1. Access Control 600 An Application End-Point can access its own data or other Application 601 End-Point's data (provided sufficient authorization) in DECADE 602 servers. Application End-Points may also authorize other End-Points 603 to store data. If an access is authorized by an Application End- 604 Point, the DECADE Server will provide access. 606 Note that even if a request is authorized, it may still fail to 607 complete due to insufficient resources by either the requesting 608 Application End-Point or the providing Application End-Point. 610 4.2.2. Resource Scheduling 612 Applications may apply their existing resource sharing policies or 613 use a custom policy for DECADE. DECADE servers perform resource 614 scheduling according to the resource sharing policies indicated by 615 Application End-Points as well as configured User Delegations. 617 Access control and resource control are separated in DECADE server. 618 It is possible that an Application End-Point provides only access to 619 its data without any resources. In order to access this data, 620 another Application End-Point may use the granted access along with 621 its own available resources to store or retrieve data from a DECADE 622 Server. 624 4.2.3. Data Store 626 Data from applications may be stored into disks. Data can be deleted 627 from disks either explicitly or automatically (e.g., after a TTL). 628 It may be possible to perform optimizations in certain cases, such as 629 avoiding writing temporary data (e.g., live streaming) to a disk. 631 4.3. Protocols 633 The DECADE Architecture uses two protocols. First, the DECADE 634 Resource Protocol is responsible for communicating access control and 635 resource scheduling policies to the DECADE Server. Second, standard 636 data transport protocols (e.g., WebDAV or NFS) are used to transfer 637 data objects to and from a DECADE Server. The DECADE architecture 638 will specify a small number of Standard Data Transport instances. 640 Decoupling the protocols in this way allows DECADE to both directly 641 utilize existing standard data transports and to evolve 642 independently. 644 It is also important to note that the two protocols do not need to be 645 separate on the wire. For example, the DECADE Resource Protocol 646 messages may be piggybacked within the extension fields provided by 647 certain data transport protocols. However, this document considers 648 them as two separate, logical functional components for clarity. 650 4.3.1. DECADE Resource Protocol 652 The DECADE Resource Protocol is responsible for communicating both 653 access control and resource sharing policies to DECADE Servers used 654 for data transport. 656 The DECADE architecture specification will provide exactly one DECADE 657 Resource Protocol. 659 4.3.2. Standard Data Transports 661 Existing data transport protocols are used to read and write data 662 from a DECADE Server. Protocols under consideration are WebDAV and 663 NFS. 665 4.4. Data Sequencing and Naming 667 In order to provide a simple and generic interface, the DECADE Server 668 is only responsible for storing and retrieving individual data 669 objects. Furthermore, DECADE uses its own simple naming scheme that 670 provides uniqueness (with high probability) between data objects, 671 even across multiple applications. 673 4.4.1. DECADE Data Object Naming Schame 675 The name of a data object is derived from the hash over the data 676 object's content (the raw bytes), which is made possible by the fact 677 that DECADE objects are immutable. This scheme multiple appealing 678 properties: 680 o Simple integrity verification 682 o Unique names (with high probability) 684 o Application independent, without a new IANA-maintained registry 686 The DECADE naming scheme also includes a "type" field, the "type" 687 identifier indicates that the name is the hash of the data object's 688 content and the particular hashing algorithm used. This allows the 689 DECADE protocol to evolve by either changing the hashing algorithm 690 (e.g., if security vulernabilities with an existing hashing algorithm 691 are discovered), or move to a different naming scheme altogether. 693 The specific format of the name (e.g., encoding, hash algorithms, 694 etc) is out of scope of this document, and left for protocol 695 specification. 697 Another advantage of this scheme is that a DECADE client knows the 698 name of a data object before it is completely stored at the DECADE 699 server. This allows for particular optimizations, such as 700 advertising data object while the data object is being stored, 701 removing store-and-forward delays. For example, a DECADE client A 702 may simultaneously begin storing an object to a DECADE server, and 703 advertise that the object is available to DECADE client B. If it is 704 supported by the DECADE server, client B may begin downloading the 705 object before A is finished storing the object. 707 4.4.2. Application Usage 709 Recall from Section 4.1.1 that an Application typically includes its 710 own naming and sequencing scheme. It is important to note that the 711 DECADE naming format does not attempt to replace any naming or 712 sequencing of data objects already performed by an Application; 713 instead, the DECADE naming is intended to apply only to data objects 714 referenced at the DECADE layer. 716 DECADE names are not necessarily correlated with the naming or 717 sequencing used by the Application using a DECADE client. The DECADE 718 client is expected to maintain a mapping from its own data objects 719 and their names to the DECADE data objects and names. Furthermore, 720 the DECADE naming scheme implies no sequencing or grouping of 721 objects, even if this is done at the application layer. 723 Not only does an Application retain its own naming scheme, it may 724 also decide the sizes of data objects to be distributed via DECADE. 725 This is desirable since sizes of data objects may impact Application 726 performance (e.g., overhead vs. data distribution delay), and the 727 particular tradeoff is application-dependent. 729 4.4.3. Application Usage Example 731 To illustrate these properties, this section presents multiple 732 examples. 734 4.4.3.1. Application with Fixed-Size Chunks 736 Similar to the example in Section 4.1.1, consider an Application in 737 which each individual application-layer segment of data is called a 738 "chunk" and has a name of the form: "CONTENT_ID:SEQUENCE_NUMBER". 739 Furthermore, assume that the application's native protocol uses 740 chunks of size 16KB. 742 Now, assume that this application wishes to make use of DECADE, and 743 assume that it wishes to store data to DECADE servers in data objects 744 of size 64KB. To accomplish this, it can map a sequence of 4 chunks 745 into a single DECADE object, as shown in Figure 2. 747 Application Chunks 748 .---------.---------.---------.---------.---------.---------.---------.-- 749 | | | | | | | | 750 | Chunk_0 | Chunk_1 | Chunk_2 | Chunk_3 | Chunk_4 | Chunk_5 | Chunk_6 | 751 | | | | | | | | 752 `---------`---------`---------`---------`---------`---------`---------`-- 754 DECADE Data Objects 755 .---------------------------------------.-------------------------------- 756 | | 757 | Object_0 | Object_1 758 | | 759 `---------------------------------------`-------------------------------- 761 Figure 2: Mapping Application Chunks to DECADE Data Objects 763 In this example, the Application might maintain a logical mapping 764 that is able to determine the name of a DECADE data object given the 765 chunks contained within that data object. The name might be learned 766 from either the original source, another endpoint with which the it 767 is communicating, a tracker, etc. 769 It is important to note that as long as the data contained within 770 each sequence of chunks is unique, the corresponding DECADE data 771 objects have unique names. This is desired, and happens 772 automatically if particular Application segments the same stream of 773 data in a different way, including different chunk size sizes or 774 different padding schemes. 776 4.4.3.2. Application with Continuous Streaming Data 778 Next, consider an Application whose native protocol retrieves a 779 continuous data stream (e.g., an MPEG2 stream) instead of downloading 780 and redistributing chunks of data. Such an application could segment 781 the continuous data stream to produce either fixed-sized or variable- 782 sized DECADE data objects. 784 Figure 3 shows how a video streaming application might produce 785 variable-sized DECADE data objects such that each DECADE data object 786 contains 10 seconds of video data. 788 Application's Video Stream 789 .------------------------------------------------------------------------ 790 | 791 | 792 | 793 `------------------------------------------------------------------------ 794 ^ ^ ^ ^ ^ 795 | | | | | 796 0 Seconds 10 Seconds 20 Seconds 30 Seconds 40 Seconds 797 0 B 400 KB 900 KB 1200 KB 1500 KB 799 DECADE Data Objects 800 .--------------.--------------.--------------.--------------.------------ 801 | | | | | 802 | Object_0 | Object_1 | Object_2 | Object_3 | 803 | (400 KB) | (500 KB) | (300 KB) | (300 KB) | 804 `--------------`--------------`--------------`--------------`------------ 806 Figure 3: Mapping a Continuous Data Stream to DECADE Data Objects 808 Similar to the previous example, the Application might maintain a 809 mapping that is able to determine the name of a DECADE data object 810 given the time offset of the video chunk. 812 4.5. Token-based Authentication and Resource Control 814 A primary use case for DECADE is a DECADE Client authorizing other 815 DECADE Clients to store or retrieve data objects from its DECADE 816 storage. To support this, DECADE uses a token-based authentication 817 scheme. 819 In particular, an entity trusted by a DECADE Client generates a 820 digitally-signed token with particular properties (see Section 5.1.2 821 for details). The DECADE Client distributes this token to other 822 DECADE Clients which then use the token when sending requests to the 823 DECADE Server. Upon receiving a token, the DECADE Server validates 824 the signature and the operation being performed. 826 This is a simple scheme, but has multiple important advantages over 827 an alternate approach in which a DECADE Client explicitly manipulates 828 an Access Control List (ACL) associated with each DECADE data object. 829 In particular, it has the following advantages when applied to 830 DECADE's target applications: 832 o Authorization policies are implemented within the Application; an 833 Application explicitly controls when tokens are generated and to 834 whom they are distributed. 836 o Fine-grained access and resource control can be applied to data 837 objects; see Section 5.1.2 for the list of restrictions that can 838 be enforced with a token. 840 o There is no messaging between a DECADE Client and DECADE Server to 841 manipulate data object permissions. This can simplify, in 842 particular, Applications which share data objects with many 843 dynamic peers and need to frequently adjust access control 844 policies attached to DECADE data objects. 846 o Tokens can provide anonymous access, in which a DECADE Server does 847 not need to know the identity of each DECADE Client that accesses 848 it. This enables a DECADE Client to send tokens to DECADE Clients 849 in other administrative or security domains, and allow them to 850 read or write data objects from its DECADE storage. 852 It is important to note that, in addition to DECADE Clients applying 853 access control policies to DECADE data objects, the DECADE Server may 854 be configured to apply additional policies based on user, object, 855 geographic location, etc. Defining such policies is out of scope of 856 the DECADE Working Group, but in such a case, a DECADE Client may be 857 denied access even though it possess a valid token. 859 4.6. In-Network Storage Components Mapped to DECADE Architecture 861 In this section we evaluate how the basic components of an in-network 862 storage system identified in Section 3 of [I-D.ietf-decade-survey] 863 map into the DECADE architecture. 865 It is important to note that complex and/or application-specific 866 behavior is delegated to applications instead of tuning the storage 867 system wherever possible. 869 4.6.1. Data Access Interface 871 Users can read and write objects of arbitrary size through the DECADE 872 Client's Data Controller, making use of a standard data transport. 874 4.6.2. Data Management Operations 876 Users can move or delete previously stored objects via the DECADE 877 Client's Data Controller, making use of a standard data transport. 879 4.6.3. Data Search Capability 881 Users can enumerate or search contents of DECADE servers to find 882 objects matching desired criteria through services provided by the 883 Content Distribution Application (e.g., buffer-map exchanges, a DHT, 884 or peer-exchange). In doing so, End-Points may consult their local 885 data index in the DECADE Client's Data Controller. 887 4.6.4. Access Control Authorization 889 All methods of access control are supported: public-unrestricted, 890 public-restricted and private. Access Control Policies are generated 891 by a Content Distribution Application and provided to the DECADE 892 Client's Resource Controller. The DECADE Server is responsible for 893 implementing the access control checks. 895 4.6.5. Resource Control Interface 897 Users can manage the resources (e.g. bandwidth) on the DECADE server 898 that can be used by other Application End-Points. Resource Sharing 899 Policies are generated by a Content Distribution Application and 900 provided to the DECADE Client's Resource Controller. The DECADE 901 Server is responsible for implementing the resource sharing policies. 903 4.6.6. Discovery Mechanism 905 This is outside the scope of the DECADE architecture. However, it is 906 expected that DNS or some other well known protocol will be used for 907 the users to discover the DECADE servers. 909 4.6.7. Storage Mode 911 DECADE Servers provide an object-based storage mode. Immutable data 912 objects may be stored at a DECADE server. Applications may consider 913 existing blocks as DECADE data objects, or they may adjust block 914 sizes before storing in a DECADE server. 916 5. DECADE Protocols 918 This section specifies the DECADE Resource Protocol (DRP) and the 919 Standard Data Transport (SDT) in terms of abstract protocol 920 interactions that are intended to mapped to specific protocols. Note 921 that while the protocols are logically separate, DRP is specified as 922 being carried through extension fields within an SDT (e.g., HTTP 923 headers). 925 The DRP is the protocol used by a DECADE client to configure the 926 resources and authorization used to satisfy requests (reading, 927 writing, and management operations concerning DECADE objects) at a 928 DECADE server. The SDT is used to send the operations to the DECADE 929 server. Necessary DRP metadata is supplied using mechanisms in the 930 SDT that are provided for extensibility (e.g., additional request 931 parameters or extension headers). 933 5.1. DECADE Resource Protocol (DRP) 935 DRP provides configuration of access control and resource sharing 936 policies on DECADE servers. A content distribution application, 937 e.g., a live P2P streaming session, MAY employ several DECADE 938 servers, for instance, servers in different operator domains, and DRP 939 allows one instance of such an application, e.g., an application 940 endpoint, to configure access control and resource sharing policies 941 on a set of servers. 943 5.1.1. Controlled Resources 945 On a single DECADE server, the following resources may be managed: 947 communication resources: DECADE servers have limited communication 948 resources in terms of bandwidth (upload/download) but also in 949 terms of number of connected clients (connections) at a time. 951 storage resources: DECADE servers have limited storage resources. 953 5.1.2. Token-based Authentication and Resource Control 955 DECADE uses a token-based scheme that allows a DECADE Client to 956 authorize other DECADE Clients to perform certain actions (e.g., read 957 or write data objects) on the client's DECADE Server. The token 958 includes the following fields: 960 Permitted operations (e.g., read, write) 962 Permitted objects (e.g., names of data objects that may be read or 963 written) 965 Permitted clients (e.g., as indicated by IP address or other 966 identifier) that may use the token 968 Expiration time 970 Priority for bandwidth given to requested operation 972 Amount of data that may be read or written 974 The particular format for the token is out of scope of this document. 976 The tokens are generated by a trusted entity at the request of a 977 DECADE Client. It is out of scope of this document to identify which 978 entity serves this purpose, but examples include the DECADE Client 979 itself, a DECADE Server trusted by the DECADE Client, or another 980 server managed by a Storage Provider trusted by the DECADE Client. 982 Upon generating a token, a DECADE Client may distribute it to another 983 DECADE Client (e.g., via their native Application protocol). The 984 receiving DECADE Client may then connect to the sending DECADE 985 Client's DECADE Server and perform any operation permitted by the 986 token. The token must be sent along with the operation. The DECADE 987 Server validates the token to identify the DECADE Client that issued 988 it and whether the requested operation is permitted by the contents 989 of the token. If the token is successfully validated, the DECADE 990 Server applies the resource control policies indicated in the token 991 while performing the operation. 993 It is possible for DRP to allow tokens to apply to a batch of 994 operations to reduce communication overhead required between DECADE 995 Clients. 997 DRP may also define tokens to include a unique identifer to allow a 998 DECADE Server to detect when a token is used multiple times. 1000 5.1.3. Status Information 1002 DRP provides a request service for status information that DECADE 1003 clients can use to request information from a DECADE server. 1005 status information per application context on a specific server: 1006 Access to such status information requires client authorization, 1007 i.e., DECADE clients need to be authorized to access status 1008 information for a specific application context. This 1009 authorization (and the mapping to application contexts) is based 1010 on the user delegation concept as described in Section 3.5. The 1011 following status information elements can be obtained: 1013 * list of associated objects (with properties) 1015 * resources used/available 1017 * list of servers to which objects have been distributed (in a 1018 certain time-frame) 1020 * list of clients to which objects have been distributed (in a 1021 certain time-frame) 1023 For the list of servers/clients to which objects have been 1024 distributed to, the DECADE server can decide on time bounds for 1025 which this information is stored and specify the corresponding 1026 time frame in the response to such requests. Some of this 1027 information can be used for accounting purposes, e.g., the list of 1028 clients to which objects have been distributed. 1030 access information per application context on a specific server: 1031 Access information can be provided for accounting purposes, for 1032 example, when application service providers are interested to 1033 maintain access statistics for resources and/or to perform 1034 accounting per user. Again, access to such information requires 1035 client authorization based on the user delegation concept as 1036 described in Section 3.5. The following access information 1037 elements can be requested: 1039 * what objects have been accessed how many times 1041 * access tokens that a server as seen for a given object 1043 The DECADE server can decide on time bounds for which this 1044 information is stored and specify the corresponding time frame in 1045 the response to such requests. 1047 5.1.4. Object Properties 1049 Objects that are stored on a DECADE server can provide properties (in 1050 addition to the object identifier and the actual content). Depending 1051 on authorization, DECADE clients may get or set such properties. 1052 This authorization (and the mapping to application contexts) is based 1053 on the user delegation concept as described in Section 3.5. The 1054 DECADE architecture does not limit the set of permissible properties, 1055 but rather specifies a set of baseline properties that SHOULD be 1056 supported by implementations. 1058 TTL: TTL of the object as an absolute time value 1060 object size: in bytes 1062 MIME type 1064 access statistics: how often the object has been accessed (and what 1065 tokens have been used) 1067 5.2. Standard Data Transport (SDT) 1069 A DECADE server provide a data access interface, and SDT is used to 1070 write objects to a server and to read (download) objects from a 1071 server. Semantically, SDT is a client-server protocol, i.e., the 1072 DECADE server always responds to client requests. 1074 5.2.1. Writing/Uploading Objects 1076 For writing objects, a client uploads an object to a DECADE server. 1077 The object on the server will be named (associated to an identifier), 1078 and this name can be used to access (download) the object later, 1079 e.g., the client can pass the name as a reference to other client 1080 that can then refer to the object. 1082 DECADE objects can be self-contained objects such as multimedia 1083 resources, files etc., but also chunks, such as chunks of a P2P 1084 distribution protocol that can be part of a containing object or a 1085 stream. 1087 A server MUST accept download requests for an object that is still 1088 being uploaded. 1090 The application that originates the objects MUST generate DECADE 1091 object names according to the naming specification in Section 4.4. 1092 The naming scheme provides that the name is unique. DECADE clients 1093 (as parts of application entities) upload a named object to a server, 1094 and a DECADE server MUST not change the name. It MUST be possible 1095 for downloading clients, to access the object using the original 1096 name. A DECADE server MAY verify the integrity and other security 1097 properties of uploaded objects. 1099 In the following we provide an abstract specification of the upload 1100 operation that we name 'PUT METHOD'. See Appendix A.1 for an example 1101 how this could be mapped to HTTP. 1103 Method PUT: 1105 Parameters: 1107 NAME: The naming of the object according to Section 4.4 1109 OBJECT: The object itself. The protocol MUST provide transparent 1110 binary object transport. 1112 Description: The PUT method is used by a DECADE client to upload an 1113 object with an associated name 'NAME' to a DECADE server. 1115 RESPONSES: The DECADE server MUST respond with one the following 1116 response messages: 1118 OK: The object has been uploaded successfully and has replaced an 1119 existing object with the same name. 1121 CREATED: The object has been uploaded successfully and is now 1122 available under the specified name. 1124 ERRORs: possible error codes later will be specified in a later 1125 version of this document 1127 5.2.2. Downloading Objects 1129 A DECADE client can request named objects from a DECADE server. In 1130 the following, we provide an abstract specification of the download 1131 operation that we name 'GET METHOD'. See Section 4.4 for an example 1132 how this could be mapped to HTTP. 1134 Method GET: 1136 Parameters: 1138 NAME: The naming of the object according to Section 4.4. 1140 Description: The GET method is used by a DECADE client to download 1141 an object with an associated name 'NAME' from a DECADE server. 1143 RESPONSES: The DECADE server MUST respond with one the following 1144 response messages: 1146 OK: The request has succeeded, and an entity corresponding to the 1147 requested resource is sent in the response. 1149 ERRORs: 1151 NOTFOUND: The DECADE server has not found anything matching 1152 the request object name. 1154 Other Errors: TBD in a future version of this document 1156 6. Server-to-Server Protocols 1158 An important feature of DECADE is the capability for one DECADE 1159 server to directly download data objects from another DECADE server. 1160 This capability allows Applications to directly replicate data 1161 objects between servers without requiring end-hosts to use uplink 1162 capacity to upload data objects to a different DECADE server. 1163 Similar to other operations in DRP and SDT, replicating data objects 1164 between DECADE servers is an explicit operation. 1166 To support this functionality, DECADE re-uses the already-specified 1167 protocols to support operations directly between servers. DECADE 1168 servers are not assumed to trust each other nor are configured to do 1169 so. All data operations are performed on behalf of DECADE clients 1170 via explicit instruction, so additional capabilities are needed in 1171 the DECADE client-server protocols DECADE clients must be able to 1172 indicate to a DECADE server the following additional parameters: 1174 o which remote DECADE server(s) to access; 1176 o the operation to be performed (PUT or GET); and 1178 o Credentials indicating permission to perform the operation at the 1179 remote DECADE server. 1181 In this way, a DECADE server is also a DECADE client, and requests 1182 may instantiate requests via that client. The operations are 1183 performed as if the original requestor had its own DECADE client co- 1184 located with the DECADE server. It is this mode of operation that 1185 provides substantial savings in uplink capacity. 1187 6.1. Operational Overview 1189 DECADE's server-to-server support is focused on reading and writing 1190 data objects between DECADE servers. A DECADE GET or PUT request MAY 1191 supply the following additional parameters: 1193 REMOTE_SERVER: Address of the remote DECADE server. The format of 1194 the address is out-of-scope of this document. 1196 REMOTE_USER: The account at the remote server from which to retrieve 1197 the object (for a GET), or in which the object is to be stored 1198 (for a PUT). 1200 TOKEN: Credentials to be used at the remote server. 1202 These parameters are used by the DECADE server to instantiate a 1203 request to the specified remote server. It is assumed that the data 1204 object referred to at the remote server is the same as the original 1205 request. It is also assumed that the operation performed at the 1206 remote server is the same as the operation in the original request. 1207 Though explicitly supplying these may provide additional freedom, it 1208 is not clear what benefit they might provide. 1210 Note that when a DECADE client invokes a request a DECADE server with 1211 these additional parameters, it is giving the DECADE server 1212 permission to act on its behalf. Thus, it would be wise for the 1213 supplied token to have narrow privileges (e.g., limited to only the 1214 necessary data objects) or validity time (e.g., a small expiration 1215 time). 1217 In the case of a GET operation, the DECADE server is to retrieve the 1218 data object from the remote server using the specified credentials 1219 (via a GET request to the remote server), and then return the object 1220 to the client. In the case of a PUT operation, the DECADE server is 1221 to store the object from the client, and then store the object to the 1222 remote server using the specified credentials (via a PUT request to 1223 the remote server). 1225 7. Potential Optimizations 1227 As suggestions for the protocol design and eventual implementations, 1228 we discuss particular optimizations that are enabled by the DECADE 1229 Architecture discussed in this document. 1231 7.1. Pipelining to Avoid Store-and-Forward Delays 1233 DECADE server may choose to not fully store an object before 1234 beginning to serve it. For example, in the case of a GET request, a 1235 DECADE server may begin to receive a data object from a remote server 1236 or DECADE Client, and immediately begin returning it to the DECADE 1237 client. This pipelining mode avoids store-and-forward delays, which 1238 could be substantial for large objects. A similar behavior could be 1239 used for PUT. 1241 7.2. Deduplication 1243 A common concern amongst Storage Providers is the total volume of 1244 data that needs to be stored. An optimization frequently applied in 1245 existing storage systems is de-duplication techniques which attempt 1246 to avoid storing identical data multiple times. DECADE Server 1247 implementations may internally perform de-duplication of data on 1248 disk, but the DECADE architecture enables other forms of de- 1249 duplication. 1251 Note that these techniques may impact protocol design. Discussion of 1252 whether or not they should be adopted is out of scope of this 1253 document. 1255 7.2.1. Traffic Deduplication 1257 7.2.1.1. Rationale 1259 When a DECADE client (A) indicates its DECADE account on a DECADE 1260 server (S) to fetch an object from a remote entity (R) (a DECADE 1261 server or DECADE client) and if the object is already stored locally 1262 in S, S may perform Traffic Deduplication. This means that S does 1263 not download the object from R, which saves network traffic. 1264 Instead, it performs a challenge to make sure that the remote entity 1265 R actually has the object and then replies with its local object copy 1266 directly. 1268 7.2.1.2. Example 1270 As shown in Figure 4 , without Traffic Deduplication, redundant 1271 traffic flows between S and R will be issued if the server already 1272 has the object requested by A. If Traffic Deduplication is enabled, S 1273 only needs to challenge R to verify that it does have the data to 1274 avoid data-stealing attacks. 1276 A S R 1277 +----------+ obj req +------------+ obj req +----------+ 1278 | DECADE |=========>| A's |==========>| Remote | 1279 | CLIENT |<=========| Account |<==========| Entity | 1280 +----------+ obj rsp +------------+ obj rsp +----------+ 1282 (a) Without Traffic Deduplication 1284 A S R 1285 +----------+ obj req +------------+ challenge +----------+ 1286 | DECADE |=========>| A's |---------->| Remote | 1287 | CLIENT |<=========| Account |<----------| Entity | 1288 +----------+ obj rsp +------------+ obj hash +----------+ 1290 (b) With Traffic Deduplication 1292 Figure 4 1294 7.2.1.3. HTTP Compatibility of Challenge 1296 How to integrate traffic deduplication with HTTP is shown in 1297 Appendix A.1.3. 1299 7.2.2. Cross-Server Storage Deduplication 1301 The same object might be uploaded for multiple times to different 1302 DECADE servers. For storage efficiency, storage providers may desire 1303 a single object to be stored on one or a few servers. They might 1304 design internal system architecture to achieve that, or simply 1305 redirect the requests to proper servers. DECADE protocol support 1306 redirections of DECADE client request to support further cross-server 1307 storage deduplication. 1309 8. Security Considerations 1311 This document currently does not contain any security considerations 1312 beyond those mentioned in [I-D.ietf-decade-problem-statement]. 1314 9. IANA Considerations 1316 This document does not have any IANA considerations. 1318 10. Informative References 1320 [RFC2616] Fielding, R., Gettys, J., Mogul, J., Frystyk, H., 1321 Masinter, L., Leach, P., and T. Berners-Lee, "Hypertext 1322 Transfer Protocol -- HTTP/1.1", RFC 2616, June 1999. 1324 [RFC3744] Clemm, G., Reschke, J., Sedlar, E., and J. Whitehead, "Web 1325 Distributed Authoring and Versioning (WebDAV) 1326 Access Control Protocol", RFC 3744, May 2004. 1328 [RFC4331] Korver, B. and L. Dusseault, "Quota and Size Properties fo 1329 r Distributed Authoring and Versioning (DAV) Collections", 1330 RFC 4331, February 2006. 1332 [RFC4709] Reschke, J., "Mounting Web Distributed Authoring and 1333 Versioning (WebDAV) Servers", RFC 4709, October 2006. 1335 [RFC4918] Dusseault, L., "HTTP Extensions for Web Distributed 1336 Authoring and Versioning (WebDAV)", RFC 4918, June 2007. 1338 [I-D.ietf-decade-problem-statement] 1339 Song, H., Zong, N., Yang, Y., and R. Alimi, "DECoupled 1340 Application Data Enroute (DECADE) Problem Statement", 1341 draft-ietf-decade-problem-statement-03 (work in progress), 1342 March 2011. 1344 [I-D.ietf-decade-survey] 1345 Alimi, R., Rahman, A., and Y. Yang, "A Survey of In- 1346 network Storage Systems", draft-ietf-decade-survey-04 1347 (work in progress), March 2011. 1349 [I-D.ietf-decade-reqs] 1350 Yingjie, G., Bryan, D., Yang, Y., and R. Alimi, "DECADE 1351 Requirements", draft-ietf-decade-reqs-02 (work in 1352 progress), May 2011. 1354 [GoogleStorageDevGuide] 1355 "Google Storage Developer Guide", . 1358 Appendix A. Appendix: Evaluation of Some Candidate Existing Protocols 1359 for DECADE DRP and SDT 1361 In this section we evaluate how well the abstract protocol 1362 interactions specified in Section 5 for DECADE DRP and SDT can be 1363 fulfilled by existing protocols such as HTTP and WEBDAV. 1365 A.1. HTTP 1367 HTTP [RFC2616] is a key protocol for the Internet in general and 1368 especially for the World Wide Web. HTTP is a request-response 1369 protocol. A typical transaction involves a client (e.g. web browser) 1370 requesting content (resources) from a web server. Another example is 1371 when a client stores or deletes content from a server. 1373 A.1.1. HTTP Support for DECADE Resource Protocol Primitives 1375 DRP provides configuration of access control and resource sharing 1376 policies on DECADE servers. 1378 A.1.1.1. Access Control Primitives 1380 Access control requires mechanisms for defining the access policies 1381 for the server, and then checking the authorization of a user before 1382 it stores or retrieves content. HTTP supports a rudimentary access 1383 control via "HTTP Secure" (HTTPS). HTTPS is a combination of HTTP 1384 with SSL/TLS. The main use of HTTPS is to authenticate the server 1385 and encrypt all traffic between the client and the server. There is 1386 also a mode to support client authentication though this is less 1387 frequently used. 1389 A.1.1.2. Communication Resource Controls Primitives 1391 Communications resources include bandwidth (upload/download) and 1392 number of simultaneous connected clients (connections). HTTP 1393 supports bandwidth control indirectly through "persistent" HTTP 1394 connections. Persistent HTTP connections allows a client to keep 1395 open the underlying TCP connection to the server to allow streaming 1396 and pipelining (multiple simultaneous requests for a given client). 1398 HTTP does not define protocol operation to allow limiting the 1399 communciation resources to a client. However servers typically 1400 perform this function via implementation algorithms. 1402 A.1.1.3. Storage Resource Control Primitives 1404 Storage resources include amount of memory and lifetime of storage. 1405 HTTP does not allow direct control of storage at the server end 1406 point. However HTTP supports caching at intermediate points such as 1407 a web proxy. For this purpose, HTTP defines cache control mechanisms 1408 that define how long and in what situations the intermediate point 1409 may store and use the content. 1411 A.1.2. HTTP Support for DECADE Standard Transport Protocol Primitives 1413 SDT is used to write objects and read (download) objects from a 1414 DECADE server. The object can be either a self-contained object such 1415 as a multimedia file or a chunk from a P2P system. 1417 A.1.2.1. Writing Primitives 1419 Writing involves uploading objects to the server. HTTP supports two 1420 methods of writing called PUT and POST. In HTTP the object is called 1421 a resource and is identified by a URI. PUT uploads a resource to a 1422 specific location on the server. POST, on the other hand, submits 1423 the object to the server and the server decides whether to update an 1424 existing resource or to create a new resource. 1426 For DECADE, the choice of whether to use PUT or POST will be 1427 influenced by which entity is responsible for the naming. If the 1428 client performs the naming, then PUT is appropriate. If the server 1429 performs the naming, then POST should be used (to allow the server to 1430 define the URI). 1432 A.1.2.2. Downloading Primitives 1434 Downloading involves fetching of an object from the server. HTTP 1435 supports downloading through the GET and HEAD methods. GET fetches a 1436 specific resource as identified by the URL. HEAD is similiar but 1437 only fetches the metadata ("header") associated with the resource but 1438 not the resource itself. 1440 A.1.3. Traffic Deduplication Primitives 1442 To challenge a remote entity for an object, the DECADE server should 1443 provide a seed number, which is generated by the server randomly, and 1444 ask the remote entity to return a hash calculated from the seed 1445 number and the content of the object. The server MAY also specify 1446 the hash function which the remote entity should use. HTTP supports 1447 the challenge message through the GET methods. The message type 1448 ("challenge"), the seed number and the hash funtion name are put in 1449 URL. In the reply, the hash is sent in an ETAG header. 1451 A.1.4. Other Operations 1453 HTTP supports deleting of content on the server through the DELETE 1454 method. 1456 A.1.5. Conclusions 1458 HTTP can provide a rudimentary DRP and SDT for some aspects of 1459 DECADE, but will not be able to satisfy all the DECADE requirements. 1460 For example, HTTP does not provide a complete access control 1461 mechanism, nor does it support storage resource controls at the end 1462 point server. 1464 It is possible, however, to envision combining HTTP with a custom 1465 suite of other protocols to fulfill most of the DECADE requirements 1466 for DRP and SDT. For example, Google Storage for Developers is built 1467 using HTTP (with extensive proprietary extensions such as custom HTTP 1468 headers). Google Storage also uses OAUTH 2.0 (for access control) in 1469 combination with HTTP [GoogleStorageDevGuide]. 1471 A.2. WEBDAV 1473 WebDAV [RFC4918] is a protocol for enhanced Web content creation and 1474 management. It was developed as an extension to HTTP Appendix A.1. 1475 WebDAV supports traditional operations for reading/writing from 1476 storage, as well as more advanced features such as locking and 1477 namespace management which are important when multiple users 1478 collaborate to author or edit a set of documents. HTTP is a subset 1479 of WebDAV functionality. Therefore, all the points noted above in 1480 Appendix A.1 apply implicitly to WebDAV as well. 1482 A.2.1. WEBDAV Support for DECADE Resource Protocol Primitives 1484 DRP provides configuration of access control and resource sharing 1485 policies on DECADE servers. 1487 A.2.1.1. Access Control Primitives 1489 Access control requires mechanisms for defining the access policies 1490 for the server, and then checking the authorization of a user before 1491 it stores or retrieves content. WebDAV has an Access Control 1492 Protocol defined in [RFC3744]. 1494 The goal of WebDAV access control is to provide an interoperable 1495 mechanism for handling discretionary access control for content and 1496 metadata managed by WebDAV servers. WebDAV defines an Access Control 1497 List (ACL) per resource. An ACL contains a set of Access Control 1498 Entries (ACEs), where each ACE specifies a principal (i.e. user or 1499 group of users) and a set of privileges that are granted to that 1500 principal. When a principal tries to perform an operation on that 1501 resource, the server evaluates the ACEs in the ACL to determine if 1502 the principal has permission for that operation. 1504 WebDAV also requires that an authentication mechanism be available 1505 for the server to validate the identity of a principal. As a 1506 minimum, all WebDAV compliant implementations are required to support 1507 HTTP Digest Authentication. 1509 A.2.1.2. Communication Resource Controls Primitives 1511 Communications resources include bandwidth (upload/download) and 1512 number of simultaneous connected clients (connections). WebDAV 1513 supports communication resource control as described in 1514 Appendix A.1.1.2. 1516 A.2.1.3. Storage Resource Control Primitives 1518 Storage resources include amount of memory and lifetime of storage. 1519 WebDAV supports the concept of properties (which are metadata for a 1520 resource). A property is either "live" or "dead". Live properties 1521 include cases where a) the value of a property is protected and 1522 maintained by the server, and b) the value of the property is 1523 maintained by the client, but the server performs syntax checking on 1524 submitted values. A dead property has its syntax and semantics 1525 enforced by the client; the server merely records the value of the 1526 property. 1528 WebDAV supports a list of standardized properties [RFC4918] that are 1529 useful for storage resource control. These include the self- 1530 explanatory "creationdate", and "getcontentlength" properties. There 1531 is also an operation called PROPFIND to retrieve all the properties 1532 defined for the requested URI. 1534 WebDAV also has a Quota and Size Properties mechanism defined in 1535 [RFC4331] that can be used for storage control. Specifically, two 1536 key properties are defined per resource: "quota-available-bytes" and 1537 "quota-used-bytes". 1539 WebDAV does not define protocol operation for storage resource 1540 control. However servers typically perform this function via 1541 implementation algorithms in conjunction with the storage related 1542 properties discussed above. 1544 A.2.2. WebDAV Support for DECADE Standard Transport Protocol Primitives 1546 SDT is used to write objects and read (download) objects from a 1547 DECADE server. The object can be either a self-contained object such 1548 as a multimedia file or a chunk from a P2P system. 1550 A.2.2.1. Writing Primitives 1552 Writing involves uploading objects to the server. WebDAV supports 1553 PUT and POST as described in Appendix A.1.2.1. WebDAV LOCK/UNLOCK 1554 functionality is not needed as DECADE assumes immutable data objects. 1555 Therefore, resources cannot be edited and so do not need to be 1556 locked. This approach should help to greatly simplify DECADE 1557 implementations as the LOCK/UNLOCK functionality is quite complex. 1559 A.2.2.2. Downloading Primitives 1561 Downloading involves fetching of an object from the server. WebDAV 1562 supports GET and HEAD as described in Appendix A.1.2.2. WebDAV LOCK/ 1563 UNLOCK functionality is not needed as DECADE assumes immutable data 1564 objects. 1566 A.2.3. Other Operations 1568 WebDAV supports DELETE as described in Appendix A.1.4. In addition 1569 WebDAV supports COPY and MOVE methods. The COPY operation creates a 1570 duplicate of the source resource identified by the Request-URI, in 1571 the destination resource identified by the URI in the Destination 1572 header. 1574 The MOVE operation on a resource is the logical equivalent of a COPY, 1575 followed by consistency maintenance processing, followed by a delete 1576 of the source, where all three actions are performed in a single 1577 operation. The consistency maintenance step allows the server to 1578 perform updates caused by the move, such as updating all URLs, other 1579 than the Request-URI that identifiesthe source resource, to point to 1580 the new destination resource. 1582 WebDAV also supports the concept of "collections" of resources to 1583 support joint operations on related objects (e.g. file system 1584 directories) within a server's namespace. For example, GET and HEAD 1585 may be done on a single resource (as in HTTP) or on a collection. 1586 The MKCOL operation is used to create a new collection. DECADE may 1587 find the concept of collections to be useful if there is a need to 1588 support directory like structures in DECADE. 1590 WebDAV servers can be interfaced from an HTML-based user interface in 1591 a web browser. However, it is frequently desirable to be able to 1592 switch from an HTML-based view to a persentation provided by a native 1593 WebDAV client, directly supporting WebDAV features. The method to 1594 perform this in a platform-neutral mechanism is specified in the 1595 WebDAV protocol for "mounting WebDAV servers" [RFC4709]. This type 1596 of feature may also be attractive for DECADE clients. 1598 A.2.4. Conclusions 1600 WebDAV has a rich array of features that can provide a good base for 1601 DRP and SDT for DECADE. An initial analysis finds that the following 1602 WebDAV features will be useful for DECADE: 1604 - access control 1606 - properties (and PROPFIND operation) 1608 - COPY/MOVE operations 1610 - collections 1612 - mounting WebDAV servers 1614 It is recommended that the following WebDAV features NOT be used for 1615 DECADE: 1617 - LOCK/UNLOCK 1619 Finally, some extensions to WebDAV may still be required to meet all 1620 DECADE requirements. For example, defining a new WebDAV "time-to- 1621 live" property may be useful for DECADE. Further analysis is 1622 required to fully define the potential extensions to WebDAV to meet 1623 all DECADE requirements. 1625 Authors' Addresses 1627 Richard Alimi 1628 Google 1630 Email: ralimi@google.com 1632 Y. Richard Yang 1633 Yale University 1635 Email: yry@cs.yale.edu 1637 Akbar Rahman 1638 InterDigital Communications, LLC 1640 Email: akbar.rahman@interdigital.com 1642 Dirk Kutscher 1643 NEC 1645 Email: dirk.kutscher@neclab.eu 1647 Hongqiang Liu 1648 Yale University 1650 Email: hongqiang.liu@yale.edu