[apps-discuss] APPSDIR review of draft-ietf-decade-arch-04
Carsten Bormann <cabo@tzi.org> Mon, 23 January 2012 01:44 UTC
Return-Path: <cabo@tzi.org>
X-Original-To: apps-discuss@ietfa.amsl.com
Delivered-To: apps-discuss@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 92A8D21F85D5; Sun, 22 Jan 2012 17:44:11 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -106.249
X-Spam-Level:
X-Spam-Status: No, score=-106.249 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, HELO_EQ_DE=0.35, RCVD_IN_DNSWL_MED=-4, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id CHOVob08hbvZ; Sun, 22 Jan 2012 17:44:10 -0800 (PST)
Received: from informatik.uni-bremen.de (mailhost.informatik.uni-bremen.de [IPv6:2001:638:708:30c9::12]) by ietfa.amsl.com (Postfix) with ESMTP id BE28121F85D1; Sun, 22 Jan 2012 17:44:09 -0800 (PST)
X-Virus-Scanned: amavisd-new at informatik.uni-bremen.de
Received: from smtp-fb3.informatik.uni-bremen.de (smtp-fb3.informatik.uni-bremen.de [134.102.224.120]) by informatik.uni-bremen.de (8.14.3/8.14.3) with ESMTP id q0N1hnVf017394; Mon, 23 Jan 2012 02:43:49 +0100 (CET)
Received: from [192.168.217.117] (p54899A62.dip.t-dialin.net [84.137.154.98]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) by smtp-fb3.informatik.uni-bremen.de (Postfix) with ESMTPSA id 88DDBA6A; Mon, 23 Jan 2012 02:43:48 +0100 (CET)
Content-Type: text/plain; charset="us-ascii"
Mime-Version: 1.0 (Apple Message framework v1251.1)
From: Carsten Bormann <cabo@tzi.org>
Date: Mon, 23 Jan 2012 02:43:47 +0100
Content-Transfer-Encoding: quoted-printable
Message-Id: <A7D68D42-9FCC-4C84-ADC4-62F03696558B@tzi.org>
To: IETF Apps Discuss <apps-discuss@ietf.org>, draft-ietf-decade-arch-04.all@tools.ietf.org
X-Mailer: Apple Mail (2.1251.1)
Cc: decade@ietf.org, SM <sm+ietf@elandsys.com>
Subject: [apps-discuss] APPSDIR review of draft-ietf-decade-arch-04
X-BeenThere: apps-discuss@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: General discussion of application-layer protocols <apps-discuss.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/apps-discuss>, <mailto:apps-discuss-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/apps-discuss>
List-Post: <mailto:apps-discuss@ietf.org>
List-Help: <mailto:apps-discuss-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/apps-discuss>, <mailto:apps-discuss-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 23 Jan 2012 01:44:11 -0000
I have been selected as the Applications Area Directorate reviewer for this draft (for background on APPSDIR, please see http://trac.tools.ietf.org/area/app/trac/wiki/ApplicationsAreaDirectorate). Please resolve these comments along with any other Last Call comments you may receive. Please wait for direction from your document shepherd or AD before posting a new version of the draft. Gruesse, Carsten --------------------------------- Document: draft-ietf-decade-arch-04 Title: DECADE Architecture Reviewer: Carsten Bormann Review Date: 2012-01-22 ** Summary: This draft is not ready for publication as an Informational RFC and should be revised before publication. Note: I decided to review this by reading the architecture document only, to see whether it is able to stand alone. Note that this implies that the review is likely incomplete. Given the cluster of entangled documents this is a part of, I recommend a concerted review of the next version(s). ** Major Issues: A1) General: Although this is not explicitly said in the introduction, the objective of this document appears to be both: -- to provide an architecture that will constrain and guide the further work of DECADE; -- to present the architecture in an introductory, reasonably accessible way, which will facilitate understanding the specific protocol specifications envisaged. These two (prescriptive vs. descriptive) objectives of this document do conflict, and the conflict is not always managed. In particular, the document goes to considerable detail in describing the protocols, but it is not clear whether this is just illustrating the architecture (as I would expect in an architecture document) or actually constraining the protocol design. E.g., -- for the write-through PUT (section 7.1), it is specified that just one target server can be given to the intermediary. Is this an accident or deliberate? -- For GET, returning the data is optional (section 7.1)? -- "DRP is specified as being carried through extension fields within an SDT (e.g., HTTP headers)." (section 6). Is it always extension fields or is it sometimes the body? (Well, the HTTP body could be called an extension field of HTTP, too.) I think the point is that the DRP data are mostly piggy-backed on SDT. Why not say that. A1a) There are a number of places where the architecture is not yet explicit about the role of entities and data objects that it requires to function. Again, the document needs to decide for itself whether these entities and objects are illustrative only or part of the prescriptive elements of the architecture. E.g., -- is the "abstract specification of ... operation" in 6.2.1 and 6.2.2 only provided for illustration, or is the architecture limiting itself to exactly these two operations? -- There appear to be some implicit parameters such as application context? -- Or, for a PUT, how are metadata such as the expiration time established? -- Is the introductory sentence of 7 intended to limit the server-to-server interaction to a pull model ("download")? -- What is the semantics of a third-party (client-to-server-to-server) GET with respect to the middle server? Is the initiating server supposed to execute a local PUT with the result? Or what is its role? A2) Terminology: The architecture defines a number of terms quite deliberately (section 2), but misses out on a few important ones. Some important roles in the architecture (such as the ticket generating server) are only introduced cursorily, without considering the implications of their existence to the architecture. A2a) "user" (4.5.2) appears to be a central concept of the architecture, but is fleshed out only very thinly. A related concept might or might not be "account", which is only touched on, or "principal" (used in the appendices only). A2b) 4.5.2 introduces an "Application Provider" that is used nowhere else. What is that? Is that an important functional entity? A2c) The capability architecture (the "token" as a data structure, and its interaction with various functional elements) is a central element of the DECADE architecture. -- See RFC 4949 with respect to the usage of the term "token". -- The "token generating server" appears to be important, but is not called out in the list of functional elements in 2. How does a client select/find one? -- The document repeatedly (5.4, 6.1.2) states that a DECADE client must trust the token generating server, but never indicates why. -- Obviously, the DECADE servers need to trust the tokens. This is not discussed. -- The token is said to contain data object names, but then it is also meant to be useful for a "batch of operations", some of which may concern data the names of which we don't know yet. -- How is it useful to "allow a DECADE Server to detect when a token is used multiple times" (what is the server supposed to do when it has detected that?)? -- Do tokens need a revocation mechanism? A2d) Sections 4.3 and 6.1.3 use a concept called "application context". Apparently application contexts are quite important for DECADE operations (e.g., 6.1.3 makes clear that "objects" are always associated with an application context); what are these application contexts? Who creates, deletes them? Resource control, access control for them? (Some operations seem to have an application context as an implicit parameter. Assumptions like these need to be spelled out.) A2e) 3.1: "Let S(A) denote A's DECADE storage server." This concept of ownership is never explained. Is it important? A3) The appendices provide relatively raw existence proofs that are likely to be overtaken by events in a year from now. Much of these are (overly) brief mini-tutorials for the relevant protocols. Appendix A.3 is about a protocol that itself does not seem to be fully cooked at this point. This is certainly useful material to collect for the WG, but it is not clear that these should be part of this document. There are lots of additional issues in these appendices, e.g.: A.1.1.1) HTTPS (where is the reference?) is a security protocol, but does not provide access control. A.1.1.3) This would need to (at least briefly) examine the interactions between HTTP caching and DECADE protocol operation. A.1.3) This specifies (?) "In the reply, the hash is sent in an ETAG header." What kind of response are we talking about? 304? Is this really part of the architecture? A.1.5) Why should the transfer protocol provide the complete access control mechanism? Access control is a local function. Transfer protocols just have to make sure the necessary parameters are in place (and/or may be used for transferring the parameters in the first place). When talking about OAUTH 2, add the relevant reference(s). I have not undertaken to review the appendices in any detail. A4) Is the architectural thinking converged enough on issues of naming? E.g., -- 4.3 seems to imply "resource identifiers" are being used that are the same between different servers. -- 5.3.1 seems to support this by building names in a predictable way out of hashes. In particular, "a DECADE client knows the name of a data object before it is completely stored at the DECADE server." -- However, if DECADE is to be used for real-time interactions, some thought needs to be given on the point in time when hash-based data identifiers/names can be generated. A DECADE client that PUTs video to a DECADE server may not have the complete byte-string of a slice in hand when it starts sending, so it can't send a hash-based name at the start. This is likely to have some impact on the protocol mappings possible. (It also makes it less clear that there is a good reason not to support name generation by the server right from the start.) -- A.2.3 says "DECADE may find the concept of collections to be useful if there is a need to support directory like structures in DECADE. It also discusses WebDAV's MOVE and COPY operations. What is the point when the name uniquely follows from the content? -- 6.1.2 says tokens include "Permitted objects (e.g., names of data objects that may be read or written)" and "It is possible for DRP to allow tokens to apply to a batch of operations to reduce communication overhead required between DECADE Clients." Does this require prescience on what the hash values of future slices will be? A5) Authorization based in IP addresses (6.1.2 "permitted clients") is rarely appropriate. A6) Much of the information discussed in 6.1.3 will be PII. The architecture must discuss how the protocols will provide the flexibility to cope with different data protection and surveillance regulations. For instance, the level of logging performed by a server may be an important parameter that must be indicated to the client before it starts operation, or some of it may conversely be clandestine. A7) Please rewrite section 9 from scratch. There is no need to explain fundamentals of cryptographic data structures (assuming that the next version will use terms that can be referenced properly). Instead, actual security considerations of the DECADE architecture must be discussed, e.g., the cache discovery attack mentioned above. More importantly, there needs to be a discussion of the threat model, the trust relationships envisaged, etc. Please see RFC 3552. ** Minor Issues: M1) Terminology Beyond the problems listed above, the draft needs an overhaul in its terminology. E.g.: -- it uses "TTL" as a term for an object expiration time, without ever explaining the term. (What is actually meant is an expiration time, *NOT* a lifetime/duration or hop count that would be analogous to IPv4's use of the abreviation.) -- using "data object" as the term for the things saved in a DECADE server is highly confusing. It is not always clear whether the raw byte string or the combination of this and certain metadata is meant. Do NOT use "contents" in its plural form as a synonym for "data objects" (4.2). Indeed, the document would improve by using "content" very sparingly, only in the overview sections, and being precise about data objects otherwise. (It would be preferable to have a name for the "data objects" that is distinctive from the plain English meaning of that term. E.g., slice.) E.g., while we learn about data objects that they are immutable and not all of the same size, we need consistent terms for the various kinds of metadata used, such as the DECADE metadata that are used in managing the localized storage vs. those metadata that would be visible in the SDT. "If an application wishes to store such metadata persistently within DECADE, it can be stored within data objects themselves." (What does that mean? New, separate objects? Within the existing ones? In the slice byte-string itself?) -- "data transport protocol" contains the term "transport protocol" which means something different in the IETF. We tend to use "transfer protocol" for the purpose intended. -- 4.4 introduces a "location". What is that? A DECADE server? -- "Traffic De-duplication" is a seriously misleading term for validated cache access. The whole point of the validation protocol in 8.2.1.2 appears to be to protect the cache at S against a colluding pair of A and R, under the assumption that A is not authorized to access S' copy of the object but compensates by being authorized to access R's copy. Since R can (1) indicate authorization and (2) prove to S it does have the data, both using the challenge-response protocol, S can fulfill the request for R. If that is the point, please say that. Please note that, from this exchange, A and R can still extract the fact that S had a copy. Discuss security implications of this discovery. 4.1) "However, the architecture may allow for more-than-one data transport protocols to be used." This *is* the architecture. It either allows it or not. (BTW, shouldn't the architecture also say something about negotiation/capability discovery?) 4.5.1) "The Storage Provider delegates the management of the resources at a DECADE server to one or more applications." What does that really mean? (And are the latter "Content Distribution Applications"?) 5.4) Is this really a digital signature? (Please reserve the term "digitally signed" for actual signatures, as opposed to including a kind of peer entity authentication that is directed towards a specific recipient. See RFC 4949.) 6.1) "...DRP allows one instance of such an application, e.g., an application endpoint, to apply access control and resource sharing policies on each of them." (them = DECADE servers.) That last sentence is rather ominous. Is this completely trivial, or does it actually mean anything? Is DRP maybe a reliable multicast protocol for control data? 6.1.4) The term "MIME type" has been superseded by "media type" (please also reference the relevant RFCs here). It is also not clear to me what that media type means in case of a slice of a larger resource representation. Why is a media type not copied with the object? 7.1) "It is also assumed that the operation performed at the remote server is the same as the operation in the original request." Explain "the same" -- are all parameters identical? Or is it just GET vs. PUT? ** Nits: [list editorial issues such as typographical errors, preferably by section number] 1) "Content Distribution Applications" in the first sentence is not defined. Point to 2.6. 4.2) "are referred as" -> "are referred to as" 4.3) "Objects that are stored in a DECADE storage server can be accessed by DECADE content consumers by a resource identifier" second by -> via 4.3) " Because a DECADE content consumer can access more than one storage server within a single application context, a data object that is replicated across different storage servers managed by a DECADE storage provider, can be accessed by a single identifier." Non sequitur. Change to: >> A DECADE content consumer may be able to access more than one storage server within a single application context. A data object that is replicated across different storage servers managed by a DECADE storage provider can still be accessed by a single identifier. << [Now, it is still not quite clear from that sentance whether that is a MUST (i.e., the whether the architecture mandates that all replicated copies MUST have the same identifier).] 4.5.2) "applications granted resources"? applications being granted resources? resources granted by applications? 5) s/principals/principles/ (Just once in the first paragraph; otherwise, principle vs. principal has been used correctly.) 6.2.2) defered -> deferred 7.1) "Note that when a DECADE client invokes a request a DECADE server with these additional parameters" -- syntax. 8.2.1.1) "When a DECADE client (A) indicates its DECADE account on a DECADE server (S) to fetch an object from a remote entity (R) (a DECADE server or DECADE client)..." What? The "account" is asked to fetch from a "client"? Ceterum censeo) RFCs, as any kind of formal technical publication, should use units in accordance with ISO/IEC 80000, in particular IEC 80000-13. Replace Mbps by Mbit/s, KB by KiB. ** Random observations: O1) The proto writeup says: > The document was reviewed by DECADE WG members, the WG Chairs, and > key non-WG contributors, particularly by David E Mcdysan, Borje > Ohlman, Akbar Rahman, Ning Zong and Dirk Kutscher. Akbar Rahman and Dirk Kutscher are co-authors of this document, so I sure hope they have reviewed this document. O2) The architecture does not give an argument why multiple SDTs are needed when all of them are just HTTP anyway. (Binding the SDT to multiple underlying protocols creates a lot of headaches that may be completely unnecessary. At least they aren't motivated.) But maybe it is not the job of the architecture document to actually motivate this highly complexity-inducing generality. E.g., A.2 alludes to a mapping to WebDAV, but then seems to go on suggesting modifications to WebDAV to enable that layering. This doesn't seem consistent. Indeed, it seems unlikely that DECADE can layer cleanly on top of either WebDAV or CDMI. A more productive view of these protocols may be as a toolkit to take certain parts from, that HTTP does not have, and that DECADE does not want to re-invent. Special care must be taken not to create a chimera, though. ---
- [apps-discuss] APPSDIR review of draft-ietf-decad… Carsten Bormann