idnits 2.17.1 draft-ietf-conneg-feature-hash-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Looks like you're using RFC 2026 boilerplate. This must be updated to follow RFC 3978/3979, as updated by RFC 4748. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** Missing expiration date. The document expiration date should appear on the first and last page. ** The document seems to lack a 1id_guidelines paragraph about 6 months document validity -- however, there's a paragraph with a matching beginning. Boilerplate error? ** The document seems to lack a 1id_guidelines paragraph about the list of current Internet-Drafts. ** The document seems to lack a 1id_guidelines paragraph about the list of Shadow Directories. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** The document seems to lack separate sections for Informative/Normative References. All references will be assumed normative when checking for downward references. ** The abstract seems to contain references ([2], [1]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the RFC 3978 Section 5.4 Copyright Line does not match the current year == Unrecognized Status in 'Category: Work-in-progress', assuming Proposed Standard (Expected one of 'Standards Track', 'Full Standard', 'Draft Standard', 'Proposed Standard', 'Best Current Practice', 'Informational', 'Experimental', 'Informational', 'Historic'.) -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (12 February 1999) is 9205 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Unused Reference: '5' is defined on line 490, but no explicit reference was found in the text -- Possible downref: Non-RFC (?) normative reference: ref. '1' -- Possible downref: Non-RFC (?) normative reference: ref. '2' ** Obsolete normative reference: RFC 2234 (ref. '3') (Obsoleted by RFC 4234) ** Obsolete normative reference: RFC 2396 (ref. '5') (Obsoleted by RFC 3986) ** Downref: Normative reference to an Informational RFC: RFC 1321 (ref. '6') -- Possible downref: Non-RFC (?) normative reference: ref. '7' Summary: 11 errors (**), 0 flaws (~~), 3 warnings (==), 5 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 IETF conneg working group Graham Klyne 3 Internet draft 5GM/Content Technologies 4 Category: Work-in-progress 12 February 1999 5 Expires: August 1999 7 Identifying composite media features 8 10 Status of this memo 12 This document is an Internet-Draft and is in full conformance with 13 all provisions of Section 10 of RFC2026. 15 Internet-Drafts are working documents of the Internet Engineering 16 Task Force (IETF), its areas, and its working groups. Note that 17 other groups may also distribute working documents as Internet- 18 Drafts. 20 Internet-Drafts are draft documents valid for a maximum of six 21 months and may be updated, replaced, or obsoleted by other 22 documents at any time. It is inappropriate to use Internet- Drafts 23 as reference material or to cite them other than as "work in 24 progress." 26 To view the list Internet-Draft Shadow Directories, see 27 http://www.ietf.org/shadow.html. 29 Copyright Notice 31 Copyright (C) The Internet Society 1999. All Rights Reserved. 33 Abstract 35 In "A syntax for describing media feature sets" [1], an expression 36 format is presented for describing media feature capabilities as a 37 combination of simple media feature tags [2]. 39 This document proposes an abbreviated format for a composite media 40 feature set, based upon a hash of the feature expression describing 41 that composite. 43 Internet Draft Identifying composite meadia features 44 12 February 1999 46 Table of contents 48 1. Introduction ............................................2 49 1.1 Organization of this document 2 50 1.2 Terminology and document conventions 3 51 2. Motivation and goals ....................................3 52 3. Composite feature representation ........................4 53 3.1 Feature set reference format 4 54 3.2 Hash value calculation 5 55 3.3 Dereferencing feature set expressions 6 56 3.3.1 Inline feature set details 6 57 3.3.2 URI reference 6 58 3.4 The birthday problem 7 59 4. Feature set resolution and matching .....................8 60 5. Examples ................................................8 61 6. Internationalization considerations .....................8 62 7. Security considerations .................................9 63 8. Full copyright statement ................................9 64 9. Acknowledgements ........................................10 65 10. References .............................................10 66 11. Author's address .......................................11 67 Appendix A: Revision history ...............................11 69 1. Introduction 71 In "A syntax for describing media feature sets" [1], an expression 72 format is presented for describing media feature capabilities as a 73 combination of simple media feature tags [2]. 75 This document proposes an abbreviated format for a composite media 76 feature set, based upon a hash of the feature expression describing 77 that composite. 79 1.1 Organization of this document 81 Section 2 sets out somne of the background and goals for feature 82 set references. 84 Section 3 preents a syntax for feature set references, and 85 describes how they are related to feature set expressions. 87 Section 4 discusses how feature set references are used in conction 88 with feature set matching. 90 Internet Draft Identifying composite meadia features 91 12 February 1999 93 1.2 Terminology and document conventions 95 This section defines a number of terms and other document 96 conventions, which are used with specific meaning in this memo. 97 The terms are listed in alphabetical order. 99 dereference 100 the act of replacing a feature set reference with its 101 corresponding feature set expression. 103 feature set 104 some set of media features described by a media feature 105 assertion, as described in "A syntax for describing media 106 feature sets" [1]. (See that memo for a more formal 107 definition of this term.) 109 feature set expression 110 a string that describes some feature set, formulated 111 according to the rules in "A syntax for describing media 112 feature sets" [1] (and possibly extended by other 113 specifications). 115 feature set reference 116 a brief construct that references some feature set. 117 (See also: "dereference".) 119 This specification uses syntax notation and conventions described 120 in RFC2234 "Augmented BNF for Syntax Specifications: ABNF" [3]. 122 NOTE: Comments like this provide additional nonessential 123 information about the rationale behind this document. 124 Such information is not needed for building a conformant 125 implementation, but may help those who wish to understand 126 the design in greater depth. 128 2. Motivation and goals 130 The range of media feature capabilities of a message handling 131 system can be quite extensive, and the corresponding feature set 132 expression [1] can reach a significant size. 134 A requirement has been identified to allow recurring feature sets 135 to be identified by a single reference value, which can be combined 136 with other elements in a feature set expression. It is anticipated 137 that mechanisms will be provided that allow the recipient of such a 138 feature set reference to discover the corresponding feature set 139 expression. 141 Internet Draft Identifying composite meadia features 142 12 February 1999 144 Thus, the goals for this proposal are: 146 o to provide an abbreviated form for referencing an arbitrary 147 feature set expression. 149 o the meaning of (i.e. the corresponding feature set expression) a 150 feature set reference should be independent of any particular 151 mechanism that may be used to dereference it. 153 o to be able to verify whether a given feature set expression 154 corresponds to some feature set reference without having to 155 perform an explicit dereferencing operation (i.e. without 156 incurring additional network traffic). 158 o for protocol processors that conform to [1] to be able to 159 sensibly handle a feature set reference without explicit 160 knowledge of its meaning (i.e. the introduction of feature set 161 references should not break existing feature expression 162 processors). 164 o to allow, but not require, some indication of how to dereference 165 a feature set reference to be included in a feature set 166 expression. 168 This proposal does not attempt to address the "override" or 169 "default" problem. (Also called "delegation", where a feature set 170 may be referenced and selectively overridden.) 172 3. Composite feature representation 174 This specification hinges on two central ideas: 176 o the use of auxiliary predicates (introduced in [1]) to form the 177 basis of a feature set reference, and 179 o the use of a token based on a hash function computed over the 180 referenced feature set expression. 182 3.1 Feature set reference format 184 This specification introduces a special form of auxililiary 185 predicate name with the following syntax: 187 fname = "h." 1*HEXDIG 189 The sequence of hexadecimal digits is the value of a hash function 190 calculated over the corresponding feature set expression (see next 191 section), represented as a hexadecimal number. 193 Internet Draft Identifying composite meadia features 194 12 February 1999 196 Thus, within a feature set expression, a feature set reference 197 would have the following form: 199 (h.123456789abcdef0123456789abcdef0) 201 NOTE: Base64 representation (per MIME [4]) would be more 202 compact (21 rather than 32 characters for the hash 203 value), but an auxiliary predicate name is defined (by 204 [1]) to have the same syntax as a feature tag, and the 205 feature tag matching rules (per [2]) state that feature 206 tag matching is case in sensitive. 208 3.2 Hash value calculation 210 The hash value is calculated using the MD5 algorithm [6] over the 211 text of the referenced feature set expression subjected to certain 212 normalizations. The feature expression must conform to the syntax 213 given in "A syntax for describing media feature sets" [1] for 214 'filter': 216 filter = "(" filtercomp ")" *( ";" parameter ) 218 The steps for calculating a hash value are: 220 1. Whitespace normalization: all spaces, CR, LF, TAB and any other 221 layout control characters that may be embedded in the feature 222 expression string are removed (or ignored for the purpose of hash 223 value computation). 225 2. Case normalization: all lower case letters in the feature 226 expression, other than those contained within quoted strings, are 227 converted to upper case. That is, unquoted characters with 228 values 97 to 122 (decimal) are changed to corresponding 229 characters in the range 65 to 90. 231 3. Hash computation: the MD5 algorithm [6] is applied to the 232 normalized feature expression string. 234 The result obtained in step 3 is a 128-bit number that is converted 235 to a hexadecimal representation to form the feature set reference. 237 NOTE: under some circumstances, removal of ALL whitespace 238 may result in an invalid feature expression string. This 239 should not be a problem as significantly different 240 feature expressions are expected to differ in ways other 241 than their whitespace. 243 NOTE: case normalization is deemed appropriate since 244 feature tag and token matching is case insensitive. 246 Internet Draft Identifying composite meadia features 247 12 February 1999 249 3.3 Dereferencing feature set expressions 251 This memo does not mandate any particular mechanism for 252 defeferencing a feature set reference. It is expected that 253 specific dereferencing mechanisms will be specified for any 254 application that uses them. 256 The following sections describe two specific ways that feature set 257 dereferencing information may be incorporated into a feature set 258 expression. Both of these mechanisms are based on auxiliary 259 predicate definitions within a "where" clause [1]. 261 NOTE: both of the forms described below may be used with 262 feature set references that are not constructed as 263 "h." values described above. The consequence of 264 not using hash-based reference values is that feature set 265 differences, changes or other errors may be undetectable. 267 3.3.1 Inline feature set details 269 The feature set expression associated with a reference value may be 270 specified directly in a "where" clause, using the auxiliary 271 predicate definition syntax [1]; e.g. 273 (& (dpi=100) (h.1234567890) ) 274 where 275 (h.1234567890) :- (& (pix-x<=200) (pix-y<=150) ) 277 This form might be used on request (where the request mechanism is 278 defined by the invoking application protocol), or when the 279 originator believes the recipient may not understand the reference. 281 3.3.2 URI reference 283 This and associates a URI with a feature set reference. 285 NOTE: How a calling application interprets the URI is 286 not specified here. For URIs that are URLs, one 287 reasonable approach would be to use the URL scheme 288 protocol to access the corresponding feature set 289 expression. But other mechanisms are possible. 291 [[[e.g. RESCAP?]]] 293 An auxiliary predicate name is defined to be a feature tag [1], and 294 one allowable form for a feature tag is 'u.' [2]. Thus a 295 standard form of auxiliary predicate definition can be used to 296 associate a URI with a feature set reference: 298 (h.1234567890) :- (u.http://www.acme.com/widget-feature/modelT) 300 Internet Draft Identifying composite meadia features 301 12 February 1999 303 [[[The range of URI forms allowed by [2] is restricted, and that 304 restriction would apply to the above proposal. Another approach 305 would be to introduce some new syntax... 307 A new form of auxiliary predicate definition is introduced, 308 extending the feature expression syntax [1]: 310 named-pred =/ "(" fname ")" ":-" "<" URI ">" 311 URI = 314 An example predicate definition using this form is: 316 (h.1234567890) :- 318 ...]]] 320 3.4 The birthday problem 322 NOTE: this entire section is commentary, and does not 323 affect the feature set reference specification in any 324 way. 326 The use of a hash value to represent an arbitrary feature set is 327 based on a presumption that no two distinct feature sets will yield 328 the same hash value. 330 There is clearly a small but distinct possibility that two 331 different feature sets will indeed yield the same hash value. 333 We assume that the hash function distributes hash values for 334 feature sets with even very small differences randomly and evenly 335 through the range of 2^128 (approximately 10^38) possible values. 336 This is a fundamental property of a good digest algorithm like MD5. 337 Thus, the chance that any two distinct feature set expressions 338 yield the same hash is roughly 1 in 10^38. This is negligible when 339 compared with, say, the probability that a receiving system will 340 fail having received data conforming to a negotiated feature set. 342 But when the number of distinct feature sets in circulation 343 increases, the probability of clashing hash values increases 344 surprisingly. This is illustrated by the "birthday paradox": 345 given a random collection of just 23 people, there is a greater 346 than even chance that there exists some pair with the same birthay. 347 This topic is discussed further in sections 7.4 and 7.5 of Bruce 348 Scheier's "Applied Cryptography" [7]. 350 [[[TODO: Include some numbers to illustrate actual probabilities 351 of clash with 10^3, 10^6, 10^9, 10^12, 10^15, 10^18 feature sets 352 in circulation.]]] 354 Internet Draft Identifying composite meadia features 355 12 February 1999 357 If original feature set expressions are generated manually, or only 358 in response to some manually constrained process, the total number 359 of feature sets in circulation is likely to remain very small in 360 relation to the total number of possible hash values. 362 The outcome of all this is: assuming that the feature sets are 363 manually generated, even taking account of the birthday paradox 364 effect, the probability of incorrectly identifying a feature set 365 using a hash value is still negligibly small when compared with 366 other possible failure modes. 368 4. Feature set resolution and matching 370 This section discusses the use of feature references in conjunction 371 with feature set matching [1]. 373 The definitive position on matching feature sets containing feature 374 set references is given by dereferencing all of the references; 375 i.e. every feature set reference is replaced by the corresponding 376 expression. 378 Sometimes, it may be desirable to process feature sets without 379 performing dereferencing. The rules below may facilitate this 380 while achieving results that are consistent with the definitive 381 position. 383 (& ... (h.) (h.) ... ) --> (& ... (h.) ... ) 384 (| ... (h.) (h.) ... ) --> (& ... (h.) ... ) 385 (& ... (h.) (! (h.) ) ... ) --> FALSE 386 (| ... (h.) (! (h.) ) ... ) --> TRUE 388 If some referenced feature set is known to be TRUE or FALSE, then 389 the corresponding references may be replaced by the corresponding 390 TRUE or FALSE value. 392 [[[Can more be said?]]] 394 5. Examples 396 [[[TODO]]] 398 6. Internationalization considerations 400 Feature set expressions are currently defined to consist of only 401 characters from the US-ASCII repertoire; under these circumstances 402 this specification is not impacted by internationalization 403 considerations. 405 Internet Draft Identifying composite meadia features 406 12 February 1999 408 But, if future revisions of the feature set syntax permit non-US- 409 ASCII characters (e.g. within quoted strings), then some canonical 410 representation must be defined for the purposes of calculating hash 411 values. One choice might be to use a UTF-8 equivalent 412 representation as the basis for calculating the feature set hash. 413 Another choice might be to leave this as an application protocol 414 issue (but this could lead to non-interoperable feature sets 415 between different protocols). 417 Another conceivable issue is that of up-casing the feature 418 expression in preparation for computing a hash value. This does 419 not apply to the content of strings so is not likely to be an 420 issue. But if changes are made that do permit non-US-ASCII 421 characters in feature tags or token strings, consideration must be 422 given to properly defining how case conversion is to be performed. 424 7. Security considerations 426 <<>> 428 8. Full copyright statement 430 Copyright (C) The Internet Society 1999. All Rights Reserved. 432 This document and translations of it may be copied and furnished to 433 others, and derivative works that comment on or otherwise explain 434 it or assist in its implementation may be prepared, copied, 435 published and distributed, in whole or in part, without restriction 436 of any kind, provided that the above copyright notice and this 437 paragraph are included on all such copies and derivative works. 438 However, this document itself may not be modified in any way, such 439 as by removing the copyright notice or references to the Internet 440 Society or other Internet organizations, except as needed for the 441 purpose of developing Internet standards in which case the 442 procedures for copyrights defined in the Internet Standards process 443 must be followed, or as required to translate it into languages 444 other than English. 446 The limited permissions granted above are perpetual and will not be 447 revoked by the Internet Society or its successors or assigns. 449 This document and the information contained herein is provided on 450 an "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET 451 ENGINEERING TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR 452 IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF 453 THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED 454 WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. 456 Internet Draft Identifying composite meadia features 457 12 February 1999 459 9. Acknowledgements 461 This proposal is developed from a suggestion by Larry Masinter. 462 Some of the ideas have been honed in early discussions with Martin 463 Duerst, Al Gilman, Ted Hardie and Bill Newman. 465 10. References 467 [1] "A syntax for describing media feature sets" 468 Graham Klyne, 5GM/Content Technologies 469 Internet draft: " 470 Work in progress, September 1998. 472 [2] "Media Feature Tag Registration Procedure" 473 Koen Holtman, TUE 474 Andrew Mutz, Hewlett-Packard 475 Ted Hardie, NASA 476 Internet draft: 477 Work in progress, July 1998. 479 [3] RFC 2234, "Augmented BNF for Syntax Specifications: ABNF" 480 D. Crocker (editor), Internet Mail Consortium 481 P. Overell, Demon Internet Ltd. 482 November 1997. 484 [4] RFC 2045, "Multipurpose Internet Mail Extensions (MIME) 485 Part 1: Format of Internet message bodies" 486 N. Freed, Innosoft 487 N. Borenstein, First Virtual 488 November 1996. 490 [5] RFC 2396, "Uniform Resource Identifiers (URI): Generic Syntax", 491 Tim Berners-Lee, World Wide Web Consortium/MIT 492 Roy T. Fielding, University of California, Irvine 493 Larry Masinter, Xerox PARC 494 August 1998. 496 [6] RFC 1321, "The MD5 Message-Digest Algorithm", 497 R. Rivest, MIT Laboratory for Computer Science and RSA Data 498 Security, Inc., 499 April 1992. 501 [7] "Applied Cryptography" 502 Bruce Schneier 503 John Wiley and Sons, 1996 (second edition) 504 ISBN 0-471-12845-7 (cloth) 505 ISBN 0-471-11709-9 (paper) 507 Internet Draft Identifying composite meadia features 508 12 February 1999 510 11. Author's address 512 Graham Klyne 513 5th Generation Messaging Ltd. Content Technologies Ltd. 514 5 Watlington Street Forum 1, Station Road 515 Nettlebed Theale 516 Henley-on-Thames, RG9 5AB Reading, RG7 4RA 517 United Kingdom United Kingdom. 518 Telephone: +44 1491 641 641 +44 118 930 1300 519 Facsimile: +44 1491 641 611 +44 118 930 1301 520 E-mail: GK@ACM.ORG 522 Appendix A: Revision history 524 00a 10-Feb-1999 Initial draft.