idnits 2.17.1 draft-ietf-impp-cpim-msgfmt-03.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Looks like you're using RFC 2026 boilerplate. This must be updated to follow RFC 3978/3979, as updated by RFC 4748. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- == No 'Intended status' indicated for this document; assuming Proposed Standard == The page length should not exceed 58 lines per page, but there was 26 longer pages, the longest (page 2) being 59 lines Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack separate sections for Informative/Normative References. All references will be assumed normative when checking for downward references. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the RFC 3978 Section 5.4 Copyright Line does not match the current year -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (13 June 2001) is 8353 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Unused Reference: '5' is defined on line 1045, but no explicit reference was found in the text == Unused Reference: '8' is defined on line 1056, but no explicit reference was found in the text == Unused Reference: '9' is defined on line 1059, but no explicit reference was found in the text == Unused Reference: '13' is defined on line 1074, but no explicit reference was found in the text == Unused Reference: '20' is defined on line 1101, but no explicit reference was found in the text ** Obsolete normative reference: RFC 822 (ref. '2') (Obsoleted by RFC 2822) ** Obsolete normative reference: RFC 2048 (ref. '5') (Obsoleted by RFC 4288, RFC 4289) ** Downref: Normative reference to an Informational RFC: RFC 2130 (ref. '6') ** Obsolete normative reference: RFC 3066 (ref. '7') (Obsoleted by RFC 4646, RFC 4647) ** Obsolete normative reference: RFC 2633 (ref. '8') (Obsoleted by RFC 3851) ** Obsolete normative reference: RFC 2440 (ref. '9') (Obsoleted by RFC 4880) ** Obsolete normative reference: RFC 2396 (ref. '10') (Obsoleted by RFC 3986) -- Possible downref: Non-RFC (?) normative reference: ref. '11' -- Possible downref: Non-RFC (?) normative reference: ref. '12' -- Possible downref: Non-RFC (?) normative reference: ref. '13' -- No information found for draft-thenine-im-common - is the name correct? -- Possible downref: Normative reference to a draft: ref. '14' ** Downref: Normative reference to an Informational RFC: RFC 2779 (ref. '15') ** Obsolete normative reference: RFC 2234 (ref. '17') (Obsoleted by RFC 4234) ** Obsolete normative reference: RFC 2616 (ref. '18') (Obsoleted by RFC 7230, RFC 7231, RFC 7232, RFC 7233, RFC 7234, RFC 7235) ** Obsolete normative reference: RFC 2278 (ref. '20') (Obsoleted by RFC 2978) ** Obsolete normative reference: RFC 2279 (ref. '21') (Obsoleted by RFC 3629) == Outdated reference: A later version (-14) exists of draft-mealling-iana-urn-00 == Outdated reference: A later version (-04) exists of draft-ietf-impp-datetime-03 Summary: 14 errors (**), 0 flaws (~~), 10 warnings (==), 7 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 Network Working Group D. Atkins, Telcordia Technologies 2 Internet Draft G. Klyne, Baltimore Technologies 3 13 June 2001 4 Expires: December 2001 6 Common Presence and Instant Messaging: Message Format 7 9 Status of this memo 11 This document is an Internet-Draft and is in full conformance with 12 all provisions of Section 10 of RFC 2026. 14 Internet-Drafts are working documents of the Internet Engineering 15 Task Force (IETF), its areas, and its working groups. Note that 16 other groups may also distribute working documents as Internet- 17 Drafts. 19 Internet-Drafts are draft documents valid for a maximum of six months 20 and may be updated, replaced, or obsoleted by other documents at any 21 time. It is inappropriate to use Internet-Drafts as reference 22 material or to cite them other than as "work in progress". 24 The list of current Internet-Drafts can be accessed at 25 http://www.ietf.org/1id-abstracts.html 27 The list of Internet-Draft Shadow Directories can be accessed at 28 http://www.ietf.org/shadow.html. 30 To view the entire list of current Internet-Drafts, please check the 31 "1id-abstracts.txt" listing contained in the Internet-Drafts Shadow 32 Directories on ftp.is.co.za (Africa), ftp.nordu.net (Northern 33 Europe), ftp.nis.garr.it (Southern Europe), munnari.oz.au (Pacific 34 Rim), ftp.ietf.org (US East Coast), or ftp.isi.edu (US West Coast). 36 Copyright Notice 38 Copyright (C) The Internet Society 2001. All Rights Reserved. 40 Abstract 42 This memo defines the mime type 'message/cpim', a message format for 43 protocols that conform to the Common Profile for Instant Messaging 44 (CPIM) specification. 46 Discussion of this document 48 Please send comments to: . To subscribe: send a 49 message with the body 'subscribe' to . The 50 mailing list archive is at . 52 Table of Contents 54 1. INTRODUCTION 55 1.1 Motivation 56 1.2 Background 57 1.3 Goals 58 1.4 Terminology and conventions 59 2. OVERALL MESSAGE STRUCTURE 60 2.1 Message/cpim MIME headers 61 2.2 Message headers 62 2.3 Character escape mechanism 63 2.4 Message content 64 3. MESSAGE HEADER SYNTAX 65 3.1 Header names 66 3.2 Header Value 67 3.3 Language Tagging 68 3.4 Namespaces for header name extensibility 69 3.5 Mandatory-to-recognize features 70 3.6 Collected message header syntax 71 4. HEADER DEFINITIONS 72 4.1 The 'From' header 73 4.2 The 'To' header 74 4.3 The 'cc' header 75 4.4 The 'DateTime' header 76 4.5 The 'Subject' header 77 4.6 The 'NS' header 78 4.7 The 'Require' header 79 5. EXAMPLES 80 5.1 An example message/cpim message 81 5.2 An example using MIME multipart/signed 82 6. APPLICATION DESIGN CONSIDERATIONS 83 7. IANA CONSIDERATIONS 84 8. INTERNATIONALIZATION CONSIDERATIONS 85 9. SECURITY CONSIDERATIONS 86 10. ACKNOWLEDGEMENTS 87 11. REFERENCES 88 12. AUTHORS' ADDRESSES 89 Appendix A: Amendment history 90 Full copyright statement 92 1. INTRODUCTION 94 This memo defines the mime content-type 'message/cpim. This is a 95 common message format for CPIM-compliant messaging protocols [14]. 97 While being prepared for CPIM, this format is quite general and may 98 be reused by other applications with similar requirements. 99 Application specifications that adopt this as a base format should 100 answer the questions rasied in section 6 of this document. 102 1.1 Motivation 104 The Common Profile for Instant Messaging (CPIM) [14] specification 105 defines a number of operations to be supported and criteria to be 106 satisfied for interworking diverse instant messaging protocols. The 107 intent is to allow a variety of different protocols interworking 108 through gateways to support cross-protocol messaging that meets the 109 requirements of RFC 2779 [15]. 111 To adequately meet the security requirements of RFC 2779, a common 112 message format is needed so that end-to-end signatures and encryption 113 may be applied. This document describes a common canonical message 114 format that must be used by any CPIM-compliant message transfer 115 protocol, and over which signatures are calculated for end-to-end 116 security. 118 1.2 Background 120 RFC 2779 requires that an instant message can carry a MIME payload 121 [3,4]; thus some level of support for MIME will be a common element 122 of any CPIM compliant protocol. Therefore it seems reasonable that a 123 common message format should use a MIME/RFC822 syntax, as protocol 124 implementations must already contain code to parse this. 126 Unfortunately, using pure RFC822/MIME [2] can be problematic: 128 o Irregular lexical structure -- RFC822 allows a number of optional 129 encodings and multiple ways to encode a particular value. For 130 example RFC822 comments may be encoded in multiple ways. For 131 security purposes, a single encoding method must be defined as a 132 basis for computing message digest values. Protocols that 133 transmit data in a different format would otherwise lose 134 information needed to verify a signature. 136 o Weak internationalization -- RFC822 requires header values to use 137 7-bit ASCII, which is problematic for encoding international 138 character sets. Mechanisms for language tagging in RFC822 headers 139 [16] are awkward to use and have limited applicability. 141 o Mutability -- addition, modification or removal of header 142 information. Because it is not explicitly forbidden, many 143 applications that process MIME content (e.g. MIME gateways) 144 rebuild or restructure messages in transit. This obliterates most 145 attempt at achieving security (e.g. signatures), leaving receiving 146 applications unable to verify the received data. 148 o Message and payload separation -- there is not a clear syntactic 149 distinction between message metadata and message content. 151 o Limited extensibility (X-headers are problematic). 153 o No support for structured information (text string values only). 155 o Some processors impose line length limitations The message format 156 defined by this memo overcomes some of these difficulties by 157 having a syntax that is generally compatible with the format 158 accepted by MIME/RFC822 parsers, but simplified, and having a 159 stricter syntax. It also defines mechanisms to support some 160 desired features not covered by the RFC822/MIME format 161 specifications. 163 1.3 Goals 165 This specification aims to satisfy the following goals: 167 o a securable end-to-end format for a message (a canonical message 168 format for signature calculation) 170 o independent of any specific application 172 o capable of conveying a range of different address types 174 o assumes an 8-bit clean message-transfer protocol 176 o evolvable: extensible by multiple parties 178 o to clearly separate message metadata from message content 180 o a simple, regular, easily parsed syntax 182 o a compact, low-overhead format for simple messages 184 1.4 Terminology and conventions 186 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 187 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 188 document are to be interpreted as described in RFC 2119 [1]. 190 NOTE: Comments like this provide additional nonessential 191 information about the rationale behind this document. 192 Such information is not needed for building a conformant 193 implementation, but may help those who wish to understand 194 the design in greater depth. 196 [[[Editorial comments and questions about outstanding issues are 197 provided in triple brackets like this. These working comments should 198 be resolved and removed prior to final publication.]]] 200 2. OVERALL MESSAGE STRUCTURE 202 The message/cpim format encapsulates an arbitrary MIME message 203 content, together with message- and content-related metadata. This 204 can optionally be signed or encrypted using MIME security multiparts 205 in conjunction with an appropriate security scheme. 207 A message/cpim object is a multipart entity, where the first part 208 contains the message metadata and the second part is the message 209 content. The two parts are syntactically separated by a blank line, 210 to keep the message header information (with its more stringent 211 syntax rules) separate from the MIME message content headers. 213 Thus, the complete message looks something like this: 215 m: Content-type: message/cpim 216 s: 217 h: (message-metadata-headers) 218 s: 219 e: (encapsulated MIME message-body) 221 The end of the message body is defined by the framing mechanism of 222 the protocol used. The tags 'm:', 's:', 'h:', 'e:', and 'x:' are not 223 part of the message format and are used here to indicate the 224 different parts of the message, thus: 226 m: MIME headers for the overall message 227 s: a blank separator line 228 h: message headers 229 e: encapsulated MIME object containing the message content 230 x: MIME security multipart message wrapper 232 2.1 Message/cpim MIME headers 234 The message MIME headers identify the message as a CPIM-formatted 235 message. The only required header is: 237 Content-type: message/cpim 239 Other MIME headers may be used as appropriate for the message 240 transfer environment. 242 2.2 Message headers 244 Message headers carry information relevant to the end-to-end transfer 245 of the message from sender to receiver. Message headers MUST NOT be 246 modified, reformatted or reordered in transit, but in some 247 circumstances they MAY be examined by a CPIM message transfer 248 protocol. 250 The message headers serve a similar purpose to RFC822 message headers 251 in email [2], and have a similar but restricted allowable syntax. 253 The basic header syntax is: 255 Key: Value 257 where "Key" is a header name and "Value" is the corresponding header 258 value. The following considerations apply: 260 o The entire header MUST be contained on a single line. The line 261 terminator is not considered part of the header value. 263 o Only one header per line. Multiple headers MUST NOT be included 264 on a single line. 266 o Processors SHOULD NOT impose any line-length limitations. 268 o There MUST NOT be any whitespace at the beginning or end of a 269 line. 271 o UTF-8 character encoding [21] MUST be used throughout. 273 o The character sequence CR,LF (13,10) MUST be used to terminate 274 each line. 276 o The header name contains only US-ASCII characters (see later for 277 the specific syntax) 279 o The header MUST NOT contain any control characters (0-31). If a 280 header value needs to represent control characters then the escape 281 mechanism described below MUST be used. 283 o There MUST be a single space character (32) following the header 284 name and colon. 286 o Multiple headers using the same key (header name) are allowed. 287 (Specific header semantics may dictate only one occurrence of any 288 particular header.) 290 o Headers names MUST match exactly (i.e. "From:" and "from:" are 291 different headers). 293 o If a header name is not recognized or not understood, the header 294 should be ignored. But see also the "Requires:" header. 296 o Interpretation (e.g. equivalence) of header values is dependent on 297 the particular header definition. Message processors MUST 298 preserve exactly all octets of all headers (both name and value). 300 o Message processors MUST NOT change the order of message headers. 302 Examples: 304 To: Pooh Bear 305 From: 306 Date: 2001-02-02T10:48:54-05:00 308 2.3 Character escape mechanism 310 This mechanism MUST be used to code control characters in a header, 311 having Unicode code points in the range U+0000 to U+001f or U+007f. 312 (The escape mechanism is as used by the Java programming language.) 313 Note that the escape mechanism is applied to a UCS-2 character, NOT 314 to the octets of its UTF-8 coding. Mapping from/to UTF-8 coding is 315 performed without regard for escape sequences or character coding. 316 (The header syntax is defined so that octets corresponding to control 317 characters other than CR and LF do not appear in the output.) 318 An arbitrary UCS-2 character is escaped using the form: 320 \uxxxx 322 where: 324 \ is U+005c (backslash) 325 u is U+0075 (lower case letter U) 326 xxxx is a sequence of exactly four hexadecimal digits 327 (0-9, a-f or A-F) or 328 (U+0030-U+0039, U+0041-U+0046, or U+0061-0066) 330 The hexadecimal number 'xxxx' is the UCS code-point value of the 331 escaped character. 333 Further, the following special sequences introduced by "\" are used: 335 \\ for \ (backslash, U+005c) 336 \" for " (double quote, U+0022) 337 \' for ' (single quote, U+0027) 338 \b for backspace (U+0008) 339 \t for tab (U+0009) 340 \n for linefeed (U+000a) 341 \r for carriage return (U+000d) 343 2.3.1 Escape mechanism usage 345 When generating messages conformant with this specification: 347 o The special sequences listed above MUST be used to encode any 348 occurrence of the following characters that appear anywhere in a 349 header: backslash (U+005c), backspace (U+0008), tab (U+0009), 350 linefeed (U+000a) or carriage return (U+000d). 352 o The special sequence \' MUST be used for any occurrence of a 353 single quote (U+0027) that appears within a string delimited by 354 single quotes. 356 o The special sequence \" MUST be used for any occurrence of a 357 double quote (U+0022) that appears within a string delimited by 358 double quotes. 360 + Quote characters that delimit a string value MUST NOT be escaped. 362 o The general escape sequence \uxxxx MUST be used for any other 363 control character (U+0000 to U+0007, U+000b to U+000c, U+000e to 364 U+001f or u+007f) that appears anywhere in a header. 366 o All other characters MUST NOT be represented using an escape 367 sequence. 369 When processing a message based on this specification, the escape 370 sequence usage described above MUST be recognized. 372 Further, any other occurrence of any escape sequence described above 373 SHOULD be recognized and treated as an occurrence of the 374 corresponding Unicode character. 376 Any backslash ('\') character SHOULD be interpreted as introducing an 377 escape sequence. Any unrecognized escape sequence SHOULD be treated 378 as an instance of the character following the backslash character. 379 An isolated backslash that is the last character of a header SHOULD 380 be ignored. 382 2.4 Message content 384 The final section of a message/cpim is the MIME-encapsulated message 385 content, which follows standard MIME formatting rules [3,4]. 387 The MIME content headers MUST include at least a Content-Type header. 388 The content may be any MIME type. 390 Example: 392 e: Content-Type: text/plain; charset=utf-8 393 e: Content-ID: <1234567890@foo.com> 394 e: 395 e: This is my encapsulated text message content 397 3. MESSAGE HEADER SYNTAX 399 A header is made of two parts, a name and a value, separated by a 400 colon character (':') followed by a single space (32), and terminated 401 by a sequence of CR,LF (13,10). 403 Headers use UTF-8 character encoding thoughout, per RFC 2279 [21]. 405 3.1 Header names 407 The header name is a sequence of US-ASCII characters, excluding 408 control characters, SPACE or separator characters. Use of the 409 character "." in a header name is reserved for a namespace prefix 410 separator. 412 Separator characters are: 414 SEPARATORS = "(" / ")" / "<" / ">" / "@" 415 / "," / ";" / ":" / " 416 / "/" / "[" / "]" / "?" / "=" 417 / "{" / "}" / SP 419 NOTE: the range of allowed characters was determined by 420 examination of HTTP and RFC822 header name formats and 421 choosing the more resticted. The intent is to allow CPIM 422 headers to follow a syntax that is compatible with the 423 allowed syntax for both RFC 822 [2] and HTTP [18] 424 (including HTTP-derived protocols such as SIP). 426 3.2 Header Value 428 A header value has a structure defined by the corresponding header 429 specification. Implementations that use a particular header must 430 adhere to the format and usage rules thus defined when creating or 431 processing a message containing that header. 433 The other general constraints on header formats MUST also be followed 434 (one line, UTF-8 character encoding, no control characters, etc.) 436 3.3 Language Tagging 438 Full internationalization of a protocol requires that a language can 439 be indicated for any human-readable text [6,19]. 441 A message header may indicate a language for its value by including 442 ';lang=tag' after the header name and colon, where 'tag' is a 443 language identifying token per RFC 3066 [7]. 445 Example: 447 Subject:;lang=fr Objet de message 449 If the language parameter is not applied a header, any human- 450 readable text is assumed to use the language identified as 451 'i-default' [19]. 453 3.4 Namespaces for header name extensibility 455 NOTE: this section defines a framework for header 456 extensibility whose use is optional. If no header 457 extensions are allowed by an application then these 458 structures may never be used. 460 An application that uses this message format is expected to define 461 the set of headers that are required and allowed for that 462 application. This section defines a header extensibility framework 463 that can be used with any application. 465 The extensibility framework is based on that provided for XML [11] by 466 XML namespaces [12]. All headers are associated with a "namespace", 467 which is in turn associated with a globally unique URI. 469 Within a particular message instance, header names are associated 470 with a particular namespace through the presence or absence of a 471 namespace prefix, which is a leading part of the header name followed 472 by a period ("."); e.g. 474 prefix.header-name: header-value 476 Here, 'prefix' is the header name prefix, 'header-name' is the header 477 name within the namespace associated with 'prefix', and 478 'header-value' is the value for this header. 480 header-name: header-value 482 In this case, the header name prefix is absent, and the given 483 'header-name' is associated with a default namespace. 485 An application that uses this format designates a default namespace 486 for any headers that are not more explicitly associated with any 487 namespace. In many cases, the default namespace may be all that is 488 needed. 490 A namespace is identified by a URI. In this usage, the URI is used 491 simply as a globally unique identifier, and there is no requirement 492 that it can be used for any other purpose. Any legal globally unique 493 URI MAY be used to identify a namespace. (By "globally unique", we 494 mean constructed according to some set of rules so that it is 495 reasonable to expect that nobody else will use the same URI for a 496 different purpose.) A URI used as an identifier MUST be a full 497 absolute-URI, per RFC 2396 [10]. (Relative URIs and URI- references 498 containing fragment identifiers MUST NOT be used for this purpose.) 499 Within a specific message, a 'NS' header is used to declare a 500 namespace prefix and associate it with a URI that identifies a 501 namespace. Following that declaration, within the scope of that 502 message, the combination of namespace prefix and header name 503 indicates a globally unique identifier for the header (consisting of 504 the namespace URI and header name). For example: 506 NS: MyFeatures 507 MyFeatures.WackyMessageOption: Use-silly-font 509 This defines a namespace prefix 'MyFeatures' associated with the 510 namespace identifier 'mid:MessageFeatures@id.foo.com'. Subsequently 511 the prefix indicates that the WackyMessageOption header name 512 referenced is associated with the identified namespace. 514 A namespace prefix declaration MUST precede any use of that prefix. 516 With the exception of any application-specific predefined namespace 517 prefixes (see section 6), a namespace prefix is strictly local to the 518 message in which it occurs. The actual prefix used has no global 519 significance. This means that the headers: 521 xxx.name: value 522 yyy.name: value 524 in two different messages may have exactly the same effect if 525 namespace prefixes 'xxx' and 'yyy' are associated with the same 526 namespace URI. Thus the following have exactly the same meaning: 528 NS: acme 529 acme.runner-trap: set 531 and 533 NS: widget 534 widget.runner-trap: set 536 A 'NS' header without a header prefix name specifies a default 537 namespace for subsequent headers; that is a namespace that is 538 associated with header names not having a prefix. For example: 540 NS: 541 runner-trap: set 543 has the same meaning as the previous examples. 545 This framework allows different implementers to create extension 546 headers without the worry of header name duplication; each defines 547 headers within their own namespace. 549 3.5 Mandatory-to-recognize features 551 Sometimes it is necessary for the sender of a message to insist that 552 some functionality is understood by the recipient. By using the 553 mandatory-to-recognize indicator, a sender is notifying the recipient 554 that it MUST understand the named header or feature in order to 555 properly understand the message. 557 A header or feature is indicated as being mandatory-to-recognize by a 558 'Require:' header. For example: 560 Require: MyFeatures.VitalMessageOption 561 MyFeatures.VitalMessageOption: Confirmation-requested 563 Multiple required header names may be listed in a single 'Require' 564 header, separated by commas. 566 NOTE: indiscriminate use of 'Require:' headers could 567 harm interoperability. It is suggested that any 568 implementer who defines required headers also publish the 569 header specifications so other implementations can 570 succesfully interoperate. 572 The 'Require:' header MAY also be used to indicate that some non- 573 header semantics must be implemented by the recipient, even when it 574 does not appear as a header. For example: 576 Require: Locale.MustRenderKanji 578 might be used to indicate that message content includes characters 579 from the Kanji repertoire, which must be rendered for proper 580 understanding of the message. In this case, the header name is just 581 a token (using header name syntax and namespace association) that 582 indicates some desired behaviour. 584 3.6 Collected message header syntax 586 The following description of message header syntax uses ABNF, per RFC 587 2234 [17]. Most of this syntax can be interpreted as defining UCS 588 character sequences or UTF-8 octet sequences. Alternate productions 589 at the end allow for either interpretation. 591 Header = Header-name ":" *( ";" Parameter ) SP 592 Header-value 593 CRLF 595 Header-name = [ Name-prefix "." ] Name 596 Name-prefix = Name 598 Parameter = Lang-param / Ext-param 599 Lang-param = "lang=" Language-tag 600 Ext-param = Param-name "=" Param-value 601 Param-name = Name 602 Param-value = Token / Number / String 604 Header-value = *HEADERCHAR 606 Name = 1*NAMECHAR 607 Token = 1*TOKENCHAR 608 Number = 1*DIGIT 609 String = DQUOTE *( Str-char / Escape ) DQUOTE 610 Str-char = %x20-21 / %x23-5B / %x5D-7E / UCS-high 611 Escape = "\" ( "u" 4(HEXDIG) ; UCS codepoint 612 / "b" ; Backspace 613 / "t" ; Tab 614 / "n" ; Linefeed 615 / "r" ; Return 616 / DQUOTE ; Double quote 617 / "'" ; Single quote 618 / "\" ) ; Backslash 620 Formal-name = 1*( Token SP ) / String 621 URI = 622 Language-tag = 624 ; Any UCS character except CTLs, or escape 625 HEADERCHAR = UCS-no-CTL / Escape 627 ; Any US-ASCII char except ".", CTLs or SEPARATORS: 628 NAMECHAR = %21 / %23-26 / %2a-2b / %2d / %5e-60 / %7c / %7e 629 / ALPHA / DIGIT 631 ; Any UCS char except CTLs or SEPARATORS: 632 TOKENCHAR = NAMECHAR / "." / UCS-high 633 SEPARATORS = "(" / ")" / "<" / ">" / "@" ; 28/29/3c/3e/40 634 / "," / ";" / ":" / "\" / <"> ; 2c/3b/3a/5c/22 635 / "/" / "[" / "]" / "?" / "=" ; 2f/5b/5d/3f/3d 636 / "{" / "}" / SP ; 7b/7d/20 637 CTL = 638 CRLF = 639 SP = 640 DIGIT = 641 HEXDIG = 642 ALPHA = 643 DQUOTE = 645 To interpret the syntax in a general UCS character environment, use 646 the following productions: 648 UCS-no-CTL = %x20-7e / UCS-high 649 UCS-high = %x80-ffffffff 651 To interpret the syntax as defining UTF-8 coded octet sequences, use 652 the following productions: 654 UCS-no-CTL = UTF8-no-CTL 655 UCS-high = UTF8-multi 656 UTF8-no-CTL = %x20-7e / UTF8-multi 657 UTF8-multi = %xC0-DF %x80-BF 658 / %xE0-EF %x80-BF %x80-BF 659 / %xF0-F7 %x80-BF %x80-BF %x80-BF 660 / %xF8-FB %x80-BF %x80-BF %x80-BF %x80-BF 661 / %xFC-FD %x80-BF %x80-BF %x80-BF %x80-BF %x80-BF 663 4. HEADER DEFINITIONS 665 This specification defines a core set of headers that are defined and 666 available for use by applications: the application specification 667 must indicate the headers that may be used, those that must be 668 recognized and those that must appear in any message (see section 6). 670 The header definitions that follow fall into two categories: 672 (a) those that are part of the CPIM format extensibility framework, 673 and 675 (b) some that have been based on similar headers in RFC 822, 676 specified here with corresponding semantics. 678 Header names and syntax are given without a namespace qualification, 679 and the associated namespace URI is listed as part of the header 680 description. Any of the namespace associations already mentioned 681 (implied default namespace, explicit default namespace or implied 682 namespace prefix or explicit namespace prefix declaration) may be 683 used to identify the namespace. 685 All headers defined here are associated with the namespace URI 686 <[[[urn:iana:cpim-headers]]]>, which is defined according to [22]. 688 4.1 The 'From' header 690 Indicates the sender of a message. 692 Header name: From 694 Namespace URI: <[[[urn:iana:cpim-headers]]]> 696 Syntax: (see also section 3.6) 698 From-header = "From" ": " [ Formal-name ] "<" URI ">" 700 Description: 702 Indicates the sender or originator of a message. 704 If present, the 'Formal-name' identifies the person or "real 705 world" name for the originator. 707 The URI indicates an address for the originator. 709 Examples: 711 From: Winnie the Pooh 713 From: 715 4.2 The 'To' header 717 Specifies an intended recipient of a message. 719 Header name: To 721 Namespace URI: <[[[urn:iana:cpim-headers]]]> 723 Syntax: (see also section 3.6) 725 To-header = "To" ": " [ Formal-name ] "<" URI ">" 727 Description: 729 Indicates the recipient of a message. 731 If present, the 'Formal-name' identifies the person or "real 732 world" name for the recipient. 734 The URI indicates an address for the recipient. 736 Multiple recipients may be indicated by including multiple 'To' 737 headers. 739 Examples: 741 To: Winnie the Pooh 743 To: 745 4.3 The 'cc' header 747 Specifies a non-primary recipient ("courtesy copy") for a message. 749 Header name: cc 751 Namespace URI: <[[[urn:iana:cpim-headers]]]> 753 Syntax: (see also section 3.6) 755 Cc-header = "cc" ": " [ Formal-name ] "<" URI ">" 757 Description: 759 Indicates a courtesy copy recipient of a message. 761 If present, the 'Formal-name', if present, identifies the person 762 or "real world" name for the recipient. 764 The URI indicates an address for the recipient. 766 Multiple courtesy copy recipients may be indicated by including 767 multiple 'cc' headers. 769 Examples: 771 cc: Winnie the Pooh 773 cc: 775 4.4 The 'DateTime' header 777 Specifies the date and time a message was sent. 779 Header name: Date 781 Namespace URI: <[[[urn:iana:cpim-headers]]]> 783 Syntax: 785 DateTime-header = "DateTime" ": " date-time 787 (where the syntax of 'date-time' is a profile of ISO8601, defined 788 in "Date and Time on the Internet" [23]) 790 Description: 792 The 'Date' header supplies the current date and time at which the 793 sender sent the message. 795 One purpose of the this header is to provide for protection 796 against a replay attack, by allowing the recipient to know when 797 the message was intended to be sent. The value of the date header 798 is the current time at the sender when the message was 799 transmitted, using ISO 8601 date and time format as profiles in 800 "Date and Time on the Internet: Timestamps" [23]. 802 Example: 804 Date: 2001-02-01T12:16:49-05:00 806 4.5 The 'Subject' header 808 Contains a description of the topic of the message. 810 Header name: Subject 812 Namespace URI: <[[[urn:iana:cpim-headers]]]> 814 Syntax: (see also section 3.6) 816 Subject-header = "Subject" ":" [ lang-param ] SP *HEADERCHAR 818 Description: 820 The 'Subject' header supplies the sender's description of the 821 topic or content of the message. 823 The sending agent should specify the language parameter if it has 824 any reasonable knowledge of the language used by the sender to 825 describe the message. 827 Example: 829 Subject:;lang=en Eeyore's feeling very depressed today 831 4.6 The 'NS' header 833 The "NS" header is used to declare a local namespace prefix. 835 Header name: NS 837 Namespace URI: <[[[urn:iana:cpim-headers]]]> 839 Syntax: (see also section 3.6) 841 NS-header = "NS" ": " [ Name-prefix ] "<" URI ">" 843 Description: 845 Declares a namespace prefix that may be used in subsequent header 846 names. See section 3.4 for more details. 848 Example: 850 NS: MyAlias 851 MyAlias.MyHeader: private-extension-data 853 4.7 The 'Require' header 855 Specify a header or feature that must be implemented by the receiver 856 for correct message processing. 858 Header name: NS 860 Namespace URI: <[[[urn:iana:cpim-headers]]]> 862 Syntax: (see also section 3.6) 864 Require-header = "Require" ": " Header-name *( "," Header-name ) 866 Description: 868 Declares a namespace prefix that may be used in subsequent header 869 names. See section 3.5 for more details. 871 Note that there is no requirement that the required header 872 actually be used, but for brevity it is recommended that an 873 implemention not use issue require header for unused headers. 875 Example: 877 Require: MyAlias.VitalHeader 879 5. EXAMPLES 881 The examples in the following sections use the following per-line 882 tags to indicate different parts of the overall message format: 884 m: MIME headers for the overall message 885 s: a blank separator line 886 h: message headers 887 e: encapsulated MIME object containing the message content 888 x: MIME security multipart message wrapper 890 The following examples also assume that <[[[urn:iana:cpim- 891 headers]]]> is the implied default namespace for the application 892 concerned. 894 5.1 An example message/cpim message 896 The following example shows a message/cpim message: 898 m: Content-type: message/cpim 899 s: 900 h: From: MR SANDERS 901 h: To: Depressed Donkey 902 h: Date: 2000-12-13T13:40:00-08:00 903 h: Subject: the weather will be fine today 904 h: Subject:;lang=fr beau temps prevu pour aujourd'hui 905 h: NS: MyFeatures 906 h: Require: MyFeatures.VitalMessageOption 907 h: MyFeatures.VitalMessageOption: Confirmation-requested 908 h: MyFeatures.WackyMessageOption: Use-silly-font 909 s: 910 e: Content-type: text/xml; charset=utf-8 911 e: Content-ID: <1234567890@foo.com> 912 e: 913 e: 914 e: Here is the text of my message. 915 e: 917 5.2 An example using MIME multipart/signed 919 In order to secure a message/cpim, an application or implementation 920 should use RFC 1847 and some appropriate cryptographic scheme. 922 Using S/MIME and pkcs7, the above message would look like this: 924 x: Content-Type: multipart/signed; boundary=next; 925 MDALG=SHA-1; type=application/pkcs 926 x: 927 x: --next 928 m: Content-Type: message/cpim 929 s: 930 h: From: MR SANDERS 931 h: To: Dopey Donkey 932 h: Date: 2000-12-13T13:40:00-08:00 933 h: Subject: the weather will be fine today 934 h: Subject:;lang=fr beau temps prevu pour aujourd'hui 935 h: NS: MyFeatures 936 h: Require: MyFeatures.VitalMessageOption 937 h: MyFeatures.VitalMessageOption: Confirmation-requested 938 h: MyFeatures.WackyMessageOption: Use-silly-font 939 s: 941 e: Content-type: text/xml; charset=utf-8 942 e: Content-ID: <1234567890@foo.com> 943 e: 944 e: 945 e: Here is the text of my message. 946 e: 947 x: --next 948 x: Content-Type: application/pkcs7 949 x: 950 x: (signature stuff) 951 : 952 x: --next-- 954 6. APPLICATION DESIGN CONSIDERATIONS 956 Applications using this specification must specify: 958 o a default namespace URI for messages created and processed by that 959 application 961 o any namespace prefixes that are implicitly defined for messages 962 created and processed by that application 964 o all headers that must be recognized by implementations of the 965 application 967 o any headers that must be present in messages created by that 968 application. 970 o any headers that may appear more than once in a message, and how 971 they are to be interpreted (e.g. how to interpret multiple 972 'subject:' headers with different language parameter values). 974 Within a network of message transfer agents, an intermediate gateway 975 MUST NOT change the message/cpim content in any way. This implies 976 that headers cannot be changed or reordered, transfer encoding cannot 977 be changed, languages cannot be changed, etc. 979 Because message/cpim messages are immutable, any transfer agent that 980 wants to modify the message should create a new message/cpim message 981 with the modified header and containing the original message as its 982 content. (This approach is similar to real-world bill-of-lading 983 handling, where each person in the chain attaches a new sheet to the 984 message. Then anyone can validate the original message and see what 985 was changed and who changed it by following the trail of amendments. 986 Another metaphor is including the old message in a new envelope.) 988 7. IANA CONSIDERATIONS 990 [[[Registration template for message/cpim content type]]] 992 [[[Registration of namespace URN for CPIM headers]]] 994 8. INTERNATIONALIZATION CONSIDERATIONS 996 Message headers use UTF-8 character encoding throughout, so can 997 convey the full UCS-4 (Unicode, ISO/IEC 10646) character repertoire. 999 Language tagging is provided for message headers using the "Language" 1000 parameter. 1002 Message content is any MIME-encapsulated content, and normal MIME 1003 content internationalization considerations apply. 1005 9. SECURITY CONSIDERATIONS 1007 The message/cpim format is designed with security in mind. In 1008 particular it is designed to be used with MIME security multiparts 1009 for signatures and encryption. To this end, message/cpim messages 1010 must be considered immutable once created. 1012 Because message/cpim messages are binary messages (due to UTF-8 1013 encoding), if they are transmitted across non-8-bit-clean transports 1014 then the transfer agent must tunnel the entire message. Changing the 1015 message data encoding is not an allowable option. This implies that 1016 the message/cpim must be encapsulated by the message tranfer system 1017 and unencapsulated at the receiving end of the tunnel. 1019 The resulting message must have no data loss due to the encoding and 1020 unencoding of the message. For example, an application may choose to 1021 apply the MIME base64 content-transfer-encoding to the message/cpim 1022 object to meet this requirement. 1024 10. ACKNOWLEDGEMENTS 1026 The authors thank the following for their helpful comments: Harald 1027 Alvestrand, Walter Houser, Leslie Daigle, [[[....]]] 1029 11. REFERENCES 1031 [1] Bradner, S., "Key words for use in RFCs to Indicate Requirement 1032 Levels", RFC 2119, March 1997. 1034 [2] Crocker, D., "Standard for the format of ARPA Internet text 1035 messages", RFC 822, STD 11, August 1982. 1037 [3] Freed, N. and N. Borenstein, "Multipurpose Internet Mail 1038 Extensions (MIME) Part One: Format of Internet Message Bodies", 1039 RFC 2045, November 1996. 1041 [4] Freed, N. and N. Borenstein, "Multipurpose Internet Mail 1042 Extensions (MIME) Part Two: Media Types", RFC 2046 November 1043 1996. 1045 [5] Freed, N., Klensin, J., and J. Postel, "Multipurpose Internet 1046 Mail Extensions (MIME) Part Four: Registration Procedures", RFC 1047 2048, BCP 13, November 1996. 1049 [6] Weider, C., Preston, C., Simonsen, K., Alvestrand, H., Atkinson, 1050 R., Crispin, M., Svanberg, P., "Report from the IAB Character 1051 Set Workshop", RFC 2130, April 1997. 1053 [7] Alvestrand, H., "Tags for the Identification of Languages", RFC 1054 3066, January 2001. (Defines Content-language header.) 1056 [8] Ramsdell, B., "S/MIME Version 3 Message Specification", RFC 1057 2633, June 1999. 1059 [9] Callas, J., Donnerhacke, L., Finney, H. and R. Thayer, "OpenPGP 1060 Message Format", RFC 2440, November 1998. 1062 [10] Berners-Lee, T., Fielding, R.T. and L. Masinter, "Uniform 1063 Resource Identifiers (URI): Generic Syntax", RFC 2396, August 1064 1998. 1066 [11] Tim Bray, Jean Paoli, and C. M. Sperberg-McQueen, "Extensible 1067 Markup Language (XML) 1.0", W3C recommendation: 1068 , 10 February 1998. 1070 [12] Tim Bray, Dave Hollander, and Andrew Layman "Namespaces in XML", 1071 W3C recommendation: , 14 1072 January 1999. 1074 [13] "Data elements and interchange formats - Information interchange 1075 - Representation of dates and times" ISO 8601:1988(E) 1076 International Organization for Standardization June 1988. 1078 [14] Crocker, D.H., Diacakis, A., Mazzoldi, F., Huitema, C., Klyne, 1079 G., Rose, M.T., Rosenberg, J., Sparks, R. and H. Sugano, "A 1080 Common Profile for Instant Messaging (CPIM)", draft-thenine-im- 1081 common-00 (work in progress), August 2000. 1083 [15] Day, M., Aggarwal, S., Mohr, G., and J. Vincent "Instant 1084 Messaging / Presence Protocol Requirements" RFC 2779 February 1085 2000. 1087 [16] N. Freed, K. Moore "MIME Parameter Value and Encoded Word 1088 Extensions: Character Sets, Languages, and Continuations" RFC 1089 2231 November 1997. 1091 [17] D. Crocker, P. Overell "Augmented BNF for Syntax Specifications: 1092 ABNF" RFC 2234 November 1997. 1094 [18] R. Fielding, J. Gettys, J. Mogul, H. Frystyk, L. Masinter, P. 1095 Leach, T. Berners-Lee "Hypertext Transfer Protocol -- HTTP/1.1" 1096 RFC 2616 June 1999. 1098 [19] Alvestrand, H, "IETF Policy on Character Sets and Languages", 1099 RFC 2277, BCP 18, January 1998. 1101 [20] Freed, N., and J. Postel, "IANA Charset Registration 1102 Procedures", BCP 19, RFC 2278, January 1998. 1104 [21] F. Yergeau "UTF-8, a transformation format of ISO 10646" RFC 1105 2279 January 1998. 1107 [22] M. Mealling "A URN Namespace for IANA Registered Protocol 1108 Elements" draft-mealling-iana-urn-00.txt (work in progress) 1109 November 2000 1111 [23] C. Newman, G. Klyne "Date and Time on the Internet: Timestamps" 1112 draft-ietf-impp-datetime-03.txt (work in progress) May 2001. 1114 12. AUTHORS' ADDRESSES 1116 Derek Atkins 1117 Telcordia Technologies 1118 6 Farragut Ave 1119 Somerville, MA 02144 1120 USA. 1121 Telephone: +1 617 623 3745 1122 E-mail: warlord@research.telcordia.com 1123 E-mail: warlord@alum.mit.edu 1124 Graham Klyne 1125 Baltimore Technologies - Content Security Group, 1126 1310 Waterside, 1127 Arlington Business Park 1128 Theale 1129 Reading, RG7 4SA 1130 United Kingdom. 1131 Telephone: +44 118 903 8000 1132 Facsimile: +44 118 903 9000 1133 E-mail: GK@ACM.ORG 1135 Appendix A: Amendment history 1137 00a 01-Feb-2001 Memo initially created. 1139 00b 06-Feb-2001 Editorial review. Reworked namespace framework 1140 description. Deferred specification of mandatory 1141 headers to the application specification, allowing 1142 this document to be less application-dependent. 1143 Expanded references. Replaced some text with ABNF 1144 syntax descriptions. Reordered some major sections. 1146 00c 07-Feb-2001 Folded in some review comments. Fix up some syntax 1147 problems. Other small editorial changes. Add some 1148 references. 1150 01a 29-Mar-2001 Incorporate review comments. State (simply) that 1151 this is a canonical end-to-end format for the purpose 1152 of signature calculation. Defined escape mechanism 1153 for control characters. Header name parameters 1154 placed after the ":". Changed name of Date: header 1155 to DateTime:. Revised syntax to separate character- 1156 level syntax from UTF-8 octet- level syntax. 1158 01b 30-Mar-2001 State explicitly that unrecognized header names 1159 should be ignored. Remove text about 1160 (non)significance of header order: simply say that 1161 order must be preserved. 1163 02a 30-May-2001 Updated reference to date/time draft. Editorial 1164 changes. 1166 03a 13-Jun-2001 Tighten up application of escape sequences. 1168 TODO: 1170 o confirm urn namespace for headers (currently depends on a work- 1171 in-progress). 1173 o Complete IANA considerations 1175 REVIEW CHECKLIST: 1177 (Points to be checked or considered more widely on or before final 1178 review.) 1180 o The desirability of a completely rigid syntax. 1182 o Escape mechanism details. 1184 Full copyright statement 1186 Copyright (C) The Internet Society 2001. All Rights Reserved. This 1187 document and translations of it may be copied and furnished to 1188 others, and derivative works that comment on or otherwise explain it 1189 or assist in its implementation may be prepared, copied, published 1190 and distributed, in whole or in part, without restriction of any 1191 kind, provided that the above copyright notice and this paragraph are 1192 included on all such copies and derivative works. 1194 However, this document itself may not be modified in any way, such as 1195 by removing the copyright notice or references to the Internet 1196 Society or other Internet organizations, except as needed for the 1197 purpose of developing Internet standards in which case the procedures 1198 for copyrights defined in the Internet Standards process must be 1199 followed, or as required to translate it into languages other than 1200 English. 1202 The limited permissions granted above are perpetual and will not be 1203 revoked by the Internet Society or its successors or assigns. This 1204 document and the information contained herein is provided on an "AS 1205 IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK 1206 FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT 1207 LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL 1208 NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY 1209 OR FITNESS FOR A PARTICULAR PURPOSE.