idnits 2.17.1 draft-lindsey-usefor-signed-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Looks like you're using RFC 2026 boilerplate. This must be updated to follow RFC 3978/3979, as updated by RFC 4748. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** Missing expiration date. The document expiration date should appear on the first and last page. == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** The document seems to lack an Authors' Addresses Section. ** The document seems to lack separate sections for Informative/Normative References. All references will be assumed normative when checking for downward references. ** There is 1 instance of too long lines in the document, the longest one being 2 characters in excess of 72. == There are 1 instance of lines with non-RFC2606-compliant FQDNs in the document. Miscellaneous warnings: ---------------------------------------------------------------------------- == The document seems to lack the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. (The document does seem to have the reference to RFC 2119 which the ID-Checklist requires). -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (September 2001) is 8257 days in the past. Is this intentional? -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '' and '' lines. Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'CFWS' is mentioned on line 481, but not defined == Missing Reference: 'FWS' is mentioned on line 274, but not defined -- Looks like a reference, but probably isn't: '0' on line 1498 == Unused Reference: 'RFC 2234' is defined on line 1018, but no explicit reference was found in the text -- Possible downref: Non-RFC (?) normative reference: ref. 'PGPMOOSE' -- Possible downref: Non-RFC (?) normative reference: ref. 'PGPVERIFY' ** Obsolete normative reference: RFC 1036 (Obsoleted by RFC 5536, RFC 5537) ** Obsolete normative reference: RFC 1327 (Obsoleted by RFC 2156) ** Obsolete normative reference: RFC 2234 (Obsoleted by RFC 4234) ** Obsolete normative reference: RFC 2440 (Obsoleted by RFC 4880) ** Obsolete normative reference: RFC 2821 (Obsoleted by RFC 5321) ** Obsolete normative reference: RFC 2822 (Obsoleted by RFC 5322) -- No information found for draft-ietf-usefor-article-format - is the name correct? -- Possible downref: Normative reference to a draft: ref. 'USEFOR' Summary: 12 errors (**), 0 flaws (~~), 6 warnings (==), 8 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group Charles H. Lindsey 3 Internet-Draft University of Manchester 4 September 2001 6 Signed Headers in Mail and Netnews 8 draft-lindsey-usefor-signed-01.txt 10 Status of this Memo 12 This document is an Internet-Draft and is in full conformance with 13 all provisions of Section 10 of RFC2026. 15 Internet-Drafts are working documents of the Internet Engineering 16 Task Force (IETF), its areas, and its working groups. Note that 17 other groups may also distribute working documents as Internet- 18 Drafts. 20 Internet-Drafts are draft documents valid for a maximum of six 21 months and may be updated, replaced, or obsoleted by other 22 documents at any time. It is inappropriate to use Internet- 23 Drafts as reference material or to cite them other than as "work 24 in progress." 26 The list of current Internet-Drafts can be accessed at 27 http://www.ietf.org/ietf/1id-abstracts.txt. 29 The list of Internet-Draft Shadow Directories can be accessed at 30 http://www.ietf.org/shadow.html. 32 Abstract 34 The huge growth of Netnews/Usenet in recent years has been 35 accompanied by many attempts to abuse the system by various forms 36 of malpractice, particularly the forging of various headers, 37 causing it to appear that articles came from parties other than 38 those that actually injected them or conveyed some Approval that 39 the real poster was not entitled to give. Insofar as Netnews is 40 regularly gatwayed to and from Email systems, these problems also 41 extend to the Email domain. 43 This document provides a cryptographically secure means whereby it 44 can be established beyond doubt that relevant headers of a Netnews 45 article or an Email message have not been tampered with in 46 transit, and that they were indeed originated by the person 47 purporting to have done so. It seeks to supplement, rather than to 48 supplant, the existing protocols for signing the bodies of 49 articles and messages. 51 [This proposal arises from the activities of the Usenet Format Working 52 Group, which is charged with updating the Netnews standards. Comments 53 are invited, preferably sent to the mailing list of the Group at 54 usenet-format@landfield.com.] 55 Table of Contents 57 1. Introduction .................................................. 3 58 1.1. Scope and Objectives ...................................... 3 59 1.2. Notations and Conventions ................................. 4 60 1.2.1. Requirements notation ................................. 4 61 1.2.2. Syntactic notation .................................... 4 62 1.3. Overview .................................................. 4 63 2. Basic Structure of Authenticating Headers ..................... 5 64 2.1. Syntax of the Signed header ............................... 5 65 2.2. Semantics of the Signed header ............................ 7 66 2.3. Syntax of the Verified header ............................. 10 67 2.4. Semantics of the Verified header .......................... 10 68 3. Protocol definition ........................................... 11 69 3.1. Requirements for canonicalization algorithms .............. 11 70 3.2. The PGP-Head-1 protocol ................................... 12 71 3.2.1. The PGP-Head-1 canonicalization algorithm ............. 13 72 3.2.2. The PGP-Head-1 cryptographic algorithm ................ 15 73 3.2.3. The PGP-Head-1 Macro Definition ....................... 16 74 4. Applications .................................................. 16 75 5. Examples ...................................................... 17 76 5.1. Newgroup Control message .................................. 17 77 5.2. Mail message re-signed by mailing list owner .............. 18 78 6. Security ...................................................... 19 79 7. References .................................................... 20 80 8. Acknowledgements .............................................. 21 81 9. Contact Address ............................................... 21 82 10. Intellectual Property Rights ................................. 21 83 Appendix A - Model implementation ................................. 21 84 Appendix A.1 - The PGP-Head-1 canonicalization .................... 21 85 Appendix A.2 - Parsing of the Signed header ....................... 25 86 Appendix A.3 - The Signing program ................................ 28 87 Appendix A.4 - The Verification program ........................... 29 88 Appendix B - Test cases ........................................... 30 89 Appendix C - PGP Public Key ....................................... 32 90 1. Introduction 92 [Remarks enclosed in square brackets and aligned with the left margin, 93 such as this one, are not part of this draft, but are editorial notes to 94 explain matters amongst ourselves, or to point out alternatives, or to 95 indicate work yet to be done.] 97 1.1. Scope and Objectives 99 [This is a Draft of a Draft, for discussion within the USEFOR mailing 100 list until the best format for putting it forward has been decided on. 101 It remains to be decided whether it should be aimed towards an 102 Experimental Protocol or the Standards track. ] 104 "Netnews" is a set of protocols [USEFOR] that enables news "articles" 105 to be broadcast to potentially-large audiences, using a flooding 106 algorithm which propagates copies throughout a network of 107 participating hosts. The huge growth in the use of this protocol in 108 recent years has been accompanied by many attempts to abuse the 109 system by causing it to appear that articles came from parties other 110 than those that actually injected them, or that they had been posted 111 with some Approval that the real poster was not entitled to give, or 112 that they otherwise appeared to be different from what they actually 113 were. The effects of such abuse are particularly accute in the case 114 of "Control" articles which can cause newsgroups to be created or 115 removed on hosts worldwide, or which can cause unauthorized deletion 116 of articles already received and stored on such hosts. It is 117 therefore considered essential to provide a cryptographically secure 118 means whereby it can be established beyond doubt that the source and 119 structure of articles are exactly as they purport to be. 121 "Electronic Mail" is a system for routing "messages" [RFC 2822] 122 between individual computer users, usually on a one-to-one basis. The 123 formats of Email messages and News articles have deliberately been 124 made to be similar, so that messages may be gatewayed to news systems 125 and vice-versa. In order that the same protection may be provided 126 end-to-end for articles passing through such gateways, the protocal 127 described here has been designed so that it will also work in the 128 Email environment. If it should be found to have further applications 129 in the Email environment, then that would be an added bonus. 131 An existing experimental protocol "pgpverify" [PGPVERIFY] is already 132 in widespread use for authenticating Control messages for creating 133 and removing newsgroups within Usenet, and has proven itself very 134 successful in mitigating the effects of malicious attacks against the 135 integrity of Usenet. This present proposal is largely based upon 136 pgpverify; however, pgpverify is unsuitable for more widespread use 137 as it stands because it is unable to cope with folded headers and 138 with the changes that mail messages in particular are likely to 139 undergo during transport. A second similar experimental protocol 140 "pgpmoose" [PGPMOOSE] is also currently in use for protecting 141 moderated newsgroups against unauthorized postings. 143 There also exist protocols for the cryptographic signature of bodies 144 of articles, notably S/Mime and PGP/Mime [RFC 2015] and [RFC 2015bis] 145 , and it is moreover common to sign such bodies using PGP alone 146 without the use of Mime [RFC 2045] et seq at all. However, these 147 protocols cannot, by their nature, be used to sign headers. Moreover, 148 since the signature is applied after any Content-Transfer-Encoding 149 [RFC 2045], it may be impossible to verify the signature if the 150 Content-Transfer-Encoding should be changed as the message passes 151 through a succession of sites during transport. Nevertheless, this 152 present proposal does not attempt to usurp those protocols, but 153 merely provides the means to sign headers, both of complete messages 154 and of headers embedded in Mime messages and multiparts. 156 [This document has been designed to fit on top of the draft currently in 157 preparation for News [USEFOR]. If it is thought wise to issue this 158 document before [USEFOR] is complete, then that reference will have to 159 be to [RFC 1036] instead.] 161 1.2. Notations and Conventions 163 1.2.1. Requirements notation 165 Certain words, when capitalized, are used to define the significance 166 of individual requirements. The key words "MUST", "SHOULD", "MAY" and 167 the same words followed by "NOT" are to be interpreted as described 168 in [RFC 2119]. 170 1.2.2. Syntactic notation 172 This document uses the Augmented Backus Naur Form described in [RFC 173 2234]. A discussion of this is outside the bounds of this document, 174 but it is expected that implementors will be able to quickly 175 understand it with reference to the defining document. 177 1.3. Overview 179 This proposal makes provision for Signed headers to be included in 180 news articles and in Mime messages and multiparts. A Signed header 181 provides a cryptographic signature over a named set of other headers, 182 including lower level headers contained in Mime messages and 183 multiparts below the current level. Such signatures can give 184 assurance to a recipient who verifies them that those headers have 185 not been changed or added to in transit, and/or that the article was 186 indeed sent by its purported originator. 188 The bodies of articles, Mime messages and multiparts are not directly 189 included in the Signature. Rather, the intention is that each such 190 body part should have a Content-MD5 (or similar) header computed for 191 it, and that header should then be included in the Signature instead. 193 There is also provision for Verified headers which may be added by 194 agents that have checked a Signed header. Verified headers may 195 themselves be included in further Signed headers; this may be 196 especially useful in the case of gateways which find it necessary to 197 change an article in ways that invalidate an original signature. 199 Every effort has been made to ensure that signatures remain 200 verifiable in spite of all reasonable (and even unreasonable) changes 201 to which they may be subjected in transit. These include changes to 202 the Content-Transfer-Encoding of body parts (a principle reason for 203 including them only via the Content-MD5 header), changes in the order 204 of headers and of their layout, and encodings and re-encodings of 205 unusual character sets. This is to be achieved by converting headers 206 into a canonical form before they are signed. New headers, yet to be 207 invented, need provide no problem, and there is no commitment to any 208 particular character set (provided field-names remain in US-ASCII, as 209 at present). 211 Provision is made for different protocols which may be required in 212 the future. However, this proposal defines just one, recommended 213 protocol, and it is not desirable that other protocols should be 214 defined unless and until serious deficiencies in the existing ones 215 have been revealed. 217 2. Basic Structure of Authenticating Headers 219 A Signed or a Verified header may appear in the headers of a news 220 article or a mail message, or in the headers of a Mime multipart 221 sub-part or of a Mime message/rfc822 object (or indeed of any similar 222 Mime object yet to be invented). In all cases, the term "current 223 level" encompasses the entire set of headers in that same object. 224 Where the headers at the current level include a "Content-Type: 225 multipart/*" or "Content-Type: message/*" header, lower-level 226 headers can arise within its sub-parts. 228 Examples of Netnews articles and Email messages containing these 229 headers may be found in section 5 below. 231 2.1. Syntax of the Signed header 233 Signed = "Signed" ["-" DIGIT9] ":" 1*SP header-ref-list 234 1*( ";" header-parameter ) CRLF 235 DIGIT9 = %x31-39 ; 1..9 236 header-ref-list= header-ref *( [CFWS] "," [CFWS] header-ref ) 237 header-ref = [ "+" / "-" ] [ subpart-indicator ] field-name / 238 [ subpart-indicator ] macro-name 239 subpart-indicator 240 = 1*( DIGIT9 *DIGIT ":" ) 241 field-name ; see section [RFC 2822] 242 CFWS ; see [RFC 2822] 243 FWS ; see [RFC 2822] 244 macro-name = '$' 247 header-parameter 248 = signed-header-parameter / other-header-parameter 249 signed-header-parameter 250 = signed-token "=" value 252 signed-token = [CFWS] ( "protocol" / "key" / "sig" ) [CFWS] / 253 254 other-header-parameter 255 = attribute "=" value 256 attribute = iana-token / x-token 257 iana-token = 260 value = token / quoted-string 261 x-token = [CFWS] "x-" token-core [CFWS] 262 token = [CFWS] token-core [CFWS] 263 token-core = 1* 265 tspecials = "(" / ")" / "<" / ">" / "@" / 266 "," / ";" / ":" / " 267 "/" / "[" / "]" / "?" / "=" 268 quoted-string ; see [RFC 2822] 269 protocol-value = ietf-token / x-token 270 ietf-token = 273 key-id-value = token 274 signature-value= [CFWS] DQUOTE [FWS] 1*( btext [FWS] ) DQUOTE [CFWS] 275 btext = %x41-5A / %x61-7A / %x30-39 / "+" / "/" / "=" 276 ; base 64 chars 278 The header-parameters MUST include a "protocol" parameter and a "sig" 279 parameter, of which the "sig" paramameter MUST be the last parameter 280 and MUST NOT be followed by CFWS (though it MAY be followed by WS). 281 Any other-header-parameter that is present SHOULD be ignored. 283 NOTE: The requirement for an explicit SP after the ":" is to 284 ensure compatibility with the syntax of Netnews [USEFOR]; it is 285 not strictly necessary for Email. 287 The use of a DIGIT9 in the Signed header allows for 10 distinct such 288 headers at any one level. This is more than sufficient for the 289 intended usage (it would be most unusual to get beyond Signed-2) 290 whilst still permitting implementations to check field-names against 291 a fixed list of valid names. There MUST NOT be more than one Signed 292 header with no DIGIT9, or the same DIGIT9, within one set of headers. 294 The header-ref-list indicates those header-refs, at or below the 295 current level, which are covered by the signature. The ordering of 296 this list is significant. A header-ref prefixed by a "+", or not 297 prefixed at all, indicates a header-ref to be added to the list 298 defined by those preceding it (unless already present therein), and a 299 header-ref prefixed by "-" indicates a header-ref to be removed from 300 the header-refs defined by the list preceding it. Any macro-names in 301 the header-ref-list MUST be defined, in the definition of the 302 protocol, to expand into a header-ref-list which does not itself 303 contain any subpart-indicators or further macro-names. 305 NOTE: If some header-ref in the list matches no header in the 306 actual article, then it comprises an assertion that no such 307 header was present when the article was signed. Headers which 308 are routinely added to or altered as the article progresses 309 through transports (such as Path, Received and Xref, and even 310 Sender and Content-Transfer-Encoding) SHOULD NOT be included in 311 a header-ref-list, and neither should any header which appears 312 twice in the set of headers. A header-ref prefixed by "-" may be 313 used to exclude any header-ref arising from the expansion of a 314 macro-name. 316 Tokens are case-insensitive. "PGP-Head-1" (section 3.2) is the 317 preferred protocol defined by this proposal. It is desirable to keep 318 the number of recognized protocols to an absolute minimum, and it is 319 anticipated that further protocols would only be needed in the event 320 that serious cryptographic deficiencies were to be found in the 321 existing ones. 323 The "key" parameter identifies the key used to generate the signature 324 in a notation dependent upon the protocol (but commonly "0x" followed 325 by hexadecimal digits). The CFWS following it MAY include a comment 326 containing an identification of the person or entity which owns that 327 key. 329 2.2. Semantics of the Signed header 331 Where the headers at the current level include a "Content-Type: 332 multipart/*" or "Content-Type: message/*" header, lower-level headers 333 within its sub-parts may be referenced as follows: 335 (i) A header-ref not containing any subpart-indicator references the 336 header of that name, if any, at the current level. Header-refs 337 are, for this purpose, considered as case-insensitive. 339 (ii) A header-ref of the form ":XXXX/" (or "::...XXXX"), 340 where and are numbers and the current level contains a 341 "Content-Type: multipart/*" header, references the header that 342 would be referenced by "XXXX" alone (or by ":...XXXX") in the 343 th sub-part of that multipart, that sub-part now being 344 regarded as the current level. 346 (iii)A header-ref of the form "1:XXXX", where the current level 347 contains a "Content-Type: message/rfc822" header (or any other 348 message type which provides for its own set of headers), 349 references the header that would be referenced by "XXXX" alone 350 in that message object. 352 (iv) A header-ref that does not match up with multipart or message 353 Content-Type headers as indicated above MUST NOT be used. 355 (v) For example "3:2:Content-MD5" references the Content-MD5 header 356 of the second part of a multipart, which is itself the third 357 part of a multipart established at the current level. 359 A protocol, as established by this proposal or by any extension to 360 it, comprises three parts: a "canonicalization algorithm", a 361 "cryptographic algorithm", and possibly a "macro definition". 363 The signature of a Signed header is constructed in accordance with a 364 given header-ref-list as follows: 366 1. A partial Signed header is constructed from that header-ref-list 367 and such header-parameters (excluding "sig") as are required by 368 the protocol, including at least a "protocol" parameter and, most 369 likely, a "key" parameter identifying the cryptographic key used 370 (possibly followed by a comment indicating the person or entity 371 responsible), all followed by a CRLF. 373 2. A reduced header-ref-list is obtained by first expanding any 374 macro-names defined in the macro definition of the protocol, 375 duplicating any subpart-indicator in front of each header-ref in 376 the expansion; then any "+" prefixes are stripped, and finally, 377 working from left to right, if a header-ref duplicates a preceding 378 one the second copy is removed, and if a header-ref is prefixed by 379 "-", that copy and any previous ones (without that prefix) are 380 removed. 382 3. The partial Signed header (with its original header-ref-list) 383 followed by all the headers referenced by the reduced header-ref- 384 list (being headers at the current level or encapsulated within 385 multiparts at any lower level and taken in their order within the 386 reduced header-ref-list) are concatenated to produce a list of 387 headers to be signed. 389 4. The list of headers to be signed is subjected to the 390 canonicalization algorithm of the protocol to produce a 391 canonicalized list. 393 5. The canonicalized list is subjected to the cryptographic algorithm 394 of the protocol to produce a character stream representing the 395 signature encoded in base64 [RFC 2045]. 397 6. A "sig" parameter is appended to the partial Signed header, its 398 value consisting of a quoted-string containing the base64-encoded 399 stream, split into convenient lines by the insertion of FWS. 401 7. The Signed header thus constructed is then incorporated into the 402 set of headers at the current level. 404 The signature of a Signed header is verified as follows: 406 1. The "sig" parameter is removed from the Signed header to give a 407 partial Signed header. 409 2-4.The corresponding steps of the process that constructed the 410 header are taken, producing a canonicalized list. 412 5. The public key identified according to the "protocol" parameter is 413 now used by the cryptographic algorithm of that protocol to verify 414 that canonicalized list. This may result in a simple pass-fail, or 415 it may return some indication of the privileges (such as the 416 authority to issue certain news control messages or to manage some 417 mailing list) enjoyed by the owner of that key. 419 The purpose of a Signed header is solely to establish that the 420 headers referenced in it were present in an article when that article 421 passed through the hands of the person or entity that generated the 422 signature (and hence that it did indeed pass through those hands). It 423 SHOULD NOT be taken as an endorsement of whatever is contained in the 424 body of the article. If the contents of the body require such 425 endorsement, then the body SHOULD be signed separately, for example 426 in accordance with PGP/Mime [RFC 2015] and [RFC 2015bis]. 428 Signatures will typically be generated by the originators of articles 429 (to prove the origin), by moderators of moderated newsgroups (to 430 testify to their Approved header), by managers of mailing lists, and 431 occasionally by gateways. They SHOULD NOT be generated by 432 intermediate transports and relayers through which the article might 433 pass. This is intended to be an end-to-end protocol, and signatures 434 SHOULD ONLY be added when new, hitherto unsigned, information is 435 added. Moreover, the set of headers included within the signature 436 SHOULD be no more than is necessary to achieve the security desired. 438 NOTE: It will be observed that no provision has been made to 439 include the bodies of an article or of its sub-parts in the 440 signature. If (as will indeed often be the case) it is required 441 to attest that the body (or sub-part) dispatched along with the 442 set of headers is the same as the body that was delivered at the 443 far end, then the proper procedure is to construct a Content-MD5 444 header [RFC 1864] for that body (or sub-part) and to include 445 that Content-MD5 amongst the headers that are signed. Doing it 446 this way confers three advantages: 447 a) The Content-MD5 header is constructed in such a way that it 448 is immune to changes of Content-Transfer-Encoding to which an 449 article, or its sub-parts, may be subjected during transport. 450 b) Given that many user agents already routinely construct a 451 Content-MD5 header, and verify it on receipt (a practice much to 452 be commended), it should be possible to generate a Signed header 453 without an extra pass through the entire body (especially in the 454 common case where there are no sub-parts). This applies 455 particularly in the case of additional signatures by moderators 456 or mailing list managers, who may not need to examine the body 457 at all. 458 c) If a Content-MD5 header should fail to verify (perhaps 459 because of some transmission error) the verification of a Signed 460 header might still succeed, giving the recipient at least some 461 partial information as to where any problem might lie. 463 NOTE: If, at some future time, a Content-SHA1 header (or any 464 similar header based upon a different hashing algorithm) should 465 be invented, it could equally well be used for this purpose. 467 2.3. Syntax of the Verified header 469 Verified = "Verified" ["-" DIGIT9] ":" 1*SP name-addr 470 *( ";" header-parameter ) CRLF 471 name-addr ; see [RFC 2822] 472 header-parameter 473 =/ verified-header-parameter 474 verified-header-parameter 475 = signature-token "=" signature-value / 476 hashcheck-token "=" hashcheck-value 477 signature-token= [CFWS] "signature" [CFWS] 478 hashcheck-token= [CFWS] "hashcheck" [CFWS] 479 signature-value= [CFWS] ( "good" / "FAILED" ) [CFWS] 480 hashcheck-value= [CFWS] DQUOTE ( "good" / "FAILED" ) 481 FWS header-ref-list DQUOTE [CFWS] 483 The Verified header is a "variant header" (i.e. it may be present or 484 not, and in a different form, in different copies of the same article 485 or message). 487 The use of a DIGIT9 in the Verified header allows for 10 distinct 488 such headers in one article. Each Verified header MUST match some 489 Signed header with the same DIGIT9 in that same set of headers. There 490 MAY be more than one Verified header with the same DIGIT9 within one 491 set of headers (but observe that it would not then be possible to 492 include those headers in a further Signed header). 494 Tokens used for attributes are case-insensitive. The only parameters 495 defined by this proposal are the "signature" and "hashcheck" 496 parameters. Any other-header-parameter that is present SHOULD be 497 ignored. The absence of a "signature" parameter should be taken as 498 indicating that the verification had succeeded. The "hashcheck" 499 parameter is to indicate that a Content-MD5 (or similar) header 500 identified in the header-ref-list had been verified, or not as the 501 case may be. 502 [Do we also want a "confidence" parameter for the verifier to express 503 his certainty of the identity of the original Signer, and if so, what 504 notation to use?] 506 2.4. Semantics of the Verified header 508 The Verified header is intended to be added to an article by an agent 509 through which the article passes, and serves as an assertion that the 510 corresponding Signed header has been cryptographically verified by 511 the person or entity identified in the name-addr (or otherwise if the 512 "FAILED" value is present). The addr-spec contained in that name- 513 addr MUST be a valid email address by which that person or entity may 514 be contacted. The original Signed header MUST NOT be removed from the 515 article. The Verified header (supposing it is the only one present 516 with that particular DIGIT9, if any) MAY itself be included in a 517 further Signed header added at the same time. 519 NOTE: The purpose of a Verified header is to save the ultimate 520 recipient the trouble of verifying the cryptographic signature 521 himself (which can be time consuming, and may require knowledge 522 of public keys not in his possession). Such a verification, if 523 performed close to the ultimate recipient (such as by the news 524 or mail server to which he connects) could normally be regarded 525 as adequate evidence of authenticity, even if not signed itself. 526 It would be hard (certainly in the case of Netnews) for a 527 malicious interloper to cause such a verification to appear 528 bearing the identity of the local server of each ultimate 529 recipient. 531 NOTE: The Verified header is also useful in the case that a 532 gateway (or a moderator) makes some change to an article that 533 renders an original Signed header invalid. Such a gateway can 534 therefore certify that the original form of the Signed header 535 had been verified, and can then re-sign the article (including 536 the added Verified header). Likewise, a site (such as the 537 originator's own server) with a well known public key can verify 538 and resign an article whose originator's public key may be less 539 well known. However, Verified headers SHOULD NOT be added as 540 routine by other intermediate sites. 542 It is normally the business of the reading agent of the ultimate 543 recipient to check the correctness of a Content-MD5 or similar 544 header. Nevertheless, an earlier agent that has added a Verified 545 header and also checked such a Content-MD5 header MAY so indicate by 546 including a "hashcheck" parameter. 548 3. Protocol definition 550 3.1. Requirements for canonicalization algorithms 552 It is a sad fact of life that those implementing agents for handling 553 Netnews and Email cannot resist the temptation to "improve" articles 554 passed through them by rewriting headers that are thought not to 555 conform to some real or supposed standard. Experience shows that, in 556 the majority of cases, such tinkering makes matters worse rather than 557 better, and for that reason [USEFOR] and, to a lesser extent, [RFC 558 2822] and [RFC 2821] try to forbid it, especially when perpetrated by 559 relaying and transport agents (there are arguments in favour of 560 allowing injecting agents and other agents close to the originator to 561 do some limited cleanups, especially where it is impractical to 562 return the article to the originator for correction). 564 Furthermore, in the case of Email it is often required for the 565 transport protocols to modify articles en route, most notably when 566 articles containing octets with the 8th bit set have to be passed 567 through a channel that permits only 7bit. 569 It is a further sad fact of life that agents which make such changes 570 are not going to go away just because some standard says so. 571 Therefore, the canonicalization algorithm SHOULD endeavour to enable 572 the headers of articles to be signed and verified in accordance with 573 this proposal in spite of such tinkerings, insofar as they can be 574 anticipated. The following list indicates some common practices which 575 are worth detecting and protecting against. 577 o Headers may be re-folded to fit within some preferred overall 578 line length. This may result in the creation of whitespace where 579 none existed before. 580 o Trailing whitespace may be removed, and line endings changed 581 to/from CRLF. 582 o Field-names may be converted into some usual canonical form (e.g. 583 "Mime-Version" into "MIME-Version"). 584 o Phrases, or parts thereof, may be converted to or from quoted- 585 strings. 586 o Date-times may be rewritten in some preferred format, or into 587 some preferred timezone. 588 o Headers with non-ASCII characters may be converted to or from the 589 notation defined in [RFC 2047]. 590 Observe that there is no canonical way to do this conversion 591 and it is, moreover, frequently performed in contexts where it 592 is not strictly allowed. 593 [Other contributions to this list welcomed.] 595 Since the slightest change to a canonicalization algorithm will 596 render it inoperable with previous versions, such an algorithm MUST 597 NOT be changed once it has been defined by this proposal, or any 598 extension thereof. In the event of some inadequacy being found, it 599 would be necessary to devise and standardize a new algorithm, a task 600 not to be undertaken lightly. For this reason, canonicalization 601 algorithms SHOULD be designed to cope with the widest possible range 602 of headers, including those not yet invented. Therefore, they SHOULD 603 NOT, so far as possible, rely on the ability to parse any particular 604 header. 606 NOTE: A canonicalization algorithm is required simply to produce 607 an octet stream for submission to the cryptographic algorithm. 608 That stream does not have to be human readable, nor does it have 609 to be a syntactically-correct header, nor does it have to be 610 convertible back into the original header, or into any correct 611 header at all. Insofar as many original headers can, in 612 principle, be mapped into the same octet stream, this in no way 613 reduces the utility of the algorithm, even though it might 614 enable conspiracy theorists to imagine, and even implement, 615 various sorts of covert channels for use by malicious 616 interlopers. 618 3.2. The PGP-Head-1 protocol 620 The "The PGP-Head-1" protocol is comprised of a canonicalization 621 algorithm, a cryptographic algorithm, and a macro definition. 623 3.2.1. The PGP-Head-1 canonicalization algorithm 625 For the purposes of this algorithm, the headers Subject, Comments, 626 Organization and Summary, and all headers starting with "X-", are to 627 be considered "unstructured" and all other headers "structured" 628 (whether or not they were so described in any other standard). 629 Headers are considered to be constrained to the following syntax: 631 structured-header 632 = header-name ":" 633 1*SP structured-header-content CRLF 634 unstructured-header 635 = header-name ":" 636 1*SP unstructured-header-content CRLF 637 header-name = 1*name-character *( "-" 1*name-character ) 638 name-character= ALPHA / DIGIT 639 structured-header-content 640 = *structured-header-zone 641 unstructured-header-content 642 = unstructured-header-zone 643 structured-header-zone 644 = neutral-zone / quoted-zone / sharp-zone / 645 square-zone / comment-zone 646 unstructured-header-zone 647 = *( FWS / encoded-word / ) 648 neutral-zone = 1*( FWS / encoded-word / 649 ) 650 quoted-zone = DQUOTE *( FWS / 651 ) 652 DQUOTE 653 sharp-zone = "<" *( FWS / 654 "> ) ">" 655 square-zone = "[" *( FWS / 656 ) "]" 657 comment-zone = "(" *( FWS / encoded-word / comment-zone / 658 ) ")" 659 encoded-word = "=?" pure-token "?" pure-token "?" 660 1* "?=" 662 pure-token = 1* 665 o where '' means any octet other than those 666 representing the US-ASCII characters NULL, CR, LF, TAB and SP, 667 o where 'except unquoted "x"' means except any "x" not immediately 668 preceded by a "\" and thus constituting a quoted-pair, and 669 o where an encoded-word does not include "(" or ")" when in a 670 comment-zone, and does not include DQUOTE, "<", "[", or "(" when 671 in a neutral-zone. 672 Observe that certain field-names containing non-alphanumeric 673 characters, and permitted by [RFC 2822] (though never used in 674 practice) are excluded from this protocol. Moreover, it is not 675 assumed that this protocol will work on any of the obsolete syntax 676 defined by [RFC 2822]. 678 NOTE: All known Email and Netnews headers (and a lot more 679 besides) are encompassed within this syntax. Observe that the 680 various zones cannot possibly overlap, and that any encoded-word 681 must be fully contained within its zone. All encoded-words 682 permitted by [RFC 2047] (and more besides) are covered. The 683 structure is easily parsed by a straightforward state machine 684 (though the nesting of comment-zones is a nuisance, as is the 685 impossibility to detect whether a sequence beginning "=?" was 686 really an encoded-word until you get to the matching "?="). 688 Each header to be included in the algorithm, which will in general 689 consist of several lines (those after the first commencing with 690 whitespace), is processed as follows: 692 1. The header-name at the start of the header is converted to 693 lowercase, the whitespace following it (if any) is removed, and a 694 single SP is inserted. 696 2. Any date-time occurring in a Date, Resent-Date or Expires header 697 (but not in any other header) is converted to UTC and written as a 698 date-time in the format 699 07 dec 2000 23:59:60 +0000 700 Note absence of day-of-week, leading zero included in day, month- 701 name in lower case, year as four digits, seconds included, and 702 timezone 0000 preceded by a "+" as opposed to a "-", and observe 703 that any FWS will be removed in the next step, giving 704 07dec200023:59:60+0000 706 NOTE: Observe that the effect is to treat "31 Dec 2000 23:59:60 707 +0000" (which is a legitimate date-time as defined by [RFC 2822] 708 ) as being different from "1 Jan 2001 00:00:00 +0000". 710 3. Within each unstructured-header-zone and each comment-zone, all 711 instances of FWS are replaced by a single SP; within each 712 neutral-, quote-, sharp- or square-zone, all instances of FWS are 713 omitted (thus the header has now been unfolded into a single 714 line). Any whitespace at the end of the header is removed, and it 715 is ensured that the header ends with a single CRLF. 717 4. The DQUOTEs (ASCII '"') enclosing each quoted-zone are removed 718 (but not any quoted DQUOTE or any DQUOTE within other zones so 719 that, in particular, they are not removed within msg-ids). 721 5. Any encoded-word (where allowed by the above syntax, and whether 722 or not its length is more than 75 characters) is replaced by the 723 sequence of octets obtained by decoding it. Moreover, where two 724 adjacent encoded-words are separated by whitespace, that 725 whitespace is removed (see [RFC 2047]). 727 NOTE: The decoding of encoded-words must take place last, 728 because it could produce arbitrary sequences of octets (when 729 decoding into UCS-16, for example) which might then be confused 730 with US-ASCII characters such as DQUOTE, etc. Whitespace needs 731 to be removed entirely from structured headers because it is 732 possible it may have been introduced by folding in unexpected 733 places en route, subsequent to the original signing. 735 Typical results of the use of this canonicalization algorithm can be 736 found with the example in section 5.2 below. Test data for use with 737 implementations of this algorithm, and containing many obscure cases, 738 can be found in Appendix B. 740 If, during signing, a header is found not to conform to the given 741 syntax (in particular, if the closing delimiter of some zone is not 742 found), then the signing MUST be aborted (and it MAY be aborted if 743 the header is malformed for some other reason). When verifying a 744 signature, however, an implementation MAY attempt to continue even 745 when the final zone of a header has no closing delimiter. 747 NOTE: If an internet mail message in the format defined by [RFC 748 2822] is converted into X.400 mail by a gateway conforming to 749 [RFC 1327] and then back into internet mail, then it is likely 750 that any signature made in accordance with this proposal will 751 fail to verify. For example, comments in headers containing 752 addresses (such as in From, Reply-To, etc.) may be converted 753 into phrases and moved in front of the addr-spec, or even 754 removed entirely, and thus the canonicalized form of the message 755 will have been changed. This old convention, for storing the 756 Real Name of the person associated with the address in a 757 following comment, is now deprecated by both [RFC 2822] and 758 [USEFOR], but even where phrases are used for this purpose it is 759 possible that other changes to the message will still render the 760 signature unverifiable. Note that there is in any case no 761 expectation that an internet mail message signed according to 762 this proposal will ever be able to be verified once it has been 763 passed permanently into an X.400 system, nor vice versa. 765 3.2.2. The PGP-Head-1 cryptographic algorithm 767 [Open PGP is the obvious choice for this, since it is widely available 768 and is blessed by the IETF. My only reservation is that it comes with a 769 rather poor certification system as compared with, say, SPKI. So this 770 choice might yet have to be reviewed.] 772 The stream of octets resulting from the canonicalization algorithm is 773 signed, in binary mode (signature type 0x00), in accordance with Open 774 PGP [RFC 2440]. 776 NOTE: The signature is made in binary mode just in case any [RFC 777 2047] decoding into UCS-16 has produced octets which might be 778 mistaken for isolated CR, LF or trailing SP characters, which 779 are treated specially in PGP text mode, and also because the 780 treatment of trailing whitespace differs between Open PGP and 781 some earlier PGP implementations. 783 The output of the algorithm MUST be Ascii-armored [RFC 2440], but the 784 Armor Header Line ("BEGIN PGP SIGNATURE"), the Armor Headers (e.g. 785 "Version:"), the blank line following the Armor Headers, and the 786 Armor Tail ("END PGP SIGNATURE") are to be omitted (thus yielding a 787 sequence of base64 characters). Observe that these characters will 788 include a CRC checksum, which SHOULD be on a separate line from the 789 rest of the signature. 791 The signature included within the Ascii-armor MAY include 792 certificates as evidence that the signing key has the necessary 793 authorization to sign articles of that nature, but such usage is in 794 general deprecated except between parties that have agreed otherwise 795 or where, for some reason, an unusual signatory is signing and 796 attaches a certificate from the usual signatory. 798 The signature SHOULD use the DSA public-key algorithm and the SHA-1 799 hashing algorithm, and be incorporated in a Version 4 Signature 800 Packet in the new format. It MAY alternatively use the combination 801 RSA/MD5 with Version 3 in the old format (for compatibility with PGP 802 2.6.x) and it MAY use the combination RSA/SHA-1 with Version 4 in the 803 new format. Verifiers MUST be able to verify all of these forms. 805 3.2.3. The PGP-Head-1 Macro Definition 807 Two macro-names are defined, "$news-standard" and "$mail-standard". 809 "$news-standard" is a macro representing a set of common headers that 810 SHOULD normally be included when signing the headers of a Netnews 811 article, and is defined to expand into the header-ref-list 813 Date, Newsgroups, Distribution, Message-ID, From, Reply-To, 814 Followup-To, References, Subject, Keywords, Control, Content-Type, 815 Content-ID 817 "$mail-standard" performs the same function for mail messages, and is 818 defined to expand into the header-ref-list 820 Date, From, Reply-To, To, Cc, In-Reply-To, References, Subject, 821 Keywords, Content-Type, Content-ID 823 NOTE: Those lists have carefully excluded those headers (such as 824 Sender and Content-Transfer-Encoding) which are liable to be 825 added or altered by sites downstream from the one which 826 generated the Signed header. 828 4. Applications 830 It is anticipated that protocols for specific applications of the 831 signature mechanisms described in this proposal will be devised, 832 whether under the auspices of the IETF or otherwise. For example, the 833 need to be able to verify the origin of Control messages for creating 834 and removing newsgroups and for cancelling articles was a prime 835 motivation for creating this proposal. 837 It is up to each such application to specify appropriate mechanisms 838 for establishing a Public Key Infrastructure suited to its purpose. 839 Such an infrastructure would provide for the storing, distribution 840 and authorization of the necessary public keys (and for revocations 841 thereof). This proposal establishes no preferred mechanisms in this 842 regard, except to draw attention to the possible usefulness of the 843 Content-Type application/pgp-keys as defined in [RFC 2015] and [RFC 844 2015bis]. 846 5. Examples 848 5.1. Newgroup Control message 850 A 'newgroup' control message in the format given in [USEFOR]. 852 Newsgroups: comp.foo 853 From: "Charles Lindsey" 854 Subject: cmsg newgroup comp.foo moderated 855 Control: newgroup comp.foo moderated 856 Approved: newgroups-request@isc.example 857 Message-ID: <919190727.4918@isc.example> 858 Date: Tue, 16 Feb 1999 18:45:27 -0000 859 MIME-Version: 1.0 860 Content-Type: multipart/mixed; boundary=88888888 861 Signed: $news-standard,+1:content-md5,+1:content-type,+3:content-md5, 862 +3:content-type; protocol=pgp-head-1; key="0xA336D40C" 863 (DSS-example); 864 sig=" 865 iQA/AwUAO40EkyQRKsmjNtQMEQIwGgCfSYj8sgrRgRNQIwWKKnk+M9j0o+wAn2Mp 866 fS2zwmmrA/KvCXyiTFsk35pr 867 =8p5V" 869 This is a multipart message in MIME format. 871 --88888888 872 Content-Type: application/news-groupinfo 873 Content-MD5: 68BGYb5+8KAVeqno7Et7Ug== 875 For your newsgroups file: 876 comp.foo For Foo discussions (Moderated) 878 --88888888 879 Content-Type: text/plain 881 comp.foo a moderated newsgroup which passed its vote for creation 882 by 424:8 as reported in news.announce.newgroups on 10 Feb 99. 884 --88888888 885 Content-Type: application/news-transmission 886 Content-MD5: cjeIxiGbPsrse1G/w9cfqQ== 888 Newsgroups: comp.foo 889 Path: not-for-relaying 890 Distribution: local 891 From: "Charles Lindsey" 892 Message-ID: <919190727.4918$p=1@isc.example> 893 Date: Tue, 16 Feb 1999 18:45:27 -0000 894 Subject: Charter for newsgroup com.foo 895 Approved: newgroups-request@isc.example 897 The charter, culled from the call for votes: 899 Comp.foo is a moderated newsgroup for discussing all manner of 900 Foos. 902 Moderation submission address: 903 comp-foo@bar.example 905 --88888888-- 907 5.2. Mail message re-signed by mailing list owner 909 received: from house.example by bar.example (8.8.8/AL/MJK-2.0) 910 id XAA10880; Sat, 13 Feb 1999 23:00:14 GMT 911 Resent-From: "Example Mail Server" 912 Precedence: list 913 Received: (from list@localhost) 914 by house.example (8.9.2/8.9.2) id OAA28279; 915 Sat, 13 Feb 1999 14:59:56 -0800 (PST) 916 From: <"[john]"@ 917 temple.example> (John Smith) 918 Organization: http://www.temple.example/john 919 Subject: Submission to mailing list 920 in connection with foo. 921 Message-ID: <19990213145946.20115@main.temple.example> 922 Date: Sat, 13 Feb 1999 22:59:46 +0000 923 Mime-Version: 1.0 924 Content-Type: text/plain; charset=us-ascii 925 Content-MD5: ayoAIdYN8PZqpOgij7VG2Q== 926 Signed: $mail-standard,content-md5; 927 protocol=PGP-Head-1; key="0xA336D40C" (DSS-example); 928 sig=" 929 iQA/AwUAO40E1yQRKsmjNtQMEQLvzQCgtNnWdN2lwYtFoajEen96111IMboAn2hV 930 z9edcA/oc2F6ui8nIj/X5/UW 931 =buij" 932 Verified: majordomo-request@com.example; signature=good; 933 hashcheck="good content-md5" 935 Signed-1: message-id,date,resent-from, 936 verified,signed; protocol=PGP-HEAD-1; key="0xA336D40C"; 937 sig=" 938 iQA/AwUAO40GgCQRKsmjNtQMEQLsNACdFPk9gPtPq9qpWMLXlurvhBLqMbAAoLg0 939 uOVRa6sHqBo2bVf+P/7qy0bF 940 =FyFy" 942 Text of John's message. 944 -- 945 John's signature. 947 Passing the original form of this through the PGP-Head-1 948 canonicalization algorithm produces the following, in the case of the 949 "Signed:" header (observe lines folded for convenience of this 950 document - the true line endings being indicated by "CRLF"): 952 signed: $mail-standard,content-md5;protocol=PGP-Head-1;key=0xA336 953 D40C(DSS-example)CRLF 954 date: 13feb199922:59:46+0000CRLF 955 from: <"[john]"@temple.example>(John Smith)CRLF 956 subject: Submission to mailing list in connection with foo.CRLF 957 content-type: text/plain;charset=us-asciiCRLF 958 content-md5: ayoAIdYN8PZqpOgij7VG2Q==CRLF 960 And here is the result of canonicalizing to produce the "Signed-1:" 961 header: 963 signed-1: message-id,date,resent-from,verified,signed;protocol=PG 964 P-HEAD-1;key=0xA336D40CCRLF 965 message-id: <19990213145946.20115@main.temple.example>CRLF 966 date: 13feb199922:59:46+0000CRLF 967 resent-from: ExampleMailServerCRLF 968 verified: majordomo-request@com.example;signature=good;hashcheck= 969 goodcontent-md5CRLF 970 signed: $mail-standard,content-md5;protocol=PGP-Head-1;key=0xA336 971 D40C(DSS-example);sig=iQA/AwUAO40E1yQRKsmjNtQMEQLvzQCgtNnWdN2lwYt 972 FoajEen96111IMboAn2hVz9edcA/oc2F6ui8nIj/X5/UW=buijCRLF 974 NOTE: the second signature signed only that which it had added 975 itself, plus sufficient of the original headers to identify the 976 original message. It did not need to scan the body to recompute 977 the MD5 hash, but effectively included it by signing the 978 original "Signed:" header. 980 6. Security 982 TBD 983 [What is there to say here?] 984 7. References 986 [PGPMOOSE] Greg Rose, [I need a URL for this], October 1995. 988 [PGPVERIFY] David Lawrence, 989 . 991 [RFC 1036] M. Horton and R. Adams, "Standard for Interchange of 992 USENET Messages", RFC 1036, December 1987. 994 [RFC 1327] S. Hardcastle-Kille, "Mapping between X.400(1988) / ISO 995 10021 and RFC 822", RFC 1327, May 1992. 997 [RFC 1864] J. Myers and M. Rose, "The Content-MD5 Header Field", RFC 998 1864, October 1995. 1000 [RFC 2015] M. Elkins, "MIME Security with Pretty Good Privacy (PGP)", 1001 RFC 2015, October 1996. 1003 [RFC 2015bis] M. Elkins, D. Del Torto, R. Levien, and T. Roessler, 1004 "MIME Security with OpenPGP", draft-ietf-openpgp-mime-06.txt, 1005 April 2001. 1007 [RFC 2045] N. Freed and N. Borenstein, "Multipurpose Internet Mail 1008 Extensions (MIME) Part One: Format of Internet Message Bodies", 1009 RFC 2045, November 1996. 1011 [RFC 2047] K. Moore, "MIME (Multipurpose Internet Mail Extensions) 1012 Part Three: Message Header Extensions for Non-ASCII Text", RFC 1013 2047, November 1996. 1015 [RFC 2119] S. Bradner, "Key words for use in RFCs to Indicate 1016 Requirement Levels", RFC 2119, March 1997. 1018 [RFC 2234] D. Crocker and P. Overell, "Augmented BNF for Syntax 1019 Specifications: ABNF", RFC 2234, November 1997. 1021 [RFC 2440] J. Callas, L. Donnerhacke, H. Finney, and R. Thayer, 1022 "OpenPGP Message Format", RFC 2440, November 1998. 1024 [RFC 2821] John C. Klensin and Dawn P. Mann, "Simple Mail Transfer 1025 Protocol", RFC 2821, April 2001. 1027 [RFC 2822] P. Resnick, "Internet Message Format", RFC 2822, April 1028 2001. 1030 [USEFOR] Charles H. Lindsey, "News Article Format", draft-ietf- 1031 usefor-article-format-03.txt. 1033 8. Acknowledgements 1035 The author acknowledges the work of David Lawrence, as original 1036 author of "pgpverify", for many of the ideas contained herein, and 1037 also many contributions from members of the usenet-format mailing 1038 list. 1040 9. Contact Address 1042 Charles. H. Lindsey 1043 5 Clerewood Avenue 1044 Heald Green 1045 Cheadle 1046 Cheshire SK8 3JU 1047 United Kingdom 1048 Phone: +44 161 436 6131 1049 Email: chl@clw.cs.man.ac.uk 1051 Comments on this draft should preferably be sent to the mailing list 1052 of the Usenet Format Working Group at 1054 usenet-format@landfield.com. 1056 This draft expires six months after the date of publication (see Page 1057 1) (i.e. in March 2002). 1059 10. Intellectual Property Rights 1061 [The usual texts from RFC 2026 to be inserted here.] 1063 Appendix A - Model implementation 1065 The following is written in PERL, with full use made of facilities 1066 provided by the Perl CPAN library. 1068 Appendix A.1 - The PGP-Head-1 canonicalization 1070 package Canon; 1072 use MIME::Words qw(decode_mimewords); 1073 use Date::Parse; 1074 use Date::Format; 1075 use Time::Local; 1076 use Exporter (); 1077 @ISA = qw(Exporter); 1078 @EXPORT = qw(set_protocol $canonicalize %macros); 1080 my $protocol; 1081 my %unstructureds = (); 1082 my %dates = (); 1084 my @ph1_news_standard = qw(date newsgroups distribution message-id 1085 from reply-to followup-to references 1086 subject keywords control content-type 1087 content-id); 1088 my @ph1_mail_standard = qw(date from reply-to to cc in-reply-to 1089 references subject keywords content-type 1090 content-id); 1092 sub set_protocol { 1093 $_ = shift; 1094 SWITCH: { 1096 # PGP-Head-1 protocol 1097 /^pgp-head-1$/o && do { 1098 %macros = ( 1099 'news-standard' => \@ph1_news_standard, 1100 'mail-standard' => \@ph1_mail_standard, 1101 ); 1102 %unstructureds = ('subject', 1, 'comments', 1, 1103 'organization', 1, 'summary', 1); 1104 %dates = ('date', 1, 'resent-date', 1, 'expires', 1); 1105 $canonicalize = \&ph1_canon; 1106 $protocol = $_; 1107 last SWITCH; 1108 }; 1110 # other protocols go in here 1112 die "Unknown protocol $_\n"; 1113 } 1114 } 1116 sub ph1_canon { 1117 my $tag = lc shift; 1118 my $line = shift; 1119 my $signing = shift; # for more stringent checks when signing 1120 my ($ss,$mm,$hh,$day,$month,$year,$zone,$time,$dummy,@dateval); 1122 $is_structured = (not $unstructureds{$tag}) && $tag !~ m/^x-/o; 1123 $is_date = $dates{$tag}; 1124 @outlist = ($tag, ': '); 1125 $outptr = \@outlist; # will point to @encodelist during encoding 1126 $state = 0; # for the state machine 1127 $encoding = 0; # part of the state machine 1128 $pending = 0; # to remember the FWS between encoded-words 1130 do { 1131 # lexical split of $line into plain ($x) + next delimiter ($y) 1132 $line =~ m/(.*?) # anything except the following: 1133 ( \\\S # quoted-pair 1134 | [][)><("] # various bracket delimiters 1135 | =\?(?!=) | \?=\s+=\? | \?= # for encoded-words 1136 | \s*$ # trailing whitespace 1137 ) /sogx; 1138 $x = $1; $y = $2; 1139 # convert $x into canonical form 1140 if ($is_date && $state == 0) { 1141 $x =~ s/(\S*)\s+/$1 /sog; # reduce FWS to SP 1142 if ($x !~ m/^\s*$/) { # zone not empty 1143 if ($signing && $x !~ m/^\s? 1144 ((mon|tue|wed|thu|fri|sat|sun)\s?,\s?)? 1145 [0-9]{1,2}\s 1146 (jan|feb|mar|apr|may|jun|jul|aug|sep|oct|nov|dec)\s 1147 [0-9]{4}\s 1148 [0-9]{2}:[0-9]{2}:[0-9]{2}\s 1149 [-+][0-9]{4}\s? 1150 /oix) {die "Bad Date '", $x, "'\n"} 1151 if (not ( 1152 (($ss,$mm,$hh,$day,$month,$year,$zone) = strptime($x)) 1153 && ($time = timegm(0,$mm,$hh,$day,$month,$year)) >= 0 1154 # $ss not used, in case it is a leap second 1155 )) {die "Bad Date '", $x, "'\n"} 1156 ($dummy,$mm,$hh,$day,$month,$year) = gmtime($time - $zone); 1157 @dateval = ($ss,$mm,$hh,$day,$month,$year); 1158 # $ss restored 1159 $x = lc strftime("%d%h%Y%T+0000", @dateval); 1160 } 1161 } elsif ($is_structured && $state <= 0) { 1162 $x =~ s/(\S*)\s+/$1/sog; # eliminate FWS 1163 } else { # unstructured, or in a comment-zone 1164 $x =~ s/(\S*)\s+/$1 /sog; # reduce FWS to SP 1165 } 1166 push @$outptr, $x; 1168 # state machine to process $y 1169 if ($is_structured) { 1170 if ($state == 0) { # neutral-zone 1171 if ($y eq '"') 1172 {$state = -1; _end_encoding()} 1173 elsif ($y eq '<') 1174 {$state = -2; push @$outptr, $y; _end_encoding()} 1175 elsif ($y eq '[') 1176 {$state = -3; push @$outptr, $y; _end_encoding()} 1177 elsif ($y eq '(') 1178 {$state = 1; push @$outptr, $y; _end_encoding()} 1179 elsif ($y eq '=?') 1180 {_start_encoding(); push @$outptr, $y} 1181 elsif ($y =~ m/\?=/o) 1182 {push @$outptr, $y; _end_encoding()} 1183 elsif ($y =~ m/^[])>]$/o) { 1184 if ($signing) {die "Unbalanced '", $y, "'\n"} 1185 else {push @$outptr, $y} 1186 } 1187 else {$y =~ s/^\s*$/\r\n/o; push @$outptr, $y} 1188 # eliminate trailing WS; insert CRLF 1190 } else { # other zones 1191 if ($y =~ s/^\s*$/\r\n/o && $signing) 1192 {die "Unbalanced header ", $line} 1194 if ($state == -1) { # in quoted-zone 1195 if ($y eq '"') {$state = 0} 1196 else {push @$outptr, $y} 1197 } 1198 elsif ($state == -2) { # in sharp-zone 1199 if ($y eq '>') {$state = 0} 1200 push @$outptr, $y; 1201 } 1202 elsif ($state == -3) { # in square-zone 1203 if ($y eq ']') {$state = 0} 1204 push @$outptr, $y; 1205 } 1206 elsif ($state > 0) { # in comment-zone 1207 if ($y eq '(') 1208 {$state ++; push @$outptr, $y; _end_encoding()} 1209 elsif ($y eq ')') 1210 {$state --; push @$outptr, $y; _end_encoding()} 1211 elsif ($y eq '=?') 1212 {_start_encoding(); push @$outptr, $y} 1213 elsif ($y =~ m/\?=/o) 1214 {push @$outptr, $y; _end_encoding()} 1215 else {push @$outptr, $y} 1216 } 1217 } 1218 } else { # unstructured 1219 $y =~ s/^\s*$/\r\n/o; # eliminate trailing WS; insert CRLF 1220 if ($y eq '=?') 1221 {_start_encoding(); push @$outptr, $y} 1222 elsif ($y =~ m/\?=/o) 1223 {push @$outptr, $y; _end_encoding()} 1224 else {push @$outptr, $y} 1225 } 1227 } until $y eq "\r\n"; 1228 if ($encoding) {_end_encoding()} 1229 $line = join('', @outlist); 1230 return $line; 1231 } 1233 sub _start_encoding { # entered at every '=?' 1234 @encodelist = (); 1235 $outptr = \@encodelist; # divert output during encoding 1236 $encoding = 1; 1237 } 1239 sub _end_encoding { # entered at every '?=' or unexpected delimiter 1240 my $token = "[^][()<>@,;:\"\?.=\x00-\x20\x7f-\xff]+"; 1241 my $encoded_text = "[^\?\x00-\x20\x7f-\xff]+"; 1242 if ($encoding) { 1243 $outptr = \@outlist; # cease output diversion 1244 if ($y =~ m/^\?=/o) { # '?=' as expected 1245 $encodelist[$#encodelist] = '?='; # in case it was '?=\s=?' 1246 $x = join('', @encodelist); 1247 $genuine = $x =~ m/^=\?$token\?$token\?$encoded_text\?=$/o; 1248 if ($genuine) 1249 {$x = decode_mimewords($x)} # dies if it fails 1250 if ($is_structured && $state <= 0) { 1251 if ($genuine) {$x =~ s/\s//go} # eliminate FWS 1252 } else { 1253 if ($pending && not $genuine) {push @$outptr, ' '} 1254 } 1255 push @$outptr, $x; 1256 } else { # unexpected delimiter during encoding 1257 if ($pending && (not $is_structured || $state > 0)) { 1258 push @$outptr, ' '; 1259 } 1260 push @$outptr, @encodelist; 1261 } 1262 $encoding = 0; 1263 if ($pending = $y =~ m/^\?=\s+=\?/o) { 1264 _start_encoding(); 1265 push @$outptr, ('=?'); 1266 } 1267 } 1268 } 1270 Appendix A.2 - Parsing of the Signed header 1272 # This module must be stored in Mail/Field/Signed.pm 1273 # relative to the other programs in the suite 1274 package Mail::Field::Signed; 1276 use strict; 1277 use vars qw(@ISA); 1278 use MIME::Field::ParamVal; 1279 use Canon; 1280 use Carp; 1282 @ISA = qw(MIME::Field::ParamVal); 1284 INIT: { 1285 my $x = bless([]); 1287 $x->register('Signed'); 1288 $x->register('Signed_1'); 1289 $x->register('Signed_2'); 1290 $x->register('Signed_3'); 1291 $x->register('Signed_4'); 1292 $x->register('Signed_5'); 1293 $x->register('Signed_6'); 1294 $x->register('Signed_7'); 1295 $x->register('Signed_8'); 1296 $x->register('Signed_9'); 1297 } 1299 sub parse { 1300 my ($self, $string, $signing) = @_; 1301 my $clean_string = _skip_CFWS($string); 1302 my $macro; 1303 $self->set($self->parse_params($clean_string)); 1304 $self->{string} = $string; 1305 $self->{header_refs} = (); 1306 set_protocol($self->protocol); 1307 do { 1308 if ($self->{_} =~ 1309 m/\G(((\d+:)*)(\$)|([-+]?)((\d+:)*))([-\w]+)(?!:)/og) { 1310 # ((---$2--)($4) (--$5-)(---$6--))(--$8--) 1311 if (defined $4) { 1312 if ($macro = $macros{$8}) 1313 {$self->_incorporate_header('', $2, @{$macro})} 1314 else { die "Unknown macro ", $8, "\n" } 1315 } else {$self->_incorporate_header($5, $6, ($8))} 1316 } else { die "Bad header-ref-list ", $string, "\n" } 1317 } while ($self->{_} =~ m/,/og); 1318 foreach ('protocol', 'key') { 1319 unless($self->param($_)) {die "$_ missing\n"}; 1320 } 1321 return $self; 1322 } 1324 sub tag { 1325 my $self = shift; 1326 return Mail::Field::tag($self); 1327 } 1329 sub protocol { 1330 my $self = shift; 1331 return lc($self->param('protocol')); 1332 } 1334 sub key { 1335 my $self = shift; 1336 return lc($self->param('key')); 1337 } 1339 sub sig { 1340 my $self = shift; 1341 return lc($self->param('sig')); 1342 } 1344 sub stringify { 1345 my $self = shift; 1346 return $self->{string}; 1347 } 1349 sub header_refs { 1350 my $self = shift; 1351 @{$self->{header_refs}}; 1352 } 1354 sub _incorporate_header { 1355 my ($self, $plusminus, $level, @additions) = @_; 1356 my $refs = \@{$self->{header_refs}}; 1357 foreach (@additions) { 1358 if ($plusminus eq '-') { 1359 # item to be removed from list 1360 for (my $i = 0; $i < @$refs; $i++) 1361 {if (@$refs[$i] eq $level.$_) {splice(@$refs, $i, 1)} } 1362 } else { 1363 # item to be added to list 1364 I: { 1365 for (my $i = 0; $i < @$refs; $i++) 1366 {if (@$refs[$i] eq $level.$_) {last I} } 1367 push (@$refs, $level.$_); # only if not already present 1368 } 1369 } 1370 } 1371 } 1373 sub _skip_CFWS { 1374 my $line = shift; 1375 my $count = 0; 1376 my @buf = (); 1377 while ($line =~ m/\G([^\s\("]*)\s*|\G(\()|\G(")/sog) { 1378 # (---$1----) ($2) (3) 1379 if ($1) {push @buf, ($1)} 1380 elsif ($2) { # comment 1381 $count += 1; 1382 do { 1383 $line =~ m/\G[^()]*([()])/sog 1384 or die "Unclosed comment\n"; 1385 $count += ($1 eq '(') ? +1 : -1; 1386 } until ($count == 0); 1387 } elsif ($3) { # quoted-string 1388 push @buf, ('"'); 1389 do { 1390 $line =~ m/\G([^\"\s]+)|\G(\s+)|\G(")/sog; 1391 # (---$1---) (-$2) (3) 1392 if ($1) {push @buf, ($1)} 1393 elsif ($2) {push @buf, (' ')} 1394 elsif ($3) {push @buf, ('"'); last} 1395 } 1396 } 1397 } 1398 return join('', @buf); 1399 } 1401 sub extract_headers { 1402 my ($self, $article, $FH, $proc, $signing) = @_; 1403 my $ref; 1404 my $signed = $self->stringify; 1405 $signed =~ s/\s*;[^;]*\bsig\b[^;]*$//io; # remove "; sig=..." 1406 print $FH (&$proc($self->tag, $signed, $signing)); 1407 foreach $ref ($self->header_refs) { 1408 _extract_header($article, $ref, $FH, $proc, $signing); 1409 } 1411 } 1413 sub _extract_header { 1414 my ($article, $ref, $FH, $proc, $signing) = @_; 1415 $ref =~ m/((\d+):((\d+:)*))?([-\w]+)/o; 1416 # ((-$2) (---$3--)) (--$5--) 1417 if ($1) # $ref of the form "1:header"; call ourself recursively 1418 {_extract_header($article->parts($2-1), $3.$5, 1419 $FH, $proc, $signing)} 1420 else { # $ref is a header at the current level 1421 if ($article->head->count($5) > 1) 1422 {die "Cannot sign duplicated header ", $5, "\n"} 1423 elsif ($article->head->count($5) == 1) { 1424 print $FH (&$proc($5, $article->head->get($5), $signing)); 1425 } 1426 } 1427 } 1429 1; 1431 Appendix A.3 - The Signing program 1433 use English; 1434 use Mail::Header; 1435 use Mail::Field; 1436 use Mail::Field::Signed; 1437 use MIME::Parser; 1438 use Canon; 1440 $signing = 1; # This is a program to sign headers 1442 # Read partial Signed header from file 1443 open SIGNED, "<".$ARGV[0]; 1444 $signed = new Mail::Header \*SIGNED; 1445 @names = $signed->tags; 1446 $tag = $names[0]; 1447 if ($tag !~ m/^signed(-[1-9])?$/oi || $#names != 0) 1448 {die "Invalid SIGNED file ", $ARGV[0], "\n"} 1449 $line = Mail::Field->extract($tag, $signed); 1451 if ($line->sig) {die "'sig' already present\n"} 1453 $parser = new MIME::Parser output_to_core=>'ALL'; 1454 $parser->parse_nested_messages('NEST'); 1455 # special treatment for message/rfc822 1456 $article = $parser->read(\*STDIN) or die "Malformed article\n"; 1458 if ($article->head->count($tag)) 1459 {die "Message already signed\n"} 1461 $tmp = "/tmp/sign-$$"; 1462 open(FH, "> $tmp") or die "Cannot open $tmp: $!\n"; 1463 $line->extract_headers($article, \*FH, $canonicalize, $signing); 1464 close(FH); 1465 # The remainder of this code is dependent upon the particular 1466 # implementation of OpenPGP. 1468 $key = $line->param('key'); 1469 $pgp = 1470 "pgps -fab +verbose=0 +textmode=off -u $key <$tmp 2>/dev/null |"; 1471 open(FH, $pgp) or die "Cannot open pipe from pgp: $!\n"; 1473 undef $INPUT_RECORD_SEPARATOR; 1474 $_ = ; # The OpenPGP signature record 1475 unlink $tmp; 1476 s/^.*[^\w+\/=\n].*\n|^\n//mog; # remove non-base64 lines 1477 s/^/ /mog; # indent by 3 spaces 1478 s/\A/;\n sig="\n/mo; s/\Z/"/mo; # enclose in '; sig="..."' 1480 $article->head->add($tag, $line->stringify . $_); 1481 $article->print; 1483 Appendix A.4 - The Verification program 1485 use English; 1486 use Mail::Header; 1487 use Mail::Field; 1488 use Mail::Field::Signed; 1489 use MIME::Parser; 1490 use Canon; 1492 $signing = 0; # This is a program to verify signed headers 1493 $parser = new MIME::Parser output_to_core=>'ALL'; 1494 $parser->parse_nested_messages('NEST'); 1495 # special treatment for message/rfc822 1496 $article = $parser->read(\*STDIN) or die "Malformed article\n"; 1498 $tag = $ARGV[0]; 1499 unless ($tag =~ m/^Signed(-[1-9])?/io) 1500 {die "Bad parameter ", $tag, "\n"} 1502 $line = Mail::Field->extract($tag, $article); 1503 unless ($line) {die $tag, " header not found\n"} 1504 unless ($line->param('sig')) {die "Malformed Signed header\n"} 1506 $tmp = "/tmp/sign-$$"; 1507 open(FH, "> $tmp") or die "Cannot open $tmp: $!\n"; 1508 $line->extract_headers($article, \*FH, $canonicalize, $signing); 1509 close(FH); 1511 # The remainder of this code is dependent upon the particular 1512 # implementation of OpenPGP. 1514 use IPC::Open2; 1515 $pgp = "pgpv -f --batchmode -o $tmp 2>&1"; 1516 open2(\*PIPEOUT, \*PIPEIN, $pgp); 1518 $armour = $line->param('sig'); 1519 $armour =~ s/\s//sog; 1520 $armour =~ s/([\w+\/=]{64})/$1\n/sog; 1521 $armour =~ s/(=[\w+\/]{4}\Z)/\n$1/so; 1522 print PIPEIN "-----BEGIN PGP SIGNATURE-----\n", 1523 "Charset: noconv\n\n", 1524 $armour, "\n", 1525 "-----END PGP SIGNATURE-----\n"; 1526 close(PIPEIN); 1527 undef $INPUT_RECORD_SEPARATOR; 1528 $result = ; 1529 unlink $tmp; 1531 $result =~ s/^This signature applies to another message\n//mo; 1532 $result =~ m/Key ID +([0-9a-fA-F]+)/iom; 1533 unless ("0x" . $1 eq $line->param('key')) { 1534 print "Signature was for key ", $line->param('key'), 1535 ", not for 0x", $1, "\n"; 1536 $badsig = 1; 1537 } 1538 $badsig |= ($result !~ m/Good signature/iom); 1539 print $result; 1540 exit $badsig; 1542 Appendix B - Test cases 1544 The following, believe it or not, is a valid email message. Note 1545 that there were various TABs and much trailing whitespace in it (but 1546 they are unlikely to have survived through to the published form of 1547 this document). 1549 Subject: Unstructured headers can contain unmatched (s and unescaped 1550 "s; (comments like this) and "quoted strings" are not 1551 treated specially. 1552 SUMMARY: Multiple spaces, tabs and foldings 1553 in unstructured headers are reduced to a single SP, and trailing 1554 whitespace (of which there is much in these examples)) is ignored. 1555 X-Header: All X headers are "treated "as unstructured") 1556 from: "Scooby Doo" (all FWS in 1557 structured headers is removed, except in comments) 1558 tO: "John (the Boss) Smith" , 1559 "Bill \"fingers\" 1560 Sykes" <"#*\"~"@twist.example> (Observe unescaped \( and escaped " 1561 within quoted strings, and (properly matched) parentheses within 1562 comments) 1563 rEPLY-tO:"#*\"~"@twist.example 1564 (Observe "s elided, since not in <...>) 1565 Message-ID: <"*\"~and-other-grunge)(]["@[127.0.0.1"Ugh!]> 1566 (Yes that is a legal msg-id, including the " in the domain-literal) 1567 Sender: foo@[127.0.0.1"Ugh!] (another " in a domain-literal) 1568 Cc: foo@[127.0.0.1(this is not], bar@[a comment)127.0.0.1], 1569 "=?utf-8?Q?not_an_encoded_word?=" 1570 <=?utf-8?Q?not_an_encoded_word?=@bar.example>, 1571 =?us-ascii?Q?Joe_D._Bloggs_=5Bwho=20else=5d?= , 1572 =?us-ascii?Q?C&A?=@bar.example (treated as an encoded-word even 1573 though, syntactically, it isn't) 1574 (in comment but =?is0-8859-1?Q?not(an_encoded-word?=)) 1575 (=?us-ascii?Q?encoded-word_split_into-?= 1576 =?us-ascii?b?cGFydHM=?=) 1577 Comments: An unstructured encoded word can have 1578 =?us-ascii?Q?any_characters_in_it_<>()[]"?= =?bogus_e.w?= 1579 Date: (pre comment) sAt, 13 fEb 1580 1999 14:59:56 -0800 (PST) 1581 Keywords: (various illegal constructs which nevertheless get through) 1582 \(Not a comment\), \" (naked quoted-pair), \ (not a quoted-SP) 1584 Comments: Various mismatches, which should be rejected. 1585 Foo: ) (naked \)) 1586 Bar: ((mismatched parens) 1587 Baz: <"mismatch" 1588 Fred: ["mismatch" 1589 Date: Sat, 13 Feb 1999 23:00:14 GMT 1590 Date: 29 Feb 2001 23:00:14 +0000 1592 The following is the result of applying the PGP-Head-1 1593 canonicalization to it (lines folded for convenience, as before, and 1594 blank lines inserted between headers for readability). 1596 subject: Unstructured headers can contain unmatched (s and unesca 1597 ped "s; (comments like this) and "quoted strings" are not treated 1598 specially.CRLF 1600 summary: Multiple spaces, tabs and foldings in unstructured heade 1601 rs are reduced to a single SP, and trailing whitespace (of which 1602 there is much in these examples)) is ignored.CRLF 1604 x-header: All X headers are "treated "as unstructured")CRLF 1606 from: ScoobyDoo(all FWS in structured headers is 1607 removed, except in comments)CRLF 1609 to: John(theBoss)Smith,Bill\"fingers\"Sykes<"#*\ 1610 "~"@twist.example>(Observe unescaped \( and escaped " within quot 1611 ed strings, and (properly matched) parentheses within comments)CRLF 1613 reply-to: #*\"~@twist.example(Observe "s elided, since not in <... 1614 \&.>)CRLF 1616 message-id: <"\*"~and-other-grunge)(]["@[127.0.0.1"Ugh!]>(Yes tha 1617 t is a legal msg-id, including the " in the domain-literal)CRLF 1619 sender: foo@[127.0.0.1"Ugh!](another " in a domain-literal)CRLF 1620 cc: foo@[127.0.0.1(thisisnot],bar@[acomment)127.0.0.1],=?utf-8?Q? 1621 not_an_encoded_word?=<=?utf-8?Q?not_an_encoded_word?=@bar.example 1622 >,JoeD.Bloggs[whoelse],C&A@bar.example(treated a 1623 s an encoded-word even though, syntactically, it isn't)(in commen 1624 t but =?is0-8859-1?Q?not(an_encoded-word?=))(encoded-word split i 1625 nto-parts)CRLF 1627 comments: An unstructured encoded word can have any characters in 1628 it <>()[]" =?bogus_e.w?=CRLF 1630 date: (pre comment)13feb199922:59:56+0000(PST)CRLF 1632 keywords: (various illegal constructs which nevertheless get thro 1633 ugh)\(Notacomment\),\"(naked quoted-pair),\(not a quoted-SP)CRLF 1635 Appendix C - PGP Public Key 1637 For the benefit of those who would like to experiment with the 1638 examples given in section 5, the following is the Public Key for 1639 0xA336D40C (DSS-example). 1641 -----BEGIN PGP PUBLIC KEY BLOCK----- 1642 Version: PGPfreeware 5.0i for non-commercial use 1644 mQDiBDuB8NIRAgDeGDfg+ZlgHDZkkXDpeeaBJxIq9/pkuFL/6puw9j+k/JsKzLr9 1645 ktqlFgkdnDyYbWm26lWAmjliZEeIyggBlxSlAKD/lbF/4JAJox/7xqW8fuSc9sPO 1646 AwIA0rQJ1TEhIztyUYB5j4D9V7pHKyhbdifFEf1MwrYsnluiejd5/K623J4wQr/m 1647 +zMzr7lnX6ZLPkITKgfgpjoAWQIAzg9BAYwHGVgjRg82MxxlP5737ihfa0yWMeVn 1648 KTU1mToKMaokGMrMnvuOjvu6GmgHdbfgaFXThrnuerN8rRqVP7QLRFNTLWV4YW1w 1649 bGWJAEsEEBECAAsFAjuB8NIECwMBAgAKCRAkESrJozbUDIgdAKCc4eqIbAFlOB6O 1650 rWv8CzMPBNo2ZACeKOD6mS+GrEgQkD+cW1MytHVjFTE= 1651 =Zl1B 1652 -----END PGP PUBLIC KEY BLOCK----- 1654 [Would it be useful to include descriptions of pgpverify and pgpmoose as 1655 additional appendices?]