idnits 2.17.1 draft-ietf-usefor-article-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Cannot find the required boilerplate sections (Copyright, IPR, etc.) in this document. Expected boilerplate is as follows today (2024-04-19) according to https://trustee.ietf.org/license-info : IETF Trust Legal Provisions of 28-dec-2009, Section 6.a: This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 2: Copyright (c) 2024 IETF Trust and the persons identified as the document authors. All rights reserved. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 3: This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** Missing document type: Expected "INTERNET-DRAFT" in the upper left hand corner of the first page ** Missing expiration date. The document expiration date should appear on the first and last page. ** The document seems to lack a 1id_guidelines paragraph about Internet-Drafts being working documents. ** The document seems to lack a 1id_guidelines paragraph about the list of current Internet-Drafts. ** The document seems to lack a 1id_guidelines paragraph about the list of Shadow Directories. == There are 2 instances of lines with non-ascii characters in the document. == No 'Intended status' indicated for this document; assuming Proposed Standard == The page length should not exceed 58 lines per page, but there was 1 longer page, the longest (page 1) being 3581 lines Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** There are 100 instances of weird spacing in the document. Is it really formatted ragged-right, rather than justified? ** There are 68 instances of too long lines in the document, the longest one being 8 characters in excess of 72. == There are 21 instances of lines with non-RFC2606-compliant FQDNs in the document. ** The document seems to lack a both a reference to RFC 2119 and the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. RFC 2119 keyword, line 384: '...s and other specifications. It MUST be...' RFC 2119 keyword, line 427: '...ting and reading agents MAY translate...' RFC 2119 keyword, line 430: '... MUST be the English-derived ...' RFC 2119 keyword, line 485: '... [MESSFOR]. Software compliant with this standard MUST NOT...' RFC 2119 keyword, line 487: '...tax, although it MAY accept such synta...' (303 more instances...) Miscellaneous warnings: ---------------------------------------------------------------------------- == Line 43 has weird spacing: '...erested host ...' == Line 45 has weird spacing: '...central admin...' == Line 46 has weird spacing: '...tration of i...' == Line 47 has weird spacing: '... medium of c...' == Line 59 has weird spacing: '... Since then ...' == (95 more instances...) == Using lowercase 'not' together with uppercase 'MUST', 'SHALL', 'SHOULD', or 'RECOMMENDED' is not an accepted usage according to RFC 2119. Please use uppercase 'NOT' together with RFC 2119 keywords (if that is what you mean). Found 'SHOULD not' in this paragraph: Since the white space beginning a continuation line remains a part of the logical line, headers can be "broken" into multiple lines only at FWS or CFWS. Posting agents SHOULD not break headers unnecessarily (but see section 4.6). == Using lowercase 'not' together with uppercase 'MUST', 'SHALL', 'SHOULD', or 'RECOMMENDED' is not an accepted usage according to RFC 2119. Please use uppercase 'NOT' together with RFC 2119 keywords (if that is what you mean). Found 'MUST not' in this paragraph: Whitespace MAY be present in the Path to make it easier to represent. However, there is no requirement to do so. Whitespace MUST not be used as a delimiter. == Using lowercase 'not' together with uppercase 'MUST', 'SHALL', 'SHOULD', or 'RECOMMENDED' is not an accepted usage according to RFC 2119. Please use uppercase 'NOT' together with RFC 2119 keywords (if that is what you mean). Found 'MUST not' in this paragraph: A proto-article is one that is created by a posting agent and has not been injected into the news system by an injecting agent. Only one copy of a proto-article MUST exist. A proto-article has the same format as a normal article except that some of the compulsory headers MAY be missing. A proto-injected article MAY have the following headers missing: "Message-Id: " , "Date: " and "Path: " . These header MUST not contain invalid values, they MUST either be correct or not present at all. -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- Couldn't find a document date in the document -- date freshness check skipped. -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '' and '' lines. Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Missing reference section? 'MAIL' on line 1904 looks like a reference -- Missing reference section? 'TEST-TLDS' on line 3327 looks like a reference -- Missing reference section? 'ABNF' on line 396 looks like a reference -- Missing reference section? 'MESSFOR' on line 3355 looks like a reference -- Missing reference section? 'RFC2045' on line 460 looks like a reference -- Missing reference section? 'RFC2234' on line 491 looks like a reference -- Missing reference section? 'FWS' on line 1433 looks like a reference -- Missing reference section? 'RFC-2045' on line 3390 looks like a reference -- Missing reference section? 'RFC-2130' on line 3399 looks like a reference -- Missing reference section? 'ISO-10646' on line 3349 looks like a reference -- Missing reference section? 'UNICODE' on line 3417 looks like a reference -- Missing reference section? 'RFC 2079' on line 1000 looks like a reference -- Missing reference section? 'ISO-8859' on line 3336 looks like a reference -- Missing reference section? 'RFC-2047' on line 2769 looks like a reference -- Missing reference section? 'NEWS' on line 1904 looks like a reference -- Missing reference section? 'MIME2' on line 1840 looks like a reference -- Missing reference section? '1036BIS' on line 1866 looks like a reference -- Missing reference section? 'MIME3' on line 2095 looks like a reference -- Missing reference section? 'NNTP' on line 3182 looks like a reference -- Missing reference section? 'CFWS' on line 2508 looks like a reference -- Missing reference section? 'RFC 2045' on line 2606 looks like a reference -- Missing reference section? 'RFC 822' on line 2610 looks like a reference -- Missing reference section? 'RFC 2046' on line 2747 looks like a reference -- Missing reference section? 'RFC 2048' on line 2617 looks like a reference -- Missing reference section? 'RFC 1896' on line 2673 looks like a reference -- Missing reference section? 'RFC 1153' on line 2707 looks like a reference -- Missing reference section? 'RFC 1847' on line 2709 looks like a reference -- Missing reference section? 'RFC 2015' on line 2709 looks like a reference -- Missing reference section? 'Originator-Info' on line 3357 looks like a reference -- Missing reference section? 'RFC-822' on line 3361 looks like a reference -- Missing reference section? 'RFC-850' on line 3365 looks like a reference -- Missing reference section? 'RFC-976' on line 3369 looks like a reference -- Missing reference section? 'RFC-977' on line 3373 looks like a reference -- Missing reference section? 'RFC-1036' on line 3378 looks like a reference -- Missing reference section? 'RFC-1036BIS' on line 3382 looks like a reference -- Missing reference section? 'RFC-1884' on line 3386 looks like a reference -- Missing reference section? 'RFC-2119' on line 3395 looks like a reference -- Missing reference section? 'RFC-2142' on line 3405 looks like a reference -- Missing reference section? 'RFC-2234' on line 3409 looks like a reference -- Missing reference section? 'RFC-2279' on line 3413 looks like a reference -- Missing reference section? 'N6AAW' on line 3458 looks like a reference Summary: 10 errors (**), 0 flaws (~~), 13 warnings (==), 44 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 INTERNET DRAFT to be NEWS Expires 19990101 4 News Article Format 5 draft-ietf-usefor-article-01 6 USEFOR Working Group 8 Status of this Memo 10 This document is an Internet-Draft. Internet-Drafts are working 11 documents of the Internet Engineering Task Force (IETF), its 12 areas, and its working groups. Note that other groups may also 13 distribute working documents as Internet-Drafts. 15 Internet-Drafts are draft documents valid for a maximum of six 16 months and may be updated, replaced, or obsoleted by other 17 documents at any time. It is inappropriate to use Internet- 18 Drafts as reference material or to cite them other than as 19 "work in progress." 21 To view the entire list of current Internet-Drafts, please check 22 the "1id-abstracts.txt" listing contained in the Internet-Drafts 23 Shadow Directories on ftp.is.co.za (Africa), ftp.nordu.net 24 (Northern Europe), ftp.nis.garr.it (Southern Europe), munnari.oz.au 25 (Pacific Rim), ftp.ietf.org (US East Coast), or ftp.isi.edu 26 (US West Coast). 28 It is hoped that this document will obsolete RFC 1036 and will 29 become an Internet standard. 31 This document is a successor to Henry Spencer's "Son of 1036" 32 Draft, and has been referred to as "Grandson of 1036". 34 Distribution of this memo is unlimited. 36 Abstract 38 This Draft defines the format of network news articles, and 39 defines roles and responsibilities for humans and software. 41 Network news articles resemble mail messages but are broadcast 42 to potentially large audiences, using a flooding algorithm 43 that propagates one copy to each interested host (or group 44 thereof), typically stores only one copy per host, and does 45 not require any central administration or systematic 46 registration of interested users. Network news originated 47 as the medium of communication for Usenet, circa 1980. 49 The term "Usenet" refers to the protocols established in RFC 50 1036 and successors; the software implementing those protocols; 51 the network of hosts exchanging traffic using that software; 52 and also the traffic itself. Cooperating subnets are possible; 53 these are groups of hosts which agree to hold each other and 54 themselves to an internally adopted set of standards concerning 55 protocol details or implementations. When a cooperating subnet 56 does not exchange traffic with general Usenet hosts, then it 57 is no longer a part of Usenet, but a separate entity. 59 Since then Usenet has grown explosively, and most Internet 60 sites participate in it. In addition, the news technology 61 is now in widespread use for other purposes, on the Internet 62 and elsewhere. 64 This document is intended to provide a definitive guide to the 65 article format and interpretations thereof. Backward 66 compatibility is a major goal, but where this document and 67 earlier documents or practices collide, this document should be 68 used. 70 Table of Contents 72 1. Introduction 74 2. Definitions, Notations and Conventions 75 2.1 Definitions. 76 2.2. Textual Notations 77 2.3. Syntax Notation 78 2.4. Language 80 3. Relation To [MAIL] (RFC 822 etc.) 82 4. Basic Format 83 4.1 Overall Syntax 84 4.2. Syntax of News Articles 85 4.3. Headers 86 4.3.1. Names and Contents 87 4.3.2 Header Classes 88 4.3.3 Experimental Headers 89 4.3.4 Persistent Headers 90 4.3.5. Variant Headers 91 4.3.6. Header Classes 92 4.3.6.1 Experimental Headers 93 4.3.6.2 Persistent Headers 94 4.3.6.3 Examples 95 4.3.6.4 Comment Headers 96 4.3.6.5. Variant Headers 97 4.3.7. White Space and Continuations 98 4.3.8 Comments 99 4.3.9. Undesirable Headers 100 4.4. Body 101 4.4.1. Body Format Issues 102 4.4.2. Body Conventions 103 4.5. Characters And Character Sets 104 4.5.1. Character Sets within Article Headers 105 4.5.2 Character Sets within Article Bodies 106 4.6. Size Limits 107 4.7. Example 109 5. Mandatory Headers 110 5.1. Date 111 5.2. From 112 5.2.1 Examples: 113 5.3. Message-ID 114 5.4. Subject 115 5.4.1 Examples: 116 5.5. Newsgroups 117 5.5.1 Forbidden newsgroup names 118 5.6 Path 119 5.6.1 Format 120 5.6.2 Adding an entry to the Path header. 121 5.6.3 The tail Entry 122 5.6.4 The Injecting Agent Entry 123 5.6.5 Delimiter Summary 124 5.6.6 Other formatting Issues 125 5.6.6.1 Use of "!" 126 5.6.7 Suggested Verification Methods 127 5.6.8 Issues 129 6. Optional Headers 130 6.1 Followup-To 131 6.2 Expires 132 6.3. Reply-To 133 6.3.1 Examples: 134 6.4. References 135 6.4.1 Examples: 136 6.5. Control 137 6.6. Control Messages 138 6.6.1 The "newgroup" Control Message 139 6.6.1.1 multipart/newsgroupinfo 140 6.6.1.2 application/newsgroupinfo 141 6.6.1.3 Initial Named Articles 142 6.6.2 The "rmgroup" Control Message 143 6.6.3 The "mvgroup" Control Message 144 6.6.3.1 Single group 145 6.6.3.2 Multiple Groups 146 6.6.4 The "checkgroups" Control Message 147 6.6.4.1 Example: 148 6.6.5 application/newscheckgroups 149 6.6.5.1 Examples 150 6.6.6 Cancel 151 6.6.7 ihave, sendme 152 6.6.8 Obsolete control messages. 153 6.7. Distribution 154 6.8. Keywords 155 6.9. Summary 156 6.10. Approved 157 6.11 Lines 158 6.12. Xref 159 6.13. Organization 160 6.14. User-Agent 161 6.14.1 Examples: 162 6.15 MIME headers 163 6.15.1 Syntax 164 6.15.2 Content-Transfer-Encoding 165 6.15.3 Content-Type 166 6.15.3.1 Text 167 6.15.3.2 Application 168 6.15.3.4 Image, Audio and Video 169 6.15.3.5 Multipart 170 6.15.3.6 Message 171 6.15.3.7 Character Sets 172 6.15.4 MIME within headers 173 6.15. Supersedes / Replaces 174 6.15.1 Message-ID version numbers chain procedure. 175 6.15.2 Implementation and Use Note 176 6.15.3 Transition 177 6.15.4 Replaced-by 178 6.15.5.1 Examples 179 6.15.5.2 Example 180 6.15.6 Dates 181 6.15.7 Issues 182 6.16 Archive 183 6.17. Obsolete Headers 185 7. Duties of Various Agents 186 7.1 Duties of an Injecting Agent. 187 7.1.1 Proto-articles. 188 7.1.2 Procedure followed by Injecting Agents. 189 7.1.3 Headers added by Injecting Agents. 190 7.2 Duties of a Relaying Agent 191 7.2.1 Unwanted and Invalid articles 192 7.3 Duties of a Serving Agent 193 7.3.1 Unwanted articles 194 7.4 Duties of a Posting Agent. 195 7.5 Duties of a Followup Agent 196 7.6 Duties of a Gateway 198 8. Propagation and Processing 200 9. Security And Related Issues 201 9.1 Attacks 203 10. Security Considerations 205 11. References: 207 1. Introduction 209 Network news articles resemble mail messages but are 210 broadcast to potentially-large audiences, using a flooding 211 algorithm that propagates one copy to each interested host 212 (or groups thereof), typically stores only one copy per 213 host, and does not require any central administration or 214 systematic registration of interested users. Network news 215 originated as the medium of communication for Usenet, circa 216 1980. Since then Usenet has grown explosively, and many 217 Internet sites participate in it. In addition, the news 218 technology is now in widespread use for other purposes, on the 219 Internet and elsewhere. 221 The earliest news interchange used the so-called "A News" 222 article format. Shortly thereafter, an article format 223 vaguely resembling Internet mail was devised and used 224 briefly. Both of those formats are completely obsolete; 225 they are documented in appendix A for historical reasons 226 only. With publication of RFC 850 [rrr] in 1983, news 227 articles came to closely resemble Internet mail messages, 228 with some restrictions and some additional headers. RFC 229 1036 in 1987 updated RFC 850 without making major changes. 231 A Draft popularly referred to as "Son of 1036" was written in 232 1992 by Henry Spencer. That document formed the original basis 233 for this document. Much is taken directly from Son of 1036, and 234 it is hoped that we have followed its spirit and intentions. 236 As in this document's predecessors, the exact means used to 237 transmit articles from one host to another is not specified. 238 NNTP [rrr] is the most common transmission method on the 239 Internet, but a number of others are in use, including the 240 UUCP protocol [rrr] extensively used in the early days of 241 Usenet, FTP, tape archives, and physically delivered magnetic 242 and optical media. 244 Several of the mechanisms described in this document may seem 245 somewhat strange or even bizarre at first reading. As with 246 Internet mail, there is no reasonable possibility of updating 247 the entire installed base of news software promptly, so 248 interoperability with old software is critical and will 249 remain so. Compatibility with existing practice and 250 robustness in an imperfect world necessarily take priority 251 over elegance. Elegance is left to the implementors. 253 2. Definitions, Notations and Conventions 255 2.1 Definitions. 257 An "article" is the unit of news, analogous to a [MAIL] 258 "message". 260 A "poster" is the person or software that composes and submits 261 a possibly compliant article to an injecting agent. The poster 262 is analogous to [MAIL]'s author(s). 264 A "posting agent" is software that assists posters to prepare 265 articles, including adding required headers and determining 266 whether the final article is compliant to this standard. If 267 the article is compliant it passes the article on to an 268 injecting agent for final checking and injection into the news 269 stream. If the article is not compliant or rejected by the 270 injecting agent then the posting agent informs the poster with 271 an explanation of the error. 273 An "injecting agent" takes the finished article from the 274 posting agent (often via the NNTP "post" command ) performs 275 some final checks and passes it on to a relaying agent for 276 general distribution. 278 A "relaying agent" is software which receives allegedly 279 compliant articles from injecting agents and/or other 280 relaying agents, and possibly passes copies on to other 281 relaying agents and serving agents. 283 A "serving agent" takes an article from a relaying agent and 284 files it in a "news database" . It also provides an interface 285 for reading agents to access the news database. 287 A "reader" is the person or software reading news articles. 289 A "reading agent" is software which presents articles to a 290 reader. 292 A "newsgroup" is a single news forum, a logical bulletin 293 board, having a name and nominally intended for articles on a 294 specific topic. An article is "posted to" a single newsgroup 295 or several newsgroups. When an article is posted to more than 296 one newsgroup, it is said to be "crossposted"; note that 297 this differs from posting the same text as part of each of 298 several articles, one per newsgroup. A "hierarchy" is the 299 set of all newsgroups whose names share a first component. 301 A newsgroup may be "moderated", in which case submissions 302 are not posted directly, but mailed to a "moderator" for 303 consideration and possible posting. Moderators are typically 304 human but may be implemented partially or entirely in 305 software. 307 A "followup" is an article containing a response to the 308 contents of an earlier article (the followup's "precursor"). 310 A "followup agent" is a combination of reading agent, and 311 posting agent that aids in the preparation and posting of a 312 followup. 314 A "reply agent" is a combination of reading agent and mailer 315 that aids in the preparation and posting of an email response 316 to an article. 318 A "message ID" is a unique identifier for an article, usually 319 supplied by the posting agent which posted it. It 320 distinguishes the article from every other article ever posted 321 anywhere. Articles with the same message ID are treated as 322 identical copies of the same article even if they are not in 323 fact identical. 325 A "gateway" is software which receives news articles and 326 converts them to messages of some other kind (e.g. mail to a 327 mailing list), or vice versa; in essence it is a translating 328 relaying agent that straddles boundaries between different 329 methods of message exchange. The most common type of gateway 330 connects newsgroup(s) to mailing list(s), either 331 unidirectionally or bidirectionally, but there are also 332 gateways between news networks using this document's news 333 format and those using other formats. 335 A "control message" is an article which is marked as 336 containing control information; a relaying or serving agent 337 receiving such an article may (subject to permissions etc.) 338 take actions beyond just filing and passing on the article. 340 An article's "reply address" is the address to which mailed 341 replies should be sent. This is the address specified in the 342 article's From header (see section 5.2), unless it also has a 343 Reply-To header (see section 6.3). 345 2.2. Textual Notations 347 Throughout this document, [MAIL] is short for "the current RFCs 348 governing electronic mail formats, beginning with the 349 historical RFC 822 and continuing to its modern successors. 351 "ASCII" is short for "the ANSI X3.4 character set" [rrr]. 352 While "ASCII" is often misused to refer to various character 353 sets somewhat similar to X3.4, in this document, "ASCII" means 354 X3.4 and only X3.4. ASCII is a 7 bit character set. Please 355 note that this document requires that all agents be 8 bit clean; 356 that is, they must accept and transmit data without changing 357 or omitting the 8th bit. 359 Certain words used to define the significance of individual 360 requirements are capitalized. "MUST", "SHOULD", "MAY" and 361 the same words followed by "NOT" should be read as having the 362 same meaning as in RFC 2119. 364 This document contains explanatory notes using the following 365 format. These may be skipped by persons interested solely 366 in the content of the specification. The purpose of the 367 notes is to explain why choices were made, to place them in 368 context, or to suggest possible implementation techniques. 370 NOTE: While such explanatory notes may seem superfluous in 371 principle, they often help the less-than-omniscient reader 372 grasp the purpose of the specification and the constraints 373 involved. Given the limitations of natural language for 374 descriptive purposes, this improves the probability that 375 implementors and users will understand the true intent of 376 the specification in cases where the wording is not entirely 377 clear. 379 All numeric values are given in decimal unless otherwise 380 indicated. Octets are assumed to be unsigned values for 381 this purpose. 383 Through this document we will give examples of various 384 definitions, headers and other specifications. It MUST be 385 remembered that these samples are for the aid of the reader 386 only and do NOT define any specification themselves. In order 387 to prevent possible conflict with "Real World" entities and 388 people the top level domain of ".example" is used in all 389 sample domains and addresses. The hierarchy of example.* is 390 also used a sample hierarchy. Information on the ".example" 391 top level domain is in [TEST-TLDS] . 393 2.3. Syntax Notation 395 This document uses the Augmented Backus Naur Form described in 396 [ABNF]. A discussion of this is outside the bounds of this 397 document, but it is expected that implementors will be able to 398 quickly understand it with reference to the defining document. 400 This document is intended to be self-contained; all syntax 401 rules used in it are defined within it, and a rule with the 402 same name as one found in [MAIL] does not have the same 403 definition. The lexical layer of [MAIL] is NOT, repeat NOT, 404 used in this document, and its presence must not be 405 assumed; notably, this document spells out all places where 406 white space is permitted/required and all places where 407 constructs resembling [MAIL] comments can occur. 409 NOTE: News parsers historically have been much less 410 permissive than [MAIL] parsers. 412 Text in newsgroup names, header parameters, etc. is 413 case-sensitive unless stated otherwise. 415 NOTE: This is at variance with [MAIL], which is 416 case-insensitive unless stated otherwise, but is 417 consistent with news historical practice and 418 existing news software. See the comments on backward 419 compatibility in section 1. 421 2.4. Language 423 Various constant strings in this document, such as header names 424 and month names, are derived from English words. Despite 425 their derivation, these words do NOT change when the poster 426 or reader employing them is interacting in a language other 427 than English. Posting and reading agents MAY translate 428 as appropriate in their interaction with the poster or 429 reader, but the forms that actually appear in articles 430 MUST be the English-derived ones defined in this document. 432 3. Relation To [MAIL] (RFC 822 etc.) 434 The primary intent of this document is to completely describe 435 the news article format. News articles were once considered as 436 a subset of [MAIL]'s message format augmented by some new 437 headers; this is no longer the case. News and [MAIL] have 438 diverged. It is the intention of this document that gateways 439 between [MAIL] and news still be capable of performing this 440 function automatically. 442 [MAIL] and news do follow some of the same standards, however. 443 In particular, the MIME standards apply equally to news 444 articles. 446 4. Basic Format 448 4.1 Overall Syntax 450 Much of the syntax of News Articles is based on the 451 corresponding syntax defined by [MESSFOR], which is deemed to 452 have been incorporated into this standard as required. 453 However, there are some important differences arising from the 454 fact that [MESSFOR] does not recognise anything other than 455 US-ASCII characters, that it does not recognise the MIME 456 headers [RFC2045], and that it includes much syntax described 457 as "obsolete". 459 The following syntactic forms supersede the corresponding 460 rules given in [MESSFOR] and [RFC2045]: 462 text = %d1-9 / ; all octets except 463 %d11-12 / ; US-ASCII NUL, CR and LF 464 %d14-255 465 ctext = NO-WS-CTL / ; all of except 466 %d33-39 / ; SP, HTAB, "(", ")" 467 %d42-91 / ; and "\" 468 %d93-255 469 qtext = NO-WS-CTL / ; all of except 470 %d33 / ; SP, HTAB, "\" and <"> 471 %d35-91 / 472 %d93-255 473 ftext = %d33-57 / ; all octets except 474 %d59-126 / ; CTL, SP and ":" 475 %d128-255 476 token = 1* 477 tspecials = "(" / ")" / "<" / ">" / "@" 478 "," / ";" / ":" / " 479 "/" / "[" / "]" / "?" / "=" 481 Wherever in this standard the syntax is stated to be taken 482 from [MESSFOR], it is to be understood as the syntax defined 483 by [MESSFOR] after making the above changes, but NOT including 484 any syntax defined in section 4 ("Obsolete syntax") of 485 [MESSFOR]. Software compliant with this standard MUST NOT 486 generate any of the syntactic forms defined in that Obsolete 487 Syntax, although it MAY accept such syntactic forms. Certain 488 syntax from the MIME specifications [RFC2045 et seq] is also 489 considered a part of this Standard (see ...). 491 The following syntactic forms, taken from [RFC2234] or from 492 [MESSFOR], are repeated here for convenience only: 494 ALPHA = %x41-5A / ; A-Z 495 %x61-7A ; a-z 496 CR = %x0D ; carriage return 497 CRLF = CR LF 498 DIGIT = %x30-39 ; 0-9 499 HTAB = %x09 ; horizontal tab 500 LF = %x0A ; line feed 501 SP = %x20 ; space 502 NO-WS-CTL = %d1-8 / ; US-ASCII control characters 503 %d11 / ; which do not include the 504 %d12 / ; carriage return, line feed, 505 %d14-41 / ; and whitespace characters 506 %d127 507 WSP = SP / HTAB ; Whitespace characters 508 FWS = ([*WSP CRLF] 1*WSP) ; Folding whitespace 509 comment = "(" *([FWS] (ctext / quoted-pair / comment)) 510 [FWS] ")" 511 CFWS = *([FWS] comment) (([FWS] comment) / FWS ) 512 <"> = %d34 ; quote mark 513 quoted-pair = "\" text 514 quoted-string = *CFWS <"> *(FWS (qtext / quoted-pair)) <"> *CFWS 515 unstructured = *( [FWS] text ) 517 4.2. Syntax of News Articles 519 The overall syntax of a news article is: 521 article = 1*header separator body 522 header = header-name ":" SP header-content CRLF 523 header-name = 1*name-character *( "-" 1*name-character ) 524 name-character = ALPHA / DIGIT 525 header-content = usenet-header-content / unstructured 526 usenet-header-content 527 = 528 separator = CRLF 529 body = *( *998text CRLF ) 530 nonblank-text = 1*( [FWS] nbtext ) 531 nbtext = qtext / ; all of except 532 "\" / <"> ; SP and HTAB 534 An article consists of some headers followed by a body. An 535 empty line separates the two. The headers contain structured 536 information about the article and its transmission. A header 537 begins with a header-name identifying it, and can be continued 538 onto subsequent lines as described in section 4.3.2. The body 539 is largely unstructured text significant only to the poster 540 and the readers. 542 NOTE: Terminology here follows the current custom in the news 543 community, rather than the [MESSFOR] convention of referring 544 to what is here called a "header" as a "header-field" or 545 "field". 547 Note that the separator line must be truly empty, not just a 548 line containing white space. Further empty lines following it 549 are part of the body, as are empty lines at the end of the 550 article. 552 4.3. Headers 554 4.3.1. Names and Contents 556 Despite the restrictions on header-name syntax imposed by the 557 grammar, relayers and reading agents SHOULD tolerate header 558 names containing any ASCII printable character other than 559 colon (":", ASCII 58). [That brings it into line with 560 as given in [MESSFOR].] 562 Header-names SHOULD be either those defined in this standard, 563 or those defined in [MESSFOR], or those defined in any 564 extension to either of these standards, or other names 565 beginning with "X-". Software SHOULD NOT attempt to interpret 566 headers not described in this standard or in its extensions. 567 Relaying agents MUST pass them on unaltered and reading agents 568 MUST enable them to be displayed, at least optionally. 570 Posters wishing to convey non-standard information in headers 571 SHOULD use header-names beginning with "X-". No standard 572 header name will ever be of this form. Reading agents SHOULD 573 ignore "X-" headers, or at least treat them with great care. 575 The order of headers in an article is not significant. 576 However, posting agents are encouraged to put mandatory 577 headers (see section 5) first, followed by optional headers 578 (see section 6), followed by "X-" headers and headers not 579 defined in this standard or its extensions. Relaying agents 580 MUST NOT change the order of the headers in an article. 582 Header-names are case-insensitive. There is a preferred case 583 convention, which posters and posting agents SHOULD use: 584 each hyphen-separated "word" has its initial letter (if any) 585 in uppercase and the rest in lowercase, except that some 586 abbreviations have all letters uppercase (e.g. "Message-ID" 587 and "MIME-Version"). The forms used in this standard are the 588 preferred forms for the headers described herein. Relaying and 589 reading agents MUST, however, tolerate articles not obeying 590 this convention. 592 4.3.2 Header Classes 594 There are four special classes of headers that may be present 595 in an article: Experimental, Persistent, Comment, and 596 Variant. All other headers are ephemeral. These classes are 597 significant in how newsreaders and servers should treat them 598 when encountered. 600 4.3.3 Experimental Headers 602 Experimental headers are headers which begin with "X-". They 603 are to be used by newsreaders proposing new headers for some 604 utility or for comments to be propogated with the article. 605 There are no established headers that are considered 606 experimental headers; an established header cannot be 607 experimental. 609 Attempts to create new headers that are to be adopted as 610 standard headers MUST begin their lives as experimental 611 headers. 613 4.3.4 Persistent Headers 615 Persistent headers are headers which begin with "P-" (or 616 "X-P-", hereafter referred to simply as "P- headers") which 617 persist across followups either identically or by simple 618 modification. Headers with this behavior include: 620 Newsgroups 621 Content is carried over into all followups. Modified by 622 content of Followup-To header. 624 Subject 625 Content is carried over into all followups. Modified by 626 prefixing with "Re: " if not already present. Also modified by 627 user, often with a "(was: )" phrase preserving the previous 628 content. 630 References 631 Content is carried over into all followups. Modified by 632 appending content of Message-ID header. 634 NOTE: Though traditionally old newsreaders would treat 635 Keywords as a persistent header, it is not a persistent 636 header. More modern newsreaders do not treat it as such. 638 4.3.5. Variant Headers 640 Variant Headers are headers that are modified on articles when 641 they are propogated. Variant headers have a "V-" prefix. 642 Variant headers may be experimental ("X-V-"), persistent 643 ("P-V-"), or both ("X-P-V-"). 645 4.3.6. Header Classes 647 There are four special classes of headers that may be present 648 in an article: Experimental, Persistent, Comment, and 649 Variant. All other headers are ephemeral. These classes are 650 significant in how newsreaders and servers should treat them 651 when encountered. 653 4.3.6.1 Experimental Headers 655 Experimental headers are headers which begin with "X-". They 656 are to be used by newsreaders proposing new headers for some 657 utility or for comments to be propogated with the article. 658 There are no established headers that are considered 659 experimental headers; an established header cannot be 660 experimental. 662 Attempts to create new headers that are to be adopted as 663 standard headers MUST begin their lives as experimental 664 headers. 666 4.3.6.2 Persistent Headers 668 Persistent headers are headers which begin with "P-" (or 669 "X-P-", hereafter referred to simply as "P- headers") which 670 persist across followups either identically or by simple 671 modification. Headers with this behavior include: 673 Newsgroups 675 Content is carried over into all followups. Modified by 676 content of Followup-To header. 678 Subject 680 Content is carried over into all followups. Modified by 681 prefixing with "Re: " if not already present. Also modified by 682 user, often with a "(was: )" phrase preserving the previous 683 content. 685 References 687 Content is carried over into all followups. Modified by 688 appending content of Message-ID header. 690 NOTE: Though traditionally old newsreaders would treat 691 Keywords as a persistent header, it is not a persistent 692 header. More modern newsreaders do not treat it as such. 694 4.3.6.3 Examples 696 Newsgroups: alt.test 697 Subject: Persistent Header Example 698 Message-ID: <001@news.site.example> 699 P-Author-IDs: 700 User-Agent: experimental/0.1g (P-Author-ID Compliant) 702 From: jane@site.invalid (Jane Smith) 703 Newsgroups: alt.test 704 Followup-To: misc.test 705 Subject: Re: Persistent Header Example 706 Message-ID: <002@news.site.example> 707 References: <001@news.site.example> 708 P-Author-IDs: 709 User-Agent: modern/1.2 (Author-ID non-Compliant; P- header compliant) 710 Keywords: persistance, good ideas 712 From: andrew@isp.invalid 713 Newsgroups: misc.test 714 Subject: Further example (was: Re: Persistent Header Example) 715 Message-ID: <001@news.isp.example> 716 References: <001@news.site.example> <002@news.site.example> 717 P-Author-IDs: 718 User-Agent: codeveloper/2.0b (Author-ID Compliant) 720 4.3.6.4 Comment Headers 722 Comment headers are headers that are strictly local and MUST 723 NOT be propogated outside of a restricted subnet for local 724 testing purposes. Comment headers have a prefix of "C-". Due 725 to their limited scope, they MUST NOT be combined with any 726 other prefix, such as "X-C-" headers. Headers with this 727 behavior include: 729 Xref 731 Used by servers to keep track of crossposted articles' article 732 numbers in the crossposted-to news groups in the local news 733 spool as an aid to newsreaders marking such articles as read. 735 4.3.6.5. Variant Headers 737 Variant Headers are headers that are modified on articles when 738 they are propogated. Variant headers have a "V-" prefix. 739 Variant headers may be experimental ("X-V-"), persistent 740 ("P-V-"), or both ("X-P-V-"). 742 4.3.7. White Space and Continuations 744 [The following text is taken from [MESSFOR], adapted to the 745 different terminology used for this standard.] 747 Each header is logically a single line of characters 748 comprising the header-name, the colon with its following 749 SP, and the header-content. For convenience, however, the 750 header-content can be split into a multiple line 751 representation; this is called "folding". The general rule is 752 that wherever this standard allows for FWS (which includes 753 CFWS, but not simply SP or HTAB) a CRLF followed by AT 754 LEAST one SP or HTAB may instead be inserted. For example, 755 the header: 757 Approved: modname@modsite.com(Acting Moderator of 758 comp.foo.bar) 760 can be represented as: 762 Approved: modname@modsite.com 763 (Acting Moderator of comp.foo.bar) 765 NOTE: Though header-contents are defined in such a way that 766 folding can take place between many of the lexical tokens, 767 folding SHOULD be limited to placing the CRLF at higher-level 768 syntactic breaks. For instance, if a header-content is defined 769 as comma-separated values, it is recommended that folding 770 occur after the comma separating the structured items, even if 771 it is allowed elsewhere. 773 Folding MUST NOT be carried out in such a way that any line of 774 a header is made up entirely of WSP characters and nothing 775 else. [That is taken from a rather unsatisfactory line in 776 section 3.2.4 of [MESSFOR] (which seems to allow WSP-only 777 lines to arise from FWS but not from CFWS). The situation 778 could arise where two FWS or CFWS could be adjacent, according 779 to the syntax (I believe this is possible in [MESSFOR], which 780 goes to show how sloppy their syntax is), or where FWS or CFWS 781 is allowed at the end of a line.] 783 The colon following the header name on the start-line MUST be 784 followed by white space, even if the header is empty. If the 785 header is not empty, at least some of the content MUST appear 786 on the start-line. Posting agents MUST enforce these 787 restrictions, but relaying agents SHOULD accept even articles 788 that violate them. 790 Posters and posting agents SHOULD use SP, not HTAB, where 791 white space is desired in headers (some existing software 792 expects this), and MUST use SP immediately following the 793 colon after a header-name (this was an RFC 1036 requirement). 794 Relaying agents SHOULD accept HTAB in all such cases, however. 796 Since the white space beginning a continuation line remains a 797 part of the logical line, headers can be "broken" into 798 multiple lines only at FWS or CFWS. Posting agents SHOULD not 799 break headers unnecessarily (but see section 4.6). 801 4.3.8 Comments 803 Strings of characters which are treated as comments may be 804 included in header contents wherever the syntactic element 805 CFWS occurs. They consist of characters enclosed in 806 parentheses. Such strings are considered comments so long as 807 they do not appear within a quoted-string. Comments may be 808 nested. 810 A comment is normally used to provide some human readable 811 informational text, except at the end of an
which 812 contains no , as in 814 fred@foo.bar.com (Fred Bloggs) 816 as opposed to 818 "Fred Bloggs" 820 The former is a deprecated, but commonly encountered, usage 821 and reading agents SHOULD take special note of such comments 822 as indicating the name of the person whose
it is. In 823 all other situations a comment is semantically interpreted as 824 a single SP. Since a comment is allowed to contain FWS, 825 folding is permitted within it as well as immediately 826 preceding and immediately following it. Also note that, since 827 quoted-pair is allowed in a comment, the parenthesis and 828 backslash characters may appear in a comment so long as they 829 appear as a quoted-pair. Semantically, the enclosing 830 parentheses are not part of the comment token; the token is 831 what is contained between the two parentheses. 833 Since comments have not hitherto been permitted in news 834 articles, except in a few specified places, posters and 835 posting-agents SHOULD NOT insert them except in those places. 836 However, compliant software MUST accept them in all places 837 where they are syntactically allowed. 839 4.3.9. Undesirable Headers 841 A header whose content is empty is said to be an empty header. 842 Relaying and reading agents SHOULD NOT consider presence or 843 absence of an empty header to alter the semantics of an 844 article (although syntactic rules, such as requirements that 845 certain header names appear at most once in an article, MUST 846 still be satisfied). Posting and injecting agents SHOULD 847 delete empty headers from articles before posting them; 848 relaying agents MUST pass them untouched. 850 Headers that merely state defaults explicitly (e.g., a 851 Followup-To header with the same content as the Newsgroups 852 header, or a MIME Content-Type header with contents 853 "text/plain; charset=us-ascii") or state information that 854 reading agents can typically determine easily themselves (e.g. 855 the length of the body in octets) are redundant and posters 856 and posting agents SHOULD NOT include them. 858 4.4. Body 860 4.4.1. Body Format Issues 862 The body of an article MAY be empty, although posting agents 863 SHOULD consider this an error condition (meriting returning 864 the article to the poster for revision). A posting or 865 injecting agent which does not reject such an article SHOULD 866 issue a warning message to the poster and supply a non-empty 867 body. Note that the separator line MUST be present even if the 868 body is empty. 870 NOTE: Some existing news software is known to react badly to 871 body-less articles, hence the request for posting and 872 injecting agents to insert a body in such cases. The sentence 873 "This article was probably generated by a buggy news reader" 874 has traditionally been used is this situation. 876 Note that an article body is a sequence of lines terminated by 877 CRLFs, not arbitrary binary data, and in particular it MUST 878 end with a CRLF. However, relaying agents SHOULD treat the 879 body of an article as an uninterpreted sequence of octets 880 (except as mandated by changes of CRLF representation and by 881 control-message processing) and SHOULD avoid imposing 882 constraints on it. See also section 4.6. 884 4.4.2. Body Conventions 886 A body is by default an uninterpreted sequence of octets for 887 most of the purposes of this standard. However, a MIME 888 Content-Type header may impose some structure or intended 889 interpretation upon it, and may also specify the character set 890 in accordance with which the octets are to be interpreted. 892 NOTE: The syntax does not permit the NUL octet to appear in a 893 body, and the octets CR and LF MUST ONLY occur together as 894 CRLF. See also section 4.6 for limits on the length of a 895 line. 897 It is a common practice for followup agents to enable the 898 incorporation of the followed-up article (the "precursor") 899 as a quotation. This SHOULD be done by prefacing each line 900 of the quoted text (even if it is empty) with the character 901 ">" (or preferably with "> "). This will result in multiple 902 levels of ">" when quoted content itself contains quoted 903 content. The followup agent SHOULD also precede the quoted 904 content by an "attribution line" incorporating at least the 905 name of the precursor's poster. 907 The following convention for attribution lines, whilst not 908 mandated by this Standard, is intended to facilitate their 909 automatic recognition and processing by sophisticated reading 910 agents. The following fields describing the precursor should, 911 if present, be in the given order. 913 A single Newsgroup name (the one from which the followup is 914 being made) enclosed within <...> or 916 The precursor's Message-ID enclosed within <...> or 918 The precursor's poster's Name enclosed within "..." 920 The precursor's poster's Email address enclosed within <...> or 921 923 The fields may be separated by arbitrary text, they may be 924 folded in the same way as headers, and they should be 925 terminated by a ":" followed by two CRLFs. Example: 927 On in <12345678@foo.com> on 24 Dec 1997 16:40:20 +0000 928 "Joe D. Bloggs" wrote: 930 NOTE: The use of the standard character ">" facilitates 931 automatic analysis of articles. The inclusion of the 932 Message-ID in the attribution would enable reading agents to 933 retrieve the precursor by clicking on it. However, readers are 934 warned not to assume that attributions are accurate, 935 especially within multiply nested quotations. 937 NOTE: Posters SHOULD edit quoted context to trim it down to 938 the minimum necessary. However, followup agents SHOULD NOT 939 attempt to enforce this beyond issuing a warning (past 940 attempts to do so have been found to be notably 941 counter-productive). 943 A "personal signature" is a short closing text automatically 944 added to the end of articles by posting agents, identifying 945 the poster and giving his network addresses, etc. If a poster 946 or posting agent does append such a signature to an article, 947 it MUST be preceded with a delimiter line containing (only) 948 two hyphens (ASCII 45) followed by one SP (ASCII 32). The 949 signature is considered to extend from the last occurrence of 950 that delimiter up to the end of the article (or up to the end 951 of the part in the case of a multipart MIME body). Followup 952 agents, when incorporating quoted text from a precursor, 953 SHOULD NOT include the signature in the quotation. Posting 954 agents SHOULD discourage (at least with a warning) signatures 955 of excessive length (4 lines is a commonly accepted limit). 957 4.5. Characters And Character Sets 959 Transmission paths for news articles MUST treat news articles 960 as uninterpreted sequences of octets, excluding the values 0 961 (ASCII NUL) and 13 and 10 (ASCII CR and LF, which MUST only 962 appear in the combination which denotes a line 963 separator). 965 NOTE: this correspponds to the range of octets permitted for 966 MIME "8bit data" [RFC-2045]. 968 An octet, or a sequence of octets, may represent a character 969 in some Coded Character Set (CCS) [RFC-2130] as determined by 970 some Character Encoding Scheme (CES) [RFC-2130]. 972 If it comes to a relaying agent's attention that it is being 973 asked to pass an article using the Content-Transfer-Encoding 974 "8bit" to a relaying agent that does not support it, it SHOULD 975 report this error to its administrator. It MUST refuse to pass 976 the article and MUST NOT re-encode it with different MIME 977 encodings. 979 NOTE: This strategy will do little harm. The target relaying 980 agent is unlikely to be able to make use of the article on its 981 own servers, and the usual flooding algorithm will likely find 982 some alternative route to get the article to destinations 983 where it is needed. 985 4.5.1. Character Sets within Article Headers 987 Within article headers, the CES is UTF-8 [ISO-10646 or 988 RFC-2279] and hence the CCS is the Universal Multiple-Octet 989 Coded Character Set (UCS) [ISO-10646] (which is essentially a 990 superset of Unicode [UNICODE] and expected to remain so). 991 However, interpreting the octets directly as ASCII characters 992 should ensure correct behaviour in most situations. 994 NOTE: UTF-8 is an encoding for 16bit (and even 32bit) 995 character sets with the property that any octet less than 128 996 immediately represents the corresponding ASCII character, thus 997 ensuring upwards compatibility with previous practice. 998 Non-ASCII characters from UCS are represented by sequences of 999 octets greater than 127. Only those octet sequences explicitly 1000 permitted by [RFC 2079] shall be used. UCS includes all 1001 characters from the ISO-8859 series of characters sets 1002 [ISO-8859] (which includes all Greek and Arabic characters) as 1003 well as the more elaborate characters used in Japan and China. 1004 See the following section for the appropriate treatment of UCS 1005 characters by reading agents. 1007 Notwithstanding the great flexibility permitted by UTF-8, 1008 there is need for restraint in its use in order that the 1009 essential components of headers may be discerned using 1010 reading agents that cannot present the full UCS range. In 1011 particular, header-names MUST be in ASCII, and certain other 1012 components of headers, as defined elsewhere in this standard - 1013 notably s (as in s), s, 1014 s s and s - MUST be in ASCII. 1015 s, s (as in
es) and s 1016 (as in s) MAY use other character sets. For 1017 s see below. 1019 Where the use of non-ASCII characters, encoded in UTF-8, is 1020 permitted as above, they MAY also be encoded using the MIME 1021 mechanism defined in RFC-2047 [RFC-2047], but this usage is 1022 deprecated within news articles (even though it is required in 1023 mail messages) since it is less legible in older reading 1024 agents which support neither it nor UTF-8. Nevertheless, 1025 reading agents SHOULD support this usage, but only in those 1026 contexts explicitly mentioned in [RFC-2047]. 1028 4.5.2 Character Sets within Article Bodies 1030 Within article bodies, the CES and CCS implied by any 1031 Content-Transfer-Encoding and Content-Type headers [RFC-2045] 1032 SHOULD be applied by reading agents. In the absence of such 1033 headers, reading agents cannot be relied upon to display 1034 correctly more than the ASCII characters. [Observe that 1035 reading agents are not forbidden to "guess", or to interpret 1036 as UTF-8 regardless, which would be the simplest course for 1037 them to take.] 1039 NOTE: It is not expected that reading agents will necessarily 1040 be able to present characters in all possible character sets, 1041 although they MUST be able to present all ASCII characters. 1042 For example, a reading agent might be able to present only the 1043 ISO-8859-1 (Latin 1) characters [ISO-8859], in which case it 1044 SHOULD present undisplayable characters using some distinctive 1045 glyph, or by exhibiting a suitable warning. Older reading 1046 agents that do not understand MIME headers or UTF-8 should be 1047 able to display bodies in ASCII (with some loss of human 1048 comprehensibility) except possibly when the 1049 Content-Transfer-Encoding is "8bit". 1051 NOTE: Be warned that it will never be safe to send raw binary 1052 data in the body of news articles, because the presence of 1053 ASCII NUL and changes of representation will inevitably 1054 corrupt it. Such data MUST be encoded (e.g. by using 1055 Content-Transfer-Encoding: base64). 1057 Posters SHOULD avoid using control characters in ASCII (or 1058 other CCSs) except for tab (ASCII 9), formfeed (ASCII 12), and 1059 backspace (ASCII 8). Tab signifies sufficient horizontal white 1060 space to reach the next of a set of fixed positions; posters 1061 are warned that there is no standard set of positions, so tabs 1062 should be avoided if precise spacing is essential. Formfeed 1063 signifies a point at which a reading agent SHOULD pause and 1064 await reader interaction before displaying further text. 1065 Backspace SHOULD be used only for underlining, done by a 1066 sequence of underscores (ASCII 95) followed by an equal number 1067 of backspaces, signifying that the same number of text 1068 characters following are to be underlined. Posters are warned 1069 that underlining is not available on all output devices and is 1070 best not relied on for essential meaning. Reading agents 1071 SHOULD recognize underlining and translate it to the 1072 appropriate commands for devices that support it. Reading 1073 agents MUST NOT pass other control characters or escape 1074 sequences unaltered to the output device. 1076 Followup agents MUST be careful to apply appropriate encodings 1077 to the outbound followup. A followup to an article containing 1078 non-ASCII material is very likely to contain non-ASCII 1079 material itself. 1081 4.6. Size Limits 1083 The syntax provides for the lines of a body to be up to 998 1084 octets in length, not including the CRLF. All software 1085 compliant with this standard MUST support lines of at least 1086 that length, both in headers and in bodies, and all such 1087 software SHOULD support lines of arbitrary length. In 1088 particular, relaying agents MUST transmit lines of arbitrary 1089 length without truncation or any other modification. 1091 NOTE: The limit of 998 octets is consistent with the 1092 corresponding limit in [MESSFOR]. 1094 In plain-text messages (those with no MIME headers, or those 1095 with a MIME Content-Type of text/plain) posting agents SHOULD 1096 encourage the practice of keeping the length of body lines to 1097 within 79 characters at most, and preferably to within 72 1098 characters (to allow room for quoting in followups). However, 1099 posting agents MUST permit the poster to include longer lines 1100 if he so insists. 1102 NOTE: Plain-text messages are intended to be displayed "as-is" 1103 without any special action (such as automatic line splitting) 1104 on the part of the recipient. The limit (72 or 79) is 1105 expressed as a number of characters (as they will be displayed 1106 by a reading agent) rather than as the number of octets used 1107 to encode them. 1109 Posting agents SHOULD fold headers by inserting CRLF followed 1110 by 1*WSP at positions (preferably higher-level ones - see 1111 4.3.2) where this is syntactically allowed so as to keep, so 1112 far as is possible, all header lines within 79 characters. 1113 Likewise, injecting agents SHOULD fold any headers generated 1114 automatically by themselves. Relaying agents MUST NOT fold 1115 header lines (i.e. they must pass on the folding as received). 1117 NOTE: There is NO restriction on the number of lines into 1118 which a header may be split, and hence there is NO restriction 1119 on the total length of a header (in particular it may, by 1120 suitable folding, be made to exceed the 998 octets 1121 restriction pertaining to a single header line). 1123 NOTE: This standard provides no upper bound on the overall 1124 size of a single article, but neither does it forbid relaying 1125 agents from dropping articles of excessive length. It is, 1126 however, suggested that any limits thought appropriate by 1127 particular agents would be more appropriately expressed in 1128 megabytes than in kilobytes. 1130 4.7. Example 1132 Here is a sample article: 1134 Path: server.example,unknown.site2.example@site2.example, 1135 relay.site.example,site.example,injector.site.example%jsmith 1136 Newsgroups: example.announce,example.chat 1137 Message-ID: <9urrt98y53@site.example> 1138 From: Ann Example 1139 Subject: Announcing a new sample article. 1140 Date: Fri, 27 Mar 1998 12:12:50 +1300 1141 Approved: example.announce moderator 1142 Followup-To: example.chat 1143 Reply-To: Ann Example 1144 Expires: Wed, 22 Apr 1998 12:12:50 -0700 1145 Organization: Site1, The Number one site for examples. 1146 User-Agent: ExampleNews/3.14 (Unix) 1147 Keywords: example, announcement, standards, RFC 1036, Usefor 1148 Summary: The URL for the next standard. 1150 Just a quick announcemnt that a new standard example article has been 1151 released; it is in the new USEFOR draft obtainable from ftp.ietf.org. 1153 Ann. 1155 -- 1156 Ann Example Sample Poster to the Stars 1157 "The opinions in this article are bloody good ones" - from J Clarke. 1159 5. Mandatory Headers 1161 An article MUST have one, and only one, of each of the 1162 following headers: Date, From, Message-ID, Subject, 1163 Newsgroups, Path. 1165 NOTE: [MAIL] specifies (if read most carefully) that there 1166 must be exactly one Date header and exactly one From header, 1167 but otherwise does not restrict multiple appearances of 1168 headers. (Notably, it permits multiple Message-ID 1169 headers!) This appears singularly useless, or even 1170 harmful, in the context of news, and much current news 1171 software will not tolerate multiple appearances of mandatory 1172 headers. 1174 Note also that there are situations, discussed in the 1175 relevant parts of section 6, where References, Sender, 1176 or Approved headers are mandatory. In control articles, 1177 specific values are required for certain headers. 1179 In the discussions of the individual headers, the content of 1180 each is specified using the syntax notation. The convention 1181 used is that the content of, for example, the Subject header 1182 is defined as . 1184 NOTE: see also Section 7.1.1 1186 5.1. Date 1188 The Date header contains the date and time that the article 1189 was submitted for transmission. The content syntax is 1190 defined in the Message Format Standard [MESSFOR]. 1192 Date-content = date-time 1194 5.2. From 1196 The From header contains the electronic address(es), and 1197 possibly the full name, of the article's author(s) . The 1198 format of the From header is defined in the Message Format 1199 Standard [MESSFOR]. 1201 All mailboxes in the From-content field MUST either belong to the 1202 posters(s) of the article ( or the poster(s) are authorized by 1203 the owners to use the mailboxes) or end in the top level 1204 domain of ".invalid". 1206 From-content = mailbox-list 1208 5.2.1 Examples: 1210 From: John Smith 1211 From: John Smith , dave@isp.example 1212 From: John Smith , andrew@isp.example, 1213 fred@site2.example 1214 From: Jan Jones 1215 From: Jan Jones 1216 From: dave@isp.example (Dave Smith) 1218 NOTE: the last example is in an obsolete syntax. 1220 5.3. Message-ID 1222 The Message-ID header contains the article's message ID, a 1223 unique identifier distinguishing the article from every 1224 other article. The format of the Message-ID header is defined 1225 in the Message Format Standard [MESSFOR] . An article's 1226 message ID MUST be unique and MUST NEVER be reused. 1228 Message-ID-content = msg-id 1230 5.4. Subject 1232 The Subject field contains a short string identifying the 1233 topic of the message. When used in a followup, the field body 1234 SHOULD start with the string "Re: " ( a "back reference" ) 1235 followed by the contents of the pure-subject of the precursor. 1237 subject-content = [ back-reference ] pure-subject CRLF 1238 pure-subject = nonblank-text 1239 back-reference = %x52.65.3A.20 ; which is a case-sensitive 1240 "Re: " 1242 The pure-subject MUST NOT begin with "Re: ". The default 1243 subject-content of a followup is the string "Re: " followed by 1244 the contents of the pure-subject of the precursor. Any leading 1245 "Re: " in the pure-subject MUST be stripped. 1247 Followup agents SHOULD remove instances of non-standard 1248 back-reference (such as "Re(2): ", "Re:", "RE: ", or "Sv: ") 1249 from the subject-content when composing the subject of a 1250 followup and add a correct back-reference in front of the 1251 result. 1253 Followup agents MUST NOT use any other string except "Re: " as 1254 a back reference. Specifically, a translation of "Re: " into a 1255 local language or usage MUST NOT be used. 1257 Agents SHOULD NOT depend on nor enforce the use of back 1258 references by followup agents. For compatibility with legacy 1259 news software the subject-content of a control message MAY 1260 start with the string "cmsg ", non-control messages MUST NOT 1261 start with the string "cmsg ". 1263 5.4.1 Examples: 1265 In the following examples, please note that only "Re: " is 1266 mandated by this DRAFT. "was: " is a convention used by many 1267 English-speaking posters to signal a change in subject matter. 1268 Software should be able to deduce this information from 1269 References. 1271 Subject: Film at 11. 1272 Subject: Re: Film at 11 1273 Subject: Use of Godwin's law considered harmful (was: Film at 11) 1274 Subject: Godwin's law (was: Film at 11) 1275 Subject: Re: Godwin's law (was: Film at 11) 1277 5.5. Newsgroups 1279 The Newsgroups header's content specifies which newsgroup(s) 1280 the article is posted to: 1282 Newsgroups-content = newsgroup-name *( ng-delim newsgroup-name) 1283 newsgroup-name = *FWS component *( "." component ) *FWS 1284 component = component-start [*component-rest component-start] 1285 component-start = lowercase / digit 1286 lowercase = / 1287 uppercase = / 1288 digit = / 1289 component-rest = component-start / "+" / "-" / "_" 1290 ng-delim = "," 1292 where the items are as described in [UNICODE]. 1294 The inclusion of folding white space within a newsgroup-name 1295 is a newly introduced feature in this standard. It MUST be 1296 accepted by all conforming implementations (relaying agents, 1297 serving agents and reading agents). On the other hand, posting 1298 agents MUST NOT generate such whitespace and injecting agents 1299 MUST NOT accept such whitespace (except for experimental 1300 postings to 'test' newsgroups or within cooperating subnets) 1301 until after AGREED IMPLEMENTATION DATE. After AGREED 1302 IMPLEMENTATION DATE such agents MAY generate such whitespace 1303 anywhere and SHOULD generate it in the form of so as 1304 to keep the length of lines in the relevant headers (notably 1305 Newsgroups and Followup-To) to no more than than 79 1306 characters. Before AGREED IMPLEMENTATION DATE, injecting 1307 agents MAY reformat such headers by removing whitespace 1308 inserted by the posting agent, but relaying agents MUST NOT do 1309 so. 1311 A newsgroup name consists of one or more components. 1312 Components MAY contain non-ASCII letters, but these MUST be 1313 encoded in UTF-8 and not according to RFC-2047. A component 1314 MUST contain at least one letter (and must, according to the 1315 syntax, begin and end with a letter or digit). Components 1316 SHOULD begin with a letter. Composite characters (made by 1317 overlaying one character with another) and format characters, 1318 as allowed in certain parts of Unicode and needed by certain 1319 languages, must use whatever canonical conventions apply to 1320 those parts of Unicode (such conventions are not 1321 defined in this Standard). The use of "_" in a component is 1322 deprecated. Serving agents MAY refuse to accept newsgroups 1323 using that component. 1325 NOTE: Components composed entirely of digits would cause 1326 problems for the commonly used implementation technique of 1327 using the component as the name of a directory, whilst also 1328 using sequential numbers to distinguish the articles within a 1329 group. 1331 NOTE: Uppercase letters MUST NOT be used. Although converting 1332 ASCII uppercase letters to their lowercase counterparts is 1333 straightforward enough, it would be unreasonable to expect 1334 software to do the same in parts of Unicode for which it was 1335 not configured (in general, a table lookup would be required). 1336 Thus software MAY attempt to convert uppercase letters 1337 according to the mappings defined by [UNICODE], but this 1338 behaviour is not required. 1340 Whilst there is no longer any technical reason to limit the 1341 length of a component (formerly, it was limited to 14 1342 characters) nor to limit the total length of a newsgroup-name, 1343 it should be noted that these names are also used in the 1344 newsgroups line (...) where an overall limit applies, and 1345 moreover excessively long names can be exceedingly 1346 inconvenient in practical use. Those responsible for the 1347 management of the various netnews hierarchies SHOULD therefore 1348 set reasonable limits for the length of a component and of a 1349 newsgroup name. In the absence of such explicit policies, 1350 figures of 30 characters and 72 characters respectively are 1351 recommended. 1353 NOTE: The newsgroup-name as encoded in UTF-8 should be 1354 regarded as the canonical form. Reading agents may convert it 1355 to whatever character set they are able to display (see 4.5.2) 1356 and serving agents may possibly need to convert it to some 1357 form more suitable as a filename. Simple algorithms for both 1358 kinds of conversion are readily available. 1360 Posters SHOULD use only the names of existing newsgroups in 1361 the Newsgroups header, because newsgroups are not created 1362 simply by being posted to. However, it is legitimate to 1363 cross-post to newsgroup(s) which do not exist on the posting 1364 agent's host, provided that at least one of the newsgroups 1365 DOES exist there, and followup agents MUST accept this 1366 (posting agents MAY accept it, but SHOULD at least alert the 1367 poster to the situation and request confirmation). Relaying 1368 agents MUST NOT rewrite Newsgroups headers in any way, even if 1369 some or all of the newsgroups do not exist on the relaying 1370 agent's host. 1372 5.5.1 Forbidden newsgroup names 1374 The following newsgroup-names MUST NOT be used: 1376 Newsgroup-names having only one component (reserved for 1377 newsgroups whose propagation is restricted to a single host, 1378 or the administrative equivalent). 1380 "poster" (because it has special meaning in the Followup-To 1381 header (see section 6.1).) 1383 "newsgroups" (likewise) 1385 "junk" (frequently used for pseudo-newsgroups internal to 1386 serving agents) 1388 "control" (likewise) 1390 Any newsgroup-name beginning with "control." (likewise) 1392 Any newsgroup-name containing the component "ctl" (likewise) 1394 "to" or any newsgroup-name beginning with "to." (reserved for 1395 test messages sent on an essentially point-to-point basis (see 1396 also the ihave/sendme protocol described in section 7.2) 1398 Any newsgroup-name containing the component "all" (because 1399 this is used as a wildcard in some implementations) 1401 A newsgroup SHOULD NOT appear more than once in the Newsgroups 1402 header. The order of newsgroup names in the Newsgroups 1403 header is not significant. 1405 5.6 Path 1407 The Path header shows the route a message took from its entry 1408 into the USENET system to the current system. It is a list of 1409 site identifiers with the origin on the right. Each relaying, 1410 injecting or serving agent that processes the article adds one 1411 or more entries to this header. Aside from tracing the route 1412 articles take in moving over the network, Path is used 1413 primarily to allow relaying systems to not send articles to 1414 sites known to already have them, in particular the site they 1415 came from. This improves the efficiency of links. Path is 1416 also used for USENET statistics gathering and flow tracking. 1417 Finally the presence of a "%" delimiter in the Path header can 1418 be used to identify an article injected in conformance with 1419 this standard. 1421 5.6.1 Format 1423 path-content = old-path / new-path 1425 old-id = 1*( ALPHA / digit / "-" | "." | "_") 1426 old-path = old-id *(punctuation old-id) 1427 punctuation = LWSP / %x21-2f / %x3a-40 / %x5b-60 / %x7b-7f 1428 ; These are ! " # $ % & ' ( ) * 1429 ; + , - . / : ; < = > ? @ [ \ 1430 ; ] ^ _ ` { | } ~ DEL 1431 new-delims = [FWS] ("@" / "/" / "," ) [FWS] 1432 new-path = post-injection "%" pre-injection 1433 delim-plus-id = [FWS] "!" [FWS] old-id 1434 / new-delims site-id 1435 post-injection = *(site-id 1*new-delims) site-id 1436 pre-injection = site-id *delim-plus-id 1437 site-id = ALPHA word ; UUCP name 1438 / ALPHA ; for "x" tail entry 1439 / "." word ; other registered name 1440 / ; as per RFC 1034 1441 / ; numeric IP address rep 1442 ; specified in rfc820 etc. 1443 / "[" dotted-quad "]" 1444 / "[" "]" ; per RFC1884 1445 word = 1*(ALPHA / digit / "-" / "_") 1447 5.6.2 Adding an entry to the Path header. 1449 When a system receives a message from another system, it MUST 1450 add its own unique name (path-identity or site-id) and a 1451 delimiter to the beginning of the Path string. In addition, if 1452 needed, folding-whitespace MAY be added. 1454 The path-identity added MUST be unique. To this end it should 1455 be one of: 1457 1. A name registered previously in the UUCP maps database 1458 (found in the newsgroup comp.mail.maps), containing no dot 1459 character. 1461 3. The fully qualified domain name or MX record, retrievable 1462 via the Internet DNS service. 1464 4. An encoding of an IP address -- dotted quad or for IPv6 as 1465 per RFC1884. These encodings using SHOULD NOT be used prior to 1466 draft-implementation-date. 1468 Whichever form is chosen, a site SHOULD use a form which can be 1469 verified using one of the schemes described below by all sites 1470 to which it will forward news articles. If all forwarding is by 1471 NNTP or other internet based protocols, then the FQDN or IP 1472 address encodings are advised. For the purposes of comparison, 1473 FQDN entries should be put in an all-lower-case canonical form. 1475 Because RFC1036 specified any punctuation or whitespace could 1476 act as delimiter, programs SHOULD accept this, with the 1477 exception that IPv6 addresses containing colons MUST be treated 1478 as a single unit. Modern programs MUST generate only the set 1479 "!,%@" plus optional additional whitespace. 1481 When a site receives an article from another site, it SHOULD 1482 (MUST after draft-implementation-date ), verify the identity of 1483 the source site. When processing an article from a source, the 1484 leftmost entry of the Path line should be extracted, converted 1485 to a canonical form, and tested to see if it matches the 1486 canonical form of the verified identity of the source. If it 1487 does, a "," should be used as the delimiter, and thus the 1488 comma, and then the receiving site's path-identity MUST be 1489 prepended to the Path line. 1491 The method of verification is up to the site. Any method of 1492 suitable authenticity may be chosen, with the consideration 1493 that in the event of problems at the source site, the relaying 1494 site may be called upon to reliably identify it. 1496 If the leftmost entry does not match the verified identity of 1497 the source, then the receiving site should prepend an "@" 1498 delimiter, then a simple form of the verified identity of the 1499 source, then a "," delimiter and then the receiving site's own 1500 path-identity. This adding of two identities to the line 1501 should not be done if the provided and verified identities 1502 match. For articles received from an internet source, the 1503 unique 32 bit IPv4 address or properly verified FQDN, whichever 1504 is shorter, is encouraged for the generated ID. 1506 5.6.3 The tail Entry 1508 For historical reasons, the rightmost entry in the Path string 1509 generated by most systems is not a site name, but a "user 1510 name". However, the Path string is not an E-mail address and 1511 MUST NOT be used to contact the user. Injecting agents MAY 1512 place any string here that is not a path-identity. If no 1513 meaning is anticipated the string "x" SHOULD be used. 1515 RFC1036 suggested that the last entry could be a site name, 1516 requiring software to check it when feeding, but said it also 1517 should have a user-id for very old systems. As of this 1518 specification, a systems MUST NOT treat the tail entry as a 1519 path-identity. 1521 Typically this field will be the only entry on the Path string 1522 generated by a poster, or if not generated by the 1523 posting-agent, by the injecting agent, which will prepend a "%" 1524 and then its own verifiable path-identity. The percent divides 1525 the verified part of the Path line from any entries provided 1526 prior to injection into the news network. There may be more 1527 than one entry to the left of the percent, and all but the last 1528 are to be treated as sites. 1530 Injecting Agents SHOULD use the tail entry for local 1531 authentication information on the source of an article. For 1532 example, if they wish to store an encoding of the IP address of 1533 a source machine connecting to do the injection, and/or the UID 1534 of an invoking user or any other such information, they may 1535 encode it in the tail entry, provided they do so in a manner 1536 that will not match any site identifier. (e.g. ending with a 1537 dot) . 1539 5.6.4 The Injecting Agent Entry 1541 The injecting agent's path identity is a special case. This 1542 identity MUST be a FQDN which can be used as a domain for 1543 E-mail connections (ie. it should have either an A or MX 1544 record). See the Duties of an Injection Agents section 7.1 1545 and RFC 2142. 1547 5.6.5 Delimiter Summary 1549 A summary of delimiters and the meaning they imply for the 1550 name on the right, or in addition, the name to the left. 1552 , Verified or generated identity. 1554 @ Name failed verification test. Name on left is identity 1555 generated by site further to the left. 1557 % Optional pre-injection entries followed by tail entry. 1558 Commonly just the tail entry, either "x" or an encoding 1559 of login identity. Name on left is FQDN of site that 1560 handles mail for Injecting Agent. The presence of two "%" 1561 in a path indicates a double-injected error. 1563 ! Entry is unverified. Identity on left is an old-style 1564 system not conformant with this specification. 1566 Folding Whitespace MUST NOT be used as the sole delimiter. 1568 Other Treat as "!" as per RFC1036 1570 "/" Reserved for future use, treat as "," 1572 ; Semicolon is reserved for the generation of extensible headers. 1574 : The colon is a valid delimiter for legacy systems, however, 1575 inside an IPv6 numeric address, surrounded in square brackets, 1576 it is a part of the path-identifier. 1578 _ This should not be treated as punctuation (a delimiter), 1579 contrary to RFC1036. Treat as part of identifiers. 1581 5.6.6 Other formatting Issues 1583 The Path header MUST NOT be truncated. 1585 Whitespace MAY be present in the Path to make it easier to 1586 represent. However, there is no requirement to do so. 1587 Whitespace MUST not be used as a delimiter. 1589 5.6.6.1 Use of "!" 1591 Old USENET relaying and injecting programs almost all delimit 1592 Path: entries with the "!" delimiter, and these entries are 1593 not verified. As such, the presence of "%" as a delimiter 1594 will indicate the article was injected by software conforming 1595 to this standard, and the presence of "!" as a delimiter will 1596 indicate the message passed through systems developed prior 1597 to this standard. Prior to the draft-implementation-date, 1598 messages with mixed sets of delimiters will be common. After 1599 that date, all messages should have no "!" delimiters prior 1600 to the "%" delimiter. 1602 5.6.7 Suggested Verification Methods 1604 Sites attempting to verify an incoming entry SHOULD take the 1605 following approaches for common transports. They are not 1606 required, but not following them may lead to wasteful 1607 double-entry Path additions. 1609 If the incoming article arrives through some protocol local to 1610 the site, such as UUCP, that protocol MUST include a means of 1611 verifying the article source site, and this should match. In 1612 UUCP implementations, commonly each incoming connection has a 1613 unique login name and password; that login name could be used 1614 to build a suitable verified identifier. 1616 Here is an example of a suitable verification method for an 1617 article arriving via a TCP/IP protocol such as via NNTP: 1619 1. If it is an encoding of an IP address, it should be decoded 1620 into a canonical form. If that address does not match the 1621 source's IP, a reverse-DNS (in-addr.arpa PTR record) lookup 1622 should be done on the provided address, followed by a regular 1623 DNS "A" record lookup on the returned name. That A record may 1624 contain several IP addresses. So long as one matches the IP 1625 address from the path, and another matches the source IP 1626 address, this is considered a match. 1628 2. If it is a internet DNS style FQDN, then the name should be 1629 looked up with DNS. The A records MUST contain an IP address 1630 that is the verified address of the source. 1632 3. (It should be noted that when generating a name after a 1633 non-match, if an FQDN is desired, simply doing a reverse DNS 1634 (PTR) lookup on the IP address is not sufficient to generate 1635 the FQDN. The returned name must be mapped back to A records 1636 to assure it matches the source's IP address.) 1638 5.6.8 Issues 1640 There is no firm way to tell a path entry generated by new 1641 software, and one generated by old software assuming that any 1642 delimiter is valid. However, use of "!" by old software has 1643 become effectively universal. 1645 Sites are not strictly required to use a standard form for 1646 their path entry, but if they don't, path lines out of that 1647 site get longer due to the adding of the identity. However, 1648 groups of associated sites wanting a common identity may decide 1649 to use that and let the receiver add the specific site. 1651 6. Optional Headers 1653 The headers appearing in this section have established 1654 meanings. They MUST be interpreted according to the 1655 definitions made in this document. None of them are required to 1656 appear in every article. All of the headers appearing in this 1657 document MUST NOT appear more than once in an article. Headers 1658 not appearing in this document (i.e. X-headers, headers defined 1659 by cooperating subnets) are exempt from this requirement. See 1660 "Responsibilities of Agents" for a clear picture. 1662 6.1 Followup-To 1664 The Followup-To header contents specify which newsgroup(s) 1665 followups should be posted to: 1667 Followup-To-content = Newsgroups-content / "poster" 1669 The syntax is the same as that of the Newsgroups content, with 1670 the exception that the magic word "poster" means that 1671 followups should be mailed to the article's reply address 1672 rather than posted. In the absence of Followup-To, the default 1673 newsgroup(s) for a followup are those in the Newsgroups header 1674 and for this reason the Followup-To header should not be 1675 included if it just duplicates the Newsgroups header. 1677 6.2 Expires 1679 The Expires header content specifies a date and time when 1680 the article is deemed to be no longer useful and should be 1681 removed ("expired"). The content syntax is the same as that of 1682 the Date content which is defined in the Message Format 1683 Standard [MESSFOR] . 1685 expires-content = date-time 1687 A Expires header SHOULD only be used in an article if the 1688 requested expiry time is earlier or later than the default 1689 would normally be for that article. Local policy for each 1690 serving agent will dictate when this header is obeyed and 1691 authors SHOULD NOT depend on it being completely followed. 1693 6.3. Reply-To 1695 The Reply-To header content specifies a reply address(es) to 1696 be used for personal replies for the author(s) of the article 1697 when this is different from the author's address(es) given in 1698 the From header. The format of the Reply-To header is defined 1699 in the Message Format Standard [MESSFOR] . 1701 In the absence of Reply-To, the reply address(es) is the 1702 address(es) in the From header. For this reason a Reply-To 1703 SHOULD NOT be included if it just duplicates the From header. 1705 Use of a Reply-To header is preferable to including a similar 1706 request in the article body, because reply agents can take 1707 account of Reply-To automatically. 1709 "Reply-To: <> " MAY be used to indicate that the poster does 1710 not wish to recieve email replies. 1712 Reply-To-content = From-content 1714 6.3.1 Examples: 1716 Reply-To: John Smith 1717 Reply-To: John Smith , dave@isp.example 1718 Reply-To: John Smith , andrew@isp.example, 1719 fred@site2.example 1720 Reply-To: Please not not reply <> 1722 6.4. References 1724 The References header content lists optionally CFWS-separated 1725 message ids of precursors. The format of the References header 1726 is defined in the Message Format Standard [MESSFOR]. 1728 A followup MUST have a References header, and an article that 1729 is not a followup MUST NOT have a References header. In a 1730 followup, if the precursor did not have a References header, 1731 the followup's References content MUST be formed by the 1732 message ID of the precursor. A followup to an article which 1733 had a References header MUST have a References header 1734 containing the precursor's References content, plus the 1735 precursor's message ID appended to the end of the list 1736 (separated from it by optional CFWS). 1738 Followup Agents SHOULD NOT trim message ids out of the 1739 References content unless the number of message ids exceeds 31 1740 in which case message ids SHOULD be trimmed until there are 1741 only 31. 1743 Trimming SHOULD be done by removing the sixth (6th) message-id 1744 and any incomplete or otherwise broken message-ids. If 1745 Followup Agents trim any message-ids out of the References 1746 content, then they MUST leave the first five and the last nine 1747 message ids and they SHOULD also leave any message ids 1748 mentioned in the body of the article intact. 1750 NOTE: Software writers should be aware that the number of 1751 messages ids in this header may exceed 31 and software must be 1752 able to handle this without problem. 1754 References-content = msg-id [msg-id...] 1756 6.4.1 Examples: 1758 References: 1759 References: 1760 References: 1761 <222@site1.example><87tfbyv@site7.example><67jimf@site666.example> 1762 References: 1763 1765 6.5. Control 1767 The Control header content marks the article as a control 1768 message, and specifies the desired actions (other than the 1769 usual ones of filing and passing on the article): 1771 Control-content = verb *( FWS argument ) verb = 1*( ALPHA / 1772 DIGIT ) argument = 1* ftext 1774 The verb indicates what action should be taken, and the 1775 argument(s) (if any) supply details. In some cases, the body 1776 of the article may also contain details. The next section 1777 describes the standard verbs. 1779 6.6. Control Messages 1781 The following sections document the group control messages. 1782 "Message" is used herein as a synonym for "article" unless 1783 context indicates otherwise. Group control messages are a 1784 special class of control messages, that request the group 1785 configuration on a server be updated. 1787 All of the group control messages MUST have an Approved header 1788 (section 6.10). They SHOULD use one of the authentication 1789 mechanisms defined in section TBD. 1791 The execution of the actions requested by control messages is 1792 subject to local administrative restrictions, which MAY deny 1793 requests or refer them to an administrator for approval. The 1794 descriptions below are generally phrased in terms suggesting 1795 mandatory actions, but any or all of these MAY be subject to 1796 local administrative approval (either as a class or 1797 case-by-case). Analogously, where the description below 1798 specifies that a message or portion thereof is to be ignored, 1799 this action MAY include reporting it to an administrator. 1801 Relaying Agents MUST propagate even control messages they do 1802 not understand. 1804 In the following sections, each type of control message is 1805 defined syntactically by defining its arguments and its body. 1806 For example, "cancel" is defined by defining cancel-arguments 1807 and cancel-body. 1809 6.6.1 The "newgroup" Control Message 1811 newgroup-ctrl = "newgroup" FWS groupname [ FWS flags ] 1812 flags = "moderated" 1813 groupname ; defined in [NEWS] 1815 The "newgroup" control message requests the specified group be 1816 created or changed. The text "moderated" is appended to mark 1817 the group as moderated. The message contains a 1818 "multipart/newsgroupinfo" (section 6.6.1 body) part containing 1819 machine- and human-readable information about the group. 1821 The newgroup command is also used to update the description 1822 line or moderation status of a group. 1824 NOTE: It is also possible to send newgroups for existing 1825 groups that don't change anything to ensure the group exist on 1826 all systems ("booster" newgroups). Implementations might want 1827 to test for this condition before attempting to update their 1828 configuration. 1830 6.6.1.1 multipart/newsgroupinfo 1832 The "multipart/newsgroupinfo" body structure contains 1833 information about a (new) newsgroup. 1835 The MIME content type definition of "multipart/newsgroupinfo" 1836 is: 1838 MIME type name: multipart 1839 MIME subtype name: newsgroupinfo 1840 Required parameters: boundary (see [MIME2]) 1841 Optional parameters: none 1842 Encoding considerations: "7bit" or "8bit" is sufficient and 1843 MUST be used to maintain compatibility. 1844 Security considerations: to be added 1846 A "multipart/newsgroupinfo" body part contains the following 1847 subparts: 1849 1. An "application/newsgroupinfo" part (section 6.6.1.2) 1850 containing the name and description line of the group(s). This 1851 part is mandatory. 1853 2. Other parts containing useful information about the 1854 backgrounds of newsgroup message. 1856 3. Parts containing initial named articles for the 1857 newsgroup. See section 6.6.1.3 for details. 1859 6.6.1.2 application/newsgroupinfo 1861 The "application/newsgroupinfo" body part contains a short 1862 information on a newsgroup, i.e. the group's name, it's 1863 description and the moderation flag. 1865 NOTE: This part has a format that makes the whole 1866 "multipart/newsgroupinfo" structure compatible with [1036BIS]. 1868 The MIME content type definition of "application/newsgroupinfo" 1869 is: 1871 MIME type name: application 1872 MIME subtype name: newsgroupinfo 1873 Optional parameters: none 1874 Encoding considerations: "7bit" or "8bit" is sufficient and 1875 MUST be used to maintain compatibility. 1876 Note that the descriptions may use [MIME3]. 1877 Security considerations: to be added 1879 The content of the "application/newsgroupinfo" body part is 1880 defined as: 1882 groupinfo-body = descriptor-tag CRLF 1*( description-line CRLF ) 1883 descriptor-tag = %x46.6F.72 SP %x79.6F.75.72 SP 1884 %x6E.65.77.73.67.72.6F.75.70.73 SP 1885 %x66.69.6C.6E.3A 1886 ; case sensitive "For your newsgroups file:" 1887 description-line = newsgroup-name [ 1*WSP description 1888 [ 1*SP group-flags ] ] 1889 description = nonblank-text / encoded-word 1890 moderation-flags = [ moderated-literal ] 1891 moderated-literal = %x28.4D.6F.64.65.72.61.74.65.64.29 1892 ; case sensitive "(Moderated)" 1894 group-flags = [ "<" addr-spec ">" 1*SP ] "(Moderated)" 1896 The "application/newsgroupinfo" is used in conjunction with the 1897 "newgroup" (section 6.6.1) and "mvgroup" control messages (section 1898 6.6.3) as part of a "multipart/newsgroupinfo" (section 6.6.1) MIME 1899 structure. 1901 Moderated newsgroups SHOULD be marked by appending the case 1902 sensitive text " (Moderated)" at the end. 1904 NOTE: Due to the line lenght limit in [MAIL], [NEWS] and 1905 [NNTP], a description line can have a maximum length of 998 1906 octets. 1908 NOTE: In some hierarchies, there exist conventions that set a 1909 far lower limit, often in characters. 1911 NOTE: Usually, the description length is limited in a way that 1912 the newsgroup name, the tab (interpreted as an 8-character tab 1913 that takes one at least to column 24) and the description 1914 without flags fit into the first 79 characters. 1916 NOTE: Servers that use an "newsgroups" file will store the 1917 group descritpions there as is, i.e. without any conversion of 1918 charsets or encoding. 1920 NOTE: The descriptions will also be used with the [NNTP] LIST 1921 NEWSGROUPS command. The descriptions will be sent as is, i.e. 1922 without any conversion of charsets or encoding. 1924 6.6.1.3 Initial Named Articles 1926 Some parts of a multipart/newsgroupinfo structure MAY contain 1927 an initial set of named articles. These parts are identified by 1928 the Article-Name header just like normal named article 1929 postings. The named articles are filed separately as single 1930 postings, where the headers of the enclosing control message 1931 are copied to every part that contains a named article except 1932 that: 1934 Content-* and Article-* headers MUST be taken from the body part. 1936 The message id MUST be changed by inserting /partX before the @ 1937 sign, where X is the number of the body part, starting with 0. 1938 The Control header of the enclosing message header MUST be 1939 stripped. It MAY be replaced by a "Control: named" header. 1940 Signatures (Auth, X-Auth...) of the enclosing message SHOULD be 1941 stripped. They MAY be replaced by a signature of the own site. 1943 The resulting articles are for internal use of the server and its 1944 users only, they MUST NOT, repeat MUST NOT be forwarded to other 1945 sites. 1947 Nested multipart/* structures are allowed, they are not 1948 recursively expanded to separate articles. 1950 6.6.2 The "rmgroup" Control Message 1952 rmgroup-ctrl = "rmgroup" FWS groupname 1954 The "rmgroup" control message requests the specified group be 1955 removed from the list of valid groups. The Content-Type of the 1956 body is unspecified; it MAY contain anything, usually an 1957 explaining text. 1959 NOTE: It is also possible to send rmgroups for nonexisting, 1960 bogus groups to ensure the group is removed on all systems 1961 ("booster" rmgroups). Implementations might want to test for 1962 this before attempting to update their configuration. 1964 6.6.3 The "mvgroup" Control Message 1966 mvgroup-ctrl = "mvgroup" FWS ( mvgrp-groups / mvgrp-hrchy) 1967 mvgrp-groups = groupname [ FWS groupname ] 1968 mvgrp-hrchy = groupnamepart ".*" FWS groupnamepart 1969 groupnamepart = groupname ; syntactically 1971 6.6.3.1 Single group 1973 The "mvgroup" control message requests the first specified 1974 group to be moved to the second group. The message contains a 1975 "multipart/newsgroupinfo" (section 6.6.1.2) body part containing 1976 machine- and human-readable information about the new group. 1978 When this message is received, the new group SHOULD be created 1979 and all articles, including named articles, SHOULD be copied or 1980 moved to the new group, then the old, now empty group SHOULD be 1981 deleted. 1983 NOTE: For servers that use a file system directory structure to 1984 organize message storage, this operation is quite efficiently 1985 implemented as a single directory rename operation. 1987 If the old group does not exist, the message is ignored unless 1988 the new group does not exist either, in which case the new 1989 group is created just as for a "newgroup" message. 1991 An indication that the old group was replaced by the new group 1992 MAY be left back in the server's configuration and be made 1993 available to clients. 1995 NOTE: For servers that use an "active" file this means an entry 1996 in the form "oldgroup xxx yyy =newgroup" is created. 1998 NOTE: If the old group did not exist, this is considered a 1999 local configuration error. Therefore it is the best to correct 2000 this error when a mvgroup is received. 2002 If the old group does not exist, the message is ignored unless 2003 the new group does not exist either, in which case the new 2004 group is created just as for a "newgroup" message. 2006 If both groups exist, the groups MAY be "merged". If this is 2007 done, it MUST be done correctly, i.e. implementations MUST take 2008 care that the messages in the group being deleted are 2009 renumbered accordingly to avoid overwriting articles in one 2010 group with those of the other and that crossposted articles 2011 don't appear twice. Otherwise, the old group is just deleted. 2013 In all cases, information transported in the 2014 "multipart/newsgroupinfo" body part is applied to the new group. 2016 Named articles are taken from the mvgroup message, the new 2017 group (if already existent) and the old group in this 2018 precedence. 2020 As a special case, the second name, i.e. the one of the new 2021 group MAY be omitted. In this case, only the information of the 2022 group is updated according to the contained 2023 "multipart/newsgroupinfo". 2025 6.6.3.2 Multiple Groups 2027 If the first name ends with the character sequence ".*", the 2028 newgroup message requests a whole (sub)hierarchy to be moved. 2029 The same procedure as for single groups (section 6.6.3.1) applies 2030 to every matched group; however, some systems might be able to 2031 optimize the process. 2033 NOTE: For servers that use a file system directory structure to 2034 organize message storage, this process can be optimized by 2035 renaming the parent directory instead of every group's 2036 directory. 2038 To avoid recursion, the new groups' names MUST NEVER match the 2039 old groups name pattern; i.e. moving a whole (sub)hierarchy to 2040 a subhierarchy of the original hierarchy is explicitly 2041 disallowed. 2043 6.6.4 The "checkgroups" Control Message 2045 The "checkgroups" control message contains a list of all valid 2046 groups in a complete hierarchy. The "Control:" header has the 2047 following format: 2049 checkgroup-ctrl = "checkgroups" [ FWS chkscope ] [ FWS chksernr ] 2050 chkscope = 1*( ["!"] newsgroup-name-part ) 2051 chksernr = "#" 1*DIGIT 2053 The chkscope parameter(s) specifies the (sub)hierarchy(s) for 2054 which this "checkgroups" message applies. 2056 6.6.4.1 Example: 2057 Control: checkgroups de !de.alt #248 2059 NOTE: "Old" software is known to ignore this parameter. Thus a 2060 "checkgroups" message SHOULD also contain the groups of other 2061 subhierarchies the sender is not responsible for. "New" 2062 software MUST ignore groups which don't fall into the scope of 2063 the "checkgroups" message. 2065 If no scope for the checkgroups message is given, it applies to 2066 all hierarchies for which group statements appear in the 2067 message. 2069 "Checkgroups" messages MAY also contain a serial number, which 2070 can be any positive integer (i.e. just numbered or the date in 2071 YYYYMMDD). It SHOULD increase by an arbitrary value with every 2072 change to the group list and MUST NOT ever decrease. 2074 NOTE: This was added to circumvent security problems in 2075 situations where the Date header can not be signed. 2077 The body of the message is an "application/newscheckgroups" part 2078 containing the list of ALL valid groups (and MAYbe deletion 2079 confirmations) for the specified hierarchies. 2081 6.6.5 application/newscheckgroups 2083 The "application/newscheckgroups" body part contains a complete 2084 list of all newsgroups in a top level hierarchy, their 2085 description lines and moderation status. 2087 The MIME content type definition of 2088 "application/newscheckgroups:" is: 2090 MIME type name: application 2091 MIME subtype name: newscheckgroups 2092 Optional parameters: none 2093 Encoding considerations: "7bit" or "8bit" is sufficient and 2094 MUST be used to maintain compatibility. 2095 Note that the descriptions may use [MIME3]. 2096 Security considerations: to be added 2098 The content of the "application/newscheckgroups" body part is 2099 defined as: 2101 checkgroups-body = *( invalidation CRLF ) 1*( valid-group CRLF ) 2102 invalidation = "!" groupname *( "," *WSP groupname ) 2103 valid-group = description-line 2104 description-line ; see section 6.6.1.2 2106 The "application/newscheckgroups" content type is used in 2107 conjunction with the "checkgroups" control message (section 2108 6.6.1.3.1). 2110 6.6.5.1 Examples 2112 A "newgroup" with bilingual charter and policy information: 2114 From: admin@example.invalid (example.all Administrator) 2115 Newsgroups: example.admin.groups,example.admin.announce 2116 Date: 27 Feb 1997 12:50:22 +14:00 (EST) 2117 Subject: Group example.admin.info created. 2118 Approved: admin@example.invalid 2119 Control: newgroup example.admin.info moderated 2120 Message-ID: 2121 Content-Type: multipart/newsgroupinfo; boundary="nxtprt" 2122 Content-Transfer-Encoding: 8bit 2124 This is a MIME control message. 2125 --nxtprt 2126 Content-Type: application/newsgroupinfo 2128 For your newsgroups file: 2129 example.admin.info Information on the example.* hierarchy 2130 (Moderated) 2132 --nxtprt 2133 Content-Type: multipart/alternative ; 2134 differences = content-language ; 2135 boundary = nxtlang 2136 Article-Name: example.admin.info: charter 2138 --nxtlang 2139 Content-Type: text/plain; charset=us-ascii 2140 Content-Transfer-Encoding: 7bit 2141 Content-Language: en 2143 The group example.admin.info contains regularly posted information on 2144 the example.* hierarchy. 2145 --nxtlang 2146 Content-Type: text/plain; charset=us-ascii 2147 Content-Transfer-Encoding: 8bit 2148 Content-Language: de 2150 Die Gruppe example.admin.info enth�lt regelm�~Kig versandte 2151 Informationen �ber die example.*-Hierarchie. 2152 --nxtlang-- 2153 --nxtprt-- 2155 plain "rmgroup": 2157 From: admin@example.invalid (example.all Administrator) 2158 Newsgroups: example.admin.groups, example.admin.announce 2159 Date: 4 Jul 1997 22:04 +02:00 (PST) 2160 Subject: Deletion of example.admin.obsolete 2161 Message-ID: 2162 Approved: admin@example.invalid 2163 Control: rmgroup example.admin.obsolete 2165 The group example.admin.obsolete is obsolete. Please remove it from 2166 your system. 2168 plain "mvgroup": 2170 From: admin@example.invalid (example.all Administrator) 2171 Newsgroups: example.admin.groups, example.admin.announce 2172 Date: 30 Jul 1997 22:04 +02:00 (CEST) 2173 Subject: Moving example.oldgroup to example.newgroup 2174 Message-ID: 2175 Approved: admin@example.invalid 2176 Control: mvgroup example.oldgroup example.newgroup 2177 Content-Type: multipart/newsgroupinfo; boundary=nxt 2179 --nxt 2180 Content-Type: application/newgroupinfo 2182 For your newsgroups file: 2183 example.newgroup The new replacement group. 2184 --nxt 2186 The group example.oldgroup is replaced by example.newgroup. 2187 Please update your configuration. 2188 --nxt-- 2190 more complex "mvgroup" for a whole hierarchy: 2192 The charter of the group example.talk.jokes contained a reference to 2193 example.talk.jokes.d, which is also being moved. So the charter is 2194 updated. 2196 From: admin@example.invalid (example.all Administrator) 2197 Newsgroups: example.admin.groups, example.admin.announce 2198 Date: 30 Jul 1997 22:04 +02:00 (PST) 2199 Subject: Deletion of example.admin.obsolete 2200 Message-ID: 2201 Approved: admin@example.invalid 2202 Control: mvgroup example.talk.* example.conversation 2203 Content-Type: multipart/newsgroupinfo; boundary=nxt; chartas=1 2205 --nxt 2206 Content-Type: application/newgroupinfo 2208 For your newsgroups file: 2209 example.conversation.boring Boring conversations. 2210 example.conversation.interesting Interesting conversations. 2211 example.conversation.jokes Jokes and funny stuff. 2212 example.conversation.jokes.d Discussion about example.conversation.jokes. 2214 Article-Name: example.conversation.jokes: charter 2216 This group is to publish jokes and other funny stuff. 2217 Discussions about the articles posted here should be redirected 2218 to example.conversation.jokes.d; adding a Followup-to: header 2219 is recommended. 2220 --nxt-- 2222 6.6.6 Cancel 2224 The cancel message requests that one or more target articles be 2225 "canceled" ie be withdrawn from circulation or access. This 2226 message MAY be issued by entities which processed the target 2227 article(s) while it was still a proto-article (ie posters, 2228 posting agents, moderators and injecting agent. See also 2229 Gateways[2.1] ). Other entities MUST NOT use this method to 2230 remove articles. 2232 NOTE: A separate method for other entities to cancel articles 2233 will be defined in a later draft. 2235 cancel-arguments = 1*( message-id CFWS ) 2236 cancel-body = body 2238 The argument(s) identify the article(s) to be cancelled, by 2239 message-id. The body SHOULD contain an indication of why the 2240 cancellation was requested. The cancel message SHOULD be posted 2241 to the same newsgroup(s), with the same distribution(s), as the 2242 article(s) it is attempting to cancel. 2244 In order for a cancel message to remove an article either: 2246 1. The mailing addresses from the From line of the cancel 2247 message and the target article match and the target article is 2248 otherwise unauthenticated. 2250 2. At least one authentication method of the target article 2251 MUST be matched by the cancel message plus the mailing addresses 2252 from the From line of the cancel message and the target article 2253 MAY match. 2255 NOTE: The Sender, From or Approved headers MUST NOT be used as 2256 an "authentication method" within the meaning of the previous 2257 paragraph. If the above conditions are satisfied then the 2258 relaying or serving agent SHOULD delete the target article 2259 completely and immediately (or at the minimum make the article 2260 unavailable for relaying or serving) and also SHOULD reject any 2261 copies of this article that appear. See also section 7 on 2262 duties of Serving and Relaying agents. 2264 6.6.7 ihave, sendme 2266 The ihave and sendme control messages implement a crude 2267 batched predecessor of the NNTP [rrr] protocol. They are 2268 largely obsolete in the Internet, but still see use in the UUCP 2269 environment, especially for backup feeds that normally are 2270 active only when a primary feed path has failed. 2272 NOTE: The ihave and sendme messages defined here have 2273 ABSOLUTELY NOTHING TO DO WITH NNTP, despite similarities of 2274 terminology. 2276 The two messages share the same syntax: 2278 ihave-arguments = *( message-id space ) relayer-name 2279 sendme-arguments = ihave-arguments 2280 ihave-body = *( message-id CRLF ) 2281 sendme-body = ihave-body 2283 Message IDs MUST appear in either the arguments or the body, but 2284 NOT both. Relayers SHOULD generate the form putting message 2285 IDs in the body, but the other form MUST be supported for 2286 backward compatibility. 2288 The ihave message states that the named relaying agent has 2289 received articles with the specified message IDs, which may be 2290 of interest to the relaying agents receiving the ihave message. 2291 The sendme message requests that the agent receiving it send 2292 the articles having the specified message IDs to the named 2293 relaying agent. 2295 These control messages are normally sent essentially as 2296 point-to-point messages, by using "to." newsgroups (see section 2297 5.5.1) that are sent only to the relaying agent the messages are 2298 intended for. The two relaying agents MUST be neighbors, 2299 exchanging news directly with each other. Each relaying agent 2300 advertises its new arrivals to the other using ihave messages, 2301 and each uses sendme messages to request the articles it lacks. 2303 To reduce overhead, ihave and sendme messages SHOULD be sent 2304 relatively infrequently and SHOULD contain reasonable numbers 2305 of message IDs. If ihave and sendme are being used to implement 2306 a backup feed, it may be desirable to insert a delay between 2307 reception of an ihave and generation of a sendme, so that a 2308 slightly slow primary feed will not cause large numbers of 2309 articles to be requested unnecessarily via sendme. 2311 6.6.8 Obsolete control messages. 2313 The following forms of control messages are declared obsolete 2314 by this document: 2316 sendsys 2317 version 2318 whogets 2319 senduuname 2321 6.7. Distribution 2323 The Distribution header specifies geographical or 2324 organizational limits to an article's propagation: 2326 Distribution-content = distribution *( dist-delim distribution) 2327 dist-delim = "," 2328 distribution = positive-distribution / negative-distribution 2329 positive-distribution = *FWS distribution-name *FWS 2330 negative-distribution = *FWS "!" distribution-name *FWS 2331 distribution-name = 1*letter 2333 [That is more restrictive than Henry, omitting '+', '-' and 2334 '_', but more liberal in allowing uppercase letters, which in 2335 fact are commonly used, and in not specifying any 14 character 2336 limit.] 2338 A distribution is case-insensitive (i.e. "US", "Us" and "us" 2339 all specify the same distribution). In the absence of a 2340 Distribution header, the default Distribution-content is 2341 "world". However, "world" SHOULD NOT be explicitly mentioned 2342 unless a negative-distribution is also present, as in 2343 Distribution: world, !us "All" MUST NOT be used as a 2344 distribution-name. 2346 Articles MUST NOT be passed between relaying agents unless the 2347 sending agent has been configured to supply and the receiving 2348 agent has requested to receive BOTH of (a) at least one of the 2349 newsgroups in the article's Newsgroups header, and (b) at 2350 least one of the positive-distributions in the article's 2351 Distribution header and none of the negative-distributions. 2352 Exceptionally, ALL relaying agents are deemed willing to 2353 supply or accept the distribution "world", and NO relaying 2354 agent should supply or accept the distribution "local". 2356 Posting agents SHOULD NOT provide a default Distribution 2357 header without giving the poster an opportunity to override 2358 it. Followup agents SHOULD initially supply the same 2359 Distribution header as found in the precursor. 2361 All the two-letter country names (e.g. "us") commonly used as 2362 top-level domain names may be used as distributions, but the 2363 common non-country top-level domain names (such as "edu" and 2364 "com") are NOT distributions, moreover top-level 2365 newsgroup-names (such as "comp" and "soc") are NOT 2366 distributions. Apart from the above, distribution-names are a 2367 matter for negotiation between the relaying agents or 2368 cooperating subnets involved. 2370 6.8. Keywords 2372 The Keywords field contains a comma separated list of 2373 important words and phrases intended to describe some aspect 2374 of the content of the article. The format of the Keywords 2375 header is defined in the Message Format Standard [MESSFOR] . 2377 NOTE: The list is comma seperated NOT space seperated. 2379 6.9. Summary 2381 The Summary header content is a short phrase summarizing the 2382 article's content. 2384 summary-content = non-blank-text CRLF 2385 non-blank-text = 1*(FWS text) 2387 The summary SHOULD be terse. Authors SHOULD avoid trying to 2388 cram their entire article into the headers; even the 2389 simplest query usually benefits from a sentence or two of 2390 elaboration and context, and not all reading agents display 2391 all headers. On the other hand the summary should give more 2392 detail than the Subject. 2394 6.10. Approved 2396 The Approved header content indicates the mailing addresses 2397 (and possibly the full names) of the persons or entities 2398 approving the article for posting: 2400 Approved-content = From-content 2402 An Approved header is required in all postings to moderated 2403 newsgroups. If this header is not present then relaying and 2404 serving agents MUST reject the article. 2406 An Approved header is also required in certain control 2407 messages, to reduce the probability of accidental posting of 2408 same; see the relevant parts of section 6.6. 2410 Please see section 7.1 on how injecting agents should treat 2411 posts to moderated groups that do not contain this header. 2413 6.11 Lines 2415 The Lines header content indicates the number of lines in the 2416 body of the article: 2418 Lines-content = 1*digit 2420 The line count includes all body lines, including the 2421 signature if any, including empty lines (if any) at beginning 2422 or end of the body. (The single empty separator line between 2423 the headers and the body is not part of the body) . The "body" 2424 here is the body as found in the posted article as transmitted 2425 by the posting agent. 2427 Reading agents SHOULD NOT rely on the presence of this header, 2428 since it is optional (and some posting agents do not supply 2429 it). They MUST NOT rely on it being precise, since it 2430 frequently is not. 2432 6.12. Xref 2434 The Xref header content indicates where an article was filed 2435 by the last server to process it: 2437 Xref-content = server 1*( CFWS location ) 2438 server = server-name 2439 location = newsgroup-name ":" article-locator 2440 article-locator = 1* 2442 The serving agent's name is included so that software can 2443 determine which serving agent generated the header. The 2444 locations specify what newsgroups the article was filed under 2445 (which may differ from those in the Newsgroups header) and 2446 where it was filed under them. The exact form of an article 2447 locator is implementation-specific. 2449 NOTE: The traditional form of an article locator is a decimal 2450 number, with articles in each newsgroup numbered consecutively 2451 starting from 1. NNTP demands that such a model be 2452 provided, and there may be other software which expects it, 2453 but it seems desirable to permit flexibility for unorthodox 2454 implementations. 2456 An agent inserting an Xref header into an article MUST delete 2457 any previous Xref header(s). A relaying agent MUST only create 2458 and/or relay an Xref header if it correct on all the receiving 2459 agents the article is forwarded to. Serving agents SHOULD 2460 insert this header unless the information in it (apart from 2461 the serving name) is correct in which case it should be left 2462 unchanged. 2464 An agent MUST use the same name in Xref headers as it uses in 2465 Path headers. 2467 6.13. Organization 2469 The Organization header content is a short phrase identifying 2470 the author's organization: 2472 organization-content = nonblank-text CRLF 2474 NOTE: Posting and injection agents are discouraged from 2475 providing a default value for this header unless it is 2476 acceptable to all posters using these agents. Unless this 2477 header contains useful information ( including some indication 2478 of the authors physical location) posters are discouraged from 2479 including it. 2481 6.14. User-Agent 2483 The User-Agent header contains information about the user 2484 agent (typically a newsreader) generating the article. This is 2485 for statistical purposes and tracing of standards violations 2486 to specific software needing correction. Although OPTIONAL, 2487 user agents SHOULD include this header with the articles they 2488 generate. 2490 The field MAY contain multiple product tokens and comments 2491 identifying the agent and any subproducts which form a 2492 significant part of the user agent such as external agents 2493 used for message composition, separated injecting agents (such 2494 as those used by offline newsreaders), and significant 2495 libraries that are part of such agents. The products are 2496 listed in order of their significance for identifying the 2497 application, not necessarily in chronological order of 2498 handling prior to injection. Injecting agents MAY include 2499 product information for servers (such as INN/1.7.2), but 2500 servers MUST NOT generate or modify this header to list 2501 themselves. 2503 User-Agent MUST NOT be modified after injection, but MAY be 2504 stripped or have its contents replaced prior to re-injection 2505 by another user agent such as an anonymizing gateway. 2507 User-Agent = "User-Agent:" SP User-Agent-content 2508 User-Agent-content = product *(CFWS product) [CFWS] 2510 At least one product MUST be present. The first token MUST NOT 2511 be a comment. Comments relate to the previously named product, 2512 not the product following it. 2514 product = token ["/" product-version] product-version = token 2516 Product tokens should be short and to the point -- they MUST 2517 NOT be used for information beyond the canonical name of the 2518 product and it's version. Although any token character MAY 2519 appear in a product-version, this token SHOULD be used only 2520 for a version identifier (i.e., successive versions of the 2521 same product SHOULD differ only in the product-version portion 2522 of the product value). Product tokens MUST identify products. 2524 NOTE: Variations from RFC 1945: 2526 1. product token is required and MUST be first, 2528 2. use of other text in the syntactic usage of the product 2529 token which is not a token is forbidden, 2531 3. comment allows quoted-pair, 2533 4. "{" and "}" are allowed in token (product and 2534 product-version) in news, 2536 5. octets from character sets other than ASCII are allowed. 2538 NOTE: Comments should be restricted to information regarding 2539 the product named to their left such as platform information 2540 and should be concise. Use as an advertising medium (in the 2541 mundane sense) is discouraged. 2543 Recipients of header field TEXT containing octets outside the 2544 US-ASCII character set may assume that they represent UTF-8 2545 characters. 2547 NOTE: Variation from RFC 1945: UTF-8 replaced ISO-8859-1 as 2548 charset assumption. 2550 6.14.1 Examples: 2552 User-Agent: tin/1.2-PL2 2553 User-Agent: tin/1.3-950621beta-PL0 (Unix) 2554 User-Agent: tin/unoff-1.3-BETA-970813 (UNIX) (Linux/2.0.30 (i486)) 2555 User-Agent: tin/pre-1.4-971106 (UNIX) (Linux/2.0.30 (i486)) 2556 User-Agent: Mozilla/4.02b7 (X11; I; en; HP-UX B.10.20 9000/712) 2557 User-Agent: Microsoft-Internet-News/4.70.1161 2558 User-Agent: Gnus/5.4.64 XEmacs/20.3beta17 ("Bucharest") 2559 User-Agent: Pluto/1.05h (RISC-OS/3.1) NewsHound/1.30 2560 User-Agent: inn/1.7.2 2561 User-Agent: inews 2562 User-Agent: telnet 2564 NOTE: Some current web proxy applications append their product 2565 information to the list in the User-Agent field. This is not 2566 recommended on the web and is forbidden for news, since it 2567 makes machine interpretation of these fields ambiguous. 2568 User-Agent is not intended to be a total audit trail of what 2569 software has handled the article. 2571 NOTE: Some existing web clients fail to restrict themselves to 2572 the product token syntax within the User-Agent field when 2573 using this header on the web. Such abuses are forbidden for 2574 news. 2576 NOTE: This header supersedes the role performed redundantly by 2577 "X-" headers such as X-Newsreader, X-Mailer, X-Posting-Agent, 2578 X-Http-User-Agent, and other headers previously used on USENET 2579 for this purpose. Use of these "X-" headers SHOULD be 2580 discontinued in favor of the single, standard User-Agent 2581 header which is to be used freely both in news and mail. 2583 NOTE: There are slight changes to the original HTTP defined 2584 format to the User-Agent header as noted, but headers in 2585 strict, common-sense compliance with RFC 1945 are compliant to 2586 this specification. The syntax from RFC 1945 is preferred, 2587 including the requirement that products and comments be 2588 separated by a space. 2590 6.15 MIME headers 2592 6.15.1 Syntax 2594 The following headers, as defined within [RFC 2045] and its 2595 extensions, may be used within articles conforming to this 2596 document. 2598 MIME-Version: 2599 Content-Type: 2600 Content-Transfer-Encoding: 2601 Content-ID: 2602 Content-Description: 2603 Content-Disposition: 2604 Content-MD5: 2606 Insofar as the syntax for these headers as given in [RFC 2045] 2607 does not specify precisely where whitespace and comments may 2608 occur (whether in the form of WS, FWS or CFWS), the usage 2609 defined in this Standard, and failing that in [MESSFOR], and 2610 failing that in [RFC 822] MUST be followed. In particular, 2611 there MUST NOT be any WS between a header-name and the 2612 following colon and there MUST be a SPACE following that 2613 colon. 2615 The meaning of the various MIME headers is as defined in [RFC 2616 2045] and [RFC 2046], and in extensions registered in 2617 accordance with [RFC 2048]. However, their usage is restricted 2618 as described in the following sections. 2620 6.15.2 Content-Transfer-Encoding 2622 Posting agents SHOULD specify "Content-Transfer-Encoding: 2623 8bit" for all articles not written in pure ASCII or whose 2624 content type implies binary. They MAY use "8bit" encoding even 2625 when "7bit" encoding would have sufficed. They SHOULD specify 2626 "base64" when the content type implies binary (i.e. content 2627 intended for machine, rather than human, consumption). 2629 Posting agents SHOULD NOT specify encoding "quoted-printable", 2630 but reading agents MUST interpret that encoding correctly. 2631 Encoding "binary" MUST NOT be used because this Standard does 2632 not mandate a transport mechanism that could support it 2633 (exceptions might be made in closed networks with alternative 2634 transport arrangements). 2636 Injecting and relaying agents MUST NOT change the encoding of 2637 articles passed to them. Gateways SHOULD ONLY change the 2638 encoding if absolutely necessary. 2640 6.15.3 Content-Type 2642 Network news is primarily a means of sharing textual 2643 information amongst a wide audience in a timely manner. The 2644 network is largely self regulating and operates by the 2645 consensus of its membership rather than by the dictate of any 2646 central authority (indeed, this lack of centralised control is 2647 seen as one of the overall strengths of the system). There are 2648 practices which, whilst being technically unexceptionable, are 2649 politically undesirable (or contrary to established 2650 "netiquette"). 2652 Insofar as there exist authorities empowered (by common 2653 consent or otherwise) to define what is and is not proper in 2654 various hierarchies or newsgroups or cooperating subnets, 2655 those authorities ought to establish, by means of rules, 2656 guidelines, charters or whatever else, the practices 2657 considered acceptable within their domains. In particular they 2658 ought to establish which of the more exotic content types are 2659 likely to be inappropriate. In the absence of such specific 2660 guidance, the following default recommendations are offered 2661 as an indication of best practice at the present time. 2663 6.15.3.1 Text 2665 "Content-Type: text/plain" is the expected type for any news 2666 article. Attention is drawn to the recommendations and limits 2667 on line lengths set out in section 4.6. Indeed, in any "text" 2668 content type the lines as transmitted (i.e. including any 2669 formatting instructions) ought to observe the recommendations 2670 set out in that section for the benefit of readers who can 2671 only see it in its transmitted form. 2673 While "Content-Type: text/enriched" [RFC 1896] can be 2674 considered acceptable in news articles, "Content-Type: 2675 text/HTML" is not appropriate since it relies on protocols 2676 currently unavailable to many reading agents. 2678 6.15.3.2 Application 2680 Generally speaking, the application content types are 2681 inappropriate for use outside of specialised newsgroups and 2682 subnets, especially where vendor-specific application 2683 subtypes are involved. 2685 "Application/octet-stream" is only appropriate in newsgroups 2686 where binaries are customarily accepted. 2688 6.15.3.4 Image, Audio and Video 2690 Likewise, these content types are only appropriate in 2691 newsgroups where binaries are customarily accepted. 2693 6.15.3.5 Multipart 2695 "Content-Type: multipart/mixed" (also "multipart/parallel") 2696 may be used freely in news articles. 2698 The use of "Content-Type: multipart/alternative" is deprecated 2699 (on account of the extra bandwidth consumed and the difficulty 2700 of quoting in followups). 2702 "Content-Type: multipart/digest" is recommended for any article 2703 composed of multiple messages more conveniently viewed as 2704 separate entities. The "boundary" should be composed of 28 2705 hyphens (ASCII 45) (which makes each boundary delimiter 30 2706 hyphens, or 32 for the final one) so as to accord with current 2707 practice for digests [RFC 1153]. 2709 "Content-Type: multipart/signed" [RFC 1847, RFC 2015] is the 2710 preferred method for the bodies of news articles that are to 2711 be digitally signed. However, encryption (as in 2712 "multipart/encrypted") is unlikely to be appropriate in a 2713 medium normally directed at such a wide readership. 2715 6.15.3.6 Message 2717 The Content Types "message/rfc822" and "message/news" are 2718 unlikely to be of much use within news articles, but attention 2719 is drawn to the benefits of using "message/news" in gatewaying 2720 from mail to news and vice versa. In particular, news articles 2721 mailed to moderators SHOULD be totally encapsulated within an 2722 email message using "message/news". Moreover, use of 2723 "message/news" in conjunction with a suitable Transfer 2724 Encoding forms a convenient way of "tunnelling" a news article 2725 through a transport medium that does not support 8bit 2726 characters. 2728 [This paragraph needs further work. Both message/news and 2729 application/news-transmission are recognised by IANA, but the 2730 distinction between them is not clear and their present 2731 definitions are out of date and omit to state that base 64 etc 2732 encodings are permitted (RFC 2046 is silent on that issue). We 2733 should take the opportunity to rewrite those specifications 2734 and include them (or at least one of them) in the present 2735 standard.] 2737 "Content-Type: message/partial" MAY be used to split a long 2738 news article into several smaller ones, but this usage is 2739 deprecated on the grounds that modern transport agents should 2740 have no difficulty in handling articles of arbitrary length. 2741 If this feature is used, then the "id" parameter should be in 2742 the form of a unique message-id (but different from the 2743 Message-ID of any of the partial articles). The second and 2744 subsequent partial articles should contain References headers 2745 referring to all the previous parts (note that these headers 2746 will be discarded upon reassembly of the parts). Contrary to 2747 the requirements specified in [RFC 2046], the 2748 Transfer-Encoding should be set to 8bit at least in each part 2749 that requires it. 2751 "Content-Type: message/external-body" could be appropriate for 2752 texts which it would be uneconomic (in view of the likely 2753 reader- ship) to distribute to the entire network. 2755 6.15.3.7 Character Sets 2757 In principle, any character set may be specified in the 2758 "charset=" parameter of a content type. However, character 2759 sets other than "us-ascii", "iso-8859-1" (and the 2760 corresponding parts of UTF-8) ought only to be used in 2761 hierarchies where the language customarily used so required 2762 (and whose readers could be expected to possess agents capable 2763 of displaying them). 2765 6.15.4 MIME within headers 2767 Since the headers of news articles are expected to use the 2768 UTF-8 character set, they SHOULD NOT normally be encoded using 2769 the MIME mechanism defined in RFC-2047 [RFC-2047]. 2770 Nevertheless, reading agents SHOULD support that usage. 2772 It is to be noted, however, that RFC-2047 encoding would be 2773 required were a news article to be transmitted as a mail 2774 message without first encapsulating in as a "message/news" as 2775 suggested above. 2777 6.15. Supersedes / Replaces 2779 These two headers take a list of message-ids (msgid-list) that 2780 the current article is expected to replace or supersede. All 2781 listed articles MUST be treated as though a "cancel" control 2782 message had arrived for the message, except as detailed below. 2784 These headers are essentially synonyms, with a change in 2785 behavior - Replaces uses the old article's message-id for 2786 the more recently arrived article, rather than creating a 2787 new article. 2789 The Supersedes header content specifies articles to be 2790 cancelled on arrival of this one: 2792 Supersedes-content = message-id *( FWS message-id ) 2794 NOTE: There is no "c" in "Supersedes". 2796 Older software supported only Supersedes, and with only one 2797 Message-ID. Until Multi-Super-Date, software SHOULD generate 2798 Supersedes with only one Message-ID, and cancel control 2799 messages SHOULD be issued if needed for other IDs. 2801 If the header is "Replaces" the new successor article SHOULD 2802 effectively over-write the predecessor(s) so that any attempt 2803 to read them shows the successor. Newsreaders should not show 2804 the article as an "unread" article unless the replaced 2805 articles were themselves all unread. A Replacement is 2806 considered a minor change, unworthy of being brought to the 2807 attention of a person who read one of the predecessors. 2808 Newsreaders and database systems MAY provide access to 2809 predecessors of articles if they wish, but this should not be 2810 part of the course of normal newsreading, and is in fact 2811 discouraged. 2813 Systems MAY treat Replaces as a synonym for Supersedes, if 2814 they do not implement the semantics of Replaces. 2816 If the header is "Supersedes" then the old articles SHOULD 2817 simply be deleted, as in a cancel, and the new article 2818 inserted into the system like any new article. 2820 Attempts to fetch a replaced or superseded article either by 2821 number or by Message-ID SHOULD retrieve instead the most 2822 recent successor. Some indication that a newer version than 2823 was asked for has been delivered MAY be provided. It is 2824 particularly encouraged that NNTP servers implement delivery 2825 of successor upon requests by message-IDs so that WWW "news:" 2826 and "msg:" URLs continue to work even when an article has a 2827 successor. 2829 It is expected that "Replaces" will become the common header 2830 for routine article changes and corrections, with Supersedes 2831 used for periodic postings (possibly every N periodic 2832 postings) or updates that make major changes to an article. 2834 As with a cancel, systems MUST NOT delete or replace articles 2835 unless the poster of the successor is authorized to cancel the 2836 predecessor. 2838 6.15.1 Message-ID version numbers chain procedure. 2840 NOTE: Sections 6.15.1 - 6.15.4 may be published as a separate 2841 recommendations document. 2843 Tools superseding or replacing messages should arrange so that 2844 the Message-ID of a replacement follows the following set of 2845 rules, generating what are known as "version-number" 2846 Message-IDs. 2848 1. If the Local-Part of the predecessor's Message-ID ends in 2849 "%v=", where is an integer version number, the new 2850 message-ID should replace the with the integer . 2852 Example: 2853 Message-ID is replaced by 2854 . 2856 2. If the Local-Part of the predecessor's Message-ID does not 2857 end in "%v=", then the string "%v=1" should be appended to 2858 the Local-Part to generate the successor Message-ID. 2860 Example: 2861 Message-ID is replaced by 2862 . 2864 6.15.2 Implementation and Use Note 2866 Typically a news database will store a "pointer" of some sort 2867 between replaced/superseded articles and their immediate 2868 successor or most recent successor. Such pointers may be 2869 expired along with other records in a news system's message-id 2870 lookup database. In addition, if a "version-number" Message-ID 2871 is found, and the "root" version (without the "%v=" tag, or 2872 with a "%v=0" tag) is not present on the server, a pointer 2873 from that root to the most recent successor SHOULD also be 2874 stored, and kept so long as there is a current successor in 2875 the system. (Systems should check for both root forms, trying 2876 the "%v=0" form first, and the tagless form 2nd.) 2878 Thus when a request for an article comes in that is not 2879 present (due to superseding or replacement) a check can be 2880 made for a pointer record for that Message-ID, or failing 2881 that, if the ID has a version-number, for a pointer record for 2882 the root versionless ID. Such pointers can be followed to the 2883 most recent successor. 2885 6.15.3 Transition 2887 Prior to Multi-Super-Date, a message may contain both a 2888 Replaces field and a Supersedes field. This should be treated 2889 as a Replaces, with the Supersedes added to assure that older 2890 systems still at least remove the predecessor. 2892 6.15.4 Replaced-by 2894 This header takes a message-id as argument. 2896 Prior to Multi-Super-Date, if there is a need to Supersede by 2897 use of a simple Cancel control message (due to inability to 2898 use multiple IDs in the Supersedes header) then such control 2899 messages may contain a "Replaced-by" header indicating the 2900 Message-ID of the successor the message that was cancelled. 2901 Systems maintaining pointers from predecessors to successors 2902 should use this record to update their pointers. 2904 Note this header goes only on the cancel control message, not 2905 the successor. The successor should have a Replaces and/or 2906 Supersedes listing the most immediate predecessor. 2908 6.15.5.1 Examples 2910 The first edition of an FAQ is posted with a Message-ID of the 2911 form: . The next version, a week 2912 later, has: 2914 Message-ID: 2915 Supersedes: 2917 The next one, another week later has: 2919 Message-ID: 2920 Supersedes: 2921 2923 The next one, another week later has: 2925 Message-ID: 2926 Supersedes: 2927 2929 Note that the long spacing between issues means the 2930 multi-entry Supersedes is there primarily to preserve pointer 2931 records at sites not using the version-number system for 2932 message-ids. 2934 Under the above, requests for the root (original) message-ID 2935 will return the most recent FAQ. On systems using the 2936 version-number system (which is optional) requests for any 2937 Message-ID in the chain will return the most recent, for all 2938 time. As such the URL "news:groupname-faq@faqsite.com" will 2939 always work, making it suitable to appear in HTML. 2941 6.15.5.2 Example 2943 A user posts a message to the net. 2944 She notices a typo, and 2 minutes later, posts with: 2946 Message-ID: 2947 Replaces: 2949 3 minutes later she sees another typo, and posts: 2951 Message-ID: 2952 Replaces: 2953 2955 The two bad versions will be replaced with the 3rd, even if a 2956 site never sees the 2nd due to batching or feed problems, and 2957 requests for the original will return the 3rd. 2959 During transition, she adds a Supersedes header to the 3rd 2960 message, with the first (direct predecessor) ID. She issues a 2961 Cancel message as well: 2963 Control: cancel Replaced-by: 2964 2966 6.15.6 Dates 2968 Multi-Super-Date ... in one year. (1036-spencer required 2969 multiple-ID supersedes, so by now just about everybody should 2970 already support it, is this true?) "Replaces" active -- 2971 whatever date we are putting for general compliance with this 2972 spec by news database systems. 2974 6.15.7 Issues 2976 No syntax for the internals of message-ids has been declared 2977 on the net. However, there is no harm if a conforming 2978 message-id matches the syntax. The syntax has been designed so 2979 that additional flags may be added to a message-id if desired, 2980 in a general "%keyword=value" form prior to the at-sign. 2982 Permanent message-ids as created by this system may even be 2983 implemented by smart NNTP servers which fetch old messages 2984 from other servers, increasing the availability of USENET 2985 messages considerably. 2987 Unfortunately, it will be some time until any new feature is 2988 widely deployed. 2990 6.16 Archive 2992 This optional header is a signal to automatic archival agents 2993 on whether this article is available for long-term storage. 2994 Agents which see "Archive: no" MUST NOT keep the article past 2995 the Expires date. Any other text indicates conditions that an 2996 agent SHOULD follow in order to archive the article. 2998 Archive-content = "no" | CFWS 3000 6.17. Obsolete Headers 3002 Persons writing new agents SHOULD ignore any former meanings 3003 of these headers. 3005 Also-Control 3006 See-Also 3007 Article-Names 3008 Article-Updates 3010 7. Duties of Various Agents 3012 The following section sets out the duties of various Agents 3013 involved in the creation, relaying and serving of Usenet 3014 articles. 3016 Agents which write to the Path header MUST conform to RFC2142 3017 with respect to contact addresses especially the "usenet" and 3018 "abuse" addresses. 3020 7.1 Duties of an Injecting Agent. 3022 An injection agent is responsible for taking a proto-article 3023 from a posting agent and either forwarding it to a moderator 3024 of injecting it into the relaying system for access by 3025 readers. 3027 As such a Injecting Agent is considered responsible for 3028 ensuring that any article it injects conform with the policies 3029 and rules of this document and any newsgroups that an article 3030 is posted to. 3032 To this end injection agents MAY cancel articles which they 3033 have previously injected. 3035 7.1.1 Proto-articles. 3037 A proto-article is one that is created by a posting agent and 3038 has not been injected into the news system by an injecting 3039 agent. Only one copy of a proto-article MUST exist. A 3040 proto-article has the same format as a normal article except 3041 that some of the compulsory headers MAY be missing. A 3042 proto-injected article MAY have the following headers missing: 3043 "Message-Id: " , "Date: " and "Path: " . These header MUST not 3044 contain invalid values, they MUST either be correct or not 3045 present at all. 3047 A proto-article MUST NOT contain the "!" or "%" character in 3048 the Path header. 3050 Proto-articles SHOULD NOT contain the Originator-Info header. 3051 See reference [draft-newman-msgheader-originfo-x.txt] on this 3052 header for more information. 3054 7.1.2 Procedure followed by Injecting Agents. 3056 A injecting agent receives proto-articles from posting and 3057 followup agents. It verifies them, adds headers where required 3058 and then either forwards them to a moderator or injects them 3059 by passing them to serving or relaying agents. An injecting 3060 agent SHOULD only accept articles from trusted agents. 3062 An injecting agent MAY reject articles in which headers contain 3063 "forged" email addresses, that is, addresses which are not 3064 valid for the known source, and do not end in ".invalid". 3066 If an injecting agent receives an otherwise valid article that 3067 has already been injected it SHOULD either act as if it is a 3068 relaying agent or pass the article on to a relaying agent 3069 completely unaltered. It MUST NOT forward an already injected 3070 article to a moderator. Articles SHOULD NOT be injected twice. 3072 An injecting agent accepts a proto-article checks it and does 3073 one of the following: 3075 (a) If the article is invalid, incorrectly formatted or 3076 unacceptable due to site policy the posting agent MUST be 3077 informed (such as via a [NNTP] 44x response code) that posting 3078 has failed and the article MUST NOT be injected nor forwarded 3079 to a moderator. 3081 (b) If the Newsgroups line contains one or more moderated 3082 groups and the article does NOT contain an Approved header 3083 then the injecting agent MUST forward the article to the 3084 moderator of the first (leftmost) moderated group listed in 3085 the Newsgroups line via email. The injecting agent MUST also 3086 add headers as detailed below. 3088 (c) If the proto-article is not posted to any moderated 3089 newsgroups or the Approved header is correctly present then 3090 the injecting agent should convert the proto-article to an 3091 injected article (see below) and forwarded it to one or more 3092 relaying or serving agents. 3094 7.1.3 Headers added by Injecting Agents. 3096 When an injecting agent forwards and article to a moderator or 3097 injects it it MUST do the do the following: 3099 The message-id and Date headers (and their content) MUST be 3100 added if not already present. The Path header MUST be 3101 correctly added if the article is being injected but SHOULD 3102 NOT be added if it is being forwarded to a moderator. 3104 If the Originator-Info header is already present in the 3105 proto-article then it MUST be removed if incorrect and a 3106 correct one MAY added. 3108 The Injecting Agent MUST verify the poster in some way. The 3109 Path header (section 5.6) MUST be correctly used and some 3110 other secure standard method (such as the Originator-info 3111 header) MAY be used. 3113 The Injecting Agent MAY add other headers not listed in this 3114 draft but MUST NOT alter, delete or reorder any headers 3115 already present in the article except the Originator-Info 3116 header (see above). The Injecting Agent MUST NOT alter the 3117 body of the article in any way. 3119 7.2 Duties of a Relaying Agent 3121 A relaying Agent accepts injected articles from injecting and 3122 other relaying agents and passes them on to relaying agents or 3123 serving agents according to mutually agreed policy. Relaying 3124 Agents SHOULD only accept articles from trusted agents. 3126 A relaying agent MAY reject articles in which headers contain 3127 "forged" email addresses, that is, addresses which are not 3128 valid for the known source, and do not end in ".invalid". 3130 A relaying agent MUST perform checks on an article to ensure 3131 it complies with this standard. If the article is invalid, 3132 unwanted (see below) or unacceptable due to site policy the 3133 agent that passed the article to the relaying agent SHOULD be 3134 informed (such as via a [NNTP] 43x response code) that 3135 relaying failed. In order to prevent a large number of error 3136 messages being sent to one location relaying agent MUST NOT 3137 inform any other external entity that an article was not 3138 relayed UNLESS that external entity has specificly requested 3139 that it be informed of these errors. 3141 In order to prevent overloading relaying agents SHOULD NOT 3142 query an external entity (such as a key-server) in order to 3143 verify an article. 3145 When an article is received the relaying agent MUST verify the 3146 previous entry in this header and add their own entry(s) 3147 according to the syntax defined in the Path section 5.6. 3148 Relaying agents MUST NOT alter, deleted or rearrange any 3149 part of an article expect for the Path and Xref Headers. 3151 Article which match mutually agreed criteria should be passed 3152 onto neighboring relaying and serving agents. 3154 NOTE: It is usual for relaying and serving agents to restrict 3155 the Newsgroups, Distributions, age and size of articles they 3156 wish to receive. 3158 7.2.1 Unwanted and Invalid articles 3160 Relaying Agents MUST reject all articles that do not have all 3161 mandatory headers present with legal contents or which have 3162 illegal contents in optional headers. 3164 Relaying Agents SHOULD reject any articles that have already 3165 been sent to it (a database message-ids of recent messages is 3166 usually kept and matched against) or which are too old (from 3167 the Date header) for it to determine if they have already been 3168 sent to it. Relaying Agents SHOULD NOT forward articles to 3169 sites whose path-identity is already in the Path header. 3171 Relaying Agents SHOULD also reject any articles that have been 3172 Canceled, Superseded or Replaced by their author or another 3173 trusted entity. 3175 7.3 Duties of a Serving Agent 3177 A Serving Agent takes an article from a relaying or injecting 3178 agent and files it in a "news database" . It also provides an 3179 interface for reading agents to access the news database. This 3180 database is normally indexed by newsgroup with with articles 3181 in each newsgroup numbered consecutively starting from 1. See 3182 [NNTP] for more information on this format. 3184 NOTE: Control messages are usually filed in the separate 3185 pseudo-newsgroup "control" or a pseudo-newsgroup in a 3186 hierarchy under "control." (ie "control.cancel" ) . Serving 3187 Agents SHOULD do this if they serve articles via NNTP. 3189 Serving Agents are encouraged to only allow access to trusted 3190 reading agents. 3192 Serving Agents SHOULD generate a correct Xref header for 3193 crossposted articles and MUST prepend a correct path-identity 3194 into the Path header of all articles. 3196 7.3.1 Unwanted articles 3198 Serving Agents MUST reject all articles that do not have all 3199 mandatory headers present with legal contents or which have 3200 illegal contents in optional headers. 3202 Serving Agents SHOULD reject any articles that have already 3203 been sent to it (a database message-ids of recent messages is 3204 usually kept and matched against) or which are too old (from 3205 the Date header) for it to determine if they have already been 3206 sent to it. 3208 Serving Agents SHOULD also reject any articles that have been 3209 Canceled, Superseded or Replaced by their author or another 3210 trusted entity and delete any of these articles that they 3211 already have in their news database. 3213 7.4 Duties of a Posting Agent. 3215 A posting agent is used to assist the poster in creating a 3216 valid proto-article and forwarding it to an injecting agent. 3218 Postings Agents SHOULD ensure that proto-articles they create 3219 are valid usenet articles according to the standards of this 3220 document and other policies. 3222 Posting agents meant for use by ordinary posters SHOULD reject 3223 any attempt to post an article which cancels, Supersedes or 3224 Replaces another article if the target article not by the 3225 poster. 3227 7.5 Duties of a Followup Agent 3229 A followup Agent is a special case of a Posting Agent and as 3230 such is bound by all the Posting Agent's requirements plus 3231 additional ones. Followup Agents MUST create valid followups, 3232 Followups have additional requirements from normal articles 3233 for the syntax of the References and Subject headers and the 3234 body format. 3236 Followup Agents MUST by default follow the FollowUp-To header 3237 when deciding which newsgroups a followup is posted to, 3238 however the poster MAY override the default if they wish. 3240 Followup Agents MUST NOT attempt to send email to any address 3241 ending in ".invalid". 3243 Followup Agents SHOULD NOT email copies of the followup to the 3244 author of the precursor (or any other person) unless this has 3245 been explicitly requested. 3247 7.6 Duties of a Gateway 3249 NOT DONE 3251 8. Propagation and Processing 3253 Most aspects of news propagation and processing are 3254 implementation-specific. The basic propagation algorithms, 3255 and certain details of how they are implemented, 3256 nevertheless need to be standard. 3258 There are two important principles that news implementors 3259 (and administrators) need to keep in mind. The first is the 3260 well-known Internet Robustness Principle: 3262 Be liberal in what you accept, and conservative in what 3263 you send. 3265 However, in the case of news there is an even more important 3266 principle, derived from a much older code of practice, the 3267 Hippocratic Oath (we will thus call this the Hippocratic 3268 Principle): 3270 First, do no harm. 3272 It is VITAL to realize that decisions which might be merely 3273 suboptimal in a smaller context can become devastating 3274 mistakes when amplified by the actions of thousands of 3275 hosts within a few hours. 3277 In the case of gateways, the primary corollary to this is: 3279 Cause no loops. 3281 9. Security And Related Issues 3283 There is no security. Don't fool yourself. USENET is a prime 3284 example of an Internet Adhocratic-Anarchy; that is, an 3285 environment in which trust forms the basis of all agreements. 3286 It works. 3288 Articles which are intended to have restricted distribution 3289 are dependent on the goodwill of every site receiving them. 3290 The "X-No-Archive: yes" header is widely recognized as a 3291 signal to automated archivers not to file an article, but that 3292 cannot be guaranteed. 3294 The Distribution header makes provisions for articles which 3295 should not be propagated beyond a cooperating subnet. The key 3296 security word here is "cooperating". When a machine is not 3297 configured properly, it may become uncooperative and tend to 3298 distribute all articles. 3300 9.1 Attacks 3302 The two categories of attacks that news is most vulnerable to 3303 are Denial-of-Service and exploitations of particular 3304 implementations. Many have argued that "spam", massively 3305 crossposted or reposted articles constitutes a DoS attack in 3306 its own regard. This may be so. 3308 Sending off-topic messages is a matter for individual hierarchies 3309 and newsgroups to control. It is a violation of this DRAFT to 3310 "forge" an email address, that is, to use a valid email address 3311 which you are not entitled to use. All invalid email addresses 3312 used in headers MUST end in the ".invalid" top-level-domain. 3313 This facility is provided primarily for those who wish to remain 3314 anonymous, but do not care to take the additional precautions of 3315 using more sophisticated anonymity measures. 3317 It is possible that legal penalties may apply to sending 3318 unsolicited commercial email and/or news articles. Check with 3319 your local legal authorities. 3321 10. Security Considerations 3323 Section 9 discusses security considerations. 3325 11. References: 3327 [TEST-TLDS] 3328 Eastlake, D. ; Panitz A. Reserved Top Level DNS Names, 3329 draft-ietf-dnsind-test-tlds-xx.txt, May 1998 3331 [ANSI-X3.4] US-ASCII 3332 American National Standard for Information Systems - Coded 3333 Character Sets - 7-Bit American National Standard Code for 3334 Information Interchange (7-Bit ASCII). ANSI X3.4, 1986. 3336 [ISO-8859] 3337 International Standard - Information Processing - 8-bit 3338 Single-Byte Coded Graphic Character Sets - Part 1: Latin 3339 alphabet No. 1, ISO 8859-1, 1987. Part 2: Latin alphabet 3340 No. 2, ISO 8859-2, 1987. Part 3: Latin alphabet No. 3, ISO 3341 8859-3, 1988. Part 4: Latin alphabet No. 4, ISO 8859-4, 3342 1988. Part 5: Latin/Cyrillic alphabet, ISO 8859-5, 1988. 3343 Part 6: Latin/Arabic alphabet, ISO 8859-6, 1987. Part 7: 3344 Latin/Greek alphabet, ISO 8859-7, 1987. Part 8: 3345 Latin/Hebrew alphabet, ISO 8859-8, 1988. Part 9: Latin 3346 alphabet No. 5, ISO 8859-9, 1990. Part 10: Latin alphabet 3347 No. 6, ISO 8859-10, 1992. 3349 [ISO-10646] 3350 International Standard - Information technology - 3351 Universal Multiple-Octet Coded Character Set (UCS) - Part 3352 1: Architecture and Basic Multilingual Plane. ISO/IEC 3353 10646-1, 1993. 3355 [MESSFOR] 3357 [Originator-Info] 3358 draft-newman-msgheader-originfo-x.txt 3359 chris.newman@innosoft.com 3361 [RFC-822] e-mail message format 3362 Crocker, David H.: Standard for the format of ARPA 3363 Internet text messages. RFC 822, 1982-08-13. 3365 [RFC-850] netnews message format (obsolete) 3366 Horton, Mark R.: Standard for interchange of Usenet 3367 messages. RFC 850, 1983-06. 3369 [RFC-976] UUCP mail interchange 3370 Horton, Mark R.: UUCP mail interchange format standard. 3371 RFC 976, 1986-02. 3373 [RFC-977] NNTP 3374 Kantor, Brian; Lapsley, Phil: Network news transfer 3375 protocol - a proposed standard for the stream-based 3376 transmission of news. RFC 977, 1986-02. 3378 [RFC-1036] netnews message format 3379 Horton, Mark R.; Adams, R.: Standard for interchange of 3380 Usenet messages. RFC 1036, 1987-12. 3382 [RFC-1036BIS] netnews message format (memo) 3383 Spencer, Henry: News article format and transmission. 3384 1994-06-02. 3386 [RFC-1884] IP v6 3387 Hinden, Robert M.; Deering, Stephen E.: IP version 6 3388 addressing architecture. RFC 1884, 1995-12. 3390 [RFC-2045] MIME, part 1 3391 Freed, Ned; Borenstein, Nathaniel S.: Multipurpose 3392 Internet mail extensions (MIME), part 1: format of 3393 Internet message bodies. RFC 2045, 1996-11. 3395 [RFC-2119] MUST/SHOULD/MAY 3396 Bradner, Scott: Key words for use in RFCs to indicate 3397 requirement levels. RFC 2119, 1997-03. 3399 [RFC-2130] character-set memo 3400 Weider, Chris; Preston, Cecilia; Simonsen, Keld; 3401 Alvestrand, Harald T.; Atkinson, Randall; Crispin, Mark; 3402 Svanberg, Peter: The report of the IAB character set 3403 workshop. RFC 2130, 1997-04. 3405 [RFC-2142] standard mailbox names 3406 Crocker, David H.: Mailbox names for common services, 3407 roles and functions. RFC 2142, 1997-05. 3409 [RFC-2234] ABNF 3410 Crocker, David H.; Overell, Paul: Augmented BNF for syntax 3411 specifications: ABNF. RFC 2234, 1997-11. 3413 [RFC-2279] UTF-8 3414 Yergeau, Francois: UTF-8, a transformation format of ISO 3415 10646. RFC 2279, 1998-01. 3417 [UNICODE] Unicode 3418 The Unicode Consortium: The Unicode Standard - Version 2.0. 3419 Addison-Wesley, 1996. 3421 Author's Address 3423 The Usenet Format Working Group (usenet-format@clari.net) 3425 Deliberations were archived at 3426 http://www.landfield.com/usefor/ 3428 Chair: Simon Lyall, simon@darkmere.gen.nz 3429 Editor: Dan Ritter, dsr@bbn.com 3431 Members (alphabetical): 3433 A. Deckers (Alain.Deckers@man.ac.uk) 3434 Ade Lovett (ade@demon.net) 3435 Andrew Gierth (andrew@erlenstar.demon.co.uk) 3436 Bill Davidsen (davidsen@prodigy.com) 3437 Bill McQuillan (mcquillan@mpa15ab.mv.unisys.com) 3438 Brad Templeton (brad@clari.net) 3439 Brian Hernacki (bhern@netscape.com) 3440 Brian Kelly (bkelly@sulaco.com) 3441 Bryan Ford (baford@sleepless.com) 3442 Buddha Buck (bmbuck@acsu.buffalo.edu) 3443 Charles Lindsey (chl@clw.cs.man.ac.uk) 3444 Chris Newman (chris+ietf-usenet@iosoft.com) 3445 Christian Weisgerber (naddy@mips.rhein-neckar.de) 3446 Christopher Sedore (cmsedore@maxwell.syr.edu) 3447 Claus Andre Farber (lists/usenet-format/clari.net@faerber.muc.de) 3448 Clive D.W. Feather (clive@demon.net) 3449 Curt Welch (curt@kcwc.com) 3450 D. J. Bernstein (djb@koobera.math.uic.edu) 3451 Dave Barr (barr@math.psu.edu) 3452 Dave Hayes (dave@kachina.jetcafe.org) 3453 Dave Mack (dmack@corp.webtv.net) 3454 David C Lawrence (tale@isc.org) 3455 David desJardins (desj@rt.com) 3456 Denis McKeon (DMckeon@swcp.com) 3457 Dirk Nimmich (dirk@roxel.ms.sub.org) 3458 Doug Royer [N6AAW] (dougr@basilisk.Eng.Sun.COM) 3459 Egil Kvaleberg (egil@kvaleberg.no) 3460 Eivind Tagseth (eivindt@multinet.no) 3461 Erik van der Poel (erik@netscape.com) 3462 Erland Sommarskog (sommar@algonet.se) 3463 Evan Champion (evanc@synapse.net) 3464 Fergus Henderson (fjh@cs.mu.oz.au) 3465 Frederic SENIS (fs@caduceus.frmug.org) 3466 Fredric Logren (Fredric.Lonngren@uab.ericsson.se) 3467 Greg Berigan (gberigan@cse.unl.edu) 3468 Harald Alvestrand (Harald.T.Alvestrand@uninett.no) 3469 Heiko Schlichting (heiko@cis.fu-berlin.de) 3470 Heiko W.Rupp (hwr@pilhuhn.de) 3471 Hrvoje Niksic (hniksic@srce.hr) 3472 Ian Davis (iand@fdc.co.uk) 3473 Ian G Batten (I.G.Batten@batten.eu.org) 3474 John Moreno (phenix@interpath.com) 3475 John Stanley (stanley@oce.orst.edu) 3476 Jon Ribbens (jon@oaktree.co.uk) 3477 Jonathan Grobe (grobe@netins.net) 3478 Kai Heingsen (kai@khms.westfalen.de) 3479 Karl Kleinpaste (karl@jprc.com) 3480 Keeth Herron (kherron@campus.mci.net) 3481 Kent Landfield (kent@landfield.com) 3482 Kristian =?ISO-8859-1?Q?K=F6hntopp?= (KRIS@koehntopp.de) 3483 Lars Magne Ingebrigtsen (lmi@gnus.org) 3484 Leonid Yegoshin (egoshin@genesyslab.com) 3485 Mark Hittinger (bugs@freebsd.netcom.com) 3486 Mark Sidell (Mark.Sidell@forteinc.com) 3487 Martin Forssen (maf@math.chalmers.se) 3488 Martin J. Duerst (mduerst@ifi.unizh.ch) 3489 Maurizio Codogno (mau@beatles.cselt.it) 3490 Mick Brown (Mick.Brown@worldnet.att.net) 3491 Mustafa Soysal MS57 (msoysal@mistik.express.net) 3492 Oo Hovers (onno@surfer.xs4all.nl) 3493 Paul Eggert (eggert@twinsun.com) 3494 Paul Overell (richard@pillar.turnpike.com) 3495 Per Abrahamsen (abraham@dina.kvl.dk) 3496 Pete Resnick (presnick@qualcomm.com) 3497 Peter =?ISO-8859-1?Q?Br=FClls?= (pb@Ecce-Terram.DE) 3498 Peter Heirich (peter@heirich.in-berlin.de) 3499 Ralph Babel 3500 Richard Clayton (richard@pillar.turnpike.com) 3501 Robert Elz (kre@muari.OZ.AU) 3502 Russ Allbery (rra@stanford.edu) 3503 R. Kelley Cook (kcook@ibm.net) 3504 Seth Breidbart (sethb@panix.com) 3505 Shmuel Metz (shmuel@os2bbs.com) 3506 Simon Fraser (smfr@santafe.edu) 3507 Stan Barber (sob@academ.com) 3508 Sylvan Butler (SBUTLER@hpbs2024.boi.hp.com) 3509 Terje Bless (link@tss.no) 3510 Thomas Roessler (roessler+1036@sobolev.iam.uni-bo.de) 3511 Tim Skirvin (tskirvin@math.uiuc.edu) 3512 Todd Michel McComb (mccomb@best.com) 3513 Tom Hughes (tom@compton.demon.co.uk) 3514 Vera Heinau (heinau@cis.fu-berlin.de) 3515 Wayne Davison (wayne@clari.net) 3516 Wolfgang Schelongowski (skaranyi@xivic.ruhr.de) 3518 Expires 19990101