<?xml version="1.0"?>
<!DOCTYPE rfc SYSTEM "rfc2629.dtd">
<?rfc symrefs='yes'?>
<?rfc toc='yes'?>
<?rfc compact='yes'?>
<?rfc subcompact='no'?>
<?rfc sortrefs='yes'?>
<!-- <?rfc strict='yes'?> -->

<rfc docName="draft-duerst-archived-at-09" ipr="full3978" category="std" xml:lang="en">
<front><title>The Archived-At Message Header Field</title>

<author initials="M.J." surname="Duerst" fullname='Martin Duerst (Note: Please write "Duerst" with u-umlaut wherever possible, for example as "D&amp;#252;rst" in XML and HTML.)'>
  <organization>Aoyama Gakuin University</organization>
  <address>
  <postal>
  <street>5-10-1 Fuchinobe</street>
  <city>Sagamihara</city>
  <region>Kanagawa</region>
  <code>229-8558</code>
  <country>Japan</country>
  </postal>
  <phone>+81 42 759 6329</phone>
  <facsimile>+81 42 759 6495</facsimile>
  <email>mailto:duerst@it.aoyama.ac.jp</email>
  <uri>http://www.sw.it.aoyama.ac.jp/D%C3%BCrst/</uri>
  </address>
</author>

<date year="2007" month="August" day="21"/>
<keyword>Electronic Mail</keyword>
<keyword>Mailing Lists</keyword>
<keyword>Archiving</keyword>
<abstract>
<t>This memo defines a new email header field, Archived-At:,
to provide a direct link to the archived form of an individual
email message.</t>
</abstract>
</front>

<middle>
<section title="Introduction">
<t><xref target="RFC2369"/> defines a number of header fields that can be added to
   Internet messages such as those sent by email distribution lists or in netnews <xref target="RFCXXXX"></xref>. One of them is the 'List-Archive'
   header field that describes how to access archives for the list. This allows to
   access the archives as a whole, but not the individual message.</t>
<t>There is often a need or desire to refer to the archived form of a single message. For more detailed usage scenarios, please see <xref target="usage"></xref>.
   This memo defines a new header, 'Archived-At', to refer to a single message at an
   archived location.
   This provides quick access to the location of a mailing list message in the list archive. It can also be used independently of mailing lists,
   for example in connection with legal requirements to archive certain messages.</t>
<t>In this document, the key words "MUST", "MUST NOT", "REQUIRED",
   "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED",  "MAY",
   and "OPTIONAL" are to be interpreted as described in <xref target="RFC2119"></xref>.</t></section>
<section title="Header Field Definition">
<section title="Syntax" anchor="Syntax">
<t>For the Archived-At header field, the field name is 'Archived-At'.
   The field body consist of a URI <xref target="STD66"/> enclosed
   in angle brackets ('&lt;', '&gt;').
   The URI MAY contain folding white space (FWS, <xref target="RFC2822"/>),
   which is ignored. MTAs MUST NOT insert whitespace within the angle
   brackets, but client applications SHOULD ignore any whitespace,
   which might have been inserted by poorly behaved MTAs.
   The URI points to an archived version of the message.
   See <xref target="format"/> for some more details.</t>
<t>This header field is subject to the encoding and character
   restrictions for mail headers as described in <xref target="RFC2822"/>.</t>
<figure><preamble>More formally, the header field is defined as follows
in ABNF according to <xref target="RFC4234"/>:</preamble>
<artwork>
   archived-at = "Archived-At:" [FWS] "&lt;" folded-URI "&gt;" CRLF
   folded-URI  = &lt;URI, but free insertion of FWS permitted&gt;
</artwork>
<postamble>where URI is defined in <xref target="STD66"/>
 and CRLF and FWS are defined in <xref target="RFC2822"/>.</postamble>
</figure>

<t>To convert folded-URI to a URI, first apply standard <xref target="RFC2822"></xref> unfolding rules (replacing FWS with a single SP), and then delete any remaining un-encoded SP characters.</t>
<t>This syntax is kept simple in that only one URI per header field is
   allowed. In this point, the syntax is different from
   <xref target="RFC2369"/>. Also, comments are not allowed.</t>
</section>
<section title="Multiple Archived-At Header Fields">
<t>Each Archived-At header field only contains a single URI. If it is desired to list multiple URIs where an archived copy of the message can be found, a separate Archived-At field per URI is required. Multiple Archived-At header
   fields with the same URI SHOULD be avoided. An Archived-At header
   field SHOULD only be created if the message is actually being made
   available at the URI given in the header field.</t>
<t>If a message is forwarded from a list to a sublist, and both lists
   support adding the Archived-At header field, then the sublist SHOULD
   add a new Archived-At header field without removing the already
   existing one(s), unless the header field is exactly the same as
   an already existing one, in which case the new header field
   SHOULD NOT be added.</t>
</section>
<section title="Interaction with Message Fragmentation and Reassembly">
<t><xref target="RFC2046"></xref> allows for the fragmentation and reassembly of messages. Archived-At header fields are to be treated in the same way as Comment header fields, i.e. copied to the first fragment message header on fragmentation and back from there to the header of the reassembled message.</t><t>This treatment has been chosen for compatibility with existing infrastructure. It means that Archived-At header fields in the first fragment message MAY refer to an archived version of the whole, unfragmented, message. To avoid confusion, Archived-At headers SHOULD NOT be added to fragment messages.</t>
</section><section title="Syntax Extension for Internationalized Message Headers"><t>There are some efforts to allow non-ASCII text directly in message header field bodies. In such contexts, the URI non-terminal in the syntax defined in <xref target="Syntax"></xref> is to be replaced by IRI as defined in <xref target="RFC3987"></xref>. The specifics of the  actual octet encoding of the IRI will follow the rules for general direct encoding of non-ASCII text. For conversion between IRIs and URIs, the procedures defined in <xref target="RFC3987"></xref> are to be applied.</t></section><section title="The X-Archived-At Header Field">
<t>For backwards compatibility, this document also describes the
X-Archived-At header field, a precursor of the Archived-At
header field. The X-Archived-At header field MAY also be parsed,
but SHOULD not be generated.</t>
<figure><preamble>The following is the syntax of the X-Archived-At header field, in ABNF according to <xref target="RFC4234"></xref> (which also defines SP): </preamble>
<artwork>
   obs-archived-at = "X-Archived-At:"  SP URI CRLF
</artwork>
</figure><t>The X-Archived-At header field does not allow white space inside URI.</t></section></section>
<section title="Implementation and Usage Considerations"><section title="Formats of Archived Message" anchor="format">
<t>There is no restriction on the format used to serve the archived message
from the URI in a 'Archived-At' header field. It is expected that in many
cases, the archived message will be served as (X)HTML, as plain text, or in its original form as
message/rfc822 <xref target="RFC2046"/>. Some forms of URIs may imply the
format in which the archived message is served, although this should not
be relied upon.</t>
<t>If the protocol used to retrieve
the message allows for content negotiation, then it is also possible to
serve the archived message in several different formats. As an example, a HTTP URI in an 'Archived-At' header may make it possible to
serve the archived message both as text/html for human consumption
in a browser and as message/rfc822
for use by a mail user agent without loss of information.</t>
</section>
<section title="Implementation Considerations" anchor="implementation">
<t>
   Mailing list expanders and email archives are often separate pieces of
   software. It may therefore be difficult to create an 'Archived-At'
   header field in the mailing list expander software.</t>
<t>One way to address this difficulty is to have the mailing list expander
   software generate an unambiguous URI, e.g. an URI based on the message identifier
   of the incoming email, and to set up the archiving system so that it
   redirects requests for such URIs to the actual messages. If the email does not contain a message identifier, a unique identifier can be
  generated.</t>
   <t>Such a system has been implemented and is in productive use at W3C.
   As an example, the URI
   <vspace/>"http://www.w3.org/mid/0I5U00G08DFGCR@mailsj-v1.corp.adobe.com",
   containing the significant part of the message identifier
   "&lt;0I5U00G08DFGCR@mailsj-v1.corp.adobe.com&gt;", is redirected to the
   URI of this message in the W3C mailing-list archive at
   http://lists.w3.org/Archives/Public/uri/2004Oct/0017.html.</t>
   <t>Source code for this implementation is available at
   http://dev.w3.org/cvsweb/search/, in particular http://dev.w3.org/cvsweb/search/cgi/mid.pl
   and http://dev.w3.org/cvsweb/search/bin/msgid-db.pl. These locations may be subject to change.</t>
<t>When using the message identifier to create an address for the archived mail,
   care has to be taken to escape characters in the message identifier that are
   not allowed in the URI, or to remove them, as done above for the "&lt;" and "&gt;" delimiters.</t><t> Implementations such as that described above can introduce a security issue. Somebody might deliberately reuse a message identifier to break the link to a message. This can be addressed by checking incoming message identifiers against those of the messages already in the archive and discarding incoming duplicates, by checking the content of incoming duplicates and discarding them if they are significantly different from the first message, by offering multiple choices in the response to the URI, or by using some authentication mechanism on incoming messages.</t>
</section><section title="Usage Considerations" anchor="usage"><t>It may at first seem strange to have a pointer to an archived form of a message in a header field of that same message. After all, if one has the message, why would one need a pointer to it? It turns out that such pointers can be extremely useful. This section describes some of the scenarios for their use.</t><t>A user may want to refer to messages in a non-message context, such as on a Web page, in an instant message, or in a phone conversation. In such a case, the user can extract the URI from the Archived-At header field,  avoiding the search for the correct message in the archive.</t><t>A user may want to refer to other messages in a message context. Referring to a single message is often done by replying to that message. However, when referring to more than one message, providing pointers to archived messages is a widespread practice. The Archived-At header field makes it easier to provide these pointers.</t><t>A user may want to find messages related to a message at hand. The user may not have received the related messages, and therefore needs to use an archive. The user may also prefer finding related messages in the archive rather than in her MUA, because messages in archives may be linked in ways not provided by the MUA. The Archived-At header field provides a link to the starting point in the archive from which to find related messages.</t><t>Please note that in the above usage scenarios, it is mostly the human reader, rather than the email client software, that makes use of the URI in the Archived-At header. However, this does not rule out the use of the URI in the Archived-At header by the email client or other software if such use is found helpful.</t></section></section>

<section title="Security Considerations">
<t>There are many potential security issues when activating and dereferencing
an URI. For more details, including some countermeasures, please see <xref target="STD66"/>.
In the context of this proposal, the following are particularly relevant:
An intruder may get access to the message transmission and be able to
insert an URI pointing to some malicious content. This can be addressed by using a secured way of message transmission. Also, somebody may be
able to construct a message that is harmless when received directly,
but that produces problems when accessed via the URI. One reason for this may be the format used in the archive, where some content was not
adequately escaped. This can be addressed by using adequate escaping.</t><t>The Archived-At header field points to some archived form of the message itself. This in turn may contain the Archived-At field. This creates a potential for a denial-of-service attack on the server pointed to by the URI in the Archived-At header field. The conditions are that the archived form of the message is downloaded automatically, and that further URIs in that message are followed and downloaded recursively without checking for already downloaded resources. However, this kind of scenario can easily be avoided by implementations. First, the URI in the Archived-At header field should not be dereferenced automatically. Second, appropriate measures for loop detection should be used.</t><t>In <xref target="implementation"></xref>, an attack is described that may break an URI to a message by introducing a new message with the same message identifier. Possible countermeasures are also discussed.</t>
</section>

<section title="IANA Considerations">

<section title="Registration of the Archive-At Header Field"><figure><preamble>IANA is herewith requested to register the
   Archived-At header field in the Message Header Fields Registry
   (<xref target="RFC3864"/>) as follows:</preamble><artwork>
   Header field name:
      Archived-At

   Applicable protocol:
      mail (RFC 2822) and netnews (RFC XXXX)

   Status:
      standard

   Author/Change controller:
      IETF

   Specification document(s):
      Internet-Draft draft-duerst-archived-at-08.txt
      (Note to IANA and RFC Editor: Replace this with
       RFC YYYY (RFC number of this specification))

   Related information:
      none</artwork></figure>
      
</section><section title="Registration of the X-Archived-At Header Field"><t>This section is non-normative (specifically, an implementation which ignores this section remains compliant with this specification).</t><figure><preamble>IANA is herewith requested to register the
   X-Archived-At header field in the Message Header Fields Registry
   (<xref target="RFC3864"/>) as follows:</preamble><artwork>
   Header field name:
      X-Archived-At

   Applicable protocol:
      mail (RFC 2822) and netnews (RFC XXXX)

   Status:
      deprecated

   Author/Change controller:
      IETF

   Specification document(s):
      Internet-Draft draft-duerst-archived-at-08.txt
      (Note to IANA and RFC Editor: Replace this with
       RFC YYYY (RFC number of this specification))

   Related information:
      none</artwork></figure></section></section>
<section title="Change Log">
      <t>Note to RFC Editor: Please remove this whole section before
      publication.</t>
   <section title="Changes from -08.txt to -09.txt"><t><list><t>Fixed some typos.</t><t>Changed from "message id" to more correct "message identifier".</t><t>Added a sentence to cover the case of messages without message identifiers.</t></list></t></section><section title="Changes from -07.txt to -08.txt"><t><list><t>Replaced 'you' by 'a user' in usage considerations section.</t><t>Added a note to say that URIs may not be permanent for pointers to example implementation in "Implementation Considerations" section.</t><t>Improved advice on countermeasures for  the security issue in <xref target="implementation"></xref>.</t><t>Removed a superfluous sentence.</t><t>Added some discussion of countermeasures to the first security issue in the security section (first paragraph).</t><t>Updated acknoledgments.</t></list></t></section><section title="Changes from -06.txt to -07.txt"><t><list><t>Changed ABNF to introduce the folded-URI rule.</t><t>Added text about a security issue to <xref target="implementation">, and mentioned it in the security section.</xref></t><t>Updated reference to netnews format.</t><t>Some wording improvements.</t></list></t></section><section title="Changes from -05.txt to -06.txt"><t>
      <list><t>Removed 'also optionally' for netnews.</t><t>Improved explanation in section about IRIs.</t><t>Some wording improvements.</t></list></t>
   </section>
   <section title="Changes from -04.txt to -05.txt"><t>
      <list><t>Small wording adjustments for readability.</t>
      <t>Separated normative and informative references, added
      reference to RFC1036.</t>
      <t>Moved X-Archived-At header into separate section and
      added registration template.</t><t>Clarified syntax: allowed FWS, provided alternate syntax  in an attempt to describe different syntactic layers, changed single quotes to double quotes for literals.</t>
      <t>Added section about message fragmentation.</t><t>Reorganized document to reduce number of top-level sections.</t><t>Added section about IRIs.</t><t>Removed reference to 'safe' URI characters (no longer a term used in URI spec).</t><t>Removed "URI not empty" comment in ABNF: URI according to STD66 has to contain a scheme  part anyway because it is absolute.</t>
      </list></t>
      </section>
<section title="Changes from -03.txt to -04.txt"><t>
   <list><t>Updated from RFC 2234 to RFC 4234.</t>
   <t>Updated author's address/affiliation.</t></list></t></section><section title="Changes from -02.txt to -03.txt"><t><list><t>Fixed syntax to allow spaces after colon.</t><t>Avoided misunderstanding with message ids ('&lt;' and '&gt;' are part of the message id)</t><t>Added reference to  RFC 2119 (keywords)</t><t>Changed "MUST" to "SHOULD" for existence of message</t></list></t></section></section>

<section title="Acknowledgments">
<t>The members of the W3C system team, in particular Gerald Oskoboiny,
   Olivier Thereaux, Jose Kahan, and Eric Prud'hommeaux, created the mid-based email
   archive lookup system and the experimental form of the Archived-At header.
   Pete Resnik provided the motivation for writing up this memo. Discussion on
   the ietf-822@imc.org mailing list, in particular contributions by Frank Ellermann, Arnt Gulbrandsen,
   Graham Klyne, Bruce Lilly, Charles Lindsey, and Keith Moore, has led to further
   improvements of the proposal. Chris Newman, Chris Lonvick, Stephane Borzmeyer, Vijay K. Gurbani, and S. Moonesamy provided additional valuable comments. </t>
</section>
</middle>
<back>
<references title="Normative References">


<reference anchor="RFC2119">
<front>
<title abbrev="RFC Key Words">Key words for use in RFCs to Indicate Requirement Levels</title>
<author initials="S." surname="Bradner" fullname="Scott Bradner"><organization/></author>
<date month="March" year="1997"/>
<area>General</area>
<keyword>keyword</keyword>
</front>
<seriesInfo name="BCP" value="14"/>
<seriesInfo name="RFC" value="2119"/>
</reference>

<reference anchor="RFC4234">
<front>
<title abbrev="ABNF">Augmented BNF for Syntax Specifications: ABNF</title>
<author initials="D." surname="Crocker" fullname="Dave Crocker"><organization/></author>
<author initials="P." surname="Overell" fullname="Paul Overell"><organization/></author>
<date month="October" year="2005"/></front>
<seriesInfo name="RFC" value="4234"/>
</reference>





<reference anchor="RFC2822">
  <front>
  <title>Internet Message Format</title>
  <author initials="P." surname="Resnick" fullname="P. Resnick"><organization/></author>
  <date month="April" year="2001"/>
  </front>
  <seriesInfo name="RFC" value="2822"/>
</reference>

<reference anchor="RFC3864">
  <front>
  <title>Registration Procedures for Message Header Fields</title>
  <author initials="G." surname="Klyne" fullname="G. Klyne"><organization/></author>
  <author initials="M." surname="Nottingham" fullname="M. Nottingham"><organization/></author>
  <author initials="J." surname="Mogul" fullname="J. Mogul"><organization/></author>
  <date year="2004" month="September"/>
  </front>
  <seriesInfo name="RFC" value="3864"/>
  <seriesInfo name="BCP" value="90"/>  
</reference><reference anchor="STD66">
<front>
<title abbrev="URI Generic Syntax">Uniform Resource Identifier (URI): Generic Syntax</title>
<author initials="T." surname="Berners-Lee" fullname="Tim Berners-Lee"><organization/></author>
<author initials="R.T." surname="Fielding" fullname="Roy T. Fielding"><organization/></author>
<author initials="L." surname="Masinter" fullname="Larry Masinter"><organization/></author>
<date month="January" year="2005"/>
</front>
<seriesInfo name="STD" value="66"/>
<seriesInfo name="RFC" value="3986"/>
</reference><reference anchor="RFC3987">
<front>
<title>Internationalized Resource Identifiers (IRIs)</title>
<author initials="M.J." surname="Duerst" fullname="Martin Duerst"><organization/></author>
<author initials="M.L." surname="Suignard" fullname="Michel Suignard"><organization/></author>
<date year="2005" month="January"/>
</front>
<seriesInfo name="RFC" value="3987"/>
</reference></references><references title="Informative References">
<reference anchor="RFC2046">
<front>
<title>Multipurpose Internet Mail Extensions (MIME) Part Two: Media Types</title>
<author initials="N." surname="Freed"><organization/></author>
<author initials="N." surname="Borenstein"><organization/></author>
<date month="November" year="1996"/>
</front>
  <seriesInfo name="RFC" value="2046"/>
</reference>





<reference anchor="RFC2369">
  <front>
  <title abbrev="URLs as Meta-Syntax">The Use of URLs as Meta-Syntax for Core Mail List Commands and their Transport through Message Header Fields</title>
  <author initials="J.D." surname="Baer" fullname="Joshua D. Baer"><organization/></author>
  <author initials="G." surname="Neufeld" fullname="Grant Neufeld"><organization/></author>
  <date month="July" year="1998"/>
  </front>
  <seriesInfo name="RFC" value="2369"/>
</reference>

<reference anchor="RFCXXXX"><front>
  <title>Netnews Article Format (Note to RFC Editor: Please update this reference with the RFC
              resulting from draft-ietf-usefor-usefor-xx.txt, and
              remove this Note.)</title>
  <author initials="K." surname="Murchison" fullname="K. Murchison"><organization/></author>
  <author initials="C." surname="Lindsey" fullname="C. Lindsey"><organization/></author>
  <author initials="D." surname="Kohn" fullname="D. Kohn"><organization/></author>
  <date month="January" year="2007"/></front>
</reference>

</references>
</back>
</rfc>
