idnits 2.17.1 draft-sparks-genarea-mailarch-02.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (Dec 14, 2011) is 4517 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Outdated reference: A later version (-12) exists of draft-ietf-eai-5738bis-02 -- Obsolete informational reference (is this intentional?): RFC 3501 (Obsoleted by RFC 9051) Summary: 0 errors (**), 0 flaws (~~), 2 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group R. Sparks 3 Internet-Draft Tekelec 4 Intended status: Informational Dec 14, 2011 5 Expires: June 16, 2012 7 IETF Email List Archiving and Search Tool Requirements 8 draft-sparks-genarea-mailarch-02 10 Abstract 12 The IETF makes heavy use of email lists to conduct its work. 13 Participants frequently need to search and browse the archives of 14 these lists, and have asked for improved search capabilities. The 15 current archive mechanism could also be made more efficient. This 16 memo captures the requirements for improved email list archiving and 17 searching systems. 19 Status of this Memo 21 This Internet-Draft is submitted in full conformance with the 22 provisions of BCP 78 and BCP 79. 24 Internet-Drafts are working documents of the Internet Engineering 25 Task Force (IETF). Note that other groups may also distribute 26 working documents as Internet-Drafts. The list of current Internet- 27 Drafts is at http://datatracker.ietf.org/drafts/current/. 29 Internet-Drafts are draft documents valid for a maximum of six months 30 and may be updated, replaced, or obsoleted by other documents at any 31 time. It is inappropriate to use Internet-Drafts as reference 32 material or to cite them other than as "work in progress." 34 This Internet-Draft will expire on June 16, 2012. 36 Copyright Notice 38 Copyright (c) 2011 IETF Trust and the persons identified as the 39 document authors. All rights reserved. 41 This document is subject to BCP 78 and the IETF Trust's Legal 42 Provisions Relating to IETF Documents 43 (http://trustee.ietf.org/license-info) in effect on the date of 44 publication of this document. Please review these documents 45 carefully, as they describe your rights and restrictions with respect 46 to this document. Code Components extracted from this document must 47 include Simplified BSD License text as described in Section 4.e of 48 the Trust Legal Provisions and are provided without warranty as 49 described in the Simplified BSD License. 51 Table of Contents 53 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 54 2. List Search and Archive Requirements . . . . . . . . . . . . . 3 55 2.1. Search and Browsing . . . . . . . . . . . . . . . . . . . . 3 56 2.2. IMAP access . . . . . . . . . . . . . . . . . . . . . . . . 4 57 2.3. Archiving Active Lists . . . . . . . . . . . . . . . . . . 5 58 2.4. Importing Messages from Other Archives . . . . . . . . . . 5 59 2.5. Exporting messages from the Archives . . . . . . . . . . . 5 60 2.6. Redundancy . . . . . . . . . . . . . . . . . . . . . . . . 6 61 2.7. Archive Administration . . . . . . . . . . . . . . . . . . 6 62 2.8. Transition Requirements . . . . . . . . . . . . . . . . . . 7 63 3. Internationalized Address Considerations . . . . . . . . . . . 7 64 4. Security Considerations . . . . . . . . . . . . . . . . . . . . 7 65 5. IANA Considerations . . . . . . . . . . . . . . . . . . . . . . 7 66 6. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 7 67 7. Changelog . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 68 7.1. 01 to 02 . . . . . . . . . . . . . . . . . . . . . . . . . 7 69 7.2. 00 to 01 . . . . . . . . . . . . . . . . . . . . . . . . . 7 70 8. Informative References . . . . . . . . . . . . . . . . . . . . 8 71 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 8 73 1. Introduction 75 The IETF makes heavy use of email lists to conduct its work. 76 Participants frequently need to search the archives of these lists, 77 and have asked for improved search capabilities. The current archive 78 mechanism could also be made more efficient. This memo captures the 79 requirements for improved email list archiving and searching systems. 81 Discussion of this memo should take place on the ietf@ietf.org 82 mailing list. 84 2. List Search and Archive Requirements 86 2.1. Search and Browsing 88 o The system must provide a web interface for search and browsing 89 archived messages. 91 o The system must allow browsing the entire archive of a given list 92 by thread or by date. 94 o The system must allow browsing the results of a search by thread 95 or by date. 97 Both threading based on Message-Id/References/In-Reply-To and 98 threading based on same subject line (modulo short prefixes 99 like re: and fwd:) should be taken into account. 101 o The system must allow searching across any subset of the archived 102 lists (one list, a selection of lists, or all lists). 104 o The system must allow searching of any combination (using AND and 105 OR operators) of the following attributes. Richer search 106 capabilites are highly desirable. 108 - string occurring in sender name 110 - date range 112 - string occurring in Subject 114 - string occurring in message body 116 - string occuring in message header (in particular, exact match 117 of Message-Id) 118 For instance, it would be nice to search the entire archive 119 for instances of a message with a given Message-ID with a 120 URL like 123 o Individual messages must be representable by a long-term stable 124 URI that can be shared between users. That is, the URI must be 125 suitable for reference in an email message. 127 - It would be preferable for that URI to appear in an Archived-At 128 header field in the message [RFC5064]. 130 o Searches should be representable by a URI that can be shared 131 between users 133 - Such URIs should be long-term stable. 135 - The search may be re-executed when the URI is referenced. It 136 is acceptable for the same URI to produce different results if 137 accessed at different times (reflecting additional messages 138 that may match the search criteria for example.) 140 2.2. IMAP access 142 Many participants would prefer to access the list archives using IMAP 143 [RFC3501]. Providing this access while meeting the following 144 requirements will likely require an IMAP server with specialized 145 capabilities. 147 o The system should expose the archive using an IMAP interface, with 148 each list represented as a mailbox. 150 o This interface must work with standard IMAP clients. 152 o The interface should allow users to each have their own read/ 153 unread marks for messages. Allowing other annotation is 154 desirable. 156 - If this requires the user to login, the system should use 157 datatracker login credentials 159 o The interface must have server-side searching enabled, and should 160 support multiple simultaneous extensive searches. 162 2.3. Archiving Active Lists 164 o The archive system must accept messages handled by various mail 165 list manager packages. 167 - Lists hosted on the IETF systems are served by mailman 168 [mailman]. 170 - Lists hosted at other organizations may use other packages. 172 * The archive system must accept messages through subscribing 173 to such an external list. 175 * The archive system may support other mechanisms for 176 accepting messages into the archive 178 2.4. Importing Messages from Other Archives 180 Lists hosted at other systems are sometimes moved to the IETF 181 servers, and their archive is moved with them. The archiving system 182 must be able to import these archives. 184 o At a minimum the archive system must be able to import mbox 185 formatted archives [RFC4155][mbox]. 187 o The archive system should be able to import maildir and maildir- 188 like (the key characteristic being on-message-per-file) formatted 189 archives [maildir]. 191 o It is acceptable to use a separate utility to convert between 192 these formats before import as long as the conversion is lossless 194 2.5. Exporting messages from the Archives 196 o The archive system must support exporting messages in the mbox 197 format 199 o The archive system should support exporting messages in maildir 200 format 202 o The archive system must support exporting the entire archive of a 203 given list 205 o The archive system must support exporting all messages from a 206 given list within a given daterange 208 o The archive system should allow exporting the results of any 209 supported search query 211 2.6. Redundancy 213 o The systems must facilitate providing archive, search, and browse 214 functions through geographically distributed servers 216 - The systems must support a single active and single standby 217 server. This reflects the current operating configuration and 218 is expected to be the initial deployment model. 220 - The systems should support a single active and multiple standby 221 servers. 223 - The systems should support multiple active servers for the 224 search and browse functions. Multiple active archive servers 225 are not a requirement. 227 - The amount of data replication between servers should be on the 228 order of the size of any new/changed messages in the archives. 230 * It is acceptable for replication to be part of the archival 231 system itself (such as using the replication mechanisms from 232 an underlying database). 234 * It is acceptable to rely on replication of the underlying 235 filesystem objects (using rsync of one or more directory 236 trees for example), but only if the objects in the 237 underlying filesystem are formatted such that the size of 238 the replication data is on the order of the size of any new/ 239 changed messages in the archives. 241 2.7. Archive Administration 243 o The archive system must support adding and removing lists to be 244 archived 246 o The system must allow the administrator to add messages to and 247 delete messages from an archived list. The system should log such 248 actions. 250 o The system must allow the administrator to delete messages from an 251 archived list 253 2.8. Transition Requirements 255 There are many existing archived messages containing embedded links 256 into the existing MHonArc mail archive. These links must continue to 257 work, but should reach the message as archived in the new system. 259 3. Internationalized Address Considerations 261 The archive and search functions should anticipate internationalized 262 email addresses as discussed in the following three documents 263 [I-D.ietf-eai-rfc5335bis] [I-D.ietf-eai-rfc5336bis] 264 [I-D.ietf-eai-5738bis]. There is no firm requirement at this time. 266 4. Security Considerations 268 Creating a new tool for searching and archiving IETF email lists does 269 not affect the security of the Internet in any significant fashion. 271 5. IANA Considerations 273 This document has no actions for IANA. 275 6. Acknowledgements 277 The Tools Development team provided input into this initial 278 brainstorm. Text suggestions from Alexey Melnikov, Pete Resnick, and 279 S. Moonesamy have been incorporated. 281 7. Changelog 283 7.1. 01 to 02 285 1. Added request for the Archived-At header field. 287 2. Pointed to the EAI work in progress and in the RFC Editor queue. 289 3. Corrected several typos 291 7.2. 00 to 01 293 1. Requested ability to import maildir-like archives, not just 294 maildir proper 296 2. Added a section requesting IMAP access to the archive. 298 8. Informative References 300 [I-D.ietf-eai-5738bis] 301 Resnick, P., Newman, C., and S. Shen, "IMAP Support for 302 UTF-8", draft-ietf-eai-5738bis-02 (work in progress), 303 December 2011. 305 [I-D.ietf-eai-rfc5335bis] 306 Yang, A., Steele, S., and N. Freed, "Internationalized 307 Email Headers", draft-ietf-eai-rfc5335bis-13 (work in 308 progress), October 2011. 310 [I-D.ietf-eai-rfc5336bis] 311 Yao, J. and W. MAO, "SMTP Extension for Internationalized 312 Email", draft-ietf-eai-rfc5336bis-16 (work in progress), 313 November 2011. 315 [RFC3501] Crispin, M., "INTERNET MESSAGE ACCESS PROTOCOL - VERSION 316 4rev1", RFC 3501, March 2003. 318 [RFC4155] Hall, E., "The application/mbox Media Type", RFC 4155, 319 September 2005. 321 [RFC5064] Duerst, M., "The Archived-At Message Header Field", 322 RFC 5064, December 2007. 324 [maildir] "Maildir", . 326 [mailman] "Mailman", . 328 [mbox] "Mbox", . 330 Author's Address 332 Robert Sparks 333 Tekelec 334 17210 Campbell Road 335 Suite 250 336 Dallas, Texas 75254-4203 337 USA 339 Email: RjS@nostrum.com