idnits 2.17.1 draft-sparks-genarea-mailarch-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (Nov 15, 2011) is 4539 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- No issues found here. Summary: 0 errors (**), 0 flaws (~~), 1 warning (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group R. Sparks 3 Internet-Draft Tekelec 4 Intended status: Informational Nov 15, 2011 5 Expires: May 18, 2012 7 IETF Email List Archiving and Search Tool Requirements 8 draft-sparks-genarea-mailarch-00 10 Abstract 12 The IETF makes heavy use of email lists to conduct its work. 13 Participants frequently need to search and browse the archives of 14 these lists, and have asked for improved search capabilities. The 15 current archive mechanism could also be made more efficient. This 16 memo captures the requirements for improved email list archiving and 17 searching systems. 19 Status of this Memo 21 This Internet-Draft is submitted in full conformance with the 22 provisions of BCP 78 and BCP 79. 24 Internet-Drafts are working documents of the Internet Engineering 25 Task Force (IETF). Note that other groups may also distribute 26 working documents as Internet-Drafts. The list of current Internet- 27 Drafts is at http://datatracker.ietf.org/drafts/current/. 29 Internet-Drafts are draft documents valid for a maximum of six months 30 and may be updated, replaced, or obsoleted by other documents at any 31 time. It is inappropriate to use Internet-Drafts as reference 32 material or to cite them other than as "work in progress." 34 This Internet-Draft will expire on May 18, 2012. 36 Copyright Notice 38 Copyright (c) 2011 IETF Trust and the persons identified as the 39 document authors. All rights reserved. 41 This document is subject to BCP 78 and the IETF Trust's Legal 42 Provisions Relating to IETF Documents 43 (http://trustee.ietf.org/license-info) in effect on the date of 44 publication of this document. Please review these documents 45 carefully, as they describe your rights and restrictions with respect 46 to this document. Code Components extracted from this document must 47 include Simplified BSD License text as described in Section 4.e of 48 the Trust Legal Provisions and are provided without warranty as 49 described in the Simplified BSD License. 51 Table of Contents 53 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 54 2. List Search and Archive Requirements . . . . . . . . . . . . . 3 55 2.1. Search and Browsing . . . . . . . . . . . . . . . . . . . . 3 56 2.2. Archiving Active Lists . . . . . . . . . . . . . . . . . . 4 57 2.3. Importing Messages from Other Archives . . . . . . . . . . 4 58 2.4. Exporting messages from the Archives . . . . . . . . . . . 5 59 2.5. Redundancy . . . . . . . . . . . . . . . . . . . . . . . . 5 60 2.6. Archive Administration . . . . . . . . . . . . . . . . . . 6 61 2.7. Transition Requirements . . . . . . . . . . . . . . . . . . 6 62 3. Future Considerations . . . . . . . . . . . . . . . . . . . . . 6 63 4. Security Considerations . . . . . . . . . . . . . . . . . . . . 6 64 5. IANA Considerations . . . . . . . . . . . . . . . . . . . . . . 6 65 6. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 7 66 7. Informative References . . . . . . . . . . . . . . . . . . . . 7 67 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 7 69 1. Introduction 71 The IETF makes heavy use of email lists to conduct its work. 72 Participants frequently need to search the archives of these lists, 73 and have asked for improved search capabilities. The current archive 74 mechanism could also be made more efficient. This memo captures the 75 requirements for improved email list archiving and searching systems. 77 The requirments captured in this version of this memo (-00) are an 78 initial brainstorm. While intended to be thorough, they are probably 79 incomplete. Please keep "What's missing?" in mind while reviewing 80 them. Discussion of this memo should take place on the ietf@ietf.org 81 mailing list. 83 2. List Search and Archive Requirements 85 2.1. Search and Browsing 87 o The system must provide a web interface for search and browsing 88 archived messages. 90 o The system must allow browsing the entire archive of a given list 91 by thread or by date. 93 o The system must allow browsing the results of a search by thread 94 or by date. 96 Both threading based on Message-Id/References/In-Reply-To and 97 threading based on same subject line (modulo short prefixes 98 like re: and fwd:) shoul dbe taken into account. 100 o The system must allow searching across any subset of the archived 101 lists (one list, a selection of lists, or all lists). 103 o The system must allow searching of any combination (using AND and 104 OR operators) of the following attributes. Richer search 105 capabilites are highly desirable. 107 - string occurring in sender name 109 - date range 111 - string occurring in Subject 112 - string occurring in message body 114 - string occuring in message header (in particular, exact match 115 of Message-Id) 117 For instance, it would be nice to search the entire archive 118 for instances of a message with a given Message-ID with a 119 URL like 122 o Individual messages must be representable by a long-term stable 123 URI that can be shared between users. That is, the URI must be 124 suitable for reference in an email message. 126 o Searches should be representable by a URI that can be shared 127 between users 129 - Such URIs should be long-term stable. 131 - The search may be re-executed when the URI is referenced. It 132 is acceptable for the same URI to produce different results if 133 accessed at different times (reflecting additional messages 134 that may match the search criteria for example.) 136 2.2. Archiving Active Lists 138 o The archive system must accept messages handled by various mail 139 reflector packages. 141 - Lists hosted on the IETF systems are served by mailman 142 [mailman]. 144 - Lists hosted at other organizations may use other packages. 146 * The archive system must accept messages through subscribing 147 to such an external list. 149 * The archive system may support other mechanisms for 150 accepting messages into the archive 152 2.3. Importing Messages from Other Archives 154 Lists hosted at other systems are sometimes moved to the IETF 155 servers, and their archive is moved with them. The archiving system 156 must be able to import these archives. 158 o At a minimum the archive system must be able to import mbox 159 formatted archives [RFC4155][mbox]. 161 o The archive system should be able to import maildir formatted 162 archives [maildir]. 164 o It is acceptable to use a separate utility to convert between 165 these formats before import as long as the conversion is lossless 167 2.4. Exporting messages from the Archives 169 o The archive system must support exporting messages in the mbox 170 format 172 o The archive system should support exporting messages in mailder 173 format 175 o The archive system must support exporting the entire archive of a 176 given list 178 o The archive system must support exporting all messages from a 179 given list within a given daterange 181 o The archive system should allow exporting the results of any 182 supported search query 184 2.5. Redundancy 186 o The systems must facilitate providing archive, search, and browse 187 functions through geographically distributed servers 189 - The systems must support a single active an dsingle standby 190 server. This reflects the current operating configuration and 191 is expected to be the initial deployment model. 193 - The systems should support a single active and multiple standby 194 servers. 196 - The systems should support multiple active servers for the 197 search and browse functions. Multiple active archive servers 198 are not a requirement. 200 - The amount of data replication between servers should be on the 201 order of the size of any new/changed messages in the archives. 203 * It is acceptable for replication to be part of the archival 204 system itself (such as using the replication mechanisms from 205 an underlying database). 207 * It is acceptable to rely on replication of the underlying 208 filesystem objects (using rsync of one or more directory 209 trees for example), but only if the objects in the 210 underlying filesystem are formatted such that the size of 211 the replication data is on the order of the size of any new/ 212 changed messages in the archives. 214 2.6. Archive Administration 216 o The archive system must support adding and removing lists to be 217 archived 219 o The system must allow the administrator to add messages to and 220 delete messages from an archived list. The system should log such 221 actions. 223 o The system must allow the administrator to delete messages from an 224 archived list 226 2.7. Transition Requirements 228 There are many existing archived messages containing embedded links 229 into the existing MHonArc mail archive. These links must continue to 230 work, but should reach the message as archived in the new system. 232 3. Future Considerations 234 The archive and search functions should anticipate internationalized 235 email addresses as discussed in the EAI working group. Since the EAI 236 WG has not finished their work, there is no firm requirement at this 237 time. 239 4. Security Considerations 241 Creating a new tool for searching and archiving IETF email lists does 242 not affect the security of the Internet in any significant fashion. 244 5. IANA Considerations 246 This document has no actions for IANA. 248 6. Acknowledgements 250 The Tools Development team provided input into this initial 251 brainstorm. 253 7. Informative References 255 [RFC4155] Hall, E., "The application/mbox Media Type", RFC 4155, 256 September 2005. 258 [maildir] "Maildir", . 260 [mailman] "Mailman", . 262 [mbox] "Mbox", . 264 Author's Address 266 Robert Sparks 267 Tekelec 268 17210 Campbell Road 269 Suite 250 270 Dallas, Texas 75254-4203 271 USA 273 Email: RjS@nostrum.com