idnits 2.17.1 draft-nottingham-atompub-feed-history-09.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** It looks like you're using RFC 3978 boilerplate. You should update this to the boilerplate described in the IETF Trust License Policy document (see https://trustee.ietf.org/license-info), which is required now. -- Found old boilerplate from RFC 3978, Section 5.1 on line 15. -- Found old boilerplate from RFC 3978, Section 5.5, updated by RFC 4748 on line 625. -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 636. -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 643. -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 649. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust Copyright Line does not match the current year == The document seems to lack the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords -- however, there's a paragraph with a matching beginning. Boilerplate error? (The document does seem to have the reference to RFC 2119 which the ID-Checklist requires). -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (April 22, 2007) is 6204 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) No issues found here. Summary: 1 error (**), 0 flaws (~~), 2 warnings (==), 7 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group M. Nottingham 3 Internet-Draft April 22, 2007 4 Intended status: Standards Track 5 Expires: October 24, 2007 7 Feed Paging and Archiving 8 draft-nottingham-atompub-feed-history-09 10 Status of This Memo 12 By submitting this Internet-Draft, each author represents that any 13 applicable patent or other IPR claims of which he or she is aware 14 have been or will be disclosed, and any of which he or she becomes 15 aware will be disclosed, in accordance with Section 6 of BCP 79. 17 Internet-Drafts are working documents of the Internet Engineering 18 Task Force (IETF), its areas, and its working groups. Note that 19 other groups may also distribute working documents as Internet- 20 Drafts. 22 Internet-Drafts are draft documents valid for a maximum of six months 23 and may be updated, replaced, or obsoleted by other documents at any 24 time. It is inappropriate to use Internet-Drafts as reference 25 material or to cite them other than as "work in progress." 27 The list of current Internet-Drafts can be accessed at 28 http://www.ietf.org/ietf/1id-abstracts.txt. 30 The list of Internet-Draft Shadow Directories can be accessed at 31 http://www.ietf.org/shadow.html. 33 This Internet-Draft will expire on October 24, 2007. 35 Copyright Notice 37 Copyright (C) The IETF Trust (2007). 39 Abstract 41 This specification defines three types of syndicated Web feeds that 42 enable publication of entries across one or more feed documents. 44 Table of Contents 46 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 47 1.1. Notational Conventions . . . . . . . . . . . . . . . . . . 3 48 1.2. Terminology . . . . . . . . . . . . . . . . . . . . . . . 3 49 2. Complete Feeds . . . . . . . . . . . . . . . . . . . . . . . . 4 50 3. Paged Feeds . . . . . . . . . . . . . . . . . . . . . . . . . 5 51 4. Archived Feeds . . . . . . . . . . . . . . . . . . . . . . . . 6 52 4.1. Publishing Archived Feeds . . . . . . . . . . . . . . . . 9 53 4.2. Consuming Archived Feeds . . . . . . . . . . . . . . . . . 9 54 5. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 10 55 6. Security Considerations . . . . . . . . . . . . . . . . . . . 10 56 7. References . . . . . . . . . . . . . . . . . . . . . . . . . . 11 57 7.1. Normative References . . . . . . . . . . . . . . . . . . . 11 58 7.2. Informative References . . . . . . . . . . . . . . . . . . 11 59 Appendix A. Acknowledgements . . . . . . . . . . . . . . . . . . 11 60 Appendix B. Use in RSS 2.0 . . . . . . . . . . . . . . . . . . . 11 62 1. Introduction 64 Syndicated Web feeds (using such formats as Atom [RFC4287]) are often 65 split up into multiple documents to save bandwidth, allow "sliding 66 window" access, or for other purposes. 68 This specification formalizes two types of feeds that can span one or 69 more feed documents; "paged" feeds and "archived" feeds. 70 Additionally, it defines "complete" feeds to cover the case when a 71 single feed document explicitly represents all of the feed's entries. 73 Each has different properties and trade-offs: 75 o Complete feeds contain the entire set of entries in one document, 76 and can be useful when it isn't desirable to "remember" 77 previously-seen entries. 78 o Paged feeds split the entries among multiple temporary documents. 79 This can be useful when entries in the feed are not long-lived or 80 stable, and the client needs to access an arbitrary portion of 81 them, usually in close succession. 82 o Archived feeds split them among multiple permanent documents, and 83 can be useful when entries are long-lived and it is important for 84 clients to see every one. 86 The semantics of a feed that combines these types is undefined by 87 this specification. 89 Although they refer to Atom normatively, the mechanisms described 90 herein can be used with similar syndication formats; see Appendix B 91 for one such use. 93 1.1. Notational Conventions 95 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 96 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 97 document are to be interpreted as described in BCP 14 [RFC2119]. 99 This specification uses XML Namespaces [W3C.REC-xml-names-19990114] 100 to uniquely identify XML element names. It uses the following 101 namespace prefix for the indicated namespace URI; 103 "fh": "http://purl.org/syndication/history/1.0" 105 1.2. Terminology 107 In this specification, "feed document" refers to an Atom Feed 108 Document or similar syndication instance document. It may contain 109 any number of entries, and may or may not be a complete 110 representation of the logical feed. 112 A "logical feed" is the complete set of entries associated with a 113 feed (as contrasted with a feed document, which may contain a subset 114 of them). 116 "Head section" refers to a document's feed-wide metadata container; 117 e.g., the child elements of the atom:feed element in an Atom Feed 118 Document. 120 This specification uses terms from the XML Infoset 121 [W3C.REC-xml-infoset-20040204]. However, this specification uses a 122 shorthand; the phrase "Information Item" is omitted when naming 123 Element Information Items. Therefore, when this specification uses 124 the term "element," it is referring to an Element Information Item in 125 Infoset terms. 127 This specification also uses Atom link relations to identify 128 different types of links; see the Atom specification [RFC4287] for 129 information about their syntax, and the IANA link relation registry 130 for more information about specific values. 132 Note that URI references in link relation values may be relative, and 133 when they are used they must be absolutised, as described in Section 134 5.1 of [RFC3986]. 136 2. Complete Feeds 138 A complete feed is a feed document that contains all of the entries 139 of a logical feed; any entry not actually in the feed document SHOULD 140 NOT be considered to be part of that feed. 142 For example, a feed that represents a ranking that varies over time 143 (such as "Top Twenty Records" or "Most Popular Items") should not 144 have newer entries displayed alongside older ones. By marking this 145 type feeds as complete, old entries are discarded when it is 146 refreshed. 148 The fh:complete element, when present in a feed's head section, 149 indicates that the feed document it occurs in is a complete 150 representation of the logical feed's entries. It is an empty 151 element; this specification does not define any content for it. 153 Example: Atom-formatted Complete Feed 155 156 158 NetMovies Queue 159 The DVDs you'll receive next. 160 161 162 164 2003-12-13T18:30:02Z 165 166 John Doe 167 168 urn:uuid:60a76c80-d399-11d9-b93C-0003939e0af6 169 170 Casablanca 171 172 urn:uuid:1225c695-cfb8-4ebb-aaaa-80da344efa6a 173 2003-12-13T18:30:02Z 174 Here's looking at you, kid... 175 176 178 This specification does not address duplicate entries in complete 179 feeds. 181 3. Paged Feeds 183 A paged feed is a set of linked feed documents that together contain 184 the entries of a logical feed, without any guarantees about the 185 stability of the documents' contents. 187 Paged feeds are lossy; that is, it is not possible to guarantee that 188 clients will be able to reconstruct the contents of the logical feed 189 at a particular time. Entries may be added or changed as the pages 190 of the feed are accessed, without the client becoming aware of them. 192 Therefore, clients SHOULD NOT present paged feeds as coherent or 193 complete, or make assumptions to that effect. 195 Paged feeds can be useful when the number of entries is very large, 196 infinite, or indeterminate. Clients can "page" through the feed, 197 only accessing a subset of the feed's entries as necessary. 199 For example, a search engine might make query results available as a 200 paged feed, so that queries with very large result sets do not 201 overwhelm the server, the network, or the client. 203 The feed documents in a paged feed are tied together with the 204 following link relations: 206 o "first" - A URI that refers to the furthest preceding document in 207 a series of documents. 208 o "last" - A URI that refers to the furthest following document in a 209 series of documents. 210 o "previous" - A URI that refers to the immediately preceding 211 document in a series of documents. 212 o "next" - A URI that refers to the immediately following document 213 in a series of documents. 215 Paged feed documents MUST have at least one of these link relations 216 present, and should contain as many as practical and applicable. 218 Example: Atom-formatted Paged Feed 220 221 222 Example Feed 223 224 225 226 2003-12-13T18:30:02Z 227 228 John Doe 229 230 urn:uuid:60a76c80-d399-11d9-b93C-0003939e0af6 231 232 Atom-Powered Robots Run Amok 233 234 urn:uuid:1225c695-cfb8-4ebb-aaaa-80da344efa6a 235 2003-12-13T18:30:02Z 236 Some text. 237 238 240 This specification does not address duplicate entries in paged feeds. 242 4. Archived Feeds 244 An archived feed is a set of feed documents that can be combined to 245 accurately reconstruct the entries of a logical feed. 247 Unlike paged feeds, archived feeds enable clients to do this without 248 losing entries. This is achieved by publishing a single subscription 249 document and (potentially) many archive documents. 251 A subscription document is a feed document that always contains the 252 most recently added or changed entries available in the logical feed. 254 Archive documents are feed documents that contain less recent entries 255 in the feed. The set of entries contained in an archive document 256 published at a particular URI SHOULD NOT change over time. Likewise, 257 the URI for a particular archive document SHOULD NOT change over 258 time. 260 The following link relations are used to tie subscription and 261 archived feeds together: 263 o "prev-archive" - A URI that refers to the immediately preceding 264 archive document. 265 o "next-archive" - A URI that refers to the immediately following 266 archive document. 267 o "current" - A URI that, when dereferenced, returns a feed document 268 containing the most recent entries in the feed. 270 Subscription documents and archive documents MUST have a "prev- 271 archive" link relation, unless there are no preceding archives 272 available. Additionally, archive documents SHOULD have "next- 273 archive" and "current" link relations. 275 Archive document SHOULD also contain an fh:archive element in their 276 head sections to indicate that they are archives. fh:archive is an 277 empty element; this specification does not define any content for it. 279 Example: Atom-formatted Subscription Document 281 282 283 Example Feed 284 285 286 288 2003-12-13T18:30:02Z 289 290 John Doe 291 292 urn:uuid:60a76c80-d399-11d9-b93C-0003939e0af6 293 294 Atom-Powered Robots Run Amok 295 296 urn:uuid:1225c695-cfb8-4ebb-aaaa-80da344efa6a 297 2003-12-13T18:30:02Z 298 Some text. 299 300 302 Example: Atom-formatted Archive Document 304 305 307 Example Feed 308 309 310 311 313 2003-11-24T12:00:00Z 314 315 John Doe 316 317 urn:uuid:60a76c80-d399-11d9-b93C-0003939e0af6 318 319 Atom-Powered Robots Scheduled To Run Amok 320 321 urn:uuid:cdef5c6d5-gff8-4ebb-assa-80dwe44efkjo 322 2003-11-24T12:00:00Z 323 Some text from an old, different entry. 324 325 327 4.1. Publishing Archived Feeds 329 The requirement that archive documents be stable allows clients to 330 safely assume that if they have retrieved one in the past, it will 331 not meaningfully change in the future. As a result, if an archive 332 document's contents are changed, some clients may not become aware of 333 it. 335 Therefore, if a publisher requires a change to be visible to all 336 users (e.g., correcting factual errors), they should consider 337 publishing the revised entry in the subscription feed, in addition to 338 (or instead of) the appropriate archive feed. Conversely, 339 unimportant changes (e.g., spelling corrections) might be only 340 effected in archive feeds. 342 Publishers SHOULD construct their feed documents in such a way as to 343 make duplicate removal unambiguous (see Section 4.2). 345 Publishers are not required to make all archive documents available; 346 they may refuse to serve (e.g., with HTTP status code 403 or 410), or 347 be unable to serve (e.g., with HTTP status code 404) an archive 348 document. 350 4.2. Consuming Archived Feeds 352 Typically, clients will "subscribe" to an archived feed by polling 353 the subscription document for recent changes. If URI contained in 354 the prev-archive link relation has not been processed in the past, 355 the client can "catch up" with any missed entries by dereferencing it 356 and adding the contained entries to the logical feed. This process 357 should be repeated recursively until the client encounters a prev- 358 archive link relation that has been processed, the end of the archive 359 is indicated by a missing prev-archive link relation, or an error is 360 encountered. 362 If duplicate entries are found, clients SHOULD consider only the most 363 recently updated entry to be part of the logical feed. If duplicate 364 entries have the same update time-stamp, or none is available, the 365 entry sourced from the most recently updated feed document SHOULD 366 replace all other duplicates of that entry. 368 In Atom-formatted archived feeds, two entries are duplicates if they 369 have the same atom:id element. The update time of an entry is 370 determined by its atom:updated element, and likewise the update time 371 of a feed document is determined by its feed-level atom:updated 372 element. 374 Clients SHOULD warn users when they are not able to reconstruct the 375 entire logical feed (e.g., by alerting the user that an archive 376 document is unavailable, or displaying pseudo-entries that inform the 377 user that some entries may be missing). 379 5. IANA Considerations 381 The "previous", "next" and "current" link relations have been 382 previously registered, and no IANA action regarding them is required. 384 This specification defines the following new relations, to be added 385 to the Link Relations registry: 387 o Attribute Value: prev-archive 388 o Description: A URI that refers to the immediately 389 preceding archive document. 390 o Expected display characteristics: none 391 o Security considerations: See [ this document ] 393 o Attribute Value: next-archive 394 o Description: A URI that refers to the immediately 395 following archive document. 396 o Expected display characteristics: none 397 o Security considerations: See [ this document ] 399 6. Security Considerations 401 Feeds using this mechanism have the same authentication and channel 402 security concerns as explained in Atom [RFC4287]. 404 Feeds using these mechanisms could be crafted in such a way as to 405 cause a client to initiate excessive (or even an unending sequence 406 of) network requests, causing denial of service (either to the 407 client, the target server, and/or intervening networks). Clients can 408 mitigate this risk by requiring user intervention after a certain 409 number of requests, or by limiting requests either according to a 410 hard limit, or with heuristics. Servers can mitigate this risk by 411 denying requests that they consider abusive (e.g., by closing the 412 connection, or generating an error). 414 Clients should be mindful of resource limits when storing feed 415 documents. To reiterate, they are not required to always store or 416 reconstruct the feed when conforming to this specification; they only 417 need inform the user when the reconstructed feed is not complete. 419 7. References 420 7.1. Normative References 422 [RFC2119] Bradner, S., "Key words for use in 423 RFCs to Indicate Requirement Levels", 424 BCP 14, RFC 2119, March 1997. 426 [RFC3986] Berners-Lee, T., Fielding, R., and L. 427 Masinter, "Uniform Resource 428 Identifier (URI): Generic Syntax", 429 STD 66, RFC 3986, January 2005. 431 [RFC4287] Nottingham, M. and R. Sayre, "The 432 Atom Syndication Format", RFC 4287, 433 December 2005. 435 [W3C.REC-xml-infoset-20040204] Cowan, J. and R. Tobin, "XML 436 Information Set (Second Edition)", 437 W3C REC REC-xml-infoset-20040204, 438 February 2004. 440 [W3C.REC-xml-names-19990114] Bray, T., Hollander, D., and A. 441 Layman, "Namespaces in XML", W3C 442 REC REC-xml-names-19990114, 443 January 1999. 445 7.2. Informative References 447 [RSS2] Winer, D., "RSS 2.0 Specification", 448 2005, . 451 Appendix A. Acknowledgements 453 The author would like to thank the following people for their 454 contributions, comments and help: Danny Ayers, Thomas Broyer, Lisa 455 Dusseault, Stefan Eissing, David Hall, Bill de Hora, Vidya Narayanan, 456 Aristotle Pagaltzis, John Panzer, Dave Pawson, Garrett Rooney, Robert 457 Sayre, James Snell, Henry Story. 459 Any errors herein remain the author's, not theirs. 461 Appendix B. Use in RSS 2.0 463 As previously noted, while this specification's extensions are 464 described in terms of the Atom feed format, they are also useful in 465 similar formats. This informative appendix demonstrates how they can 466 be used in an RSS 2.0-formatted [RSS2] feed. 468 In RSS 2.0-formatted feeds, two entries are duplicates if they have 469 the same guid element. The update time of an entry is not defined by 470 RSS 2.0, but the feed-level update time can be determined by the 471 pubDate element. 473 RSS 2.0-formatted Complete Feed 475 476 478 479 NetMovies Queue 480 http://netmovies.example.org/ 481 The DVDs you'll receive next. 482 en-us 483 Tue, 10 Jun 2003 04:00:00 GMT 484 Tue, 10 Jun 2003 09:41:01 GMT 485 http://blogs.law.harvard.edu/tech/rss 486 Weblog Editor 2.0 487 editor@netmovies.example.org 488 webmaster@netmovies.example.org 489 490 491 Casablanca 492 http://netmovies.example.org/movies/Casablanca 493 Here's looking at you, kid... 494 495 Tue, 03 Jun 2003 09:39:21 GMT 496 urn:uuid:1225c695-cfb8-4ebb-aaaa-80da344efa6a 497 498 499 500 RSS 2.0-formatted Paged Feed 502 503 505 506 Liftoff News 507 http://liftoff.nasa.gov/ 508 Liftoff to Space Exploration. 509 en-us 510 Tue, 10 Jun 2003 04:00:00 GMT 511 Tue, 10 Jun 2003 09:41:01 GMT 512 http://blogs.law.harvard.edu/tech/rss 513 Weblog Editor 2.0 514 editor@example.com 515 webmaster@example.com 516 530 531 RSS 2.0-formatted Subscription Document 533 534 535 536 Liftoff News 537 http://liftoff.nasa.gov/ 538 Liftoff to Space Exploration. 539 en-us 540 Tue, 10 Jun 2003 04:00:00 GMT 541 Tue, 10 Jun 2003 09:41:01 GMT 542 http://blogs.law.harvard.edu/tech/rss 543 Weblog Editor 2.0 544 editor@example.com 545 webmaster@example.com 546 549 550 Star City 551 http://liftoff.nasa.gov/2003/06/news-starcity 552 How do Americans get ready to work with Russians 553 aboard the International Space Station? They take a crash course 554 in culture, language and protocol at Russia's 557 Tue, 03 Jun 2003 09:39:21 GMT 558 http://liftoff.nasa.gov/2003/06/03.html#item573 559 560 561 562 RSS 2.0-formatted Archive Document 564 565 567 568 Liftoff News 569 http://liftoff.nasa.gov/ 570 Liftoff to Space Exploration. 571 en-us 572 Tue, 30 May 2003 08:00:00 GMT 573 Tue, 30 May 2003 10:31:52 GMT 574 http://blogs.law.harvard.edu/tech/rss 575 Weblog Editor 2.0 576 editor@example.com 577 webmaster@example.com 578 579 581 584 585 Sky watchers in Europe, Asia, and parts of 586 Alaska and Canada will experience a partial eclipse of the Sun 587 on Saturday, May 31st. 588 Fri, 30 May 2003 11:06:42 GMT 589 http://liftoff.nasa.gov/2003/05/30.html#item572 590 591 592 The Engine That Does More 593 http://liftoff.nasa.gov/2003/05/news-VASIMR.asp 594 Before man travels to Mars, NASA hopes to 595 design new engines that will let us fly through the Solar 596 System more quickly. The proposed VASIMR engine would do 597 that. 598 Tue, 27 May 2003 08:37:32 GMT 599 http://liftoff.nasa.gov/2003/05/27.html#item571 600 601 602 604 Author's Address 606 Mark Nottingham 608 EMail: mnot@pobox.com 609 URI: http://www.mnot.net/ 611 Full Copyright Statement 613 Copyright (C) The IETF Trust (2007). 615 This document is subject to the rights, licenses and restrictions 616 contained in BCP 78, and except as set forth therein, the authors 617 retain all their rights. 619 This document and the information contained herein are provided on an 620 "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS 621 OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND 622 THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS 623 OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF 624 THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED 625 WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. 627 Intellectual Property 629 The IETF takes no position regarding the validity or scope of any 630 Intellectual Property Rights or other rights that might be claimed to 631 pertain to the implementation or use of the technology described in 632 this document or the extent to which any license under such rights 633 might or might not be available; nor does it represent that it has 634 made any independent effort to identify any such rights. Information 635 on the procedures with respect to rights in RFC documents can be 636 found in BCP 78 and BCP 79. 638 Copies of IPR disclosures made to the IETF Secretariat and any 639 assurances of licenses to be made available, or the result of an 640 attempt made to obtain a general license or permission for the use of 641 such proprietary rights by implementers or users of this 642 specification can be obtained from the IETF on-line IPR repository at 643 http://www.ietf.org/ipr. 645 The IETF invites any interested party to bring to its attention any 646 copyrights, patents or patent applications, or other proprietary 647 rights that may cover technology that may be required to implement 648 this standard. Please address the information to the IETF at 649 ietf-ipr@ietf.org. 651 Acknowledgement 653 Funding for the RFC Editor function is provided by the IETF 654 Administrative Support Activity (IASA).