<?xml version="1.0" encoding="utf-8"?>

<!DOCTYPE rfc SYSTEM "http://xml.resource.org/authoring/rfc2629.dtd" [
	<!ENTITY rfc2119 SYSTEM 'http://xml.resource.org/public/rfc/bibxml/reference.RFC.2119.xml'>
  <!ENTITY rfc2616 SYSTEM 'http://xml.resource.org/public/rfc/bibxml/reference.RFC.2616.xml'> 
  <!ENTITY rfc2756 SYSTEM 'http://xml.resource.org/public/rfc/bibxml/reference.RFC.2756.xml'> 
  <!ENTITY rfc5988 SYSTEM 'http://xml.resource.org/public/rfc/bibxml/reference.RFC.5988.xml'> 
]>

<?xml-stylesheet type='text/xsl' href='rfc2629.xslt' ?>
<?rfc toc="yes" ?>
<?rfc tocdepth="3" ?>
<?rfc tocindent="yes" ?>
<?rfc symrefs="yes" ?>
<?rfc sortrefs="yes"?>
<?rfc iprnotified="no" ?>
<?rfc strict="yes" ?>
<?rfc compact="yes" ?>
<?rfc comments="yes" ?>
<?rfc inline="yes" ?>

<rfc ipr="trust200902" 
		 docName="draft-nottingham-linked-cache-inv-04" 
		 category="info"
		 submissionType="independent"
>

	<front>
	<title abbrev="">Linked Cache Invalidation</title>
	<author initials="M." surname="Nottingham" fullname="Mark Nottingham">
	  <organization></organization>
	  <address>       
	    <email>mnot@mnot.net</email> 
	    <uri>http://www.mnot.net/</uri>       
	  </address>
	</author>
	<author initials="M." surname="Kelly" fullname="Mike Kelly">
	  <organization></organization>
	  <address>       
	    <email>mike@stateless.co</email>
			<uri>http://stateless.co/</uri>
	  </address>
	</author>
	<date year="2012"/>
	<abstract>
	   <t>This memo defines two new link types that indicate relationships
      between resources in terms of cache invalidation, along with a HTTP
      cache-control extension that takes advantage of those relationships to
      use them to extend response freshness. Collectively, this is referred to
      as Linked Cache Invalidation (LCI).</t>
	</abstract>
	</front>

	<middle>

		<section title="Introduction">

			<t>In normal operation, a HTTP <xref target="RFC2616"/> cache will
         invalidate a stored response if an unsafe request (e.g., POST,
         PUT or DELETE) is made to its URI. HTTP also provides for such a
         state-changing request to invalidate related resources (using the
         Location and Content-Location headers in the response), but this is
         of limited utility, because those headers have defined semantics, and
         can only occur once each.</t>

			<t>Because of this, it is not practical to make a response that
			depends on the state of another resource cacheable. For example, an
			update to a blog entry might change several different resources, such
			as the user's summary page, the blog's "front" page, the blog's Atom
			feed, and of course the blog entry itself. If any of these resources
			is made cacheable, it will not reflect those changes, causing
			confusion if the user tries to verify that their changes have been
			correctly applied.</t>

			<t>This memo introduces new link relation types <xref
         target="RFC5988"/> that allow more fine-grained relationships between
         resources to be defined, so that caches can invalidate all related
         representations when the state of one changes. It also introduces a
         cache-control response extension, so that responses using the
         relations can be cached by implementations that understand these
         relations.</t>

			<section title="Example">
				
				<t>Taking the blog use case described above, imagine that we have
            the following related resources:</t>
					
					<t><list style="symbols">
						<t>http://example.com/blog/2012/05/04/hi [the blog
                  entry]</t>
						<t>http://example.com/blog/2012/05/04/hi/comments [full
                  comments for the entry]</t>
						<t>http://example.com/blog/ {the blog "home"}</t>
						<t>http://example.com/users/bob/ [the user page, listing his
                  entries]</t>
					</list></t>
				
				<t>When someone comments on Bob's blog entry, they might send a
            request like this:</t>
					
				<figure>
<artwork xml:space="preserve"><![CDATA[
POST /cgi-bin/blog.cgi HTTP/1.1
Host: example.com
Content-Type: application/x-www-form-urlencoded
Content-Length: 7890

[...]
]]></artwork>
				</figure>
										
				<t>When the comment is successful, it's typical to redirect the
            client back to the original blog page, with a response like
            this:</t>
					
				<figure>
<artwork xml:space="preserve"><![CDATA[
HTTP/1.1 302 Moved Temporarily
Location: http://example.com/blog/2012/05/04/hi
Content-Length: 0

]]></artwork>
				</figure>
					
				<t>Which would invalidate the blog entry URI, as per HTTP's normal
            operation.</t>
				
				<t>To invalidate the full comments page for the entry, the
            relationship can be described in that page's response headers:</t>
					
				<figure>
<artwork xml:space="preserve"><![CDATA[
HTTP/1.1 200 OK
Content-Type: text/html
Content-Length: 5555
Link: </blog/2012/05/04/hi>; rel="inv-by"
Cache-Control: no-cache, inv-maxage=600

[...]
]]></artwork>
				</figure>
		
				<t>This declares that whenever the entry page (the target of the
            link header) changes, this response (the full comments page)
            changes as well; it's invalidated by the link target.</t>
	
				<t>Note that the full comments page also carries a Cache-Control
            header that instructs "normal" caches not to reuse this response,
            but allows those caches that are aware of LCI to consider it fresh
            for ten minutes.</t>
									
				<t>To invalidate the blog home page and user page, it's
            impractical to list all of the resources that might change if a
            new entry is posted; not only are there many of them, but their
            URLs might not be known when the pages are cached. To address
            this, the POST response itself (i.e., when the comment is made)
            can nominate resources to invalidate, using the 'invalidates'
            relation, making that response:</t>
					
				<figure>
<artwork xml:space="preserve"><![CDATA[
HTTP/1.1 302 Moved Temporarily
Location: http://example.com/blog/2012/05/04/hi
Link: <http://example.com/blog/>; rel="invalidates", 
      <http://example.com/users/bob/>; rel="invalidates"
Content-Length: 0

]]></artwork>
				</figure>

				<t>Depending on how important it is to see updates on the home
					page and user page, those responses can either allow caching
					regardless of support for LCI, like this:</t>

				<figure>
<artwork xml:space="preserve"><![CDATA[
Cache-Control: max-age=300
]]></artwork>
				</figure>
					
				<t>... or they can only allow caching by LCI-aware caches, like
            this:</t>
					
				<figure>
<artwork xml:space="preserve"><![CDATA[
Cache-Control: no-cache, inv-maxage=300
]]></artwork>
				</figure>
									
				<t>Together, these techniques can be used to invalidate a variety
            of related responses.</t>
					
				<t>It is important to note that the invalidations are only
            effective in the caches that the client's request stream travels
            through. This means other caches will continue to serve the "old"
            content until the invalidation event is propagated to them
            (see below) or the cached responses become stale.</t>
					
				<t>When multiple caches are close together, the HyperText Caching
            Protocol (HTCP) <xref target="RFC2756"/> can be used to propagate
            invalidation events between them.</t>
				
			</section>

	</section>

		<section title="Notational Conventions">

			<t>The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL
			NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in
			this document are to be interpreted as described in RFC 2119 <xref
			target="RFC2119"/>.</t>

         <t>This document uses the Augmented Backus-Naur Form (ABNF) notation
         of <xref target="RFC2616" />, and explicitly includes the following
         rules from it: delta-seconds.</t>

		</section>

		<section title="The 'invalidates' Link Relation Type" anchor="invalidates">
			
			<t>The 'invalidates' link relation type allows a response that
         signifies a state change on the server to indicate one or more
         associated URIs whose states have also changed.</t>
			
			<t><list style="symbols">
				<t>Relation name: invalidates</t>
				<t>Description: Indicates that when the link context changes, 
					the link target also has changed.</t>
				<t>Reference: [this document]</t>
				<t>Notes:</t>
			</list></t>
			
		</section>
		
		<section title="The 'inv-by' Link Relation Type" anchor="inv-by">

   		<t>The 'inv-by' link relation type allows a response to nominate one
         or more other resources that affect the state of the resource it's
         associated with. That is, when one of the nominated resources
         changes, it also changes the state of this response's resource.</t>
			
   		<t><list style="symbols">
   			<t>Relation name: inv-by</t>
   			<t>Description: Indicates that when the link target changes, the
   				link's context has also changed.</t>
   			<t>Reference: [this document]</t>
   			<t>Notes:</t>
   		</list></t>

		</section>
		
		<section title="The 'inv-maxage' Response Cache-Control Extension"
    anchor="inv-maxage">

			<t>When present, the 'inv-maxage' cache-control extension indicates
         the number of seconds that caches who implement Linked Cache
         invalidation can consider responses fresh for, provided they are not
         invalidated.</t>

         <section title="inv-maxage Syntax">

            <t>The inv-maxage cache-control extension is parameterised with
               a numeric argument:</t>

   			<figure>
   			<artwork xml:space="preserve"><![CDATA[	
"inv-maxage" "=" ( delta-seconds | ( <"> delta-seconds <"> ) )			
]]></artwork>
   			</figure>

            <t>Note that the argument MAY occur in either token or
            quoted-string form.</t>

            <t>If the argument is missing or otherwise does not conform to 
               the BNF rule, it is invalid and MUST be ignored by caches.</t>
               
            <t>If the directive appears more than once in a response, each
            instance is invalid and MUST be ignored by caches.</t>
            
         </section>
         
         <section title="inv-maxage Semantics">

			<t>HTTP caches MAY, if they fully implement this specification,
         disregard the HTTP response cache-control directives 'no-cache',
         'max-age' and 's-maxage' when a valid 'inv-maxage' cache-control
         directive is present in a response, using its value as a replacement
         for max-age.</t>

			<t>HTTP caches using inv-maxage to calculate freshness MUST
         invalidate all stored responses whose request-URIs (after
         normalisation) are the target of a 'invalidates' link relation
         contained in a successful response to a state-changing request,
         provided that they are allowed.</t>

			<t>HTTP caches using inv-maxage to calculate freshness MUST
         invalidate all stored responses containing a 'inv-by' link relation
         whose target is the current request-URI (after normalisation) upon
         receipt of a successful response to a state-changing request.</t>

			<t>Likewise, HTTP caches using inv-maxage to calculate freshness MUST
         invalidate all stored responses containing a 'inv-by' link relation
         whose target is the content of either the Location or
         Content-Location response headers (after normalisation) upon receipt
         of a successful response to a state-changing request.</t>

			<t>Here, a response is considered to "contain" a link relation if it
         is carried in the Link HTTP header <xref target="RFC5988"/>. I.e., it
         is not necessary to look at the response body.</t>
				
			<t>"Invalidate" means that the cache will either remove all stored
         responses related to the effective request URI, or will mark these as
         "invalid" and in need of a mandatory validation before they can be
         returned in response to a subsequent request.</t>
				
			<t>A "successful" response is one with a 2xx or redirecting 3xx
         (e.g., 301, 302, 303, 307) status code.</t>
			
			<t>A "state-changing" request is one with an unsafe method (e.g.,
         POST, PUT, DELETE, PATCH), or one that is not known to be safe.</t>
			
			<t>In this context, "normalisation" means, in the case of a relative
         request-URI, that it is absolutised using the value of the Host
         request header and the appropriate protocol scheme.</t>
			
			<t>Finally, an invalidation based upon "invalidates" is "allowed" if
         the host part of the request-URI (if absolute) or Host request header
         (if the request-URI is relative) matches the host part of the target
         URI. This prevents some types of denial-of-service attacks.</t>
			
			<t>Implementations SHOULD effect invalidations when they become aware
         of changes through other means; e.g., HTCP <xref target="RFC2756"/>
         CLR messages, upon invalidations caused by other links (i.e., chained
         "cascades" of linked invalidations), or when a changed response is
         seen (such as when HTTP validation is unsuccessful).</t>

         </section>
			
		</section>

		<section title="Security Considerations">

			<t>Linked Cache Invalidation does not guarantee that invalidations
         will be effected; e.g., they can be lost due to network issues or
         cache downtime. Furthermore, it does not guarantee that all caches
         that understand LCI will be made aware of invalidations that happen,
         because of how they originate.</t>

			<t>Therefore, care should be taken that LCI invalidations are not
         relied upon (e.g., to purge sensitive content).</t>

			<t>Furthermore, while some care is taken to avoid denial-of-service
         attacks through invalidation, cache efficiency may still be impaired
         under certain circumstances (e.g., arranging for one request to
         invalidate a large number of responses), leading to a reduction in
         service quality.</t>

		</section>

		<section title="IANA Considerations">

			<t>This document registers two entries in the Link Relation Type
         Registry; see <xref target="invalidates"/> and <xref
         target="inv-by"/>.</t>

         <t>It also registers a HTTP Cache Directive, "inv-maxage"; see <xref
         target="inv-maxage"/>.</t>

		</section>
	</middle>

	<back>

		<references title="Normative References">
			&rfc2119; &rfc2616; &rfc5988;
		</references>
		
		<references title="Informative References">
			&rfc2756;
		</references>

		<section title="Acknowledgements">
			<t>Thanks to Michael Hausenblas for his input.</t>
		</section>

	</back>
</rfc>
