<?xml version="1.0"?>
<?xml-stylesheet type='text/xsl' href='./rfc2629.xslt' ?>

<?rfc toc="yes"?>
<?rfc symrefs="no"?>
<?rfc compact="no" ?>
<?rfc sortrefs="yes" ?>
<?rfc strict="yes" ?>
<?rfc linkmailto="yes" ?>
<!DOCTYPE rfc SYSTEM "rfc2629.dtd" >
    

<?xml-stylesheet type='text/xsl' href='./rfc2629.xslt' ?>

<?rfc toc="yes"?>
<?rfc symrefs="yes"?>
<?rfc compact="yes" ?>
<?rfc subcompact="yes" ?>
<?rfc sortrefs="yes" ?>
<rfc ipr="trust200902" docName="draft-nir-websec-extended-origin-02" category="std" updates="6454">
  <front>
    <title abbrev="Extended Origin">A More Granular Web Origin Concept</title>
<!-- =====================================================================
    <author initials="O." surname="Souroujon" fullname="Oren Sourourjon">
      <organization abbrev="Check Point">Check Point Software Technologies Ltd.</organization>
      <address>
        <postal>
          <street>5 Hasolelim st.</street>
          <city>Tel Aviv</city>
          <code>67897</code>
          <country>Israel</country>
        </postal>
        <email>osourourj@checkpoint.com</email>
      </address>
    </author>
========================= -->
    <author initials="Y." surname="Nir" fullname="Yoav Nir" role="editor">
      <organization abbrev="Check Point">Check Point Software Technologies Ltd.</organization>
      <address>
        <postal>
          <street>5 Hasolelim st.</street>
          <city>Tel Aviv</city>
          <code>67897</code>
          <country>Israel</country>
        </postal>
        <email>ynir@checkpoint.com</email>
      </address>
    </author>
    <date year="2012"/>
    <area>Security Area</area>
    <keyword>Internet-Draft</keyword>
    <abstract>
      <t> This document defines an HTTP header that allows the partitioning of a single origin (as 
        defined in RFC 6454) into multiple origins, so that the same origin policy applies among
        them.</t>
      <t> The header introduced in this document allows a portal to specify that resources that 
        appear to be from the same origin should, in fact, be treated as though they are from 
        different origins, by extending the 3-tuple of the origin to a 4-tuple. A compliant user 
        agent is expected to apply the same-origin policy according to the 4-tuple rather than the 
        3-tuple.</t>
    </abstract>
  </front>
  <middle>
    <!-- ====================================================================== -->
    <section anchor="introduction" title="Introduction">
      <t> Reverse proxies such as SSL VPN portals "flatten" part of the Web, by providing access to 
        multiple web sites through a single host. For example, a company portal may be located at 
        https://sslvpn.example.com, and allow remote access to several websites that form the 
        corporate intranet as well as webified access to the mail server. The different services 
        are distinguised by implementation-specific manipulation of the URL. For example, the 
        following three URLs may be respectively for the internal mail server, for the internal 
        wiki, and for Wikipedia:<list style="numbers">
        <t> https://sslvpn.example.com/link/my_web_mail/inbox/index.html</t>
        <t> https://sslvpn.example.com/link/the_wiki/index.html</t>
        <t> https://sslvpn.example.com/ext/wikipedia.org</t></list></t>
      <t> The problem here is that although there are separate servers, they all map to the same 
        origin as defined in <xref target="RFC6454"/>. Scripts from any of these sites can affect 
        others. In fact, the Origin header as defined in section 7 of RFC 6454 can leak information 
        to the real web server that it is located within the same flattened domain.</t>
      <t> The HTTP header introduced in this document allows the portal to specify that URLs that 
        appear to be from the same origin are, in fact, from different origins, by extending the 
        3-tuple of the origin to a 4-tuple. The user agent would be expected to apply the 
        same-origin policy according to the 4-tuple rather than the 3-tuple.</t>
      <section anchor="mustshouldmay" title="Conventions Used in This Document">
        <t>The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT",
          "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described
          in <xref target="RFC2119"/>.</t>
      </section>
    </section>
    <section anchor="fmt" title="The Extended-Origin Header">
      <t> When a web portal hides multiple actual web sites behind its own origin, it MUST add the
        new Extended-Origin header defined in the next section. The name field need not be related
        to the actual web origin, and is not meant for human consumption. The requirement is only
        that different origins MUST have different names in the header.</t>
      <t> If the response from the original web site already contains one or more Extended-Origin 
        headers, then the portal adds its own header after the rest.</t>
      <section anchor="hdrfmt" title="Header Format">
        <t> The ABNF is to be added.</t>
        <t> The header includes a name, which is not necessarily meant for human consumption, and 
          a path parameter. The general format is <figure>
            <artwork><![CDATA[
    Extended-Origin: name; path=/something
]]></artwork>
          </figure></t>
        <t> This means that all requests of the format "GET /something/..." will be considered as 
          going to the origin defined by the combination of the RFC 6454 origin and the name. As 
          such, cookies from the portal MUST not be returned in requests to the extended origin,
          and vice versa. Scripts from inside the extended origin MUST be prevented from executing
          requests against the main portal and against other extended origins within the same 
          portal.</t> 
      </section>
      <section anchor="serialization" title="Update to the Serialization Requirements">
        <t> Section 6 of RFC 6454 defines how to serialize an origin for inclusion in the "Origin"
          header defined in section 7 of that RFC.</t>
        <t> For serializing an extended origin, follow steps 1-3 of section 6.1 or 6.2 of RFC 6454.
          To the result, append the name from the Extended-Origin header and a U+002E FULL STOP 
          code points ("."). Then continue with steps 4-6.</t>
        <t> For example, if the host is sslvpn.example.com, and the name in the extended origin
          header is webmail, then the serialized origin becomes https://webmail.sslvpn.example.com</t>
        <t> To avoid collisions between serialized extended origins and serialized non-extended 
          origins, servers SHOULD NOT use readable origins such as "webmail". Instead they should
          choose random-looking extended origin names, possibly obtained by hashing an internally
          meaningful name.</t>    
      </section>
    </section>
    <section anchor="example" title="Examples">
      <t> Here's an example of a connection with both the Extended-Origin and the Origin headers.
        <figure>
            <artwork><![CDATA[
    CONNECT  https://sslvpn.example.com
    
    GET / HTTP/1.1
    
    HTTP/1.1 200 OK
    Content-Type: application/octet-stream
    Set-Cookie: session=1234
    
    <html>
      <body>
        Welcome, you can read your mail 
          <a href="/link/my_web_mail/inbox/index.html">here</a>
      </body>
    </html>
    
    GET /link/my_web_mail/inbox/index.html HTTP/1.1
    Referer: https://sslvpn.example.com/
    Cookie: session=1234

    HTTP/1.1 200 OK
    Content-Type: application/octet-stream
    Extended-Origin: d41d8cd98f00b204; path=/link/my_web_mail
    Set-Cookie: mailsession=5678
    
    <html>
      <body>
        You have 1 unread message. Jumping in 5 seconds...
        <script>...</script>
      </body>
    </html>
    
    GET /link/my_web_mail/inbox/msg0945.html HTTP/1.1
    Referer: https://sslvpn.example.com/link/my_web_mail/inbox/index.htm
    Origin: https://d41d8cd98f00b204.sslvpn.example.com
    Cookie: mailsession=5678
 
]]></artwork>
          </figure></t>
      <t> In this example, the first GET was the result of the user typing in an address, or
        following a link. Therefore it has no Origin header. It goes to the main page of the portal,
        so the response contains no Extended-Origin.</t>
      <t> The second GET also happened because of clicking  a link, not by any action of the page,
        so there's no need to send an Origin header. If there had been such a header, it would be
        just as defined in RFC 6454: https://sslvpn.example.com</t>
      <t> The third GET is caused by a script running on the mail page. This page came with an
        Extended-Origin header, and so the user agent constructs the Origin header in the request
        according to the new rules in <xref target="serialization"/>.</t> 
      <t> Note that the cookie set by the main portal was not sent in the third request. The second 
        reply marked all requests beginning with "/link/my_web_mail" as belonging to the extended 
        origin, and the third request matches that pattern. Cookies from the non-extended origin
        are not forwarded to the extended origin.</t>
      <t> The second request did include the portal cookie in a request to the mail server. This is
        only an issue with the main portal cookies, not among the extended origins. Some SSL VPN
        portals strip their own cookies from requests going to the other servers, and this behavior
        is RECOMMENDED.</t>
    </section>
    <section anchor="conversion" title="Determining the Extended Origin based on a URL">
      <t> This section defines an algorithm for converting a URL into an origin. This section is 
        not normative, and compliant browsers may implement this in other ways.</t>
      <t> For each visited site, the browser keeps a table mapping paths to origin names. Initially,
        this table looks like this:</t>
      <texttable anchor="tbl1" title="Initial table">
        <ttcol align="center">Path</ttcol>
        <ttcol align="center">Name</ttcol>
        <c>/</c><c></c>
      </texttable>
      <t> As Extended-Origin headers are encountered, entries are added to the table. For example,
        after seeing the header in the example in <xref target="example"/>, the table will look 
        like this:</t>  
      <texttable anchor="tbl2" title="The table with 2 more entries">
        <ttcol align="center">Path</ttcol>
        <ttcol align="center">Name</ttcol>
        <c>/</c><c></c>
        <c>/link/my_web_mail</c><c>d41d8cd98f00b204</c>
        <c>/link/SAP</c><c>12c30f3bb3275376</c>
      </texttable>
      <t> When presented with a URL, the browser can normally figure the scheme, host and port.
        The name parameter can be figured out form the path according to the closes match in the
        table. Here are some URLs and the origins to which they map:</t>
      <texttable anchor="tbl3" title="Extended origin examples">
        <ttcol align="center">URL</ttcol>
        <ttcol align="center">Extended Origin</ttcol>
        <c>https://sslvpn.example.com/index.html</c><c>https://sslvpn.example.com</c>
        <c>https://sslvpn.example.com/link/my_web_mail/msg0005.html</c><c>https://d41d8cd98f00b204.sslvpn.example.com</c>
        <c>https://sslvpn.example.com/link/SAPIENCE/index.html</c><c>https://sslvpn.example.com</c>
        <c>https://sslvpn.example.com/link/SAP/index.html</c><c>https://12c30f3bb3275376.sslvpn.checkpoint.com</c>
        <c>https://sslvpn.example.com/ext/wikipedia.org/index.html</c><c>https://sslvpn.example.com</c>
      </texttable>
    </section>
    <section anchor="cors" title="CORS interaction">
      <t> The interaction between this draft and CORS (<xref target="CORS"/>) is to be added.</t>
    </section>
    <section anchor="openq" title="Open Issues">
      <section anchor="openq_ut" title="Other Methods of Encoding Server Identity">
        <t> Some SSL-VPN products and configurations do not encode the server identity using a 
          prefix in the URL, as shown in the example in <xref target="example"/>. One such Method
          is this:<figure>
            <artwork><![CDATA[
 https://sslvpn.example.com/p/inb/msg0945.html,HOST=mail.example.com
]]></artwork>
          </figure></t>
        <t> The issue here is that the way the path parameter is defined, you cannot use it to 
          define what URLs belong to the extended origin. We could replace it with a parameter 
          that accepts a regular expression, but that seems overly complex:<figure> 
            <artwork><![CDATA[
    Extended-Origin: webmail; expr=/p/*,HOST=mail.example.com
]]></artwork>
          </figure></t>
      </section>  
    </section>
    <!-- ====================================================================== -->
    <section anchor="ack" title="Acknowledgements">
      <t> Oren Souroujon contributed some of the text in this document, and also came up with the 
        original idea. Yehezkel Horowitz helped with reviewing the draft and pointing out the 
        issues with cookies and paths.</t>
      <t> Thanks to James Manger and Tobias Gondrom for reviewing the first version of this draft.</t>  
    </section>
    <section anchor="security" title="Security Considerations">
      <t> This document causes compliant clients to disallow certain actions that are allowed
        today. In that sense, it reduces the attack surface.</t>
      <t> More to be added. </t>
    </section>
    <section anchor="iana" title="IANA Considerations">
      <t> The permanent message header field registry (see <xref target="RFC3864"/>) should be 
        updated with the following registration: <list style="symbols">
        <t> Header field name: Extended-Origin</t>
        <t> Applicable protocol: http</t>
        <t> Status: Standard</t>
        <t> Author/Change controller: IETF</t>
        <t> Specification document: this specification</t></list></t>
    </section>
    <section anchor="delta" title="Changes from Previous Versions">
      <t> NOTE TO RFC EDITOR: Please remove this section before publication.</t>
      <section anchor="diff02" title="Changes in version -02">
        <t> Added <xref target="conversion"/> about converting URLs to extended origins</t>
      </section>
      <section anchor="diff01" title="Changes in version -01">
        <t> Removed the special handling of portals behind portals.</t>
        <t> Changed the syntax of the serialized origin from fragment-like to subdomain-like.</t>
        <t> Cleaned up some grammar.</t>
      </section>
    </section>
  </middle>
  <!-- ====================================================================== -->
  <back>
    <references title="Normative References"> 
      <reference anchor='RFC6454'>
        <front>
          <title abbrev='origin'>The Web Origin Concept</title>
          <author initials='A.' surname='Barth' fullname='Adam Barth'>
            <organization>Google, Inc.</organization>
            <address>
              <email>ietf@adambarth.com</email>
            </address>
          </author>
          <date year='2011' month='December' />
          <area>Security</area>
          <keyword>keyword</keyword>
        </front>
        <seriesInfo name='RFC' value='6454' />
        <format type='TXT' target='http://www.ietf.org/rfc/rfc6454.txt' />
        <format type='HTML' target='http://tools.ietf.org/html/rfc6454' />
      </reference>
      <reference anchor='RFC2119'>
        <front>
          <title abbrev='RFC Key Words'>Key words for use in RFCs to Indicate Requirement Levels</title>
          <author initials='S.' surname='Bradner' fullname='Scott Bradner'>
            <organization>Harvard University</organization>
            <address>
              <postal>
                <street>1350 Mass. Ave.</street>
                <street>Cambridge</street>
                <street>MA 02138</street>
              </postal>
              <phone>- +1 617 495 3864</phone>
              <email>sob@harvard.edu</email>
            </address>
          </author>
          <date year='1997' month='March' />
          <area>General</area>
          <keyword>keyword</keyword>
        </front>
        <seriesInfo name='BCP' value='14' />
        <seriesInfo name='RFC' value='2119' />
        <format type='TXT' octets='4723' target='ftp://ftp.isi.edu/in-notes/rfc2119.txt' />
        <format type='HTML' octets='16553' target='http://xml.resource.org/public/rfc/html/rfc2119.html' />
        <format type='XML' octets='5703' target='http://xml.resource.org/public/rfc/xml/rfc2119.xml' />
      </reference>
    </references>
    <references title="Informative References"> 
      <reference anchor='CORS'>
        <front>
          <title abbrev='cors'>Cross-Origin Resource Sharing</title>
          <author initials='A.' surname='van Kesteren' fullname='Anne van Kesteren'>
            <organization>Opera Software ASA</organization>
            <address>
              <email>annevk@opera.com</email>
            </address>
          </author>
          <date year='2010' month='July' />
        </front>
        <seriesInfo name='W3C Working Draft' value='WD-cors-20100727' />
        <format type='HTML' target='http://www.w3.org/TR/cors/' />
      </reference>
      <reference anchor='RFC3864'>
        <front>
          <title abbrev='bcp90'>Registration Procedures for Message Header Fields</title>
          <author initials='G.' surname='Klyne' fullname='Graham Klyne'>
            <organization>Nine by Nine</organization>
            <address>
              <email>GK-IETF@ninebynine.org</email>
            </address>
          </author>
          <author initials='M.' surname='Nottingham' fullname='Mark Nottingham'>
            <organization>BEA Systems</organization>
            <address>
              <email>mnot@pobox.com</email>
            </address>
          </author>
          <author initials='J.C.' surname='Mogul' fullname='Jeffrey C. Mogul'>
            <organization>HP Labs</organization>
            <address>
              <email>JeffMogul@acm.org</email>
            </address>
          </author>
          <date year='2004' month='September' />
        </front>
        <seriesInfo name='RFC' value='3864' />
        <seriesInfo name='BCP' value='90' />
        <format type='TXT' target='http://www.ietf.org/rfc/rfc3864.txt' />
        <format type='HTML' target='http://tools.ietf.org/html/rfc3864' />
      </reference>
    </references>
    <!-- ====================================================================== -->
  </back>
</rfc>
