<?xml version="1.0" encoding="UTF-8"?>

<!-- This is built from a template for a generic Internet Draft. Suggestions for
     improvement welcome - write to Brian Carpenter, brian.e.carpenter @ gmail.com -->

<!-- This can be converted using the Web service at http://xml.resource.org/experimental.html
     (which supports the latest, sometimes undocumented and under-tested, features.) -->

<!DOCTYPE rfc SYSTEM "rfc2629.dtd" [

<!-- You need one entry like the following for each RFC referenced -->

<!ENTITY RFC2119 PUBLIC ''
  'http://xml.resource.org/public/rfc/bibxml/reference.RFC.2119.xml'>
<!ENTITY RFC2629 PUBLIC ''
  'http://xml.resource.org/public/rfc/bibxml/reference.RFC.2629.xml'>
<!ENTITY RFC2460 PUBLIC ''
  'http://xml.resource.org/public/rfc/bibxml/reference.RFC.2460.xml'>
<!ENTITY RFC6437 PUBLIC ''
  'http://xml.resource.org/public/rfc/bibxml/reference.RFC.6437.xml'>
<!ENTITY RFC6434 PUBLIC ''
  'http://xml.resource.org/public/rfc/bibxml/reference.RFC.6434.xml'>
<!ENTITY RFC6296 PUBLIC ''
  'http://xml.resource.org/public/rfc/bibxml/reference.RFC.6296.xml'>
<!ENTITY RFC4864 PUBLIC ''
  'http://xml.resource.org/public/rfc/bibxml/reference.RFC.4864.xml'>
<!ENTITY RFC2991 PUBLIC ''
  'http://xml.resource.org/public/rfc/bibxml/reference.RFC.2991.xml'>

<!-- You need one entry like the following for each I-D referenced -->


<!ENTITY DRAFT-labbal SYSTEM "http://xml.resource.org/public/rfc/bibxml3/reference.I-D.carpenter-flow-label-balancing.xml"> 

]>


<?rfc toc="yes"?>            <!-- You want a table of contents -->
<?rfc symrefs="yes"?>        <!-- Use symbolic labels for references -->
<?rfc sortrefs="yes"?>       <!-- This sorts the references -->
<?rfc iprnotified="no" ?>    <!-- Change to "yes" if someone has disclosed IPR for the draft -->
<?rfc compact="yes"?>


<!-- This defines the specific filename and version number of your draft (and inserts the appropriate IETF boilerplate -->
<rfc ipr="trust200902" docName="draft-tarreau-extend-flow-label-balancing-01" category="info" >


<front>
<title abbrev="Flow Label for Load Balancing Persistence">Extending Use of the IPv6 Flow Label for Load Balancing Persistence</title>


<author initials="B. E." surname="Carpenter" fullname="Brian Carpenter">
    <organization abbrev="Univ. of Auckland"></organization>
    <address>
      <postal>
        <street>Department of Computer Science</street>
        <street>University of Auckland</street>
        <street>PB 92019</street>
        <city>Auckland</city>
        <region></region>
        <code>1142</code>
        <country>New Zealand</country>
      </postal>
      
      <email>brian.e.carpenter@gmail.com</email>
    </address>
</author>




   <author fullname="Sheng Jiang" initials="S." surname="Jiang">
      <organization>Huawei Technologies Co., Ltd</organization>
      <address>
        <postal>
          <street>Q14, Huawei Campus</street>
          <street>No.156 Beiqing Road</street>
          <city>Hai-Dian District, Beijing</city>
          <code>100095</code>
          <country>P.R. China</country>
        </postal>
        <email>jiangsheng@huawei.com</email>
      </address>
    </author>

   <author fullname="Willy Tarreau" initials="W." surname="Tarreau">
      <organization>Exceliance</organization>
      <address>
        <postal>
          <street>R&amp;D Produits reseau</street>
          <street>3 rue du petit Robinson</street>
          <city>78350 Jouy-en-Josas</city>
          <country>France</country>
        </postal>
        <email>w@1wt.eu</email>
      </address>
    </author>

 


 <date day="5" month="December" year="2012" />

<area>Internet</area>
<workgroup></workgroup>

 


<abstract>

<t>This document describes how the IPv6 flow label could be used in an extended role to simplify
persistence mechanisms for load distribution and balancing for large server farms. 
</t>
    
</abstract>
</front>

<middle>
<section anchor="intro" title="Introduction">

<t>The IPv6 flow label has been redefined <xref target="RFC6437"/> and is
now a recommended IPv6 node requirement <xref target="RFC6434"/>.
Its use for layer 3/4 server load balancing is described in <xref target="I-D.carpenter-flow-label-balancing"/>,
and the reader is assumed to be familiar with that document.
In server load balancing, 'persistence' is defined as guaranteeing that a given session will run to completion on a single server. The present document describes extensions to the role of the flow label
to simplify and enhance persistence mechanisms. </t>

<t>Note in draft: The authors recognize that this proposal is incomplete,
that it needs considerable thought and discussion, and that other approaches might be possible.
However, current approaches to persistence in server load balancing are complicated
and pragmatic, and this new approach, even though it requires changes to
client applications and proxies as well as to load balancers themselves,
seems worth discussion. </t>

</section> <!-- intro -->

<section anchor="persist" title="Server Load Balancing and the Persistence Problem">

<t>The IPv6 flow label is a 20 bit field included in every IPv6 header <xref target="RFC2460"/>.
It is recommended to be supported in all IPv6 nodes by <xref target="RFC6434"/> and it is
defined in <xref target="RFC6437"/>. In <xref target="I-D.carpenter-flow-label-balancing"/>, it is explained how
the flow label value, set at or near the client of a server farm, may be used by a layer 3/4 load balancer
as part of a 2-tuple {source address, flow label} to efficiently identify packets belonging
to a given client's application data flow and to direct them to a specific server. </t>

<t>A layer 3/4 load balancer has to recognize incoming packets as belonging to new or existing
client sessions, and choose a target server or proxy so as to ensure persistence.
In a simple scenario, the 2-tuple {source address, flow label} will
be a sufficient label for a user data flow to guarantee persistence. However, there are
various cases where this does not apply. Sometimes, multiple
independent transport connections from the same client need to be handled
by the same server instance. This can be an extremely difficult task which
often requires ugly tricks such as pattern matching within a buffered stream,
cookie insertion, etc, which most current load balancers have to deal with every day.
</t>

<t>A common example is FTP. For a load balancer, passive mode FTP requires parsing
the entire control stream on port 21, in order to find which incoming packet will
initiate a data session on a port chosen by the server. This expensive process
is not always useful, due to the fact that sometimes clients fail to connect, or
that the session is finally not used, e.g., because no transfer needs to be performed. </t>

<t>The same issue is even more prominent with HTTP/HTTPS. While it is costly
but straightforward to insert a cookie in an HTTP stream to identify the server
to which the user was assigned, it is very difficult to do that for HTTPS, because
the stream must be deciphered first. Deciphering the stream requires
a huge amount of centralized power, since the load balancer needs to see the clear
stream; this is in fact the main reason for having specialised SSL proxies in load balancing
scenarios. </t>

<t>An additional complication that arises frequently is when a single client inadvertently
generates sessions that appear to originate from different IP addresses. This can arise, for
example, if an enterprise uses a proxy farm for outgoing traffic, or in mobile applications
where several subsequent requests come from different network cells and thus different IP
addresses (for instance, consulting a bank account in the train). When two consecutive
client requests pass through two distinct proxies, a different IP source address may be
presented to the server load balancer, which then cannot rely on address-based persistence. </t>

<t>In some application scenarios, such an inadvertent change in the client IP address may only have
consequences for performance, such as reloading transaction context into a new server. In other
cases it may be more serious and result in a transaction failure that seems inexplicable
to the user. A reliable and efficient solution to this problem is therefore needed. </t>

</section> <!-- persist -->

<section anchor="extend" title="Extended Role of Flow Label">

<t>We propose a new model in which the client application has control over the outgoing flow label, and 
assigns the same label to all transport connections related to a single application session.
It would then be both possible and desirable to use the same flow label value for multiple correlated
transport sessions from the same client. For this to work, it is also necessary for any proxies 
to be transparent to the flow label value. Thus, modifications to the network stack and
the application layer code are needed in both hosts and proxies. In particular, the network
API needs to provide a method for the application layer to set the flow label value for 
each new outgoing transport session, and to read it for each new incoming transport session. 
The former is needed for client applications, and both features are needed for HTTP proxies
in particular. </t>

<t>Such a mechanism is not the recommended default host behaviour, but it is permitted
by <xref target="RFC6437"/>, which states that "a flow is not necessarily 1:1 mapped to a
transport connection". The assignment of flow labels in this case is clearly no longer
stateless, but the requirement that they be drawn from a uniform distribution should
be retained. </t>

<t>In the case of FTP Passive mode, using a flow label, the client could generate an initial
flow label value when a file transfer is expected, and assign the same flow label
to all data connections related to the same control connection. A flow label
based load balancer would then by definition send the data traffic to the
same server as the control traffic, and would thus guarantee that the sessions
are properly associated. </t>

<t>For a multiprotocol transaction, if a web client (browser) used the same flow label for 
any protocol targetting a given host (or domain), this could be used by load balancers to 
reach the same server for several protocols (such as HTTP and HTTPS), without having to
inspect the stream payload at all nor to inspect anything beyond layer 3, which 
is unavoidable today. </t>

<t>In the case of an inadvertent change in the client IP address, most likely due
to an intervening proxy, the use of the same flow label for an entire application
session would allow a load balancer to reliably route all packets for the session to the
same server without any additional packet inspection or cookie insertion. This would
improve both efficiency and reliability for all parties concerned. </t>

<t>This proposal means that the layer 3/4 load balancer identifies a session by using
the flow label value alone, rather than the 2-tuple {source address, flow label} as
described in <xref target="I-D.carpenter-flow-label-balancing"/>. However, it is of minor importance if two
independent client sessions are directed to the same server because they happen to have
the same flow label. They will in any case be treated separately by the server, and statistically
there will be a negligible effect on load balancing, since there are a million different possible
flow labels to spread traffic across the server farm. </t>

<t>Using the flow label in this way would also greatly simplify the logging
of user sessions. A very common task is to match logs from various equipments to follow a user's activity
and decide whether it indicates a bug, user error or attack. Logging a flow label
would of course help because it's easier to find the beginning and end of
a session and decide whether it's legitimate or not. In the case of two simultaneous
application sessions using the same flow label value by chance, the two logs
would be intermingled, so the analysis would be more complex, but still quite
feasible. </t>

<t>Such extensions to the role of the flow label in load balancing are theoretically
very attractive, but would require a major refresh of client and proxy software as well
as of load balancers themselves. It amounts to considering an entire application
session, in a broad sense, as a single flow for the purposes of RFC 6437. </t>

<t>It is worth noting that what is important to save server-side resources is
wide enough adoption. Most of today's load balanced traffic is HTTP originating from
a handful of browsers which are regularly upgraded for security reasons. Thus, once
a mechanism is adopted, it can quickly be deployed across popular browsers,
and soon become the general case.</t>

<t>The difficulty of the upgrade path is then on the server side. The first step
would consist in having layer 7 load balancers be able to consider the flow label,
to avoid costly layer 7 analysis, each time it is possible. This means that if a
non-null flow label is seen, then the load balancer would consider it, otherwise
it would fall back to its default behaviour. The second step would consist in having
front layer 3/4 load balancers bypass the layer 7 load balancer farms when the flow
label is found. This point would greatly offload layer 7 load balancers. </t>

<t>Finally we observe that both clients and server farms would benefit from
this approach, in terms of performance and reliability. Thus, although both
parties (and proxy operators) would need to upgrade software, it is in their
own interest to do so. </t>

</section> <!-- extend -->


<section anchor="security" title="Security Considerations">

<t>The security considerations of <xref target="RFC6437"/>
and <xref target="I-D.carpenter-flow-label-balancing"/> apply. </t>

<t>Using the flow label on its own as a session handle
has limitations. It has no security properties and must not be used in any way
as an identifier or authenticator; it does not have enough bits to be used as a nonce.
Its value should never be used in the application layer nor where any form of resource
sharing is not desired. For instance, it is not acceptable that an application would identify
a user session by its flow label value, due to the inevitable collisions and the risk of forgery.
The flow label value on its own should only be used where resource sharing is intended 
(for instance, load balancing) and by components explicitly designed for this task,
taking into account all the risks exposed in the above security references, with solid protection
against mis-use, and acceptable fallbacks for remaining situations where the flow label
values are unusable. </t>

<t>The setting of the flow label by an application is necessarily a stateful process,
so that the application can store the label value for re-use in all transport sessions
that are part of the same application transaction. Therefore, any firewall that chooses
to rewrite the label, to avoid a perceived covert channel risk, must do so in a stateful way
such that a given incoming label value is always rewritten to the same outgoing value. </t>

</section> <!-- security -->


<section anchor="iana" title="IANA Considerations">
   <t>This document requests no action by IANA. </t>
</section> <!-- iana -->




<section anchor="ack" title="Acknowledgements">

<t>
Valuable comments and contributions were made by...
 </t>

<t>This document was produced using the xml2rfc tool
<xref target="RFC2629"/>.</t>

</section> <!-- ack -->


<section anchor ="changes" title="Change log [RFC Editor: Please remove]">

<t>draft-tarreau-extend-flow-label-balancing-01: small updates, 2012-12-05.</t>
<t>draft-tarreau-extend-flow-label-balancing-00: extended role extracted after IETF83, 2012-06-12.</t>
<t>draft-carpenter-v6ops-label-balance-02: clarified after WG discussions, 2012-03-06.</t>
<t>draft-carpenter-v6ops-label-balance-01: updated with community comments, additional author, 2012-01-17.</t>
<t>draft-carpenter-v6ops-label-balance-00: original version, 2011-10-13.</t>


</section> <!-- changes -->

</middle>

<back>

<references title="Normative References">

&RFC2460;
&RFC6434;
&RFC6437;


</references>

<references title="Informative References">

&RFC2629;
&DRAFT-labbal;



  
</references>




</back>
</rfc>

