<?xml version="1.0" encoding="US-ASCII"?>
<!DOCTYPE rfc SYSTEM "rfc2629.dtd">
<?rfc toc="yes"?>
<?rfc tocompact="yes"?>
<?rfc tocdepth="3"?>
<?rfc tocindent="yes"?>
<?rfc symrefs="yes"?>
<?rfc sortrefs="yes"?>
<?rfc comments="yes"?>
<?rfc inline="yes"?>
<?rfc compact="yes"?>
<?rfc subcompact="no"?>
<rfc category="info"
     docName="draft-farrer-softwire-lw4o6-deterministic-arch-00"
     ipr="trust200902">
  <front>
    <title>lw4over6 Deterministic Architecture</title>

    <author fullname="Ian Farrer" initials="I." surname="Farrer">
      <organization>Deutsche Telekom AG</organization>

      <address>
        <postal>
          <street>GTN-FM4</street>

          <street>Landgrabenweg 151</street>

          <!-- Reorder these if your country does things differently -->

          <city>Bonn</city>

          <code>53227</code>

          <country>Germany</country>
        </postal>

        <email>ian.farrer@telekom.de</email>

        <!-- uri and facsimile elements may also be added -->
      </address>
    </author>

    <author fullname="Alain Durand" initials="A." surname="Durand">
      <organization>Juniper Networks</organization>

      <address>
        <postal>
          <street>1194 North Mathilda Avenue</street>

          <!-- Reorder these if your country does things differently -->

          <city>Sunnyvale</city>

          <region>CA</region>

          <code>94089-1206</code>

          <country>USA</country>
        </postal>

        <email>adurand@juniper.net</email>

        <!-- uri and facsimile elements may also be added -->
      </address>
    </author>

    <date/>

    <abstract>
      <t>This memo provides an operational deterministic architecture for the deployment of
      Lightweight 4over6 <xref target="I-D.cui-softwire-b4-translated-ds-lite"/> offering scalability
      and high-availability whilst preserving the per-flow stateless nature of
      the solution.</t>
    </abstract>

    <note title="Requirements Language">
      <t>The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
      "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
      document are to be interpreted as described in <xref
      target="RFC2119"/>.</t>
    </note>
  </front>

  <middle>
    <section title="Introduction">
      <t>DS-Lite <xref target="RFC6333"/> is a solution to deal with the IPv4
      exhaustion problem once an IPv6 access network is deployed. It enables
      unmodified IPv4 applications to access the IPv4 Internet over the IPv6
      access network. In the DS-Lite architecture, global IPv4 addresses are
      shared amongst subscribers as the concentrator (AFTR) performs a
      Carrier-Grade NAT (CGN) function.</t>

      <t>[I-D.cui-softwire-b4-translated-ds-lite] extends the original DS-Lite
      model so that NAT is performed by the CPE and IPv4 address sharing is
      possible through the use of source port-restrictions.</t>

      <t>This memo provides an operational architecture for the deployment of
      Lightweight 4over6 <xref
      target="I-D.cui-softwire-b4-translated-ds-lite"/> offering scalability
      and high-availability whilst preserving the per-flow stateless nature of
      the solution.</t>

      <t>The approach presented here is stateless and deterministic. It
      leverages the stateless properties of Lightweight 4over6 to offer a
      completely deterministic solution. The bindings between IPv4 addresses,
      ports and IPv6 addresses are pre-computed and stored identically in the
      AFTRs and DHCP servers. This allows for a very simple fail-over
      mechanism within a cluster of identically provisioned AFTRs.</t>
    </section>

    <section title="Deterministic Architecture">
      <section title="Distribution of the Subscriber Population">
        <t>In a large deployment, it makes sense to distribute the subscriber
        population into subscriber groups, managed by a single AFTR, or by
        many AFTRs grouped in a cluster. Topological considerations and
        geographical proximity may also be factors in the grouping of
        subscribers. The exact size of those groups depend on the capacity
        characteristics of the AFTRs and is out of scope for this memo.</t>

        <t>Each subscriber group is assigned an IPv6 anycast address and a
        pool of IPv4 addresses which are common to all AFTRs in a cluster. The
        IPv4 pool must be sized to handle the subscriber population. No
        constraints are placed upon the addresses that are used for this pool,
        in that they can be taken from a single, contiguous block, multiple
        non-contiguous blocks or single IPv4 addresses as required by the
        operator.</t>

        <t>The exact ratio subscribers to IPv4 addresses, (e.g. the average
        number of ports assigned per subscriber) is out of scope for this
        memo.</t>
      </section>

      <section title="AFTR Cluster">
        <t>All AFTRs within a cluster are configured with identical lW4o6
        parameters. In particular, they are configured with the same: <list
            style="symbols">
            <t>IPv6 AFTR tunnel end-point address</t>

            <t>IPv4 public pool</t>

            <t>IPv6 address to IPv4 address and port binding table</t>
          </list></t>
      </section>

      <section title="IPv4 Address Plus Ports to IPv6 Binding Table Considerations">
        <t>The DHCPv4 over IPv6 server will provide each IPv6 CPE an IPv4
        address and port range to use within its local NAT binding table. The
        DHCPv4 server uses the IPv6 address of the CPE as its identifier. As
        such, the DHCPv4 server contains a table for assigning a specific IPv4
        address and ports based on the IPv6 address of the requesting CPE. To
        maintain the stateless nature of the architecture, DHCPv4 reservation
        based address assignment is recommended. The lease time for the IPv4
        address is unimportant, although a long lease time (or even infinity)
        is recommended to reduce the number of DHCPv4 requests. </t>

        <t>A similar table (containing the same address/port binding
        information) is also present on all AFTRs in the cluster.</t>

        <t>The following table shows sample CPE configuration data for a
        subscriber group. In order for the system to function coherently, this
        data needs to be kept synchronised between all of the functional
        elements (AFTRs and DHCPv4o6 servers) serving the subscriber
        group.</t>

        <t/>

        <figure>
          <artwork><![CDATA[          +--------------+------------+----------+
          | IPv6 address |IPv4 address|port-range|
          +--------------+------------+----------+
          |2001:db8::1   |   1.2.3.4  | 1000-1999|
          |2001:0:1::2   |   1.2.3.4  | 2000-2999|
          |2001:0:5::1   |   2.3.4.5  | 1500-3999|
          +--------------+------------+----------+
                                             
]]></artwork>

          <postamble>Figure 1 DHCP4o6/AFTR Per-subscriber configuration table
          data example</postamble>
        </figure>

        <t/>

        <t>This memo proposes a simple architecture to guarantee the
        synchronization of those mapping tables and rely on anycast IPv4 and
        IPv6 technologies to provide failover within the AFTRs in the same
        cluster.</t>
      </section>

      <section title="IPv6 and IPv4 Anycast Considerations">
        <t>The following diagram shows the architecture for the Lightweight
        4over6 cluster deployment.</t>

        <t><figure>
            <artwork><![CDATA[
                                              ^
                                              |  To public Internet
             Public IPv4 Address              |  (IPv4)
             pool advertised into             |
             upstream routing                 |
             by all AFTRs             +-------+-------+
             in cluster               |               |
                                      |   Upstream    |
                +- - - - - - - - - - >| IPv4 Anycast  |
                |                     +--+----+----+--+
                                         |    |    |
                |                        |    |    |
                   +---------------------+    |    +---------------------+
                |  |                          |                          |
           +----+--+-------+          +-------+ ------+          +-------+-------+
           |               |          |               |          |               |
           |    LW4o6      |          |    LW4o6      |          |    LW4o6      |
           |    AFTR1      |          |    AFTR2      |  . . . . |    AFTRn      |
           |               |          |               |          |               |
           +----+--+-------+          +-------+-------+          +-------+-------+
                |  |                          |                          |
                   +---------------------+    |    +---------------------+
                |                        |    |    |
                                         |    |    |
                |                     +--+----+----+--+
                +- - - - - - - - - - >|               |
                                      |  Downstream   |
            IPv6 tunnel end-point     | IPv6 Anycast  |
            Anycast Address           +-------+-------+
            advertized into                   |
            downstream routing                |
            by all AFTRS                      | To Initiator
            in cluster                        | Access network
                                              | (IPv6 only)
                                              v
]]></artwork>

            <postamble>Figure 2 Lightweight 4over6 Cluster</postamble>
          </figure></t>

        <t>To increase service availability, A simple way to achieve fail-over
        is to configure both the IPv6 tunnel end point address and the IPv4
        address pool as anycast addresses on the AFTRs and announce these
        routes into the IGP run by the ISP, such as OSPF or IS-IS.</t>

        <t>The number of AFTRs in a cluster provides the degree of redundancy
        of the solution. In practice, two AFTRs are expected to be sufficient
        in most cases.</t>
      </section>

      <section title="Load Balancing across Multiple Concentrators">
        <t>AFTR functionality can be scaled by load balancing the
        encapsulation/decapsulation across multiple AFTRs in a cluster. Due to
        the commonality of configuration and stateless nature of the solution,
        any tunneled packet from any Initiator served by the cluster can
        arrive at any cluster member and will be processed in the same way.
        Likewise, for inbound packets originating in the IPv4 realm, a packet
        that arrives at any of the cluster member will be encapsulated and
        sent to the correct initiator.</t>

        <t>Load balancing could be achieved using specific load balancing
        infrastructure to distribute the tunnels and inbound v4 traffic across
        the cluster. It is also possible to use the Equal Cost Multipath
        inherent in some routing protocols to achieve this.</t>

        <t>In order to prevent out-of-sequence packets in the tunnelled
        traffic, a mechanism for forwarding all packets belonging to a single
        tunnel through the same cluster member should be used. An example of
        this would be a source/destination hashing algorithm such as <xref
        target="RFC2992"/> describes.</t>
      </section>

      <section title="DHCPv6 Tunnel End-point Option Considerations">
        <t>All CPEs belonging to the same group of subscribers need to receive
        the same tunnel end-point option (via DHCPv6). This will be set to the
        IPv6 anycast address of the AFTR cluster.</t>
      </section>

      <section title="CPE IPv6 Address Management">
        <t>The DHCPv4 server uses the IPv6 address of the CPE as its index. In
        order to keep the overall service architecture flexible and adaptable,
        it is preferable that the CPE is configured using DHCPv6 out of a
        specific pool reserved by the ISP.</t>
      </section>

      <section title="Binding Table Synchronization">
        <t>It is proposed that binding tables be pre-computed and stored
        statically on the AFTRs and the DHCPv4 servers. The method of creating
        the binding tables is out of the scope of this memo.</t>

        <t>These tables are not expected to change regularly. Typical reasons
        for an update include adding or removing an IPv4 address block, or
        changing the size of IPv4 ports ranges available to each CPE.</t>

        <t>To ensure continuous operation, binding tables have to be updated
        simultaneously across all AFTRs in a cluster by a mechanism such as
        netconf. It may also be necessary to reconfigure CPEs during this
        process (e.g. via a DHCPv6 reconifgure message). The details are out
        of scope for this memo.</t>
      </section>

      <section title="Subscriber Management and Growth">
        <t>It is recommended that the ISP predefines all IPv6 addresses and
        corresponding IPv4 addresses and port ranges for any given subscriber
        group.</t>

        <t>If the ISP runs out of space within a subscriber group, another
        group is then defined. Cutomer CPEs can be migrated between different
        subscriber groups by alterning the CPE configuration over DHCP.</t>
      </section>

      <section title="Privacy Extensions">
        <t>In some deployments, regulations require that IP addresses
        allocated to customers can be changed periodically or on demand to
        protect users privacy.</t>

        <t>This can be achieve by rolling over the IPv6 addresses in the
        DHCPv6 server allocating IPv6 addresses to the CPE. If all subscribers
        within the subscriber group are allocated the same number of ports in
        IPv4, then the IPv6 to IPv4 address and port binding may remain the
        same, the IPv4 address and ports will then roll over automatically at
        the same time as the IPv6 addresses do.</t>

        <t><xref target="I-D.cui-softwire-b4-translated-ds-lite"/> states that
        when the IPv6 address of the B4 is changed, then DHCPv4over6
        configuration process must be re-initiated.</t>
      </section>
    </section>

    <section title="IANA Considerations">
      <t>None.</t>
    </section>

    <section anchor="Security" title="Security Considerations">
      <t>None.</t>
    </section>

    <section anchor="Acknowledgements" title="Acknowledgements">
      <t/>
    </section>
  </middle>

  <back>
    <references title="Normative References">
      <?rfc include="reference.RFC.2119"?>

      <?rfc include='reference.RFC.2992'?>

      <?rfc include="reference.RFC.6333"?>
    </references>

    <references title="Informative References">
      <?rfc include="reference.I-D.cui-softwire-b4-translated-ds-lite"?>
    </references>
  </back>
</rfc>
