<?xml version="1.0"?>
<!DOCTYPE rfc SYSTEM "rfc2629.dtd" [
<!ENTITY rfc2119 SYSTEM
"http://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.2119.xml">
<!ENTITY rfc5245 SYSTEM
"http://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.5245.xml">
<!ENTITY I-D.ietf-tram-stunbis SYSTEM
"http://xml2rfc.ietf.org/public/rfc/bibxml3/reference.I-D.ietf-tram-stunbis">
<!ENTITY rfc5766 SYSTEM
"http://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.5766.xml">
<!ENTITY I-D.ietf-ice-trickle SYSTEM
"http://xml2rfc.ietf.org/public/rfc/bibxml3/reference.I-D.ietf-ice-trickle">
]>

<?rfc toc='yes'?>
<?rfc rfcprocack="yes" ?>
<?rfc symrefs="yes" ?>
<?rfc iprnotified="no" ?>
<?rfc strict="yes" ?>
<?rfc compact="yes" ?>
<?rfc subcompact="no" ?>
<?rfc sortrefs="yes" ?>
<?rfc colonspace='yes' ?>
<?rfc tocindent='yes' ?>

<rfc ipr="trust200902" category="std" docName="draft-williams-peer-redirect-04">
  <front>
    <title abbrev="Peer Redirect for TURN">Peer-specific Redirection for
      Traversal Using Relays around NAT (TURN)</title>
    <author initials="B." surname="Williams" fullname="Brandon Williams">
      <organization abbrev="Akamai">Akamai, Inc.</organization>
      <address>
        <postal>
          <street>8 Cambridge Center</street>
          <city>Cambridge</city>
          <region>MA</region>
          <code>02142</code>
          <country>USA</country>
        </postal>
        <email>brandon.williams@akamai.com</email>
      </address>
    </author>
    <author fullname="Tirumaleswar Reddy" initials="T." surname="Reddy">
      <organization abbrev="Cisco">Cisco Systems, Inc.</organization>
      <address>
        <postal>
          <street>Cessna Business Park, Varthur Hobli</street>
          <street>Sarjapur Marathalli Outer Ring Road</street>
          <city>Bangalore</city>
          <region>Karnataka</region>
          <code>560103</code>
          <country>India</country>
        </postal>
        <email>tireddy@cisco.com</email>
      </address>
    </author>
    <date year="2015" />
    <abstract>
      <t>This specification describes a peer-specific redirection method that
        allows the TURN server to redirect a client for the purpose of
        improving communication with a specific peer without negatively
        affecting communication with other peers.</t>
    </abstract>
  </front>

  <middle>
    <section anchor="intro" title="Introduction">

      <t>A Traversal Using Relay around NAT (TURN) <xref target="RFC5766" />
        service provider may provide multiple candidate TURN servers for use
        by a host, but it might not be possible to determine which candidate
        TURN server will provide the best performance until both peers have
        been identified. This could be true for a variety of reasons,
        including:
          <list style="symbols">
            <t>Using the selected relay for a specific peer results in a
              sub-optimal end-to-end Internet path.</t>
            <t>Load conditions on the selected relay have changed since the
              allocation was established such that it cannot support the
              new data flow.</t>
          </list>
        At the same time, the above conditions might apply to one peer but not
        another, such that it would be best to selectively use the existing
        relay allocation for peers that will receive reasonable performance
        and redirect data flows for other peers to an alternate server. These
        scenarios are discussed in greater detail below.</t>

      <t>The Session Traversal Utilities for NAT (STUN) protocol <xref
        target="I-D.ietf-tram-stunbis" /> defines an ALTERNATE-SERVER
      mechanism with which a server can redirect a client to another server by
      replying to a request message with an error response with error code 300
      (Try Alternate). The TURN protocol describes error code 300 as one of
      the possible error codes for an Allocate error response.</t>
        
      <t>This specification describes an additional use of the
        ALTERNATE-SERVER STUN attribute for TURN that allows the TURN server
        to redirect a client for the purpose of improving communication with a
        specific peer without negatively affecting communication with other
        peers.</t>

      <section title="Redirection for Performance">
        <t>Consider the following example:</t>

        <figure>
          <artwork>
                                                          Boston
                                                          Peer C
                                      Chicago              [PC]
                                       Peer B               /
TURN Relay A                  ----------[PB]-------------[TC]
San Francisco      ----------/                       TURN Relay C
    [TA]----------/                                    New York
     |
    [PA]
   Peer A
 Los Angeles
          </artwork>
        </figure>

        <t>When Peer B wishes to communicate with either Peer A or Peer C, it
          performs a DNS lookup and discovers TURN Relay C, the nearest of the
          candidate TURN servers. Peer B then sends a TURN Allocate request to
          TURN Relay C to determine the reflexive and relay candidates to
          offer. After the reflexive candidate has been chosen, Peer B sends a
          ChannelBind request to TURN Relay C to establish a channel for
          communication with the peer. If Peer C is the remote peer, the
          existing allocation will perform reasonably well, but if Peer A is
          the remote peer, the latency for relayed packets will be nearly
          twice as long as if TURN Relay A had been selected as the relay
          candidate. The problem is worse if Peer B wishes to communicate with
          both Peer A and Peer C, since there is no single relay candidate
          that would provide optimum performance for both peers.</t>
          
        <t>If TURN Relay C and TURN Relay A are part of a common TURN service,
          it would be possible for TURN Relay C to determine that TURN Relay A
          will provide optimal service for communication between Peer B and
          Peer A. This allows the TURN service to redirect just the data
          channel between Peer A and Peer B to TURN relay A, thus providing
          optimal performance for both relay channels.</t>

        <t>The above example describes the problem in terms of physical
          geography instead of network geography in order to help clarify the
          discussion. However, readers should note that the problem of
          selecting a relay server to achieve optimal end-to-end routing is
          much more complicated than the above description suggests, requiring
          a detailed real-time view of network connectivity characteristics
          and the peering relationships between autonomous systems. A naive
          approach based solely on the physical location of the hosts involved
          is just as likely to produce negative results as positive ones.</t>
        
        <t>That said, a relay service provider with a broadly distributed
          system for actively monitoring network performance across the
          relevant parts of the Internet could make use of the resulting data
          set to select the optimal relay for each peer pair.</t>
      </section>

      <section title="Redirection for Load Balancing">
        <t>At the point when a relay allocation is first established, it can
          be difficult to determine how much aggregate concurrent load could
          eventually be associated with that allocation. The initiating peer
          could attempt to use that allocation for any number of peer-to-peer
          data flows over an extended period of time, during which time load
          conditions on the relay could change substantially, such that
          quality of service for already established flows would degrade if
          the relay were to accept additional flows.</t>

        <t>Under these conditions, a TURN service provider with multiple relay
          hosts and distributed capacity could improve service quality by
          redirecting data flows to a different host that has more available
          capacity.</t>
      </section>
    </section>

    <section title="Terminology">
      <t>The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
        "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
        document are to be interpreted as described in
        <xref target="RFC2119" />.</t>
    </section>

    <section anchor="mechanism" title="Peer-specific Server Redirect Mechanism">
      <t>This specification describes a new STUN indication type, Redirect,
        which is used by a TURN server to notify a TURN client when better
        service could be available through an alternate TURN server. The
        Redirect indication contains an ALTERNATE-SERVER attribute to provide
        the address for the alternate TURN server.</t>
      
      <t>This specification also defines two new comprehension-optional STUN
        attributes: CHECK-ALTERNATE and XOR-OTHER-ADDRESS. The CHECK-ALTERNATE
        attribute is used by the client to request that the server perform
        peer-specific redirection. The XOR-OTHER-ADDRESS is used by the client
        to provide an alternate peer address for location identification in
        the event that the XOR-PEER-ADDRESS attribute in the CreatePermission
        or ChannelBind request is not expected to reliably serve this
        purpose.</t>

      <section anchor="allocreq" title="Forming an Allocate Request">
        <t>When forming an Allocate request, a TURN client includes a
          CHECK-ALTERNATE STUN attribute to signal to the TURN server that
          peer-specific redirection is both supported and desired.</t>

        <t>When forming a CHECK-ALTERNATE attribute, the STUN Type is TBD-CA.
          To maintain backward compatibility, this type is in the
          comprehension-optional range, which means that an <xref
            target="RFC5766" /> compliant TURN server can safely ignore
          it.</t>

        <t>The CHECK-ALTERNATE attribute has no value part and thus the
          attribute length field is 0.</t>
      </section>

      <section title="Receiving an Allocate Request">
        <t>When a server receives an Allocate request, it first processes the
          request as per the TURN specification <xref target="RFC5766" />.
          After determining that a success response will be prepared, a TURN
          server that supports peer-specific redirection checks for a
          CHECK-ALTERNATE attribute. If one exists, the server stores this
          information as part of the allocation state. There is no need for
          the server to indicate that the attribute was accepted in the
          success response.</t>
      </section>

      <section anchor="permreq" title="Forming a CreatePermission or ChannelBind Request">
        <t>When sending a CreatePermission or a ChannelBind request, the
          XOR-OTHER-ADDRESS STUN attribute allows the TURN client to provide
          an alternate peer address that can be used by the server to identify
          the network geographic location of the peer when performing the
          peer-specific redirection check.  Use of this attribute is only
          necessary if the XOR-PEER-ADDRESS already contained in the
          CreatePermission or ChannelBind request does not adequately serve
          this purpose, which should only be true when both peers require a
          TURN relay for end-to-end data flow. In this case, the TURN
          CreatePermission or ChannelBind request will provide the peer's TURN
          relay address as the XOR-PEER-ADDRESS value. If the RTT between the
          peer and its TURN relay server is very small, the TURN relay address
          might still be an appropriate address to use for the peer-specific
          redirection check. As the RTT grows, the TURN relay address will
          become less suitable for this purpose.  For this reason, it is
          generally the case that the peer's public address (i.e. its host or
          reflexive address) is a better indication of its network geographic
          location than its TURN relay address.</t>
          
        <t>When forming an XOR-OTHER-ADDRESS attribute, the STUN Type is
          TBD-XOA. To support backward compatibility, this type is in the
          comprehension-optional range, which means that an <xref
            target="RFC5766" /> compliant TURN server can safely ignore
          it.</t>

        <t>The XOR-OTHER-ADDRESS value specifies an address and port
          suitable for identification of the peer's network geographic
          location. It is encoded in the same way as XOR-MAPPED-ADDRESS
          <xref target="I-D.ietf-tram-stunbis" />.</t>

        <t>A CreatePermission request is allowed to contain multiple
          XOR-PEER-ADDRESS attributes. When multiple peer addresses are
          provided in a CreatePermission request, it would be difficult for
          the TURN server to associate an XOR-OTHER-ADDRESS attribute with the
          correct XOR-PEER-ADDRESS. For this reason, a TURN client MUST form
          a separate CreatePermission request for an XOR-PEER-ADDRESS request
          when an XOR-OTHER-ADDRESS attribute will be included in the
          request.</t>

        <t>The XOR-OTHER-ADDRESS attribute SHOULD NOT be included in a request
          if its value will be identical to the request's XOR-PEER-ADDRESS
          attribute. Its value would be redundant and a waste of space in the
          message.</t>
      </section>

      <section title="Receiving a CreatePermission or ChannelBind Request">
        <t>When a TURN server receives a CreatePermission or ChannelBind
          request for an allocation that included the CHECK-ALTERNATE
          attribute, it processes the request as per the TURN specification
          <xref target="RFC5766" /> plus the specific rules mentioned
          here.</t>

        <t>If an XOR-OTHER-ADDRESS attribute is present, the server validates
          the number of XOR-PEER-ADDRESS attributes. If there is more than one
          XOR-PEER-ADDRESS attribute in the request, the server MUST reject
          the request with an error response using error code 400 (Bad
          Request). If there is only one XOR-PEER-ADDRESS attribute, the
          request is accepted and the value of the XOR-OTHER-ADDRESS attribute
          is stored with the permission state for use when checking for an
          alternate server.</t>

        <t>The mechanism for deciding when and how to check for an alternate
          server is implementation dependent. This activity could be timer
          driven (e.g.  check for an alternate server once every 120 seconds),
          it could be event driven (e.g. check for an alternate server on
          every permission or binding refresh), or another mechanism
          appropriate for the internal implementation could be chosen.
          Likewise, the decision of which specific address(es) to check for
          alternate servers is also implementation dependent. A server could
          check for alternates for all active permissions, it could check just
          for permissions that have relayed non-ICE data, or another selection
          method appropriate for the implementation could be chosen.</t>

        <t>When checking for an alternate server for a permission where the
          XOR-OTHER-ADDRESS attribute was provided, the server SHOULD use this
          address for peer location identification. Otherwise, the server
          SHOULD use the XOR-PEER-ADDRESS value.</t>

        <t>The TURN client will retransmit a CreatePermission or ChannelBind
          request if the response is not received. <xref
            target="I-D.ietf-tram-stunbis" /> recommends that the retransmit
          timeout be greater than 500 ms, but does not require this, so it is
          important to avoid unnecessary delays in request processing. For
          this reason, the mechanism for driving alternate server checks
          SHOULD be asynchronous relative to processing of the associated
          CreatePermission or ChannelBind request and SHOULD NOT delay
          transmission of the response message.</t>
      </section>

      <section anchor="rediri" title="Forming a Redirect Indication">
        <t>When an alternate server for a specific permission has been
          identified, the server notifies the client using a Redirect
          indication. The server MUST NOT send Redirect indications if the
          client did not indicate support by including a CHECK-ALTERNATE
          attribute in its Allocate request.</t>

        <t>The codepoint for Redirect indication is TBD-RI.</t>

        <t>A Redirect indication MUST contain a single ALTERNATE-SERVER
          attribute to provide the address for the alternate server. The
          message MAY contain one or more XOR-PEER-ADDRESS attributes to
          indicate a subset of peer addresses for redirection. If all peer
          addresses are to be redirected, no XOR-PEER-ADDRESS attribute is
          required. The message MUST contain either a MESSAGE-INTEGRITY or
          MESSAGE-INTEGRITY-SHA256 attribute (see <xref target="security" />
          for an explanation of the rationale).</t>

        <t>Because this message codepoint is for indications only, the TURN
          client will not send a success response, and the TURN server will
          have no way to determine whether the message was received. For
          improved reliability, the TURN server MAY retransmit the indication
          multiple times, following the request retransmission semantics
          described in <xref target="I-D.ietf-tram-stunbis" />. Retransmission
          requirements for indications might differ from those for requests,
          since requests are only retransmitted if no response was received.
          For this reason, an implementation that retransmits Redirect
          indications SHOULD provide separate configuration settings to
          control the maximum number of Redirect retransmissions and the
          minimum RTO.</t>
      </section>

      <section title="Receiving a Redirect Indication">
        <t>When a TURN client receives a Redirect indication, it checks that
          the indication contains both an ALTERNATE-SERVER attribute and one
          of either a MESSAGE-INTEGRITY or a MESSAGE-INTEGRITY-SHA256
          attribute and discards it if it does not. If XOR-PEER-ADDRESS
          attributes are present, it checks that all specified addresses are
          recognized peers for the allocation and discards the indication if
          any addresses are not recognized. Finally, it verifies the integrity
          attribute's value and discards the message if that value is
          invalid.</t>

        <t>After validating the message, the ALTERNATE-SERVER value and
          associated peer addresses are delivered to the ICE implementation.
          Interactions with ICE are described below (<xref target="ice"
            />).</t>

        <t>See <xref target="security" /> below for discussion of how the
          client should respond when receiving a Redirect indication when
          redirection was not requested.</t> 
      </section>
    </section>

    <section anchor="ice" title="ICE Interactions">
      <t>With "Vanilla" ICE as defined in <xref target="RFC5245" />, candidate
        gathering is complete before the offer/answer exchange. When a client
        using standard ICE receives a valid Redirect indication, it first
        checks whether it already has an active allocation on the specified
        server. If not, it adds the new relay to its list of servers and forms
        an allocation. This generates a new relayed candidate to be added to
        the local ICE candidates list, which requires ICE restart.</t>

      <t>With trickle ICE as defined in <xref target="I-D.ietf-ice-trickle"
          />, if the end-of-candidates announcement has not yet been sent, it
        could be possible to complete the TURN Allocate request and include
        the new relayed candidate in the candidates list for the current ICE
        negotiation. However, on some implementations, it could be true that
        candidate gathering is already complete, even though the
        end-of-candidates announcement hasn't been sent. In other words, a
        trickle ICE agent might need to restart ICE in order to restart
        candidate gathering. If the end-of-candidates announcement has already
        been sent, ICE restart is necessary in order to add the new relayed
        candidate.</t>

      <t>It is possible for a TURN relay to send multiple Redirect indications
        on the same allocation within a short time frame, each for a different
        set of peers. For example, consider the case of a remote peer with two
        interfaces, one wifi and the other 4G on a different Internet service
        provider. It is possible that the network geography for each interface
        requires a different relay for best performance and therefore a unique
        Redirect indication. After receiving a Redirect indication that does
        not apply to all peers associated with the allocation, it might be
        beneficial for the ICE agent to delay ICE restart by a small interval
        in order to avoid restarting ICE multiple times within a short time
        frame. However, selecting a good delay interval could be difficult,
        since the time between indications could vary due to packet loss and
        retransmission timeouts.</t>

      <t>The authors have considered alternative Redirect indication formats
        that would allow all concurrent redirects to be provided in a single
        indication, which would avoid the above described issue. Feedback is
        desired on the question of whether the problem of minimizing ICE
        restarts is important enough to add greater complexity to the building
        and parsing of Redirect indications.</t>
    </section>

    <section anchor="security" title="Security Considerations">
      <t>This section considers attacks that are possible in a TURN deployment
        through the specified protocol extension, and discusses how they are
        mitigated by mechanisms in the protocol or recommended practices in
        the implementation.</t>

      <t>The specified mechanism affects the use of TURN CreatePermission
        request messages, ChannelBind request messages, and Redirect
        indication messages. Each of these TURN message types requires a STUN
        message integrity attribute (either MESSAGE-INTEGRITY or
        MESSAGE-INTEGRITY-SHA256), which limits attacks that attempt to make
        use of the specified mechanism to authenticated clients and
        servers.</t>

      <section title="Permission Flood">
        <t>A compromised TURN client could send a large number of
          CreatePermission or ChannelBind request messages with distinct peer
          address values, which would drive increased load on the TURN server.
          The mechanism described in this document does not make such an
          attack more likely, though it could make it possible to increase the
          impact of such an attack due to the additional load associated with
          determining whether an alternate server should be used by the
          client. The TURN server MAY be configured to disable or rate limit
          alternate server checks under some conditions in order to limit the
          associated load. The conditions under which it is appropriate for a
          TURN server to ignore disable or rate limit such checks are
          implementation dependent.</t>
      </section>

      <section title="Unsolicited or Invalid Redirect Indication">
        <t>A compromised TURN server could send Redirect indications for
          allocations that did not include the CHECK-ALTERNATE attribute.  For
          a client that does not support this mechanism, receiving such
          indications is no worse than receiving messages with any other
          unrecognized message type. A client that recognizes the message type
          MUST ignore Redirect indications for allocations where
          CHECK-ALTERNATE was not specified, and in particular, to avoid
          unnecessary authentication overhead, the client SHOULD drop such
          indications before attempting to validate the message integrity
          attribute.</t>

        <t>A compromised TURN server could send an invalid ALTERNATE-SERVER
          attribute value in a Redirect indication message, where the value
          refers to an unaffiliated TURN server to which the sending TURN
          server is not allowed to redirect traffic. Such an attack is already
          allowed by the use of Try Alternate errors in response to Allocate
          request messages. Use of the ALTERNATE-SERVER attribute in the
          context of peer-specific redirection does not make such an attack
          more likely, though it could make it possible to increase the scale
          of such an attack by allowing multiple ALTERNATE-SERVER attributes
          to each client, one per requested permission or binding. A client
          SHOULD ignore all future Redirect indications received from the TURN
          server after an authentication failure with any server identified
          via an ALTERNATE-SERVER attribute. A client MAY discontinue use of
          the associated TURN allocation after an authentication failure with
          any server identified via an ALTERNATE-SERVER attribute.</t>

        <t>An external attacker could send an invalid ALTERNATE-SERVER
          attribute value in a Redirect indication message. The client must
          have some way to detect when this occurs, which is the purpose of
          including a message integrity attribute in the Redirect indication.
          Without the message integrity attribute, it would be possible for an
          attacker to spoof a Redirect indication from the TURN server and
          drive the client to attempt to connect to a bad relay server.</t>
      </section>

      <section title="Replayed Redirect Indication">
        <t>An in-path attacker could capture and replay Redirect indications.
          If the client has been redirected again after the replayed Redirect
          indication was received, the replay could drive the client to carry
          out unnecessary work to establish a new allocation and restart ICE
          if the results of the previous Redirect indication have since been
          discarded. It could also drive unexpected load from the client to
          a server that has since become overloaded, potentially degrading
          performance for not only the target client but also all others now
          connected to the alternate server.</t>

        <t>Multiple potential mitigations for this attack exist. For example, a
          client that maintains a complete list of all TURN servers used
          throughout the life of the session could keep track of the
          Transaction ID for each Redirect indication received, which would
          allow the client to recognize and reject a replayed indication.
          Alternatively, a client could rate-limit its responses to Redirect
          indications, requiring a configurable interval to expire between
          Redirect indications before accepting a new one.</t>

        <t>The authors have also considered the option of adding a timestamp
          attribute to the Redirect indication message. The timestamp could be
          used to minimize the window of opportunity for a Redirect indication
          replay attack. However, such use of timestamps is fragile in the
          presence of potential clock skew problems between the client and the
          server and so has not been included in the specification.</t>
      </section>
    </section>

    <section anchor="iana" title="IANA Considerations">
      <t>[Paragraphs below in braces should be removed by the RFC Editor upon
        publication]</t>

      <t>[The CHECK-ALTERNATE attribute requires that IANA allocate a value in
        the "STUN Attributes Registry" from the comprehension-optional range
        (0x8000-0xFFFF), to be replaced for TBD-CA throughout this
        document]</t>

      <t>This document defines the CHECK-ALTERNATE STUN attribute, described
        in <xref target="allocreq" />. IANA has allocated the
        comprehension-optional codepoint TBD-CA for this attribute.</t>

      <t>[The XOR-OTHER-ADDRESS attribute requires that IANA allocate a value
        in the "STUN Attributes Registry" from the comprehension-optional
        range (0x8000-0xFFFF), to be replaced for TBD-XOA throughout this
        document]</t>

      <t>This document defines the XOR-OTHER-ADDRESS STUN attribute, described
        in <xref target="permreq" />. IANA has allocated the
        comprehension-optional codepoint TBD-XOA for this attribute.</t>

      <t>[The Redirect indication codepoint requires that IANA allocate a
        value in the "STUN Methods Registry", to be replace for TBD-RI
        throughout this document.]</t>

      <t>This document defines the Redirect indication method type, described
        in <xref target="rediri" />. IANA has allocated the codepoint TBD-RI
        for this method type.</t>
    </section>

    <section anchor="Acknowledgements" title="Acknowledgements">
      <t>Many thanks to J. Uberti for his suggestions regarding ICE
        interactions.</t>
    </section>
  </middle>

  <back>
    <references title="Normative References">

      &rfc2119;
      &rfc5766;
      &I-D.ietf-tram-stunbis;

    </references>
    <references title="Informative References">

      &rfc5245;
      &I-D.ietf-ice-trickle;

    </references>

    <section anchor="change_history" title="Change History">
      <t>[Note to RFC Editor: Please remove this section prior to
        publication.]</t>

      <section title="Changes from version 03 to 04">
        <t>Introduced Redirect indication to redefine the mechanism as
          push-like notification.</t>
        <t>Moved CHECK-ALTERNATE to Allocation request.</t>
        <t>Added a short section on ICE interactions.</t>
        <t>Changed STUN reference to STUNbis, since the doc now references
          STUNbis content. Left other references as they are.</t>
      </section>

      <section title="Changes from version 02 to 03">
        <t>Minor copy-editing.</t>
      </section>

      <section title="Changes from version 01 to 02">
        <t>Add warning about the difference between physical geography and
          network geography.</t>
        <t>Add load balancing use case.</t>
      </section>

      <section title="Changes from version 00 to 01">
        <t>Expand discussion of when/how to use CHECK-ALTERNATE and
          XOR-OTHER-ADDRESS.</t>
      </section>
    </section>
  </back>
</rfc>
