<?xml version="1.0" encoding="US-ASCII"?>
<!DOCTYPE rfc SYSTEM "rfc2629.dtd">
<?rfc toc="yes"?>
<?rfc tocompact="yes"?>
<?rfc tocdepth="3"?>
<?rfc tocindent="yes"?>
<?rfc symrefs="yes"?>
<?rfc sortrefs="yes"?>
<?rfc comments="yes"?>
<?rfc inline="yes"?>
<?rfc compact="yes"?>
<?rfc subcompact="no"?>
<rfc category="std" docName="draft-reddy-mmusic-stun-auth-fw-traversal-00" ipr="trust200902">
  <front>
    <title abbrev="STUN extn for FW Traversal">
	STUN Extensions for Firewall Traversal</title>

    <author fullname="Tirumaleswar Reddy" initials="T." surname="Reddy">
      <organization abbrev="Cisco">Cisco Systems, Inc.</organization>

      <address>
        <postal>
          <street>Cessna Business Park, Varthur Hobli</street>

          <street>Sarjapur Marathalli Outer Ring Road</street>

          <city>Bangalore</city>

          <region>Karnataka</region>

          <code>560103</code>

          <country>India</country>
        </postal>

        <email>tireddy@cisco.com</email>
      </address>
    </author>

    <author fullname="Muthu Arul Mozhi Perumal" initials="Muthu A M."
            surname="Perumal">
      <organization abbrev="Cisco">Cisco Systems, Inc.</organization>

      <address>
        <postal>
          <street>Cessna Business Park, Varthur Hobli</street>

          <street>Sarjapur Marathalli Outer Ring Road</street>

          <city>Bangalore</city>

          <region>Karnataka</region>

          <code>560103</code>

          <country>India</country>
        </postal>

        <email>mperumal@cisco.com</email>
      </address>
    </author>

    <author fullname="Ram Mohan Ravindranath" initials="Ram Mohan" surname="Ravindranath">
      <organization abbrev="Cisco">Cisco Systems, Inc.</organization>

      <address>
        <postal>
          <street>Cessna Business Park, Varthur Hobli</street>

          <street>Sarjapur Marathalli Outer Ring Road</street>

          <city>Bangalore</city>

          <region>Karnataka</region>

          <code>560103</code>

          <country>India</country>
        </postal>

        <email>rmohanr@cisco.com</email>
      </address>
    </author>

    <author fullname="Dan Wing" initials="D." surname="Wing">
      <organization abbrev="Cisco">Cisco Systems, Inc.</organization>

      <address>
        <postal>
          <street>170 West Tasman Drive</street>

          <city>San Jose</city>

          <region>California</region>

          <code>95134</code>

          <country>USA</country>
        </postal>

        <email>dwing@cisco.com</email>
      </address>
    </author>

    <date />

    <workgroup>MMUSIC</workgroup>

    <abstract>
      <t>Some networks deploy firewalls configured to block UDP traffic.
      When SIP user agents or WebRTC endpoints are deployed
      behind such firewalls, media cannot be sent over UDP across
      the firewall, but must be sent using TCP (which causes a different
      user experience) or through a session border controller.</t>

      <t>This draft describes an alternate model wherein extensions to
      ICE connectivity checks can be examined by the firewall to permit
      outgoing UDP flows across the firewall.</t>
    </abstract>
  </front>

  <middle>
    <section anchor="introduction" title="Introduction">
      <t>To protect networks using real-time communications, firewalls
      or session border controllers are typically deployed.</t>

      <t>Firewalls include Application Layer Gateway functionality, which 
      intercepts and analyzes the session signaling traffic such as the
      Session Initiation Protocol (SIP) traffic and creates dynamic mapping to
      permit the media traffic. In particular, firewall extracts the media
      transport addresses, transport protocol and ports from the session description and creates
      dynamic mapping for media to flow through. This model has the following
      problems:
      <list style="numbers">
          <t>It does not work if the session signaling is end-to-end encrypted
          (say, using TLS).</t>

          <t>It does not work if a non-standard session signaling is used that
          the firewall does not understand.</t>

          <t>It does not work if the session signaling and media traverse
          different firewalls.</t>
      </list></t>

      <t>When an enterprise deploys WebRTC, the above problems are relevant
      because: 
      <list style="numbers">
          <t>The session signaling between the WebRTC application running in
          the browser and the web server could be using TLS.</t>

          <t>WebRTC does not enforce a particular session signaling protocol to
          be used. So, the firewall may not be able to understand it.</t>

          <t>This session signaling and the peer-to-peer media may traverse
          different firewalls.</t>
      </list></t>

      <t>As a result the firewall may block ICE connectivity
      checks and media traffic.</t>
	  
      <t>Session Border Controllers (SBCs) are active participants with call
      signaling. Like firewalls, they also create dynamic mappings to
      permit media traffic. This forces call signaling and media through
      specific IP addresses, belonging to the SBC or an SBC-controlled media
      relay device.</t>

      <t>TURN is also used as an alternative to permit media traffic, 
       i.e. Use TCP transport between the client and TURN server because Firewalls are configured to block UDP entirely. </t>
     
     <t>The use-case is explained in Section 4.2.4.1 of <xref target="I-D.ietf-rtcweb-use-cases-and-requirements"></xref> refers to 
       deploying a TURN server to audit all media sessions from inside the company premises to any external peer. 
       </t>

     <t>Using TURN for all such communication has the following problems:<list
            style="symbols">
            <t>Single TURN server will result in single point of failure.</t>
            
            <t>TURN server could increase media latency and high-end TURN server would be needed to cater to all such calls.</t>

            <t>TURN server is just providing the 5-tuple details 
         (source IP address, destination IP address, protocol number, source port number,  and destination port number) 
         but no other details of the WebRTC server using which the call is initiated</t>

            <t>Enterprise firewalls would typically have granular policies to permit call initiated 
            using selected WebRTC servers (Dr.Good) it trusts and block others (Dr.Evil).</t>

            <t>It comes at a high cost to the provider of the TURN server, since the
   server typically needs a high-bandwidth connection to the Internet.  
   As a consequence, it is best to use a TURN server only when a direct
   communication path cannot be found. When the client and a peer use ICE
   to determine the communication path, ICE will use hole punching techniques
   to search for a direct path first and only use a TURN server when a direct
   path cannot be found.</t>

   <t>The value of the Diffserv field may not be preserved.</t>
   <t>The Explicit Congestion Notification (ECN) field may be reset.</t>

          </list></t>

   <t>This draft has a solution where an authorized server 
       (could be a Call Agent or a WebRTC server ) 
       generate a cryptographic token which
      is passed to the endpoints. The endpoint includes the token in its
      ICE connectivity checks. The firewall intercepts the ICE connectivity
      checks containing the token, validates it, and permits the ICE
      connectivity checks and the subsequent media flow through the firewall.
      </t>

    </section>

    <section anchor="notation" title="Notational Conventions">
      <t>The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
      "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
      document are to be interpreted as described in <xref
      target="RFC2119"></xref>.</t>

      <t>This note uses terminology defined in <xref
      target="RFC5245"></xref>.</t>
    </section>

   <section anchor="problem_stmt" title="Problem Statement">
      <t>In the below topology, an webRTC Server is deployed in the enterprise
      Data Center. Alice makes a webRTC call to Bob. For the two endpoints to
      successfully establish media sessions, firewalls FW1 and FW2 need to
      permit the ICE connectivity checks and media traffic. In such scenarios
      the mechanism described in this draft proposes a new
      comprehension-optional FW-FLOWDATA STUN attribute to be included in STUN
      Bind requests sent during ICE connectivity checks so that firewalls will
      permit media traffic between internal peers. This STUN attribute is 
      created by the trusted WebRTC server and sent to the endpoints to be
      propagated by the respective ICE agents during ICE connectivity checks.
      </t>

      <t><figure anchor="Figure1"
          title="WebRTC service in enterprise - internal call">
          <artwork align="left"><![CDATA[               
               =========================
               |  WebRTC Server        |
               =========================
                         |  Data Center
                         |  
                         | 
                 ==================
                 |    WAN         |-----+-+-------+---+----+-+-----+
                 ==================                        |
                           |              Branch office 2  |
  Branch office 1          |                               |  
                           |                               | 
                   +-------+-------+               +----+-------+
                   | Firewall 1    |               | Firewall 2 |
                   |               |               |            |
                   +-------+-------+               +----+-------+
                           |                               |
                           |                               |
                           |                               |
   ---+-+-----+-----------+-+-----+--------         -----+-+-----+------
                           |                               |
                        +-+------+                     +--------+
                        | Alice  |                     | Bob    |
                        +--------+                     +--------+   
]]></artwork>
        </figure></t>

      <t></t>
    </section>

    <section anchor="sol_overview" title="Solution Overview">
      <t>This section gives an overview of the solution and the different
      components involved in the solution and the role of each component.</t>

      <section title="Different Components and the Trust model">
        <t>Figure 1 above shows typical components involved in a WebRTC call
        scenario. As part of the call setup, the WebRTC endpoint would have to
        gather its candidates from a STUN/TURN server, send the candidates in
        the offer to the peer endpoint. On receiving the answer from the peer
        endpoint it starts the  ICE connectivity checks. As discussed in the 
        problem statement, firewalls would typically block these ICE
        connectivity checks and media flowing there after. To allow this traffic
        a firewall needs to authorize the flow.</t>

        <t><list style="symbols">
            <t>A new comprehensive optional STUN attribute called FW-FLOWDATA
            is defined as part of this draft. This is used by WebRTC endpoints
            requiring firewall traversal.</t>

            <t>This STUN attribute FW-FLOWDATA is generated by the WebRTC
            server in co-ordination with the WebRTC endpoint.</t>

            <t>Once the WebRTC session ends the firewall's dynamic mappings are
            closed after timeout.</t>

            <t>DISCUSSION: Could we could have a FW-FLOWDATA attribute sent in a
            STUN message to close the dynamic mappings in the firewalls?</t>
          </list></t>
      </section>
    </section>

    <section anchor="usage" title="Usage and Processing">
      <t>An RTP endpoint which generates media can include the FW-FLOWDATA
      attribute in its STUN Binding requests used in ICE connectivity checks, to
      inform on-path firewalls to permit the flow.</t>

      <section title="Generating  FW-FLOWDATA Attribute">
        <t>The WebRTC server after processing the OFFER/ANSWER sends the 
        FW-FLOWDATA STUN attribute to both the peers to be included in the ICE
        connectivity checks. The Authentication Tag field in the FW-FLOWDATA
        attribute contains the digest of the FW-FLOWDATA attribute for data 
        origin authentication and integrity protection. The server first selects
        the candidate address info based on OFFER/ANSWER exchange and generates
        other fields of this attribute. The server then computes a digest for
        the FW-FLOWDATA attribute using HMAC-SHA1. The key for HMAC-SHA1 is
        provisioned using the technique in <xref target="key_prov"></xref>.
        The result of which is truncated to 96 bits (retaining the left most
        bits) to produce HMAC-SHA-1-96 and input into the Authentication Tag
        field. The mechanism to send FW-FLOWDATA attribute from the WebRTC
        server to the cient is outside the scope of this draft. But it is
        assumed that signalling protocols used for WebRTC call setup will be
        enhanced to deliver this new attribute to the WebRTC client. The WebRTC
        server MUST provide a new FW-FLOWDATA to allow the media session to
        continue before Lifetime expires.</t>
      </section>

      <section title="Sending FW-FLOWDATA Attribute in Binding Request">
        <t>Once a WebRTC endpoint receives the FW-FLOWDATA, it is responsible
        for generating the STUN message and retransmitting the transactions
        per the STUN specification. The FW-FLOWDATA attribute should be placed
        before the FINGERPRINT attribute (if present) and after the
        MESSAGE-INTEGRITY attribute. The STUN length field is adjusted to
        point to the new end of the STUN message; that is, the STUN length
        field always accurately indicates the length of the STUN message
        (including the MESSAGE-INTEGRITY, FINGERPRINT, and FW-FLOWDATA
        attributes). This does not interfere with 3rd party receivers of the
        STUN message, as they will adjust the STUN length field to point to the
        end of the MESSAGE-INTEGRITY field. Receivers that do not understand the
        FW-FLOWDATA will ignore it.</t>

        <t>FW-FLOWDATA attribute received by the WebRTC client is passed to
        the web browser's ICE agent (API to be added in in W3C WebRTC-API
        specification [I.D.w3c-webrtc]). The ICE agent includes the FW-FLOWDATA
        attribute with all ICE connectivity checks, so that on-path firewalls
        can validate and permit the ICE connectivity checks and forthcoming
        media. The token MUST included in the ICE binding indication packets (keepalive) 
        (In case the lifetime expires)</t>

        <t>For the FW-FLOWDATA attribute to be visible to the firewalls
        between the client and the TURN server, the FW-FLOWDATA should be
        included in the ALLOCATE request, channel bind or refresh messages going
        to the TURN server. This is to avoid firewalls having to look for STUN
        packets within STUN (TURN) packets.</t>
      </section>

      <section title="Firewalls processing FW-FLOWDATA Attribute">
        <t>Firewalls can reliably determine a UDP message is a STUN message
        because all STUN messages sent as ICE connectivity checks include the
        32-bit STUN magic cookie and the FINGERPRINT attribute. STUN messages
        which are authenticated also include a MESSAGE-INTEGRITY attribute
        which authenticates the fields prior to the MESSAGE-INTEGRITY.</t>

        <t>When the firewall receives a STUN binding request with FW-FLOWDATA
        attribute it stores the Authentication Tag in the FW-FLOWDATA
        attribute. The firewall then generates a digest for the FW-FLOWDATA
        attribute using HMAC-SHA1. The result of which is truncated to 96 bits
        (retaining the left most bits) to produce HMAC-SHA-1-96. If the value of
        the newly generated digest HMAC-SHA-1-96 is identical to the stored one, 
        the firewall can ensure that the FW-FLOWDATA attribute has not been
        tampered with. Otherwise the packet is discarded.</t>

        <t>To facilitate timestamp checking for replay attacks, each firewall
        should perform the following check for each message:</t>

        <t>When a message is received, the received timestamp, TSnew, is checked, and
        the packet is accepted if the timestamp is recent enough to the
        reception time of the packet, RDnew:</t>

        <t> Lifetime + Delta &gt; (RDnew - TSnew) </t>

        <t>The recommended value for the allowed Delta is 30 seconds. If the
        timestamp is NOT within the boundaries then discard the STUN message.
        </t>

        <t>The firewall also performs the following checks:<list
            style="symbols">
            <t>Ensures that the source IP address and UDP port of the packet
            matches with one of the local CAI entries in
            the payload except for peer-reflexive cases.
            </t>

            <t>Ensures the destination IP address and UDP port of the packet
            matches with one of the local CAI entries in
            the packet payload except for peer-reflexive cases.</t>

            <t>Firewall if located after NAT(peer-reflexive cases) can skip CAI
             processing (It can be configurable option). For peer-reflexive case, destination CAI MUST
             match in case of outgoing STUN packet and source CAI MUST match incase of 
             incoming STUN packet</t>

          </list></t>

        <t>If all the above checks pass then the firewall creates the 5-tuple
        dynamic mapping using the local candidate IP address, local candidate
        port, remote candidate IP address, remote candidate port, transport
        protocol. The session time of the dynamic mapping will be set to a
        short lifetime (default value of 60 seconds).</t>

        <t>If the initial ICE connectivity check includes the ICE-CONTROLLING
   attribute but does not include USE-CANDIDATE, ICE connectivity check is successful
   and a subsequent ICE connectivity check includes both these attributes, the firewall can
   determine that the ICE agent is the controlling agent using regular
   nomination and this candidate pair is nominated for media flow.  The
   firewall then sets the session time of the dynamic mapping equal to
   the Lifetime field in FW-FLOWDATA attribute.</t>

        <t>If the initial ICE connectivity check includes the ICE-CONTROLLING
   attribute and the USE-CANDIDATE attribute, firewall can determine
   that the ICE agent is the controlling agent using aggressive using
   nomination.  If the ICE connectivity check is successful It then waits for the media traffic to
   flow before setting the session time of the dynamic mapping equal to Lifetime
   field in FW-FLOWDATA attribute.</t>

       <t>DISCUSSION: If WebRTC implementations of RTP support multiplexing
        of multiple media sessions onto a single RTP session, FW-FLOWDATA
        attribute can be enhanced to carry a flag indicating the same so that
        firewall can immediately close the dynamic mapping created for other
        pairs in the ICE checklist once media starts flowing on one the
        candidate pairs. In case of multi-homing firewalls can track multiple
        host IP addresses using authentication supplicant or, for hosts
        lacking the supplicant, use address-based authentication method.</t>
      </section>
    </section>

    <section anchor="attr_format" title="STUN Attribute Format">
      <t><figure anchor="Figure3" title="FW-FLOWDATA Attribute">
          <artwork align="left"><![CDATA[
       0                   1                   2                   3
       0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                          Lifetime                             |  
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                          Nonce                                |  
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+      
      |                          Timestamp                            |  
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      | LCA  Count    |  RCA  Count   |    Reserved                   |  
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                                                               |
      |                 Candidate Address Info                        |
      |                                                               |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                                                               |
      |                 Authentication Tag                            |
      |                                                               |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+                ]]></artwork>
        </figure><list style="hanging">
          <t hangText="Lifetime:">32-bit unsigned integer. The length of time in
          seconds that the STUN attribute is valid for the purpose of firewall
          creating dynamic mapping. The lifetime of the firewall dynamic
          mapping is set to this value. After the lifetime expires the mapping
          is deleted, unless the lifetime is extended using a another
          FW-FLOWDATA attribute.</t>

          <t hangText="Nonce:">96-bit unsigned integer. Random value chosen by
          the WebRTC Server that uniquely identifies the STUN attribute.</t>

          <t hangText="Timestamp:">64-bit unsigned integer field containing a
          timestamp. The value indicates the number of seconds since January
          1, 1970, 00:00 UTC, by using a fixed point format. In this format,
          the integer number of seconds is contained in the first 48 bits of
          the field, and the remaining 16 bits indicate the number of 1/64K
          fractions of a second.</t>

          <t hangText="LCA Count:">8-bit unsigned integer. Number of local
          candidate addresses. </t>

          <t hangText="RCA Count:">8-bit unsigned integer. Number of remote
          candidate addresses.</t>

          <t hangText="Reserved:">16-bit unsigned integer. An unused field. It
          MUST be initialized to zero by the sender and MUST be ignored by the
          receiver.</t>

          <t hangText="Candidate Address Info:"><figure>
              <artwork><![CDATA[       0                   1                   2                   3
       0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |Family         | Protocol        |         Port                |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                                                               |
      |                 Address (32 bits or 128 bits)                 |
      |                                                               |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+]]></artwork>
            </figure>It consists of an 8-bit address family, L4 protocol (for
          example 17 for UDP, 6 for TCP) and a 16-bit port, followed by a
          fixed-length value representing the IP address. If the 16-bit port
          value is 0 it indicates "all ports". The address family can take on
          the following values: 0x01:IPv4 and 0x02:IPv6. 
          An endpoint may send Zero or more CAI in its FLOWDATA</t>

          <t hangText="Authentication Tag:">A 96-bit field that carries the
          Message Authentication Code for the FW-FLOWDATA STUN attribute.</t>
        </list></t>
    </section>

    <section anchor="key_prov" title="Key Provisioning">
      <t>Static keys are preconfigured, either manually or through a network
      management system. The simplest way to implement FW-FLOWDATA validation
      is to use static keys. The provisioning of static keys requires either
      manual operator intervention on the WebRTC Server and each firewall in
      the enterprise or a network management system performing the same
      task.</t>

      <t>Alternatively using Dynamic Group Key Distribution, group keys are
      dynamically distributed among the WebRTC server and enterprise firewalls
      using GDOI <xref target="RFC6407"></xref>. In this way, each firewall
      requests a group key from a key server as part of an encrypted and
      integrity-protected key agreement protocol. Once the key server has
      authenticated and authorized the firewalls, it distributes a group key
      to the group member. The authentication in this model can be based on
      public key mechanisms, thereby avoiding the need for static key
      provisioning.</t>
    </section>

    <section anchor="security" title="Security Considerations">
      <t>Hosts using WebRTC calls will see lot of FW-FLOWDATA attributes. They
      determine the key by trying a number of candidate keys and seeing if one
      of them is correct. The attack works when the keys have low entropy, such
      as a word from the dictionary. This attack can be mitigated by using
      strong keys with large entropy. In situations where even stronger
      mitigation is required, the keys can be dynamically changed using GDOI.
      The WebRTC server controls how long a firewall session is kept open via
      the Lifetime value and WebRTC server could use different Lifetime values
      depending on the anticipated level of trust of the device (e.g. company
      provided laptop might be trusted more than a Bring Your Own Device
      (BYOD)); the device with more trust need to obtain its authentication
      attribute less often). Firewalls in addition to timestamp checking can
      also maintain a cache of used Nonces, IP source addresses associated 
      with used Nonces as an effective countermeasure against replay attacks.</t>

      <t>All the security considerations applicable to STUN <xref
      target="RFC5389"></xref> and ICE <xref target="RFC5245"></xref> are
      applicable to this document as well.</t>
    </section>

    <section anchor="iana" title="IANA Considerations">
      <t>Allocate new STUN attribute value for FW-FLOWDATA from the <xref
      target="STUN-ATTR"></xref> registry.</t>
    </section>

    <section anchor="ack" title="Acknowledgements">
      <t>The authors would like to thank Prashanth Patil and Ramesh Nethi for
      review comments.</t>
    </section>
  </middle>

  <back>
    <references title="Normative References">
      <?rfc include="reference.RFC.2119"?>

      <?rfc include="reference.RFC.5245"?>

      <?rfc include="reference.RFC.5389"?>

       <?rfc include="reference.RFC.6407"?>
   </references>

    <references title="Informative References">
	  <?rfc include='reference.I-D.ietf-rtcweb-use-cases-and-requirements'?>
	  
      <reference anchor="STUN-ATTR"
                 target="http://www.iana.org/assignments/stun-parameters/stun-parameters.xml#stun-parameters-3">
        <front>
          <title>IANA: STUN Attributes</title>

          <author fullname="IANA" surname="IANA">
            <organization>IANA</organization>
          </author>

          <date month="December" year="2011" />
        </front>
      </reference>

      <!---->
    </references>
  </back>
</rfc>
