<?xml version="1.0"?>
<!DOCTYPE rfc SYSTEM "rfc2629.dtd">
<?xml-stylesheet type="text/xsl" href="rfc2629.xslt" ?>
<?rfc strict="yes" ?>
<?rfc toc="yes"?>
<?rfc tocdepth="4"?>
<?rfc symrefs="yes"?>
<?rfc sortrefs="yes" ?>
<?rfc compact="yes" ?>
<?rfc subcompact="no" ?>
<rfc category="std" ipr="trust200902">
  <front>
    <title abbrev="Tao of Web">
      Media API Requirements for Real-Time Interoperation of Web User Agents
    </title>

    <author initials="M" surname="Thomson" fullname="Martin Thomson">
      <organization>Microsoft</organization>
      <address>
        <postal>
          <street>3210 Porter Drive</street>
          <city>Palo Alto</city>
          <region>CA</region>
          <code>94304</code>
          <country>US</country>
        </postal>
        <phone>+1 650-353-1925</phone>
        <email>martin.thomson@skype.net</email>
      </address>
    </author>

    <author initials="B" surname="Aboba" fullname="Bernard Aboba">
      <organization>Microsoft</organization>
      <address>
        <postal>
          <street>One Microsoft Way</street>
          <city>Redmond</city>
          <region>WA</region>
          <code>98052</code>
          <country>US</country>
        </postal>
        <email>bernard_aboba@hotmail.com</email>
      </address>
    </author>

    <date month="July" year="2012"/>
    <area>RAI</area>
    <workgroup>RTCWEB</workgroup>
    <keyword>API</keyword>
    <keyword>Web</keyword>
    <keyword>Real-Time</keyword>
    <keyword>Realtime</keyword>

    <abstract>
      <t>Direct, real-time communications between web user agents (browsers) requires two points of standardization.  The first describes the on-the-wire protocol that is used.  The second is the API that controls and configures what gets put on the wire and how.  The capabilities of the protocol determines the nature of the API.  This document outlines how the constraints imposed upon the API by the protocol.
      </t>
    </abstract>
  </front>

  <middle>
    <section title="Introduction">
      <t>The exchange of media - audio and video - between peers is a basic foundation of telecommunications.  The fact that the web platform has successfully managed to avoid this fundamental feature for so long is a testament to the usefulness of its other features.  The WebRTC/rtcweb effort attempts to remedy this shortcoming.
      </t>

      <t><xref target="arch"/> shows the architecture for real-time communications on the web.
      </t>
      <figure anchor="arch" title="Architecture for the Real-Time Web">
<artwork><![CDATA[
                      _________
                    ((         ))
                 .-(( The Cloud ))-.
             ?  /   ((_________))   \ ?
               /                     \
 +------------/----+               +--\--------------+
 | +---------/---+ |               | +-\-----------+ |
 | | Javascript  | |               | | Javascript  | |
 | |             | |               | |             | |
 | +--<W3C API>--+ |     IETF      | +--<W3C API>--+ |
 |                 === protocols ===                 |
 | Web User Agent  |               | Web User Agent  |
 +-----------------+               +-----------------+
]]></artwork>
      </figure>

      <t>Two points of interoperation are necessary in this architecture: API and peer-to-peer protocols.  Critically, the points marked with a '?' are not subject to standardization.  These points are effectively internal to the application that exists both in "The Cloud" and in the Javascript virtual machine provided by the web user agent.
      </t>

      <t>The IETF rtcweb working group has settled on <xref target="RFC3550">Real-Time Transport Protocol (RTP)</xref>, <xref target="RFC3711">Secure RTP (SRTP)</xref> and <xref target="RFC5245">Interactive Connectivity Establishment (ICE)</xref> for the peer to peer protocol components.  Use of RTP is documented in <xref target="I-D.ietf-rtcweb-rtp-usage"/>; use of SRTP and ICE is described in <xref target="I-D.ietf-rtcweb-security-arch">the rtcweb security architecture</xref>.
      </t>

      <t>ICE and RTP are unable to operate without an out-of-band communications channel.  For ICE this provides bootstrapping information; for RTP, a common understanding of the key protocol parameters.  Though SRTP can operate without an out-of-band channel if certain assumptions are made, many uses depend on some form of out-of-band communication.  The out-of-band channel is provided by the web application.  The necessary information passes through the API to the browser.
      </t>

      <t>This document describes what information must pass between a web application and a web user agent in order to successfully establish peer-to-peer media communications.  No attempt is made to constrain what the API looks like.  After all:
      <list style="empty">
        <t>[...] when [a] IETF documents start telling you how to build Javascript APIs, you should run far away... quickly.  :)<vspace/> &nbsp; &nbsp; &nbsp; -- <xref target="I-D.kaplan-rtcweb-api-reqs"/>
        </t>
      </list>
      </t>

      <section title="Terminology">
	<t>In this document, the key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" are to be interpreted as described in BCP 14, <xref target="RFC2119">RFC 2119</xref> and indicate requirement levels for compliant implementations.
	</t>
      </section>
    </section>

    <section anchor="ice" title="Peer-to-Peer Transport">
      <t><xref target="RFC5245">ICE</xref> provides a mechanism for establishing peer-to-peer communications using UDP.
      </t>
      <t>The ICE process applies to a single UDP flow from a source address and port to a destination address and port.  Where multiple flows are required, the process is applied once for each flow.
      </t>

      <section title="Overview of ICE Operation">
        <t>In ICE, for each UDP flow, each peer performs the following steps:
        <list style="symbols">
          <t>Candidate gathering.  A candidate is an UDP port with associated session authentication parameters.  Candidates are attributed to the peer, but they can be collected from the local host, by querying servers about the server perspective of the peer address, or a <xref target="RFC5766">TURN</xref> relay can allocate server-based ports that forward packets to the peer.
          </t>
          <t>Candidate exchange.  Each peer learns of the candidates gathered by the other peer.
          </t>
          <t>Candidate pair testing.  Candidates are paired, one local against one remote.  A connectivity check is performed where a <xref target="RFC5389">STUN</xref> packet is sent from the local candidate to the remote.  The pair is valid if a response arrives.  Packets might be lost, so this process is repeated.
          </t>
          <t>Candidate pair selection.  As soon as one or more candidates are marked as valid, one of these can be selected and used.
          </t>
        </list>
        </t>
      </section>

      <section title="Transport ICE Requirements">
        <t>The following requirements ensure that an application is able to establish a UDP flow between peers with proper consent:
        <list style="format ICE-%d">
          <t>The application MUST be able to request the gathering of host candidates.  Candidates are comprised of an IP address (v4 or v6), a UDP port number, an ICE username fragment and an ICE password.
          </t>
          <t>The application MUST be able to request the gathering of server reflexive address information using STUN.  Input to this is the identity of a STUN server and any credentials required by the server.
          </t>
          <t>The application MUST be able to request the allocation of TURN relay candidates.  Input to this is the identity of a TURN server and any credentials required by the server.
          </t>
          <t>The application MUST be able to request that a connectivity check be sent to a remote candidate and to be informed of success.  Input to this is a choice of local candidate and details of the remote candidate.
          <vspace blankLines="1"/>
          This establishes: whether the UDP packet is able to reach the remote peer and whether the remote peer consents to the receipt of packets from this peer.
          </t>
          <t>The application MUST be able to add STUN attributes to the STUN messages that are sent for connectivity checks.
          </t>
          <t>The application MUST be able to examine STUN attributes received on successful responses to connectivity checks.
          </t>
          <t>The application MUST be able to retrieve peer reflexive addresses that are discovered when connectivity checks are successful.
          </t>
          <t>The application MUST be able to request the creation of a UDP flow on a valid candidate pair.
          </t>
          <t>The application MUST be notified when a successful connectivity check is received from remote peer.  The notification MUST include peer addressing information.
          </t>
        </list>
        </t>
      </section>
      <section title="Established Flow Requirements">
        <t>The following requirements apply to established flows:
        <list style="format UDP-%d">
          <t>The application MUST be able to send connectivity checks on an active UDP flow to test liveness of the flow.
          </t>
          <t>The application MUST be notified when consent for an active UDP flow expires.
          </t>
          <t>The application MUST be able to retrieve the bandwidth that the remote peer consents to receive for the UDP flow.
          </t>
          <t>The application MUST be able to learn the bandwidth limit that the browser has detected for the flow based on feedback from the network.
          </t>
          <t>The application MUST be notified when the browser detects changes in the available bandwidth for the UDP flow.
          </t>
          <t>The application MUST be notified when the browser detects network congestion that affects the UDP flow.
          </t>
          <t>The application MUST be notified when changes in network connectivity occur.
          </t>
          <t>The application MUST be able to pause connectivity checking on an active UDP flow.
          <vspace blankLines="1"/>
          This allows an application to conserve resources for inactive flows, especially for battery-powered devices.
          </t>
        </list>
        </t>
      </section>
    </section>

    <section anchor="srtp" title="Securing Media">
      <t><xref target="RFC3711">SRTP</xref> provides confidentiality, message authentication and replay protection for RTP media.
      </t>

      <t>SRTP requires external key negotiation.  Two methods are possible: <xref target="RFC5764">Datagram Transport Layer Security (DTLS) can be used for key negotiation (DTLS-SRTP)</xref>, or the SRTP master keys can be provided directly.  The application chooses which of these modes the application uses. [[Ed: requirement for second option not established]]
      </t>

      <t>With DTLS key negotiation, the asymmetry of the DTLS handshake means that out-of-band methods must be used to determine whether peers act in the server or client roles.  DTLS offers a means for peer authentication, allowing the application to learn of the identity of the peer through the certificate they offer during the handshake.
      </t>

      <t>Without DTLS key negotiation, the SRTP master keys are determined out-of-band and provided to the browser through the API (e.g., <xref target="RFC4568"/>).  SRTP does not offer any method for peer authentication.
      </t>

      <t>The following requirements apply:
      <list style="format SEC-%d">
        <t>The application MUST be able to select how SRTP key negotiation is performed, either DTLS-SRTP or through the API.
        </t>
        <t>Where DTLS-SRTP is chosen, the application MUST be able to specify whether the browser is to take the client or server role.
        </t>
        <t>Where DTLS-SRTP is chosen, the application MUST be able to view the certificate offered by the peer.
        </t>
        <t>Where DTLS-SRTP is chosen, the application MUST be able to discover if the peer certificate is a trusted certificate for an identified domain.
        </t>
        <t>Where DTLS-SRTP is chosen, the application MUST be able to reject communication with peers based on information presented in their certificate.
        </t>
        <t>Where SRTP is chosen without DTLS for key negotiation, the application MUST be able to set SRTP parameters, including: the encryption, key generation and authentication algorithms; key and parameter size; key and salt value; key usage lifetime; key derivation interval; and an optional master key identifier.
        </t>
      </list>
      </t>
        
    </section>

    <section anchor="rtp" title="Sending Peer-to-Peer Media">
      <t><xref target="RFC3550">RTP</xref> sends real-time data - media - from a source to a sink.
      </t>
      <t>An RTP stream is the manifestation of media as packets.  Most media is a single stream, though layered codecs (such as <xref target="RFC6190">H.264 SVC</xref>) can be split into multiple streams.  Browser need to handle RTP streams in two directions: outbound from a local source, and inbound toward a local sink.
      </t>

      <t>The following requirements apply:
      <list style="format MED-%d">
        <t>The application MUST be able to select the UDP flow that an RTP stream uses.
        </t>
        <t>The application MUST be able select the UDP flow that RTCP for a given RTP stream uses.
        </t>
        <t>The application MUST be able to move RTP streams (or associated RTCP) between UDP flows without losing packets.
        </t>
        <t>The application MUST be able to specify the SSRC for RTP streams, both outbound and inbound.
        </t>
        <t>The application MUST be able to discover the CNAME that will be used for streams that it generates.
        </t>
        <t>The application MUST be able to discover the <xref target="I-D.westerlund-avtext-rtcp-sdes-srcname">SRCNAME</xref> that will be used for streams that it generates. [[Ed: conditional on acceptance of SRCNAME]]
        </t>
        <t>The application MUST be able to specify the RTP packet type that is used to identify codecs in RTP streams, both inbound and outbound.
        </t>
        <t>The application MUST be able to select the codecs that are used for outbound RTP streams and to specify the codecs that are present in inbound RTP streams.
        </t>
        <t>The application MUST be able to limit the bandwidth allocated to a single outbound RTP stream.
        </t>
        <t>The application MUST be able to specify the number of audio channels used for an audio stream.
        </t>
        <t>The application MUST be able to specify codec parameters (such as appear in <xref target="RFC4566">SDP</xref> <spanx style="verb">a=fmtp</spanx> lines).
        </t>
        <t>The application MUST be able to mark the priority of an RTP stream.
        <vspace blankLines="1"/>
        This lets the browser choose appropriate packet marking using DiffServ and the relative sensitivity of streams to congestion.
        </t>
        <t>The application MUST be able to update the configuration for an active RTP stream.
        </t>
      </list>
      </t>
    </section>

    <section title="DTMF Tones">
      <t>Dual-Tone Multifrequency (DTMF) is commonly used by legacy telephony applications.  DTMF tones are typically interleaved/mixed with audio streams.
      </t>
      <t>The following requirements apply:
      <list style="format DTMF-%d">
        <t>The application MUST be able to insert DTMF tones into an audio stream.
        </t>
        <t>The application MUST be able to be informed of the existence of DTMF tone events on an audio stream.
        </t>
        </list>
      </t>
      
    </section>

    <section anchor="iana" title="IANA Considerations">
      <t>This document has no IANA actions.
      </t>

      <t>[RFC Editor: please remove this section prior to publication.]
      </t>
    </section>

    <section anchor="security" title="Security Considerations">
      <t>The security of your precious bodily fluids can only be achieved through purity of essence.
      </t>
    </section>
    
    <section title="Acknowledgments">
      <t>This document takes ideas from <xref target="I-D.kaplan-rtcweb-api-reqs"/>, including the short title you see on each page.
      </t>
    </section>
  </middle>

  <back>
    <references title="Normative References">
      <?rfc include="reference.RFC.2119"?>
      <?rfc include="reference.RFC.3550"?>
      <?rfc include="reference.RFC.3711"?>
      <?rfc include="reference.RFC.5245"?>
      <?rfc include="reference.RFC.5764"?>
      <?rfc include="reference.I-D.ietf-rtcweb-rtp-usage"?>
      <?rfc include="reference.I-D.ietf-rtcweb-security-arch"?>
      <?rfc include="reference.I-D.westerlund-avtext-rtcp-sdes-srcname"?>
    </references>

    <references title="Informative References">
      <?rfc include="reference.I-D.kaplan-rtcweb-api-reqs"?>
      <?rfc include="reference.RFC.4566"?>
      <?rfc include="reference.RFC.4568"?>
      <?rfc include="reference.RFC.5389"?>
      <?rfc include="reference.RFC.5766"?>
      <?rfc include="reference.RFC.6190"?>
      <?rfc include="reference.RFC.6455"?>
    </references>
  </back>

</rfc>
