<?xml version="1.0" encoding="UTF-8"?>
<!-- edited with XMLSPY v5 rel. 3 U (http://www.xmlspy.com)
     by Daniel M Kohn (private) -->
<!DOCTYPE rfc SYSTEM "rfc2629.dtd" [
<!ENTITY rfc2119 PUBLIC "" "http://xml.resource.org/public/rfc/bibxml/reference.RFC.2119.xml">
]>
<rfc category="std"
     docName="draft-wu-avtcore-multiplex-multisource-endpoint-01.txt"
     ipr="trust200902">
  <?xml-stylesheet type='text/xsl' href='rfc2629.xslt' ?>

  <?rfc toc="yes" ?>

  <?rfc symrefs="yes" ?>

  <?rfc sortrefs="yes"?>

  <?rfc iprnotified="no" ?>

  <?rfc strict="yes" ?>

  <front>
    <title abbrev="Bandwidth and RTCP Timing">Bandwidth and RTCP timing issues
    for multi-source endpoint</title>

    <author fullname="Qin Wu" initials="Q." surname="Wu">
      <organization>Huawei</organization>

      <address>
        <postal>
          <street>101 Software Avenue, Yuhua District</street>

          <city>Nanjing</city>

          <region>Jiangsu</region>

          <code>210012</code>

          <country>China</country>
        </postal>

        <email>sunseawq@huawei.com</email>
      </address>
    </author>

    <author fullname="Roni Even" initials="R." surname="Even">
      <organization>Huawei</organization>

      <address>
        <postal>
          <street>14 David Hamelech</street>

          <city>Tel</city>

          <region>Aviv</region>

          <code>64953</code>

          <country>Israel</country>
        </postal>

        <email>ron.even.tlv@gmail.com</email>
      </address>
    </author>

    <date year="2012" />

    <area>Real-time Applications and Infrastructure Area</area>

    <workgroup>Audio/Video Transport Working Group</workgroup>

    <keyword>RFC</keyword>

    <keyword>Request for Comments</keyword>

    <keyword>I-D</keyword>

    <keyword>Internet-Draft</keyword>

    <keyword>Real Time Control Protocol</keyword>

    <abstract>
      <t>This document discusses bandwidth issues and RTCP timing rule issues
      that arise when the multi-source endpoint multiplexing all the media
      type in one RTP session and follows RFC3550 timing rules. It provides
      recommendations for multi-source host sending multiple media types in
      the same session.</t>
    </abstract>
  </front>

  <middle>
    <section anchor="intro" title="Introduction">
      <t>Multiplexing is a method by which multiple streams are combined into
      one stream over a shared medium. In RTP, multiplexing is provided by the
      destination transport address (network address and port number) which is
      different for each RTP session. Given each media type having different
      requirement for bandwidth, multiple media types (e.g., Separate audio
      and video streams) are usually not carried in a single RTP session.</t>

      <t>However for some applications that use unicast transport, e.g., in
      RTCWeb application, multiple-source hosts may use a different SSRC for
      each medium but sending them in the same RTP session, which reduces
      communication failure due to NAT and firewall when using multiple RTP
      sessions or multiple transport flow. If these multi-source hosts still
      follow RFC3550 timing rules, audio and video are multiplexed onto a
      single RTP session and share a common session bandwidth, the audio flows
      sending at much lower rates will waste a large amount of bandwidth.</t>

      <t>This document provides recommendations for multi-source host sending
      multiple media types in the same session.</t>
    </section>

    <section title="Terminology">
      <section title="Standards Language">
        <t>The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
        "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
        document are to be interpreted as described in <xref
        target="RFC2119">RFC 2119</xref>.<list style="hanging">
            <t hangText="Multi-Source Endpoint"><vspace blankLines="1" />End
            system with multiple sources generated from one host and running
            on that host. One example of multi-source endpoint is an RTP
            endpoint which has multiple capture devices of the same media type
            and characteristics.</t>

            <t hangText="Single-Source Endpoint"><vspace blankLines="0" />End
            system with single source generated from one host described in
            RFC3550.</t>
          </list></t>
      </section>
    </section>

    <section title="Use Cases for multi-source host communications">
      <section title="One to one Communication between multi-source endpoints">
        <figure title="Figure 1">
          <artwork>
       Multiple-Source              Multiple-Source
           Endpoint1                    Endpoint2
          +--------+                   +--------+
          |        |  Common Transport |        |
          |        |                   |        |
          |        |Transport flow1(v1,a1)      |
SSRC-V1---|        |------------------&gt;|        |--- SSRC-V2
          |        |                   |        |
          |        |                   |        |
          |        |                   |        |
          |        |                   |        |
SSRC-A1---|        |Transport flow1(v2,a2)      |
          |        |&lt;------------------|        +----SSRC-A2
          |        |                   |        |
          |        |                   |        |
          |        |                   |        |
          +--------+                   +--------+   
</artwork>
        </figure>

        <t>In the figure 1, one to one communication between multiple-source
        endpoint1 and multiple-source endpoint2 takes place. In order to
        reduce unicast transport flow to facilitate NAT/FW traversal,
        Multiple-Source endpoint 1 and multiple-source endpoint2 in one to one
        communication share the same transport. Both video stream v1 and audio
        stream a1 from multiple-source endpoint 1 are multiplexed in the RTP
        session1 and transmitted over the transport flow1 to multiple-source
        endpoint 2. Similarly, both video stream v2 and audio stream a2 from
        multiple-source endpoint2 are multiplexed in the RTP session 2 and
        transmitted over the same transmitted over the same transport flow 1
        as multiple source endpoint1.</t>

        <t>Multiple-source endpoint1 sends RTCP SR and RR packets for every
        active media stream it’s receiving (i.e.,v2,a2) from every local
        source (i.e.,v1,a1), which wastes a lot of bandwidth for redundant
        statistics reports. So does multiple-source endpoint2.</t>

        <t>Also the bandwidth usage for each stream is limited by its media
        data rate and audio stream usually consumes lower bandwidth than video
        stream (e.g.,high quality audio with 80kbps and high quality video
        with 350 kbps). When audio stream and video stream share the same
        transport, i.e., sharing the common session bandwidth between audio
        and video, audio stream will still be sent at its own media data rate
        far below the common session bandwidth and therefore waste lots of
        bandwidth provided by the transport. Further when reception reports
        are used for audio stream and follows RFC3550 timing rule, RTCP SR and
        RR packets for active audio stream will be reported more frequent than
        when audio and video are carried in separate transports since more
        RTCP bandwidth are allocated to audio stream reporting when more
        session bandwidth are allocated to audio stream.</t>
      </section>

      <section title="Multi-party Communication between multi-source endpoints">
        <figure title="Figure 2">
          <artwork>
                             +---+
                             |   |
                    Transport flow 1(v1,a1)-----------+
                    Transport flow 1(v2,a2)           |
                          +--+ c +--------&gt;           |
                          |  | e |        | Multiple  |-----SSRC-V2
                          |  | n |        |  Source   |
          +-----------+   |  | t |        | Endpoint2 |
          |           &lt;---|  | e |        |           |
SSRC-V1---|           |      | r |        |           +-----SSRC-A2
          | Multiple  |      |   |        +-----------+
          |  Source   |      | s |
          | Endpoint1 |      | e |
SSRC-A1---|           &lt;---|  | r |        +-----------+
          |           |   |  | v |        |           |
          +-----------+   |  | e |        |           |
                          |  | r |        | Multiple  |-----SSRC-V3
                          |  |               Source   |
                    Transport flow 2(v1,a1) Endpoint3 |
                    Transport flow 2(v3,a3)           |
                          ---|   +--------&gt;           +-----SSRC-A3
                             |   |        +-----------+
                             |   |
                             |   |
                             +---+                               
</artwork>
        </figure>

        <t>In figure 2, multiple-source endpoint1 communicates with
        multiple-source endpoint2 and multiple-source endpoint3 in the same
        session. In order to reduce unicast transport flow to facilitate
        NAT/FW traversal, multiple-source endpoint 1 shares the same transport
        flow 1 with multiple-source endpoint2 to send video stream v1 and
        audio stream a1 and receive video stream v2 and audio stream v2
        simultaneously. Similarly, multiple-source endpoint 1 shares the same
        transport flow 2 with multiple-source endpoint3 to send video stream
        v1 and audio stream a1 and receive video stream v3 and audio stream a3
        simultaneously.</t>

        <t>Multiple-source endpoint1 sends RTCP SR and RR packets for every
        active media stream it’s receiving (i.e.,v2,a2,v3,a3) from every local
        source (i.e.,v1,a1), which wastes a lot of bandwidth for redundant
        statistics reports. So does multiple-source endpoint2.</t>

        <t>Also the bandwidth usage for each stream is limited by its media
        data rate and audio stream usually consumes lower bandwidth than video
        stream (e.g.,high quality audio with 80kbps and high quality video
        with 350 kbps). When audio stream and video stream share the same
        transport, i.e., sharing the common session bandwidth between audio
        and video, audio stream will still be sent at its own media data rate
        far below the common session bandwidth and therefore waste lots of
        bandwidth provided by the transport. Further when reception reports
        are used for audio stream and follows RFC3550 timing rule, RTCP SR and
        RR packets for active audio stream will be reported more frequent than
        when audio and video are carried in separate transports since more
        RTCP bandwidth are allocated to audio stream reporting when more
        session bandwidth are allocated to audio stream.</t>
      </section>

      <section title="Communication between multi-source endpoint multiplexing each medium in separate RTP session and multiple-source endpoint multiplexing all mediums in one RTP session">
        <figure>
          <artwork>
       Multiple-Source              Multiple-Source
           Endpoint1                    Endpoint2
          +--------+                   +--------+
          |        | Various Transports|        |
          |        |                   |        |
          |        |Transport flow1(v1,a1)      |
SSRC-V1---|        |------------------&gt;|        |--- SSRC-V2
          |        |                   |        |
          |        |Transport flow2(v2)|        |
          |        |&lt;------------------|        |
          |        |                   |        |
SSRC-A1---|        |Transport flow3(a2)|        |
          |        |&lt;------------------|        +----SSRC-A2
          |        |                   |        |
          |        |                   |        |
          |        |                   |        |
          +--------+                   +--------+                    
</artwork>
        </figure>

        <t>In the figure 3, multiple-source endpoint1 residing outside of the
        firewall communicates with multiple-source endpoint2 residing inside
        of the firewall. Unlike one to one communication described in section
        3.1, multiple-source endpoint2 multiplexes each medium (i.e.,v2,a2) in
        separate transports (i.e.,transport flow2,transport flow3)to
        multiple-source endpoint1 while multiple-source endpoint1 multiplexes
        all mediums(ie.,v1,a1) in one unicast transport(i.e.,transport
        flow1)to multiple-source endpoint2.</t>
      </section>
    </section>

    <section title="Discuss">
      <t>The RTCP bandwidth fraction is derived from the media data rate. If
      audio and video are sent as two separate RTP sessions, they would
      naturally have different RTCP bandwidth fractions, since the two media
      types have different rates.</t>

      <t>However, if audio and video are multiplexed onto a single RTP
      session, a common session bandwidth would have to be chosen. This common
      bandwidth will likely be inappropriate for one of the media types,
      leading to the situation where some media flows use their allocation,
      while some flows send at a rate that is quite different to the session
      bandwidth. In the common case, the video sends at the session bandwidth,
      while the audio flows send at much lower rates. The RTCP bandwidth is
      derived from the session bandwidth, which means it's appropriate for the
      video, but is too high for the audio. In the worst case, the audio flows
      can have more RTCP than video flows, which is very wasteful. Fixing this
      is potentially difficult, since the session bandwidth concept is baked
      into all the RTCP timing rules.</t>
    </section>

    <section title="Recommendations">
      <t>Senders and receivers which are not multi-source endpoints are not
      affected by bandwidth issues associated with multi-source endpoint and
      should follow RFC3550 timing rules[RFC3550], no special accommodation is
      required.</t>

      <t>Senders and receivers which are multi-source endpoints and sending or
      receiving multiple media types over different transport should be
      treated in the same way as single source endpoints dealing with multiple
      streams in the same media type.</t>

      <t>Senders and receivers which are multi-source endpoints and sending or
      receiving multiple media types MAY multiplex/demultiplex each media type
      in separate RTP session or multiplex/demultiplex all media types in one
      RTP session. The multiplexing/demultiplexing mode to be employed in two
      directions between senders and receivers should be configurable. It is
      RECOMMENDED that<list style="numbers">
          <t>As the default behavior, Senders and receivers use the media
          bundling mechanism [BUNDLE] in two directions between senders and
          receivers, i.e., multiplexing/demulplexing separate media types in
          one RTP session over the same lower layer transport.</t>

          <t>Configuration or local policy on the senders or receivers can
          override the default Mechanism specified in Option 1 above in one or
          two direction. Therefore senders and receivers MAY be configured
          with the different multiplexing mode.</t>

          <t>Dynamic multiplexing negotiation mechanism [BUNDLE] can be used
          to signal which multiplexing mechanism is used between senders and
          receivers and override the default mechanism specified in Option 1
          and 2 above. The employed multiplexing negotiation mechanism is
          outside the scope of this document.</t>
        </list></t>

      <t>Multi-source endpoints multiplexing each media type in separate RTP
      session SHOULD be treated as multiple endpoints for each media type.</t>

      <t>Editor's Note: It will be useful to discuss why we recommend to have
      a specific algorithm and see how it works with mixers and translators in
      the future version.</t>

      <section title="Choosing RTCP Bandwidth">
        <t>Multi-source endpoint multiplexing multiple media types in the same
        RTP session SHOULD specify RTCP bandwidth using SDP, in a “b=RS” line
        or a “b=RR” line rather than choosing 5% of session bandwidth for RTCP
        bandwidth, especially when audio stream and video stream are sent over
        the same transport and the media data rate of audio stream is far less
        than video stream.</t>

        <t>RTCP bandwidth for video stream MAY still follow RFC3550
        recommendation, i.e., the fraction of the session bandwidth added for
        RTCP be fixed at 5%.</t>

        <t>RTCP bandwidth for video = session bandwidth *5%</t>

        <t>However given lower bandwidth usage of audio stream, RTCP bandwidth
        for audio stream is RECOMMENDED to be specified using SDP in a “b=RS”
        line or a “b=RR” line.</t>

        <t>RTCP bandwidth for audio stream SHOULD be calculated based on the
        following formula:</t>

        <t>RTCP bandwidth for audio = ((audio codec maximum bitrate*20%)+
        audio codec maximum bitrate)*5%</t>

        <t>Here a 20% RTP packet overhead is added to the data rate to
        calculate the required RTCP bandwidth.</t>
      </section>

      <section title="Maintaining the number of session members and senders">
        <t>As described in section 6.2.1 of RFC3550, Calculation of the RTCP
        packet interval mainly depends upon the number of members. For an
        endpoint with multiple sources multiplexing all media types in the
        same RTP session, it is more desirable to set the number of sender for
        such endpoint as one and the number of receiver as one. However
        according to RFC3550, each source within such endpoint is considered
        valid until multiple packets carrying the new SSRC have been received,
        or until an SDES RTCP packet containing a CNAME for that SSRC has been
        received.</t>

        <t>To achieve this, the sender which is multi-source endpoint SHOULD
        not send Reception reports (e.g., SR/RR RTCP packets) from all local
        sources to each remote source in the receiving side, Similarly the
        Receiver which is multi-source endpoint SHOULD not receive reception
        reports(e.g., SR/RR RTCP packets from each remote source in the
        sending side to all local sources. Instead, one designated local
        source from the senders within multi-source endpoint should be chosen
        as reporting source for other local sources within the sender.</t>

        <t>When multi-Source endpoints multiplexing each media type in
        separated RTP session join the session, they SHOULD treated as
        multiple single-source endpoint multiplexing for each media type and
        SHOULD be counted as multiple active senders. The number of active
        senders within multi-source endpoint is decided by the number of local
        source which is sending reception reports.</t>

        <t>For the number of session members, it depends on the number of
        multiple-source endpoints E and the number of local source in each
        multiple-source endpoint S. It can be estimated or calculated
        according to the rules in RFC 3550 section 6.2.1, based on received
        valid SSRC in the SDES packet or RTP packets.</t>
      </section>

      <section title="RTCP Reporting Interval calculation">
        <t>The RTCP packet interval calculation SHOULD consider RTCP bandwidth
        estimation and signaling in section 5.1 and active sender estimation
        in section 5.2. The RTCP reporting interval for audio stream and video
        stream should be calculated according to the rules in RFC3550 section
        6.2 respectively. When RTCP reception reports for audio stream and
        video stream are sent in different intervals, in order to decrease the
        number of RTCP packet to be sent, it is more desirable to transmit
        reception report for audio stream and reception report for video
        stream in the same compound RTCP packet and set RTCP reporting
        interval for audio stream as a multiple of RTCP reporting intervals
        for video stream.</t>

        <t>For example, if the RTCP Reporting Interval for audio stream is
        more than one RTCP Reporting Interval for video but less than two RTCP
        Reporting Intervals for video, it is RECOMMENDED the RTCP Reporting
        Interval for audio stream be chosen as one RTCP Reporting Interval for
        video.</t>

        <t>If the RTCP Reporting Interval for audio stream is more than two
        RTCP Reporting Intervals for video but less than three RTCP Reporting
        Intervals for video, it is RECOMMENDED the RTCP Reporting Interval for
        audio stream be chosen as two RTCP Reporting Intervals for video.</t>

        <t>If the RTCP Reporting Interval for audio stream is more than three
        RTCP Reporting Intervals for video but less than four RTCP Reporting
        Intervals for video, it is RECOMMENDED the RTCP Reporting Interval for
        audio stream be chosen as three RTCP Reporting Intervals for
        video.</t>

        <t>Editor's Note: The problem is not as simple as just varying the
        RTCP bandwidth fraction for the audio sources in an RTP session.
        Varying the reporting interval in this way for some participants can
        lead to timeouts. You'd also need to revise the timeout rules,
        reconsideration rules, etc.</t>
      </section>

      <section title="RTCP reception report">
        <t>Multiple-Source Endpoint SHOULD NOT send reception reports from one
        of its source about all the other local sources of its own. RTP
        application SHOULD provide a means to identify multiple-source
        endpoint as in fact being sources from the same RTP node.</t>

        <t>Multiple-Source Endpoint SHOULD combine RTCP reception reports into
        a single compound RTCP packet without exceeding the maximum
        transmission unit (MTU) of the network path.</t>
      </section>
    </section>

    <section title="Security Considerations">
      <t>TBC.</t>
    </section>

    <section title="IANA Considerations">
      <t>This document has no actions for IANA.</t>
    </section>
  </middle>

  <back>
    <references title="Normative References">
      <reference anchor="RFC2119">
        <front>
          <title abbrev="RFC Key Words">Key words for use in RFCs to Indicate
          Requirement Levels</title>

          <author fullname="Scott Bradner" initials="S." surname="Bradner">
            <organization>Harvard University</organization>

            <address>
              <postal>
                <street>1350 Mass. Ave.</street>

                <street>Cambridge</street>

                <street>MA 02138</street>
              </postal>

              <phone>- +1 617 495 3864</phone>

              <email>sob@harvard.edu</email>
            </address>
          </author>

          <date month="March" year="1997" />

          <area>General</area>

          <keyword>keyword</keyword>

          <abstract>
            <t>In many standards track documents several words are used to
            signify the requirements in the specification. These words are
            often capitalized. This document defines these words as they
            should be interpreted in IETF documents. Authors who follow these
            guidelines should incorporate this phrase near the beginning of
            their document: <list>
                <t>The key words "MUST", "MUST NOT", "REQUIRED", "SHALL",
                "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and
                "OPTIONAL" in this document are to be interpreted as described
                in RFC 2119.</t>
              </list></t>

            <t>Note that the force of these words is modified by the
            requirement level of the document in which they are used.</t>
          </abstract>
        </front>
      </reference>

      <reference anchor="RFC3550">
        <front>
          <title>RTP: A Transport Protocol for Real-Time Applications</title>

          <author fullname="Henning Schulzrinne" initials="H."
                  surname="Schulzrinne">
            <organization>Columbia University</organization>
          </author>

          <date month="July" year="2003" />
        </front>

        <seriesInfo name="RFC" value="3550" />

        <format type="TXT" />
      </reference>

      <reference anchor="RFC3556">
        <front>
          <title>Session Description Protocol (SDP) Bandwidth Modifiers for
          RTP Control Protocol (RTCP) Bandwidth</title>

          <author fullname="S.Casner" initials="S." surname="Casner">
            <organization></organization>
          </author>

          <date month="July" year="2003" />
        </front>

        <seriesInfo name="RFC" value="3556" />

        <format type="TXT" />
      </reference>

      <reference anchor="BUNDLE">
        <front>
          <title>Extended RTP Profile for Real-time Transport Control Protocol
          (RTCP)-Based Feedback (RTP/AVPF)</title>

          <author fullname="C.Holmberg" initials="C." surname="Holmberg">
            <organization></organization>
          </author>

          <author fullname="H. Alvestrand" initials="H." surname="Alvestrand">
            <organization></organization>
          </author>

          <date month="August" year="2012" />
        </front>

        <seriesInfo name="ID"
                    value="draft-ietf-mmusic-sdp-bundle-negotiation-01" />

        <format type="TXT" />
      </reference>
    </references>

    <references title="Informative References">
      <reference anchor="I-D.ietf-avtcore-multi-media-rtp-session-00">
        <front>
          <title>Multiple Media Types in an RTP Session</title>

          <author fullname="M.Westerlund" initials="M." surname="Westerlund">
            <organization></organization>
          </author>

          <date month="October" year="2012" />
        </front>

        <seriesInfo name="ID"
                    value="draft-ietf-avtcore-multi-media-rtp-session-00" />

        <format type="TXT" />
      </reference>
    </references>
  </back>
</rfc>
