2.7.1 Audio/Video Transport (avt)

NOTE: This charter is a snapshot of the 41st IETF Meeting in Los Angeles, California. It may now be out-of-date. Last Modified: 05-Feb-98


Stephen Casner <casner@precept.com>

Transport Area Director(s):

Scott Bradner <sob@harvard.edu>
Allyn Romanow <allyn@mci.net>

Transport Area Advisor:

Allyn Romanow <allyn@mci.net>

Mailing Lists:

General Discussion:rem-conf@es.net
To Subscribe: rem-conf-request@es.net
Archive: ftp://nic.es.net/pub/mailing-lists/mail-archive/rem-conf

Description of Working Group:

The Audio/Video Transport Working Group was formed to specify experimental protocols for real-time transmission of audio and video over UDP and IP multicast. The focus of this group is near-term and its purpose is to integrate and coordinate the current AVT efforts of existing research activities. No standards-track protocols are expected to be produced because UDP transmission of audio and video is only sufficient for small-scale experiments over fast portions of the Internet. However, the transport protocols produced by this working group should be useful on a larger scale in the future in conjunction with additional protocols to access network-level resource management mechanisms. Those mechanisms, research efforts now, will provide low-delay service and guard against unfair consumption of bandwidth by audio/video traffic.

Similarly, initial experiments can work without any connection establishment procedure so long as a priori agreements on port numbers and coding types have been made. To go beyond that, we will need to address simple control protocols as well. Since IP multicast traffic may be received by anyone, the control protocols must handle authentication and key exchange so that the audio/video data can be encrypted. More sophisticated connection management is also the subject of current research. It is expected that standards-track protocols integrating transport, resource management, and connection management will be the result of later working group efforts.

The AVT Working Group may design independent protocols specific to each medium, or a common, lightweight, real-time transport protocol may be extracted. Sequencing of packets and synchronization among streams are important functions, so one issue is the form of timestamps and/or sequence numbers to be used. The working group will not focus on compression or coding algorithms which are domain of higher layers.

Goals and Milestones:



Conduct a teleconference working group meeting using a combination of packet audio and telephone. The topic will be a discussion of issues to be resolved in the process of synthesizing a new protocol.



Define the scope of the working group, and who might contribute. The first step will be to solicit contributions of potential protocols from projects that have already developed packet audio and video. From these contributions the group will distill the appropriate protocol features.



Review contributions of existing protocols, and discuss which features should be included and tradeoffs of different methods. Make writing assignments for first-draft documents.



Post an Internet-Draft of the lightweight audio/video transport protocol.



Post a revision of the AVT protocol addressing new work and security options as an Internet-Draft.

Jun 93


Submit the AVT protocol to the IESG for consideration as an Experimental Protocol.


Request For Comments:







RTP: A Transport Protocol for Real-Time Applications



RTP Profile for Audio and Video Conferences with Minimal Control



RTP Payload Format for JPEG-compressed Video



RTP payload format for H.261 video streams



RTP Payload Format of Sun's CellB Video Encoding



RTP Payload Format for H.263 Video Streams



RTP Payload for Redundant Audio Data



RTP Payload Format for MPEG1/MPEG2 Video

Current Meeting Report

Minutes of the Audio/Video Transport (avt) Working Group

Reported by Steve Casner

1. Introduction and Status

The AVT working group met for two very full sessions at the 41st IETF meeting in Los Angeles. The biggest topic was a continuation of the discussion that began at the previous IETF regarding a "generic payload format" to handle the hundreds of existing codecs without requiring a payload format specification to be written for each. There were also two presentations on RTP for MPEG-4 which are related because the mechanisms in MPEG-4 are similar to a generic payload format.

AVT's primary outputs, the Real-time Transport Protocol and the companion RTP profile for audio/video conferencing, remain at Proposed Standard status. Internet-Draft revisions of both have been posted in the working toward advancement to Draft Standard status. That step was not achieved before this meeting, but it is hoped that revisions will be completed before the next IETF. Several issues raised on the mailing list were presented and discussed in the second session.

Since the last IETF meeting, a revision of the payload format for MPEG-1 and -2 video originally published in RFC2038 was published as RFC2250, still at Proposed Standard status. The IP/UDP/RTP header compression Internet-Draft remained on hold in IESG Last Call on a chain of document dependencies ending at IPSec, but it was reported that decisions allowing IPSec to proceed were completed at this meeting. Before the meeting, IESG Last Call was requested for two additional drafts: a revision of the JPEG video payload format, and a new payload format for BT.656 video. At this meeting, there were discussions of the next round of drafts as described in the following sections.

2. New Drafts Ready for WG Last Call

Colin Perkins gave a brief presentation the revised draft "Options for Repair of Streaming Media", draft-ietf-avt-info-repair-03.txt. This draft covers loss mitigation schemes and recommendations for when they may be used appropriately. This revision includes discussion of a "reasonable operating point" for these schemes as suggested by the TCP throughput equivalence formula. There were no objections to issuing last call for comments from the working group on this draft before requesting publication as an Informational RFC.

A second informational draft, written by Mark Handley during the last meeting, is draft-ietf-avt-rtp-format-guidelines-00.txt, ps. This draft is a first attempt to capture the group's experience in designing payload formats, and is not considered ready for publication yet. However, the working group is requested to read the draft and contribute additional ideas.

The third draft up for consideration is the H.263+ payload format that was presented in detail at the last meeting. Joerg Ott described the revised draft-ietf-avt-rtp-h263-video-01.txt that includes a number of changes that were announced at the last meeting to reorganize the payload header for improved efficiency without changing the overall functionality. This revision also removes the controversial feedback backchannel which will be revisited in a future separate draft that might be applied to other payloads in addition to H.263+. The authors consider the current draft ready for working group last call, and there were no objections to proceeding. However, Michael Speer suggested that an appendix be added to show how to map from the current H.263 payload format in RFC2190 to the new one. Carsten Bormann countered that this should be a separate document and agreed to write it. There was some discussion of whether the new draft should obsolete RFC2190: some are concerned about having two Proposed Standard RFCs for H.263 payload formats, while others are concerned about obsoleting a spec that is referenced by the ITU and implemented in products. In a discussion on the mailing list since the meeting, it was agreed that the new RFC would be marked as "updating" the old one and would include a paragraph describing the relationship (the new one should be used for new implementations). Both RFCs will be Proposed Standards, but RFC2190 may be moved to Historic if that seems appropriate when the new RFC is ready to move to Draft Standard. The static payload type 34 will continue to be assigned to RFC2190, while the new payload format will use a dynamic payload type.

3. Format Parameters in SDP

Some encodings, such as H.263, may have optional modes or other parameters that need to be communicated out-of-band with respect to the RTP session. The "a=fmtp" attribute of the Session Description Protocol is one method. The recent draft-koskelainen-sdp263-01.txt defines a format parameter syntax for H.263. This draft is based on the 1996 version of H.263 (to which RFC2190 corresponds), but it includes some features of the 1998 version (H.263+). Given the imminent publication of H.263+, it does not make sense to move this draft forward before extending it to cover all of what's needed for H.263+. However, the authors of the H.263+ payload format assert that the converse is not true because H.263+ will be used with a variety of out-of-band mechanisms. When used with SDP, an H.263+ codec can operate "pretty well" with default parameters until the sdp263 draft is revised and published.

The encoding of format parameters in SDP is also a component of the next topic, generic payload formats, in order to describe what encoding is being carried. One question to consider is whether there should be any standardization of parameter syntax within fmtp, or whether each payload format may define its own.

4. Generic Payload Formats

The biggest topic of this meeting was a continuation of the discussion that began at the previous IETF regarding a generic payload format for the hundreds of existing codecs. It is deemed impractical to create an RFC to define an optimized payload format for each of them, especially since the details that would allow optimization may be unavailable for some. If one or more generic payload formats can be defined, then the problem is reduced to defining and managing the encoding namespace plus conveying the encoding name and parameters out of band. However, one may be concerned that the introduction of a generic payload format would stifle development of optimized payload formats as new encodings are created.

4.1 draft-periyannan-generic-rtp-00.txt

The first of two proposals was presented by Alagu Periyannan. It defines a set of three packetization schemes A, B and C that can be applied as appropriate for different encodings, plus a mechanism within SDP to convey the encoding and packetization information out of band to receivers.

The first two schemes are very simple and introduce no additional packetization overhead. Scheme A follows the model of packetization for many of the audio encodings in the RFC1890 profile where the number of samples or frames may be determined from the length of the packet, and all samples or frames are smaller than the MTU so that no fragmentation is required. Since scheme A may be used for media other than audio, it was suggested that the marker bit be defined to indicate the first packet after "a gap in the timeline" rather than the start of a talkspurt.

When the size of a sample (frame) may exceed the MTU, then scheme B may be used. Following the model of most existing video payload formats, fragmentation is indicated by the RTP timestamp being the same on all fragments of the sample, and the marker bit is set on the last fragment. Since the internal structure of the data is not known to the generic payload format, fragmentation is done without respect to any internal structure of the data. Loss of a fragment implies loss of the whole sample.

Scheme C is much more complicated; it should be used only when the assumptions of schemes A and B don't apply. To accommodate samples sizes that may vary from a fraction of the MTU to several times the MTU, as may occur with some video encodings, scheme C allows each packet to carry either multiple samples or a fragment of a sample. Each sample or fragment is preceded by a scheme C header that gives the length of a whole sample or the offset of a fragment. The header size is 4, 8 or 12 bytes and may include a relative timestamp (offset from the RTP timestamp) when multiple samples are aggregated and the spacing of samples cannot be assumed. The header may also include the sample duration and a key-sample flag when the encoding depends on external framing for this information.

To indicate the selection of scheme A/B/C, it is proposed that the syntax of the encoding name in the SDP "a=rtpmap" attribute be extended to optionally include a namespace and a packetization scheme. Additional encoding parameters would be carried in the fmtp attribute as mentioned earlier.

4.2 draft-klemets-generic-rtp-00.txt

The second proposal was presented by Anders Klemets (the draft is available from http://microsoft.com/asf/resources/). This proposal defines a single packetization scheme similar to scheme C in that multiple samples or fragments may be carried in one packet and each is prefixed with a header. To minimize overhead, 32-bit alignment is sacrificed so the size of the header can vary from 1 to 8 bytes (or more if "extension data" is included). As in scheme C, the header may include fragment offset, sample length, and timestamp delta fields. In addition, there is a fragment sequence number field that allows distinguishing between two kinds of fragmentation: codec-unaware, where the samples are fragmented at arbitrary boundaries during packetization; codec-aware, where the fragment boundaries are chosen by the application layer, perhaps at authoring time; or a nesting of the two. It is possible to include fragments from multiple frames in one packet, which the rules of scheme C do not allow.

The extension data mechanism may be used to include arbitrary additional information into the header for each sample, such as a send time timestamp, duration, or a key frame flag. Some extensions may be hints that could be safely ignored, while others are mandatory. It is proposed that the set of extensions in use be specified in SDP. This would provide a means of "specializing" the generic payload format for a particular encoding. On the other hand, Alagu Periyannan observed that defining the details of the extension mechanism might be as much work as specifying a real payload format for the codec.

This scheme has low overhead and is very flexible. The tradeoff is that decoding the set of fields included in the header requires a fairly complicated sequence of decisions based on bits in the header, the RTP marker bit, and the position in the sequence of header and data objects.

4.3 Discussion

There was an extended discussion of the issues surrounding generic payload formats. There are two primary parts of the problem:

· the packetization format and the set of features provided; one question is whether to have a single format or multiple formats, and a second question is the design of the format(s).
· encoding namespaces, registration procedures, and how to carry the encoding and/or format names and parameters out of band, for example in SDP.

The packetization formats themselves received less of the discussion this time. It was noted that there are similarities between the Klemets proposal and Scheme C in the Periyannan proposal. Eric Fleischman suggested that the four authors of the two drafts work together to produces one merged proposal also incorporating the ideas from the discussion. The authors present all agreed.

Mark Handley liked the idea of having a set of packetization formats, but suggested that they be used as a shortcut method of specifying an RTP payload format draft. In "three lines" one could specify that for a particular encoding, a particular one of the packetization schemes would be used, and if necessary, what parameters would be carried in the fmtp attribute. Colin Perkins observed that these formats might also serve as templates to facilitate the definition of encoding-specific payload formats without having to define the whole payload format from scratch.

There were several issues related to naming. One goal is to have a single name registered in the IANA namespace to refer to any particular encoding. However, it will most likely be necessary to allow incorporation by reference of other encoding name registries sponsored by vendors, and there are already instances of multiple registrations of the same encoding in those registries. IANA can't be expected to determine which registrations are equivalent. Eric Fleischman expressed the hope that the encoding vendors, as participants in the AVT community, could go back and examine the registrations to determine which are equivalent to the preferred names in the IANA (MIME) namespace.

Mark Handley summarized the process to say that the encoding names would be entered into the common namespace. There would be a flag associated with each entry to indicate whether or not it is appropriate for transport via RTP. For those that are easily mapped to one of the generic packetization schemes, the name of the selected scheme would be included in the registration, and that's all that would be required to use the encoding over RTP. One would want that table of encodings to be actively maintained by IANA and to be available in machine readable form. Rob Lanphier remarked that keeping the selection of packetization scheme implicit in the table avoids the problem of a server choosing an inappropriate scheme and the client having to be prepared to handle all the packetization choices for all data types even when some don't make sense. However, others were concerned about having to install the table on servers and clients and keep it up-to-date. This issue requires further discussion.

To indicate the encoding in SDP, the MIME major types goes on the "m=" media selection line, and the minor type goes into the "a=rtpmap" attribute. Given that names from the vendor subspace of the MIME namespace specify the codec as a parameter of the type (see draft-fleischman-codec-subtree-02.txt), the syntax of those parameters might need to be modified for inclusion in the rtpmap. It might also be necessary to extract some of the parameters into the fmtp attribute. The details of this design are part of the work to be done in the merged draft.

5. MPEG-4

Two presentations were given on an RTP payload format and RTCP usage for MPEG-4. This work is at an early stage so there are a number of open issues to be resolved. There are some problems in fitting MPEG-4 and RTP together because of overlap in their functionality; in particular, at what level multiplexing of streams should be done. MPEG-4 is in a sense a generic payload format because the MPEG-4 Systems layer can carry data from many different codecs. Therefore, we might expect some synthesis between these two efforts.

5.1 MPEG-4 payload format: draft-ietf-avt-rtp-mpeg4-00.txt

Michael Speer presented the proposed RTP payload format for MPEG-4. This format supports packetization of both single Elementary Streams and multiple Elementary Streams bundled into a FlexMux stream. For single streams, the Access Unit Layer in the MPEG-4 architecture fragments data into AL-PDUs according to the MTU. The AL-PDU can contain separate timestamps for composition time, decode time, and wallclock time, and these are of adjustable size and resolution. The composition timestamp is to be extracted from the AL-PDU and carried as the RTP timestamp unless it is longer than 32 bits, in which case the FlexMux encapsulation should be used. FlexMux encapsulation is also called for if the Access Unit Layer cannot fragment according to the MTU, and when there are many Elementary Streams sending many small data units such that sending each stream in a separate RTP session would incur too much overhead or require too many addresses and ports. In the FlexMux encapsulation, the RTP timestamp carries the wallclock time when the packet is sent. The data-related timestamps are carried in the original AL-PDUs as fragmented into FlexMux PDUs.

Dave Oran asked why not send the FlexMux packets directly over UDP since that mode is not really using the features of RTP. The response from Vahe Balabanian and Reha Civanlar was that operation over UDP was the original design, but that it is desired to take advantage of RTCP feedback and to allow multiplexing with non-MPEG-4 streams that are carried in RTP. Henning Schulzrinne observed that we should carefully consider whether FlexMux fits into RTP or whether it is a specific instance of a more generic problem of multiplexing streams into RTP that should be solved more generically. A related issue is whether elementary streams encoded in MPEG-1 or H.263 or other encodings for which there are specific RTP payload formats defined should use those formats instead of a generic MPEG-4 payload format. These issues are to be discussed further on the mailing list.

5.2 DMIF and RTP/MPEG-4: draft-ietf-avt-rtp-mpeg4-dmif-00.txt

Vahe Balabanian described the role of DMIF (Delivery Multimedia Integration Framework) when using MPEG-4 with RTP and RTCP. DMIF defines the interface the MPEG--4 system will use RTP including the signaling and QoS negotiation which is outside the bounds of RTP. Using feedback obtained via RTCP, when the loss rate is to high to meet the requested QoS, DMIF might signal a reduction in the number of layers from a layered encoding to be received.

Unfortunately, the QoS measures provided by RTCP don't match what's defined in DMIF exactly. In the FlexMux case, the number of AL-PDUs carried in an RTP packet will generally not be one, so the AL-PDU loss probability will not match the RTP packet loss probability that RTCP provides. DMIF also wants to provide a "gap loss" measure, but RTCP would require an extension to carry this. The methods for calculating jitter and delay are different between RTCP and DMIF as well. One question is what should be done when the underlying network layers cannot meet the interface specified by DMIF?

There was some discussion of which AL-PDU timestamp should be carried in the RTP timestamp, considering the effect on jitter calculation. The jitter will be inaccurate if the timestamp represents the composition time but the data is sometimes sent early for rate smoothing, or if the timestamp is non-monotonic for interpolated video frames. Steve Casner pointed out that the jitter value in RTCP is intended as a comparative measure between streams or at different times rather than as an absolute measure. The selection of timestamps is another issue to be discussed on the mailing list.

6. Revision of RTP Spec and A/V Profile

The discussion of changes to the RTP spec and A/V profile was abbreviated because of time limitations. The current drafts are available as draft-ietf-avt-rtp-new-00.txt, .ps for the main spec and draft-ietf-avt-profile-new-02.txt, .ps for the profile. The plan is to revise these drafts to incorporate changes discussed at the last meeting and then put the drafts forward for publication as revised RFCs. Steve Casner pointed out that as the specifications advance to Draft Standard, any features that have not been demonstrated to be interoperable must be dropped. This must be considered as we make changes. Most of the changes have been in algorithms such as the RTCP interval calculation rather than in the protocol on the wire, and are not expected to be problematic.

Three topics from the mailing list were discussed:

· A request was made for a means to add application-specific extensions to the RTCP RR packet. Mark Handley and Van Jacobson expressed concern about the negative impact this would have on interoperability. The general sentiment of the group was that a new RTCP packet type should be allocated instead to carry the additional information.
· There have been a number of suggestions for redefining the interval over which the RTCP RR "loss fraction" is calculated. The problem is that when the RTCP interval is long, this value does not carry much information. However, there are enough potential problems with trying to define how the loss fraction should be calculated over a shorter interval that there was no consensus for changing it.
· A request was received for a static payload type to be assigned to the Qclp audio encoding. As had been discussed in Munich, it was agreed that the RTP Profile draft should be revised to say no more static payload types would be assigned. Those that are already assigned should be considered "default" values in the dynamic payload type table, and may be overwritten once all the unassigned values have been used for dynamic types within a session. However, since the new policy has not yet been published, the request for Qclp should be allowed.

7. Payload Format for Generic FEC

Jonathan Rosenberg presented an update on draft-ietf-avt-fec-02.txt which specifies a "meta payload format" to add forward error correction. The primary change was that the added FEC packets are sent on a separate stream rather than appearing within the sequence number space of the media stream. The means of separation may vary depending upon the application, for example using a different multicast address or port, or as a redundant coding in the same stream per RFC2198. This allows the sequence numbers and timestamps of the media stream to behave nicely so FEC-incapable receivers are not confused and so FEC-capable receivers can tell which packets are lost. On the FEC packets, it was agreed that the RTP timestamp on the FEC packets should represent the time when the FEC packets are sent rather than using the minimum timestamp from the covered media packets. However, the FEC timestamps must be in the past with respect to the media packets in order to use RFC2198.

To make the change to separate streams, an explicit timestamp recovery field is added to the FEC header. This field was proposed to be 8 bits, but Steve Casner pointed out that the timestamp increment between video frames is a minimum of 3003. Two possible changes were discussed: make the TS recovery field 16 bits and reduce the packet pattern mask to be 8 bits, or increase the FEC header from 64 to 96 bits to allow a 32-bit TS recovery field. There were not strong opinions either way, but this "will be worked out."

The draft now specifies a simple recovery algorithm as an example. Applications may implement more complex algorithms to give improved performance.

8. Elevating RTP to Protocol

Jonathan Rosenberg gave (part of) a second presentation on draft-rosenberg-rtpproto-00.txt which proposes that RTP be elevated to an IP protocol type of its own, parallel to UDP and TCP. The motivation is to make RTP packets easier to identify for purposes of classification in differentiated services, header compression, firewalls, etc.

Response to this proposal was primarily negative due to concerns about interoperability problems it would impose without really solving the problems it was intended to address. Van Jacobson said that differentiated services would be driven off the TOS byte in the IP header and established according to the full "transport signature" and not just the protocol type. And for RTP header compression, the UDP traffic can also be compressed even when the following header is not RTP, so one would want to apply compression to both protocol types even if RTP did have a different one. Ross Finlayson felt that breaking away from the already supported UDP protocol would make RTP too hard to deploy in hosts, while Dave Oran expressed concern about needing to get access lists changed in routers all over the place to allow a new protocol type through. The consensus was not to proceed.

9. Wrapup

Because the session ran over time, there were two agenda topics not discussed. Henning Schulzrinne has asked that the working group consider reactivating the idea of aggregation of multiple, parallel RTP streams. Jonathan Rosenberg presented this in December, 1996, but it was not ready for full consideration at that point. Since then the ideas have been refined and a draft will be posted after IETF.

The second item was to note that several drafts have been recently posted regarding how RTP may be recorded and/or played back from different file formats. Should the working group take any action on these documents other than to provide feedback so they can be refined and published as informational RFCs? That is, should there be any standardization of an "RTP file format"? Please send comments on this to the mailing list.

10. Slides

The presentation slides are available as follows:

Agenda, status, H.263-SDP, RTP changes:

Options for Repair:

H.263+ payload format:

Generic payload format:

http://drogo.cselt.it/mpeg/documents/dmif-avt.zip or


RTP as protocol:

[Editor's Note: The slides listed above are also available in this working group's slides section of these proceedings.]


Audio/Video Transport Group - Agenda and Status
Elevating RTP to Protocol (PDF file)
FEC Payload Format (PDF file)
Options for Repair of Streaming Media
RTP Payload Format for H.263+
The Role of DMIF in Support of RTP MPEG-4 Payloads
Common Generic RTP Payload Format

Attendees List

go to list