Current Meeting Report

2.7.1 Audio/Video Transport (avt)

NOTE: This charter is a snapshot of the 53rd IETF Meeting in Minneapolis, MN USA. It may now be out-of-date. Last Modified: 16-Jan-02
Stephen Casner <>
Colin Perkins <>
Transport Area Director(s):
Scott Bradner <>
Allison Mankin <>
Transport Area Advisor:
Allison Mankin <>
Mailing Lists:
To Subscribe:
Description of Working Group:
The Audio/Video Transport Working Group was formed to specify a protocol for real-time transmission of audio and video over UDP and IP multicast. This is the Real-time Transport Protocol, RTP, together with its associated profile for audio/video conferences and payload format documents.

The current goals of the working group are to revise the main RTP specification and the RTP profile ready for advancement to draft standard stage (including the sampling algorithms for use with very large groups, which have been broken out into a separate document), to complete the RTP MIB, to produce a guidelines document for future developers of payload formats and to continue development of new payload formats.

The payload formats currently under discussion include a number of media specific formats (MPEG-4, DTMF, PureVoice) and FEC techniques applicable to multiple formats (parity FEC, Reed-Solomon coding).

Archive before July 2001:

Goals and Milestones:
Done   Working group last call on guidelines for payload format writers (BCP)
Done   Post revised DTMF payload format draft, ready for WG last call
Done   Post revised RTP spec and audio/video profile
Done   Working group last call on parity FEC draft (standards track)
Done   Post revised RTP MIB and issue working group last call (stds track)
Done   Post RTP implementation checklist draft
Done   Post revised draft on PureVoice (qcelp) payload format to address WG last call comments
Done   Post payload format for MPEG-4 based on MPEG/IETF joint meetings
Done   Post revised RTP membership (SSRC) sampling draft
Done   Submit RTP MIB to IESG for publication as Proposed Standard RFC
Done   Submit guidelines for payload format writers for publication as a BCP
Done   New working group last call on PureVoice payload format
Done   Analysis/simulation of multiplexing payload format proposals
Done   Working group last call on revised SSRC sampling draft (experimental)
Done   Post final revision of RTP spec and A/V profile drafts
Done   Revise MPEG-4 payload format document after implementation experience
Done   Working group last call on RTP and A/V profile (for Draft Standard)
Done   Decide how to proceed with multiplexing protocol: one generic payload format or a number of application specific formats
Done   Prepare MPEG4 implementation results ready for WG last call
Done   Post final revisions of selected multiplexing protocol draft(s)
Done   Working group last call on multiplexing payload format (stds track)
Request For Comments:
RFC1889PSRTP: A Transport Protocol for Real-Time Applications
RFC1890PSRTP Profile for Audio and Video Conferences with Minimal Control
RFC2032PSRTP payload format for H.261 video streams
RFC2029PSRTP Payload Format of Sun's CellB Video Encoding
RFC2190PSRTP Payload Format for H.263 Video Streams
RFC2198PSRTP Payload for Redundant Audio Data
RFC2250PSRTP Payload Format for MPEG1/MPEG2 Video
RFC2343E RTP Payload Format for Bundled MPEG
RFC2354 Options for Repair of Streaming Media
RFC2429PSRTP Payload Format for the 1998 Version of ITU-T Rec. H.263 Video (H.263+)
RFC2431PSRTP Payload Format for BT.656 Video Encoding
RFC2435PSRTP Payload Format for JPEG-compressed Video
RFC2508PSCompressing IP/UDP/RTP Headers for Low-Speed Serial Links
RFC2733PSAn RTP Payload Format for Generic Forward Error Correction
RFC2736 Guidelines for Writers of RTP Payload Format Specifications
RFC2762E Sampling of the Group Membership in RTP
RFC2793PSRTP Payload for Text Conversation
RFC2833PSRTP Payload for DTMF Digits, Telephony Tones and Telephony Signals
RFC2862PSRTP Payload Format for Real-Time Pointers
RFC2959PSReal-Time Transport Protocol Management Information Base
RFC3009PSRegistration of parityfec MIME types
RFC3016PSRTP payload format for MPEG-4 Audio/Visual streams
RFC3047PSRTP Payload Format for ITU-T Recommendation G.722.1
RFC3119PSA More Loss-Tolerant RTP Payload Format for MP3 Audio
RFC3158 RTP Testing Strategies
RFC3189PSRTP Payload Format for DV Format Video
RFC3190PSRTP Payload Format for 12-bit DAT, 20- and 24-bit Linear Sampled Audio

Current Meeting Report

Audio/Video Transport Working Group Minutes

Reported by Stephen Casner and Colin Perkins

The Audio/Video Transport working group held its 10th anniversary meeting at the 53rd IETF in Minneapolis, having met first at IETF in San Diego in March, 1992. We met again for two sessions, but this time only two hours each. Some presentations were a bit squeezed in favor of taking time for necessary discussion. In the first session, we covered document status, Secure RTP, several new and old payload formats specifications, and the profile for RTCP-based feedback. In the second session, we covered MPEG-4, a new proposal for carrying telephony state signaling over RTP, RTP retransmission and RTCP extensions. A proposed discussion of multiplexing RTP based on SSRC was cut due to lack of time.

Introduction and Document Status

This meeting began with the usual document status update, but for the first time, we didn't talk about advancing the RTP spec to Draft Standard because it is already in IESG Last Call. There were a few comments on the RTP spec draft-ietf-avt-rtp-new-11.txt and profile draft-ietf-avt-profile-new-12.txt that were mentioned at the meeting. First, regarding the profile, the convention for determining the ordering of audio channels based on the number of channels has an ambiguity because two orderings are specified for 4 channels. The solution is to remove the "quadrophonic" channel order because that technology never caught on and so we presume that ordering has not been used.

On the RTP spec, the first comment is that the packet loss calculation in Appendix A.3 results in an indication of 1 lost packet before any packets have been received, but corrects to 0 packets lost when the first packet is received. The solution has not been worked out yet; suggestions are welcome. The second comment was a question whether the session bandwidth includes the RTCP bandwidth. The intention was that it does not. One reason is that bandwidth allocation in RSVP, for example, is per port, so there is no allocation common to both RTP and RTCP. It is also expected that the data bandwidth allocation needs to allow a little headroom, and the RTCP packets will fit into that headroom. Some of the wording is not clear enough; it will be improved. The third comment was a request from Paul Jones to change the requirement level for use of an even/odd port pair from SHOULD to MUST, unless both the RTP and RTCP ports are specified explicitly. Steve Casner reasoned that to say MUST and then "MAY disregard" if explicit seems a somewhat clumsy expression. However, the problem with the proposed change is that there may be some environments in which the requirement for even/odd port pairs does not makes sense. Since RTP is a general framework, we need to be careful about what we mandate. SHOULD means you really should do it unless you understand the consequences of not doing so. There were no comments from those present, but comments are welcome on the mailing list.

Two RFCs (3189 and 3190) on DV audio and video payload formats were published since last meeting, and draft-ietf-avt-rtp-amr-13.txt is in the RFC editor queue. We have five other drafts submitted to the IESG for publication but not accepted yet. There are nine more drafts in working group last call which will go to the IESG as reviews (and in some cases updates) are completed. Five of which appear later on the agenda. For one of the others, draft-ietf-avt-dsr-01.txt, Steve Casner had two comments: 1) the payload format includes a 4-bit CRC on each frame-pair that breaks octet alignment and may not be very effective, however, the reply was that this CRC is part of the expected information transfer between the two ends; and 2) rather than having the payload format indicated by MIME subtype "dsr" and then including a MIME parameter to indicate the actual front-end encoding, the MIME subtype should indicate the front-end encoding directly because that's all there is to the payload format. This was accepted. The remaining drafts in working group last call include draft-ietf-avt-evrc-smv-00.txt which has been reviewed, but will probably require another revision to address comments. The last two, draft-ietf-avt-uxp-02.txt and draft-ietf-avt-ulp-04.txt, are waiting for reviews by working group participants, especially to assess the level of interest.

MIDI Payload Format

Colin Perkins presented a slide on the MIDI packetization protocol draft-ietf-avt-mwpp-midi-rtp-02.txt since John Lazzaro could not attend. This draft has been discussed quite a bit on the mailing list. As a result, much of the non-normative description of send/receive algorithms was removed since there may be multiple valid algorithms. We'd like to hear opinions about whether it would be valuable to include a suggested algorithm as an appendix even though it is not normative. Jason Flaks concurred. The protocol is being extended to cover the full MIDI command set, but work remains to provide resiliency for commands not covered by the recovery journal.

Secure Real Time Transport Protocol

David McGrew described the changes to the Secure Real Time Transport Protocol in draft-ietf-avt-srtp-03.txt and the feedback received. The main change was the addition of an optional Master Key Identifier that is useful to avoid synchronization problems for applications needing frequent re-keying. This is not as general purpose as the optional SPI that was removed in the previous draft revision. Steve Casner expressed a concern that the draft might not be stable yet if non-trivial changes such as this are still being made. The reply from Mark Baugher was that the SPI was removed because it was partially redundant with other mechanisms; the MKI has a more limited role and was added to allow SRTP to meet the requirements of larger-scale multicast applications in addition to the telephony applications which had previously been the primary focus. An earlier criticism of the TMMH hash algorithm was that it did not work well for large packets, so a revised TMMHv2 is specified in this draft. It is believed that this change does not impact the provable security aspects of the hash; it has been accepted in reviews by the Security Area Advisory Group. Colin Perkins expressed concern not about this change but about the inclusion of TMMH in this draft rather than as a separate draft because it is just one algorithm that may be used in SRTP. David agreed since it has already been submitted to SAAG.

Other changes to SRTP included the addition of a Master Salt with the Master Key, and tweaks to AES Counter Mode to increase efficiency in applications using the same key for multiple sessions by depending on SSRC uniqueness when that can be guaranteed by signaling. Since the draft was posted, there was quite a bit of feedback on the mailing list: it may not be reasonable to specify a single processing order for the combination of SRTP and FEC because the requirements differ among applications; need to specify the relationship to RTP padding; and need to specify what happens if there is an SSRC collision. Martin Euchner pointed out the need for re-keying when counter mode is used in very long-lived sessions; the draft needs to say more about key management requirements for this. He also expressed support for not requiring TMMH, and making it a separate draft, because some applications with tight memory requirements want to use the same underlying algorithm for all crypto services. David agreed that separation of TMMH should not be a problem, and concluded by saying that there is a long list of things to correct, but that the next revision is expected to be final.

Payload Format for JVT Video

Stephan Wenger presented a new video payload format proposal for JVT (Joint MPEG/ITU Video Team) video, also known as H.26L, in draft-wenger-avt-rtp-jvt-00.txt. He clarified that this is an individual proposal that does not yet have the support of the full JVT because the issues are still being considered. Stephan emphasized that for the first time the requirements for packet transmission are being considered during the design of the video coding, not afterwards. The video coding itself is expected to be technically frozen in May with an opportunity for minor changes in July, so we have a small time window for input from the payload format considerations to affect the video coding. The Video Coding Layer (VCL) is similar to predecessor algorithms but with enhancements like different block sizes and macroblock shapes. An important change for packet networks is that slices are independently decodable -- it is not necessary for the packetization to replicate header information. The Network Adaptation Layer (NAL) maps slices to different kinds of networks, either bit-oriented or packet-oriented, and this is where AVT comes in. There are two new features that affect the payload format. The first is Enhanced Reference Picture Selection (ERPS) in which the reference picture for motion compensation may be one of several past pictures that are stored in a buffer. Normally this is a FIFO list of completed pictures, but it is also possible to interleave slices of other reference pictures that are unrelated in time. These may be referenced when the scene changes to a commercial so no I-frame is required at that time. The second feature is Data Partitioning into different levels of importance, with the header data being the most important. This allows unequal error protection to be added.

The main issue discussed during the AVT session and at a lunch meeting between sessions was the meaning of the RTP timestamp. The draft proposes to use the packet transmission time on the theory that the inclusion of temporally unrelated pictures would preclude the use of sampling, presentation or decoding times. However, using the transmission time precludes synchronizing a JVT stream with other media streams. As reported by Stephan at the start of the second session, after much discussion during the lunch meeting it was concluded that the presentation time could be used as the RTP timestamp value, and that packets with no presentation time could be assigned a timestamp between the timestamps of the adjacent timed frames. This leads to another issue: the need to include timestamps in compound packets, to indicate the time of the aggregated packets when the temporal spacing is not uniform (this work will be pursued in the JVT). Stephan also noted a message from Steve Casner, suggesting that sampling time (which is relatively equivalent to the presentation time) needs to be used instead of presentation time in order to match the timestamps of other RTP streams. Stephan pointed out that this may be confusing for streams that are not real-time, so we agreed to discuss offline how this could be specified clearly. Stephan noted that there was no resolution on use of the marker bit, but that this was not critical. The penultimate issue was the use of PUPs and parameter set updates (competing solutions: in the bitstream and in the control protocol). The current solution in the draft was found reasonable, but it would be nice if it was a little more flexible. Stephan will look into this, and report back on the reflector. The final issue was the topic of user data: the ability of the bitstream to transport arbitrary data. We cannot prevent this from being transported -- it's included in the standard and used by a number of applications -- but the draft should include text in the security considerations to point out the potential risks of this data.

iLBC Speech Codec and Payload Format

Alan Duric presented a new audio payload format for iLBC Speech (draft-duric-rtp-ilbc-00.txt) along with an introduction to the "internet Low Bitrate Codec" itself. Additional details about the codec were presented by Soren Andersen. Global IP Sound has submitted an IPR statement to the IETF stating that the codec is "freeware", and the IETF is being asked to standardize this codec because it is not being considered by the other organizations that usually standardize codecs and because it is specifically targeted for packet networks. Each packet is decoded without error propagation from previous packets, and as a result the codec is claimed to provide better robustness to packet loss (higher MOS scores) than the ITU-standard codecs G.729 or G.723. Results of third-party MOS testing including loss according to actual network loss patterns will also be submitted in the future. At this point the payload format for iLBC is minimal (N integral frames with no payload header), but changes are planned to include unequal error protection sorting of the frames and this will make it more complicated. Work is also in progress to reduce the frame size from 53 to 52 octets. In addition to the RTP format, the draft specifies a storage format is consisting simply of a magic number followed by a sequence of speech frames. The codec itself is specified in draft-andersen-ilbc-00.txt which stretches to 203 pages because it includes source code for a full reference implementation. See also Do any AVT participants have concerns about IETF standardizing a codec, or about AVT providing being the working group home for it? Or, any suggestions about how codec spec, and possibly the reference implementation, should best be published? In times past, the IESG insisted that we should not standardize codecs. However, now we have AD input that supporting a freeware, Internet-oriented codec is a worthwhile motivation for IETF to consider standardizing it. Henry Sinnreich commented that he had used this codec several times over a five-network path between the US and Sweden with excellent results. Those present hummed yes for taking on iLBC as a working group item.

Payload Format for AC-3 Audio Streams

Jason Flaks presented an update on the payload format for AC-3 audio streams in draft-flaks-avt-rtp-ac3-01.txt with a few small changes. This revision removes the fragment counter from the payload header and adds a flag bit to indicate whether a timecode is contained within the frames in the packet to make them easier to find. Implementation is underway to verify the workability of the payload format. Most of the applications being considered are in closed environments rather than over the Internet at large. Some interesting new applications are being considered such as streaming to wireless speakers using 802.11. Jason asked about including in the header an indication of the channel format ontained in the packet. Colin Perkins asked if the channel-order specification in SDP would be sufficient. It might be necessary to add a 5.1 channel order. (This draft was accepted as a WG item after the last meeting, but was not renamed in this revision.)

Payload Format for SMPTE 292M Video

Ladan Gharai described the changes to the payload format for SMPTE 292M video in the revised draft-ietf-avt-smpte292-video-04.txt. The primary change was to increase the RTP timestamp clock from 10Mhz to 148.5Mhz or 148.5/1.001Mhz so that timing can be precise to the 10-bit word when operating at either of the SMPTE292M data rates 1.485Gbps or 1.485/1.001Gbps. The second rate is not an integer so it can't be represented in the current syntax of the SDP rtpmap attribute. Rather than modifying the SDP syntax and cause backwards compatibility problems, the group preferred the proposal to specify that the nearest integer 148351648 should be interpreted as 148500000/1.001. Since there are only two legal values for the clock rate (148500000 and 148351648), there is no ambiguity for implementations of this payload format. Even if some unrelated device such as a third-party monitor interprets the value literally, the error is about 4 orders of magnitude less than that of a typical crystal oscillator. The second change was to use 16 previously unused bits in the payload header to carry the 11-bit line number and the V and F flags. This allows reconstruction of the start- and end-of-active-video (SAV and EAV) and the line number in case of packet loss. In addition, the packetization rules now specify that the 10-bit word sequences making up the SAV and EAV timing signals SHOULD NOT be fragmented because the decoder needs the full sequence to detect the start of a scan line; Chuck Harrison commented on the mailing list that this should be MUST NOT. The packetization also specifies that lines MUST NOT be fragmented within the sequence of Y, Cr and Cb words making up one video pixel; Chuck commented that this makes the packetization rules dependent on the video format being carried, and suggested "SHOULD NOT" instead.

Extended RTP Profile for RTCP-based Feedback

Joerg Ott presented changes in the draft-ietf-avt-rtcp-feedback-02.txt. He explained that there was only one issue to be resolved as of the last meeting, but several comments from the list since then are also addressed. The open issue from last time was a concern that lack of feedback suppression for packet types that are not understood could cause an implosion, but after discussion at the last meeting it was agreed this is not a problem. This revision of the draft includes the explanation from that discussion so others won't raise the same concern, and also removes the Generic Info packet which we agreed did not help. The security considerations section was also expanded, more details were given on the modes of operation, and many editorial nits were fixed. An appendix on video considerations that had been included for motivation was deemed unnecessary and removed. The packet format for Reference Picture Selection had been inadvertently omitted and so was added, along with allowing RPSI in ACK mode, and the description of video-specific feedback formats was expanded. The most significant change was that the feedback timing rules have been simplified by the removal of RTT-based timing to use only timing based on group size and RTCP bandwidth. This was driven by a concern that in a session in which some participants were receive-only and would therefore not know the RTT that the timing could be inconsistent. Steve Casner observed that simplification was good, but asked if the goals would still be met. Joerg replied that the feedback still provides significant improvement over operation without it; the main effect would be longer response times in a local area where a small RTT would have allowed shorter times. The simulation draft will be updated to reflect this change.

There are two new open issues. The first is that the scope of this feedback specification was to convey information about problems from the receiver to the sender and let the sender take whatever action is appropriate based on all the constraints that may be known to it. Roni Even and Orit Levin explained that folks in the video conferencing community would like the spec to mandate particular actions by the sender when certain feedback is received because otherwise the decoder can't depend on this mechanism for feedback. However, Joerg explained that these requirements should be part of the video encoding specification and not the RTCP feedback spec which may be used with many video encodings. Stephan Wenger explained that mandating the sending of an I-frame, for example, may violate congestion control requirements. After considerable discussion, we concluded that the current scope is appropriate. Other communication paths, such as SIP, may be used for more explicit commands such as remote camera control which require more reliable communication and have less stringent timing requirements. The second open issue was whether there will be a combinatorial problem with profile selection, given that we have two new profiles in development. Steve Casner proposed that we continue with the plan to specify a new profile (with a new name) for useful profile combinations, taking care to address any interactions there may be in the combination. We don't expect many profiles, but if the number grows to the point that naming is a problem, then we can consider a different naming (or numbering) scheme for the set that may be freely combined. There was not a lot of discussion due to lack of time as the first session ended. These issues may not require a change to the draft, but another revision is needed to shorten the author list and the abstract.

MPEG-4 Payload Formats

Philippe Gentric gave an update on the status of the MPEG4-generic payload format (draft-ietf-avt-mpeg4-multisl-04.txt) and the companion simplified draft (draft-ietf-avt-mpeg4-simple-01.txt). He recapped the genesis of the design, with the goal of transporting any MPEG-4 SL stream, and the desire from ISMA to have a simplified version of this with less emphasis on the full capabilities of the SL. Colin Perkins asked for clarification on the role of RFC 3016; it is a subset of the Simple format for video, but is incompatible for audio. Philippe outlined the plan to move the Generic and Simple formats to RFC, with Simple being positioned as an applicability statement of Generic, but noted that there is only "weak consensus" and support of the Generic format, and not many implementations. There is, however, good consensus and strong support behind the Simple format, with quite a few implementations by the ISMA members. Because of this, he proposed to "make now what can be made now", pushing the Simple format to RFC now while leaving the Generic format for later. Philippe believes that the Simple format can go forward now, with only minor textual changes, since it is written in a manner that is largely independent from the Generic format anyway. The Generic draft remains open, and will be backwards compatible with Simple, with main two additional features (transport of SL packets, interleaving of AU fragments). At a later stage the Generic format will be moved to RFC.

The only major issue is that MPEG wants clarification that it can also transport system streams, for example BIFS, in the Simple format. Stephan expressed concern about the suggestion to transport BIFS in the Simple format; this is a significant extension to the scope, and raises a number of security and reliability issues. Philippe noted that the Generic format has a section that explains the issues, and that BIFS defines robustness features itself. Stephan is not convinced that the BIFS robustness features are sufficient: for BIFS you need reliable transport. Young-Kwon Lim and Philippe disagreed; noting that BIFS was designed for robustness. Steve Casner concluded the discussion, noting that there is controversy, and that any addition to the Simple format to carry BIFS must clearly explain the issues.

Steve Casner liked the idea of putting forward what we're sure about, but noted that the Simple draft is currently written as a subset of Generic, and uses the MIME type of Generic, with parameters to specify usage. If we do things the other way around, we have to decide if the MIME type still makes sense.

Steve Casner asked about the implementation status of RFC 3016, and if the need for LATM audio would continue. Philippe noted that 3GPP has taken RFC 3016 as part of their standards, so LATM audio will likely continue to be used there, but that MPEG are trying to promote the Simple format for audio since it offers improved performance compared to RFC 3016. There was an inconclusive discussion on the possible future of RFC 3016 audio.

Colin Perkins expressed concern that we have RFC 3016, the Simple format (which is a minor extension of that), and we may well have Generic which is another extension. Are we getting a proliferation of MPEG-4 payload formats? Philippe noted that the only alternative is a true proliferation: one for each type of MPEG elementary stream. Stephan Wenger suggested that the Generic draft was "mostly seen as a toolbox, from which you pick and choose whatever is necessary for a certain application" and should be an informational RFC so the other formats could pick-and-choose from it, since it's unlikely to make it past Proposed Standard anyway (needing multiple complete implementations).

The status of the FlexMux format was queried: will it be completed? Colin Perkins noted that we already have RTP header compression and RTP specific multiplexing, but that FlexMux is a potential alternative for MPEG-4 content.

Colin Perkins again expressed concern about the potential for proliferation of MPEG-4 payload formats. Young-Kwon Lim noted that the original MPEG aim was to produce one format, but they started with a simple format because of IETF procedural issues. They plan to make everything a subset of Generic, rather than defining new formats. Colin Perkins was concerned that we may work through several possible subsets of Generic, making each an RFC, and that this would be unfortunate. Stephan Wenger agreed that proliferation of RFCs was bad, but noted that the concern about the Generic format was that it is too complex. Given the choice between one excessively complex format, and a proliferation of simpler formats, he may choose the proliferation. Steve Casner noted that the real conflict is between RFC 3016 and Simple, rather than between Simple and Generic. If we accept this conflict, we should go forward with Simple and follow it with Generic eventually, trying to avoid intermediate RFCs between Simple and Generic.

Young-Kwon Lim discussed the framework for the carriage of MPEG-4 content (draft-singer-mpeg4-ip-04.txt). This is a part of the MPEG-4 standard, 14496-8, and is intended to be published as an informational RFC in the IETF, providing an umbrella specification with guidelines for designing new MPEG-4 payload formats, and listing the common MIME types. Colin Perkins asked about the procedural issues: how can this be part of the MPEG-4 standard, when it references an Internet-Draft? Young-Kwon replied that there will be a delay in this part of the MPEG-4 standard, until the draft is published as an RFC. Stephan Wenger supported publication of the framework as an RFC, as well as part of MPEG-4, but worried about the possibility of divergence between the RFC and MPEG-4. Steve Casner agreed that publishing a framework pointing to the other documents is a good idea, but noted that this draft also has some normative aspects, and it is this which causes the problems (e.g., the definition of the MIME types in the framework). Philippe Gentric suggested splitting the framework draft, for example putting the MIME types into a separate document. Colin Perkins noted that he thought the draft was fundamentally okay, but we really needed to make sure it's consistent with the payload formats, rather than making fundamental changes. We need to check that the draft is consistent and correct, then move it forwards.

Payload Format for State Signaling Events

Rajesh Kumar discussed a new RTP payload format proposal for in-band state signaling events (draft-rajeshkumar-avt-sse-00.txt). This format is intended for modem over IP sessions, where signaling must be synchronized with the media to complete certain operations. Steve Casner commented that the solution we would like to see is one where signaling can go through the signaling path rather than being intermingled with the media stream. However, the RTP payload format is proposed to transport the signaling events because out-of-band signaling is too slow. The question: is this appropriate? Jim Renkel noted that this is not really a traditional signaling issue, in the sense that call setup is, since media processing is significantly shifted by specific events in the media stream. He also noted that there is consensus in the ITU study group that some form of in-band signaling is needed for this, but that there is recognition that this is not the only mechanism needed (and, in some cases, out-of-band signaling will be needed).

Jean-Francois Mule noted that the proposed solution works for switching from RTP to another media, but not for switching back. Rajesh disagrees, since the multiple media formats can operate on a single port, and use of multiple ports can be specified. Jean-Francois also objects to the mingling of signaling and control, since this puts signaling data at the DSP which is a significant problem for many implementations. Glen Parsons reiterated these points: the ITU T.38 study group would prefer to use the existing signaling methods, since the timing concerns raised in the draft are not an issue for them.

Joerg Ott asked how modem over IP was to be carried: RTP or some other protocol? Rajesh notes that it doesn't really matter, but the current specification uses a format called SPRT which is not the same as RTP but has the payload type field in the same place, so it can be multiplexed as part of the same stream. Steve Casner strongly objected to multiplexing using the payload types fields, pointing to the list of reasons in the RTP specification, but especially objected to multiplexing different protocols by fields that happened to line up in the same bits. Joerg noted that, even if using RTP, there is no timing advantage to using a different format, rather than just changing the RTP payload type. Why can't the other side just buffer media data when the payload type changes? The proposed signaling cannot improve the reaction time, so why use it? There was further discussion on this point, with Eric Burger and several others commenting, but no real conclusion.

Discussion then moved to meta issues: we should focus on the requirement rather than on specific solutions. Steve Casner noted that the cleanest solution might be a direct signaling path, following the media path but not in-band with the RTP stream, but that this might still raise issues due to bypassing the out-of-band signaling devices, QoS reservations and policy restrictions. Steve cut off the discussion at that point, with the comment that the problems seem broader than this particular issue and beyond the scope of AVT. We should step back, and take a broader look at the requirements and problem space, and decide on the appropriate course of action after discussion with the Area Directors. Jim Renkel noted that there may need to be higher level coordination between IETF and ITU on the subject of modem-over-IP, since the discussion today just touches on part of the problem. Scott Bradner (the transport Area Director in attendance) agreed that a broader look at the problem was appropriate.

RTP Retransmission Framework

David Leon was scheduled to discuss the RTP retransmission framework in draft-leon-rtp-retransmission-02.txt, but noted that he and the authors of draft-ietf-avt-rtp-retransmission-00.txt had agreed to produce a combined draft and therefore saw no need to present the original draft at this point. Steve Casner commented that he could see the requirements and advantages/disadvantages to the different approaches, but that he couldn't see how they combine. If the notion of the merger is to simply make one draft, which presents two approaches, that is not necessarily better than two drafts, one for each approach, with clear applicability statements (much in the way was done for ULP and UXP).

Following this discussion, when it was decided to keep the drafts separate, David outlined the changes in the draft since the last meeting. The congestion control discussion has been expanded and clarified. There is new discussion of using RFC 2198 redundancy to send retransmissions, although this does not provide an exact retransmission (since the RFC 2198 format doesn't convey certain header fields). Carsten Burmeister noted that the use of RFC 2198 here causes overlap with the other retransmission proposal, and has the disadvantages of packet expansion and incomplete repair. Magnus Westerlund agreed with the criticism, noting that the main thrust of this draft was separate transmission, and that the use of RFC 2198 didn't fit. Colin Perkins noted that the RFC 2198 format was never intended to be lossless, and if the information -- such as sequence numbers -- that it doesn't convey is important to an application, it is not appropriate to use RFC 2198.

Another comment from the last meeting was the suggestion that RFC 2733 could be used to retransmit packets. This is true, but the retransmission format has less overhead. The retransmission format also uses the actual media timestamp, unlike RFC 2733, and is more in line with the RTP specifications in this regard. Also, if using RFC 2733 for retransmission, the receiver would have no way of knowing if the FEC was used for retransmission or for FEC. Colin Perkins noted that latter point may be considered an advantage of the RFC 2733 format for retransmission, although David disagreed.

David also outlined some measurements they made using the RTCP feedback profile, draft-ietf-avt-rtcp-feedback-02.txt, to compare use of early or regular feedback. These seem to show that early retransmission requests do not help; but there was discussion that the simulations were flawed and were not in the correct mode to use early feedback. Further simulation is clearly needed, and possibly clarification of the feedback profile.

RTCP Extensions for Single-Source Multicast

The next presentation was on draft-ietf-avt-rtcpssm-00.txt by Julian Chesterfield. The security considerations section has been expanded to clarify the requirements for security and better identify threats. This is work in progress, and input from the group would be appreciated. The current focus is to identify an existing security mechanism to address the requirements. The motivation behind the summary mechanism was outlined, and it was noted that it has become more generic in this version of the draft, providing the ability to introduce new summary types. The draft suggests round-trip time distribution and SSRC distribution types, and feedback is again solicited. A brief comparison of the report frequency for standard RTP and for the summary feedback was given; for details see .

A feedback target identifier has been introduced: an SDP attribute determines where the feedback information is sent. There are clearly security considerations involved in the definition of this attribute which need to be worked out, in particular how to identify that the SDP comes from a trusted source. Marshal Eubanks noted that there are some applications where it is necessary to send feedback somewhere other than the source, and that this is a real problem with significant issues. Julian noted that he hoped that TESLA might be a solution here, but more work is needed. Mark Handley suggested a hello-nonce-echo exchange at the start of the session. Steve Casner noted that this is the approach used in the similar unicast RTCP mechanism implemented in IP/TV, but this doesn't solve the third-party problem. Marshal Eubanks noted some urgency for standardizing this function since his client and IP/TV both use something like this but don't interwork. Further discussion and experimentation is needed.

RTCP Reporting Extensions

The final area of discussion was RTCP Reporting Extensions, with Timur Friedman discussing draft-friedman-avt-rtcp-report-extns-02.txt. Timur outlined his draft and noted changes since it was last discussed at the Pittsburgh meeting. The draft now uses a new packet type, rather than extending SR/RR packets. It also allows data thinning to constrain bandwidth, selecting certain receivers (see papers referenced in slides). There has been deployment experience, validating the usefulness of these mechanisms. Timur requested that a working group last call be issued to move this to experimental RFC, but the chairs felt it worth considering the overlap with the related draft by Alan Clark (draft-clark-avt-rtcpvoip-00.txt) first to see if consolidation is feasible. The goal is to move to last call relatively quickly though, since the report extensions draft is stable.

Alan Clark was unable to attend the meeting, so Colin Perkins briefly presented the slides he provided. His proposal is based on Timur's draft, but uses an alternative description of the packet loss distribution and proposes several additional voice quality metrics. Three good questions were raised:

- Is the definition of a gap unambiguous, or may several different conditions result in the same value?

- Aren't these measures very dependent on a certain model?

- What is the use to have both the R-factor and estimated MOS in the packets? They seem to be equivalent.

Multiplexing RTP based on SSRC

The last item on the agenda was considered conditional on having enough time, and as usual, we didn't. However, Steve Casner brought up this topic on the mailing list before the meeting, and his slides are included with these minutes in the proceedings.

Action Items

We took two "hums" during the meeting which need confirmation on the mailing list. They were to accept the payload format iLBC Speech (draft-duric-rtp-ilbc-00.txt) and the RTP retransmission framework (draft-leon-rtp-retransmission-02.txt) as working group tasks. We solicit confirmations or objections on these actions -- we want to hear both "yeas" and "nays".

In addition, we have the new JVT video payload format proposal in draft-wenger-avt-rtp-jvt-00.txt for which we neglected to request a hum. I expect there would we wide support in AVT for including a JVT payload format as an AVT work item; please comment. We note that there may be additional input from the JVT itself as well on the payload format.


Monday Agenda
Wednesday Agenda
SRTP Draft v2 to v3 Changes
RTP Payload for JVT Video
Updated RTP Payload format for AC3 Streams
RTP Payload Format for SMPTE 292M Video
RTCP-based Feedback
News about MPEG-4 RTP payloads
A framework for the carriage of MPEG-4 contents
Fast, Reliable Media State Change Signaling Slide Deck Version 2
RTP retransmission framework
RTCP extensions for Single Source Multicast sessions
RTCP Reporting Extensions
RTCP Extensions for VoIP
RTP Payload for JVT Video, Results of the Lunch Breakout
AVT Drafts in Process
Multiplexing on SSRC ID