Last Modified: 2004-09-22
|Done||Review DCCP including prototypes and API; feedback to DCCP WG|
|Done||Initial draft requirements for ECRTP over MPLS; discuss with MPLS WG|
|Done||Submit iLBC payload format for Proposed Standard|
|Done||Submit iLBC codec specification for Experimental|
|Done||Advance RTP specification and A/V profile to Full Standard|
|Mar 04||Finish requirements for ECRTP over MPLS; recharter for subsequent work|
|Jul 04||Begin update of RTP/AVPF profile for Draft Standard RFC|
|Jul 04||Begin update of SRTP profile for Draft Standard RFC|
|Jul 04||Submit RTP/SAVPF profile for Proposed Standard|
|Aug 04||Consider update of RTP MIB|
|Sep 04||Submit RTCP/SSM draft for Proposed Standard|
|Nov 04||Collect RTP/AVPF implementation reports|
|Nov 04||Collect SRTP implementation reports|
|Nov 04||Submit ULP Payload Format for Proposed Standard|
|Nov 04||Submit UXP Payload Format for Proposed Standard|
|Dec 04||Identify payload formats to classify as Historic|
|Dec 04||Submit Framing of RTP for TCP and TLS for Proposed Standard|
|Mar 05||Submit RTP/AVPF for Draft Standard|
|Mar 05||Submit SRTP for Draft Standard|
|RFC1889||PS||RTP: A Transport Protocol for Real-Time Applications|
|RFC1890||PS||RTP Profile for Audio and Video Conferences with Minimal Control|
|RFC2029||PS||RTP Payload Format of Sun's CellB Video Encoding|
|RFC2032||PS||RTP payload format for H.261 video streams|
|RFC2035||PS||RTP Payload Format for JPEG-compressed Video|
|RFC2038||PS||RTP Payload Format for MPEG1/MPEG2 Video|
|RFC2190||PS||RTP Payload Format for H.263 Video Streams|
|RFC2198||PS||RTP Payload for Redundant Audio Data|
|RFC2250||PS||RTP Payload Format for MPEG1/MPEG2 Video|
|RFC2343||E||RTP Payload Format for Bundled MPEG|
|RFC2354||I||Options for Repair of Streaming Media|
|RFC2429||PS||RTP Payload Format for the 1998 Version of ITU-T Rec. H.263 Video (H.263+)|
|RFC2431||PS||RTP Payload Format for BT.656 Video Encoding|
|RFC2435||PS||RTP Payload Format for JPEG-compressed Video|
|RFC2508||PS||Compressing IP/UDP/RTP Headers for Low-Speed Serial Links|
|RFC2733||PS||An RTP Payload Format for Generic Forward Error Correction|
|RFC2736||BCP||Guidelines for Writers of RTP Payload Format Specifications|
|RFC2762||E||Sampling of the Group Membership in RTP|
|RFC2793||PS||RTP Payload for Text Conversation|
|RFC2833||PS||RTP Payload for DTMF Digits, Telephony Tones and Telephony Signals|
|RFC2862||PS||RTP Payload Format for Real-Time Pointers|
|RFC2959||PS||Real-Time Transport Protocol Management Information Base|
|RFC3009||PS||Registration of parityfec MIME types|
|RFC3016||PS||RTP payload format for MPEG-4 Audio/Visual streams|
|RFC3047||PS||RTP Payload Format for ITU-T Recommendation G.722.1|
|RFC3119||PS||A More Loss-Tolerant RTP Payload Format for MP3 Audio|
|RFC3158||I||RTP Testing Strategies|
|RFC3189||PS||RTP Payload Format for DV Format Video|
|RFC3190||PS||RTP Payload Format for 12-bit DAT, 20- and 24-bit Linear Sampled Audio|
|RFC3267||PS||RTP payload format and file storage format for the Adoptive Multi-Rate (AMR) and Adaptive Multi-Rate Wideband (AMR-WB) audio codecs|
|RFC3389||PS||RTP Payload for Comfort Noise|
|RFC3497||PS||RTP Payload Format for Society of Motion Picture and Television Engineers (SMPTE) 292M Video|
|RFC3545||PS||Enhanced Compressed RTP (CRTP) for links with High Delay,Packet Loss and Reordering|
|RFC3550||DS||RTP: A Transport Protocol for Real-Time Applications|
|RFC3551||DS||RTP Profile for Audio and Video Conferences with Minimal Control|
|RFC3555||PS||MIME Type Registration of RTP Payload Formats|
|RFC3556||PS||Session Description Protocol (SDP) Bandwidth Modifiers for RTP Control Protocol (RTCP) Bandwidth|
|RFC3557||PS||RTP Payload Format for European Telecommunications Standards Institute (ETSI) European Standard ES 201 108 Distributed Speech Recognition Encoding|
|RFC3558||PS||RTP Payload Format for Enhanced Variable Rate Codecs (EVRC) and Selectable Mode Vocoders SMV|
|RFC3611||Standard||RTP Control Protocol Extended Reports (RTCP XR)|
|RFC3640||Standard||RTP Payload Format for Transport of MPEG-4 Elementary Streams|
|RFC3711||Standard||The Secure Real-time Transport Protocol|
Audio/Video Transport Working Group Minutes
Reported by Colin Perkins from notes by Marshall Eubanks
Introduction and Status Update
The Audio/Video Transport working group met twice at the 61st IETF meeting (Washington DC, 10th and 11th November 2004). Topics under discussion included the RTCP XR MIB, the new RTP Profile for TCP Friendly Rate Control, an RTP header extension for anti-shadow redundancy, and RTP payload formats for telephony tones and events, JPEG 2000, H.224, VMR-WB audio, and several variants of AMR audio. The group also discussed the issue of how the RTP timestamp should be interpreted for audio codecs where the input and output sampling rates are decoupled. The meeting was chaired by Colin Perkins and Magnus Westerlund.
The meeting started with a working group status update presented by the chairs. The group has had no new RFCs published since the previous IETF meeting, but 4 drafts are with the RFC editor awaiting publication. In addition, we have requested publication for 11 more drafts which are currently awaiting review by our area director or the full IESG. The RTCP feedback draft is expected to go to the IESG for review shortly. The chairs briefly outlined the status of the drafts that were not to be discussed in detail, and solicited feedback via the mailing list.
Colin Perkins noted that the grqup is slightly behind on some of its milestones, and that some of them will need to be updated. The group also needs to update its charter to include new work. The most pressing milestone to which no effort has been allocated is the updated RTP MIB; Alan Clark volunteered to work on this, but solicited help.
RTP Payload Format for MIDI
John Lazzaro couldn't be at the meeting to discuss the MIDI drafts, but sent a slide outlining the status of the work. The draft has received an expert review from Dominique Fober from the MIDI community. This found no significant issues, and we expect the draft will be completed soon. Comments from the group were solicited.
RTCP XR MIB
Alan Clark and Amy Pendleton discussed the RTCP XR MIB. This has been updated since the previous meeting, and is now a working group draft. Recent changes include a section on the relation between the RTCP XR MIB and RAQMON that will incorporate wording suggested by the RAQMON authors, plus a number of technical changes to the MIB to address comments by Magnus Westerlund and Dan Romascanu.
After Alan introduced the work and gave the status update, Amy gave a detailed response to the comments received. Discussion centered around three areas: use of the SSRC as a session identifier, the complexity of the MIB, and the need for new parameters to be signalled.
The issue with using the SSRC as a session identifier is that it is unique only within a session, not necessarily across sessions. There was much discussion, but no real conclusions were drawn: it is clear that using the SSRC isn't sufficient to correlate across sessions, but it's not clear what would be an appropriate session identifier.
Regarding the complexity of the MIB, Dan Romascanu noted that the MIB is large and may not all be implemented by everyone. It might be a good idea to split the MIB, to allow implementations to claim conformance to subsets of the MIB with different tables? Jean Francois Mule also expressed interest in such a split.
There was a lot of discussion regarding the calculation of the R factor and how the MOS-LQ is to be calculated. Jean Francois Mule was keen to see information added to the MIB to specify how particular parameters, that can be calculated in various ways, were defined. Alan Clark noted that RFC3611 does recommend a minimum algorithm (G.107), however, this is an evolving area: the state-of-the-art a year ago has been extended by now. It would be very useful to be able to specify enhanced methods in the future. It seems clear that something may need to be added to the MIB to address this: Colin Perkins suggested Jean Francois write-up a proposal for changes to send to the mailing list, and that discussion continue there.
Open Issues in RFC 2833bis
Tom Taylor discussed the open issues in RFC 2833bis, starting with a review of the changes in the -04 and -05 drafts. There are a lot of changes (listed in the -05 draft) a number of which cause backwards compatibility issues: compliance no longer requires support of DTMF events, a number of trunk events have been discarded and alternative means of signalling them defined, and event codes 138 and 139 have been reassigned. Feedback was solicited on these changes, and Jean Francois Mule expressed concern over backwards compatibility of the signalling for DTMF, suggesting that the default be to assume DTMF is supported unless signalled otherwise. Jean Francois and Flemming Andreasen noted that DTMF is widely supported, but other events are not. Colin Perkins suggested splitting the draft, progressing the DTMF parts quickly. The idea of a split was well received.
The E bit is usually set to indicate the end of an event, but there is an exception for events defined as states, when the E bit doesn't have to be set when moving between mutually exclusive states. This rarely happens, but complicates the code and has little real benefit. It was asked if this is worth the complication? Colin Perkins suggested to remove this feature, unless there were objections. No one spoke up for the continued presence of this feature.
There are a number of tone events listed that are not well defined. Tom suggested to remove these? Stephan Wenger and others supported their removal.
Tom noted that the draft is very long, and it's not clear that all the events are needed. The decision to split out DTMF into a separate draft will help with this problem, but we may also consider other events for removal.
Tom asked if we should establish an IANA registry for the code points, or if it should be done when the draft was published as an RFC. Colin Perkins suggested this should go into the IANA considerations section now, rather than waiting for publication.
RTP Payload Format for JPEG 2000
Andrew Leung discussed the RTP payload format for JPEG2000. Since the last meeting, an IPR statement has been published listing the patent applications that cover this draft, the non-intelligent mode has been removed, the MIME types and SDP parameters have been improved, and the payload header has been slightly reordered.
After off-line discussion with the chairs at the meeting, it has been agreed to split this draft into two parts. The patent applications do not apply to the basic protocol, but to various enhanced features. It looks to be possible to split the draft into an unencumbered base and an encumbered extension draft. This will be done by the next meeting. Stephan Wenger applauded this split, and encouraged other companies with encumbered payload formats to consider similar moves.
The non-intelligent packetization has been removed. There is only one branch in the code, making better use of the payload header and using the Main Header Flag Field.
Regarding the MIME types, Colin Perkins noted that the uncompressed video draft has similar parameters to identify the colour space. It would be good if the two drafts could be made consistant.
Revised drafts addressing these issues should be available by the end of the year.
Far End Camera Control/H.224
Roni Even discussed the MIME type registration for the RTP payload format for H.224. ITU has defined, amongst other things, an approach to far end camera control in H.281 and H.224. H.323 annex Q defines how to transport this within RTP, however it does not register an appropriate MIME subtype. This draft registers the "application/h224" MIME subtype, referencing H.323 annex Q, allowing SIP applications to interwork with existing H.323 systems that use the format.
Roni asked for feedback, Steve Casner noted that it looked fine in his view. Alan Johnston (the XCON co-chair) agreed, and noted that far end camera control is out of scope for XCON. Colin Perkins concluded with an acceptance of this as an AVT working group draft, unless there were objections via the mailing list.
RTP Payload for Anti-Shadow Redundancy
Qiaobing Xie discussed a new draft on anti-shadow redundancy for RTP. There may be IPR issues with Motorola: the IETF has been notified and an IPR statement will be forthcoming.
This draft attempts to address the problem of shadowing in mobile radio communications, where blockage on signals can range from fractions of a second to a couple of minutes, and where traditional RTP-style FEC is not effective. The draft defines an RTP header extension, where the sender generates a base media stream and a forward-shifted anti-shadow stream sent at the same time. The amount of forward shift determines the maximum length of blockage that can be recovered without service interruption, and the amount of data that must be received before the first interruption. The anti-shadow stream may be lower quality than the original.
There was a lot of discussion, with Lorenzo Vicisano, Joerg Ott, Steve Casner, Stephan Wenger, Colin Perkins, Magnus Westerlund, Anders Klemets and Marshall Eubanks noting that this is a simple but inefficient form of FEC, which has a long buffering delay and problems with startup, is not likely to work well with single packet losses, and is difficult to deploy when the lower-quality encoding has different timing properties to the primary. It was noted that the RMT working group, and others, have effective FEC schemes that can solve this problem. It was also noted that it may be more appropriate to solve this at the link layer, or via layer/parallel transmission.
Colin Perkins concluded by stating that there is clear pushback against this draft as it stands. If there is a desire to proceed with the work, the authors need to define the problem and explain why it's something we need to solve within AVT, before working on solutions.
RTP Jitter Calculation
At the end of the first session, Steve Casner noted that there has been recent discussion on the list about jitter calculation. This is defined very specifically and a particular algorithm is required so calculations are comparable both over time and between streams, if not necessarily between, say, a video and an audio codec. People should pay more attention to the receiver reports section of RFC 3550, which describes how to make use of this.
RTP Profile for TCP friendly rate control
The second session started with a presentation of the RTP Profile for TFRC (RTP/AVPCC) by Ladan Gharai. After summarising the operation of the profile, Ladan discussed the changes since the -02 draft. The RTP header previously included a 16 bit quad-RTT counter as a profile specific extension. This has been changed to a 32 bit send timestamp with a 32 bit RTT sent only when the RTT estimate changes significantly. The presence of the RTT is indicated by an "R" bit, taken from the 7 bit payload type field (as extra marker bit, as a profile specific change to the RTP header). The RTP/AVPCC profile uses dynamic payload types only, and does not share the static payload type space with RTP/AVP. Joerg Ott noted that the presence of the R and M bits could cause the RTP payload type to alias the RTCP payload type values - could there be confusion? Colin Perkins noted that standard RTP has the same issue, and reserves some payload types. This should also be done for RTP/AVPCC.
RTCP RR packets are extended with 4 32-bit fields, describing the timestamp of the last data packet received (t_i), the delay between receipt of the last data packet and the generation of this feedback packet (t_delay), the rate at which the receiver estimates that data was received since the last feedback report was sent (x_recv), and the loss event rate (p). Colin Perkins asked if the field p and t_delay could be reduced in size? Possibly, but it may be possible to use the data in sender reports to deduce the RTT in interactive sessions, saving more bandwidth.
Another open issue is the RTCP minimum timing interval. TFRC says to send feedback once per RTT or once per packet, whichever is longer. However for flows with small round-trip time and data rate, this can require sending feedback packets more often than allowed by the RTCP timing rules, and consuming large amount of bandwidth. It was asked if the RTCP timing rules can be relaxed in these cases? Steve Casner agreed that they probably could be, since this is specifically for unicast and so will not have flash joins or leaves, however this does not change the excessive bandwidth usage for the feedback in these cases. An alternative might be to send less than one packet of feedback per round-trip time; this is a regime where TFRC hasn't been tested and, as noted by Tom Phelan, it is not clear that TFRC is TCP-friendly. More research will be needed to determine if TFRC can operate well at that rate.
The draft adopts the SRTP specification (RFC 3711) for security. There can be a NULL security operation, if there is no desire for privacy. It was pointed out that SRTP requires RTCP packets to be authenticated, which may not be desirable. Anders Klemets expressed concern about this additional requirement, wanting secure RTP/AVPCC to be a separate profile to standard RTP/AVPCC. The sense of the room was to agree with Anders, rather than forcing authentication on all implementations.
Finally, Ladan asked if the group thought it appropriate to include more introductory text on TFRC into the draft. Colin replied that this never hurts, but there are perhaps more pressing design issues to resolve with the draft.
RTP Timestamps for Variable Frequency Codecs.
An issue which has generated a lot of traffic on the mailing list is how to handle the RTP timestamp for audio codecs that use different input and output sampling rates. Colin Perkins summarised the issues and presented options on how to proceed.
The RTP timestamp reflects the sampling instance of the first octet in the RTP data packet, and MUST be derived from a clock that increments monotonically and linearly in time [RFC 3550, section 5.1]. This means that we cannot vary the RTP timestamp rate within a session.
There are different conventions for use of the timestamp between audio and video formats. For audio, the RTP clock rate used for generating the RTP timestamp is independent of the number of channels and the encoding; it usually equals the number of sampling periods per second [RFC 3551, section 4.1]. There are two exceptions to this: G.722 uses an 8kHz RTP timestamp clock for a 16kHz sampling rate codec, for compatibility with a mistake in RFC 1890; and MPEG audio uses a 90kHz clock for compatibility with other MPEG systems [RFC 3551 sections 4.5.2 and 4.5.13, RFC 3119]. The implication is that using RTP clock rate other than the sampling rate is allowed, but one must consider how it will affect the overall system design.
For audio, the convention is that the sampling frequency SHOULD be drawn from the set 8000, 11025, 16000, 22050, 24000, 32000, 4400 and 48000 Hz. However, most audio encodings are defined for a more restricted set of sampling frequencies [RFC 3551, section 4.1].
Two new audio formats violate these conventions. VMR-WB accepts 8kHz or 16kHz input and produces 16kHz output by default, irrespective of the input sampling rate. It is desired to use a fixed 16kHz clock for the VMR-WB payload format. AMR-WB+ can accept a range of input sampling rates, re-samples within the codec to one of 12 internal sampling frequencies, and can produce output at one of 8, 16, 24, 32 or 48kHz. The draft RTP payload format for AMR-WB+ uses a fixed 72kHz clock. The key points for both new codecs are that the input and output sampling rates are decoupled and that the decoder is agnostic of the input sampling rate (and that the rate can vary within a session).
These codecs raise two issues: 1) the number of sampling periods per second varies in different parts of the system: the usual definition of the RTP clock rate for audio it not sufficient; and 2) is it necessary to separately signal the input and output rates, and, if so, how? Colin suggested that consistency across codecs is desirable, to simplify the protocol and implementations that support multiple codecs.
There are four options going forward:
1) Only support codecs with one sample rate. This makes for easy backwards compatibility, but won't support certain codecs.
2) Mandate a common clock rate for all RTP codecs as described in draft-ietf-avt-variable-rate-audio-00.txt. Stephan Wenger gave an outline of this (the "uRTR" approach).
3) Allow each codec to specify its own definition for the RTP timestamp and signaling
4) Do something else?
Finally, it was asked if we need a set of guidelines for developers of new audio codecs which might need transport in RTP?
After presenting option 2, Stephan noted that his experience is that once Pandora's box is opened, the issue does not go away. We'll see this with more and more codecs in the future - probably most future audio codecs. But this is good, as the ability to change sample rates means that bandwidth requirements will decline and efficiency will increase.
Stephan noted several issues with option 2: Should this apply to all future audio codecs, or just the variable clock frequency ones? What about codecs that don't fit the subset sampling rates (e.g. Magnus Westerlund noted that DVD audio uses an 88.2kHz sampling rate). How do we cope with non-sample-exact mixing and synchronisation?
Qiabing Xie would support the uRTR, but wondered what if we pick not one but two uRTRs? Is that so bad? Colin Perkins wondered what would we would gain from using two uRTRs? Stephan Casner replied that we'd gain exactness in calculations during synchronisation.
Magnus Westerlund noted that the problem he sees with uRTR is with codecs that don't fit the rate. You will wind up with frames with fractional lengths. That is a real complication. This will cause error. There was a lot of discussion of the details of this, between Magnus and Stephan.
Steve Casner asked what is the strong gain that comes from mandating one number? We know that there are some downsides, what is the gain? Colin replied that the strong gain comes from if the uRTR time rate is an integer multiple of all possible time rates. Doubt was expressed that this is possible for a realistic common rate.
The group took a "hum" on the preferred outcome. Option 3 got the majority. Options 1 and 4 got nothing, option 2 very little. Colin Perkins wrapped up the session by stating that it was clear option 3 was the consensus, and we just have to accept the complexity it causes for implementations.
RTP Payload Format and File Format for VMR-WB
Sassan Ahmadi presented the open issues with the VMR-WB payload format, now that the previous discussion has given the go-ahead to its use of a fixed 16kHz timestamp.
Since the last meeting, the draft has been split into two: one for the payload format, one for the file format. Currently there is one MIME subtype with four magic numbers for the file format, but it has been suggested that four MIME subtypes should be defined with appropriate headers and parameters (to match typical MIME usage). Colin Perkins also noted that the AMR file format was origianlly defined in a very similar manner to this VMR-WB file format, but that 3GPP is moving away from this towards using an ISO-based file format. It may be appropriate to consider using an ISO-based file format for VMR-WB+, defined within 3GPP2, rather than repeating a design choice that didn't work well for 3GPP. There are concerns that this is not an appropriate file format, based on experience with AMR, and the group would like for 3GPP2 to look at this again and think about what their requirements are. Once they do that, we can move forward.
RTP Payload Format for AMR and AMR-WB
Magnus Westerlund presented an update to RFC 3267, intended to address its shortcomings and to move the AMR and AMR-WB payload formats to draft standard. In addition to a number of clarifications, the big change in this draft is a new offer/answer section.
After private discussion during the week, the authors have decided that this new offer/answer section needs revision: there are issues with the gateway-related paramters (mode-set, mode-change-period, and mode-change neighbour) which may result in a new parameter (mode-change-capabilities) being defined. In addition, the mode-change-neighbour parameter may be changed to be only a hint, rather than a strict requirement on possible mode changes (as it will usually work changing between arbitrary modes). A new draft will be produced to include these changes before the end of the year.
Magnus noted that Ericsson and Nokia have performed tests of about 50% of the various parameter combinations. However, help with conducting the reminder of the interoperability tests was solicited.
RTP Payload Format for AMR-WB+
Magnus Westerlund also presented the AMR-WB+ payload format. He noted that Nokia believes there is an unpublished Nokia patent application that may be relevant to this draft (an IPR statement is available on the IETF website). AMR-WB+ is one of the recommended codecs for 3GPP PSS and MMS, and the technical specifications of the codec are ready and published now. It is aslo likely be become a mandatory codec in 3GPP MBMS.
The payload format as written has a problem with overhead, with its flexibility coming with an unacceptable price. It turns out that this flexibility isn't always needed, since the internal sampling frequency and frame type are rarely switched. By making assumptions that these will be constant for certain frame grouping, it's possible to reduce the overhead significantly. An outline new proposal was presented (see slides) that makes this change. A revised version of the draft, including this proposal, will be published shortly. Steve Casner expressed approval for the new proposal, noting that this is a change in the direction that he generally favours.
-- + --