2.8.1 Audio/Video Transport (avt)

In addition to this official charter maintained by the IETF Secretariat, there is additional information about this working group on the Web at:

       http://www.cs.columbia.edu/~hgs/rtp/faq.html -- RTP FAQ Page
NOTE: This charter is a snapshot of the 56th IETF Meeting in San Francisco, California USA. It may now be out-of-date.

Last Modified: 2003-01-21

Stephen Casner <casner@acm.org>
Colin Perkins <csp@isi.edu>
Transport Area Director(s):
Scott Bradner <sob@harvard.edu>
Allison Mankin <mankin@psg.com>
Transport Area Advisor:
Allison Mankin <mankin@psg.com>
Mailing Lists:
General Discussion: avt@ietf.org
To Subscribe: avt-request@ietf.org
Archive: ftp://ftp.ietf.org/ietf-mail-archive/avt
Description of Working Group:
The Audio/Video Transport Working Group was formed to specify a protocol for real-time transmission of audio and video over UDP and IP multicast. This is the Real-time Transport Protocol, RTP, together with its associated profile for audio/video conferences and payload format documents.

The current goals of the working group are to revise the main RTP specification and the RTP profile ready for advancement to draft standard stage (including the sampling algorithms for use with very large groups, which have been broken out into a separate document), to complete the RTP MIB, to produce a guidelines document for future developers of payload formats and to continue development of new payload formats.

The payload formats currently under discussion include a number of media specific formats (MPEG-4, DTMF, PureVoice) and FEC techniques applicable to multiple formats (parity FEC, Reed-Solomon coding).

Archive before July 2001: ftp://ftp.es.net/pub/mail-archive/rem-conf/

Goals and Milestones:
Done  Working group last call on parity FEC draft (standards track)
Done  Post revised RTP MIB and issue working group last call (stds track)
Done  Working group last call on guidelines for payload format writers (BCP)
Done  Post revised RTP spec and audio/video profile
Done  Post revised DTMF payload format draft, ready for WG last call
Done  Post RTP implementation checklist draft
Done  Post payload format for MPEG-4 based on MPEG/IETF joint meetings
Done  Post revised RTP membership (SSRC) sampling draft
Done  Post revised draft on PureVoice (qcelp) payload format to address WG last call comments
Done  Submit RTP MIB to IESG for publication as Proposed Standard RFC
Done  Submit guidelines for payload format writers for publication as a BCP
Done  New working group last call on PureVoice payload format
Done  Working group last call on revised SSRC sampling draft (experimental)
Done  Analysis/simulation of multiplexing payload format proposals
Done  Post final revision of RTP spec and A/V profile drafts
Done  Revise MPEG-4 payload format document after implementation experience
Done  Decide how to proceed with multiplexing protocol: one generic payload format or a number of application specific formats
Done  Working group last call on RTP and A/V profile (for Draft Standard)
Done  Prepare MPEG4 implementation results ready for WG last call
Done  Post final revisions of selected multiplexing protocol draft(s)
Done  Working group last call on multiplexing payload format (stds track)
  • - draft-ietf-avt-profile-new-13.txt
  • - draft-ietf-avt-rtp-new-12.txt
  • - draft-ietf-avt-rtp-mime-06.txt
  • - draft-ietf-avt-rtcp-bw-05.txt
  • - draft-ietf-avt-tcrtp-07.txt
  • - draft-ietf-avt-smpte292-video-08.txt
  • - draft-ietf-avt-crtp-enhance-07.txt
  • - draft-ietf-avt-ulp-07.txt
  • - draft-ietf-avt-rtp-selret-05.txt
  • - draft-ietf-avt-uxp-05.txt
  • - draft-ietf-avt-srtp-05.txt
  • - draft-ietf-avt-rtcp-feedback-05.txt
  • - draft-ietf-avt-mpeg4-simple-07.txt
  • - draft-ietf-avt-dsr-05.txt
  • - draft-ietf-avt-evrc-smv-03.txt
  • - draft-ietf-avt-mwpp-midi-rtp-06.txt
  • - draft-ietf-avt-rtcpssm-03.txt
  • - draft-ietf-avt-rtp-retransmission-06.txt
  • - draft-ietf-avt-rtp-interleave-00.txt
  • - draft-ietf-avt-rtp-jpeg2000-02.txt
  • - draft-ietf-avt-rfc2833bis-02.txt
  • - draft-ietf-avt-rtp-ac3-00.txt
  • - draft-ietf-avt-ilbc-codec-01.txt
  • - draft-ietf-avt-rtcp-report-extns-03.txt
  • - draft-ietf-avt-uncomp-video-02.txt
  • - draft-ietf-avt-rfc3119bis-01.txt
  • - draft-ietf-avt-rtp-h264-01.txt
  • - draft-ietf-avt-rtp-ilbc-01.txt
  • Request For Comments:
    RFC1889 PS RTP: A Transport Protocol for Real-Time Applications
    RFC1890 PS RTP Profile for Audio and Video Conferences with Minimal Control
    RFC2035 PS RTP Payload Format for JPEG-compressed Video
    RFC2032 PS RTP payload format for H.261 video streams
    RFC2038 PS RTP Payload Format for MPEG1/MPEG2 Video
    RFC2029 PS RTP Payload Format of Sun's CellB Video Encoding
    RFC2190 PS RTP Payload Format for H.263 Video Streams
    RFC2198 PS RTP Payload for Redundant Audio Data
    RFC2250 PS RTP Payload Format for MPEG1/MPEG2 Video
    RFC2343 E RTP Payload Format for Bundled MPEG
    RFC2354 I Options for Repair of Streaming Media
    RFC2429 PS RTP Payload Format for the 1998 Version of ITU-T Rec. H.263 Video (H.263+)
    RFC2431 PS RTP Payload Format for BT.656 Video Encoding
    RFC2435 PS RTP Payload Format for JPEG-compressed Video
    RFC2508 PS Compressing IP/UDP/RTP Headers for Low-Speed Serial Links
    RFC2733 PS An RTP Payload Format for Generic Forward Error Correction
    RFC2736BCPGuidelines for Writers of RTP Payload Format Specifications
    RFC2762 E Sampling of the Group Membership in RTP
    RFC2793 PS RTP Payload for Text Conversation
    RFC2833 PS RTP Payload for DTMF Digits, Telephony Tones and Telephony Signals
    RFC2862 PS RTP Payload Format for Real-Time Pointers
    RFC2959 PS Real-Time Transport Protocol Management Information Base
    RFC3009 PS Registration of parityfec MIME types
    RFC3016 PS RTP payload format for MPEG-4 Audio/Visual streams
    RFC3047 PS RTP Payload Format for ITU-T Recommendation G.722.1
    RFC3119 PS A More Loss-Tolerant RTP Payload Format for MP3 Audio
    RFC3158 I RTP Testing Strategies
    RFC3189 PS RTP Payload Format for DV Format Video
    RFC3190 PS RTP Payload Format for 12-bit DAT, 20- and 24-bit Linear Sampled Audio
    RFC3267 PS RTP payload format and file storage format for the Adoptive Multi-Rate (AMR) and Adaptive Multi-Rate Wideband (AMR-WB) audio codecs
    RFC3389 PS RTP Payload for Comfort Noise

    Current Meeting Report

    Audio/Video Transport Working Group Minutes
    Reported by Stephen Casner and Colin Perkins
       The avt working group met in two sessions at the 56th IETF meeting in San 
    Francisco. In the first session, the group discussed the status of 
    several documents in progress including the profile for Secure RTP, plus RTP 
    framing over TCP, multiple RTCP extensions, and two video payload 
    formats.  In the second session, the discussion covered seven payload 
    formats related to audio (including MIDI and distributed speech 
    Introduction, Document Status, and Open Issues
       This meeting began with an update by Steve Casner on document 
    publication status.  Two momentous steps were achieved in the days 
    preceding the meeting:  The revised RTP specification and A/V profile 
    (revisions of RFC 1889 and 1890) were approved by the IESG for 
    publication as Draft Standards, and the payload format for MPEG-4 was 
    submitted by the working group to the IESG with a request for 
    publication as a Proposed Standard.  The RTP spec and A/V profile had been 
    tentatively approved before the previous meeting, but there were several 
    "RFC Editor note" clarifications requested by the IESG.  In addition, a 
    number of questions regarding the text have been posted to the WG 
    mailing list during the IESG review.  Consequently, Steve Casner wrote 
    clarifications to address these questions and WG participants 
    discussed them on the mailing list after the previous meeting.  
    Ultimately, there were enough changes that the IESG agreed that the 
    drafts could be revised before submission to the RFC editor: 
    draft-ietf-avt-rtp-new-12 and 
    draft-ietf-avt-profile-new-13.  The IESG gave final approval of these 
    revised drafts and sent them to the RFC Editor.  Steve briefly reviewed the 
    list of changes at this meeting and asked those who made comments in the 
    mailing list discussion to verify that the results are 
       We continued the pattern of publishing at least one RFC since the last 
    meeting (RFC 3497 on the payload format for SMPTE-292M video), and four 
    other drafts are in the RFC editor queue (the MIME registration for the 
    payload formats in the RTP profile and the SDP bandwidth modifiers for RTCP 
    bandwidth, both waiting for the RTP specification, and the payload 
    formats for EVRC/SMV and ETSI ES 201 108 DSR).  Four drafts are with the 
    IESG (ECRTP which is in Last Call, TCRTP, SRTP, and MPEG-4).
       Several drafts are either in WG last call or may be ready.  The draft on 
    RTCP Feedback profile AVPF 
    (draft-ietf-avt-rtcp-feedback-05) has been in last call for some time and is 
    awaiting a final revision.  The companion draft which reports on 
    simulations to validate the protocol design has been revised 
    (draft-burmeister-avt-rtcp-feedback-sim-01) and will be published as 
    Informational at the same time.  The RTP Retransmission draft, which uses 
    the AVPF, has been revised based on WG comments and is ready for WG last 
    call when the AVPF draft is updated.  Two drafts on Uneven Level 
    Protection and Unequal Erasure Protection are being considered together as 
    alternative methods; a new approach is proposed for ULP to revise RFC 2733 to 
    fix its non-standard use of RTP header bits, but this was not 
    completed before the meeting.  Lastly, 
    draft-ietf-avt-rtcp-report-extns-03 on RTCP Extended Reports (XR) was 
    discussed at this meeting to consider whether it is ready for last call.
    Secure Real-time Transport Protocol
       Mark Baugher reported that SRTP 
    (draft-ietf-avt-srtp-05) was submitted to the IESG in mid-2002 and IETF 
    Last Call was issued, but the Security and Transport Area Directors 
    expressed concerns regarding the interaction between the encryption 
    ciphers and authentication.  Those concerns were discussed at the 
    previous IETF AVT meeting and subsequently with the ADs.  Mark 
    summarized the agreed modifications: authentication using an 80-bit MAC is 
    mandated for SRTCP and the default for SRTP, but user may choose null 
    authentication for SRTP.  Precise language will be crafted for the draft 
    about when this is acceptable and to make clear the risks.  
    Discussion of error tolerance will be removed from the draft, even though it 
    was one of the original design requirements, because it has become a point of 
    controversy due to its dependence on changes at several other layers of the 
    protocol stack.  Other considerations are sufficient to justify the 
    design decisions.  Automatic key management is mandated to ensure that the 
    key streams are not repeated in order to avoid a serious failure case with 
    the default counter mode cipher.  The authors also requested 
    permission to make some small changes/clarifications reflecting recent 
    feedback from implementers; these will be sent in a separate note to the 
    list.  Mark stated the authors' intention to produce a new draft in a 
    couple of weeks.  Steve Casner reported that our AD Allison Mankin had 
    mentioned in a side conversation at this meeting that she wants this draft to 
    go ahead quickly.
    Framing RTP over Connection-Oriented Transport
       John Lazzaro presented 
    draft-lazzaro-avt-rtp-framing-contrans-00, a new draft that restates the 
    description of RTP framing over TCP that was removed when the RTP/AVP 
    Profile (RFC 1890) was revised since no report of 
    interoperability of the feature was obtained.  It is now claimed that 
    there are implementations, so it would be appropriate to have it 
    documented.  As this draft was prepared, two related problems became 
    apparent:  1) the MMUSIC comedia draft specifies SDP session 
    descriptions using TCP transport, but does not specify a format 
    parameter to indicate RTP/AVP inside the TCP; and 2) RTSP (RFC 2326) 
    specifies a single TCP connection carrying interleaved control and data, but 
    not separate data streams carried in TCP.  This draft includes sections to 
    address these last two topics, but it is an open issue for the WG 
    whether these should be kept or abandoned.
       As an alternative to this "classical" framing of RTP in TCP using a 
    16-bit frame length field between frames, some people have suggested on the 
    mailing list that the multiplexed framing specified in RTSP be used 
    instead.  John asked for input from the group whether one or the other (or 
    both or neither) approach should be followed.  Steve Casner noted that the 
    TCP framing was removed in the profile revision due to lack of interop, and 
    asked whether there really is sufficient interest now to warrant 
    publishing a spec.  John replied that this is needed in MIDI over RTP (see 
    MWPP, later) because much of the user community is uncomfortable with UDP.  
    Ross Finlayson spoke in favor of using the RTSP framing which is 
    implemented in several RTSP servers including Apple's and his own 
    (Live.com).  Henning Schulzrinne countered that it is only really 
    applicable if there's a control protocol in the same stream, and 
    suggested that we just specify the "classical" framing now and leave for the 
    future a specification of how RTSP framing can be used separately from RTSP 
    (requiring a different format parameter in SDP to indicate that 
    framing).  One reason is that RTSP is undergoing update now.  Ross 
    suggested that specifying a default "channel ID" would suffice for using 
    RTSP framing without a control protocol, but others were concerned that 
    there would be a desire to use multiplexing with other control 
    protocols where the syntax of the RTSP framing would conflict, so it is not 
    general.  Colin Perkins summarized the discussion by saying that this 
    draft should restrict itself to the classical framing and leave the RTSP 
    framing to be specified somewhere else if there is interest or dropped.  The 
    SDP format parameter "TCP RTP/AVP" would indicate this simple case; other 
    parameters could be defined for multiplexed framing as needed.
       Regarding the other problem of specifying how data can be carried in 
    separate TCP streams under RTSP, John concluded and Magnus Westerlund 
    agreed that this should be left to the revised RTP specification.
    RTCP Extended Reports (XR)
       Timur Friedman reviewed the changes to the RTCP reporting 
    extensions in 
    draft-ietf-avt-rtcp-report-extns-02 with minor additional updates in -03.  
    The RTCP packet type number is changed to 207 to avoid the conflict with the 
    RTCP Feedback draft, and the numbering and structure of a few of the 
    report blocks have been simplified.  There are several open issues 
    regarding the -03 revision, the first of which is that the title was 
    changed to avoid the need to spell out the RTCP acronym.  Steve Casner 
    feels that the title should remain RTCP because that is the accurate 
    topic.  The title would be "RTP Control Protocol (RTCP) Extended 
    Reports", which is not too long.  Section 4.3 specifies how packet 
    arrival timestamps in RTP timestamp format may be reported back; Magnus 
    Westerlund has suggested that these timestamps should be converted to some 
    fixed units to avoid problems if the RTP timestamp clock rate changes.  
    Steve Casner said that is not usually done because it causes a number of 
    other complications, and that you should be able to use the sequence 
    numbers that are returned with the timestamps to select the correct rate.  
    However, Steve asked for clarification in the text that these are 
    arrival timestamps, not the sender's timestamps.
       In Section 4.4, the definition of standard deviation and TTL will be 
    clarified.  The variable geometry of this packet type, indicated by a bit 
    field, was simplified to a single format.  Some people would prefer the 
    space savings of the variable format, while others prefer the 
    simplicity.  Steve suggested that if particular subsets are most useful, 
    then those subsets should be defined as distinct block types.
       Alan Clark discussed the changes to the VoIP metrics in Section 4.7.  In 
    -02, echo level measurements were added since a number of 
    implementations produce that data.  In -03, to support for 
    sample-based codecs as well as frame-based codecs, the jitter buffer 
    metric was changed to units of time (5ms) rather than frames.  A couple of 
    people raised the question whether the 5ms unit should be 1ms instead.  
    Alan responded that 5ms seemed a better tradeoff between resolution and 
    range for the performance effects to be measured.  It could be done, but 
    would require increasing three fields from 8 to 16 bits.  Anwar ??? asked 
    about the requirement that all fields of the VoIP report block be 
    supported.  Alan responded that although the block structure is fixed, some 
    of the fields allow an "undefined" value for use when the metric does not 
    apply.  Steve Casner asked for the use of the undefined value to be 
    clarified.  Anwar also asked if there would be a XR block type for 
    one-way delay.  Timur responded that this could be added in the future if 
    someone develops a method that proves useful.  Philippe Gentric wants to get 
    reports of instantaneous fill level rather than average; Alan said the 
    collection of measures should allow this to be determined.
       Steve Casner raised a few issues regarding the completeness of the 
    draft and the merging of the VoIP metrics with the other sections.  The 
    VoIP metrics include a measure of round trip time, but as specified, this is 
    only possible if data is being transmitted in both directions.  The draft 
    needs to explain that, or to explain what to do if that's not the case.  The 
    VoIP section could refer to the the non-sender RTT measurement method in 
    Section 4.5, but it doesn't.  The draft needs an applicability 
    statement at the top to explain which of the tools in this toolbox should be 
    used in which situations, and how they would be used.  The draft should be 
    careful to avoid unstated assumptions about the use cases.  Another 
    problem is that Section 4.7 gives rules for how often the VoIP metric 
    blocks must be sent.  This draft is not allowed to say that.  The RTP spec 
    defines the basic RTCP packet timing, and the RTCP Feedback profile 
    defines alternate timing under a more restrictive use case.  This draft 
    could refer to use under the RTCP Feedback profile, but it can't specify 
    timing of its own.  Alan responded that this would be done, and 
    commented that in some cases the more useful information in the VoIP 
    report block would allow less frequent reporting, saving bandwidth.  Steve 
    agreed that it would be good to add a sentence pointing this out because the 
    increased in RTCP packet size caused by the inclusion of the VoIP 
    metrics will mean that the interval between RTCP packets is 
       A more complicated issue raised by Magnus Westerlund is that the draft 
    should specify SDP signaling for its use.  This is a valid request, but 
    might introduce too much delay in completing the draft.  Colin Perkins 
    pointed out the need to update the document quickly in order to meet a May 
    deadline for consideration by ITU H.460.9.  The specification of SDP 
    signaling could be a separate draft, but it appears feasible to include a 
    specification of SDP for point-to-point applications and leave more 
    complicated scenarios for a separate draft.
    RTCP Extensions for SSM with unicast feedback
       Julian Chesterfield discussed updates in 
    draft-ietf-avt-rtcpssm-03 which introduces some additional summary 
    statistics, some clarifications and examples of security 
    requirements and how the security methods should be used for RTSP and SIP 
    sessions, and updates to the IANA considerations.  The new summary 
    sub-blocks are IPv6 feedback address, BYE list, and RTCP receiver 
    bandwidth.  This revision is now aligned with the RTCP XR format that 
    Timur just presented, as was suggested in a previous AVT meeting.  Steve 
    Casner clarified that the reason for the suggestion was not to save an RTCP 
    packet type code point, rather to ask if the same reporting 
    mechanisms could be shared for both purposes rather than defining 
    different ones.  If there is no feasible merging of 
    functionality, then there is not a requirement to fit the SSM reports into 
    the XR block structure with extra overhead.  The authors will 
    reconsider this issue.
       Some open issues remain for the draft: clarification that the sender 
    must forward a group size report whenever the size changes to allow 
    correct operation of RTP timer reconsideration rules; reporting in the 
    summary what the data corresponds to in terms of sample group size and 
    receiver report age; and the alignment with XR.  The authors would like to 
    know whether there are any other implementations of this draft, and 
    whether it will be considered ready for WG last call when these open 
    issues have been addressed.
       Timur Friedman suggests using XR block type 8 rather than 10.  John 
    Lazzaro asked about summarization of the "extended highest sequence 
    number" field from RTCP reports.  MWPP uses this value; he would want the 
    summary to be the minimum value reported by receivers.  He will think 
    about the details and make a suggestion.  Steve Casner asked how the 
    summarization methods in this draft compare with the summarization 
    methods in the XR draft.  Eve Schooler responded that the nature of the 
    summary methods in the two drafts is different.  The XR draft reports 
    sampling of data, while SSM reports mathematical distributions.  The 
    authors of both drafts agreed to meet after the session to discuss this in 
    more detail.  Steve commented that the reason for raising the point was 
    that if there is no common applicability of the two methods, then 
    binding them together does not make sense.
    RTP Payload Format for JVT Video
       Stephan Wenger discussed 
    draft-ietf-avt-rtp-h264-01.  The spec for the JVT codec itself was 
    finalized in a meeting the previous week with acceptance for ISO FDIS 
    status and with "Consent" in the ITU-T due on March 28, so it is now time to 
    finalize the packetization here and Stephan hopes to accelerate the 
    process.  A -02 revision is to be produced shortly, and perhaps one more 
    revision if there are comments.  The hope is to go to WG last call before 
    the next meeting.  There were many editorial changes from -00, plus a few 
    technical additions.  Fragmentation was added, but needs more 
    description and an example.  A "decoding order number" was added to 
    facilitate some special handling of the packets without needing to decode 
    the bitstream, but the description in this revision is inadequate.  A 
    section was added on MIME registration and SDP usage, but this may also 
    need refinement.
       The main issue for this payload format is interoperability with the 
    MPEG-4 Simple payload format.  A new informational draft will be written to 
    describe how this interoperation may be done.  Colin Perkins asked 
    whether interoperation with RFC 3016 is possible and there is any 
    pressure for this.  Stephan replied that he thought not, but will 
    investigate for the interop draft.  As the h264-01 draft now stands, it is 
    possible to transmit a subset of JVT in MPEG-4 Simple access unit 
    fragments, although the utility of this is not clear.  It was also agreed at 
    the JVT meeting to add a new form of STAP to allow more useful 
    interoperation.  But it has also been suggested that this payload format 
    should re-model the interleaving of MPEG-4 Simple and that the 
    STAP/MTAP syntax should be aligned with MPEG-4 Simple.  This would 
    achieve syntactic alignment allowing some re-use of code, but there would be a 
    huge semantic difference.  Dave Singer expressed the position that this 
    draft should be made the best it can be for H.264 first, and then 
    consider interop.  The only solid requirement for interop is that it must be 
    possible for a video stream using this payload format to be part of an 
    MPEG-4 presentation.  That is already possible.  It was agreed not to 
    introduce mythical similarities.
       At the previous AVT meeting, Stephan was resisting the addition of 
    media-unaware fragmentation, but has added it in this revision because it 
    may be necessary to transport a NALU of size greater than 64KB which is the 
    most that IPv4 fragmentation can handle.  In addition, in there may be 
    content pre-recorded with a NALU size that does not fit the MTU size of a 
    delivery network.  Doing the fragmentation at the application layer 
    allows application of tools such as RFC 2733 FEC for better 
    protection efficiency.
       The security section of the document needs to be expanded beyond the 
    minimal text in -01.  The main issue is the vulnerability of Parameter Sets 
    when transmitted in-band.  This is addressed in the codec spec, and Colin 
    Perkins said a normative reference could be made to that.  Steve Casner 
    asked why the in-band transmission of Parameter Sets should be allowed at 
    all, given that the draft says it is a bad idea.  Stephan replied that it is 
    necessary in some scenarios involving gateways.  Philippe Gentric 
    expressed serious concern that if this practice is allowed then people will 
    use it when it shouldn't be used, and it will cause endless headaches 
    similar to those already seen with RFC 3016.  Stephan said the draft will 
    say SHOULD NOT to discourage its use.  The codec spec also includes the 
    means to transmit data of unknown type that may or may not be active 
    information, for example a set-top box software update that could 
    contain a virus.  Stephan asks for help on security language to address 
    this problem.
       Another open issue is video conferencing support.  Stephan would like to 
    specify MIME codepoints for levels of operation beyond those included in the 
    codec spec, e.g. 704x576 images at 7.5 fps.  Colin asked whether these 
    modes will be added in future revisions of the codec spec, which would 
    result in two ways to specify the same mode.  Stephan said no.  There is 
    also a question whether this payload format is the right place to 
    specify whether the response to a request for a full intraframe is 
    mandatory.  It was agreed that this payload format cannot specify that 
    because it is subject to congestion control.  Colin said this topic will be 
    discussed separately in come combination of AVT and MMUSIC.
       The last open issue is regarding timestamps for Field mode, where there 
    may be separate sampling instants for the two fields of a picture.  
    Discussion requires detailed knowledge of H.264, so was deferred.
    RTP Payload Format for uncompressed video
       Ladan Gharai reviewed the modifications, additions, and remaining open 
    issues for draft-ietf-avt-uncomp-video-02.  The payload header now 
    extends the sequence number in the RTP header to 32 bits to 
    accommodate high data rates.  Additional color codings for 4:1:1 and 4:2:0 
    interlaced and progressive were added, and separate timestamps are 
    specified on interlaced fields to accommodate reversing 3:2 pulldown (an 
    issue that was raised at the previous meeting).  The draft now 
    specifies required and optional SDP parameters (rate, pgroup, 
    color-mode, etc.) and covers congestion control in Security 
    Considerations (a serious issue for very high data rates).
       There are some open issues.  At the previous IETF it was suggested that a 
    planar video mode be added to the draft, i.e., sending Y, Cb and Cr 
    planes separately.  This requires a 2-bit field, which could be stolen from 
    the length field, to indicate which plane.  The main concern is that a 
    substantial portion of the draft related to pgroups and color 
    subsampling would be irrelevant to this mode.  Does it make sense to 
    combine these disparate modes in one draft?  Stephan Wenger 
    recommended keeping them separate.  Two bits may not be enough since there 
    are some color models with more than 4 planes, and the plane numbering 
    would need to be specified for all these models.  He also felt this would 
    not be used because display systems are built to receive data in that 
    form.  Philippe Gentric countered that systems doing processing, rather 
    than display, do operate on the planes separately, and since they 
    operate at gigabit rates, saving memory shuffling is important.  
    However, after some study he concluded that the implementations differ so 
    much that a single format won't help.  It was agreed to drop planar mode.
       On the other hand, the group felt it was worthwhile to keep the 
    detailed listing of pgroup values rather than just the text 
    explaining the rules for determining the values.  Ladan asked if anyone has 
    additional color subsampling or colorimetry values that should be 
    included in the draft.  Stephan Wenger said he had several that he would 
    discuss after the meeting.  Steve Casner said the draft needs to specify the 
    means for defining and registering new values after the draft is 
       The last question was whether there should be a means to 
    independently represent SMPTE timecodes, or are the RTP and RTCP SP 
    timestamps sufficient.  Dave Singer said the SMPTE timecodes are 
    required for some video applications because they can have gaps in edited 
    programs, but it is orthogonal to the data format of the video.  Colin 
    Perkins asked whether the NPT mapping in RTSP solves this problem or 
    whether some means is needed in the payload format.  Philippe Gentric said 
    this is really needed for studio work, and recommended that there be a 
    separate payload format defined for carrying just the SMPTE timecodes that 
    could then be synchronized with RTP mechanisms to any payload format.  
    Colin recalled a proposal for that a couple of years ago, and 
    suggested that we resurrect it.  Stephan advised we need input on real 
    usage cases in order to produce anything useful.  John Lazzaro said that 
    MIDI timecodes are SMPTE timecodes encoded in MIDI, and MWPP provides a 
    means to transport these with resiliency.  This would be a solution, 
    perhaps a gross one.
    RTP Payload Format for a 64 kbit/s transparent call
       The second session started with a discussion of the RTP Payload format 
    for 64 kbps transparent calls 
    (draft-kreuter-avt-rtp-clearmode-02.txt) by Ruediger Kreuter. This draft 
    describes a single 64kbps clearmode connection, and MIME type 
    registration. It is distinct from Nx64kbps transport 
    (draft-vainshtein-cesopsn-XX.txt) work which is done in the PWE3 working 
       The MIME type "audio/clearmode" is registered. Ruediger asked if this is 
    appropriate? Flemming Andreason said yes, using "audio" as the MIME type 
    makes things easier in SDP since it can be included in the same media 
    stream as other codecs. Ruediger also noted that the ITU use "audio" for 
    this, as do existing products, so that helps compatibility. Ruediger noted 
    that Magnus Westerlund sent a number of comments on the draft. Due to 
    these he will include the "maxptime" parameter, and an explanation of the 
    mapping from MIME parameters to SDP parameters. 
       The chairs asked if there were objections to making this a work item? 
    None were raised, so the next version will be submitted as an AVT work 
    item, subject to area director approval.  Colin Perkins noted that it is 
    necessary to avoid overlap with work in PWE3 (i.e. limit this to a single 
    64kbps channel) if we are to accept it as a work item.
    The MIDI Wire Protocol Packetization (MWPP)
    John Lazzaro discussed MWPP 
    (draft-ietf-avt-mwpp-midi-rtp-06.txt) and coding guidelines 
    zaro-avt-mwpp-coding-guidelines-02.txt). He noted that he has been in 
    discussion with the MMA and AMEI (the MIDI manufacturers 
    associations for the US and Japan), and is incorporating their feedback as 
    appropriate. He also noted that IEEE P1639 has related work, solving the 
    problem at layer 2, and outlined differences between the two 
       John noted that equipment today uses MIDI cables, asynchronous serial 
    cables, which do not use timestamps. Instead, the timing of the signal on 
    the wire denoted the media timing. People want pseudo-wire emulation in 
    RTP, so they can use the precise packet times when operating on a LAN or 
    other low-jitter environment.  Wants a parameter to indicate that the 
    sender is "precision", so the last command in the packet indicates the 
    sending time. 
       The other issue is synchronization between direct digital audio output 
    and synth MIDI output. Options are to use an external clock sync, and 
    slave everything to it, or to specify packet buffer latency and use 
    manual calibration.
       There was much discussion around this subject between John, Magnus 
    Westerlund, Steve Casner and Colin Perkins. This leant towards the first 
    option: somehow convey the mapping between RTP timestamp and an 
    external sync clock, and use the external clock to determine when to 
    playout the media. Steve Casner gave a reference to the BBN 
    synchronization protocol (J. Escobar, C. Partridge and D. Deutsch, Flow 
    Synchronization Protocol, IEEE/ACM Transactions on Networking, Volume 2, 
    Number 2, April 1994), which is an example of this approach. 
       Colin Perkins noted that RTSP has a similar signalling mechanism, to 
    convey a mapping between RTP timestamps and a SMPTE timecode. John noted 
    that MWPP may also be used with SIP, so a more general parameter may be 
    useful, but is nervous about splitting this out into a separate 
    document due to feature creep. It may be appropriate for the MWPP 
    document to outline the form of the solution, but to leave the actual 
    solution to a different document (since the solution to 
    synchronizing playout from multiple sources is more general than MWPP).
    RTP Payload Format for ATRAC-X
       Matthew Romaine discussed 
    draft-hatanaka-avt-rtp-atracx-01.txt, the RTP payload format for 
    ATRAC-X. A number of questions were raised at the last meeting: 
    ambiguity in the timestamp definition, concern that the reasoning behind the 
    multiplexing was not convincing, the reason why RFC 2198 was not used for 
    redundancy, and concern about possible decoding ambiguity when 
    fragmentation is used. Since then, the timestamp and sample rate have been 
    clarified, the redundant data framework has been modified, a new method of 
    multi-channel data decomposition has been introduced, and the reasons for 
    multiplexing have been solidified.  
       The timestamp now corresponds to the presentation time in 
    milliseconds and the sample rate must now be identical for all streams. 
    Steve Casner asked why milliseconds are used for the presentation time 
    rather than the sample clock rate? Will revisit.
       The RFC 2198 redundancy format has a lot of overhead, because of the 
    payload headers, and its block length field (10 bits) is too short, so 
    redundancy is included in the payload format.
       Multi-channel decomposition is now included, allowing rate 
    adaptation at the sender, which can drop the less important channels based on 
    feedback from the receiver, and allowing for multi-channel 
    presentations to be split across multiple ATRAC-X streams if they use more 
    channels than are supported by the codec. 
       There was some confusion over the terminology, with Steve Casner and 
    Colin Perkins asking questions for clarification whether the multiple 
    streams are from one source, or multiple sources. The draft will be 
    updated to clarify the distinction between ATRAC-X streams and RTP 
       Steve Casner expressed concern that the payload format is 
    inflexible in the way streams are multiplexed: for example layered coding is 
    typically implemented by layered data across several RTP sessions, but the 
    payload format sends the multiple layers within a single RTP stream. It 
    would be appropriate to consider scenarios other than unicast 
    streaming to ensure the design is does not unduely constrain future 
    RTP Payload Format for iLBC Speech
       Alan Duric discussed iLBC speech 
    (draft-ietf-avt-rtp-ilbc-01.txt and 
    draft-ietf-avt-ilbc-codec-01.txt). Since the last meeting, the codec and 
    payload format have been tested at the SIPit 12 interop event, with 
    several interoperable implementations; a 20ms frame size mode has been 
    introduced into the codec (with 4 blocks of 160 samples); and the PLC has 
    been enhanced.
       The SDP now includes a "mode=" attribute to indicate if 20ms or 30ms 
    frame sizes is preferred (the default, if the attribute is not present, is 
    30ms for backwards compatibility). Steve Casner asked if the frame size can 
    be distinguished in-band? Not without breaking the backwards 
    compatibility. Steve asked how the zero value - support both - is used? Not 
    clear, maybe just list it as "reserved". Colin Perkins asked if the 
    "a=ptime" attribute can be used instead?  No due to ambiguity with 60ms 
    packets (2x30ms or 3x20ms).
       The current open issue is VAD and comfort noise generation. No 
    interest at present, so they plan to drop this. Steve Casner noted that 
    comfort noise frames would need to be indicated in-band? Yes, this needs to 
    be considered how they can be distinguished. May be possible to use the
       existing comfort noise payload format. 
       See also http://www.iLBCfreeware.org/ for source code and more info.
    RTP Payload Format for the Speex Codec
       Greg Herlein was scheduled to discuss the Speex codec and payload 
    (draft-herlein-speex-rtp-profile-00.txt), but couldn't reach the meeting due 
    to traffic disruption caused by anti-war protests in the city. Steve 
    Casner briefly outlined the codec and payload format, and asked for any 
    comments to be sent to the mailing list.
    RTP Payload Format for RGL Codec
       Michael Ramalho discussed the RGL codec 
    (draft-ramalho-rgl-desc-01.txt) and payload format 
    (draft-ramalho-rgl-rtpformat-01.txt). He started with a brief 
    introduction to the codec, which provides lossless compression for G.711. 
    The output of the RGL codec is variable rate. Arbitrary input can be 
    compressed, and the output will be at most one octet larger than the input 
    (and usually will be much smaller).
       The first octet of the compressed output will never match one of a set
       of reserved values (0x3e, 0x5e, 0x7e, 0x9e, 0xbe, 0xde and 0xfe). These 
    reserved values are used in the payload format to indicate a 
    particular bundled format: bundling of one, two or three RGL frames per 
    packet is signalled by one of these reserved values, and a less 
    efficient, but generic, format is signalled by another value, and allows an 
    arbitrary number of frames per packet. 
       Steve Casner asked if the codec is likely to switch between single- and 
    multi-frame modes? It may be possible to distinguish the bundling at 
    session setup time, to avoid the need for reserved byte codes (as is done by 
    the EVRC payload format). Magnus Westerlund also noted that one can 
    always distinguish the format using different RTP payload types, so it may 
    not be necessary to use the reserved codes.
       Magnus Westerlund asked if the generic format, or something similar with a 
    count of the number of frames in the packet, can be used in all cases? This 
    is possible, but the overhead may be too large. Magnus noted that, if the 
    type octet was removed, the size would be the same in almost all cases, and 
    the format would be simpler. 
       Steve Casner noted that this is rather complex, with several 
    different formats.  That is a detraction, that could be fixed using two 
    payload types (one frame per packet; multiple frames per packet).
       Michael noted that it is desirable that the number of samples in the 
    payload headers matches the "a=ptime" attribure in the SDP. Also, he noted 
    that there is a missing parameter for a-law or u-law.
       More information is available at 
    RTP Payload Format for ETSI ES 202 050 DSR
       Qiaobing Xie discussed 
    draft-xie-avt-dsr-es202050-00.txt, which is a new encoding scheme for 
    distributed speech recognition. The draft uses exactly the same format as 
    that used in the previous DSR payload format RFC-to-be, although the 
    contents of the frame pairs, and the MIME type, are different.  Accept as a 
    work item? Yes, subject to area director approval.
       Steve Casner asked if this draft cannot be simplified more, being just a 
    reference to the previous DSR payload format and registration of the new 
    MIME type? Qiaobing wished to clarify a couple of points that were 
    unclear in the previous draft, and so copied the text and added a 
    diagram for explanation. Steve said it would be appropriate to insert the 
    change into the other draft, as an RFC editor note, to ensure that both are 
    File format for EVRC/SMV vs QCP
       Steve Casner noted that there is a proposal to change the file format for 
    the EVRC/SMV payload format (currently with the RFC Editor) to be that 
    defined in draft-garudadri-qcp-00.txt. Steve asked for an "ego-free" 
    evaluation of which format makes most sense, or if we should accept both 
       It was asked if the QCP file format has IPR associated with it, and if 
    the implementations are independent. Randall Gellens replied, noting that he 
    is still researching these issues and cannot give a definitive answer, but 
    that he thinks the QCP implementations are independent and is not aware of 
    any IPR on QCP. 
       Randall also noted that he was unaware of the QCP format when working on 
    the EVRC/SMV payload format.  The QCP format has been in use for several 
    years, by other applications, but the MIME type has not been 
    registered.  It may be necessary to register this, for use with QCELP for 
    example, even if it is not used for the EVRC/SMV payload format. There may be 
    implementations using this with EVRC/SMV though.
       It is unclear if the EVRC/SMV format is used as currently 
    specified, but 3GPP-2 may be considering this. Steve Casner asked if we can 
    get clarification from 3GPP-2 on their plans for the file format? We may be 
    able to get a liaison statement from 3GPP-2, stating their needs?
       No decision was made, but it is hoped that input from 3GPP can help to 
    resolve the issues.
    Other discussion
       Since there was time left at the end of the session - due to missing the 
    Speex discussion - Steve Casner asked if there were any other topics that 
    the group wished to discuss?
       It was asked if it is possible to add new items to the RTCP 
    reporting extensions draft (e.g. for video quality assessment)? Not at this 
    time, because that draft is complete and we don't wish to delay it. 
    Extensions may be written as new drafts though.
       John Lazzaro asked when the AVT group might close, since the charter is 
    open ended due to the addition of new payload formats? Steve noted that we 
    have fairly broad license to work on payload formats, but need area 
    director approval for other new work items. Colin Perkins noted that the 
    current charter lists getting RTP to Draft Standard as the main goal of the 
    group. This is now complete, so we should be considering future work 
       Stephan Wenger asked if it is appropriate to consider moving 
    existing payload formats to draft standard? Yes, or to historic if they are 
    no longer useful. There was some discussion of how the H.263 format can 
    move to draft standard. 
       Magnus Westerlund noted that NAT boxes present a challenge. Should the 
    AVT group work to define "symmetric RTP"? Steve replied that it should 
    probably be defined in the control protcols that use RTP, since they may 
    have differing interpretations of the term. Magnus noted that there is a lot 
    of commonality between the protocols, so a separate draft may be 
    warrented, and there is the issue of address binding? Something to be 
    discussed offline.
       Colin Perkins noted that the group may also wish to consider mapping RTP 
    onto DCCP as a future work item. Transport over PR-SCTP was suggested as 
    another possible work item.  These are long term goals, not something we 
    should rush into, but something to consider as those other protocols are 


    Document Status
    Secure Real-time Transport Protocol
    RTP Framing for Connection Oriented Transport
    RTP Extended Reports
    RTCP Extension for SSM Session with Unicast Feedback
    RTP Payload for JVT Video
    RTP payload format for uncompressed video
    RTP Payload Format for a 64 kbit/s transparent call
    MIDI Wire Protocol Packetization
    RTP Payload format for ATRAC-X
    RTP Payload Format for ILBC Speech
    RTP Payload Format for the Speex Codec
    RTP Payload Format for RGL Codec
    RTP Payload Format for ETSI ES-202-050 DSR