Current Meeting Report
Slides
Jabber Logs


2.8.1 Audio/Video Transport (avt)


In addition to this official charter maintained by the IETF Secretariat, there is additional information about this working group on the Web at:

       http://www.cs.columbia.edu/~hgs/rtp/faq.html -- RTP FAQ Page
NOTE: This charter is a snapshot of the 55th IETF Meeting in Altanta, Georgia USA. It may now be out-of-date.

Last Modifield: 07/03/2002

Chair(s):
Stephen Casner <casner@acm.org>
Colin Perkins <csp@isi.edu>
Transport Area Director(s):
Scott Bradner <sob@harvard.edu>
A. Mankin <mankin@isi.edu>
Transport Area Advisor:
A. Mankin <mankin@isi.edu>
Mailing Lists:
General Discussion: avt@ietf.org
To Subscribe: avt-request@ietf.org
Archive: ftp://ftp.ietf.org/ietf-mail-archive/avt
Description of Working Group:
The Audio/Video Transport Working Group was formed to specify a protocol for real-time transmission of audio and video over UDP and IP multicast. This is the Real-time Transport Protocol, RTP, together with its associated profile for audio/video conferences and payload format documents.

The current goals of the working group are to revise the main RTP specification and the RTP profile ready for advancement to draft standard stage (including the sampling algorithms for use with very large groups, which have been broken out into a separate document), to complete the RTP MIB, to produce a guidelines document for future developers of payload formats and to continue development of new payload formats.

The payload formats currently under discussion include a number of media specific formats (MPEG-4, DTMF, PureVoice) and FEC techniques applicable to multiple formats (parity FEC, Reed-Solomon coding).

Archive before July 2001: ftp://ftp.es.net/pub/mail-archive/rem-conf/

Goals and Milestones:
Done  Working group last call on guidelines for payload format writers (BCP)
Done  Working group last call on parity FEC draft (standards track)
Done  Post revised RTP MIB and issue working group last call (stds track)
Done  Post revised DTMF payload format draft, ready for WG last call
Done  Post RTP implementation checklist draft
Done  Post revised RTP spec and audio/video profile
Done  Post payload format for MPEG-4 based on MPEG/IETF joint meetings
Done  Post revised draft on PureVoice (qcelp) payload format to address WG last call comments
Done  Post revised RTP membership (SSRC) sampling draft
Done  Submit RTP MIB to IESG for publication as Proposed Standard RFC
Done  Submit guidelines for payload format writers for publication as a BCP
Done  New working group last call on PureVoice payload format
Done  Working group last call on revised SSRC sampling draft (experimental)
Done  Post final revision of RTP spec and A/V profile drafts
Done  Analysis/simulation of multiplexing payload format proposals
Done  Revise MPEG-4 payload format document after implementation experience
Done  Decide how to proceed with multiplexing protocol: one generic payload format or a number of application specific formats
Done  Working group last call on RTP and A/V profile (for Draft Standard)
Done  Prepare MPEG4 implementation results ready for WG last call
Done  Post final revisions of selected multiplexing protocol draft(s)
Done  Working group last call on multiplexing payload format (stds track)
Internet-Drafts:
  • - draft-ietf-avt-profile-new-12.txt
  • - draft-ietf-avt-rtp-new-11.txt
  • - draft-ietf-avt-rtp-mime-06.txt
  • - draft-ietf-avt-rtcp-bw-05.txt
  • - draft-ietf-avt-rtp-cn-06.txt
  • - draft-ietf-avt-tcrtp-06.txt
  • - draft-ietf-avt-smpte292-video-06.txt
  • - draft-ietf-avt-crtp-enhance-04.txt
  • - draft-ietf-avt-ulp-05.txt
  • - draft-ietf-avt-rtp-selret-05.txt
  • - draft-ietf-avt-srtp-05.txt
  • - draft-ietf-avt-uxp-03.txt
  • - draft-ietf-avt-mpeg4-multisl-04.txt
  • - draft-ietf-avt-rtcp-feedback-03.txt
  • - draft-ietf-avt-mpeg4-simple-04.txt
  • - draft-ietf-avt-dsr-03.txt
  • - draft-ietf-avt-evrc-smv-03.txt
  • - draft-ietf-avt-mwpp-midi-rtp-04.txt
  • - draft-ietf-avt-rtcpssm-01.txt
  • - draft-ietf-avt-rtp-retransmission-02.txt
  • - draft-ietf-avt-rtp-interleave-00.txt
  • - draft-ietf-avt-rtp-jpeg2000-01.txt
  • - draft-ietf-avt-rfc2833bis-00.txt
  • - draft-ietf-avt-rtp-ac3-00.txt
  • Request For Comments:
    RFCStatusTitle
    RFC1889 PS RTP: A Transport Protocol for Real-Time Applications
    RFC1890 PS RTP Profile for Audio and Video Conferences with Minimal Control
    RFC2035 PS RTP Payload Format for JPEG-compressed Video
    RFC2032 PS RTP payload format for H.261 video streams
    RFC2029 PS RTP Payload Format of Sun's CellB Video Encoding
    RFC2038 PS RTP Payload Format for MPEG1/MPEG2 Video
    RFC2190 PS RTP Payload Format for H.263 Video Streams
    RFC2198 PS RTP Payload for Redundant Audio Data
    RFC2250 PS RTP Payload Format for MPEG1/MPEG2 Video
    RFC2343 E RTP Payload Format for Bundled MPEG
    RFC2354 I Options for Repair of Streaming Media
    RFC2429 PS RTP Payload Format for the 1998 Version of ITU-T Rec. H.263 Video (H.263+)
    RFC2431 PS RTP Payload Format for BT.656 Video Encoding
    RFC2435 PS RTP Payload Format for JPEG-compressed Video
    RFC2508 PS Compressing IP/UDP/RTP Headers for Low-Speed Serial Links
    RFC2733 PS An RTP Payload Format for Generic Forward Error Correction
    RFC2736BCPGuidelines for Writers of RTP Payload Format Specifications
    RFC2762 E Sampling of the Group Membership in RTP
    RFC2833 PS RTP Payload for DTMF Digits, Telephony Tones and Telephony Signals
    RFC2793 PS RTP Payload for Text Conversation
    RFC2862 PS RTP Payload Format for Real-Time Pointers
    RFC2959 PS Real-Time Transport Protocol Management Information Base
    RFC3009 PS Registration of parityfec MIME types
    RFC3016 PS RTP payload format for MPEG-4 Audio/Visual streams
    RFC3047 PS RTP Payload Format for ITU-T Recommendation G.722.1
    RFC3119 PS A More Loss-Tolerant RTP Payload Format for MP3 Audio
    RFC3158 I RTP Testing Strategies
    RFC3190 PS RTP Payload Format for 12-bit DAT, 20- and 24-bit Linear Sampled Audio
    RFC3189 PS RTP Payload Format for DV Format Video
    RFC3267 PS RTP payload format and file storage format for the Adoptive Multi-Rate (AMR) and Adaptive Multi-Rate Wideband (AMR-WB) audio codecs

    Current Meeting Report

    Audio/Video Transport Working Group Minutes 
    
    Reported by Stephen Casner and Colin Perkins 
    
    The AVT working group met in two sessions at the 55th IETF meeting in 
    Atlanta. In the first session, the group discussed RTP payload formats for 
    MIDI, DTMF digits and tones, iLBC speech, ATRAC-X audio, and 
    uncompressed video. The session ended with an important discussion of the 
    issues to be resolved for IESG approval of Secure RTP. In the second 
    session, the discussion focused on RTP payload formats for MPEG-4 and JVT 
    video plus RTCP extensions for voice quality reporting and for SSM 
    sessions, and RTP retransmission. A bonus topic on the RGL codec and 
    payload format was squeezed in at the end. 
    
    Introduction, Document Status, and Open Issues 
    
    This meeting began with an update by Steve Casner on document 
    publication status, including a few issues identified for documents in the 
    queue. One RFC was published since the last meeting (RFC 3389 on Comfort 
    Noise payload format), two are in the RFC editor queue (the MIME 
    registration for the payload formats in the RTP profile and the SDP 
    bandwidth modifiers for RTCP bandwidth, both blocked on the RTP 
    specification), and seven are with the IESG. Of the latter, two are the RTP 
    specification and A/V profile (revisions of RFC 1889 and 1890) which have 
    been "tentatively approved". Final approval is pending preparation of a set 
    of "RFC Editor notes" to be passed with the drafts to the RFC Editor to 
    implement the changes requested by the IESG and the resolution of 
    comments by the working group while the documents have been under IESG 
    review. Steve Casner will prepare these notes for approval by the Area 
    Directors. 
    
    Some of those RFC Editor notes implement the resolution of an issue with the 
    RTP A/V profile that was raised just before the previous (54th) IETF 
    meeting. This was a request to change the sample packing order for G.726 
    audio encoding to be consistent with the packing order for ATM AAL2 
    transport as specified in ITU-T Recommendation I.366.2 Annex E. A 
    request for comments on the proposal to make this change was sent to at 
    least ten relevant mailing lists in IETF and ITU-T. The number of 
    comments was surprisingly small, which indicates that there may not be many 
    implementations of G.726 transport in RTP. However, the comments did 
    indicate that both packing orders are in use and that there are parties 
    opposed to making the change in addition to the those who proposed the 
    change. 
    
    The conclusion reached by the chairs in consultation with the Area 
    Directors is that we need to define MIME subtypes for two payload 
    formats reflecting the two packing orders. We generally prefer not to have 
    multiple choices because of the risk of incompatibility that imposes, but we 
    are forced into it in this case by an incompatibility that already 
    exists. Furthermore, both packing orders are specified in separate areas of 
    ITU-T (AAL2 and X.400 mail). In order to make clear the 
    incompatibility between the existing G726-* payload formats and the AAL2 
    packing, we will add a note in the A/V profile section that specifies 
    those formats to note the incompatibility and say that a second set of 
    payload formats named AAL2-G726-* will be specified in a separate 
    document. Or, if the IESG agrees, the AAL2-G726-* specification will be 
    added as a new section in the profile. One problem would be that the 
    profile is to be published as a Draft Standard, which means there should 
    first be two interoperable implementations. Alternatively, a separate 
    draft can be produced quickly to be published as a Proposed Standard. 
    
    Flemming Andreasen asked why not make the existing G726-* names 
    indicate the AAL2 packing and make up new names for the existing payload 
    formats for RTP. The primary justification for that approach would be if 
    most implementations used the existing name to indicate a payload format 
    with the same packing as ITU I366.2. That appears not to be the case. The 
    real issue is not the name but the interpretation of static payload type 2 
    which is assigned to G726-32, since most implementations are probably 
    using the 32K rate and using the static payload type rather than the MIME 
    name. This incompatible interpretation exists and can't be avoided. 
    Consequently, we will deprecate the use of static payload type 2. All 
    systems should negotiate a dynamic payload type using the MIME subtypes 
    G726-32 or AAL2-G726-32 depending upon which packetization they want to 
    use. A longer summary of the comments and the details of the 
    conclusion was posted by the chairs to the AVT mailing list on November 14, 
    just before this meeting. 
    
    Five other drafts have been submitted to the IESG but not yet accepted for 
    publication. These include enhanced CRTP and TCRTP, the secure RTP 
    profile, the payload format for EVRC/SMV speech, and the payload format for 
    distributed speech recognition. Our Area Director Allison Mankin asked for 
    some changes on ECRTP and TCRTP; revisions were submitted. Discussion of the 
    issues for the secure RTP profile is covered later in these minutes. 
    
    Several drafts are in (extended) working group last call. The RTCP 
    feedback profile draft was updated for this meeting to address comments 
    from the last call, but the authors did not have time to complete a 
    "wording cleanup" pass they want to do, so we will wait for that and give 
    the WG a last chance to read it before passing it on to the IESG. Steve 
    Casner asked for the feedback simulation draft to be updated and 
    resubmitted the so it can accompany the feedback profile as an 
    Informational RFC to help convince the IESG that congestion control 
    issues have been properly addressed. José Rey said he would try to do 
    this. The MPEG-4 payload format has been revised to address comments 
    regarding the section on interleaving; those were discussed in the second 
    AVT session. Two drafts specify unequal error protection: the ULP and UXP 
    FEC mechanisms. At the previous AVT meeting, Steve Casner requested that the 
    ULP draft be changed to update and replace RFC 2733 FEC rather than 
    extend it. The motivation is to correct an unfortunate design choice in RFC 
    2733 resulting in the X, P and CC bits in the RTP header not following the 
    usual rules (these bits are the XOR of the bits in the protected packets 
    instead) and thus requiring a special case for header validation. A new 
    draft-ietf-avt-ulp-07.txt was submitted in response to this request, but the 
    new design repeats all of the RTP header in the FEC payload so the 
    overhead is too large at 7 octets. It may be possible to just insert the 
    problem bits into the FEC header by reducing the mask size instead. This 
    will be discussed with the authors, and others are asked to comment as 
    well. Finally, the SMPTE 292 video draft completed last call in October but 
    needed a few tweaks to the security considerations and references. 
    
    Steve also mentioned one new document this is not otherwise on the 
    agenda: 
    draft-kreuter-avt-rtp-clearmode-00.txt, a CLEARMODE payload format that is 
    just the same as PCMU (G.711) audio except that the bits carry ISDN data 
    rather than audio. A question is what media type should be used in an SDP 
    description since the bits are not necessarily audio. There is also the 
    possibility of charter overlap with PWE3 working group. Comments are 
    requested. 
    
    MIDI Wire Protocol Packetization (MWPP) 
    
    Colin Perkins, sitting in for John Lazzaro, gave an update on the MIDI Wire 
    Protocol Packetization 
    (draft-ietf-avt-mwpp-midi-rtp-05.txt). This revision incorporates many 
    changes reflecting WG comments (the change log itself is 2 pages). There are 
    about 20 open issues remaining, however; John plans a -06 revision early in 
    the New Year to list those issues and proposed resolutions, and then a -07 
    revision to incorporate the consensus and be ready for working group last 
    call. This normative draft on the payload format is now accompanied by a new 
    informative draft intended as an implementers guide for MWPP. It 
    includes a walk-through of sample coding techniques intended to help those in 
    the MIDI community who are totally unfamiliar with RTP 
    applications. The new draft is not finished; comments are requested on the 
    approach and what should be added or removed. 
    
    In parallel with the document preparation, a reference 
    implementation of MWPP in the sfront program is tracking the spec for 
    validation. The MIDI Manufacturers Association has also provided 
    comments and positive feedback on the MWPP work. John has also been 
    contacted by an IEEE WG that is forming to develop transport of MIDI 
    directly over Ethernet (without IP). He asks whether there are any 
    standards or work on using RTP, SDP, RTSP, and SIP in that mode. Anyone 
    with information should let us know. One answer would be, "Don't do 
    that." 
    
    RTP end-to-end liveness test 
    
    Henning Schulzrinne presented a topic resulting from a discussion on the 
    mailing list: Flemming Andreasen had asked whether the RFC 2833 tones 
    payload format could be extended to include an active end-to-end 
    liveness test (an RTP "ping"). The purpose is to detect problems above the IP 
    level that might be induced by NATs or firewalls; some risks are that the 
    function could be used for DoS attacks or result in multicast 
    implosion. One solution, which doesn't require anything new, is to just 
    rely on RTCP reception reports. A dummy RTP packet, perhaps with no 
    payload, can be sent if no real traffic is being sent. RTCP already 
    accommodates multicast scaling, although the consequence is that the RTCP 
    response is not immediate. The delay is probably not an 
    unreasonable wait. Not all receivers implement RTCP, but you can 
    distinguish that case from a problem in the RTP forward path by whether you 
    don't get any RTCP at all or just don't get an RR indicating receipt of the 
    RTP packet. A second solution involves signaling (e.g., in SDP) an RTP 
    "ping" capability, then sending a special type of RTP packet that would 
    elicit a response packet sent to a signaled address or to the source 
    address/port of the request packet. But this solution poses the 
    potential for DoS and implosion problems requiring complicated 
    solutions some of which are already in RTCP. That's likely a killer. 
    
    Flemming favors the RTCP solution, but wants faster response in the case of 
    success. Could the RTCP interval be reduced? Steve Casner responded that the 
    RTCP feedback timing rules would be appropriate. Dave Oran asked why we 
    need a dummy packet, why not just send comfort noise? Magnus 
    Westerlund pointed out that works fine for audio sessions. For others, an 
    empty payload may be needed. Flemming confirmed this because in some SIP 
    scenarios early media packets can cut off ringback tone. Dave 
    continued that this was all started by people who don't do RTCP... they 
    should just do it! Roni Even said monitoring RTCP is important because if 
    the other side dies, there may be no other indication that the packets go 
    into a black hole. Maybe this just needs a 
    hints-for-implementers document. Henning will put the discussion on his RTP 
    web page. 
    
    RTP Payload for DTMF Digits, Tones and Signals 
    
    Henning Schulzrinne discussed 
    draft-ietf-avt-rfc2833bis-02.txt, which updates the payload format for DTMF 
    tones in RFC 2833. This payload format transports DTMF and other tones in 
    the form of named events as an alternative encoding the tone waveform with 
    low fidelity when a high-compression codec is in use. There is also a 
    second mode in which tones are specified by their component 
    frequencies. An amazing amount of email has been received with comments and 
    requests for additions, so many people must be implementing and using this 
    payload format. The changes from the -00 revision are: 
    
    Addition of a formal notion of state to clarify that signals such as 
    on/off-hook and the ABCD bits used on T1 trunks represent sets of states out 
    of which only one can be active. Also, the notion of soft state was added 
    for signals that reset to default value after a period of time. 
    Clarification that events longer than the maximum duration (about 8 
    seconds for 8 kHz RTP clock) can be expressed as the concatenation of 
    multiple events. 
    Clarification of which tones can meaningfully have a volume 
    specified. 
    Addition of a few data tones and clarification of the meaning and naming of 
    ANS signals. 
    Colin Perkins expressed concern that the state additions may be 
    introducing too much application semantics into the protocol. Henning 
    responded that the concern is understood, but that for the few cases that 
    exist the semantics are already fixed. 
    
    There are two open issues. The first is that some signals (in 
    particular MF R1 signals) have acquired different names or 
    descriptions over the decades, some of which are not even documented well by 
    the ITU, so help is requested to supply definitive references for the 
    complete and correct text. The second issue is more significant. Some 
    (potential) users of the payload format want to pass the signals 
    required for fax setup and negotiation, but this involves a 
    non-trivial number of bits sent as 300-baud V.21 modem data. Sending these 
    bits as a sequence of tones is very inefficient at one symbol (tone) per 
    packet. This could be improved in various ways, but any significant 
    improvement would require redefining the fields of the payload format to be 
    interpreted differently. There is a real concern is that we're slipping 
    down a dangerous slope of mission creep: this is not a signaling 
    protocol. The purpose of this payload format is to convey tones with more 
    fidelity than low-rate codecs can provide, and to allow the receiver to 
    avoid the need to implement tone detection for some scenarios. Do we want to 
    support a full-featured fax negotiation as a sequence of named events? Or 
    should we say that if you want to do fax you should do T.38 or whatever 
    else might be appropriate, and deprecate what is in RFC 2833 for V.21 now. 
    Either extend to do the whole job in a reasonable way, or don't do it at 
    all. 
    
    Jim Rafferty, who has participated in IP-FAX standards work, commented that 
    T.38 has its pluses and minuses. A number of people in ITU might be 
    interested in an RTP-based alternative to T.38, but he questioned 
    whether it is worth doing at this point in time. Flemming Andreasen 
    agreed that this payload format shouldn't be a new way of sending fax, but 
    there is a strong need for it in the initial phases of call 
    establishment (V.8, V.8bis, V.25), and most of these signals are sent 
    using V.21. Steve Casner took off his chairman's hat to express the 
    opinion that we should do nothing more than provide for the sending of 
    tones. If it is feasible for some applications to send each bit of V.21 
    data as a tone in one RFC 2833 packet, that's fine, but we should do 
    nothing to provide a higher-density representation. Flemming asked for a 
    review of the code points that are included; the CI signal is there, but TM 
    and JM are not, and might be useful. 
    
    Henning said the important point is to get this work completed, and that 
    requires interop testing to allow advancing to Draft Standard. The number of 
    points in the matrix is large, including features such as 
    redundancy; plus for each codepoint the matrix need to state what it means to 
    be supported. One attendee indicated that the tones portion of the draft 
    (specified by frequencies) has been implemented, but a second would be 
    needed for interop testing. Robert Sparks has posted an initial draft of the 
    matrix and others have volunteered to help. They plan to gather as much 
    interop input as possible at SIPit, but for those who are not going to be 
    there, please send interop input to Robert Sparks (see 
    draft-sparks-avt-2833-interop-00.txt). 
    
    An attendee asked if other forms of DTMF-represented coding can be added, 
    e.g. some signaling supplementary services as defined by Telcordia 
    related to voiceband data transmission. Henning replied that there is more 
    room to add tones that fit the design of the payload format. If there is 
    something that exists now, and preferably is already implemented since we 
    want to get to Draft Standard, send the info: common name, succinct 
    description, and citable reference. However, the list is intended to be 
    extendible after the draft is published; there is an IANA 
    registration mechanism. 
    
    Payload format for iLBC Speech 
    
    Alan Duric presented an update of two drafts on the iLBC speech codec and 
    its associated payload format, in 
    draft-ietf-avt-ilbc-codec-00.txt and 
    draft-ietf-avt-rtp-ilbc-00.txt, respectively (each was preceded by two 
    revisions as individual submissions). Extensive changes were made to the 
    iLBC codec since last meeting. The number of bits per frame was reduced 
    from 416 to 399 bits to fit in 50 bytes while at the same time the 
    quality was improved and the complexity was significantly reduced (to less 
    than G.729a). A/B tests by the authors and by third parties confirmed the 
    quality improvement which derives from the addition of a 57th sample in the 
    quantized residual state and an increase in the number of bits 
    allocated to gain (utilizing bits freed elsewhere). A demo SIP client with 
    the iLBC codec is available by request from 
    alan.duric@globalipsound.com. 
    
    Steve Casner asked why the 400th bit should not be used for something more 
    than setting it to zero. Alan replied that several ideas have been 
    proposed and that these will be sent on the mailing list. Steve also 
    commented that the codec seems to be still changing a lot. We don't want to 
    progress this until the codec format has stabilized. Alan responded that no 
    further changes are expected on the codec itself. This round of changes 
    completes the work on reduction of frame size and complexity as 
    planned. Plans for a 20ms frame option may be dropped because the need does 
    not appear strong. Comments on that are requested. Work on voice 
    activity detection is ongoing; this may be paired with the RFC 3389 
    Comfort Noise. That work is expected to be completed in time for interop 
    testing planned for the next SIPit in February. 
    
    Steve asked whether the sorting of bits for ULP is intended to be 
    applied across frames, because the payload format draft is not clear on 
    this. The answer is yes. Steve said that is appropriate (otherwise the 
    sorting does no good), but it is a lot of work which gives no 
    advantage in environments without ULP at lower layers. It may be 
    necessary to allow both sorted and unsorted modes as in the AMR codec. We'd 
    like feedback from implementers about the cost and utility of the ULP 
    sorting. 
    
    Alan asked about the possibility of adding another document giving the 
    qualification criteria for the codec. Steve replied that this would need to 
    be standards track to be effective, but the status even of 
    standardizing the codec itself is still not entirely clear. Generally IETF 
    avoids conformance testing. Stephan Wenger asked when the general issue of 
    standardizing media codecs in IETF will be resolved. Steve Casner 
    replied that, although the Transport ADs were consulted and were in favor of 
    this work before it started, we won't know the answer for sure until the 
    work is submitted to the whole IESG for approval. 
    
    RTP Payload Format for ATRAC-X 
    
    Matthew Romaine present a new payload format for ATRAC-X audio in 
    draft-hatanaka-avt-rtp-atracx-00.txt. Sony's ATRAC family of 
    perceptual codecs is used in MD's and solid-state recorders. The -X 
    version supports multiple channels in a wide range of data rates from 
    8kbps to 1.4Mbps. The payload format supports multiplexing of multiple 
    streams and metadata within a single session, redundant data to 
    mitigate packet loss, and fragmentation. The draft details the 
    segmentation of streams into segments and the association of segments from 
    different streams in the same time slot. Two open issues were 
    identified; the first was how to manage the allocation of metadata 
    identifiers. Some appropriate body could static identifiers, as is done in 
    MIDI, or the assignments could be a dynamic free-for-all. There was no 
    input on this. The second issue is the determination of the RTP 
    timestamp: the draft currently specifies transmit time, but it has 
    already been pointed out that a presentation (sampling) timestamp is 
    needed to allow synchronization with other streams. The problem is that a 
    single session might carry multiple sampling rates. Steve Casner offered the 
    example of MPEG audio in which the timestamp clock rate is always 90kHz 
    synchronized to the sampling clock, which may vary in rate. Could a 
    similar arrangement be used here? Magnus Westerlund suggested that if 
    different rates are needed, perhaps different RTP sessions should be used. 
    
    Steve Casner asked why the multiplexing of streams built into the 
    payload format rather than using multiplexing at the UDP/RTP level. Is the 
    format derived from something already in use on MD or other media and 
    therefore hard to change, or is it a new design that is part of the 
    payload format and therefore open to discussion? Matthew responded that the 
    format was developed with streaming in mind; it is supposed to be 
    extensible. Multiple bit rates are supported for scalable QoS, and they 
    have specified multi-channel configurations up to 7.1 but it could be 
    expanded to 32 channels. The benefit is payload overhead. Steve asked how 
    this would be used for QoS: keep some parts of the packet and throw away 
    others? That does not work. It might make sense for the file format to 
    contain multiple rates for scalability, but the packets should only 
    contain the rate appropriate for the receive or you have not achieved the 
    goal of fitting the available bandwidth. If you need to deliver 
    different rates to different receives, send different streams, or 
    layered coding for multicast. Roni Even echoed this concern; if the 
    multiplexing of streams is for redundancy, the draft needs to explain the 
    relationship between the fragments, redundant segments, etc. 
    
    Colin Perkins asked why redundancy was built into the payload format 
    rather than using RFC 2198. The authors were unaware of 2198. Steve also 
    pointed out that for redundancy to be useful the redundant copy may need to 
    be separated further in time than one slot. He also suggested that it 
    would be useful for the authors to review several of the other payload 
    formats since several of the architectural ideas commonly used in AVT have 
    been missed, such as separate streams for separate needs. 
    
    Magnus asked if is it possible for fragments to be independently 
    decoded, or must a segment be fully reassembled to decode it. Matthew said 
    the answer depends on the encoder, and needs to look into this further. In 
    summary, this payload format may need quite a bit of change from what is 
    defined so far. 
    
    RTP Payload Format for Uncompressed Video 
    
    Ladan Gharai presented updates to 
    draft-ietf-avt-uncomp-video-01.txt. In addition to the correction of 
    editorial nits and the inclusion of an applicability statement and a 
    comparison to RFC 2431 (BT.656 video), some new features were added: 12- and 
    16-bit sample sizes join to the 8- and 10-bit sizes specified 
    previously, and monochrome, 4:4:4:4 chrominance subsampling, and RGBA 
    color representations were added. The payload header was unchanged except 
    that the 'M' bit was renamed 'C' to avoid possible confusion with marker bit 
    in RTP header. The draft has established a list of mandatory SDP fmtp 
    parameters and a partial list of optional parameters. The authors are 
    still working on the representation of these parameters, but will 
    complete this work for the next draft. 
    
    Ladan identified a few open issues. Currently only packed sample formats are 
    provided; the authors are considering adding planar and 
    macro-blocked formats as well. The planar format, in which color planes are 
    sent separately, is straightforward; it would be identified by an SDP 
    parameter. However, it is unclear whether it makes sense to have packed and 
    macro-block formats in the same payload format. To accommodate 
    macro-blocks, width and length parameters would have to be added to the 
    payload header (there is room), and then the packed format would be 
    indicated by a macro-block size of 1. Stephan Wenger would like to see the 
    planar representation added, but has doubts about a 
    macro-block-based scheme. There are applications for which it would be 
    useful, but there are too many complications related to 
    interlacing. You can't assume that the shape of a macro-block will be 
    16x16 in a progressive scan or in one field. Sometimes a macro-block is a 
    different size with parts from both fields. It is also affected by 
    transcoding. 
    
    A second open issue is the transport of interlaced 4:2:0 color 
    subsampling. This has been discussed on the mailing list and work is still in 
    progress. Lastly, for interlaced video, there is a question whether the two 
    fields should have distinct timestamps. A problem is that for the 
    current 90kHz timestamp clock rate which increments at 3003 for 
    29.97fps NTSC video, a fractional increment of 1501.5 would be needed for 
    the intermediate field timestamp, but the RTP timestamp is an integer. It 
    should be possible instead to derive the timestamp from header bits and the 
    frame rate. Stephan explained that you need to have a timestamp for every 
    field in order to indicate the proper mapping of fields between 24fps film 
    content and 30fps video using 3-2-pulldown because an individual field may be 
    repeated so they do not always appear in even-odd pairs. However, we don't 
    worry about the exact timestamp value for this, it would be safe to round up 
    to the next integer. 
    
    Resolution of comments on draft-ietf-avt-srtp-05.txt 
    
    The first session ended with a discussion of IESG security concerns 
    regarding the Secure RTP profile 
    (draft-ietf-avt-srtp-05.txt). For this discussion, Allison Mankin 
    introduced herself as the Transport AD for this group, Eric Rescorla as 
    security advisor to the Transport Area, and Steve Bellovin who is one of the 
    Security Area Directors. 
    
    Eric Rescorla started by noting that the SRTP profile has some unusual 
    design features: it uses AES in counter mode, rather than in CBC mode, and it 
    offers a choice of several message authentication codes (MACs), 
    including no authentication. These features, in particular the option to use 
    AES in counter mode with no authentication, don't make security folks 
    comfortable. Eric then summarized his understanding of the issues that 
    require SRTP to use these modes of operation. The first is latency, since 
    shorter packets mean less latency for voice and MACs consume 
    bandwidth. Secondly, wireless channels are noisy and packets often 
    contain bit errors. If integrity checks are used in this 
    environment, the bandwidth consumption will be excessive and the bit 
    errors may lead to unacceptable packet discard rates due to failure of the 
    integrity checks. 
    
    Eric moved on to explain that counter mode has no integrity 
    protection unless protected by a MAC. This is not obviously a problem for 
    voice, one of the key applications for SRTP, but may be a problem for 
    other types of content. From a security viewpoint, it is desirable to use 
    SRTP with a MAC, but the default MAC in SRTP is a weak 32-bit code and 
    there is the option to use SRTP without integrity protection (there is also a 
    strong MAC option). The choices lead to the threat of modified message 
    streams and forged traffic, unless the optional strong MAC is used. Two 
    solutions to this problem were proposed: 
    
    Make the MAC mandatory and add FEC after encryption to correct bit errors so 
    that the integrity check will work on somewhat corrupted packets. There was 
    considerable discussion of this in private email with the authors, who were 
    opposed on the grounds that it expands packets and makes SRTP 
    uneconomic for cellular links, which already employ link-layer FEC. Eric was 
    not convinced by these arguments, citing the qualitative nature of the 
    concerns rather than hard numbers giving performance impact. 
    
    
    Define a wireless voice profile for SRTP where the MAC protects only the 
    control data leaving the media data unprotected. The reduced MAC causes 
    limited packet expansion, but is less sensitive to bit-errors than SRTP as 
    currently specified. Other types of traffic will use a mandatory 80 bit 
    MAC. 
    Mark Baugher noted that SRTP has the ability to use strong integrity 
    protection now, but it's not the default. The question is whether the 
    vendors or the users should be able to make the determination, based on 
    their environment, their application, whether they want a strong MAC or 
    not. Steve Bellovin agreed with this formulation, but noted that the IESG 
    has a strong preference for protocols that are secure by default, and a 
    protocol won't be published unless it has strong mandatory to 
    implement security. If a protocol has weaker security options, it needs a 
    Security Considerations section that describes the environments where the 
    weaker options may be acceptable, and explains the consequences and 
    tradeoffs of selecting those options. 
    
    Eric Rescorla asked Steve Bellovin if it was acceptable for SRTP to have the 
    option of no authentication? Steve answered that it was permitted in 
    certain other situations, but would take detailed analysis to show where it 
    is safe and useful and where it isn't. 
    
    Mark Baugher asked if changing the default mandatory transforms, adding CBC 
    mode as an option, would satisfy concerns? Steve Bellovin answered that, 
    assuming you meet requirements for safely using counter mode, there is no 
    strong need for CBC mode; the MAC is much more critical. 
    
    Mark Baugher asked if the security folks are not happy with the default 
    32-bit HMAC-SHA1? Steve Bellovin replied that he needs to think more on 
    that, but the group needs to better analyse the environment before he can 
    make a good decision. 
    
    Allison Mankin reminded the group that SRTP is for all 
    environments, and expressed her preference for a specification where the MAC 
    was mandatory in all cases, with a possible exception for cellular 
    telephony. Elisabetta Carrara reminded Allison that SRTP includes a 
    32-bit MAC by default, and that stronger options are specified. Eric 
    Rescorla again noted that it is necessary to analyse individual threats and 
    the environment, giving numbers to characterize the impact of security on 
    performance. 
    
    Elisabetta noted that the MAC cannot be used in cellular telephony, since 
    that environment cannot afford the bandwidth of the MAC. She reminded the 
    group that the requirements driving ROHC and UDPlite also apply to SRTP. 
    Steve Bellovin replied this is the sort of thing that has to go into the 
    security considerations section, explaining why the environment has these 
    requirements and how they affect security. 
    
    Allison commented that the draft is intended to be general purpose, but is 
    optimized for cellular use. The default transform needs to be suitable for 
    the general case, with a non-optional MAC if counter mode is used, and 
    justification why weaker options are present for cellular operation. 
    
    Steve Casner asked if there was a problem with changing the defaults to be 
    more general-purpose, signaling specific settings for telephony 
    applications, and clearly documenting the rationale in the security 
    consideration section? There were no objections. 
    
    Liaison statement from MPEG 
    
    Steve Casner started the second session by reading a liaison statement the 
    group has received from the MPEG committee, stating that they have 
    revised the RTP Payload Format for MPEG-4 taking into account comments from 
    the last AVT working-group last call, and requesting publication of the 
    draft as an RFC. 
    
    MIME Type Registration for MPEG-4 
    
    Jan van der Meer, sitting in for Young-Kwon Lim, outlined the draft 
    draft-lim-mpeg4-mime-01.txt that specifies MIME type registrations for the 
    MPEG-4 file formats and their relation to the MPEG4-on-IP framework 
    (ISO/IEC 14496-8). 
    
    Steve Casner noted that this draft includes some discussion of RTP MIME 
    parameters, which needs to be moved to the payload format drafts. Steve 
    also expressed concern that the previous versions of the full 
    framework document, submitted to the IETF, had problems which needed to be 
    resolved but it's not clear that these have been addressed in ISO. There is a 
    need to address these issues in future, especially if this MIME 
    registration and the framework conflict. 
    
    Mike Coleman asked about the difference between streams and files, in this 
    context, since MPEG-4 streams are not well defined. Steve Casner and Colin 
    Perkins clarified that this draft should cover only the MP4 file format, and 
    that the RTP payload format drafts will contain MIME types for use with 
    RTP. 
    
    Stephan Wenger asked about the presumed existence of an 
    informational RFC, pointing to the MPEG4-on-IP framework. Colin Perkins and 
    Steve Casner explained that this was agreed in the AVT meeting at the 52nd 
    IETF (Salt Lake City). 
    
    RTP Payload Format for MPEG-4 
    
    Jan van der Meer discussed 
    draft-ietf-avt-mpeg4-simple-05.txt, the RTP Payload Format for MPEG-4. This 
    document is in working group last call and several comments, mostly 
    editorial, have been received. The main issues are the suggested 
    replacement of the "Profile" parameter with 
    "InterleaveDelay", and whether RTP timestamps should be allowed to go 
    backwards when interleaving. These have been discussed in AVT, and in MPEG 
    and ISMA, and it has been agreed to allow both features. Current 
    discussion on the mailing list is on the exact meaning of interleave delay 
    and emission rules. 
    
    This discussion continued in the meeting with Steve Casner, Stephan 
    Wenger, Colin Perkins and Andrea Basso commenting on the RTP system model 
    and how it leaves much to the discretion of the receiver when compared to 
    the MPEG buffer model. They saw no need for the emission rules, viewing 
    them as implementation details that do not need to be specified. In 
    addition, they noted that the characteristics of an IP network are such 
    that the sender cannot control the buffering at the receiver. This also led 
    to the definition of the interleaving delay, with concern being 
    expressed that the attempt to precisely define the delay being 
    unnecessary, since what is really needed is a hint to the receiver 
    suggesting an starting estimate of the buffering delay. Much of the 
    complexity comes from trying to tightly bound the interleaving delay, and a 
    tight bound is not necessary or feasible. 
    
    Stephan Wenger asked what would be the impact of pulling 
    interleaving out of the payload format? Colin Perkins said that this is not 
    possible, but we may consider leaving the interleave delay parameter, and 
    letting the sender chose an appropriate value without saying how to do 
    that. 
    
    Mike Coleman asked about the draft status, since it is not available in the 
    archives and because parts of the MPEG committee belive it complete, but it 
    clearly is not. Steve Casner noted that the draft will be in the 
    archives after the meeting. Steve and Colin also noted that the current 
    working group last call is not completed. There will be time to review any 
    changes introduced before the draft is advanced. 
    
    RTP Payload Format for JVT Video 
    
    Stephan Wenger discussed 
    draft-ietf-avt-rtp-h264-00.txt, the payload format for JVT video. This 
    updates draft-wenger-avt-rtp-jvt-01.txt to align with the latest JVT 
    specification and adds MTAPs with 8-, 16-, 24- and 32-bit timestamp 
    offsets (as discussed at the previous AVT meeting). Stephan is 
    considering removing the 8- and 32-bit timestamp offsets, since they are not 
    believed to be useful. 
    
    The next open issue is the relation between this payload format and the 
    MPEG-4 payload format, since JVT video is referenced as part of MPEG-4. 
    Stephan believes that using the MPEG-4 format for JVT is not 
    acceptable, since MTAPs and STAPs cannot be sent efficiently with that 
    format. He also believes that full binary compatibility between the JVT 
    payload format and the MPEG-4 payload format is not achievable. 
    However, it is possible to define a common operation point, providing 
    compatibility at the expense of limited optimization. 
    
    Steve Casner noted that the draft specifies use of the latest 
    timestamp when doing AU aggregation, but that other payload formats use the 
    oldest timestamp. Stephan agreed that this is an issue, and should be 
    changed. 
    
    Mike Coleman noted that section 3 says the draft is "not intended to be 
    used with MPEG-4 systems" and asked for clarification what is meant? It is 
    possible to use it with MPEG-4 systems, but there are some features of this 
    draft that are not compatible with the MPEG-4 payload format. Jan van der 
    Meer noted that some in MPEG will ask "what are the features offered with 
    this draft that cannot be supported by the MPEG-4 payload format?" 
    Stephan answered that the main reason is STAPs and aggregation which 
    cannot be supported efficiently, and multiple fragments of AUs are vital but 
    not supported in the MPEG-4 payload format. There was some discussion of 
    this, and it may be appropriate to clarify in a future version of the 
    draft. 
    
    RTCP Reporting Extensions 
    
    Alan Clark discussed 
    draft-ietf-avt-rtcp-report-extns-01.txt, on RTCP reporting 
    extensions. This is the combination of the various reporting 
    extensions drafts discussed in Yokohama, with the addition of loss 
    run-length encoding, updated VoIP metrics, and security and IANA 
    considerations. 
    
    Colin Perkins noted that the IANA considerations section needs work to 
    specify the registration in detail, and will supply detailed comments 
    offline. Colin also asked if the jitter buffer metrics are useful and 
    match implementations? Have implementors looked at the draft to see if the 
    information is meaningful in their context? Alan Duric noted that jitter 
    buffer and PLC functions are separate in the draft, but these 
    sometimes combined in implementations. Alan Clark said that the broad 
    intent is to provide rough info for diagnostic purposes, not an exact 
    description of an implementation. 
    
    Alan Clark noted the need to be management friendly, even if SRTP is used. 
    Accordingly, he would like to add a note to the draft indicating that the 
    SRTP E bit can be used to send extended RTCP report frames in 
    plaintext, even if encryption has been selected as the default setting. 
    Colin agreed that this might be possible, but noted that the draft 
    shouldn't specify a security policy. Steve Casner also noted that the 
    draft should talk about this issue in the security considerations 
    section. 
    
    Steve Casner also highlighted that a receive-only endpoint will not know the 
    RTT that is supposed to be included in the VoIP metrics report since the 
    RTCP mechanism works only for senders. Alan Clark replied that these 
    metrics are expected to be used for full-duplex conversations. Steve said 
    that in that case, the draft needs to make clear in what scenarios the VoIP 
    metrics report is applicable. 
    
    RTCP Extensions for SSM 
    
    Jörg Ott described changes to 
    draft-ietf-avt-rtcpssm-02.txt, the RTCP extensions for 
    source-specific multicast. The main changes are to the security 
    considerations. In addition, SSRC distribution has been removed from this 
    version and cumulative values are now included in the 
    distribution. There are "work-in-progress" changes to the IANA 
    considerations section and to use the XR packet formats (on this 
    subject, Jörg noted that there are several proposed RTCP extensions using 
    packet type 205, and we need to resolve this conflict). 
    
    The security considerations section has been significantly reworked, with 
    the assumptions that we need to maintain low overhead, that the session 
    parameters are securely distributed out of band, and that the security 
    weaknesses should be addressed at the transport layer and above since 
    weaknesses may exist in the SSM layer below. The threats identified are 
    denial of service, packet forgery, session replay and 
    eavesdropping. The draft also categorizes threats according to the 
    direction of the traffic flow, and discusses the trust models. 
    
    Colin Perkins approved of the security considerations section, but would 
    like discussion of specific applications and mandatory security 
    behavior for those applications in this draft (e.g. how to use SSM with 
    RTSP and SIP). 
    
    Jörg highlighted the issue of relation to other I-Ds, since this uses the 
    features of the extended RTCP reporting draft. He asked on the time 
    schedule for the RTCP reporting extensions draft. Alan Clark would like to 
    get the RTCP Reporting Extensions draft done quickly, and was willing to 
    cooperate on the IANA issues, ensuring they're aligned. 
    
    Jörg asked if future drafts relating to RTCP should include a section on SSM 
    considerations? Steve Casner was not sure if we need to establish a 
    requirement, but noted that this draft should have a section giving 
    advice to authors of RTCP extensions that might be affected by SSM. 
    
    Open issues include cumulative BYE packets, a possible revision to the 
    message format, discussion of the relation to other RTP/RTCP 
    extensions, completion of IANA considerations, etc. A revised draft is 
    expected by the end of the year. 
    
    Retransmission 
    
    The RTP retransmission format 
    (draft-ietf-avt-rtp-retransmission-03.txt) was discussed by José Rey. This is 
    the merger of the two previous drafts, as was discussed in Yokohama. The new 
    draft uses a dynamic payload format to indicate the original payload type of 
    the retransmission. It supports session multiplexing, with streams 
    associated using an a=fmtp parameter and FID, and SSRC multiplexing using an 
    a=fmtp parameter to associate the retransmission with the original 
    stream. José also outlined the RTSP considerations regarding 
    SSRC-multiplexing. There will be a minor revision shortly, which is 
    expected to be ready for last call. 
    
    Anders Klemets asked if one MUST NOT do session multiplexing and SSRC 
    multiplexing in the same session? It was clarified that this is 
    correct. 
    
    RGL codec and payload format 
    
    The final presentation was a brief outline of the RGL lossless G.711 
    codec, by Michael Ramalho, which was presented as a possible future work 
    item. Steve Casner noted that standardizing codecs is not entirely within 
    scope of AVT, and will need discussion, as with iLBC. Drafts will be 
    submitted shortly after the meeting. 
    
    
    

    Slides

    Agenda, document status and open issues
    The MIDI Wire Protocol Packetization (MWPP)
    RTP end-to-end liveness test
    RTP Payload for DTMF Digits, Tones and Signals
    RTP Payload Format for iLBC
    RTP Payload Format for ATRAC-X
    RTP Payload Format for Uncompressed Video
    SRTP Security Issues
    MIME Type Registration for MPEG-4
    RTP Payload Format for MPEG-4
    RTP Payload Format for JVT Video
    RTCP Reporting Extensions
    RTCP Extensions for SSM
    RTP Retransmission
    RGL codec and payload format