![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
Hi, Luca,
I have been selected as the General Area Review Team (Gen-ART) reviewer for this draft (for background on Gen-ART, please see http://www.alvestrand.no/ietf/gen/art/gen-art-FAQ.html).
Please resolve these comments along with any other Last Call comments you may receive.
Thanks,
Spencer
Document: draft-ietf-avt-rtp-vorbis-06 Reviewer: Spencer Dawkins Review Date: 26 Oct 2007 (sorry, late!) IETF LC End Date: 22 Oct 2007 IESG Telechat date:
1. Introduction
Vorbis is a general purpose perceptual audio codec intended to allow maximum encoder flexibility, thus allowing it to scale competitively over an exceptionally wide range of bitrates. At the high quality/ bitrate end of the scale (CD or DAT rate stereo, 16/24 bits), it is in the same league as AAC. Vorbis is also intended for lower and
Spencer (nit): AAC? has not been expanded previously.
higher sample rates (from 8kHz telephony to 192kHz digital masters) and a range of channel representations (monaural, polyphonic, stereo, quadraphonic, 5.1, ambisonic, or up to 255 discrete channels).
2.2. Payload Header
Vorbis Data Type (VDT): 2 bits
This field specifies the kind of Vorbis data stored in this RTP packet. There are currently three different types of Vorbis payloads. Each packet MUST contain only a single type of Vorbis payload (e.g. you MUST not aggregate configuration and comment
payload in the same packet)
0 = Raw Vorbis payload
1 = Vorbis Packed Configuration payload
2 = Legacy Vorbis Comment payload
3 = ReservedThe packets with a VDT of value 3 MUST be ignored
The last 4 bits represent the number of complete packets in this payload. This provides for a maximum number of 15 Vorbis packets in the payload. If the packet contains fragmented data the number of packets MUST be set to 0.
2.3. Payload Data
The Vorbis packet length header is the length of the Vorbis data block only and does not count the length field.
Spencer (nit): s/count/include/, I think.
The payload packing of the Vorbis data packets MUST follow the guidelines set-out in [3] where the oldest packet occurs immediately
after the RTP packet header. Subsequent packets, if any, MUST follow in temporal order.
3.1.1. Packed Configuration
A Vorbis Packed Configuration is indicated with the Vorbis Data Type field set to 1. Of the three headers defined in the Vorbis I specification [10], the identification and the setup MUST be packed as they are, while the comment header MAY be replaced with a dummy one. The packed configuration follows a generic way to store xiph codec configurations: The first field stores the number of the following packets minus one (count field), the next ones represent the size of the headers (length fields), the headers immediately follow the list of length fields. The size of the last header is implicit. The count and the length fields are encoded using the following logic: the data is in network order, every byte has the
most significant bit used as flag and the following 7 used to store the value. The first N bit are to be taken, where N is number of bits representing the value modulo 7, and stored in the first byte. If there are more bits, the flag bit is set to 1 and the subsequent 7bit are stored in the following byte, if there are remaining bits set the flag to 1 and the same procedure is repeated. The ending byte has the flag bit set to 0. In order to decode it is enough to iterate over the bytes until the flag bit set to 0, for every byte the data is added to the accumulated value multiplied by 128. The headers are packed in the same order they are present in ogg: identification, comment, setup.
3.2. Out of Band Transmission
This section, as stated above, does not cover all the possible out- of-band delivery methods since they rely on different protocols and are linked to specific applications. The following packet definition SHOULD be used in out-of-band delivery and MUST be used when
Spencer: is there an obvious reason to violate the SHOULD?
Configuration is inlined in the SDP.
5.1. Example Fragmented Vorbis Packet
The Fragment type field is set to 2 and the number of packets field is set to 0. For large Vorbis fragments there can be several of these type of payload packets. The maximum packet size SHOULD be no
Spencer (nit): s/these type/this type/?
Spencer: why is this a SHOULD?
greater than the path MTU, including all RTP and payload headers. The sequence number has been incremented by one but the timestamp field remains the same as the initial packet.
5.2. Packet Loss
As there is no error correction within the Vorbis stream, packet loss will result in a loss of signal. Packet loss is more of an issue for fragmented Vorbis packets as the client will have to cope with the handling of the Fragment Type. In case of loss of fragments the client MUST discard all the remaining fragments and decode the
Spencer (nit) - "remaining Vorbis fragments" and "incomplete Vorbis packet"?
incomplete packet. If we use the fragmented Vorbis packet example above and the first packet is lost the client MUST detect that the
Spencer (nit) - "and the first RTP packet is lost"?
next packet has the packet count field set to 0 and the Fragment type 2 and MUST drop it. The next packet, which is the final fragmented packet, MUST be dropped in the same manner. If the missing packet is the last, the received two fragments will be kept and the incomplete vorbis packet decoded.
6. IANA Considerations
configuration-uri: the URI [4] of the configuration headers in
case of out of band transmission. In the form of
"protocol://path/to/resource/", depending on the specificSpencer: isn't this "scheme://path/to/resource"?
method, a single configuration packet could be retrived by its
Ident number, or multiple packets could be aggregated in a
single stream. Non hierarchical protocols MAY point to a
resource using their specific syntax.7.1. Mapping Media Type Parameters into SDP
The information carried in the Media Type media type specification has a specific mapping to fields in the Session Description Protocol (SDP) [5], which is commonly used to describe RTP sessions. When SDP is used to specify sessions the mapping are as follows:
7.1.1. SDP Example
The following example shows a basic SDP single stream. The first configuration packet is inlined in the sdp, other configurations
could be fetched at any time from the first provided uri using or all
the known configuration could be downloaded using the second uri. The inline base64 [9] configuration string is trimmed because of the
length.
c=IN IP4 192.0.2.1
m=audio RTP/AVP 98
a=rtpmap:98 vorbis/44100/2
a=fmtp:98 delivery-method=inline;
configuration=AAAAAZ2f4g9NAh4aAXZvcmJpcwA...; delivery-
method=out_band; configuration-uri=rtsp://path/to/the/resource;
delivery-method=out_band;
configuration-uri=http://another/path/to/resource/;Note that the payload format (encoding) names are commonly shown in upper case. Media Type subtypes are commonly shown in lower case. These names are case-insensitive in both places. Similarly, parameter names are case-insensitive both in Media Type types and in the default mapping to the SDP a=fmtp attribute. The exception regarding case sensitivity is the configuration-uri URI which MUST be regarded as being case sensitive. The a=fmtp line is a single line
Spencer: "shown as multiple lines in this document for clarity"?
even if it is presented broken because of clarity.
7.2. Usage with the SDP Offer/Answer Model
The only paramenter negotiable is the delivery method. All the
Spencer (nit): c/paramenter negotiable/negotiable parameter/
others are declarative: the offer, as described in An Offer/Answer Model Session Description Protocol [8], may contain a large number of delivery methods per single fmtp attribute, the answerer MUST remove every delivery-method and configuration-uri not supported. All the parameters MUST not be altered on answer otherwise.
8. Congestion Control
Vorbis clients SHOULD send regular receiver reports detailing
congestion. A mechanism for dynamically downgrading the stream, known as bitrate peeling, will allow for a graceful backing off of the stream bitrate. This feature is not available at present so an alternative would be to redirect the client to a lower bitrate stream if one is available.
9.1. Stream Radio
This is one of the most common situation: one single server streaming content in multicast, the clients may start a session at random time. The content itself could be a mix of live stream, as the wj's voice,
Spencer (nit): "wj's"? please spell this out (I'm guessing what this means)
_______________________________________________ Ietf mailing list Ietf at ietf.org https://www1.ietf.org/mailman/listinfo/ietf
Note Well: Messages sent to this mailing list are the opinions of the senders and do not imply endorsement by the IETF.