[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [AVT] I-D ACTION:draft-ietf-avt-rtp-speex-01.txt



Randell Jesup skrev:
"Alfred E. Heggestad" <aeh at db.org> writes:
Section 3.3: "Sampling rate values of 8000, 16000 or 32000 Hz MUST be
used.  Any other sampling rates MUST NOT be used" is confusing. Better to
say "The sampling rate MUST be either 8000 Hz, 16000 Hz, or 32000 Hz".
Section 4.1.1: "rate" needs to be listed a required parameter, since the
codec supports several sampling rates.
I assume that you mean "rate=sampling rate" here. Not sure how to formulate
it in the best way, but I ended up with this:

rate: The sampling rate MUST be either 8000 Hz, 16000 Hz, or 32000 Hz.

The overloading of sample rate onto the timestamp rate for RTP for (most)
audio codecs causes some real problems with certain cases.


Switching (in-call) between codecs with different sample rates -
overloading means that playout time is tricky to calculate, and doubly so
if there was a lost packet at the boundary.  (It also could significantly
upset jitter buffers and the like, unless you do a full reset on codec
shift.)  It may also complicate related RFCs like 2833, since 2833 is
normally intermixed with regular codecs:

m=audio 4321 RTP/AVP 98 99 100 101
a=rtpmap:98 ILBC/8000
a=rtpmap:99  G7221/16000
a=rtpmap:100 G7221/32000
a=rtpmap:101 telephone-event/8000
(a=fmtp's would be needed too)

Ok: what's the timestamp rate? :-) And when can it change?

For this case to have a chance to work you need three telephone-event lines like this:
a=rtpmap:101 telephone-event/8000
a=rtpmap:102 telephone-event/16000
a=rtpmap:103 telephone-event/32000


So that the packets sent is sent at the same timestamp rate as used by the main codec.


Note 2833 encodes offsets and durations using timestamp values. The receiver *may* not be able to accurately determine the point of switching. In some cases, the multiple switches could be missed in a burst loss (even of 2 packets).

For example consider an application that offers and agrees to G.722.1 and
G.711 (it doesn't matter if this particular set is smart), and switches
between them based on apparent bandwidth or even based on the apparent
audio content (G.711 in quiet/silence periods, G.722.1 in
noisy/speaking/music).  You might see multiple shifts within an even fairly
short burst loss, and twice in 2 packets at the extreme.

The base problem is that timestamp in an RTP packet is both a presentation
time, and a sample time, and the rate isn't fixed (per the SDP above).
RTCP indicates current timestamp and current time, but says nothing
(directly) about rate.

It probably would have made more sense if (long ago), the timestamp rate
had been set on the media line, with a suggestion that something like the
smallest value that can (sufficiently) accurately encode the presentation
time be selected.  (Example: for 8000 and 16000, choose 16000.  For 7000
and 8000, you might need to choose 56000.)  In some cases, a rate that's
not an exact multiple of the sample rate might need to be used, to avoid
needing to use a Very Large timestamp rate.

So - What *should* an application do?  Full reset on codec rate shift,
which may imply a short gap in playout, but may let you ignore packet loss?
What about 2833/etc?  Only allow codecs with identical timestamp rates on a
single media line?  (Causes all sorts of problems if someone wants to use
(for example) G.722).


I agree that timestamp rate switching is problematic. And if you have the sampling rate switching capability integrated into the codec, then I would recommend that you actually use the highest sample rate as RTP timestamp rate and ensure that it always provide integer timestamp ticks per frame. But for 8, 16 and 32k that is not an issue. Then you include a separate parameter to negotiate which sampling rates that really are allowed. But if the case really is that you anyway is reseting the codec between sample rate switching I think using the timestamp rate would be more correct.


Cheers

Magnus Westerlund

IETF Transport Area Director & TSVWG Chair
----------------------------------------------------------------------
Multimedia Technologies, Ericsson Research EAB/TVM/M
----------------------------------------------------------------------
Ericsson AB                | Phone +46 8 4048287
Torshamsgatan 23           | Fax   +46 8 7575550
S-164 80 Stockholm, Sweden | mailto: magnus.westerlund at ericsson.com
----------------------------------------------------------------------

_______________________________________________
Audio/Video Transport Working Group
avt at ietf.org
https://www1.ietf.org/mailman/listinfo/avt