[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [AVT] RE: <draft-ietf-avt-rtp-vmr-wb-03.txt>: sampling rate
Hello, Magnus,
Magnus Westerlund wrote:
Hi Sassan,
Based on what you write in the previous mail. It seems that the only
reason for using different RTP timestamp rate between 8000 and 16000 Hz
is to indicate the sampling rate of the source material. If the codec
does not need any indication at all if the source material is 8k or 16k
then, I think the usage of different RTP timestamp rates is creating
unnecessary interoperability barriers. The barrier is that one actually
needs to indicate the rate of the source material, and cope with RTP
timestamp switching.
Right on! You nailed the issue perfectly.
To avoid the unnecessary function I would propose that VMR-WB only
defines 16kHz as RTP timestamp rate.
Agreed. This would effectively remove the interoperability barrier you pointed out above.
My only concern is that this may create some interesting situations. Let's consider an
example - original speech of 8k rate is passed to vmr-wb encoder and the decoder is set to
output speech at 8k rate.
Here, we would then have:
- source sampling rate = 8k
- actually sampling rate of the bit stream sent over RTP = 12.8k
- sampling rate output from vmr-wb = 8k
- RTP header timestamp rate = 16k!!!
I am not sure this will cause any problem, but it seems strange.
> If there is desire to have
knowledge about source sampling rate that will be used, then one should
define a parameter that indicates that. But I am not certain it really
is needed. Such a parameter is declarative and does not matter in
regards to any interoperability and can be ignored without consequence.
I, too, would like to first see some use case here. If we don't know how the information is
going to be used, it makes no sense to specify a mechanism in RTP or even SDP to pass it around.
regards,
-Qiaobing
Or is it something else about the codec that prevents this? I would not
think so as the file format can be fine without an explicit indication
of the source sampling rate.
Cheers
Magnus
sassan.ahmadi at nokia.com wrote:
Hi Qiaobing,
Is it true that all the coded frames output from a VMR-WB __encoder__
use the 12.8k sampling rate, independent of the original sampling
rate of the speech?
The above statement is true. However, I want to make sure that it is
not misinterpreted.
The VMR-WB encoder converts the 8 or 16 kHz sampled input speech to
12.8 kHz prior to the encoding functions. This INTERNAL sampling
frequency is transparent (hidden) to the user. The bit stream
generated by the encoder is then transmitted to the VMR-WB decoder.
The VMR-WB decoding functions are independent of the encoder input
speech sampling frequency. By default, the VMR-WB decoder generates a
wideband output, unless instructed otherwise. The internal sampling
frequency must now be converted to 16 kHz (for wideband output) and
the higher frequency band (6.4 to 7 kHz spectrum) must be
reconstructed by the decoder. If a narrowband output is desired then
12.8 kHz sampling frequency must be converted to 8 kHz. Therefore, you
CANNOT use the 12.8 kHz internal sampling frequency for any other
purposes than the encoding-decoding functions.
Depending on the output audio interface (or the network interface),
one may wish to instruct the decoder to generate a narrowband or
wideband output.
For proper operation, the RTP timestamp clock rate must be either 8000
or 16000 depending on the narrowband or wideband operation,
respectively. The 12800 Hz internal sampling rate CANNOT be used for
the RTP timestamp clock rate. The correct timestamp or clock rate
(8000 or 16000) is required for proper buffering and other functions
in the transmitting and receiving sites.
cdma2000 Service Option 62 (VMR-WB) also recognizes only 8000 or 16000
Hz sampling frequencies.
Since VMR-WB and AMR-WB codecs share the same core technology, the
concept of 12800 Hz internal sampling frequency is used in both
codecs. As you see in AMR-WB RFC and 3GPP specs, there is no external
usage of the internal sampling frequency and the default RTP clock
rate for the AMR-WB codec is 16000 Hz.
Regards
-Sassan Ahmadi
_______________________________________________
Audio/Video Transport Working Group
avt at ietf.org
https://www1.ietf.org/mailman/listinfo/avt