[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: [AVT] RE: <draft-ietf-avt-rtp-vmr-wb-03.txt>: sampling rate
Hi Magnus,
Please find my response to your comments below:
> Based on what you write in the previous mail. It seems that the only
> reason for using different RTP timestamp rate between 8000
> and 16000 Hz
> is to indicate the sampling rate of the source material.
This is correct. The only reason that I defined two RTP clock rates was to differentiate between narrowband and wideband media in the encoder and the decoder.
If your original media is narrowband, since the decoder does not care about the input sampling frequency, you generate a wideband output, unless somehow the decoder is informed. I initially thought that this could be achieved by the RTP clock rate.
> If the codec
> does not need any indication at all if the source material is
> 8k or 16k
> then, I think the usage of different RTP timestamp rates is creating
> unnecessary interoperability barriers. The barrier is that
> one actually
> needs to indicate the rate of the source material, and cope with RTP
> timestamp switching.
Agreed.
> To avoid the unnecessary function I would propose that VMR-WB only
> defines 16kHz as RTP timestamp rate. If there is desire to have
> knowledge about source sampling rate that will be used, then
> one should
> define a parameter that indicates that. But I am not certain
> it really
> is needed. Such a parameter is declarative and does not matter in
> regards to any interoperability and can be ignored without
> consequence.
> Or is it something else about the codec that prevents this? I
> would not
> think so as the file format can be fine without an explicit
> indication
> of the source sampling rate.
This is a good suggestion. We need to define the RTP timestamp as 16000 Hz to maintain compatibility with RFC 3267 and the AMR-WB interoperable mode. But it is still important to have a declarative MIME parameter "sampling-frequency" for the RTP payload to inform the encoder and decoder when a narrowband media is processed. That parameter is going to be optional and if not present that output of the decoder will be wideband regardless of the input sampling frequency.
Now if someone injects some narrowband tones or announcements, that does not affect the RTP timestamp and does not affect the decoding.
If you agree, I make the following changes to the VMR-WB I-D:
1- Only one RTP timestamp clock rate of 16000 Hz is used throughout the I-D.
2- A new optional MIME parameter "sampling-frequency" is defined with the following description:
sampling-frequency: The input/output media sampling frequency. Permissible values are 8000 (narrowband) or 16000 (wideband, default). If 16000 or not present, indicates that the input/output media sampling frequency is 16000 Hz.
The reason that I used the term media is that the input to the encoder or the decoder output could be speech, audio (e.g., music, tone, etc.).
If this is acceptable, I will make the necessary changes in the I-D and submit a new draft shortly.
Regards
-Sassan Ahmadi
_______________________________________________
Audio/Video Transport Working Group
avt at ietf.org
https://www1.ietf.org/mailman/listinfo/avt