[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[AVT] Re: VMR-WB RTP Payload and Storage Formats



Sassan,

I am likewise getting frustrated with the situation. However, I don't believe the arguments in favour of making the VMR-WB payload format inconsistent with other speech payload formats are compelling. The codec is not particularly special in its behaviour, and the argument that one wishes to inject 8 kHz sampled tones rather than injecting tones at native rate (i.e. you wish to save a few bytes of storage in the tone generation box) does not justify the inconsistency.

I regard a consistent and stable timing model across payload formats to be fundamental to the utility of RTP. You are trying to disrupt that for a trivial cost saving in a particular application. That is not an appropriate trade-off.

Colin



On 4 Oct 2004, at 19:08, <sassan.ahmadi at nokia.com> wrote:

Dear Colin,

I am getting frustrated with the situation regarding the VMR-WB I-D.

When this I-D was about to be accepted for the WGLC, a comment on the relationship between the sampling rate and RTP clock rate has put this I-D in an ambiguous status.

In the one hand, based on the distinctive capability of VMR-WB, there are people who want to use a fixed RTP clock rate of 16000 Hz to enable processing/injection of the 8000 Hz sampled media. Note that 8000 and 16000 Hz sampled media have identical VMR-WB output frames. I believe there is technically nothing wrong and revision -04 of the I-D appropriately addresses this concern.

On the other hand, you persist on your opinion that RTP clock rate must be identical to the input media sampling rate regardless of the codec capabilities.

The following excerpt from Section 4.1 of RFC 3551 (line 434)

"...
The RTP clock rate used for generating the RTP timestamp is
independent of the number of channels and the encoding; it usually
equals the number of sampling periods per second. For N-channel
encodings, each sampling period (say, 1/8,000 of a second) generates
N samples. (This terminology is standard, but somewhat confusing, as
the total number of samples generated per second is then the sampling
rate times the channel count.)
..."


indicates that (there is no normative language here) the RTP clock rate "usually" equals the input media sampling rate and that it is independent of the encoding.

Also the following excerpt from Section 6.4.4 of RFC 3550 (line 2391)

"...
   Since that timestamp is
   independent of the clock rate for the data encoding, it is possible
   to implement encoding- and profile-independent quality monitors.
..."

Therefore, you have no technical ground to assert that RTP clock rate MUST be equal to input media sampling frequency.

Please think of VMR-WB as a dual-rate system where both 8000 and 16000 Hz sampled media are supported and that decoding can proceed without knowing the input media sampling frequency.

Please remember that initially I resisted this idea; however, after considering all aspects of the requested change, I came to realize that it is technically sound.

I do want a resolution for this matter. This is definitely jeopardizing and delaying the deployment of VMR-WB codec and its prospective applications. Please do not expect that the other parties to back off from their position. We must resolve this issue.

Your understanding and consideration are appreciated in advance.

Regards

-Sassan Ahmadi


_______________________________________________
Audio/Video Transport Working Group
avt at ietf.org
https://www1.ietf.org/mailman/listinfo/avt