[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: [AVT] Re: VMR-WB RTP Payload and Storage Formats
Colin,
It seems that the 'sampling rate' issue has now crossed over to the Storage Format thread. In any case, it's been a while since my initial comments regarding application impacts of restricting the RTP timestamp to be equivalent to the source media sampling rate. So, I would like to reiterate that the impact of this restriction extends well beyond saving a few bytes of storage in a tone generation box. As I stated previously, this restriction is also problematic for mixing and/or switching diverse media sources in streaming and conversational applications.
Consider a conference bridging application, where at least one party is a land-line circuit (which requires 8 KHz sampling) and at least one other party is a wideband VMR-WB terminal (which may employ 16 KHz sampling). Conference bridging for these media streams requires switching and/or mixing the input streams into like output streams. According to your restricted method, use of the VMR-WB sample rate adaptation function for this application requires end-to-end session renegotiation (eg, an SDP offer-answer cycle for each leg) in order to achieve a common sampling rate, and hence timestamp interval, for all RTP terminations involved in the conference bridge. Additional latencies and synchronization complexities associated with end-to-end renegotiation of the sampling rate for all RTP terminations should not be required to establish and maintain the bridge in such applications.
Regards,
Gino
-----Original Message-----
From: avt-bounces at ietf.org [mailto:avt-bounces at ietf.org] On Behalf Of Colin Perkins
Sent: Monday, October 04, 2004 2:01 PM
To: sassan.ahmadi at nokia.com
Cc: magnus.westerlund at ericsson.com; avt at ietf.org
Subject: [AVT] Re: VMR-WB RTP Payload and Storage Formats
Sassan,
I am likewise getting frustrated with the situation. However, I don't
believe the arguments in favour of making the VMR-WB payload format
inconsistent with other speech payload formats are compelling. The
codec is not particularly special in its behaviour, and the argument
that one wishes to inject 8 kHz sampled tones rather than injecting
tones at native rate (i.e. you wish to save a few bytes of storage in
the tone generation box) does not justify the inconsistency.
I regard a consistent and stable timing model across payload formats to
be fundamental to the utility of RTP. You are trying to disrupt that
for a trivial cost saving in a particular application. That is not an
appropriate trade-off.
Colin
On 4 Oct 2004, at 19:08, <sassan.ahmadi at nokia.com> wrote:
> Dear Colin,
>
> I am getting frustrated with the situation regarding the VMR-WB I-D.
>
> When this I-D was about to be accepted for the WGLC, a comment on the
> relationship between the sampling rate and RTP clock rate has put this
> I-D in an ambiguous status.
>
> In the one hand, based on the distinctive capability of VMR-WB, there
> are people who want to use a fixed RTP clock rate of 16000 Hz to
> enable processing/injection of the 8000 Hz sampled media. Note that
> 8000 and 16000 Hz sampled media have identical VMR-WB output frames. I
> believe there is technically nothing wrong and revision -04 of the I-D
> appropriately addresses this concern.
>
> On the other hand, you persist on your opinion that RTP clock rate
> must be identical to the input media sampling rate regardless of the
> codec capabilities.
>
> The following excerpt from Section 4.1 of RFC 3551 (line 434)
>
> "...
> The RTP clock rate used for generating the RTP timestamp is
> independent of the number of channels and the encoding; it usually
> equals the number of sampling periods per second. For N-channel
> encodings, each sampling period (say, 1/8,000 of a second) generates
> N samples. (This terminology is standard, but somewhat confusing,
> as
> the total number of samples generated per second is then the
> sampling
> rate times the channel count.)
> ..."
>
> indicates that (there is no normative language here) the RTP clock
> rate "usually" equals the input media sampling rate and that it is
> independent of the encoding.
>
> Also the following excerpt from Section 6.4.4 of RFC 3550 (line 2391)
>
> "...
> Since that timestamp is
> independent of the clock rate for the data encoding, it is possible
> to implement encoding- and profile-independent quality monitors.
> ..."
>
> Therefore, you have no technical ground to assert that RTP clock rate
> MUST be equal to input media sampling frequency.
>
> Please think of VMR-WB as a dual-rate system where both 8000 and 16000
> Hz sampled media are supported and that decoding can proceed without
> knowing the input media sampling frequency.
>
> Please remember that initially I resisted this idea; however, after
> considering all aspects of the requested change, I came to realize
> that it is technically sound.
>
> I do want a resolution for this matter. This is definitely
> jeopardizing and delaying the deployment of VMR-WB codec and its
> prospective applications. Please do not expect that the other parties
> to back off from their position. We must resolve this issue.
>
> Your understanding and consideration are appreciated in advance.
>
> Regards
>
> -Sassan Ahmadi
_______________________________________________
Audio/Video Transport Working Group
avt at ietf.org
https://www1.ietf.org/mailman/listinfo/avt
_______________________________________________
Audio/Video Transport Working Group
avt at ietf.org
https://www1.ietf.org/mailman/listinfo/avt