Re: [AVT] accepting draft-lakaniemi-avt-rtp-evbr-02 as WG item

Thomas Schierl <schierl@hhi.fhg.de> Thu, 04 September 2008 15:34 UTC

Return-Path: <avt-bounces@ietf.org>
X-Original-To: avt-archive@optimus.ietf.org
Delivered-To: ietfarch-avt-archive@core3.amsl.com
Received: from [127.0.0.1] (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 7D03E3A69D1; Thu, 4 Sep 2008 08:34:52 -0700 (PDT)
X-Original-To: avt@core3.amsl.com
Delivered-To: avt@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 243673A6898 for <avt@core3.amsl.com>; Thu, 4 Sep 2008 08:34:51 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -4.801
X-Spam-Level:
X-Spam-Status: No, score=-4.801 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, HELO_EQ_DE=0.35, HELO_MISMATCH_DE=1.448, RCVD_IN_DNSWL_MED=-4]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 27WLkqPxH52B for <avt@core3.amsl.com>; Thu, 4 Sep 2008 08:34:49 -0700 (PDT)
Received: from mail.hhi.fraunhofer.de (mail.HHI.FRAUNHOFER.DE [193.174.67.45]) by core3.amsl.com (Postfix) with ESMTP id 09CBF3A67A5 for <avt@ietf.org>; Thu, 4 Sep 2008 08:34:49 -0700 (PDT)
Received: by mail.hhi.fraunhofer.de (Postfix, from userid 65534) id 6E9F91D88FE3; Thu, 4 Sep 2008 17:34:50 +0200 (CEST)
Received: from [10.8.0.150] (unknown [10.8.0.150]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client CN "Thomas Schierl", Issuer "Fraunhofer-Gesellschaft Root-CA v2" (not verified)) by mail.hhi.fraunhofer.de (Postfix) with ESMTP id 3F6321D88FDA; Thu, 4 Sep 2008 17:34:48 +0200 (CEST)
Message-ID: <48C00017.2030900@hhi.fhg.de>
Date: Thu, 04 Sep 2008 17:34:47 +0200
From: Thomas Schierl <schierl@hhi.fhg.de>
Organization: Fraunhofer HHI
User-Agent: Thunderbird 2.0.0.16 (Windows/20080708)
MIME-Version: 1.0
To: Jonathan Lennox <jonathan@vidyo.com>
References: <48b081ea.01b7660a.391e.ffffd15d@mx.google.com> <23F7C403-DE24-4 4A1-BDC7-DA8CCDD381B6@csperkins.org> <44C96BEE548AC8429828A37623150347F3682 0@vaebe101.NOE.Nokia.com> <9EA8709C-26BF-451C-A7F6-5AA89F382524@csperkins.o rg> <44C96BEE548AC8429828A37623150347F36CBE@vaebe101.NOE.Nokia.com> <E95681 56-E394-4644-BA0C-D38840740076@csperkins.org> <44C96BEE548AC8429828A3762315 0347F36EB4@vaebe101.NOE.Nokia.com> <7C74ABD7-567D-489F-A264-D1B9C1D082D8@cs perkins.org> <44C96BEE548AC8429828A37623150347F36F61@vaebe101.NOE.Nokia.com > <3AD980DD-D2C0-4FED-8CBC-8B1116895980@csperkins.org> <48BD291C.8020007@iis.fraunhofer.de> <7AE6CF30-7DBA-4A49-AF56-CF8EC4AC03F1@csperkins.org> <6B55710E7F51AD4B93F336052113B85F3D3393@be150.mail.lan> <389E0F24-4D11-4324-BFE0-893A11A53722@csperkins.org> <6B55710E7F51AD4B93F336052113B85F3D33F0@be150.mail.lan>
In-Reply-To: <6B55710E7F51AD4B93F336052113B85F3D33F0@be150.mail.lan>
X-alterMIME: Yes
Cc: Ye-Kui.Wang@nokia.com, Colin Perkins <csp@csperkins.org>, avt@ietf.org
Subject: Re: [AVT] accepting draft-lakaniemi-avt-rtp-evbr-02 as WG item
X-BeenThere: avt@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Audio/Video Transport Working Group <avt.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/avt>, <mailto:avt-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/pipermail/avt>
List-Post: <mailto:avt@ietf.org>
List-Help: <mailto:avt-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/avt>, <mailto:avt-request@ietf.org?subject=subscribe>
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Sender: avt-bounces@ietf.org
Errors-To: avt-bounces@ietf.org

Jonathan,

I also agree with you for the live case. For the pre-encoded case, there 
is typically no deviation in the clock contained in the file and the 
media frames do have equidistant timings. Having such an accurate clock 
in the file does help to "tweak" the SR's NTP wallclock value in a way 
that it can be "forced" to be correct. As an example, you may take the 
timing from the file as a scheduler time and put this time + a delta as 
wallclock into your SR, while the SR is actually sent at the 
corresponding gettimeofday() value, which may have deviation from the 
scheduled (wallclock) time.

For the live case, i.e. you have to rely on the system's wallclock only, 
it's different. Your example was the deviations of the win32 get-time 
functions. In such a case with clock deviations around 10ms, there is 
indeed a problem! Even using other clocks as the one of the sound card 
or the one of the video capturer for the wallclock may not really help here.

Possible backward-compatible solutions (assuming the same clock is used) 
may be RTP timestamp alignment or sending RTCP SRs in the multiple 
sessions at the same time including the same NTP wallclock. Everything 
else may lead to inaccuracy in NTP media timestamp generation at the 
receiver. But this seems to be rather an implementation issue than a 
standardization issue. Although this issue may affect the standardization.
I think we should discuss this problem in detail in the draft and show 
what can be done, if no exact clock is available. Even, if not all 
systems are affected, there must be a way to work around such a problem.
Since e.g., the two above proposed solutions would extend the current 
RTP specification for the case of multi session transmission, we do not 
have an informational draft on "multi session transmission issues". It 
would be a normative one, right?

-- 
Thomas Schierl
--------------
Fraunhofer HHI



Jonathan Lennox wrote:
> Colin Perkins writes:
>   
>>> If you're sending live (as opposed to pre-recorded) media, clock
>>> drift between the media clock (generating RTP timestamps) and the
>>> system time-of-day clock (generating NTP timestamps) can make
>>> sample-accurate stream alignment based on RTCP SRs extremely
>>> problematic, especially as RTCP report intervals lengthen.
>>>
>>> Specifically, assuming each RTCP SR's NTP and RTP times accurately
>>> represent the time-of-day and sample-clock measurements
>>> (respectively) at SR transmission time, and SR transmission time is
>>> randomized independently for each RTCP session,
>>> max_sample_misalignment = max_rtcp_interval * abs(actual_clock_rate
>>> - nominal_clock_rate)/nominal_clock_rate.
>>>       
>> I'm not sure this expression is valid, since I'd expect the receiver
>> to estimate the clock skew based on observing several SR's, and
>> compensate for it.
>>     
>
> I think this may be the disconnect we're seeing here.  It's fine for synchronization for lip sync to be estimated approximately sometime shortly after session startup, and slowly converge to greater precision over time as you receive more SRs.  Lip sync also degrades gracefully if the encoder's clocks are inaccurate or imprecise.
>
> Layered codecs, by contrast, are *not decodable* until you've established the timing relationship between the layers, unless you use codec-specific in-band methods such as SVC's NI-C or I-C which avoid the use of timing information entirely.  Otherwise, if your receiver mis-calculates a timing relationship between streams you end up with garbage output from your decoder.
>
> I think that for layered codecs, the requirements (100% accurate clock synchronization prior to any decoding) and the domain constraints (a single common media clock, as I argued in my previous e-mail) are sufficiently different from those of lip sync that it's not unreasonable to have a specific solution to the problem, rather than trying to re-use a solution to a different problem which has different requirements and constraints.
> 	
>   

----
Visit us at

IBC 2008 / Amsterdam (NL), 12 - 16 September 2008 / Hall 8 Booth C81
http://www.hhi.fraunhofer.de/index.php?id=2399&L=1

insite 2008 / Johannesburg (South Africa), Sandton Convention Centre / 15 - 17 September 2008 
http://www.insitex.co.za
_______________________________________________
Audio/Video Transport Working Group
avt@ietf.org
https://www.ietf.org/mailman/listinfo/avt