[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [AVT] alignment of layers issue
Magnus Westerlund wrote:
Ingemar Johansson S skrev:
Hi
Given the discussion around alignment of layers and the document
draft-perkins-avt-rapid-rtp-sync-00
I can't get rid of the feeling that there is a probablity that layers may not be aligned of RTCP-SR based suchronization is used.
I am not that worried about clock-skew. The problem I see is rather the update rate of clocks that are used for the NTP. If the update rate is 15msec you already have a relatively high probability that layers may slip (the slip may only be occasional and initial and can probably be fixed in the long run using some magic algo that irons out the 15msec update).
But that put aside. How does the multi-anything decoder know that one or more layers (or views) are misaligned ?.
I can't tell for SVC, MVC and the others but I fail to see how this verification is possible in G.718. The user will experience crap audio but how does the decoder know that one or more layers are misaligned ?.
I don't think this really is an issue if you correctly implement the
synchronization. You can completely avoid to have misalignment between
layers in the time-domain with the exception for any clock skew updates.
Where such could exist in the time between the first RTCP SR packet
with an update and the last layer having received the corresponding update.
Thus there are some consideration that clock skew updates should not be
done to often and maybe some implementation tips would be to hold off
adjusting a multi-layered until SR packets with updates has been
received for all layers. This do work as clock skew fixing is only
adjusting its relation to other media streams, not the internal process.
But dealing with the updates seems to become really complicated. What
would this mean? You would have to inform the receiver, that an update
was going on in the binding of RTP and NTP. This would mean the receiver
would have to get all "updated" RTCP reports in the different sessions
first, before it can make use of it. So, another extra signaling?
Anyway, let us assume for the time being, we can make this work with a
relatively complex implementation in the sender as well as in the receiver.
I still ask myself and I guess also other people do so, why not doing
this little change towards "RTP timestamp alignment" for data of the
same layered sample in the multiple sessions? It's so easy and the only
real argument against it was the security issue which I guess was
discussed earlier. With this, I further assume we can keep the SSRC
alignment of the sessions in the session multiplex and we can also make
sure, that using the same timescale in the different sessions is ok.
The thing what I would wish to have is a simple and robust mechanism,
where I do not have to tweak timestamps and where I do not have to have
extra signaling and where I do not have to care about lost or late
packets and where I just can take two numerical values from the packets
and know whether this is of the same sample or not. No rounding problems
at all. The mechanism is there and it comes for free!
Remember for multi session transmission, the alignment of data is not
required after decoding as for lip-sync, it is already required before
you can start decoding. And, there are not so many implementations
which gets the synchronization right. For lip-sync it is just an
quality-of-implementation issue. For data alignment of layered codecs,
it becomes an correctness issue for the implementation.
Thomas
Regarding how to generate correct RTCP SR packets this is how you do it:
At start up of codecs you take one clock value for the media clock TS_O
and the corresponding system clock NTP_O. This reference is going to be
used for all layers, i.e. in all RTP Sessions for this media source.
When sending an RTCP SR at time p, the sender gets the current system
clock value NTP_P. It doesn't sample the media clock to determine the
RTP TS value, instead this is calculated based on the reference value.
TS_P= (NTP_P-NTP_O)/TS_RATE + TS_O
To determine clock skew you need to regularly sample system clock and
media clock and use that to determine if you have significant clock skew
that needs adjustments. This should be low pass filtered and probably
put on threshold to avoid unnecessary updates. Only when the clock skew
affects inter media synchronization to such a degree that issues arise
or the buffer handling becomes problematic should this be updated, i.e.
at least 20-50 ms error before reacting. Certain applications can take
much bigger errors before reacting.
If you have clock skew that becomes an adjustment parameter in the above
calculation, or an update of the NTP_O TS_O binding.
When using the above method you will not have an issue that the
synchronization information in the SR will depend on inaccuracy of the
system clock.
Cheers
Magnus Westerlund
IETF Transport Area Director & TSVWG Chair
----------------------------------------------------------------------
Multimedia Technologies, Ericsson Research EAB/TVM
----------------------------------------------------------------------
Ericsson AB | Phone +46 10 7148287
Färögatan 6 | Mobile +46 73 0949079
SE-164 80 Stockholm, Sweden| mailto: magnus.westerlund at ericsson.com
----------------------------------------------------------------------
_______________________________________________
Audio/Video Transport Working Group
avt at ietf.org
https://www.ietf.org/mailman/listinfo/avt
--
Thomas Schierl
--------------
Fraunhofer HHI
----
Visit us at
MEDICA 2008 / Duesseldorf, Germany / 19 - 22 November 2008 / Hall 16, Booth D55
http://www.medica.de
SOCCEREX 2008 / Gauteng, South Africa / 23 - 26 November 2008 / Booth V1-B1 a
http://www.soccerex.com
_______________________________________________
Audio/Video Transport Working Group
avt at ietf.org
https://www.ietf.org/mailman/listinfo/avt