[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [AVT] Fwd: I-D Action:draft-perkins-avt-rapid-rtp-sync-03.txt



Colin-

FYI, I started out in March with a number of comments, some related to multicast UDLR with 1-2 links and I sense that's too big a topic to try and add at this point.  My comments therefore have distilled to addressing two major factors in syncing RTP sessions that I believe overshadow the time resolutions being discussed in the draft, and therefore it is an omission not to address them:

1. Common timebase
2. Media-specific access points

I stopped short of suggesting we should design anything. Therefore I still believe both topics belong here, and in this draft, and they are not so big as to defer at this point.

For clarification, I wasn't proposing we define a timebase signaling mechanism (although I do think that's a good idea).  All I am proposing is to add some text that explains that a common timebase is needed to achieve anything near the implied sync times discussed in this draft. How the senders and receivers figure that out is out of band. But if they don't and just go along with "wallclock", it makes much of the rest of this draft irrelevant since the timebase error will be huge by comparison.

Regarding the issue about access points, I'm not sure how much background to add, so apologies if this is obvious to everyone. Every media stream type has "access points" that arrive periodically - the point where a decoder can begin to decode the stream. In general, these are a function of the encoding, not constant per type, and not constant per stream. For example in H.264, these typically center around the presence of an IDR frame.  HE AAC has a simpler coding structure, but one still can't start decoding anywhere in the stream.  The "typical" periods of the H.264 access points range from 0.5 to 2 seconds, depending on the coding.  The exact period is application dependent, but is not relevant.  The point is that these periods greatly exceed the times shown in the tables which are implied to be the sync times - going as low as 0.04 seconds.  The tables in this draft suggesting that RTCP SR packets be sent 0.04 seconds apart for high rate streams. This is interesting (as Thomas notes) for jitter management and clock settling, but that high rate does not help you sync since you can't even start decoding the video for a relatively long time after the 0.04 seconds. This is mostly related to issues of late joiners of course.  Therefore, I was suggesting we make note of this fact to clarify there are other factors to sync and acquisition beyond receipt of NTP and the RTCP SR packet, and that these other times far exceed 0.04 seconds.

Co-locating in time the RTCP SR packets with their related access points further improves the sync time.  For example, if the IDR is every 2 seconds, there is no benefit (for syncing) to sending the RTCP SR more frequently than that, especially if it is aligned with this access point in time.  If the RTCP SR packets are sent randomly in time, the sync time is the IDR period plus 1/2 the RTCP SR packet period, on average. This is more of an optimization, but then, that's what this whole draft is about - optimizing the sync time.

I believe the text to address the above is minor, does not alter the core design, and will improve the sync time.

Regards,

        Mike

At 08:23 PM 6/16/2009 +0100, Colin Perkins wrote:
Michael,

On 16 Jun 2009, at 17:37, Michael A Dolan wrote:
I decided to reply to the earlier email/draft so we have the thread context since the comments apply to both that draft and the one posted the other day. Sorry for any confusion.

I think you might misunderstand my comment about the timebase.  I am not proposing you force applications to use any specific timebase.  I am proposing that senders and receivers have to agree (implicitly since it cannot be signaled) what timebase is being used in specific NTPv4 terms.  "wallclock" is ambiguous, but more importantly will not result in the improved sync time you desire without such an out of band agreement.

To the extent that you can perform synchronisation given the ambiguity about the timebase, the mechanisms specified in our draft will improve synchronisation times. Signalling additional details on the clocks and timebase will also help synchronisation, clearly, but is an orthogonal issue (and might be something for MMUSIC to consider, rather than AVT, since it's primarily a signalling problem).

Regarding the relationship to the media stream access points, again there may be a misunderstanding.  I did not propose faster RTCP SR records be forbidden, or even discouraged.  I am proposing clarifying text explaining the important sync-related relationship that would change one's interpretation of the tables and desired RTCP SR frequency.  I believe the tables are misleading where they imply actual sync times remotely near 0.04 seconds.

I'm not sure I see the problem here - can you clarify?

Since I am only proposing (albeit important in my opinion) clarifying material and am not actually opposing what is there now; and I am willing to craft specific text changes (your option); I am not sure I understand the offhand rejection of the comments.

Based on offline emails with you and the chair, I understood this work to be THE authority on rapid sync of RTP sessions.  Maybe I misunderstood and its scope is more narrow than that. If that's the case, then perhaps we can narrow the scope of the work and I'll author another draft on RTP sync. But I think the proposed changes are relatively minor in the grand scheme of things and hope you will consider addressing them in this draft.

If there is an issue with timing of media access points affecting the synchronisation latency, then that fits with the discussion in our draft. However, I think signalling to define the timebase for synchronisation is a broader problem, separate to the additional mechanisms we define, and would be better suited to a separate draft.

Cheers,
Colin



Regards,

        Mike

At 01:36 PM 6/12/2009 +0200, Thomas Schierl wrote:
Hi Michael,

I guess your comments are related to draft-ietf-avt-rapid-rtp-sync-02.txt. Please find my comments in line.

Thomas

Michael A Dolan wrote:
Thomas and all-

Continuing this in the context of yesterday's WG draft -02:

1. The draft still uses "wallclock".  I believe what's important is that the NTP reference clock must have the same reference_ID value as the clock used in the RTCP SR records. It's not important that it be "wallclock" (or more precisely, one of the values defined in NTPv4).  It can be PPS or something else. The draft does mention that a common clock should be used.  The problem I see is that the language is imprecise WRT NTPv4 terminology and implies a vague subset of timebases by using the term, "wallclock". Further, when the hdr ext for NTP is not used, it should be stated that the NTP timebase reference_ID must be agreed to out of band between the senders and receivers.

See my original comments attached to your message. I think we do not want to make any additional requirements on the clock to be used for generating the wallclock, i.e. we will keep the definitions in RFC 3550 for backward compatibility reasons.


2. Sync can only occur when the receiver has not only NTP and RTCP SR packets, but also a point in every RTP session where there is an "access point" for that media stream.  Since these access points are customarily considerably further apart than 0.04 seconds, the tables are a bit misleading on the practical sync time and recommended RTCP SR emission frequency.  It should be noted around all the tables that there is no benefit to sending the RTCP SR record more often than the media access point frequency, especially in combination with the next comment below.

Sure, there is typically a need for a random access point to start decoding. So on the first glance it makes sense to send SRs not with a higher frequency than random access points. But there may be a benefit to send SRs more frequently, e.g., to estimate clock skews.


3. It should be recommended that RTCP SR packets be sent near, and immediately following, media access points (subject to the 5% boundary). Doing so will greatly improve the sync time rather than relying on more frequent, and/or randomly timed, RTCP SR packets.

Generally, it should not be a problem to slightly delay RTCP packets in order to align with media random access points, but this would need further analysis to document the acceptable bounds. I guess this would require more time and would further delay the intended solution for layered codecs. In order to not further delay the solution in this draft, this could become a topic of another draft.


Comment #1 becomes even more important the faster you want to sync.  If the RTP sessions are using GPS and the NTP source is using UTC (both "wallclock"), there will be errors measured in seconds.  This disparity is less likely when the hdr ext is used, of course. But in the general case, this language needs tightening up to remove the clock errors.

Comments 2 & 3, although true in all scenarios, become more relevant in SSM with late joiners.

I hope the above comments are clearer than my attempts below.  I'm happy to suggest specific edits to the draft if that will help.

Regards,

        Mike

At 02:25 PM 4/2/2009 +0200, Thomas.Schierl at hhi.fraunhofer.de wrote:
Hi Michael,

What I can see from the ATSC M/H Transport spec is that there are requirements on encoder and decoder clock precisions, which for sure give a complete different environment than assumed by RFC3550 with respect to the clock precision. If we could assure such clocks for all devices which may potentially encode and stream layered media via multi session transmission, the problem would have been solved anyway. But that is not the case, as I tried to show by pointing to some of the relevant discussions on the mailing list.

About your second point: RTP does not require NTP to be used as wallclock. In RTP, there is much more freedom in selecting the source for this clock. RTP uses the NTP timestamp format just for indicating the wallclock in the RTCP sender reports. Furthermore draft-perkins-avt-rapid-rtp-sync-03 does only require to use the same wallclock as used for the SRs also for the wallclock timestamps in the header extensions.

Another interesting point is that the ATSC Transport spec is not mentioning SVC while mentioning AVC and AAC. If looking at the ATSC Video Systems spec. ( S4-136r7-A153-Part-7-Video-System), there is a solution which requires RTP timestamp synchronization for "SVC Transport in Two RTP Sessions". This for sure solves the clock skew problem, but as discussed in AVT, this is not a general solution and has different implications if applied to all use cases of RTP multi session transmission as, e.g. conversational services. Therefore we now have a solution which seems to be agreeable by the group, which solves the problem and is more in line with RFC 3550, i.e. using the wallclock for the data alignment between the session.

Best regards,
Thomas


Michael A Dolan wrote:
Hi Thomas-

Thanks.  Yes, I am familiar with the discussion.

I'm not necessarily pushing the ATSC solution, but offering some points to consider, which apply equally to the problem at hand here, I think.

Also, it is important to note some more things about the ATSC text:

1. The language in the specification very carefully recommends the clock precision without requiring it.  It is expected to work over routers with jitter and delays and with media components from outside the broadcast facility. The difference is that the client is expected to be able to perform tight sync when the servers are able to provide the recommended timing model (encoding one hop away, etc).  It gracefully degrades to "Internet" timing model quality.

2. The clock, although "NTP" syntax, is not time of day (e.g. reference_ID= "GPS"), but "PPS", thus allowing the components to be encoded and decoded without concern for end-to-end delivery delay.  This only works when NTP is explicitly signaled as part of the encoding and both components share a clock (same or closely coupled encoders). So it will not work in the general case.  But there are many systems where it will and perhaps consideration should be given to permitting non-"wall clock" timebases.

The NTP hdr ext here is an encoding optimization that indeed has some other good properties.  Although the draft ties the NTP timestamp back to the RTCP SR records, the timebase is not explicitly defined (which is also true of RTCP SR).  Some reference_ID value must be assumed and there is enough delta between, for example, GPS and UTC, timebases to cause things to break if the entire system is not implicitly assumed to be only one timebase. For example, without an explicit reference_ID, how could one use other timebases defined in NTPv4?  This is a broader issue than the hdr ext syntax, but since we're discussing NTP, I thought I'd bring it up.

I think the above and the other points below are relevant here and not inconsistent with the RTCP SR methods. The fact that the ATSC design *can* be more precise is not alone cause to discard the techniques developed there.

Regards,

         Mike

At 09:34 AM 3/31/2009 +0200, Thomas.Schierl at hhi.fraunhofer.de wrote:
Dear Michael,

Thanks for the comments and the reference to the ATSC M/H.

I guess, what we are looking for in AVT is a solution which also works for encoders having non-perfect clocks. There was a long discussion on the list, here are two pointers:

http://www.ietf.org/mail-archive/web/avt/current/msg09144.html
http://www.ietf.org/mail-archive/web/avt/current/msg10491.html

ATSC surely can assume to have a perfect clock in the NTP streams, which will not be the case on PC systems that may be used as an encoder for multi session transmission of a layered codec. This is one major problem we intended to solve with the header extension while keeping the whole approach as close as possible in line with the existing RTCP SR method.


Best regards,
Thomas



______________________________________________
Thomas Schierl, Dipl.-Ing.
Fraunhofer HHI
Senior Research Engineer
Image Processing Dept. / Multimedia Transport
Einsteinufer 37, 10587 Berlin, Germany
PHONE: +49 30 31002 227 , FAX: +49 30 392 7200
WEB: http://ip.hhi.de , SKYPE: thomas.schierl
EMAIL: thomas.schierl at hhi.fraunhofer.de



Michael A Dolan wrote:
Those interested in this topic may also find this interesting:

http://www.atsc.org/standards/cs_documents/a153-2009-03-13/S4-132r13-A153-Part-3-Transport.pdf

Section 10.3.

Some differences are:

1. There is one NTP packet stream per service, rather than the encoding optimization in the hdr ext;
2. RTCP SR records are emitted "near" video random access points, rather than on a prescribed time schedule; and
3. The above works in multicast-only, UDLR networks (e.g. television broadcast), rather than requiring a 2-way link.

#2 is interesting since in order to begin a *synchronized* decode (video + audio properly correlated), a client must have acquired all of:
a. NTP
b. RTCP SR for each component
c. video reference frame and decoding metadata (i.e. a random access point)
d. audio frame

So, the relative timing of the RTCP SR records to the RTP stream content is perhaps more important than the emission period.  Having the RTCP SR records arrive faster than the video reference frames doesn't help of course.  And even if they are "fast enough", having them arrive out of sync with the reference frames delays the service acquisition by half the period on average.

Bandwidth utilization (how many bits to devote to RTCP and NTP) can arguably be more efficiently managed by the server in most cases, not individual client requests for faster RTCP SR packets. And, the client would have no knowledge of the intended (video) reference frame periods being used by the server, so it would not, in general, know what to ask for.

In practice, won't a server that wants to enable faster acquisition and sync do things like the above?  I'm concerned that detailed requests from clients beyond "please use more bandwidth and enable fast sync" won't be as valuable as they seem since in the general case the client doesn't know what to ask for exactly.

FYI, the above ATSC document is in a "candidate" state where public comments are solicited.  Independent of this thread, if anyone has comments on it, I could convey them back to the ATSC.

Regards,

        Mike

At 07:43 PM 3/9/2009 +0000, Colin Perkins wrote:
The changes in this version are primarily editorial; we're aware that
not all the open issues have been addressed yet. Any further comments
are welcomed.

Chairs: we'd like this to be considered as an AVT working group
draft, as indicated privately.

Cheers,
Colin




Begin forwarded message:
From: Internet-Drafts at ietf.org
Date: 9 March 2009 17:15:02 GMT
To: i-d-announce at ietf.org
Subject: I-D Action:draft-perkins-avt-rapid-rtp-sync-03.txt
Reply-To: internet-drafts at ietf.org

A New Internet-Draft is available from the on-line Internet-Drafts
directories.

        Title           : Rapid Synchronisation of RTP Flows
        Author(s)       : C. Perkins, T. Schierl
        Filename        : draft-perkins-avt-rapid-rtp-sync-03.txt
        Pages           : 17
        Date            : 2009-03-09

This memo outlines how RTP multimedia sessions are synchronised, and
discusses how rapidly such synchronisation can occur.  We show that
most RTP sessions can be synchronised immediately, but that the use
of video switching multipoint conference units (MCUs) or large source
specific multicast (SSM) groups can greatly increase the initial
synchronisation delay.  This increase in delay can be unacceptable to
some applications that use layered and/or multi-description codecs.

This memo updates the RTP Control Protocol (RTCP) timing rules to
reduce the initial synchronisation delay for SSM sessions.  A new
feedback packet is defined for use with the Extended RTP Profile for
RTCP-based Feedback (RTP/AVPF), allowing video switching MCUs to
rapidly request resynchronisation.  Two new RTP header extensions are
defined to allow rapid synchronisation of late joiners, and guarantee
correct timestamp based decoding order recovery for layered codecs in
the presence of clock skew.

A URL for this Internet-Draft is:
http://www.ietf.org/internet-drafts/draft-perkins-avt-rapid-rtp- sync-03.txt

Internet-Drafts are also available by anonymous FTP at:
ftp://ftp.ietf.org/internet-drafts/

Below is the data which will enable a MIME compliant mail reader
implementation to automatically retrieve the ASCII version of the
Internet-Draft.
Content-Type: text/plain
Content-ID: <2009-03-09100610.I-D at ietf.org>

_______________________________________________
I-D-Announce mailing list
I-D-Announce at ietf.org
https://www.ietf.org/mailman/listinfo/i-d-announce
Internet-Draft directories: http://www.ietf.org/shadow.html
or ftp://ftp.ietf.org/ietf/1shadow-sites.txt

_______________________________________________
Audio/Video Transport Working Group
avt at ietf.org
https://www.ietf.org/mailman/listinfo/avt

_______________________________________________
Audio/Video Transport Working Group
avt at ietf.org
https://www.ietf.org/mailman/listinfo/avt

---
Visit us at

Web 2.0 Expo / March 31 - April 03 / San Francisco, California / Booth 415
http://www.web2expo.com/webexsf2009

NAB SHOW 2009 / April 18-23 / Las Vegas, Nevada USA / South Hall (Upper), Booth SU9624F
http://www.nabshow.com/





_______________________________________________
Audio/Video Transport Working Group
avt at ietf.org



https://www.ietf.org/mailman/listinfo/avt
 
     


----
Visit us at

Web 2.0 Expo / March 31 - April 03 / San Francisco, California / Booth 415
http://www.web2expo.com/webexsf2009

NAB SHOW 2009 / April 18-23 / Las Vegas, Nevada USA / South Hall (Upper), Booth SU9624F
http://www.nabshow.com/





_______________________________________________
Audio/Video Transport Working Group
avt at ietf.org


https://www.ietf.org/mailman/listinfo/avt
 


----
Visit us at

IEEE ICC 2009 / June 14-18 / Dresden, Germany / International Congress Center
http://www.comsoc.org/confs/icc/2009/

Laser World of Photonics 2009 / June 15-18 / Munich, Germay / Hall B1, Booth 461
http://world-of-photonics.net/de/laser/start
_______________________________________________
Audio/Video Transport Working Group
avt at ietf.org
https://www.ietf.org/mailman/listinfo/avt



--
Colin Perkins
http://csperkins.org/