[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[AVT] RE: Carrying SMPTE time-codes in RTP streams, discussion email



Brian, your email raised interesting enough questions that I was trying to think about it before reacting. However, Colin wants an I-D in tomorrow morning (it can be revised later), so here goes.


At 3:11 PM -0800 2/11/05, Link, Brian wrote:
Hi Dave,

This is good. You've given this a lot of thought. Here are a few
comments to add to the mix:

1.  One (or more) actual SMPTE time stamp values are needed in each
packet for the commercial application I described. I don't see how the
mapping between SMPTE time code and RTP time stamps over long runs in
your proposal could be sufficient. The application is a quality control
(QC) test of the final master audio and video for a DVD, before they are
multiplexed into a program stream. Part of what needs to be checked, is
that the SMPTE time stamps that are "striped" on a hard disk with the
audio (one SMPTE time stamp value per AC-3 audio frame) are all
correctly lined up with the video so that the multiplex operation will
work correctly. The mapping between RTP time stamps and SMPTE time
stamps isn't sufficient because it requires the receiver to synthesize
SMPTE time stamps between mapping points, which may mask errors in the
source.

If you lose RTCP packets, yes, you may lose a discontinuity. I tried to discuss this a bit in the draft.



2. Since synthesized time code is acceptable in many applications, it presumably is undesirable to require all streams that use SMPTE to carry a time stamp for every RTP packet.

Every RTP packet already has a timestamp. We're at cross-purposes here; I am leveraging the *existing* timestamp; no change at all to RTP packets. Completely neutral with respect to profile, encryption, and payload format.


Would it be possible to send a
mapping RTCP APP packet for every RTP packet? Or even more than one per
RTP packet, since there is a SMPTE time stamp associated with each audio
frame? Then no packet would get stamped with a synthesized time stamp.
This might be a way to enable the QC application.

3.  For that matter, what's the advantage of putting these mappings in
separate RTCP packets? Instead, encapsulating them in the same packet as
the media data instead would give a greater assurance that the receiver
is associating the right SMPTE time value with the media data.

If they are in every packet, there is no need for a mapping. If they are imputed from previous packets, they can be imputed from RTCP. The advantage of this approach is the low-bandwidth and isolation from the RTP formats.



4. Note that regardless of where the mapping exists, there needs to be a mechanism so that the receiver can determine if and when it has missed a mapping packet (and therefore a run boundary) so it knows that any further synthesized values are invalid.

I can't see how to do this perfectly; for stored media, one could have a range in the mapping (when a segment end-time is known in advance). This isn't possible live, of course.



5. Finally, I've specified a different representation of SMPTE time code values for the AC-3 payload format than you have. There's an extension defined in SMPTE 339M for applying time code values to audio frames, which may not be synchronous with their associated video frames. The key difference is that the 339M format adds a 'sample' field after the 'frames' field. This indicates which PCM sample in the coded audio frame the (video) frame number in the time stamp is associated with.

That's important. Thanks. It needs thought. If the RTP clock is sample-accurate, then I think I'm OK, as I am associating the time-code with an RTP timestamp, which may not be the timestamp of any given frame.



* * *

Because of these points, I still think the QC application is best served
by including actual SMPTE time stamps associated with each audio frame
inside each RTP packet. It seems like the QC application is a special
case compared to the applications that would generally use your mapping.
Maybe there is a good way to handle the special case within the mapping.
I've already proposed a straightforward way to include it in the AC-3
payload format specification. Since it is optional, it can be omitted
from most streams and doesn't interfere with the general case.

Regards,
Brian

---
Brian Link
Dolby Laboratories, 100 Potrero Ave., San Francisco, CA 94103
phone: 415-558-0324  fax: 415-863-1373
email: bdl at dolby.com   alt email: link at ieee.org


-----Original Message----- From: Dave Singer [mailto:singer at apple.com] Sent: Thursday, February 10, 2005 4:25 PM To: avt at ietf.org Cc: Colin Perkins; lazzaro at cs.berkeley.edu; Link, Brian Subject: Carrying SMPTE time-codes in RTP streams, discussion email

Here are some thoughts on carrying SMPTE time-code information in RTP.

First a brief background on SMPTE time-codes.

SMPTE time-codes count frames.  There are two common forms of
display:  either a simple counter, or what looks like a normal clock
value (hh:mm:ss.frame).  When the frame rate is truly integer, then this
can be a normal clock value, in that seconds tick by at the same rate as
the seconds we know and love.

However, NTSC video infamously runs slightly slower than 30
frames/second.  Some people call it 29.97 (which isn't quite right) and
some say that a frame takes 1001 ticks of a 30000 tick/second clock
(which is closer).  Be that as it may, SMPTE time codes count 30 of
these frames and deem that to make a second.

This causes a SMPTE time-code display to 'run slow' compared to
real-time.  To ameliorate this, a format called drop-frame is used.
Some of the frame numbers are skipped, so that the counter periodically
'catches up' (so some time-code-seconds actually only have 28 or 29
frames in them).


So, what we desire is a system that allows us to associate a SMPTE time-code with some media. Since in RTP all media has a clock already, we can leverage that fact. If we treat the media as having 'runs' of time in which the time-code is simply counting up, then the time-code anywhere within a run can be calculated if you know 1. the RTP timestamp of the start of the run 2. the RTP timestamp where you want to know the time-code 3. the time-code of the start of the run 4. the counting rate and other parameters of the time-code.

My proposal is that we put periodic mappings between (1) and (3) into
RTCP packets, and provide (4) the 'setup' information out-of-band, for
example in SDP.

The setup information includes:
   (the timescale of the RTP stream, already provided)
   the duration, in that timescale, of a single frame-count in the
'frames' portion of the time-code
   the number of those frames that make a time-code-second
   the following booleans:
     is-NTSC-drop-frame:  should the usual 'left out numbers' of
drop-frame be applied or not?
     wrap-at-24-hours:  should the hours portion wrap from 23 to 0, or
keep counting up?
     allow-negative-time-codes:  are negative time-codes used in this
stream?
     display-time-code-as-counter:  should the display be an integer
frame-count, or hh:mm:ss.fr format?
     time-code-displayed:  is it intended that this time-code be
displayed somehow?

For example, if associated with a video track with the common time-scale
of 90000, then frame-duration of 3003 and frames-per-tc-second of 30
would yield a 'normal' SMPTE time-code for NTSC video.  Similarly values
of 3750 and 24 yield a time-code for 24 fps film content, and so on.

Now, we put into an RTCP APP packet (or a new RTCP packet), a mapping
between an RTP time-stamp value and the time-code.  The RTP timestamp
and the that time-code are 32 bits;  the time-code is either a signed
counter value (if we're in counter format), or it is the format
   hours(8) -- 0 to 255
   sign(1) -- 1 for negative, 0 for positive
   minutes(7) -- 0 to 59
   seconds(8) -- 0 to 59
   frames(8) -- 0 to (frames-per-tc-second - 1)

This establishes the time-code for all RTP times greater than or equal
to the one given, until a subsequent APP packet reestablishes the
mapping.  It's unfortunate that the sign is in the middle, but that
allows the hours to use the full range, and the minutes don't need to.

* * *

It might be argued that we could set the initial mapping also in the
SDP, since RTCP packets might get lost.  But this means that the SDP now
has to have knowledge of the RTP random offset (nasty);  and if one puts
this APP packet into all sender reports, it's probably good enough.
Then if you don't have time-codes, you don't have audio-video-sync
either.

This associates the time-code with a particular media stream.  An
alternative would be to make it an RTP stream in its own right;  but the
data rate is so low, this seems egregious.  And by packing it inline, we
can do this backwards-compatible for gateways etc. that already handle
dual-stream.

Comments?

--
David Singer
Apple Computer/QuickTime

-----------------------------------------
This message (including any attachments) may contain confidential
information intended for a specific individual and purpose.  If you are not
the intended recipient, delete this message.  If you are not the intended
recipient, disclosing, copying, distributing, or taking any action based on
this message is strictly prohibited.



--
David Singer
Apple Computer/QuickTime

_______________________________________________
Audio/Video Transport Working Group
avt at ietf.org
https://www1.ietf.org/mailman/listinfo/avt