[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [AVT] Post WG last call edits on the H.264 payload format: Technical change



Hi Miska,

It seems that I had lost track of some of our more exotic use cases. See further comments below.

miska.hannuksela at nokia.com wrote:

Magnus, all,

Live broadcast is interrupted by pre-encoded content such as commercials from time to time. The first intra picture of a pre-encoded clip is transmitted in advance to ensure that it is readily available in the receiver. At the time of transmitting the first intra picture, the originator does not exactly know how many NAL units are going to be encoded before the first intra picture of the pre-encoded clip follows in decoding order. Thus, the values of DON for the NAL units of the first intra picture of the pre-encoded clip have to be estimated at the time of transmitting them and gaps in values of DON may occur.

In this example, the first intra picture of the pre-encoded commercial must be given a large DON value to make sure that it is not decoded too early. The value of DON for this first intra picture must also be such that it does not cause pictures to be flushed from the de-interleaving buffer. Thus, the use of sprop-interleaving-depth is required in this use case.

I do agree that with the interleaving-depth parameter it is possible to have a rather small buffer and then put a few NALUs in there that can be considerably delayed.



sprop-max-don-diff makes the de-interleaving process vulnerable to transmission losses in some cases. For example, if the DONs of the transmitted stream are ..., 100, 101, 102, 200, 103, 104, 105, ..., and sprop-max-don-diff is e.g. 50, NAL unit with DON 200 is supposed to flush the de-interleaving buffer. However, if that NAL unit is lost during transmission the de-interleaving buffer keeps filling in and may overflow.

Yes, if the fact is that only a single NALU in near time will push out a large block of data. I would recommend that one uses NALUs in more than a single packet to push out data. However, I don't think that interleaving-depth is a sure saver in this condition. If you use max-don-diff and have a stream when the amount of NALUs that should be in the buffer fluctuates between 10 and 30 depending on position in the stream. Then if a loss results in that data is not pushed out of the buffer, the interleaving-depth parameter will only make a difference if one is a part of the stream where the buffer in near 30 NALUs in the buffer. Otherwise, the loss can still lead to somewhat late decoding.


I conclusion I think it is motivated to keep sprop-interleaving depth.


sprop-init-buf-time is useful at least in the following cases:
- The de-interleaving buffer can also be used to smooth out variations in transmission scheduling. Video decoder input buffers are not guaranteed to handle non-constant input bitrate (or to be more specific: other rate of input than specified in the hypothetical reference decoder of the video coding standard). Thus, to be on the safe side, if a server applies any transmission scheduling other than constant bitrate, it should make sure that the receiver has a buffer, e.g. de-interleaving buffer, before the decoder. sprop-max-don-diff cannot be used to control the initial buffering before decoder (as data unit may be sent in decoding order).

- As demonstrated above, sprop-max-don-diff cannot be used in some use cases. Then, sprop-init-buf-time and sprop-interleaving-depth become the only parameters to control the initial buffering.

I can actually combine the max-don-diff with Interleaving depth to allow both to be used at the same time. As when solving the above use case then one needs to put a max limit on the span of the DON increase before it actually most be decoded. In this use case one can simple put in a max-don-diff equal to that value. Then in most case the interleaving-depth will ensure that data is pushed out of the buffer. But when one initially likes to start with a certain amount of media, then one simply goes to the DON numbering scheme that results in the sufficient amount of data to be buffered before decoding starts, even if less than Interleaving-depth requires.


- The file containing the stream also contains a hint track which always includes a transmission schedule. If a streaming server does not modify the transmission schedule, the value of sprop-init-buf-time is valid. Otherwise (the server applies fine-grain rate control / transmission scheduling), sprop-init-buf-time gives an approximate of initial buffering for the receiver.

Yes, for network where one does not need congestion control, this is true.


Moreover, sprop-init-buf-time is clearly specified and optional to use (in sender and receiver). It gives the receiver a possibility for shorter initial buffering in some use cases than what can be achieved with sprop-max-don-diff. Therefore, I don't see any harm of keeping the text regarding sprop-init-buf-time unchanged.

I am a little concerned over how the receiver is going to know when init-buf-time is only hinting, or when it is required to be followed.


In conclusion I don't really like this parameter, it needs to be used with great caution. In most use cases the max-don-diff will be superior and actually robuster towards error. However I do see that there are at least on use case where it make some sense.

[snip]


Based on the motivatins above, I propose keeping the Internet Draft unchanged.


Accepted, I think it has been sufficiently demonstrated that there exist theoretical need for all parameters. However it will be interesting to see if actually all these special use case really are used. I think we will have to wait until we take this to draft standard to prune out any unnecessary functionality.


I will get on to provide an editorial update according to the previous email.

Cheers

Magnus Westerlund

Multimedia Technologies, Ericsson Research EAB/TVA/A
----------------------------------------------------------------------
Ericsson AB                | Phone +46 8 4048287
Torshamsgatan 23           | Fax   +46 8 7575550
S-164 80 Stockholm, Sweden | mailto: magnus.westerlund at ericsson.com

_______________________________________________
Audio/Video Transport Working Group
avt at ietf.org
https://www1.ietf.org/mailman/listinfo/avt