[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[AVT] Post WG last call edits on the H.264 payload format: Technical change



Hi AVTers and H.264 Authors,

In addition to my previous email on some editorial changes to the RTP payload format for H.264, some of us authors and Colin have had a discussion about the large set of parameters for deinterleaving. However we have so far not agreed if something should be done about this. To save time and have an open process about this the discussion will be held openly on the AVT reflector to allow all to give their view.

Currently we have the following interleaving related parameters:
- sprop-interleaving-depth
- sprop-deint-buf-req
- deint-buf-cap
- sprop-init-buf-time
- sprop-max-don-diff

Analyzing them we seem to have a significant overlap between some of the parameters. To simplify things I personally would suggest that we remove some of them avoid uncertainties on which to use, and make implementation simpler.

The sprop-init-buf-time (specifying the initial buffering time in milliseconds) does provide a subset of the functionality of sprop-max-don-diff. The max-don-diff parameter can be used both for initial and during session to determine the number of NALUs before releasing them from the deinterleaving buffer. Thus it can provide the initial buffering time, by having a receiver do initial buffering until the first packet pops out of the buffer. The max-don-diff is also more robust and can handle slow start like behavior that makes the time before playout start a bit longer than anticipated.

The sprop-interleaving-depth and sprop-deint-buf-req does also overlap as both specify the smallest required buffer space to guarantee that deinterleaving can be done. However they are expressed in different units, where int-depth is in number of NALUs and buf-req is in bytes. The buf-req parameter is only intended to provide the size in bytes, while the int-depth does provide the receiver with another measure on when a NALU can be released from the buffer.

The sprop-interleaving-depth and sprop-max-don-diff does both provide values that allow a receiver to determine when a NALU can be released from the deinterleaving buffer. The interleaving-depth parameter is rather inflexible, and must always have the same number of NALUs in the buffer before releasing them. This does not allow a sender to reduce the amount of buffering if less then the depth NALUs are used in the interleaving pattern. However the max-don-diff mechanism does provide for that as, the don numbers can be made sparse in the assignment, the amount of NALUs within the interleaving buffer for a max don diff value of N is anything between 1 and N, depending on how the DONs are assigned.

With all these overlap between the parameters one might consider what simplifications that can be made. In my opinion we can remove sprop-init-buf-time as being totally redundant and inferior to sprop-max-don-diff.

It is a more difficult question if we should remove any of the other parameters. Here I see a couple of different alternatives:

A. We remove sprop-interleaving-depth and sprop-deint-buf-req and only relies on max-don-diff to determine the maximum number of NALUs that can be present in the deinterleaving buffer. The actual memory requirement for this buffer will be a bit vague, and the client can only make a estimation based on likely max size of the NALUs. The deint-buf-cap is kept to allow a offer or answer to state its memory capabilities for de-interleaving in bytes, and having the sender to ensure that these are meet.

B. We remove sprop-interleaving-depth. The removal of NALUs from the buffer is completely controlled by max-don-diff. The checking if the receiver has sufficient resources are expressed by sprop-deint-buf-req, and thus expressed in bytes. The receiver will then also have the maximum number of NALUs through the max-don-diff parameter.

These changes will effect the MIME type, the Offer-Answer section and the informative chapter describing proposed methods for handling the de-interleaving buffer.

I think we should do some reduction of parameters despite the fact that it will delay the request to the IESG with at least one and half week, assuming that it is sufficent with a 1 week WG last call.

Please provide comments on this, preferable stating if alternative A or B or neither is your preference.


Cheers

Magnus Westerlund

Multimedia Technologies, Ericsson Research EAB/TVA/A
----------------------------------------------------------------------
Ericsson AB                | Phone +46 8 4048287
Torshamsgatan 23           | Fax   +46 8 7575550
S-164 80 Stockholm, Sweden | mailto: magnus.westerlund at ericsson.com



_______________________________________________
Audio/Video Transport Working Group
avt at ietf.org
https://www1.ietf.org/mailman/listinfo/avt