[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: [AVT] draft-ietf-avt-rtp-vc1-05.txt
Anders,
>> T1.
>> Comment: The very first frame of a stream is naturally a non-B frame.
>> The draft lacks a rule how the decode time of the very first frame
>SHALL
>> be set.
>
>It can be set as if the stream has gone on forever. This is
>suggested by Figure 1. Figure 1 shows frame I0 appearing at
>time 2 in the coded order. So, the decode time could be 2 and
>that is equal to the presentation time of a hypothetical
>previous frame I(-1).
>I have verified with my colleagues that it is also acceptable
>to make the decode time of the very first frame equal to its
>presentation time.
>
>It would probably be good to have a sentence that explains how
>to handle the case of the first frame. I prefer the second
>option (DT=PT) as it is easiest to implement and describe.
Would the second option (DT=PT) satisfy HRD compliancy?
Either option is fine to me as long as the solution remains VC-1 (and
HRD) compliant.
>> T2.
>> Comment: As far as we can see, the only use for decode time is
>specified
>> in the Hypothetical Reference Decoder section of the VC-1
>spec. HRD is
>> something that decoders need not implement. Moreover, as
>specified in
>> section 4.3, the decode time is something that can be derived from
>> presentation times.
>
>But some decoders may have implemented the HRD. They also
>can't compute the decode time in the case the previous RTP
>packet was lost. These reasons alone should be a good enough.
> Furthermore, some decoders may prefer to use the DT to
>determine buffer occupancy levels and for pacing of packets
>prior to decoding. Also, because the DT for reference frames
>are a function of the PT of the previous reference frame, if
>there were lost RTP packets receiver implementations could
>check DT to determine if any of the lost RTP packets contained
>a reference frame.
These are good reasons, and it would be good to have them documented as
informative text in the payload specification, as some of them are
probably not obvious for implementers.
>> T5.
>> Comment: The sentence "(or the number of layers subscribed for a
>layered
>> multicast session)" suggests that this payload specification can be
>used
>> for layered multicast. We don't think this payload specification is
>> capable for layered multicast
>
>The wording is consistent with other RFCs that also don't
>explicitly define how layered multicast will work. See for
>example RFC 4175 (Uncompressed Video). It uses the same
>wording, without any explanation for how layered multicast
>would actually work with that payload format.
>
>I think the "Congestion Control" section is similar to the
>"Security Considerations" section, in that it tries to
>describe how to handle unlikely scenarios. (In some RTP
>payload format specs, the treatment of congestion control is
>actually included in the Security Considerations section
>itself.) The intent is that the section should provide clear
>normative language for the appropriate behavior (in this case,
>unsubscribing to a multicast group), but detailed discussions
>about implementations are out of scope, in my opinion.
>
>As you mentioned in option #2, using this payload format with
>layered multicast is not completely impossible. It could be
>done in conjunction with some other RTP payload format or with
>some new RTP header extension. I think that is enough
>justification for keeping the sentence in there, just in case
>someone will attempt it in the future.
>But it is simply out of scope to go in to specific details
>about how layered multicast can or cannot be implemented.
The use of layered multicast is obvious for some payload specifications
and therefore need not be explained in the payload specification.
Unfortunately, we think that this is not the case for VC-1. We would
like to avoid the situation in which a sender decides to send
pre-encoded VC-1 content using layered multicast and splits a VC-1
stream to two layers in the most obvious way, i.e. B-frames in the
enhancement layer and reference frames in the base layer but does not
modify the original values of FRMCNT or TFCNTR by any means. Some
receivers receive only base layer, unnecessarily conclude frame losses
due to discontinuous FRMCNT or TFCNTR, and freeze the picture or perform
other error handling because of these unnecessarily detected frame
losses.
We see no harm in clarifying the layered multicast issue in the payload
specification.
Best Regards,
Miska
_______________________________________________
Audio/Video Transport Working Group
avt at ietf.org
https://www1.ietf.org/mailman/listinfo/avt