[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[AVT] RE: Comments on draft-ietf-avt-rtp-3gpp-timed-text-04.txt
Hi Magnus,
many thanks for your comments. I moved the important issues up front, so you
can ignore the editorials after the hyphen line
> 4. Section 2.3.4: I think the tables of possible combinations to send
> has a problem. I would think that sending a packet with first a type 2
> and then a type 3 fragment would be desirable. It seems the most likely
> fragmentation scenario is after all that the modifiers are to big. In
> that case it seems desirable to start with the text, and then as much
> modifiers that fits the MTU.
>
What you propose is only possible if the fragments belong to different text
samples. For fragments of the same text sample it is not possible since the
TYPE 3 has a duration field and the timestamp calculation wouldn't work. The
reason for including the SDUR is that it should be known to the receiver
when these fragments should be discarded. Otherwise, receiver buffer may
contain fragments of unknown expiration time.
One can argue that a fragment of the next packet could be included then,
e.g., a TYPE3 from text sample 2 with a TYPE2 of text sample 1, but for
simplicity I did not want to do this: the next or previous sample may
actually not need fragmentation or even not have any modifiers, so that at
the end it will probably make no difference and it does seem more complex.
> 5. Section 2.3.4:
> "Note that payloads MAY also be empty as a special case for TYPE 1
> units."
>
> I think I know the motivation behind this, but it would be good if the
> text included a motivation why this is a good feature.
>
OK. There are, at least, two reasons to send these samples:
- to 'clear' the screen of text
- to put an end to samples of unknown duration
> 6. Section 3:
> "The initial value MUST be randomly determined."
>
> It is normatively defined as SHOULD in RTP, are there further reasons to
> make this a MUST for this payload format? If not I think you should
> remove the whole sentence, else you need to motivate it. If SHOULD is
> fine and you feel it important to point this out, then reformulate to
> point to the RTP spec.
>
No actually I think the level of requirement as in RTP is fine.
> And the thing that many may not think of is the measuring through RTCP.
>
??
You mean I should mention that synchronisation of media is done with RTCP?
Do I need to mention this?
> 8. Section 3:
> "Thus, the
> encoder could sample text every 1s (yielding RTP payloads of
> ~14-18 bytes), encapsulate the current and last two samples in
> every RTP packet (accounting to an IP packet size of 98 bytes)
> and send the packet six times, thus exhausting the available bit
> rate and increasing packet loss resilience."
>
> This sentence and actually the whole paragraph is a bit hard to
> understand. I got really confused when I come to "send the packet six
> times". I guess you are trying to say that if you have 4.6 Kbps you can
> send a sample every second, and also send 5 redundant copies, why
> staying within the bandwidth.
Yeah, that's right!
> the conclusion that you had 4.6Kbps?
4.6 Kbps is not a conclusion, it is what I assume, just an example bitrate.
> If you send one 576 byte packet
> every 60 seconds, then you would use 76.8 bits/s. So the argumentation
> seems to be comparing apples and pears.
>
What I try to say is that one should play with the parameters a bit to come
with a packetization that allows for redundancy and meets the scenario
requirements. Given the requirements in the example the only solution is to
reduce the packet size to 98 bytes. This gives a high packet redudancy,
which is big but it's not a real world example.
I'll try and re-structure the text to be more clear.
> 9. Section 3.1.2:
> "Note that also empty text samples are considered whole text samples,
> although they do not contain sample contents. In this particular
> case, TYPE 1 units MUST NOT include any sample contents and the LEN
> field SHALL have a value of 8 (0x0008). Otherwise, the LEN field
> SHALL be always greater than 8 (0x0008)."
>
> Here is one place where some motivation why to send empty could be
> included. Also I think the normative text is maybe a bit to much. It
> seems rather obvious that an empty text sample will not contain any
> sample contents and will have a length of 8. If some text should be
> written on this, I think an informative sentence is better:
> "A Type 1 units without sample contents will have LEN field value of 8
> (0x0008).
>
You mean better than the SHALL above?
>
> 10. Section 3.1.2:
> "Thus, this feature
> MUST be limited to those sample descriptions that provide a set of
> minimum default format settings."
>
> I think this is a SHOULD or RECOMMENDED rather than MUST. I think there
> are clearly applications where the best channel to use, is out-of-band.
> It might be so that the sample descriptors are pre defined, and the
> out-of-band mechanism is actually hard coded into the implementation,
> thus having no problem with band-width etc.
>
OK, thanks for the hint.
> 11. Section 3.1.2:
> "The default start value MUST be 129. "
>
> What is the meaning with the sentence. What is the start value?
> Is what
> you mean the following: "When assigning static sample descriptions SIDX
> values, they SHOULD start at 129."
Yes except for the SHOULD, I think..
> I don't think MUST is possible to
> support, and seems to provide no benefit.
>
why? When reading out of a 3GP file, you can rewrite the SIDX anyway and
when encoding live putting 129 is the same as putting any other value there.
Missing anything?
The original thought was to make it a MUST, so that you have the whole
interval available. OK, you will never use the 126 values available it but
there is no reason to make it shorter...On the other hand, if you make them
random you might get a random value of 254 which just allows 1 static sample
description and we didn't define a wrap-up mechanism for static because this
is really not necessary. Why not just fix it and get rid of the problems?
> 18. Section 7.1: brand, cbrand, mver:
>
> As you say they are provided for information. Is there any useful reason
> for a receiver to get this information?
>
I guess not... even if a receiver wants to store the received data, it must
not necessarily be in any 3gpp profile..
will remove them.
> 19. Section 8.2.1:
> "This means that an offerer using these
> parameters only specifies which values are going to be used for the
> sent stream."
>
> I hope you are aware of that you are changing the direction of how
> parameters normally work in offer-answer.
I am not sure what you mean. This may need further discussion depending on
the answer to the one below.
> I agree that it is difficult
> to get right.
You say it :)
>I toiled rather much with the H.264 offer answer section,
> and did basically define two versions of some parameters. The reason is
> that there is cases where the receiver would like to specify that it
> want to receive a timed text that fits a certain screen size. While
> often it is the sender that will declare what format the stream going to
> be delivered will have. So the question if this functionality is needed.
>
Let me answer you with a counter question ;) Typically, text and video will
go together in many applications: in video applications, is it usual/normal
to negotiate the picture size for video instead of choosing from a list of
possibilities in an SDP (similar to example 10.2 in RFC 3264)? If
negotiation is the common thing because there are so many different possible
sizes, it makes sense to let these parameters express preference, similar to
what is done with the parameter sets included in the h264 profile parameter
(right?). Otherwise if the options are typically a few I can keep it as
is. What do you think?
>
> I think it is getting there.
>
Nice to hear that. Thanks again,
Jose
----------------------------------------------------------------------
> -----Original Message-----
> From: Magnus Westerlund [mailto:magnus.westerlund at ericsson.com]
> Sent: Friday, July 23, 2004 5:51 PM
> To: Jose Rey; matsui.yoshinori at jp.panasonic.com; IETF AVT WG
> Subject: Comments on draft-ietf-avt-rtp-3gpp-timed-text-04.txt
>
>
> Hi Jose,
>
> Having reviewed the document I have the following comments:
>
> 1. Section 2.2:
> "However, sending them in-band is easier and more efficient."
> I think that this section is very positive in regards to in-band. I
> think the downside should be pointed out. For example by adding to the
> above sentence: "..., but may not be as reliable." Also the easier and
> more efficient, is dependent on the actual protocols used for
> out-of-band. SDP with base64 is of course not as efficient, but if one
> has something that uses binary directly, then the efficiency should be
> equal or better.
>
OK
> 2. Section 2.3.3: "Since fragments are sent in own RTP packets
> the overhead may be considerable if fragmentation is done too
> often."
> I think it is not a matter of how often it is done, is rather how it is
> done. If you only fragment at MTU boundaries and put fragments as
> efficient as possible, then it shouldn't be more overhead, than for
> using IP fragmentation.
>
OK
> 3. Section 2.3.3:
> " o modifier and text string fragments SHOULD be protected against
> packet losses, i.e. using FEC [7], retransmission [11], repetition
> (Section 4) or an equivalent technique. This minimizes the
>
> Rey & Matsui [Page 9]
> > Internet Draft RTP Payload Format for 3GPP Timed Text July 19,
2004
>
>
>
> o an additional requirement when fragmenting text samples is that "
>
> The above is a quotation from the download file. It seems to be missing
> some lines. Look at the last line of page 9 which ends: "This minimizes
> the" And on the next page nothing continues the sentence.
>
...effects of packet loss.
> 7. Section 3:
> "For live streaming an appropriate timestamp clockrate SHALL be used.
> A default value of 1000 Hz is RECOMMENDED. This value should provide
> enough timing resolution for synchronizing text with other media and
> expressing the duration of text samples. Other clockrates MAY be
> used. Timestamp clockrates MUST be signaled by out-of-band means at
> session setup, e.g. using the "rate" attribute in SDP. See Section 8
> for details. "
>
> I think you should ad to this section a informative sentence to point
> out the below fact defined in RTP, see section 5.1:
>
> "The resolution of the clock MUST be sufficient for the desired
> synchronization accuracy and for measuring packet arrival jitter
> (one tick per video frame is typically not sufficient)."
>
This text addresses the point well. I'll include it.
> 12. Section 3.1.3:
>
> "This header type is used to transport text string fragments:"
>
> I think the introducationary text may be a bit more clear. That it may
> contain a whole text string, or a fragment of it. And that it shall not
> contain any modifier data.
OK
>
> 13. Section 5:
> "Applications implementing this payload format SHALL implement
> congestion control."
>
> Although correct for all users of shared, or unmanaged, non QoS handling
> networks, there are cases where the used network, does provide
> functionalities that enables a sender to not have congestion control. I
> would recommend that you remove the quoted sentence, and rely on the
> remaining text.
OK
>
> 14. Section 6.1:
>
> "Any SDP description containing a 3GPP timed text stream MUST include
> the parameters listed above."
>
> What parameters are you referring to. It is ambiguous, and I definitely
> don't like this when it is normatively written. Please clarify what
> parameters that this statement applies for.
>
OK.
> 15. Section 7.1:
> The list of parameters use different styles. "rate: the ..." and then
> "sver=<Z1 ..." please be consistent.
>
OK
> 16. Section 7.1:
> "This MAY be followed by a comma-separated list of
> increasingly older versions that SHOULD be used as alternatives."
> How is older defined in respect to these multi part version numbers, and
> secondly, you do define that it is in a falling preference order. That
> may not be the same as "older" order.
>
OK
> 17. Section 7.1: spldesc parameter:
> "This corresponds to the default case. This is the
> default case."
> I think it is sufficient with one sentence saying that this is default.
>
;) ..sure
> 20. Section 10:
>
> "RTP packets using the payload format defined in this specification
> are subject to the security considerations discussed in the RTP
> specification [3].
>
> I would like this sentence to be extended to also cover the used
> profile, thus:
>
> RTP packets using the payload format defined in this specification
> are subject to the security considerations discussed in the RTP
> specification [3], and any applicable RTP profile, e.g. AVP [RFC3551].
>
> 21. Section 11.1 and 11.2:
>
> The RTP and AVP references needs to include their STD numbers, thus:
> H. Schulzrinne, S. Casner, R. Frederick and V. Jacobson, "RTP: A
> Transport Protocol for Real-Time Applications", STD 64, RFC 3550,
> July 2003.
>
OK
_______________________________________________
Audio/Video Transport Working Group
avt at ietf.org
https://www1.ietf.org/mailman/listinfo/avt