[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[AVT] Comments on draft-ietf-avt-rtp-3gpp-timed-text-12.txt
Hi,
I have reviewed the draft and has mostly found editorial issues. I have
a few minor technical issues to discuss. But otherwise I think the draft
is in good shape.
Technical issues.
T1. In section 4, timestamp definition. It references 4.5 for details on
how to calculate. However section 4.5 does not seem to contain a
normative text on how to calculate the timestamp. It only contains some
examples. Could you please do a complete normative text for this?
T2. Section 1, third paragraph. "MIME type usage". I would suggest that
we stop calling it MIME, and instead only use the name "Media Type".
Multipurpose Internet Mail Extensions (MIME) is one of the users of the
media type registry. While SDP is another user of the registered types.
So please remove all "MIME" references and replace them with "media type".
T3. Section 4.1.2: page 18, second paragraph:
" Furthermore, only sample descriptions (TYPE 5 units) MAY follow units
of unknown duration in the same aggregate payload. Otherwise, it would
not be possible to calculate the timestamp of these other units."
With the current proposed rules, the timestamp for a type 5 following an
unbounded type 1 seems to lack TS derivation definition. The text in
chapter 4:
" TYPE 5 units receive their timestamp from the first non-TYPE 5
unit following them in the payload or from the RTP packet header
itself, if there are only TYPE 5 unit(s) or if one or several
TYPE 5 units follow a sample of unknown duration (see Section
4.1.2, SDUR definition). If there are no non-TYPE 5 units that
follow, the timestamp of the sample description is calculated in
the usual way, i.e. by adding sample duration and timestamp
value of the last unit encountered (see case a) in Figure 10).
Finally, note that for TYPE 5 units, the timestamp actually does
not represent the instant when they are played out, but instead
the instant at which they become available for use."
The above sentences does not define how the second exception case
mentioned in the first sentence will be handled.
T4. Section 4.1.3, page 19, second paragraph in the "SLEN" bullet:
If several TYPE 2 units having the same timestamp but different
SLEN are received, they MUST be discarded: a fragment of a text
sample has always a size value that does not change during
transmission.
Which is "they" in the above sentence? Is it all units with this
timestamp received, the offending unit, or something else?
T5. Section 4.1.6.1, page 25, informative note:
" Informative note: note that it is allowed to send any value of
SIDX=X in the interval [0,127]. E.g. if [64..127] is the
current active set and 65 is sent a new sample description is
defined and an old one deleted (64). Similarly one could send
X=127, thus inverting the active and inactive sets."
I think there is an error in this text regarding which unit needs to be
sent to remove SIDX 64 from the active set. The second sentence is also
hard to understand. To my understanding it is SIDX 0 that must be
received to make the active set be 0, and 65-127. If 65 is received, it
only follows the duplicate rule, as it is within the set.
T6. Section 4.3, page 29:
" o If there is some bitrate and free space in the payload available,
sample descriptions (if at hand) SHOULD be aggregated. Sample
descriptions (TYPE 5 units) MAY be placed anywhere in an aggregate
payload, since the sample index (SIDX) is used to associate them
to their text samples (explained in Section 4.2)."
The second sentence is in error. A sample description may not be placed
at any point, because depending on the location, the TS value will differ.
T7. Section 4.3, Page 30:
" o An additional requirement when fragmenting text samples is that
the start of the modifiers MUST be indicated using the payload
header defined for that purpose, i.e. a TYPE 3 unit MUST be used
(see Section 4.1.4). Otherwise, if packets are lost, a client may
be unable to identify where the modifiers start and the text ends
or whether either text strings or modifiers were received
completely or not."
The sentence starting with "otherwise, ..." is in error. This procedure
does still not fully prevent the detection of the border. It does
however enable the detection as long as only a single loss occurs. Thus
I would propose to correct this sentence.
T8. A question on section 4.7:
Is there a relevant object identifier that should be used in RFC 3640
media type signalling to identify Timed Text content? If so, wouldn't it
be good to have an reference to this?
T9. Section 5, first paragraph:
" Apart from the basic fragmentation guidelines described in the
section above, the simplest option for packet loss resilient
transport is packet repetition. A variant of packet repetition would
be data carousel transmission, where data packets are sent in
periodic cycles."
I don't think the usage of "carousel" is that appropriate. It is a
unclear term, that people interpret quite differently. In my definition
a carousel is something that transmits everything, and then restarts
from the beginning. This is clearly not what you do here. This is either
a window based repeat function or simply a repeating mechanism,
depending if one aggregates the repeated packets with new or not.
T10. Section 7.2:
The position defined in SMIL, does that relate to the corner of the text
track, or is it from where TX and TY is applied? I think that needs to
be clarified.
T11. Section 7.3 tx, ty bullet:
"Therefore, only the first 16 bits are used in the payload header."
and
" o width, height: they also have the same name in the box and
the payload header. All (unsigned) 32 bits are meaningful."
Is this really correct? Isn't tx, ty, height and width only expressed in
the media type parameters? So it can't really be used in the payload
header.
T12. Section 9.2.1:
" o Text track (area) dimensions, "height" and "width": in the case
of sendonly (sendrecv) offers, an answerer accepting the offer
MUST be prepared to render (and send) the stream with the same
exact values. If any of these conditions are not met, the
stream MUST be removed or the session rejected."
Isn't the the text within the parenthesis wrong? In a "sendrecv" offer,
the answer does not at all need to send with the parameters that the
offer provides. He can choice his own, as this is a declarative stream
property parameter. I would suggest to simply remove the two parenthesis.
Editorial things
----------------
E1. The page length on the first 3 pages is not consistent with the rest
of the draft.
E2. Section 1, paragraph 3. "Section 8 registers the MIME type
usage.": In my eyes the formulation doesn't look right. I think
"Section 8 defines the media type." The request to register it is part
of section 10. The usage rules in SDP for the defined media type and its
parameters are in section 9.
E3. Section 2.3, bullet 3a:
"If sample descriptions are needed in the course of a session, these may
be sent also out-of-band or in-band." I would suggest to add "further"
as the second word in the above sentence.
E4. Section 2.5, bullet 2:
"Instead, it is
recommended that some more overhead be invested to provide full
error correction by protecting the less text sample fragments
using the measures outlined in Section 5. "
Something is wrong with the "the less", should it read "at least".
E5. Section 2.5, bullet 5:
"For this reason the fields SIDX
and SDUR are swapped in TYPE 1 unit. "
Compared to what are they swapped? I would suggest to change this to:
"For this reason the fields SIDX and SDUR are swapped in TYPE 1 unit
compared to the other units."
E6. Section 3, text strings definition:
"When using this payload format, the text string does contain any byte
order mark (BOM)." I think there is a missing "not" before "contain".
E7. Page 10. "track / stream" is hanging alone on this page. Please go
through the draft and adjust these things for better readability in the
next version.
E8. Section 4, page 10:
" Timestamp clockrates MUST be signaled by out-of-band means at
session setup, e.g. using the "rate" attribute in SDP. See
Section 9 for details."
In my opinion the sentence should be changed to
"Timestamp clockrates MUST be signaled by out-of-band means at session
setup, e.g. using the media type "rate" parameter in SDP. See Section 9
for details."
E9. Section 4.1.3, page 20, first example.
"If lower delay and higher redundancy is required, a choice
could be that the encoder 'collects' text every second; this
yields text samples (TYPE 1 units) of 68 bytes, TYPE 1 header
included. Taking a smaller delay of 3s, three contiguous
text samples could be aggregated in one RTP payload: the
current and last two text samples."
I don't think "smaller" in the second sentence is correct. I interpret
it to be a in comparison with the 1 second delay.
E10. Section 4.1.5: Last paragraph:
"Regarding the SDUR field and the absence of the SLEN and SIDX fields,
the same reasoning as for TYPE 3 applies."
Can this language be tighten up. I think separating the normative part
and the informative part would be good:
"The SDUR field is defined as in TYPE 1. The reasoning behind the
absence of SLEN and SIDX is the same as in TYPE 3 units."
E11. Section 4.2, page 26, SDUR bullet, last line:
Extra space in "SDUR= SDUR1+SDUR2".
E12. Page 27:
"a) The total number of indices used is greater than the
number of indices available, i. e., for static ones more
than 127 and for dynamic ones more than 64 or, "
This sentence is hard to parse. I would suggest to change it to:
"a) The total number of indices used is greater than the
number of indices available, i. e., if the static
sample descriptions are more than 127, or the dynamic ones are more than
64 or, "
E13. Section 5, Page 40:
" A server MAY decide to use repetition as a measure for packet loss
resilience. Thereby, a server MAY send the same RTP packet payloads
or just parts of it, i.e. single units."
The second sentence is a bit strange. At a minimum the "s" in "payloads"
should be removed, or "it" is wrong.
I propose:
" A server MAY decide to use repetition as a measure for packet loss
resilience. Thereby, a server MAY send the same RTP payloads
or just some of the units from the payloads."
E14. section 8, section title:
I would suggest to remove 8.1 completely and instead place everything in
8. There is nothing else in this section than the media type definition.
Thus my proposal for section title is:
"8. 3GPP Timed Text media type"
Cheers
Magnus Westerlund
Multimedia Technologies, Ericsson Research EAB/TVA/A
----------------------------------------------------------------------
Ericsson AB | Phone +46 8 4048287
Torshamsgatan 23 | Fax +46 8 7575550
S-164 80 Stockholm, Sweden | mailto: magnus.westerlund at ericsson.com
_______________________________________________
Audio/Video Transport Working Group
avt at ietf.org
https://www1.ietf.org/mailman/listinfo/avt