[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[AVT] Re: Comments on draft-ietf-avt-rtp-3gpp-timed-text-04.txt
Hi Jose,
Comments below:
Jose Rey wrote:
4. Section 2.3.4: I think the tables of possible combinations to send
has a problem. I would think that sending a packet with first a type 2
and then a type 3 fragment would be desirable. It seems the most
likely
fragmentation scenario is after all that the modifiers are to big. In
that case it seems desirable to start with the text, and then as much
modifiers that fits the MTU.
What you propose is only possible if the fragments belong to different
text
samples. For fragments of the same text sample it is not possible since
the
TYPE 3 has a duration field and the timestamp calculation wouldn't work.
The
reason for including the SDUR is that it should be known to the receiver
when these fragments should be discarded. Otherwise, receiver buffer
may
contain fragments of unknown expiration time.
One can argue that a fragment of the next packet could be included then,
e.g., a TYPE3 from text sample 2 with a TYPE2 of text sample 1, but for
simplicity I did not want to do this: the next or previous sample may
actually not need fragmentation or even not have any modifiers, so that
at
the end it will probably make no difference and it does seem more
complex.
Okay, I understand. How severe is this limitation? It seems that in a
scenario when not sending redundancy, the fragmentation might force you
to send a first very in efficient type 2 packet for the text part,
followed by another packet that is slightly over MTU, forcing you to
send two modifier packets, for a total of 3 packets. While it could have
been sufficient with 2. However I also don't like strange rules. My
suggestion for solving this would be to actually include both SDUR and
timestamp offset. Thus allowing presence of multiple packet types from
the same timestamp. Also allowing redundancy skipping certain samples. I
don't know which impact such a solution would have.
5. Section 2.3.4:
"Note that payloads MAY also be empty as a special case for TYPE
1
units."
I think I know the motivation behind this, but it would be good if the
text included a motivation why this is a good feature.
OK. There are, at least, two reasons to send these samples:
- to 'clear' the screen of text
- to put an end to samples of unknown duration
Good, please put this into the draft.
6. Section 3:
"The initial value MUST be randomly determined."
It is normatively defined as SHOULD in RTP, are there further reasons
to
make this a MUST for this payload format? If not I think you should
remove the whole sentence, else you need to motivate it. If SHOULD is
fine and you feel it important to point this out, then reformulate to
point to the RTP spec.
No actually I think the level of requirement as in RTP is fine.
Good.
And the thing that many may not think of is the measuring through
RTCP.
??
You mean I should mention that synchronisation of media is done with
RTCP?
Do I need to mention this?
Sorry, me being fuzzy. What I mean is that if the RTP timestamp rate is
to low, this results in problems for some of the RTCP measurements as
their resolution is locked to the RTP payload rate.
Thus I think you with the recommendation to use 1000 Hz should also
include as statement declaring that with lower timestamp rates some RTCP
measurements may be effected due to low resolution.
[snip]
9. Section 3.1.2:
"Note that also empty text samples are considered whole text
samples,
although they do not contain sample contents. In this particular
case, TYPE 1 units MUST NOT include any sample contents and the
LEN
field SHALL have a value of 8 (0x0008). Otherwise, the LEN field
SHALL be always greater than 8 (0x0008)."
Here is one place where some motivation why to send empty could be
included. Also I think the normative text is maybe a bit to much. It
seems rather obvious that an empty text sample will not contain any
sample contents and will have a length of 8. If some text should be
written on this, I think an informative sentence is better:
"A Type 1 units without sample contents will have LEN field value of 8
(0x0008).
You mean better than the SHALL above?
I would say so, this is over usage of the normative word. If one follows
the definition of the LEN field it becomes obvious that this is the
values that will be used. So therefore if you like to put some emphasis
on that empty packets are allowed, that is sufficient to perform using
text without normative words.
[snip]
11. Section 3.1.2:
"The default start value MUST be 129. "
What is the meaning with the sentence. What is the start value?
Is what
you mean the following: "When assigning static sample descriptions
SIDX
values, they SHOULD start at 129."
Yes except for the SHOULD, I think..
I don't think MUST is possible to
support, and seems to provide no benefit.
why? When reading out of a 3GP file, you can rewrite the SIDX anyway and
when encoding live putting 129 is the same as putting any other value
there.
Missing anything?
The original thought was to make it a MUST, so that you have the whole
interval available. OK, you will never use the 126 values available it
but
there is no reason to make it shorter...On the other hand, if you make
them
random you might get a random value of 254 which just allows 1 static
sample
description and we didn't define a wrap-up mechanism for static because
this
is really not necessary. Why not just fix it and get rid of the
problems?
Okay, we are on different pages. My first question regarding the
definition of a start value, seem to be the springing point here.
If a start value, is the first SIDX used, and there exist a restriction,
that any subsequent SIDX value used must be higher, then your comment
holds. However if there exist no restrictions on how you use the SIDX
values, then it is not a problem. Example:
Definition section defines:
255 148 127, in that order.
The texting stream uses the SIDX values in the following order:
148 255 127.
If that is not a problem, there is no need for any care about start
values and what is defined.
19. Section 8.2.1:
"This means that an offerer using these
parameters only specifies which values are going to be used for
the
sent stream."
I hope you are aware of that you are changing the direction of how
parameters normally work in offer-answer.
I am not sure what you mean. This may need further discussion depending
on
the answer to the one below.
In a sendrecv case, when the offerer gives a stream. The actual values
of parameters he provides is what he accepts to receive. For parameters
applicable that works in the direction, of what the declaring entity
sends does not fit very well. For example the offerer has to declare
them prior to knowing what the answerer is accepting to receive.
I toiled rather much with the H.264 offer answer section,
and did basically define two versions of some parameters. The reason
is
that there is cases where the receiver would like to specify that it
want to receive a timed text that fits a certain screen size. While
often it is the sender that will declare what format the stream going
to
be delivered will have. So the question if this functionality is
needed.
Let me answer you with a counter question ;) Typically, text and video
will
go together in many applications: in video applications, is it
usual/normal
to negotiate the picture size for video instead of choosing from a list
of
possibilities in an SDP (similar to example 10.2 in RFC 3264)? If
negotiation is the common thing because there are so many different
possible
sizes, it makes sense to let these parameters express preference,
similar to
what is done with the parameter sets included in the h264 profile
parameter
(right?). Otherwise if the options are typically a few I can keep it
as
is. What do you think?
But video sizes are in the offer/answer model are not negotiated. I
think they are simply declared by the receiver. Or at least the
parameters that effect them. See for example the MPEG-4 elementary
stream config string, that although missing offer/answer can only be
interpreted as declared parameters to apply for the stream that is
accepted to be received (in a sendrecv case).
It is clear that at least one of parameters are clearly needed to be
declared in the senders direction: tx3g. This is something the sender
determines, in fact they seem to be rather similar to H.264's parameter
sets. Also this idealistic case does assume live encoding. If one uses
stored streams, also the size and rate parameters need to be declared. I
think it may in fact be a serious problem to not have a defined RTP
timestamp rate. Because I have no idea how to actually allow fit into
the model the possibility to declare the timestamp rate on what one
intends to send.
The problem is really Offer/Answer and SDP, possibly also SIP. The fact
is that we need a two phase negotiation in some cases. First one
declares one capabilities and preferences. Then in the second phase, one
declares what actual configuration matching the capabilities and
preferences one is going to use.
I don't know if you managed to get any wiser on these comments. We
should discuss this face to face.
Cheers
Magnus Westerlund
Multimedia Technologies, Ericsson Research EAB/TVA/A
----------------------------------------------------------------------
Ericsson AB | Phone +46 8 4048287
Torshamsgatan 23 | Fax +46 8 7575550
S-164 80 Stockholm, Sweden | mailto: magnus.westerlund at ericsson.com
_______________________________________________
Audio/Video Transport Working Group
avt at ietf.org
https://www1.ietf.org/mailman/listinfo/avt