[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[AVT] Re: Comments on draft-ietf-avt-rtp-3gpp-timed-text-04.txt



Hi,

Jose Rey wrote:



There's not restriction in the TS saying that the SIDX shall be
monotonically increased.  Static SIDXs are something defined for RTP
streaming.
The SIDX is exactly what you define above as a monotonically increasing
number with an initial value of 129 (start value).  Each new static SIDX
increases the SIDX in one unit.


Thus it is an artificial restriction that you introduce. To me it doesn't seem to have any purpose. I don't see how it matters in which order the SIDX values are used within the static space. What matters is if the sample description is received and available when used. Doing an renumbering seem to only be a possibility for errors.



19. Section 8.2.1:
[snip]


This is getting fuzzy...

I think I that 'declarative' is not the best word to use, as there seems
to
be different definitions or, at least, have a wider definition.  I have
the
feeling that one can neither use terms like "media format configuration
parameters", "media stream parameters" and "capability parameters" for
those
that are symmetric or downgradable (relating to the stream and the
actual
'physical capabilities' of the offerer/receiver) respectively.

So I'll speak about symmetric and downgradable, which is more general.

I can agree to symmetric, which is clear that both parties must use the same value. However I don't like downgradable, because in sendrecv stream for some declarative parameters, they can't not only be downgraded, parameters could in fact be increased. For example, one could declare the capability to receive a version that is higher than what the other end-point has offered to receive for itself.



Let's see:

tx,ty,layer,tx3g,height,width

- if sendrecv/recvonly: the parameters express the values the offerer
wishes
to have for the incoming stream.  At the same time these values are the
values the offerer will use (for the O->A stream if sendrecv) if the
stream
is accepted by the answerer.   In this case, for simplicity, the
parameters
preferred by the offerer cannot be changed by the answerer if he accepts
the
stream (thus symmetric use) since it is not guaranteed that the modified
values are supported by the offerer, right?.

In a sendrecv stream the answerer could define, for non-symmetric values, another value. In the answer to a recvonly, one can only confirm the values, potentially declare the answerers capabilities if such parameters are defined.


Also in a ideal case when the offerer has declare its display capabilities, then the answerer would declare how it is going to use them. However now we are past what the offer/answer model handles normally.



I think I can agree that they are linked together and needs to be set
consistent. However the problem I am trying to explain is that the
entity needing to set them is the sending entity rather then the
receiving one.



Yes, I think (?)

My "think" is in regards to the following: Due to that sample descriptors are dependent on width, height, tx, and ty, they are linked. If I change on of them, I may have to change another. Thus they are problematic.


OR was it something else you did not understand?




In the case you address, there are two possiblities, either the encoded text has lower or higher resolution. If lower, this is clearly not a problem as the resolution is not lost when converting to higher clockrates. The problem is if the original clockrate is higehr than 1000Hz. What we can do is REQUIRE 1000 HZ and advise that resolution MAY be lost when converting from high to low, although I really think that 1000 HZ is good enough for capturing and synchronising speech...unless human make considerable developments in the spoken language ;)


I am mostly concerned that with that lower rates are allowed, people will use rates that are not sufficient for RTCP. I do not expect that people writing media packetizers for RTP-hint tracks in quicktime, mpeg, or 3GPP files will understand the restriction, unless it is made clear.



If one looks at the 3GPP timed text parameters, one would need to have
something like H.264, and define the following parameters:

   rate: the RTP timestamp clockrate is equal to the clockrate of the
        media.  If RTP packets are generated out of a 3GP file, the
        clockrate of the text media MUST be copied from the 3GP file,
        i.e. the clockrate is the value of "timescale" parameter in

the

        Media Header Box describing that text track.  Other tracks
        (audio/video/text) in the 3GP file may have their own

clockrates

as indicated in their corresponding Media Header Box. For

live

encoding, a clockrate of 1000 Hz is RECOMMENDED but other

values

        MAY be used.

   sver=<Z1(x1*256+y1)>, <Z2(x2*256+y2), ..., <Zi(xi*256+yi)>,...
        The parameter "sver" specifies the list of supported

backwards-

compatible versions of the timed text format specification

(3GPP

        TS 26.245), which the
**"receiver"** (instead of sender)


I don't understand this change.


the reason is that with the current definition of the Offer/Answer any parameters defined does, unless sendonly, apply to the traffic that is going to be received. Thus the presence of the sver in a offer or answer, applies to the stream going to be received at the offerer or answer respectively.

I think we still have a miscommunication on how offer/answer works.


A B Offerer Answerer Offer: A->B Answer: A->B

In classic offer (A-B) of a sendrecv stream, any declared parameter in that SDP, will be given the restrictions that B must conform to when sending media B->A.

Therefore, will the parameter sver as currently declared is the preference by a receiver, not a sender.




I think that with the O/A usage I defined above this is not needed
(?)...I
guess sprop- means sender properties  Is it not an overkill to have such
fine granularity for the settings? Instead care shall be taken when
composing the session... I think..


sprop are stream properties, i.e. the declare what the stream being sent will actually be.


This depends on the parameters. From my perspective, it seems that some of the 3GPP timed text parameters needs to be declared by the sender. However if I am wrong, then so much better, because it will be much simpler. However that require that 3GPP timed text can be made to work with the restriction that the receiver of the streams sets all parameters including out-of-band sent sample descriptors.




Another issue: I have been discussing with Jan and Dave offline. There is another possibility to combine TYPE2+TYPE3 fragments without adding any fields. As I said, this should happen very rarely and it is not worth adding any new fields, but one can make an exception when the payload contains fragments by saying that in that particular case, the timestamp calculated for the first applies to all and no further timestamp calculation shall be performed. The SDUR is kept in TYPE3 and TYPE4 since they may still be alone in RTP payloads... It is a little exception that doesn't require much change and takes care of it pretty elegantly.. what do you say?

So yo say that for packets that contain first a TYPE2 followed by a Type 3, the timestamp MUST be the same for all parts?


Also how does this effects the aggregation case. If I have a type 2, then a type 3, then another type 2. Does this work in that case?

If it work I don't see a problem.

Cheers

Magnus Westerlund

Multimedia Technologies, Ericsson Research EAB/TVA/A
----------------------------------------------------------------------
Ericsson AB                | Phone +46 8 4048287
Torshamsgatan 23           | Fax   +46 8 7575550
S-164 80 Stockholm, Sweden | mailto: magnus.westerlund at ericsson.com

_______________________________________________
Audio/Video Transport Working Group
avt at ietf.org
https://www1.ietf.org/mailman/listinfo/avt