[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[AVT] Comments on draft-hatanaka-avt-rtp-atrac-family-02.txt



Hi,

I have reviewed the RTP payload format specification and in general I think it looks good. The only major comment I do have is in regards to the fragmentation mechanism. Currently I think it is rather confusing and contradicting described. My comments are:

1. Section 3.1: Sequence number: As I understand it the sequence number is really not used for any purpose in the payload format. It is after all the TS that is the main instrument to determine decoding order of audio frames, and also the grouping of fragmented frames can be done based on the TS value in each packet. I would suggest that you remove the sequene number paragraph.

2. Section 3.2: I think the definition of the C flag is in error. Currently I interpret it that it is only the first part of a series of a fragments that has 0, rather then the intended last.

3. Section 3.2: The number of frame paragraph. I think removing the sentence in relation to fragmentation would probably simplify things. I think that you should gather all rules in relation to fragmentation in its own section.

4. Section 3.2: Frame offset:
"(The
   only other necessary variable is sampling frequency, which MUST have
   been established during SDP negotiations). "

I don't think the MUST should be normative here, you are only including if for information. The normative usage is defined elsewhere. Also I think you should change SDP negotiations to out-of-band negotiations.

5. Section 3.2: Frame offset:
"This field SHOULD NOT be used in packets containing fragmented data."

Is really should appropriate here? A fragmented series of packets can only contain a single audio frame. IF it is redundantly sent, should or should not it use frame offset. If not I think simply defining that a frame can be multiply received for same TS value is sufficient.

6. Section 3.3: "The byte length of encoded audio data until the end of the current packet."
I think this definition can be improved: First changing packet to frame or block. Also "until the end" does not work when multiple frames are included.


7. Section 3.3: Is the block-length always the same if one includes multiple frames in the same packet? If yes, then it should be clarified that it is one block length header, followed by multiple audio blocks, each representing one time frame. If no, then it should be clarified that each audio frame is in the form: block-length + data.

8. Section 3.3: Is 4096 currently sufficient for all the defined use cases. If not, why don't extend the field from 12 bits to 16? Please consider that there exist networks with MTU that is larger than 4096. Thus there exist reasons for allowing larger if they may occur.

9. Section 4, first sentence: I would reformulate this to be more strict: Each RTP packet SHALL contain either an integer number of frame blocks, or one frame fragment.

10. Section 4:
" As many complete ATRAC frames as can fit in a single path-MTU SHOULD
be placed in an RTP packet, with the aforementioned maximum of 16.
However, if an ATRAC frame will not fit into an RTP packet, it MUST
be fragmented."
I think you should change the MUST to SHOULD. There are reasons why one may like to use IP fragmentation, also due to the difficulty of performing PMTUD (Path MTU Discovery), it is hard to ensure. The sentence is also ambiguous, it can be interpret that with several frames in the packet also the last shall be fragmented when reaching MTU.


11. Section 4: The fragmentation description. I think you should try to rewrite it and include all rules around the usage of fragmentation, including usage of the other header fields. It should be clarified that you are using the TS to determine which fragments that belong to together.

12. Section 4: I think an another payload field example with multiple frames would help the reader.

13. Section 5.1: What is the sample rate for ATRAC3? I must be defined if it is fixed, otherwise it is missing a parameter in this registration. IF defined do it in section 3.1: For ATRAC-3 the RTP timestamp rate SHALL be 44100 Hz.

14. Section 5.1: Max redundant frames: Please specify the allowed values.

15. Section 5.4.1: Is ATRAC-3 always stereo? In that case the description should define that number of channels (2) shall be included in the rtpmap attribute.

16. Section 5.4.2: Please clarify that the number of channels that shall be included in the rtpmap attribute is the actual number of audio channels, not ChannelID. Including the actual number use if channelID=0.

17. Section 5.4.2: The order of the bullets need to be changed. The last bullet with ptime must come before the second last one, as it specifies "any remaining parameter"

18. Section 5.5.2: I think that "highest possible values offered" is very unclear. Should it be: The answer SHALL NOT contain any values requiring further capabilities then the offer contains, but is RECOMMENDED to provide values as close as possible to the ones in the offer.?

19. Section 5.5.2: "The "maxRedundantFrames" is a suggested minimum. The Answerer MAY use a higher value, but MUST NOT use a lower value."
I think the text "is a suggested minimum" is easily misunderstood. I would recommend that it is changed to something like:
The "maxRedundantFrames" value SHALL NOT be reduced, but MAY be increased in a answer."


20. Section 6: Please change to:
"Two new MIME subtypes, for ATRAC3 and ATRAC-X, are requested to be registered (see Section 5)."


21. Section 7. Please change the following sentence:
The payload format as described in this document is
   subject to the security considerations defined in RFC3550 [1].

To:

The payload format as described in this document is subject to the security considerations defined in RFC3550 [1] and any applicable profile, for example RFC 3551 [3]."


22. Section 8: Please include in reference 1 and 3 their STD numbers.


Cheers

Magnus


--

Magnus Westerlund

Multimedia Technologies, Ericsson Research EAB/TVA/A
----------------------------------------------------------------------
Ericsson AB                | Phone +46 8 4048287
Torshamsgatan 23           | Fax   +46 8 7575550
S-164 80 Stockholm, Sweden | mailto: magnus.westerlund at ericsson.com


_______________________________________________ Audio/Video Transport Working Group avt at ietf.org https://www1.ietf.org/mailman/listinfo/avt