[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[AVT] Comments on draft-ietf-avt-rtp-vmr-wb-01.txt



Hi,

I have reviewed the RTP payload format for the VMR-WB audio codec. It is mostly in good shape. My main comments are around the interoperability with AMR-WB.

1. The draft does have a couple of places that it is to wide, i.e. more than 72 columns. Especially the right limit of the text seems to vary widely between different parts of the text.

2. The draft is lacking form feed characters between the pages. It may be that this draft has suffered the same fate that the AMR-WB+ payload format has, having them stripped by the IETF secretariat.

3. section 3: "VMR-WB is robust to high percentage of packet loss and
   packets with corrupted rate information."

I think that it should be frames rather then packets that the codec handles.

4. Section 4.1: Please avoid having figures being split by a page break.

5. Section 5.1: I think there are clearly a way of achieving interoperable communication between a AMR and VMR client in this case. They simple use the AMR-WB mime type with a restricted mode set and in octet-aligned mode.

6. Section 5.2: In my opinion the scenario description is not an appropriate place of putting normative text.

7. Section 5.2:
  "The VMR-WB mode 3 and octet-aligned payload format SHALL be
   used for this scenario. Moreover, to avoid signaling
   conflicts in the IP network, two sessions SHALL be
   established using SIP/SDP, one between the VMR-WB enabled
   terminal and the gateway and another session between the
   gateway and the AMR-WB enabled terminal."

I think the second SHALL is totally inappropriate. Requiring that you use two signalling session. I haven't followed the work in SIPPING, but there has at least been discussion on how to invoke transcoding and this type of functionality from a session. Therefore I think there are definitely useful cases where the media flows by a gateway, while having a single signalling session between the two end-points.

Could you please clarify what that signalling conflict would be.

Also as I wrote in issue 5, I would think that the most likely case for interoperational usage between an VMR and AMR client is 5.1.

8. Most pages: There shall be at least one empty line between headers and footers, and the main text.

9. Section 5.4: More of the signalling conflict and use of normative language.

10. Section 5.4:
  "If the AMR-WB codec is engaged in an interoperable
   interconnection with VMR-WB, the active AMR-WB codec mode set
   SHALL be limited to 0, 1, and 2."
I would change this normative language to an informative:

   If the AMR-WB codec is engaged in an interoperable
   interconnection with VMR-WB, the active AMR-WB codec mode set
   needs to be limited to 0, 1, and 2.

11. Section 5.4: The figure is too wide.

12. Section 6.2: I think the current specification of the header free format would not work. There is two frame types that have equal length. Both FT 9 and 10 are 5 bytes long, and could not be separated from each other.

13. Section 6.2:
  "Since the header-free payload format is not compatible with
   AMR-WB, it is RECOMMENDED that only VMR-WB modes 0, 1, and 2
   be used with this payload format."

As your scenario in 5.2 shows there are cases where the AMR-WB modes could be transported in the VMR header free format to a gateway and then be repackaged into AMR-WB payload. So I think there are reasons for against dis recommending it.

14. Section 6.3.2:

  "CMR (4 bits): Indicates a codec mode request sent to the
   speech encoder at the site of the receiver of this payload,
   provided that the network allows the use of the requested
   mode."

In IETF we are normally not talking about the network actually having control over what is happening on the application layer, especially in its data. It is very unclear what is giving any restrictions in this case. Is it congestion state in the IP transport network, thus requiring the sender to use even lower modes?

15. Section 6.3.2:
  "The encoder SHOULD follow a received mode request, but MAY
   change to a different mode if the network necessitates it,
   for example to control congestion."

Again, a question of appropriate language. Also I don't think it is only the network that can put requirements on this. It is clearly application limitations in this consideration.

16. Section 6.3.3: Table 3.

So the VMR-WB framed version of the AMR-WB SID is only 35 bits, while the normal AMR-WB version is 40 bits. Is this correct?

17. Section 6.4:
I think one should here recommend that a VMR-WB RTP payload implementation should also understand AMR-WB signalling, enabling it to determine when it can accept offers with AMR-WB and how to also offer itself as an AMR-WB capable host.


18. Section 7.1: In regards to the magic numbers. Is any of this file format approved somewhere else. If not, I would strongly recommend that you remove the single channel case. The reason AMR has two, is that the addition of the multi-channel support was done after the 3GPP release. And there where already available terminals, which would fail if we used the same magic number for a multi-channel definition. If that is not the case, I do not understand the reason for this.

19. Section 7.1: The AMR-WB interoperable VMR-WB file format. What is the purpose of this format. I have a hard time understand how it actually is useful. If the sole purpose is to mark the file as being suitable for repacking into AMR-WB file without transcoding. Then I think you can spend 1 of the reserved bits in the 32 bit channel description field for this purpose. I think using one file format, with one magic key is simpler, than having four, when the sole purpose is to indicate different facts about the content of the file.

20. Section 9.1: The last paragraph does not apply to this payload format as there is no ULP support.

21. Section 9.2: Last paragraph: I think this list of things to include should be extended to include SSRC and RTP PT. In practice any tampering in the RTP header would result in failure to decode. And for authentication purposes, the SSRC is need to ensure that the right source is sending the data. Otherwise one flow could be claimed to come from some other participant.

22. Section 9.3:
   "Late packets (i.e., unavailability of a packet when needed
   for decoding at the receiver) SHALL be treated as lost
   packets."

I don't know how they should otherwise be treated. I don't know the need for normative language for this. It shouldn't have any impact on the interoperability, thus be over use of normative language.


23. Section 9.3: "Furthermore, if the late packet is part of an interleave group, depending upon the availability of the other packets in that interleave group, decoding MUST be resumed from the next (sequential order) available packet. In other words, the unavailability of a packet in an interleave group at certain time SHOULD not invalidate the other packets within that interleave group that MAY arrive later."

I think there are an error in the first sentence:
"available packet" shouldn't that be "available frame"?

Also the I don't think using so strong normative language is appropriate. This is implementation choices that do not effect interoperability. They can be recommend ways, but not mandated ones.

24. Section 10.1: I know to little about the VMR-WB codec and its usage in the system. What is the motivation behind the mode-set parameter? Also the listed possible values, they refer to some mode numbering that is written in the VMR-WB specification. I think a reference is in order here.

25. section 10.1: I think you should consider writing a section on interoperability under additional information. The potential usage of the AMR-WB MIME type also for this codec belongs in the MIME registration.

26. Section 10.2:
     "The MIME subtype (payload format name) goes in SDP
      "a=rtpmap" as the encoding name.  The RTP clock rate in
      "a=rtpmap" MUST be 16000 for VMR-WB (Note that 8000 is
      also supported by VMR-WB for narrowband I/O processing), ..."

I think you should change the text in the following way, so that it doesn't contradict itself:

      The MIME subtype (payload format name) goes in SDP
      "a=rtpmap" as the encoding name.  The RTP clock rate in
      "a=rtpmap" MUST be 16000 or 8000 (narrowband I/O processing) for
      VMR-WB, ..."

27. Section 10.3: You will need to have a rules for how the mode-set parameter works under offer-answer.

28. Section 11:
"The new attributes "dtx" and "payload_format" need to be registered."

All parameters defined in the MIME registration template are automatically included. Also there is no repository of parameter names that needs to be considered. It is simply that people has borrowed a number of parameter from each others specification. So you should remove the above sentence as it is not a request that IANA knows how to fulfill.

29. References: Both RTP and AVP reference shall include indication that there are standards by adding "STD 64" and "STD 65" into the references.

30. References: Please check the references that they are up to date. At least ref 8 is out-of date. It is RFC 3711 now.

Cheers

Magnus

29.





--

Magnus Westerlund

Multimedia Technologies, Ericsson Research EAB/TVA/A
----------------------------------------------------------------------
Ericsson AB                | Phone +46 8 4048287
Torshamsgatan 23           | Fax   +46 8 7575550
S-164 80 Stockholm, Sweden | mailto: magnus.westerlund at ericsson.com


_______________________________________________ Audio/Video Transport Working Group avt at ietf.org https://www1.ietf.org/mailman/listinfo/avt