idnits 2.17.1 draft-ietf-avt-rtp-g718-02.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** The document seems to lack a License Notice according IETF Trust Provisions of 28 Dec 2009, Section 6.b.i or Provisions of 12 Sep 2009 Section 6.b -- however, there's a paragraph with a matching beginning. Boilerplate error? (You're using the IETF Trust Provisions' Section 6.b License Notice from 12 Feb 2009 rather than one of the newer Notices. See https://trustee.ietf.org/license-info/.) Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** There are 8 instances of too long lines in the document, the longest one being 1 character in excess of 72. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document seems to contain a disclaimer for pre-RFC5378 work, and may have content which was first submitted before 10 November 2008. The disclaimer is necessary when there are original authors that you have been unable to contact, or if some do not wish to grant the BCP78 rights to the IETF Trust. If you are able to get all authors (current and original) to grant those rights, you can and should remove the disclaimer; otherwise, the disclaimer is needed and you can ignore this comment. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (October 22, 2009) is 5300 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'RFC3550' is mentioned on line 898, but not defined -- Looks like a reference, but probably isn't: '4340' on line 603 == Unused Reference: 'RFC5104' is defined on line 1167, but no explicit reference was found in the text == Unused Reference: 'RFC4340' is defined on line 1192, but no explicit reference was found in the text -- Possible downref: Non-RFC (?) normative reference: ref. 'AMR-WB' -- Possible downref: Non-RFC (?) normative reference: ref. 'G.718' ** Obsolete normative reference: RFC 4288 (Obsoleted by RFC 6838) ** Obsolete normative reference: RFC 4566 (Obsoleted by RFC 8866) -- Obsolete informational reference (is this intentional?): RFC 2326 (Obsoleted by RFC 7826) -- Obsolete informational reference (is this intentional?): RFC 5117 (Obsoleted by RFC 7667) Summary: 4 errors (**), 0 flaws (~~), 4 warnings (==), 7 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 Audio/Video Transport WG Ari Lakaniemi 2 Internet Draft Nokia 3 Intended status: Standards track Ye-Kui Wang 4 Expires: April 2010 Huawei Technologies 5 October 22, 2009 7 RTP payload format for G.718 speech/audio 8 draft-ietf-avt-rtp-g718-02.txt 10 Status of this Memo 12 This Internet-Draft is submitted to IETF in full conformance with the 13 provisions of BCP 78 and BCP 79. This document may contain material 14 from IETF Documents or IETF Contributions published or made publicly 15 available before November 10, 2008. The person(s) controlling the 16 copyright in some of this material may not have granted the IETF 17 Trust the right to allow modifications of such material outside the 18 IETF Standards Process. Without obtaining an adequate license from 19 the person(s) controlling the copyright in such materials, this 20 document may not be modified outside the IETF Standards Process, and 21 derivative works of it may not be created outside the IETF Standards 22 Process, except to format it for publication as an RFC or to 23 translate it into languages other than English. 25 Internet-Drafts are working documents of the Internet Engineering 26 Task Force (IETF), its areas, and its working groups. Note that 27 other groups may also distribute working documents as Internet-Drafts. 29 Internet-Drafts are draft documents valid for a maximum of six months 30 and may be updated, replaced, or obsoleted by other documents at any 31 time. It is inappropriate to use Internet-Drafts as reference 32 material or to cite them other than as "work in progress." 34 The list of current Internet-Drafts can be accessed at 35 http://www.ietf.org/ietf/1id-abstracts.txt. 37 The list of Internet-Draft Shadow Directories can be accessed at 38 http://www.ietf.org/shadow.html. 40 This Internet-Draft will expire on April 22, 2009. 42 Copyright Notice 44 Copyright (c) 2009 IETF Trust and the persons identified as the 45 document authors. All rights reserved. 47 This document is subject to BCP 78 and the IETF Trust's Legal 48 Provisions Relating to IETF Documents in effect on the date of 49 publication of this document (http://trustee.ietf.org/license-info). 50 Please review these documents carefully, as they describe your rights 51 and restrictions with respect to this document. Code Components 52 extracted from this document must include Simplified BSD License text 53 as described in Section 4.e of the Trust Legal Provisions and are 54 provided without warranty as described in the BSD License. 56 Abstract 58 This document specifies the Real-Time Transport Protocol (RTP) 59 payload format for the Embedded Variable Bit-Rate (EV-VBR) 60 speech/audio codec, specified in ITU-T G.718. A media type 61 registration for this RTP payload format is also included. 63 Conventions used in this document 65 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 66 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 67 document are to be interpreted as described in RFC 2119 [RFC2119]. 69 Table of Contents 71 1. Introduction...................................................3 72 2. Background.....................................................3 73 2.1. The G.718 codec...........................................3 74 2.2. Benefits of layered design................................5 75 2.3. Transmitting layered data.................................5 76 2.4. Scaling scenarios & rate control..........................6 77 3. G.718 RTP payload format.......................................7 78 3.1. Payload Structure.........................................7 79 3.1.1. Payload Header.......................................7 80 3.1.2. G.718 transport blocks...............................8 81 3.2. Handling the Encoded data................................11 82 3.3. G.718 scaling............................................13 83 3.4. CRC verification.........................................14 84 3.5. G.718 session............................................14 85 3.6. Cross-stream/cross-layer timing synchronization..........14 86 3.7. RTP Header usage.........................................15 87 4. Payload Format Parameters.....................................15 88 4.1. Media Type Registration..................................15 89 4.2. Mapping to SDP Parameters................................17 90 4.3. Offer/answer considerations..............................18 91 4.4. Declarative usage of SDP.................................18 92 4.5. SDP examples.............................................18 94 5. Security Considerations.......................................20 95 6. Congestion control............................................21 96 7. IANA Considerations...........................................22 97 APPENDIX A: Payload examples.....................................23 98 A.1. Simple payload examples..................................23 99 A.1.1. All the layers in the same payload..................23 100 A.1.2. Layers in separate RTP streams......................24 101 A.2. Advanced examples........................................25 102 A.2.1. Different update rate for subset of layers..........25 103 A.2.2. Redundant frames with limited set of layers.........26 104 8. References....................................................28 105 8.1. Normative References.....................................28 106 8.2. Informative References...................................29 107 Author's Addresses...............................................30 108 Acknowledgment...................................................30 109 9. Open Issues...................................................30 110 10. Changes Log..................................................31 112 1. Introduction 114 The International Telecommunication Union (ITU-T) Recommendation 115 G.718 [G.718] specifies the Embedded Variable Bit Rate (EV-VBR) 116 speech/audio codec. This document specifies the Real-time Transport 117 Protocol (RTP) [RFC3550] payload format for this codec. 119 2. Background 121 2.1. The G.718 codec 123 G.718 is an embedded variable rate speech codec having a layered 124 design. The bitstream of the G.718 core codec consists of a core 125 layer, denoted as L1, and four enhancement layers, denoted as L2-L5. 126 The bit-rates of the G.718 core codec range from 8 kbit/s (core layer 127 only) to 32 kbit/s (with all layers up to L5). Furthermore, the G.718 128 codec supports also discontinuous transmission (DTX) and comfort 129 noise generation (CNG) by sending Silence Descriptor (SID) frames 130 during periods of non-active input signal, resulting in a reduced 131 bit-rate. The sampling frequency of the core codec is 16 kHz and the 132 codec operates on 20 ms frames. The G.718 codec is also capable of 133 narrowband operation with audio input and/or output at 8 kHz sampling 134 frequency. 136 While transmitting/receiving the core layer L1 is enough for 137 successful decoding of the audio content, each of the enhancement 138 layers Ln (n being 2 to 5, inclusive) provides an improvement to 139 reconstructed audio quality. Thus, the core layer ensures the basic 140 communication while the enhancement layers can be used to improve the 141 perceptual quality. Furthermore, enhancement layers are dependent on 142 all the lower layers in a sense that successful decoding of layer Ln 143 requires also all the layers Lm with mn MUST 623 also be discarded. 625 3.5. G.718 session 627 An G.718 session consists of one or several RTP sessions carrying 628 encoded G.718 data according the payload format specified in section 629 3.1. 631 3.6. Cross-stream/cross-layer timing synchronization 633 In case an G.718 session consists of multiple RTP sessions, the RTP 634 packets transmitted on separate RTP sessions need to be synchronized 635 in order to enable reconstruction of the frames in the receiving end. 636 Since each of the RTP sessions uses its own random initial value for 637 the RTP timestamp, there is also a random offset between the RTP 638 timestamps values carrying the EDUs belonging to the same encoded 639 frame in different RTP sessions. 641 The receiver MUST use the traditional RTCP based mechanism to 642 synchronize streams by using the RTP and NTP timestamps of the RTCP 643 Sender Reports (SR) it receives. 645 3.7. RTP Header usage 647 This section specifies the usage of some fields of the RTP header 648 (specified in section 5 of [RFC3550]) with the G.718 RTP payload 649 format. Setting of other RTP header fields is as specified in 650 [RFC3550]. 652 The RTP timestamp corresponds to the sampling instant of the first 653 encoded sample of the earliest frame in the payload. The timestamp 654 clock frequency is 32 kHz. 656 The marker bit (M) of each of the RTP streams of the session SHALL be 657 set to value 1 if the payload carries an EDU belonging to the first 658 frame after an inactive period, i.e. an EDU from the first frame of a 659 talkspurt. For all other packets the marker bit is set to value 0. 661 4. Payload Format Parameters 663 This section defines the parameters that may be used to configure 664 optional features in the G.718 RTP transmission. 666 The parameters are defined here as part of the media subtype 667 registration for the G.718 codec. Mapping of the parameters into the 668 Session Description Protocol (SDP) [RFC4566] is also provided for 669 those applications that use SDP. In control protocols that do not 670 use MIME or SDP, the media type parameters must be mapped to the 671 appropriate format used with that control protocol. 673 4.1. Media Type Registration 675 This registration is done using the template defined in RFC 4288 676 [RFC4288] and following RFC 4855 [RFC4855]. 678 Type name: audio 680 Subtype name: G718 682 Required parameters: none 684 Optional parameters: 686 mode: This parameter MAY be used to indicate whether the 687 mode with layer L1 being present or the AMR-WB 688 compatible mode (with layer L1' being present) is in 689 use. If this parameter is not present or the value of 690 this parameter is equal to 0, the mode with layer L1 691 being present is in use. Otherwise, the AMR-WB 692 compatible mode is in use. When this parameter is 693 present, the value MUST be either 0 or 1. 695 Author's note: When the upcoming stereo and SWB options are 696 present, the semantics of this parameter may change. 698 layers: The numbers of the layers (in range from 1 to 5, 699 denoting layers from L1 to L5, respectively) 700 transmitted in this session, expressed as comma- 701 separated list of layer numbers. If the parameter is 702 present, at least layer L1 or L1' MUST be included in 703 the list of layers in one of the RTP sessions included 704 in the G.718 session. If the parameter is not present, 705 all layers up to layer L5 MAY be used in the session. 707 Author's note: Why not use semantics similarly as L-ID? 709 ptime: The recommended length of time (in milliseconds) 710 represented by the media in a packet. See Section 6 711 of [RFC4566]. 713 maxptime: The maximum length of time (in milliseconds) that can 714 be encapsulated in a packet. See Section 6 of 715 [RFC4566] 717 Author's note: Some further study is needed to see if separate 718 parameters for sending and receiving capabilities/preferences are 719 needed -- especially for upcoming stereo and SWB options. 721 Author's note: The support for upcoming SWB and stereo options 722 needs to be taken into account. Basically we can either 1) extend 723 the parameter "layers" to cover also this aspect, or 2) define 724 separate parameter(s) for these new options when more details on 725 the stereo/SWB support are available. 727 Encoding considerations: 729 This media type is framed and contains binary data; see Section 4.8 730 of [RFC4288]. 732 Security considerations: See Section 6 of RFC xxxx 733 Interoperability considerations: none 735 Published specification: RFC xxxx 737 Applications which use this media type: 739 For example Voice over IP, audio and video conferencing, audio 740 streaming and voice messaging. 742 Additional information: none 744 Person & email address to contact for further information: 746 Ari Lakaniemi, ari.lakaniemi@nokia.com 748 Intended usage: COMMON 750 Restrictions on usage: 752 This media type depends on RTP framing, and hence is only defined 753 for transfer via RTP [RFC3550] 755 Author: 757 Ari Lakaniemi, ari.lakaniemi@nokia.com 759 Change controller: 761 IETF Audio/Video Transport working group delegated from the IESG 763 4.2. Mapping to SDP Parameters 765 The information carried in the media type specification has a 766 specific mapping to fields of the SDP [RFC4566], which is commonly 767 used to describe RTP sessions. When SDP is used to specify sessions 768 employing the G.718 codec, the mapping is as follows: 770 o The media type ("audio") goes in SDP "m=" as the media name. 772 o The media subtype ("G718") goes in SDP "a=rtpmap" as the encoding 773 name. The RTP clock rate in "a=rtpmap" MUST be 32000 for G.718. 775 Author's note: The current choice for the RTP clock rate is a 776 'placeholder'. The clock rate needs to be set according to SWB 777 sampling rate, which is still T.B.D. Since the core codec employs 778 16000 Hz sampling rate, an integer multiple of 16000 Hz seems to 779 be a preferable choice. 781 o The parameters "ptime" and "maxptime" go in the SDP "a=ptime" and 782 "a=maxptime" attributes, respectively. 784 o Any remaining parameters go in the SDP "a=fmtp" attribute by 785 copying them directly from the media type string as a semicolon 786 separated list of parameter=value pairs. 788 4.3. Offer/answer considerations 790 The following considerations apply when using the SDP offer/answer 791 [RFC3264] mechanism to negotiate the G.718 transport. The parameter 792 "layers" MAY be used to indicate the layer configuration for the each 793 RTP session belonging to current G.718 session an end-point making 794 the offer is ready to transmit and wishes to receive. 796 o In case the G.718 session consists of a single RTP session, it is 797 RECOMMENDED not to impose any layer restrictions for the session 798 but to use the rate control functionality to set possible 799 restrictions on usage of the higher or highest layers. If the 800 offer includes a layer configuration parameter, the answer MAY use 801 different configuration, but the highest layer in the answer MUST 802 NOT be higher than the highest layer of the offered configuration. 804 Author's note: Support for answer modifying the layer 805 configuration is FFS. 807 In case the G.718 session consists of multiple RTP sessions, the 808 answer MUST use the layer configurations provided in the offer for 809 the sessions it accepts. 811 4.4. Declarative usage of SDP 813 In declarative usage, such as SDP in RTSP [RFC2326] or SAP [RFC2974], 814 the parameter "layers" SHALL be interpreted to provide a set of 815 layers that the sender may use in the session. 817 4.5. SDP examples 819 Some example SDP session descriptions utilizing G.718 encodings are 820 provided below. 822 The first example illustrates the simple case where the G.718 session 823 employing a single RTP session and the AVPF profile is offered, and 824 the answer accepts the offer without any changes. 826 Offer: 828 m=audio 49120 RTP/AVPF 97 829 a=rtpmap:97 G718/32000/1 831 Answer: 833 m=audio 49120 RTP/AVPF 97 834 a=rtpmap:97 G718/32000/1 836 The second example shows a bit more complex case where the G.718 837 session using a single RTP session and the AVPF profile is offered 838 with restriction to send/receive only with layers L1 and L2. The 839 answer indicates that the other end-point is happy to receive (and 840 send) layers up to L5. 842 Offer: 844 m=audio 49120 RTP/AVPF 97 845 a=rtpmap:97 G718/32000/1 846 a=fmtp:97 layers=1,2 848 Answer: 850 m=audio 49120 RTP/AVPF 97 851 a=rtpmap:97 G718/32000/1 852 a=fmtp:97 layers=1,2,3,4,5 854 The third example shows an G.718 session using multiple RTP sessions 855 with the AVPF profile. The answerer wishes to use only layers up to 856 L3. 858 Offer: 860 m=audio 49120 RTP/AVPF 97 861 a=rtpmap:97 G718/32000/1 862 a=fmtp:97 layers=1,2 863 a=mid=1 865 m=audio 49122 RTP/AVPF 98 866 a=rtpmap:98 G718/32000/1 867 a=fmtp:98 layers=3 868 a=mid=2 869 a=depend:lay 1 871 m=audio 49124 RTP/AVPF 99 872 a=rtpmap:99 G718/32000/1 873 a=fmtp:99 layers=4,5 874 a=mid=3 875 a=depend:lay 1 2 877 Answer: 879 m=audio 49120 RTP/AVPF 97 880 a=rtpmap:97 G718/32000/1 881 a=fmtp:97 layers=1,2 882 a=mid=1 884 m=audio 49120 RTP/AVPF 98 885 a=rtpmap:98 G718/32000/1 886 a=fmtp:98 layers=3 887 a=mid=2 888 a=depend:lay 1 890 Note that the dependency signaling according to [smd-sdp] is used in 891 the third example above to indicate the relationship between the 892 layers distributed into separate RTP sessions. 894 5. Security Considerations 896 RTP packets using the payload format defined in this specification 897 are subject to the security considerations discussed in the RTP 898 specification [RFC3550], and in any appropriate RTP profile (for 899 example [RFC3551] or [RFC4585]). This implies that confidentiality 900 of the media streams is achieved by encryption; for example, through 901 the application of SRTP [RFC3711]. Because the data compression used 902 with this payload format is applied end-to-end, any encryption needs 903 to be performed after compression. 905 A potential denial-of-service threat exists for data encodings using 906 compression techniques that have non-uniform receiver-end 907 computational load. The attacker can inject pathological datagrams 908 into the stream that will increase the processing load of the decoder 909 and may cause the receiver to be overloaded. For example inserting 910 additional EDUs representing the higher enhancement layers on top of 911 the ones actually transmitted may increase the decoder load. However, 912 the G.718 codec is not particularly vulnerable to such an attack, 913 since the majority of the computational load in an G.718 session is 914 associated to the encoder. Another form of possible attach might be 915 forging of codec bit-rate control messages, which may result in 916 encoder operating employing higher number of enhancement layers than 917 originally intended and thereby requiring larger amount of 918 computation resources. Therefore, the usage of data origin 919 authentication and data integrity protection of at least the RTP 920 packet is RECOMMENDED; for example, with SRTP [RFC3711]. 922 Note that the appropriate mechanism to ensure confidentiality and 923 integrity of RTP packets and their payloads is very dependent on the 924 application and on the transport and signaling protocols employed. 925 Thus, although SRTP is given as an example above, other possible 926 choices exist. 928 Note that end-to-end security with either authentication, integrity 929 or confidentiality protection will prevent a network element not 930 within the security context from performing media-aware operations 931 other than discarding complete packets. To allow any (media-aware) 932 intermediate network element to perform its operations, it is 933 required to be a trusted entity which is included in the security 934 context establishment. 936 6. Congestion control 938 As scalable codec G.718 implicitly provides means for congestion 939 control by providing a possibility for 'thinning' the bitstream. The 940 RTP payload format according to this specification provides several 941 different means for reducing the G.718 session bandwidth. The most 942 appropriate mechanism (in terms of impact to the user experience) 943 depends on the employed payload structure and also on the employed 944 session configuration (single RTP session or multiple RTP sessions). 945 The following means (in no particular order) can be used to assist 946 congestion control procedures -- either by the sender or by the 947 intermediate node. 949 o The transport blocks carrying the EDUs representing the highest 950 layers within the payload may be dropped. 952 o The payloads carrying the EDUs representing the highest layers in 953 an G.718 session are dropped. 955 o Transport blocks or payloads carrying EDUs belonging to redundant 956 frames included in the payload are dropped. 958 7. IANA Considerations 960 IANA is kindly requested to register a media type for the G.718 codec 961 for RTP transport, as specified in section 4.1. of this document. 963 APPENDIX A: Payload examples 965 The G.718 payload structure enables flexible transport either by 966 carrying all layers in the same payload or separating the layers into 967 separate payloads. The following subsections illustrate different 968 possibilities for transport by simple examples. Note that examples do 969 not show the full payload structure to keep the illustration simple. 971 A.1. Simple payload examples 973 A.1.1. All the layers in the same payload 975 The illustration below shows layers L1-L3 from two encoded frames 976 encapsulated into separate payloads using single transport block. 978 +-------+--------+-----+------+------+------+ 979 | RTP1 | L-ID=3 |NF=0 |F1-L1 |F1-L2 |F1-L3 | 980 +-------+--------+-----+------+------+------+ 982 +-------+--------+-----+------+------+------+ 983 | RTP2 | L-ID=3 |NF=0 |F2-L1 |F2-L2 |F2-L3 | 984 +-------+--------+-----+------+------+------+ 986 In case the same layers from two input frames are encapsulated into 987 one payload using single transport block, the structure is as shown 988 below. 990 +-------+--------+-----+------+------+------+------+------+------+ 991 | RTP1 | L-ID=3 |NF=1 |F1-L1 |F2-L1 |F1-L2 |F2-L2 |F3-L3 |F2-L3 | 992 +-------+--------+-----+------+------+------+------+------+------+ 994 The third example illustrates the case where the layers L1-L3 from 995 two input frames are encapsulated into one payload using two separate 996 transport blocks, the first one carrying L1 and the other one 997 containing L2 and L3. 999 +-------+--------+-----+------+------+ 1000 | RTP1 | L-ID=1 |NF=1 |F1-L1 |F2-L1 | 1001 +-------+--------+-----+------+------+------+------+ 1002 | L-ID=7 |NF=1 |F1-L2 |F2-L2 |F2-L2 |F2-L3 | 1003 +--------+-----+------+------+------+------+ 1005 A.1.2. Layers in separate RTP streams 1007 In this case the data for each layer is transmitted in its own 1008 payload. 1010 In the first example each transport block including a single EDU is 1011 carried in its own RTP payload. 1013 +-------+--------+-----+-----+ +-------+--------+-----+-----+ 1014 | RTP1a | L-ID=1 |NF=0 |F1-L1| | RTP1b | L-ID=6 |NF=0 |F1-L2| 1015 +-------+--------+-----+-----+ +-------+--------+-----+-----+ 1017 +-------+--------+-----+-----+ +-------+--------+-----+-----+ 1018 | RTP1c |L-ID=10 |NF=0 |F1-L3| | RTP2a | L-ID=1 |NF=0 |F2-L1| 1019 +-------+--------+-----+-----+ +-------+--------+-----+-----+ 1021 +-------+--------+-----+-----+ +-------+--------+-----+-----+ 1022 | RTP2b | L-ID=6 |NF=0 |F2-L2| | RTP2c |L-ID=10 |NF=0 |F2-L3| 1023 +-------+--------+-----+-----+ +-------+--------+-----+-----+ 1025 If the payloads carry data from two consecutive input frames, the 1026 same encoded data as in the previous example is arranged as follows. 1028 +-------+--------+-----+-----+-----+ 1029 | RTP1a | L-ID=1 |NF=1 |F1-L1|F2-L1| 1030 +-------+--------+-----+-----+-----+ 1032 +-------+--------+-----+-----+-----+ 1033 | RTP1b | L-ID=6 |NF=1 |F1-L2|F2-L2| 1034 +-------+--------+-----+-----+-----+ 1036 +-------+--------+-----+-----+-----+ 1037 | RTP1c |L-ID=10 |NF=1 |F1-L3|F2-L3| 1038 +-------+--------+-----+-----+-----+ 1040 A.2. Advanced examples 1042 A.2.1. Different update rate for subset of layers 1044 An example employing different update rates (i.e. different number of 1045 frames per packet) for selected subsets of layers. In these examples 1046 all core codec layers L1-L5 are shown. 1048 +-------+--------+-----+-----+-----+-----+-----+ 1049 | RTP1 | L-ID=1 |NF=3 |F1-L1|F2-L1|F3-L1|F4-L1| 1050 +-------+--------+-----+-----+-----+-----+-----+ 1052 +-------+--------+-----+-----+-----+-----+-----+ 1053 | RTP2a | L-ID=7 |NF=1 |F1-L2|F2-L2|F1-L3|F2-L3| 1054 +-------+--------+-----+-----+-----+-----+-----+ 1056 +-------+--------+-----+-----+-----+ 1057 | RTP3a |L-ID=14 |NF=0 |F1-L4|F1-L5| 1058 +-------+--------+-----+-----+-----+ 1060 +-------+--------+-----+-----+-----+ 1061 | RTP3b |L-ID=14 |NF=0 |F2-L4|F2-L5| 1062 +-------+--------+-----+-----+-----+ 1064 +-------+--------+-----+-----+-----+-----+-----+ 1065 | RTP2b | L-ID=7 |NF=1 |F3-L2|F4-L2|F3-L3|F4-L3| 1066 +-------+--------+-----+-----+-----+-----+-----+ 1068 +-------+--------+-----+-----+-----+ 1069 | RTP3c |L-ID=14 |NF=0 |F3-L4|F3-L5| 1070 +-------+--------+-----+-----+-----+ 1072 +-------+--------+-----+-----+-----+ 1073 | RTP3d |L-ID=14 |NF=0 |F4-L4|F4-L5| 1074 +-------+--------+-----+-----+-----+ 1076 A.2.2. Redundant frames with limited set of layers 1078 An example transmitting layers L1-L3 as primary data and L1 (of the 1079 previous frame) as redundant data is shown below. Each payload 1080 carries one primary (i.e. new) frame in one transport block and one 1081 redundant frame, which in this example is the frame preceding the 1082 primary frame, in another transport block. 1084 +-------+--------+-----+-----+--------+-----+-----+-----+-----+ 1085 | RTP1 | L-ID=1 |NF=0 |F0-L1| L-ID=3 |NF=0 |F1-L1|F1-L2|F1-L3| 1086 +-------+--------+-----+-----+--------+-----+-----+-----+-----+ 1088 +-------+--------+-----+-----+--------+-----+-----+-----+-----+ 1089 | RTP2 | L-ID=1 |NF=0 |F1-L1| L-ID=3 |NF=0 |F2-L1|F2-L2|F2-L3| 1090 +-------+--------+-----+-----+--------+-----+-----+-----+-----+ 1092 +-------+--------+-----+-----+--------+-----+-----+-----+-----+ 1093 | RTP3 | L-ID=1 |NF=0 |F2-L1| L-ID=3 |NF=0 |F3-L1|F3-L2|F3-L3| 1094 +-------+--------+-----+-----+--------+-----+-----+-----+-----+ 1096 Alternatively, the payload carrying also redundant data for a subset 1097 of layers can be arranged differently, as shown in the example below. 1099 +-------+--------+-----+-----+-----+-----+--------+-----+-----+ 1100 | RTP1 | L-ID=3 |NF=0 |F0-L1|F0-L2|F0-L3| L-ID=1 |NF=0 |F1-L1| 1101 +-------+--------+-----+-----+-----+-----+--------+-----+-----+ 1103 +-------+--------+-----+-----+-----+-----+--------+-----+-----+ 1104 | RTP2 | L-ID=3 |NF=0 |F1-L1|F1-L2|F1-L3| L-ID=1 |NF=0 |F2-L1| 1105 +-------+--------+-----+-----+-----+-----+--------+-----+-----+ 1107 +-------+--------+-----+-----+-----+-----+--------+-----+-----+ 1108 | RTP3 | L-ID=3 |NF=0 |F2-L1|F2-L2|F2-L3| L-ID=1 |NF=0 |F3-L1| 1109 +-------+--------+-----+-----+-----+-----+--------+-----+-----+ 1111 Now the first transport block carries the primary data and the second 1112 transport block carries the redundant data, which in this case covers 1113 the frame following the primary frame. The benefit of this approach 1114 is that the redundant data is included in the last (secondary) 1115 transport block of the payload, which might be beneficial for 1116 possible payload scaling operation within the network. 1118 8. References 1120 8.1. Normative References 1122 [AMR-WB] 3GPP TS 26.171, "Adaptive Multi-Rate Wideband (AMR-WB) 1123 speech codec; General description (Release 7)", v7.0.0, 1124 September 2006. 1126 [G.718] ITU-T Recommendation G.718, "Frame Error Robust Narrowband 1127 and Wideband Embedded Variable Bit-Rate Coding of Speech 1128 and Audio from 8-32 Kbit/s", (consented) May 2008. 1130 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1131 Requirement Levels", BCP 14, RFC 2119, March 1997. 1133 [RFC3264] Rosenberg, J., Schulzrinne, H., "An Offer/Answer Model with 1134 Session Description Protocol (SDP)", RFC 3264, June 2002. 1136 [RFC3550]Schulzrinne, H., Casner, S., Frederick, R. and Jacobson, V., 1137 "RTP: A Transport Protocol for Real-Time Applications", STD 1138 64, RFC 3550, July 2003. 1140 [RFC3551] Schulzrinne, H., Casner, S., "RTP Profile for Audio and 1141 Video Conferences with Minimal Control", STD 65, RFC 3551, 1142 July 2003. 1144 [RFC3711] Baugher, M., McGrew, D., Naslund, M., Carrara, E., Norrman, 1145 K., "The Secure Real-Time Transport Protocol (SRTP)", RFC 1146 3711, March 2004. 1148 [RFC4288] Freed, N., Klensin, J., "Media Type Specifications and 1149 Registration Procedures", BCP 13, RFC 4288, December 2005. 1151 [RFC4566] Handley, M., Jacobson, V. and Perkins, C., "SDP: Session 1152 Description Protocol", RFC 4566, July 2006. 1154 [RFC4585] Ott, J., Wenger, S., Sato, N., Burmeister, C., Rey, J., 1155 "Extended RTP Profile for Real-Time Transport Control 1156 Protocol (RTCP)-Based Feedback (RTP/AVPF)", RFC 4585, July 1157 2006. 1159 [RFC4855] Casner, S., "Media Type Registration of RTP Payload 1160 Formats", RFC 4855, February 2007. 1162 [RFC4867] Sjoberg, J., Westerlund, M., Lakaniemi, A., Xie, Q., "RTP 1163 Payload Format and File Storage Format fort he Adaptive 1164 Multi-Rate (AMR) and Adaptive Multi-Rate Wideband (AMR-WB) 1165 Audio Codecs", RFC 4867, April 2007. 1167 [RFC5104] Wenger, S., Chandra, U., Westerlund, M., Burman, B., "Codec 1168 Control Messages in the RTP Audio-Visual Profile with 1169 Feedback (AVPF)", RFC 5104, Feburary 2008. 1171 [smd-sdp] Schierl, T., Wenger, S., "Signaling media decoding 1172 dependency in Session Description Protocol (SDP)", draft- 1173 schierl-mmusic-layered-codec-04 (work in progress), June 1174 2007. 1176 8.2. Informative References 1178 [McCanne] McCanne, S., Jacobson, V., and Vetterli, M., "Receiver- 1179 driven layered multicast", in Proc. of ACM SIGCOMM'96, 1180 pages 117--130, Stanford, CA, August 1996. 1182 [RFC2326] Schulzrinne, H., Rao, A., Lanphier, R., "Real Time 1183 Streaming Protocol (RTSP)", RFC 2326, April 1998. 1185 [RFC2974] Handley, M., Perkins, C., Whelan, E., "Session Announcement 1186 Protocol", RFC 2974, October 2000. 1188 [RFC3828] Larzon, L-A., Degermark, M., Pink, S., Jonsson, L-E., 1189 Fairhurst, G., "The Lightweight User Datagram Protocol 1190 (UDP-Lite)", RFC 3828, July 2004. 1192 [RFC4340] Kohler, E., Handley, M., Floyd, S., "Data Congestion 1193 Control Protocol (DCCP)", RFC 4340, March 2006. 1195 [RFC5117] Westerlund, M., Wenger, S., "RTP Topologies", RFC 5117, 1196 January 2008. 1198 Author's Addresses 1200 Ari Lakaniemi 1201 Nokia 1202 P.O.Box 407 1203 FIN-00045 Nokia Group, FINLAND 1205 Phone: +358-71-8008000 1206 Email: ari.lakaniemi@nokia.com 1208 Ye-Kui Wang 1209 Huawei Technologies 1210 400 Somerset Corp Blvd, Suite 602 1211 Bridgewater, NJ 08807, USA 1213 Phone: +1-908-541-3518 1214 EMail: yekuiwang@huawei.com 1216 Acknowledgment 1218 Funding for the RFC Editor function is currently provided by the 1219 Internet Society. 1221 9. Open Issues 1223 1) Support of super-wideband (SWB) audio and stereophonic encoding 1224 extensions to ITU-T G.718 currently being worked on by ITU-T is to 1225 be specified after ITU-T completes the work in that regards. 1227 a. Some further study is needed to see if separate parameters 1228 for sending and receiving capabilities/preferences are needed 1229 -- especially for upcoming stereo and SWB options. 1231 b. The support for upcoming SWB and stereo options needs to be 1232 taken into account. Basically we can either 1) extend the 1233 parameter "layers" to cover also this aspect, or 2) define 1234 separate parameter(s) for these new options when more details 1235 on the stereo/SWB support are available. 1237 2) For streaming or other applications that allow for relatively long 1238 end-to-end delay, sometimes it would be beneficial to aggregate 1239 more than 4 frames in one Transport Block (TB). Should the length 1240 of the NF field be larger? 1242 3) On layer structure and configuration signalling. Currently, a 1243 unique layer ID is assigned for any possible layer combinations. 1245 See the editing notes below Table 3 for other possible approaches. 1246 One of the alternative ways may be chosen in the final draft. 1248 4) Currently, it is mandated that lower layer EDUs of later frames go 1249 before higher layer EDUs of earlier frames in a transport block. 1250 This way is friendlier to adaptation (dropping of higher layers). 1251 However, if all layers are received, then the depacketizer needs 1252 to reorder the EDUs to their decoding order before feeding them to 1253 the decoder. Therefore, the other way around (i.e. lower layer 1254 EDUs of later frames go after higher layer EDUs of earlier frames, 1255 or EDUs in transport blocks are placed in decoding order) is more 1256 friendly to the depacketizer. Another benefit of the latter is 1257 that it does not introduce any end-to-end delay. Which way to be 1258 specified (or both allowed if needed) is FFS. 1260 5) MANEs dropping RTP packets are RTP translators. But are those 1261 MANEs dropping a subset of the transport blocks in one packet also 1262 RTP translators? 1264 6) The RTCP based cross-session synchronization is not possible until 1265 the first RTCP SRs are received in all sessions. This implies that 1266 decoding only a subset of layers may be possible until RTCP SRs in 1267 all sessions have been received. This may imposes higher end-to- 1268 end delay or higher bandwidth for RTCP data, and the approach may 1269 not work perfectly for some multicast topologies. There is a study 1270 ongoing by some AVT members. Once there is an acceptable solution 1271 fouthe draft documenting that solution may be referenced in this 1272 draft. 1274 7) It might be better to change the semantics of the media type 1275 parameter 'layers' to be similar as that for L-ID. 1277 8) Offer/answer with answer being capable of modifying the layer 1278 configuration is FFS. 1280 9) Some references need to be updated in the final draft. 1282 10. Changes Log 1284 From draft-ietf-avt-rtp-g718-00 to draft-ietf-avt-rtp-g718-01 1286 - Updated the boiler template. 1288 - Changed Ye-Kui Wang's affiliation and address. 1290 From draft-ietf-avt-rtp-g718-01 to draft-ietf-avt-rtp-g718-02 1291 - Updated the boiler template (added the last sentence in Copyright 1292 Notice).