idnits 2.17.1 draft-ietf-avt-rtp-g718-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** It looks like you're using RFC 3978 boilerplate. You should update this to the boilerplate described in the IETF Trust License Policy document (see https://trustee.ietf.org/license-info), which is required now. -- Found old boilerplate from RFC 3978, Section 5.1 on line 14. -- Found old boilerplate from RFC 3978, Section 5.5, updated by RFC 4748 on line 1238. -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 1215. -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 1222. -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 1228. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust Copyright Line does not match the current year -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (October 23, 2008) is 5657 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Looks like a reference, but probably isn't: '4340' on line 591 == Unused Reference: 'RFC5104' is defined on line 1156, but no explicit reference was found in the text == Unused Reference: 'RFC4340' is defined on line 1181, but no explicit reference was found in the text -- Possible downref: Non-RFC (?) normative reference: ref. 'AMR-WB' -- Possible downref: Non-RFC (?) normative reference: ref. 'G.718' ** Obsolete normative reference: RFC 4288 (Obsoleted by RFC 6838) ** Obsolete normative reference: RFC 4566 (Obsoleted by RFC 8866) -- Obsolete informational reference (is this intentional?): RFC 2326 (Obsoleted by RFC 7826) -- Obsolete informational reference (is this intentional?): RFC 5117 (Obsoleted by RFC 7667) Summary: 3 errors (**), 0 flaws (~~), 3 warnings (==), 12 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 Audio/Video Transport WG Ari Lakaniemi 2 Internet Draft Ye-Kui Wang 3 Intended status: Standards track Nokia 4 Expires: April 2009 October 23, 2008 6 RTP payload format for G.718 speech/audio 7 draft-ietf-avt-rtp-g718-00.txt 9 Status of this Memo 11 By submitting this Internet-Draft, each author represents that any 12 applicable patent or other IPR claims of which he or she is aware 13 have been or will be disclosed, and any of which he or she becomes 14 aware will be disclosed, in accordance with Section 6 of BCP 79. 16 Internet-Drafts are working documents of the Internet Engineering 17 Task Force (IETF), its areas, and its working groups. Note that 18 other groups may also distribute working documents as Internet- 19 Drafts. 21 Internet-Drafts are draft documents valid for a maximum of six months 22 and may be updated, replaced, or obsoleted by other documents at any 23 time. It is inappropriate to use Internet-Drafts as reference 24 material or to cite them other than as "work in progress." 26 The list of current Internet-Drafts can be accessed at 27 http://www.ietf.org/ietf/1id-abstracts.txt 29 The list of Internet-Draft Shadow Directories can be accessed at 30 http://www.ietf.org/shadow.html 32 This Internet-Draft will expire on April 23, 2009. 34 Copyright Notice 36 Copyright (C) The IETF Trust (2008). 38 Abstract 40 This document specifies the Real-Time Transport Protocol (RTP) 41 payload format for the Embedded Variable Bit-Rate (EV-VBR) 42 speech/audio codec, specified in ITU-T G.718. A media type 43 registration for this RTP payload format is also included. 45 Conventions used in this document 47 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 48 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 49 document are to be interpreted as described in RFC 2119 [RFC2119]. 51 Table of Contents 53 1. Introduction...................................................3 54 2. Background.....................................................3 55 2.1. The G.718 codec...........................................3 56 2.2. Benefits of layered design................................5 57 2.3. Transmitting layered data.................................5 58 2.4. Scaling scenarios & rate control..........................6 59 3. G.718 RTP payload format.......................................7 60 3.1. Payload Structure.........................................7 61 3.1.1. Payload Header.......................................7 62 3.1.2. G.718 transport blocks...............................8 63 3.2. Handling the Encoded data................................10 64 3.3. G.718 scaling............................................13 65 3.4. CRC verification.........................................13 66 3.5. G.718 session............................................14 67 3.6. Cross-stream/cross-layer timing synchronization..........14 68 3.7. RTP Header usage.........................................14 69 4. Payload Format Parameters.....................................15 70 4.1. Media Type Registration..................................15 71 4.2. Mapping to SDP Parameters................................17 72 4.3. Offer/answer considerations..............................17 73 4.4. Declarative usage of SDP.................................18 74 4.5. SDP examples.............................................18 75 5. Security Considerations.......................................20 76 6. Congestion control............................................21 77 7. IANA Considerations...........................................21 78 APPENDIX A: Payload examples.....................................22 79 A.1. Simple payload examples..................................22 80 A.1.1. All the layers in the same payload..................22 81 A.1.2. Layers in separate RTP streams......................23 82 A.2. Advanced examples........................................24 83 A.2.1. Different update rate for subset of layers..........24 84 A.2.2. Redundant frames with limited set of layers.........25 85 8. References....................................................27 86 8.1. Normative References.....................................27 87 8.2. Informative References...................................28 88 Author's Addresses...............................................29 89 Intellectual Property Statement..................................29 90 Disclaimer of Validity...........................................30 91 Copyright Statement..............................................30 92 Acknowledgment...................................................30 93 9. Open Issues...................................................30 94 10. Changes Log..................................................31 96 1. Introduction 98 The International Telecommunication Union (ITU-T) Recommendation 99 G.718 [G.718] specifies the Embedded Variable Bit Rate (EV-VBR) 100 speech/audio codec. This document specifies the Real-time Transport 101 Protocol (RTP) [RFC3550] payload format for this codec. 103 2. Background 105 2.1. The G.718 codec 107 G.718 is an embedded variable rate speech codec having a layered 108 design. The bitstream of the G.718 core codec consists of a core 109 layer, denoted as L1, and four enhancement layers, denoted as L2-L5. 110 The bit-rates of the G.718 core codec range from 8 kbit/s (core layer 111 only) to 32 kbit/s (with all layers up to L5). Furthermore, the G.718 112 codec supports also discontinuous transmission (DTX) and comfort 113 noise generation (CNG) by sending Silence Descriptor (SID) frames 114 during periods of non-active input signal, resulting in a reduced 115 bit-rate. The sampling frequency of the core codec is 16 kHz and the 116 codec operates on 20 ms frames. The G.718 codec is also capable of 117 narrowband operation with audio input and/or output at 8 kHz sampling 118 frequency. 120 While transmitting/receiving the core layer L1 is enough for 121 successful decoding of the audio content, each of the enhancement 122 layers Ln (n being 2 to 5, inclusive) provides an improvement to 123 reconstructed audio quality. Thus, the core layer ensures the basic 124 communication while the enhancement layers can be used to improve the 125 perceptual quality. Furthermore, enhancement layers are dependent on 126 all the lower layers in a sense that successful decoding of layer Ln 127 requires also all the layers Lm with mn MUST 611 also be discarded. 613 3.5. G.718 session 615 An G.718 session consists of one or several RTP sessions carrying 616 encoded G.718 data according the payload format specified in section 617 3.1. 619 3.6. Cross-stream/cross-layer timing synchronization 621 In case an G.718 session consists of multiple RTP sessions, the RTP 622 packets transmitted on separate RTP sessions need to be synchronized 623 in order to enable reconstruction of the frames in the receiving end. 624 Since each of the RTP sessions uses its own random initial value for 625 the RTP timestamp, there is also a random offset between the RTP 626 timestamps values carrying the EDUs belonging to the same encoded 627 frame in different RTP sessions. 629 The receiver MUST use the traditional RTCP based mechanism to 630 synchronize streams by using the RTP and NTP timestamps of the RTCP 631 Sender Reports (SR) it receives. 633 3.7. RTP Header usage 635 This section specifies the usage of some fields of the RTP header 636 (specified in section 5 of [RFC3550]) with the G.718 RTP payload 637 format. Setting of other RTP header fields is as specified in 638 [RFC3550]. 640 The RTP timestamp corresponds to the sampling instant of the first 641 encoded sample of the earliest frame in the payload. The timestamp 642 clock frequency is 32 kHz. 644 The marker bit (M) of each of the RTP streams of the session SHALL be 645 set to value 1 if the payload carries an EDU belonging to the first 646 frame after an inactive period, i.e. an EDU from the first frame of a 647 talkspurt. For all other packets the marker bit is set to value 0. 649 4. Payload Format Parameters 651 This section defines the parameters that may be used to configure 652 optional features in the G.718 RTP transmission. 654 The parameters are defined here as part of the media subtype 655 registration for the G.718 codec. Mapping of the parameters into the 656 Session Description Protocol (SDP) [RFC4566] is also provided for 657 those applications that use SDP. In control protocols that do not 658 use MIME or SDP, the media type parameters must be mapped to the 659 appropriate format used with that control protocol. 661 4.1. Media Type Registration 663 This registration is done using the template defined in RFC 4288 664 [RFC4288] and following RFC 4855 [RFC4855]. 666 Type name: audio 668 Subtype name: G718 670 Required parameters: none 672 Optional parameters: 674 mode: This parameter MAY be used to indicate whether the 675 mode with layer L1 being present or the AMR-WB 676 compatible mode (with layer L1' being present) is in 677 use. If this parameter is not present or the value of 678 this parameter is equal to 0, the mode with layer L1 679 being present is in use. Otherwise, the AMR-WB 680 compatible mode is in use. When this parameter is 681 present, the value MUST be either 0 or 1. 683 Author's note: When the upcoming stereo and SWB options are 684 present, the semantics of this parameter may change. 686 layers: The numbers of the layers (in range from 1 to 5, 687 denoting layers from L1 to L5, respectively) 688 transmitted in this session, expressed as comma- 689 separated list of layer numbers. If the parameter is 690 present, at least layer L1 or L1' MUST be included in 691 the list of layers in one of the RTP sessions included 692 in the G.718 session. If the parameter is not present, 693 all layers up to layer L5 MAY be used in the session. 695 Author's note: Why not use semantics similarly as L-ID? 697 ptime: The recommended length of time (in milliseconds) 698 represented by the media in a packet. See Section 6 699 of [RFC4566]. 701 maxptime: The maximum length of time (in milliseconds) that can 702 be encapsulated in a packet. See Section 6 of 703 [RFC4566] 705 Author's note: Some further study is needed to see if separate 706 parameters for sending and receiving capabilities/preferences are 707 needed -- especially for upcoming stereo and SWB options. 709 Author's note: The support for upcoming SWB and stereo options 710 needs to be taken into account. Basically we can either 1) extend 711 the parameter "layers" to cover also this aspect, or 2) define 712 separate parameter(s) for these new options when more details on 713 the stereo/SWB support are available. 715 Encoding considerations: 717 This media type is framed and contains binary data; see Section 4.8 718 of [RFC4288]. 720 Security considerations: See Section 6 of RFC xxxx 722 Interoperability considerations: none 724 Published specification: RFC xxxx 726 Applications which use this media type: 728 For example Voice over IP, audio and video conferencing, audio 729 streaming and voice messaging. 731 Additional information: none 733 Person & email address to contact for further information: 735 Ari Lakaniemi, ari.lakaniemi@nokia.com 737 Intended usage: COMMON 738 Restrictions on usage: 740 This media type depends on RTP framing, and hence is only defined 741 for transfer via RTP [RFC3550] 743 Author: 745 Ari Lakaniemi, ari.lakaniemi@nokia.com 747 Change controller: 749 IETF Audio/Video Transport working group delegated from the IESG 751 4.2. Mapping to SDP Parameters 753 The information carried in the media type specification has a 754 specific mapping to fields of the SDP [RFC4566], which is commonly 755 used to describe RTP sessions. When SDP is used to specify sessions 756 employing the G.718 codec, the mapping is as follows: 758 o The media type ("audio") goes in SDP "m=" as the media name. 760 o The media subtype ("G718") goes in SDP "a=rtpmap" as the encoding 761 name. The RTP clock rate in "a=rtpmap" MUST be 32000 for G.718. 763 Author's note: The current choice for the RTP clock rate is a 764 'placeholder'. The clock rate needs to be set according to SWB 765 sampling rate, which is still T.B.D. Since the core codec employs 766 16000 Hz sampling rate, an integer multiple of 16000 Hz seems to 767 be a preferable choice. 769 o The parameters "ptime" and "maxptime" go in the SDP "a=ptime" and 770 "a=maxptime" attributes, respectively. 772 o Any remaining parameters go in the SDP "a=fmtp" attribute by 773 copying them directly from the media type string as a semicolon 774 separated list of parameter=value pairs. 776 4.3. Offer/answer considerations 778 The following considerations apply when using the SDP offer/answer 779 [RFC3264] mechanism to negotiate the G.718 transport. The parameter 780 "layers" MAY be used to indicate the layer configuration for the each 781 RTP session belonging to current G.718 session an end-point making 782 the offer is ready to transmit and wishes to receive. 784 o In case the G.718 session consists of a single RTP session, it is 785 RECOMMENDED not to impose any layer restrictions for the session 786 but to use the rate control functionality to set possible 787 restrictions on usage of the higher or highest layers. If the 788 offer includes a layer configuration parameter, the answer MAY use 789 different configuration, but the highest layer in the answer MUST 790 NOT be higher than the highest layer of the offered configuration. 792 Author's note: Support for answer modifying the layer 793 configuration is FFS. 795 In case the G.718 session consists of multiple RTP sessions, the 796 answer MUST use the layer configurations provided in the offer for 797 the sessions it accepts. 799 4.4. Declarative usage of SDP 801 In declarative usage, such as SDP in RTSP [RFC2326] or SAP [RFC2974], 802 the parameter "layers" SHALL be interpreted to provide a set of 803 layers that the sender may use in the session. 805 4.5. SDP examples 807 Some example SDP session descriptions utilizing G.718 encodings are 808 provided below. 810 The first example illustrates the simple case where the G.718 session 811 employing a single RTP session and the AVPF profile is offered, and 812 the answer accepts the offer without any changes. 814 Offer: 816 m=audio 49120 RTP/AVPF 97 817 a=rtpmap:97 G718/32000/1 819 Answer: 821 m=audio 49120 RTP/AVPF 97 822 a=rtpmap:97 G718/32000/1 824 The second example shows a bit more complex case where the G.718 825 session using a single RTP session and the AVPF profile is offered 826 with restriction to send/receive only with layers L1 and L2. The 827 answer indicates that the other end-point is happy to receive (and 828 send) layers up to L5. 830 Offer: 832 m=audio 49120 RTP/AVPF 97 833 a=rtpmap:97 G718/32000/1 834 a=fmtp:97 layers=1,2 836 Answer: 838 m=audio 49120 RTP/AVPF 97 839 a=rtpmap:97 G718/32000/1 840 a=fmtp:97 layers=1,2,3,4,5 842 The third example shows an G.718 session using multiple RTP sessions 843 with the AVPF profile. The answerer wishes to use only layers up to 844 L3. 846 Offer: 848 m=audio 49120 RTP/AVPF 97 849 a=rtpmap:97 G718/32000/1 850 a=fmtp:97 layers=1,2 851 a=mid=1 853 m=audio 49122 RTP/AVPF 98 854 a=rtpmap:98 G718/32000/1 855 a=fmtp:98 layers=3 856 a=mid=2 857 a=depend:lay 1 859 m=audio 49124 RTP/AVPF 99 860 a=rtpmap:99 G718/32000/1 861 a=fmtp:99 layers=4,5 862 a=mid=3 863 a=depend:lay 1 2 865 Answer: 867 m=audio 49120 RTP/AVPF 97 868 a=rtpmap:97 G718/32000/1 869 a=fmtp:97 layers=1,2 870 a=mid=1 872 m=audio 49120 RTP/AVPF 98 873 a=rtpmap:98 G718/32000/1 874 a=fmtp:98 layers=3 875 a=mid=2 876 a=depend:lay 1 878 Note that the dependency signaling according to [smd-sdp] is used in 879 the third example above to indicate the relationship between the 880 layers distributed into separate RTP sessions. 882 5. Security Considerations 884 RTP packets using the payload format defined in this specification 885 are subject to the security considerations discussed in the RTP 886 specification [RFC3550], and in any appropriate RTP profile (for 887 example [RFC3551] or [RFC4585]). This implies that confidentiality 888 of the media streams is achieved by encryption; for example, through 889 the application of SRTP [RFC3711]. Because the data compression used 890 with this payload format is applied end-to-end, any encryption needs 891 to be performed after compression. 893 A potential denial-of-service threat exists for data encodings using 894 compression techniques that have non-uniform receiver-end 895 computational load. The attacker can inject pathological datagrams 896 into the stream that will increase the processing load of the decoder 897 and may cause the receiver to be overloaded. For example inserting 898 additional EDUs representing the higher enhancement layers on top of 899 the ones actually transmitted may increase the decoder load. However, 900 the G.718 codec is not particularly vulnerable to such an attack, 901 since the majority of the computational load in an G.718 session is 902 associated to the encoder. Another form of possible attach might be 903 forging of codec bit-rate control messages, which may result in 904 encoder operating employing higher number of enhancement layers than 905 originally intended and thereby requiring larger amount of 906 computation resources. Therefore, the usage of data origin 907 authentication and data integrity protection of at least the RTP 908 packet is RECOMMENDED; for example, with SRTP [RFC3711]. 910 Note that the appropriate mechanism to ensure confidentiality and 911 integrity of RTP packets and their payloads is very dependent on the 912 application and on the transport and signaling protocols employed. 914 Thus, although SRTP is given as an example above, other possible 915 choices exist. 917 Note that end-to-end security with either authentication, integrity 918 or confidentiality protection will prevent a network element not 919 within the security context from performing media-aware operations 920 other than discarding complete packets. To allow any (media-aware) 921 intermediate network element to perform its operations, it is 922 required to be a trusted entity which is included in the security 923 context establishment. 925 6. Congestion control 927 As scalable codec G.718 implicitly provides means for congestion 928 control by providing a possibility for 'thinning' the bitstream. The 929 RTP payload format according to this specification provides several 930 different means for reducing the G.718 session bandwidth. The most 931 appropriate mechanism (in terms of impact to the user experience) 932 depends on the employed payload structure and also on the employed 933 session configuration (single RTP session or multiple RTP sessions). 934 The following means (in no particular order) can be used to assist 935 congestion control procedures -- either by the sender or by the 936 intermediate node. 938 o The transport blocks carrying the EDUs representing the highest 939 layers within the payload may be dropped. 941 o The payloads carrying the EDUs representing the highest layers in 942 an G.718 session are dropped. 944 o Transport blocks or payloads carrying EDUs belonging to redundant 945 frames included in the payload are dropped. 947 7. IANA Considerations 949 IANA is kindly requested to register a media type for the G.718 codec 950 for RTP transport, as specified in section 4.1. of this document. 952 APPENDIX A: Payload examples 954 The G.718 payload structure enables flexible transport either by 955 carrying all layers in the same payload or separating the layers into 956 separate payloads. The following subsections illustrate different 957 possibilities for transport by simple examples. Note that examples do 958 not show the full payload structure to keep the illustration simple. 960 A.1. Simple payload examples 962 A.1.1. All the layers in the same payload 964 The illustration below shows layers L1-L3 from two encoded frames 965 encapsulated into separate payloads using single transport block. 967 +-------+--------+-----+------+------+------+ 968 | RTP1 | L-ID=3 |NF=0 |F1-L1 |F1-L2 |F1-L3 | 969 +-------+--------+-----+------+------+------+ 971 +-------+--------+-----+------+------+------+ 972 | RTP2 | L-ID=3 |NF=0 |F2-L1 |F2-L2 |F2-L3 | 973 +-------+--------+-----+------+------+------+ 975 In case the same layers from two input frames are encapsulated into 976 one payload using single transport block, the structure is as shown 977 below. 979 +-------+--------+-----+------+------+------+------+------+------+ 980 | RTP1 | L-ID=3 |NF=1 |F1-L1 |F2-L1 |F1-L2 |F2-L2 |F3-L3 |F2-L3 | 981 +-------+--------+-----+------+------+------+------+------+------+ 983 The third example illustrates the case where the layers L1-L3 from 984 two input frames are encapsulated into one payload using two separate 985 transport blocks, the first one carrying L1 and the other one 986 containing L2 and L3. 988 +-------+--------+-----+------+------+ 989 | RTP1 | L-ID=1 |NF=1 |F1-L1 |F2-L1 | 990 +-------+--------+-----+------+------+------+------+ 991 | L-ID=7 |NF=1 |F1-L2 |F2-L2 |F2-L2 |F2-L3 | 992 +--------+-----+------+------+------+------+ 994 A.1.2. Layers in separate RTP streams 996 In this case the data for each layer is transmitted in its own 997 payload. 999 In the first example each transport block including a single EDU is 1000 carried in its own RTP payload. 1002 +-------+--------+-----+-----+ +-------+--------+-----+-----+ 1003 | RTP1a | L-ID=1 |NF=0 |F1-L1| | RTP1b | L-ID=6 |NF=0 |F1-L2| 1004 +-------+--------+-----+-----+ +-------+--------+-----+-----+ 1006 +-------+--------+-----+-----+ +-------+--------+-----+-----+ 1007 | RTP1c |L-ID=10 |NF=0 |F1-L3| | RTP2a | L-ID=1 |NF=0 |F2-L1| 1008 +-------+--------+-----+-----+ +-------+--------+-----+-----+ 1010 +-------+--------+-----+-----+ +-------+--------+-----+-----+ 1011 | RTP2b | L-ID=6 |NF=0 |F2-L2| | RTP2c |L-ID=10 |NF=0 |F2-L3| 1012 +-------+--------+-----+-----+ +-------+--------+-----+-----+ 1014 If the payloads carry data from two consecutive input frames, the 1015 same encoded data as in the previous example is arranged as follows. 1017 +-------+--------+-----+-----+-----+ 1018 | RTP1a | L-ID=1 |NF=1 |F1-L1|F2-L1| 1019 +-------+--------+-----+-----+-----+ 1021 +-------+--------+-----+-----+-----+ 1022 | RTP1b | L-ID=6 |NF=1 |F1-L2|F2-L2| 1023 +-------+--------+-----+-----+-----+ 1025 +-------+--------+-----+-----+-----+ 1026 | RTP1c |L-ID=10 |NF=1 |F1-L3|F2-L3| 1027 +-------+--------+-----+-----+-----+ 1029 A.2. Advanced examples 1031 A.2.1. Different update rate for subset of layers 1033 An example employing different update rates (i.e. different number of 1034 frames per packet) for selected subsets of layers. In these examples 1035 all core codec layers L1-L5 are shown. 1037 +-------+--------+-----+-----+-----+-----+-----+ 1038 | RTP1 | L-ID=1 |NF=3 |F1-L1|F2-L1|F3-L1|F4-L1| 1039 +-------+--------+-----+-----+-----+-----+-----+ 1041 +-------+--------+-----+-----+-----+-----+-----+ 1042 | RTP2a | L-ID=7 |NF=1 |F1-L2|F2-L2|F1-L3|F2-L3| 1043 +-------+--------+-----+-----+-----+-----+-----+ 1045 +-------+--------+-----+-----+-----+ 1046 | RTP3a |L-ID=14 |NF=0 |F1-L4|F1-L5| 1047 +-------+--------+-----+-----+-----+ 1049 +-------+--------+-----+-----+-----+ 1050 | RTP3b |L-ID=14 |NF=0 |F2-L4|F2-L5| 1051 +-------+--------+-----+-----+-----+ 1053 +-------+--------+-----+-----+-----+-----+-----+ 1054 | RTP2b | L-ID=7 |NF=1 |F3-L2|F4-L2|F3-L3|F4-L3| 1055 +-------+--------+-----+-----+-----+-----+-----+ 1057 +-------+--------+-----+-----+-----+ 1058 | RTP3c |L-ID=14 |NF=0 |F3-L4|F3-L5| 1059 +-------+--------+-----+-----+-----+ 1061 +-------+--------+-----+-----+-----+ 1062 | RTP3d |L-ID=14 |NF=0 |F4-L4|F4-L5| 1063 +-------+--------+-----+-----+-----+ 1065 A.2.2. Redundant frames with limited set of layers 1067 An example transmitting layers L1-L3 as primary data and L1 (of the 1068 previous frame) as redundant data is shown below. Each payload 1069 carries one primary (i.e. new) frame in one transport block and one 1070 redundant frame, which in this example is the frame preceding the 1071 primary frame, in another transport block. 1073 +-------+--------+-----+-----+--------+-----+-----+-----+-----+ 1074 | RTP1 | L-ID=1 |NF=0 |F0-L1| L-ID=3 |NF=0 |F1-L1|F1-L2|F1-L3| 1075 +-------+--------+-----+-----+--------+-----+-----+-----+-----+ 1077 +-------+--------+-----+-----+--------+-----+-----+-----+-----+ 1078 | RTP2 | L-ID=1 |NF=0 |F1-L1| L-ID=3 |NF=0 |F2-L1|F2-L2|F2-L3| 1079 +-------+--------+-----+-----+--------+-----+-----+-----+-----+ 1081 +-------+--------+-----+-----+--------+-----+-----+-----+-----+ 1082 | RTP3 | L-ID=1 |NF=0 |F2-L1| L-ID=3 |NF=0 |F3-L1|F3-L2|F3-L3| 1083 +-------+--------+-----+-----+--------+-----+-----+-----+-----+ 1085 Alternatively, the payload carrying also redundant data for a subset 1086 of layers can be arranged differently, as shown in the example below. 1088 +-------+--------+-----+-----+-----+-----+--------+-----+-----+ 1089 | RTP1 | L-ID=3 |NF=0 |F0-L1|F0-L2|F0-L3| L-ID=1 |NF=0 |F1-L1| 1090 +-------+--------+-----+-----+-----+-----+--------+-----+-----+ 1092 +-------+--------+-----+-----+-----+-----+--------+-----+-----+ 1093 | RTP2 | L-ID=3 |NF=0 |F1-L1|F1-L2|F1-L3| L-ID=1 |NF=0 |F2-L1| 1094 +-------+--------+-----+-----+-----+-----+--------+-----+-----+ 1096 +-------+--------+-----+-----+-----+-----+--------+-----+-----+ 1097 | RTP3 | L-ID=3 |NF=0 |F2-L1|F2-L2|F2-L3| L-ID=1 |NF=0 |F3-L1| 1098 +-------+--------+-----+-----+-----+-----+--------+-----+-----+ 1100 Now the first transport block carries the primary data and the second 1101 transport block carries the redundant data, which in this case covers 1102 the frame following the primary frame. The benefit of this approach 1103 is that the redundant data is included in the last (secondary) 1104 transport block of the payload, which might be beneficial for 1105 possible payload scaling operation within the network. 1107 8. References 1109 8.1. Normative References 1111 [AMR-WB] 3GPP TS 26.171, "Adaptive Multi-Rate Wideband (AMR-WB) 1112 speech codec; General description (Release 7)", v7.0.0, 1113 September 2006. 1115 [G.718] ITU-T Recommendation G.718, "Frame Error Robust Narrowband 1116 and Wideband Embedded Variable Bit-Rate Coding of Speech 1117 and Audio from 8-32 Kbit/s", (consented) May 2008. 1119 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1120 Requirement Levels", BCP 14, RFC 2119, March 1997. 1122 [RFC3264] Rosenberg, J., Schulzrinne, H., "An Offer/Answer Model with 1123 Session Description Protocol (SDP)", RFC 3264, June 2002. 1125 [RFC3550] Schulzrinne, H., Casner, S., Frederick, R. and Jacobson, 1126 V., "RTP: A Transport Protocol for Real-Time Applications", 1127 STD 64, RFC 3550, July 2003. 1129 [RFC3551] Schulzrinne, H., Casner, S., "RTP Profile for Audio and 1130 Video Conferences with Minimal Control", STD 65, RFC 3551, 1131 July 2003. 1133 [RFC3711] Baugher, M., McGrew, D., Naslund, M., Carrara, E., Norrman, 1134 K., "The Secure Real-Time Transport Protocol (SRTP)", RFC 1135 3711, March 2004. 1137 [RFC4288] Freed, N., Klensin, J., "Media Type Specifications and 1138 Registration Procedures", BCP 13, RFC 4288, December 2005. 1140 [RFC4566] Handley, M., Jacobson, V. and Perkins, C., "SDP: Session 1141 Description Protocol", RFC 4566, July 2006. 1143 [RFC4585] Ott, J., Wenger, S., Sato, N., Burmeister, C., Rey, J., 1144 "Extended RTP Profile for Real-Time Transport Control 1145 Protocol (RTCP)-Based Feedback (RTP/AVPF)", RFC 4585, July 1146 2006. 1148 [RFC4855] Casner, S., "Media Type Registration of RTP Payload 1149 Formats", RFC 4855, February 2007. 1151 [RFC4867] Sjoberg, J., Westerlund, M., Lakaniemi, A., Xie, Q., "RTP 1152 Payload Format and File Storage Format fort he Adaptive 1153 Multi-Rate (AMR) and Adaptive Multi-Rate Wideband (AMR-WB) 1154 Audio Codecs", RFC 4867, April 2007. 1156 [RFC5104] Wenger, S., Chandra, U., Westerlund, M., Burman, B., "Codec 1157 Control Messages in the RTP Audio-Visual Profile with 1158 Feedback (AVPF)", RFC 5104, Feburary 2008. 1160 [smd-sdp] Schierl, T., Wenger, S., "Signaling media decoding 1161 dependency in Session Description Protocol (SDP)", draft- 1162 schierl-mmusic-layered-codec-04 (work in progress), June 1163 2007. 1165 8.2. Informative References 1167 [McCanne] McCanne, S., Jacobson, V., and Vetterli, M., "Receiver- 1168 driven layered multicast", in Proc. of ACM SIGCOMM'96, 1169 pages 117--130, Stanford, CA, August 1996. 1171 [RFC2326] Schulzrinne, H., Rao, A., Lanphier, R., "Real Time 1172 Streaming Protocol (RTSP)", RFC 2326, April 1998. 1174 [RFC2974] Handley, M., Perkins, C., Whelan, E., "Session Announcement 1175 Protocol", RFC 2974, October 2000. 1177 [RFC3828] Larzon, L-A., Degermark, M., Pink, S., Jonsson, L-E., 1178 Fairhurst, G., "The Lightweight User Datagram Protocol 1179 (UDP-Lite)", RFC 3828, July 2004. 1181 [RFC4340] Kohler, E., Handley, M., Floyd, S., "Data Congestion 1182 Control Protocol (DCCP)", RFC 4340, March 2006. 1184 [RFC5117] Westerlund, M., Wenger, S., "RTP Topologies", RFC 5117, 1185 January 2008. 1187 Author's Addresses 1189 Ari Lakaniemi 1190 Nokia 1191 P.O.Box 407 1192 FIN-00045 Nokia Group, FINLAND 1194 Phone: +358-71-8008000 1195 Email: ari.lakaniemi@nokia.com 1197 Ye-Kui Wang 1198 Nokia Research Center 1199 P.O. Box 1000 1200 33721 Tampere 1201 Finland 1203 Phone: +358-50-466-7004 1204 EMail: ye-kui.wang@nokia.com 1206 Intellectual Property Statement 1208 The IETF takes no position regarding the validity or scope of any 1209 Intellectual Property Rights or other rights that might be claimed to 1210 pertain to the implementation or use of the technology described in 1211 this document or the extent to which any license under such rights 1212 might or might not be available; nor does it represent that it has 1213 made any independent effort to identify any such rights. Information 1214 on the procedures with respect to rights in RFC documents can be 1215 found in BCP 78 and BCP 79. 1217 Copies of IPR disclosures made to the IETF Secretariat and any 1218 assurances of licenses to be made available, or the result of an 1219 attempt made to obtain a general license or permission for the use of 1220 such proprietary rights by implementers or users of this 1221 specification can be obtained from the IETF on-line IPR repository at 1222 http://www.ietf.org/ipr. 1224 The IETF invites any interested party to bring to its attention any 1225 copyrights, patents or patent applications, or other proprietary 1226 rights that may cover technology that may be required to implement 1227 this standard. Please address the information to the IETF at 1228 ietf-ipr@ietf.org. 1230 Disclaimer of Validity 1232 This document and the information contained herein are provided on an 1233 "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS 1234 OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND 1235 THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS 1236 OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF 1237 THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED 1238 WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. 1240 Copyright Statement 1242 Copyright (C) The IETF Trust (2008). 1244 This document is subject to the rights, licenses and restrictions 1245 contained in BCP 78, and except as set forth therein, the authors 1246 retain all their rights. 1248 Acknowledgment 1250 Funding for the RFC Editor function is currently provided by the 1251 Internet Society. 1253 9. Open Issues 1255 1) Support of super-wideband (SWB) audio and stereophonic encoding 1256 extensions to ITU-T G.718 currently being worked on by ITU-T is to 1257 be specified after ITU-T completes the work in that regards. 1259 a. Some further study is needed to see if separate parameters 1260 for sending and receiving capabilities/preferences are needed 1261 -- especially for upcoming stereo and SWB options. 1263 b. The support for upcoming SWB and stereo options needs to be 1264 taken into account. Basically we can either 1) extend the 1265 parameter "layers" to cover also this aspect, or 2) define 1266 separate parameter(s) for these new options when more details 1267 on the stereo/SWB support are available. 1269 2) For streaming or other applications that allow for relatively long 1270 end-to-end delay, sometimes it would be beneficial to aggregate 1271 more than 4 frames in one Transport Block (TB). Should the length 1272 of the NF field be larger? 1274 3) On layer structure and configuration signalling. Currently, a 1275 unique layer ID is assigned for any possible layer combinations. 1277 See the editing notes below Table 3 for other possible approaches. 1278 One of the alternative ways may be chosen in the final draft. 1280 4) Currently, it is mandated that lower layer EDUs of later frames go 1281 before higher layer EDUs of earlier frames in a transport block. 1282 This way is friendlier to adaptation (dropping of higher layers). 1283 However, if all layers are received, then the depacketizer needs 1284 to reorder the EDUs to their decoding order before feeding them to 1285 the decoder. Therefore, the other way around (i.e. lower layer 1286 EDUs of later frames go after higher layer EDUs of earlier frames, 1287 or EDUs in transport blocks are placed in decoding order) is more 1288 friendly to the depacketizer. Another benefit of the latter is 1289 that it does not introduce any end-to-end delay. Which way to be 1290 specified (or both allowed if needed) is FFS. 1292 5) MANEs dropping RTP packets are RTP translators. But are those 1293 MANEs dropping a subset of the transport blocks in one packet also 1294 RTP translators? 1296 6) The RTCP based cross-session synchronization is not possible until 1297 the first RTCP SRs are received in all sessions. This implies that 1298 decoding only a subset of layers may be possible until RTCP SRs in 1299 all sessions have been received. This may imposes higher end-to- 1300 end delay or higher bandwidth for RTCP data, and the approach may 1301 not work perfectly for some multicast topologies. There is a study 1302 ongoing by some AVT members. Once there is an acceptable solution 1303 fouthe draft documenting that solution may be referenced in this 1304 draft. 1306 7) It might be better to change the semantics of the media type 1307 parameter 'layers' to be similar as that for L-ID. 1309 8) Offer/answer with answer being capable of modifying the layer 1310 configuration is FFS. 1312 9) Some references need to be updated in the final draft. 1314 10. Changes Log 1316 From draft-lakaniemi-avt-rtp-evbr-04 to draft-ietf-avt-rtp-g718-00 1318 - Changed the media sub-type name "EV-VBR" to "G718", and replaced 1319 "EV-VBR" with "G.718" in all text.