AVT B. Song Internet-Draft H. Qin Expires: December 20, 2006 Xidian Univ. June 18, 2006 Generic RTP Payload Format for Forward Error Correction in Video Applications draft-bsong-avt-rtp-gfec-00.txt Status of this Memo By submitting this Internet-Draft, each author represents that any applicable patent or other IPR claims of which he or she is aware have been or will be disclosed, and any of which he or she becomes aware will be disclosed, in accordance with Section 6 of BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt. The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. This Internet-Draft will expire on December 20, 2006. Copyright Notice Copyright (C) The Internet Society (2006). Abstract This document describes a generic scheme to packetize video stream for transport using the Real-time Transport Protocol, RTP. Erasure- based forward error correction and interleaving are used To reduce the impact of the packet losses. Furthermore, this document presents an application of this generic scheme in H.264 video communication. Song & Qin Expires December 20, 2006 [Page 1] Internet-Draft Generic RTP Payload Format for FEC June 2006 Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.1. Terminology . . . . . . . . . . . . . . . . . . . . . . . 3 2. Erasure-based FEC . . . . . . . . . . . . . . . . . . . . . . 4 3. A Generic payload format for FEC . . . . . . . . . . . . . . . 5 3.1. FEC algorithm with unequal error protection . . . . . . . 6 3.2. Interleaving FEC encoded blocks . . . . . . . . . . . . . 7 3.3. Encapsulating interleaved blocks into RTP packets . . . . 7 3.4. Delay consideration and parameter determination . . . . . 8 3.5. Detecting packet losses . . . . . . . . . . . . . . . . . 8 4. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 9 5. Security Considerations . . . . . . . . . . . . . . . . . . . 10 6. An example application of generic FEC in H.264 . . . . . . . . 11 6.1. Network Abstraction Layer Unit (NALU) . . . . . . . . . . 11 6.2. Design consideration . . . . . . . . . . . . . . . . . . . 11 6.2.1. I-slice . . . . . . . . . . . . . . . . . . . . . . . 12 6.2.2. P-slice . . . . . . . . . . . . . . . . . . . . . . . 12 7. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . 13 8. Normative References . . . . . . . . . . . . . . . . . . . . . 13 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 14 Intellectual Property and Copyright Statements . . . . . . . . . . 15 Song & Qin Expires December 20, 2006 [Page 2] Internet-Draft Generic RTP Payload Format for FEC June 2006 1. Introduction The nature of real-time video applications implies that they usually have more stringent delay requirements than normal data transmissions. As a result, the retransmission of lost packets is generally not an acceptable option for such applications. In these cases, a better method is to attempt to recover the information contained in lost packets through Forward Error Correction (FEC) [RFC3453]. On the other hand, interleaving the data streams from the sender can mitigate quality degradation due to packet loss, especially burst loss. This document specifies a new RTP payload format for FEC in the generic sense, and as an example, provides the description of its application to video streams encoded using H.264 [H264]. 1.1. Terminology The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14, RFC 2119 [RFC2119] and indicate requirement levels for compliant RTP implementations. Song & Qin Expires December 20, 2006 [Page 3] Internet-Draft Generic RTP Payload Format for FEC June 2006 2. Erasure-based FEC In the general literature, FEC refers to the ability to overcome both erasure errors and bit-level corruptions [RFC3558]. However, in the case of an IP-based protocol, either the network layer detects corrupted packets and discard them, or the transport layer uses packet authentication mechanism to discard corrupted packets. Therefore the primary application of FEC codes to IP protocols is one as an erasure code. On the sender's side, the payloads are generated and processed using an FEC erasure encoder, and on the receiver's side, objects are reassembled from the reception of the packets containing the FEC encoded payloads by an FEC erasure decoder using the corresponding FEC mechanism. A (N, K) erasure-based FEC code works in the following way. The input to a block FEC encoder are origin data blocks (ODBs) of equal length grouped into batches with one batch at a time. Each batch contains K ODBs, and for each batch of input ODBs, the encoder then generates a total of N>K encoded data blocks (EDB) composed of the K ODBs and the N-K redundant check blocks. The whole group of these N EDBs is referred to as an FEC unit. A block FEC decoder has the property that any K of the N EDBs is sufficient to reconstruct the original K ODBs. A typical (N, K) erasure code is known as Tornado code. Song & Qin Expires December 20, 2006 [Page 4] Internet-Draft Generic RTP Payload Format for FEC June 2006 3. A Generic payload format for FEC In order to improve the ability of the receiver to tolerate the packet losses and guarantee the QoS of real-time communications, especially video communications, this section presents a novel generic RTP payload format for FEC which attempts to relieve if not eliminate the impacts of packet losses. The new RTP payload format is defined as follows. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | RTP Header | +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ | R |C|FECType|FEC subtype| interleave index | block size | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | one or more encoded data blocks | | .... | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ The RTP header fields take the values described in the RTP specification [RFC3550]. The fields for generic FEC are specified as follows: R: 3bits Reserved, MUST be set to zero by the sender, SHOULD be ignored by the receiver. C: 1 bit If the C bit is set, the output data from the sender is interleaved in RTP payload. (see Section 3.2), otherwise, the output data is time-ordered. FEC type: 4 bits The field indicates the type of FEC algorithms (erasure code) used for the RTP payload. A value of zero means no FEC is used for this RTP payload. The mapping of non-zero FEC type values to FEC algorithms is not to be specified in this document. FEC subtype: 6 bits The field is used to define the different subtype of FEC type which is defined above. The basic idea of using two fields FEC type and FEC subtype to identify a certain FEC code is that the whole family of FEC codes can be divided into different subfamilies and each subfamily contains FEC codes with similar characteristics and generation mechanisms. The FEC type field indicates which subfamily a certain FEC code is in and the FEC Song & Qin Expires December 20, 2006 [Page 5] Internet-Draft Generic RTP Payload Format for FEC June 2006 subtype field further indicates in the given subfamily what generation parameters are used for this certain FEC code. Usually FEC codes in a subfamily have similar generation mechanism but different generation parameters, and as a result, different protection strengths. This field SHOULD be ignored if FEC type is zero. For non-zero FEC subtype values, parameters of FEC algorithm, such as the values of N and K for (N, K) erasure code, SHOULD be uniquely determined by this field. Therefore, for a given non-zero value of the FEC subtype field, parameters of FEC algorithm can be exactly derived. Interleave index: 10bits This field indicates zero-based index within an interleaving group which is used to tell the indices of lost packets in each interleaving group (see section 3.2) which are important to FEC decoding. In contrast, the SN field in RTP header tells the indices of the lost packets. The value of this field SHOULD be less than N, which is the number of EDBs comprising an FEC unit. Block size (B): 8 bits Each data block contained in an FEC unit SHOULD have the same size, and this field indicates the number of octets in a block. 3.1. FEC algorithm with unequal error protection Real-time data in a RTP session may have different relative importance to the receiver. Generally speaking, the encoded stream of a video sequence is launched frame by frame. These frames have different importance to the video decoder, For example, reference frames are critical to a decoder to reconstruct following pictures, and prediction frames are used to reconstruct just one picture based on the previous reference frame. Losses of reference frames may cause far more serious disasters to the video decoder in the sense that the video decoder can be forced into an interruption lasting a rather long period. Data with different importance MAY be unequally protected with different FEC algorithms. if this is the case, only the data with the same or similar importance can be encapsulated into a RTP packet. In other words, only one FEC algorithm can be used throughout a RTP packet. The encoded stream from the sender is equally divided into a sequence of origin data blocks, and each block has the size of B octets. Every K data blocks will be encoded using an FEC algorithm to generate N (N>K) encoded blocks. Song & Qin Expires December 20, 2006 [Page 6] Internet-Draft Generic RTP Payload Format for FEC June 2006 When unequal protection is enabled, if the total number of octets of the encoded stream with the same importance is not divided exactly by K*B. Padding octets should be appended to enable FEC algorithm to work correctly. The negotiation on FEC algorithms acceptable to both the sender and the receiver is not to be specified in this document. 3.2. Interleaving FEC encoded blocks Bundling is used to spread the transmission overhead of the RTP and payload headers over multiple encoded blocks so as to maximize the bandwidth efficiency and minimize the transmission latency. Interleaving [RFC3453] additionally reduces the receiver's perception the negative effects of data losses by spreading such losses over non-consecutive blocks. Interleaving is meaningful only if more than one encoded blocks are put into a single RTP packet. All receivers MUST support interleaving. The senders MAY optionally support interleaving. Given a sequence of EDBs from the sender, numbered 0, .., n, and a (N, K) FEC algorithm. Note that n SHOULD be divided exactly by N, this can be achieved using padding mentioned in section 3.1. These n EDBs will be organized to form F=n/N FEC units, and will be further interleaved into G=(F+U-1)/U groups with interleaving length N where U is the maximum bundling value, i.e., the maximum FEC unit number that can be contained in a RTP packet. The blocks are placed into RTP packets as follows: This continues to the last RTP packet in the interleave group 0: This continues to the interleaved group G-2. Each RTP packet belongs to these G-1 interleaving groups contain U encoded blocks Until now, there are still F mod U FEC units to be interleaved for interleaved group G-1. The same interleaving method as mentioned above is used for interleaving group G-1 with each RTP packet containing only V=F mod U encoded blocks. 3.3. Encapsulating interleaved blocks into RTP packets A RTP packet MUST contain exactly one or more encoded blocks, i.e., one encoded block SHOULD NOT span two or more RTP packets. The sender MUST not contain more data blocks in a single RTP packet than can be fitted in the MTU (maximum transfer unit) of the underlying transport protocol. Song & Qin Expires December 20, 2006 [Page 7] Internet-Draft Generic RTP Payload Format for FEC June 2006 The following formula can be used to calculate the number of blocks in a RTP packet. RTP packet size can be obtained from the underlying transport layer. block number=(RTP packet size-RTP header length-payload header length)/block size When interleaving is enabled, the bundle value V always equals to the block number in a RTP packet. 3.4. Delay consideration and parameter determination FEC and interleaving reduce the effect of pack losses at the cost of increasing the delay. Large values of N, K or U increase not only the ability of the receiver to be tolerant of packet losses, but also the delay. Tradeoff between the low end-to-end delay and great tolerance for packet losses should be made in the choice of FEC and interleaving parameters. 3.5. Detecting packet losses The receiver determines the expected bundling value V for all RTP packets in an interleaved group by the number of encoded blocks bundled in the first RTP packet of the interleaved group received. Note that this may not be the first RTP packet of the interleaved group if packets are lost or delivered out of order by the underlying transport. Given an RTP packet with sequence number S, interleaving length N which can be derived from the FEC subtype field, interleaving index value I, and bundling value U, the interleaved group consists of this RTP packet and other RTP packets with sequence numbers ranging between S-I mod 65536 to S-I+N mod 65536 inclusively. In other words, the interleaved group always consists of N RTP packets with sequential sequence numbers. The bundling values for all RTP packets in an interleaved group MUST be the same. The indices of lost packets in a given interleaved group can be detected quickly from the interleaving index field. This information is needed by the FEC decoder to recover the lost packets. Song & Qin Expires December 20, 2006 [Page 8] Internet-Draft Generic RTP Payload Format for FEC June 2006 4. IANA Considerations There is no IANA consideration introduced by this draft. There is no default mapping of FEC Type field and FEC subtype field to the FEC algorithm and its parameters specified in this document, except zero- value of FEC type means FEC is disabled. Senders and receivers of a RTP session may achieve the same mapping table through negotiation which is not specified in this document. Song & Qin Expires December 20, 2006 [Page 9] Internet-Draft Generic RTP Payload Format for FEC June 2006 5. Security Considerations RTP packets using the payload format defined in this specification are subject to the security considerations discussed in the RTP specification [RFC3550], and any appropriate RTP profile. This implies that confidentiality of the media streams is achieved by encryption. Because the data compression used with this payload format is applied end-to-end, encryption may be performed after compression so there is no conflict between the two operations. A potential denial-of-service threat exists for data encoding using compression techniques that have non-uniform receiver-end computational load. The attacker can inject pathological datagrams into the stream which are complex to decode and cause the receiver to be overloaded. The usage of authentication of at least the RTP packet is RECOMMENDED As with any IP-based protocol, in some circumstances a receiver may be overloaded simply by the receipt of too many packets, either desired or undesired. Network-layer authentication may be used to discard packets from undesired sources, but the processing cost of the authentication itself may be too high. In a multicast environment, pruning of specific sources may be implemented in future versions of IGMP and in multicast routing protocols to allow a receiver to select which sources are allowed to reach it. A security review of this payload format found no additional considerations beyond those in the RTP specification. Song & Qin Expires December 20, 2006 [Page 10] Internet-Draft Generic RTP Payload Format for FEC June 2006 6. An example application of generic FEC in H.264 6.1. Network Abstraction Layer Unit (NALU) In H.264 [H264], the Network Abstraction Layer (NAL) encoder encapsulates the slice output of the Video coding layer (VCL) encoder into NAL Units (NALUs), which are suitable for transmission over packet networks or use in packet-oriented multiplex environments. NALU has the following format: +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ ... +-+-+-+-+-+-+-+ |F|NRI| TYPE | payload | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ ... +-+-+-+-+-+-+-+ F: 1 bit Forbidden_zero_bit. The H.264 specification declares a value of 1 as a syntax violation. NRI: 2 bits Nal_ref_idc. This field is employed to signal the importance of a NAL unit for the reconstruction process. A value of 0 indicates that the NAL unit is not used for prediction, and hence could be discarded by the decoder or by network elements without the risk of causing drifting effects. Values higher than 0 indicate that the NALU is required for a drift-free reconstruction. And the higher the value, the heavier the impact of a loss of that NALU would be on the decoding process. Type: 5 bits NALU Type. This component specifies the NALU payload type [H264]. 6.2. Design consideration H.264 code stream is transmitted or stored based on NAL unit. As it is known, different values of the NRI field of NAL unit header have different significances. I-slices of H.264 video with NRI values greater that 0 are critical to the receiver. The data which are located between two adjacent I-slices, called a picture sequence, have less importance than I-slices. Furthermore, in a picture sequence, the data that are close to the beginning of the picture sequence are in general more important and they are more likely to carry more information than the data further back in the picture sequence. For the above mentioned reasons, the H.264 encoded streams are unequally protected with different FEC protection strengths based on different values of the NRI field. The NAL units with greater values Song & Qin Expires December 20, 2006 [Page 11] Internet-Draft Generic RTP Payload Format for FEC June 2006 of NRI are protected by stronger protection schemes. 6.2.1. I-slice I-slice should be strongly and separately protected. The time- ordered NALUs of an I-slice are equally divided into several data blocks. Each data block has the size of B octets. For every K data blocks, N encoded blocks are first generated via a strong FEC algorithm with more redundancy, then interleaved into a interleaved group and finally encapsulated into N consecutive RTP packets. 6.2.2. P-slice Assuming the total number of octets of P-slices which belong to the same picture sequence is W bytes, and the total number of octets, excluding the redundant octets appended by FEC, of an interleaved group is U*K*B. All these P-slices can be divided into W/(U*K*B) groups, which will be protected with a normal FEC algorithm, interleaved and encapsulated. There may remain W mod (U*K*B) octets. These octets are encapsulated directly, i.e., they are neither protected with FEC nor interleaved. Because they are at the end of the picture sequence, hence have lower importance. Even these octets are lost; their impacts on reconstructed pictures will be eliminated by the following I-slice. Song & Qin Expires December 20, 2006 [Page 12] Internet-Draft Generic RTP Payload Format for FEC June 2006 7. Conclusion Using the novel generic RTP payload format described in this document, the capability of error resilience for real-time communications over error-prone IP networks with packet losses can be improved. Furthermore, this method is simple and easy to implement. 8. Normative References [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", RFC 3550, BCP 14, March 1997. [RFC3550] Schulzrinne, H., Casner, S., Frederick, R., and V. Jacobson, "RTP: A Transport protocol for Real-Time Applications", RFC 3550, July 2003. [RFC3453] Frederick, M., Vicisano, L., Gemmell, J., Rizzo, L., Handley, M., and J. Crowcroft, "The Use of Forward Error Correction (FEC) in Reliable Multicast", RFC 3453, December 2002. [RFC3558] Li, A., "RTP Payload Format for Enhanced Variable Rate Codecs (EVRC) and Selectable Mode Vocoders (SMV)", RFC 3558, July 2003. [H264] International Telecommunications Union, "Advanced video coding for generic audiovisual services", ITU Recommendation H.264, July 2003. Song & Qin Expires December 20, 2006 [Page 13] Internet-Draft Generic RTP Payload Format for FEC June 2006 Authors' Addresses Bin Song Xidian Univ. 2 South TaiBai Road Xi'an, Shaanxi 710071 CN Phone: +86 29 8820 4409 Email: bsong@mail.xidian.edu.cn Hao Qin Xidian Univ. 2 South TaiBai Road Xi'an, Shaanxi 710071 CN Phone: +86 29 8820 3614 Email: hqin@mail.xidian.edu.cn Song & Qin Expires December 20, 2006 [Page 14] Internet-Draft Generic RTP Payload Format for FEC June 2006 Intellectual Property Statement The IETF takes no position regarding the validity or scope of any Intellectual Property Rights or other rights that might be claimed to pertain to the implementation or use of the technology described in this document or the extent to which any license under such rights might or might not be available; nor does it represent that it has made any independent effort to identify any such rights. Information on the procedures with respect to rights in RFC documents can be found in BCP 78 and BCP 79. Copies of IPR disclosures made to the IETF Secretariat and any assurances of licenses to be made available, or the result of an attempt made to obtain a general license or permission for the use of such proprietary rights by implementers or users of this specification can be obtained from the IETF on-line IPR repository at http://www.ietf.org/ipr. The IETF invites any interested party to bring to its attention any copyrights, patents or patent applications, or other proprietary rights that may cover technology that may be required to implement this standard. Please address the information to the IETF at ietf-ipr@ietf.org. Disclaimer of Validity This document and the information contained herein are provided on an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Copyright Statement Copyright (C) The Internet Society (2006). This document is subject to the rights, licenses and restrictions contained in BCP 78, and except as set forth therein, the authors retain all their rights. Acknowledgment Funding for the RFC Editor function is currently provided by the Internet Society. Song & Qin Expires December 20, 2006 [Page 15]