idnits 2.17.1 draft-ietf-avt-rtp-payload-04.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Cannot find the required boilerplate sections (Copyright, IPR, etc.) in this document. Expected boilerplate is as follows today (2024-04-25) according to https://trustee.ietf.org/license-info : IETF Trust Legal Provisions of 28-dec-2009, Section 6.a: This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 2: Copyright (c) 2024 IETF Trust and the persons identified as the document authors. All rights reserved. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 3: This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** Missing expiration date. The document expiration date should appear on the first and last page. ** The document seems to lack a 1id_guidelines paragraph about Internet-Drafts being working documents. ** The document seems to lack a 1id_guidelines paragraph about 6 months document validity -- however, there's a paragraph with a matching beginning. Boilerplate error? ** The document seems to lack a 1id_guidelines paragraph about the list of current Internet-Drafts. ** The document seems to lack a 1id_guidelines paragraph about the list of Shadow Directories. ** Expected the document's filename to be given on the first page, but didn't find any == No 'Intended status' indicated for this document; assuming Proposed Standard == The page length should not exceed 58 lines per page, but there was 1 longer page, the longest (page 11) being 63 lines == It seems as if not all pages are separated by form feeds - found 0 form feeds but 11 pages Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack a Security Considerations section. ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** The document seems to lack separate sections for Informative/Normative References. All references will be assumed normative when checking for downward references. Miscellaneous warnings: ---------------------------------------------------------------------------- == Line 70 has weird spacing: '...ists of k*16 ...' == Line 252 has weird spacing: '...he same on su...' == Line 507 has weird spacing: '...ices at p x 6...' == Couldn't figure out when the document was first submitted -- there may comments or warnings related to the use of a disclaimer for pre-RFC5378 work that could not be issued because of this. Please check the Legal Provisions document at https://trustee.ietf.org/license-info to determine if you need the pre-RFC5378 disclaimer. -- The document date (June 15, 1997) is 9811 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) ** Obsolete normative reference: RFC 1889 (ref. '1') (Obsoleted by RFC 3550) -- Possible downref: Non-RFC (?) normative reference: ref. '2' ** Obsolete normative reference: RFC 1890 (ref. '3') (Obsoleted by RFC 3551) -- Possible downref: Non-RFC (?) normative reference: ref. '4' ** Obsolete normative reference: RFC 2032 (ref. '5') (Obsoleted by RFC 4587) -- Possible downref: Non-RFC (?) normative reference: ref. '6' Summary: 13 errors (**), 0 flaws (~~), 7 warnings (==), 4 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 Internet Engineering Task Force Audio-Video Transport WG 2 INTERNET-DRAFT C. Zhu 3 Intel Corp. 4 June 15, 1997 5 Expires: December 15, 1997 7 RTP Payload Format for H.263 Video Streams 9 Status of This Memo 11 This document is an Internet-Draft. Internet-Drafts are working 12 documents of the Internet Engineering Task Force (IETF), its 13 areas, and its working groups. Note that other groups may also 14 distribute working documents as Internet-Drafts. 16 Internet-Drafts are draft documents valid for a maximum of six 17 months and may be updated, replaced, or obsoleted by other 18 documents at any time. It is inappropriate to use Internet- 19 Drafts as reference material or to cite them other than as 20 ``work in progress.'' 22 To learn the current status of any Internet-Draft, please check 23 the ``1id-abstracts.txt'' listing contained in the Internet- 24 Drafts Shadow Directories on ftp.is.co.za (Africa), 25 nic.nordu.net (Europe), munnari.oz.au (Pacific Rim), 26 ds.internic.net (US East Coast), or ftp.isi.edu (US West Coast). 28 Distribution of this document is unlimited. 30 Abstract 32 This document specifies the payload format for encapsulating an H.263 33 bitstream in the Real-Time Transport Protocol (RTP). Three modes are 34 defined for the H.263 payload header. An RTP packet can use one of the 35 three modes for H.263 video streams depending on the desired 36 network packet size and H.263 encoding options employed. 37 The shortest H.263 payload header (mode A) supports fragmentation 38 at Group of Block (GOB) boundaries. The long H.263 payload headers 39 (mode B and C) support fragmentation at Macroblock (MB) boundaries. 41 1. Introduction 43 This document describes a scheme to packetize an H.263 video stream for 44 transport using RTP [1]. H.263 video stream is defined by ITU-T 45 Recommendation H.263 (referred to as H.263 in this document) [4] for 46 video coding at very low data rates. RTP is defined by the Internet 47 Engineering Task Force (IETF) to provide end-to-end network transport 48 functions suitable for applications transmitting real-time data over 49 multicast or unicast network services. 51 2. Definitions 53 The following definitions apply in this document: 55 CIF: Common Intermediate Format. For H.263, a CIF picture has 352 x 288 56 pixels for luminance, and 176 x 144 pixels for chrominance. 58 QCIF: Quarter CIF source format with 176 x 144 pixels for luminance and 59 88 x 72 pixels for chrominance. 61 Sub-QCIF: picture source format with 128 x 96 pixels for luminance and 62 64 x 48 pixels for chrominance. 64 4CIF: Picture source format with 704 x 576 pixels for luminance and 65 352 x 288 pixels for chrominance. 67 16CIF: Picture source format with 1408 x 1152 pixels for luminance and 68 704 x 576 pixels for chrominance. 70 GOB: For H.263, a Group of Blocks (GOB) consists of k*16 lines, where 71 k depends on the picture format (k=1 for QCIF, CIF and sub-QCIF; k=2 72 for 4CIF and k=4 for 16CIF). 74 MB: A macroblock (MB) contains four blocks of luminance and the 75 spatially corresponding two blocks of chrominance. Each block consists 76 of 8x8 pixels. For example, there are eleven MBs in a GOB in QCIF 77 format and twenty two MBs in a GOB in CIF format. 79 3. Design Issues for Packetizing H.263 Bitstreams 81 H.263 is based on the ITU-T Recommendation H.261 [2] (referred to as 82 H.261 in this document). Compared to H.261, H.263 employs similar 83 techniques to reduce both temporal and spatial redundancy, but there 84 are several major differences between the two algorithms that 85 affect the design of packetization schemes significantly. This 86 section summarizes those differences. 88 3.1 Optional Features of H.263 90 In addition to the basic source coding algorithms, H.263 supports four 91 negotiable coding options to improve performance: Advanced Prediction, 92 PB-frames, Syntax-based Arithmetic Coding, and Unrestricted Motion 93 Vectors. They can be used in any combination. 95 Advanced Prediction(AP): One or four motion vectors can be used 96 for some macroblocks in a frame. This feature makes recovery from 97 packet loss difficult, because more redundant information has to be 98 preserved at the beginning of a packet when fragmenting at a macroblock 99 boundary. 101 PB-frames: Two frames (a P frame and a B frame) are coded into one 102 bitstream with macroblocks from the two frames interleaved. From a 103 packetization point of view, a MB from the P frame and a MB from the B 104 frame must be treated together because each MB for the B frame is coded 105 based on the corresponding MB for the P frame. A means must be provided 106 to ensure proper rendering of two frames in the right order. Also, if 107 part of this combined bitstream is lost, it will affect both frames, 108 and possibly more. 110 Syntax-based Arithmetic Coding (SAC): When the SAC option is used, the 111 resultant run-value pair after quantization of Discrete Cosine 112 Transform (DCT) coefficients will be coded differently from Huffman 113 codes, but the macroblock hierarchy will be preserved. Since context 114 variables are only synchronized after fixed length codes in the 115 bitstream, any fragmentation starting at variable length codes 116 will result in difficulty in decoding in the presence of packet 117 loss without carrying the values of all the context variables in each 118 H.263 payload header. 120 The Unrestricted motion vectors feature allows large range of motion 121 vectors to improve performance of motion compensation for inter-coded 122 pictures. This option also affects packetization because it uses 123 larger range of motion vectors than normal. 125 To enable proper decoding of packets received, without dependency on 126 previous packets, the use of these optional features is signaled in the 127 H.263 payload header, as described in Section 5. 129 3.2 GOB Numbering 131 In H.263, each picture is divided into groups of blocks (GOB). GOBs are 132 numbered according to a vertical scan of a picture, starting with the 133 top GOB and ending with the bottom GOB. In contrast, a GOB in H.261 134 is composed of three rows of 16x16 MB for QCIF, and three half-rows 135 of MBs for CIF. A GOB is divided into macroblocks in H.263 136 and the definition of the macroblocks are the same as in H.261. 138 Each GOB in H.263 can have a fixed GOB header, but the use 139 of the header is optional. If the GOB header is present, it may or may 140 not start on a byte boundary. Byte alignment can be achieved by proper 141 bit stuffing by the encoder, but it is not required by the H.263 142 bitstream specification [4]. 144 In summary, a GOB in H.263 is defined and coded with finer granularity 145 but with the same source format, resulting in more flexibility for 146 packetization than with H.261. 148 3.3 Motion Vector Encoding 150 Differential coding is used to code motion vectors as variable length 151 codes. Unlike in H.261, where each motion vector is predicted from the 152 previous MB in the GOB, H.263 employs a more flexible prediction scheme, 153 where one or three candidate predictors could be used depending on 154 the presence of GOB headers. 156 If the GOB header is present in a GOB, motion vectors are coded with 157 reference to MBs in the current GOB only. If a GOB header is not 158 present in the current GOB, three motion vectors must be available to 159 decode one macroblock, where two of them might come from the previous 160 GOB. To correctly decode a whole inter-coded GOB, all the motion 161 vectors for MBs in the previous GOB must be available to compute the 162 predictors or the predictors themselves must be present. The optional 163 use of three motion vector predictors can be a major problem for a 164 packetization scheme like the one defined for H.261 when packetizing 165 at MB boundaries [5]. 167 Consider the case that a packet starts with a MB but the GOB header 168 is not present. If the previous packet is lost, then all the motion 169 vectors needed to predict the motion vectors for the MBs in the current 170 GOB are not available. In order to decode the received MBs correctly, 171 all the motion vectors for the previous GOB or the motion vector 172 predictors would have to be duplicated at the beginning of the packet. 173 This kind of duplication would be very expensive and unacceptable in 174 terms of bandwidth overhead. 176 The encoding strategy of each H.263 CODEC (CODer and DECoder) 177 implementation is beyond the scope of this document, even though it 178 has significant effect on visual quality in the presence of 179 packet loss. However, we strongly recommend use of the GOB header 180 for every GOB at the beginning of a packet to address this problem. 182 Similar problems exist because of cross-GOB data dependency related to 183 motion vectors, but they can not be addressed by using the GOB header. 184 For 16CIF and 4CIF pictures, a GOB contains more than one row of MBs. 185 If a GOB can not fit in one RTP packet, and the first packet containing 186 the GOB header is lost, then MBs in the second packet can not compute 187 motion vectors correctly, because they are coded relative to data in 188 the lost packet. Similarly, when OBMC (Overlapped Block Motion 189 Compensation) [4] in Advanced Prediction mode is used, motion 190 compensation for some MBs in one GOB could use motion vectors of MBs 191 in previous GOB regardless of the presence of GOB header. When 192 MBs that are used to decode received MBs are lost, those 193 received MBs can not be decoded correctly. Each implementation of 194 the method described in this document should take these limitations 195 into account. 197 3.4 Macroblock Address 199 As specified by H.261, a macroblock address (MBA) is encoded with a 200 variable length code to indicate the position of a macroblock within 201 a group of MBs in H.261 bitstreams. H.263 does not code the MBA 202 explicitly, but the macroblock address within a GOB is necessary to 203 recover from packet loss when fragmenting at MB boundaries. 204 Therefore, this information must be included in the H.263 payload 205 header for modes (mode B and mode C as described in Section 5) that 206 allow packetization at MB boundaries. 208 4. Usage of RTP 210 When transmitting H.263 video streams over the Internet, the output 211 of the encoder can be packetized directly. For every video frame, 212 the H.263 bitstream itself is carried in the RTP payload without 213 alteration, including the picture start code, the entire picture 214 header, in addition to any fixed length codes and variable length codes. 215 In addition, the output of the encoder is packetized without adding 216 the framing information specified by H.223 [6]. Therefore multiplexing 217 audio and video signals in the same packet is not accommodated, as UDP 218 and RTP provide a much more efficient way to achieve multiplexing. 220 RTP does not guarantee a reliable and orderly data delivery service, 221 so a packet might get lost in the network. To achieve a best-effort 222 recovery from packet loss, the decoder needs assistance to proceed with 223 decoding of other packets that are received. Thus it is desirable to 224 be able to process each packet independent of other packets. 225 Some frame level information is included in each packet, such as source 226 format and flags for optional features to assist the decoder in 227 operating correctly and efficiently in presence of packet loss. The 228 flags for H.263 optional features also provide information about 229 coding options used in H.263 video bitstreams that can be used by 230 session management tools. 232 H.263 video bitstreams will be carried as payload data within RTP 233 packets. A new H.263 payload header is defined in section 5 on the H.263 234 payload header. This section defines the usage of RTP fixed header 235 and H.263 video packet structure. 237 4.1 RTP Header Usage 239 Each RTP packet starts with a fixed RTP header [1]. The following 240 fields of the RTP fixed header are used for H.263 video streams: 242 Marker bit (M bit): The Marker bit of the RTP fixed header is set to 1 243 when the current packet carries the end of current frame; set to 0 244 otherwise. 246 Payload Type (PT): The Payload Type shall specify H.263 video payload 247 format using the value specified by the RTP profile in use, for 248 example RFC 1890 [3]. 250 Timestamp: The RTP timestamp encodes the sampling instant of the 251 video frame contained in the RTP data packet. The RTP timestamp may be 252 the same on successive packets if a video frame occupies more than one 253 packet. For H.263 video streams, the RTP timestamp is based on a 90 kHz 254 clock, the same as the RTP timestamp for H.261 video streams [5]. 256 4.2 Video Packet Structure 258 For each RTP packet, the RTP fixed header is followed by the H.263 259 payload header, which is followed by the standard H.263 compressed 260 bitstream [4]. 262 The size of the H.263 payload header is variable depending on modes 263 used as detailed in the next section. The layout of an RTP H.263 264 video packet is shown as: 266 0 1 2 3 267 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 268 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 269 | RTP header | 270 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 271 | H.263 payload header | 272 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 273 | H.263 bitstream | 274 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 276 5. H.263 Payload Header 278 For H.263 video streams, each RTP packet carries only one H.263 video 279 packet. The H.263 payload header is always present for each H.263 video 280 packet. 282 Three formats (mode A, mode B and mode C) are defined for H.263 283 payload header. In mode A, an H.263 payload header of four bytes is 284 present before actual compressed H.263 video bitstream in a packet. 285 It allows fragmentation at GOB boundaries. In mode B, an eight byte 286 H.263 payload header is used and each packet starts at MB boundaries 287 without the PB-frames option. Finally, a twelve byte H.263 payload 288 header is defined in mode C to support fragmentation at MB boundaries 289 for frames that are coded with the PB-frames option. 291 The mode of each H.263 payload header is indicated by the F and 292 P fields in the header. Packets of different modes can be 293 intermixed. All client application are required to be able to 294 receive packets in any mode, but decoding of mode C packets 295 is optional because the PB-frames feature is optional. 297 In this section, the H.263 payload format is shown as rows of 32-bit 298 words. Each word is transmitted in network byte order. Whenever a 299 field represents a numeric value, the most significant bit is at 300 the left of the field. 302 5.1 Mode A 304 In this mode, an H.263 bitstream will be packetized 305 on a GOB boundary or a picture boundary. Mode A packets always 306 start with the H.263 picture start code [4] or a GOB, but do not 307 necessarily contain complete GOBs. Four bytes are used for the 308 mode A H.263 payload header. The H.263 payload header definition 309 for mode A is shown as follows with F=0. Mode A packets are allowed 310 to start at a GOB boundary even if no GOB header is present in the 311 bitstream for the GOB. However, such use is discouraged due to 312 the dependencies it creates across GOB boundaries, as described 313 in Section 3.3. 315 0 1 2 3 316 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 317 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 318 |F|P|SBIT |EBIT | SRC |I|U|S|A|R |DBQ| TRB | TR | 319 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 321 F: 1 bit 322 The flag bit indicates the mode of the payload header. F=0, mode A; 323 F=1, mode B or mode C depending on P bit defined below. 325 P: 1 bit 326 Optional PB-frames mode as defined by the H.263 [4]. "0" implies normal 327 I or P frame, "1" PB-frames. When F=1, P also indicates modes: mode B 328 if P=0, mode C if P=1. 330 SBIT: 3 bits 331 Start bit position specifies number of most significant bits that 332 shall be ignored in the first data byte. 334 EBIT: 3 bits 335 End bit position specifies number of least significant bits that 336 shall be ignored in the last data byte. 338 SRC : 3 bits 339 Source format, bit 6,7 and 8 in PTYPE defined by H.263 [4], specifies 340 the resolution of the current picture. 342 I: 1 bit. 343 Picture coding type, bit 9 in PTYPE defined by H.263[4], 344 "0" is intra-coded, "1" is inter-coded. 346 U: 1 bit 347 Set to 1 if the Unrestricted Motion Vector option, bit 10 in PTYPE 348 defined by H.263 [4] was set to 1 in the current picture header, 349 otherwise 0. 351 S: 1 bit 352 Set to 1 if the Syntax-based Arithmetic Coding option, bit 11 in PTYPE 353 defined by the H.263 [4] was set to 1 for current picture header, 354 otherwise 0. 356 A: 1 bit 357 Set to 1 if the Advanced Prediction option, bit 12 in PTYPE defined 358 by H.263 [4] was set to 1 for current picutre header, otherwise 0. 360 R: 4 bits 361 Reserved, must be set to zero. 363 DBQ: 2 bits 364 Differential quantization parameter used to calculate quantizer for 365 the B frame based on quantizer for the P frame, when PB-frames option 366 is used. The value should be the same as DBQUANT defined by 367 H.263 [4]. Set to zero if PB-frames option is not used. 369 TRB: 3 bits 370 Temporal Reference for the B frame as defined by H.263 [4]. Set to 371 zero if PB-frames option is not used. 373 TR: 8 bits 374 Temporal Reference for the P frame as defined by H.263 [4]. Set to 375 zero if the PB-frames option is not used. 377 5.2 Mode B 379 In this mode, an H.263 bitstream can be fragmented at MB boundaries. 380 Whenever a packet starts at a MB boundary, this mode shall be used 381 without PB-frames option. Mode B packets are intended for a GOB 382 whose size is larger than the maximum packet size allowed in the 383 underlying protocol, thus making it impossible to fit one or more 384 complete GOBs in a packet. This mode can only be used without the 385 PB-frames option. Mode C as defined in the next section can be used 386 to fragment H.263 bitstreams at MB boundaries with the PB-frames option. 387 The H.263 payload header definition for mode B is shown as follows 388 with F=1 and P=0: 390 0 1 2 3 391 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 392 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 393 |F|P|SBIT |EBIT | SRC | QUANT | GOBN | MBA |R | 394 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 395 |I|U|S|A| HMV1 | VMV1 | HMV2 | VMV2 | 396 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 398 The following fields are defined the same as in mode A: F, P, SBIT, 399 EBIT, SRC, I, U, S and A. Other fields are defined as follows: 401 QUANT: 5 bits 402 Quantization value for the first MB coded at the starting of the packet. 403 Set to 0 if the packet begins with a GOB header. This is the equivalent 404 of GQUANT defined by the H.263 [4]. 406 GOBN: 5 bits 407 GOB number in effect at the start of the packet. GOB number is specified 408 differently for different resolutions. See H.263 [4] for details. 410 MBA: 9 bits 411 The address within the GOB of the first MB in the packet, counting from 412 zero in scan order. For example, the third MB in any GOB is given 413 MBA = 2. 415 HMV1, VMV1: 7 bits each. 416 Horizontal and vertical motion vector predictors for the first MB 417 in this packet [4]. When four motion vectors are used for current 418 MB with advanced prediction option, these would be the motion 419 vector predictors for block number 1 in the MB. Each 7 bits field 420 encodes a motion vector predictor in half pixel resolution as a 2's 421 complement number. 423 HMV2, VMV2: 7 bits each. 424 Horizontal and vertical motion vector predictors for block number 3 425 in the first MB in this packet when four motion vectors are used with 426 the advanced prediction option. This is needed because block number 3 427 in the MB needs different motion vector predictors from 428 other blocks in the MB. These two fields are not used when the 429 MB only has one motion vector. See the H.263 [4] for 430 block organization in a macroblock. Each 7 bits field encodes a motion 431 vector predictor in half pixel resolution as a 2's complement number. 433 R : 2 bits 434 Reserved, must be set to zero. 436 5.3 Mode C 438 In this mode, an H.263 bitstream is fragmented at MB boundaries of P 439 frames with the PB-frames option. It is intended for those GOBs 440 whose sizes are larger than the maximum packet size allowed in the 441 underlying protocol when PB-frames option is used. The H.263 payload 442 header definition for mode C is shown as follows with F=1 and P=1: 444 0 1 2 3 445 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 446 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 447 |F|P|SBIT |EBIT | SRC | QUANT | GOBN | MBA |R | 448 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 449 |I|U|S|A| HMV1 | VMV1 | HMV2 | VMV2 | 450 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 451 | RR |DBQ| TRB | TR | 452 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 454 The following fields are defined the same as in mode B: F, P, SBIT, 455 EBIT, SRC, QUANT, GOBN, MBA, R, I, U, S, A, HMV1, VMV1, HMV2, VMV2. The 456 rest of the fields (TR, DBQ, TRB) are defined the same as in mode A, 457 except field RR. The RR field takes 19 bits, and is currently reserved. 458 It must be set to zero. 460 5.4 Selection of Modes for the H.263 Payload Header 462 Packets carrying H.263 video streams with different modes can be 463 intermixed. The modes shall be selected carefully based on network 464 packet size, H.263 coding options and underlying network protocols. 465 More specifically, mode A shall be used for packets starting with a GOB 466 or the H.263 picture start code [4], and mode B or C shall be used 467 whenever a packet has to start at a MB boundary. Mode B or C are 468 necessary for those GOBs with sizes larger than network packet size. 470 We strongly recommend use of mode A whenever possible. 471 The major advantage of mode A over mode B and C is its simplicity. 472 The H.263 payload header is smaller than mode B and C. Transmission 473 overhead is reduced and the savings may be very significant when 474 working with very low data rates or relatively small packet sizes. 476 Another advantage of mode A is that it simplifies error recovery in the 477 presence of packet loss. The internal state of a decoder can be 478 recovered at GOB boundaries instead of having to synchronize with MBs 479 as in mode B and C. The GOB headers and the picture start code are 480 easy to identify, and their presence will normally cause a H.263 481 decoder to re-synchronize its internal states. 483 Finally, we would like to stress that recovery from packet loss 484 depends on a decoder's ability to use the information provided 485 in the H.263 payload header within RTP packets. 487 6. Limitations 489 The packetization method described in this document applies to the 1996 490 version of H.263. It may not be applicable to bitstreams with 491 features added after that. 493 7. Acknowledgments 495 The author would like to thank the following people for their 496 valuable comments: Linda S. Cline, Christian Maciocco, Mojy 497 Mirashrafi, Phillip Lantz, Steve Casner, Gary Sullivan, and 498 Sassan Pejhan. 500 8. References 502 [1] H. Schulzrinne, S. Casner, R. Frederick, V. Jacobson. 503 RTP : A Transport Protocol for Real-Time Applications, RFC 1889. 504 IETF, 1996. 506 [2] International Telecommunication Union. 507 Video Codec for Audiovisual Services at p x 64 kbits/s, 508 ITU-T Recommendation H.261, 1993. 510 [3] H. Schulzrinne. 511 RTP Profile for Audio and Video Conference with Minimal 512 Control, RFC 1890. 513 IETF, 1996. 515 [4] International Telecommunication Union. 516 Video Coding for Low Bitrate Communication, ITU-T Recommendation 517 H.263, 1996 519 [5] T. Turletti, C. Huitema. 520 RTP Payload Format for H.261 Video Streams, RFC 2032. 521 IETF, 1996. 523 [6] International Telecommunication Union. 524 Multiplexing Protocol for Low Bitrate Multimedia Communication, 525 ITU-T Recommendation H.223, 1995. 527 7. Author's Address 529 C. "Chad" Zhu 530 Mail Stop: JF3-202 531 Intel Corporation 532 2111 N.E. 25th Avenue 533 Hillsboro, OR 97124 534 USA 536 Email: czhu@ibeam.intel.com 537 Tel: (503) 264-6008 538 Fax: (503) 264-1805