idnits 2.17.1 

draft-ietf-avt-rtp-h263-video-01.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

  ** Cannot find the required boilerplate sections (Copyright, IPR, etc.) in
     this document.

     Expected boilerplate is as follows today (2024-04-26) according to
     https://trustee.ietf.org/license-info :

     IETF Trust Legal Provisions of 28-dec-2009, Section 6.a:
        This Internet-Draft is submitted in full conformance with the provisions
        of BCP 78 and BCP 79.

     IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 2:
        Copyright (c) 2024 IETF Trust and the persons identified as the document
        authors.  All rights reserved.

     IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 3:
        This document is subject to BCP 78 and the IETF Trust's Legal Provisions
        Relating to IETF Documents
        (https://trustee.ietf.org/license-info) in effect on the date of
        publication of this document.  Please review these documents
        carefully, as they describe your rights and restrictions with
        respect to this document.  Code Components extracted from this
        document must include Simplified BSD License text as described in
        Section 4.e of the Trust Legal Provisions and are provided
        without warranty as described in the Simplified BSD License.


  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

  ** Missing expiration date.  The document expiration date should appear on
     the first and last page.

  ** The document seems to lack a 1id_guidelines paragraph about
     Internet-Drafts being working documents. 

  ** The document seems to lack a 1id_guidelines paragraph about 6 months
     document validity -- however, there's a paragraph with a matching
     beginning. Boilerplate error?

  ** The document seems to lack a 1id_guidelines paragraph about the list of
     current Internet-Drafts. 

  ** The document seems to lack a 1id_guidelines paragraph about the list of
     Shadow Directories. 

  == No 'Intended status' indicated for this document; assuming Proposed
     Standard

  == The page length should not exceed 58 lines per page, but there was 1
     longer page, the longest (page 1) being 664 lines


  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  ** The document seems to lack an Abstract section.

  ** The document seems to lack an IANA Considerations section.  (See Section
     2.2 of https://www.ietf.org/id-info/checklist for how to handle the case
     when there are no actions for IANA.)

  ** The document seems to lack an Authors' Addresses Section.

  ** The document seems to lack separate sections for Informative/Normative
     References.  All references will be assumed normative when checking for
     downward references.


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  -- The document seems to lack a disclaimer for pre-RFC5378 work, but may
     have content which was first submitted before 10 November 2008.  If you
     have contacted all the original authors and they are all willing to grant
     the BCP78 rights to the IETF Trust, then this is fine, and you can ignore
     this comment.  If not, you may need to add the pre-RFC5378 disclaimer. 
     (See the Legal Provisions document at
     https://trustee.ietf.org/license-info for more information.)

  -- Couldn't find a document date in the document -- date freshness check
     skipped.


  Checking references for intended status: Proposed Standard
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

  == Unused Reference: '2' is defined on line 626, but no explicit reference
     was found in the text

  == Unused Reference: '6' is defined on line 638, but no explicit reference
     was found in the text

  ** Obsolete normative reference: RFC 1889 (ref. '1') (Obsoleted by RFC 3550)

  -- Possible downref: Non-RFC (?) normative reference: ref. '2'

  ** Obsolete normative reference: RFC 1890 (ref. '3') (Obsoleted by RFC 3551)

  -- Possible downref: Non-RFC (?) normative reference: ref. '4'

  ** Obsolete normative reference: RFC 2032 (ref. '5') (Obsoleted by RFC 4587)

  ** Downref: Normative reference to an Historic RFC: RFC 2190 (ref. '6')

  -- Possible downref: Non-RFC (?) normative reference: ref. '7'

  -- Possible downref: Non-RFC (?) normative reference: ref. '8'


     Summary: 14 errors (**), 0 flaws (~~), 4 warnings (==), 6 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------

1	Internet Engineering Task Force                 Audio-Video Transport WG
2	INTERNET-DRAFT                                 C. Bormann / Univ. Bremen
3	                                                        L. Cline / Intel
4	                                                      G. Deisher / Intel
5	                                                       T. Gardos / Intel
6	                                                     C. Maciocco / Intel
7	                                                       D. Newell / Intel
8	                                                   J. Ott / Univ. Bremen
9	                                                G. Sullivan / PictureTel
10	                                                   S. Wenger / TU Berlin
11	                                                          C. Zhu / Intel

13	                                            Date Generated: 14 Jan. 1998

15	               RTP Payload Format for the 1998 Version of
16	                    ITU-T Rec. H.263 Video (H.263+)
17	                 <draft-ietf-avt-rtp-h263-video-01.txt>

19	Status of This Memo

21	This document is an Internet-Draft.  Internet-Drafts are working
22	documents of the Internet Engineering Task Force (IETF), its areas, and
23	its working groups.  Note that other groups may also distribute working
24	documents as Internet-Drafts.

26	Internet-Drafts are draft documents valid for a maximum of six months
27	and may be updated, replaced, or made obsolete by other documents at any
28	time.  It is inappropriate to use Internet-Drafts as reference material
29	or to cite them other than as "work in progress."

31	To learn the current status of any Internet-Draft, please check the
32	"1id-abstracts.txt" listing contained in the Internet-Drafts Shadow
33	Directories on ftp.is.co.za (Africa), nic.nordu.net (Europe),
34	munnari.oz.au (Pacific Rim), ds.internic.net (US East Coast), or
35	ftp.isi.edu (US West Coast).

37	Distribution of this document is unlimited.

39	1. Introduction

41	This document specifies an RTP payload header format applicable to the
42	transmission of video streams generated based on the 1998 version of
43	ITU-T Recommendation H.263 [4].  Because the 1998 version of H.263 is a
44	superset of the 1996 syntax, this format can also be used with the 1996
45	version of H.263.

47	The 1998 version of ITU-T Recommendation H.263 added numerous coding
48	options to improve codec performance over the 1996 version.  The 1998
49	version is referred to as H.263+ in this document.  Among the new
50	options, the ones with the biggest impact on the RTP payload
51	specification and the error resilience of the video content are the
52	slice structured mode, the independent segment decoding mode (ISD), the
53	reference picture selection mode, and the scalability mode.  This
54	section summarizes the impact of these new coding options on
55	packetization.  Refer to [4] for more information on coding options.

57	The slice structured mode was added to H.263+ for three purposes: to
58	provide enhanced error resilience capability, to make the bitstream more
59	amenable to use with an underlying packet transport such as RTP, and to
60	minimize video delay.  The slice structured mode supports fragmentation
61	at macroblock boundaries.

63	With the independent segment decoding option, a video picture frame is
64	broken into segments and encoded in such a way that each segment is
65	independently decodable.  Utilizing ISD in a lossy network environment
66	helps to prevent the propagation of errors from one segment of the
67	picture to others.

69	The reference picture selection mode allows the use of an older
70	reference picture rather than the one immediately preceding the current
71	picture.  Usually, the last transmitted frame is implicitly used as the
72	reference picture for inter-frame prediction.  If the reference picture
73	selection mode is used, the data stream carries information on what
74	reference frame should be used, indicated by the temporal reference as
75	an ID for that reference frame.  The reference picture selection mode
76	can be used with or without a back channel, which provides information
77	to the encoder about the internal status of the decoder.  However, no
78	special provision is made herein for carrying back channel information.

80	H.263+ also includes bitstream scalability as an optional coding mode.
81	Three kinds of scalability are defined: temporal, signal-to-noise ratio
82	(SNR), and spatial scalability.  Temporal scalability is achieved via
83	the disposable nature of bi-directionally predicted frames, or B-frames.
84	SNR scalability permits refinement of encoded video frames, thereby
85	improving the quality (or SNR).  Spatial scalability is similar to SNR
86	scalability except the refinement layer is twice the size of the base
87	layer in the horizontal dimension, vertical dimension, or both.

89	2. Usage of RTP

91	When transmitting H.263+ video streams over the Internet, the output of
92	the encoder can be packetized directly.  All the bits resulting from the
93	bitstream including the fixed length codes and variable length codes
94	will be included in the packet, with the only exception being that when
95	the payload of a packet begins with a Picture, GOB, Slice, EOS, or EOSBS
96	start code, the first two (all-zero) bytes of the start code are removed
97	and replaced by setting an indicator bit in the payload header.

99	For H.263+ bitstreams coded with temporal, spatial, or SNR scalability,
100	each layer may be transported to a different network address.  More
101	specifically, each layer may use a unique IP address and port number
102	combination.  The temporal relations between layers shall be expressed
103	using the RTP timestamp so that they can be synchronized at the
104	receiving ends in multicast or unicast applications.

106	The H.263+ video stream will be carried as payload data within RTP
107	packets.  A new H.263+ payload header is defined in section 4.  This
108	section defines the usage of the RTP fixed header and H.263+ video
109	packet structure.

111	2.1 RTP Header Usage

113	Each RTP packet starts with a fixed RTP header.  The following fields of
114	the RTP fixed header are used for H.263+ video streams:

116	Marker bit (M bit): The Marker bit of the RTP header is set to 1 when
117	the current packet carries the end of current frame, and is 0 otherwise.

119	Payload Type (PT): The Payload Type shall specify the H.263+ video
120	payload format.

122	Timestamp: The RTP Timestamp encodes the sampling instance of the first
123	video frame data contained in the RTP data packet.  The RTP timestamp
124	shall be the same on successive packets if a video frame occupies more
125	than one packet.  In a multilayer scenario, all pictures corresponding
126	to the same temporal reference should use the same timestamp.  If
127	temporal scalability is used (if B-frames are present), the timestamp
128	may not be monotonically increasing in the RTP stream.  If B-frames are
129	transmitted on a separate layer and address, they must be synchronized
130	properly with the reference frames.  Refer to the 1998 ITU-T
131	Recommendation H.263 [4] for information on required transmission order
132	to a decoder.  For an H.263+ video stream, the RTP timestamp is based on
133	a 90 kHz clock, the same as that of the RTP payload for H.261 stream
134	[5].  Since both the H.263+ data and the RTP header contain time
135	information, it is required that those timing information run
136	synchronously.  That is, both the RTP timestamp and the temporal
137	reference (TR in the picture header of H.263) should carry the same
138	relative timing information.  If necessary, mathematical rounding should
139	be applied to the information of the H.263+ data stream to generate the
140	RTP timestamp (this is especially true for the standard picture clock
141	frequency of 30000/1001 Hz, and may also be true if custom picture clock
142	frequencies are to be used; see [4] for details).

144	2.2 Video Packet Structure

146	A section of an H.263+ compressed bitstream is carried as a payload
147	within each RTP packet.  For each RTP packet, the RTP header is followed
148	by an H.263+ payload header, which is followed by a number of bytes of a
149	standard H.263+ compressed bitstream.  The size of the H.263+ payload
150	header is variable depending on the payload involved as detailed in the
151	section 4.  The layout of the RTP H.263+ video packet is shown as:

153	 0                   1                   2                   3
154	 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
155	+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
156	|    RTP Header                                               ...
157	+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
158	|    H.263+ Payload Header                                    ...
159	+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
160	|    H.263+ Compressed Data Stream                            ...
161	+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

163	Any H.263+ start codes can be byte aligned by an encoder by using the
164	stuffing mechanisms of H.263+.  As specified in H.263+, picture, slice,
165	and EOSBS start codes shall always be byte aligned, and GOB and EOS
166	start codes may be byte aligned.  For packetization purposes, GOB start
167	codes should be byte aligned, although this is not absolutely required
168	herein since it is not required in H.263+.

170	All H.263+ start codes (Picture, GOB, Slice, EOS, and EOSBS) begin with
171	16 zero-valued bits.  If a start code is byte aligned and it occurs at
172	the beginning of a packet, these two bytes shall be removed from the
173	H.263+ compressed data stream in the packetization process and shall
174	instead be represented by setting a bit (the P bit) in the payload
175	header.

177	3. Design Considerations

179	The goals of this payload format are to specify an efficient way of
180	encapsulating an H.263+ standard compliant bitstream and to enhance the
181	resiliency towards packet losses.  Due to the large number of different
182	possible coding schemes in H.263+, a copy of the picture header with
183	configuration information is inserted into the payload header when
184	appropriate.  The use of that copy of the picture header along with the
185	payload data can allow decoding of a received packet even in such cases
186	in which another packet containing the original picture header becomes
187	lost.

189	There are a few assumptions and constraints associated with this H.263+
190	payload header design.  The purpose of this section is to point out
191	various design issues and also to discuss several coding options
192	provided by H.263+ that may impact the performance of network-based
193	H.263+ video.

195	o The optional slice structured mode described in annex K of H.263+ [4]
196	  enables more flexibility for packetization.  Similar to a picture
197	  segment that begins with a GOB header, the motion vector predictors in
198	  a slice are restricted to reside within its boundaries.  However,
199	  slices provide much greater freedom in the selection of the size and
200	  shape of the area which is represented as a distinct decodable region.
201	  In particular, slices can have a size which is dynamically selected to
202	  allow the data for each slice to fit into a chosen packet size.
203	  Slices can also be chosen to have a rectangular shape which is
204	  conducive for minimizing the impact of errors and packet losses on
205	  motion compensated prediction.  For these reasons, the use of the
206	  slice structured mode is strongly recommended for any applications
207	  used in environments where significant packet loss occurs.

209	o In non-rectangular slice structured mode, only complete slices should
210	  be included in a packet.  In other words, slices should not be
211	  fragmented across packet boundaries.  The only reasonable need for a
212	  slice to be fragmented across packet boundaries is when the encoder
213	  which generated the H.263+ data stream could not be influenced by an
214	  awareness of the packetization process (such as when sending H.263+
215	  data through a network other than the one to which the encoder is
216	  attached, as in network gateway implementations).  Optimally, each
217	  packet will contain only one slice.

219	o The independent segment decoding (ISD) described in annex R of [4]
220	  prevents any data dependency across slice or GOB boundaries in the
221	  reference picture.  It can be utilized to further improve resiliency
222	  in high loss conditions.

224	o If ISD is used in conjunction with the slice structure, the
225	  rectangular slice submode shall be enabled and the dimensions and
226	  quantity of the slices present in a frame shall remain the same
227	  between each two intra-coded frames (I-frames), as required in H.263+.
228	  The individual ISD segments may also be entirely intra coded from time
229	  to time to realize quick error recovery without adding the latency
230	  time associated with sending complete INTRA-pictures.

232	o When the slice structure is not applied, the insertion of a
233	  (preferably byte-aligned) GOB header can be used to provide resync
234	  boundaries in the bitstream, as the presence of a GOB header
235	  eliminates the dependency of motion vector prediction across GOB
236	  boundaries.  These resync boundaries provide natural locations for
237	  packet payload boundaries.

239	o H.263+ allows picture headers to be sent in an abbreviated form in
240	  order to prevent repetition of overhead information that does not
241	  change from picture to picture.  For resiliency, sending a complete
242	  picture header for every frame is often advisable.  This means, that
243	  especially in cases with high packet loss probability in which picture
244	  header contents are not expected to be highly predictable, the sender
245	  may always set the subfield UFEP in PLUSPTYPE to '001' in the H.263+
246	  video bitstream.

248	o In a multi-layer scenario, each layer may be transmitted to a
249	  different network address.  The configuration of each layer such as
250	  the enhancement layer number (ELNUM), reference layer number (RLNUM),
251	  and scalability type should be determined at the start of the session
252	  and should not change during the course of the session.

254	o All start codes can be byte aligned, and picture, slice, and EOSBS
255	  start codes are always byte aligned.  The boundaries of these
256	  syntactical elements provide ideal locations for placing packet
257	  boundaries.

259	o We assume that a maximum Picture Header size of 504 bits is
260	  sufficient.  The syntax of H.263+ does not explicitly prohibit larger
261	  picture header sizes, but the use of such extremely large picture
262	  headers is not expected.

264	4. H.263+ Payload Header

266	For H.263+ video streams, each RTP packet carries only one H.263+ video
267	packet.  The H.263+ payload header is always present for each H.263+
268	video packet.  The payload header is of variable length.  A 16 bit field
269	of the basic payload header may be followed by an 8 bit field for Video
270	Redundancy Coding information, and/or by a variable length picture
271	header as indicated by PLEN. These optional fields appear in the order
272	given above when present.

274	If a picture header is included in the payload header, the length of the
275	picture header in number of bytes is specified by PLEN.  The minimum
276	length of the payload header is 16 bits, corresponding to PLEN equal to
277	0 and no VRC information present.

279	The remainder of this section defines the various components of the RTP
280	payload header.  Section five defines the various packet types that are
281	used to carry different types of H.263+ coded data, and section six
282	summarizes how to distinguish between the various packet types.

284	4.1 General H.263+ payload header

286	The H.263+ payload header is structured as follows:

288	 0                   1
289	 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5
290	+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
291	|  RR     |P|V|  PLEN     |PEBIT|
292	+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

294	RR: 5 bits
295	  Reserved bits.  Shall be zero.

297	P: 1 bit
298	  Indicates the picture start or a picture segment (GOB/Slice) start or
299	  a video sequence end (EOS or EOSBS).  Two bytes of zero bits then have
300	  to be prefixed to the payload of such a packet to compose a complete
301	  picture/GOB/slice/EOS/EOSBS start code.  This bit allows the omission
302	  of the two first bytes of the start codes, thus improving the
303	  compression ratio.

305	V: 1 bit
306	  Indicates the presence of an 8 bit field containing information for
307	  Video Redundancy Coding (VRC), which follows immediately after the
308	  initial 16 bits of the payload header if present.  For syntax and
309	  semantics of that 8 bit VRC field see section 4.2.

311	PLEN: 6 bits
312	  Picture header length in number of bytes.  If no additional picture
313	  header is attached, PLEN is 0.  If PLEN>0, the additional picture
314	  header is attached immediately following the rest of the payload
315	  header.

317	PEBIT: 3 bits
318	  Indicates the number of bits that shall be ignored in the last byte of
319	  the picture header.  If PLEN is zero, then PEBIT shall also be zero.

321	4.2 Video Redundancy Coding Header Extension

323	Video Redundancy Coding (VRC) is an optional mechanism intended to
324	improve error resilience over packet networks.  Implementing VRC in
325	H.263+ will require the Reference Picture Selection option described in
326	Annex N.  By having multiple "threads" of independently inter-frame
327	predicted pictures, damage of individual frame will cause distortions
328	only within its own thread but leave the other threads unaffected.  From
329	time to time, all threads converge to a so-called sync frame (an INTRA
330	picture or a non-INTRA picture which is redundantly represented within
331	multiple threads); from this sync frame, the independent threads are
332	started again.  For a more complete description of VRC see [7].

334	While a VRC data stream is - like all H.263+ data - totally self-
335	contained, it may be useful for the transport hierarchy implementation
336	to have knowledge about the current damage status of each thread.  On
337	the Internet, this status can easily be determined by observing the
338	marker bit, the sequence number of the RTP header, and the thread-id and
339	a circling "packet per thread" number.  The latter two numbers are coded
340	in the VRC header extension.

342	The format of the VRC header extension is as follows:

344	 0 1 2 3 4 5 6 7
345	+-+-+-+-+-+-+-+-+
346	| TID | Trun  |S|
347	+-+-+-+-+-+-+-+-+

349	TID: 3 bits
350	  Thread ID.  Up to 7 threads are allowed. Each frame of H.263+ VRC data
351	  will use as reference information only sync frames or frames within
352	  the same thread.  By convention, thread 0 is expected to be the
353	  "canonical" thread, which is the thread from which the sync frame
354	  should ideally be used.  In the case of corruption or loss of the
355	  thread 0 representation, a representation of the sync frame with a
356	  higher thread number can be used by the decoder.  Lower thread numbers
357	  are expected to contain equal or better representations of the sync
358	  frames than higher thread numbers in the absence of data corruption or
359	  loss.  See [7] for details.

361	Trun: 4 bits
362	  Monotonically increasing (modulo 16) 4 bit number counting the packet
363	  number within each thread.

365	S: 1 bit
366	  A bit that indicates that the packet content is for a sync frame.  An
367	  encoder using VRC may send several representations of the same "sync"
368	  picture, in order to ensure that regardless of which thread of
369	  pictures is corrupted by errors or packet losses, the reception of at
370	  least one representation of a particular picture is ensured (within at
371	  least one thread).  The sync picture can then be used for the
372	  prediction of any thread.  If packet losses have not occurred, then
373	  the sync frame contents of thread 0 can be used and those of other
374	  threads can be discarded (and similarly for other threads).  Thread 0
375	  is considered the "canonical" thread, the use of which is preferable
376	  to all others.  The contents of packets having lower thread numbers
377	  shall be considered as generally preferred over those with higher
378	  thread numbers.

380	5. Packetization schemes

382	5.1 Picture Segment Packets and Sequence Ending Packets (P=1)

384	A picture segment packet is defined as a packet that starts at the
385	location of a Picture, GOB, or slice start code in the H.263+ data
386	stream.  This corresponds to the definition of the start of a video
387	picture segment as defined in H.263+.  For such packets, P=1 always.

389	An extra picture header can sometimes be attached in the payload header
390	of such packets.  Whenever an extra picture header is attached as
391	signified by PLEN>0, only the last six bits of its picture start code,
392	'100000', are included in the payload header.  A complete H.263+ picture
393	header with byte aligned picture start code can be conveniently
394	assembled on the receiving end by prepending the sixteen leading '0'
395	bits.

397	When PLEN>0, the end bit position corresponding to the last byte of the
398	picture header data is indicated by PEBIT.  The actual bitstream data
399	shall begin on an 8-bit byte boundary following the payload header.

401	A sequence ending packet is defined as a packet that starts at the
402	location of an EOS or EOSBS code in the H.263+ data stream.  This
403	delineates the end of a sequence of H.263+ video data (more H.263+ video
404	data may still follow later, however, as specified in ITU-T
405	Recommendation H.263).  For such packets, P=1 and PLEN=0 always.

407	The optional header extension for VRC may or may not be present as
408	indicated by the V bit flag.

410	5.1.1 Packets that begin with a Picture Start Code

412	Any packet that contains the whole or the start of a coded picture shall
413	start at the location of the picture start code (PSC), and should
414	normally be encapsulated with no extra copy of the picture header. In
415	other words, normally PLEN=0 in such a case.   However, if the coded
416	picture contains an incomplete picture header (UFEP = "000"), then a
417	representation of the complete (UFEP = "001") picture header may be
418	attached during packetization in order to provide greater error
419	resilience.  Thus, for packets that start at the location of a picture
420	start code, PLEN shall be zero unless both of the following conditions
421	apply:
422	1) The picture header in the H.263+ bitstream payload is incomplete
423	   (PLUSPTYPE present and UFEP="000"), and
424	2) The additional picture header which is attached is not incomplete
425	   (UFEP="001").

427	A packet which begins at the location of a Picture, GOB, slice, EOS, or
428	EOSBS start code shall omit the first two (all zero) bytes from the
429	H.263+ bitstream, and signify their presence by setting P=1 in the
430	payload header.

432	Here is an example of encapsulating the first packet in a frame (without
433	an attached redundant complete picture header):

435	 0                   1                   2                   3
436	 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
437	+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
438	|  RR     |1|V|0|0|0|0|0|0|0|0|0|
439	+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-------------------------------+
440	| bitstream data without the first two 0 bytes of the PSC       |
441	+---------------------------------------------------------------+

443	5.1.2 Packets that begin with GBSC or SSC

445	For a packet that begins at the location of a GOB or slice start code,
446	PLEN may be zero or may be nonzero, depending on whether a redundant
447	picture header is attached to the packet.  In environments with very low
448	packet loss rates, or when picture header contents are very seldom
449	likely to change (except as can be detected from the GFID syntax of
450	H.263+), a redundant copy of the picture header is not required.
451	However, in less ideal circumstances a redundant picture header should
452	be attached for enhanced error resilience, and its presence is indicated
453	by PLEN>0.

455	Assuming a PLEN of 9, below is an example of a packet that begins with a
456	GBSC or a SSC:

458	 0                   1                   2                   3
459	 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
460	+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
461	|  RR     |1|V|0 0 1 0 0 1|PEBIT|
462	+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

464	+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
465	|1 0 0 0 0 0| picture header starting with TR, PTYPE, ...       |
466	+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
467	| ...                                                           |
468	+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
469	| ...           | bitstream data begins with GBSC/SCC ...       .
470	+-+-+-+-+-+-+-+-+-----------------------------------------------+

472	Notice that only the last six bits of the picture start code, '100000',
473	are included in the payload header.  A complete H.263+ picture header
474	with byte aligned picture start code can be conveniently assembled if
475	needed on the receiving end by prepending the sixteen leading '0' bits.

477	5.1.3 Packets that Begin with an EOS or EOSBS Code

479	For a packet that begins with an EOS or EOSBS code, PLEN shall be zero,
480	and no Picture, GOB, or Slice start codes shall be included within the
481	same packet.  As with other packets beginning with start codes, the two
482	all-zero bytes that begin the EOS or EOSBS code at the beginning of the
483	packet shall be omitted, and their presence shall be indicated by
484	setting the P bit to 1 in the payload header.

486	System designers should be aware that some decoders may interpret the
487	loss of a packet containing only EOS or EOSBS information as the loss of
488	essential video data and may thus respond by not displaying some
489	subsequent video information.  Since EOS and EOSBS codes do not actually
490	affect the decoding of video pictures, they are somewhat unnecessary to
491	send at all.  Because of the danger of misinterpretation of the loss of
492	such a packet, encoders are generally to be discouraged from sending EOS
493	and EOSBS.

495	Below is an example of a packet containing an EOS code:

497	 0                   1
498	 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5
499	+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
500	|  RR     |1|V|0|0|0|0|0|0|0|0|0|
501	+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
502	|1|1|1|1|1|1|0|0|
503	+-+-+-+-+-+-+-+-+

505	5.2 Encapsulating Follow-On Packet (P=0)

507	A Follow-on packet contains a number of bytes of coded H.263+ data which
508	does not start at a synchronization point.  That is, a Follow-On packet
509	does not start with a Picture, GOB, Slice, EOS, or EOSBS header, and it
510	may or may not start at a macroblock boundary.  Since Follow-on packets
511	do not start at synchronization points, the data at the beginning of a
512	follow-on packet is not independently decodable.  For such packets, P=0
513	always.  If the preceding packet of a Follow-on packet got lost, the
514	receiver may discard that Follow-on packet as well as all other
515	following Follow-on packets.  Better behavior, of course, would be for
516	the receiver to scan the interior of the packet payload content to
517	determine whether any start codes are found in the interior of the
518	packet which can be used as resync points.  The use of an attached copy
519	of a picture header for a follow-on packet is useful only if the
520	interior of the packet or some subsequent follow-on packet contains a
521	resync code such as a GOB or slice start code.  PLEN>0 is allowed, since
522	it may allow resync in the interior of the packet.  The decoder may also
523	be resynchronized at the next segment or picture packet.

525	Here is an example of a follow-on packet (with PLEN=0):

527	 0                   1                   2                   3
528	 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
529	+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
530	|  RR     |0|V|0|0|0|0|0|0|0|0|0|
531	+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-------------------------------+
532	| bitstream data                                                |
533	+---------------------------------------------------------------+

535	6. Use of this payload specification

537	There is no syntactical difference between a picture segment packet and
538	a Follow-on packet, other than the indication P=1 for picture segment or
539	sequence ending packets and P=0 for Follow-on packets.  See the
540	following for a summary of the entire packet types and ways to
541	distinguish between them.

543	For a more detailed discussion on how to use the payload specification,
544	the reader should refer to [8].

546	It is possible to distinguish between the different packet types by
547	checking the P bit and the first 6 bits of the payload along with the
548	header information.  The following table shows the packet type for
549	permutations of this information (see also the picture/GOB/Slice header
550	descriptions in H.263+ for details):

552	--------------+--------------+----------------------+-------------------
553	 First 6 bits | P-Bit | PLEN |  Packet              |  Remarks
554	 of Payload   |(payload hdr.)|                      |
555	--------------+--------------+----------------------+-------------------
556	 100000       |   1   |  0   |  Picture             |  Typical Picture
557	 100000       |   1   | > 0  |  Picture             |  Note UFEP
558	 1xxxxx       |   1   |  0   |  GOB/Slice/EOS/EOSBS |  See possible GNs
559	 1xxxxx       |   1   | > 0  |  GOB/Slice           |  See possible GNs
560	 Xxxxxx       |   0   |  0   |  Follow-on           |
561	 Xxxxxx       |   0   | > 0  |  Follow-on           |  Interior Resync
562	--------------+--------------+----------------------+-------------------

564	See [4] for details regarding the possible values of the six bits (a "1"
565	bit followed by a five bit GN field explicit or emulated) of GOB, Slice,
566	EOS, and EOSBS codes.

568	As defined in this specification, every start of a coded frame (as
569	indicated by the presence of a PSC) has to be encapsulated as a picture
570	segment packet.  If the whole coded picture fits into one packet of
571	reasonable size (which is dependent on the connection characteristics),
572	this is the only type of packet that needs to be observed.  Due to the
573	high compression ratio achieved by H.263+ it is often possible to use
574	this mechanism, especially for small spatial picture formats such as
575	QCIF and typical Internet packet sizes around 1500 bytes.

577	If the complete coded frame does not fit into a single packet, two
578	different ways for the packetization may be chosen.  In case of very low
579	or zero packet loss probability, one or more Follow-on packets may be
580	used for coding the rest of the picture.  Doing so leads to minimal
581	coding and packetization overhead as well as to an optimal use of the
582	maximal packet size, but does not provide any added error resilience.

584	The alternative is to break the picture into reasonably small partitions
585	- called Segments - (by using the Slice or GOB mechanism), that do offer
586	synchronization points.  By doing so and using the Picture Segment
587	payload with PLEN>0, decoding of the transmitted packets is possible
588	even in such cases in which the Picture packet containing the picture
589	header was lost (provided any necessary reference picture is available).
590	Picture Segment packets can also be used in conjunction with Follow-on
591	packets for large segment sizes.

593	7. Security Considerations

595	RTP packets using the payload format defined in this specification are
596	subject to the security considerations discussed in the RTP
597	specification [1], and any appropriate RTP profile (for example [3]).
598	This implies that confidentiality of the media streams is achieved by
599	encryption.  Because the data compression used with this payload format
600	is applied end-to-end, encryption may be performed after compression so
601	there is no conflict between the two operations.

603	A potential denial-of-service threat exists for data encodings using
604	compression techniques that have non-uniform receiver-end computational
605	load.  The attacker can inject pathological datagrams into the stream
606	which are complex to decode and cause the receiver to be overloaded.
607	However, this encoding does not exhibit any significant non-uniformity.

609	As with any IP-based protocol, in some circumstances a receiver may be
610	overloaded simply by the receipt of too many packets, either desired or
611	undesired.  Network-layer authentication may be used to discard packets
612	from undesired sources, but the processing cost of the authentication
613	itself may be too high.  In a multicast environment, pruning of specific
614	sources may be implemented in future versions of IGMP [5] and in
615	multicast routing protocols to allow a receiver to select which sources
616	are allowed to reach it.

618	A security review of this payload format found no additional
619	considerations beyond those in the RTP specification.

621	8. References

623	[1] H. Schulzrinne, S. Casner, R. Frederick, V. Jacobson, "RTP : A
624	    Transport Protocol for Real-Time Applications", RFC 1889.

626	[2] "Video Codec for Audiovisual Services at px64 kbits/s", ITU-T
627	    Recommendation H.261, 1993.

629	[3] "RTP Profile for Audio and Video Conference with Minimal Control",
630	    RFC 1890.

632	[4] "Video Coding for Low Bitrate Communication", Draft ITU-T
633	    Recommendation H.263, Draft 20, September 1997.

635	[5] T. Turletti, C. Huitema, "RTP Payload Format for H.261 Video
636	    Streams", RFC 2032.

638	[6] C. Zhu, "RTP Payload Format for H.263 Video Streams", RFC 2190.

640	[7] S. Wenger, "Video Redundancy Coding in H.263+", Proc. AVSPN97,
641	    Aberdeen, U.K..

643	[8] S. Wenger, G. Knorr, J. Ott: "Error resilience support in H.263
644	    V.2", submitted for publication to IEEE T-CSVT, 1997.