idnits 2.17.1 

draft-ietf-avt-uxp-05.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

  ** Looks like you're using RFC 2026 boilerplate.  This must be updated to
     follow RFC 3978/3979, as updated by RFC 4748.


  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

  ** The document seems to lack a 1id_guidelines paragraph about
     Internet-Drafts being working documents. 

  ** The document seems to lack a 1id_guidelines paragraph about 6 months
     document validity. 

  ** The document seems to lack a 1id_guidelines paragraph about the list of
     current Internet-Drafts. 

  ** The document seems to lack a 1id_guidelines paragraph about the list of
     Shadow Directories. 

  == No 'Intended status' indicated for this document; assuming Proposed
     Standard


  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  ** The document seems to lack an IANA Considerations section.  (See Section
     2.2 of https://www.ietf.org/id-info/checklist for how to handle the case
     when there are no actions for IANA.)

  ** The document seems to lack separate sections for Informative/Normative
     References.  All references will be assumed normative when checking for
     downward references.


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == Line 154 has weird spacing: '...ressive  media...'

  == Line 177 has weird spacing: '... of the   prog...'

  == Line 184 has weird spacing: '...tes the  bitst...'

  == Line 731 has weird spacing: '...ed into  one s...'

  == Line 897 has weird spacing: '...=10. We  reser...'

  -- The document seems to lack a disclaimer for pre-RFC5378 work, but may
     have content which was first submitted before 10 November 2008.  If you
     have contacted all the original authors and they are all willing to grant
     the BCP78 rights to the IETF Trust, then this is fine, and you can ignore
     this comment.  If not, you may need to add the pre-RFC5378 disclaimer. 
     (See the Legal Provisions document at
     https://trustee.ietf.org/license-info for more information.)

  -- Couldn't find a document date in the document -- date freshness check
     skipped.


  Checking references for intended status: Proposed Standard
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

  ** Obsolete normative reference: RFC 2733 (ref. '1') (Obsoleted by RFC 5109)

  -- Possible downref: Non-RFC (?) normative reference: ref. '2'

  -- Possible downref: Non-RFC (?) normative reference: ref. '3'

  -- Possible downref: Non-RFC (?) normative reference: ref. '4'

  -- Possible downref: Non-RFC (?) normative reference: ref. '5'

  -- Possible downref: Non-RFC (?) normative reference: ref. '6'

  -- Possible downref: Non-RFC (?) normative reference: ref. '7'

  -- Possible downref: Non-RFC (?) normative reference: ref. '8'

  -- Possible downref: Non-RFC (?) normative reference: ref. '9'


     Summary: 8 errors (**), 0 flaws (~~), 6 warnings (==), 10 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------

1	 Internet Engineering Task Force                             G. Liebl
2	 Internet Draft                                  LNT, Munich Univ. of
3	                                                            Technology
4	 Document: draft-ietf-avt-uxp-05.txt
5	 March 2003                                     M. Wagner, J. Pandel,
6	                                                               W. Weng
7	 Expires: September  2003                          Siemens AG, Munich

9	      An RTP Payload Format for Erasure-Resilient Transmission of
10	                     Progressive Multimedia Streams
11	 Status of this Memo
12	    This document is an Internet-Draft and is in full conformance
13	       with all provisions of Section 10 of RFC2026.
14	    Internet-Drafts are working documents of the Internet Engineering
15	    Task Force (IETF), its areas, and its working groups. Note that
16	    other groups may also distribute working documents as Internet-
17	    Drafts. Internet-Drafts are draft documents valid for a maximum
18	    of six months and may be updated, replaced, or obsoleted by other
19	    documents at any time. It is inappropriate to use Internet-
20	    Drafts as reference material or to cite them other than as "work
21	    in progress."
22	    The list of current Internet-Drafts can be accessed at
23	    http://www.ietf.org/ietf/1id-abstracts.txt
24	    The list of Internet-Draft Shadow Directories can be accessed at
25	    http://www.ietf.org/shadow.html.

27	 Abstract
28	    This document specifies an efficient way to ensure erasure-
29	    resilient transmission of progressively encoded multimedia
30	    sources via RTP using Reed-Solomon (RS) codes together with
31	    interleaving. The level of erasure protection can be explicitly
32	    adapted to the importance of the respective parts in the source
33	    stream, thus allowing a graceful degradation of application
34	    quality with increasing packet loss rate on the network. Hence,
35	    this type of unequal erasure protection (UXP) schemes is intended
36	    to cope with the rapidly varying channel conditions on wireless
37	    access links to the Internet backbone. Furthermore, protection of
38	    non-progressive multimedia streams is ensured, since equal
39	    erasure protection (EXP) represents a subset of generic UXP. By
40	    applying interleaving and RS codes a  payload format is defined,
41	    which can be easily integrated into the existing framework for
42	    RTP.

44	 Table of Contents

46	    1. Introduction.................................................2
47	    2. Conventions used in this Document............................3
48	    3. Reed-Solomon Codes...........................................6
49	    4. Progressive Source Coding....................................7
50	    5. General Structure of UXP Schemes.............................8

52	 Liebl,Wagner,Pandel,Weng                                    [Page1]
53	    6. RTP payload structure.......................................13
54	    7. Indication of UXP in SDP....................................20
55	    8. Security Considerations.....................................21
56	    9. Application Statement.......................................21
57	    10. Intellectual Property Considerations.......................23
58	    11. References.................................................23
59	    12. Acknowledgments............................................24
60	    13. Author's Addresses.........................................24

62	 1. Introduction

64	    Due to the increasing popularity of high-quality multimedia
65	    applications over the Internet and the high level of public
66	    acceptance of existing mobile communication systems, there is a
67	    strong demand for a future combination of these two techniques:
68	    One possible scenario consists of an integrated communication
69	    environment, where users can set up multimedia connections
70	    anytime and anywhere via radio access links to the Internet.
71	    For this reason, several packet-oriented transmission modes like
72	    EGPRS (Enhanced General Packet Radio Service) or UMTS (Universal
73	    Mobile Telecommunications System) can be used, which are mostly
74	    based on the same principle: Long message blocks, i.e. IP
75	    packets, that enter the wireless part of the network are split up
76	    into segments of desired length, which can be multiplexed onto
77	    link layer packets of fixed size. The latter are then transmitted
78	    sequentially over the wireless link, reassembled, and passed on
79	    to the next network element.
80	    However, compared to the rather benign channel characteristics on
81	    today's fixed networks, wireless links suffer from severe fading,
82	    noise, and interference conditions in general, thus resulting in
83	    a comparably high residual bit error rate after detection and
84	    decoding. By use of efficient CRC-mechanisms, these bit errors
85	    are usually detected with very high probability, and every
86	    corrupted segment, i.e. which contains at least one erroneous
87	    bit, is discarded to prevent error propagation through the
88	    network. But if only one single segment is missing at the
89	    reassemble stage, the upper layer IP packet cannot be
90	    reconstructed anymore. The result is a significant increase in
91	    packet loss rate at IP level.
92	    Since most multimedia applications can only recover from a very
93	    limited number of lost IP packets, it is vitally necessary to
94	    keep packet loss at IP level within a certain acceptable range
95	    depending on the individual quality-of-service requirements.
96	    However, due to the delay constraints typically imposed by most
97	    audio or video codecs, the use of ARQ-schemes is often prohibited
98	    both at link level and at transport level. In addition,
99	    retransmission strategies cannot be applied to any broadcast or
100	    multicast scenarios. Thus, forward erasure correction strategies
101	    have to be considered, which provide a simple means to
102	    reconstruct the content of lost packets at the receiver from the
103	    redundancy that has been spread out over a certain number of
104	    consecutive packets.
105	    There already exist some previous studies and proposals regarding
106	    erasure-resilient packet transmission [1,8]. Since most of them
107	    are based on the assumption that all parts in a message block are
108	    equally important to the receiver, i.e. the respective
109	    application cannot operate on partly complete blocks, they were
110	    optimized with respect to assigning equal erasure protection over
111	    the whole message block. However, recent developments both in
112	    audio and video coding have introduced the notion of
113	    progressively encoded media streams, for which unequal erasure
114	    protection strategies seem to be more promising, as it will be
115	    explained in more detail below. Although the scheme defined in
116	    [1] is in principle capable of supporting some kind of unequal
117	    erasure protection, possible implementations seem to be quite
118	    complex with respect to the gain in performance. Finally, in [1]
119	    it is assumed that consecutive RTP packets can have variable
120	    length, which would cause significant segmentation overhead at
121	    the link layer of almost all wireless systems.
122	    This document defines a payload format for RTP, such that
123	    different elements in a progressively encoded multimedia stream
124	    can be protected against packet erasures according to their
125	    respective quality-of-service requirement. The general principle,
126	    including the use of Reed-Solomon codes together with an
127	    appropriate interleaving scheme for adding redundancy, follows
128	    the ideas already presented in [2], but allows for finer
129	    granularity in the structure of the progressive media stream. The
130	    proposed scheme is generic in the way that it (1) is independent
131	    of the type of media stream, be it audio or video, and (2) can be
132	    adapted to varying transmission quality very quickly by use of
133	    inband-signaling.

135	 2. Conventions used in this Document

137	    The following terms are used throughout this document:
138	    1.)  Segment: denotes a link layer transport unit.
139	    2.)  Segmentation/Reassembly Process: If the size of the
140	         transport units at the link layer is smaller than that at
141	         the upper layers, message blocks have to be split up into
142	         several parts, i.e. segments, which are then transmitted
143	         subsequently over the link. If nothing is lost, the original
144	         message block can be restored at the receiving entity
145	         (reassembly).
146	    3.)  Codec: denotes a functional pair consisting of a source
147	         encoding unit at the sender and a corresponding source
148	         decoding unit at the receiver; usually standardized for
149	         different media applications like audio or video.

151	    4.)  Media stream: A bitstream. which results at the output of an
152	         encoder for a specific media type, e.g. H.263, MPEG-4
153	         Visual.
154	    5.)  Progressive  media stream: A media stream which can be
155	         divided into successive elements. The distinct elements are
156	         of different importance to the decoding process and are
157	         commonly ordered from highest to least importance, where the
158	         latter elements depend on the previous.
159	    6.)  Progressive source coding: results in a progressive media
160	         stream.
161	    7.)  Reed-Solomon (RS) code: belongs to the class of linear
162	         nonbinary block codes, and is uniquely specified by the
163	         block length n, the number of parity symbols t, and the
164	         symbol alphabet.
165	    8.)  n: is a variable, which denotes both the block length of a
166	         RS codeword, and the number of columns in a TB (see 19).
167	    9.)  k: is a variable, which denotes the number of information
168	         symbols in an RS codeword.
169	    10.) t: is a variable, which denotes the number of parity symbols
170	         in an RS codeword.
171	    11.) Erasure: When a packet is lost during transmission, an
172	         erasure is said to have happened. Since the position of the
173	         erased packet in a sequence is usually known, a
174	         corresponding erasure marker can be set at the receiving
175	         entity.
176	    12.) Base layer: comprises the first and most important elements
177	         of the   progressive media stream, without which all
178	         subsequent information is useless.
179	    13.) Enhancement layer: comprises one or more sets of the less
180	         important subsequent elements of the progressive media
181	         stream. A specific enhancement layer can be decoded, if and
182	         only if the base layer and all previous enhancement layer
183	         data (of higher importance) are available.
184	    14.) Info stream: denotes the  bitstream which has to be
185	         protected by the UXP scheme. It usually consists of the
186	         media stream (progressively source encoded or not), which is
187	         arranged according to a desired syntax (e.g. to achieve an
188	         appropriate framing, see Sect. 6.3 ). In any case, it is
189	         assumed that every info stream is already octet-aligned
190	         according to the standard procedures defined in the context
191	         of the used syntax specifications.
192	    15.) Info octet: Denotes one element of the info stream.
193	    16.) Transmission block (TB): denotes a memory array of L rows
194	         and n columns. Each row of a TB represents a RS codeword,
195	         whereas each column, together with the respective UXP header
196	         (see 36) in front, forms the payload of a single RTP packet.
197	         Each TB consists of at least two distinct transmission sub
198	         blocks (TSB, see20): The first L_s rows belong to the
199	         signaling TSB, whereas the last L_d=(L-L_s) rows belong to
200	         one or more data TSB.
201	    17.) Transmission sub block (TSB): denotes a memory array of
202	         0<l<L rows and n columns, which is a horizontal slice of a
203	         TB. Depending on whether the info octet positions are filled
204	         with descriptors (see31) or media data, the TSB is of type
205	         signaling or data, respectively.
206	    18.) L: is a variable, which denotes both the number of rows in a
207	         TB and the payload length (without UXP header, see 36) of an
208	         RTP packet in octets.
209	    19.) Unequal erasure protection (UXP): denotes a specific
210	         strategy which varies the level of erasure protection across
211	         a TB according to a given redundancy profile.
212	    20.) Equal erasure protection (EXP): is a subset of UXP, for
213	         which the level of erasure protection is kept constant
214	         across a TB.
215	    21.) Redundancy profile: describes the size of the different
216	         erasure protection classes in a TB, i.e. the number of rows
217	         (codewords) per class.
218	    22.) Erasure protection class: contains a set of rows (codewords)
219	         of the TB with same erasure correction capability.
220	    23.) i: is a variable, which denotes the number of parity
221	         symbols for each row in erasure protection class i.

223	    24.) EPC_i: is a variable, which denotes the set of rows
224	         contained in erasure protection class i.
225	    25.) R_i: is a variable, which denotes the total number of rows
226	         contained in erasure protection class i, i.e. the
227	         cardinality of EPC_i.
228	    26.) T: is a variable, which denotes the number of parity
229	         symbols for each row in the highest erasure protection class
230	         (with respect to application data) in a TB.
231	    27.) EPV: denotes the erasure protection vector of length (T+1)
232	         used to describe a certain redundancy profile.
233	    28.) DP: descriptor used for in-band signaling of the erasure
234	         protection vector.
235	    29.) SI: stuffing indicator, which contains the number of media
236	         stuffing symbols at the end of a data TSB (see 34).
237	    30.) Descriptor Stuffing: insertion of otherwise unused
238	         descriptor values (i.e. 0x00) at the end of the signaling
239	         TSB. Descriptor stuffing is performed, if the final sequence
240	         of descriptors and stuffing indicators for a valid
241	         redundancy profile is shorter than the space initially
242	         reserved for it in the signaling TSB.
243	    31.) Media Stuffing: insertion of additional symbols at the end
244	         of a data TSB. Media stuffing is performed, if the info
245	         stream (see 17) is shorter than the space reserved for it in
246	         the data TSB for a desired redundancy profile. Since the
247	         number of stuffing symbols is signaled in the respective SI,
248	         any octet value may be used (e.g. 0x00).
249	    32.) Interleaver: performs the spreading of a codeword, i.e. a
250	         row in the TB, over n successive packets, such that the
251	         probability of an erasure burst in a codeword is kept small.
252	    33.) UXP header: is the additional header information contained
253	         in each RTP packet after UXP has been applied. It is always
254	         present at the start of the payload section of an RTP
255	         packet.
256	    34.) X: denotes a currently not used extension field of 1 bit in
257	         the UXP header.
258	    35.) P: is a variable which denotes the number of parity symbols
259	         per row used to protect the inband signaling of the
260	         redundancy profile.
261	    36.) ceil(.): denotes the ceiling function, i.e. rounding up to
262	         the next integer.

264	    The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL
265	    NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED",  "MAY", and
266	    "OPTIONAL" in this document are to be interpreted as described in
267	    RFC-2119.

269	 3. Reed-Solomon Codes

271	    Reed-Solomon (RS) codes are a special class of linear nonbinary
272	    block codes, which are known to offer maximum erasure correction
273	    capability with minimum amount of redundancy.
274	    An arbitrary t-erasure-correcting (n,k) RS code defined over
275	    Galois field GF(q) has the following parameters [3]:
276	    - Block length:                                      n=q-1
277	    - No. of information symbols in a codeword:          k
278	    - No. of parity-check symbols in a codeword:         n-k=t
279	    - Minimum distance:                                  d=t+1

281	    In what follows, only systematic RS codes over GF(2^8) shall be
282	    considered, i.e. the symbols of interest can be directly related
283	    to a tuple of eight bits, which is commonly called an octet in
284	    packet transmission. The principle structure of a codeword is
285	    shown in Fig. 1.
286	    By shortening the initial (n=255,n-t) RS code, any desired
287	    (n',n'-t) RS code for a given erasure correction capability t may
288	    be obtained.

290	      block of n octets
291	    <----------------->
292	    +-+-+-+-+-+-+-+-+-+
293	    |&|&|&|&|&|&|&|*|*|
294	    +-+-+-+-+-+-+-+-+-+
295	    <------------><--->
296	        k=n-t       t
297	      (&:info)     (*:parity)

299	    Fig. 1: Structure of a systematic RS codeword

301	 4. Progressive Source Coding

303	    The output of an encoder for a specific media type, e.g. H.263 or
304	    MPEG-4 Visual is said to be a media stream. If the media stream
305	    consists of several distinct elements, which are of different
306	    importance with respect to the quality of the decoding process at
307	    the receiver, then the media stream is progressive. The
308	    progressive media stream is often organized in separate layers.
309	    Hence, there exists at least one layer, often called base layer,
310	    without which decoding fails at all, whereas all the other
311	    layers, often called enhancement layers, just help to continually
312	    improve the quality. Consequently, the different layers are
313	    usually contained in the (source-)encoded media stream in
314	    decreasing order of importance, i.e. the base layer data is
315	    followed by the various enhancement layers.
316	    An example can be found in the fine granular scalability modes
317	    which have been proposed to various standardization bodies like
318	    MPEG, where the resolution of the scaling process in the
319	    progressive source encoder is as low as one symbol in the
320	    enhancement layer [4]. Another example is given by data
321	    partitioning which can be applied to the  ITU/MPEG H.26L standard
322	    [5], MPEG-4, and H.263++. Also, the existence of I,P, and B
323	    frames in streams which comply with standards like MPEG-2 can be
324	    interpreted as progressive.
325	    From the above definition, it is quite obvious that the most
326	    important base layer data must be protected as strongly as
327	    possible against packet loss during transmission. However, the
328	    protection of the enhancement layers can be continually lowered,
329	    since a loss at these stages has only minor consequences for the
330	    decoding process. Thus, by using a suitable unequal erasure
331	    protection strategy across a progressive media stream, the
332	    overhead due to redundancy is reduced. Furthermore, if channel
333	    conditions get worse during transmission, only more and more
334	    enhancement layers are lost, i.e. a graceful degradation in
335	    application quality at the receiver is achieved [6].
336	    Nevertheless, it should be mentioned that the specific structure
337	    of the media stream strongly depends on the actual media codec in
338	    use and does not always provide suitable mechanisms for transport
339	    over data networks, like framing (see also Sect. 6.3 ). In order
340	    to keep the description of the unequal erasure protection
341	    strategy in Sect. 5 as general as possible, the final bitstream
342	    which has to be protected by the proposed UXP scheme will be
343	    called "info stream" in the following. Furthermore, it is assumed
344	    that every info stream is already octet-aligned according to the
345	    standard procedures defined in the context of the used syntax
346	    specifications.

348	 5. General Structure of UXP Schemes

350	    In this section, the principle features of the proposed UXP
351	    scheme are described with a special focus on the protection and
352	    reconstruction procedure which is applied to the info stream. In
353	    addition, the behavior of the sender and receiver is specified as
354	    far as it concerns the reconstruction of the info stream.
355	    However, the complete UXP payload structure, including the
356	    additional UXP header, is described in Sect. 6.
357	    The reason for using the term "info stream" as well as the
358	    details of the construction are described in Sect. 6.3 . For now,
359	    we assume that we have an info stream which has to be protected.

361	    Fig. 1 already illustrated the structure of a systematic RS
362	    codeword, which shall be represented by a single row with n
363	    successive symbols that contain the information and the parity
364	    octets. This structure shall now be extended by forming a
365	    transmission block (TB) consisting of L codewords of length n
366	    octets each, which amounts to a total of L rows and n columns
367	    [7]: Each column, together with the respective UXP header in
368	    front, shall represent the payload of an RTP packet, i.e. the
369	    whole data of a TB is transmitted via a sequence of n RTP packets
370	    all carrying a payload of length (L+2) octets (UXP header
371	    included).
372	    Each TB usually consists of two or more horizontal sub blocks,
373	    the so-called transmission sub blocks (TSB), as can be seen in
374	    Fig. 2: The first L_s rows always belong to the signaling TSB,
375	    which is used to convey the actual redundancy profile in the data
376	    part to the receiver (see 6.4.). The following L_d=(L-L_s) rows
377	    belong to one or more data TSBs, which contain the interleaved
378	    and RS encoded info stream, as will be described below.

380	    Transmission Block (TB)

382	                 /\ +-+-+-+-+-+-+-+-+-+ /\
383	                 |  |  signaling TSB  |  |  L_s octets
384	                 |  +-+-+-+-+-+-+-+-+-+ \/
385	                 |  |                 | /\               /\
386	                 |  +   data TSB #1   +  |  L_d(1) octets |
387	                 |  |                 |  |                |
388	                 |  +-+-+-+-+-+-+-+-+-+ \/                |
389	    L octets     |  |                 | /\                |
390	    payload      |  +   data TSB #2   +  |  L_d(2) octets |
391	    per packet   |  +                 |  |                |  L_d oct.
392	                 |  +-+-+-+-+-+-+-+-+-+ \/                |
393	                 |  |        .        |  .                |
394	                 |  +        .        +  .                |
395	                 |  |        .        |  .                |
396	                 |  +-+-+-+-+-+-+-+-+-+ /\                |
397	                 |  |   data TSB #z   |  |  L_d(z) octets |
398	                 \/ +-+-+-+-+-+-+-+-+-+ \/               \/
399	                    <----------------->
400	                          n packets
401	    Fig. 2: General structure of a TB

403	    Since the UXP procedure is mainly applied to the data TSBs, it
404	    will be described next, whereas the content and syntax of the
405	    signaling TSB will be defined in section 6.4.
406	    For means of simplification, only one single data TSB will be
407	    assumed throughout the following explanation of the encoding and
408	    decoding procedure. However, an extension to more than one data
409	    TSB per TB is straightforward, and will be shown in section 6.5.
410	    As depicted in Fig. 3, the rows of a transmission sub block shall
411	    be assembled into T+1 different classes EPC_i, where i=0...T,
412	    such that each class contains exactly R_i=|EPC_i| consecutive
413	    rows of the matrix, where the R_i have to satisfy the following
414	    relationship:
415	    R_0+R_1+...+R_T=L_d
416	    Data Transmission Sub Block (data TSB)
417	                                  T
418	                              <------->
419	                 /\ +-+-+-+-+-+-+-+-+-+ /\
420	                 |  |&|&|&|&|&|*|*|*|*|  |
421	                 |  +-+-+-+-+-+-+-+-+-+  |  R_T=3
422	                 |  |&|&|&|&|&|*|*|*|*|  |
423	                 |  +-+-+-+-+-+-+-+-+-+  |
424	    L_d octets   |  |&|&|&|&|&|*|*|*|*| \/
425	    per packet   |  +-+-+-+-+-+-+-+-+-+ /\
426	                 |  |%|%|%|%|%|%|*|*|*|  |  R_(T-1)=1
427	                 |  +-+-+-+-+-+-+-+-+-+ \/
428	                 |  |$|$|$|$|$|$|$|*|*|  .
429	                 |  +-+-+-+-+-+-+-+-+-+  .
430	                 |  |!|!|!|!|!|!|!|!|*|  .
431	                 |  +-+-+-+-+-+-+-+-+-+ /\
432	                 |  |#|#|#|#|#|#|#|#|#|  |  R_0=1
433	                 \/ +-+-+-+-+-+-+-+-+-+ \/
434	                    <----------------->
435	                          n packets
436	    &,%,$,!,# : info octets belonging to a certain info stream in
437	                decreasing order of importance
438	    * :         parity octets gained from Reed-Solomon coding
439	    Fig. 3: General structure for coding with unequal erasure
440	    protection

442	    Furthermore, all rows in a particular class EPC_i shall contain
443	    exactly the same number of parity octets, which is equal to the
444	    index i of the class. For each row in a certain class EPC_i, the
445	    same (n,n-i) RS code shall be applied.
446	    As can be observed from Fig. 3, class EPC_T contains the largest
447	    number of parity octets per row, i.e. offers the highest erasure
448	    protection capability in the block. Consequently, the most
449	    important element in the info stream must be assigned to class
450	    EPC_T, where the value of T should be chosen according to the
451	    desired outage threshold of the application given a certain
452	    packet erasure rate on the link.
453	    All other classes EPC_(T-1)...EPC_0 shall be sequentially filled
454	    with the remaining elements of the info stream in decreasing
455	    order of importance, where the optimal choice for the size of
456	    each class (0 or more rows), i.e. the structure of the redundancy
457	    profile, should depend on the quality-of-service requirements for
458	    the various (progressively-encoded) layers.
459	    The following set of rules contains a compact description of all
460	    the operations that must be performed for each transmission
461	    block:
462	    1.) The total number of columns n of the TB shall be chosen
463	    according to the actual delay constraints of the application.

465	    2.) Next, the expected number of rows reserved for the signaling
466	    TSB has to selected, which limits the data TSB to L_d=(L-L_s)
467	    rows.
468	    3.) The maximum erasure correction capability T in the data TSB
469	    should be chosen according to the desired outage threshold of the
470	    application given the actual packet erasure rate on the link.
471	    4.) The redundancy profile for the rest of the data TSB should
472	    depend on the size and number of the various layers in the info
473	    stream, as well as the desired probability of successful decoding
474	    for each of them (quality-of-service requirement).
475	    5.) Any suitable optimization algorithm may be used for deriving
476	    an adequate redundancy profile. However, the result has to
477	    satisfy the following constraints:
478	    a) All available info octet positions in the data TSB have to be
479	    completely filled. If the info stream is too short for a desired
480	    profile, media stuffing may be applied to the empty info octet
481	    positions at the end of the data TSB by appending a sufficient
482	    number of octets (with arbitrary value, e.g. 0x00). The actual
483	    number of stuffing symbols per data TSB is then signaled via the
484	    respective stuffing indicator (see Sect. 6.4.). However, before
485	    resorting to any stuffing, it should be checked whether it is
486	    possible to strengthen the protection of certain rows instead,
487	    thus improving the overall robustness of the decoding process.
488	    b) The info stream SHOULD be fully contained within the data TSB
489	    (unless cutting it off at a specific point is explicitly allowed
490	    by the properties of the info stream).
491	    c) The number of required descriptors and stuffing indicators
492	    (see section 6.4.) to signal the profile SHALL NOT exceed the
493	    space initially reserved for them in the signaling TSB.
494	    Constraints a) and b) should be already incorporated in the
495	    optimization algorithm. However, if constraint c) is not met, the
496	    data TSB has to be reduced by one row in favor of the signaling
497	    TSB to accommodate more space for the descriptors and stuffing
498	    indicators, i.e. steps 2-5 have to be repeated until a valid
499	    redundancy profile has been obtained.
500	    6.) For each nonempty class EPC_i, i=T...0, in the data TSB, the
501	    following steps have to be performed:
502	    a) All rows of this specific class SHALL be filled from left to
503	    right and top to bottom with data octets of the info stream.
504	    b) For each row in the class, the required i parity-check octets
505	    are computed from the same set of codewords of an (n,n-i) RS
506	    code, and filled in the empty positions at the end of each row.
507	    Thus, every row in the class constitutes a valid codeword of the
508	    chosen RS code.

510	    7.) After having filled the whole data TSB with information and
511	    parity octets, the redundancy profile is mapped to the signaling
512	    TSB as described in section 6.4.
513	    8.) Each column of the resulting TB is now read out octet-wise
514	    from top to bottom and, together with the respective UXP header
515	    (see section 6.2.) in front, is mapped onto the payload section
516	    of one and only one RTP packet.

518	    9.) The n resulting RTP packets SHALL be transmitted
519	    consecutively to the remote host, starting with the leftmost one.
520	    10.) At the corresponding protocol entity at the remote host, the
521	    payload (without the UXP header) of all successfully received RTP
522	    packets belonging to the same sending TB SHALL be filled into a
523	    similar receiving TB column-wise from top to bottom and left to
524	    right.
525	    11.) For every erased packet of a received TB, the respective
526	    column in the TB shall be filled with a suitable erasure marker.
527	    12.) Before any other operations can be performed, the redundancy
528	    profile has to be restored from the signaling TSB according to
529	    the procedure defined in Sect. 6.4.. If the attempt fails because
530	    of too many lost packets, the whole TB SHALL be discarded and the
531	    receiving entity should wait for the next incoming TB.
532	    13.) If the attempt to recover the redundancy profile has been
533	    successful, a decoding operation shall be performed for each row
534	    of the data TSB by applying any suitable algorithm for erasure
535	    decoding.
536	    14.) For all rows of the data TSB for which the decoding
537	    operation has been successful, the reconstructed data octets are
538	    read out from left to right and top to bottom, and appended to
539	    the reconstructed version of the info stream.

541	    One can easily realize that the above rules describe an
542	    interleaver, i.e. at the sender a single codeword of a TB is
543	    spread out over n successive packets. Thus, each codeword of a
544	    transmitted TB experiences the same number of erasures at exactly
545	    the same positions.
546	    Two important conclusions can be drawn from this:
547	    a) Since the same RS code is applied to all rows contained in a
548	    specific class, either all of them can be correctly decoded or
549	    none. Hence, there exist no partly decodable classes at the
550	    receiver.
551	    b) If decoding is successful for a certain class EPC_i, all the
552	    classes EPC_(i+1)...EPC_T can also be decoded, since they are
553	    protected by at least one more parity octet per row. Together
554	    with rule 6, it is therefore always ensured, that in case a
555	    decodable enhancement layer exists, all other layers it depends
556	    on can also be reconstructed!

558	    Given the maximum erasure protection value T, the redundancy
559	    profile for a data TSB of size (L_d x n) shall be denoted by a
560	    so-called erasure protection vector EPV of length (T+1), where
561	    EPV:=(R_0,R_1,...,R_(T-1),R_T)
562	    From the above definition, it is easy to realize that the trivial
563	    cases of no erasure protection and EXP are a subset of UXP:
564	    a) no erasure protection at all: all application data is mapped
565	    onto
566	       class EPC_0, i.e. EPV=(L_d,0,0,...,0).
567	    b) EXP: all application data is mapped onto class EPC_T, i.e.
568	       EPV=(0,0,...,0,R_T=L_d).

570	    Hence, the UXP payload format also can be used with info streams
571	    which are non progressive.

573	 6. RTP payload structure

575	    This section is organized as follows. First, the specific
576	    settings in the RTP header are shown. Next, the RTP payload
577	    header for UXP (the so-called UXP header) is specified. After
578	    that, the structure of the bitstream which is protected by UXP,
579	    the so-called info stream, is discussed. Finally, the in-band
580	    signaling of the erasure protection vector is introduced.
581	    For every packet, the  UXP payload is formed by reading out a
582	    column of the TB and prefixing it with the UXP header. Thus, an
583	    UXP-compliant RTP packet looks as follows:

585	    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-
586	    |RTP Header| UXP Header| one column of the TB        |
587	    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-

589	 6.1 Specific Settings in the RTP Header

591	    The timestamp of each RTP packet is set to the sampling time of
592	    the first octet of the progressive media stream in the
593	    corresponding TB. If several data TSBs are included in one TB,
594	    the sampling time of data TSB #1 is relevant. This results in the
595	    TS value being the same for all RTP packets belonging to a
596	    specific TB.
597	    The payload type is of dynamic type, and obtained through out-of-
598	    band signaling similar to [1]. End systems, which cannot
599	    recognize a payload type, must discard it.
600	    The marker bit is set to 1 in the last packet of a TB; otherwise,
601	    its value is 0.
602	    All other fields in the RTP header are set to those values
603	    proposed for regular multimedia transmission using the RTP-format
604	    of the media stream which is protected by UXP, e.g for MPEG-4
605	    Visual as specified in RFC 3016.

607	 6.2. Structure of the UXP Header

609	    The UXP header shall consist of 2 octets, and is shown in Fig. 4:

611	     0                   1 1 1 1 1 1
612	     0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5

614	    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
615	    |X|  block PT   | block length n|
616	    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

618	    Fig. 4: Proposed UXP header
619	    The fields in the UXP header are defined as follows:
620	    - X (bit 0): extension bit, reserved for future enhancements,
621	    currently not in use -> default value: 0
622	    - block PT (bits 1-7): regular RTP payload type to indicate the
623	    media type contained in the info stream
624	    - block length n (bits 8-15): indicates total number of RTP
625	    -                             packets
626	                                  resulting from one TB (which equals
627	                                  the number of columns of the TB)
628	    The syntax of the info stream which is protected by UXP is
629	    specified by the RTP payload type field contained in the UXP
630	    header. The details of the info stream are described in Sec. 6.3
631	    For example, payload type H.263 means that the info stream
632	    conforms to the specifications of the RTP profile for H.263 and
633	    does not represent the "raw" H.263 media stream produced by an
634	    H.263 encoder.
635	    However, UXP can also be applied to the "raw" media stream (in
636	    case it is already octet-aligned), if this can be signaled to the
637	    receiver via other means, e.g. by use of H.245 or SDP.
638	    Based on the RTP sequence number, the marker bit, and the
639	    repetition of the block length n in each UXP header, the
640	    receiving entity is able to recognize both TB boundaries and the
641	    actual position of packets (both received and lost ones) in the
642	    TB.

644	 6.3 Framing and Timing Mechanism in UXP: The Info Stream

646	    As described in Sect. 5, UXP creates its own packetization scheme
647	    by interleaving. The regular framing and timing structure of RTP
648	    is therefore destroyed. This section describes which kind of
649	    problems arise with interleaving and how they can be solved. This
650	    finally leads to the specification of the info stream.
651	    The timestamp of an RTP packet usually describes the sampling
652	    time of the first octet included in the RTP data packet. This is
653	    in principle also true for UXP RTP packets. According to the time
654	    stamp definition in Sect. 6.1  every packet contains the
655	    timestamp of the sampling time of the first octet in the
656	    corresponding TB. Therefore, all packets which belong to one TB
657	    contain the same timestamp. This can lead to problems since due
658	    to the theoretical size limit of a TB (the limit for the number
659	    of columns is 256, and the limit for the number of rows is the
660	    maximum packet size), it can contain data from different sampling
661	    time instances, e.g. several video frames. Then the timing
662	    information of the later frames has to be determined from the
663	    media stream itself and not from the RTP timestamp.

665	    A second problem arising with interleaving is that the framing
666	    mechanism of RTP is not supported. Consider a media encoder,
667	    which does not create a fully decodable bitstream, e.g. H.26L
668	    with the video coding layer (VCL) and network adaptation layer
669	    (NAL) concept [9]. In this concept the VCL creates slices which
670	    are prepared for transmission over several networks at the NAL.
671	    Consequently, in case of RTP transmission, header information
672	    which allows to decode the slices is included only in the RTP
673	    packets. Thus, to fill an UXP TB with the "raw" media stream from
674	    the VCL can lead, even without packet losses, to a non-decodable
675	    stream.
676	    The framing problem can be solved in two ways:
677	    One solution could be to use the RTP payload specification of a
678	    given media stream to create a bitstream with an appropriate
679	    framing, resulting in the so-called info stream. For example, to
680	    create an H.263 info stream, the following steps are necessary:
681	    1.)  Generate an H.263-compliant media stream, i.e. take a slice
682	         or a video frame directly from the H.263 encoder.
683	    2.)  Apply the H.263 payload specification (e.g. RFC 2429) to
684	         create the RTP payload for only one packet.
685	    3.)  Insert the latter row by row into one data TSB.
686	    It is possible to apply the procedure mentioned above several
687	    times for different data TSBs (see Sect. 6.5.). Due to the in-
688	    band signaling, it is possible to determine the beginning and end
689	    of every TSB without parsing the whole TB. This allows a fast
690	    decomposition of the TB into the different TSBs.
691	    Another solution of the framing problem would be to rely on the
692	    framing mechanism of the media stream. This is, for example,
693	    possible for media streams which contain start codes.
694	    The timing problem can be solved in two ways.
695	    One solution is to comply with the RTP payload specification of
696	    the media stream. If the specification allows to put into one
697	    packet octets which belong to different sampling times, this
698	    should also be allowed for a TB.
699	    The second solution for the timing problem is to rely on the
700	    timing information contained in the media stream itself, if
701	    available.
702	    Therefore, there are two different modes for framing:
703	    1.)  RTP payload framing (if an RTP payload specification exists
704	         for the media stream),
705	    2.)  pure media stream framing (if framing is contained in the
706	         media stream),

708	    and two different modes for timing:
709	    1.)  timing rules of the RTP payload specification for the media
710	         stream,
711	    2.)  timing information within the media stream.

713	    All combinations of timing and framing modes are possible, but
714	    framing mode 1 and timing mode 1 represent the default mode of
715	    operation for UXP. The use of other timing and framing modes has
716	    to be signaled by non RTP means.
717	    The info stream is thus defined by the media stream together with
718	    framing and timing rules.
719	    In the following, some examples will be given:
720	    1.)  The info stream for MPEG-4 Visual according to RFC 3016 is
721	         the pure MPEG-4 compliant media stream, since RFC 3016
722	         specifies (in case of video) to take the MPEG-4 compliant
723	         video stream as payload.
724	    2.)  The info stream for H.263+ can be created according to RFC
725	         2429 as follows:
726	    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-
727	    |H.263+ payload| H.263+ compliant stream (possibly changed with|
728	    |header        | respect to RFC 2429) containing a slice/frame |
729	    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-

731	    This info stream is inserted into  one single data TSB.
732	    If necessary, for example, if the slices are too short to achieve
733	    a reasonable TB size, several info streams can be inserted in one
734	    TB by concatenating several data TSBs to a single TB (see Sect.
735	    6.5.).

737	 6.4. In-band Signaling of the Structure of the Redundancy Profile

739	    To enable a dynamic adaptation to varying link conditions, the
740	    actual redundancy profile used in the data TSB as well as the
741	    beginning and end of a TSB must be signaled to the receiving
742	    entity. Since out-of-band signaling either results in excessive
743	    additional control traffic, or prevents quick changes of the
744	    profile between successive TBs, an in-band signaling procedure is
745	    desired.
746	    Since without knowledge of the correct redundancy profile, the
747	    decoding process cannot be applied to any of the erasure
748	    protection classes, the redundancy profile has to be protected at
749	    least as strongly as the most important element in the info
750	    stream. Therefore, an additional class EPC_P is used in the
751	    signaling TSB, where the number of parity symbols is by default
752	    set to the following value:
753	    P=ceil(n/2)
754	    Hence, up to 50% of the RTP packets can be lost, before the
755	    redundancy profile cannot be recovered anymore. This seems to be
756	    a reasonable value for the lowest point of operation over a lossy
757	    link. Alternatively, P may be explicitly signaled during session
758	    setup by means of SDP or H.245 protocol.
759	    Consequently, since all other classes must have equal or less
760	    erasure protection capability, the maximum allowable value for
761	    class EPC_T in the data TSB is now limited to T<=P.
762	    The signaling of the erasure protection vector is accomplished by
763	    means of descriptors. In the following we describe an efficient
764	    encoding scheme for the descriptors.

766	    For each class EPC_i with R_i>0, there is a descriptor DP_i
767	    providing information about the size of class EPC_i (i.e. the
768	    value of R_i) and establishing a relationship between the erasure
769	    protection of class EPC_i and that of the class EPC_(i+j), where
770	    j>0 and j is the smallest value for which R_(i+j)>0 is true. A
771	    descriptor DP_i is mapped onto one octet, which is sub-divided
772	    into two half-octets (i.e. the higher and the lower four bits).
773	    The first half-octet is of type unsigned and contains the 4-bit
774	    representation of the decimal value R_i. The second half-octet is
775	    of type signed and contains the difference in erasure protection
776	    between class EPC_i and class EPC_(i+j), i.e. the signed 4-bit
777	    representation of the decimal value (-j) (where the MSB denotes
778	    the sign, and the lower three bits the absolute value). Note that
779	    the erasure protection P of class EPC_p is fixed, whereas the
780	    size R_P may vary.
781	    Thus, the data to be filled into class EPC_P shall consist of a
782	    sequence of descriptors separated by stuffing indicators (see
783	    below), where the number of descriptors is primarily given by the
784	    number of protection classes EPC_i, 0<=i<=T, in the data TSB with
785	    R_i>0.
786	    Without a-priori knowledge, the initial value for the size of the
787	    signaling TSB, R_P, should be set to one (row). When the number
788	    of necessary descriptors and stuffing indicators exceeds the (n-
789	    P) information positions, one or more additional rows have to be
790	    reserved. This is usually done by increasing the value for L_s to
791	    R_P>1, i.e. the data TSB is reduced to (L-R_P) rows. Hence, in
792	    order to indicate the actual size of the signaling TSB, an
793	    additional descriptor is inserted at the very beginning, which
794	    takes on the value 0xq0, where q denotes the (octal) four bit
795	    representation of the decimal value R_P.
796	    Furthermore, the end of each data TSB is signaled by the
797	    otherwise unused descriptor value 0x00, followed by exactly one
798	    stuffing indicator (SI). The latter is mapped onto an octet,
799	    which is of type unsigned and contains the 8-bit representation
800	    of the decimal value of the number of media stuffing symbols used
801	    at the end of the respective data TSB.
802	    The (extended) sequence of descriptors and stuffing indicators is
803	    then mapped to the octet positions in the R_P rows of the
804	    signaling TSB from left to right and top to bottom. Each row is
805	    then encoded with the same (n,n-P) RS code.
806	    If the number of descriptors and stuffing indicators is less than
807	    the available octet positions, however, empty positions in class
808	    EPC_P may be filled up with the otherwise unused descriptor 0x00.
809	    At the receiving entity, the sequence of descriptors shall be
810	    recovered by performing erasure decoding on the first row of the
811	    TB (which definitely belongs to the signaling TSB) using the same
812	    algorithm as later for the data TSB. If successful, the very
813	    first descriptor now indicates the number of rows of the
814	    signaling TSB, and the next (R_P-1) rows are decoded to
815	    reconstruct the redundancy profile for the data TSB(s), together
816	    with the number of media stuffing symbols denoted by the
817	    respective SI(s).

819	    The complete structure of the TB is now depicted in Fig. 5.

821	    Transmission Block (TB)
822	                                 P
823	                            <--------->
824	                 /\ +-+-+-+-+-+-+-+-+-+ /\
825	                 |  |?|?|?|?|*|*|*|*|*|  |  R_P=1
826	                 |  +-+-+-+-+-+-+-+-+-+ \/
827	                 |  |&|&|&|&|&|*|*|*|*| /\
828	                 |  +-+-+-+-+-+-+-+-+-+  |  R_T=3
829	                 |  |&|&|&|&|&|*|*|*|*|  |
830	                 |  +-+-+-+-+-+-+-+-+-+  |
831	    L octets     |  |&|&|&|&|&|*|*|*|*| \/
832	    payload      |  +-+-+-+-+-+-+-+-+-+ /\
833	    per packet   |  |%|%|%|%|%|%|*|*|*|  |  R_(T-1)=1
834	                 |  +-+-+-+-+-+-+-+-+-+ \/
835	                 |  |$|$|$|$|$|$|$|*|*|  .
836	                 |  +-+-+-+-+-+-+-+-+-+  .
837	                 |  |!|!|!|!|!|!|!|!|*|  .
838	                 |  +-+-+-+-+-+-+-+-+-+ /\
839	                 |  |#|#|#|#|#|#|#|#|#|  |  R_0=1
840	                 \/ +-+-+-+-+-+-+-+-+-+ \/
841	                    <----------------->
842	                          n packets
843	    ? :          descriptors and stuffing indicators for in-band
844	                 signaling of the redundancy profile

846	    &,%,$,!,# :  info octets belonging to a certain element of the
847	                 info stream in decreasing order of importance

849	    * :          parity octets gained from Reed-Solomon coding

851	    Fig. 5: General structure for UXP with in-band signaling of the
852	    redundancy profile
853	    The following simple example is meant to illustrate the idea
854	    behind using descriptors: Let an erasure protection vector of
855	    length T+1=7 be given as follows:
856	    EPV=(R_0,R_1,...,R_5,R_6)=(7,0,2,2,0,3,10)
857	    Hence, the length L of the TB (including one row for the
858	    signaling TSB) is equal to 7+2+2+3+10+1=25 (rows/octets). If the
859	    width is assumed to be equal to 20 (columns/packets), then the
860	    erasure protection of the descriptors is P=10.
861	    The corresponding sequence of descriptors can be written as
862	    DP=(DP_6,DP_5,DP_3,DP_2,DP_0)=(0xAC,0x39,0x2A,0x29,0x7A),
863	    where the values of the descriptors are given in hexadecimal
864	    notation. Next, the descriptor indicating the length of the
865	    signaling TSB has to be inserted, the end of the data TSB has to
866	    be marked by 0x00, and the SI has to be appended. If the number
867	    of media stuffing symbols is assumed to be 3, the 10 info octets
868	    in the signaling TSB take on the following values (descriptor
869	    stuffing included):
870	    (0x10,0xAC,0x39,0x2A,0x29,0x7A,0x00,0x03,0x00,0x00)

872	    6.5. Optional Concatenation of Transmission Sub Blocks

874	    The following procedure may be applied if a single info stream
875	    would be too short to achieve an efficient mapping to a
876	    transmission block with respect to the fixed payload length L and
877	    the desired number of packets n. For example, intra-coded video
878	    frames (I-frames) are usually much larger than the following
879	    predicted ones (P-frames). In this case, a certain number z of
880	    successive small info streams should be each mapped to a
881	    transmission sub block with length L_d(y) and width n, such that
882	    L_d(1)+L_d(2)+...+L_d(z)=L_d.
883	    The resulting transmission sub blocks can then be easily
884	    concatenated to form a TB of size L x n having one common
885	    signaling TSB (see Fig. 2): Since the second half-octet of the
886	    descriptors is of type signed (cf. Sect. 6.4.), we are able to
887	    signal both decreasing and increasing erasure protection
888	    profiles.
889	    Again, we will give a simple example to illustrate this idea: Let
890	    the erasure protection vectors for two concatenated data TSBs be
891	    given as follows:
892	    EPV1=(R1_0,R1_1,...,R1_5,R1_6)=(0,0,2,2,0,3,10),
893	    EPV2=(R2_0,R2_1,...,R2_5,R2_6)=(0,0,2,2,0,3,10).
894	    Hence, two single identical data TSBs will be concatenated to
895	    form a TB of length L=2*(2+2+3+10)+2=36 (rows/octets). If the
896	    width is again assumed to be equal to 20 (columns/packets), then
897	    the erasure protection of the descriptors is P=10. We  reserve a
898	    total of two rows for the signaling TSB. The corresponding
899	    sequence of descriptors can now be written as
900	    DP=(0xAC,0x39,0x2A,0x29,0xA4,0x39,0x2A,0x29), where the values of
901	    the descriptors are given in hexadecimal notation. The values of
902	    the first four descriptors are taken from the descriptor of EPV1
903	    as described in Sect. 6.4. (without the SI). The last four
904	    descriptors are taken from the descriptor of EPV2 (without SI)
905	    with one exception. The fifth descriptor of DP (i.e. 0xA4) is
906	    created as follows: The first half-octed is created according to
907	    Sect. 6.4. However, the second half-octed describes no longer the
908	    difference between R_P and R2_6. It rather describes the
909	    difference between R1_2 and R2_6, i.e. R1_2-R2_6, which can be a
910	    positive or negative number. If the number of media stuffing
911	    symbols is assumed to be 3 for each data TSB, the 20 info octet
912	    positions in the signaling TSB are filled with the following
913	    values (descriptor stuffing included):
914	    (0x20,0xAC,0x39,0x2A,0x29,0x00,0x03,0xA4,0x39,0x2A,0x29,0x00,0x03
915	    ,
916	    0x00,0x00,0x00,0x00,0x00,0x00,0x00)
917	    Therefore from the example above, the following general rule MUST
918	    be used to create the resulting descriptors for concatenated data
919	    TSB #u and data TSB #v, where v=u+1:
920	    Let EPVu=(Au_0,Au_1,...) and EPVv=(Av_0, Av_1,...) be the
921	    corresponding erasure protection vectors and DPu and DPv the
922	    corresponding descriptors created according to Sect. 6.4. (with
923	    stuffing). Let w be the smallest index for which Au_w >0. Let x
924	    be the largest index for which Av_x >0. The resulting descriptor
925	    can be created by concatenation of DPu and DPv where the first
926	    descriptor of DPv should be changed as follows:
927	    The second half byte is defined by Au_w-Av_x.

929	 7. Indication of UXP in SDP

931	    From the discussion in Sect. 6.3 , we know that UXP encapsulates
932	    and protects the info stream. The info stream consists usually of
933	    a regular RTP-Payload format, e.g. RFC 3016.
934	    There is no static payload type assignment for UXP, so dynamic
935	    payload type numbers MUST be used. The binding to the number is
936	    indicated by an rtpmap attribute. The name used in this binding
937	    is
938	    "UXP". The payload type number of UXP is indicated in the "m"
939	    line of the
940	    media as well as the payload type of the info-stream.

942	    A sample indication of UXP in SDP is as follows:

944	       m = video 8000 RTP/AVP 98 99
945	       a = rtpmap:98 UXP/90000
946	       a = rtpmap:99 MP4V-ES/90000

948	    Here, PT 98 indicates that the payload consists of UXP with the
949	    corresponding info stream "MP4V-ES". Alternatively, PT 99 can be
950	    used which indicates "MP4V-ES" without UXP.
951	    Since UXP is generic, several payload types can be protected. The
952	    lines

954	       m = video 8000 RTP/AVP 98 99 100
955	       a = rtpmap:98 UXP/90000
956	       a = rtpmap:99 MP4V-ES/90000
957	       a = rtpmap:100 H263-1998/90000

959	    mean that UXP can be used with either "MP4V-ES" or "H263-1998" as
960	    info stream (indicated by PT 98 in the RTP-Header and either
961	    block PT=99 or block PT=100 in the UXP-Header). Alternatively,
962	    PT=99 or PT=100 in the RTP-Header means the use of "MP4V-ES" or
963	    "H263-1998" without UXP.

965	    As described in Sect. 6.4., the parameter P has the default value
966	    P=ceil(n/2), if not otherwise stated. The parameter P MAY be
967	    specified explicitly by means of SDP:

969	    a = fmtp:98 UXP-prof: f

971	    where f is a number in the interval (1,0) and specifies P by
972	    P=ceil(n*f). For example, if we set f=0.5,

974	    a = fmtp:98 UXP-prof: 0.5

976	    we get the default value for P, since P=ceil(n/2).

978	 8. Security Considerations
979	    The payload of the RTP-packets consists of an interleaved media
980	    and parity stream. Therefore, it is reasonable to encrypt the
981	    resulting stream with one key rather than using different keys
982	    for media and parity data. It should also be noted that
983	    encryption of the media data without encryption of the parity
984	    data could enable known-plaintext attacks.
985	    The overall proportion between parity octets and info octets
986	    should be chosen carefully if the packet loss is due to network
987	    congestion. If the proportion of parity octets per TB is
988	    increased in this case, it could lead to increasing network
989	    congestion. Therefore, the proportion between parity octets and
990	    info octets per TB MUST NOT be increased as packet loss increases
991	    due to network congestion.
992	    The overall ratio between parity and info octets MUST NOT be
993	    higher than 1:1, i.e. the absolute bitrate spent for redundancy
994	    must not be larger than the bitrate required for transmission of
995	    multimedia data itself.

997	 9. Application Statement
998	    There are currently two different schemes proposed for unequal
999	    error protection in the IETF-AVT: Unequal Level Protection (ULP)
1000	    and Unequal Erasure Protection (UXP).
1001	    Although both methods seem to address the same problem, the
1002	    proposed solutions differ in many respects. This section tries to
1003	    describe possible application scenarios and to show the strengths
1004	    and weaknesses of both approaches.
1005	    The main difference between both approaches is that while ULP
1006	    preserves the structure of the packets which have to be protected
1007	    and provides the redundancy in extra packets, UXP interleaves the
1008	    info stream which has to be protected, inserts the redundancy
1009	    information, and thus creates a totally new packet structure.
1010	    Another difference concerns multicast compatibility: It cannot be
1011	    assumed that all future terminals will be able to apply UXP/ULP.
1012	    Therefore, backward compatibility could be an issue in some
1013	    cases. Since ULP does not change the original packet structure,
1014	    but only adds some extra packets, it is possible for terminals
1015	    which do not
1016	    support ULP to discard the extra packets. In case of UXP,
1017	    however, two separate streams with and without erasure protection
1018	    have to be sent, which increases the overall data rate.
1019	    Next, both approaches offer different mechanisms to adjust packet
1020	    sizes, if necessary: UXP allows to adjust the packet sizes
1021	    arbitrarily. This is an advantage in case the loss probability is
1022	    dependent on the packet length, which happens, for example, if
1023	    the end-to-end connection contains wireless links. In this case
1024	    proper adjustment of the packet size is one essential network
1025	    adaptation technique. In addition, if a preencoded stream is sent
1026	    over the network, the packet size can be adjusted independently
1027	    of slice structures.
1028	    Since ULP does not change the existing packetization scheme, this
1029	    flexibility does not exist.
1030	    The ability of UXP to adjust the packet size arbitrarily can be
1031	    especially exploited in a streaming scenario, if a delay of
1032	    several hundred milliseconds is acceptable. It is then possible
1033	    to fill several video frames into a single TB of desired size,
1034	    e.g. a group of pictures consisting of I-frame, P-frames and B-
1035	    frames. The redundancy scheme can thus be selected in such a way
1036	    as to guarantee the following property: In case of packet loss,
1037	    the P-frames are only recoverable if the I-frame on which the
1038	    decoding of P-frames depends is recoverable. The same is true for
1039	    B-frames, which can only be decoded if the respective P-frames
1040	    are recoverable. This prevents situations in which, for example,
1041	    the B-frames have been received correctly, but the P-frames have
1042	    been lost, i.e. assures a gradual decrease in application quality
1043	    also on the frame level. Of course, a similar encoding is
1044	    possible with ULP. But in this case one might have to send
1045	    several frames within one packet which leads to large packet
1046	    sizes.
1047	    Furthermore, decoding delay is also a crucial issue in
1048	    communications. Again, both approaches have different delay
1049	    properties: UXP introduces a decoding delay because a reasonable
1050	    amount of correctly received packets are necessary to start
1051	    decoding of a TB. The delay in general depends on the dimensions
1052	    of the interleaver. This should be considered for any system
1053	    design which includes UXP.

1055	    With ULP, every correctly received media packet can be decoded
1056	    right away. However, a significant delay is introduced, if
1057	    packets are corrupted, because in this case one has to wait for
1058	    several redundancy packets. Thus, the delay is in general
1059	    dependent on the actual ULP-FEC-packet scheme and cannot be
1060	    considered in advance during the system design phase.
1061	    Finally, we want to point out that UXP uses RS codes which are
1062	    known
1063	    to be the most efficient type of block codes in terms of erasure
1064	    correction capability.

1066	 10. Intellectual Property Considerations
1067	    Siemens AG has filed patent applications that might possibly have
1068	    technical relations to this contribution.
1069	    On IPR related issues, Siemens AG refers to the Siemens Statement
1070	    on Patent Licensing, see http://www.ietf.org/ietf/IPR/SIEMENS-
1071	    General.

1073	 11. References
1074	    [1] J. Rosenberg and H. Schulzrinne, "An RTP Payload Format for
1075	    Generic Forward Error Correction", Request for Comments 2733,
1076	    Internet Engineering Task Force, Dec. 1999.
1077	    [2] A. Albanese, J. Bloemer, J. Edmonds, M. Luby, and M. Sudan,
1078	    "Priority encoding transmission", IEEE Trans. Inform. Theory,
1079	    vol. 42, no. 6, pp. 1737-1744, Nov. 1996.
1080	    [3] Shu Lin and Daniel J. Costello, Error Control Coding:
1081	    Fundamentals and Applications, Prentice-Hall, Inc., Englewood
1082	    Cliffs, N.J., 1983.
1083	    [4] W. Li: "Streaming video profile in MPEG-4", IEEE Trans. on
1084	    Circuits and Systems for Video Technology, Vol. 11, no. 3, 301-
1085	    317, March 2001.
1086	    [5] G. Blaettermann, G. Heising, and D. Marpe: "A Quality
1087	    Scalable Mode for H.26L", ITU-T SG16, Q.15, Q15-J24, Osaka, May
1088	    2000.
1089	    [6] F. Burkert, T. Stockhammer, and J. Pandel, "Progressive A/V
1090	    coding for lossy packet networks - a principle approach", Tech.
1091	    Rep., ITU-T SG16, Q.15, Q15-I36, Red Bank, N.J., Oct. 1999.
1092	    [7] Guenther Liebl, "Modeling, theoretical analysis, and coding
1093	    for wireless packet erasure channels", Diploma Thesis, Inst. for
1094	    Communications Engineering, Munich University of Technology,
1095	    1999.
1096	    [8] U. Horn, K. Stuhlmuller, M. Link, and B. Girod, "Robust
1097	    Internet video transmission based on scalable coding and unequal
1098	    error protection", Image Com., vol. 15, no. 1-2, pp. 77-94, Sep.
1099	    1999.

1101	    [9] S. Wenger, "H.26L over IP: The IP-Network Adaptation Layer",
1102	    Packet Video 2002, Pittsburgh, Pennsylvania, USA, April 24-
1103	    26,2002.
1104	 12. Acknowledgments
1105	    Many thanks to Philippe Gentric, Stephen Casner, and Hermann
1106	    Hellwagner for helpful comments and improvements. The authors
1107	    would like to thank Thomas Stockhammer who came up with the
1108	    original idea of UXP. Also, the help of Gero Baese, Frank
1109	    Burkert, and Minh Ha Nguyen for the development of UXP is well
1110	    acknowledged.

1112	 13. Author's Addresses
1113	    Guenther Liebl
1114	    Institute for Communications Engineering (LNT)
1115	    Munich University of Technology
1116	    D-80290 Munich
1117	    Germany
1118	    Email: {liebl}@lnt.e-technik.tu-muenchen.de

1120	    Marcel Wagner, Juergen Pandel, Wenrong Weng
1121	    Siemens AG - Corporate Technology CT IC 2
1122	    D-81730 Munich
1123	    Germany
1124	    Email:
1125	    {marcel.wagner,juergen.pandel,wenrong.weng}@mchp.siemens.de

1127	 Full Copyright Statement
1128	    "Copyright (C) The Internet Society (date). All Rights Reserved.
1129	    This document and translations of it may be copied and furnished
1130	    to others, and derivative works that comment on or otherwise
1131	    explain it or assist in its implementation may be prepared,
1132	    copied, published and distributed, in whole or in part, without
1133	    restriction of any kind, provided that the above copyright notice
1134	    and this paragraph are included on all such copies and derivative
1135	    works. However, this document itself may not be modified in any
1136	    way, such as by removing the copyright notice or references to
1137	    the Internet Society or other Internet organizations, except as
1138	    needed for the purpose of developing Internet standards in which
1139	    case the procedures for copyrights defined in the Internet
1140	    Standards process must be followed, or as required to translate
1141	    it into languages other than English.
1142	    The limited permissions granted above are perpetual and will not
1143	    be revoked by the Internet Society or its successors or assigns.
1144	    This document and the information contained herein is provided on
1145	    an "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET
1146	    ENGINEERING TASK FORCE DISCLAIMS ALL WARRANTIES; EXPRESS OR
1147	    IMPLIED; INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE
1148	    OF INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
1149	    WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR
1150	    PURPOSE.