idnits 2.17.1 draft-ietf-avt-uxp-07.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** It looks like you're using RFC 3978 boilerplate. You should update this to the boilerplate described in the IETF Trust License Policy document (see https://trustee.ietf.org/license-info), which is required now. -- Found old boilerplate from RFC 3667, Section 5.1 on line 17. -- Found old boilerplate from RFC 3978, Section 5.5 on line 1329. -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 1340. -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 1347. -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 1353. ** Found boilerplate matching RFC 3978, Section 5.4, paragraph 1 (on line 1320), which is fine, but *also* found old RFC 2026, Section 10.4C, paragraph 1 text on line 37. ** The document seems to lack an RFC 3978 Section 5.1 IPR Disclosure Acknowledgement -- however, there's a paragraph with a matching beginning. Boilerplate error? ** This document has an original RFC 3978 Section 5.4 Copyright Line, instead of the newer IETF Trust Copyright according to RFC 4748. ** This document has an original RFC 3978 Section 5.5 Disclaimer, instead of the newer disclaimer which includes the IETF Trust according to RFC 4748. ** The document uses RFC 3667 boilerplate or RFC 3978-like boilerplate instead of verbatim RFC 3978 boilerplate. After 6 May 2005, submission of drafts without verbatim RFC 3978 boilerplate is not accepted. The following non-3978 patterns matched text found in the document. That text should be removed or replaced: By submitting this Internet-Draft, I certify that any applicable patent or other IPR claims of which I am aware have been disclosed, or will be disclosed, and any of which I become aware will be disclosed, in accordance with RFC 3668. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- == No 'Intended status' indicated for this document; assuming Proposed Standard == The page length should not exceed 58 lines per page, but there was 28 longer pages, the longest (page 2) being 114 lines == It seems as if not all pages are separated by form feeds - found 0 form feeds but 29 pages Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the RFC 3978 Section 5.4 Copyright Line does not match the current year -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- Couldn't find a document date in the document -- date freshness check skipped. Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'Laz04' is mentioned on line 758, but not defined == Unused Reference: 'Wen02' is defined on line 1275, but no explicit reference was found in the text ** Obsolete normative reference: RFC 2733 (Obsoleted by RFC 5109) -- Possible downref: Non-RFC (?) normative reference: ref. 'Lin83' ** Obsolete normative reference: RFC 3555 (Obsoleted by RFC 4855, RFC 4856) ** Obsolete normative reference: RFC 2327 (Obsoleted by RFC 4566) Summary: 9 errors (**), 0 flaws (~~), 6 warnings (==), 8 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 Internet Engineering Task Force G. Liebl 2 Internet Draft LNT, Munich Univ. of 3 Technology 4 Document: draft-ietf-avt-uxp-07.txt 5 October 2004 M. Wagner, J. Pandel, 6 W. Weng 7 Expires: April 2005 Siemens AG, Munich 9 An RTP Payload Format for Erasure-Resilient Transmission of 10 Progressive Multimedia Streams 12 Status of this Memo 14 By submitting this Internet-Draft, I certify that any applicable 15 patent or other IPR claims of which I am aware have been 16 disclosed, and any of which I become aware will be disclosed, in 17 accordance with RFC 3668. 19 By submitting this Internet-Draft, I accept the provisions of 20 Section 3 of RFC 3667 22 Internet-Drafts are working documents of the Internet Engineering 23 Task Force (IETF), its areas, and its working groups. Note that 24 other groups may also distribute working documents as Internet- 25 Drafts. Internet-Drafts are draft documents valid for a maximum 26 of six months and may be updated, replaced, or obsoleted by other 27 documents at any time. It is inappropriate to use Internet- 28 Drafts as reference material or to cite them other than as "work 29 in progress." 30 The list of current Internet-Drafts can be accessed at 31 http://www.ietf.org/ietf/1id-abstracts.txt 32 The list of Internet-Draft Shadow Directories can be accessed at 33 http://www.ietf.org/shadow.html. 35 Copyright Notice 37 Copyright (C) The Internet Society (2004). All Rights Reserved. 39 Abstract 41 This document specifies an efficient way to ensure erasure- 42 resilient transmission of progressively encoded multimedia 43 sources via RTP using Reed-Solomon (RS) codes together with 44 interleaving. The level of erasure protection can be explicitly 45 adapted to the importance of the respective parts in the source 46 stream, thus allowing a graceful degradation of application 47 quality with increasing packet loss rate on the network. Hence, 48 this type of unequal erasure protection (UXP) schemes is intended 49 to cope with the rapidly varying channel conditions on wireless 51 Liebl,Wagner,Pandel,Weng [Page1] 52 access links to the Internet backbone. Furthermore, protection of 53 non-progressive multimedia streams is ensured, since equal 54 erasure protection (EXP) represents a subset of generic UXP. By 55 applying interleaving and RS codes a payload format is defined, 56 which can be easily integrated into the existing framework for 57 RTP. 59 Table of Contents 61 1. Introduction.................................................2 62 2. Conventions used in this Document............................4 63 3. Preliminaries................................................4 64 4. General Structure of UXP Schemes.............................8 65 5. RTP payload structure.......................................14 66 6. Indication of UXP in SDP....................................21 67 7. Security Considerations.....................................22 68 8. IANA Considerations.........................................22 69 9. Application Statement.......................................25 70 10. Intellectual Property Considerations.......................26 71 11. References.................................................27 72 12. Acknowledgments............................................27 73 13. Author's Addresses.........................................28 75 1. Introduction 77 Due to the increasing popularity of high-quality multimedia 78 applications over the Internet and the high level of public 79 acceptance of existing mobile communication systems, there is a 80 strong demand for a future combination of these two techniques: 81 One possible scenario consists of an integrated communication 82 environment, where users can set up multimedia connections 83 anytime and anywhere via radio access links to the Internet. 84 For this reason, several packet-oriented transmission modes like 85 EGPRS (Enhanced General Packet Radio Service) or UMTS (Universal 86 Mobile Telecommunications System) can be used, which are mostly 87 based on the same principle: Long message blocks, i.e. IP 88 packets, that enter the wireless part of the network are split up 89 into segments of desired length, which can be multiplexed onto 90 link layer packets of fixed size. The latter are then transmitted 91 sequentially over the wireless link, reassembled, and passed on 92 to the next network element. 93 However, compared to the rather benign channel characteristics on 94 today's fixed networks, wireless links suffer from severe fading, 95 noise, and interference conditions in general, thus resulting in 96 a comparably high residual bit error rate after detection and 97 decoding. By use of efficient CRC-mechanisms, these bit errors 98 are usually detected with very high probability, and every 99 corrupted segment, i.e. which contains at least one erroneous 100 bit, is discarded to prevent error propagation through the 101 network. But if only one single segment is missing at the 102 reassembly stage, the upper layer IP packet cannot be 103 reconstructed anymore. The result is a significant increase in 104 packet loss rate at IP level. 105 Since most multimedia applications can only recover from a very 106 limited number of lost IP packets, it is vitally necessary to 107 keep packet loss at IP level within a certain acceptable range 108 depending on the individual quality-of-service requirements. 109 However, due to the delay constraints typically imposed by most 110 audio or video codecs, the use of ARQ-schemes is often prohibited 111 both at link level and at transport level. In addition, 112 retransmission strategies cannot be applied to any broadcast or 113 multicast scenarios. Thus, forward erasure correction strategies 114 have to be considered, which provide a simple means to 115 reconstruct the content of lost packets at the receiver from the 116 redundancy that has been spread out over a certain number of 117 consecutive packets. 118 There already exist some previous studies and proposals regarding 119 erasure-resilient packet transmission [RFC2733,Hor99]. Since most 120 of them are based on the assumption that all parts in a message 121 block are equally important to the receiver, i.e. the respective 122 application cannot operate on partly complete blocks, they were 123 optimized with respect to assigning equal erasure protection over 124 the whole message block. However, recent developments both in 125 audio and video coding have introduced the notion of 126 progressively encoded media streams, for which unequal erasure 127 protection strategies seem to be more promising, as it will be 128 explained in more detail below. Although the scheme defined in 129 [RFC2733] is in principle capable of supporting some kind of 130 unequal erasure protection, possible implementations seem to be 131 quite complex with respect to the gain in performance. Finally, 132 in [RFC2733] it is assumed that consecutive RTP packets can have 133 variable length, which would cause significant segmentation 134 overhead at the link layer of almost all wireless systems. 136 This document defines a payload format for RTP, such that 137 different elements in a progressively encoded multimedia stream 138 can be protected against packet erasures according to their 139 respective quality-of-service requirement. The general principle, 140 including the use of Reed-Solomon codes together with an 141 appropriate interleaving scheme for adding redundancy, follows 142 the ideas already presented in [Alb96], but allows for finer 143 granularity in the structure of the progressive media stream. The 144 proposed scheme is generic in the way that it (1) is independent 145 of the type of media stream, be it audio or video, and (2) can be 146 adapted to varying transmission quality very quickly by use of 147 inband-signaling. 149 2. Conventions used in this Document 151 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL 152 NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and 153 "OPTIONAL" in this document are to be interpreted as described in 154 RFC-2119. 156 3. Preliminaries 158 The purpose of this section is to provide some preliminaries 159 which are important for understanding the UXP scheme. First, some 160 definitions used throughout this document are given. Next, Reed- 161 Solomon Codes are introduced. Finally, progressive source coding 162 and the resulting properties of progressive bitstreams are 163 discussed. 165 3.1 Definitions 167 The following terms are used throughout this document: 168 1.) Segment: denotes a link layer transport unit. 169 2.) Segmentation/Reassembly Process: If the size of the 170 transport units at the link layer is smaller than that at 171 the upper layers, message blocks have to be split up into 172 several parts, i.e. segments, which are then transmitted 173 subsequently over the link. If nothing is lost, the original 174 message block can be restored at the receiving entity 175 (reassembly). 176 3.) Codec: denotes a functional pair consisting of a source 177 encoding unit at the sender and a corresponding source 178 decoding unit at the receiver; usually standardized for 179 different media applications like audio or video. 180 4.) Media stream: A bitstream which results at the output of an 181 encoder for a specific media type, e.g. H.263, MPEG-4 182 Visual. 183 5.) Progressive media stream: A media stream which can be 184 divided into successive elements. The distinct elements are 185 of different importance to the decoding process and are 186 commonly ordered from highest to least importance, where the 187 latter elements depend on the previous. 188 6.) Progressive source coding: results in a progressive media 189 stream. 190 7.) Reed-Solomon (RS) code: belongs to the class of linear 191 nonbinary block codes, and is uniquely specified by the 192 block length n, the number of parity symbols t, and the 193 symbol alphabet. 194 8.) n: is a variable, which denotes both the block length of a 195 RS codeword, and the number of columns in a TB (see 19). 196 9.) k: is a variable, which denotes the number of information 197 symbols in an RS codeword. 199 10.) t: is a variable, which denotes the number of parity symbols 200 in an RS codeword. 201 11.) Erasure: When a packet is lost during transmission, an 202 erasure is said to have happened. Since the position of the 203 erased packet in a sequence is usually known, a 204 corresponding erasure marker can be set at the receiving 205 entity. 206 12.) Base layer: comprises the first and most important elements 207 of the progressive media stream, without which all 208 subsequent information is useless. 209 13.) Enhancement layer: comprises one or more sets of the less 210 important subsequent elements of the progressive media 211 stream. A specific enhancement layer can be decoded, if and 212 only if the base layer and all previous enhancement layer 213 data (of higher importance) are available. 214 14.) Info stream: denotes the bitstream which has to be protected 215 by the UXP scheme. It usually consists of the media stream 216 (progressively source encoded or not), which is arranged 217 according to a desired syntax (e.g. to achieve an 218 appropriate framing, see Sect. 5.4 ). In any case, it is 219 assumed that every info stream is already octet-aligned 220 according to the standard procedures defined in the context 221 of the used syntax specifications. 222 15.) Info octet: Denotes one element of the info stream. 223 16.) Transmission block (TB): denotes a memory array of L rows 224 and n columns. Each row of a TB represents a RS codeword, 225 whereas each column, together with the respective UXP header 226 (see 36) in front, forms the payload of a single RTP packet. 227 Each TB consists of at least two distinct transmission sub 228 blocks (TSB, see20): The first L_s rows belong to the 229 signaling TSB, whereas the last L_d=(L-L_s) rows belong to 230 one or more data TSB. 231 17.) Transmission sub block (TSB): denotes a memory array of 232 0 319 +-+-+-+-+-+-+-+-+-+ 320 |&|&|&|&|&|&|&|*|*| 321 +-+-+-+-+-+-+-+-+-+ 322 <------------><---> 323 k=n-t t 324 (&:info) (*:parity) 326 Fig. 1: Structure of a systematic RS codeword 328 3.3 Progressive Source Coding 330 The output of an encoder for a specific media type, e.g. H.263 or 331 MPEG-4 Visual is said to be a media stream. If the media stream 332 consists of several distinct elements, which are of different 333 importance with respect to the quality of the decoding process at 334 the receiver, then the media stream is progressive. The 335 progressive media stream is often organized in separate layers. 336 Hence, there exists at least one layer, often called base layer, 337 without which decoding fails at all, whereas all the other 338 layers, often called enhancement layers, just help to continually 339 improve the quality. Consequently, the different layers are 340 usually contained in the (source-)encoded media stream in 341 decreasing order of importance, i.e. the base layer data is 342 followed by the various enhancement layers. 343 An example can be found in the fine granular scalability modes 344 which have been proposed to various standardization bodies like 345 MPEG, where the resolution of the scaling process in the 346 progressive source encoder is as low as one symbol in the 347 enhancement layer [Li01]. Another example is given by data 348 partitioning which can be applied to the ITU/MPEG H.264/AVC 349 standard [Bla00], MPEG-4, and H.263++. Also, the existence of 350 I,P, and B frames in streams which comply with standards like 351 MPEG-2 can be interpreted as progressive. 352 From the above definition, it is quite obvious that the most 353 important base layer data must be protected as strongly as 354 possible against packet loss during transmission. However, the 355 protection of the enhancement layers can be continually lowered, 356 since a loss at these stages has only minor consequences for the 357 decoding process. Thus, by using a suitable unequal erasure 358 protection strategy across a progressive media stream, the 359 overhead due to redundancy is reduced. Furthermore, if channel 360 conditions get worse during transmission (resulting in a higher 361 number of corrupt segments and thus higher IP packet loss rate), 362 only more and more enhancement layers are lost, i.e. a graceful 363 degradation in application quality at the receiver is achieved 364 [Bur99]. 365 Nevertheless, it should be mentioned that the specific structure 366 of the media stream strongly depends on the actual media codec in 367 use and does not always provide suitable mechanisms for transport 368 over data networks, like framing (see also Sect. 5.4 ). In order 369 to keep the description of the unequal erasure protection 370 strategy in Sect. 4 as general as possible, the final bitstream 371 which has to be protected by the proposed UXP scheme will be 372 called "info stream" in the following. Furthermore, it is assumed 373 that every info stream is already octet-aligned according to the 374 standard procedures defined in the context of the used syntax 375 specifications. 377 4. General UXP Concept 379 In this section, the principle features of the proposed UXP 380 scheme are described with a special focus on the protection and 381 reconstruction procedure which is applied to the info stream. In 382 addition, the behavior of the sender and receiver is specified as 383 far as it concerns the reconstruction of the info stream. 384 However, the complete UXP payload structure, including the 385 additional UXP header, is described in Sect. 5. 386 The reason for using the term "info stream", as well as the 387 details of the construction, are described in Sect. 5.4 . For 388 now, we assume that we have an info stream which has to be 389 protected. 391 4.1 Transmission Block Structure 393 Fig. 1 already illustrated the structure of a systematic RS 394 codeword, which shall be represented by a single row with n 395 successive symbols that contain the information and the parity 396 octets. This structure shall now be extended by forming a 397 transmission block (TB) consisting of L codewords of length n 398 octets each, which amounts to a total of L rows and n columns 399 [Lie99]: Each column, together with the respective UXP header in 400 front, shall represent the payload of an RTP packet, i.e. the 401 whole data of a TB is transmitted via a sequence of n RTP packets 402 all carrying a payload of length (L+2) octets (UXP header 403 included). 404 Each TB usually consists of two or more horizontal sub blocks, 405 the so-called transmission sub blocks (TSB), as can be seen in 406 Fig. 2: The first L_s rows always belong to the signaling TSB, 407 which is used to convey the actual redundancy profile in the data 408 part to the receiver (see 5.5.). The following L_d=(L-L_s) rows 409 belong to one or more data TSBs, which contain the interleaved 410 and RS encoded info stream, as will be described below. 412 Transmission Block (TB) 414 /\ +-+-+-+-+-+-+-+-+-+ /\ 415 | | signaling TSB | | L_s octets 416 | +-+-+-+-+-+-+-+-+-+ \/ 417 | | | /\ /\ 418 | + data TSB #1 + | L_d(1) octets | 419 | | | | | 420 | +-+-+-+-+-+-+-+-+-+ \/ | 421 L octets | | | /\ | 422 payload | + data TSB #2 + | L_d(2) octets | 423 per packet | + | | | L_d oct. 424 | +-+-+-+-+-+-+-+-+-+ \/ | 425 | | . | . | 426 | + . + . | 427 | | . | . | 428 | +-+-+-+-+-+-+-+-+-+ /\ | 429 | | data TSB #z | | L_d(z) octets | 430 \/ +-+-+-+-+-+-+-+-+-+ \/ \/ 431 <-----------------> 432 n packets 433 Fig. 2: General structure of a TB 435 Since the UXP procedure is mainly applied to the data TSBs, it 436 will be described next, whereas the content and syntax of the 437 signaling TSB will be defined in section 5.5. 439 4.2 TB Fill Procedure 441 For means of simplification, only one single data TSB will be 442 assumed throughout the following explanation of the encoding and 443 decoding procedure. However, an extension to more than one data 444 TSB per TB is straightforward, and will be shown in section 5.6. 445 In the following description, we need an info stream which is 446 filled into a TSB. In order to make clear how the filling works 447 in detail, we denote the octets of a stream as described in Fig. 448 3. 450 Octet pos: 0 1 2 3 ... 10 15 451 +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ 452 Octet |x0|x1|x2|x3|x4|x5|x6|x7|x8|x9|xA|xB|xC|xD|xE|xF| 454 Octet pos:16 ... .. 31 455 +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ 456 Octet |xG|xH|...................................|xU|xV| 458 Octet pos: 32 44 459 +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--| 460 Octet |xW|xX|xY|xZ|y0|y1|..................y8| 462 Figure 3: Exemplary info stream 464 This means, for example, that the octet at position 10 in the 465 info stream is denoted by xA. The info stream is progressive, 466 which means that the octets at the beginning of the stream are 467 more important than the octets later in the stream. 469 As depicted in Fig. 4, the rows of a transmission sub block shall 470 be assembled into T+1 different classes EPC_i, where i=0...T, 471 such that each class contains exactly R_i=|EPC_i| consecutive 472 rows of the matrix, where the R_i have to satisfy the following 473 relationship: 474 R_0+R_1+...+R_T=L_d 476 Data Transmission Sub Block (data TSB) 477 T 478 <-------> 479 /\ +--+--+--+--+--+--+--+--+--+ /\ 480 | |x0|x1|x2|x3|x4|* |* |* |* | | 481 | +--+--+--+--+--+--+--+--+--+ | R_T=3 482 | |x5|x6|x7|x8|x9|* |* |* |* | | 483 | +--+--+--+--+--+--+--+--+--+ | 484 L_d octets | |xA|xB|xC|xD|xE|* |* |* |* | \/ 485 per packet | +--+--+--+--+--+--+--+--+--+ /\ 486 | |xF|xG|xH|xI|xJ|xK|* |* |* | | R_(T-1)=1 487 | +--+--+--+--+--+--+--+--+--+ \/ 488 | |xL|xM|xN|xO|xP|xQ|xR|* |* | . 489 | +--+--+--+--+--+--+--+--+--+ . 490 | |xS|xT|xU|xV|xW|xX|xY|xZ|* | . 491 | +--+--+--+--+--+--+--+--+--+ /\ 492 | |y0|y1|y2|y3|y4|y5|y6|y7|y8| | R_0=1 493 \/ +--+--+--+--+--+--+--+--+--+ \/ 494 <-----------------> 495 n packets 496 x#,y# : info octets belonging to the info stream defined in Fig. 497 3 498 * : parity octets gained from Reed-Solomon coding 500 Fig. 4: General structure for coding with unequal erasure 501 protection 503 Furthermore, all rows in a particular class EPC_i shall contain 504 exactly the same number of parity octets, which is equal to the 505 index i of the class. For each row in a certain class EPC_i, the 506 same (n,n-i) RS code shall be applied. 507 As can be observed from Fig. 4, class EPC_T contains the largest 508 number of parity octets per row, i.e. offers the highest erasure 509 protection capability in the block. Consequently, the most 510 important elements in the info stream must be assigned to class 511 EPC_T, where the value of T should be chosen according to the 512 desired outage threshold of the application given a certain 513 packet erasure rate on the link. 514 All other classes EPC_(T-1)...EPC_0 shall be sequentially filled 515 with the remaining elements of the info stream in decreasing 516 order of importance as follows: The info stream is filled into 517 the TSB column by column, from left to right, and line by line, 518 from the upper lines to the lowest line. The result of this 519 procedure is shown in Fig 4. 521 In the following, we describe a set of rules containing a compact 522 description of all the operations that must be performed for each 523 transmission block at the sender and receiver. 525 4.3 UXP Sender Rules 527 1) The total number of columns n of the TB shall be chosen 528 according to the actual delay constraints of the application. 529 2) The maximum erasure correction capability T and the R_i in the 530 data TSB should be chosen according to the desired outage 531 threshold of the application given the actual packet erasure 532 rate on the link and the properties of the info streams. 533 However, the resulting number of TSB rows, 534 L_d=R_0+R_1+...+R_T, should be kept in mind since it has major 535 influence on the packet size of the resulting RTP packets (cf. 536 Sec.55555555 55). 537 3) Any suitable optimization algorithm may be used for deriving 538 adequate values for T and all R_i. However, the result has to 539 satisfy the following constraints: 540 a. All available info octet positions in the data TSB have 541 to be completely filled. If the info stream is too short 542 for a desired profile, media stuffing may be applied to 543 the empty info octet positions at the end of the data TSB 544 by appending a sufficient number of stuffing octets. The 545 stuffing octets MUST have the value 0x00. The actual 546 number of stuffing symbols per data TSB is then signaled 547 via the respective stuffing indicator (see Sect. 5.5.). 548 b. The info stream SHOULD be fully contained within the data 549 TSB (unless cutting it off at a specific point is 550 explicitly allowed by the properties of the info stream). 551 4) For each nonempty class EPC_i, i=T...0, in the data TSB, the 552 following steps have to be performed: 553 a. All rows of this specific class SHALL be filled from left 554 to right and top to bottom with data octets of the info 555 stream as shown in Fig. 4. 556 b. For each row in the class, the required i parity-check 557 octets are computed from the same set of codewords of an 558 (n,n-i) RS code, and filled in the empty positions at the 559 end of each row. Thus, every row in the class constitutes 560 a valid codeword of the chosen RS code. 562 5) After having filled the whole data TSB with information and 563 parity octets, the redundancy profile is mapped to the 564 signaling TSB as described in section 5.5. 565 6) Each column of the resulting TB is now read out octet-wise 566 from top to bottom and, together with the respective UXP 567 header (see section 5.2) in front, is mapped onto the payload 568 section of one and only one RTP packet. 569 7) The n resulting RTP packets SHALL be transmitted consecutively 570 to the remote host, starting with the leftmost one. 572 4.4 UXP Receiver Rules 574 1) At the corresponding protocol entity at the remote host, the 575 payload (without the UXP header) of all successfully received 576 RTP packets belonging to the same sending TB SHALL be filled 577 into a similar receiving TB column-wise from top to bottom and 578 left to right. 579 2) For every erased packet of a received TB, the respective 580 column in the TB SHALL be filled with a suitable erasure 581 marker. 582 3) Before any other operations can be performed, the redundancy 583 profile MUST be restored from the signaling TSB according to 584 the procedure defined in Sect. 5.5.. If the attempt fails 585 because of too many lost packets, the whole TB SHALL be 586 discarded and the receiving entity should wait for the next 587 incoming TB. 588 4) If the attempt to recover the redundancy profile has been 589 successful, a decoding operation SHALL be performed for each 590 row of the data TSB by applying any suitable algorithm for 591 erasure decoding. 592 5) For all rows of the data TSB for which the decoding operation 593 has been successful, the reconstructed data octets are read 594 out from left to right and top to bottom, and appended to the 595 reconstructed version of the info stream. 597 4.5 Protection Properties of UXP 599 One can easily realize that the above rules describe an 600 interleaved coding scheme, i.e. at the sender a single codeword 601 of a TB is spread out over n successive packets. Thus, each 602 codeword of a transmitted TB experiences the same number of 603 erasures at exactly the same positions. 604 Two important conclusions can be drawn from this: 605 a) Since the same RS code is applied to all rows contained in a 606 specific class, either all of them can be correctly decoded or 607 none. Hence, there exist no partly decodable classes at the 608 receiver. 609 b) If decoding is successful for a certain class EPC_i, all the 610 classes EPC_(i+1)...EPC_T can also be decoded, since they are 611 protected by at least one more parity octet per row. Together 612 with rule 6, it is therefore always ensured, that in case a 613 decodable enhancement layer exists, all other layers it depends 614 on can also be reconstructed! 616 4.6 Description of the Redundancy Profile by Erasure Protection 617 Vectors 619 Given the maximum erasure protection value T, the redundancy 620 profile for a data TSB of size (L_d x n) SHALL be denoted by a 621 so-called erasure protection vector EPV of length (T+1), where 622 EPV:=(R_0,R_1,...,R_(T-1),R_T) 623 From the above definition, it is easy to realize that the trivial 624 cases of no erasure protection and EXP are a subset of UXP: 625 a) no erasure protection at all: all application data is mapped 626 onto 627 class EPC_0, i.e. EPV=(L_d,0,0,...,0). 628 b) EXP: all application data is mapped onto class EPC_T, i.e. 629 EPV=(0,0,...,0,R_T=L_d). 630 Hence, the UXP payload format can also be used with info streams 631 which are non progressive. 633 5. RTP payload structure 635 This section is organized as follows: First, the specific 636 settings in the RTP header are shown. Next, the RTP payload 637 header for UXP (the so-called UXP header) is specified. After 638 that, the structure of the bitstream which is protected by UXP, 639 the so-called info stream, is discussed. Finally, the in-band 640 signaling of the erasure protection vector is introduced. 641 For every packet, the UXP payload is formed by reading out a 642 column of the TB and prefixing it with the UXP header. Thus, a 643 UXP-compliant RTP packet looks as follows: 645 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+- 646 |RTP Header| UXP Header| one column of the TB | 647 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+- 649 5.1 Specific Settings in the RTP Header 651 The timestamp of each RTP packet SHALL be set to the sampling 652 timestamp of the first octet of the progressive media stream in 653 the corresponding TB. The clock rate MUST be the same as defined 654 in the RTP payload format for the progressive media stream. 655 If several data TSBs are included in one TB, the sampling 656 timestamp of data TSB #1 SHALL be relevant. This results in the 657 TS value being the same for all RTP packets belonging to a 658 specific TB. 659 The payload type SHALL be of dynamic type, and obtained through 660 out-of-band signaling similar to [RFC2733]. End systems, which 661 cannot recognize a payload type, MUST discard it. 662 The marker bit SHOULD be set to 1 in the last packet of a TB; 663 otherwise, its value SHOULD be 0. 665 5.2 Structure of the UXP Header 667 The UXP header SHALL consist of 2 octets, and is shown in Fig. 5: 669 0 1 670 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 671 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 672 |X| block PT | TB indicator | 673 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 675 Fig. 5: Proposed UXP header 677 The fields in the UXP header are defined as follows: 678 - X (bit 0): extension bit, reserved for future enhancements, 679 currently not in use -> default value: 0 680 - block PT (bits 1-7): regular RTP payload type to indicate the 681 media type contained in the info stream 682 - TB indicator (bits 8-15): This field indicates the size and 683 position of one TB within a stream of RTP-packets. The 684 interpretation depends on the actual RTP sequence number of this 685 packet. We denote the TB this packet belongs to as the current 686 TB. Then there are two cases: 687 1) If the sequence number is even, it indicates the total 688 number of RTP packets within the current TB (which equals 689 the number of columns of the current TB). 690 2) Otherwise, it indicates the sequence number of the first 691 RTP packet of the current TB. Since it is only one octet, 692 it contains the least significant octet of the sequence 693 number. 695 The syntax of the info stream which is protected by UXP is 696 specified by the RTP payload type field contained in the UXP 697 header. The details of the info stream are described in Sec. 5.4 698 . 699 Based on the RTP sequence number, the marker bit, and the TB 700 indicator in each UXP header, the receiving entity is able to 701 recognize both TB boundaries and the actual position of packets 702 (both received and lost ones) in the TB. An example how this can 703 be done is given in the next subsection. 705 5.3 Usage of UXP- and RTP-Header at the Receiver 707 This subsection describes how the UXP- and RTP-headers can be 708 used to reconstruct a TB. 709 We assume that the receiver knows about the sequence number of 710 the first RTP packet within a TB, i.e. the left column, and the 711 width of the TB. Then it is easy to find out the column in which 712 the payload of an RTP packet has to be inserted only by 713 considering the RTP sequence number. 714 However, the receiver does not know this in advance, since the TB 715 width can be changed each time a new TB is sent. In addition, the 716 RTP session starts with a random sequence number. Therefore, even 717 if the TB width is known at the beginning, the receiver does not 718 know whether the first packets where lost or not. It is then 719 wrong to interpret the first received packet as the first packet 720 in the TB. 721 Therefore, the comination of UXP header, RTP timestamp, and 722 marker bit will help the receiver to recover TB synchronization. 724 5.4 Framing and Timing Mechanism in UXP: The Info Stream 726 As described in Sect. 4, UXP creates its own packetization scheme 727 by interleaving. The regular framing and timing structure of RTP 728 is therefore destroyed. This section describes which kind of 729 problems arise with interleaving and how they can be solved. This 730 finally leads to the specification of the info stream. 731 The timestamp of an RTP packet usually describes the sampling 732 time of the first octet included in the RTP data packet. This is 733 in principle also true for UXP RTP packets. According to the time 734 stamp definition in Sect. 5.1 every UXP RTP packet contains as 735 timestamp the sampling time of the first octet in the 736 corresponding TB. Therefore, all packets which belong to one TB 737 contain the same RTP timestamp. This can lead to problems since 738 due to the theoretical size limit of a TB (the limit for the 739 number of columns is 256, and the limit for the number of rows is 740 the maximum packet size), it can contain data from different 741 sampling time instances, e.g. several video frames. Then the 742 timing information of the later frames has to be determined from 743 the media stream itself and not from the RTP timestamp. 744 A second problem arising with interleaving is that the framing 745 mechanism of RTP is not supported. Since the payload of a single 746 RTP-packet does not contain individually decodable payload, but 747 rather the whole stream is reconstructed from a full TB, the UXP 748 RTP packets can not be used to provide information about the 749 start of different access units within the octet stream. 751 The framing and time problem can be solved in many ways: 753 One solution of the problem would be to rely on the framing and 754 timing mechanism of the elementary media stream. This is, for 755 example, possible for media streams which contain start codes and 756 information about the frame rate. 757 A second solution could be to define a specific framing mechanism 758 for the info stream similar to [Laz04] and extend it by timing 759 information. A third possibility is to insert the RTP packets of 760 a media directly into a TB 762 In this specification, we consider only the first solution, i.e. 763 to rely on the timing in the elementary media stream. Other 764 solutions have to be defined as extensions of this specification. 765 Therefore, an info-stream in this specification SHALL be defined 766 as an elementary media stream which provides timing and framing. 768 5.5. In-band Signaling of the Structure of the Redundancy Profile 770 To enable a dynamic adaptation to varying link conditions, the 771 actual redundancy profile used in the data TSB as well as the 772 beginning and end of a TSB must be signaled to the receiving 773 entity. Since out-of-band signaling either results in excessive 774 additional control traffic, or prevents quick changes of the 775 profile between successive TBs, an in-band signaling procedure is 776 desired. 777 Since without knowledge of the correct redundancy profile, the 778 decoding process cannot be applied to any of the erasure 779 protection classes, the redundancy profile has to be protected at 780 least as strongly as the most important element in the info 781 stream. Therefore, an additional class EPC_P is used in the 782 signaling TSB, where the number of parity symbols is by default 783 set to the following value: 784 P=ceil(n/2.0) 785 Hence, up to 50% of the RTP packets can be lost, before the 786 redundancy profile cannot be recovered anymore. This seems to be 787 a reasonable value for the lowest point of operation over a lossy 788 link. Alternatively, P may be explicitly signaled during session 789 setup by means of SDP or H.245 protocol. 790 Consequently, since all other classes must have equal or less 791 erasure protection capability, the maximum allowable value for 792 class EPC_T in the data TSB is now limited to T<=P. 793 The signaling of the erasure protection vector is accomplished by 794 means of descriptors. In the following we describe an efficient 795 encoding scheme for the descriptors. 796 For each class EPC_i with R_i>0, there is a descriptor DP_i 797 providing information about the size of class EPC_i (i.e. the 798 value of R_i) and establishing a relationship between the erasure 799 protection of class EPC_i and that of the class EPC_(i+j), where 800 j>0 and j is the smallest value for which R_(i+j)>0 is true. A 801 descriptor DP_i is mapped onto one octet, which is sub-divided 802 into two half-octets (i.e. the higher and the lower four bits). 803 The first half-octet is of type unsigned and contains the 4-bit 804 representation of the decimal value R_i. The second half-octet is 805 of type signed and contains the difference in erasure protection 806 between class EPC_i and class EPC_(i+j), i.e. the signed 4-bit 807 representation of the decimal value (-j) (where the MSB denotes 808 the sign, and the lower three bits the absolute value). Note that 809 the erasure protection P of class EPC_p is fixed, whereas the 810 size R_P may vary. 811 Thus, the data to be filled into class EPC_P shall consist of a 812 sequence of descriptors separated by stuffing indicators (see 813 below), where the number of descriptors is primarily given by the 814 number of protection classes EPC_i, 0<=i<=T, in the data TSB with 815 R_i>0. 816 Without a-priori knowledge, the initial value for the size of the 817 signaling TSB, R_P, should be set to one (row). When the number 818 of necessary descriptors and stuffing indicators exceeds the (n- 819 P) information positions, one or more additional rows have to be 820 reserved. This is usually done by increasing the value for L_s to 821 R_P>1, i.e. the data TSB is reduced to (L-R_P) rows. Hence, in 822 order to indicate the actual size of the signaling TSB, an 823 additional descriptor is inserted at the very beginning, which 824 takes on the value 0xq0, where q denotes the (octal) four bit 825 representation of the decimal value R_P. 826 Furthermore, the end of each data TSB is signaled by the 827 otherwise unused descriptor value 0x00, followed by exactly one 828 stuffing indicator (SI). The latter is mapped onto an octet, 829 which is of type unsigned and contains the 8-bit representation 830 of the decimal value of the number of media stuffing symbols used 831 at the end of the respective data TSB. 832 The (extended) sequence of descriptors and stuffing indicators is 833 then mapped to the octet positions in the R_P rows of the 834 signaling TSB from left to right and top to bottom. Each row is 835 then encoded with the same (n,n-P) RS code. 836 If the number of descriptors and stuffing indicators is less than 837 the available octet positions, however, empty positions in class 838 EPC_P may be filled up with the otherwise unused descriptor 0x00. 839 At the receiving entity, the sequence of descriptors shall be 840 recovered by performing erasure decoding on the first row of the 841 TB (which definitely belongs to the signaling TSB) using the same 842 algorithm as later for the data TSB. If successful, the very 843 first descriptor now indicates the number of rows of the 844 signaling TSB, and the next (R_P-1) rows are decoded to 845 reconstruct the redundancy profile for the data TSB(s), together 846 with the number of media stuffing symbols denoted by the 847 respective SI(s). 848 The complete structure of the TB is now depicted in Fig. 6. 850 Transmission Block (TB) 851 P 852 <---------> 853 /\ +--+--+--+--+--+--+--+--+--+ /\ 854 | |d0|d1|d2|d3|* |* |* |* |* | | R_P=1 855 | +--+--+--+--+--+--+--+--+--+ \/ 856 | |x0|x1|x2|x3|x4|* |* |* |* | /\ 857 | +--+--+--+--+--+--+--+--+--+ | R_T=3 858 | |x5|x6|x7|x8|x9|* |* |* |* | | 859 | +--+--+--+--+--+--+--+--+--+ | 860 L octets | |xA|xB|xC|xD|xE|* |* |* |* | \/ 861 payload | +--+--+--+--+--+--+--+--+--+ /\ 862 per packet | |xF|xG|xH|xI|xJ|xK|* |* |* | | R_(T-1)=1 863 | +--+--+--+--+--+--+--+--+--+ \/ 864 | |xL|xM|xN|xO|xP|xQ|xR|* |* | . 865 | +--+--+--+--+--+--+--+--+--+ . 866 | |xS|xT|xU|xV|xW|xX|xY|xZ|* | . 867 | +--+--+--+--+--+--+--+--+--+ /\ 868 | |y0|y1|y2|y3|y4|y5|y6|y7|y8| | R_0=1 869 \/ +-+-+-+-+-+-+-+-+-+ \/ 870 <-----------------> 871 n packets 873 d# : descriptors and stuffing indicators for in-band 874 signaling of the redundancy profile 875 x#,y# : info octets belonging to the info stream defined in Fig. 876 3 877 * : parity octets gained from Reed-Solomon coding 879 Fig. 6: General structure for UXP with in-band signaling of the 880 redundancy profile 882 The following simple example is meant to illustrate the idea 883 behind using descriptors: Let an erasure protection vector of 884 length T+1=7 be given as follows: 885 EPV=(R_0,R_1,...,R_5,R_6)=(7,0,2,2,0,3,10) 886 Hence, the length L of the TB (including one row for the 887 signaling TSB) is equal to 7+2+2+3+10+1=25 (rows/octets). If the 888 width is assumed to be equal to 20 (columns/packets), then the 889 erasure protection of the descriptors is P=10. 890 The corresponding sequence of descriptors can be written as 891 DP=(DP_6,DP_5,DP_3,DP_2,DP_0)=(0xAC,0x39,0x2A,0x29,0x7A), 892 where the values of the descriptors are given in hexadecimal 893 notation. Next, the descriptor indicating the length of the 894 signaling TSB has to be inserted, the end of the data TSB has to 895 be marked by 0x00, and the SI has to be appended. If the number 896 of media stuffing symbols is assumed to be 3, the 10 info octets 897 in the signaling TSB take on the following values (descriptor 898 stuffing included): 899 (0x10,0xAC,0x39,0x2A,0x29,0x7A,0x00,0x03,0x00,0x00) 901 5.6. Optional Concatenation of Transmission Sub Blocks 903 The following procedure may be applied if a single info stream 904 would be too short to achieve an efficient mapping to a 905 transmission block with respect to the fixed payload length L and 906 the desired number of packets n. For example, intra-coded video 907 frames (I-frames) are usually much larger than the following 908 predicted ones (P-frames). In this case, a certain number z of 909 successive small info streams should be each mapped to a 910 transmission sub block with length L_d(y) and width n, such that 911 L_d(1)+L_d(2)+...+L_d(z)=L_d. 912 The resulting transmission sub blocks can then be easily 913 concatenated to form a TB of size L x n having one common 914 signaling TSB (see Fig. 2): Since the second half-octet of the 915 descriptors is of type signed (cf. Sect. 5.5.), we are able to 916 signal both decreasing and increasing erasure protection 917 profiles. 918 Again, we will give a simple example to illustrate this idea: Let 919 the erasure protection vectors for two concatenated data TSBs be 920 given as follows: 921 EPV1=(R1_0,R1_1,...,R1_5,R1_6)=(0,0,2,2,0,3,10), 922 EPV2=(R2_0,R2_1,...,R2_5,R2_6)=(0,0,2,2,0,3,10). 923 Hence, two single identical data TSBs will be concatenated to 924 form a TB of length L=2*(2+2+3+10)+2=36 (rows/octets). If the 925 width is again assumed to be equal to 20 (columns/packets), then 926 the erasure protection of the descriptors is P=10. We reserve a 927 total of two rows for the signaling TSB. The corresponding 928 sequence of descriptors can now be written as 929 DP=(0xAC,0x39,0x2A,0x29,0xA4,0x39,0x2A,0x29), where the values of 930 the descriptors are given in hexadecimal notation. The values of 931 the first four descriptors are taken from the descriptor of EPV1 932 as described in Sect. 5.5. (without the SI). The last four 933 descriptors are taken from the descriptor of EPV2 (without SI) 934 with one exception. The fifth descriptor of DP (i.e. 0xA4) is 935 created as follows: The first half-octet is created according to 936 Sect. 5.5. However, the second half-octet describes no longer the 937 difference between R_P and R2_6. It rather describes the 938 difference between R1_2 and R2_6, i.e. R1_2-R2_6, which can be a 939 positive or negative number. If the number of media stuffing 940 symbols is assumed to be 3 for each data TSB, the 20 info octet 941 positions in the signaling TSB are filled with the following 942 values (descriptor stuffing included): 943 (0x20,0xAC,0x39,0x2A,0x29,0x00,0x03,0xA4,0x39,0x2A,0x29,0x00,0x03 944 , 945 0x00,0x00,0x00,0x00,0x00,0x00,0x00) 946 Therefore from the example above, the following general rule MUST 947 be used to create the resulting descriptors for concatenated data 948 TSB #u and data TSB #v, where v=u+1: 949 Let EPVu=(Au_0,Au_1,...) and EPVv=(Av_0, Av_1,...) be the 950 corresponding erasure protection vectors and DPu and DPv the 951 corresponding descriptors created according to Sect. 5.5. (with 952 stuffing). Let w be the smallest index for which Au_w >0. Let x 953 be the largest index for which Av_x >0. The resulting descriptor 954 can be created by concatenation of DPu and DPv where the first 955 descriptor of DPv should be changed as follows: 956 The second half byte is defined by Au_w-Av_x. 958 6. Indication of UXP in SDP 960 From the discussion in Sect. 5.4 , we know that UXP encapsulates 961 and protects the info stream. The info stream consists usually of 962 a regular RTP-Payload format, e.g. RFC 3016. 963 There is no static payload type assignment for UXP, so dynamic 964 payload type numbers MUST be used. The binding to the number is 965 indicated by an rtpmap attribute. The name used in this binding 966 is 967 "UXP". The payload type number of UXP is indicated in the "m" 968 line of the 969 media, as well as the payload type of the info-stream. 971 A sample indication of UXP in SDP is as follows: 973 m = video 8000 RTP/AVP 98 99 974 a = rtpmap:98 UXP/90000 975 a = rtpmap:99 MP4V-ES/90000 977 Here, PT 98 indicates that the payload consists of UXP with the 978 corresponding info stream "MP4V-ES". Alternatively, PT 99 can be 979 used which indicates "MP4V-ES" without UXP. 980 Since UXP is generic, several payload types can be protected. The 981 lines 983 m = video 8000 RTP/AVP 98 99 100 984 a = rtpmap:98 UXP/90000 985 a = rtpmap:99 MP4V-ES/90000 986 a = rtpmap:100 H263-1998/90000 988 mean that UXP can be used with either "MP4V-ES" or "H263-1998" as 989 info stream (indicated by PT 98 in the RTP-Header and either 990 block PT=99 or block PT=100 in the UXP-Header). Alternatively, 991 PT=99 or PT=100 in the RTP-Header means the use of "MP4V-ES" or 992 "H263-1998" without UXP. 994 As described in Sect. 5.5., the parameter P has the default value 995 P=ceil(n/2.0), if not otherwise stated. The parameter P MAY be 996 specified explicitly by means of SDP: 998 a = fmtp:98 UXP-prof: fvalue 1000 where fvalue is a floating point number in the interval (0 < 1001 fvalue <1) and specifies P by P=ceil(n*fvalue). For example, if 1002 we set fvalue=0.5, 1004 a = fmtp:98 UXP-prof: 0.5 1006 we get the default value for P, since P=ceil(n/2.0). 1007 The ABNF for fvalue according to RFC 2234 is 1009 fvalue = "0" "." 1*2DIGIT 1011 7. Security Considerations 1013 The payload of the RTP-packets consists of an interleaved media 1014 and parity stream. Therefore, it is reasonable to encrypt the 1015 resulting stream with one key rather than using different keys 1016 for media and parity data. It should also be noted that 1017 encryption of the media data without encryption of the parity 1018 data could enable known-plaintext attacks. 1019 The overall proportion between parity octets and info octets 1020 should be chosen carefully if the packet loss is due to network 1021 congestion. If the proportion of parity octets per TB is 1022 increased in this case, it could lead to increasing network 1023 congestion. Therefore, the proportion between parity octets and 1024 info octets per TB MUST NOT be increased as packet loss increases 1025 due to network congestion. 1026 The overall transmission rate for parity and info octets MUST be 1027 controlled by a congestion control algorithm. The congestion 1028 control algorithm used for the media which is protected by UXP 1029 MUST by used for the overall transmission rate for parity and 1030 info octets in UXP, i.e. for the resulting data rate. The trade- 1031 off between parity and info octets is determined by the 1032 optimization algorithm which determines the EPV and is, thus, out 1033 of scope of this specification. 1035 8. IANA Considerations 1036 8.1 Video 1038 To: ietf-types@iana.org 1040 Subject: Registration of MIME media type video/UXP 1042 MIME media type name: video 1044 MIME subtype name: UXP 1046 Required parameters: none 1048 [RFC3555] mandates that RTP payload formats without a defined 1049 rate must define a rate parameter as part of their MIME 1050 registration. This payload specification does not specify a rate 1051 parameter. However, the rate for UXP payload is equal to the rate 1052 of the media data it protects. 1054 Optional parameters: 1055 UXP-prof: Describes the redundancy of the signaling sub block 1056 (cf. Sec.5.5.). 1058 Encoding considerations: This format is only defined for 1059 transport within the Real Time Transport protocol (RTP) 1060 [RFC3550]. Its transport within RTP is fully specified within 1061 this specification. 1063 Security considerations: The same security considerations apply 1064 to these mime registrations as to the payloads for them, as 1065 detailed in this specification. 1067 Interoperability considerations: none 1069 Published specification: This MIME type is described fully within 1070 this specification. 1072 Applications which use this media type: Audio and video streaming 1073 tools which seek to improve resiliency to loss by sending 1074 additional data with the media stream. 1076 Additional information: none 1078 Person & email address to contact for further information: 1080 Marcel Wagner 1081 Siemens AG 1082 Otto-Hahn-Ring 6 1083 81730 Munich, Germany 1084 email: Marcel.Wagner@siemens.com 1086 Intended usage: COMMON 1088 Author/Change controller: Marcel Wagner. 1090 RTP and SDP Issues: Usage of this format within RTP and the 1091 Session Description Protocol (SDP) [RFC2327] are fully specified 1092 within this specification. 1094 8.2 Audio 1096 To: ietf-types@iana.org 1098 Subject: Registration of MIME media type audio/UXP 1100 MIME media type name: audio 1102 MIME subtype name: UXP 1104 Required parameters: none 1106 [RFC3555] mandates that RTP payload formats without a defined 1107 rate must define a rate parameter as part of their MIME 1108 registration. This payload specification does not specify a rate 1109 parameter. However, the rate for UXP payload is equal to the rate 1110 of the media data it protects. 1112 Optional parameters: 1113 UXP-prof: Describes the redundancy of the signaling sub block 1114 (cf. Sec.5.5.). 1116 Encoding considerations: This format is only defined for 1117 transport within the Real Time Transport protocol (RTP) 1118 [RFC3550]. Its transport within RTP is fully specified within 1119 this specification. 1121 Security considerations: The same security considerations apply 1122 to these mime registrations as to the payloads for them, as 1123 detailed in this specification. 1125 Interoperability considerations: none 1127 Published specification: This MIME type is described fully within 1128 this specification. 1130 Applications which use this media type: Audio and video streaming 1131 tools which seek to improve resiliency to loss by sending 1132 additional data with the media stream. 1134 Additional information: none 1135 Person & email address to contact for further information: 1137 Marcel Wagner 1138 Siemens AG 1139 Otto-Hahn-Ring 6 1140 81730 Munich, Germany 1141 email: Marcel.Wagner@siemens.com 1143 Intended usage: COMMON 1145 Author/Change controller: Marcel Wagner. 1147 RTP and SDP Issues: Usage of this format within RTP and the 1148 Session Description Protocol (SDP) [RFC2327] are fully specified 1149 within this specification. 1151 9. Application Statement 1152 There are currently two different schemes proposed for unequal 1153 error protection in the IETF-AVT: Unequal Level Protection (ULP) 1154 and Unequal Erasure Protection (UXP). 1155 Although both methods seem to address the same problem, the 1156 proposed solutions differ in many respects. This section tries to 1157 describe possible application scenarios and to show the strengths 1158 and weaknesses of both approaches. 1159 The main difference between both approaches is that while ULP 1160 preserves the structure of the packets which have to be protected 1161 and provides the redundancy in extra packets, UXP interleaves the 1162 info stream which has to be protected, inserts the redundancy 1163 information, and thus creates a totally new packet structure. 1164 Another difference concerns multicast compatibility: It cannot be 1165 assumed that all future terminals will be able to apply UXP/ULP. 1166 Therefore, backward compatibility could be an issue in some 1167 cases. Since ULP does not change the original packet structure, 1168 but only adds some extra packets, it is possible for terminals 1169 which do not 1170 support ULP to discard the extra packets. In case of UXP, 1171 however, two separate streams with and without erasure protection 1172 have to be sent, which increases the overall data rate. 1173 Next, both approaches offer different mechanisms to adjust packet 1174 sizes, if necessary: UXP allows to adjust the packet sizes 1175 arbitrarily. This is an advantage in case the loss probability is 1176 dependent on the packet length, which happens, for example, if 1177 the end-to-end connection contains wireless links. In this case 1178 proper adjustment of the packet size is one essential network 1179 adaptation technique. In addition, if a preencoded stream is sent 1180 over the network, the packet size can be adjusted independently 1181 of slice structures. 1182 Since ULP does not change the existing packetization scheme, this 1183 flexibility does not exist. 1185 The ability of UXP to adjust the packet size arbitrarily can be 1186 especially exploited in a streaming scenario, if a delay of 1187 several hundred milliseconds is acceptable. It is then possible 1188 to fill several video frames into a single TB of desired size, 1189 e.g. a group of pictures consisting of I-frame, P-frames and B- 1190 frames. The redundancy scheme can thus be selected in such a way 1191 as to guarantee the following property: In case of packet loss, 1192 the P-frames are only recoverable if the I-frame on which the 1193 decoding of P-frames depends is recoverable. The same is true for 1194 B-frames, which can only be decoded if the respective P-frames 1195 are recoverable. This prevents situations in which, for example, 1196 the B-frames have been received correctly, but the P-frames have 1197 been lost, i.e. assures a gradual decrease in application quality 1198 also on the frame level. Of course, a similar encoding is 1199 possible with ULP. But in this case one might have to send 1200 several frames within one packet which leads to large packet 1201 sizes. 1202 Furthermore, decoding delay is also a crucial issue in 1203 communications. Again, both approaches have different delay 1204 properties: UXP introduces a decoding delay because a reasonable 1205 amount of correctly received packets are necessary to start 1206 decoding of a TB. The delay in general depends on the dimensions 1207 of the interleaver. This should be considered for any system 1208 design which includes UXP. 1209 With ULP, every correctly received media packet can be decoded 1210 right away. However, a significant delay is introduced, if 1211 packets are corrupted, because in this case one has to wait for 1212 several redundancy packets. Thus, the delay is in general 1213 dependent on the actual ULP-FEC-packet scheme and cannot be 1214 considered in advance during the system design phase. 1215 Finally, we want to point out that UXP uses RS codes which are 1216 known 1217 to be the most efficient type of block codes in terms of erasure 1218 correction capability. 1220 10. Intellectual Property Considerations 1222 Siemens AG has filed patent applications that might possibly have 1223 technical relations to this contribution. 1224 On IPR related issues, Siemens AG refers to the Siemens Statement 1225 on Patent Licensing, see http://www.ietf.org/ietf/IPR/SIEMENS- 1226 General. 1228 The following patent might apply to this specification: 1229 United States Patent 5,617,541, April 1, 1997, System for 1230 packetizing data encoded corresponding to priority levels where 1231 reconstructed data corresponds to fractionalized priority level 1232 and received fractionalized packets, Inventors: Albanese; 1233 Andres (Berkeley, CA); Luby; Michael G. (Berkeley,CA); Bloemer; 1234 Johannes F. (Berkeley, CA); Edmonds; Jeffrey A. (Berkeley, CA) 1235 Filed: December 21, 1994 1237 11. References 1239 Normative References 1240 [RFC2733] J. Rosenberg and H. Schulzrinne, "An RTP Payload Format 1241 for Generic Forward Error Correction", Request for Comments 2733, 1242 Internet Engineering Task Force, Dec. 1999. 1243 [Lin83] Shu Lin and Daniel J. Costello, Error Control Coding: 1244 Fundamentals and Applications, Prentice-Hall, Inc., Englewood 1245 Cliffs, N.J., 1983. 1246 [RFC3550] Schulzrinne, H., Casner, S., Frederick, R. and V. 1247 Jacobson, "RTP: A Transport Protocol for Real-time Applications", 1248 RFC 3550, July 2003. 1249 [RFC3555] Casner, S., Hoschka, P.," MIME Type Registration of RTP 1250 Payload Formats", RFC 3555, July 2003 1251 [RFC2327] Handley, M. and V. Jacobson, "SDP: Session Description 1252 Protocol", RFC 2327, April 1998. 1254 Informative References 1255 [Alb96] A. Albanese, J. Bloemer, J. Edmonds, M. Luby, and M. 1256 Sudan, "Priority encoding transmission", IEEE Trans. Inform. 1257 Theory, vol. 42, no. 6, pp. 1737-1744, Nov. 1996. 1258 [Li01] W. Li: "Streaming video profile in MPEG-4", IEEE Trans. on 1259 Circuits and Systems for Video Technology, Vol. 11, no. 3, 301- 1260 317, March 2001. 1261 [Bla00] G. Blaettermann, G. Heising, and D. Marpe: "A Quality 1262 Scalable Mode for H.26L", ITU-T SG16, Q.15, Q15-J24, Osaka, May 1263 2000. 1264 [Bur99] F. Burkert, T. Stockhammer, and J. Pandel, "Progressive 1265 A/V coding for lossy packet networks - a principle approach", 1266 Tech. Rep., ITU-T SG16, Q.15, Q15-I36, Red Bank, N.J., Oct. 1999. 1267 [Lie99] Guenther Liebl, "Modeling, theoretical analysis, and 1268 coding for wireless packet erasure channels", Diploma Thesis, 1269 Inst. for Communications Engineering, Munich University of 1270 Technology, 1999. 1271 [Hor99] U. Horn, K. Stuhlmuller, M. Link, and B. Girod, "Robust 1272 Internet video transmission based on scalable coding and unequal 1273 error protection", Image Com., vol. 15, no. 1-2, pp. 77-94, Sep. 1274 1999. 1275 [Wen02] S. Wenger, "H.26L over IP: The IP-Network Adaptation 1276 Layer", Packet Video 2002, Pittsburgh, Pennsylvania, USA, April 1277 24-26,2002. 1278 [Laz04]Lazzaro, John, "Framing RTP and RTCP Packets over 1279 Connection-Oriented Transport", draft-ietf-avt-rtp-framing- 1280 contrans-02.txt, work in progress, 2004 1282 12. Acknowledgments 1283 Many thanks to Magnus Westerlund, Philippe Gentric, Stephen 1284 Casner, and Hermann Hellwagner for helpful comments and 1285 improvements. The authors would like to thank Thomas Stockhammer 1286 who came up with the original idea of UXP. Also, the help of Gero 1287 Baese, Frank Burkert, and Minh Ha Nguyen for the development of 1288 UXP is well acknowledged. 1290 13. Author's Addresses 1291 Guenther Liebl 1292 Institute for Communications Engineering (LNT) 1293 Munich University of Technology (TUM) 1294 D-80290 Munich 1295 Germany 1296 Email: {liebl}@lnt.e-technik.tu-muenchen.de 1298 Marcel Wagner 1299 Siemens AG - Corporate Technology CT IC 2 1300 D-81730 Munich 1301 Germany 1302 Email: marcel.wagner@siemens.com 1304 Juergen Pandel 1305 Siemens AG - Corporate Technology CT IC 2 1306 D-81730 Munich 1307 Germany 1308 Email: juergen.pandel@siemens.com 1310 Wenrong Weng 1311 Siemens AG - Corporate Technology CT IC 2 1312 D-81730 Munich 1313 Germany 1314 Email: wenrong.weng@siemens.com 1316 14. Full Copyright Statement 1317 Copyright (C) The Internet Society (2004). This document is 1318 subject to the rights, licenses and restrictions contained in BCP 1319 78, and except as set forth therein, the authors retain all their 1320 rights. 1322 This document and the information contained herein are provided 1323 on an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE 1324 REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND 1325 THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, 1326 EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY 1327 THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY 1328 RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS 1329 FOR A PARTICULAR PURPOSE. 1331 15. Intellectual Property Notice 1333 The IETF takes no position regarding the validity or scope of any 1334 Intellectual Property Rights or other rights that might be 1335 claimed to pertain to the implementation or use of the technology 1336 described in this document or the extent to which any license 1337 under such rights might or might not be available; nor does it 1338 represent that it has made any independent effort to identify any 1339 such rights. Information on the procedures with respect to 1340 rights in RFC documents can be found in BCP 78 and BCP 79. 1342 Copies of IPR disclosures made to the IETF Secretariat and any 1343 assurances of licenses to be made available, or the result of an 1344 attempt made to obtain a general license or permission for the 1345 use of such proprietary rights by implementers or users of this 1346 specification can be obtained from the IETF on-line IPR 1347 repository at http://www.ietf.org/ipr. 1349 The IETF invites any interested party to bring to its attention 1350 any copyrights, patents or patent applications, or other 1351 proprietary rights that may cover technology that may be required 1352 to implement this standard. Please address the information to 1353 the IETF at ietf-ipr@ietf.org.