idnits 2.17.1 draft-edwards-avt-rtp-jpeg2000-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Cannot find the required boilerplate sections (Copyright, IPR, etc.) in this document. Expected boilerplate is as follows today (2024-03-29) according to https://trustee.ietf.org/license-info : IETF Trust Legal Provisions of 28-dec-2009, Section 6.a: This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 2: Copyright (c) 2024 IETF Trust and the persons identified as the document authors. All rights reserved. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 3: This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** The document seems to lack a 1id_guidelines paragraph about 6 months document validity -- however, there's a paragraph with a matching beginning. Boilerplate error? ** The document seems to lack a 1id_guidelines paragraph about the list of Shadow Directories. ** The document is more than 15 pages and seems to lack a Table of Contents. == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** The document seems to lack separate sections for Informative/Normative References. All references will be assumed normative when checking for downward references. ** There are 6 instances of too long lines in the document, the longest one being 4 characters in excess of 72. ** There are 175 instances of lines with control characters in the document. ** The document seems to lack a both a reference to RFC 2119 and the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. RFC 2119 keyword, line 249: '... range SHALL be chosen by means of ...' Miscellaneous warnings: ---------------------------------------------------------------------------- == Line 198 has weird spacing: '...ny part of th...' == Line 202 has weird spacing: '...000 has flexi...' == Line 507 has weird spacing: '...e first tile-...' == Line 777 has weird spacing: '...skipped becau...' == Line 846 has weird spacing: '... sender to...' -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (November 14, 2001) is 8171 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Unused Reference: '6' is defined on line 1029, but no explicit reference was found in the text -- Possible downref: Non-RFC (?) normative reference: ref. '1' -- Possible downref: Non-RFC (?) normative reference: ref. '2' -- Possible downref: Non-RFC (?) normative reference: ref. '3' -- Possible downref: Non-RFC (?) normative reference: ref. '4' -- Possible downref: Non-RFC (?) normative reference: ref. '5' ** Obsolete normative reference: RFC 1889 (ref. '6') (Obsoleted by RFC 3550) Summary: 10 errors (**), 0 flaws (~~), 7 warnings (==), 7 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 INTERNET-DRAFT Eric Edwards 2 draft-edwards-avt-rtp-jpeg2000-00.txt Satoshi Futemma 3 Eisaburo Itakura 4 Takahiro Fukuhara 5 Sony Corporation 6 November 14, 2001 7 Expires: May 13 2002 9 RTP Payload Format for JPEG 2000 Video Streams 11 Status of this memo 13 This document is an Internet-Draft and is in subject to all 14 provisions of Section 10 of RFC2026. 16 Internet-Drafts are working documents of the Internet Engineering 17 Task Force (IETF), its areas, and its working groups. Note that 18 other groups may also distribute working documents as 19 Internet-Drafts. 21 Internet-Drafts are draft documents valid for a maximum of six 22 months and may be updated, replaced, or obsoleted by other documents 23 at any time. It is inappropriate to use Internet-Drafts as reference 24 materials or to cite them other than as "work in progress." 26 The list of current Internet-Drafts can be accessed at 27 http://www.ietf.org/1id-abstracts.html 29 The list of Internet-Drafts Shadow Directories can be accessed at 30 http://www.ietf.org/shadow.html. 32 Abstract 34 This document describes a payload format for transporting JPEG 2000 35 video streams using RTP (Real-time Transport Protocol). 36 JPEG 2000 video streams are formed as a continuous series of JPEG 37 2000 still images which is next-generation still image coding. 38 The JPEG 2000 payload format described in this document has three 39 features: (1) Improvement of robustness to packet loss by 40 fragmenting JPEG 2000 packet units intelligently, (2) Persistency of 41 main header to minimize loss effect, (3) Priority information field 42 for scalable delivery from the same codestream. 43 These will allow the scalability and robustness of JPEG 2000 to be 44 maximized in streaming applications. 46 1. Introduction 48 This document specifies payload formats for JPEG 2000 video streams 49 over the Real-time Transport Protocol (RTP). 50 JPEG-2000 is the international standardization system for 51 next-generation still image encoding and its basic encoding 52 technology is described in [1]. 54 In JPEG 2000 part 3, Motion JPEG 2000 is defined[2]. However, this 55 defines only the file format but not the transmission format for 56 streaming on the Internet. For this reason, it is necessary to 57 define the RTP format for JPEG 2000 video streams. 59 JPEG 2000 supports many features over the current JPEG 60 standard[3][4][5]. 62 o Higher compression efficiency than JPEG with less visual loss 63 especially at bit rates less than 0.25bpp for grayscale images. 65 o A single codestream that offers both lossy and superior 66 lossless compression. 68 o Transmission over noisy environments. The JPEG 2000 69 codestream can be built with markers to boost its error 70 resilience and recovery. The JPEG 2000 codestream is very 71 robust to bit errors as it has been designed to avoid 72 catastrophic decoding failure due to bit errors. 74 o Progressive transmission by pixel accuracy and resolution: 75 Progressive transmission that allows images to be 76 reconstructed with increasing pixel accuracy or spatial 77 resolution is essential for many applications. This feature 78 allows the reconstruction of images with different resolutions 79 and pixel accuracy, as needed or desired, for different target 80 devices. The image architecture provides for the efficient 81 delivery of image data in many applications such as 82 client/server applications. 84 o Random codestream access and processing. There are parts of 85 an image which maybe more important than others. Specific 86 regions of the codestream can be defined to be less distorted 87 than other areas. Access to any specific area of an image is 88 handled efficiently without the need to completely decompress 89 the codestream. Simple image transforms (rotating, 90 translation, filtering) can be done with compressed 91 codestream. 93 First, the JPEG 2000 algorithm is briefly explained below. 94 Fig. 1 shows a block diagram of JPEG 2000 encoder. 96 +-----+ 97 | ROI | 98 +-----+ 99 | 100 V 101 +----------+ +----------+ +------------+ 102 |DC, comp. | | Wavelet | | | 103 raw image==>|transform-|==>|transform-|==>|Quantization|==+ 104 | ation | | ation | | | | 105 +----------+ +----------+ +------------+ | 106 | 107 +-------------+ +----------+ +------------+ | 108 | | | | | | | 109 JPEG 2000 <==|Data ordering|<==|Arithmetic|<==|Coefficient |<=+ 110 codestream | | | coding | |bit modeling| 111 +-------------+ +----------+ +------------+ 113 Fig. 1: Block diagram of the JPEG2000 encoder 115 First, the image will go through component separation, if it is a 116 color image. Split into RGB, YUV, or various other colorspaces. It 117 can also further be sectioned into tiles within the image for 118 processing. 120 Each color component or tile is transformed into the wavelet 121 coefficients. The component or tile is sampled into various levels 122 usually subsampled vertically and horizontally from high frequencies 123 (which contains all the sharp details) to the low frequencies (which 124 contains all the flat areas). These wavelet coefficients are 125 categorized into different frequencies called subbands. Subband HH 126 has the high frequency information, then HL and LH are the contains 127 the middle frequencies, and the lowest frequencies and most 128 important coefficients are in the LL subband. 130 Quantization is performed on the coefficients within each subband. 131 The wavelet coefficient is divided by the quantization step size and 132 the result is truncated. This can happen iteratively to produce an 133 accurate target bitrate. 135 After quantization, code-blocks are formed from within the precincts 136 within the tiles. Precincts are a finer separation than tiles and 137 code-blocks are the smallest separation of the image data. Entropy 138 coding is performed within each code-block and arithmetically 139 encoded by bitplane. There are 3 passes for the code-block: 140 significance propagation pass, magnitude refinement pass, and 141 cleanup pass. 143 After the coefficients of all code-blocks have been coded into a 144 short bitstream, a header is added turning it into a packet. The 145 header has all the information needed to decompress the packet into 146 code-blocks. A group of packets is called layers. 148 For additional features in transmitting, a re-ordering of the formed 149 packets is necessary. The standard has four ways to transmit and 150 decode a compressed image by: resolution, quality, location, or 151 component. As there are many markers builtin to the codestream of 152 JPEG 2000, a parser can go through the bitstream and get the proper 153 order of packets to transmit and decode. 155 This is only to serve as an introduction to JPEG 2000 to aid in 156 understanding the rest of this document. Further details of the 157 encoder can be found in various texts on JPEG 2000. 159 To decompress a JPEG 2000 codestream, one would follow the reverse 160 order of the encoding order, minus the quantization step. It is 161 outside the scope of this document to describe in detail this 162 procedure. Please refer to various JPEG 2000 texts for details. 164 2. JPEG 2000 video features 166 As described above, JPEG 2000 has the following features. 168 o Higher compression efficiency than existing JPEG and yet less 169 SNR deterioration 170 (improved compression efficiency over JPEG with dramatic 171 improvements at low bitrates) 173 o Random codestream access and processing 175 o Both lossless and compression and lossy compression can be 176 performed by the same algorithm. 178 o Optional spatial resolution and SNR progressive can be easily 179 taken out from a single codestream. 180 (NOTE)SNR means Signal to Noise Ratio. This is the factor to 181 define the quality. 183 o Parts of an image can have more bits for more detail. (ROI 184 (Region of Interest) function) 186 o Various levels of error resilience functionality. 188 JPEG 2000 video streams are formed as a continuous series of JPEG 189 2000 still images, so the above features of JPEG 2000 can be used 190 effectively. JPEG 2000 video stream has the following merits. 192 o SNR is improved at a low bit rate. The formation can be used 193 as a video stream format at a low band. 195 o This is a Full Intra format in which each frame is 196 independently compressed has a low encoding and decoding 197 delay. This is suitable for interactive video communication. 198 Even if a packet loss occurs in any part of the frame, error 199 is not propagated to subsequent frames. Moreover, each frame 200 can be handled independently this facilitates video editing. 202 o JPEG 2000 has flexible and accurate rate control. This is 203 suitable for traffic control and congestion control at the 204 Internet transmission. 206 o JPEG 2000 can provide within its own codestream error 207 resilience markers to aid in codestream recovery. An encoder 208 can insert a resynchronization marker at the beginning of a 209 JPEG 2000 packet and a segmentation symbol at the end of the 210 bit plane to aid in recovery within a frame. 212 3. Requirements for RTP payload format of JPEG 2000 video streams 214 To provide a payload format that makes the most of the merits of 215 JPEG 2000 video stream, described in the previous section, the 216 following must be taken into consideration. 218 - Provisions for packet loss 220 On the Internet, 5% packet loss is common and this percentage 221 may sometimes come to 20% or more. To split JPEG 2000 video 222 streams into RTP packets, efficient packetization of the 223 codestream is required to minimize the effects of disabled 224 decoding due to missing code-blocks over error prone 225 environments. If the main header is lost in transmission, the 226 decoding ability is lost. Accordingly, a system to compensate 227 for the loss of the main header as much as possible is required. 229 - A packetizing scheme that permits making the most of the JPEG 2000 230 functionality. 232 A packetizing scheme so that an image can be progressively 233 transmitted and reconstructed progressively by the receiver using 234 JPEG 2000 functionality. Maximizing performance over various 235 network conditions and various computing power of receiving 236 platforms. 238 4. Proposal for an RTP payload format for JPEG 2000 video streams 240 4.1 RTP fixed header usage 242 For each RTP packet, the RTP fixed header is followed by the JPEG 243 2000 payload header, which is followed by JPEG 2000 codestream. 244 The RTP header fields that have a meaning specific to the JPEG 2000 245 video are described as follows: 247 Payload type (PT): The payload type is dynamically assigned by means 248 outside the scope of this document. A payload type in the dynamic 249 range SHALL be chosen by means of an out of band signaling 250 protocol (e.g., RTSP, SIP, etc). 252 Marker bit (M): The marker bit of the RTP fixed header is set to 1 253 on the last RTP packet of a video frame, and otherwise, must be 0. 254 When transmission is performed by multiple RTP sessions, the bit is 255 set in the last packet of the frame in each session. 257 Timestamp: The RTP timestamp is in units of 90 KHz. The same 258 timestamp must appear in each fragment of a given frame. The 259 initial value of the timestamp is random (unpredictable) to make 260 known-plaintext attacks on encryption more difficult, even if the 261 source itself does not encrypt, because the packets may flow 262 through a translator that does. 264 4.2 RTP Payload header format 266 The RTP payload header format for JPEG 2000 video stream is as 267 follows: 269 0 1 2 3 270 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 271 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 272 | type | type-specific | priority |X|rsvd | mh_id | 273 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 274 | fragment offset | 275 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 277 Fig. 2: RTP payload header format for JPEG 2000 279 type : 8 bits 281 The type field shows which part of JPEG 2000 codestream is 282 included. The details of the type are described later. 284 type-specific : 8 bits 286 Interpretation depends on the value of the type field. 287 This field is defined for future usage. 288 This bit must be set to 0 when not used. 289 i.e. Tile specific priority number (general idea) 291 priority : 8 bits 293 The priority field shows the importance of the JPEG 2000 294 packet included in the given RTP packet. Typically, the higher 295 priority is set at the packet which contains the JPEG 2000 296 packets of the lower layers and the lower subbands. 298 X : 1 bit 300 extension bit. This bit must be set to 1 when 301 JPEG 2000 optional payload header follows the JPEG 2000 302 payload header, and otherwise set to 0. 303 The details of the optional payload header is described later. 305 rsvd : 3 bits 307 These bits are reserved for future use and must be set to 0. 309 mh_id : 4 bit 311 identification of the main header of JPEG 2000. The same mh_id 312 is used as long as the coding parameters described in the main 313 header remain unchanged. 315 fragment offset : 32 bits 317 Because JPEG 2000 frames are typically larger than 318 underlying network's maximum transfer units (MTU), frames may 319 often be fragmented into several packets. The fragment offset is 320 the data offset in bytes of the current packet from the first 321 byte in the JPEG 2000 codestream. This field helps the receiver 322 to reassemble JPEG 2000 codestream. 324 To perform scalable video delivery by using multiple RTP 325 sessions, the offset value from the first byte of the same frame 326 is set for fragment offset. Accordingly, in scalable video 327 delivery using multiple RTP sessions, maybe the fragment offset 328 will not be started with 0 in some RTP sessions even if the 329 packet is the first one of the frame. 331 5. Fragmentation of JPEG 2000 codestream and Type Field 333 Fig. 2 shows the construction of the JPEG 2000 codestream. The JPEG 334 2000 codestream consists of a main header beginning with the SOC 335 marker, one or more tiles (only one tile for no tile division), and 336 the EOC marker to indicate the end of the codesteam. Each tile 337 consists of a tile-part header starts with the SOT marker and ending 338 with the SOD marker, and a bit stream (a series of JPEG 2000 339 packets) of the bit stream. 341 +-- +------------+ 342 Main | | SOC | Required as the first marker. 343 header| +------------+ 344 | | main | Main header marker segments 345 +-- +------------+ 346 | | SOT | Required at the beginning of each tile-part 347 Tile- | +------------+ header. 348 part | | T0,TP0 | Tile 0, tile-part 0 header marker segments 349 header| +------------+ 350 | | SOD | Required at the end of each tile-part header 351 +-- +------------+ 352 | bit stream | Tile-part bit stream. 353 +-- +------------+ Might include SOP and EPH 354 | | SOT | 355 Tile- | +------------+ 356 part | | T1,TP0 | 357 header| +------------+ 358 | | SOD | 359 +-- +------------+ 360 | bit stream | 361 +------------+ 362 | EOC | Required as the last marker in the codestream 363 +------------+ 365 Fig. 3: Construction of the JPEG 2000 codestream 367 JPEG 2000 video streams are typically larger than underlying 368 network's maximum transfer units (MTU), video sequence may often be 369 fragmented into several IP packets at the network layer. the JPEG 370 2000 video streams are fragmented into RTP packets according to the 371 following basic rule. 373 The JPEG 2000 construction consists of a main header, tile-part 374 headers, and JPEG 2000 packets. When we packetize the JPEG 2000 375 codestream, these construction units from the codestream should be 376 maintained. Each RTP packet should consist of a main header, 377 tile-part header, or JPEG 2000 packet. 379 If the sender understands JPEG 2000 codestream and can read the JPEG 380 2000 packets from the codestream. (i.e. the sender is intelligent) 381 JPEG 2000 packets should be packed into RTP payload packets in the 382 following way: 384 1. If the JPEG 2000 packets are smaller than the MTU size, the 385 sender should put as many whole JPEG 2000 packets into a 386 single RTP packet. That is, the JPEG 2000 payload data 387 should begin with one of the SOC marker, SOT marker, or SOP 388 marker (if it exists). 390 2. If the JPEG 2000 packets are larger than the MTU size, the 391 sender should segment the JPEG 2000 packets at the largest 392 possible MTU size but without JPEG 2000 packets overlapping. 394 If the server does not understand JPEG 2000 codestream (i.e. the 395 sender is not intelligent,) it should pack JPEG 2000 codestream in 396 the largest possible MTU data size for the RTP packet. JPEG 2000 397 codestream will be segmented along arbitrary lengths by the sender 398 into RTP packets. 400 Regardless of the sender's capabilities, the receiver must be able 401 to handle RTP packets of any size. 403 If we do not fragment at the sender, any packets larger than the MTU 404 size, will be fragmented into multiple smaller IP packets than the 405 MTU size by the IP layer. If one fragmented IP packet is lost 406 during transmission, it is recognized as a loss of the whole RTP 407 packet because the receiving host cannot reassemble the RTP packet. 408 The segmentation of the JPEG 2000 codestream into RTP packets, 409 should fit within the RTP payload size. 411 In the following, all the possible packetization cases are described 412 with diagrams. For each case, the type field value shown in Fig. 2 413 is also indicated. 415 5.1 Separation at arbitrary lengths 417 In this case, a JPEG 2000 codestream is split into several 418 fragments at arbitrary byte-position. 419 The type value of the RTP packet is set to 0. 421 +---+---+---+----------------------+ 422 |RTP|PL |SOC| jpeg 2000 codestream | type = 0 423 |hdr|hdr| | fragment (1) | 424 +---+---+---+----------------------+ 425 +---+---+--------------------------+ 426 |RTP|PL | jpeg 2000 codestream | type = 0 427 |hdr|hdr| fragment (2) | 428 +---+---+--------------------------+ 429 ... 430 +---+---+----------------------+---+ 431 |RTP|PL | jpeg 2000 codestream |EOC| type = 0 432 |hdr|hdr| fragment (N) | | 433 +---+---+----------------------+---+ 434 *PL hdr = payload header 436 Such RTP packetization scheme is not recommended from the standpoint 437 of error resilience. It is desirable to use it only in some limited 438 environments shown below. 440 - The sender finds it difficult to distinguish the main header, 441 tile header, and JPEG 2000 packets from one another. There is 442 no SOP marker in the JPEG 2000 codestream. The sender is not 443 intelligent. 445 - The network environment is error free. 447 - If the JPEG 2000 error resilience markers (TLM, PLM, PLT, PPM, 448 and PPT markers) are present in the codestream. Error 449 resilience will be handled outside of RTP. Its description is 450 not within the scope of this document. Using these markers 451 may improve the error resilience. 453 5.2 General JPEG 2000 RTP packet types 455 (1) JPEG 2000 main header(SOC marker) must come first of the RTP 456 payload (just after the RTP payload header). The type value of 457 the RTP packets which contain the whole main header (not 458 fragmented) is 4, 460 (1-a) The RTP packet only contains the complete main header. 461 +---+---+------+ 462 |RTP|PL |Main | type = 4 463 |hdr|hdr|header| 464 +---+---+------+ 466 (1-b) The main header and the first tile-part header are packed 467 into one RTP packet. 468 +---+---+------+---------+ 469 |RTP|PL |Main |Tile-part| type = 4 470 |hdr|hdr|header|header | 471 +---+---+------+---------+ 472 (1-c) The main header, the first tile-part header and JPEG 2000 473 packet(s) are packed into one RTP packet. 474 +---+---+------+---------+---------+-----+---------+ 475 |RTP|PL |Main |Tile-part|jpeg 2000| ... |jpeg 2000| type = 4 476 |hdr|hdr|header|header |packet | |packet | 477 +---+---+------+---------+---------+-----+---------+ 479 (1-d) The main header is split into the several RTP packets. 481 If the main header is larger than one RTP packet, then it may be 482 split into several RTP packets. 483 In this case, the RTP packets must contain only a piece of the 484 main header. The type value of the RTP packets which contain the 485 first piece of the main header is type 5, and the last piece is 486 type 7 and the middle pieces are all type 6. 488 +---+---+--------------+ 489 |RTP|PL |Main Header(1)| type = 5 490 |hdr|hdr| | 491 +---+---+--------------+ 492 +---+---+--------------+ 493 |RTP|PL |Main Header(2)| type = 6 494 |hdr|hdr| | 495 +---+---+--------------+ 496 +---+---+--------------+ 497 |RTP|PL |Main Header(3)| type = 6 498 |hdr|hdr| | 499 +---+---+--------------+ 500 ... ... 501 +---+---+--------------+ 502 |RTP|PL |Main Header(N)| type = 7 503 |hdr|hdr| | 504 +---+---+--------------+ 506 (Note) When the main header is split into multiple RTP packets, 507 the first tile-part header must not be included in the RTP 508 packet containing the last fragment. 509 +---+---+--------------+---------+ 510 |RTP|PL |Main Header(N)|Tile-part| This packetization is 511 |hdr|hdr| |header | not allowed. 512 +---+---+--------------+---------+ 514 (2) Tile-part headers (SOT marker) must come first of the RTP payload 515 (just after the RTP payload header), except for the first 516 tile-part header just after the main header. 517 The first tile-part header may either be packed with the main 518 header, or be separated to another RTP packet. 519 The type value of the RTP packet which begins with the 520 tile-part header is 8. 522 (2-a) The RTP packet only contains the complete tile-part header. 523 +---+---+----------+ 524 |RTP|PL |Tile-part | type = 8 525 |hdr|hdr|Header | 526 +---+---+----------+ 528 (2-b) The tile-part header and JPEG 2000 packet(s) are packed 529 into one RTP packet. 530 +---+---+----------+---------+-----+---------+ 531 |RTP|PL |Tile-part |jpeg 2000| ... |jpeg 2000| type = 8 532 |hdr|hdr|Header |packet | | | 533 +---+---+----------+---------+-----+---------+ 535 (2-c) The tile-part header is split into the several RTP 536 packets. 538 If the tile-part header is larger than one RTP 539 packet, it may be split into several RTP packets. In this 540 case, the RTP packets contain only a piece of the tile-part 541 header. 542 The RTP packets which contain the first piece of the tile-part 543 header is type 9, and the last piece is type 11, and the middle 544 pieces are all type 10. 546 +---+---+-------------------+ 547 |RTP|PL |Tile-part header | type = 9 548 |hdr|hdr|fragment(1) | 549 +---+---+-------------------+ 550 +---+---+-------------------+ 551 |RTP|PL |Tile-part header | type = 10 552 |hdr|hdr|fragment(2) | 553 +---+---+-------------------+ 554 +---+---+-------------------+ 555 |RTP|PL |Tile-part header | type = 10 556 |hdr|hdr|fragment(3) | 557 +---+---+-------------------+ 558 ... 559 +---+---+-------------------+ 560 |RTP|PL |Tile-part header | type = 11 561 |hdr|hdr|fragment(N) | 562 +---+---+-------------------+ 564 (Note) When the tile-part header is split into multiple RTP 565 packets, the JPEG 2000 packet must not be included in the RTP 566 packet containing the last fragment. 567 +---+---+-------------------+---------+ 568 |RTP|PL |Tile-part header |jpeg 2000| This packetization is 569 |hdr|hdr|fragment(N) |packet | not allowed. 570 +---+---+-------------------+---------+ 572 (3) The JPEG 2000 packet must be packed by itself, except for JPEG 573 2000 packets just after the tile-part header. Also several JPEG 574 2000 packets may be packed into the one RTP packet. 575 If SOP(Start of Packet) marker is used for error resilience, 576 SOP marker shall be placed at the beginning of the RTP payload. 577 (When the SOP marker is used, it is placed at the beginning of 578 the RTP packet.) 579 The type value of the RTP packet, which contains only jpeg 2000 580 packet(s) is 12. 582 (3-a) More than one jpeg 2000 packets are packed into one RTP packet. 583 +---+---+---------+-----+---------+ 584 |RTP|PT |jpeg 2000| ... |jpeg 2000| type = 12 585 |hdr|hdr|packet | |packet | 586 +---+---+---------+-----+---------+ 588 (3-b) The jpeg 2000 packet is split into the several RTP 589 packets 591 If the JPEG 2000 packet is larger than one RTP packet, then it 592 may be split into two or more RTP packets. In this case, the RTP 593 packets contain only a piece of the jpeg 2000 packet. 595 The RTP packet with the first piece of JPEG 2000 packet is type 596 13, and the last piece is type 15, and the middle pieces are 597 all type 14. 599 +---+---+-------------------+ 600 |RTP|PT |jpeg 2000 packet | type = 13 601 |hdr|hdr|fragment(1) | 602 +---+---+-------------------+ 603 +---+---+-------------------+ 604 |RTP|PT |jpeg 2000 packet | type = 14 605 |hdr|hdr|fragment(2) | 606 +---+---+-------------------+ 607 +---+---+-------------------+ 608 |RTP|PT |jpeg 2000 packet | type = 14 609 |hdr|hdr|fragment(3) | 610 +---+---+-------------------+ 611 ... ... 612 +---+---+-------------------+ 613 |RTP|PT |jpeg 2000 packet | type = 15 614 |hdr|hdr|fragment(N) | 615 +---+---+-------------------+ 617 (Note) When the JPEG 2000 packet is split into multiple RTP 618 packets, another JPEG 2000 packet must not be included in the RTP 619 packet containing the last fragment. 620 +---+---+-------------------+---------+ 621 |RTP|PT |jpeg 2000 packet |jpeg 2000| This packetization is 622 |hdr|hdr|fragment(N) |packet | not allowed. 623 +---+---+-------------------+---------+ 625 6. Scalable Delivery and Priority field 627 JPEG 2000 codestream has rich functionality built into it so 628 decoders can easily handle scalable delivery or progressive 629 transmission. Progressive transmission that allows images to be 630 reconstructed with increasing pixel accuracy or spatial resolution 631 is essential for many applications. This feature allows the 632 reconstruction of images with different resolutions and pixel 633 accuracy, as needed or desired, for different target devices. The 634 largest image source devices can provide a codestream that is easily 635 processed for the smallest image display device. 637 The JPEG 2000 packets contain all compressed image data from a 638 specific layer, a specific component, a specific resolution level, 639 and a specific precinct. The order in which these packets are found 640 in the codestream is called the "progression order". The ordering 641 of the packets can progress along four axes: layer, component, 642 resolution level and precinct. 644 Providing priority field to show importance of data contained in a 645 given RTP packet makes the most of JPEG 2000 progressive/scalable 646 functions. 648 In resolution progression order, the higher decomposition level is 649 more important. The priority field of the RTP packet that contains 650 the higher decomposition level is set to the higher priority. 651 When transmitted in spatial resolution order, LL0 components data 652 is set to the highest priority. 654 6.1 Priority mapping table 656 For the progression order, the priority value to be given to each 657 JPEG 2000 packet is defined by the priority mapping table. The 658 higher the importance, the smaller the priority value. The priority 659 mapping table can define the priority values for spatial resolution, 660 layer, color component, or precinct level. This priority table is 661 sent from the sender to a receiver through another protocol (RTSP, 662 SIP, etc.) outside of RTP. To change the priority mapping table, a 663 new priority mapping table must be sent from the sender to the 664 receiver as needed. 666 If there is no priority mapping table, the priority value of the RTP 667 packet must be set to '0xff'. 669 For example, the priority table can be sent to the receiver from the 670 sender but the receiver will determine its own level of priority RTP 671 packets to receive using the priority table as a guideline. 673 The priority value of 1 has the highest priority in the priority 674 mapping table. As the priority value increases, the priority 675 becomes lower. If transmission is performed without attaching any 676 priority mapping table, 0xff (255) must be set in the priority 677 field. 679 For RTP packets that only consist of a whole or fragmented main or 680 tile header and containing no JPEG 2000 packets , priority 0 must be 681 set by the sender if a priority mapping table is used. (If a 682 priority mapping table is not used, the priority value must be 0xff 683 for the same RTP packets.) 685 The sender may transmit each priority using separate multiple RTP 686 sessions defined by the priority value. For example, different 687 priority may be allocated to other multicast groups. The sender may 688 also transmit all priority valued RTP packets using a single RTP 689 session. 691 When multiple JPEG 2000 packets are included in a single RTP packet, 692 the higher priority value of JPEG 2000 packets is set for the whole 693 RTP packet by the sender. 695 In the following, an example of priority mapping table is shown. 696 The component based priority should be used when there is a higher 697 priority component like Y in YUV components. 699 6.1.1 Layer based priority 701 This is an example of priority mapping table in the progression 702 order in which SNR is improved progressively. The JPEG 2000 packet 703 of layer 0 and resolution 0 has the highest priority. The JPEG 2000 704 packets with layer 0 and resolution 1 or more are next in priority. 705 As the layer number increases, the priority becomes lower. 707 L R C P | priority 708 ------------+------------- 709 0 0 - - | 1 710 0 >0 - - | 2 711 1 - - - | 3 712 .... | .... 714 6.1.2 Resolution level based priority 716 This is an example of priority mapping table in the progression 717 order in which the spatial resolution is increased. The JPEG 2000 718 packet with layer 0 and resolution 0 has the highest priority and 719 the JPEG 2000 packets with later 1 or more and resolution 0 are next 720 in priority. As the resolution level increases, the priority 721 becomes lower. 723 L R C P | priority 724 ------------+------------- 725 0 0 - - | 1 726 >0 0 - - | 2 727 - 1 - - | 3 728 .... | .... 730 6.1.3 Component based priority 732 The priority mapping table for component progression is used only 733 when there is priority order among components. This example is for 734 YUV components. The JPEG 2000 packet with layer 0, resolution 0, 735 and component 0 has the highest priority. The JPEG packets with 736 layer 1 or more, resolution 0, and component 0 are next in priority. 737 The JPEG 2000 packets with resolution 0 and component 0 are the 738 third in priority. As the resolution increases, the priority 739 becomes lower. 741 L R C P | priority 742 ------------+------------- 743 0 0 0 - | 1 744 >0 0 0 - | 2 745 >0 0 - | 3 746 - - 1 - | 4 747 .... | .... 749 6.2 Sender's Actions 751 Priority is given in accordance with the priority mapping table. 752 The priority field is only a hint for the receiver but never forces 753 the receiver to use any specific processing method. If the priority 754 mapping table is not used, '0xff' must be set. 756 6.3 Receiver's Action 758 Progressive transmission that allows images to be reconstructed with 759 increasing pixel accuracy or spatial resolution is essential for 760 many applications. This feature allows the reconstruction of images 761 with different resolutions and pixel accuracy, as needed or desired, 762 for different target devices. The image architecture provides for 763 the efficient delivery of image data in many applications such as 764 client/server applications. The receiver should decode packets 765 above a certain priority to obtain maximum performance depending on 766 the receiver's platform. 768 The receiver can determine on its own (using or not using the 769 mapping table and several other variables) the priority value level 770 the RTP packets it should decode. 772 For example, when the CPU power is incompetent or the terminal has 773 only a low-resolution display, decoding only RTP packets below a 774 certain priority permits obtaining optimal performance. 776 If any high-priority RTP packet is not received when a packet loss 777 occurs, frame(s) can be skipped because visual loss may be 778 remarkable even if decoding can be successfully performed. 780 When any uninterpretable or unexpected priority is received, the 781 receiver must interpret packets as no priority (i.e. priority= 782 0xff.) 784 7. JPEG 2000 main header compensation 786 The JPEG 2000 image main header describes various encode parameters 787 and the decoder decodes by using the parameters described in the main 788 header. If the RTP packet that contains the main header is lost, 789 the corresponding JPEG 2000 codestream cannot be decoded. In an 790 extremely rare case, if the main header has dropped and all the 791 remainder JPEG 2000 packets has been received successfully, the 792 receiver cannot decode the frame. Even when the main header is 793 lost, it can be recovered to a certain level using the following 794 method. 796 A recovery of the main header that has been lost is very simple. In 797 the case of JPEG 2000 video, it is common that encode parameters 798 will not greatly change in each frame. Even if the RTP packet 799 including the main header of a frame has dropped, decoding 800 processing can be performed by using the main header of the previous 801 frame if this previous frame is already encoded by the same encode 802 parameters. 804 The mh_id field of the payload header is used to recognize whether 805 the encoding parameters of the main header are the same as the 806 encoding parameters of the previous frame. The same value is set in 807 mh_id of the RTP packet in the same frame. mh_id and encode 808 parameters are not associated with each other as 1:1 but they are 809 used to recognize whether the encode parameters of the previous 810 frame are the same or not. 812 The mh_id field is saved from previous frames to be used to recover 813 the current frame's main header, if lost. If the mh_id of the 814 current frame has the same value as the mh_id value of the previous 815 frame, the previous frame's main header can be used to decode the 816 current frame, in case the main header lost. 818 7.1 Sender processing 820 The sender transmits RTP packets with the same mh_id value unless 821 the encoder parameters are different from the previous frame. The 822 encode parameters are the fixed information marker segment (SIZ 823 marker) and functional marker segments (COD, COC, RGN, QCD, QCC, and 824 POC) specified in JPEG 2000 Part 1 Annex. A. If the encode 825 parameters have been changed, the sender transmits RTP packets by 826 incrementing the mh_id value by one. The initial mh_id value is 1. 827 When the mh_id value exceeds 15, the value returns to 1 again. 829 If the mh_id field is set to 0, the receiver must not save the main 830 header and must not compensate for lost headers using the above 831 method. 833 7.2 Receiver processing 835 When the receiver has received the main header correctly, the RTP 836 sequence number, the mh_id and main header are saved except when the 837 mh_id value is 0. Only the last main header that was received 838 correctly is saved. That is, if there has been a saved main header, 839 the previous one is deleted and the new main header is saved. 841 When the main header could not be received, the receiver compares 842 the current mh_id value (this mh_id can be known by receiving at 843 least one RTP packet) with the saved mh_id value. When the values 844 are the same, decoding is performed by using the saved main header. 846 The main header of mh_id = 0 is an indication from the sender to 847 not compensate for lost headers or to save any headers. . 849 8. Optional Payload Header 851 When the extension bit of the JPEG 2000 payload header is 1, the 852 payload header is followed by an optional payload header. The JPEG 853 2000 video stream payload comes after the optional payload header. 854 The figure shows a general format of the optional payload header. 856 0 1 2 3 857 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 858 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 859 | optype |X| length | | 860 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 861 | option specific format ..... | 862 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 864 Fig. JPEG 2000 video stream optional payload header generic format 866 optype : 7 bits 868 optype shows the optional payload header type. 870 X : 1bit 872 more extension bit. This must be set to 1 if another optional 873 payload header follows this optional payload header; otherwise 874 it must be set to 0. 876 length : 8 bits 878 length of optional header in bytes. 880 The receiver performs processing for the optional header when the 881 extension bit of the JPEG 2000 payload header is 1. 882 When having received an optype that cannot be interpreted, the 883 receiver will skip the amount specified in the length field and not 884 process the optional payload header.. 886 When the more extension bit of the optional header is 1, another 887 optional payload header will come immediately after this optional 888 payload header. 890 8.1 Quantization Optional Header 892 As one of optional payload headers, the quantization optional header 893 is defined. If only the QCD and/or QCC information has been 894 changed, this optional payload header conveys the information. One 895 optional payload header for QCD and another optional payload header 896 for the QCC information. Both changes must not be conveyed in a 897 single optional payload header. 899 If the receiver having received the quantization optional header but 900 the main header of the current frame is lost; the receiver can 901 replace the QCD and QCC information in the saved main header using 902 the current QCD or QCC optional header only if the mh_id value of 903 the current frame and previous frame differ by 1. The receiver 904 should interpret this optional payload header only when the mh_id 905 value changes. 907 This header is supposed to be used when an adjustment is made by 908 quantization size in order to keep the amount of compressed JPEG 909 2000 image data at a constant level. 911 The quantization optional header format is shown in the figure 912 below. 914 0 1 2 3 915 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 916 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 917 | optype=1 |X| length |Q| cindex | decomp level | 918 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 919 | style | | 920 +-+-+-+-+-+-+-+-+ + 921 | quantization step size value ..... | 922 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 924 Fig. Quantization Optional Header format 926 Each field is explained below. 928 optype : 1 bit 930 The optype value of the quantization optional header is 1. 932 Q : 1 bit 934 This indicates whether the information is of QCD or of QCC. If 935 the information is of QCD, 0 is set. If the information is of 936 QCC, 1 is set. 938 cindex : 7 bits 940 When the information is of QCC, this represents a component 941 number. 943 decomp level : 8 bits 945 This indicates the decomposition level of the corresponding frame. 947 style : 8 bits 949 This indicates the quantization style specified in the QCD and 950 QCC marker segments. (Refer to JPEG 2000 Part I: Annex A Table 951 A-28.) 953 quantization step size value : variable length 955 This is followed by the quantization stop size value specified 956 by style. (Refer to JPEG 2000 Part I: Annex A Table A-29 and 957 A-30.) 959 9. Security Consideration 961 RTP packets using the payload format defined in this specification 962 are subject to the security considerations discussed in the RTP 963 specifications[3]. This implies that confidentiality of the media 964 streams is achieved by encryption. Because the data compression 965 used with this payload format is applied end-to-end, encryption 966 may be performed on the compressed data so there is no conflict 967 between the two operations. 969 10. Author's Address 971 Eric Edwards 972 Sony Corporation 973 Media Processing Division 974 Network & Software Technology Center of America 975 3300 Zanker Road, MD: SJ2C4 976 San Jose, CA 95134 977 Phone: +1 408 955 6462 978 Fax: +1 408 955 5724 979 Email: Eric.Edwards@am.sony.com 981 Satoshi Futemma 982 Sony Corporation 983 6-7-35 Kitashinagawa Shinagawa-ku 984 Tokyo 141-0001 JAPAN 985 Phone: +81 3 5448 4373 986 Fax: +81 3 5448 4622 987 Email: satosi-f@sm.sony.co.jp 989 Eisaburo Itakura 990 Sony Corporation 991 6-7-35 Kitashinagawa Shinagawa-ku 992 Tokyo 141-0001 JAPAN 993 Phone: +81 3 5448 3096 994 Fax: +81 3 5448 4622 995 Email: itakura@sm.sony.co.jp 997 Takahiro Fukuhara 998 Sony Corporation 999 1-11-1 Osaki Shinagawa-ku 1000 Tokyo 141-0032 JAPAN 1001 Phone: +81 3 5435 3665 1002 Fax: +81 3 5435 3891 1003 Email: fukuhara@av.crl.sony.co.jp 1005 11. References 1007 [1] ISO/IEC JTC1/SC29/WG1: "JPEG 2000 Part I Final Draft 1008 International Standard", September 2000. 1010 [2] ISO/IEC JTC1/SC29/WG1: "Motion JPEG 2000 Committee Draft 1011 1.0", http://www.jpeg.org/public/cd15444-3.pdf, 1012 December 2000. 1014 [3] A. N. Skodras, C. A. Christopoulos and T. Ebrahimi: "JPEG2000: 1015 The Upcoming Still Image Compression Standard", In Proc. of the 1016 11th Portuguese Conference on Pattern Recognition, pp. 359-366, 1017 Porto, Portugal, May 2000. 1019 [4] ISO/IEC JTC1/SC29/WG1: "JPEG2000 requirements and profiles 1020 version 6.3", draft in progress, 1021 http://www.jpeg.org/public/wg1n1803.pdf, July 2000. 1023 [5] Diego Santa-Cruz, Touradj Ebrahimi, Joel Askelof, Mathias 1024 Larsson and Charilaos Christopoulos: "JPEG 2000 still image 1025 coding versus other standards", In Proc. of SPIE's 45th annual 1026 meeting, Applications of Digital Image Processing XXIII, 1027 vol. 4115, pp. 446-454, July 2000. 1029 [6] H. Schulzrinne, S. Casner, R. Frederick, and V. Jacobson "RTP: 1030 A Transport Protocol for Real Time Applications", RFC 1889, 1031 January 1996.