idnits 2.17.1 draft-ietf-payload-rtp-jpegxs-17.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (June 17, 2021) is 1042 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Possible downref: Non-RFC (?) normative reference: ref. 'ISO21122-1' -- Possible downref: Non-RFC (?) normative reference: ref. 'ISO21122-2' -- Possible downref: Non-RFC (?) normative reference: ref. 'ISO21122-3' Summary: 0 errors (**), 0 flaws (~~), 1 warning (==), 4 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 avtcore S. Lugan 3 Internet-Draft intoPIX 4 Intended status: Standards Track A. Descampe 5 Expires: December 19, 2021 UCL 6 C. Damman 7 intoPIX 8 T. Richter 9 IIS 10 T. Bruylants 11 intoPIX 12 June 17, 2021 14 RTP Payload Format for ISO/IEC 21122 (JPEG XS) 15 draft-ietf-payload-rtp-jpegxs-17 17 Abstract 19 This document specifies a Real-Time Transport Protocol (RTP) payload 20 format to be used for transporting JPEG XS (ISO/IEC 21122) encoded 21 video. JPEG XS is a low-latency, lightweight image coding system. 22 Compared to an uncompressed video use case, it allows higher 23 resolutions and video frame rates, while offering visually lossless 24 quality, reduced power consumption, and end-to-end latency confined 25 to a fraction of a video frame. 27 Status of This Memo 29 This Internet-Draft is submitted in full conformance with the 30 provisions of BCP 78 and BCP 79. 32 Internet-Drafts are working documents of the Internet Engineering 33 Task Force (IETF). Note that other groups may also distribute 34 working documents as Internet-Drafts. The list of current Internet- 35 Drafts is at https://datatracker.ietf.org/drafts/current/. 37 Internet-Drafts are draft documents valid for a maximum of six months 38 and may be updated, replaced, or obsoleted by other documents at any 39 time. It is inappropriate to use Internet-Drafts as reference 40 material or to cite them other than as "work in progress." 42 This Internet-Draft will expire on December 19, 2021. 44 Copyright Notice 46 Copyright (c) 2021 IETF Trust and the persons identified as the 47 document authors. All rights reserved. 49 This document is subject to BCP 78 and the IETF Trust's Legal 50 Provisions Relating to IETF Documents 51 (https://trustee.ietf.org/license-info) in effect on the date of 52 publication of this document. Please review these documents 53 carefully, as they describe your rights and restrictions with respect 54 to this document. Code Components extracted from this document must 55 include Simplified BSD License text as described in Section 4.e of 56 the Trust Legal Provisions and are provided without warranty as 57 described in the Simplified BSD License. 59 Table of Contents 61 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 62 2. Conventions, Definitions, and Abbreviations . . . . . . . . . 3 63 3. Media Format Description . . . . . . . . . . . . . . . . . . 5 64 3.1. Image Data Structures . . . . . . . . . . . . . . . . . . 5 65 3.2. Codestream . . . . . . . . . . . . . . . . . . . . . . . 5 66 3.3. Video support box and color specification box . . . . . . 5 67 3.4. JPEG XS Frame . . . . . . . . . . . . . . . . . . . . . . 6 68 4. RTP Payload Format . . . . . . . . . . . . . . . . . . . . . 7 69 4.1. RTP packetization . . . . . . . . . . . . . . . . . . . . 7 70 4.2. RTP Header Usage . . . . . . . . . . . . . . . . . . . . 9 71 4.3. Payload Header Usage . . . . . . . . . . . . . . . . . . 11 72 4.4. Payload Data . . . . . . . . . . . . . . . . . . . . . . 13 73 5. Traffic Shaping and Delivery Timing . . . . . . . . . . . . . 18 74 6. Congestion Control Considerations . . . . . . . . . . . . . . 19 75 7. Payload Format Parameters . . . . . . . . . . . . . . . . . . 19 76 7.1. Media Type Registration . . . . . . . . . . . . . . . . . 19 77 8. SDP Parameters . . . . . . . . . . . . . . . . . . . . . . . 24 78 8.1. Mapping of Payload Type Parameters to SDP . . . . . . . . 24 79 8.2. Usage with SDP Offer/Answer Model . . . . . . . . . . . . 25 80 9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 25 81 10. Security Considerations . . . . . . . . . . . . . . . . . . . 26 82 11. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 27 83 12. RFC Editor Considerations . . . . . . . . . . . . . . . . . . 27 84 13. References . . . . . . . . . . . . . . . . . . . . . . . . . 27 85 13.1. Normative References . . . . . . . . . . . . . . . . . . 27 86 13.2. Informative References . . . . . . . . . . . . . . . . . 29 87 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 29 89 1. Introduction 91 This document specifies a payload format for packetization of JPEG XS 92 [ISO21122-1] encoded video signals into the Real-time Transport 93 Protocol (RTP) [RFC3550]. 95 The JPEG XS coding system offers compression and recompression of 96 image sequences with very moderate computational resources while 97 remaining robust under multiple compression and decompression cycles 98 and mixing of content sources, e.g. embedding of subtitles, overlays 99 or logos. Typical target compression ratios ensuring visually 100 lossless quality are in the range of 2:1 to 10:1, depending on the 101 nature of the source material. The end-to-end latency can be 102 confined to a fraction of a video frame, typically between a small 103 number of lines down to below a single line. 105 2. Conventions, Definitions, and Abbreviations 107 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 108 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 109 "OPTIONAL" in this document are to be interpreted as described in BCP 110 14 [RFC2119] [RFC8174] when, and only when, they appear in all 111 capitals, as shown here. 113 Application Data Unit (ADU) 114 The unit of source data provided as payload to the transport 115 layer, and corresponding, in this RTP payload definition, to a 116 single JPEG XS video frame. 118 Color specification box (CS box) 119 An ISO color specification box defined in JPEG XS Part 3 120 [ISO21122-3] that includes color related metadata required to 121 correctly display JPEG XS video frames, such as color primaries, 122 transfer characteristics and matrix coefficients. 124 EOC marker 125 A marker that consists of the two bytes 0xff11 indicating the end 126 of a JPEG XS codestream. 128 JPEG XS codestream 129 A sequence of bytes representing a compressed image formatted 130 according to JPEG XS Part 1 [ISO21122-1]. 132 JPEG XS codestream header 133 A sequence of bytes, starting with a SOC marker, at the beginning 134 of each JPEG XS codestream encoded in multiple markers and marker 135 segments that does not carry entropy coded data, but metadata such 136 as the video frame dimension and component precision. 138 JPEG XS frame 139 In the case of progressive video, a single JPEG XS picture 140 segment. In the case of interlaced video, the concatenation of 141 two JPEG XS picture segments. 143 JPEG XS header segment 144 The concatenation of a video support box [ISO21122-3], a color 145 specification box [ISO21122-3], and a JPEG XS codestream header. 147 JPEG XS picture segment 148 The concatenation of a video support box [ISO21122-3], a color 149 specification box [ISO21122-3], and a JPEG XS codestream. 151 JPEG XS stream 152 A sequence of JPEG XS frames. 154 Marker 155 A two-byte functional sequence that is part of a JPEG XS 156 codestream starting with a 0xff byte and a subsequent byte 157 defining its function. 159 Marker segment 160 A marker along with a 16-bit marker size and payload data 161 following the size. 163 Packetization unit 164 A portion of an Application Data Unit whose boundaries coincide 165 with boundaries of RTP packet payloads (excluding payload header), 166 i.e. the first (resp. last) byte of a packetization unit is the 167 first (resp. last) byte of an RTP packet payload (excluding its 168 payload header). 170 SLH marker 171 A marker that represents a slice header, as defined in 172 [ISO21122-1]. 174 Slice 175 The smallest independently decodable unit of a JPEG XS codestream, 176 bearing in mind that it decodes to wavelet coefficients which 177 still require inverse wavelet filtering to give an image. 179 SOC marker 180 A marker that consists of the two bytes 0xff10 indicating the 181 start of a JPEG XS codestream. The SOC marker is considered an 182 integral part of the JPEG XS codestream header. 184 Video support box (VS box) 185 An ISO video support box, as defined in [ISO21122-3], that 186 includes metadata required to play back a JPEG XS stream, such as 187 its maximum bitrate, its subsampling structure, its buffer model 188 and its frame rate. 190 3. Media Format Description 192 3.1. Image Data Structures 194 JPEG XS is a low-latency lightweight image coding system for coding 195 continuous-tone grayscale or continuous-tone color digital images. 197 This coding system provides an efficient representation of image 198 signals through the mathematical tool of wavelet analysis. The 199 wavelet filter process separates each component into multiple bands, 200 where each band consists of multiple coefficients describing the 201 image signal of a given component within a frequency domain specific 202 to the wavelet filter type, i.e. the particular filter corresponding 203 to the band. 205 Wavelet coefficients are grouped into precincts, where each precinct 206 includes all coefficients over all bands that contribute to a spatial 207 region of the image. 209 One or multiple precincts are furthermore combined into slices 210 consisting of an integer number of precincts. Precincts do not cross 211 slice boundaries, and wavelet coefficients in precincts that are part 212 of different slices can be decoded independently of each other. 213 Note, however, that the wavelet transformation runs across slice 214 boundaries. A slice always extends over the full width of the image, 215 but may only cover parts of its height. 217 3.2. Codestream 219 A JPEG XS codestream is formed by (in the given order): 221 o a JPEG XS codestream header, which starts with an SOC marker, 223 o one or more slices, 225 o an EOC marker to signal the end of the codestream. 227 The JPEG XS codestream format, including the definition of all 228 markers, is further defined in [ISO21122-1]. It represents sample 229 values of a single image, without any interpretation relative to a 230 color space. 232 3.3. Video support box and color specification box 234 While the information defined in the codestream is sufficient to 235 reconstruct the sample values of one image, the interpretation of the 236 samples remains undefined by the codestream itself. This 237 interpretation is given by the video support box and the color 238 specification box which contain significant information to correctly 239 play the JPEG XS stream. The layout and syntax of these boxes, 240 together with their content, are defined in [ISO21122-3]. 242 The video support box provides information on the maximum bitrate, 243 the frame rate, the interlaced mode (progressive or interlaced), the 244 subsampling image format, the informative timecode of the current 245 JPEG XS frame, the profile, level/sublevel used, and optionally on 246 the buffer model and the mastering display metadata. 248 Note that the profile and level/sublevel, specified by respectively 249 the Ppih and Plev fields, specify limits on the capabilities needed 250 to decode the codestream and handle the output. Profiles represent a 251 limit on the required algorithmic features and parameter ranges used 252 in the codestream. The combination of level and sublevel defines a 253 lower bound on the required throughput for a decoder in respectively 254 the image (or decoded) domain and the codestream (or coded) domain. 255 The actual defined profiles and levels/sublevels, along with the 256 associated values for the Ppih and Plev fields, are defined in 257 [ISO21122-2]. 259 The color specification box indicates the color primaries, transfer 260 characteristics, matrix coefficients and video full range flag needed 261 to specify the color space of the video stream. 263 3.4. JPEG XS Frame 265 The concatenation of a video support box, a color specification box, 266 and a JPEG XS codestream forms a JPEG XS picture segment. 268 In the case of a progressive video stream, each JPEG XS frame 269 consists of one single JPEG XS picture segment. 271 In the case of an interlaced video stream, each JPEG XS frame is made 272 of two concatenated JPEG XS picture segments. The codestream of each 273 picture segment corresponds exclusively to one of the two fields of 274 the interlaced frame. Both picture segments SHALL contain identical 275 boxes (i.e. concatenation of the video support box and the color 276 specification box is byte exact the same for both picture segments of 277 the frame). 279 Note that the interlaced mode, as signaled by the frat field 280 [ISO21122-3] in the video support box, indicates either progressive, 281 interlaced top-field first, or interlaced bottom-field first mode. 282 Thus, in the case of interlaced content, its value SHALL also be 283 identical in both picture segments. 285 4. RTP Payload Format 287 This section specifies the payload format for JPEG XS streams over 288 the Real-time Transport Protocol (RTP) [RFC3550]. 290 In order to be transported over RTP, each JPEG XS stream is 291 transported in a distinct RTP stream, identified by a distinct 292 Synchronization source (SSRC) [RFC3550]. 294 A JPEG XS stream is divided into Application Data Units (ADUs), each 295 ADU corresponding to a single JPEG XS frame. 297 4.1. RTP packetization 299 An ADU is made of several packetization units. If a packetization 300 unit is bigger than the maximum size of an RTP packet payload, the 301 unit is split into multiple RTP packet payloads, as illustrated in 302 Figure 1. As seen there, each packet SHALL contain (part of) one and 303 only one packetization unit. A packetization unit may extend over 304 multiple packets. The payload of every packet SHALL have the same 305 size (based e.g. on the Maximum Transfer Unit of the network), except 306 (possibly) the last packet of a packetization unit. The boundaries 307 of a packetization unit SHALL coincide with the boundaries of the 308 payload of a packet (excluding the payload header), i.e. the first 309 (resp. last) byte of the packetization unit SHALL be the first (resp. 310 last) byte of the payload (excluding its header). 312 RTP +-----+------------------------+ 313 Packet #1 | Hdr | Packetization unit #1 | 314 +-----+------------------------+ 315 RTP +-----+--------------------------------------+ 316 Packet #2 | Hdr | Packetization unit #2 | 317 +-----+--------------------------------------+ 318 RTP +-----+--------------------------------------------------+ 319 Packet #3 | Hdr | Packetization unit #3 (part 1/3) | 320 +-----+--------------------------------------------------+ 321 RTP +-----+--------------------------------------------------+ 322 Packet #4 | Hdr | Packetization unit #3 (part 2/3) | 323 +-----+--------------------------------------------------+ 324 RTP +-----+----------------------------------------------+ 325 Packet #5 | Hdr | Packetization unit #3 (part 3/3) | 326 +-----+----------------------------------------------+ 327 ... 328 RTP +-----+-----------------------------------------+ 329 Packet #P | Hdr | Packetization unit #N (part q/q) | 330 +-----+-----------------------------------------+ 332 Figure 1: Example of ADU packetization 334 There are two different packetization modes defined for this RTP 335 payload format. 337 1. Codestream packetization mode: in this mode, the packetization 338 unit SHALL be the entire JPEG XS picture segment (i.e. codestream 339 preceded by boxes). This means that a progressive frame will 340 have a single packetization unit, while an interlaced frame will 341 have two. The progressive case is illustrated in Figure 2. 343 2. Slice packetization mode: in this mode, the packetization unit 344 SHALL be the slice, i.e. there SHALL be data from no more than 345 one slice per RTP packet. The first packetization unit SHALL be 346 made of the JPEG XS header segment (i.e. the concatenation of the 347 VS box, the CS box and the JPEG XS codestream header). This 348 first unit is then followed by successive units, each containing 349 one and only one slice. The packetization unit containing the 350 last slice of a JPEG XS codestream SHALL also contain the EOC 351 marker immediately following this last slice. This is 352 illustrated in Figure 3. In the case of an interlaced frame, the 353 JPEG XS header segment of the second field SHALL be in its own 354 packetization unit. 356 RTP +-----+--------------------------------------------------+ 357 Packet #1 | Hdr | VS box + CS box + JPEG XS codestream (part 1/q) | 358 +-----+--------------------------------------------------+ 359 RTP +-----+--------------------------------------------------+ 360 Packet #2 | Hdr | JPEG XS codestream (part 2/q) | 361 +-----+--------------------------------------------------+ 362 ... 363 RTP +-----+--------------------------------------+ 364 Packet #P | Hdr | JPEG XS codestream (part q/q) | 365 +-----+--------------------------------------+ 367 Figure 2: Example of codestream packetization mode 369 RTP +-----+----------------------------+ 370 Packet #1 | Hdr | JPEG XS header segment | 371 +-----+----------------------------+ 372 RTP +-----+--------------------------------------------------+ 373 Packet #2 | Hdr | Slice #1 (part 1/2) | 374 +-----+--------------------------------------------------+ 375 RTP +-----+-------------------------------------------+ 376 Packet #3 | Hdr | Slice #1 (part 2/2) | 377 +-----+-------------------------------------------+ 378 RTP +-----+--------------------------------------------------+ 379 Packet #4 | Hdr | Slice #2 (part 1/3) | 380 +-----+--------------------------------------------------+ 381 ... 382 RTP +-----+---------------------------------------+ 383 Packet #P | Hdr | Slice #N (part q/q) + EOC marker | 384 +-----+---------------------------------------+ 386 Figure 3: Example of slice packetization mode 388 Due to the constant bit-rate of JPEG XS, the codestream packetization 389 mode guarantees that a JPEG XS RTP stream will produce a constant 390 number of bytes per video frame, and a constant number of RTP packets 391 per video frame. If an implementation wishes to provide the same 392 guarantee with the slice packetization mode, it will need to use an 393 additional mechanism. This can involve a constraint at the rate 394 allocation stage in the JPEG XS encoder to impose a constant bit-rate 395 at the slice level, the usage of padding data, or the insertion of 396 empty RTP packets (i.e. an RTP packet whose payload data is empty). 398 4.2. RTP Header Usage 400 The format of the RTP header is specified in [RFC3550] and reprinted 401 in Figure 4 for convenience. This RTP payload format uses the fields 402 of the header in a manner consistent with that specification. 404 The RTP payload (and the settings for some RTP header bits) for 405 packetization units are specified in Section 4.3. 407 0 1 2 3 408 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 409 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 410 | V |P|X| CC |M| PT | sequence number | 411 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 412 | timestamp | 413 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 414 | synchronization source (SSRC) identifier | 415 +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ 416 | contributing source (CSRC) identifiers | 417 | .... | 418 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 420 Figure 4: RTP header according to RFC 3550 422 The version (V), padding (P), extension (X), CSRC count (CC), 423 sequence number, synchronization source (SSRC) and contributing 424 source (CSRC) fields follow their respective definitions in 425 [RFC3550]. 427 The remaining RTP header information to be set according to this RTP 428 payload format is set as follows: 430 Marker (M) [1 bit]: 432 If progressive scan video is being transmitted, the marker bit 433 denotes the end of a video frame. If interlaced video is being 434 transmitted, it denotes the end of the field. The marker bit 435 SHALL be set to 1 for the last packet of the video frame/field. 436 It SHALL be set to 0 for all other packets. 438 Payload Type (PT) [7 bits]: 440 A dynamically allocated payload type field that designates the 441 payload as JPEG XS video. 443 Timestamp [32 bits]: 445 The RTP timestamp is set to the sampling timestamp of the content. 446 A 90 kHz clock rate SHALL be used. 448 As specified in [RFC3550] and [RFC4175], the RTP timestamp 449 designates the sampling instant of the first octet of the video 450 frame to which the RTP packet belongs. Packets SHALL NOT include 451 data from multiple video frames, and all packets belonging to the 452 same video frame SHALL have the same timestamp. Several 453 successive RTP packets will consequently have equal timestamps if 454 they belong to the same video frame (that is until the marker bit 455 is set to 1, marking the last packet of the video frame), and the 456 timestamp is only increased when a new video frame begins. 458 If the sampling instant does not correspond to an integer value of 459 the clock, the value SHALL be truncated to the next lowest 460 integer, with no ambiguity. 462 4.3. Payload Header Usage 464 The first four bytes of the payload of an RTP packet in this RTP 465 payload format are referred to as the payload header. Figure 5 466 illustrates the structure of this payload header. 468 0 1 2 3 469 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 470 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 471 |T|K|L| I |F counter| SEP counter | P counter | 472 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 474 Figure 5: Payload header 476 The payload header consists of the following fields: 478 Transmission mode (T) [1 bit]: 480 The T bit is set to indicate that packets are sent sequentially by 481 the transmitter. This information allows a receiver to dimension 482 its input buffer(s) accordingly. If T=0, nothing can be assumed 483 about the transmission order and packets may be sent out-of-order 484 by the transmitter. If T=1, packets SHALL be sent sequentially by 485 the transmitter. The T bit value SHALL be identical for all 486 packets of the RTP stream. 488 pacKetization mode (K) [1 bit]: 490 The K bit is set to indicate which packetization mode is used. 491 K=0 indicates codestream packetization mode, while K=1 indicates 492 slice packetization mode. In the case that the Transmission mode 493 (T) is set to 0 (out-of-order), the slice packetization mode SHALL 494 be used and K SHALL be set to 1. This is required, because only 495 the slice packetization mode supports out-of-order packet 496 transmission. The K bit value SHALL be identical for all packets 497 of the RTP stream. 499 Last (L) [1 bit]: 501 The L bit is set to indicate the last packet of a packetization 502 unit. As the end of the video frame also ends the packet 503 containing the last unit of the video frame, the L bit is set 504 whenever the M bit is set. If codestream packetization mode is 505 used, L bit and M bit are equivalent. 507 Interlaced information (I) [2 bit]: 509 These two I bits are used to indicate how the JPEG XS frame is 510 scanned (progressive or interlaced). In case of an interlaced 511 frame, they also indicate which JPEG XS picture segment the 512 payload is part of (first or second). 514 00: The payload is progressively scanned. 516 01: Reserved for future use. 518 10: The payload is part of the first JPEG XS picture segment of 519 an interlaced video frame. The height specified in the 520 included JPEG XS codestream header is half of the height of the 521 entire displayed image. 523 11: The payload is part of the second JPEG XS picture segment of 524 an interlaced video frame. The height specified in the 525 included JPEG XS codestream header is half of the height of the 526 entire displayed image. 528 F counter [5 bits]: 530 The frame (F) counter identifies the video frame number modulo 32 531 to which a packet belongs. Frame numbers are incremented by 1 for 532 each video frame transmitted. The frame number, in addition to 533 the timestamp, may help the decoder manage its input buffer and 534 bring packets back into their natural order. 536 SEP counter [11 bits]: 538 The Slice and Extended Packet (SEP) counter is used differently 539 depending on the packetization mode. 541 * In the case of codestream packetization mode (K=0), this 542 counter resets whenever the Packet counter resets (see 543 hereunder), and increments by 1 whenever the Packet counter 544 overruns. 546 * In the case of slice packetization mode (K=1), this counter 547 identifies the slice modulo 2047 to which the packet 548 contributes. If the data belongs to the JPEG XS header 549 segment, this field SHALL have its maximal value, namely 2047 = 550 0x07ff. Otherwise, it is the slice index modulo 2047. Slice 551 indices are counted from 0 (corresponding to the top of the 552 video frame). 554 P counter [11 bits]: 556 The packet (P) counter identifies the packet number modulo 2048 557 within the current packetization unit. It is set to 0 at the 558 start of the packetization unit and incremented by 1 for every 559 subsequent packet (if any) belonging to the same unit. 560 Practically, if codestream packetization mode is enabled, this 561 field counts the packets within a JPEG XS picture segment and is 562 extended by the SEP counter when it overruns. If slice 563 packetization mode is enabled, this field counts the packets 564 within a slice or within the JPEG XS header segment. 566 4.4. Payload Data 568 The payload data of a JPEG XS RTP stream consists of a concatenation 569 of multiple JPEG XS frames. Within the RTP stream, all of the video 570 support boxes and all of the color specification boxes SHALL retain 571 their respective layouts for each JPEG XS frame. Thus, each video 572 support box in the RTP stream SHALL define the same sub boxes. The 573 effective values in the boxes are allowed to change under the 574 condition that their relative byte offsets SHALL NOT change. 576 Each JPEG XS frame is the concatenation of one or more packetization 577 unit(s), as explained in Section 4.1. Figure 6 depicts this layout 578 for a progressive video frame in the codestream packetization mode, 579 Figure 7 depicts this layout for an interlaced video frame in the 580 codestream packetization mode, Figure 8 depicts this layout for a 581 progressive video frame in the slice packetization mode and Figure 9 582 depicts this layout for an interlaced video frame in the slice 583 packetization mode. The Frame counter value is not indicated because 584 the value is constant for all packetization units of a given video 585 frame. 587 +=====[ Packetization unit (PU) #1 ]====+ 588 | Video support box | SEP counter=0 589 | +---------------------------------+ | P counter=0 590 | : Sub boxes of the VS box : | 591 | +---------------------------------+ | 592 +- - - - - - - - - - - - - - - - - - - -+ 593 | Color specification box | 594 | +---------------------------------+ | 595 | : Fields of the CS box : | 596 | +---------------------------------+ | 597 +- - - - - - - - - - - - - - - - - - - -+ 598 | JPEG XS codestream | 599 : (part 1/q) : M=0, K=0, L=0, I=00 600 +---------------------------------------+ 601 | JPEG XS codestream | SEP counter=0 602 | (part 2/q) | P counter=1 603 : : M=0, K=0, L=0, I=00 604 +---------------------------------------+ 605 | JPEG XS codestream | SEP counter=0 606 | (part 3/q) | P counter=2 607 : : M=0, K=0, L=0, I=00 608 +---------------------------------------+ 609 : : 610 +---------------------------------------+ 611 | JPEG XS codestream | SEP counter=1 612 | (part 2049/q) | P counter=0 613 : : M=0, K=0, L=0, I=00 614 +---------------------------------------+ 615 : : 616 +---------------------------------------+ 617 | JPEG XS codestream | SEP counter=(q-1) div 2048 618 | (part q/q) | P counter=(q-1) mod 2048 619 : : M=1, K=0, L=1, I=00 620 +=======================================+ 622 Figure 6: Example of JPEG XS Payload Data (codestream packetization 623 mode, progressive video frame) 625 +=====[ Packetization unit (PU) #1 ]====+ 626 | Video support box | SEP counter=0 627 +- - - - - - - - - - - - - - - - - - - -+ P counter=0 628 | Color specification box | 629 +- - - - - - - - - - - - - - - - - - - -+ 630 | JPEG XS codestream (1st field) | 631 : (part 1/q) : M=0, K=0, L=0, I=10 632 +---------------------------------------+ 633 | JPEG XS codestream (1st field) | SEP counter=0 634 | (part 2/q) | P counter=1 635 : : M=0, K=0, L=0, I=10 636 +---------------------------------------+ 637 : : 638 +---------------------------------------+ 639 | JPEG XS codestream (1st field) | SEP counter=1 640 | (part 2049/q) | P counter=0 641 : : M=0, K=0, L=0, I=10 642 +---------------------------------------+ 643 : : 644 +---------------------------------------+ 645 | JPEG XS codestream (1st field) | SEP counter=(q-1) div 2048 646 | (part q/q) | P counter=(q-1) mod 2048 647 : : M=1, K=0, L=1, I=10 648 +===============[ PU #2 ]===============+ 649 | Video support box | SEP counter=0 650 +- - - - - - - - - - - - - - - - - - - -+ P counter=0 651 | Color specification box | 652 +- - - - - - - - - - - - - - - - - - - -+ 653 | JPEG XS codestream (2nd field) | 654 | (part 1/q) | 655 : : M=0, K=0, L=0, I=11 656 +---------------------------------------+ 657 | JPEG XS codestream (2nd field) | SEP counter=0 658 | (part 2/q) | P counter=1 659 : : M=0, K=0, L=0, I=11 660 +---------------------------------------+ 661 : : 662 +---------------------------------------+ 663 | JPEG XS codestream (2nd field) | SEP counter=(q-1) div 2048 664 | (part q/q) | P counter=(q-1) mod 2048 665 : : M=1, K=0, L=1, I=11 666 +=======================================+ 668 Figure 7: Example of JPEG XS Payload Data (codestream packetization 669 mode, interlaced video frame) 671 +===[ PU #1: JPEG XS Header segment ]===+ 672 | Video support box | SEP counter=0x07FF 673 +- - - - - - - - - - - - - - - - - - - -+ P counter=0 674 | Color specification box | 675 +- - - - - - - - - - - - - - - - - - - -+ 676 | JPEG XS codestream header | 677 | +---------------------------------+ | 678 | : Markers and marker segments : | 679 | +---------------------------------+ | M=0, T=0, K=1, L=1, I=00 680 +==========[ PU #2: Slice #1 ]==========+ 681 | +---------------------------------+ | SEP counter=0 682 | | SLH Marker | | P counter=0 683 | +---------------------------------+ | 684 | : Entropy Coded Data : | 685 | +---------------------------------+ | M=0, T=0, K=1, L=1, I=00 686 +==========[ PU #3: Slice #2 ]==========+ 687 | Slice #2 | SEP counter=1 688 | (part 1/q) | P counter=0 689 : : M=0, T=0, K=1, L=0, I=00 690 +---------------------------------------+ 691 | Slice #2 | SEP counter=1 692 | (part 2/q) | P counter=1 693 : : M=0, T=0, K=1, L=0, I=00 694 +---------------------------------------+ 695 : : 696 +---------------------------------------+ 697 | Slice #2 | SEP counter=1 698 | (part q/q) | P counter=q-1 699 : : M=0, T=0, K=1, L=1, I=00 700 +=======================================+ 701 : : 702 +========[ PU #N: Slice #(N-1) ]========+ 703 | Slice #(N-1) | SEP counter=N-2 704 | (part 1/r) | P counter=0 705 : : M=0, T=0, K=1, L=0, I=00 706 +---------------------------------------+ 707 : : 708 +---------------------------------------+ 709 | Slice #(N-1) | SEP counter=N-2 710 | (part r/r) | P counter=r-1 711 : + EOC marker : M=1, T=0, K=1, L=1, I=00 712 +=======================================+ 714 Figure 8: Example of JPEG XS Payload Data (slice packetization mode, 715 progressive video frame) 717 +====[ PU #1: JPEG XS Hdr segment 1 ]===+ 718 | Video support box | SEP counter=0x07FF 719 +- - - - - - - - - - - - - - - - - - - -+ P counter=0 720 | Color specification box | 721 +- - - - - - - - - - - - - - - - - - - -+ 722 | JPEG XS codestream header 1 | 723 | +---------------------------------+ | 724 | : Markers and marker segments : | 725 | +---------------------------------+ | M=0, T=0, K=1, L=1, I=10 726 +====[ PU #2: Slice #1 (1st field) ]====+ 727 | +---------------------------------+ | SEP counter=0 728 | | SLH Marker | | P counter=0 729 | +---------------------------------+ | 730 | : Entropy Coded Data : | 731 | +---------------------------------+ | M=0, T=0, K=1, L=1, I=10 732 +====[ PU #3: Slice #2 (1st field) ]====+ 733 | Slice #2 | SEP counter=1 734 | (part 1/q) | P counter=0 735 : : M=0, T=0, K=1, L=0, I=10 736 +---------------------------------------+ 737 | Slice #2 | SEP counter=1 738 | (part 2/q) | P counter=1 739 : : M=0, T=0, K=1, L=0, I=10 740 +---------------------------------------+ 741 : : 742 +---------------------------------------+ 743 | Slice #2 | SEP counter=1 744 | (part q/q) | P counter=q-1 745 : : M=0, T=0, K=1, L=1, I=10 746 +=======================================+ 747 : : 748 +==[ PU #N: Slice #(N-1) (1st field) ]==+ 749 | Slice #(N-1) | SEP counter=N-2 750 | (part 1/r) | P counter=0 751 : : M=0, T=0, K=1, L=0, I=10 752 +---------------------------------------+ 753 : : 754 +---------------------------------------+ 755 | Slice #(N-1) | SEP counter=N-2 756 | (part r/r) | P counter=r-1 757 : + EOC marker : M=1, T=0, K=1, L=1, I=10 758 +=======================================+ 759 +===[ PU #N+1: JPEG XS Hdr segment 2 ]==+ 760 | Video support box | SEP counter=0x07FF 761 +- - - - - - - - - - - - - - - - - - - -+ P counter=0 762 | Color specification box | 763 +- - - - - - - - - - - - - - - - - - - -+ 764 | JPEG XS codestream header 2 | 765 | +---------------------------------+ | 766 | : Markers and marker segments : | 767 | +---------------------------------+ | M=0, T=0, K=1, L=1, I=11 768 +===[ PU #N+2: Slice #1 (2nd field) ]===+ 769 | +---------------------------------+ | SEP counter=0 770 | | SLH Marker | | P counter=0 771 | +---------------------------------+ | 772 | : Entropy Coded Data : | 773 | +---------------------------------+ | M=0, T=0, K=1, L=1, I=11 774 +===[ PU #N+3: Slice #2 (2nd field) ]===+ 775 | Slice #2 | SEP counter=1 776 | (part 1/s) | P counter=0 777 : : M=0, T=0, K=1, L=0, I=11 778 +---------------------------------------+ 779 | Slice #2 | SEP counter=1 780 | (part 2/s) | P counter=1 781 : : M=0, T=0, K=1, L=0, I=11 782 +---------------------------------------+ 783 : : 784 +---------------------------------------+ 785 | Slice #2 | SEP counter=1 786 | (part s/s) | P counter=s-1 787 : : M=0, T=0, K=1, L=1, I=11 788 +=======================================+ 789 : : 790 +==[ PU #2N: Slice #(N-1) (2nd field) ]=+ 791 | Slice #(N-1) | SEP counter=N-2 792 | (part 1/t) | P counter=0 793 : : M=0, T=0, K=1, L=0, I=11 794 +---------------------------------------+ 795 : : 796 +---------------------------------------+ 797 | Slice #(N-1) | SEP counter=N-2 798 | (part t/t) | P counter=t-1 799 : + EOC marker : M=1, T=0, K=1, L=1, I=11 800 +=======================================+ 802 Figure 9: Example of JPEG XS Payload Data (slice packetization mode, 803 interlaced video frame) 805 5. Traffic Shaping and Delivery Timing 807 In order to facilitate proper synchronization between senders and 808 receivers it is RECOMMENDED to implement traffic shaping and delivery 809 timing in accordance with the Network Compatibility Model compliance 810 definitions specified in [SMPTE-ST2110-21]. In such case, the 811 session description SHALL signal the compliance with the media type 812 parameter TP. The actual applied traffic shaping and timing delivery 813 mechanism is outside the scope of this memo and does not influence 814 the payload packetization. 816 6. Congestion Control Considerations 818 Congestion control for RTP SHALL be used in accordance with 819 [RFC3550], and with any applicable RTP profile: e.g., [RFC3551]. An 820 additional requirement if best-effort service is being used is users 821 of this payload format SHALL monitor packet loss to ensure that the 822 packet loss rate is within acceptable parameters. Circuit Breakers 823 [RFC8083] is an update to RTP [RFC3550] that defines criteria for 824 when one is required to stop sending RTP Packet Streams and 825 applications implementing this standard SHALL comply with it. 826 [RFC8085] provides additional information on the best practices for 827 applying congestion control to UDP streams. 829 7. Payload Format Parameters 831 This section specifies the required and optional parameters of the 832 payload format and/or the RTP stream. All parameters are 833 declarative, meaning that the information signaled by the parameters 834 is also present in the payload data, namely in the payload header 835 (see Section 4.3) or in the JPEG XS header segment [ISO21122-1] 836 [ISO21122-3]. When provided, their respective values SHALL be 837 consistent with the payload. 839 7.1. Media Type Registration 841 This registration is done using the template defined in [RFC6838] 842 and following [RFC4855]. 844 The receiver SHALL ignore any unrecognized parameter. 846 Type name: video 848 Subtype name: jxsv 850 Clock rate: 90000 852 Required parameters: 854 rate: The RTP timestamp clock rate. Applications using this 855 payload format SHALL use a value of 90000. 857 packetmode: This parameter specifies the configured packetization 858 mode as defined by the pacKetization mode (K) bit in the 859 payload header of Section 4.3. This value SHALL be equal to 860 the K bit value configured in the RTP stream (i.e. 0 for 861 codestream or 1 for slice). 863 Optional parameters: 865 transmode: This parameter specifies the configured transmission 866 mode as defined by the Transmission mode (T) bit in the payload 867 header of Section 4.3. If specified, this value SHALL be equal 868 to the T bit value configured in the RTP stream (i.e. 0 for 869 out-of-order-allowed or 1 for sequential-only). If not 870 specified, a value 1 (sequential-only) SHALL be assumed and the 871 T bit SHALL be set to 1. 873 profile: The JPEG XS profile [ISO21122-2] in use. Any white 874 space in the profile name SHALL be omitted. Examples of valid 875 profile names are 'Main444.12' or 'High444.12'. 877 level: The JPEG XS level [ISO21122-2] in use. Any white space in 878 the level name SHALL be omitted. Examples of valid levels are 879 '2k-1' or '4k-2'. 881 sublevel: The JPEG XS sublevel [ISO21122-2] in use. Any white 882 space in the sublevel name SHALL be omitted. Examples of valid 883 sublevels are 'Sublev3bpp' or 'Sublev6bpp'. 885 depth: Determines the number of bits per sample. This is an 886 integer with typical values including 8, 10, 12, and 16. 888 width: Determines the number of pixels per line. This is an 889 integer between 1 and 32767 inclusive. 891 height: Determines the number of lines per video frame. This is 892 an integer between 1 and 32767 inclusive. 894 exactframerate: Signals the video frame rate in frames per 895 second. Integer frame rates SHALL be signaled as a single 896 decimal number (e.g. "25") whilst non-integer frame rates SHALL 897 be signaled as a ratio of two integer decimal numbers separated 898 by a "forward-slash" character (e.g. "30000/1001"), utilizing 899 the numerically smallest numerator value possible. 901 interlace: If this parameter name is present, it indicates that 902 the video is interlaced, or that the video is Progressive 903 segmented Frame (PsF). If this parameter name is not present, 904 the progressive video format SHALL be assumed. 906 segmented: If this parameter name is present, and the interlace 907 parameter name is also present, then the video is a Progressive 908 segmented Frame (PsF). Signaling of this parameter without the 909 interlace parameter is forbidden. 911 sampling: Signals the color difference signal sub-sampling 912 structure. 914 Signals utilizing the non-constant luminance Y'C'B C'R signal 915 format of Recommendation ITU-R BT.601-7, Recommendation ITU-R 916 BT.709-6, Recommendation ITU-R BT.2020-2, or Recommendation 917 ITU-R BT.2100 SHALL use the appropriate one of the following 918 values for the Media Type Parameter "sampling": 920 YCbCr-4:4:4 (4:4:4 sampling) 921 YCbCr-4:2:2 (4:2:2 sampling) 922 YCbCr-4:2:0 (4:2:0 sampling) 924 Signals utilizing the Constant Luminance Y'C C'BC C'RC signal 925 format of Recommendation ITU-R BT.2020-2 SHALL use the 926 appropriate one of the following values for the Media Type 927 Parameter "sampling": 929 CLYCbCr-4:4:4 (4:4:4 sampling) 930 CLYCbCr-4:2:2 (4:2:2 sampling) 931 CLYCbCr-4:2:0 (4:2:0 sampling) 933 Signals utilizing the constant intensity I CT CP signal format 934 of Recommendation ITU-R BT.2100 SHALL use the appropriate one 935 of the following values for the Media Type Parameter 936 "sampling": 938 ICtCp-4:4:4 (4:4:4 sampling) 939 ICtCp-4:2:2 (4:2:2 sampling) 940 ICtCp-4:2:0 (4:2:0 sampling) 942 Signals utilizing the 4:4:4 R' G' B' or RGB signal format (such 943 as that of Recommendation ITU-R BT.601, Recommendation ITU-R 944 BT.709, Recommendation ITU-R BT.2020, Recommendation ITU-R 945 BT.2100, SMPTE ST 2065-1 or ST 2065-3) SHALL use the following 946 value for the Media Type Parameter sampling. 948 RGB (RGB or R' G' B' samples) 950 Signals utilizing the 4:4:4 X' Y' Z' signal format (such as 951 defined in SMPTE ST 428-1) SHALL use the following value for 952 the Media Type Parameter sampling. 954 XYZ (X' Y' Z' samples) 956 Key signals as defined in SMPTE RP 157 SHALL use the value key 957 for the Media Type Parameter sampling. The Key signal is 958 represented as a single component. 960 KEY (Samples of the key signal) 962 Signals utilizing a color sub-sampling other than what is 963 defined here SHALL use the following value for the Media Type 964 Parameter sampling. 966 UNSPECIFIED (Sampling signaled by the payload.) 968 colorimetry: Specifies the system colorimetry used by the image 969 samples. Valid values and their specification are: 971 BT601-5 ITU-R Recommendation BT.601-5. 972 BT709-2 ITU-R Recommendation BT.709-2. 973 SMPTE240M SMPTE ST 240M. 974 BT601 ITU-R Recommendation BT.601-7. 975 BT709 ITU-R Recommendation BT.709-6. 976 BT2020 ITU-R Recommendation BT.2020-2. 977 BT2100 ITU-R Recommendation BT.2100 978 Table 2 titled "System colorimetry". 979 ST2065-1 SMPTE ST 2065-1 Academy Color Encoding 980 Specification (ACES). 981 ST2065-3 SMPTE ST 2065-3 Academy Density Exchange 982 Encoding (ADX). 983 XYZ ISO/IEC 11664-1, section titled 984 "1931 Observer". 985 UNSPECIFIED Colorimetry is signaled in the payload by 986 the color specification box of [ISO21122-3], 987 or it must be manually coordinated between 988 sender and receiver. 990 Signals utilizing the Recommendation ITU-R BT.2100 colorimetry 991 SHOULD also signal the representational range using the 992 optional parameter RANGE defined below. Signals utilizing the 993 UNSPECIFIED colorimetry might require manual coordination 994 between the sender and the receiver. 996 TCS: Transfer Characteristic System. This parameter specifies 997 the transfer characteristic system of the image samples. Valid 998 values and their specification are: 1000 SDR Standard Dynamic Range video streams that 1001 utilize the OETF of ITU-R Recommendation 1002 BT.709 or ITU-R Recommendation BT.2020. Such 1003 streams SHALL be assumed to target the EOTF 1004 specified in ITU-R Recommendation BT.1886. 1005 PQ High dynamic range video streams that utilize 1006 the Perceptual Quantization system of ITU-R 1007 Recommendation BT.2100. 1008 HLG High dynamic range video streams that utilize 1009 the Hybrid Log-Gamma system of ITU-R 1010 Recommendation BT.2100. 1011 UNSPECIFIED Video streams whose transfer characteristics 1012 are signaled by the payload as specified in 1013 [ISO21122-3], or must be manually 1014 coordinated between sender and receiver. 1016 RANGE: This parameter SHOULD be used to signal the encoding range 1017 of the sample values within the stream. When paired with ITU 1018 Rec BT.2100 colorimetry, this parameter has two allowed values 1019 NARROW and FULL, corresponding to the ranges specified in table 1020 9 of ITU Rec BT.2100. In any other context, this parameter has 1021 three allowed values: NARROW, FULLPROTECT, and FULL, which 1022 correspond to the ranges specified in SMPTE RP 2077. In the 1023 absence of this parameter, and for all but the UNSPECIFIED 1024 colorimetry, NARROW SHALL be the assumed value. When paired 1025 with the UNSPECIFIED colorimetry, FULL SHALL be the default 1026 assumed value. 1028 Encoding considerations: 1029 This media type is framed in RTP and contains binary data; see 1030 Section 4.8 in [RFC6838]. 1032 Security considerations: 1033 Please see the Security Considerations (Section 10) of RFC XXXX. 1035 Interoperability considerations: 1036 None. 1038 Published specification: 1039 See RFC XXXX and its References section. 1041 Applications that use this media type: 1042 Any application that transmits video over RTP (like SMPTE ST 1043 2110). 1045 Fragment identifier considerations: 1046 N/A. 1048 Additional information: 1049 None. 1051 Person & email address to contact for further information: 1052 S. Lugan and Th. Richter . 1055 Intended usage: 1056 COMMON 1058 Restrictions on usage: 1059 This media type depends on RTP framing, and hence is only defined 1060 for transfer via RTP [RFC3550]. 1062 Author: 1063 See the Authors' Addresses section of RFC XXXX. 1065 Change controller: 1066 IETF Audio/Video Transport working group delegated from the IESG. 1068 8. SDP Parameters 1070 A mapping of the parameters into the Session Description Protocol 1071 (SDP) [RFC8866] is provided for applications that use SDP. 1073 8.1. Mapping of Payload Type Parameters to SDP 1075 The media type video/jxsv string is mapped to fields in the Session 1076 Description Protocol (SDP) [RFC8866] as follows: 1078 The media type ("video") goes in SDP "m=" as the media name. 1080 The media subtype ("jxsv") goes in SDP "a=rtpmap" as the encoding 1081 name, followed by a slash ("/") and the required parameter "rate" 1082 corresponding to the RTP timestamp clock rate (which for the 1083 payload format defined in this document SHALL be 90000). 1085 The required parameter "packetmode", and any of the additional 1086 optional parameters, as described in Section 7.1, go in the SDP 1087 media format description, being the "a=fmtp" attribute (Format 1088 Parameters), by copying them directly from the MIME media type 1089 string as a semicolon-separated list of parameter=value pairs. 1091 All parameters of the media format SHALL correspond to the parameters 1092 of the payload. In case of discrepancies between payload parameter 1093 values and SDP fields, the values from the payload data SHALL 1094 prevail. 1096 The receiver SHALL ignore any parameter that is not defined in 1097 Section 7.1. 1099 An example SDP mapping for JPEG XS video is as follows: 1101 m=video 30000 RTP/AVP 112 1102 a=rtpmap:112 jxsv/90000 1103 a=fmtp:112 packetmode=0;sampling=YCbCr-4:2:2; 1104 width=1920;height=1080;depth=10; 1105 colorimetry=BT709;TCS=SDR;RANGE=FULL;TP=2110TPNL 1107 In this example, a JPEG XS RTP stream is to be sent to UDP 1108 destination port 30000, with an RTP dynamic payload type of 112 and a 1109 media clock rate of 90000 Hz. Note that the "a=fmtp:" line has been 1110 wrapped to fit this page, and will be a single long line in the SDP 1111 file. This example includes the TP parameter (as specified in 1112 Section 5). 1114 8.2. Usage with SDP Offer/Answer Model 1116 When JPEG XS is offered over RTP using SDP in an offer/answer model 1117 [RFC3264] for negotiation for unicast usage, the following 1118 limitations and rules apply: 1120 The "a=fmtp" attribute SHALL be present specifying the required 1121 parameter "packetmode", and MAY specify any of the optional 1122 parameters, as described in Section 7.1. 1124 All parameters in the "a=fmtp" attribute indicate sending 1125 capabilities (i.e. properties of the payload). 1127 An answerer of the SDP is required to support all parameters and 1128 values of the parameters provided by the offerer; otherwise, the 1129 answerer SHALL reject the session. It falls on the offerer to use 1130 values that are expected to be supported by the answerer. If the 1131 answerer accepts the session, it SHALL reply with the exact same 1132 parameters values in the "a=fmtp" attribute as it was offered. 1134 The same RTP payload type number used in the offer SHOULD be used 1135 in the answer, as specified in [RFC3264]. 1137 9. IANA Considerations 1139 The IANA is requested to register the media type registration "video/ 1140 jxsv" as specified in Section 7.1. The media type is also requested 1141 to be added to the IANA registry for "RTP Payload Format MIME types" 1142 . 1144 10. Security Considerations 1146 RTP packets using the payload format defined in this memo are subject 1147 to the security considerations discussed in [RFC3550] and in any 1148 applicable RTP profile such as RTP/AVP [RFC3551], RTP/AVPF [RFC4585], 1149 RTP/SAVP [RFC3711], or RTP/SAVPF [RFC5124]. This implies that 1150 confidentiality of the media streams is achieved by encryption. 1152 However, as "Securing the RTP Framework: Why RTP Does Not Mandate a 1153 Single Media Security Solution" [RFC7202] discusses, it is not an RTP 1154 payload format's responsibility to discuss or mandate what solutions 1155 are used to meet the basic security goals like confidentiality, 1156 integrity, and source authenticity for RTP in general. This 1157 responsibility lies on anyone using RTP in an application. They can 1158 find guidance on available security mechanisms and important 1159 considerations in "Options for Securing RTP Sessions" [RFC7201]. 1160 Applications SHOULD use one or more appropriate strong security 1161 mechanisms. 1163 Implementations of this RTP payload format need to take appropriate 1164 security considerations into account. It is important for the 1165 decoder to be robust against malicious or malformed payloads and 1166 ensure that they do not cause the decoder to overrun its allocated 1167 memory or otherwise misbehave. An overrun in allocated memory could 1168 lead to arbitrary code execution by an attacker. The same applies to 1169 the encoder, even though problems in encoders are typically rarer. 1171 This payload format and the JPEG XS encoding do not exhibit any 1172 substantial non-uniformity, either in output or in complexity to 1173 perform the decoding operation and thus are unlikely to pose a 1174 denial-of-service threat due to the receipt of pathological 1175 datagrams. 1177 This payload format and the JPEG XS encoding do not contain code that 1178 is executable. 1180 It is important to note that HD or UHDTV JPEG XS-encoded video can 1181 have significant bandwidth requirements (typically more than 1 Gbps 1182 for ultra high-definition video, especially if using high framerate). 1183 This is sufficient to cause potential for denial-of-service if 1184 transmitted onto most currently available Internet paths. 1186 Accordingly, if best-effort service is being used, users of this 1187 payload format SHALL monitor packet loss to ensure that the packet 1188 loss rate is within acceptable parameters. Packet loss is considered 1189 acceptable if a TCP flow across the same network path, and 1190 experiencing the same network conditions, would achieve an average 1191 throughput, measured on a reasonable timescale, that is not less than 1192 the RTP flow is achieving. This condition can be satisfied by 1193 implementing congestion control mechanisms to adapt the transmission 1194 rate (or the number of layers subscribed for a layered multicast 1195 session), or by arranging for a receiver to leave the session if the 1196 loss rate is unacceptably high. 1198 This payload format may also be used in networks that provide 1199 quality-of-service guarantees. If enhanced service is being used, 1200 receivers SHOULD monitor packet loss to ensure that the service that 1201 was requested is actually being delivered. If it is not, then they 1202 SHOULD assume that they are receiving best-effort service and behave 1203 accordingly. 1205 11. Acknowledgments 1207 The authors would like to thank the following people for their 1208 valuable contributions to this memo: Arnaud Germain, Alexandre 1209 Willeme, Gael Rouvroy, Siegfried Foessel, and Jean-Baptise Lorent. 1211 12. RFC Editor Considerations 1213 Note to RFC Editor: This section may be removed after carrying out 1214 all the instructions of this section. 1216 RFC XXXX is to be replaced by the RFC number this specification 1217 receives when published. 1219 13. References 1221 13.1. Normative References 1223 [ISO21122-1] 1224 International Organization for Standardization (ISO) - 1225 International Electrotechnical Commission (IEC), 1226 "Information technology - JPEG XS low-latency lightweight 1227 image coding system - Part 1: Core coding system", ISO/ 1228 IEC IS 21122-1. 1230 [ISO21122-2] 1231 International Organization for Standardization (ISO) - 1232 International Electrotechnical Commission (IEC), 1233 "Information technology - JPEG XS low-latency lightweight 1234 image coding system - Part 2: Profiles and buffer models", 1235 ISO/IEC IS 21122-2. 1237 [ISO21122-3] 1238 International Organization for Standardization (ISO) - 1239 International Electrotechnical Commission (IEC), 1240 "Information technology - JPEG XS low-latency lightweight 1241 image coding system - Part 3: Transport and container 1242 formats", ISO/IEC IS 21122-3. 1244 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1245 Requirement Levels", BCP 14, RFC 2119, 1246 DOI 10.17487/RFC2119, March 1997, 1247 . 1249 [RFC3264] Rosenberg, J. and H. Schulzrinne, "An Offer/Answer Model 1250 with Session Description Protocol (SDP)", RFC 3264, 1251 DOI 10.17487/RFC3264, June 2002, 1252 . 1254 [RFC3550] Schulzrinne, H., Casner, S., Frederick, R., and V. 1255 Jacobson, "RTP: A Transport Protocol for Real-Time 1256 Applications", STD 64, RFC 3550, DOI 10.17487/RFC3550, 1257 July 2003, . 1259 [RFC3551] Schulzrinne, H. and S. Casner, "RTP Profile for Audio and 1260 Video Conferences with Minimal Control", STD 65, RFC 3551, 1261 DOI 10.17487/RFC3551, July 2003, 1262 . 1264 [RFC4855] Casner, S., "Media Type Registration of RTP Payload 1265 Formats", RFC 4855, DOI 10.17487/RFC4855, February 2007, 1266 . 1268 [RFC6838] Freed, N., Klensin, J., and T. Hansen, "Media Type 1269 Specifications and Registration Procedures", BCP 13, 1270 RFC 6838, DOI 10.17487/RFC6838, January 2013, 1271 . 1273 [RFC8083] Perkins, C. and V. Singh, "Multimedia Congestion Control: 1274 Circuit Breakers for Unicast RTP Sessions", RFC 8083, 1275 DOI 10.17487/RFC8083, March 2017, 1276 . 1278 [RFC8085] Eggert, L., Fairhurst, G., and G. Shepherd, "UDP Usage 1279 Guidelines", BCP 145, RFC 8085, DOI 10.17487/RFC8085, 1280 March 2017, . 1282 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 1283 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 1284 May 2017, . 1286 [RFC8866] Begen, A., Kyzivat, P., Perkins, C., and M. Handley, "SDP: 1287 Session Description Protocol", RFC 8866, 1288 DOI 10.17487/RFC8866, January 2021, 1289 . 1291 13.2. Informative References 1293 [RFC3711] Baugher, M., McGrew, D., Naslund, M., Carrara, E., and K. 1294 Norrman, "The Secure Real-time Transport Protocol (SRTP)", 1295 RFC 3711, DOI 10.17487/RFC3711, March 2004, 1296 . 1298 [RFC4175] Gharai, L. and C. Perkins, "RTP Payload Format for 1299 Uncompressed Video", RFC 4175, DOI 10.17487/RFC4175, 1300 September 2005, . 1302 [RFC4585] Ott, J., Wenger, S., Sato, N., Burmeister, C., and J. Rey, 1303 "Extended RTP Profile for Real-time Transport Control 1304 Protocol (RTCP)-Based Feedback (RTP/AVPF)", RFC 4585, 1305 DOI 10.17487/RFC4585, July 2006, 1306 . 1308 [RFC5124] Ott, J. and E. Carrara, "Extended Secure RTP Profile for 1309 Real-time Transport Control Protocol (RTCP)-Based Feedback 1310 (RTP/SAVPF)", RFC 5124, DOI 10.17487/RFC5124, February 1311 2008, . 1313 [RFC7201] Westerlund, M. and C. Perkins, "Options for Securing RTP 1314 Sessions", RFC 7201, DOI 10.17487/RFC7201, April 2014, 1315 . 1317 [RFC7202] Perkins, C. and M. Westerlund, "Securing the RTP 1318 Framework: Why RTP Does Not Mandate a Single Media 1319 Security Solution", RFC 7202, DOI 10.17487/RFC7202, April 1320 2014, . 1322 [SMPTE-ST2110-21] 1323 Society of Motion Picture and Television Engineers, "SMPTE 1324 Standard - Professional Media Over Managed IP Networks: 1325 Traffic Shaping and Delivery Timing for Video", SMPTE ST 1326 2110-21:2017, 2017, 1327 . 1329 Authors' Addresses 1330 Sebastien Lugan 1331 intoPIX S.A. 1332 Rue Emile Francqui, 9 1333 1435 Mont-Saint-Guibert 1334 Belgium 1336 Phone: +32 10 23 84 70 1337 Email: rtp@intopix.com 1338 URI: https://www.intopix.com/ 1340 Antonin Descampe 1341 Universite catholique de Louvain 1342 Place du Levant, 3 - bte L5.03.02 1343 1348 Louvain-la-Neuve 1344 Belgium 1346 Phone: +32 10 47 25 97 1347 Email: antonin.descampe@uclouvain.be 1348 URI: https://uclouvain.be/en/research-institutes/icteam 1350 Corentin Damman 1351 intoPIX S.A. 1352 Rue Emile Francqui, 9 1353 1435 Mont-Saint-Guibert 1354 Belgium 1356 Phone: +32 10 23 84 70 1357 Email: c.damman@intopix.com 1358 URI: https://www.intopix.com/ 1360 Thomas Richter 1361 Fraunhofer IIS 1362 Am Wolfsmantel 33 1363 91048 Erlangen 1364 Germany 1366 Phone: +49 9131 776 5126 1367 Email: thomas.richter@iis.fraunhofer.de 1368 URI: https://www.iis.fraunhofer.de/ 1369 Tim Bruylants 1370 intoPIX S.A. 1371 Rue Emile Francqui, 9 1372 1435 Mont-Saint-Guibert 1373 Belgium 1375 Phone: +32 10 23 84 70 1376 Email: t.bruylants@intopix.com 1377 URI: https://www.intopix.com/