idnits 2.17.1 draft-ietf-avt-uncomp-video-06.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Looks like you're using RFC 2026 boilerplate. This must be updated to follow RFC 3978/3979, as updated by RFC 4748. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** The document is more than 15 pages and seems to lack a Table of Contents. == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the RFC 3978 Section 5.4 Copyright Line does not match the current year == Line 360 has weird spacing: '...ponding pgrou...' == Line 651 has weird spacing: '...ad type can s...' == Line 807 has weird spacing: '...for the purpo...' -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (16 February 2004) is 7369 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: '0' is mentioned on line 453, but not defined == Missing Reference: '1' is mentioned on line 453, but not defined == Missing Reference: '2' is mentioned on line 453, but not defined == Missing Reference: '3' is mentioned on line 453, but not defined == Missing Reference: '4' is mentioned on line 453, but not defined == Missing Reference: '5' is mentioned on line 453, but not defined == Missing Reference: '6' is mentioned on line 453, but not defined == Missing Reference: '7' is mentioned on line 443, but not defined == Missing Reference: '8' is mentioned on line 443, but not defined == Unused Reference: '268' is defined on line 736, but no explicit reference was found in the text == Unused Reference: '22028' is defined on line 768, but no explicit reference was found in the text ** Obsolete normative reference: RFC 2434 (Obsoleted by RFC 5226) -- Possible downref: Non-RFC (?) normative reference: ref. '601' -- Possible downref: Non-RFC (?) normative reference: ref. '709' -- Possible downref: Non-RFC (?) normative reference: ref. '240' -- Obsolete informational reference (is this intentional?): RFC 2327 (ref. 'SDP') (Obsoleted by RFC 4566) Summary: 3 errors (**), 0 flaws (~~), 16 warnings (==), 6 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Internet Engineering Task Force AVT WG 3 INTERNET-DRAFT Ladan Gharai 4 draft-ietf-avt-uncomp-video-06.txt USC/ISI 5 Colin Perkins 6 University of Glasgow 7 16 February 2004 8 Expires: August 2004 10 RTP Payload Format for Uncompressed Video 12 Status of this Memo 14 This document is an Internet-Draft and is in full conformance with 15 all provisions of Section 10 of RFC2026. 17 Internet-Drafts are working documents of the Internet Engineering 18 Task Force (IETF), its areas, and its working groups. Note that 19 other groups may also distribute working documents as Internet- 20 Drafts. 22 Internet-Drafts are draft documents valid for a maximum of six months 23 and may be updated, replaced, or obsoleted by other documents at any 24 time. It is inappropriate to use Internet-Drafts as reference 25 material or to cite them other than as "work in progress." 27 The list of current Internet-Drafts can be accessed at 28 http://www.ietf.org/ietf/1id-abstracts.txt 30 The list of Internet-Draft Shadow Directories can be accessed at 31 http://www.ietf.org/shadow.html. 33 Copyright Notice 35 Copyright (C) The Internet Society (2003). All Rights Reserved. 37 Abstract 39 This memo specifies a packetization scheme for encapsulating 40 uncompressed video into a payload format for the Real-time Transport 41 Protocol, RTP. It supports a range of standard- and high-definition 42 video formats, including common television formats such as ITU 43 BT.601, and standards from the Society of Motion Picture and 44 Television Engineers (SMPTE), such as SMPTE 274M and SMPTE 296M. The 45 format is designed to be applicable and extensible to new video 46 formats as they are developed. 48 1. Introduction 50 [Note to RFC Editor: All references to RFC XXXX are to be replaced 51 with the RFC number of this memo, when published] 53 This memo defines a scheme to packetize uncompressed, studio-quality, 54 video streams for transport using RTP [RTP]. It supports a range of 55 standard and high definition video formats, including ITU-R BT.601 56 [601], SMPTE 274M [274] and SMPTE 296M [296]. 58 Formats for uncompressed standard definition television are defined 59 by ITU Recommendation BT.601 [601] along with bit-serial and parallel 60 interfaces in Recommendation BT.656 [656]. These formats allow both 61 625 line and 525 line operation, with 720 samples per digital active 62 line, 4:2:2 color sub-sampling, and 8- or 10-bit digital 63 representation. 65 The representation of uncompressed high definition television is 66 specified in SMPTE standards 274M [274] and 296M [296]. SMPTE 274M 67 defines a family of scanning systems with an image format of 68 1920x1080 pixels with progressive and interlaced scanning, while 69 SMPTE 296M defines systems with an image size of 1280x720 pixels and 70 progressive scanning. In progressive scanning, scan lines are 71 displayed in sequence from top to bottom of a full frame. In 72 interlaced scanning, a frame is divided into its odd and even scan 73 lines (called fields) and the two fields are displayed in succession. 74 SMPTE 274M and 296M define images with aspect ratios of 16:9, and 75 define the digital representation for RGB and YCbCr components. In 76 the case of YCbCr components, the Cb and Cr components are 77 horizontally sub-sampled by a factor of two (4:2:2 color encoding). 79 Although these formats differ in their details, they are structurally 80 very similar. This memo specifies a payload format to encapsulate 81 these, and other similar, video formats for transport within RTP. 83 2. Conventions Used in this Document 85 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 86 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 87 document are to be interpreted as described in RFC 2119 [2119]. 89 3. Payload Design 91 Each scan line of digital video is packetized into one or more RTP 92 packets. If the data for a complete scan line exceeds the network 93 MTU, the scan line SHOULD be fragmented into multiple RTP packets, 94 each smaller than the MTU. A single RTP packet MAY contain data for 95 more than one scan line. Only the active samples are included in the 96 RTP payload: inactive samples and the contents of horizontal and 97 vertical blanking SHOULD NOT be transported. Scan line numbers are 98 included in the RTP payload header, along with a field identifier for 99 interlaced video. 101 For SMPTE 296M format video, valid scan line numbers are from 26 102 through 745, inclusive. For progressive scan SMPTE 274M format 103 video, valid scan lines are from scan line 42 through 1121 104 inclusive. For interlaced scan SMPTE 274M format video, valid scan 105 line numbers for field one (F=0) are from 21 to 560 and valid scan 106 line numbers for the second field (F=1) are from 584 to 1123. For 107 ITU-R BT.601 format video, the blanking intervals defined in BT.656 108 are used: for 625 line video, lines 24 to 310 of field one (F=0) 109 and 337 to 623 of the second field (F=1) are valid; for 525 line 110 video, lines 21 to 263 of the first field, and 284 to 525 of the 111 second field are valid. Other formats (e.g. [372]) may define 112 different ranges of active lines. 114 The payload header contains a 16 bit extension to the standard 16 bit 115 RTP sequence number, thereby extending the sequence number to 32 bits 116 and enabling the payload format to accommodate high data rates 117 without ambiguity. This is necessary as the 16 bit RTP sequence 118 number will roll-over very quickly for high data rates. For example, 119 for a 1 Gbps video stream with packet sizes of at least one thousand 120 octets, the standard RTP packet will roll-over in 0.5 seconds, which 121 can be a problem for detecting loss and out of order packets 122 particularly in instances where the round trip time is greater than 123 half a second. The extended 32 bit number allows for a longer wrap- 124 around time of approximately nine hours. 126 Each scan line comprises of an integer number of pixels. Each pixel 127 is represented by a number of samples. Samples may be coded as 8, 10, 128 12 or 16 bit values. A sample may represent a color component or a 129 luminance component of the video. Color samples may be shared 130 between adjacent pixels. The sharing of color samples between 131 adjacent pixels is known as color sub-sampling. This is typically 132 done in the YCbCr color space for the purpose of reducing the size of 133 the image data. 135 Pixels that share sample values MUST be transported together as a 136 "pixel group". If 10 bit or 12 bit samples are used, each pixel may 137 also comprise a non-integer number of octets. In this case, several 138 pixels MUST be combined into an octet aligned pixel group for 139 transmission. These restrictions simplify the operation of receivers 140 by ensuring that the complete payload is octet aligned, and that 141 samples relating to a single pixel are not fragmented across multiple 142 packets [ALF]. 144 For example, in YCbCr video with 4:1:1 color sub-sampling, each group 145 of 4 adjacent pixels comprises 6 samples, Y1 Y2 Y3 Y4 Cr Cb, with the 146 Cr and Cb values being shared between all 4 pixels. If samples are 8 147 bit values, the result is a group of 4 pixels comprising 6 octets. 148 If, however, samples are 10 bit values, the resulting 60 bit group is 149 not octet aligned. To be both octet aligned and appropriately 150 framed, two groups of 4 adjacent pixels must be collected, thereby 151 becoming octet aligned on a 15 octet boundary. This length is 152 referred to as the pixel group size ("pgroup"). 154 Formally, the "pgroup" parameter is the size in octets of the 155 smallest grouping of pixels such that 1) the grouping comprises an 156 integer number of octets; and 2) if color sub-sampling is used, 157 samples are only shared within the grouping. When packetizing digital 158 active line content, video data MUST NOT be fragmented within a 159 pgroup. 161 Video content is almost always associated with additional information 162 such as audio tracks, time code, etc. In professional digital video 163 applications this data is commonly embedded in non-active portions of 164 the video stream (horizontal and vertical blanking periods) so that 165 precise and robust synchronization is maintained. This payload format 166 requires that applications using such synchronized ancillary data 167 SHOULD deliver it in separate RTP sessions which operate concurrently 168 with the video session. The normal RTP mechanisms SHOULD be used to 169 synchronize the media. 171 4. RTP Packetization 173 The standard RTP header is followed by a 2 octet payload header that 174 extends the RTP Sequence Number, and by a 6 octet payload header for 175 each line (or partial line) of video included. One or more lines, or 176 partial lines, of video data follow. This format makes the payload 177 header 32 bit aligned in the common case, where one scan line (or 178 fragment) of video is included in each RTP packet. 180 For example, if two lines of video are encapsulated, the payload 181 format will be as shown in Figure 1. 183 0 1 2 3 184 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 185 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 186 | V |P|X| CC |M| PT | Sequence Number | 187 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 188 | Time Stamp | 189 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 190 | SSRC | 191 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 192 | Extended Sequence Number | Length | 193 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 194 |F| Line No |C| Offset | 195 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 196 | Length |F| Line No | 197 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 198 |C| Offset | . 199 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ . 200 . . 201 . Two (partial) lines of video data . 202 . . 203 +---------------------------------------------------------------+ 205 Figure 1: RTP Payload Format showing two (partial) lines of video 207 4.1. The RTP Header 209 The fields of the fixed RTP header have their usual meaning, with the 210 following additional notes: 212 Payload Type (PT): 7 bits 214 A dynamically allocated payload type field which designates the 215 payload as uncompressed video. 217 Timestamp: 32 bits 219 For progressive scan video, the timestamp denotes the sampling 220 instant of the frame to which the RTP packet belongs. Packets MUST 221 NOT include data from multiple frames, and all packets belonging to 222 the same frame MUST have the same timestamp. 224 For interlaced video, the timestamp denotes the sampling instant of 225 the field to which the RTP packet belongs. Packets MUST NOT 226 include data from multiple fields, and all packets belonging to the 227 same field MUST have the same timestamp. Use of field timestamps, 228 rather than a frame timestamp and field indicator bit, is needed to 229 support reverse 3-2 pulldown. 231 A 90 kHz timestamp SHOULD be used in both cases. If the sampling 232 instant does not correspond to an integer value of the clock (as 233 may be the case when interleaving) the value SHALL be truncated to 234 the next lowest integer, with no ambiguity. 236 Marker bit (M): 1 bit 238 If progressive scan video is being transmitted, the marker bit 239 denotes the end of a video frame. If interlaced video is being 240 transmitted, it denotes the end of the field. The marker bit MUST 241 be set to 1 for the last packet of the video frame/field. It MUST 242 be set to 0 for other packets. 244 Sequence Number: 16 bits 246 The low order bits for RTP sequence number. The standard 16 bit 247 sequence number is augmented with another 16 bits in the payload 248 header in order avoid problems due to wrap-around when operating at 249 high rate rates. 251 4.2. Payload Header 253 Extended Sequence Number : 16 bits 255 The high order bits of the extended 32 bit sequence number, in 256 network byte order. 258 Length: 16 bits 260 Number of octets of data included from this scan line, in network 261 byte order. This MUST be a multiple of the pgroup value. 263 Line No : 15 bits 265 Scan line number of encapsulated data, in network byte order. 266 Successive RTP packets MAY contains parts of the same scan line 267 (with an incremented RTP sequence number, but the same timestamp), 268 if it is necessary to fragment a line. 270 Offset : 15 bits 272 Offset of the first pixel of the payload data within the scan line. 273 If YCbCr format data is being transported, this is the pixel offset 274 of the luminance sample; if RGB format data is being transported it 275 is the pixel offset of the red sample; if BGR format data is being 276 transported it is the pixel offset of the blue sample. The value is 277 in network byte order. The offset has a value of zero if the first 278 sample in the payload corresponds to the start of the line, and 279 increments by one for each pixel. 281 Field Identification (F): 1 bit 283 Identifies which field the scan line belongs to, for interlaced 284 data. F=0 identifies the the first field and F=1 the second field. 285 For progressive scan data (e.g. SMPTE 296M format video), F MUST 286 always be set to zero. 288 Continuation (C): 1 bit 290 Determines if an additional scan line header follows the current 291 scan line header in the RTP packet. Set to 1 if an additional 292 header follows, implying that the RTP packet is carrying data for 293 more than one scan line. Set to 0 otherwise. Several scan lines 294 MAY be included in a single packet, up to the path MTU limit. The 295 only way to determine the number of scan lines included per packet 296 is to parse the payload headers. 298 4.3. Payload Data 300 Depending on the video format, each RTP packet can include either a 301 single complete scan line, a single fragment of a scan line, or one 302 (or more) complete scan lines and scan line fragments. The length of 303 each scan line or scan line fragment MUST be an integer multiple of 304 the pgroup size in octets. Scan lines SHOULD be fragmented so that 305 the resulting RTP packet is smaller than the path MTU. 307 It is possible that the scan line length is not evenly divisible by 308 the number of pixels in a pgroup, so the final pixel data of a scan 309 line does not align to either an octet or pgroup boundary. 310 Nonetheless the payload MUST contain a whole number of pgroups; the 311 sender MUST fill the remaining bits of the final pgroup with zero and 312 the receiver MUST ignore the fill data. (In effect, the trailing edge 313 of the image is black-filled to a pgroup boundary.) 315 For RGB format video, samples are packed in order Red-Green-Blue. For 316 BGR format video, samples are packed in order Blue-Green-Red. For 317 both formats, if 8 bit samples are used, the pgroup is 3 octets. If 318 10 bit samples are used, samples from 4 adjacent pixels form 15 octet 319 pgroups. If 12 bit samples are used, samples from 2 adjacent pixels 320 form 9 octet pgroups. If 16 bit samples are used, each pixel forms a 321 separate 6 octet pgroup. 323 For RGBA format video, samples are packed in order Red-Green-Blue- 324 Alpha. For BGRA format video, samples are packet in order Blue- 325 Green-Red-Alpha. For 8, 10, 12, or 16 bit samples, each pixel forms 326 its own pgroup, with octet sizes of 4, 5, 6 and 8 respectively. 328 If the video is in YCbCr format, the packing of samples into the 329 payload depends on the color sub-sampling used. 331 For YCbCr 4:4:4 format video, samples are packed in order Cb-Y-Cr for 332 both interlaced and progressive frames. If 8 bit samples are used, 333 the pgroup is 3 octets. If 10 bit samples are used, samples from 4 334 adjacent pixels form 15 octet pgroups. If 12 bit samples are used, 335 samples from 2 adjacent pixels form 9 octet pgroups. If 16 bits 336 samples are used, each pixel forms a separate 6 octet pgroup. 338 For YCbCr 4:2:2 format video, the Cb and Cr components are 339 horizontally sub-sampled by a factor of two (each Cb and Cr sample 340 corresponds to two Y components). Samples are packed in order 341 Cb0-Y0-Cr0-Y1 for both interlaced and progressive scan lines. For 8, 342 10, 12 or 16 bit samples, the pgroup is formed from two adjacent 343 pixels (4, 5, 6 or 8 octets respectively). 345 For YCbCr 4:1:1 format video, the Cb and Cr components are 346 horizontally sub-sampled by a factor of four (each Cb and Cr sample 347 corresponds to four Y components). Samples are packed in order 348 Cb0-Y0-Y1-Cr0-Y2-Y3 for both interlaced and progressive scan lines. 349 For 8, 10, 12 or 16 bit samples, the pgroup is formed from four 350 adjacent pixels (6, 15, 9 or 12 octets respectively). 352 For YCbCr 4:2:0 video, the Cb and Cr components are sub-sampled by a 353 factor of two both horizontally and vertically. Therefore chrominance 354 samples are shared between certain adjacent lines. Figure 2 shows 355 the composition of luminance and chrominance samples for a 6x6 pixel 356 grid of 4:2:0 YCbCr video. The pixel group is a group of four pixels 357 arranged in a 2x2 matrix. The octet size of the pgroup for 358 progressive scan 4:2:0 video with samples sizes of 8, 10, 12 and 16 359 bits is 6, 15, 9 and 12 octets respectively. For interlaced 4:2:0 360 video the corresponding pgroups are 4, 5, 6 and 8 octets. 362 line 0: Y00 Y01 Y02 Y03 Y04 Y05 363 Cb00 Cr00 Cb01 Cr01 Cb02 Cr02 364 line 1: Y10 Y11 Y12 Y13 Y14 Y15 366 line 2: Y20 Y21 Y22 Y23 Y24 Y25 367 Cb10 Cr10 Cb11 Cr11 Cb12 Cr12 368 line 3: Y30 Y31 Y32 Y33 Y34 Y35 370 line 4: Y40 Y41 Y42 Y43 Y44 Y45 371 Cb20 Cr20 Cb21 Cr21 Cb22 Cr22 372 line 5: Y50 Y51 Y52 Y53 Y54 Y55 374 Figure 2: Chrominance/luminance composition in 4:2:0 YCbCr video 376 When packetizing progressive scan 4:2:0 YCbCr video, samples from two 377 consecutive scan lines are included in each packet. The scan line 378 number in the payload header is set to that of the first scan line of 379 the pair: 381 line 0/1: 382 Y00-Y01-Y10-Y11-Cb00-Cr00 Y02-Y03-Y12-Y13-Cb01-Cr01 383 Y04-Y05-Y14-Y15-Cb02-Cr02 385 line 2/3: 386 Y20-Y21-Y30-Y31-Cb10-Cr10 Y22-Y23-Y32-Y33-Cb11-Cr11 387 Y24-Y25-Y34-Y35-Cb12-Cr12 389 line 4/5: 390 Y40-Y41-Y50-Y51-Cb20-Cr20 Y42-Y43-Y52-Y53-Cb21-Cr21 391 Y44-Y45-Y54-Y55-Cb22-Cr22 393 Figure 3: Packetization of progressive 4:2:0 YCbCr video 395 For interlaced transport chrominance samples are transported with 396 every other line. The first set of chrominance samples may be 397 transported with either the first line of the field 0, or the first 398 line of field 1. The example below illustrates the transport of 399 chrominance samples starting with the first line of field 0 (signaled 400 by the "top-field-first" MIME parameter). 402 field 0: 403 line 0: Y00-Y01-Cb00-Cr00 Y02-Y03-Cb01-Cr01 Y04-Y05-Cb02-Cr02 404 line 2: Y20-Y21 Y22-Y23 Y24-Y25 405 line 4: Y40-Y41-Cb20-Cr20 Y42-Y43-Cb21-Cr21 Y44-Y45-Cb22-Cr22 407 field 1: 408 line 1: Y10-Y11 Y12-Y13 Y14-Y15 409 line 3: Y30-Y31-Cb10-Cr10 Y32-Y33-Cb11 Cr11 Y34-Y35-Cb12-Cr12 410 line 5: Y50-Y51 Y52-Y53 Y54-Y55 412 Figure 4: Packetization of interlaced 4:2:0 YCbCr video with 413 top-field-first. 415 Chrominance values may be sampled with different offsets relative to 416 luminance values. For instance, in Figure 2, chrominance values are 417 sampled at the same distance from neighboring luminance samples. It 418 is also possible for a chrominance sample to be co-sited with a 419 luminance sample, as in Figure 5: 421 line 0: Y00-C Y01 Y02-C Y03 Y04-C Y05 423 line 1: Y10 Y11 Y12 Y13 Y14 Y15 425 line 2: Y20-C Y21 Y22-C Y23 Y24-C Y25 427 line 3: Y30 Y31 Y32 Y33 Y34 Y35 429 line 4: Y40-C Y41 Y42-C Y43 Y44-C Y45 431 line 5: Y50 Y51 Y52 Y53 Y54 Y55 433 Figure 5: Co-sited video sampling in 4:2:0 YCbCr video where C 434 designates a CbCr pair 436 In general chrominance values may be placed between luminance samples 437 or co-sited. Positions can be designated by an integer numbering 438 system starting from left to right and top to bottom. The following 439 position matrices apply for 4:2:0, 4:2:2 and 4:1:1 video: 441 line N: Y[0] [1] Y[2] Y[0] [1] Y[2] 442 [3] [4] Y[5] [3] [4] [5] 443 line N+1: Y[6] [7] Y[8] Y[6] [7] Y[8] 445 Figure 6: Chrominance position matrix for 4:2:0 YCbCr video 447 line N: Y[0] [1] Y[2] [3] Y[0] [1] Y[2] [3] 448 line N+1: Y[0] [1] Y[2] [3] Y[0] [1] Y[2] [3] 450 Figure 7: Chrominance position matrix for 4:2:2 YCbCr video 452 line N: Y[0] [1] Y[2] [3] Y[4] [5] Y[6] 453 line N+1: Y[0] [1] Y[2] [3] Y[4] [5] Y[6] 455 Figure 8: Chrominance position matrix for 4:1:1 YCbCr video 457 While these positions do not effect the packetization order of 458 chrominance and luminance samples, the information is needed for 459 interpolation prior to display and therefore should be signaled to 460 the receiver. 462 5. RTCP Considerations 464 RTCP SHOULD be used as specified in RFC3550 [RTP]. It is to be noted 465 that the sender's octet count in SR packets and the cumulative number 466 of packets lost will wrap around quickly for high data rate streams. 467 This means these two fields may not accurately represent octet count 468 and number of packets lost since the beginning of transmission, as 469 defined in RFC 3550. Therefore for network monitoring purposes other 470 means of keeping track of these variables SHOULD be used. 472 6. IANA Considerations 474 The IANA is requested to register one new MIME subtype along with an 475 associated RTP Payload Format, and to create two sub-parameter 476 registries, as described in the following. 478 6.1. MIME type registration 480 MIME media type name: video 482 MIME subtype name: raw 484 Required parameters: 486 rate: The RTP timestamp clock rate. Applications using this payload 487 format SHOULD use a value of 90000. 489 sampling: Determines the color (sub-)sampling mode of the video 490 stream. Currently defined values are RGB, RGBA, BGR, BGRA, 491 YCbCr-4:4:4, YCbCr-4:2:2, YCbCr-4:2:0, and YCbCr-4:1:1. New values 492 may be registered as described in section 6.2 of RFC XXXX. 494 width: Determines the number of pixels per line. This is an integer 495 between 1 and 32767. 497 height: Determines the number of lines per frame. This is an 498 integer between 1 and 32767. 500 depth: Determines the number of bits per sample. This is an integer 501 with typical values including 8, 10, 12, and 16. 503 colorimetry: This parameter defines the set of colorimetric 504 specifications and other transfer characteristics for the video 505 source, by reference to an external specification. Valid values and 506 their specification are: 508 BT601-5 ITU Recommendation BT.601-5 [601] 509 BT709-2 ITU Recommendation BT.709-2 [709] 510 SMPTE240M SMPTE standard 240M [240] 512 New values may be registered as described in section 6.2 of RFC 513 XXXX. 515 Optional parameters: 517 Interlace: If this OPTIONAL parameter is present, it indicates that 518 the video stream is interlaced. If absent, progressive scan is 519 implied. 521 Top-field-first: If this OPTIONAL parameter is present, it 522 indicates that chrominance samples are packetized starting with the 523 first line of field 0. Its absence implies that chrominance samples 524 are packetized starting with the first line of field 1. 526 chroma-position: This OPTIONAL parameter defines the position of 527 chrominance samples relative to luminance samples. It is either a 528 single integer or a comma separated pair of integers. Integer 529 values range from 0 to 8, as specified in Figures 6-8 of RFC XXXX. 530 A single integer implies that Cb and Cr are co-sited. A comma 531 separated pair of integers designates the locations of Cb and Cr 532 samples, respectively. In its absence, a single value of zero is 533 assumed for color-subsampled video (chroma-position=0). 535 gamma: An OPTIONAL floating point gamma correction value. 537 Encoding considerations: 539 Uncompressed video can be transmitted with RTP as specified in RFC 540 XXXX. No file format is defined at this time. 542 Security considerations: See section 9 of RFC XXXX. 544 Interoperability considerations: NONE. 546 Published specification: RFC XXXX. 548 Applications which use this media type: Video communication. 550 Additional information: None 552 Magic number(s): None 554 File extension(s): None 556 Macintosh File Type Code(s): None 558 Person & email address to contact for further information: 560 Ladan Gharai 561 IETF Audio/Video Transport working group. 563 Intended usage: COMMON 565 Author/Change controller: Ladan Gharai 567 6.2. Parameter Registration 569 New values of the "sampling" parameter MAY be registered with the 570 IANA provided they reference an RFC or other permanent and readily 571 available specification (the Specification Required policy of RFC 572 2434 [2434]). A new registration MUST define the packing order of 573 samples and a valid combinations of color and sub-sampling modes. 575 New values of the "colorimetry" parameter MAY be registered with the 576 IANA provided they reference an RFC or other permanent and readily 577 available specification if colorimetric parameters and other 578 applicable transfer characteristics (the Specification Required 579 policy of RFC 2434 [2434]). 581 7. Mapping MIME Parameters into SDP 583 The information carried in the MIME media type specification has a 584 specific mapping to fields in the Session Description Protocol (SDP) 585 [SDP], which is commonly used to describe RTP sessions. When SDP is 586 used to specify sessions transporting uncompressed video, the mapping 587 is as follows: 589 - The MIME type ("video") goes in SDP "m=" as the media name. 591 - The MIME subtype (payload format name) goes in SDP "a=rtpmap" 592 as the encoding name. 594 - Remaining parameters go in the SDP "a=fmtp" attribute by 595 copying them directly from the MIME media type string as a 596 semicolon separated list of parameter=value pairs. 598 A sample SDP mapping for uncompressed video is as follows: 600 m=video 30000 RTP/AVP 112 601 a=rtpmap:112 raw/90000 602 a=fmtp:112 sampling=YCbCr-4:2:2; width=1280; height=720; depth=10; 603 colorimetry=BT.709-2; chroma-position=1 605 In this example, a dynamic payload type 112 is used for uncompressed 606 video. The RTP sampling clock is 90kHz. Note that the "a=fmtp:" line 607 has been wrapped to fit this page, and will be a single long line in 608 the SDP file. 610 8. Security Considerations 612 RTP packets using the payload format defined in this specification 613 are subject to the security considerations discussed in the RTP 614 specification [RTP] and any appropriate RTP profile. This implies 615 that confidentiality of the media streams is achieved by encryption. 617 This payload type does not exhibit any significant non-uniformity in 618 the receiver side computational complexity for packet processing to 619 cause a potential denial-of-service threat. 621 It is important to note that uncompressed video can have immense 622 bandwidth requirements (up 270 Mbps for standard definition video, 623 and approximately 1 Gbps for high definition video). This is 624 sufficient to cause potential for denial-of-service if transmitted 625 onto most currently available Internet paths. 627 Accordingly, if best-effort service is being used, users of this 628 payload format MUST monitor packet loss to ensure that the packet 629 loss rate is within acceptable parameters. Packet loss is considered 630 acceptable if a TCP flow across the same network path, and 631 experiencing the same network conditions, would achieve an average 632 throughput, measured on a reasonable timescale, that is not less than 633 the RTP flow is achieving. This condition can be satisfied by 634 implementing congestion control mechanisms to adapt the transmission 635 rate (or the number of layers subscribed for a layered multicast 636 session), or by arranging for a receiver to leave the session if the 637 loss rate is unacceptably high. 639 This payload format may also be used in networks which provide 640 quality of service guarantees. If enhanced service is being used, 641 receivers SHOULD monitor packet loss to ensure that the service that 642 was requested is actually being delivered. If it is not, then they 643 SHOULD assume that they are receiving best-effort service and behave 644 accordingly. 646 9. Relation to RFC 2431 648 In comparison with RFC 2431 this memo specifies support for a wider 649 variety of uncompressed video, in terms of frame size, color sub- 650 sampling and sample sizes. While [BT656] can transport up to 4096 651 scan lines and 2048 pixels per line, our payload type can support up 652 to 32768 scan lines and pixels per line. Also, RFC 2431 only address 653 4:2:2 YCbCr data, while this memo covers YCbCr, RGB, RGBA, BGR and 654 BGRA and most common color sub-sampling schemes. Given the variety of 655 video types that we cover, this memo also assumes out-of-band 656 signaling for sample size and data types (RFC 2431 uses in band 657 signaling). 659 10. Relation to RFC 3497 661 RFC 3497 [292RTP] specifies a RTP payload format for encapsulating 662 SMPTE 292M video. The SMPTE 292M standard defines a bit-serial 663 digital interface for local area High Definition Television (HDTV) 664 transport. As a transport medium, SMPTE 292M utilizes 10 bit words 665 and a fixed 1.485Gbps (and 1.485/1.001Gbps) data rate. SMPTE 292M is 666 typically used in the broadcast industry for the transport of other 667 video formats such as SMPTE 260M, SMPTE 295M, SMPTE 274M and SMPTE 668 296M. 670 RFC 3497 defines a circuit emulation for the transport of SMPTE 292M 671 over RTP. It is very specific to SMPTE 292 and has been designed to 672 be interoperable with existing broadcast equipment with a constant 673 rate of 1.485Gbps. 675 This memo defines a flexible native packetization scheme which can 676 packetize any uncompressed video, at varying data rates. In addition, 677 unlike RFC 3497, this memo only transports active video pixels (i.e. 678 horizontal and vertical blanking are not transported). 680 11. Acknowledgments 682 The authors are grateful to Philippe Gentric, Chuck Harrison, Stephan 683 Wenger and Dave Singer for their feedback. 685 This memo is based upon work supported by the U.S. National Science 686 Foundation (NSF) under Grant No. 0230738. Any opinions, findings and 687 conclusions or recommendations expressed in this material are those 688 of the authors and do not necessarily reflect the views of NSF. 690 12. Authors' Addresses 692 Ladan Gharai 693 USC Information Sciences Institute 694 3811 N. Fairfax Drive, #200 695 Arlington, VA 22203 696 USA 698 Colin Perkins 699 University of Glasgow 700 Department of Computing Science 701 17 Lilybank Gardens 702 Glasgow G12 8QQ 703 United Kingdom 705 Normative References 707 [RTP] H. Schulzrinne, S. Casner, R. Frederick and V. Jacobson, 708 "RTP: A Transport Protocol for Real-Time Applications", 709 Internet Engineering Task Force, RFC 3550, July 2003. 711 [2119] S. Bradner, "Key words for use in RFCs to Indicate 712 Requirement Levels", RFC 2119. 714 [2434] T. Narten and H. Alvestrand, "Guidelines for Writing an IANA 715 Considerations Section in RFCs", RFC 2434, October 1998. 717 [601] International Telecommunication Union, "Studio encoding 718 parameters of digital television for standard 4:3 and wide 719 screen 16:9 aspect ratios", Recommendation BT.601, October 720 1995. 722 [709] International Telecommunication Union, "Parameter Values for 723 HDTV Standards for Production and International Programme 724 Exchange", Recommendation BT.709-2 726 [240] Society of Motion Picture and Television Engineers, 727 "Television - Signal Parameters - 1125-Line High-Definition 728 Production", SMPTE 240M-1999. 730 Informative References 732 [274] Society of Motion Picture and Television Engineers, 733 "1920x1080 Scanning and Analog and Parallel Digital 734 Interfaces for Multiple Picture Rates", SMPTE 274M-1998. 736 [268] Society of Motion Picture and Television Engineers, 737 "File Format for Digital Moving Picture Exchange (DPX)", 738 SMPTE 268M-1994. (Currently under revision.) 740 [296] Society of Motion Picture and Television Engineers, 741 "1280x720 Scanning, Analog and Digital Representation and 742 Analog Interfaces", SMPTE 296M-1998. 744 [372] Society of Motion Picture and Television Engineers, 745 "Dual Link 292M Interface for 1920 x 1080 Picture Raster", 746 SMPTE 372M-2002. 748 [ALF] Clark, D. D., and Tennenhouse, D. L., "Architectural 749 Considerations for a New Generation of Protocols", In 750 Proceedings of SIGCOMM '90 (Philadelphia, PA, Sept. 1990), 751 ACM. 753 [SDP] M. Handley and V. Jacobson, "SDP: Session Description 754 Protocol", RFC 2327, April 1998. 756 [BT656] D. Tynan, "RTP Payload Format for BT.656 Video Encoding", 757 Internet Engineering Task Force, RFC 2431, October 1998. 759 [292RTP] L. Gharai et al., "RTP Payload Format for SMPTE 292M Video", 760 RFC 3497, March 2003. 762 [656] International Telecommunication Union, "Interfaces for 763 Digital Component Video Signals in 525-line and 625-line 764 Television Systems Operating at the 4:2:2 Level of 765 Recommendation ITU-R BT.601 (Part A)", Recommendation 766 BT.656, April 1998. 768 [22028] ISO TC42 (Photography), Photography and graphic technology - 769 Extended colour encodings for digital image storage, 770 manipulation and interchange - Part 1: Architecture and 771 requirements, ISO/CD 22028-1, Work in Progress. 773 13. IPR Notice 775 The IETF takes no position regarding the validity or scope of any 776 intellectual property or other rights that might be claimed to 777 pertain to the implementation or use of the technology described in 778 this document or the extent to which any license under such rights 779 might or might not be available; neither does it represent that it 780 has made any effort to identify any such rights. Information on the 781 IETF's procedures with respect to rights in standards-track and 782 standards-related documentation can be found in BCP-11. Copies of 783 claims of rights made available for publication and any assurances of 784 licenses to be made available, or the result of an attempt made to 785 obtain a general license or permission for the use of such 786 proprietary rights by implementors or users of this specification can 787 be obtained from the IETF Secretariat. 789 The IETF invites any interested party to bring to its attention any 790 copyrights, patents or patent applications, or other proprietary 791 rights which may cover technology that may be required to practice 792 this standard. Please address the information to the IETF Executive 793 Director. 795 14. Full Copyright Statement 797 Copyright (C) The Internet Society 2003. All Rights Reserved. 799 This document and translations of it may be copied and furnished to 800 others, and derivative works that comment on or otherwise explain it 801 or assist in its implmentation may be prepared, copied, published and 802 distributed, in whole or in part, without restriction of any kind, 803 provided that the above copyright notice and this paragraph are 804 included on all such copies and derivative works. However, this 805 document itself may not be modified in any way, such as by removing 806 the copyright notice or references to the Internet Society or other 807 Internet organizations, except as needed for the purpose of 808 developing Internet standards in which case the procedures for 809 copyrights defined in the Internet Standards process must be 810 followed, or as required to translate it into languages other than 811 English. 813 The limited permissions granted above are perpetual and will not be 814 revoked by the Internet Society or its successors or assigns. 816 This document and the information contained herein is provided on an 817 "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING 818 TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING 819 BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION 820 HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF 821 MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.