idnits 2.17.1 draft-ietf-avt-dv-video-03.txt: ** The Abstract section seems to be numbered Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Looks like you're using RFC 2026 boilerplate. This must be updated to follow RFC 3978/3979, as updated by RFC 4748. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** The document seems to lack a 1id_guidelines paragraph about 6 months document validity -- however, there's a paragraph with a matching beginning. Boilerplate error? == No 'Intended status' indicated for this document; assuming Proposed Standard == The page length should not exceed 58 lines per page, but there was 11 longer pages, the longest (page 2) being 60 lines == It seems as if not all pages are separated by form feeds - found 0 form feeds but 12 pages Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) == There are 2 instances of lines with non-RFC6890-compliant IPv4 addresses in the document. If these are example addresses, they should be changed. == There are 4 instances of lines with multicast IPv4 addresses in the document. If these are generic example addresses, they should be changed to use the 233.252.0.x range defined in RFC 5771 ** The document seems to lack a both a reference to RFC 2119 and the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords -- however, there's a paragraph with a matching beginning. Boilerplate error? RFC 2119 keyword, line 129: '... types MUST be assigned, one for eac...' RFC 2119 keyword, line 130: '... MUST change to the corresponding pa...' RFC 2119 keyword, line 135: '... video frame MUST have the same time...' RFC 2119 keyword, line 148: '...ideo frame times MAY be monitored usin...' RFC 2119 keyword, line 156: '...f a frame change MUST NOT rely on the ...' (14 more instances...) Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the RFC 3978 Section 5.4 Copyright Line does not match the current year == Using lowercase 'not' together with uppercase 'MUST', 'SHALL', 'SHOULD', or 'RECOMMENDED' is not an accepted usage according to RFC 2119. Please use uppercase 'NOT' together with RFC 2119 keywords (if that is what you mean). Found 'MUST not' in this paragraph: When the fmtp audio: parameter is not announced, the audio data MUST not be bundled into the DV video stream. -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (December 2000) is 8526 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Missing reference section? '6' on line 525 looks like a reference -- Missing reference section? '1' on line 511 looks like a reference -- Missing reference section? '2' on line 515 looks like a reference -- Missing reference section? '3' on line 517 looks like a reference -- Missing reference section? '4' on line 519 looks like a reference -- Missing reference section? '5' on line 522 looks like a reference -- Missing reference section? '7' on line 529 looks like a reference -- Missing reference section? '8' on line 532 looks like a reference -- Missing reference section? '9' on line 535 looks like a reference -- Missing reference section? '10' on line 538 looks like a reference -- Missing reference section? '11' on line 542 looks like a reference -- Missing reference section? '12' on line 545 looks like a reference -- Missing reference section? '13' on line 549 looks like a reference -- Missing reference section? '14' on line 552 looks like a reference Summary: 5 errors (**), 0 flaws (~~), 7 warnings (==), 16 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 INTERNET-DRAFT Katsushi Kobayashi 2 draft-ietf-avt-dv-video-03.txt Communication Research Laboratory 3 Akimichi Ogawa 4 Keio University 5 Stephen Casner 6 Cisco Systems 7 Carsten Bormann 8 Universitaet Bremen TZI 9 June 26, 2000 10 Expires December 2000 12 RTP Payload Format for DV Format Video 14 Status of this Memo 16 This document is an Internet-Draft and is in full conformance with 17 all provisions of Section 10 of RFC 2026. 19 Internet-Drafts are working documents of the Internet Engineering 20 Task Force (IETF), its areas, and its working groups. Note that 21 other groups may also distribute working documents as Internet- 22 Drafts. 24 Internet-Drafts are draft documents valid for a maximum of six months 25 and may be updated, replaced, or obsoleted by other documents at any 26 time. It is inappropriate to use Internet- Drafts as reference 27 material or to cite them other than as "work in progress." 29 The list of current Internet-Drafts can be accessed at 30 http://www.ietf.org/ietf/1id-abstracts.txt 32 The list of Internet-Draft Shadow Directories can be accessed at 33 http://www.ietf.org/shadow.html. 35 1. Abstract 37 This document specifies the packetization scheme for encapsulating 38 the compressed digital video data streams commonly known as "DV" into 39 a payload format for the Real-Time Transport Protocol (RTP). There 40 are two kinds of DV, one for consumer use and the other for 41 professional. The original "DV" specification designed for consumer- 42 use digital VCRs is approved as the IEC 61834 standard set. The 43 specifications for professional DV are published as SMPTE 306M(D-7) 44 and 314M(D-9). Both are based on consumer DV. The RTP payload 45 format specified in this document supports IEC 61834 consumer DV and 46 professional SMPTE 306M and 314M(DV-Based) formats. 48 2. Introduction 50 This document specifies payload formats for encapsulating both 51 consumer- and professional-use DV format data streams into the Real- 52 time Transport Protocol (RTP), version 2 [6]. DV compression audio 53 and video formats were designed for helical-scan magnetic tape media. 54 The DV standards for consumer-market devices, the IEC 61883 and 61834 55 series, cover many aspects of consumer-use digital video, including 56 mechanical specifications of a cassette, magnetic recording format, 57 error correction on the magnetic tape, DCT video encoding format, and 58 audio encoding format[1]. The digital interface part of IEC 61883 59 defines an interface on an IEEE 1394 network[2,3]. This specification 60 set supports several video formats: SD-VCR (Standard Definition), HD- 61 VCR (High Definition), SDL-VCR (Standard Definition - Long), PALPlus, 62 DVB (Digital Video Broadcast) and ATV (Advanced Television). North 63 American formats are indicated with a number of lines and "/60", 64 while European formats use "/50". DV standards extended for 65 professional use were published by SMPTE as 306M and 314M, for 66 different sampling system, higher color resolution, and faster bit 67 rates[4,5]. 69 IEC 61834 also includes magnetic tape recording for digital TV 70 broadcasting systems (such as DVB and ATV) that use MPEG2 encoding. 71 The payload format for encapsulating MPEG2 into RTP has already been 72 defined in RFC 2250[7] and others. 74 Consequently, the payload specified in this document will support six 75 video formats of the IEC standard: SD-VCR (525/60, 625/50), HD-VCR 76 (1125/60, 1250/50) and SDL-VCR (525/60, 625/50), and six of the SMPTE 77 standards: 306M (525/60, 625/50), 314M 25Mbps (525/60, 625/50) and 78 314M 50Mbps (525/60, 625/50). In the future it can be extended into 79 other high-definition formats. 81 Throughout this specification, we make extensive use of the 82 terminology of IEC and SMPTE standards. The reader should consult the 83 original references for definitions of these terms. 85 2.1 Terminology 87 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 88 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 89 document are to be interpreted as described in RFC 2119 [8] 91 3. DV format encoding 93 The DV format only uses the DCT compression technique within each 94 frame, contrasted with the interframe compression of the MPEG video 95 standards [9,10]. All video data including audio and other system 96 data are managed within the picture frame unit of video. 98 The DV encoding is composed of a three-level hierarchical structure. 99 A picture frame is divided into rectangle- or clipped-rectangle- 100 shaped DCT super blocks. DCT super blocks are divided into 27 101 rectangle- or square-shaped DCT macro blocks. Audio data is encoded 102 with PCM format. The sampling frequency is 32 kHz, 44.1 kHz or 48 103 kHz and the quantization is 12-bit non-linear, 16-bit linear or 104 20-bit linear. The number of channels may be up to 8. Only certain 105 combinations of these parameters are allowed depending upon the video 106 format; the restrictions are specified in each document. A frame of 107 data in the DV format stream is divided into several "DIF sequences". 108 A DIF sequence is composed of an integral number of 80-byte DIF 109 blocks. A DIF block is the primitive unit for all treatment of DV 110 streams. Each DIF block contains a 3-byte ID header that specifies 111 the type of the DIF block and its position in the DIF sequence. Five 112 types of DIF blocks are defined: DIF sequence header, Subcode, Video 113 Auxiliary information (VAUX), Audio and Video. Audio DIF blocks are 114 composed of 5 bytes of Audio Auxiliary data (AAUX) and 72 bytes of 115 audio data. 117 Each RTP packet starts with the RTP header as defined in RFC 1889 118 [6]. No additional payload-format-specific header is required for 119 this payload format. 121 4.1 RTP header usage 123 The RTP header fields that have a meaning specific to the DV format 124 are described as follows: 126 Payload type (PT): The payload type is dynamically assigned by means 127 outside the scope of this document. If multiple DV encoding formats 128 are to be used within one RTP session, then multiple dynamic payload 129 types MUST be assigned, one for each DV encoding format. The sender 130 MUST change to the corresponding payload type whenever the encoding 131 format is changed. 133 Timestamp: 32-bit 90 kHz timestamp representing the time at which the 134 first data in the frame was sampled. All RTP packets within the same 135 video frame MUST have the same timestamp. The timestamp SHOULD 136 increment by a multiple of the nominal interval for one frame time, 137 as given in the following table: 139 Mode Frame rate (Hz) Increase of one frame 140 in 90kHz timestamp 142 525-60 29.97 3003 143 625-50 25 3600 144 1125-60 30 3000 145 1250-50 25 3600 147 When the DV stream is obtained from a IEEE 1394 interface, the 148 progress of video frame times MAY be monitored using the SYT 149 timestamp carried in the CIP header, as described in Appendix A. 151 Marker bit (M): The marker bit of the RTP fixed header is set to one 152 on the last packet of a video frame, and otherwise, must be zero. 153 The M bit allows the receiver to know that it has received the last 154 packet of a frame so it can display the image without waiting for the 155 first packet of the next frame to arrive to detect the frame change. 156 However, detection of a frame change MUST NOT rely on the marker bit 157 since the last packet of the frame might be lost. Detection of a 158 frame change MUST be done by differences in RTP timestamp. 160 4.2 DV data encapsulation into RTP payload 162 Integral DIF blocks are placed into the RTP payload beginning 163 immediately after the RTP header. Any number of DIF blocks may be 164 packed into one RTP packet, except that all DIF blocks in one RTP 165 packet must be from the same video frame. DIF blocks from the next 166 video frame MUST NOT be packed into the same RTP packet even if more 167 payload space remains. This requirement stems from the fact the 168 transition from one video frame to the next is indicated by a change 169 in the RTP timestamp. It also reduces the processing complexity on 170 the receiver. Since the RTP payload contains an integral number of 171 DIF blocks, the length of the RTP payload will be a multiple of 80 172 bytes. 174 Audio and video data may be transmitted as one bundled RTP stream or 175 in separate RTP streams (unbundled). The choice MUST be indicated as 176 part of the assignment of the dynamic payload type and MUST remain 177 unchanged for the duration of the RTP session to avoid complicated 178 procedures of sequence number synchronization. The RTP sender MAY 179 send DIF-sequence header and subcode DIF block into streams. When 180 sending DIF-sequence header and subcode DIF block, both the blocks 181 MUST be included in the video stream. 183 DV streams include "source" and "source control" packs that carry 184 information indispensable for proper decoding, such as aspect ratio, 185 position of picture, quantization of audio sampling, the number of 186 audio channels, audio channel assignment, and language of audio. 187 However, describing all of these attributes with SDP would require 188 large SDP descriptions to enumerate all combinations. Therefore, in 189 the later section of this document, the SDP entry for each of these 190 parameters is not defined. Instead, the RTP sender MUST transmit at 191 least VAUX DIF block and/or AAUX information including "source" and 192 "source control" pack filled with the indispensable information for 193 decoding. In the case of one bundled stream, DIF blocks for both 194 audio and video are packed into RTP packets in the same order as they 195 were encoded. 197 In the case of an unbundled stream, only the header, subcode, video 198 and VAUX DIF blocks are sent within the video stream. Audio is sent 199 in a different stream if desired, using a different RTP payload type. 200 It is also possible to send audio duplicated in a separate stream, in 201 addition to bundling it in with the video stream. 203 When using unbundled mode, it is RECOMMENDED that the audio stream 204 data be extracted from the DIF blocks and repackaged into the 205 corresponding RTP payload format for the audio encoding (DAT12, L16, 206 L20) [11,12] in order to maximize interoperability with non-DV- 207 capable receivers while maintaining the original source quality. In 208 the case of unbundled transmission where both audio and video are 209 sent in the DV format, the same timestamp SHOULD be used for both 210 audio and video data within the same frame to simplify the lip 211 synchronization effort on the receiver. Lip synchronization may also 212 be achieved using reference timestamps passed in RTCP as described in 213 RFC 1889 [6]. 215 The sender MAY reduce the video frame rate by discarding the video 216 data and VAUX DIF blocks for some of the video frames. The RTP 217 timestamp must still be incremented to account for the discarded 218 frames. The sender MAY alternatively reduce bandwidth by discarding 219 video data DIF blocks for portions of the image which are unchanged 220 from the previous image. To enable this bandwidth reduction, 221 receivers SHOULD implement an error concealment strategy to 222 accommodate lost or missing DIF blocks, e.g. repeating the 223 corresponding DIF block from the previous image. 225 5. SDP Signaling for RTP/DV 227 When using SDP (Session Description Protocol) for negotiation of the 228 RTP payload information, the format described in this document SHOULD 229 be used. SDP description will be slightly different for a bundled 230 stream and an unbundled stream. 232 When DV stream is sent to port 31394 and RTP payload type identifier 233 111, the m=?? line will be like: 235 m=video 31394 RTP/AVP 111 237 The a=rtpmap attribute will be like: 239 a=rtpmap:111 DV/90000 241 "DV" is the encoding name for the DV video payload format defined in 242 this document. 90000 shows the clock rate. The clock used for the 243 payload format defined in this document uses 90kHz clock. 245 In SDP, format specific parameters are defined as a=fmtp, as below: 247 a=fmtp: 249 In the DV video payload format, the a=fmtp line will be used to show 250 the encoding type within the DV video and will be used as below: 252 a=fmtp: encode: 254 The parameter is specified which type of DV 255 format is used. The DV format name will be one of the following: 257 o SD-VCR/525-60 258 o SD-VCR/625-50 259 o HD-VCR/1125-60 260 o HD-VCR/1250-50 261 o SDL-VCR/525-60 262 o SDL-VCR/625-50 263 o 306M/525-60 264 o 306M/625-50 265 o 314M-25/525-60 266 o 314M-25/625-50 267 o 314M-50/525-60 268 o 314M-50/625-50 270 In order to show whether the audio data is bundled into DV stream or 271 not, a format specific parameter is defined as bellow: 273 a=fmtp: audio: