idnits 2.17.1 draft-ietf-avt-dv-video-02.txt: ** The Abstract section seems to be numbered Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Looks like you're using RFC 2026 boilerplate. This must be updated to follow RFC 3978/3979, as updated by RFC 4748. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** The document seems to lack a 1id_guidelines paragraph about 6 months document validity -- however, there's a paragraph with a matching beginning. Boilerplate error? == No 'Intended status' indicated for this document; assuming Proposed Standard == The page length should not exceed 58 lines per page, but there was 11 longer pages, the longest (page 2) being 60 lines == It seems as if not all pages are separated by form feeds - found 0 form feeds but 12 pages Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) == There are 2 instances of lines with non-RFC6890-compliant IPv4 addresses in the document. If these are example addresses, they should be changed. == There are 4 instances of lines with multicast IPv4 addresses in the document. If these are generic example addresses, they should be changed to use the 233.252.0.x range defined in RFC 5771 ** The document seems to lack a both a reference to RFC 2119 and the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords -- however, there's a paragraph with a matching beginning. Boilerplate error? RFC 2119 keyword, line 128: '... types MUST be assigned, one for eac...' RFC 2119 keyword, line 129: '... MUST change to the corresponding pa...' RFC 2119 keyword, line 134: '... video frame MUST have the same time...' RFC 2119 keyword, line 147: '...ideo frame times MAY be monitored usin...' RFC 2119 keyword, line 155: '...f a frame change MUST NOT rely on the ...' (14 more instances...) Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the RFC 3978 Section 5.4 Copyright Line does not match the current year == Using lowercase 'not' together with uppercase 'MUST', 'SHALL', 'SHOULD', or 'RECOMMENDED' is not an accepted usage according to RFC 2119. Please use uppercase 'NOT' together with RFC 2119 keywords (if that is what you mean). Found 'MUST not' in this paragraph: When the fmtp audio: parameter is not announced, the audio data MUST not be bundled into the DV video stream. -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (September 2000) is 8624 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Missing reference section? '6' on line 524 looks like a reference -- Missing reference section? '1' on line 510 looks like a reference -- Missing reference section? '2' on line 514 looks like a reference -- Missing reference section? '3' on line 516 looks like a reference -- Missing reference section? '4' on line 518 looks like a reference -- Missing reference section? '5' on line 521 looks like a reference -- Missing reference section? '7' on line 528 looks like a reference -- Missing reference section? '8' on line 531 looks like a reference -- Missing reference section? '9' on line 534 looks like a reference -- Missing reference section? '10' on line 537 looks like a reference -- Missing reference section? '11' on line 541 looks like a reference -- Missing reference section? '12' on line 544 looks like a reference -- Missing reference section? '13' on line 548 looks like a reference -- Missing reference section? '14' on line 551 looks like a reference Summary: 5 errors (**), 0 flaws (~~), 7 warnings (==), 16 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 INTERNET-DRAFT Katsushi Kobayashi 2 draft-ietf-avt-dv-video-02.txt Communication Research Laboratory 3 Akimichi Ogawa 4 Keio University 5 Stephen Casner 6 Cisco Systems 7 Carsten Bormann 8 Universitaet Bremen TZI 9 March 10, 2000 10 Expires September 2000 12 RTP Payload Format for DV Format Video 14 Status of this Memo 16 This document is an Internet-Draft and is in full conformance with 17 all provisions of Section 10 of RFC 2026. 19 Internet-Drafts are working documents of the Internet Engineering 20 Task Force (IETF), its areas, and its working groups. Note that 21 other groups may also distribute working documents as Internet- 22 Drafts. 24 Internet-Drafts are draft documents valid for a maximum of six months 25 and may be updated, replaced, or obsoleted by other documents at any 26 time. It is inappropriate to use Internet- Drafts as reference 27 material or to cite them other than as "work in progress." 29 The list of current Internet-Drafts can be accessed at 30 http://www.ietf.org/ietf/1id-abstracts.txt 32 The list of Internet-Draft Shadow Directories can be accessed at 33 http://www.ietf.org/shadow.html. 35 1. Abstract 37 This document specifies the packetization scheme for encapsulating 38 the compressed digital video data streams commonly known as "DV" into 39 a payload format for the Real-Time Transport Protocol (RTP). There 40 are two kinds of DV, one for consumer use and the other for 41 professional. The original "DV" specification designed for consumer- 42 use digital VCRs is approved as the IEC 61834 standard set. The 43 specifications for professional DV are published as SMPTE 306M(D-7) 44 and 314M(D-9). Both are based on consumer DV. The RTP payload 45 format specified in this document supports IEC 61834 consumer DV and 46 professional SMPTE 306M and 314M(DV-Based) formats. 48 2. Introduction 49 This document specifies payload formats for encapsulating both 50 consumer- and professional-use DV format data streams into the Real- 51 time Transport Protocol (RTP), version 2 [6]. DV compression audio 52 and video formats were designed for helical-scan magnetic tape media. 53 The DV standards for consumer-market devices, the IEC 61883 and 61834 54 series, cover many aspects of consumer-use digital video, including 55 mechanical specifications of a cassette, magnetic recording format, 56 error correction on the magnetic tape, DCT video encoding format, and 57 audio encoding format[1]. The digital interface part of IEC 61883 58 defines an interface on an IEEE 1394 network[2,3]. This specification 59 set supports several video formats: SD-VCR (Standard Definition), HD- 60 VCR (High Definition), SDL-VCR (Standard Definition - Long), PALPlus, 61 DVB (Digital Video Broadcast) and ATV (Advanced Television). North 62 American formats are indicated with a number of lines and "/60", 63 while European formats use "/50". DV standards extended for 64 professional use were published by SMPTE as 306M and 314M, for 65 different sampling system, higher color resolution, and faster bit 66 rates[4,5]. 68 IEC 61834 also includes magnetic tape recording for digital TV 69 broadcasting systems (such as DVB and ATV) that use MPEG2 encoding. 70 The payload format for encapsulating MPEG2 into RTP has already been 71 defined in RFC 2250[7] and others. 73 Consequently, the payload specified in this document will support six 74 video formats of the IEC standard: SD-VCR (525/60, 625/50), HD-VCR 75 (1125/60, 1250/50) and SDL-VCR (525/60, 625/50), and six of the SMPTE 76 standards: 306M (525/60, 625/50), 314M 25Mbps (525/60, 625/50) and 77 314M 50Mbps (525/60, 625/50). In the future it can be extended into 78 other high-definition formats. 80 Throughout this specification, we make extensive use of the 81 terminology of IEC and SMPTE standards. The reader should consult the 82 original references for definitions of these terms. 84 2.1 Terminology 86 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 87 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 88 document are to be interpreted as described in RFC 2119 [8] 90 3. DV format encoding 92 The DV format only uses the DCT compression technique within each 93 frame, contrasted with the interframe compression of the MPEG video 94 standards [9,10]. All video data including audio and other system 95 data are managed within the picture frame unit of video. 97 The DV encoding is composed of a three-level hierarchical structure. 98 A picture frame is divided into rectangle- or clipped-rectangle- 99 shaped DCT super blocks. DCT super blocks are divided into 27 100 rectangle- or square-shaped DCT macro blocks. Audio data is encoded 101 with PCM format. The sampling frequency is 32 kHz, 44.1 kHz or 48 102 kHz and the quantization is 12-bit non-linear, 16-bit linear or 103 20-bit linear. The number of channels may be up to 8. Only certain 104 combinations of these parameters are allowed depending upon the video 105 format; the restrictions are specified in each document. A frame of 106 data in the DV format stream is divided into several "DIF sequences". 107 A DIF sequence is composed of an integral number of 80-byte DIF 108 blocks. A DIF block is the primitive unit for all treatment of DV 109 streams. Each DIF block contains a 3-byte ID header that specifies 110 the type of the DIF block and its position in the DIF sequence. Five 111 types of DIF blocks are defined: DIF sequence header, Subcode, Video 112 Auxiliary information (VAUX), Audio and Video. Audio DIF blocks are 113 composed of 5 bytes of Audio Auxiliary data (AAUX) and 72 bytes of 114 audio data. 116 Each RTP packet starts with the RTP header as defined in RFC 1889 117 [6]. No additional payload-format-specific header is required for 118 this payload format. 120 4.1 RTP header usage 122 The RTP header fields that have a meaning specific to the DV format 123 are described as follows: 125 Payload type (PT): The payload type is dynamically assigned by means 126 outside the scope of this document. If multiple DV encoding formats 127 are to be used within one RTP session, then multiple dynamic payload 128 types MUST be assigned, one for each DV encoding format. The sender 129 MUST change to the corresponding payload type whenever the encoding 130 format is changed. 132 Timestamp: 32-bit 90 kHz timestamp representing the time at which the 133 first data in the frame was sampled. All RTP packets within the same 134 video frame MUST have the same timestamp. The timestamp SHOULD 135 increment by a multiple of the nominal interval for one frame time, 136 as given in the following table: 138 Mode Frame rate (Hz) Increase of one frame 139 in 90kHz timestamp 141 525-60 29.97 3003 142 625-50 25 3600 143 1125-60 30 3000 144 1250-50 25 3600 146 When the DV stream is obtained from a IEEE 1394 interface, the 147 progress of video frame times MAY be monitored using the SYT 148 timestamp carried in the CIP header, as described in Appendix A. 150 Marker bit (M): The marker bit of the RTP fixed header is set to one 151 on the last packet of a video frame, and otherwise, must be zero. 152 The M bit allows the receiver to know that it has received the last 153 packet of a frame so it can display the image without waiting for the 154 first packet of the next frame to arrive to detect the frame change. 155 However, detection of a frame change MUST NOT rely on the marker bit 156 since the last packet of the frame might be lost. Detection of a 157 frame change MUST be done by differences in RTP timestamp. 159 4.2 DV data encapsulation into RTP payload 161 Integral DIF blocks are placed into the RTP payload beginning 162 immediately after the RTP header. Any number of DIF blocks may be 163 packed into one RTP packet, except that all DIF blocks in one RTP 164 packet must be from the same video frame. DIF blocks from the next 165 video frame MUST NOT be packed into the same RTP packet even if more 166 payload space remains. This requirement stems from the fact the 167 transition from one video frame to the next is indicated by a change 168 in the RTP timestamp. It also reduces the processing complexity on 169 the receiver. Since the RTP payload contains an integral number of 170 DIF blocks, the length of the RTP payload will be a multiple of 80 171 bytes. 173 Audio and video data may be transmitted as one bundled RTP stream or 174 in separate RTP streams (unbundled). The choice MUST be indicated as 175 part of the assignment of the dynamic payload type and MUST remain 176 unchanged for the duration of the RTP session to avoid complicated 177 procedures of sequence number synchronization. The RTP sender MAY 178 send DIF-sequence header and subcode DIF block into streams. When 179 sending DIF-sequence header and subcode DIF block, both the blocks 180 MUST be included in the video stream. 182 DV streams include "source" and "source control" packs that carry 183 information indispensable for proper decoding, such as aspect ratio, 184 position of picture, quantization of audio sampling, the number of 185 audio channels, audio channel assignment, and language of audio. 186 However, describing all of these attributes with SDP would require 187 large SDP descriptions to enumerate all combinations. Therefore, in 188 the later section of this document, the SDP entry for each of these 189 parameters is not defined. Instead, the RTP sender MUST transmit at 190 least VAUX DIF block and/or AAUX information including "source" and 191 "source control" pack filled with the indispensable information for 192 decoding. In the case of one bundled stream, DIF blocks for both 193 audio and video are packed into RTP packets in the same order as they 194 were encoded. 196 In the case of an unbundled stream, only the header, subcode, video 197 and VAUX DIF blocks are sent within the video stream. Audio is sent 198 in a different stream if desired, using a different RTP payload type. 199 It is also possible to send audio duplicated in a separate stream, in 200 addition to bundling it in with the video stream. 202 When using unbundled mode, it is RECOMMENDED that the audio stream 203 data be extracted from the DIF blocks and repackaged into the 204 corresponding RTP payload format for the audio encoding (DAT12, L16, 205 L20) [11,12] in order to maximize interoperability with non-DV- 206 capable receivers while maintaining the original source quality. In 207 the case of unbundled transmission where both audio and video are 208 sent in the DV format, the same timestamp SHOULD be used for both 209 audio and video data within the same frame to simplify the lip 210 synchronization effort on the receiver. Lip synchronization may also 211 be achieved using reference timestamps passed in RTCP as described in 212 RFC 1889 [6]. 214 The sender MAY reduce the video frame rate by discarding the video 215 data and VAUX DIF blocks for some of the video frames. The RTP 216 timestamp must still be incremented to account for the discarded 217 frames. The sender MAY alternatively reduce bandwidth by discarding 218 video data DIF blocks for portions of the image which are unchanged 219 from the previous image. To enable this bandwidth reduction, 220 receivers SHOULD implement an error concealment strategy to 221 accommodate lost or missing DIF blocks, e.g. repeating the 222 corresponding DIF block from the previous image. 224 5. SDP Signaling for RTP/DV 226 When using SDP (Session Description Protocol) for negotiation of the 227 RTP payload information, the format described in this document SHOULD 228 be used. SDP description will be slightly different for a bundled 229 stream and an unbundled stream. 231 When DV stream is sent to port 31394 and RTP payload type identifier 232 111, the m=?? line will be like: 234 m=video 31394 RTP/AVP 111 236 The a=rtpmap attribute will be like: 238 a=rtpmap:111 DV/90000 240 "DV" is the encoding name for the DV video payload format defined in 241 this document. 90000 shows the clock rate. The clock used for the 242 payload format defined in this document uses 90kHz clock. 244 In SDP, format specific parameters are defined as a=fmtp, as below: 246 a=fmtp: 248 In the DV video payload format, the a=fmtp line will be used to show 249 the encoding type within the DV video and will be used as below: 251 a=fmtp: encode: 253 The parameter is specified which type of DV 254 format is used. The DV format name will be one of the following: 256 o SD-VCR/525-60 257 o SD-VCR/625-50 258 o HD-VCR/1125-60 259 o HD-VCR/1250-50 260 o SDL-VCR/525-60 261 o SDL-VCR/625-50 262 o 306M/525-60 263 o 306M/625-50 264 o 314M-25/525-60 265 o 314M-25/625-50 266 o 314M-50/525-60 267 o 314M-50/625-50 269 In order to show whether the audio data is bundled into DV stream or 270 not, a format specific parameter is defined as bellow: 272 a=fmtp: audio: