idnits 2.17.1 draft-ietf-avt-mwpp-midi-rtp-08.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Looks like you're using RFC 2026 boilerplate. This must be updated to follow RFC 3978/3979, as updated by RFC 4748. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an Authors' Addresses Section. == There are 5 instances of lines with non-RFC3849-compliant IPv6 addresses in the document. If these are example addresses, they should be changed. -- The document has examples using IPv4 documentation addresses according to RFC6890, but does not use any IPv6 documentation addresses. Maybe there should be IPv6 examples, too? Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the RFC 3978 Section 5.4 Copyright Line does not match the current year -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- Couldn't find a document date in the document -- date freshness check skipped. Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Missing reference section? '2' on line 4428 looks like a reference -- Missing reference section? '6' on line 4443 looks like a reference -- Missing reference section? '14' on line 4473 looks like a reference -- Missing reference section? '16' on line 4480 looks like a reference -- Missing reference section? '15' on line 4477 looks like a reference -- Missing reference section? '3' on line 4432 looks like a reference -- Missing reference section? '1' on line 4425 looks like a reference -- Missing reference section? '12' on line 4464 looks like a reference -- Missing reference section? '5' on line 4440 looks like a reference -- Missing reference section? '19' on line 4490 looks like a reference -- Missing reference section? '18' on line 4488 looks like a reference -- Missing reference section? '13' on line 4469 looks like a reference -- Missing reference section? '4' on line 4436 looks like a reference -- Missing reference section? '11' on line 4459 looks like a reference -- Missing reference section? '17' on line 4483 looks like a reference -- Missing reference section? '9' on line 4453 looks like a reference -- Missing reference section? '7' on line 4447 looks like a reference -- Missing reference section? '20' on line 4493 looks like a reference -- Missing reference section? '10' on line 4456 looks like a reference -- Missing reference section? '8' on line 4450 looks like a reference Summary: 2 errors (**), 0 flaws (~~), 3 warnings (==), 23 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 INTERNET-DRAFT J. Lazzaro 3 June 27, 2003 J. Wawrzynek 4 Expires: December 27, 2003 UC Berkeley 6 RTP Payload Format for MIDI 8 10 Status of this Memo 12 This document is an Internet-Draft and is subject to all provisions of 13 Section 10 of RFC2026. 15 Internet-Drafts are working documents of the Internet Engineering Task 16 Force (IETF), its areas, and its working groups. Note that other groups 17 may also distribute working documents as Internet-Drafts. 19 Internet-Drafts are draft documents valid for a maximum of six months 20 and may be updated, replaced, or obsoleted by other documents at any 21 time. It is inappropriate to use Internet-Drafts as reference material 22 or to cite them other than as "work in progress." 24 The list of current Internet-Drafts can be accessed at 25 http://www.ietf.org/1id-abstracts.html 27 The list of Internet-Draft Shadow Directories can be accessed at 28 http://www.ietf.org/shadow.html 30 Copyright Notice 32 Copyright (C) The Internet Society (2003). All Rights Reserved. 34 Abstract 36 This memo describes an RTP payload format for the MIDI command 37 language. The format encodes all commands that may legally appear 38 on a MIDI 1.0 DIN cable. The format is suitable for interactive 39 applications (such as the remote operation of musical instruments) 40 and content-delivery applications (such as file streaming). The 41 format may be used over unicast and multicast UDP as well as TCP, 42 and defines tools for graceful recovery from packet loss. Stream 43 behavior, including the MIDI rendering method, may be customized 44 during session setup. The format also serves as a mode for the 45 mpeg4-generic format, to support the MPEG 4 Audio Object Types for 46 General MIDI, Downloadable Sounds Level 2, and Structured Audio. 48 Table of Contents 50 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . 4 51 1.1 Terminology . . . . . . . . . . . . . . . . . . . . . . . . 5 52 2. Packet Format. . . . . . . . . . . . . . . . . . . . . . . . . . 5 53 2.1 RTP Header . . . . . . . . . . . . . . . . . . . . . . . . 6 54 2.2 MIDI Payload . . . . . . . . . . . . . . . . . . . . . . . 7 55 3. MIDI Command Section . . . . . . . . . . . . . . . . . . . . . . 8 56 3.1 Timestamps . . . . . . . . . . . . . . . . . . . . . . . . 10 57 3.2 Command Coding . . . . . . . . . . . . . . . . . . . . . . 11 58 4. The Recovery Journal System . . . . . . . . . . . . . . . . . . . 15 59 5. Recovery Journal Format . . . . . . . . . . . . . . . . . . . . . 17 60 6. Session Description Protocol . . . . . . . . . . . . . . . . . . 20 61 6.1 Session Descriptions for Native Streams . . . . . . . . . . 20 62 6.2 Session Description for mpeg4-generic Streams . . . . . . . 21 63 6.3 Session Configuration Tools . . . . . . . . . . . . . . . . 22 64 7. Extensibility . . . . . . . . . . . . . . . . . . . . . . . . . . 23 65 8. Congestion Control . . . . . . . . . . . . . . . . . . . . . . . 24 66 A. The Recovery Journal Channel Chapters . . . . . . . . . . . . . . 25 67 A.1 Recovery Journal Definitions . . . . . . . . . . . . . . . 25 68 A.2 Chapter P: MIDI Program Change . . . . . . . . . . . . . . 27 69 A.3 Chapter W: MIDI Pitch Wheel . . . . . . . . . . . . . . . . 28 70 A.4 Chapter N: MIDI NoteOff and NoteOn . . . . . . . . . . . . 29 71 A.4.1 Header Structure . . . . . . . . . . . . . . . . . . 30 72 A.4.2 Note Structures . . . . . . . . . . . . . . . . . . 31 73 A.5 Chapter A: MIDI Poly Aftertouch . . . . . . . . . . . . . . 31 74 A.6 Chapter T: MIDI Channel Aftertouch . . . . . . . . . . . . 32 75 A.7 Chapter C: MIDI Control Change . . . . . . . . . . . . . . 33 76 A.7.1 Log Inclusion Rules . . . . . . . . . . . . . . . . 33 77 A.7.2 Log Coding Rules . . . . . . . . . . . . . . . . . . 34 78 A.7.3 Portamento Control . . . . . . . . . . . . . . . . . 37 79 A.7.4 The Parameter System . . . . . . . . . . . . . . . . 37 80 A.8 Chapter E: MIDI Reset All Controllers . . . . . . . . . . . 38 81 A.9 Chapter M: MIDI Parameter System . . . . . . . . . . . . . 40 82 A.9.1 Log Inclusion Rules . . . . . . . . . . . . . . . . 40 83 A.9.2 Log Coding Rules . . . . . . . . . . . . . . . . . . 41 84 A.9.2.1 COARSE and FINE Fields . . . . . . . . . . . 42 85 A.9.2.2 The BUTTON Field . . . . . . . . . . . . . . 42 86 A.9.2.3 A-COARSE, A-FINE, and A-BUTTON . . . . . . . 43 87 A.9.2.4 COUNT and TCOUNT . . . . . . . . . . . . . . 45 88 B. The Recovery Journal System Chapters . . . . . . . . . . . . . . 45 89 B.1 System Chapter D: Simple System Commands . . . . . . . . . 45 90 B.1.1 Undefined System Commands . . . . . . . . . . . 47 91 B.2 System Chapter V: Active Sense Command . . . . . . . . . . 49 92 B.3 System Chapter Q: Sequencer State Commands . . . . . . . . 49 93 B.3.1 Non-compliant Sequencers . . . . . . . . . . . 51 94 B.4 System Chapter F: MIDI Time Code . . . . . . . . . . . . . 51 95 B.4.1 Partial Frames . . . . . . . . . . . . . . . . . . 53 96 B.5 System Chapter X: System Exclusive . . . . . . . . . . . . 54 97 B.5.1 Chapter Format . . . . . . . . . . . . . . . . 55 98 B.5.2 Coding Tools . . . . . . . . . . . . . . . . . 56 99 C. SDP Session Configuration Tools . . . . . . . . . . . . . . . . . 57 100 C.1 The Journalling System . . . . . . . . . . . . . . . . . . 58 101 C.1.1 The j_sec Parameter . . . . . . . . . . . . . . . . 58 102 C.1.2 The j_update Parameter . . . . . . . . . . . . . . . 60 103 C.1.2.1 The anchor Sending Policy . . . . . . . . . . 60 104 C.1.2.2 The closed-loop Sending Policy . . . . . . . 60 105 C.1.2.3 The open-loop Sending Policy . . . . . . . . 63 106 C.1.3 Chapter Inclusion Parameters . . . . . . . . . . . . 65 107 C.2 Command Execution Semantics . . . . . . . . . . . . . . . . 68 108 C.2.1 The async Algorithm . . . . . . . . . . . . . . . . 69 109 C.2.2 The buffer Algorithm . . . . . . . . . . . . . . . . 70 110 C.3 Timing Tools . . . . . . . . . . . . . . . . . . . . . . . 71 111 C.3.1 ptime and maxptime . . . . . . . . . . . . . . . . . 71 112 C.3.2 The guardtime Parameter . . . . . . . . . . . . . . 72 113 C.3.3 MIDI Time Code Issues . . . . . . . . . . . . . . . 72 114 C.4 Multiple Streams . . . . . . . . . . . . . . . . . . . . . 73 115 C.4.1 The musicport Parameter . . . . . . . . . . . . . . 73 116 C.4.2 The zerosync Parameter . . . . . . . . . . . . . . . 75 117 C.5 MIDI Rendering . . . . . . . . . . . . . . . . . . . . . . 78 118 C.5.1 The rinit Parameter . . . . . . . . . . . . . . . . 78 119 C.5.2 Encoding rinit Data Objects . . . . . . . . . . . . 79 120 C.5.3 MIDI Channel Mapping . . . . . . . . . . . . . . . . 80 121 C.5.3.1 smf_info . . . . . . . . . . . . . . . . . . 80 122 C.5.3.2 smf_inline, smf_url, smf_cid . . . . . . . . 81 123 C.5.3.3 chanmask . . . . . . . . . . . . . . . . . . 81 124 C.5.4 The audio/asc MIME Type . . . . . . . . . . . . . . 82 125 D. Parameter Syntax Definitions . . . . . . . . . . . . . . . . . . 83 126 E. A MIDI Overview for Networking Specialists . . . . . . . . . . . 87 127 E.1 Commands Types . . . . . . . . . . . . . . . . . . . . . . 88 128 E.2 Running Status . . . . . . . . . . . . . . . . . . . . . . 89 129 E.3 Command Timing . . . . . . . . . . . . . . . . . . . . . . 89 130 F. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . 89 131 G. Security Considerations . . . . . . . . . . . . . . . . . . . . . 90 132 H. IANA Considerations . . . . . . . . . . . . . . . . . . . . . . . 90 133 H.1 rtp-midi MIME Registration . . . . . . . . . . . . . . . . 90 134 H.2 mpeg4-generic MIME Registration . . . . . . . . . . . . . . 93 135 H.3 asc MIME Registration . . . . . . . . . . . . . . . . . . . 96 136 I. References . . . . . . . . . . . . . . . . . . . . . . . . . . . 97 137 I.1 Normative References . . . . . . . . . . . . . . . . . . . 97 138 I.2 Informative References . . . . . . . . . . . . . . . . . . 98 139 J. Author Addresses . . . . . . . . . . . . . . . . . . . . . . . . 99 140 K. Intellectual Property Rights Statement . . . . . . . . . . . . . 99 141 L. Full Copyright Statement . . . . . . . . . . . . . . . . . . . . 100 142 M. Change Log for . . . . . . 101 143 1. Introduction 145 The Internet Engineering Task Force (IETF) has developed a set of 146 focused tools for multimedia networking ([2] [6] [14] [16]). These 147 tools can be combined in different ways to support a variety of real- 148 time applications over Internet Protocol (IP) networks. 150 For example, a telephony application might use the Session Initiation 151 Protocol (SIP, [14]) to set up a phone call. Call setup would include 152 negotiations to agree on a common audio codec [15]. Negotiations would 153 use the Session Description Protocol (SDP, [6]) to describe candidate 154 codecs. 156 After a call is set up, audio data would flow between the parties using 157 the Real Time Protocol (RTP, [2]) under the Audio/Visual Profile (AVP, 158 [3]). The tools used in this telephony example (SIP, SDP, RTP/AVP) 159 might be combined in a different way to support a content streaming 160 application, perhaps in conjunction with other tools (such as the Real 161 Time Streaming Protocol (RTSP, [16])). 163 The MIDI command language [1] is widely used in musical applications 164 that are analogous to the examples described above. On stage and in the 165 recording studio, MIDI is used for the interactive remote control of 166 musical instruments, an application similar in spirit to telephony. On 167 web pages, Standard MIDI Files (SMFs, [1]) rendered using the General 168 MIDI standard [1] provide a low-bandwidth substitute for audio 169 streaming. 171 This memo is motivated by a simple premise: if MIDI performances could 172 be sent as RTP streams that are managed by IETF session tools, a 173 hybridization of the MIDI and IETF application domains may occur. 175 For example, interoperable MIDI networking may foster network music 176 performance applications, in which a group of musicians, located at 177 different physical locations, interact over a network to perform as they 178 would if located in the same room [12]. As another example, the 179 streaming community may begin to use MIDI for low-bitrate audio coding, 180 perhaps in conjunction with normative sound synthesis methods [5]. As 181 another example, manufacturers of professional audio equipment and 182 electronic musical instruments may consider adopting the IETF multimedia 183 stack (IP, SIP, RTP) as the networking layer for a MIDI control plane. 185 To provide a foundation for RTP MIDI applications, this memo extends two 186 of the IETF tools (RTP and SDP) to support MIDI. Sections 2-5 and 187 Appendices A-B extend RTP/AVP by adding a MIDI payload format. Section 188 6 and Appendices C-D extend SDP by adding session configuration tools to 189 customize stream behavior (including the MIDI rendering method) during 190 session setup. 192 Some applications may require MIDI media delivery at a certain service 193 quality level (latency, jitter, packet loss, etc). RTP itself does not 194 provide service guarantees. However, applications may use lower-layer 195 network protocols to configure the quality of the transport services 196 that RTP uses. These protocols may act to reserve network resources for 197 RTP flows [19], or may simply direct RTP traffic onto a dedicated "media 198 network" in a local installation. Note that RTP and the MIDI payload 199 format do provide tools that applications may use to achieve the best 200 possible real-time performance at a given service level. 202 This memo normatively defines the syntax and semantics of the MIDI 203 payload format. However, this memo does not define algorithms for 204 sending and receiving packets. An ancillary document [18] provides 205 informative guidance on algorithms. Supplemental information may be 206 found in related conference publications [12] [13]. 208 Throughout this memo, the phrase "native stream" refers to a stream that 209 uses the rtp-midi MIME type. The phrase "mpeg4-generic stream" refers 210 to a stream that uses the mpeg4-generic MIME type (in mode rtp-midi) to 211 operate in an MPEG 4 environment [4]. Section 6 describes this 212 distinction in detail. 214 1.1 Terminology 216 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 217 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 218 document are to be interpreted as described in BCP 14, RFC 2119 [11]. 220 2. Packet Format. 222 In this section, we introduce the format of RTP MIDI packets. The 223 description includes some background information on RTP/AVP, for the 224 benefit of MIDI implementors new to IETF tools. Implementors should 225 consult [2,3] for an authoritative description of RTP/AVP. 227 This memo assumes the reader is familiar with MIDI syntax and semantics. 228 Appendix E provides a MIDI overview, at a level of detail sufficient to 229 understand most of this memo. Implementors should consult [1] for an 230 authoritative description of MIDI. 232 The MIDI payload format maps a MIDI command stream (16 voice channels + 233 systems) onto an RTP stream. An RTP media stream is a sequence of 234 logical packets that share a common format. Each packet consists of two 235 parts: the RTP header and the MIDI payload. Figure 1 shows this format 236 (vertical space delineates the header and payload). 238 We describe RTP packets as "logical" packets to highlight the fact that 239 RTP itself is not a network-layer protocol. Instead, RTP packets are 240 mapped onto network protocols (such as unicast UDP, multicast UDP, or 241 TCP) by an application [17]. 243 2.1 RTP Header 245 [2] provides a complete description of the RTP header fields. In this 246 section, we clarify the role of a few RTP header fields for MIDI 247 applications. All fields are coded in network byte order (big-endian). 249 0 1 2 3 250 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 251 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 252 | V |P|X| CC |M| PT | Sequence number | 253 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 254 | Timestamp | 255 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 256 | SSRC | 257 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 259 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 260 | MIDI command section ... | 261 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 262 | Journal section ... | 263 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 265 Figure 1 -- Packet format 267 The behavior of the 1-bit M field depends on the MIME type of the 268 stream. For native streams, the M bit MUST be set to 1 if the MIDI 269 command section codes one or more MIDI commands, and MUST be set to 0 270 otherwise. For mpeg4-generic streams, the M bit MUST be set to 1 for 271 all packets (to conform to [4]). 273 The 16-bit sequence number field is initialized to a randomly chosen 274 value, and is incremented by one (modulo 2^16) for each packet sent in 275 the stream. A related quantity, the 32-bit extended packet sequence 276 number, may be computed by tracking rollovers of the 16-bit sequence 277 number. Note that different receivers of the same stream may compute 278 different extended packet sequence numbers, depending on when the 279 receiver joined the session. 281 The 32-bit timestamp field sets the base timestamp value for the packet. 282 The payload codes MIDI command timing relative to this value. The 283 timestamp units are set during session configuration by the srate rtpmap 284 parameter (Sections 6.1-2). For example, if srate has a value of 44100 285 Hz, two packets whose base timestamp values differ by 2 seconds have RTP 286 timestamp fields that differ by 88200. By default the timestamp field 287 is initialized to a randomly chosen value (see Appendix C.4.2 for an 288 exception). 290 RTP timestamps do not necessarily increment at a fixed rate, because 291 packets are not necessarily sent at a fixed rate. The degree of packet 292 transmission regularity reflects the underlying application dynamics. 293 Interactive applications may vary the packet sending rate to track the 294 gestural rate of a human performer, whereas content-streaming 295 applications may send packets at a fixed rate. 297 Therefore, the timestamps for two sequential RTP packets may be 298 identical, or the second packet may have a timestamp arbitrarily larger 299 than the first packet (modulo 2^32). Section 3 places additional 300 restrictions on the RTP timestamps for two sequential RTP packets, as 301 does the guardtime fmtp parameter (Appendix C.3.2). 303 The media time coded by a packet is computed by subtracting the last 304 command timestamp in the MIDI command section from the RTP timestamp 305 (modulo 2^32). If the MIDI list of the MIDI command section of a packet 306 is empty, the media time coded by the packet is 0 ms. Appendix C.3.1 307 discusses media time issues in detail. 309 2.2 MIDI Payload 311 The payload (Figure 1) MUST begin with the MIDI command section. The 312 MIDI command section codes a (possibly empty) list of timestamped MIDI 313 commands, and provides the essential service of the payload format. 315 The payload MAY also contain a journal section. The journal section 316 provides resiliency by coding the recent history of the stream. A flag 317 in the MIDI command section codes the presence of a journal section in 318 the payload. 320 Section 3 defines the MIDI command section. Sections 4-5 and Appendices 321 A-B define the recovery journal, the default format for the journal 322 section. Here, we describe how these payload sections operate in a 323 stream. 325 The journalling method for a stream is set at the start of a session and 326 MUST NOT be changed thereafter. A stream may be set to use the recovery 327 journal, to use an alternative journal format (none are defined in this 328 memo), or to not use a journal. 330 The default journalling method of a stream is inferred from its 331 transport type. Streams that use unreliable transport (such as UDP) 332 default to using the recovery journal. Streams that use reliable 333 transport (such as TCP) default to not using a journal. Appendix C.1.1 334 defines session configuration tools for overriding these defaults. 336 If a stream uses the recovery journal, every payload in the stream MUST 337 include a journal section. If a stream does not use journalling, a 338 journal section MUST NOT appear in a stream payload. If a stream uses 339 an alternative journal format, the specification for the journal format 340 defines an inclusion policy. 342 If a stream sent over reliable transport does not use journalling, the 343 sender MUST transmit an RTP packet stream with consecutive sequence 344 numbers (modulo 2^16). If a stream sent over reliable transport uses 345 the recovery journal, the sender MAY transmit an RTP stream with missing 346 or out-of-order packets. 348 The payload of a stream encodes data for a single MIDI command name 349 space (16 voice channels + systems). Applications may use several 350 streams in a session. Session configuration tools for multi-stream 351 sessions are defined in Appendix C.4. 353 In some applications, a receiver renders MIDI commands into audio (or 354 into control actions, such as the rewind of a tape deck or the dimming 355 of stage lights). In other applications, a receiver presents a MIDI 356 stream to software programs via an Application Programmer Interface 357 (API). Appendix C.5 defines session configuration tools to specify what 358 receivers should do with a MIDI command stream. 360 If a stream is sent over UDP transport, the Maximum Transmission Unit 361 (MTU) of the underlying network limits the practical size of the payload 362 section (for example, an Ethernet MTU is 1500 octets). Note that MTU 363 size restrictions do not apply to RTP packets sent over TCP streams. 364 The session configuration tools defined in Appendix C.4 may be used to 365 split a dense MIDI name space into several UDP streams, so that the 366 payload fits comfortably into an MTU. 368 3. MIDI Command Section 370 Figure 2 shows the format of the MIDI command section. 372 0 1 2 3 373 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 374 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 375 |B|J|Z|P|LEN... | MIDI list ... | 376 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 378 Figure 2 -- MIDI command section 380 The MIDI command section begins with a variable-length header. 382 The header field LEN codes the number of octets in the MIDI list that 383 follows the header. If the header flag B is 0, the header is one octet 384 long, and LEN is a 4-bit field, supporting a maximum MIDI list length of 385 15 octets. If B is 1, the header is two octets long, and LEN is a 386 12-bit field, supporting a maximum MIDI list length of 4095 octets. A 387 LEN value of 0 is legal, and codes an empty MIDI list 389 If the J header bit is set to 1, a journal section MUST appear after 390 MIDI command section in the payload. If the J header bit is set to 0, 391 the payload MUST NOT contain a journal section. 393 If the LEN header field is nonzero, the MIDI list has the structure 394 shown in Figure 3. 396 0 1 2 3 397 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 398 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 399 | Delta Time 0 (if Z = 1) | MIDI Command 0 ... | 400 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 401 | Delta Time 1 ... | MIDI Command 1 ... | 402 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 403 | Delta Time 2 ... | MIDI Command 2 ... | 404 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 405 | ..... | 406 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 407 | Delta Time N ... | MIDI Command N (may be empty) | 408 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 410 Figure 3 -- MIDI list structure 412 If the header flag Z is 1, the MIDI list begins with a complete MIDI 413 command (MIDI Command 0) preceded by a delta time (Delta Time 0). If Z 414 is 0, the Delta Time 0 field is not present in the MIDI list, and MIDI 415 Command 0 has an implicit delta time of 0. The MIDI list structure may 416 also optionally encode a list of N additional complete MIDI commands. 418 Each additional command MUST be preceded by a delta time. 420 The final MIDI Command field in the MIDI list MAY be empty. Senders may 421 use this feature to precisely set the media time of a packet. 423 3.1 Timestamps 425 The RTP MIDI delta time syntax is a modified form of the MIDI File delta 426 time syntax [1]. RTP MIDI delta times use 1-4 octet fields to encode 427 32-bit unsigned integers. Figure 4 shows the encoded and decoded forms 428 of delta times. Note that delta time values may be legally encoded in 429 multiple formats; for example, there are four legal ways to encode the 430 zero delta time (0x00, 0x8000, 0x800000, 0x80000000). 432 RTP MIDI uses delta times to encode a timestamp for each MIDI command. 433 The timestamp for MIDI Command K is the summation (modulo 2^32) of the 434 RTP timestamp and decoded delta times 0 through K. This cumulative 435 coding technique, borrowed from MIDI File delta time coding, is 436 efficient because it reduces the number of multi-octet delta times. 438 One-Octet Delta Time: 440 Encoded form: 0ddddddd 441 Decoded form: 00000000 00000000 00000000 0ddddddd 443 Two-Octet Delta Time: 445 Encoded form: 1ccccccc 0ddddddd 446 Decoded form: 00000000 00000000 00cccccc cddddddd 448 Three-Octet Delta Time: 450 Encoded form: 1bbbbbbb 1ccccccc 0ddddddd 451 Decoded form: 00000000 000bbbbb bbcccccc cddddddd 453 Four-Octet Delta Time: 455 Encoded form: 1aaaaaaa 1bbbbbbb 1ccccccc 0ddddddd 456 Decoded form: 0000aaaa aaabbbbb bbcccccc cddddddd 458 Figure 4 -- Decoding delta time formats 460 All command timestamps in a packet MUST be less than or equal to the RTP 461 timestamp of the next packet in the stream (modulo 2^32). 463 By default, a command timestamp indicates the execution time for the 464 command. The difference between two timestamps indicates the time delay 465 between the execution of the commands. This difference may be zero, 466 coding simultaneous execution. 468 This default interpretation of timestamp semantics is a good choice to 469 use for transcoding a Standard MIDI File (SMF) into an RTP MIDI stream. 470 To code an SMF that uses metric time markers, use the tempo map (encoded 471 as SMF meta-events) to convert metric units into seconds-based RTP 472 timestamp units. 474 MIDI command sources that use implicit command timing, such as MIDI 1.0 475 DIN cables, must be annotated with timestamps as part of the RTP 476 transcoding process. Appendix C.2 describes session configuration tools 477 for transcoding MIDI sources of this type. 479 3.2 Command Coding 481 Each non-empty MIDI Command field in the MIDI list codes one of the MIDI 482 command types that may legally appear on a MIDI 1.0 DIN cable. Note 483 that SMF meta-events do not fit this definition and MUST NOT appear in 484 the MIDI list. As a rule, each MIDI Command field codes a complete 485 command, in the binary command format defined in [1]. In the remainder 486 of this section, we describe exceptions to this rule. 488 The first MIDI channel command in the MIDI list MUST include a status 489 octet. Running status coding, as defined in [1], MAY be used for all 490 subsequent MIDI channel commands in the list. If the status octet of 491 the first MIDI channel command in the list does not appear in the source 492 data stream, the P (phantom) header bit MUST be set to 1. In all other 493 cases, the P bit MUST be set to 0. 495 As in [1], System Common and System Exclusive messages (0xF0 ... 0xF7) 496 cancel running status state, but System Real-time messages (0xF8 ... 497 0xFF) do not effect running status state. As receivers MUST be able to 498 decode running status, sender implementors should feel free to use 499 running status to improve bandwidth efficiency. However, senders SHOULD 500 NOT introduce timing jitter into an existing MIDI command stream through 501 an inappropriate use or removal of running status coding. 503 On a MIDI 1.0 DIN cable [1], a System Real-time command may be embedded 504 inside of another "host" MIDI command. This syntactic construction is 505 not supported in the payload format: a MIDI Command field in the MIDI 506 list codes exactly one complete MIDI command. 508 To encode an embedded System Real-time command, senders MUST extract the 509 command from its host, and code it in the MIDI list as a separate 510 command. The host command and System Real-time command SHOULD appear in 511 the same MIDI list. The delta time of the System Real-time command 512 SHOULD result in a command timestamp that encodes the System Real-time 513 command placement in its original embedded position. 515 Two methods are provided for encoding MIDI System Exclusive (SysEx) 516 commands in the MIDI list. A SysEx command may be encoded in a MIDI 517 Command field verbatim: a 0xF0 octet, followed by an arbitrary number of 518 data octets, followed by a 0xF7 octet. 520 Alternatively, a SysEx command may be encoded as multiple segments. The 521 command is divided into two or more SysEx command segments; each segment 522 is encoded in its own MIDI Command field in the MIDI list. 524 The payload format supports segmentation in order to encode SysEx 525 commands that encode information in the temporal pattern of data octets. 526 By encoding these commands as a series of segments, each data octet may 527 be associated with a distinct delta time. Segmentation also supports 528 the coding of large SysEx commands across several packets. 530 To segment a SysEx command, first partition its data octet list into two 531 or more sublists. Each sublist MUST contain at least one data octet. 532 To complete the segmentation, add status octets to the head and tail of 533 each sublist, as detailed in Figure 5. Figure 6 shows example 534 segmentations of a SysEx command. 536 The relative ordering of SysEx command segments in a MIDI list must 537 match the relative ordering of the sublists in the original SysEx 538 command. Only System Real-time MIDI commands may appear between SysEx 539 command segments. If the command segments of a SysEx command are placed 540 in the MIDI lists of two or more RTP packets, the segment ordering rules 541 apply to the concatenation of all affected MIDI lists. 543 ----------------------------------------------------------- 544 | Sublist Position | Head Status Octet | Tail Status Octet | 545 |-----------------------------------------------------------| 546 | first | 0xF0 | 0xF0 | 547 |-----------------------------------------------------------| 548 | middle | 0xF7 | 0xF0 | 549 |-----------------------------------------------------------| 550 | last | 0xF7 | 0xF7 | 551 ----------------------------------------------------------- 553 Figure 5 -- Command segmentation status octets 555 SysEx commands carried on a MIDI 1.0 DIN cable may use the "dropped 556 0xF7" construction [1]. In this coding method, the 0xF7 octet is 557 dropped from the end of the SysEx command, and the status octet of the 558 next MIDI command acts both to terminate the SysEx command and start the 559 next command. To encode this construction in the payload format, follow 560 these steps: 562 o Determine the appropriate delta times for the SysEx command and 563 the command that follows the SysEx command. 565 o Insert the "dropped" 0xF7 octet at the end of the SysEx command, 566 to form the standard SysEx syntax. 568 o Code both commands into the MIDI list using the rules above. 570 o Replace the 0xF7 octet that terminates the verbatim SysEx 571 encoding or the last segment of the segmented SysEx encoding 572 with a 0xF5 octet. This substitution informs the receiver 573 of the original dropped 0xF7 coding. 575 [1] reserves the System Common opcodes 0xF4 and 0xF5 and the System 576 Real-time opcodes 0xF9 and 0xFD for future use. We refer to these 577 opcodes as undefined opcodes. By default, undefined opcodes MUST NOT 578 appear in a MIDI Command field in the MIDI list. 580 During session configuration, a stream may be customized to allow 581 transport of the undefined opcodes (Appendix C.1.3). In this case, 582 commands that use the undefined System Common opcodes MUST be terminated 583 with a 0xF7 octet and coded using the System Exclusive verbatim rule. 584 Commands that use the undefined System Real-time opcodes MUST be coded 585 using the System Real-time rules. 587 Original SysEx command: 589 0xF0 0x01 0x02 0x03 0x04 0x05 0x06 0x07 0x08 0xF7 591 A two-segment segmentation: 593 0xF0 0x01 0x02 0x03 0x04 0xF0 595 0xF7 0x05 0x06 0x07 0x08 0xF7 597 A different two-segment segmentation: 599 0xF0 0x01 0xF0 601 0xF7 0x02 0x03 0x04 0x05 0x06 0x07 0x08 0xF7 603 A three-segment segmentation: 605 0xF0 0x01 0x02 0xF0 607 0xF7 0x03 0x04 0xF0 609 0xF7 0x05 0x06 0x07 0x08 0xF7 611 The segmentation with the largest number of segments: 613 0xF0 0x01 0xF0 615 0xF7 0x02 0xF0 617 0xF7 0x03 0xF0 619 0xF7 0x04 0xF0 621 0xF7 0x05 0xF0 623 0xF7 0x06 0xF0 625 0xF7 0x07 0xF0 627 0xF7 0x08 0xF7 629 Figure 6 -- Example segmentations 631 4. The Recovery Journal System 633 The recovery journal is the default resiliency tool for unreliable 634 transport. In this section, we normatively define the roles that 635 senders and receivers play in the recovery journal system. 637 MIDI is a fragile code. A single lost command in a MIDI command stream 638 may produce an artifact in the rendered performance. We normatively 639 classify rendering artifacts into two categories: 641 o Transient artifacts. Transient artifacts produce immediate 642 but short-term glitches in the performance. For example, a lost 643 NoteOn (0x9) command produces a transient artifact: one note 644 fails to play, but the artifact does not extend beyond the end 645 of that note. 647 o Indefinite artifacts. Indefinite artifacts produce long-lasting 648 errors in the rendered performance. For example, a lost NoteOff 649 (0x8) command may produce an indefinite artifact: the note that 650 should have been ended by the lost NoteOff command may sustain 651 indefinitely. As a second example, the loss of a Control Change 652 (0xB) command for controller number 7 (Channel Volume) may 653 produce an indefinite artifact: after the loss, all notes on 654 the channel may play too softly or too loudly. 656 The purpose of the recovery journal system is to satisfy the recovery 657 journal mandate: the MIDI performance rendered from an RTP MIDI stream 658 sent over unreliable transport MUST NOT contain indefinite artifacts. 660 The recovery journal system does not use packet retransmission to 661 satisfy this mandate. Instead, each packet includes a special section, 662 called the recovery journal. 664 The recovery journal codes the history of the stream, back to an earlier 665 packet called the checkpoint packet. The range of coverage for the 666 journal is called the checkpoint history. The recovery journal codes 667 the information necessary to recover from the loss of an arbitrary 668 number of packets in the checkpoint history. Appendix A.1 normatively 669 defines the checkpoint packet and the checkpoint history. 671 When a receiver detects a packet loss, it compares its own knowledge 672 about the history of the stream with the history information coded in 673 the recovery journal of the packet that ends the loss event. By noting 674 the differences in these two versions of the past, a receiver is able to 675 transform all indefinite artifacts in the rendered performance into 676 transient artifacts, by executing MIDI commands to repair the stream. 678 We now state the normative role for senders in the recovery journal 679 system. 681 Senders prepare a recovery journal for every packet in the stream. In 682 doing so, senders choose the checkpoint packet identity for the journal. 683 Senders make this choice by applying a sending policy. Appendix C.1.2 684 normatively defines three sending policies: "closed-loop", "open-loop", 685 and "anchor". 687 By default, senders MUST use the closed-loop sending policy. If the 688 session description overrides this default policy, by using the fmtp 689 parameter j_update defined in Appendix C.1.2, senders MUST use the 690 specified policy. 692 After choosing the checkpoint packet identity for a packet, the sender 693 creates the recovery journal. By default, this journal MUST conform to 694 the normative semantics in Section 5 and Appendices A-B in this memo. 695 In Appendix C.1.3, we define fmtp parameters that modify the normative 696 semantics for recovery journals. If the session description uses these 697 parameters, the journal created by the sender MUST conform to the 698 modified semantics. 700 Next, we state the normative role for receivers in the recovery journal 701 system. 703 A receiver MUST detect each RTP sequence number break in a stream. If 704 the sequence number break is due to a packet loss event (as defined in 705 [2]) the receiver MUST repair all indefinite artifacts in the rendered 706 MIDI performance caused by the loss. If the sequence number break is 707 due to an out-of-order packet (as defined in [2]) the receiver MUST NOT 708 take actions that introduce indefinite artifacts (ignoring the out-of- 709 order packet is a safe option). 711 Receivers take special precautions when entering or exiting a session. 712 A receiver MUST process the first received packet in a stream as if it 713 were a packet that ends a loss event. Upon exiting a session, a 714 receiver MUST ensure that the rendered MIDI performance does not end 715 with indefinite artifacts. 717 Receivers are under no obligation to perform indefinite artifact repairs 718 at the moment a packet arrives. A receiver that uses a playout buffer 719 may choose to wait until the moment of rendering before processing the 720 recovery journal, as the "lost" packet may be a late packet that arrives 721 in time to use. 723 Next, we state the normative role for the creator of the session 724 description in the recovery journal system. Depending on the 725 application, the sender, the receivers, and other parties may take part 726 in creating or approving the session description. 728 A session description that specifies the default closed-loop sending 729 policy and the default recovery journal semantics satisfies the recovery 730 journal mandate. However, these default behaviors may not be 731 appropriate for all sessions. If the creators of a session description 732 use the parameters defined in Appendix C.1 to override these defaults, 733 the creators MUST ensure that the parameters define a system that 734 satisfy the recovery journal mandate. 736 Finally, we note that this memo does not specify sender or receiver 737 recovery journal algorithms. Implementations are free to use any 738 algorithm that conforms to the requirements in this section. The non- 739 normative [18] discusses sender and receiver algorithm design. 741 5. Recovery Journal Format 743 This section introduces the structure of the recovery journal, and 744 defines the bitfields of recovery journal headers. Appendices A-B 745 complete the bitfield definition of the recovery journal. The recovery 746 journal has a three-level structure: 748 o Top-level header. 750 o Channel and system journal headers. Encodes recovery 751 information for a single voice channel (channel journal) or 752 for all systems commands (system journal). 754 o Chapters. Describes recovery information for a single MIDI 755 command type. 757 Figure 7 shows the top-level structure of the recovery journal. A 758 recovery journals consists of a 3-octet header, optionally followed by a 759 system journal and a list of channel journals. These elements appear in 760 the order shown in Figure 7. 762 0 1 2 3 763 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 764 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 765 |S|Y|A|R|TOTCHAN| Checkpoint Packet Seqnum | S-journal ... | 766 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 767 | Channel journals ... | 768 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 770 Figure 7 -- Top-level recovery journal format 772 If the Y bit is set to 1, a system journal follows the recovery journal 773 header. If the A bit is set to 1, the recovery journal ends with a list 774 of (TOTCHAN + 1) channel journals. If A and Y are both zero, the 775 recovery journal only contains the 3-octet header, and is considered to 776 be an "empty" journal. 778 A MIDI channel may be represented by (at most) one channel journal in a 779 recovery journal. Channel journals MUST appear in the recovery journal 780 in ascending channel-number order. 782 The S (single-packet loss) bit appears in most recovery journal 783 structures. The S bit helps receivers efficiently parse the recovery 784 journal in the common case of the loss of a single packet. Appendix A.1 785 defines S bit semantics. 787 The R bit is reserved. The semantics for all R fields are uniform 788 throughout the recovery journal, and are defined in Appendix A.1. 790 The 16-bit Checkpoint Packet Seqnum field codes the sequence number of 791 the checkpoint packet for this journal. The choice of the checkpoint 792 packet sets the depth of the checkpoint history for the journal (defined 793 in Appendix A.1). 795 Receivers may use the Checkpoint Packet Seqnum field of the packet that 796 ends a loss event to verify that the journal checkpoint history covers 797 the entire loss event. The checkpoint history covers the loss event if 798 the Checkpoint Packet Seqnum field is less than or equal to the highest 799 RTP sequence number previously received on the stream (modulo 2^16). 801 0 1 2 3 802 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 803 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 804 |S| CHAN |R| LENGTH |P|W|N|A|T|C|E|M| Chapters ... | 805 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 807 Figure 8 -- Channel journal format 809 Figure 8 shows the structure of a channel journal: a 3-octet header, 810 followed by a list of leaf elements called channel chapters. A channel 811 journal encodes information about MIDI commands on the MIDI channel 812 coded by the 4-bit CHAN header field. 814 The 10-bit LENGTH field codes the length of the channel journal. The 815 semantics for LENGTH fields are uniform throughout the recovery journal, 816 and are defined in Appendix A.1. 818 The third octet of the channel journal header is the Table of Contents 819 (TOC) of the channel journal. The TOC is a set of bits that encode the 820 presence of a chapter in the journal. Each chapter contains information 821 about a certain class of MIDI channel command: 823 o Chapter P: MIDI Program Change (0xC) 824 o Chapter W: MIDI Pitch Wheel (0xE) 825 o Chapter N: MIDI NoteOff (0x8), NoteOn (0x9) 826 o Chapter A: MIDI Poly Aftertouch (0xA) 827 o Chapter T: MIDI Channel Aftertouch (0xD) 828 o Chapter C: MIDI Control Change (0xB) 829 o Chapter E: MIDI Reset All Controllers (part of 0xB) 830 o Chapter M: MIDI Parameter System (part of 0xB) 832 Chapters appear in a list following the header, in order of their 833 appearance in the TOC. Appendices A.2-9 describe the bitfield format 834 for each chapter, and define the conditions under which a chapter type 835 MUST appear in the recovery journal. If any chapter types are required 836 for a channel, an associated channel journal MUST appear in the recovery 837 journal. 839 0 1 2 3 840 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 841 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 842 |S|D|V|Q|F|X| LENGTH | System chapters ... | 843 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 845 Figure 9 -- System journal format 847 Figure 9 shows the structure of the system journal: a 2-octet header, 848 followed by a list of system chapters. System chapters code information 849 about a specific class of MIDI Systems command: 851 o Chapter D: Song Select (0xF3), Tune Request (0xF6), Reset (0xFF), 852 undefined System commands (0xF4, 0xF5, 0xF9, 0xFD) 853 o Chapter V: Active Sense (0xFE) 854 o Chapter Q: Sequencer State (0xF2, 0xF8, 0xF9, 0xFA, 0xFB, 0xFC) 855 o Chapter F: MTC Tape Position (0xF1, 0xF0 0x7F 0xcc 0x01 0x01) 856 o Chapter X: System Exclusive (all other 0xF0) 858 If header bits D, V, Q, or F are set to 1, one chapter for each set bit 859 appears in the system chapter list. The chapter list ordering follows 860 the ordering of the header bits. If header bit X is set to 1, one or 861 more Chapter X bitfields appear at the end of the chapter list. 863 Appendix B describes the bitfield format for the system chapters, and 864 define the conditions under which a chapter type MUST appear in the 865 recovery journal. If any system chapter type is required to appear in 866 the recovery journal, the system journal MUST appear in the recovery 867 journal. 869 6. Session Description Protocol 871 RTP does not perform session management. Instead, RTP is designed to 872 work together with tools that perform session management, such as the 873 Session Initiation Protocol (SIP, [14]) and the Real Time Streaming 874 Protocol (RTSP, [16]). RTP interacts with session management tools via 875 another standard, the Session Description Protocol (SDP, [6]). 877 SDP is a textual format for specifying session descriptions. Session 878 descriptions specify the network transport and media encoding for RTP 879 streams. SIP and RTSP coordinate the exchange of session descriptions 880 between participants. In SIP, session descriptions also support 881 negotiation [15]. 883 Below, we show session description examples for native (Section 6.1) and 884 mpeg4-generic (Section 6.2) streams. In Section 6.3, we introduce 885 session configuration tools that may be used to customize streams. 887 6.1 Session Descriptions for Native Streams 889 The session description below shows a minimal session description for a 890 native stream sent over unicast UDP transport. 892 v=0 893 o=lazzaro 2520644554 2838152170 IN IP4 first.example.net 894 s=Example 895 t=0 0 896 m=audio 5004 RTP/AVP 96 897 c=IN IP4 192.0.2.94 898 a=rtpmap: 96 rtp-midi/44100 900 The rtpmap attribute line uses the rtp-midi MIME type to specify a 901 native stream. If the session parties send and receive RTP packets, the 902 streams form bi-directional MIDI connections, suitable for use by the 903 MIDI System commands that use handshaking protocols [1]. 905 We describe this session description as minimal, because it does not 906 customize the stream. Without such customization, a native stream has 907 these characteristics: 909 1. If the stream uses unreliable transport (unicast UDP, multicast 910 UDP, ...) the recovery journal system is in use, and the RTP 911 payload contains both the MIDI command section and the journal 912 section. If the stream uses reliable transport (TCP, TLS, ...), 913 the stream does not use journalling, and the payload contains 914 only the MIDI command section (Section 2.2). 916 2. If the stream uses the recovery journal system, the recovery 917 journal system uses the default sending policy and the default 918 journal semantics (Section 4). 920 3. In the MIDI command section of the payload, command timestamps 921 use the default semantics (Section 3). 923 4. The media time encoded by an RTP packet may range from 0 to 924 200 ms, and the RTP timestamp difference between sequential 925 packets in the stream may be arbitrarily large (Section 2.1). 927 5. If more than one minimal rtp-midi stream appears in a session, 928 the MIDI name spaces for these streams are independent: channel 929 1 in the first stream does not reference the same MIDI channel 930 as channel 1 in the second stream. 932 6. The rendering method for the stream is not specified. 934 6.2 Session Description for mpeg4-generic Streams 936 An mpeg4-generic stream uses an MPEG 4 Audio Object Type to render MIDI 937 into audio [3]. Three Object Types are compatible with MIDI: 939 o General MIDI (Audio Object Type ID 15), based on the General 940 MIDI rendering standard [1]. 942 o Wavetable Synthesis (Audio Object Type ID 14), based on the 943 Downloadable Sounds Level 2 (DLS 2) rendering standard [9]. 945 o Main Synthetic (Audio Object Type ID 13), based on Structured 946 Audio and the programming language SAOL [5]. 948 The session description below shows a minimal session description for an 949 mpeg4-generic stream sent over unicast UDP transport. This example uses 950 the General MIDI Audio Object Type under Synthesis Profile @ Level 2. 952 v=0 953 o=lazzaro 2520644554 2838152170 IN IP6 first.example.net 954 s=Example 955 t=0 0 956 m=audio 5004 RTP/AVP 96 957 c=IN IP6 FF1E:03AD::7F2E:172A:1E24 958 a=rtpmap: 96 mpeg4-generic/44100 959 a=fmtp: 96 streamtype=5; mode=rtp-midi; profile-level-id=12; 960 a=fmtp: 96 config=7A124D546864000000060000000100604D547 961 26B0000000400FF2F000 963 (The linebreak in the second fmtp line accommodates memo formatting 964 restrictions; SDP does not have continuation lines.) 966 The fmtp attribute lines code the parameters that MUST appear in a 967 mpeg4-generic session description [4]. The "streamtype" parameter MUST 968 be set to 5, and the "mode" parameter MUST be set to "rtp-midi". The 969 "profile-level-id" parameter MUST be set to the MPEG-4 Profile Level. 971 The "config" parameter MUST appear in the session description. The 972 config value is a hexadecimal encoding [4] of the AudioSpecificConfig 973 data block [7] for the stream. AudioSpecificConfig encodes the Audio 974 Object Type for the stream, and also encodes initialization data (SAOL 975 programs, DLS 2 wave tables, etc). Standard MIDI Files encoded in the 976 AudioSpecificConfig MUST be ignored by the receiver. 978 We describe this session description as minimal, because it does not 979 customize the stream. In Section 6.1, we describe the behavior of a 980 minimal native stream, as a numbered list of characteristics. Items 1-4 981 on that list also describe the minimal mpeg4-generic stream, but items 5 982 and 6 require restatements, as listed below: 984 5. If more than one minimal mpeg4-generic stream appears in 985 a session, each stream uses an independent instance of the 986 Audio Object Type coded in the config parameter value. 988 6. A minimal mpeg4-generic stream encodes the AudioSpecificConfig 989 as an inline hexadecimal constant. If session description 990 is sent over UDP, it may be impossible to transport large 991 AudioSpecificConfig blocks, as the Maximum Transmission Size 992 (MTU) of the underlying network limits the UDP packet size 993 (for Ethernet, the MTU is 1500 octets). 995 6.3 Session Configuration Tools 997 This section introduces the session configuration tools for RTP MIDI 998 sessions. The tools add features to the minimal streams described in 999 Sections 6.1-2, and support several types of services: 1001 o Journal customization. The j_sec and j_update parameters 1002 configure the use of the payload journal section. The 1003 ch_default, ch_unused, ch_never, and ch_anchor parameters 1004 configure the semantics of the recovery journal chapters. 1005 These fmtp parameters are described in Appendix C.1, and 1006 override the default stream behaviors 1 and 2 listed in 1007 Section 6.1 and referenced in Section 6.2. 1009 o MIDI command timestamp semantics. The tsmode, octpos, 1010 mperiod, and linerate parameters customize the semantics 1011 of timestamps in the MIDI command section. These parameters 1012 let RTP MIDI accurately encode the implicit time coding of 1013 MIDI 1.0 DIN cables. These fmtp parameters are described in 1014 Appendix C.2, and override default stream behavior 3 listed in 1015 Section 6.1 and referenced in Section 6.2 1017 o Media time. The standard SDP attributes ptime and 1018 maxptime define the media time encoded by a packet. The 1019 guardtime fmtp parameter sets the minimum sending rate of 1020 stream packets. These tools are described in Appendix C.3, 1021 and override default stream behavior 4 listed in Section 1022 6.1 and referenced in Section 6.2. 1024 o Multiple streams. The musicport parameter labels the 1025 MIDI name space of multi-stream sessions. The zerosync 1026 parameter supports synchronization in multi-stream sessions. 1027 These fmtp parameters are described in Appendix C.4, and 1028 override default stream behavior 5 in Sections 6.1 and 6.2. 1030 o MIDI rendering. Several fmtp parameters specify the MIDI 1031 rendering method of a stream. These parameters are described 1032 in Appendix C.5, and override default stream behavior 6 in 1033 Sections 6.1 and 6.2. 1035 7. Extensibility 1037 The payload format defined in this memo exclusively encodes all commands 1038 that may legally appear on a MIDI 1.0 DIN cable. 1040 Many worthy uses of MIDI over RTP do not fall within the narrow scope of 1041 the format. For example, the format does not support the direct 1042 transport of Standard MIDI File (SMF) meta-event and metric timing data. 1043 As a second example, the format does not define transport tools for 1044 user-defined commands (apart from tools to support System Exclusive 1045 commands [1]). 1047 The format does not provide an extension mechanism to support new 1048 features of this nature, by design. Instead, we encourage the 1049 development of new payload formats for specialized musical applications. 1050 The IETF session management tools [15] [16] support codec negotiation, 1051 to facilitate the use of new formats in a backward-compatible way. 1053 However, the payload format does provide several extensibility tools, 1054 which we list below: 1056 o Rendering. The payload format may be extended to support 1057 new MIDI renderers (Appendix C.5.1). The extension 1058 mechanism uses the standard MIME registration process [20]. 1060 o Journalling. As described in Appendix C.1, new token 1061 values for the j_sec and j_update fmtp parameters may 1062 be defined in IETF standards-track documents. This 1063 mechanism supports the design of new journal formats 1064 and the definition of new journal sending policies. 1066 o Undefined opcodes. [1] reserves 4 MIDI System opcodes 1067 for future use (0xF4, 0xF5, 0xF9, 0xFD). If updates 1068 to [1] define the reserved opcodes, IETF standards-track 1069 documents may be defined to provide resiliency support for 1070 the commands. Opaque LEGAL fields appear in System Chapter 1071 D for this purpose (Appendix B.1.1). 1073 A final form of extensibility involves the inclusion of the payload 1074 format in framework documents. Framework documents describe how to 1075 combine protocols to form a platform for interoperable applications. 1076 For example, a network musical performance [12] framework might define 1077 how to use SIP [14], SDP [6] and RTP/AVP [2] [3] to support real-time 1078 performances between geographically-distributed players. 1080 8. Congestion Control 1082 RTP MIDI has congestion control issues that are unique for an audio 1083 payload format. In applications such as network musical performance 1084 [12], the packet rate is linked to the gestural rate of a human 1085 performer. 1087 Senders MUST monitor the MIDI command source for patterns that result in 1088 excessive packet rates, and take actions during RTP transcoding to 1089 reduce the RTP packet rate. [18] offers implementation guidance on this 1090 issue. 1092 A. The Recovery Journal Channel Chapters 1094 A.1 Recovery Journal Definitions 1096 This Appendix defines the terminology and the coding idioms that are 1097 used in the recovery journal bitfield descriptions in Section 5 (journal 1098 header structure), Appendices A.2-9 (channel journal chapters) and 1099 Appendices B.1-5 (system journal chapters). 1101 We assume that the recovery journal resides in the journal section of an 1102 RTP packet with sequence number I ("packet I") and that the Checkpoint 1103 Packet Seqnum field in the top-level recovery journal header refers to a 1104 packet with sequence number C. Unless stated otherwise, algorithms are 1105 assumed to use modulo 2^16 arithmetic for calculations on 16-bit 1106 sequence numbers and modulo 2^32 arithmetic for calculations on 32-bit 1107 extended sequence numbers. 1109 Several bitfield coding idioms appear throughout the recovery journal 1110 system, with consistent semantics. Most recovery journal elements begin 1111 with an "S" (Single-packet loss) bit. S bits are designed to help 1112 receivers efficiently parse through the recovery journal hierarchy in 1113 the common case of the loss of a single packet. 1115 By default, all S bits MUST be set to 1. If a recovery journal element 1116 in packet I encodes data about a command stored in the MIDI command 1117 section of packet I - 1, its S bit MUST be set to 0. If a recovery 1118 journal element has its S bit set to 0, all higher-level recovery 1119 journal elements that contain it MUST also have S bits that are set to 1120 0, including the top-level recovery journal header. 1122 Other consistent bitfield coding idioms are described below: 1124 o R flag bit. R flag bits are reserved for future use. Senders 1125 MUST set R bits to 0. Receivers MUST ignore R bit values. 1127 o LENGTH field. All fields named LENGTH (as distinct from LEN) 1128 code the number of octets in the structure that contains it, 1129 including the header it resides in and all hierarchical levels 1130 below it. If a structure contains a LENGTH field, a receiver 1131 MUST use the LENGTH field value to advance past the structure 1132 during parsing, rather than use knowledge about the internal 1133 format of the structure. 1135 We now define normative terms used to describe recovery journal 1136 semantics, grouped by order of appearance in Appendices A.2-9. 1138 o Checkpoint history. The checkpoint history of a recovery journal 1139 is the concatenation of the MIDI command sections of packets C 1140 through I - 1. The last command in the MIDI command section for 1141 packet I - 1 is considered the most recent command; the first 1142 command in the MIDI command section for packet C is the oldest 1143 command. If command X is less recent than command Y, X is 1144 considered to be "before Y". A checkpoint history with no 1145 commands is considered to be empty. The checkpoint history 1146 never contains the MIDI command section of the packet I (the 1147 packet containing the recovery journal), so if C == I, the 1148 checkpoint history is empty by definition. 1150 o Session history. The session history of a recovery journal is 1151 the concatenation of MIDI command sections from the first 1152 packet of the session up to packet I - 1. The definitions of 1153 command recency and history emptiness follow those in the 1154 checkpoint history. The session history never contains the 1155 MIDI command section of packet I, and so the session history of 1156 the first packet in the session is empty by definition. 1158 o Finished/unfinished commands. If all octets of a MIDI command 1159 appear in the session history, the command is defined to be 1160 finished. If some but not all octets of a command appear 1161 in the session history, the command is defined to be unfinished. 1162 Unfinished commands occur if segments of a SysEx command appear 1163 in several RTP packets. For example, if a SysEx command is coded 1164 as 3 segments, with segment 1 in packet K, segment 2 in packet 1165 K + 1, and segment 3 in packet K + 2, the session histories for 1166 packets K + 1 and K + 2 contain unfinished versions of the command. 1168 o Active commands. Active command are MIDI commands that do not 1169 appear before one of the following commands in the session 1170 history: System Reset (0xFF), General MIDI System Enable 1171 (0xF0 0x7E 0xcc 0x09 0x01 0xF7), General MIDI System Disable 1172 (0xF0 0x7E 0xcc 0x09 0x00 0xF7). 1174 o N-active commands. N-active commands are MIDI commands that do 1175 not appear before one of the following commands in the session 1176 history: System Reset (0xFF), General MIDI System Enable (0xF0 1177 0x7E 0xcc 0x09 0x01 0xF7), General MIDI System Disable (0xF0 1178 0x7E 0xcc 0x09 0x00 0xF7), MIDI Control Change numbers 123-127 1179 (numbers with All Notes Off semantics) or 120 (All Sound Off). 1181 o C-active commands. C-active commands are MIDI commands that do 1182 not appear before one of the following commands in the session 1183 history: System Reset (0xFF), General MIDI System Enable (0xF0 1184 0x7E 0xcc 0x09 0x01 0xF7), General MIDI System Disable (0xF0 1185 0x7E 0xcc 0x09 0x00 0xF7), MIDI Control Change number 121 (Reset 1186 All Controllers). 1188 o Parameter system. A MIDI feature that provides two sets of 1189 16,384 parameters to expand the 0-127 controller number space. 1190 The Registered Parameter Names (RPN) system and the Non-Registered 1191 Parameter Names (NRPN) system each provides 16,384 parameters. 1193 o Parameter system transaction. The value of RPNs and NRPNs are 1194 changed by a series of Control Change commands that form a 1195 parameter system transaction. A transaction begins with two 1196 Control Change commands to set the parameter number (controller 1197 numbers 98 and 99 for NRPNs, controller numbers 100 and 101 for 1198 RPNs). The transaction continues with an arbitrary number of 1199 Data Entry (controller numbers 6 and 38) and Data Button 1200 (controller numbers 96 and 97) Control Change commands to 1201 set the parameter value. The transaction ends with a second 1202 pair of (98, 99) or (100, 101) Control Change commands. These 1203 terminal commands are considered a part of the transaction. 1204 In addition, the terminal commands start a second parameter system 1205 transaction. Thus, the commands belong to two transactions. 1207 o Initiated parameter system transaction. An initiated parameter 1208 system transaction is a transaction whose (98, 99) or (100, 101) 1209 initial Control Change command pair appears in the session 1210 history. Unpaired Control Change commands for controller numbers 1211 98-101 do not form an initiated transaction. The termination of 1212 a transaction does not change the "initiated" status of the 1213 transaction. 1215 The chapter definitions in Appendices A.2-9 and B.1-5 reflect the 1216 default recovery journal behavior. The ch_default, ch_unused, ch_never, 1217 and ch_anchor parameters modify these definitions, as described in 1218 Appendix C.1.3. 1220 The chapter definitions specify if data MUST be present in the journal. 1221 Senders MAY also include non-required data in the journal. This 1222 optional data MUST comply with the normative chapter definition. For 1223 example, if a chapter definition states that a field codes data from the 1224 most recent active command in the session history, the sender MUST NOT 1225 code inactive commands or older commands in the field. 1227 Finally, we note that channel journals only encode information about 1228 MIDI commands appearing on the MIDI channel the journal protects. All 1229 references to MIDI commands in Appendices A.2-9 should be read as "MIDI 1230 commands appearing on this channel." 1232 A.2 Chapter P: MIDI Program Change 1234 A channel journal MUST contain Chapter P if an active Program Change 1235 (0xC) command appears in the checkpoint history. Figure A.2.1 shows the 1236 format for Chapter P. 1238 0 1 2 1239 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 1240 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1241 |S| PROGRAM |B| BANK-COARSE |C| BANK-FINE | 1242 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1244 Figure A.2.1 -- Chapter P format 1246 The chapter has a fixed size of 24 bits. The PROGRAM field indicates 1247 the data value of the most recent active Program Change command in the 1248 session history. By default, the B, BANK-COARSE, C, and BANK-FINE 1249 fields MUST be set to 0. 1251 However, if an active Control Change (0xB) command for controller number 1252 0 (Bank Select Coarse) appears before the Program Change command in the 1253 session history, the B bit MUST be set to 1, and the BANK-COARSE field 1254 MUST code the data value of the Control Change command. If this Control 1255 Change command is also C-active, the C bit MUST be set to 1. 1257 If the B bit is set to 1, the BANK-FINE field MUST code the data value 1258 of the most recent Control Change command for controller number 32 (Bank 1259 Select Fine) that preceded the Program Change command coded in the 1260 PROGRAM field and followed the Control Change command coded in the BANK- 1261 COARSE field. If no such Control Change command exists, the BANK-FINE 1262 field MUST be set to 0. 1264 A.3 Chapter W: MIDI Pitch Wheel 1266 A channel journal MUST contain Chapter W if an active MIDI Pitch Wheel 1267 (0xE) command appears in the checkpoint history. Figure A.3.1 shows the 1268 format for Chapter W. 1270 0 1 1271 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 1272 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1273 |S| FIRST |R| SECOND | 1274 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1276 Figure A.3.1 -- Chapter W format 1278 The chapter has a fixed size of 16 bits. The FIRST and SECOND fields 1279 are the 7-bit values of the first and second data octets of the most 1280 recent active Pitch Wheel command in the session history. 1282 A.4 Chapter N: MIDI NoteOff and NoteOn 1284 In this Appendix, we consider NoteOn commands with zero velocity to be 1285 NoteOff commands. Readers may wish to review the Appendix A.1 1286 definition of "N-active commands" before reading this Appendix. 1288 A channel journal MUST contain Chapter N if an N-active MIDI NoteOn 1289 (0x9) or NoteOff (0x8) command appears in the checkpoint history. 1290 Figure A.4.1 shows the format for Chapter N. 1292 0 1 2 3 1293 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 8 0 1 1294 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1295 |B| LEN | LOW | HIGH |S| NOTENUM |Y| VELOCITY | 1296 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1297 |S| NOTENUM |Y| VELOCITY | .... | 1298 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1299 | OFFBITS | OFFBITS | .... | OFFBITS | 1300 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1302 Figure A.4.1 -- Chapter N format 1304 Chapter N consists of a 2-octet header, followed by least one of the 1305 following data structures: 1307 o A list of note logs to code NoteOn commands. 1308 o A NoteOff bitfield structure to code NoteOff commands. 1310 The note log list MUST contain an entry for all note numbers whose most 1311 recent checkpoint history appearance is in an N-active NoteOn command. 1312 The NoteOff bitfield structure MUST contain a set bit for all note 1313 numbers whose most recent checkpoint history appearance is in an N- 1314 active NoteOff command. 1316 A note number MUST NOT be coded in both structures. All note logs and 1317 NoteOff bitfield set bits MUST code the most recent N-active NoteOn or 1318 NoteOff reference to a note number in the session history. 1320 A.4.1 Header Structure 1322 The header for Chapter N, shown in Figure A.4.2, codes the size of the 1323 note list and bitfield structures. 1325 0 1 1326 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 1327 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1328 |B| LEN | LOW | HIGH | 1329 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1331 Figure A.4.2 -- Chapter N header 1333 The LEN field, a 7-bit integer value, codes the number of 2-octet note 1334 logs in the note list. Zero is a valid value for LEN, and codes an 1335 empty note list. 1337 The 4-bit LOW and HIGH fields code the number of OFFBITS octets that 1338 follow the note log list. LOW and HIGH are unsigned integer values. If 1339 LOW <= HIGH, there are (HIGH - LOW + 1) OFFBITS octets in the chapter. 1340 The value pairs (LOW = 15, HIGH = 0) and (LOW = 15, HIGH = 1) code an 1341 empty NoteOff bitfield structure (i.e. no OFFBITS octets). Other (LOW > 1342 HIGH) value pairs MUST NOT appear in the header. 1344 The B bit provides S-bit functionality (Appendix A.1) for the NoteOff 1345 bitfield structure. By default, the B bit MUST be set to 1. However, 1346 if the MIDI command section of the previous packet (packet I - 1, with I 1347 as defined in Appendix A.1) includes a NoteOff command for the channel, 1348 the B bit MUST be set to 0. If the B bit is set to 0, the higher-level 1349 recovery journal elements that contain Chapter N MUST have S bits that 1350 are set to 0, including the top-level journal header. 1352 The LEN value of 127 codes a note list length of 127 or 128 note logs, 1353 depending on the values of LOW and HIGH. If LEN = 127, LOW = 15, and 1354 HIGH = 0, the note list holds 128 note logs, and the NoteOff bitfield 1355 structure is empty. For other values of LOW and HIGH, LEN = 127 codes 1356 that the note list contains 127 note logs. In this case, the chapter 1357 has (HIGH - LOW + 1) NoteOff OFFBITS octets if LOW <= HIGH, and has no 1358 OFFBITS octets if LOW = 15 and HIGH = 1. 1360 A.4.2 Note Structures 1362 Figure A.4.3 shows the 2-octet note log structure. 1364 0 1 1365 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 1366 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1367 |S| NOTENUM |Y| VELOCITY | 1368 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1370 Figure A.4.3 -- Chapter N note log 1372 The 7-bit NOTENUM field codes the note number for the log. A note 1373 number MUST NOT be represented by multiple note logs in the note list. 1374 The 7-bit VELOCITY field codes the velocity value for the most recent N- 1375 active NoteOn command for the note number in the session history. 1376 VELOCITY is never zero; NoteOn commands with zero velocity are coded as 1377 NoteOff commands in the NoteOff bitfield structure. 1379 The note log does not code the execution time of the NoteOn command. 1380 However, the Y bit codes a hint from the sender about the NoteOn 1381 execution time. The Y bit codes a recommendation to play (Y = 1) or 1382 skip (Y = 0) the NoteOn command recovered from the note log. In a 1383 normative sense, the Y bit is set to 1 if the sender considers the 1384 command coded by the log to be simultaneous with the RTP timestamp of 1385 the packet that contains the log. In all other cases, the Y bit is set 1386 to 0. 1388 Figure A.4.1 shows the NoteOff bitfield structure, as the list of 1389 OFFBITS octets at the end of the chapter. A NoteOff OFFBITS octet codes 1390 NoteOff information for eight consecutive MIDI note numbers, with the 1391 most-significant bit representing the lowest note number. The most- 1392 significant bit of the first OFFBITS octet codes the note number 8*LOW; 1393 the most-significant bit of the last OFFBITS octet codes the note number 1394 8*HIGH. 1396 A set bit codes a NoteOff command for the note number. In the most 1397 efficient coding for the NoteOff bitfield structure, the first and last 1398 octets of the structure contain at least one set bit. Note that Chapter 1399 N does not code NoteOff velocity data. 1401 A.5 Chapter A: MIDI Poly Aftertouch 1403 A channel journal MUST contain Chapter A if an N-active Poly Aftertouch 1404 (0xA) command appears in the checkpoint history. Figure A.5.1 shows the 1405 format for Chapter A. 1407 0 1 2 3 1408 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 8 0 1 1409 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1410 |S| LEN |S| NOTENUM |R| PRESSURE |S| NOTENUM | 1411 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1412 |R| PRESSURE | .... | 1413 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1415 Figure A.5.1 -- Chapter A format 1417 The chapter consists of a 1-octet header, followed by a variable length 1418 list of 2-octet note logs. A note log MUST appear for a note number if 1419 an N-active Poly Aftertouch command for the note number appears in the 1420 checkpoint history. A note number MUST NOT be represented by multiple 1421 note logs in the note list. 1423 The 7-bit LEN field codes the number of note logs in the list, minus 1424 one. Figure A.5.2 reproduces the note log structure of Chapter A. 1426 0 1 1427 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 1428 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1429 |S| NOTENUM |R| PRESSURE | 1430 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1432 Figure A.5.2 -- Chapter A note log 1434 The 7-bit PRESSURE field codes the pressure value of the most recent N- 1435 active Poly Aftertouch command in the session history. The MIDI note 1436 number for this command is coded in the 7-bit NOTENUM field. 1438 A.6 Chapter T: MIDI Channel Aftertouch 1440 A channel journal MUST contain Chapter T if an N-active MIDI Channel 1441 Aftertouch (0xD) command appears in the checkpoint history. Figure 1442 A.6.1 shows the format for Chapter T. 1444 0 1445 0 1 2 3 4 5 6 7 1446 +-+-+-+-+-+-+-+-+ 1447 |S| PRESSURE | 1448 +-+-+-+-+-+-+-+-+ 1450 Figure A.6.1 -- Chapter T format 1452 The chapter has a fixed size of 8 bits. The 7-bit PRESSURE field holds 1453 the pressure value of the most recent N-active Channel Aftertouch 1454 command in the session history. 1456 A.7 Chapter C: MIDI Control Change 1458 Readers may wish to review the Appendix A.1 definition of "C-active 1459 commands" before reading this Appendix. 1461 Figure A.7.1 shows the format for Chapter C. 1463 0 1 2 3 1464 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 8 0 1 1465 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1466 |S| LEN |S| NUMBER |A| VALUE/ALT |S| NUMBER | 1467 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1468 |A| VALUE/ALT | .... | 1469 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1471 Figure A.7.1 -- Chapter C format 1473 The chapter consists of a 1-octet header, followed by a variable length 1474 list of 2-octet controller logs. The list MUST contain at least one 1475 controller log. The 7-bit LEN field codes the number of controller logs 1476 in the list, minus one. 1478 A channel journal MUST contain Chapter C if the rules defined in this 1479 Appendix require that one or more controller logs appear in the list. 1481 A.7.1 Log Inclusion Rules 1483 The list MUST contain an entry for controller numbers 0-119 (excepting 1484 controller numbers 0, 6, 32-63, 96-101, and 124-127) if a C-active 1485 Control Change command for a number appears in the checkpoint history. 1486 In addition, the list MUST contain an entry for a controller numbers 1487 120-123 if an active Control Change command for a number appears in the 1488 checkpoint history. 1490 A special rule applies to streams that transmit 14-bit controller values 1491 using paired MSB (controller numbers 0-31) and LSB (controller numbers 1492 32-63) Control Change commands. In this case, if the most recent C- 1493 active Control Change command in the session history for a 14-bit 1494 controller uses the MSB controller number, the Chapter C MUST NOT code 1495 the associated LSB controller number. 1497 Apart from this exception, and apart from exceptions for controller 1498 numbers 32 (described below) and 38 (described in Appendix A.7.4), the 1499 controller list MUST contain an entry for controller numbers 32-63 if a 1500 C-active Control Change command for a number appears in the checkpoint 1501 history. 1503 If C-active Control Change commands for controller numbers 0 (Bank 1504 Select Coarse) or 32 (Bank Select Fine) appear in the checkpoint 1505 history, the most recent commands for these controller numbers MUST 1506 appear as entries in the controller list, with a single exception: if 1507 the command instances are also coded in the BANK-COARSE and BANK-FINE 1508 fields of the Chapter P (Appendix A.2), Chapter C MAY omit the 1509 controller logs for the commands. Note that for this exception to 1510 apply, the C bit for Chapter P MUST be set to 1. 1512 Several controller numbers pairs are defined to be mutually exclusive. 1513 Controller numbers 124 (Omni Off) and 125 (Omni On) form a mutually 1514 exclusive pair, as do controller numbers 126 (Mono) and 127 (Poly). 1516 If active Control Change commands for one or both members of a mutually 1517 exclusive pair appear in the checkpoint history, exactly one controller 1518 log MUST appear in controller list to code the pair. 1520 If active Control Change commands for one or both members of a mutually 1521 exclusive pair appear in the session history, at most one controller log 1522 MAY appear in controller list to code the pair. 1524 In both cases, the controller log that appears in the controller list 1525 MUST code the controller number of the most recent Control Change 1526 command of the pair in the session history. 1528 A.7.2 Log Coding Rules 1530 Figure A.7.2 shows the controller log structure of Chapter C. 1532 0 1 1533 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 1534 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1535 |S| NUMBER |A| VALUE/ALT | 1536 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1538 Figure A.7.2 -- Chapter C controller log 1540 The 7-bit NUMBER field identifies the controller number. Controller 1541 logs for controller numbers 120-127 MUST appear at the start of the 1542 Chapter C controller list, in ascending NUMBER field order. Logs for 1543 controller numbers 0-119 MUST follow the 120-127 logs, also in ascending 1544 NUMBER field order. 1546 The 7-bit VALUE/ALT field codes recovery information for the most recent 1547 C-active (controller numbers 0-119) or active (controller numbers 1548 120-127) Control Change command in the session history. 1550 Chapter C provides three tools for coding recovery information for a 1551 command in the VALUE/ALT field: the value tool, the toggle tool, and the 1552 count tool. Implementations may choose among the tools to code a 1553 Control Change command. 1555 In the value tool, the 7-bit VALUE field codes the control value of the 1556 most recent C-active (controller numbers 0-119) or active (controller 1557 numbers 120-127) Control Change command in the session history. This 1558 tool works best for controllers that code a continuous quantity, such as 1559 number 1 (Modulation Wheel). If the value tool is chosen, the A bit is 1560 set to 0. 1562 The A bit is set to 1 to code the toggle or count tool. These tools 1563 work best for controllers that code discrete actions. Figure A.7.3 1564 shows the controller log for these tools. 1566 0 1 1567 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 1568 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1569 |S| NUMBER |1|T| ALT | 1570 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1572 Figure A.7.3 -- Controller log for ALT tools 1574 The T flag is set to 1 to code the toggle tool; T is set to 0 to code 1575 the count tool. Both methods use the 6-bit ALT field as an unsigned 1576 integer. 1578 The toggle tools works best for controllers that act as on/off switches, 1579 such as 64 (Hold Pedal). These controllers code the "off" state with 1580 control values 0-63 and the "on" state with 64-127. The ALT field codes 1581 the total number of toggles (off->on and on->off) due to Control Change 1582 commands in the session history, including toggle events caused by MIDI 1583 Control Change number 121 (Reset All Controllers). 1585 Toggle counting is performed modulo 64. The toggle count is reset at 1586 the start of a session, and whenever a System Reset (0xFF), General MIDI 1587 System Enable (0xF0 0x7E 0xcc 0x09 0x01 0xF7), or General MIDI System 1588 Disable (0xF0 0x7E 0xcc 0x09 0x00 0xF7) appears in the session history. 1589 When these reset events occur, the toggle count for a controller is set 1590 to 0 (for controllers whose default value is 0-63) or 1 (for controllers 1591 whose default value is 64-127). 1593 The Hold Pedal controller illustrates the benefit of the toggle tool 1594 over the value tool for switch controllers. As often used in piano 1595 applications, the "on" state of the Hold Pedal lets notes resonate, 1596 while the "off" state immediately damps notes to silence. The loss of 1597 the "off" command in an "on->off->on" sequence results in ringing notes 1598 that should have been damped silent. The toggle tool lets receivers 1599 detect this lost "off" command but the value tool does not. 1601 The count tool is similar to the toggle tool, but is optimized for 1602 controllers whose value octet is ignored, such as 123 (All Notes Off). 1603 For the count tool, the ALT field codes the total number of Control 1604 Change commands in the session history. Command counting is performed 1605 modulo 64. 1607 The command count is set to 0 at the start of the session, and is reset 1608 to 0 whenever a System Reset (0xFF), General MIDI System Enable (0xF0 1609 0x7E 0xcc 0x09 0x01 0xF7), or General MIDI System Disable (0xF0 0x7E 1610 0xcc 0x09 0x00 0xF7) appears in the session history. 1612 In most situations, a controller number SHOULD be coded by a single tool 1613 (and thus, a single controller log). However, a few controller numbers 1614 require several tool types (and thus, several controller logs) to code 1615 correctly. 1617 For example, controller number 121 (Reset All Controllers) may require 1618 two tools to code correctly. Active commands for controller number 121 1619 in the checkpoint history MUST be coded with the count tool. If the 1620 most recent command has a non-zero data octet, a second log MUST also 1621 appear in the controller list, and this log MUST use the value tool. 1622 This rule supports renderers (such as [9]) that use the data octet to 1623 code the reset semantics. 1625 Multiple logs for the same controller number that use the same tool type 1626 MUST NOT appear in the controller list. 1628 A.7.3 Portamento Control 1630 Controller number 84 (Portamento Control) codes that a pitch glide 1631 effect should be used for the next NoteOn command on the note number 1632 coded in the Control Change data octet. These semantics are a poor 1633 match for Chapter C, as a long-lived log entry may introduce a spurious 1634 portamento effect after each packet loss event, and thus create an 1635 indefinite artifact. 1637 Note that by its nature, a lost Portamento Control command can not cause 1638 an indefinite artifact, as it only affects a single note instance. 1639 Rather, it is the attempt at recovery that causes the artifact. 1640 However, banning controller number 84 from Chapter C is not an option, 1641 as the Portamento Control semantics are a recent MIDI addition, and thus 1642 number 84 is often used as a generic controller. 1644 This situation motivates the following rule. If a C-active Control 1645 Change command for controller number 84 appears in the checkpoint 1646 history, the controller list MUST contain at least 2 entries for the 1647 number. One entry MUST use the value tool, and one entry MUST use the 1648 count tool. The presence of both value and count tools lets a receiver 1649 detect long-lived log entries, and avoid the "indefinite portamento 1650 effect" problem. 1652 A.7.4 The Parameter System 1654 Appendix A.9 defines Chapter M, the MIDI Parameter chapter, to provide 1655 resiliency for the MIDI registered/non-registered parameter system. 1656 Here, we define the Chapter C rules for coding Control Change commands 1657 related to the parameter system. These rules serve to minimize 1658 redundancy with Chapter M. 1660 Control Change commands for controller numbers 6 and 38 (Data Slider) 1661 and 96 and 97 (Data Button) may be used as part of the parameter system, 1662 or may be used as general-purpose controllers. Control Change commands 1663 for controller numbers 6, 38, 96, or 97 in the session history that are 1664 used in the parameter system MUST NOT appear as entries in the 1665 controller list. 1667 However, if C-active Control Change commands for controller numbers 6, 1668 38, 96, or 97 appear in the checkpoint history, and these commands are 1669 used as general-purpose controllers, the most recent general-purpose 1670 command instance for these controller numbers MUST appear as entries in 1671 the controller list. 1673 Likewise, if C-active Control Change commands for controller numbers 6, 1674 38, 96, or 97 appear in the session history, and these commands are used 1675 as general-purpose controllers, the most recent general-purpose command 1676 instance for these controller numbers MAY appear as entries in the 1677 controller list. In Chapter C, the controller number pair (6, 38) 1678 adheres to the 14-bit log inclusion rules defined in Appendix A.7.1, and 1679 controllers numbers 96 and 97 are coded as 7-bit controllers, 1680 independent of each other and from the controller pair (6, 38). 1682 A parameter system transaction begins with paired Control Change 1683 commands for controller numbers 98 and 99 (Non-Registered Parameter LSB 1684 and MSB) or 100 and 101 (Registered Parameter LSB and MSB). Chapter M 1685 codes these paired Control Change commands. The Chapter C rule below 1686 acts to code "unpaired" commands for these controller numbers, that 1687 appear in the checkpoint history if a (98, 99) or (100, 101) pair is 1688 split across the MIDI command sections of two RTP packets. 1690 If the most recent C-active Control Change command for controller 98, 1691 99, 100, or 101 in the session history is part of a (98, 99) or (100, 1692 101) command pair that begins a parameter system transaction, the 1693 command MUST NOT appear in the controller list. 1695 However, if the most recent C-active Control Change command for 1696 controller 98, 99, 100, or 101 in the checkpoint history does not form 1697 part of a (98, 99) or (100, 101) command pair, an entry MUST appear in 1698 the controller list. Likewise, if the most recent C-active Control 1699 Change command for controller 98, 99, 100, or 101 in the session history 1700 does not form part of a (98, 99) or (100, 101) command pair, an entry 1701 MAY appear in the controller list. 1703 A.8 Chapter E: MIDI Reset All Controllers 1705 As defined in [1], the Control Change (0xB) command for controller 1706 number 121 (Reset All Controllers) resets controller numbers 0-119 to 1707 the "ideal initial state" for the device. As a consequence, the 1708 definition of Chapter C (Appendix A.7) limits the inclusion of Control 1709 Change commands for controller numbers 0-119 to C-active commands 1710 (Appendix A.1). This rule ensures that receivers do not incorrectly set 1711 controllers to "stale" values during recovery from a loss event. 1713 However, rendering standards may define certain controller numbers in 1714 the 0-119 range to be unaffected by Reset All Controllers commands. For 1715 example, DLS 2 [9] declares controller numbers 7, 10 and 11 to be 1716 unaffected by a Reset All Controller command whose data octet is null. 1717 Thus, Chapter C would not protect controller numbers 7, 10, and 11 if a 1718 packet loss event occurred at an inopportune moment in a stream. 1720 Chapter E is designed to close this coverage gap in the recovery 1721 journal. Chapter E uses the same bitfield format as Chapter C: a 1722 1-octet header, followed by a variable length list of 2-octet controller 1723 logs, as shown in Figure A.7.1. 1725 A channel journal MUST contain Chapter E if (1) an active Control Change 1726 command for controller number 121 (Reset All Controllers) appears in the 1727 checkpoint history, and if (2) the rules we define below yield a non- 1728 empty list of controller logs. Otherwise, a channel journal MUST NOT 1729 contain Chapter E. 1731 The controller log inclusion rules for Chapter E are identical to the 1732 inclusion rules for Chapter C, except that: 1734 o All uses of the term "C-active" in Appendix A.7 are replaced 1735 with the term "active". 1737 o Control Change commands that occur after the most recent 1738 Reset All Controllers command in the session history MUST 1739 NOT be coded in Chapter E. 1741 o Controller numbers 120-127 MUST NOT be coded in Chapter E. 1743 o If a controller number appears in Chapter C of a channel 1744 journal, the number MUST NOT appear in Chapter E of the 1745 channel journal. In addition, if the MSB controller numbers 1746 0-31 appears in Chapter C, the associated LSB controller 1747 numbers 32-63 MUST NOT appear in Chapter E. 1749 o If the channel journal contains Chapter P, and the BANK-COARSE 1750 field of Chapter P codes a Control Change command that occurred 1751 after the most recent Reset All Controllers command in the 1752 session history, Chapter E MUST NOT code controllers 0 and 32. 1754 o Chapter E MUST NOT code Control Change commands for controller 1755 numbers 0 or 32 if the BANK-COARSE or BANK-FINE fields of 1756 Chapter P code the same command instances. 1758 As defined by these rules, Chapter E codes a snapshot of the active 1759 Control Change commands for controllers 0-119 that appear in the 1760 checkpoint history before the most recent Reset All Controllers command. 1761 As the Control Change commands that follow the Reset All Controllers 1762 command make part of the snapshot irrelevant, formerly REQUIRED 1763 controller logs in Chapter E are removed from the controller list. 1765 The normative text in this Appendix reflects the default recovery 1766 journal behavior. In most situations, session participants know which 1767 controller numbers (if any) require Chapter E support. Parties SHOULD 1768 use this knowledge to minimize the size of the Chapter E bitfields, by 1769 using the session configuration tools defined in Appendix C.1.3. 1771 A.9 Chapter M: MIDI Parameter System 1773 Readers may wish to review the Appendix A.1 definitions for "parameter 1774 system", "parameter system transaction", and "initiated parameter system 1775 transaction" before reading this Appendix. 1777 Chapter M protects RPN and NRPN parameters. Figure A.9.1 shows the 1778 variable-length format of Chapter M. 1780 0 1 2 3 1781 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1782 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1783 |S|A|P|N|R|R| LENGTH | Parameter log list ... | 1784 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1786 Figure A.9.1 -- Top-level Chapter M format 1788 Chapter M consists of a 2-octet header, followed by a list of variable- 1789 length parameter logs. The list MUST contain at least one parameter 1790 log. The 10-bit LENGTH field codes the size of Chapter M, and conforms 1791 to semantics described in Appendix A.1. 1793 A channel journal MUST contain Chapter M if the rules defined in this 1794 Appendix require that one or more parameter logs appear in the list. 1796 A.9.1 Log Inclusion Rules 1798 Parameter logs code recovery information for a specific RPN or NRPN 1799 parameter. Multiple logs for the same RPN or NRPN parameter MUST NOT 1800 appear in the list. 1802 By default, a parameter log MUST appear in the list if a C-active 1803 command that forms part of an initiated transaction for the parameter 1804 appears in the checkpoint history. If Chapter M uses these default 1805 semantics, the A header bit MUST be set to 0. 1807 During session configuration, Chapter M may be customized to require 1808 that a parameter log MUST appear in the list if an active (as opposed to 1809 C-active) command that forms part of an initiated transaction for the 1810 parameter appears in the checkpoint history. If Chapter M uses these 1811 modified semantics, the A header bit MUST be set to 1. 1813 In both configurations, a log MAY appear in the list if an active 1814 command associated with an initiated transaction for the parameter 1815 appears in the session history. 1817 Parameter logs MUST be ordered with respect to the relative recency of 1818 transactions for the parameter. The first log in the list codes the 1819 parameter most recently involved in a transaction, the second log codes 1820 a parameter whose most recent transaction occurred before the most 1821 recent transaction of the first list parameter, etc. 1823 The N and P header bits signal the presence of ongoing RPN and NRPN 1824 transactions in the session history. If the session history does not 1825 include commands to terminate the most recent initiated transaction for 1826 the first RPN parameter log in the list, P MUST be set to 1. Otherwise, 1827 P MUST be set to 0. Likewise, if the session history does not include 1828 commands to terminate the most recent initiated transaction for the 1829 first NRPN parameter log in the list, N MUST be set to 1. Otherwise, N 1830 MUST be set to 0. 1832 Transactions for the RPN and NRPN null parameter (0x3FFF) MUST NOT 1833 appear in the list. Null parameter transactions are coded implicitly, 1834 by the N and P header bits and by the ordering of parameter log list. 1836 A.9.2 Log Coding Rules 1838 Figure A.9.2 shows the parameter log structure of Chapter M. 1840 0 1 2 3 1841 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 8 0 1 1842 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1843 |S|Q|J|K|L|X|Y|Z|C|T| PNUM-MSB | PNUM-LSB |S| COARSE | 1844 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1845 |S| FINE |S|G| BUTTON |S| A-COARSE | 1846 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1847 |S| A-FINE |S|G| A-BUTTON |S| COUNT | 1848 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1849 |S| TCOUNT | 1850 +-+-+-+-+-+-+-+-+ 1852 Figure A.9.2 -- Chapter M parameter log 1854 The log begins with a 3-octet header (10 flag bits, followed by the 1855 PNUM-MSB and PNUM-LSB fields). If the Q header bit is set to 0, the log 1856 encodes an RPN parameter. If Q = 1, the log encodes an NRPN parameter. 1857 The 7-bit PNUM-LSB and PNUM-MSB fields code the parameter number, and 1858 reflect the Control Change command data values for controller numbers 1859 98-99 (for NRPNs) or 100-101 (for RPNs). 1861 The J, K, L, X, Y, Z, C and T header bits form a Table of Contents (TOC) 1862 for the log, and signal the presence of fixed-sized fields that 1863 optionally follow the header. Figure A.9.2 shows a fully-populated log 1864 (coded by setting all TOC bits to 1). The ordering of fields in the log 1865 follows the ordering of the header bits in the TOC. A set header bit 1866 codes the presence of a field in the log. 1868 Each field acts as a coding tool to protect the parameter. If the rules 1869 in Appendix A.9.1 state that a log for a given parameter MUST appear in 1870 Chapter M, the log MUST include the subset of fields necessary to 1871 protect the parameter for loss events, given the semantics of the 1872 parameter. A safe (but inefficient) option is to use all possible 1873 fields for each coded parameter log. 1875 A.9.2.1 COARSE and FINE Fields 1877 The J bit codes the presence of the 7-bit COARSE field and its 1878 associated S bit. The COARSE field codes the data value of the most 1879 recent C-active Control Change command for controller number 6 (Data 1880 Entry MSB) in the session history that appears in a transaction for the 1881 log parameter. 1883 The K bit codes the presence of the 7-bit FINE field and its associated 1884 S bit. The FINE field codes the data value of the most recent C-active 1885 Control Change command for controller number 38 (Data Entry LSB) in the 1886 session history that appears in a transaction for the log parameter. 1887 However, the FINE field MUST NOT appear in the log if the Control Change 1888 command associated with FINE field precedes the Control Change command 1889 associated with the COARSE field in the session history. 1891 Note that the FINE field MAY appear in the log even if the COARSE field 1892 does not appear in the log. In this situation, FINE does not have C- 1893 active semantics, but may have active semantics if the A-COARSE field is 1894 present in the log (Appendix A.9.2.3). 1896 A.9.2.2 The BUTTON Field 1898 The B bit codes the presence of the 14-bit BUTTON field and its 1899 associated S and G bits. The fields code the use of Control Change 1900 commands for controller numbers 96 and 97 (Data Button Increment and 1901 Data Button Decrement) in transactions for the log parameter. BUTTON is 1902 interpreted as an unsigned integer, and the G bit codes the sign of the 1903 integer (G = 1 for positive, G = 0 for negative). 1905 If the parameter log does not use the COARSE field, the BUTTON and G 1906 fields code a signed count of the number of C-active Data Button 1907 Increment and Decrement Control Change commands in the session history 1908 that appear in a transaction for the log parameter. 1910 If the log uses the COARSE field but not the FINE field, the BUTTON and 1911 G fields code a signed count of the number of C-active Data Button 1912 Increment and Decrement Control Change commands in the session history 1913 that are more recent than the Control Change command associated with the 1914 COARSE field, and that appear in a transaction for the log parameter. 1916 If the log uses the FINE field, the BUTTON and G fields code a signed 1917 count of the number of C-active Data Button Increment and Decrement 1918 Control Change commands in the session history that are more recent than 1919 the Control Change command associated with the FINE field, and that 1920 appear in a transaction for the log parameter. If necessary, the value 1921 of the COARSE or the A-COARSE field MUST be adjusted to reflect the 1922 presence of C-active Data Button Increment and Decrement Control Change 1923 commands between the Control Change command associated with the COARSE 1924 or A-COARSE field and the Control Change command associated with the 1925 FINE field. 1927 To compute and code the count value, initialize the count value to 0, 1928 add 1 for each qualifying Data Button Increment command, subtract 1 for 1929 each qualifying Data Button Decrement command, and limit the magnitude 1930 of the final count to 16383. The G bit codes the sign of the count, and 1931 the BUTTON field codes the magnitude of the count. 1933 A.9.2.3 The A-COARSE, A-FINE, and A-BUTTON Fields 1935 The X header bit codes the presence of the 7-bit A-COARSE parameter and 1936 its associated S bit. The Y bit codes the presence of the 7-bit A-FINE 1937 field and its associate S bit. The Z bit codes the presence of the 1938 14-bit A-BUTTON field and its associated G and S bits. 1940 The rules we define below let the A-COARSE, A-FINE, and A-BUTTON fields 1941 code a snapshot of the parameter value at the moment before the 1942 appearance of the most recent active Control Change command for 1943 controller number 121 (Reset All Controllers) in the session history. 1944 This snapshot, in combination with the COARSE, FINE, and BUTTON fields, 1945 lets receivers who ignore Reset All Controllers Control Change commands 1946 for an RPN or NPRN parameter recover from packet loss events. 1948 The A-COARSE, A-FINE, and A-BUTTON fields MUST NOT appear in the log if 1949 an active Control Change command for controller number 121 (Reset All 1950 Controllers) does not appear in the session history. The A-COARSE, A- 1951 FINE, and A-BUTTON fields MUST NOT appear in the log if Appendix A.9.2.1 1952 permits the COARSE field to appear in the log. The A-FINE and A-BUTTON 1953 fields MUST NOT appear in the log if Appendix A.9.2.1 permits the FINE 1954 field to appear in the log. 1956 The A-COARSE field codes the data value of the most recent active 1957 Control Change command for controller number 6 (Data Entry MSB) in the 1958 session history that appears in a transaction for the log parameter, and 1959 that precedes the most recent Reset All Controllers Control Change 1960 command in the session history. If a Control Change command for 1961 controller number 6 that meets this criteria does not exist, the A- 1962 COARSE field MUST NOT appear in the log. 1964 The A-FINE field codes data value of the most recent active Control 1965 Change command for controller number 38 (Data Entry LSB) in the session 1966 history that appears in a transaction for the log parameter, and that 1967 precedes the most recent Reset All Controllers Control Change command in 1968 the session history. If a Control Change command for controller number 1969 38 that meets this criteria does not exist, the A-FINE field MUST NOT 1970 appear in the log. The A-FINE field MUST NOT appear in the log if the 1971 A-COARSE field does not appear in the log, or if the command associated 1972 with the A-FINE field precedes the command associated with the A-COARSE 1973 field in the session history. 1975 If the log does not use the A-COARSE field, the A-BUTTON and its 1976 associated G bit code a signed count of the number of active Data Button 1977 Increment and Decrement Control Change commands in the session history 1978 that appear in a transaction for the log parameter, and that precede the 1979 most recent Reset All Controllers Control Change command in the session 1980 history. 1982 If the log uses the A-COARSE field but not the A-FINE field, the A- 1983 BUTTON and its associated G bit code a signed count of the number of 1984 active Data Button Increment and Decrement Control Change commands in 1985 the session history that are more recent than the Control Change command 1986 associated with the A-COARSE field, that appear in a transaction for the 1987 log parameter, and that precede the most recent Reset All Controllers 1988 Control Change command in the session history. 1990 If the log uses the A-COARSE and A-FINE fields, the A-BUTTON and its 1991 associated G bit code a signed count of the number of active Data Button 1992 Increment and Decrement Control Change commands in the session history 1993 that are more recent than the Control Change command associated with the 1994 FINE field, that appear in a transaction for the log parameter, and that 1995 precede the most recent Reset All Controllers Control Change command in 1996 the session history. If necessary, the value of the A-COARSE field MUST 1997 be adjusted to reflect the presence of Data Button Increment and 1998 Decrement Control Change commands between the Control Change command 1999 associated with the A-COARSE field and the Control Change command 2000 associated with the A-FINE field. 2002 To compute and code the count value, initialize the count value to 0, 2003 add 1 for each qualifying Data Button Increment command, subtract 1 for 2004 each qualifying Data Button Decrement command, and limit the magnitude 2005 of the final count to 16383. The G bit codes the sign of the count, and 2006 the A-BUTTON field codes the magnitude of the count. 2008 A.9.2.4 The COUNT and TCOUNT Fields 2010 The C bit codes the presence of the 7-bit COUNT field and its associated 2011 S bit. The COUNT field codes the number of active Control Change 2012 commands for controller numbers 6, 38, 96, and 97 in the session history 2013 that appear in a transaction for the log parameter. 2015 The T bit codes the presence of the 7-bit TCOUNT field, and its 2016 associated S bit. The TCOUNT field codes the number of initiated 2017 transactions for the parameter in the session history that contain at 2018 least one active Control Change command, including commands for 2019 controller numbers 98-101 that initiate a transaction, but excluding 2020 commands for controller numbers 98-101 that terminate the transaction. 2022 COUNT and TCOUNT counting is performed modulo 128. COUNT and TCOUNT are 2023 set to 0 at the start of a session, and are reset to 0 whenever a System 2024 Reset (0xFF), General MIDI System Enable (0xF0 0x7E 0xcc 0x09 0x01 2025 0xF7), or General MIDI System Disable (0xF0 0x7E 0xcc 0x09 0x00 0xF7) 2026 appears in the session history. 2028 B. The Recovery Journal System Chapters 2030 B.1 System Chapter D: Simple System Commands 2032 The system journal MUST contain Chapter D if an active MIDI Reset 2033 (0xFF), MIDI Tune Request (0xF6), MIDI Song Select (0xF3), undefined 2034 MIDI System Common (0xF4 and 0xF5), or undefined MIDI System Real-time 2035 (0xF9 and 0xFD) command appears in the checkpoint history. 2037 Figure B.1.1 shows the variable-length format for Chapter D. 2039 0 1 2 3 2040 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2041 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2042 |S|B|G|H|J|K|Y|Z| Command logs ... | 2043 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2045 Figure B.1.1 -- System Chapter D format 2047 The chapter consists of a 1-octet header, followed by one or more 2048 command logs. Header flag bits indicate the presence of command logs 2049 for the Reset (B = 1), Tune Request (G = 1), Song Select (H = 1), 2050 undefined System Common 0xF4 (J = 1), undefined System Common 0xF5 (K = 2051 1), undefined System Real-time 0xF9 (Y = 1), or undefined System Real- 2052 time 0xFD (Z = 1) commands. 2054 Command logs appear in a list following the header, in the order that 2055 the flag bits appear in the header. 2057 Figure B.1.2 shows the 1-octet command log format for the Reset and Tune 2058 Request commands. 2060 0 2061 0 1 2 3 4 5 6 7 2062 +-+-+-+-+-+-+-+-+ 2063 |S| COUNT | 2064 +-+-+-+-+-+-+-+-+ 2066 Figure B.1.2 -- Command log for Reset and Tune Request 2068 Chapter D MUST contain the Reset command log if an active Reset command 2069 appears in the checkpoint history. The 7-bit COUNT field codes the 2070 total number of Reset commands (modulo 128) present in the session 2071 history. 2073 Chapter D MUST contain the Tune Request command log if an active Tune 2074 Request command appears in the checkpoint history. The 7-bit COUNT 2075 field codes the total number of Tune Request commands (modulo 128) 2076 present in the session history. 2078 Figure B.1.3 shows the 1-octet command log format for the Song Select 2079 command. 2081 0 2082 0 1 2 3 4 5 6 7 2083 +-+-+-+-+-+-+-+-+ 2084 |S| VALUE | 2085 +-+-+-+-+-+-+-+-+ 2087 Figure B.1.3 -- Song Select command log format 2089 Chapter D MUST contain the Song Select command log if an active Song 2090 Select command appears in the checkpoint history. The 7-bit VALUE field 2091 codes the song number of the most recent active Song Select command in 2092 the session history. 2094 B.1.1 Undefined System Commands 2095 In this section, we define the Chapter D command logs for the undefined 2096 System opcodes. [1] reserves the undefined System opcodes 0xF4, 0xF5, 2097 0xF9, and 0xFD for future use. At the time of this writing, any MIDI 2098 command stream that uses these opcodes is non-compliant with [1]. 2099 However, future versions of [1] may define these opcodes, and a few 2100 products do use these opcodes in a non-compliant manner. 2102 Figure B.1.4 shows the variable length command log format for the 2103 undefined System Common commands (0xF4 and 0xF5). 2105 0 1 2 3 2106 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2107 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2108 |S|DSZ|V|C|L|R|R| LENGTH | VALUE ... | 2109 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2110 | COUNT | LEGAL ... | 2111 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2113 Figure B.1.4 -- Undefined System Common command log format 2115 The command log codes a single opcode type (0xF4 or 0xF5, not both). 2116 Chapter D MUST contain a command log if an active 0xF4 command appears 2117 in the checkpoint history, and MUST contain an independent command log 2118 if an active 0xF5 command appears in the checkpoint history. 2120 Chapter D consists of a two-octet header followed by a variable number 2121 of data fields. Header flag bits indicate the presence of the VALUE 2122 field (V = 1), the COUNT field (C = 1), and the LEGAL field (L = 1). 2123 The 8-bit LENGTH field codes the size of the command log, and conforms 2124 to semantics described in Appendix A.1. 2126 The 2-bit DSZ field codes the number of data octets in the command 2127 instance that appears most recently in the session history. If DSZ = 2128 0-2, the command has 0-2 data octets. If DSZ = 3, the command has 3 or 2129 more command data octets. 2131 We now define the default rules for the use of the VALUE, COUNT, and 2132 LEGAL fields. The session configuration tools defined in Appendix C.1.3 2133 may be used to override this behavior. 2135 If the DSZ field is set to 0, the command log MUST include the COUNT 2136 field. The 8-bit COUNT field codes the total number of opcode commands 2137 present in the session history, modulo 256. 2139 If the DSZ field is set to 1-3, the command log MUST include the VALUE 2140 field. The variable-length VALUE field codes a verbatim copy the data 2141 octets for the most recent use of the opcode in the session history. 2142 The most-significant bit of the final data octet MUST be set to 1, and 2143 the most-significant bit of all other data octets MUST be set to 0. 2145 The LEGAL field is reserved for future use. If an update to [1] defines 2146 the 0xF4 or 0xF5 opcode, an IETF standards-track document MAY define the 2147 LEGAL field to protect the opcode. Until such a document appears, 2148 senders MUST NOT use the LEGAL field, and receivers MUST use the LENGTH 2149 field to skip over the LEGAL field. 2151 Figure B.1.5 shows the variable length command log format for the 2152 undefined System Real-time commands (0xF9 and 0xFD). 2154 0 1 2 3 2155 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2156 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2157 |S|C|L| LENGTH | COUNT | LEGAL ... | 2158 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2160 Figure B.1.5 -- Undefined System Real-time command log format 2162 The command log codes a single opcode type (0xF9 or 0xFD, not both). 2163 Chapter D MUST contain a command log if an active 0xF9 command appears 2164 in the checkpoint history, and MUST contain an independent command log 2165 if an active 0xFD command appears in the checkpoint history. 2167 Chapter D consists of a one-octet header followed by a variable number 2168 of data fields. Header flag bits indicate the presence of the COUNT 2169 field (C = 1) and the LEGAL field (L = 1). The 5-bit LENGTH field codes 2170 the size of the command log, and conforms to semantics described in 2171 Appendix A.1. 2173 We now define the default rules for the use of the COUNT and LEGAL 2174 fields. The session configuration tools defined in Appendix C.1.3 may 2175 be used to override this behavior. 2177 The 8-bit COUNT field codes the total number of opcode commands present 2178 in the session history, modulo 256. By default, the COUNT field MUST be 2179 present in the command log. 2181 The LEGAL field is reserved for future use. If an update to [1] defines 2182 the 0xF9 or 0xFD opcode, an IETF standards-track document MAY define the 2183 LEGAL field to protect the opcode. Until such a document appears, 2184 senders MUST NOT use the LEGAL field, and receivers MUST use the LENGTH 2185 field to skip over the LEGAL field. 2187 Finally, we note that some non-standard uses of the undefined System 2188 Real-time opcodes act to implement non-compliant variants of the MIDI 2189 sequencer system. In Appendix B.3.1, we describe resiliency tools for 2190 the MIDI sequencer system that provide some protection in this case. 2192 B.2 System Chapter V: Active Sense Command 2194 The system journal MUST contain Chapter V if an active MIDI Active Sense 2195 (0xFE) command appears in the checkpoint history. Figure B.2.1 shows 2196 the format for Chapter V. 2198 0 2199 0 1 2 3 4 5 6 7 2200 +-+-+-+-+-+-+-+-+ 2201 |S| COUNT | 2202 +-+-+-+-+-+-+-+-+ 2204 Figure B.2.1 -- System Chapter V format 2206 The 7-bit COUNT field codes the total number of Active Sense commands 2207 (modulo 128) present in the session history. 2209 B.3 System Chapter Q: Sequencer State Commands 2211 This Appendix describes Chapter Q, the system chapter for the MIDI 2212 sequencer commands. 2214 The system journal MUST contain Chapter Q if an active MIDI Song 2215 Position Pointer (0xF2), MIDI Clock (0xF8), MIDI Start (0xFA), MIDI 2216 Continue (0xFB) or MIDI Stop (0xFC) command appears in the checkpoint 2217 history. Figure B.3.1 shows the variable-length format for Chapter Q. 2219 0 1 2 3 2220 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2221 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2222 |S|N|D|C|T| TOP | CLOCK | TIMETOOLS ... | 2223 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2224 | ... | 2225 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2227 Figure B.3.1 -- System Chapter Q format 2229 Chapter Q encodes the most recent state of the sequencer system. 2230 Receivers use the chapter to re-synchronize the sequencer after a packet 2231 loss episode. Chapter fields encode the position of the sequencer 2232 pointer, the presence of the downbeat, and the on/off state of the 2233 sequencer. 2235 Chapter Q consists of a 1-octet header followed by several optional 2236 fields, in the order shown in Figure B.3.1. Header flag bits signal the 2237 presence of the 16-bit CLOCK field (C = 1) and the 24-bit TIMETOOLS 2238 field (T = 1). 2240 The N header bit encodes the relative occurrence of the Start, Stop, and 2241 Continue commands in the session history. If an active Start or 2242 Continue command appears most recently, the N bit MUST be set to 1. If 2243 an active Stop appears most recently, or if no active Start, Stop, or 2244 Continue commands appear in the session history, the N bit MUST be set 2245 to 0. 2247 The D header bit encodes the presence of the downbeat. If N is set to 2248 1, and if at least one Clock command follows the most recent Start or 2249 Continue command in the session history, the D bit MUST be set to 1. In 2250 all other cases, the D bit MUST be set to 0. 2252 If N is set to 0 (coding a stopped sequence), or if N is set to 1 and D 2253 is set to 0 (coding a sequence on the verge of beginning), Chapter Q 2254 MUST encode the starting song position of the sequence. The C flag, the 2255 TOP field, and the CLOCK field act to code the starting song position: 2257 o If C = 0, the song position is at the beginning of the song. 2259 o If C = 1, the 3-bit TOP header field and the 16-bit 2260 CLOCK field are combined to form the 19-bit unsigned quantity 2261 65536*TOP + CLOCK. This value encodes the song position 2262 in units of clocks (24 clocks per quarter note). 2264 If the N and D header bits are both set to 1, the sequence is playing, 2265 and Chapter Q MUST encode the current song position in the sequence. 2267 The current song position is coded using the same fields and methods as 2268 the starting song position (65536*TOP + CLOCK, with C set to 1). 2270 B.3.1 Non-compliant Sequencers 2272 The Chapter Q description in this Appendix assumes that the sequencer 2273 system counts off time with Clock commands, as mandated in [1]. 2274 However, a few non-compliant products do not use Clock commands to count 2275 off time, but instead use non-standard methods. 2277 Chapter Q uses the TIMETOOLS field to provide resiliency support for 2278 these non-standard products. By default, the TIMETOOLS field MUST NOT 2279 appear in Chapter Q, and the T header bit MUST be set to 0. The session 2280 configuration tools described in Appendix C.1.3 may be used to select 2281 TIMETOOLS coding. 2283 Figure B.3.2 shows the format of the 24-bit TIMETOOLS field. 2285 0 1 2 2286 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 2287 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2288 |B| TIME | 2289 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2291 Figure B.3.2 -- TIMETOOLS format 2293 The TIME field is a 23-bit unsigned integer quantity, with units of 2294 milliseconds. TIME codes an additive correction term for the song 2295 position coded by the TOP, CLOCK, C fields. TIME is coded in network 2296 byte order (big-endian). 2298 A receiver computes the correct song position by converting TIME into 2299 units of MIDI clocks and adding it to 65536*TOP + CLOCK (assuming C = 2300 1). Alternatively, a receiver may convert 65536*TOP + CLOCK into 2301 milliseconds (assuming C = 1) and add it to TIME. 2303 The B bit encodes the presence of the downbeat in the non-standard 2304 command stream. If the N header bit is set to 1, and if at least one 2305 non-standard command that counts off time follows the most recent Start 2306 or Continue command in the session history, the B bit MUST be set to 1. 2307 In all other cases, B MUST be set to 0. 2309 B.4 System Chapter F: MIDI Time Code Tape Position 2311 This Appendix describes Chapter F, the system chapter for the MIDI Time 2312 Code (MTC) commands. Readers may wish to review the Appendix A.1 2313 definition of "finished/unfinished commands" before reading this 2314 Appendix. 2316 The system journal MUST contain Chapter F if an active System Common 2317 Quarter Frame command (0xF1) or an active finished System Exclusive 2318 (Universal Real Time) MTC Full Frame command (F0 7F cc 01 01 hr mn sc fr 2319 F7) appears in the checkpoint history. 2321 Figure B.4.1 shows the variable-length format for Chapter F. 2323 0 1 2 3 2324 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2325 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2326 |S|C|Q|P|D|POINT| COMPLETE ... | 2327 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2328 | ... | PARTIAL ... | 2329 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2330 | ... | 2331 +-+-+-+-+-+-+-+-+ 2333 Figure B.4.1 -- System Chapter F format 2335 Chapter F holds information about the most recent MTC tape position 2336 coded in the session history. Receivers use Chapter F to re-synchronize 2337 the MTC system after a packet loss episode. 2339 Chapter F consists of a 1-octet header followed by several optional 2340 fields, in the order shown in Figure B.4.1. Header flag bits signal the 2341 presence of the 32-bit COMPLETE field (C = 1) and the 32-bit PARTIAL 2342 field (P = 1). 2344 Chapter F MUST include the COMPLETE field if an active finished Full 2345 Frame command appears in the checkpoint history, or if an active Quarter 2346 Frame command that completes the encoding of a frame value appears in 2347 the checkpoint history. 2349 The COMPLETE field encodes the most recent active complete MTC frame 2350 value that appears in the session history. This frame value may take 2351 the form of a series of 8 active Quarter Frame commands (0xF1 0x0n 2352 through 0xF1 0x7n for forward tape movement, 0xF1 0x7n through 0xF1 0x0n 2353 for reverse tape movement), or may take the form of an active finished 2354 Full Frame command. 2356 If the COMPLETE field encodes a Quarter Frame command series, the Q 2357 header bit MUST be set to 1, and the COMPLETE field MUST have the format 2358 shown in Figure B.4.2. The 4-bit fields MT0 through MT7 code the binary 2359 data nibble for the Quarter Frame commands for Message Type 0 through 2360 Message Type 7 [1]. These nibbles encode a complete frame value, in 2361 addition to fields reserved for future use by [1]. 2363 0 1 2 3 2364 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2365 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2366 | MT0 | MT1 | MT2 | MT3 | MT4 | MT5 | MT6 | MT7 | 2367 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2369 Figure B.4.2 -- COMPLETE field format, Q = 1 2371 In this usage, the frame value encoded in the COMPLETE field MUST be 2372 offset by 2 frames, relative to the frame value encoded in the Quarter 2373 Frame commands, if the tape is moving in the forward direction. This 2374 offset compensates for the two frame latency of the Quarter Frame 2375 encoding. No offset is applied if the tape is moving in reverse. 2377 Alternatively, the most recent active complete MTC frame value may be 2378 encoded by an active finished Full Frame command. In this case, the Q 2379 header bit MUST be set to 0, and the COMPLETE field MUST have format 2380 shown in Figure B.4.3. The HR, MN, SC, and FR fields correspond to the 2381 hr, mn, sc, and fr data octets of the Full Frame command. 2383 0 1 2 3 2384 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2385 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2386 | HR | MN | SC | FR | 2387 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2389 Figure B.4.3 -- COMPLETE field format, Q = 0 2391 B.4.1 Partial Frames 2393 The most recent active session history command that encodes MTC frame 2394 value data may be a Quarter Frame command other than a forward-moving 2395 0xF1 0x7n command (which completes a frame value for forward tape 2396 movement) or a reverse-moving 0xF1 0x1n command (which completes a frame 2397 value for reverse tape movement). 2399 We define this type of Quarter Frame command as being associated with a 2400 partial frame value encoding. This definition only holds if the partial 2401 frame value is well-formed: the Quarter Frame sequence MUST start at 2402 Message Type 0 and increment contiguously to an intermediate value, or 2403 start at Message Type 7 and decrement contiguously to an intermediate 2404 value). 2406 Chapter F MUST include a PARTIAL field if the most recent active command 2407 in the checkpoint history that encodes MTC frame value data is a Quarter 2408 Frame command that is associated with a partial frame value. 2410 The PARTIAL field MUST have the format shown in Figure B.4.2. The D and 2411 POINT header fields (Figure B.4.1) qualify the contents of the PARTIAL 2412 field, as we now describe. 2414 The D header bit reflects the direction of tape movement coded by the 2415 Quarter Frame command (D = 0 for forward movement, D = 1 for reverse 2416 movement). The 3 bit POINT header field encodes the unsigned integer 2417 value formed by the lower 3 bits of the upper nibble of the data value 2418 of the most recent active Quarter Frame command in the session history. 2420 If D = 0, POINT may take on the values 0-6. If D = 1, POINT may take on 2421 the values 1-7. If D = 0, MT fields (Figure B.4.2) in the inclusive 2422 range 0 to the POINT value encode the partial frame value, and all other 2423 MT fields MUST be ignored. If D = 1, MT fields in the inclusive range 7 2424 down to the POINT value encode the partial frame value, and all other MT 2425 fields MUST be ignored. 2427 Senders MUST NOT add a 2-frame offset to the partial frame value encoded 2428 in the PARTIAL field. Unlike the COMPLETE field, an offset is not 2429 necessary because the D bit encodes the tape direction. 2431 The header field value pairs (D = 0, POINT = 7) and (D = 1, POINT = 0) 2432 are reserved for future use. Senders MUST NOT use these value pairs and 2433 receivers MUST ignore the PARTIAL field if these value pairs appear in 2434 the chapter header. 2436 B.5 System Chapter X: System Exclusive 2438 This Appendix describes Chapter X, the system chapter for MIDI System 2439 Exclusive (SysEx) commands (opcode 0xF0). Readers may wish to review 2440 the Appendix A.1 definition of "finished/unfinished commands" before 2441 reading this Appendix. 2443 The system journal may code multiple Chapter X chapters. Chapter X 2444 journal chapters are ordered with respect to the recency of the SysEx 2445 command coded by the chapter. The chapter coding the most recent SysEx 2446 command in the session history appears first in the system journal, 2447 followed by a chapter coding an older command, followed by a chapter 2448 coding an even older command, etc. 2450 The system journal MUST contain at least one Chapter X chapter if an 2451 active SysEx command (excluding a finished MTC Full Frame command) 2452 appears in the checkpoint history. A SysEx command "appears" in the 2453 checkpoint history if the history contains a verbatim encoding of the 2454 SysEx command, or if the history contains at least one segment of a 2455 segmental encoding of the SysEx command. 2457 Chapter X is optimized for the small SysEx commands that signal real- 2458 time events, not the large SysEx commands used for bulk data. Bulk data 2459 commands SHOULD be sent over reliable transport. Appendix C.4 defines 2460 session configuration tools for splitting a MIDI name space into streams 2461 that are carried on different transports. 2463 B.5.1 Chapter Format 2465 Figure B.5.1 shows the variable length format for System Chapter X. 2467 0 1 2 3 2468 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2469 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2470 |S|IDC|L|T| LEN | DATA ... | 2471 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2473 Figure B.5.1 -- System Chapter X format 2475 Chapter X consists of a 1-octet header, following by an arbitrary length 2476 DATA field. The DATA field encodes a modified version of the data 2477 octets of a SysEx command. The leading 0xF0 and trailing 0x7F SysEx 2478 octets never appear in the DATA field. 2480 The DATA field encodes all command data octets that appears in the 2481 session history (as distinct from the checkpoint history). This 2482 distinction is relevant for the coding of commands whose segments appear 2483 across multiple packets. In this case, the DATA field MUST include the 2484 starting segments for the command, even if these segments no longer 2485 appear in the checkpoint history. 2487 If the Manufacturer ID value of the SysEx command (coded in the first 2488 octet of the MIDI command) has the values 0x00, 0x7E, or 0x7F, the DATA 2489 field begins with the second data octet of the SysEx command; for all 2490 other Manufacturer ID values, the DATA field begins with the first data 2491 octet of the SysEx command. The 2-bit IDC header field codes 0x00, 2492 0x7E, and 0x7F ID values, using the method shown in Figure B.5.2. 2494 --------------------------------------------------------------------- 2495 | IDC | Manufacturer ID | First DATA octet is: | 2496 |--------------------------------------|------------------------------| 2497 | 0x0 | 0x7E (Universal Real-Time) | 2nd SysEx data octet | 2498 |--------------------------------------|------------------------------| 2499 | 0x1 | 0x7F (Universal Non-Real-Time) | 2nd SysEx data octet | 2500 |--------------------------------------|------------------------------| 2501 | 0x2 | 0x00 (Extension Escape Code) | 2nd SysEx data octet | 2502 |--------------------------------------|------------------------------| 2503 | 0x3 | in the range 0x01--0x7D | 1st SysEx data octet | 2504 --------------------------------------------------------------------- 2506 Figure B.5.2 -- IDC header field encoding 2508 The 3-bit LEN header field codes the exact length of short, complete 2509 SysEx commands, and signals alternative coding techniques for longer 2510 commands and truncated commands. 2512 The LEN values 0x0 through 0x5 indicate that the length of the DATA 2513 field is 1-6 octets. For these LEN values, the DATA field encodes a 2514 complete SysEx command, as a verbatim copy of the SysEx data octets 2515 (possibly skipping the first octet, per Figure B.5.2). 2517 The LEN value 0x6 indicates that the DATA field contains 7 or more 2518 octets. The DATA field encodes a complete SysEx command, as a verbatim 2519 copy of the data octets of the SysEx command (possibly skipping the 2520 first octet, per Figure B.5.2). To code the field length, the most- 2521 significant bit of the final octet MUST be set to 1, and the most- 2522 significant bit of all other octets MUST be set to 0. 2524 The LEN value 0x7 indicates that the SysEx command is truncated. This 2525 coding option is used for SysEx commands encoded using the segmented 2526 method, in the case where not all segments appear in the session 2527 history. The DATA field encodes a verbatim copy of the data octets of 2528 the command segments that appear in the session history, ordered from 2529 the first segment to the last segment, using the coding methods defined 2530 for the 0x6 LEN value. 2532 B.5.2 Coding Tools 2534 The L and T header flags (Figure B.5.1) indicate the coding tool for the 2535 Chapter X chapter. The coding tool sets the inclusion semantics for a 2536 subset of SysEx commands, which we call a type. 2538 If the L bit is set to 1 (the list tool), all active commands that 2539 appear in the checkpoint history of the type coded in the DATA field 2540 MUST be coded by a chapter. If L is set to 0 (the recency tool), the 2541 most recent active command that appears in the checkpoint history of the 2542 type coded in the DATA field MUST be coded by a chapter. 2544 For each command type, an implementation may choose either the list tool 2545 or the recency tool. Simple implementations may use the list tool for 2546 all command types; sophisticated implementations may reduce bandwidth by 2547 using the recency tool for some command types. 2549 The T flag defines the nature of the type. The T flag has different 2550 semantics for MIDI Universal SysEx commands (Manufacturers ID 0x7E and 2551 0x7F) and for generic SysEx commands (all other Manufacturers ID 2552 values). 2554 We first define the T flag for Universal SysEx commands. The first four 2555 data octets of Universal commands are defined in [1], using the syntax: 2556 ID cc SubID SubID1. If T is set to 0, all Universal commands with the 2557 same ID, cc, SubID, and SubID1 values are considered the same type. If 2558 T is set to 1, all Universal commands with the same ID, cc, and SubID 2559 values are considered the same type. 2561 For generic SysEx commands (all Manufacturers ID values except 0x7E and 2562 0x7F), we define the T flag as follows. The first data octet of a 2563 generic SysEx command is the Manufacturers ID; the remaining data octets 2564 may have an arbitrary organization, but often have a set of octets 2565 coding device and sub-command, followed by data octets for the command. 2567 If T is set to 0, all generic SysEx commands with the same ID value are 2568 considered to be of the same type. If T is set to 1, the command is 2569 assumed to have a device/sub-command/data organization, and all commands 2570 with the same ID value, device, and sub-command values are considered to 2571 be of the same type. If the command has a multi-level sub-command 2572 structure, these semantics require identical sub-command values at all 2573 levels. 2575 C. SDP Session Configuration Tools 2577 In the main text, we show minimal session descriptions for native 2578 (Section 6.1) and mpeg4-generic (Section 6.2) streams. In this 2579 Appendix, we describe how to customize (and perhaps negotiate [15]) 2580 stream behavior through the use of the standard SDP attributes and the 2581 payload format fmtp parameters. 2583 The Appendix is divided into 5 sections, each devoted to parameters that 2584 affect a particular aspect of stream behavior: 2586 o Appendix C.1 describes the journalling system (ch_anchor, 2587 ch_default, ch_never, ch_unused, j_sec. j_update). 2589 o Appendix C.2 describes MIDI command timestamp semantics 2590 (linerate, mperiod, octpos, tsmode). 2592 o Appendix C.3 describes media time (guardtime, maxptime, ptime). 2594 o Appendix C.4 describes multi-stream sessions (musicport, 2595 zerosync). 2597 o Appendix C.5 describes MIDI rendering (chanmask, cid, inline, 2598 render, rinit, smf_cid, smf_info, smf_inline, smf_url, url). 2600 Appendix C.5.4 defines the MIME type "audio/asc", a stored object for 2601 initializing mpeg4-generic renderers. RTP stream semantics are not 2602 defined for "audio/asc". Therefore, "asc" MUST NOT appear on the rtpmap 2603 line of a session description. 2605 Appendix D defines the Augmented Backus-Naur Form (ABNF, [10]) syntax 2606 for the parameters listed above. Appendix H provides information to the 2607 Internet Assigned Numbers Authority (IANA) on the MIME types and 2608 parameters defined in this document. 2610 C.1 SDP Definitions: The Journalling System 2612 In this Appendix, we define the session description parameters that 2613 configure stream journalling and the recovery journal system. 2615 The j_sec parameter (Appendix C.1.1) sets the journalling method for the 2616 stream. The j_update parameter (Appendix C.1.2) sets the recovery 2617 journal sending policy for the stream. Appendix C.1.2 also defines the 2618 sending policies of the recovery journal system. 2620 Appendix C.1.3 defines several parameters that modify the recovery 2621 journal semantics. These parameters change the default recovery journal 2622 semantics as defined in Section 5 and Appendices A-B. 2624 C.1.1 The j_sec Parameter 2626 Section 2.2 defines the default journalling method for a stream. 2627 Streams that use unreliable transport (such as UDP) default to using the 2628 recovery journal. Streams that use reliable transport (such as TCP) 2629 default to not using a journal. 2631 The fmtp parameter j_sec may be used to override this default. This 2632 memo defines two symbolic values for j_sec: "none", to indicate that all 2633 stream payloads MUST NOT contain a journal section, and "recj", to 2634 indicate that all stream payloads MUST contain a journal section that 2635 uses the recovery journal format. 2637 For example, the j_sec parameter might be set to "none" for a UDP stream 2638 that travels between two hosts on a local network that is known to 2639 provide reliable datagram delivery. 2641 The session description below configures a UDP stream that does not use 2642 the recovery journal: 2644 v=0 2645 o=lazzaro 2520644554 2838152170 IN IP4 first.example.net 2646 s=Example 2647 t=0 0 2648 m=audio 5004 RTP/AVP 96 2649 c=IN IP4 192.0.2.94 2650 a=rtpmap: 96 rtp-midi/44100 2651 a=fmtp: 96 j_sec=none; 2653 Other IETF standards-track documents may define alternative journal 2654 formats. These documents MUST define new symbolic values for the j_sec 2655 parameter to signal the use of the format. If a session description 2656 uses a j_sec value unknown to the recipient, the recipient MUST NOT 2657 accept the description. 2659 Special j_sec issues arise when sessions are managed by the Real Time 2660 Streaming Protocol (RTSP, [16]). In many streaming applications, the 2661 session description in the response to the DESCRIBE method does not code 2662 the transport details (such as UDP or TCP) for the session. Instead, 2663 server and client negotiate transport details using the SETUP method. 2665 In this scenario, the use of the j_sec parameter may be ill-advised, as 2666 the server does not yet know the transport type for the session. In 2667 this case, the session description SHOULD configure the journalling 2668 system using the parameters defined in the remainder of Appendix C.1, 2669 but SHOULD NOT use j_sec to set the journalling status. Recall that if 2670 j_sec does not appear in the session description, the default method for 2671 choosing the journalling method is in effect (no journal for reliable 2672 transport, recovery journal for unreliable transport). 2674 However, in situations where the server knows journalling is always 2675 required (such as pre-recorded streams that contain packet loss events) 2676 or never required (such as UDP streams sent over a reliable network), 2677 the session description returned by the DESCRIBE method SHOULD use the 2678 j_sec parameter. 2680 C.1.2 The j_update Parameter 2681 In Section 4, we use the term "sending policy" to describe the method a 2682 sender uses to choose the checkpoint packet identity for each recovery 2683 journal in a stream. In the sub-sections that follow, we normatively 2684 define three sending policies: anchor, closed-loop, and open-loop. 2686 As stated in Section 4, the default sending policy for a stream is the 2687 closed-loop policy. The fmtp parameter j_update may be used to override 2688 this default. 2690 We define three symbolic values for j_update: "anchor", to indicate that 2691 the stream uses the anchor sending policy, "open-loop", to indicate that 2692 the stream uses the open-loop sending policy, and "closed-loop", to 2693 indicate that the stream uses the closed-loop sending policy. See 2694 Appendix C.1.3 for examples session descriptions that use the j_update 2695 parameter. 2697 Other IETF standards-track documents may define additional sending 2698 policies for the recovery journal system. These documents MUST define 2699 new symbolic values for the j_update parameter to signal the use of the 2700 new policy. If a session description uses a j_update value unknown to 2701 the recipient, the recipient MUST NOT accept the description. 2703 C.1.2.1 The anchor Sending Policy 2705 In the anchor policy, the sender uses the first packet in the stream as 2706 the checkpoint packet for all packets in the stream. The anchor policy 2707 satisfies the recovery journal mandate (Section 4), as the checkpoint 2708 history always covers the entire stream. 2710 The anchor policy does not require the use of the Real Time Control 2711 Protocol (RTCP, [2]) or other feedback from receiver to sender. Senders 2712 do not need to take special actions to ensure that received streams 2713 start up free of artifacts, as the recovery journal always covers the 2714 entire history of the stream. Receivers are relieved of the 2715 responsibility of tracking the changing identity of the checkpoint 2716 packet, because the checkpoint packet never changes. 2718 The main drawback of the anchor policy is bandwidth efficiency. Because 2719 the checkpoint history covers the entire stream, the size of the 2720 recovery journals produced by this policy usually exceeds the journal 2721 size of alternative policies. For single-channel MIDI data streams, the 2722 bandwidth overhead of the anchor policy is often acceptable (see 2723 Appendix A.4 of [12]). For dense streams, the closed-loop or open-loop 2724 policies may be more appropriate. 2726 C.1.2.2 The closed-loop Sending Policy 2728 The closed-loop policy is the default policy of the recovery journal 2729 system. For each packet in the stream, the policy lets senders choose 2730 the smallest possible checkpoint history that satisfies the recovery 2731 journal mandate. As smaller checkpoint histories generally yield 2732 smaller recovery journals, the closed-loop policy reduces the bandwidth 2733 of a stream, relative to the anchor policy. 2735 The closed-loop policy relies on feedback from receiver to sender. The 2736 policy assumes that a receiver periodically informs the sender of the 2737 highest sequence number it has seen so far in the stream, coded in the 2738 32-bit extension format defined in [2]. In sessions that use RTCP, 2739 receivers transmit this information in the Extended Highest Sequence 2740 Number Received (EHSNR) field of Receiver Report (RR) packets. However, 2741 applications MAY use any method of feedback to implement the closed-loop 2742 policy. 2744 The sender may safely use receiver sequence number feedback to guide 2745 checkpoint history management, because Section 4 requires receivers to 2746 repair indefinite artifacts whenever a packet loss event occur. 2748 We now normatively define the closed-loop policy. At the moment a 2749 sender prepares an RTP packet for transmission, the sender is aware of R 2750 >= 0 receivers for the stream. Senders may become aware of a receiver 2751 via RTCP traffic from the receiver, via RTP packets from a paired stream 2752 sent by the receiver to the sender, via messages from a session 2753 management tool, or by other means. As receivers join and leave a 2754 session, the value of R changes. 2756 Each known receiver k (1 <= k <= R) is associated with a 32-bit extended 2757 packet sequence number M(k), where the extension reflects the sequence 2758 number rollover count of the sender. 2760 If the sender has received at least one feedback report from receiver k, 2761 M(k) is the most recent report of the highest RTP packet sequence number 2762 seen by the receiver, normalized to reflect the rollover count of the 2763 sender. 2765 If the sender has not received a feedback report from the receiver, M(k) 2766 is the extended sequence number of the last packet the sender 2767 transmitted before it became aware of the receiver. If the sender 2768 became aware of this receiver before it sent the first packet in the 2769 stream, M(k) is the extended sequence number of the first packet in the 2770 stream. 2772 Given this definition of M(), we now state the closed-loop policy. When 2773 preparing a new packet for transmission, a sender MUST choose a 2774 checkpoint packet with extended sequence number N, such that M(k) >= (N 2775 - 1) for all k, 1 <= k <= R, where R >= 1. The policy does not restrict 2776 sender behavior in the R == 0 (no known receivers) case. 2778 Under the closed-loop policy as defined above, a sender may transmit 2779 packets whose checkpoint history is shorter than the session history (as 2780 defined in Appendix A.1). In this event, a new receiver that joins the 2781 stream may experience indefinite artifacts. 2783 For example, if a Control Change (0xB) command for the channel volume 2784 (controller number 7) was sent early in a stream, and later a new 2785 receiver joins the session, the closed-loop policy may permit all 2786 packets sent to the new receiver to use a checkpoint history that does 2787 not include the channel volume Control Change command. As a result, the 2788 new receiver experiences an indefinite artifact, and play all notes on a 2789 channel too loudly or too softly. 2791 To address this issue, the closed-loop policy states that whenever a 2792 sender becomes aware of a new receiver, the sender MUST determine if the 2793 receiver would be subject to indefinite artifacts under the closed-loop 2794 policy. If so, the sender MUST ensure that the receiver starts the 2795 session free of indefinite artifacts. In satisfying this requirement, 2796 senders MAY infer the initial MIDI state of the receiver from the 2797 session description. For example, the stream example in Section 6.2 has 2798 the initial state defined in [1] for General MIDI. 2800 In some types of sessions, a receiver may have access to stream packets 2801 before the sender is aware of the receiver. In this case, the 2802 restrictions the closed-loop policy places on the sender may not protect 2803 the receiver from indefinite artifacts. 2805 To address this issue, the closed-loop policy states that if a receiver 2806 participates in a session where it may have access to a stream before 2807 the sender is aware of the receiver, the receiver MUST take actions to 2808 ensure that its rendered MIDI performance does not contain indefinite 2809 artifacts. The receiver MUST NOT discontinue these protective actions 2810 until it is certain that the sender is aware of its presence. 2812 The final set of normative closed-loop policy requirements concern how 2813 senders drop receivers from a stream. As defined earlier in this 2814 section, the closed-loop policy states that a sender MUST choose a 2815 checkpoint packet with extended sequence number N, such that M(k) >= (N 2816 - 1) for all k, 1 <= k <= R, where R >= 1. If the sender has received 2817 at least one feedback report from receiver k, M(k) is the most recent 2818 report of the highest RTP packet sequence number seen by the receiver, 2819 normalized to reflect the rollover count of the sender. 2821 If this receiver k stops sending feedback to the sender, the M(k) value 2822 used by the sender reflects the last feedback report from the receiver. 2823 As time progresses without feedback from receiver k, this fixed M(k) 2824 value forces the sender to increase the size of the checkpoint history, 2825 and thus increases the bandwidth of the stream. 2827 At some point, the sender may need to take action in order to limit the 2828 bandwidth of the stream. The closed-loop policy states that if this 2829 situation occurs, and if the nature of the session permits a sender to 2830 stop transmitting packets to the offending receiver, the sender MUST 2831 stop transmitting packets to this receiver. In other words, it is not 2832 permissible for a sender to no longer use M(k) in computing the 2833 checkpoint packet identity but still send the stream to receiver k, if 2834 it is possible for the sender to actively cut off receiver k from the 2835 stream. 2837 In certain types of sessions, it may not be possible for a sender to 2838 actively stop sending packets to a particular receiver. The closed-loop 2839 policy states that if receivers participate in a session where senders 2840 are unable to stop sending packets to a particular receiver of the 2841 stream, the receiver MUST monitor the RTP stream, and any other sources 2842 of information, to determine if the sender is no longer using the M(k) 2843 feedback from the receiver to choose each checkpoint packet. If the 2844 receiver detects this condition, it MUST leave the session, and close 2845 down the rendered MIDI performance in a manner that is free of 2846 indefinite artifacts. 2848 Finally, we note that the closed-loop policy is suitable for use in 2849 RTP/RTCP sessions that use multicast transport. However, aspects of the 2850 closed-loop policy do not scale well to sessions with large numbers of 2851 participants. The sender state scales linearly with the number of 2852 receivers, as the sender needs to track the identity and M(k) value for 2853 each receiver k. The average recovery journal size is not independent 2854 of the number of receivers, as the RTCP reporting interval backoff slows 2855 down the rate of a full update of M(k) values. The backoff algorithm 2856 may also increase the amount of ancillary state used by implementations 2857 of the normative sender and receiver behaviors defined in Section 4. 2859 C.1.2.3 The open-loop Sending Policy 2861 The open-loop policy is suitable for sessions that are not able to 2862 implement the receiver-to-sender feedback required by the closed-loop 2863 policy, and are also not able to use the anchor policy because of 2864 bandwidth constraints. 2866 The open-loop policy does not place constraints on how a sender chooses 2867 the checkpoint packet for each packet in the stream. In the absence of 2868 such constraints, a receiver may find that the recovery journal in the 2869 packet that ends a loss event has a checkpoint history that does not 2870 cover the entire loss event. We refer to loss events of this type as 2871 uncovered loss events. 2873 To ensure that uncovered loss events do not compromise the recovery 2874 journal mandate, the open-loop policy assigns specific recovery tasks to 2875 senders, receivers, and the creators of session descriptions. The 2876 underlying premise of the open-loop policy is that the indefinite 2877 artifacts produces during uncovered loss events fall into two classes. 2879 One class of artifacts are recoverable indefinite artifacts. Receivers 2880 are able to repair recoverable artifacts that occur during an uncovered 2881 loss event without intervention from the sender, at the potential cost 2882 of unpleasant transient artifacts. 2884 For example, after an uncovered loss event, receivers are able to repair 2885 indefinite artifacts due to NoteOff (0x8) commands that may have 2886 occurred during the loss event, by executing NoteOff commands for all 2887 active NoteOns commands. This action causes a transient artifacts (a 2888 sudden silent period in the performance), but ensures that no stuck 2889 notes sound indefinitely. We refer to MIDI commands that are amenable 2890 to repair in this fashion as recoverable MIDI commands. 2892 A second class of artifacts are unrecoverable indefinite artifacts. If 2893 this class of artifact occurs during an uncovered loss event, the 2894 receiver is not able to repair the stream. 2896 For example, after an uncovered loss event, receivers are not able to 2897 repair indefinite artifacts due to Control Change (0xB) channel volume 2898 (controller number 7) commands that have occurred during the loss event. 2899 A repair is impossible because the receiver has no way of determining 2900 the data value of a lost channel volume command. We refer to MIDI 2901 commands that are fragile in this way as unrecoverable MIDI commands. 2903 The open-loop policy does not specify how to partition the MIDI command 2904 set into recoverable and unrecoverable commands. Instead, it assumes 2905 that the creators of the session descriptions are able to come to 2906 agreement on a suitable recoverable/unrecoverable MIDI command partition 2907 for an application. 2909 Given these definitions, we now state the normative requirements for the 2910 open-loop policy. 2912 In the open-loop policy, the creators of the session description MUST 2913 use the ch_unused or ch_anchor fmtp parameters (defined in Appendix 2914 C.1.3) to protect all unrecoverable MIDI command types from indefinite 2915 artifacts. 2917 In a general sense, the ch_anchor parameter changes the recovery journal 2918 semantics to use the anchor checkpoint policy (Appendix C.1.2.1) for a 2919 command, and the ch_unused parameter acts to exclude a command type from 2920 the stream. These options act to shield command types from artifacts 2921 during an uncovered loss event. 2923 In the open-loop policy, receivers MUST examine the Checkpoint Packet 2924 Seqnum field of the recovery journal header after every loss event, to 2925 check if the loss event is an uncovered loss event. Section 5 shows how 2926 to perform this check. If an uncovered loss event has occurred, a 2927 receiver MUST perform indefinite artifact recovery for all MIDI command 2928 types that are not shielded by ch_anchor and ch_unused parameter 2929 assignments in the session description. 2931 The open-loop policy does not place specific constraints on the sender. 2932 However, the open-loop policy works best if the sender manages the size 2933 of the checkpoint history to ensure that uncovered losses occur 2934 infrequently, by taking into account the delay and loss characteristics 2935 of the network. Also, as each checkpoint packet change incurs the risk 2936 of an uncovered loss, senders should only move the checkpoint if it 2937 reduces the size of the journal. 2939 C.1.3 Recovery Journal Chapter Inclusion Parameters 2941 The recovery journal chapter definitions (Appendices A-B) specify under 2942 what conditions a chapter MUST appear in the recovery journal. In most 2943 cases, the definition states that if a certain command appears in the 2944 checkpoint history, a certain chapter type MUST appear in the recovery 2945 journal to protect the command. 2947 In this section, we describe the chapter inclusion fmtp parameters. 2948 These parameters modify the conditions under which a chapter appears the 2949 journal. 2951 These parameters are essential to the use of the open-loop policy 2952 (Appendix C.1.2.3), and may also be used to simplify multicast 2953 implementations of the closed-loop policy (Appendix C.1.2.2). 2955 The parameters also serve to signal the types of MIDI commands that are 2956 not in use in a session. In this role, the parameters may be used with 2957 streams that do not use journalling. 2959 Each parameter represents a type of chapter inclusion semantics. An 2960 assignment to a parameter declares which chapters (or chapter subsets) 2961 obey the inclusion semantics. We describe the assignment syntax for 2962 these parameters later in this section. 2964 Below, we normatively define the semantics of the chapter inclusion 2965 parameters. For clarity, we define the action of parameters on complete 2966 chapters. If a parameter is assigned a subset of a chapter, the 2967 definition applies only to the chapter subset. 2969 o ch_unused. If a chapter is assigned to the ch_unused parameter, 2970 the command types encoded by the chapter MUST NOT appear in 2971 the MIDI command sections of stream packets. As a consequence, 2972 the chapter MUST NOT appear in the recovery journal. 2974 In contrast with ch_unused, if a chapter is assigned to the parameters 2975 we define below, the command types encoded by the chapter MAY appear in 2976 the MIDI command section of stream packets. 2978 o ch_never. A chapter assigned to the ch_never parameter MUST 2979 NOT appear in the recovery journal. 2981 o ch_default. A chapter assigned to the ch_default parameter 2982 MUST follow the default semantics for the chapter, as defined 2983 in Appendices A-B. 2985 o ch_anchor. A chapters assigned to the ch_anchor MUST obey a 2986 modified version of the default chapter semantics. In the 2987 modified semantics, all references to the checkpoint history 2988 are replaced with references to the session history, and all 2989 references to the checkpoint packet are replaced with 2990 references to the first packet sent in the stream. 2992 Parameter assignments obey the following syntax (see Appendix D for 2993 ABNF): 2995 = [channel list][field list]; 2997 The chapter list is mandatory; the channel and field lists are optional. 2998 Multiple assignments to parameters have a cumulative effect, and are 2999 applied in the order of parameter appearance in a media description. 3001 The chapter list specifies the channel or system chapters for which the 3002 parameter applies. The chapter list is a concatenated sequence of one 3003 or more of the letters corresponding to the chapter types 3004 (ACDEFMNPQTVWX). In addition, the list may contain one or more of the 3005 letters for the sub-chapter types (BGHJKYZ) of System Chapter D. 3006 Assignments to sub-chapters of Chapter D override assignments to Chapter 3007 D. The letters in a chapter list MUST be upper case, and MUST appear in 3008 alphabetical order. 3010 The channel list specifies the channel journals for which this parameter 3011 applies; if no channel list is provided, the parameter applies to all 3012 channel journals. The channel list takes the form of a list of channel 3013 numbers (0 through 15) and dash-separated channel number ranges (i.e. 3014 0-5, 8-12, etc). Dots (i.e. "." characters) separate elements in the 3015 channel list. 3017 A few system channels use special semantics for the channel list, which 3018 we now define. 3020 For the J and K Chapter D sub-chapters (undefined System Common), the 3021 digit 0 codes that the parameter applies to the LEGAL field of the 3022 associated command log (Figure B.1.4 of Appendix B.1), the digit 1 codes 3023 that the parameter applies to the VALUE field of the command log, and 3024 the digit 2 codes that the parameter applies to the COUNT field of the 3025 command log. 3027 For the Y and Z Chapter D sub-chapters (undefined System Real-time), the 3028 digit 0 codes that the parameter applies to the LEGAL field of the 3029 associated command log (Figure B.1.5 of Appendix B.1) and the digit 1 3030 codes that the parameter applies to the COUNT field of the command log. 3032 For Chapter Q (Sequencer State Commands), the digit 0 codes that the 3033 parameter applies to the default Chapter Q definition, which forbids the 3034 TIME field. The digit 1 codes that the parameter applies to the 3035 optional Chapter Q definition, which supports the TIME field. 3037 For Chapter X (System Exclusive), the channel list specifies the types 3038 of System Exclusive commands to which the parameter applies. The digit 3039 0 corresponds to Universal Real-Time commands (Manufacturer ID 0x7E), 3040 the digit 1 corresponds to Universal Non-Real-Time (Manufacturer ID 3041 0x7F), and the digit 2 corresponds to internal use (Manufacturer ID 3042 0x7D). The digit 3 corresponds to real-time commands for all other 3043 Manufacturer ID numbers, and the digit 4 corresponds to non-real-time 3044 commands for all other Manufacturer ID numbers. 3046 The syntax for field lists follows the syntax for channel lists. If no 3047 field list is provided, the parameter applies to all controller or note 3048 numbers. 3050 For Chapters C and E, the field list codes the controller numbers for 3051 which the parameter applies. 3053 For Chapter M, the field list consists of a single digit. The digit 0 3054 codes that a log MUST appear for a parameter in Chapter M if a C-active 3055 command that forms part of an initiated transaction for the parameter 3056 appears in the checkpoint history, and that the A-COARSE, A-FINE, and A- 3057 BUTTON fields MUST NOT appear in parameter logs. The digit 1 codes that 3058 a log MUST appear for a parameter in Chapter M if an active (as opposed 3059 to C-active) command that forms part of an initiated transaction for the 3060 parameter appears in the checkpoint history. 3062 For Chapters N and A, the field list codes the note numbers for which 3063 the parameter applies. For sub-chapters J and K of Chapter D, the field 3064 list consists of a single digit, which specifies the number of data 3065 octets that follow the command octet. 3067 The example session description below illustrates the use of the chapter 3068 inclusion parameters: 3070 v=0 3071 o=lazzaro 2520644554 2838152170 IN IP6 first.example.net 3072 s=Example 3073 t=0 0 3074 m=audio 5004 RTP/AVP 96 3075 c=IN IP6 FF1E:03AD::7F2E:172A:1E24 3076 a=rtpmap: 96 rtp-midi/44100 3077 a=fmtp: 96 j_update=open-loop; ch_unused=ABDEFGHJKMQTVWXYZ; 3078 a=fmtp: 96 ch_anchor=P; ch_anchor=C7.64; 3079 a=fmtp: 96 ch_never=4.11-13N; 3081 The j_update parameter codes that the stream uses the open-loop policy. 3082 Most chapters are assigned to ch_unused, a typical MIDI usage pattern of 3083 a low-bandwidth stream. 3085 To guard against indefinite artifacts, the MIDI Program Change command 3086 and several MIDI Control Change controller numbers are assigned to 3087 ch_anchor. Note that the ordering of the ch_anchor chapter C assignment 3088 after the ch_unused command acts to override the ch_unused assignment 3089 for the listed controller numbers (7 and 64). 3091 Chapter N for several MIDI channels is assigned to ch_never; in 3092 practice, this assignment pattern would reflect knowledge about a 3093 resilient rendering method in use for certain channels. In this 3094 example, Chapter N for MIDI channels other than 4, 11, 12, and 13 may 3095 appear in the recovery journal, per the default behavior. 3097 C.2 SDP Definitions: Command Execution Semantics 3099 The MIDI command section of the payload format consists of a list of 3100 commands, each with an associated timestamp. Section 3.1 defines the 3101 default semantics for command timestamps. These semantics work well for 3102 transcoding Standard MIDI Files (SMFs), but are problematic for 3103 transcoding MIDI sources (such as MIDI 1.0 DIN cables [1]) that use 3104 implicit "time-of-arrival" coding. 3106 In this Appendix, we define session configuration tools for customizing 3107 the timestamp semantics of the MIDI command section. 3109 The fmtp parameter "tsmode" specifies the timestamp semantics for a 3110 stream. The parameter takes on one of three token values: "comex", 3111 "async", or "buffer". 3113 The "comex" value specifies the default semantics. The "async" value 3114 selects an asynchronous sampling algorithm for time-of-arrival sources 3115 (Appendix C.2.1). The "buffer" value selects an alternative synchronous 3116 sampling algorithm (Appendix C.2.2). 3118 Ancillary fmtp parameters may follow tsmode in a media description. One 3119 such parameter is "linerate". This parameter codes the timespan of one 3120 MIDI octet on the transmission medium of the MIDI source to be sampled 3121 (such as a MIDI 1.0 DIN cable). The parameter has units of nanoseconds, 3122 and takes on integral values. For MIDI 1.0 DIN cables, the correct 3123 linerate value is 320000 (this value is also the default value for the 3124 parameter). Other ancillary fmtp parameters are defined in Appendices 3125 C.2.1-2 below. 3127 C.2.1 The async Algorithm 3129 The "async" tsmode value specifies the asynchronous sampling of a MIDI 3130 time-of-arrival source. In asynchronous sampling, the moment an octet 3131 is received from a source it is labelled with a wall-clock time value. 3132 The time value has RTP timestamp units. 3134 The "octpos" ancillary fmtp parameter defines how RTP command timestamps 3135 are derived from octet time values. If octpos has the token value 3136 "first", a timestamp codes the time value of the first octet of the 3137 command. If octpos has the token value "last", a timestamp codes the 3138 time value of the last octet of the command. If the octpos parameter 3139 does not appear in the media description, a timestamp MAY reflect the 3140 time value of any octet of the command. 3142 The octpos semantics refer to the first or last octet of a command as it 3143 appears on a time-of-arrival source, not as it appears in the RTP 3144 packet. This distinction is significant for segmented SysEx commands. 3145 This distinction is also significant for sources that use running status 3146 coding, as the RTP encoding does not always preserve running status. 3147 The P header bit of the MIDI command section may be used to ascertain 3148 accurate command timing in this case (Section 3). 3150 We now show a session description example for the async algorithm. 3151 Consider a sender that is transcoding a MIDI 1.0 DIN cable source into 3152 RTP. The sender runs on a computing platform that assigns time values 3153 to every incoming octet of the source, and the sender uses the time 3154 values to label the first octet of each command in the RTP packet. This 3155 session description describes the transcoding: 3157 v=0 3158 o=lazzaro 2520644554 2838152170 IN IP4 first.example.net 3159 s=Example 3160 t=0 0 3161 m=audio 5004 RTP/AVP 96 3162 c=IN IP4 192.0.2.94 3163 a=rtpmap: 96 rtp-midi/44100 3164 a=fmtp: 96 tsmode=async;linerate=320000;octpos=first; 3166 C.2.2 The buffer Algorithm 3168 The "buffer" tsmode value specifies the synchronous sampling of a MIDI 3169 time-of-arrival source. 3171 In synchronous sampling, octets received from a source are placed in a 3172 holding buffer upon arrival. At periodic intervals, the RTP sender 3173 examines the buffer. The sender removes complete commands from the 3174 buffer, and codes those commands in an RTP packet. The command 3175 timestamp reflects the actual moment of buffer examination, expressed in 3176 RTP timestamp units. Note that several commands may have the same 3177 timestamp value. 3179 The "mperiod" ancillary fmtp parameter defines the nominal periodic 3180 sampling interval. The parameter takes on positive integral values, and 3181 has RTP timestamp units. 3183 The "octpos" ancillary fmtp parameter, defined in Appendix C.2.1 for 3184 asynchronous sampling, plays a different role in synchronous sampling. 3185 In synchronous sampling, the parameter specifies the timestamp semantics 3186 of a command whose octets span several sampling periods. 3188 If octpos has the token value "first", the timestamp reflects the 3189 arrival period of the first octet of the command. If octpos has the 3190 token value "last", the timestamp reflects the arrival period of the 3191 last octet of the command. If the octpos parameter does not appear in 3192 the media description, the timestamp MAY reflect the arrival period of 3193 any octet of the command. The octpos semantics refer to the first or 3194 last octet of the command as it appears on a time-of-arrival source, not 3195 as it appears in the RTP packet. 3197 We now show a session description example for the buffer algorithm. 3198 Consider a sender that is transcoding a MIDI 1.0 DIN cable source into 3199 RTP. The sender runs on a computing platform that places source data 3200 into a buffer upon receipt. The sender polls the buffer 1000 times a 3201 second, extracts all complete commands from the buffer, and places the 3202 commands in an RTP packet. This session description describes the 3203 transcoding: 3205 v=0 3206 o=lazzaro 2520644554 2838152170 IN IP6 first.example.net 3207 s=Example 3208 t=0 0 3209 m=audio 5004 RTP/AVP 96 3210 c=IN IP6 FF1E:03AD::7F2E:172A:1E24 3211 a=rtpmap: 96 rtp-midi/44100 3212 a=fmtp: 96 tsmode=buffer;linerate=320000;octpos=last;mperiod=44; 3214 The mperiod value of 44 is derived by dividing the srate (44100 Hz) by 3215 the 1000 Hz buffer sampling rate, and rounding to the nearest integer. 3216 Command timestamps might not increment by exact multiples of 44, as the 3217 actual sampling period might not precisely match the nominal mperiod 3218 value. 3220 C.3 SDP Definitions: Timing Tools 3222 In this Appendix, we describe session configuration tools for 3223 customizing the temporal behavior of MIDI streams. 3225 C.3.1 ptime and maxptime 3227 Senders code the temporal nature of a stream by choosing the amount of 3228 media time encoded in each packet. Short media times (20 ms or less) 3229 often imply an interactive session. Longer media times (100 ms or more) 3230 usually indicate a content streaming session. The AVP profile permits 3231 audio packet media times to range from 0 to 200 ms. 3233 An RTP receiver dynamically senses the media time of packets in a 3234 stream, and chooses the length of its playout buffer to match the 3235 stream. A receiver typically sizes its playout buffer to fit several 3236 audio packets, and adjusts the buffer length to reflect the network 3237 jitter and the sender timing fidelity. 3239 Alternatively, the packet media time may be statically set during 3240 session configuration. The standard "ptime" attribute sets the typical 3241 packet media time for a session. The standard "maxptime" attribute sets 3242 the maximum packet media time for a session [6]. 3244 0 ms is a reasonable media time value for MIDI packets. In a packet 3245 with a 0 ms media time, all commands execute at the instant coded by the 3246 packet timestamp. Prohibitions in [15] against 0 ms ptime values are 3247 not relevant for MIDI streams, and may be ignored. 3249 The session description example below defines a stream suitable for use 3250 in low-latency interactive applications. 3252 v=0 3253 o=lazzaro 2520644554 2838152170 IN IP4 first.example.net 3254 s=Example 3255 t=0 0 3256 m=audio 5004 RTP/AVP 96 3257 c=IN IP4 192.0.2.94 3258 a=rtpmap: 96 rtp-midi/44100 3259 a=ptime:0 3260 a=maxptime:0 3262 C.3.2 The guardtime Parameter 3264 RTP/AVP permits a sender to stop sending audio packets for an arbitrary 3265 period of time during a session. When sending resumes, the RTP sequence 3266 number series continues unbroken, and the RTP timestamp value reflects 3267 the media time silence gap. 3269 This RTP/AVP feature has its roots in telephony, but is also well 3270 matched to interactive MIDI sessions, as players may fall silent for 3271 several seconds during (or between) songs. 3273 Certain MIDI applications benefit from a slight enhancement to this 3274 RTP/AVP feature. In interactive applications, receivers may use on-line 3275 network models to guide heuristics for handling lost and late RTP 3276 packets. These models may work poorly if a sender ceases packet 3277 transmission for long periods of time. 3279 Session descriptions may use the fmtp parameter "guardtime" to set a 3280 minimum sending rate for a media session. The value assigned to 3281 guardtime codes the maximum separation time between two sequential 3282 packets, as expressed in RTP timestamp units. Typical guardtime values 3283 are 500-2000 ms. 3285 Below, we show a session description that uses the guardtime parameter. 3287 v=0 3288 o=lazzaro 2520644554 2838152170 IN IP6 first.example.net 3289 s=Example 3290 t=0 0 3291 m=audio 5004 RTP/AVP 96 3292 c=IN IP6 FF1E:03AD::7F2E:172A:1E24 3293 a=rtpmap: 96 rtp-midi/44100 3294 a=ptime:0 3295 a=maxptime:0 3296 a=fmtp: 96 guardtime=44100; 3298 C.3.3 MIDI Time Code Issues 3300 RTP defines tools to synchronize the playout of multiple RTP media 3301 streams. Appendix C.4 shows how to use these tools in MIDI streams. 3303 In content-creation applications, it may be necessary to synchronize 3304 stream playout with media that are not sent over RTP. For example, 3305 analog video may be marked with SMPTE 12M timecode, and an application 3306 may need to synchronize MIDI playout the video using timecode. 3308 The MIDI standard includes the MIDI Time Code (MTC) commands for SMPTE 3309 12M timecode [1]. An application MAY use MTC to send timecode data 3310 (including offsets and user data) in the MIDI command stream for 3311 heterogeneous synchronization purposes. 3313 C.4 SDP Definitions: Multiple Streams 3315 Several MIDI streams may appear in a session description. By default, 3316 the MIDI name space (16 voice channels + systems) for each stream is 3317 unique, and the rendering for each stream proceeds independently. The 3318 audio outputs of the streams are presented in a synchronized fashion. 3320 In this Appendix, we define two fmtp parameters for use in sessions with 3321 several streams. These parameters ("musicport" and "zerosync") add 3322 three features to RTP MIDI: 3324 1. Several streams may target the same MIDI name space. 3326 2. Several streams may be bundled to form a larger MIDI 3327 name space, that a single rendering system may treat as 3328 an ordered entity. 3330 3. Streams may specify relative timebase offsets, to support 3331 synchronization with zero sync-lock delay. 3333 In Appendices C.4.1-2, we define the musicport and zerosync parameters. 3334 In Appendix C.4.3, we show session description examples. 3336 Other payload formats MAY define musicport and zerosync fmtp parameters. 3337 Formats would define these parameters so that their streams could be 3338 bundled into RTP MIDI name spaces. The parameter definitions MUST be 3339 compatible with the musicport and zerosync semantics defined in this 3340 Appendix. 3342 C.4.1 The musicport Parameter 3344 The musicport parameter codes an arbitrary identification number for the 3345 MIDI name space (16 voice channels + systems) of an RTP stream. The 3346 musicport parameter may take on integer values between 0 and 429496729. 3348 If several MIDI streams in a session share the same musicport value, the 3349 streams target the same MIDI name space. We refer to this relationship 3350 as the identity relationship. 3352 If several MIDI streams in a session have contiguous musicport values 3353 (i.e. i, i+1, ... i+k), the name spaces of the streams form an ordered 3354 entity. In this case, the streams in the entity are said to share an 3355 ordered relationship. 3357 Note that a stream may participate in both an identity and an ordered 3358 relationship. For example, a stream in an identity relationship may 3359 have a musicport value that forms part of an ordered relationship. If 3360 the musicport values of two streams are not part of an ordered or 3361 identity relationship, the two streams are independent, and have 3362 independent MIDI name spaces. 3364 RTP MIDI streams in an ordered or identity relationship MUST be all 3365 native streams or all mpeg4-generic streams. Thus, we refer to 3366 relationships as being native relationships or mpeg4-generic 3367 relationships. 3369 For native relationships, at most one stream may specify MIDI renderers 3370 (using the tools described in C.5). Each MIDI rendering type may define 3371 its own semantics with regard to identity and ordered relationships. 3373 For mpeg4-generic relationships, at most one stream in an identity or 3374 ordered relationship may have a config parameter value other than the 3375 empty string. In this case, the config value configures the stream. 3376 Alternatively, all config parameters may be set to the empty string. In 3377 this case, exactly one stream in the relationship MUST define the 3378 configuration using the tools described in Appendix C.5. 3380 For both native and mpeg4-generic relationships, an exception to the 3381 "one stream defines the rendering" rule applies to relationships that 3382 exclusively contain sendonly and recvonly streams (as defined in [6]). 3383 In this case, a stream in each direction may define a renderer. 3385 In an identity relationship, the sender partitions a MIDI name space (16 3386 voice channels + systems) into several RTP streams. Receivers may 3387 process these streams independently, or may merge the streams to 3388 reconstitute the original MIDI command stream. We now specify receiver 3389 and sender responsibilities to ensure the robust transmission of 3390 identity relationships. 3392 Receivers that merge identity relationship streams into a single MIDI 3393 command stream MUST maintain the structural integrity of the MIDI 3394 commands coded in each stream during the merging process, in the same 3395 way that software that merges traditional MIDI 1,0 DIN cable flows is 3396 responsible for creating a merged command flow compatible with [1]. 3398 Senders MUST partition the name space so that the rendered MIDI 3399 performance does not contain indefinite artifacts (as defined in Section 3400 4). This responsibility holds even if all streams are sent over 3401 reliable transport, as imperfect synchronization of reliable streams may 3402 yield indefinite artifacts. For example, stuck notes may occur in a 3403 performance split over two TCP streams, if NoteOn commands are sent on 3404 one stream and NoteOff commands are sent on the other. 3406 Senders MUST NOT split a Registered Parameter Name (RPN) or Non- 3407 Registered Parameter Name (NRPN) transaction appearing on a MIDI channel 3408 across multiple identity relationship streams. Receivers MUST assume 3409 that the RPN/NRPN transactions that appear on different identity 3410 relationship streams are independent, and MUST preserve transactional 3411 integrity during the MIDI merge. 3413 A simple way to safely partition voice channel commands is to place all 3414 MIDI commands for a particular voice channel into the same stream. Safe 3415 partitions of systems commands may be more complex for streams that 3416 extensively use System Exclusive commands. 3418 C.4.2 The zerosync Parameter 3420 The RTP timestamp of the first packet in a stream is not set to zero. 3421 Instead, [2] mandates that the RTP timestamp is initialized to a 3422 randomly chosen value, to guard against plaintext attacks on encrypted 3423 streams. As a consequence, a receiver cannot directly use RTP 3424 timestamps to play back two RTP streams in sync. 3426 The Real Time Control Protocol (RTCP), a low-bandwidth feedback channel 3427 that is paired with each RTP stream, provides synchronization services. 3428 Certain types of RTCP packets code the current time in two forms: the 3429 format of the RTP timestamp, and the 64-bit Network Time Protocol (NTP) 3430 format. A receiver may examine the NTP timestamps of several RTCP 3431 streams, and use this information to deduce the temporal relationship 3432 between the RTP streams associated with the RTCP streams. This method 3433 assumes that the NTP timestamps coded by all streams derive from a 3434 common clock source. 3436 For many applications, this RTCP-based method is a good way to 3437 synchronize streams. In some applications, however, this method is not 3438 optimal, because of the synchronization time delay at the start of the 3439 session. 3441 The zerosync parameter provides an alternative mechanism for stream 3442 synchronization. The zerosync parameter codes the RTP timestamp offsets 3443 for each stream, so that streams generated in a synchronized fashion may 3444 be played back in sync without using RTCP feedback. 3446 The use of the zerosync parameter weakens the security of RTP, as 3447 discussed in Appendix G of this memo. 3449 The zerosync parameter supports two synchronization mechanisms. One 3450 mechanism potentially synchronizes all streams within a given 3451 relationship. Media descriptions code this mechanism with a zerosync 3452 parameter whose value is in the range 1-429496729. We refer to this 3453 mechanism as the non-zero behavior. 3455 A second mechanism potentially synchronizes all RTP MIDI streams in a 3456 session. Media descriptions code this mechanism with a zerosync 3457 parameter whose value is set to 0. We refer to this mechanism as the 3458 zero behavior. 3460 A media description may contain, at most, one zerosync parameter 3461 assignment. Thus, a stream may participate in a non-zero behavior or a 3462 zero behavior, but not both. In both zero and non-zero behaviors, all 3463 media descriptions synchronized by the behavior MUST have identical 3464 srate values. 3466 In a non-zero behavior, all streams within a relationship share an 3467 underlying timebase, but the randomly chosen initial timestamp value for 3468 each stream obscures this commonality. To unmask the similarity, each 3469 media description in the relationship MAY include a zerosync parameter 3470 whose non-zero value codes its initial timestamp value. In this scheme, 3471 the underlying timestamp for a packet is computed by subtracting (modulo 3472 2^32) the zerosync value from the packet timestamp. 3474 In a zero behavior, all affected streams share an underlying timebase 3475 AND the same initial timestamp value (in direct violation of [2]). 3476 Thus, the packet timestamps code the "true" timestamp directly. 3478 C.4.3 Multi-stream examples using musicport and zerosync. 3480 This section shows several session description examples that use the 3481 musicport and zerosync parameters. 3483 Our first session description example shows two mpeg4-generic streams 3484 that drive the same General MIDI decoder. 3486 v=0 3487 o=lazzaro 2520644554 2838152170 IN IP4 first.example.net 3488 s=Example 3489 t=0 0 3490 m=audio 5004 RTP/AVP 61 3491 c=IN IP4 192.0.2.94 3492 a=rtpmap: 61 mpeg4-generic/44100 3493 a=fmtp: 61 streamtype=5; mode=rtp-midi; profile-level-id=12; 3494 a=fmtp: 96 config=7A124D546864000000060000000100604D547 3495 26B0000000400FF2F000 3496 a=fmtp: 61 musicport=12;zerosync=1726 3497 m=audio 5006 RTP/AVP 62 3498 c=IN IP4 192.0.2.94 3499 a=rtpmap: 62 mpeg4-generic/44100 3500 a=fmtp: 62 streamtype=5; mode=rtp-midi; config=""; 3501 a=fmtp: 62 profile-level-id=12; musicport=12; zerosync=726; 3503 (The linebreak in the second fmtp line accommodates memo formatting 3504 restrictions; SDP does not have continuation lines.) 3506 The musicport values indicate the streams share an identity 3507 relationship, and the zerosync values code the non-zero behavior. 3509 A variant on this example, whose session description is not shown, would 3510 use two streams in an identity relationship driving the same MIDI 3511 renderer, each with a different transport type. One stream would use 3512 UDP, and would be dedicated to real-time messages. A second stream 3513 would use TCP, and would be used for SysEx bulk data messages. 3515 In the next example, two mpeg4-generic streams form an ordered 3516 relationship to drive a Structured Audio decoder with 32 MIDI voice 3517 channels. 3519 v=0 3520 o=lazzaro 2520644554 2838152170 IN IP6 first.example.net 3521 s=Example 3522 t=0 0 3523 m=audio 5004 RTP/AVP 61 3524 c=IN IP6 FF1E:03AD::7F2E:172A:1E24 3525 a=rtpmap: 61 mpeg4-generic/44100 3526 a=fmtp: 61 streamtype=5; mode=rtp-midi; config=""; 3527 a=fmtp: 61 profile-level-id=13; musicport=5; zerosync=0; 3528 m=audio 5006 RTP/AVP 62 3529 c=IN IP6 FF1E:03AD::7F2E:172A:1E24 3530 a=rtpmap: 62 mpeg4-generic/44100 3531 a=fmtp: 62 streamtype=5; mode=rtp-midi; config=""; profile-level-id=13; 3532 a=fmtp: 62 profile-level-id=13; musicport=6; zerosync=0; 3533 a=fmtp: 62 render=synthetic; rinit="audio/asc"; 3534 a=fmtp: 62 url="http://example.com/cardinal.asc"; 3535 a=fmtp: 62 cid="azsldkaslkdjqpwojdkmsldkfpe"; 3537 The sequential musicport values for the two streams establishes the 3538 ordered relationship. The musicport=5 stream maps to Structured Audio 3539 extended channels range 0-15, the musicport=6 stream maps to Structured 3540 Audio extended channels range 16-31. The zerosync values code the zero 3541 behavior. 3543 Both config strings are empty. The configuration data is specified in 3544 the final two fmtp lines of the second media description. We define 3545 this configuration method in Appendix C.5. 3547 C.5 SDP Definitions: MIDI Rendering 3549 This Appendix defines the session configuration tools for rendering. 3551 The "render" fmtp parameter specifies a rendering method for a stream. 3552 The inclusion of a render parameter in a media description acts to 3553 override the default rendering semantics (defined in Sections 6.1-2) for 3554 the stream. 3556 The render parameter is assigned a token value that signals the top- 3557 level rendering type. This memo defines two token values for render: 3558 "synthetic" and "api". A "synthetic" renderer transforms the MIDI 3559 stream into audio output (or sometimes, into stage lighting changes or 3560 other actions). An "api" renderer presents the command stream to 3561 applications via an Application Programmer Interface (API). 3563 Other fmtp parameters follow the render parameter in the media 3564 description, and define the exact nature of the renderer. The "rinit" 3565 fmtp parameter (defined in Appendix C.5.1) specifies the MIME subtype 3566 for the renderer, and the "inline", "url", and "cid" fmtp parameters 3567 (defined in Appendix C.5.2) specify renderer initialization data. 3569 Other IETF standards-track documents MAY define additional token values 3570 for the render parameter. If a receiver is not aware of the token value 3571 assigned to a render parameter, the receiver MUST ignore the renderer 3572 the parameter defines. 3574 A media description MAY contain several render parameters. This syntax 3575 requests synchronized rendering of the stream by each renderer, if 3576 possible. Renderers appear in a media description in order of 3577 decreasing priority. A receiver with limited resources SHOULD use the 3578 priority to decide which renderer(s) to retain in a session. 3580 C.5.1 The rinit Parameter 3582 The "rinit" fmtp parameter defines the nature of the renderer declared 3583 by the render parameter. Exactly one rinit parameter MUST follow the 3584 render parameter in a media description. 3586 The value assigned to the rinit parameter MUST be a MIME type/subtype 3587 [8] that defines a renderer. Authors of rendering systems and MIDI APIs 3588 SHOULD register [20] a MIME subtype for use with RTP MIDI. 3590 A renderer that directly produces audio output SHOULD be registered 3591 under the "audio" MIME type. API presentation renderers, and renderers 3592 that control non-audio devices, SHOULD be registered under the 3593 "application" MIME type. 3595 The subtype registration for a renderer MAY define a data object. For 3596 renderers that directly produce audio or control output, the data object 3597 usually codes initialization data for the rendering algorithm. The data 3598 object may also encapsulate an SMF, so that the data object may be used 3599 as a format for stored performances. 3601 For API presentation renderers, the role of the data object varies. In 3602 some cases, the data object describes the hardware device that generates 3603 the stream (manufacturer, model, etc). In other cases, the data object 3604 follows the semantics of audio renderer data objects. 3606 If a renderer MIME registration defines a data object, additional fmtp 3607 parameters MAY follow the rinit parameter to encode the object. We 3608 define these parameters in Appendix C.5.2. 3610 By default, if a data object is encoded in an RTP MIDI media 3611 description, SMFs encapsulated in the data object MUST be ignored by the 3612 receiver. We define fmtp parameters to override this default in 3613 Appendix C.5.3. 3615 Special rules apply to using the rinit parameter in an mpeg4-generic 3616 stream. We define these rules in Appendix C.5.4. 3618 The rinit parameter MAY be assigned the "application/octet-stream" or 3619 "audio/octet-stream" values. These values code an opaque rendering 3620 type, whose rendering semantics and data object format has been defined 3621 outside the scope of this memo. 3623 C.5.2 Encoding rinit Data Objects 3625 The "inline", "url", and "cid" fmtp parameters MAY follow the rinit 3626 parameter in a media description. These parameters encode the 3627 initialization data object for the renderer. 3629 The "inline" parameter supports the inline encoding of the data object. 3630 The parameter is assigned a double-quoted Base64 [8] encoding of the 3631 binary data object, with no line breaks. 3633 The "url" parameter is assigned a double-quoted string representation of 3634 a Uniform Resource Locator (URL) for the data object. If the URL points 3635 to a MIME object, the object MUST have the MIME type/subtype value coded 3636 by the rinit parameter. 3638 The "cid" parameter supports data object caching, and MAY follow the url 3639 parameter in the media description. The parameter is assigned a double- 3640 quoted string value that encodes a globally unique identifier for the 3641 data object. If the url string points to a MIME object, the cid string 3642 MUST match the Content-ID header [8] value of the object. 3644 In most cases, one inline parameter or one url/cid parameter pair 3645 follows the rinit parameter in the media description. The correct 3646 receiver interpretation of multiple data objects SHOULD be defined in 3647 the renderer MIME registration. 3649 C.5.3 MIDI Channel Mapping 3651 In this Appendix, we specify how to map MIDI name spaces (16 voice 3652 channels + systems) onto a renderer. 3654 In the general case: 3656 o A session may define an ordered relationship (Appendix C.4) 3657 that presents more than one MIDI name space to a renderer. 3659 o A renderer may accept an arbitrary number of MIDI name spaces, 3660 or may expect a fixed number of MIDI name spaces. 3662 A session description SHOULD define mappings of streams to renderers 3663 that are name-space compatible. If a receiver detects a name-space 3664 mismatch in a session description, extra stream name spaces MUST be 3665 discarded, and extra renderer name spaces MUST NOT be driven with MIDI 3666 data. 3668 If a media description defines several renderers, each renderer 3669 processes the presented name space(s) in parallel. However, the 3670 "chanmask" fmtp parameter may be used to mask out selected voice 3671 channels to each renderer. We define "chanmask" and other channel 3672 management fmtp parameters in the sub-sections below. 3674 C.5.3.1 The smf_info fmtp Parameter 3676 The smf_info parameter MAY appear after the rinit parameter (Appendix 3677 C.5.1) in a media description. The parameter defines the use of all 3678 SMFs encapsulated in renderer data objects. 3680 We define token values for smf_info: "sdp_start" and "ignore". The 3681 "sdp_start" value codes that SMF rendering MUST begin upon the 3682 acceptance of the session description. The "ignore" value codes that 3683 SMF files MUST be discarded (the default behavior). Below, we define 3684 the semantics for the "sdp_start" token value. 3686 SMFs share the MIDI name spaces of the RTP streams. SMF commands and 3687 RTP stream commands are merged and presented to the renderer. The 3688 indefinite artifact responsibilities for merged MIDI streams defined in 3689 Appendix C.4.1 also apply to merging RTP streams and SMFs. 3691 If the data object encapsulates multiple SMFs, the SMF name spaces are 3692 presented as an ordered entity to the renderer. The first encapsulated 3693 SMF in the data object maps to the first renderer name space, the second 3694 encapsulated SMF maps to the second renderer name space, etc. If the 3695 associated RTP streams form an ordered relationship, the first SMF is 3696 merged with the first name space of the relationship, the second SMF is 3697 merged to the second name space of the relationship, etc. 3699 Unless the streams and the SMFs both use MIDI Time Code, the time offset 3700 between SMF and stream data is unspecified. This restriction may limit 3701 the use of SMFs to applications where synchronization is not critical, 3702 such as the transport of System Exclusive commands for renderer 3703 initialization, or human-SMF interactivity. 3705 C.5.3.2 The smf_inline, smf_url, and smf_cid fmtp Parameters 3707 In some applications, the renderer data object may not encapsulate SMFs, 3708 but an application may wish to use SMFs in the manner defined in 3709 Appendix C.5.3.1. 3711 The "smf_inline", "smf_url", and "smf_cid" fmtp parameters address this 3712 situation. These parameters use the syntax and semantics of the inline, 3713 url, and cif parameters defined in C.5.2, except that the encoded data 3714 object is an SMF. 3716 If several "smf_inline" or "smf_url" parameters appear in a media 3717 description, the order of the parameter defines the SMF name space 3718 ordering. 3720 If smf_url points to a MIME object, the "application/octet-stream" 3721 type/subtype SHOULD be used for the object. 3723 C.5.3.3 The chanmask fmtp Parameter 3725 The chanmask fmtp parameter instructs the renderer to ignore all MIDI 3726 voice commands for certain channel numbers. The parameter value is an 3727 concatenated string of "1" and "0" digits. Each string position maps to 3728 a MIDI voice channel number (system channels may not be masked). A "1" 3729 instructs the renderer to process the voice channel; a "0" instructs the 3730 renderer to ignore the voice channel. 3732 The string length of the chanmask parameter value MUST be 16 (for a 3733 single stream or an identity relationship) or a multiple of 16 (for an 3734 ordered relationship). 3736 The chanmask parameter appears after the render parameter, and describes 3737 the final MIDI name spaces presented to the renderer. The SMF and 3738 stream components of the MIDI name spaces may not be independently 3739 masked. 3741 C.5.4 The audio/asc MIME Type 3743 In Appendix H.3, we register the audio/asc MIME type. The data object 3744 for audio/asc is a binary encoding of the AudioSpecificConfig data used 3745 to configure mpeg4-generic streams (Section 6.2 and [7]). 3747 An mpeg4-generic media description MAY use audio/asc for renderer 3748 configuration. Several restrictions apply to the use of the render 3749 parameter with mpeg4-generic streams: 3751 o An mpeg4-generic media description that uses the render parameter 3752 MUST assign the empty string ("") to the mpeg4-generic "config" 3753 parameter. 3755 o The render parameter MUST be assigned the value "synthetic". 3756 Other token values for render MUST NOT appear in an mpeg4-generic 3757 media description. 3759 o The rinit parameter MUST be assigned the value "audio/asc". 3760 Other token values for rinit MUST NOT appear in an mpeg4-generic 3761 media description. 3763 o The streamtype, mode, and profile-level-id parameters MUST be 3764 used as defined in Section 6.2, and the AudioSpecificConfig data 3765 MUST encode one of the MPEG 4 Audio Object Types defined for use 3766 with mpeg4-generic in Section 6.2. 3768 In addition, several restrictions apply to the use of the audio/asc MIME 3769 type in RTP MIDI. 3771 o A native stream MUST NOT assign the "audio/asc" value to rinit. 3773 o The audio/asc MIME type defines a stored object type; it does 3774 not define semantics for RTP streams. Thus, audio/asc MUST NOT 3775 appear on an rtpmap line of a session description. 3777 Below, we show session description examples for audio/asc. The session 3778 description below uses the inline parameter to code the 3779 AudioSpecificConfig block for a mpeg4-generic General MIDI stream. 3781 v=0 3782 o=lazzaro 2520644554 2838152170 IN IP4 first.example.net 3783 s=Example 3784 t=0 0 3785 m=audio 5004 RTP/AVP 61 3786 c=IN IP4 192.0.2.94 3787 a=rtpmap: 61 mpeg4-generic/44100 3788 a=fmtp: 61 streamtype=5; mode=rtp-midi; 3789 a=fmtp: 61 config=""; profile-level-id=12; render=synthetic; 3790 a=fmtp: 61 rinit="audio/asc"; 3791 a=fmtp: 61 inline="ehJNVGhkAAAABgAAAAEAYE1UcmsAAAAEAP8vAAA=" 3793 The session description below uses the url fmtp parameter to code the 3794 AudioSpecificConfig block for the same General MIDI stream: 3796 v=0 3797 o=lazzaro 2520644554 2838152170 IN IP4 first.example.net 3798 s=Example 3799 t=0 0 3800 m=audio 5004 RTP/AVP 61 3801 c=IN IP4 192.0.2.94 3802 a=rtpmap: 61 mpeg4-generic/44100 3803 a=fmtp: 61 streamtype=5; mode=rtp-midi; 3804 a=fmtp: 61 config=""; profile-level-id=12; 3805 a=fmtp: 61 render=synthetic; rinit="audio/asc"; 3806 a=fmtp: 61 url="http://example.net/oski.asc"; 3807 a=fmtp: 61 cid="xjflsoeiurvpa09itnvlduihgnvet98pa3w9utnuighbuk"; 3809 D. Parameter Syntax Definitions 3811 In this Appendix, we define the syntax for the RTP MIDI fmtp parameters 3812 in Augmented Backus-Naur Form (ABNF, [10]). Parameters appear in the 3813 fmtp lines of session descriptions for native or mpeg4-generic streams. 3814 A fmtp line may be defined as: 3816 ; 3817 ; SDP fmtp line definition 3818 ; 3820 fmtp = "a=fmtp:" token 1*(param-assign ";") CRLF 3822 where codes the RTP payload type. Below, we define as a set of incremental rules for the custom parameters defined 3824 in Appendix C. 3826 ; 3827 ; 3828 ; top-level definition for all parameters 3829 ; 3830 ; 3832 ; 3833 ; Parameters defined in Appendix C.1 3835 param-assign = "j_sec" "=" ("none" / "recj" / *ietf-extension) 3837 param-assign /= "j_update" "=" ("anchor" / "closed-loop" / "open-loop" 3838 / *ietf-extension) 3840 param-assign /= "ch_default" "=" ([channel-list] chapter-list [f-list]) 3842 param-assign /= "ch_unused" "=" ([channel-list] chapter-list [f-list]) 3844 param-assign /= "ch_never" "=" ([channel-list] chapter-list [f-list]) 3846 param-assign /= "ch_anchor" "=" ([channel-list] chapter-list [f-list]) 3848 ; 3849 ; Parameters defined in Appendix C.2 3851 param-assign /= "tsmode" "=" ("comex" / "async" / "buffer") 3853 param-assign /= "linerate" "=" nonzero-four-octet 3855 param-assign /= "octpos" "=" ("first" / "last") 3857 param-assign /= "mperiod" "=" nonzero-four-octet 3859 ; 3860 ; Parameter defined in Appendix C.3 3862 param-assign /= "guardtime" "=" nonzero-four-octet 3864 ; 3865 ; Parameters defined in Appendix C.4 3867 param-assign /= "musicport" "=" four-octet 3869 param-assign /= "zerosync" "=" four-octet 3871 ; 3872 ; Parameters defined in Appendix C.5 3874 param-assign /= "chanmask" "=" 1*( 16( "0" / "1" ) ) 3876 param-assign /= "cid" "=" double-quote cid-block double-quote 3877 param-assign /= "inline" "=" double-quote base-64-block double-quote 3879 param-assign /= "render" "=" ("synthetic" / "api" / *ietf-extension) 3881 param-assign /= "rinit" "=" mime-type "/" mime-subtype 3883 param-assign /= "smf_cid" "=" double-quote cid-block double-quote 3885 param-assign /= "smf_info" "=" ("ignore" / "sdp_start" ) 3887 param-assign /= "smf_inline" "=" double-quote base-64-block double-quote 3889 param-assign /= "smf_url" "=" double-quote uri-element double-quote 3891 param-assign /= "url" "=" double-quote uri-element double-quote 3893 ; 3894 ; list definitions for the ch_ chapter-list 3895 ; 3897 chapter-list = chapter-part1 chapter-part2 chapter-part3 3899 chapter-part1 = 0*1"A" 0*1"B" 0*1"C" 0*1"D" 0*1"E" 0*1"F" 0*1"G" 3901 chapter-part2 = 0*1"H" 0*1"J" 0*1"K" 0*1"M" 0*1"N" 0*1"P" 0*1"Q" 3903 chapter-part3 = 0*1"T" 0*1"V" 0*1"W" 0*1"X" 0*1"Y" 0*1"Z" 3905 ; 3906 ; list definitions for the ch_ channel-list 3907 ; 3909 channel-list = midi-chan-element *("." midi-chan-element) 3911 midi-chan-element = midi-chan / midi-chan-range 3913 midi-chan-range = midi-chan "-" midi-chan 3915 ; decimal value of left midi-chan 3916 ; MUST be strictly less than decimal 3917 ; value of right midi-chan 3919 midi-chan = %d0-15 3921 ; 3922 ; list definitions for the ch_ field list (f-list) 3923 ; 3924 f-list = midi-field-element *("." midi-field-element) 3926 midi-field-element = midi-field / midi-field-range 3928 midi-field-range = midi-field "-" midi-field 3929 ; 3930 ; decimal value of left midi-field 3931 ; MUST be strictly less than decimal 3932 ; value of right midi-field 3934 midi-field = %d0-127 3936 ; 3937 ; definitions for rinit fmtp parameter 3938 ; 3940 mime-type = type 3941 ; 3942 ; as defined on page 12 in [8] 3944 mime-subtype = subtype 3945 ; 3946 ; as defined on page 12 in [8] 3948 ; 3949 ; generic rules 3950 ; 3952 ietf-extension = token 3953 ; 3954 ; token as defined in reference [6]. 3955 ; ietf-extension may only be defined in 3956 ; standards-track RFCs (Section 7). 3958 four-octet = %d0-429496729 3959 ; unsigned encoding of 32-bits 3961 nonzero-four-octet = %d1-429496729 3962 ; unsigned encoding of 32-bits, ex-zero 3964 uri-element = uri 3965 ; as defined in reference [6]. 3967 base-64-block = base64 3968 ; as defined in reference [6]. 3970 double-quote = %x22 3971 ; the double-quote (") character 3973 cid-block = msg-id 3974 ; as discussed in Section 7 of 3975 ; reference [8] 3977 ; 3978 ; End of ABNF 3980 The mpeg4-generic RTP payload [4] defines a "mode" parameter that 3981 signals the type of MPEG stream in use. We add a new mode value, "rtp- 3982 midi", using the ABNF rule below: 3984 ; 3985 ; mpeg4-generic mode parameter extension 3986 ; 3988 mode /= "rtp-midi" 3989 ; as described in Section 6.2 of this memo 3991 E. A MIDI Overview for Networking Specialists 3993 This Appendix presents an overview of the MIDI standard, for the benefit 3994 of networking specialists new to musical applications. Implementors 3995 should consult [1] for a normative description of MIDI. 3997 Musicians make music by performing a controlled sequence of physical 3998 movements. For example, a pianist plays by coordinating a series of key 3999 presses, key releases, and pedal actions. MIDI represents a musical 4000 performance by encoding these physical gestures as a sequence of MIDI 4001 commands. This high-level musical representation is compact but 4002 fragile: one lost command may be catastrophic to the performance. 4004 MIDI commands have much in common with the machine instructions of a 4005 microprocessor. MIDI commands are defined as binary elements. 4006 Bitfields within a MIDI command have a regular structure and a 4007 specialized purpose. For example, the upper nibble of the first command 4008 octet (the opcode field) codes the command type. MIDI commands may 4009 consist of an arbitrary number of complete octets, but most MIDI 4010 commands are 1, 2, or 3 octets in length. 4012 ------------------------------------------------------------- 4013 | Name | Bitfield Pattern | 4014 |-------------------------------------------------------------| 4015 | NoteOff (end a note) | 1000cccc 0nnnnnnn 0vvvvvvv | 4016 |-------------------------------------------------------------| 4017 | NoteOn (start a note) | 1001cccc 0nnnnnnn 0vvvvvvv | 4018 |-------------------------------------------------------------| 4019 | PTouch (Polyphonic Aftertouch) | 1010cccc 0nnnnnnn 0aaaaaaa | 4020 |-------------------------------------------------------------| 4021 | CControl (Controller Change) | 1011cccc 0xxxxxxx 0yyyyyyy | 4022 |-------------------------------------------------------------| 4023 | PChange (Program Change) | 1100cccc 0ppppppp | 4024 |-------------------------------------------------------------| 4025 | CTouch (Channel Aftertouch) | 1101cccc 0aaaaaaa | 4026 |-------------------------------------------------------------| 4027 | PWheel (Pitch Wheel) | 1110cccc 0xxxxxxx 0yyyyyyy | 4028 |-------------------------------------------------------------| 4029 | System (sub-opcode is xxxx) | 1111xxxx ... | 4030 ------------------------------------------------------------- 4032 Figure E.1 -- MIDI command chart 4034 Figure E.1 shows the MIDI command family. There are two major classes 4035 of commands: voice commands (opcode field values in the range 0x8 4036 through 0xE) and system commands (opcode field value 0xF). Voice 4037 commands code the musical gestures for each timbre in a composition. 4038 Systems commands perform housekeeping functions, such as System Reset 4039 (the one-octet command 0xFF). 4041 E.1 Commands Types 4043 Voice commands execute on one of 16 MIDI channels, as coded by its 4-bit 4044 channel field (field cccc in Figure E.1). In most applications, notes 4045 for different timbres are assigned to different channels. To support 4046 applications that require more than 16 channels, MIDI systems use 4047 several MIDI command streams in parallel, to yield 32, 48, or 64 MIDI 4048 channels. 4050 As an example of a voice command, consider a NoteOn command (opcode 4051 0x9), with binary encoding 1001cccc 0nnnnnnn 0aaaaaaa. This command 4052 signals the start of a musical note on MIDI channel cccc. The note has 4053 a pitch coded by the note number nnnnnnn, and an onset amplitude coded 4054 by note velocity aaaaaaa. 4056 Other voice commands signal the end of notes (NoteOff, opcode 0x8), map 4057 a specific timbre to a MIDI channel (PChange, opcode 0xC), or set the 4058 value of parameters that modulate the timbral quality (all other voice 4059 commands). The exact meaning of most voice channel commands depends on 4060 the rendering algorithms the MIDI receiver uses to generate sound. In 4061 most applications, a MIDI sender has a model (in some sense) of the 4062 rendering method used by the receiver. 4064 E.2 Running Status 4066 All MIDI command bitfields share a special structure: the leading bit of 4067 the first octet is set to 1, and the leading bit of all subsequent 4068 octets is set to 0. This structure supports a data compression system, 4069 called running status [1], that improves the coding efficiency of MIDI. 4071 In running status coding, the first octet of a MIDI voice command may be 4072 dropped if it is identical to the first octet of the previous MIDI voice 4073 command. This rule, in combination with a convention to consider NoteOn 4074 commands with a null third octet as NoteOff commands, supports the 4075 coding of note sequences using two octets per command. 4077 E.3 Command Timing 4079 The bitfield formats in Figure E.1 do not encode the execution time for 4080 a command. Timing information is not a part of the MIDI command syntax 4081 itself; different applications of the MIDI command language use 4082 different methods to encode timing. 4084 For example, the MIDI command set acts as the transport layer for MIDI 4085 1.0 DIN cables [1]. MIDI cables are short asynchronous serial lines 4086 that facilitate the remote operation of musical instruments and audio 4087 equipment. Timestamps are not sent over a MIDI 1.0 DIN cable. Instead, 4088 the standard uses an implicit "time of arrival" code. Receivers execute 4089 MIDI commands at the moment of arrival. 4091 In contrast, Standard MIDI Files (SMFs, [1]), a file format for 4092 representing complete musical performances, add a explicit timestamp to 4093 each MIDI command, using a delta encoding scheme that is optimized for 4094 statistics of musical performance. SMF timestamps usually code timing 4095 using the metric notation of a musical score. SMF meta-events are used 4096 to add a tempo map to the file, so that score beats may be accurately 4097 converted into units of seconds during rendering. 4099 F. Acknowledgements 4101 We thank the networking, media compression, and computer music community 4102 members who have commented or contributed to the effort, including Steve 4103 Casner, Paul Davis, Robin Davies, Joanne Dow, Dominique Fober, Adrian 4104 Freed, Philippe Gentric, Chris Grigg, Todd Hager, Michel Jullian, Phil 4105 Kerr, Young-Kwon Lim, Jessica Little, Jan van der Meer, Colin Perkins, 4106 Charlie Richmond, Herbie Robinson, Larry Rowe, Dave Singer, Martijn 4107 Sipkema, Kent Terry, David Wessel, Magnus Westerlund, Tom White, Matt 4108 Wright, Jim Wright, and Giorgio Zoia. 4110 G. Security Considerations 4112 Authentication of incoming RTP and RTCP packets is RECOMMENDED. Without 4113 such protections, attackers could forge MIDI commands into an ongoing 4114 stream, damaging speakers and eardrums. An attacker could also craft 4115 RTP and RTCP packets to exploit known bugs in the client, and take 4116 effective control of a client machine. 4118 Session management tools SHOULD use authentication on all session 4119 descriptions. Session descriptions may code initialization data inline, 4120 using the inline (Appendix C.5.2) and smf_inline (Appendix C.5.3.2) fmtp 4121 parameters. If an attacker inserts bogus initialization data into a 4122 session description, he can corrupt the session or forge an client 4123 attack. 4125 Session descriptions may code renderer initialization data by reference, 4126 via the url (Appendix C.5.2) and smf_url (Appendix C.5.3.2) parameters. 4127 If the coded URL is spoofed, both session and client are open to attack. 4129 The zerosync fmtp parameter (described in Appendix C.4.2) impairs a 4130 security feature of RTP. In standard RTP, the RTP timestamp is 4131 initialized to a randomly chosen value, to reduce the predictability of 4132 the header. If zerosync is used in a media description, this security 4133 feature is partially (for non-zero zerosync values) or totally (if 4134 zerosync is set to zero) disabled. 4136 H. IANA Considerations 4138 In this Appendix, we register the audio/rtp-midi and audio/asc MIME 4139 types, and we extend the audio/mpeg4-generic MIME type [4]. The 4140 audio/rtp-midi and audio/asc registrations are in the IETF tree. 4142 H.1 rtp-midi MIME Registration 4144 This section registers rtp-midi as a MIME subtype for the audio type. 4146 MIME media type name: 4148 audio 4150 MIME subtype name: 4152 rtp-midi 4154 Required parameters: 4156 rate: The RTP timestamp clock rate, as specified in the rtpmap 4157 line. See Sections 2.1 and 6.1 for usage details. 4159 Optional parameters: 4161 Standard SDP attributes: 4163 maxptime: See Appendix C.3 for usage details. 4164 ptime: See Appendix C.3 for usage details. 4166 Non-extensible parameters: 4168 ch_anchor: See Appendix C.1 for usage details. 4169 ch_default: See Appendix C.1 for usage details. 4170 ch_never: See Appendix C.1 for usage details. 4171 ch_unused: See Appendix C.1 for usage details. 4172 chanmask: See Appendix C.5 for usage details. 4173 cid: See Appendix C.5 for usage details. 4174 guardtime: See Appendix C.3 for usage details. 4175 inline: See Appendix C.5 for usage details. 4176 linerate: See Appendix C.2 for usage details. 4177 musicport: See Appendix C.4 for usage details. 4178 mperiod: See Appendix C.2 for usage details. 4179 octpos: See Appendix C.2 for usage details. 4180 rinit: See Appendix C.5 for usage details. 4181 tsmode: See Appendix C.2 for usage details. 4182 smf_cid: See Appendix C.5 for usage details. 4183 smf_info: See Appendix C.5 for usage details. 4184 smf_inline: See Appendix C.5 for usage details. 4185 smf_url: See Appendix C.5 for usage details. 4186 url: See Appendix C.5 for usage details. 4187 zerosync: See Appendix C.4 for usage details. 4189 Extensible parameters: 4191 j_sec, j_update: 4193 See Appendix C.1 for usage details. The parameters 4194 may only be extended via an IETF standards-track 4195 document. 4197 render: 4199 See Appendix C.5 for usage details. The parameter may 4200 only be extended via an IETF standards-track document. 4202 Encoding considerations: 4204 This type is only defined for real-time transfers of MIDI 4205 streams via RTP. Stored-file semantics for rtp-midi may 4206 be defined in the future. 4208 Security considerations: 4210 See Appendix G of this memo. 4212 Interoperability considerations: 4214 None. 4216 Published specification: 4218 This memo and [1] serve as the normative specification. In 4219 addition, references [12], [13], and [18] provide non-normative 4220 implementation guidance. 4222 Applications which use this media type: 4224 Audio content-creation hardware, such as MIDI controller piano 4225 keyboards and MIDI audio synthesizers. Audio content-creation 4226 software, such as music sequencers, digital audio workstations, 4227 and soft synthesizers. Computer operating systems, for network 4228 support of MIDI Application Programmer Interfaces. Content 4229 distribution servers and terminals may use this media type for 4230 low bit-rate music coding. 4232 Additional information: 4234 None. 4236 Person & email address to contact for further information: 4238 John Lazzaro 4240 Intended usage: 4242 COMMON. 4244 Author/Change controller: 4246 John Lazzaro 4248 H.2 mpeg4-generic MIME Registration 4250 The mpeg4-generic MIME type [4] permits extensions to support new modes. 4251 The registration below defines mode rtp-midi for mpeg4-generic, to 4252 support the MPEG Audio codecs [5] that use MIDI. 4254 MIME media type name: 4256 audio 4258 MIME subtype name: 4260 mpeg4-generic 4262 Required parameter extensions: 4264 We extend the mpeg4-generic required parameter mode, by 4265 adding the value=parameter syntax: 4267 mode=rtp-midi 4269 to the list of legal mode values defined in [4]. See 4270 Section 6.2 for usage details. 4272 rate: In mode rtp-midi, rate is a required parameter. Rate 4273 specifies the RTP timestamp clock rate on the rtpmap line. 4274 See Sections 2.1 and 6.2 for usage details. 4276 Optional parameters: 4278 Standard SDP attributes: 4280 maxptime: See Appendix C.3 for usage details. 4281 ptime: See Appendix C.3 for usage details. 4283 Non-extensible parameters: 4285 ch_anchor: See Appendix C.1 for usage details. 4286 ch_default: See Appendix C.1 for usage details. 4287 ch_never: See Appendix C.1 for usage details. 4288 ch_unused: See Appendix C.1 for usage details. 4289 chanmask: See Appendix C.5 for usage details. 4290 cid: See Appendix C.5 for usage details. 4291 guardtime: See Appendix C.3 for usage details. 4292 inline: See Appendix C.5 for usage details. 4293 linerate: See Appendix C.2 for usage details. 4294 musicport: See Appendix C.4 for usage details. 4295 mperiod: See Appendix C.2 for usage details. 4296 octpos: See Appendix C.2 for usage details. 4297 rinit: See Appendix C.5 for usage details. 4298 tsmode: See Appendix C.2 for usage details. 4299 smf_cid: See Appendix C.5 for usage details. 4300 smf_info: See Appendix C.5 for usage details. 4301 smf_inline: See Appendix C.5 for usage details. 4302 smf_url: See Appendix C.5 for usage details. 4303 url: See Appendix C.5 for usage details. 4304 zerosync: See Appendix C.4 for usage details. 4306 Extensible parameters: 4308 j_sec, j_update: 4310 See Appendix C.1 for usage details. The parameters 4311 may only be extended via an IETF standards-track 4312 document. 4314 render: 4316 See Appendix C.5 for usage details. The parameter may 4317 only be extended via an IETF standards-track document. 4319 Encoding considerations: 4321 Only defined for real-time transfers of audio/mpeg4-generic 4322 RTP streams with mode=rtp-midi. 4324 Security considerations: 4326 See Appendix G of this memo. 4328 Interoperability considerations: 4330 Except for the marker bit (Section 2.1), the packet formats 4331 for audio/rtp-midi and audio/mpeg4-generic (mode rtp-midi) 4332 are identical. The formats differ in use: audio/mpeg4-generic 4333 is for MPEG work, audio/rtp-midi is for all other work. 4335 Published specification: 4337 This memo, [1], and [5] are the normative references. In 4338 addition, references [12], [13], and [18] provide non-normative 4339 implementation guidance. 4341 Applications which use this media type: 4343 MPEG 4 servers and terminals that support [5]. 4345 Additional information: 4347 None. 4349 Person & email address to contact for further information: 4351 John Lazzaro 4353 Intended usage: 4355 COMMON. 4357 Author/Change controller: 4359 John Lazzaro 4361 H.3 asc MIME Registration 4362 This section registers asc as a MIME subtype for the audio type. 4364 MIME media type name: 4366 audio 4368 MIME subtype name: 4370 asc 4372 Required parameters: 4374 none 4376 Optional parameters: 4378 none 4380 Encoding considerations: 4382 This type is only defined for data object (stored file) 4383 transfer. The native form of the data object is binary 4384 data padded to an octet boundary. The most common 4385 transports for the type are HTTP and SMTP. 4387 Security considerations: 4389 See Appendix G of this memo. 4391 Interoperability considerations: 4393 None. 4395 Published specification: 4397 The audio/asc data object is the AudioSpecificConfig 4398 binary data structure, which is normatively defined in [7]. 4400 Applications which use this media type: 4402 MPEG 4 Audio servers and terminals which support 4403 audio/mpeg4-generic RTP streams for mode rtp-midi. 4405 Additional information: 4407 None. 4409 Person & email address to contact for further information: 4411 John Lazzaro 4413 Intended usage: 4415 COMMON. 4417 Author/Change controller: 4419 John Lazzaro 4421 I. References 4423 I.1 Normative References 4425 [1] MIDI Manufacturers Association. "The Complete MIDI 1.0 Detailed 4426 Specification", 1996. 4428 [2] Schulzrinne, H., Casner, S., Frederick, R., and V. Jacobson. 4429 "RTP: A transport protocol for real-time applications", work in 4430 progress, draft-ietf-avt-rtp-new-12.txt. 4432 [3] Schulzrinne, H., and S. Casner. "RTP Profile for Audio and Video 4433 Conferences with Minimal Control", work in progress, 4434 draft-ietf-avt-profile-new-13.txt. 4436 [4] van der Meer, J., Mackie, D., Swaminathan, V., Singer, D., and 4437 P. Gentric. "RTP Payload Format for Transport of MPEG-4 Elementary 4438 Streams", work in progress, draft-ietf-avt-mpeg4-simple-07.txt. 4440 [5] International Standards Organization. "ISO/IEC 14496 MPEG-4", 4441 Part 3 (Audio), Subpart 5 (Structured Audio), 2001. 4443 [6] Handley, M., Jacobson, V., and C. Perkins. "SDP: Session 4444 Description Protocol", work in progress, 4445 draft-ietf-mmusic-sdp-new-12.txt. 4447 [7] International Standards Organization. "ISO 14496 MPEG-4", Part 3 4448 (Audio), 2001. 4450 [8] Freed, N. and N. Borenstein. "MIME Part One: Format of Internet 4451 Message Bodies", RFC 2045, November 1996. 4453 [9] MIDI Manufacturers Association. "The MIDI Downloadable Sounds 4454 Specification", v98.2, 1998. 4456 [10] Crocker, D. and P. Overell. "Augmented BNF for Syntax 4457 Specifications: ABNF.", RFC 2234, November 1997. 4459 [11] Bradner, S. "Key words for use in RFCs to Indicate Requirement 4460 Levels", BCP 14, RFC 2119, March 1997. 4462 I.2 Informative References 4464 [12] Lazzaro, J. and J. Wawrzynek. "A Case for Network Musical 4465 Performance", 11th International Workshop on Network and Operating 4466 Systems Support for Digital Audio and Video (NOSSDAV 2001) June 25-26, 4467 2001, Port Jefferson, New York. 4469 [13] Fober, D., Orlarey, Y. and S. Letz. "Real Time Musical Events 4470 Streaming over Internet", Proceedings of the International Conference 4471 on WEB Delivering of Music 2001, pages 147-154. 4473 [14] Rosenberg, J, Schulzrinne, H., Camarillo, G., Johnston, A., 4474 Peterson, J., Sparks, R., Handley, M., and E. Schooler. "SIP: Session 4475 Initiation Protocol", RFC 3261, June 2002. 4477 [15] J. Rosenberg and H. Schulzrinne. "An Offer/Answer Model with 4478 SDP", RFC 3264, June 2002. 4480 [16] Schulzrinne, H., Rao, A., and R. Lanphier. "Real Time Streaming 4481 Protocol (RTSP)", RFC 2326, April 1998. 4483 [17] Clark, D. D. and D. L. Tennenhouse. "Architectural considerations 4484 for a new generation of protocols", SIGCOMM Symposium on 4485 Communications Architectures and Protocols , (Philadelphia, 4486 Pennsylvania), pp. 200--208, IEEE, Sept. 1990. 4488 [18] Lazzaro, J., and J. Wawrzynek. "An Implementation Guide for RTP 4489 MIDI", work in progress, 4490 [19] Braden, R. et al. "Resource ReSerVation Protocol (RSVP) -- 4491 Version 1 Functional Specification", RFC 2205, September 1997. 4493 [20] Freed, N., Klensin, J., and J. Postel. "MIME Part Four: 4494 Registration Procedures", RFC 2048, November 1996. 4496 J. Author Addresses 4498 John Lazzaro (corresponding author) 4499 UC Berkeley 4500 CS Division 4501 315 Soda Hall 4502 Berkeley CA 94720-1776 4503 Email: lazzaro@cs.berkeley.edu 4505 John Wawrzynek 4506 UC Berkeley 4507 CS Division 4508 631 Soda Hall 4509 Berkeley CA 94720-1776 4510 Email: johnw@cs.berkeley.edu 4512 K. Intellectual Property Rights Statement 4514 The IETF takes no position regarding the validity or scope of any 4515 intellectual property or other rights that might be claimed to pertain 4516 to the implementation or use of the technology described in this 4517 document or the extent to which any license under such rights might or 4518 might not be available; neither does it represent that it has made any 4519 effort to identify any such rights. Information on the IETF's 4520 procedures with respect to rights in standards-track and standards- 4521 related documentation can be found in BCP-11. Copies of claims of 4522 rights made available for publication and any assurances of licenses to 4523 be made available, or the result of an attempt made to obtain a general 4524 license or permission for the use of such proprietary rights by 4525 implementors or users of this specification can be obtained from the 4526 IETF Secretariat. 4528 The IETF invites any interested party to bring to its attention any 4529 copyrights, patents or patent applications, or other proprietary rights 4530 which may cover technology that may be required to practice this 4531 standard. Please address the information to the IETF Executive 4532 Director. 4534 L. Full Copyright Statement 4536 Copyright (C) The Internet Society (2002-2003). All Rights Reserved. 4538 This document and translations of it may be copied and furnished to 4539 others, and derivative works that comment on or otherwise explain it or 4540 assist in its implementation may be prepared, copied, published and 4541 distributed, in whole or in part, without restriction of any kind, 4542 provided that the above copyright notice and this paragraph are included 4543 on all such copies and derivative works. However, this document itself 4544 may not be modified in any way, such as by removing the copyright notice 4545 or references to the Internet Society or other Internet organizations, 4546 except as needed for the purpose of developing Internet standards in 4547 which case the procedures for copyrights defined in the Internet 4548 Standards process must be followed, or as required to translate it into 4549 languages other than English. 4551 The limited permissions granted above are perpetual and will not be 4552 revoked by the Internet Society or its successors or assigns. 4554 This document and the information contained herein is provided on an "AS 4555 IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK 4556 FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT 4557 LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT 4558 INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR 4559 FITNESS FOR A PARTICULAR PURPOSE. 4561 Acknowledgement 4563 Funding for the RFC Editor function is currently provided by the 4564 Internet Society. 4566 M. Change Log for 4568 [Note to RFC Editors: this Appendix, and its Table of Contents listing, 4569 should be removed from the final version of the memo] 4571 Chapter M (Appendix A.9) has been redesigned, to follow the semantic 4572 design of Chapters C and E. Several definitions in Appendix A.1 have 4573 been changed to reflect this change, as have the chapter inclusion 4574 semantics for Chapter M in Appendix C.1.3. 4576 Many small editorial changes throughout the document, to correct 4577 grammatical errors and improve phrasing.