idnits 2.17.1 draft-lazzaro-avt-mwpp-midi-nmp-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Looks like you're using RFC 2026 boilerplate. This must be updated to follow RFC 3978/3979, as updated by RFC 4748. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** The document seems to lack a 1id_guidelines paragraph about 6 months document validity -- however, there's a paragraph with a matching beginning. Boilerplate error? ** The document is more than 15 pages and seems to lack a Table of Contents. == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** The document seems to lack an Authors' Addresses Section. ** There are 2 instances of too long lines in the document, the longest one being 2 characters in excess of 72. == There are 3 instances of lines with non-RFC6890-compliant IPv4 addresses in the document. If these are example addresses, they should be changed. ** The document seems to lack a both a reference to RFC 2119 and the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. RFC 2119 keyword, line 219: '...R: reserved bits MUST be set to zero b...' RFC 2119 keyword, line 223: '...and in the MIDI data field MUST have a...' RFC 2119 keyword, line 224: '...sequent commands MAY use the running s...' RFC 2119 keyword, line 227: '... command payload MUST only contain the...' RFC 2119 keyword, line 460: '...e that receivers MUST use the LENGTH f...' (13 more instances...) Miscellaneous warnings: ---------------------------------------------------------------------------- == Using lowercase 'not' together with uppercase 'MUST', 'SHALL', 'SHOULD', or 'RECOMMENDED' is not an accepted usage according to RFC 2119. Please use uppercase 'NOT' together with RFC 2119 keywords (if that is what you mean). Found 'MUST not' in this paragraph: A note log will exist for a note number (coded by the 7-bit NOTENUM field) if a NoteOn command has occurred for this note number since the last checkpoint packet, and that this NoteOn command occurred more recently than a NoteOff command. The 7-bit VELOCITY field codes the velocity value for this NoteOn command; this field MUST not be zero. -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- Couldn't find a document date in the document -- date freshness check skipped. Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Missing reference section? '1' on line 1218 looks like a reference -- Missing reference section? '5' on line 1232 looks like a reference -- Missing reference section? '4' on line 1228 looks like a reference -- Missing reference section? '2' on line 1221 looks like a reference -- Missing reference section? '7' on line 1241 looks like a reference -- Missing reference section? '8' on line 1244 looks like a reference -- Missing reference section? '3' on line 1224 looks like a reference -- Missing reference section? '6' on line 1238 looks like a reference -- Missing reference section? '128' on line 1114 looks like a reference Summary: 7 errors (**), 0 flaws (~~), 3 warnings (==), 11 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 INTERNET-DRAFT John Lazzaro 3 October 1, 2001 John Wawrzynek 4 Expires: April 1, 2002 UC Berkeley 6 MWPP: A resilient MIDI RTP packetization for network musical performance 8 10 Status of this Memo 12 This document is an Internet-Draft and is subject to all provisions of 13 Section 10 of RFC2026. 15 Internet-Drafts are working documents of the Internet Engineering Task 16 Force (IETF), its areas, and its working groups. Note that other groups 17 may also distribute working documents as Internet-Drafts. 19 Internet-Drafts are draft documents valid for a maximum of six months 20 and may be updated, replaced, or obsoleted by other documents at any 21 time. It is inappropriate to use Internet- Drafts as reference material 22 or to cite them other than as "work in progress." 24 The list of current Internet-Drafts can be accessed at 25 http://www.ietf.org/1id-abstracts.html 27 The list of Internet-Draft Shadow Directories can be accessed at 28 http://www.ietf.org/shadow.html 30 Abstract 32 This memo describes the MIDI Wire Protocol Packetization (MWPP). 33 MWPP is a resilient RTP packetization for the MIDI wire protocol; 34 it is specialized for low-latency applications such as network 35 musical performance. MWPP defines a multicast-compatible recovery 36 system for gracefully handling lost and late packets during the 37 performance. 39 In a network musical performance system, incoming MWPP streams 40 control audio synthesis software running on each network host. This 41 software might use the MPEG 4 Structured Audio standard as a 42 normative framework for audio synthesis. To support MPEG 4 43 Structured Audio in an interoperable fashion, this memo describes 44 how to transport MWPP via the generic MPEG 4 RTP packetization. 46 1. Introduction 48 The MIDI standard [1] defines a wire protocol to interconnect electronic 49 musical instruments into a distributed real-time system, using short 50 coaxial "MIDI cables" for the physical layer. 52 When the MIDI standard first came into use in the early 1980s, most 53 electronic musical instruments used special-purpose analog and digital 54 circuitry to generate sound. In that era, a typical use of MIDI was to 55 control the sound synthesizers of several musical instruments from one 56 piano keyboard. In those MIDI systems, general-purpose computers did not 57 participate in audio processing; computers were relegated to the tasks 58 of recording, routing, and playing back MIDI data into the instruments. 60 Today, personal computers are capable of executing complex sound 61 synthesis software algorithms in real-time. If a computer has a MIDI 62 input jack, a musician can use a personal computer as a software-based 63 musical instrument, by connecting a MIDI controller (such as a piano 64 keyboard) to the computer. State of the art software synthesizers 65 provide low total latency (piano key press to speaker cone movement) and 66 low temporal jitter, and deliver a quality playing experience to the 67 performer. 69 If a personal computer acting as a real-time instrument is connected to 70 the Internet, and if a second similarly-configured computer is also 71 connected to the Internet at a different location, the two computers 72 could exchange MIDI data, and turn both local and remote MIDI data into 73 sound. If the nominal end-to-end latency is sufficiently low, musicians 74 using these systems can engage in a network musical performance [5]. 76 Note that the programmable nature of software synthesis is essential for 77 network musical performance. With programmable synthesis, it is easy to 78 configure each network host to produce identical audio in response to 79 the same MIDI data stream, creating a sense of telepresence. 81 This memo describes the MIDI Wire Protocol Packetization (MWPP), a 82 resilient RTP [4] packetization for the low-latency transmission of the 83 subset of the MIDI wire protocol that is useful for real-time 84 performance. In this framework, each network host sends MWPP RTP packets 85 coding the MIDI events of its local player to remote players, and 86 receives the MWPP RTP packets of the remote players. Each sender- 87 receiver transport pair acts as a virtual MIDI extension cable. 89 Network musical performance systems work well when the nominal total 90 latency between the participating musicians is reasonably short [5]. 91 This memo does not address the nominal latency issue; we assume a system 92 using MWPP has a sufficiently low nominal total latency to support the 93 application. 95 MWPP is designed for use over UDP and other unreliable datagram 96 transport: the design goal is graceful recovery from lost and late UDP 97 packets, without using packet retransmission. MWPP also supports 98 reliable transport such as TCP: it includes features to minimize 99 bandwidth overhead when used with TCP. 101 Sending the MIDI wire protocol over unreliable transport is not trivial. 102 The MIDI standard defines a set of commands, that reflect the gestures 103 musicians make in playing their instruments ("NoteOn" command to start a 104 new note, "NoteOff" command to end the note, etc). Gestural commands 105 make MIDI data streams very compact, but also very fragile: a single 106 lost "NoteOff" command could result in a sound that sustains 107 indefinitely long. MWPP defines a recovery system that ensures these 108 sort of catastrophic command losses do not indefinitely impact a 109 performance. The MWPP recovery system uses domain knowledge about how 110 MIDI is used to control software synthesizers. 112 The recent MPEG 4 Audio standard includes a normative decoder, MPEG 4 113 Structured Audio [2], that defines a language and run-time environment 114 for software-based electronic musical instruments. The standard 115 normatively supports real-time MIDI control of instruments, using a 116 subset of the MIDI wire protocol command set encoded into real-time 117 streaming units (midi_event chunks in SA_access_units). 119 MWPP relates to MPEG 4 Structured Audio in two distinct ways: 121 o MWPP normatively includes the Structured Audio standard [2]. 122 [2] describes the interaction of a MIDI wire protocol 123 command stream and a software synthesizer, and defines the 124 subset of the MIDI protocol that is useful for software 125 synthesizers. We adopt these conventions through normative 126 inclusion, rather than recapitulate them in this memo, and 127 use them to define MWPP operation. 129 o Because we adopt Structured Audio MIDI semantics for MWPP, 130 the payload of the MWPP is an appropriate resilient coding for 131 MPEG 4 Structured Audio. However, to maximize interoperability, 132 MPEG 4 Structured Audio streams should use the "RFC-generic" 133 MPEG 4 Systems packetization [7], not the RTP packetization for 134 MWPP defined in this memo. Therefore, this memo also includes 135 a way to use the payload of MWPP together with RFC-generic, 136 modeled after the CELP and AAC definitions for RFC-generic 137 defined in [8]. 139 The remainder of this memo describes MWPP in detail, and assumes a 140 working knowledge of the MIDI standard [1] and the MIDI-related sections 141 of the Structured Audio standard [2]. 143 Readers unacquainted with [1] and [2] may read Section 6 of [5], which 144 provide sufficient detail to understand this memo. See [3] for a 145 software implementation of MWPP. 147 The bulk of the memo describes MWPP for RTP transport; in Sections 9 and 148 12, we describe RFC-generic transport of MWPP for MPEG 4 applications. 149 However, we refer to semantics of MPEG 4 Structured Audio throughout the 150 memo, as the normative framework for describing the interaction between 151 a software synthesizer and the MIDI wire protocol. Rather than refer to 152 the "software synthesizer," we refer to the "Structured Audio decoder." 153 We use the normative language of Structured Audio to describe the 154 interaction between the software synthesizer and the MIDI wire protocol. 156 2. Sending MWPP RTP packets 158 This section describes sending MWPP RTP packets. We describe MWPP for 159 use over unreliable datagram transport without sender proxies. Reliable 160 transport and sender proxies are described in Section 10 of this memo. 162 0 1 2 3 163 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 164 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 165 |V=2|P|X| CC |M| PT | Sequence Number | 166 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 167 | Timestamp | 168 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 169 | SSRC | 170 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 171 | CSRCs | 172 | ... | 173 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 175 Figure 1 -- RTP header 177 An MWPP packet begins with a standard RTP header (Figure 1). MWPP does 178 not use header extensions or the marker bit. As is standard, each MWPP 179 RTP packet sent has its sequence number incremented by one modulo 65536. 181 MWPP packets encode MIDI commands that are scheduled for execution at a 182 particular moment in time on the sender's Structured Audio decoder. The 183 RTP timestamp field encodes the moment of execution. 185 The timestamp clock is set by the SDP rtpmap attribute srate (see 186 Sections 11 and 12 for details). For simple uses of MWPP, this srate 187 value is identical to the Structured Audio global srate parameter, which 188 codes the audio sampling rate. For example, if srate has a value of 189 44100Hz, two MWPP packets coding MIDI commands that are executed 2 190 seconds apart on the sender's SA decoder have RTP header timestamps that 191 differ by 88200. 193 The timestamp is a monotonically increasing function of the sequence 194 number, as expressed in modulo arithmetic. As is standard in RTP, the 195 timestamp field is initialized to a randomly chosen value. 197 0 1 2 3 198 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 199 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 200 |R|R| LEN | MIDI Command Payload ... | 201 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 202 | Recovery Journal ... | 203 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 205 Figure 2 -- MWPP payload 207 The MWPP payload follows the RTP header. The MWPP payload has two 208 variable-length sections: a MIDI command section, and a recovery 209 journal. 211 The MIDI command section encodes the MIDI commands executed on the 212 sender's SA decoder at the moment encoded by the RTP timestamp. The 213 MIDI command section has a one-octet header, followed by the MIDI 214 command payload. 216 The header codes the length (in units of octets) of the data field, via 217 the 6-bit value LEN. A LEN value of 0 is legal, and codes an empty 218 payload. The header also contains two reserved R bits. All reserved 219 bits in this memo are named R: reserved bits MUST be set to zero by 220 senders and ignored by receivers. 222 The MIDI command payload must contain zero or more complete MIDI 223 commands. The first MIDI command in the MIDI data field MUST have a 224 status octet, but subsequent commands MAY use the running status data 225 compression scheme [1]. 227 The MIDI command payload MUST only contain the MIDI commands that are 228 legally coded in the midi_event chunk of SA_access_unit real-time 229 streaming units of Structured Audio (i.e. all MIDI commands except the 230 MIDI System command (0xF)). Note that all commands in the MIDI command 231 payload are scheduled for the same moment in time; the 320 us/octet 232 serial delay of a MIDI cable is not emulated. 234 The MIDI command section of the MWPP payload is followed by the recovery 235 journal. Information encoded in the recovery journal enables the 236 receiver to gracefully recover from the loss of all RTP packets sent 237 since an earlier RTP packet, called the checkpoint packet. 239 The growth in size of the recovery journal is limited in two ways. 241 o The recovery journal encodes the minimal session history 242 that is needed for recovery, not a trace log of all MIDI 243 commands sent since the checkpoint packet. 245 o A sender monitors the "last extended sequence number received" 246 field of RTCP RR reports [4], and advance the checkpoint packet 247 to reflect the known state of all receivers. This mechanism is 248 multicast compatible, and does not require a change in the RTCP 249 mechanism as described in [4]. 251 Detailed information about the format of the recovery journal appears in 252 Section 5. 254 3. Receiving MWPP RTP packets 256 In this section, we describe how receivers process RTP MWPP packets. We 257 assume that the RTP sender and receiver use a transmission channel with 258 sufficiently low nominal latency to support the application, but that 259 transient network disturbances may result in lost packets, and in 260 packets received with significantly longer latencies that the nominal 261 latency. 263 When a new RTP packet arrives, the receiver first examines the timestamp 264 field, and classifies the packet as either "ontime" or "late." To 265 perform this classification, the receiver typically maintains a model of 266 the latency of the channel (see Appendix B of [5] for an example latency 267 model, that is implemented in [3]). 269 The receiver then examines the RTP sequence number, and classifies the 270 situation as: 272 o Normal. The extended packet sequence number of new RTP packet is 273 one greater than the extended sequence number of the last RTP 274 packet number received. 276 o Sequence Break. The extended packet sequence number of new RTP packet 277 is greater than in the normal case. This classification also applies 278 if the new RTP packet is the first RTP packet received in the session. 280 o Out of Order. The extended packet sequence number of new RTP packet 281 is less than in the normal case. 283 The most common occurrence is for packets to be normal/ontime. In this 284 case, the receiver schedules all commands in the MIDI command section of 285 the MWPP payload for execution on the local SA decoder. The details of 286 scheduling are implementation-specific: simple decoders (such as [3]) 287 may execute MIDI commands as soon as they arrive, but other approaches 288 are possible. 290 If a packet is normal/late, the MIDI command section of the MWPP payload 291 is executed as in the normal/ontime case, except for MIDI NoteOn 292 commands with non-zero velocity, which are discarded. These semantics 293 prevent "straggler notes" from disturbing a performance, quiets "soft 294 stuck notes" immediately, and updates all other MIDI state in an 295 acceptable way. 297 If a packet is a sequence break packet, the receiver first processes the 298 recovery journal section of the payload, as described in Section 5. This 299 processing may result in the execution of one or more MIDI commands to 300 gracefully recover from the packet loss. After processing the recovery 301 journal, the receiver processes the MIDI command section of a sequence 302 break packet as if it were a "normal" packet. 304 If a packet is an out of order packet, its MIDI command and recovery 305 journal sections are ignored. 307 4. Sender Addendum: Guard Packets 309 One shortcoming of MWPP as presented in Sections 2 and 3 is that senders 310 are not required to maintain a minimum sending rate for RTP packets. 311 This can cause problems if a packet that encodes a MIDI NoteOff event is 312 lost. Until another packet is sent, the recovery journal mechanism 313 cannot function to quiet the stuck note. 315 MWPP senders may address this problem by sending RTP packets with empty 316 MIDI command sections at regular intervals: the recovery journal section 317 of these "guard packets" serves to quiet stuck notes and update the MIDI 318 state of the receiver's SA decoder. Guard packets also serve to prevent 319 intermediaries (such as Network Address Translators) from timing out 320 their services. 322 The use of guard packets by senders is implementation-dependent (but see 323 Section 14). 325 5. The Recovery Journal 327 The recovery journal section of MWPP RTP packets has a three-level 328 structure: 330 o Top-level header. Encodes recovery journal structure. 332 o Channel journal header. Encodes recovery information for a 333 single MIDI channel (a MIDI command executes on one of 16 334 MIDI channels). 336 o Chapters. Describes recovery information for a single 337 MIDI command type. Chapters are specialized to the 338 semantics of the command, as defined in [1] and [2]. 340 In this section, we specify the format of the top-level and chapter 341 journal headers. Subsequent sections describe how senders and receivers 342 use these headers as part of the recovery system. Appendices describe 343 the semantics of each journal chapter. 345 0 1 2 3 346 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 347 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 348 |S|A|K|R|TOTCHAN| Checkpoint Packet Seqnum | Channels ... | 349 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 351 Figure 3 -- Top-level recovery journal format 353 Figure 3 shows the top-level structure of the recovery journal. A 354 recovery journals consists of a 3-octet header, followed by a list of 355 channel journals. Channel journals encode recovery information for a 356 single MIDI channel. 358 If the A bit is set in the recovery journal header, the recovery journal 359 is "empty", and contains no channel journals. If the A bit is clear, the 360 channel journal list contains (TOTCHAN + 1) channel journals. 362 The recovery journal header includes an S bit. S bits appear on 363 structures throughout the recovery journal format, with uniform 364 semantics: if the S bit is set, the structure may be ignored if a 365 sequence break of exactly one RTP packet triggered the recovery journal 366 processing. 368 A set S bit on the recovery journal header indicates the previous sent 369 packet is a guard packet (lost guard packets can be ignored because 370 their MIDI command payloads are empty). 372 The 16-bit Checkpoint Packet Seqnum field codes the sequence number of 373 the checkpoint packet used by sender to create this journal. If this 374 sequence number has changed since the last MWPP packet sent, the K bit 375 is set, else it is clear. 377 0 1 2 3 378 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 379 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 380 |S| CHAN |R| LENGTH |P|W|N|A|T|C|R|R| Chapters ... | 381 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 383 Figure 4 -- Channel journal format 385 Figure 4 shows the structure of a channel journal: a 3-octet header, 386 followed by a list of leaf elements called chapters. A channel journal 387 encodes recovery information for commands sent on the MIDI channel coded 388 by the 4-bit CHAN header field. The 10-bit LENGTH field codes the number 389 of octets in the channel journal, including the header. 391 The third octet of the channel journal header is the Table of Contents 392 (TOC) of the channel journal. The TOC is a set of bits to encode the 393 presence of a chapter in the journal. Each chapter contains information 394 to recover from the loss of a certain class of MIDI commands: 396 o Chapter P: MIDI Program Change (0xC) 397 o Chapter W: MIDI Pitch Wheel (0xE) 398 o Chapter N: MIDI NoteOff (0x8), NoteOn (0x9) 399 o Chapter A: MIDI Poly Aftertouch (0xA) 400 o Chapter T: MIDI Channel Aftertouch (0xD) 401 o Chapter C: MIDI Control Change (0xB) 403 Chapters appear in a list following the header, in order of their 404 appearance in the TOC. The Appendices of this memo describe the format 405 of each chapter, and explain how senders and receivers use each chapter. 407 6. Sending Recovery Journals 409 In this section of the memo, we briefly describe how senders create 410 recovery journals. 412 Senders maintain state about MIDI commands sent since the last 413 checkpoint packet, using a data structure that records the RTP sequence 414 number associated with each MIDI command (see [3] for a sample 415 implementation). We refer to this data structure as the "recovery 416 journal data structure" below. 418 To send a new MWPP packet, the sender first creates the MIDI command 419 section of the packet. Then, the sender traverses the recovery journal 420 data structure to build the new recovery journal. Typically, the data 421 structure elements match the structure of recovery journal chapters and 422 headers, so that simple memory copies act to build a new journal. 424 Timing-sensitive chapter data (such as the Y bits of Chapter N, as 425 explained in Appendix A.3) are updated during the build process. 427 After the recovery journal is created, the sender inserts the MIDI 428 commands encoded in the MIDI command section of the new packet into the 429 recovery journal data structure. Data structure elements corresponding 430 to the S bits of the recovery journal are updated during this insertion. 432 The reception of an RTCP RR packet may also result in an update of the 433 recovery journal data structure. The sender first examines the "last 434 extended sequence number received" field of the received RTCP RR packet, 435 and combines it with the RTCP RR data from other receivers in the 436 session. If the sender determines checkpoint packet may be safely 437 updated from its current value, it traverses the recovery journal data 438 structure, pruning MIDI data wherever possible, in order to reduce the 439 size of future recovery journals. 441 7. Receiving Recovery Journals 443 In this section, we briefly describe how receivers parse recovery 444 journals. 446 In the case of a loss of a single RTP packet, the receiver uses the S 447 bits of the recovery journal to skip over channels and chapters which do 448 not encode information about the lost packet. For each chapter whose S 449 bit is clear, the receiver executes the chapter-specific recovery 450 algorithm described in the Appendices. 452 In the case of a multi-packet loss, the S bits are ignored, and each 453 chapter of each channel journal undergoes the recovery procedure 454 described in the Appendices. 456 The recovery algorithms for most chapters require information about MIDI 457 commands received in previous RTP packets. See [3] for a sample 458 implementation of data structures to efficiently maintain this state. 460 Note that receivers MUST use the LENGTH field of the chapter journal 461 header to traverse from chapter to chapter, and not rely on the sizes of 462 each chapter journal. This restriction is needed for backward 463 compatibility, as the R bits in the TOC may be used for new chapters in 464 future versions of MWPP. 466 8. MWPP Startup Issues 468 The recovery journal mechanism depends on senders tracking the status of 469 receivers, by: 471 o Knowing the first MWPP RTP packet sent to a new receiver. 472 o Examining the "last extended sequence number received" field 473 of RTCP RR reports. 475 In simple unicast applications, both mechanisms work well. In more 476 complex situations, such as true or simulated multicast transport, 477 senders may not know of the presence of a receiver until the first RTCP 478 packet arrives. In this case, lost packets early in a session may not be 479 protected by the recovery journal mechanism, because the sender 480 "incorrectly" moved the checkpoint packet. 482 Receivers can detect this situation by using the Checkpoint Packet 483 Seqnum field coded in the recovery journal header, as shown in Figure 3. 484 Note that this technique requires that receivers examine the recovery 485 journal of every MWPP packet received, although the K bit that marks 486 checkpoint updates minimizes the work per packet required. 488 To detect this situation, the receiver examines the Checkpoint Packet 489 Seqnum, and checks to see if it is consistent with the reception history 490 of the receiver. If an inconsistency is detected, receivers should 491 assume that the sender is not aware of its existence, and take 492 precautions to ensure that catastrophic MIDI errors do not occur (for 493 example, NoteOn commands could be "timed out" with a matching NoteOff 494 command after a suitably long period of time). 496 9. MWPP Transport over MPEG 4 "RFC-generic" RTP Packetization 498 MWPP, as described in this memo, is a stand-alone RTP packetization for 499 the MIDI wire protocol. Section 2 describes the packet format of MWPP 500 RTP packets: a standard RTP header (Figure 1) followed by the MWPP 501 payload (Figure 2). 503 For maximum interoperability, MPEG 4 Structured Audio systems should not 504 use this stand-alone RTP packetization. Instead, the generic RTP 505 packetization for MPEG 4 described in [7] ("RFC-generic") should be 506 used. 508 In this section, we describe how to incorporate the MWPP payload into 509 RFC-generic. In Section 12, we describe how to configure RFC-generic to 510 support MWPP. This section borrows heavily from the MPEG 4 Audio AAC and 511 CELP work described in [8]. 513 +---------+-----------+-----------+---------------+ 514 | RTP | AU Header | Auxiliary | Access Unit | 515 | Header | Section | Section | Data Section | 516 +---------+-----------+-----------+---------------+ 518 <----------RTP Packet Payload-----------> 520 Figure 5: Data sections within an RTP packet (from [8], describing [7]). 522 Figure 5 shows the RFC-generic RTP packet, consisting of a standard RTP 523 header (Figure 1) followed by the RFC-generic payload. 525 The main purpose of an RFC-generic packet is to carry MPEG 4 Access 526 Units; this data is held in the final section of the payload (Access 527 Unit Data Section). The AU Header Section describes the way that Access 528 Units are packed into the Access Unit Data Section: for example, the 529 data section may contain multiple Access Units, or a fragment of a 530 single Access Unit. The RFC-generic packet may also contain ancillary 531 data, in the Auxiliary Section. 533 To incorporate MWPP into RFC-generic, we consider the payload of MWPP 534 (Figure 2) to be the Access Unit. We place exactly one MWPP Access Unit 535 into the Access Unit Data Section of each RFC-generic RTP packet; the 536 MWPP Access Unit is never fragmented. The AU Header Section and 537 Auxiliary Section are both always empty. 539 The RTP Header section of the RFC-generic packet is essentially the same 540 as the RTP header for MWPP described in Section 2, with several 541 exceptions: 543 o The marker bit is always set to 1, indicating a complete MWPP 544 Access Unit in the Access Unit Data Section. 546 o A random offset for the Timestamp field should be avoided. 548 10. Reliable Transport and Proxies 550 The recovery journal adds significant overhead to MWPP. When sending 551 MWPP over reliable transport (TCP, or a point-to-point reliable IP link) 552 the recovery journal section of MWPP packets may be safely deleted 553 without affecting the proper operation of the system. 555 MWPP recovery journals may also be safely deleted if the SAOL program 556 running on the Structured Audio decoders uses application-layer recovery 557 techniques that make the MWPP recovery journal scheme redundant. 559 Senders and receivers MUST use the Session Description Protocol (SDP) 560 [6] mechanism described in Sections 11 and 12 to indicate that an MWPP 561 session does not use the recovery journal mechanism. 563 MWPP packets without recovery journals are also used in association with 564 sender and receiver proxies. Sender and receiver proxies are used when 565 Structured Audio clients are running on thin clients, such as electronic 566 piano keyboards. These keyboards may have special-purpose audio 567 processing hardware for Structured Audio decoding, but may have simple 568 general-purpose processors that cannot handle the overhead of recovery 569 journal send and receive operations. 571 If the thin client has a reliable channel to a suitable host, sender and 572 receiver proxies may be used to offload the recovery journal processing 573 task. In this scheme, the thin client would send and receive MWPP 574 packets without recovery journals to the proxies. The sending proxy 575 would add recovery journals to outgoing packets, and the receiving proxy 576 would handle the lost and late packet processing described in Section 3. 577 The sender and receiver proxies MUST also handle the RTCP duties for the 578 thin client, because the thin client is not able to compute transport 579 statistics correctly. 581 11. Session Description Protocol 583 This section describes Session Description Protocol (SDP) [6] 584 definitions for MWPP transport directly over RTP. Section 12 describes 585 the SDP definitions for MWPP transport over the MPEG 4 "RFC-generic" RTP 586 packetization. 588 The MIME name for this packetization is mwpp. The SDP rtpmap attribute 589 is declared as 591 a=rtpmap: mwpp/// 593 The parameter codes the audio sampling rate used for the RTP 594 timestamp field. Typically, this value corresponds to the srate global 595 parameter value of the SAOL program (see [2] subpart 5.8.5.2.1). We 596 specify in the rtpmap so that musicians can choose a different 597 local srate value without disturbing the MWPP system. 599 The parameter codes the control sampling rate, which typically 600 corresponds to the krate global parameter value of the SAOL program (see 601 [2] subpart 5.8.5.2.2). This memo does not refer to the krate value; we 602 include it in the rtpmap for possible future use, since like srate, 603 musicians may wish to choose a different local krate value. 605 The parameter codes the presence or absence of recovery journals in 606 MWPP packets (see Section 10 for details). The two valid values for 607 are "rj" and "no-rj". If the parameter does not exist, its value is 608 assumed to be "rj". 610 For example, the following lines bind the packetization to dynamic 611 payload number 96, and specifies an srate of 44100 Hz, a krate of 1260 612 Hz, and the presence of a recovery journal in each RTP packet: 614 m=audio 5004 RTP/AVP 96 615 c=IN IP4 171.64.92.160 616 a=rtpmap: 96 mwpp/44100/1260/rj 618 Note that the packetization does not directly support multiple 619 16-channel MIDI Input sources. Different UDP ports should be used in 620 this case, each devoted to a single source: 622 m=audio 5004 RTP/AVP 96 623 c=IN IP4 171.64.92.160 624 a=rtpmap: 96 mwpp/44100/1260/rj 625 m=audio 5006 RTP/AVP 97 626 c=IN IP4 171.64.92.160 627 a=rtpmap: 97 mwpp/44100/1260/rj 629 Note that the SDP does not include a binary encoding of the SAOL program 630 to run on the decoder (StructuredAudioSpecificConfig). This memo assumes 631 that StructuredAudioSpecificConfig is sent out-of-band. 633 Finally, note that MWPP is self-framing, and so TCP transport is 634 possible without explicit framing. 636 12. Session Description Protocol and MPEG 4 "RFC-generic" transport 638 This section describes Session Description Protocol (SDP) [6] 639 definitions for MWPP transport over the MPEG 4 "RFC-generic" RTP 640 packetization. 642 The MIME name for this packetization is mpeg-generic. The SDP rtpmap 643 attribute is declared as 645 a=rtpmap: mpeg-generic/// 647 The definitions of srate, krate, and rj are identical to the 648 descriptions in Section 11. Note that srate functions as the RTP clock. 650 The SDP fmpt command configures RFC-generic for MWPP transport, as shown 651 below: 653 a=fmpt: streamtype=5; profile-level-id=15; mode=SA-mwpp; 655 To signal SingleSL mode, we omit the ConstantSize and SizeLength format 656 parameters from the fmpt command. The StructuredAudioSpecificConfig is 657 sent by other means, and so AudioSpecificConfig() is not used. The 658 values for streamtype and profile-level-id are tentative, pending a 659 check of the relevant standards documents. 661 13. Security Considerations 663 Cryptographic authentication of incoming RTP and RTCP packets is highly 664 recommended when using MWPP. Without such protections, attackers could 665 forge MIDI commands into an ongoing session, potentially damaging 666 speakers and eardrums. An attacker could also craft RTP and RTCP packets 667 to exploit known bugs in the client, and take effective control of a 668 client machine. 670 14. Congestion Control 672 MWPP has congestion control issues that are unique for an RTP audio 673 packetization. When used for network musical performance, the packet 674 rate is linked to the gestural rate of a human performer. 676 MWPP implementations SHOULD sense the MIDI stream for command patterns 677 that result in excessive packet rates, and filter these streams as part 678 of MWPP to reduce the packet rate. 680 In addition, the guard packet mechanism described in Section 4 of this 681 memo is a possible source of congestion control problems. Implementers 682 MUST ensure that the guard packet strategies of senders are well behaved 683 with respect to congestion control. 685 Appendix A.1. Chapter P: MIDI Program Change 687 Chapter P protects against the loss of MIDI Program Change commands, 688 which the Structured Audio standard uses to bind SAOL instruments to 689 MIDI channels. If a Program Change command is lost, notes played on a 690 channel will sound with the incorrect timbre, or perhaps not sound at 691 all. 693 To prepare for recovery, the receiver should store state for each 694 channel, to indicate the program value of the last Program Change 695 command received on this channel and the Bank Select values (Coarse and 696 Fine) that were in effect at the time this Program Change command 697 executed. The Bank Select values are issued via the MIDI Control Change 698 command, and act to extend the range of program values. The stored state 699 should also include flag bits to signify the null cases of no Program 700 Change received, no Bank Select Coarse value received, and no Bank 701 Select Fine value received. 703 The encoding for Chapter P is shown below: 705 0 1 2 706 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 707 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 708 |S| PROGRAM |C| BANK-COARSE |F| BANK-FINE | 709 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 711 The chapter has a fixed size of 24 bits. If the S bit is set to 1, the 712 previous packet sent did not include a Program Change command on this 713 channel, and the receiver can skip to the next Chapter if it is 714 recovering from the loss of a single packet. 716 The PROGRAM field indicates the program value of the last Program Change 717 command sent on this channel. If a Control Change command for the Bank 718 Select Coarse controller was sent before this Program Change command, 719 the C bit is set to 1, and the BANK-COARSE field is the Bank Select 720 Coarse controller value that was sent. The F bit and BANK-FINE field 721 code the Bank Select Fine value in the same manner. 723 The receiver should compare the values in Chapter P with the stored 724 state for this channel, to determine if one or more Program Change 725 commands were lost. If a loss is detected, the receiver should execute 726 the Program Change command coded in Chapter P, and update its own 727 recovery state. 729 Appendix A.2. Chapter W: MIDI Pitch Wheel 731 Chapter W protects against the loss of MIDI Pitch Wheel commands. A 732 common use of the Pitch Wheel command is to transmit the current 733 position of a "pitch wheel" controller placed on the side of piano 734 controllers, which players can use to dynamically alter the pitch of all 735 depressed keys. 737 Structured Audio makes the current value of the Pitch Wheel available to 738 SAOL programmers in the MIDIWheel standard name, which programmers 739 typically use for continuous modification of instrument models, in a 740 manner similar in spirit to the original pitch bend semantics of the 741 controller. The recovery mechanisms in Chapter W are designed to 742 protect the Pitch Wheel data stream for these types of SAOL programs. 744 To prepare for recovery, the receiver should store state for each 745 channel, that codes the wheel value for the last Pitch Wheel command 746 received, along with a flag bit to signify the null case of no Pitch 747 Wheel command received. 749 The encoding for Chapter W is shown below: 751 0 1 752 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 753 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 754 |S| FIRST |R| SECOND | 755 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 757 The chapter has a fixed size of 16 bits. If the S bit is set to 1, the 758 previous packet sent did not include a Pitch Wheel command on this 759 channel, and the receiver can skip to the next Chapter if it is 760 recovering from the loss of a single packet. 762 The FIRST and SECOND fields are the 7-bit values of the first and second 763 data bytes of the last Pitch Wheel command sent on this channel. The R 764 bit is reserved. 766 The receiver should compare the FIRST and SECOND fields in Chapter W 767 with the stored pitch wheel state for this channel. If no difference is 768 detected, Pitch Wheel commands may still have been lost, but any 769 artifacts induced are transient in nature, and the receiver SHOULD take 770 no action. 772 If a difference is detected, the receiver should update its recovery 773 state to reflect the values of the FIRST and SECOND fields. In addition, 774 the receiver MAY execute a single Pitch Wheel command, or MAY plan a 775 series of Pitch Wheel commands spaced over time. 777 Appendix A.3. Chapter N: MIDI NoteOff and NoteOn 779 Chapter N protects against the loss of MIDI NoteOn commands, which 780 Structured Audio uses to launch new instrument instances, and MIDI 781 NoteOff commands, which Structured Audio uses to schedule instances for 782 termination. If a NoteOn command is lost, notes are skipped, a 783 transient error. If a NoteOff command is lost, notes may sound 784 indefinitely, an error that may be catastrophic for sustained timbres. 786 Structured Audio ignores the velocity field of the NoteOff command, and 787 Chapter N does not protect this field. In the discussion below, our 788 references to NoteOff commands include NoteOn commands with zero 789 velocity, which have semantics identical to NoteOff commands in 790 Structured Audio. Our references to NoteOn commands refer to NoteOn 791 commands with non-zero velocity only. 793 To prepare for recovery, the receiver should maintain state for each 794 note number for an active channel. The recovery algorithms in this 795 section references receiver recovery state variables, using the 796 following nomenclature: 798 vel This variable is initialized to zero at the start of a 799 session. If a NoteOn command is executed for this note, 800 vel is set to the velocity value of the command. If a 801 NoteOff command is is executed for this note, vel is 802 set to zero. 804 seq Whenever a NoteOn or NoteOff command executes, seq is 805 set to the extended sequence number of the RTP packet 806 whose parsing resulted in execution of the command. 808 time Time of the last NoteOn or NoteOff command, in the 809 internal time units used by the application. 811 The encoding for Chapter N is shown below: 813 0 1 2 3 814 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 8 0 1 815 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 816 |B| LENGTH | LOW | HIGH |S| NOTENUM |Y| VELOCITY | 817 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 818 |S| NOTENUM |Y| VELOCITY | .... | 819 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 820 | BITFIELD | BITFIELD | .... | BITFIELD | 821 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 823 The chapter consists of a 2-byte header, followed by a list of 16-bit 824 note logs, followed by a list of bitfields. A note number is represented 825 in a note log or a bitfield if it has been used in a NoteOff or NoteOn 826 command since the last checkpoint packet. 828 If the note number last appeared in a NoteOn command, it appears in a 829 note log; if the note number last appeared in a NoteOff command, it 830 appears in a bitfield. 832 The 7-bit LENGTH field codes the number of note logs; zero is a valid 833 value, and codes an empty note log. The maximum number of note logs is 834 127; in the musically unlikely case of 128 concurrent NoteOn commands, 835 one NoteOn command is unprotected, risking a transient (not 836 catastrophic) error on one note number. 838 The 4-bit fields LOW and HIGH determine the number of bitfield bytes 839 that follow the note logs. A bitfield byte codes NoteOff information for 840 eight consecutive MIDI note numbers, with the MSB representing the 841 lowest note number. A 1 in a bit position indicates a NoteOff command 842 has occurred for this note number since the last checkpoint packet, and 843 that this NoteOff command occurred more recently than a NoteOn command. 845 The MSB of the first bitfield byte codes the note number 16*LOW, while 846 the MSB of the last bitfield byte codes the note number 16*HIGH. If LOW 847 is less that or equal to HIGH, there are (HIGH - LOW + 1) bitfield bytes 848 in the chapter. To code a chapter with no bitfield bytes, senders MUST 849 set LOW to 15 and HIGH to 0. 851 If the B bit is set to 1, the previous packet sent did not include a 852 NoteOff command on this channel, and the receiver can skip parsing the 853 bitfield section of the chapter if it is recovering from the loss of a 854 single packet. To skip over a Chapter N, the receiver calculates the 855 chapter length based on the values of LENGTH, LOW, and HIGH. 857 We now explain note log encoding, reproduced for reference below: 859 0 1 860 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 861 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 862 |S| NOTENUM |Y| VELOCITY | 863 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 865 A note log will exist for a note number (coded by the 7-bit NOTENUM 866 field) if a NoteOn command has occurred for this note number since the 867 last checkpoint packet, and that this NoteOn command occurred more 868 recently than a NoteOff command. The 7-bit VELOCITY field codes the 869 velocity value for this NoteOn command; this field MUST not be zero. 871 If the S bit in a note log is 1, the previous packet sent did not 872 include a NoteOn command for this note number, and the receiver can skip 873 parsing this note log if it is recovering from the loss of a single 874 packet. 876 The Y bit helps receivers to make the "play or skip" decision for 877 recovered NoteOn events. Senders set Y to 1 for recovery packets sent 878 shortly after the arrival of a NoteOn command from a MIDI controller; 879 subsequent recovery packets are sent with Y = 0. 881 The tables below summarizes the recovery algorithm for Chapter N. For 882 each set bitfield position, the receiver executes a strategy in the 883 table column indicated by the recovery state variable vel for the note 884 number: 886 |-------------------------------------------------------------------- 887 | vel | Diagnosis Suggested recovery | 888 |-------------------------------------------------------------------- 889 | zero | Either no note events | Do nothing. | 890 | | have been lost, or a | | 891 | | series of NoteOn -> | Update seq to the | 892 | | NoteOff events have | current packet number. | 893 | | been lost. | | 894 |-------------------------------------------------------------------- 895 | non-zero | Either one NoteOff was| Execute a NoteOff | 896 | | lost or a series of | command to end the | 897 | | NoteOff->On->Offs | current note. | 898 | | were lost. | | 899 | | | Update velocity to 0, | 900 | | | update seq to the current | 901 | | | packet number. | 902 |-------------------------------------------------------------------- 903 For each note log, the receiver executes the strategy in the table 904 column the recovery state variable vel for the note number: 906 |-------------------------------------------------------------------- 907 | vel | Diagnosis Suggested recovery | 908 |-------------------------------------------------------------------- 909 | zero | Either one NoteOn was | If Y = 0, never play | 910 | | lost, or a series of | the new note. If Y = 1, | 911 | | NoteOn->Off->On. | play a new note if an | 912 | | | analysis of the current | 913 | | | packet timestamp and the | 914 | | | estimated delay from the | 915 | | | sender shows the current | 916 | | | packet is reasonably on | 917 | | | time. | 918 | | | | 919 | | | Update vel to value | 920 | | | VELOCITY, and seq to | 921 | | | the current packet number. | 922 |-------------------------------------------------------------------- 923 | non-zero | Either no note events | Do one of: | 924 | | have been lost, or a | | 925 | | series of NoteOff-> | [1] Leave the current note | 926 | | NoteOn events have | (with velocity vel) | 927 | | been lost. | playing. | 928 | | | | 929 | | | [2] End the current | 930 | | | note but do not | 931 | | | start a second note. | 932 | | | | 933 | | | [3] End the current note | 934 | | | and start a second | 935 | | | note with velocity | 936 | | | VELOCITY. | 937 | | | | 938 | | | Based on the values of | 939 | | | vel, VELOCITY, Y, | 940 | | | seq, the current packet | 941 | | | number, and the current | 942 | | | checkpoint packet number. | 943 | | | | 944 | | | Update vel to value | 945 | | | VELOCITY, and seq to | 946 | | | the current packet number. | 947 |-------------------------------------------------------------------- 948 The second entry is this table is complex, due to the ambiguity of 949 situation. A series of simple tests resolves the issue in most cases: 951 TEST 1: (seq < checkpoint packet sequence number) 953 If the value of seq is less than the current checkpoint packet 954 sequence number, recovery is simple, since we know that the last 955 NoteOn event executed is not a part of the note log, and so we know 956 that a series of one or more NoteOff->NoteOn events have been lost. 957 In this case, the current note should always be ended, and the 958 execution of a new note should occur using the same criteria we use 959 in the second entry of the table above. If seq indicates that the 960 last NoteOn executed did occur during the current note log, our task 961 is more difficult. 963 TEST 2: (vel != VELOCITY) 965 If vel doesn't equal VELOCITY, we know that the NoteOff corresponding 966 to the last NoteOn executed was lost. In this case, the current note 967 should always be ended, and the execution of a new note should occur 968 using the same criteria we use in the second entry of the table 969 above. 971 These two tests leave the (vel == VELOCITY) case to consider. This case 972 is not a rare event, since many MIDI devices do not implement velocity 973 sensing, and generate all NoteOn's with the same velocity value. 975 TEST 3: (Y == 1) 977 If Y is 1, the NoteOn in the note log occurred recently. If the 978 receiver executed the last NoteOn recently (which was can tell by 979 the time value for the last executed note), we know the note log 980 represents the last executed note, and the correct action is to 981 let the note continue to play. If the last note was not recently 982 executed, it should be terminated, and an execution of a new note 983 should occur using the same criteria we use in the second entry of 984 the table above. 986 These three tests leave the following case unresolved: Y == 0 (the 987 NoteOn in the note log didn't occur recently) and vel == VELOCITY (the 988 last executed note and note log note are ambiguous). In this case, 989 letting the last executed note continue to play, to be turned off by a 990 forthcoming NoteOff command, is an acceptable result. 992 Appendix A.4. Chapter A: MIDI Poly Aftertouch 994 Chapter A protects against the loss of MIDI Poly Aftertouch commands. 995 This command supports piano keyboard controllers that have individual 996 pressure sensors under each key, that generate a continuous signal 997 whenever the key is depressed. Keyboard controllers that include these 998 sensors send a stream of Poly Aftertouch commands for the duration of 999 each key event. Because multiple keys may be down at once, the Poly 1000 Aftertouch command specifies a note number (0-127) as well as a pressure 1001 value (0-127). 1003 SAOL programmers may access the last aftertouch value for each MIDI note 1004 in the MIDItouch[128] standard name array. Programmers typically use 1005 MIDItouch[] for continuous modification of instrument models parameters. 1006 The recovery mechanisms in Chapter A are designed to protect the 1007 aftertouch data stream for these types of SAOL programs. 1009 To prepare for recovery, the receiver should store state for each note 1010 on each channel, that codes the pressure value of the last Poly 1011 Aftertouch command received, along with a flag bit to signify the null 1012 case of no Poly Aftertouch command received. 1014 The encoding for Chapter A is shown below: 1016 0 1 2 3 1017 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 8 0 1 1018 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1019 |S| LENGTH |F| NOTENUM |R| PRESSURE |F| NOTENUM | 1020 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1021 |R| PRESSURE | .... | 1022 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1024 The chapter consists of a 1-byte header followed by a list of 16-bit 1025 note logs. Note logs exist for note numbers whose pressure value has 1026 been changed by a Poly Aftertouch command since the last checkpoint 1027 packet, and let receivers recover from the loss of those commands. Only 1028 one note log may exist in the note list for a particular note number. 1030 The 7-bit LENGTH field codes the number of note logs minus one; the 1031 expression (1 + 2*(LENGTH + 1)) yields the number of bytes in the 1032 chapter. The maximum chapter length of 257 bytes protects the worst-case 1033 situation of Poly Aftertouch commands occurring for all 128 MIDI notes 1034 since the last checkpoint packet. 1036 If the S bit is set to 1, the previous packet sent did not include a 1037 Poly Aftertouch command on this channel, and the receiver can skip to 1038 the next Chapter if it is recovering from the loss of a single packet. 1040 For each note log, the 7-bit NOTENUM field identifies the MIDI note 1041 number of the log, and the 7-bit PRESSURE field indicates the pressure 1042 value of the last Poly Aftertouch command sent. 1044 If the F bit is set to 1, the previous packet sent did not include a 1045 Poly Aftertouch for this note, and the receiver can skip to the next 1046 note log if it is recovering from the loss of a single packet. 1048 If the F bit is 0, the receiver should compare the PRESSURE value with 1049 the stored pressure value for the note; if these values are different, 1050 the receiver should update its recovery state to reflect the value of 1051 the PRESSURE field. In addition, the receiver MAY execute a single Poly 1052 Aftertouch command, or MAY plan a series of Poly Aftertouch commands 1053 spaced over time. 1055 Appendix A.5. Chapter T: MIDI Channel Aftertouch 1057 Chapter T protects against the loss of MIDI Channel Aftertouch commands. 1058 This command supports piano keyboard controllers that use a single 1059 pressure sensor for the entire keyboard. Keyboard controllers that 1060 include this sensor send a stream of Channel Aftertouch commands 1061 whenever at least one key is depressed. Unlike the Poly Aftertouch 1062 command, the Channel Aftertouch command does not specify a note number, 1063 only a pressure value (0-127). 1065 Structured Audio makes the pressure value of the last Channel Aftertouch 1066 command available to SAOL programmers in all array positions of the 1067 MIDItouch[128] standard name array, which programmers typically use for 1068 continuous modification of instrument models parameters. The recovery 1069 mechanisms in Chapter T are designed to protect the aftertouch data 1070 stream for these types of SAOL programs. 1072 To prepare for recovery, the receiver should store state for each 1073 channel, that codes the pressure value for the last Channel Aftertouch 1074 command received, along with a flag bit to signify the null case of no 1075 Channel Aftertouch command received. 1077 SAOL programmers may access the last aftertouch value received via in 1078 the MIDItouch[128] standard name array; all array positions contain the 1079 same value. Programmers typically use MIDItouch[] for continuous 1080 modification of instrument models parameters. The recovery mechanisms in 1081 Chapter T are designed to protect the aftertouch data stream for these 1082 types of SAOL programs. 1084 The encoding for Chapter T is shown below: 1086 0 1087 0 1 2 3 4 5 6 7 1088 +-+-+-+-+-+-+-+-+ 1089 |S| PRESSURE | 1090 +-+-+-+-+-+-+-+-+ 1092 The chapter has a fixed size of 8 bits. If the S bit is set to 1, the 1093 previous packet sent did not include a Channel Aftertouch command on 1094 this channel, and the receiver can skip to the next Chapter if it is 1095 recovering from the loss of a single packet. 1097 The 7-bit PRESSURE field indicates the pressure value of the last 1098 Channel Aftertouch command sent. The receiver should compare the 1099 PRESSURE value with the stored pressure value for the note; if these 1100 values are different,the receiver should update its recovery state to 1101 reflect the value of the PRESSURE field. In addition, the receiver MAY 1102 execute a single Channel Aftertouch command, or MAY plan a series of 1103 Channel Aftertouch commands spaced over time. 1105 Appendix A.6. Chapter C: MIDI Control Change 1107 Chapter C protects against the loss of MIDI Control Change commands. A 1108 Control Change command alters the 7-bit value of one of the 128 MIDI 1109 controllers. Most MIDI controllers are meant to be used as continuous 1110 parameters (for example, controller 7 is the Main Volume control), but 1111 some parameters have special semantics. 1113 Structured Audio makes the current value of MIDI controllers available 1114 to SAOL programmers in the MIDIctrl[128] standard name array, that 1115 programmers typically use for continuous modification of instrument 1116 model parameters. The recovery mechanisms in Chapter C are designed to 1117 protect the Control Change data stream for these types of SAOL programs. 1119 In addition, a Structured Audio decoder implements special semantics for 1120 the Bank Select Coarse and Fine, Sustain Pedal, All Notes Off, and All 1121 Sound Off controllers. The recovery mechanism in Chapter C, in 1122 conjunction with Chapter P, protects the special semantics of these five 1123 controllers. 1125 To prepare for recovery, the receiver should store state for each 1126 channel about the Control Change data stream. For the All Notes Off and 1127 All Sound Off controllers, the receiver should keep a count, module 128, 1128 of the total number of Control Change commands received. 1130 For all other controllers, the receiver should store the value of the 1131 last Control Command received, along with a flag bit to signify the null 1132 case of no Control Change command received. Note that this state should 1133 not reflect any changes to MIDIctrl[] made by assignment statements by 1134 SAOL code. 1136 The encoding for Chapter C is shown below: 1138 0 1 2 3 1139 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 8 0 1 1140 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1141 |S| LENGTH |F| CONTROLLER |R| VALUE/COUNT |F| CONTROLLER | 1142 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1143 |R| VALUE/COUNT | .... | 1144 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1146 The chapter consists of a 1-byte header followed by a list of 16-bit 1147 controller logs. A controller log exist for All Notes Off and All Sound 1148 Off controllers if a new Control Change command has been received for 1149 the controller since the last checkpoint packet. For all other 1150 controllers, a controller log exists for controllers whose values have 1151 been changed by a Control Change command since the last checkpoint 1152 packet. Only one controller log may exist in the controller list for a 1153 particular controller number. 1155 The 7-bit LENGTH field codes the number of controller logs minus one; 1156 the expression (1 + 2*(LENGTH + 1)) yields the number of bytes in the 1157 chapter. 1159 If the S bit is set to 1, the previous packet sent did not include a 1160 Control Change command on this channel, and the receiver may skip to the 1161 next Chapter if it is recovering from the loss of a single packet. 1163 For each controller log, the 7-bit CONTROLLER field identifies the 1164 controller number. For most controllers, the VALUE/COUNT field codes the 1165 value of the last Control Change command sent for this controller. 1167 However, if the controller log codes the All Notes Off or All Sound Off 1168 controllers, the VALUE/COUNT field codes the total number of Control 1169 Change commands received for the lifetime of the session. If this value 1170 exceeds 127, modulo arithmetic is used, but the value 0 is skipped. 1172 If the controller log codes the Sustain Pedal controller, zero is used 1173 to code pedal release. To code pedal depression, the the VALUE/COUNT 1174 field codes the total number of pedal depressions that occur during a 1175 session. If this value exceeds 127, modulo arithmetic is used, but the 1176 value 0 is skipped. 1178 If the F bit is set to 1, the previous packet sent did not include a 1179 Control Change command for this controller, and the receiver can skip to 1180 the next controller log if it is recovering from the loss of a single 1181 packet. 1183 If the F bit is 0, the receiver should compare the VALUE/COUNT field 1184 with its stored state for the controller. 1186 For the All Notes Off and All Sound Off controllers, and the Sustain 1187 Pedal controllers coding a depressed pedal, if the stored modulo count 1188 for the controller does not match the VALUE/COUNT field, the receiver 1189 should update its state for this controller, and execute the semantics 1190 of the lost command. Note these commands happen sufficiently 1191 infrequently that the ambiguity of modulo comparisons should not affect 1192 the recovery process. 1194 For all other controllers, if the recovery state does not match the 1195 VALUE/COUNT field, the receiver should update its recovery state to 1196 reflect the value of the VALUE/COUNT. In addition, the receiver MAY 1197 execute a single Control Change command, or MAY plan a series of Control 1198 Change commands spaced over time. 1200 Appendix B. Author Addresses 1202 John Lazzaro (corresponding author) 1203 UC Berkeley 1204 CS Division 1205 315 Soda Hall 1206 Berkeley CA 94720-1776 1207 Email: lazzaro@cs.berkeley.edu 1209 John Wawrzynek 1210 UC Berkeley 1211 CS Division 1212 631 Soda Hall 1213 Berkeley CA 94720-1776 1214 Email: johnw@cs.berkeley.edu 1216 Appendix C. References 1218 [1] MIDI Manufacturers Association. The complete MIDI 1.0 1219 detailed specification, 1996. http://www.midi.org 1221 [2] International Standards Organization. ISO 14496 MPEG-4, 1222 Part 3 (Audio) Subpart 5 (Structured Audio) 1999. 1224 [3] Sfront source code release, includes a Linux networking 1225 client that implements the MIDI RTP packetization. 1226 http://www.cs.berkeley.edu/~lazzaro/sa/ 1228 [4] H. Schulzrinne, S. Casner, R. Frederick, and V. Jacobson. 1229 RFC 1889: RTP: A transport protocol for real-time applications, 1230 1996. 1232 [5] John Lazzaro and John Wawrzynek. A Case for Network 1233 Musical Performance. The 11th International Workshop on Network 1234 and Operating Systems Support for Digital Audio and Video 1235 (NOSSDAV 2001) June 25-26, 2001, Port Jefferson, New York. 1236 http://www.cs.berkeley.edu/~lazzaro/sa/pubs/pdf/nossdav01.pdf 1238 [6] M. Handley and V. Jacobson. RFC 2327: SDP: Session Description 1239 Protocol. 1998. 1241 [7] Internet Engineering Task Force. RTP Payload Format for MPEG-4 1242 Streams. Work in progress, draft-ietf-avt-mpeg4-multisl-02.txt. 1244 [8] Internet Engineering Task Force. Use of "RFC-generic" for MPEG-4 1245 Elementary Streams with no SL layer. Work in progress, 1246 draft-ietf-avt-mpeg4-simple-00.txt. 1248 Appendix D. Expiration Notice 1250 This document expires April 1, 2002.