idnits 2.17.1 draft-ietf-mmusic-stream-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Cannot find the required boilerplate sections (Copyright, IPR, etc.) in this document. Expected boilerplate is as follows today (2024-04-26) according to https://trustee.ietf.org/license-info : IETF Trust Legal Provisions of 28-dec-2009, Section 6.a: This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 2: Copyright (c) 2024 IETF Trust and the persons identified as the document authors. All rights reserved. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 3: This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** Missing expiration date. The document expiration date should appear on the first and last page. ** The document seems to lack a 1id_guidelines paragraph about Internet-Drafts being working documents. ** The document seems to lack a 1id_guidelines paragraph about 6 months document validity -- however, there's a paragraph with a matching beginning. Boilerplate error? ** The document seems to lack a 1id_guidelines paragraph about the list of current Internet-Drafts. ** The document seems to lack a 1id_guidelines paragraph about the list of Shadow Directories. ** Expected the document's filename to be given on the first page, but didn't find any ** The document is more than 15 pages and seems to lack a Table of Contents. == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** The document seems to lack an Authors' Addresses Section. == There are 1 instance of lines with non-RFC2606-compliant FQDNs in the document. == There are 2 instances of lines with non-RFC6890-compliant IPv4 addresses in the document. If these are example addresses, they should be changed. == There are 2 instances of lines with multicast IPv4 addresses in the document. If these are generic example addresses, they should be changed to use the 233.252.0.x range defined in RFC 5771 ** The document seems to lack a both a reference to RFC 2119 and the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. RFC 2119 keyword, line 297: '...n implementation MAY cache the last RT...' RFC 2119 keyword, line 300: '... MAY be set to an arbitrarily large ...' RFC 2119 keyword, line 333: '...mplementing RTSP MUST support carrying...' RFC 2119 keyword, line 339: '...ream. RTSP data MAY be interleaved wi...' RFC 2119 keyword, line 433: '...conference identifier MUST be globally...' (9 more instances...) Miscellaneous warnings: ---------------------------------------------------------------------------- == Couldn't figure out when the document was first submitted -- there may comments or warnings related to the use of a disclaimer for pre-RFC5378 work that could not be issued because of this. Please check the Legal Provisions document at https://trustee.ietf.org/license-info to determine if you need the pre-RFC5378 disclaimer. -- The document date (November 26, 1996) is 10013 days in the past. Is this intentional? -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '' and '' lines. Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Missing reference section? 'TBD' on line 271 looks like a reference -- Missing reference section? 'PORT' on line 334 looks like a reference Summary: 11 errors (**), 0 flaws (~~), 5 warnings (==), 4 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Internet Engineering Task Force MMUSIC WG 3 Internet Draft H. Schulzrinne 4 ietf-mmusic-stream-00.txt Columbia U. 5 November 26, 1996 6 Expires: 26/8/97 8 A real-time stream control protocol (RTSP') 10 STATUS OF THIS MEMO 12 This document is an Internet-Draft. Internet-Drafts are working 13 documents of the Internet Engineering Task Force (IETF), its areas, 14 and its working groups. Note that other groups may also distribute 15 working documents as Internet-Drafts. 17 Internet-Drafts are draft documents valid for a maximum of six months 18 and may be updated, replaced, or obsoleted by other documents at any 19 time. It is inappropriate to use Internet-Drafts as reference 20 material or to cite them other than as ``work in progress''. 22 To learn the current status of any Internet-Draft, please check the 23 ``1id-abstracts.txt'' listing contained in the Internet-Drafts Shadow 24 Directories on ftp.is.co.za (Africa), nic.nordu.net (Europe), 25 munnari.oz.au (Pacific Rim), ds.internic.net (US East Coast), or 26 ftp.isi.edu (US West Coast). 28 Distribution of this document is unlimited. 30 ABSTRACT 32 This strawman proposal presents a revised version of the 33 RTSP proposal put forward to the MMUSIC group, borrowing 34 liberally from the original. 36 The Real Time Streaming Protocol, or RTSP, is an 37 application-level protocol for control over the delivery 38 of data with real-time properties. RTSP provides an 39 extensible framework to enable controlled, on-demand 40 delivery of real- time data, such as audio and video. 41 Sources of data can include both live data feeds and 42 stored clips. This protocol is intended to control 43 multiple data delivery sessions, provide a means for 44 choosing delivery channels such as UDP, multicast UDP and 45 TCP, and delivery mechanisms based upon RTP (RFC 1889). 47 1 Introduction 49 1.1 Terminology 51 conference: a multiparty, multimedia session, where "multi" implies 52 greater than or equal to one. 54 client: The client requests media data from the media server. 56 entity: An entity is a participant in a conference. This participant 57 may be non-human, e.g., a media record or playback server. 59 media server: The network entity providing playback or recording 60 services for one or more media streams. Different media streams 61 within a session may originate from different media servers. A 62 media server may reside on the same or a different host as the 63 web server the media session is invoked from. 65 (media) stream: A single media instance, e.g., an audio stream or a 66 video stream as well as a whiteboard or shared application 67 session. When using RTP, a stream consists of all RTP and RTCP 68 packets created by a source within an RTP session. 70 [TBD: terminology is confusing since there's an RTP session, which is 71 used by a single RTSP stream.] 73 media session: A collection of media streams to be treated. 74 Typically, a client will synchronize all media streams within a 75 media session. 77 session description: A session description contains information about 78 one or more media within a session, such as the set of 79 encodings, network addresses and information about the content. 80 The session description may take several different formats, 81 including SDP and SDF. 83 Both client and server can send commands. 85 The protocol supports the following operations: 87 Retrieval of media from media server: The client can request a 88 session decription via HTTP or some other method. If the session 89 is being multicast, the session description contains the 90 multicast addresses and ports to be used. If the session is to 91 be sent only to the client, the client provides the destination 92 for security reasons. 94 Invitation of media server to conference: A media server can be 95 "invited" to join an existing conference, either to play back 96 media into the session or to record all or a subset of the media 97 in a session. This mode is useful for distributed teaching 98 applications. Several parties in the conference may take turns 99 "pushing the remote control buttons". 101 Adding media to an existing session: Particularly for live events, it 102 is useful if the server can tell the client about additional 103 media becoming available. 105 1.2 Requirements 107 The protocol satisfies the following requirements 109 extendable: new commands and parameters can be easily added 111 easy to parse: standard HTTP or MIME parsers can (but do not have to 112 be) used 114 secure: re-uses web security mechanisms, either at the transport 115 level (SSL) or within the requests (basic and digest 116 authentication) 118 transport-independent: may use either an unreliable datagram protocol 119 (UDP), a reliable datagram protocol (RDP, not widely used) or a 120 reliable stream protocol (TCP) by implementing application-level 121 reliability 123 multi-server capable: Each media stream within a session can reside 124 on a different server. The client automatically establishes 125 several concurrent control sessions with the different media 126 servers. Media synchronization is performed at the transport 127 level. 129 multi-client capable: Stream identifiers can be used by several 130 control streams, so that "passing the remote" is possible. The 131 protocol does not address how several clients negotiate access; 132 this is left to either a "social protocol" or some other floor 133 control mechanism. 135 control of recording devices: The protocol can control both recording 136 and playback devices, as well as devices that can alternate 137 between the two modes ("VCR"). 139 separation of stream control and conference initiation: Stream 140 control is divorced from inviting a media server to a 141 conference. The only requirement is that the conference 142 initiation protocol either provides or can be used to create a 143 unique conference identifier. In particular, S*IP or H.323 may 144 be used to invite a server to a conference. 146 suitable for professional applications: RTSP' supports frame-level 147 accuracy through SMPTE time stamps to allow remote digital 148 editing. 150 S*IP compatible: As much as possible, stream control should be 151 aligned with the IETF conference initiation effort. However, for 152 simple applications, a media server should not have to implement 153 a conference initiation protocol. 155 session description neutral: The protocol does not impose a 156 particular session description or metafile format and can convey 157 the type of format to be used. However, the session description 158 must contain an RTSP URI. 160 proxy and firewall friendly: The protocol should be readily handled 161 by both application and transport-layer (SOCKS) firewalls. For 162 proxies, re-use of existing proxies should be possible, but 163 remains to be verified. [TBD: what exactly is needed to make a 164 protocol firewall-friendly?] A firewall may need to understand 165 the SET_PORT directive to open a "hole" for the UDP media 166 stream. 168 HTTP friendly: Where sensible, RTSP re-uses HTTP concepts, so that 169 the existing infrastructure can be re-used. 171 1.3 Extending the Protocol 173 The protocol described below can be extended in three ways, listed in 174 order of the magnitude of changes supported: 176 o Existing commands can be extended with new parameters, as long 177 as these parameters can be safely ignored by the recipient. 178 (This is equivalent to adding new parameters to an HTML tag.) 180 o New methods can be added. If the recipient of the message does 181 not understand the request, it responds with error code 501 182 (Not implemented) and the sender can then attempt an earlier, 183 less functional version. 185 o A new version of the protocol can be defined, allowing almost 186 all aspects (except the position of the protocol version 187 number) to change. 189 1.4 Overall Operation 190 Each media stream and session is identified by an rtsp URL. The 191 overall session and the properties of the media the session is made 192 up of are defined by a session description file, the format of which 193 is outside the scope of this specification. The session description 194 file is retrieved using HTTP, either from the web server or the media 195 server, typically using an URL with scheme http. 197 The session description file contains a description of the media 198 streams making up the media session, including their encodings, 199 language, and other parameters that enable the client to choose the 200 most appropriate combination of media. In this session description, 201 each media stream is identified by an rtsp URL, which points to the 202 media server handling that particular media stream. Several media 203 streams can be located on different servers; for example, audio and 204 video tracks can be split across servers for load sharing. The 205 description also enumerates which transport methods the server is 206 capable of. If desired, the session description can also contain only 207 an RTSP URL, with the complete session description retrieved via 208 RTSP. 210 Besides the media parameters, the network destination address and 211 port need to be determined. Several modes of operation can be 212 distinguished: 214 Unicast: The media is transmitted to the source of the RTSP request, 215 with the port number picked by the client. Alternatively, the 216 media is transmitted on the same reliable stream as RTSP. 218 Multicast, server chooses address: The media server picks the 219 multicast address and port. This is the typical case for a live 220 or near-media-on-demand transmission. 222 Multicast, client chooses address: If the server is to participate in 223 an existing multicast conference, the multicast address, port 224 and encryption key are given by the conference. 226 1.5 Relationship with Other Protocols 228 RTSP' has some overlap in functionality with HTTP. It also needs to 229 interact with the web in that the initial contact with streaming 230 content is often to be made through a web page. The current protocol 231 specification aims to allow different hand-off points between a web 232 server and the media server implementing RTSP'. For example, the 233 session description can be retrieved using HTTP or RTSP'. Having the 234 session description be returned by the web server makes it possible 235 to have the web server take care of authentication and billing, by 236 handing out a session description whose media identifier includes an 237 encrypted version of the requestor's IP address and a timestamp, with 238 a shared secret between web and media server. 240 However, RTSP' differs fundamentally from HTTP in that data delivery 241 takes place out-of-band, in a different protocol. HTTP is an 242 asymmetric protocol, where the client issues requests and the server 243 responds. In RTSP', both the media client and media server can issue 244 requests. RTSP' requests are also not stateless, in that they may set 245 parameters and continue to control a media stream long after the 246 request has been acknowledged. 248 Re-using HTTP functionality has advantages in at least two 249 areas, namely security and proxies. The requirements are 250 very similar, so having the ability to adopt HTTP work on 251 caches, proxies and authentication is valuable. The current 252 RTSP already has first hints on caches and proxies, but is 253 nowhere near as complete as HTTP in that regard. 255 It is possible to very quickly build a simple RTSP' server by adding 256 a PLAY and, optionally, a SET_PARAMETER method to an existing 257 HTTP/1.1 web server. All of RTSP' can be implemented as part of an 258 HTTP server as long as only the client issues requests. 260 While most real-time media will use RTP as a transport protocol, 261 RTSP' is not tied to RTP. 263 RTSP' assumes the existence of a session description format that can 264 express both static and temporal properties of a media session 265 containing several media streams. 267 2 Protocol Parameters 269 2.1 Message Format and Transmission 271 RTSP is a text-based protocol [TBD] and uses the ISO 10646 character 272 set in UTF-8 encoding (RFC 2044) [TBD; this conflicts with ]. Lines 273 are terminated by CRLF, but receivers should be prepared to also 274 interpret CR and LF by themselves as line terminators. 276 Text-based protocols make it easier to add optional 277 parameters in a self-describing manner. Since the number of 278 parameters and the frequency of commands is low, processing 279 efficiency is not a concern. Text-based protocols, if done 280 carefully, also allow easy implementation in scripting 281 languages such as Tcl, VisualBasic and Perl. 283 The 10646 character set avoids tricky character set switching, but is 284 invisible to the application as long as US-ASCII is being used. This 285 is also the encoding used for RTCP. ISO 8859-1 translates directly 286 into Unicode, with a high-order octet of zero. ISO 8859-1 characters 287 with the most-significant bit set are represented as 1100001x 288 10xxxxxx. 290 RTSP messages can be carried over any lower-layer transport protocol 291 that is 8-bit clean. 293 Commands are acknowledged by the receiver unless they are sent to a 294 multicast group. If there is no acknowledgement, the sender may 295 resend the same message after a timeout of one round-trip time (RTT). 296 The round-trip time is estimated as in TCP (RFC TBD), with an initial 297 round-trip value of 500 ms. An implementation MAY cache the last RTT 298 measurement as the initial value for future connections. If a 299 reliable transport protocol is used to carry RTSP, the timeout value 300 MAY be set to an arbitrarily large value. 302 This can greatly increase responsiveness for proxies 303 operating in local-area networks with small RTTs. The 304 mechanism is defined such that the client implementation 305 does not have be aware of whether a reliable or unreliable 306 transport protocol is being used. It is probably a bad idea 307 to have two reliability mechanisms on top of each other, 308 although the RTSP RTT estimate is likely to be larger than 309 the TCP estimate. 311 Each request carries a sequence number, which is incremented by one 312 for each request transmitted. If a request is repeated because of 313 lack of acknowledgement, the sequence number is incremented. 315 This avoids ambiguities when computing round-trip time 316 estimates. [TBD: An initial sequence number negotiation 317 needs to be added for UDP; otherwise, a new stream 318 connection may see a request be acknowledged by a delayed 319 response from an earlier "connection". This handshake can 320 be avoided with a sequence number containing a timestamp of 321 sufficiently high resolution.] 323 The reliability mechanism described here does not protect against 324 reordering. This may cause problems in some instances. For example, a 325 STOP followed by a PLAY has quite a different effect than the 326 reverse. Similarly, if a PLAY request arrives before all parameters 327 are set due to reordering, the media server would have to issue an 328 error indication. Since sequence numbers for retransmissions are 329 incremented (to allow easy RTT estimation), the receiver cannot just 330 ignore out-of-order packets. [TBD: This problem could be fixed by 331 including both a sequence number that stays the same for 332 retransmissions and a timestamp for RTT estimation.] 333 Systems implementing RTSP MUST support carrying RTSP over TCP and MAY 334 support UDP. The default port for the RTSP server is [PORT] for both 335 UDP and TCP. 337 A number of RTSP packets destined for the same control end point may 338 be packed into a single lower-layer PDU or encapsulated into a TCP 339 stream. RTSP data MAY be interleaved with RTP and RTCP packets. An 340 RTSP packet is terminated with an empty line. (TBD: doesn't work well 341 for including session descriptions. Maybe use Content-length for 342 payloads - these are usually imported anyway? or new page? Wrapping a 343 packet in some kind of braces or parenthesis is another possibility, 344 but again puts restrictions on the SDF.) 346 Unless all but the RTP data is textual, there is not much 347 point in keeping the payload as textual data, since visual 348 debugging is more difficult and "telnet protocol emulation" 349 is no longer possible. Length fields don't make much sense 350 for textual data, particularly because of the line 351 termination ambiguities, i.e., CR, LF and CRLF. There does 352 not seem to be a need for an explicit, connection-oriented 353 framing layer as in the original RTSP proposal. However, if 354 we allow interleaving with RTP, a textual format gets very 355 awkward. 357 Requests contain methods, the object the method is operating upon and 358 parameters to further describe the method. Methods are idempotent, 359 unless otherwise noted. Methods are also designed to require little 360 or no state maintenance at the media server. 362 A message has the following format: 364 Method Object Version Sequence-Number 365 *(Parameter-Value) 366 CRLF 368 A message with a message body has the following format: 370 Method Object Version Sequence-Number 371 Content-length: 372 *(Parameter-Value) 373 CRLF 374 message-body 375 After receiving and interpreting an RTSP' request, the server 376 responds with an RTSP' response message. 378 [TBD: proper BNF] 380 A typical response to a request with sequence number 17 might be: 382 RTSP/1.0 200 17 OK 384 This format is HTTP-friendly; the sequence number is simply 385 ignored by HTTP servers. The likelihood that a textual 386 protocol will share the same port and not have that format 387 seems fairly remote. RTP packets have the most-significant 388 bit set and can thus be easily distinguished. 390 If a connectionless transport protocol is used, the media server 391 considers all packets originating from a single port number and 392 network address to be part of the same session. [TBD: is this 393 necessary?] 395 2.2 Session and Media URI 397 The RTSP URL scheme is used to locate and control stream resources 398 via the RTSP protocol. 400 A media stream is identified by an textual session and media 401 identifier, using the character set and escape conventions of URLs. 402 The media identifier is separated from the session by a slash. 403 Commands below can refer to either the whole session or an individual 404 stream. Stream identifiers can be passed between clients ("passing 405 the remote control"). A specific instance of a session, e.g., one of 406 several concurrent transmissions of the same content, is appended 407 where needed. The instance identifies the whole session, so that all 408 media streams within that session have the same instance identifier. 410 For example, 412 rtsp://media.content.com:5000/twister/audio.en/1234 414 identifies instance 1234 of the stream audio.en within the session 415 "twister", which is located at port 5000 of host media.content.com. 417 The ordering and significance of the path components of the rtsp URL 418 is only of significance to the media server. 420 This decoupling also allows session descriptions to be used 421 with non-RTSP media control protocols, simply by replacing 422 the scheme in the URL. 424 2.3 Encoding Identifiers 426 RTP profile and/or MIME types. [TBD: should probably register all the 427 RTP data types as MIME types.] 429 2.4 Conference Identifiers 431 Conference identifiers are opaque to RTSP' and are encoded using 432 standard URI encoding methods (i.e., escaping with %). They can 433 contain any octet value. The conference identifier MUST be globally 434 unique. For H.323, the conferenceID value is to be used. 436 If the conference participant inviting the media server 437 would only supply a conference identifier which is unique 438 for that inviting party, the media server could add an 439 internal identifier for that party, e.g., its Internet 440 address. However, this would prevent that the conference 441 participant and the initiator of the RTSP commands are two 442 different entities. 444 2.5 Relative Timestamps 446 A relative time-stamp expresses time relative to the start of the 447 clip. Relative timestamps are expressed as SMPTE time codes for 448 frame-level access accuracy. The time code has the format 449 hours:minutes:seconds.frames, with the origin at the start of the 450 clip. For NTSC, the frame rate is 29.97 frames per second. This is 451 handled by dropping the first frame index of every minute, except 452 every tenth minute. If the frame value is zero, it may be omitted. 454 Examples: 456 10:12:33.40 457 10:7:33 459 2.6 Absolute Time 460 Absolute time is expressed as ISO 8601 timestamps. It is always 461 expressed as UTC (GMT). 463 Example for November 8, 1996 at 14h37 and 20 seconds GMT: 465 19961108T143720Z 467 3 Header Field Definitions 469 3.1 Accept 471 The Accept request-header field can be used to specify certain 472 session description types which are acceptable for the response. The 473 only parameter allowed is that of level , which indicates the highest 474 level or version accepted by the requestor. 476 Example of use: 478 Accept: application/sdf, application/sdp;level=2 480 3.2 Address 482 3.3 Allow 484 The Allow response header field lists the methods supported by the 485 resource identified by the Request-URI. The purpose of this field is 486 to strictly inform the recipient of valid methods associated with the 487 resource. An Allow header field must be present in a 405 (Method not 488 allowed) response. 490 Example of use: 492 Allow: PLAY, RECORD, SET_PARAMETER 494 3.4 Authorization 496 3.5 Blocksize 498 3.6 Conference 499 This field establishes a logical connection between a conference, 500 established using non-RTSP' means, and an RTSP stream. 502 [TBD: This parameter is for further study. May not be needed with the 503 Given parameter.] 505 3.7 Content-Length 507 3.8 Content-Type 509 3.9 Given 511 3.10 Location 513 3.11 Port 515 3.12 Range 517 3.13 Speed 519 3.14 Transport 521 3.15 TTL 523 4 Methods 525 The Method token indicates the method to be performed on the resource 526 identified by the Request-URI case-sensitive. New methods may be 527 defined in the future. Method names may not start with a $ character 528 (decimal 24) and must be a token 530 4.1 GET 532 The GET method retrieves a session description from a server. It may 533 use the Accept header to specify the session description formats that 534 the client understands. 536 GET twister RTSP/1.0 937 537 Accept: application/sdp, application/sdf, application/mheg 539 If the media server has previously been invited to a conference, the 540 GET method also contains a conference identifier or a Given 541 parameter. 543 GET twister RTSP/1.0 834 544 Conference: 128.16.64.19/32492374 545 Authorization: Basic QWxhZGRpbjpvcGVuIHNlc2FTZQ== 547 If the GET request contains a conference identifier, the media server 548 MUST locate the conference description and use the multicast 549 addresses and port numbers supplied in that description. The media 550 server SHOULD only offer media types corresponding to the media types 551 currently active within the conference. If the media server has no 552 local reference to this conference, it returns status code 452. 554 The conference invitation should also contain an indication whether 555 the media server is expected to receive or generate media, or both. 556 (A VCR-like device would support both directions.) If the invitation 557 does not contain an indication of the operations to be performed, the 558 media server should accept and then reject inappropriate operations. 560 A typical response might be: 562 200 18 OK 563 Content-Type: application/sdf 564 session description 566 4.2 SESSION 568 This method is used by a media server to send media information to 569 the client. If a new media type is added to a session (e.g., during a 570 live event), the whole session description should be sent again, 571 rather than just the additional components. 573 This allows the deletion of session components. 575 Example: 577 SESSION twister/*/1234 Content-Type: application/sdp 579 Session Description 581 Response: 200, 302, 303, 500, can't do this operation, busy, 583 4.3 PLAY 585 The PLAY method tells the server to start sending data via the 586 previously set transport mechanism. The Range header specifies the 587 range. The range can be specified in a number of units. This 588 specification defines the smpte (see Section 2.5) and clock (see 589 Section 2.6) range units. 591 PLAY media-name 592 Range: smpte= range-value 594 The following example plays the whole session starting at SMPTE time 595 code 0:10:20 until the end of the clip. 597 PLAY twister/*/1234 598 Range: smpte=0:10:20- 600 For playing back a recording of a live event, it may be desirable to 601 use clock units: 603 PLAY meeting/*/1234 604 Range: clock=19961108T142300Z-19961108T143520Z 606 A media server only supporting playback MUST support the smpte format 607 and MAY support the clock format. 609 [TBD: It may be desirable to allow several ranges, so that remote 610 digital editing can be done easily.] 612 Response: 200, 500, 501, clock format not supported. 614 4.4 RECORD 616 This method initiates eecording a range of media data according to 617 the session description. The timestamp reflects start and end time 618 (UTC). If no time range is given, use the start or end time provided 619 in the session description. If the session has already started, 620 commence recording immediately. The Conference header is mandatory. 622 A media server supporting recording of live events MUST support the 623 clock range format; the smpte format does not make sense. 625 RECORD meeting/audio.en/1234 626 Conference: 128.16.64.19/32492374 628 4.5 REDIRECT 630 A redirect request informs the client that it must connect to another 631 server location. It contains the mandatory header Location , which 632 indicates that the client should issue a GET for that URL. It may 633 contain the parameter Range , which indicates when the redirection 634 takes effect. 636 4.6 SET_PARAMETER 638 Both client and media server can issue this request. 640 The following parameters are defined: 642 Blocksize: This advisory parameter is sent from the client to the 643 media server setting the transport packet size. The server 644 truncates this packet size to the closest multiple of the 645 minimum media-specific block size or overrides it with the media 646 specific size if necessary. The block size is a strictly 647 positive decimal number and measured in bytes. The server only 648 returns an error (416) if the value is syntactically invalid, 649 but not if the server adjusts it according to the mechanism 650 described above or decides to simply ignore the advice. 652 Port: UDP or TCP port to be used for this media. 654 SSRC: RTP SSRC value to be used by the media server. This parameter 655 is only valid for unicast transmission. It identifies the 656 synchronization source to be associated with the media stream. 657 This can be used for demultiplexing by the client of data 658 received on the same port. 660 Address: Destination network address, consisting of the address class 661 identifier and the address. Currently, the address classes IP4 662 and IP6 are defined. 664 Transport: Transport protocol stack to be used: UDP or TCP or 665 interleaved, followed by the next-layer transport protocol. in 666 whatever protocol is being used by the control stream. 667 Currently, the next-layer protocols RTP is defined. Parameters 668 may be added to each protocol, separated by a semicolon. For 669 RTP, the boolean parameter compressed is defined, indicating 670 compressed RTP according to RFC XXXX. Example: UDP 671 RTP;compressed 673 TTL: Multicast time-to-live value. In some cases, it may make sense 674 for a client to ask a media server sending on a given multicast 675 address to expand its range. 677 Speed: This advisory parameter sets the speed at which the server 678 delivers data to the client, contingent on the server's ability 679 and desire to serve the media stream at the given speed. 680 Implementation by the server is optional. The default is the bit 681 rate of the stream. 683 The parameter value is expressed as a decimal ratio, e.g., 2.0 684 indicates that data is to be delivered twice as fast as normal. A 685 speed of zero is invalid. A negative value indicates that the stream 686 is to be played back in reverse direction. 688 A request SHOULD only contain a single parameter to allow the client 689 to determine why a particular request failed. A server MUST allow a 690 parameter to be set repeatedly to the same value, but it MAY disallow 691 changing parameter values. 693 The parameters are split in a fine-grained fashion so that, 694 for example, the client can set just the unicast port, 695 without having to modify the destination address. There is 696 no substantial difference between the privileged parameters 697 and the parameters identified by family and parameter id in 698 the current RTSP spec. If desired, parameter names could 699 easily take the form family/parameter , e.g., 700 Audio/Annotations 702 A SET_PARAMETER request without parameters can be used as a way to 703 detect whether the other side is still responding. 705 Example: 707 SET_PARAMETER twister/1234/audio.en RTSP/1.0 68 708 Speed: 2.3 710 [TBD: Or should this be like SET_PARAMETER? Bit longer, but forces 711 single parameter per request.] 713 4.7 GET_PARAMETER 715 Both client and media server can issue a GET_PARAMETER request to 716 retrieve a specific parameter. All parameters described for the 717 SET_PARAMETER request are valid. In the request, the message body 718 contains the parameter value. Only one parameter can be requested in 719 each GET_PARAMETER request. 721 Example: 723 C->S: GET_PARAMETER twister/1234/audio.en RTSP/1.0 6 724 Content-length: 17 726 Audio/Annotations 728 S->C: RTSP/1.0 200 6 OK 729 Content-type: text/ascii 730 Content-length: 2 732 64 734 4.8 STOP 736 Stops delivery of stream immediately. Returns indication of current 737 position to allow play instead of resume. 739 Thus, RESUME is not needed. 741 C->M: STOP movie RTSP/1.0 76 743 M->C: RTSP/1.0 200 76 OK 745 4.9 BYE 747 Sent by either client or server to terminate a connection and release 748 resources. 750 4.10 Embedded Data Stream 752 The command DATA is used to indicate an embedded media data object, 753 together with the content types. DATA requests are not acknowledged 754 by RTSP'. The embedded object can have any type. For space-efficient 755 encapsulation of binary data, the method in Section 4.11 should be 756 used instead. 758 DATA twisters/audio.en/1234 RTSP/1.1 759 Content-Length: 500 760 Content-Type: message/rtp 762 (RTP data) 764 This is workable, but not very space-efficient. However, 765 the interesting case is that of a single TCP stream 766 carrying both commands and media data. There is no 767 particular reason to have small chunks in that case. 769 4.11 Embedded Binary Data 771 Binary packets such as RTP data are encapsulated by an ASCII dollar 772 sign (24 decimal), followed by a one-byte session identifier, 773 followed by the length of the encapsulated binary data as a binary, 774 two-byte integer in network byte order. The binary data follows 775 immediately afterwards, without a CRLF. 777 This makes the encapsulation overhead 4 bytes, less than 778 the 8 bytes imposed by SCP. 780 5 Status Codes Definitions 782 Where applicable, HTTP status codes are re-used. [TBD: add those 783 relevant here] 785 5.1 Successful 2xx 787 5.1.1 200 OK 789 The request has succeeded. The information returned with the response 790 depends on the method used in the request, for example: 792 GET: the session description; 794 GET_PARAMETER: the value of the parameter. 796 5.2 Redirection 3xx 798 5.2.1 301 Moved Permanently 800 5.2.2 303 Moved Temporarily 802 5.3 Client Error 4xx 803 5.3.1 400 Bad Request 805 The request could not be understood by the recipient due to malformed 806 syntax. The request SHOULD NOT be repeated without modification. 808 5.3.2 401 Unauthorized 810 The request requires user authentication. 812 5.3.3 402 Payment Required 814 This code is reserved for future use. 816 5.3.4 405 Method Not Allowed 818 5.3.5 406 Not Acceptable 820 5.3.6 408 Request Timeout 822 5.3.7 411 Length Required 824 5.3.8 414 Request URI Too Long 826 5.3.9 415 Unsupported Mediatype 828 The recipient of the request is refusing to service the request 829 because the entity of the request is in a format not supported by the 830 requested resource for the requested method. 832 5.3.10 450 Invalid Parameter 834 The parameter in the request is not valid, i.e., out of range or 835 malformed. 837 5.3.11 451 Parameter Not Understood 839 The recipient of the request does not support one or more parameters 840 contained in the request. 842 5.3.12 452 Conference Not Found 844 The conference indicated by a Conference: identifier is unknown to 845 the media server. 847 5.3.13 453 Not Enough Bandwidth 849 The request was refused since there was insufficient bandwidth. This 850 may, for example, be the result of a resource reservation failure. 852 5.4 Server Error 5xx 854 5.4.1 500 Internal Server Error 856 5.4.2 501 Not Implemented 858 5.4.3 502 Bad Gateway 860 5.4.4 503 Service Unavailable 862 The server is currently unable to handle the request due to a 863 temporary overloading or maintenance of the server. The implication 864 is that this is a temporary condition which will be alleviated. 866 5.4.5 504 Gateway Timeout 868 5.4.6 505 RTSP Version Not Supported 870 6 Examples 872 6.1 Media on demand (unicast) 874 Client C requests a movie media servers A (audio.content.com) and V 875 (video.content.com). The media description is stored on a web server 876 W. This, however, is transparent to the client. The client is only 877 interested in the last part of the movie. The server requires 878 authentication for this movie. The audio track can be dynamically 879 switched between between two sets of encodings. The URL with scheme 880 rtpsu indicates the media servers want to use UDP for exchanging RTSP 881 messages. 883 C->W: GET twister HTTP/1.0 884 Accept: application/sdf; application/sdp 886 W->C: 200 OK 887 Content-type: application/sdf 889 (session 890 (all 891 (media (t audio) (oneof 892 ((e PCMU/8000/1 89 DVI4/8000/1 90) (id lofi)) 893 ((e DVI4/16000/2 90 DVI4/16000/2 91) (id hifi)) 894 ) 895 (language en) 896 (id rtspu://audio.content.com/twister/audio.en/1234) 897 ) 898 (media (t video) (e JPEG) 899 (id rtspu://video.content.com/twister/video/1234) 900 ) 901 ) 902 ) 904 C->A: SET_PARAMETER twister/audio.en/1234/lofi RTSP/1.0 1 905 Port: 3056 906 Transport: RTP;compression 908 A->C: RTSP/1.0 200 1 OK 910 C->V: SET_PARAMETER twister/video/1234/hifi RTSP/1.0 2 911 Port: 3058 912 Transport: RTP;compression 914 V->C: RTSP/1.0 200 2 OK 916 C->V: PLAY twister/video/1234 RTSP/1.0 3 917 Range: smpte 0:10:00- 919 V->C: RTSP/1.0 200 3 OK 921 C->A: PLAY twister/audio.en/1234/lofi RTSP/1.0 4 922 Range: smpte 0:10:00- 924 S->C: 200 4 OK 926 Even though the audio and video track are on two different servers, 927 may start at slightly different times and may drift with respect to 928 each other, the client can synchronize the two using standard RTP 929 methods. 931 6.2 Live Media Event Using Multicast 933 The media server chooses the multicast address and port. Here, we 934 assume that the web server only contains a pointer to the full 935 description, while the media server M maintains the full description. 936 During the session, a new subtitling stream is added. 938 C->W: GET concert HTTP/1.0 940 W->C: HTTP/1.0 200 OK 941 Content-Type: application/sdf 942 (session 943 (id rtsp://live.content.com/concert) 944 ) 946 C->M: GET concert RTSP/1.0 1 948 M->C: RTSP/1.0 200 OK 949 Content-Type: application/sdf 951 (session (all 952 (media (t audio) (id music) (a IP4 224.2.0.1) (p 3456)) 953 )) 955 C->M: PLAY concert/music RTSP/1.0 956 Range: smpte 1:12:0 958 M->C: RTSP/1.0 405 No positioning possible 960 M->C: SESSION concert RTSP/1.0 961 Content-Type: application/sdf 963 (session (all 964 (media (t audio) (id music)) 965 (media (t text) (id lyrics)) 966 )) 968 C->M: PLAY concert/lyrics RTSP/1.0 970 Since the session description already contains the necessary address 971 information, the client does not set the transport address. The 972 attempt to position the stream fails since this is a live event. 974 6.3 Playing media into an existing session 976 A conference participant C wants to have the media server M play back 977 a demo tape into an existing conference. When retrieving the session 978 description, C indicates to the media server that the network 979 addresses and encryption keys are already given by the conference, so 980 they should not be chosen by the server. The example omits the simple 981 ACK responses. 983 C->M: GET demo RTSP/1.0 1 984 Accept: application/sdf, application/sdp 985 Given: address, privacy 987 M->C: RTSP/1.0 200 1 OK 988 Content-type: application/sdf 990 (session 991 (id 548) 992 (media (t audio) (id sound) 993 ) 995 C->M: SET_PARAMETER demo/548/sound RTSP/1.0 2 996 Address: IP4 224.2.0.1 997 Port: 3456 998 TTL: 127 1000 6.4 Recording 1002 Conference participant C asks the media server M to record a session. 1003 If the session description contains any alternatives, the server 1004 records them all. 1006 C->M: SESSION meeting RTSP/1.0 89 1007 Content-type: application/sdp 1009 v=0 1010 s=Mbone Audio 1011 i=Discussion of Mbone Engineering Issues 1013 M->C: 415 89 Unsupported Media Type 1014 Accept: application/sdf 1016 C->M: SESSION meeting RTSP/1.0 90 1017 Content-type: application/sdf 1019 M->C: 200 90 OK 1021 C->M: RECORD meeting 1022 Range: clock 19961110T1925-19961110T2015 1024 7 Access Authentication 1026 Besides limiting access, access authentication is also needed to 1027 avoid denial-of-service attacks. 1029 8 Security Considerations 1031 The protocol offers the opportunity for a remote-control denial-of- 1032 service attack. The attacker, using a forged source IP address, can 1033 ask for a stream to be played back to that forged IP address. This 1034 can be prevented by a challenge-response authentication. If the goal 1035 is simply to prevent this denial-of-service attack, a default, widely 1036 known key can be used. 1038 If the client retrieves a session description, the server hand out an 1039 encrypted version of the client's IP address to the client during the 1040 initial retrieval of the session description. 1042 A Session Description 1044 A session description must be able to identify sessions and 1045 individual media streams. The per-media identifier is created by the 1046 entity creating the session description and is opaque to anyone else. 1047 It may contain any 8-bit value except CR and LF. 1049 B Notes on RTSP 1051 o The STREAM_HEADER functionality has been subsumed by the 1052 session description. 1054 o SEND_REPORT is not really needed. Should define an RTCP 1055 request with a random response interval. 1057 o Error reports are sent automatically. If server wants to 1058 terminate connection, it sends a BYE. 1060 o Resending (UDP_RESEND) should be handled by RTCP since it is 1061 always media-specific and RTCP can be readily flow-controlled 1062 to avoid congestion collapse. 1064 o Is STOP really needed? What's the difference between STOP and 1065 PAUSE? Resources (which?) cannot be released since there may be 1066 a PLAY command immediately. Bearing on resource reservation? 1068 C Author Addresses 1070 Henning Schulzrinne 1071 Dept. of Computer Science 1072 Columbia University 1073 1214 Amsterdam Avenue 1074 New York, NY 10027 1075 USA 1076 electronic mail: schulzrinne@cs.columbia.edu 1078 D Acknowledgements 1080 This draft is based on the functionality of the RTSP draft. It also 1081 borrows format and descriptions from HTTP/1.1.