MMUSIC Working Group H. Schulzrinne Internet-Draft Columbia University Intended status: Standards Track A. Rao Expires: August 28, 2008 Cisco R. Lanphier Real Networks M. Westerlund Ericsson AB M. Stiemerling (Ed.) NEC February 25, 2008 Real Time Streaming Protocol 2.0 (RTSP) draft-ietf-mmusic-rfc2326bis-17.txt Status of this Memo By submitting this Internet-Draft, each author represents that any applicable patent or other IPR claims of which he or she is aware have been or will be disclosed, and any of which he or she becomes aware will be disclosed, in accordance with Section 6 of BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt. The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. This Internet-Draft will expire on August 28, 2008. Copyright Notice Copyright (C) The IETF Trust (2008). Schulzrinne, et al. Expires August 28, 2008 [Page 1] Internet-Draft Real Time Streaming Protocol 2.0 (RTSP) February 2008 Abstract This memorandum defines RTSP version 2.0 which is a revision of the Proposed Standard RTSP version 1.0 which is defined in RFC 2326. The Real Time Streaming Protocol, or RTSP, is an application-level protocol for control over the delivery of data with real-time properties. RTSP provides an extensible framework to enable controlled, on-demand delivery of real-time data, such as audio and video. Sources of data can include both live data feeds and stored clips. This protocol is intended to control multiple data delivery sessions, provide a means for choosing delivery channels such as UDP, multicast UDP and TCP, and provide a means for choosing delivery mechanisms based upon RTP (RFC 3550). Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 8 1.1. Scope and Background . . . . . . . . . . . . . . . . . . 8 1.2. RTSP Specificication Update . . . . . . . . . . . . . . 9 1.3. Notational Conventions . . . . . . . . . . . . . . . . . 9 1.4. Terminology . . . . . . . . . . . . . . . . . . . . . . 10 2. RTSP Introduction . . . . . . . . . . . . . . . . . . . . . . 14 2.1. Protocol Properties . . . . . . . . . . . . . . . . . . 14 2.2. RTSP's Relationship to HTTP . . . . . . . . . . . . . . 15 2.3. Extending RTSP . . . . . . . . . . . . . . . . . . . . . 16 2.4. Overall Operation . . . . . . . . . . . . . . . . . . . 17 2.5. RTSP States . . . . . . . . . . . . . . . . . . . . . . 18 2.6. Relationship with Other Protocols . . . . . . . . . . . 19 3. RTSP Use Cases . . . . . . . . . . . . . . . . . . . . . . . 20 3.1. On-demand Playback of Stored Content . . . . . . . . . . 20 3.2. Unicast distribution of Live Content . . . . . . . . . . 21 3.3. On-demand Playback using Multicast . . . . . . . . . . . 22 3.4. Inviting an RTSP server into a conference . . . . . . . 22 3.5. Live Content using Multicast . . . . . . . . . . . . . . 23 4. Protocol Parameters . . . . . . . . . . . . . . . . . . . . . 25 4.1. RTSP Version . . . . . . . . . . . . . . . . . . . . . . 25 4.2. RTSP IRI and URI . . . . . . . . . . . . . . . . . . . . 25 4.3. Session Identifiers . . . . . . . . . . . . . . . . . . 27 4.4. SMPTE Relative Timestamps . . . . . . . . . . . . . . . 27 4.5. Normal Play Time . . . . . . . . . . . . . . . . . . . . 27 4.6. Absolute Time . . . . . . . . . . . . . . . . . . . . . 28 4.7. Feature-tags . . . . . . . . . . . . . . . . . . . . . . 28 4.8. Entity Tags . . . . . . . . . . . . . . . . . . . . . . 29 5. RTSP Message . . . . . . . . . . . . . . . . . . . . . . . . 30 5.1. Message Types . . . . . . . . . . . . . . . . . . . . . 30 5.2. Message Headers . . . . . . . . . . . . . . . . . . . . 30 Schulzrinne, et al. Expires August 28, 2008 [Page 2] Internet-Draft Real Time Streaming Protocol 2.0 (RTSP) February 2008 5.3. Message Body . . . . . . . . . . . . . . . . . . . . . . 30 5.4. Message Length . . . . . . . . . . . . . . . . . . . . . 30 6. General Header Fields . . . . . . . . . . . . . . . . . . . . 32 7. Request . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 7.1. Request Line . . . . . . . . . . . . . . . . . . . . . . 33 7.2. Request Header Fields . . . . . . . . . . . . . . . . . 35 8. Response . . . . . . . . . . . . . . . . . . . . . . . . . . 37 8.1. Status-Line . . . . . . . . . . . . . . . . . . . . . . 37 8.1.1. Status Code and Reason Phrase . . . . . . . . . . . 37 8.2. Response Header Fields . . . . . . . . . . . . . . . . . 40 9. Entity . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 9.1. Entity Header Fields . . . . . . . . . . . . . . . . . . 43 9.2. Entity Body . . . . . . . . . . . . . . . . . . . . . . 44 10. Connections . . . . . . . . . . . . . . . . . . . . . . . . . 45 10.1. Reliability and Acknowledgements . . . . . . . . . . . . 45 10.2. Using Connections . . . . . . . . . . . . . . . . . . . 46 10.3. Closing Connections . . . . . . . . . . . . . . . . . . 47 10.4. Timing Out Connections and RTSP Messages . . . . . . . . 48 10.5. Showing Liveness . . . . . . . . . . . . . . . . . . . . 48 10.6. Use of IPv6 . . . . . . . . . . . . . . . . . . . . . . 49 11. Capability Handling . . . . . . . . . . . . . . . . . . . . . 50 12. Pipelining Support . . . . . . . . . . . . . . . . . . . . . 52 13. Method Definitions . . . . . . . . . . . . . . . . . . . . . 53 13.1. OPTIONS . . . . . . . . . . . . . . . . . . . . . . . . 54 13.2. DESCRIBE . . . . . . . . . . . . . . . . . . . . . . . . 55 13.3. SETUP . . . . . . . . . . . . . . . . . . . . . . . . . 57 13.3.1. Changing Transport Parameters . . . . . . . . . . . 59 13.4. PLAY . . . . . . . . . . . . . . . . . . . . . . . . . . 60 13.5. PAUSE . . . . . . . . . . . . . . . . . . . . . . . . . 65 13.6. TEARDOWN . . . . . . . . . . . . . . . . . . . . . . . . 68 13.7. GET_PARAMETER . . . . . . . . . . . . . . . . . . . . . 69 13.8. SET_PARAMETER . . . . . . . . . . . . . . . . . . . . . 70 13.9. REDIRECT . . . . . . . . . . . . . . . . . . . . . . . . 71 14. Embedded (Interleaved) Binary Data . . . . . . . . . . . . . 74 15. Status Code Definitions . . . . . . . . . . . . . . . . . . . 76 15.1. Success 1xx . . . . . . . . . . . . . . . . . . . . . . 76 15.1.1. 100 Continue . . . . . . . . . . . . . . . . . . . . 76 15.2. Success 2xx . . . . . . . . . . . . . . . . . . . . . . 76 15.2.1. 200 OK . . . . . . . . . . . . . . . . . . . . . . . 76 15.3. Redirection 3xx . . . . . . . . . . . . . . . . . . . . 76 15.3.1. 300 Multiple Choices . . . . . . . . . . . . . . . . 77 15.3.2. 301 Moved Permanently . . . . . . . . . . . . . . . 77 15.3.3. 302 Found . . . . . . . . . . . . . . . . . . . . . 77 15.3.4. 303 See Other . . . . . . . . . . . . . . . . . . . 77 15.3.5. 304 Not Modified . . . . . . . . . . . . . . . . . . 77 15.3.6. 305 Use Proxy . . . . . . . . . . . . . . . . . . . 78 15.4. Client Error 4xx . . . . . . . . . . . . . . . . . . . . 78 15.4.1. 400 Bad Request . . . . . . . . . . . . . . . . . . 78 Schulzrinne, et al. Expires August 28, 2008 [Page 3] Internet-Draft Real Time Streaming Protocol 2.0 (RTSP) February 2008 15.4.2. 405 Method Not Allowed . . . . . . . . . . . . . . . 78 15.4.3. 451 Parameter Not Understood . . . . . . . . . . . . 78 15.4.4. 452 reserved . . . . . . . . . . . . . . . . . . . . 78 15.4.5. 453 Not Enough Bandwidth . . . . . . . . . . . . . . 79 15.4.6. 454 Session Not Found . . . . . . . . . . . . . . . 79 15.4.7. 455 Method Not Valid in This State . . . . . . . . . 79 15.4.8. 456 Header Field Not Valid for Resource . . . . . . 79 15.4.9. 457 Invalid Range . . . . . . . . . . . . . . . . . 79 15.4.10. 458 Parameter Is Read-Only . . . . . . . . . . . . . 79 15.4.11. 459 Aggregate Operation Not Allowed . . . . . . . . 79 15.4.12. 460 Only Aggregate Operation Allowed . . . . . . . . 79 15.4.13. 461 Unsupported Transport . . . . . . . . . . . . . 80 15.4.14. 462 Destination Unreachable . . . . . . . . . . . . 80 15.4.15. 463 Destination Prohibited . . . . . . . . . . . . . 80 15.4.16. 464 Data Transport Not Ready Yet . . . . . . . . . . 80 15.4.17. 470 Connection Authorization Required . . . . . . . 80 15.4.18. 471 Connection Credentials not accepted . . . . . . 80 15.4.19. 472 Failure to establish secure connection . . . . . 81 15.5. Server Error 5xx . . . . . . . . . . . . . . . . . . . . 81 15.5.1. 551 Option not supported . . . . . . . . . . . . . . 81 16. Header Field Definitions . . . . . . . . . . . . . . . . . . 82 16.1. Accept . . . . . . . . . . . . . . . . . . . . . . . . . 91 16.2. Accept-Credentials . . . . . . . . . . . . . . . . . . . 91 16.3. Accept-Encoding . . . . . . . . . . . . . . . . . . . . 92 16.4. Accept-Language . . . . . . . . . . . . . . . . . . . . 92 16.5. Accept-Ranges . . . . . . . . . . . . . . . . . . . . . 92 16.6. Allow . . . . . . . . . . . . . . . . . . . . . . . . . 92 16.7. Authorization . . . . . . . . . . . . . . . . . . . . . 93 16.8. Bandwidth . . . . . . . . . . . . . . . . . . . . . . . 93 16.9. Blocksize . . . . . . . . . . . . . . . . . . . . . . . 93 16.10. Cache-Control . . . . . . . . . . . . . . . . . . . . . 93 16.11. Connection . . . . . . . . . . . . . . . . . . . . . . . 96 16.12. Connection-Credentials . . . . . . . . . . . . . . . . . 96 16.13. Content-Base . . . . . . . . . . . . . . . . . . . . . . 97 16.14. Content-Encoding . . . . . . . . . . . . . . . . . . . . 97 16.15. Content-Language . . . . . . . . . . . . . . . . . . . . 97 16.16. Content-Length . . . . . . . . . . . . . . . . . . . . . 97 16.17. Content-Location . . . . . . . . . . . . . . . . . . . . 98 16.18. Content-Type . . . . . . . . . . . . . . . . . . . . . . 98 16.19. CSeq . . . . . . . . . . . . . . . . . . . . . . . . . . 98 16.20. Date . . . . . . . . . . . . . . . . . . . . . . . . . . 98 16.21. ETag . . . . . . . . . . . . . . . . . . . . . . . . . . 98 16.22. Expires . . . . . . . . . . . . . . . . . . . . . . . . 99 16.23. From . . . . . . . . . . . . . . . . . . . . . . . . . . 100 16.24. If-Match . . . . . . . . . . . . . . . . . . . . . . . . 100 16.25. If-Modified-Since . . . . . . . . . . . . . . . . . . . 100 16.26. If-None-Match . . . . . . . . . . . . . . . . . . . . . 101 16.27. Last-Modified . . . . . . . . . . . . . . . . . . . . . 101 Schulzrinne, et al. Expires August 28, 2008 [Page 4] Internet-Draft Real Time Streaming Protocol 2.0 (RTSP) February 2008 16.28. Location . . . . . . . . . . . . . . . . . . . . . . . . 101 16.29. Pipelined-Requests . . . . . . . . . . . . . . . . . . . 101 16.30. Proxy-Authenticate . . . . . . . . . . . . . . . . . . . 102 16.31. Proxy-Authorization . . . . . . . . . . . . . . . . . . 102 16.32. Proxy-Require . . . . . . . . . . . . . . . . . . . . . 102 16.33. Proxy-Supported . . . . . . . . . . . . . . . . . . . . 103 16.34. Public . . . . . . . . . . . . . . . . . . . . . . . . . 103 16.35. Range . . . . . . . . . . . . . . . . . . . . . . . . . 104 16.36. Referer . . . . . . . . . . . . . . . . . . . . . . . . 106 16.37. Retry-After . . . . . . . . . . . . . . . . . . . . . . 106 16.38. Require . . . . . . . . . . . . . . . . . . . . . . . . 106 16.39. RTP-Info . . . . . . . . . . . . . . . . . . . . . . . . 107 16.40. Scale . . . . . . . . . . . . . . . . . . . . . . . . . 109 16.41. Speed . . . . . . . . . . . . . . . . . . . . . . . . . 109 16.42. Server . . . . . . . . . . . . . . . . . . . . . . . . . 110 16.43. Session . . . . . . . . . . . . . . . . . . . . . . . . 110 16.44. Supported . . . . . . . . . . . . . . . . . . . . . . . 111 16.45. Timestamp . . . . . . . . . . . . . . . . . . . . . . . 111 16.46. Transport . . . . . . . . . . . . . . . . . . . . . . . 112 16.47. Unsupported . . . . . . . . . . . . . . . . . . . . . . 117 16.48. User-Agent . . . . . . . . . . . . . . . . . . . . . . . 118 16.49. Vary . . . . . . . . . . . . . . . . . . . . . . . . . . 118 16.50. Via . . . . . . . . . . . . . . . . . . . . . . . . . . 118 16.51. WWW-Authenticate . . . . . . . . . . . . . . . . . . . . 118 17. Proxies . . . . . . . . . . . . . . . . . . . . . . . . . . . 119 18. Caching . . . . . . . . . . . . . . . . . . . . . . . . . . . 121 19. Security Framework . . . . . . . . . . . . . . . . . . . . . 122 19.1. RTSP and HTTP Authentication . . . . . . . . . . . . . . 122 19.2. RTSP over TLS . . . . . . . . . . . . . . . . . . . . . 122 19.3. Security and Proxies . . . . . . . . . . . . . . . . . . 123 19.3.1. Accept-Credentials . . . . . . . . . . . . . . . . . 124 19.3.2. User approved TLS procedure . . . . . . . . . . . . 125 20. Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . 127 20.1. Base Syntax . . . . . . . . . . . . . . . . . . . . . . 127 20.2. RTSP Protocol Definition . . . . . . . . . . . . . . . . 129 20.2.1. Generic Protocol elements . . . . . . . . . . . . . 129 20.2.2. Message Syntax . . . . . . . . . . . . . . . . . . . 132 20.2.3. Header Syntax . . . . . . . . . . . . . . . . . . . 136 20.3. SDP extension Syntax . . . . . . . . . . . . . . . . . . 143 21. Security Considerations . . . . . . . . . . . . . . . . . . . 144 21.1. Remote denial of Service Attack . . . . . . . . . . . . 146 22. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 148 22.1. Feature-tags . . . . . . . . . . . . . . . . . . . . . . 148 22.1.1. Description . . . . . . . . . . . . . . . . . . . . 148 22.1.2. Registering New Feature-tags with IANA . . . . . . . 149 22.1.3. Registered entries . . . . . . . . . . . . . . . . . 149 22.2. RTSP Methods . . . . . . . . . . . . . . . . . . . . . . 149 22.2.1. Description . . . . . . . . . . . . . . . . . . . . 149 Schulzrinne, et al. Expires August 28, 2008 [Page 5] Internet-Draft Real Time Streaming Protocol 2.0 (RTSP) February 2008 22.2.2. Registering New Methods with IANA . . . . . . . . . 149 22.2.3. Registered Entries . . . . . . . . . . . . . . . . . 150 22.3. RTSP Status Codes . . . . . . . . . . . . . . . . . . . 150 22.3.1. Description . . . . . . . . . . . . . . . . . . . . 150 22.3.2. Registering New Status Codes with IANA . . . . . . . 150 22.3.3. Registered Entries . . . . . . . . . . . . . . . . . 150 22.4. RTSP Headers . . . . . . . . . . . . . . . . . . . . . . 150 22.4.1. Description . . . . . . . . . . . . . . . . . . . . 150 22.4.2. Registering New Headers with IANA . . . . . . . . . 151 22.4.3. Registered entries . . . . . . . . . . . . . . . . . 151 22.5. Transport Header Registries . . . . . . . . . . . . . . 152 22.5.1. Transport Protocol Specification . . . . . . . . . . 152 22.5.2. Transport modes . . . . . . . . . . . . . . . . . . 153 22.5.3. Transport Parameters . . . . . . . . . . . . . . . . 154 22.6. Cache Directive Extensions . . . . . . . . . . . . . . . 154 22.7. Accept-Credentials . . . . . . . . . . . . . . . . . . . 155 22.7.1. Accept-Credentials policies . . . . . . . . . . . . 155 22.7.2. Accept-Credentials hash algorithms . . . . . . . . . 155 22.8. Range header formats . . . . . . . . . . . . . . . . . . 156 22.9. URI Schemes . . . . . . . . . . . . . . . . . . . . . . 156 22.9.1. The rtsp URI Scheme . . . . . . . . . . . . . . . . 156 22.9.2. The rtsps URI Scheme . . . . . . . . . . . . . . . . 157 22.9.3. The rtspu URI Scheme . . . . . . . . . . . . . . . . 158 22.10. SDP attributes . . . . . . . . . . . . . . . . . . . . . 159 23. References . . . . . . . . . . . . . . . . . . . . . . . . . 160 23.1. Normative References . . . . . . . . . . . . . . . . . . 160 23.2. Informative References . . . . . . . . . . . . . . . . . 162 Appendix A. Examples . . . . . . . . . . . . . . . . . . . . . . 165 A.1. Media on Demand (Unicast) . . . . . . . . . . . . . . . 165 A.2. Media on Demand using Pipelining . . . . . . . . . . . . 169 A.3. Media on Demand (Unicast) . . . . . . . . . . . . . . . 171 A.4. Single Stream Container Files . . . . . . . . . . . . . 175 A.5. Live Media Presentation Using Multicast . . . . . . . . 177 A.6. Capability Negotiation . . . . . . . . . . . . . . . . . 178 Appendix B. RTSP Protocol State Machine . . . . . . . . . . . . 180 B.1. States . . . . . . . . . . . . . . . . . . . . . . . . . 180 B.2. State variables . . . . . . . . . . . . . . . . . . . . 180 B.3. Abbreviations . . . . . . . . . . . . . . . . . . . . . 180 B.4. State Tables . . . . . . . . . . . . . . . . . . . . . . 181 Appendix C. Media Transport Alternatives . . . . . . . . . . . . 186 C.1. RTP . . . . . . . . . . . . . . . . . . . . . . . . . . 186 C.1.1. AVP . . . . . . . . . . . . . . . . . . . . . . . . 186 C.1.2. AVP/UDP . . . . . . . . . . . . . . . . . . . . . . 186 C.1.3. AVPF/UDP . . . . . . . . . . . . . . . . . . . . . . 187 C.1.4. SAVP/UDP . . . . . . . . . . . . . . . . . . . . . . 188 C.1.5. SAVPF/UDP . . . . . . . . . . . . . . . . . . . . . 188 C.1.6. RTCP usage with RTSP . . . . . . . . . . . . . . . . 188 C.2. RTP over TCP . . . . . . . . . . . . . . . . . . . . . . 189 Schulzrinne, et al. Expires August 28, 2008 [Page 6] Internet-Draft Real Time Streaming Protocol 2.0 (RTSP) February 2008 C.2.1. Interleaved RTP over TCP . . . . . . . . . . . . . . 189 C.2.2. RTP over independent TCP . . . . . . . . . . . . . . 189 C.2.3. Handling NPT Jumps in the RTP Media Layer . . . . . 193 C.2.4. Handling RTP Timestamps after PAUSE . . . . . . . . 196 C.2.5. RTSP / RTP Integration . . . . . . . . . . . . . . . 198 C.2.6. Scaling with RTP . . . . . . . . . . . . . . . . . . 198 C.2.7. Maintaining NPT synchronization with RTP timestamps . . . . . . . . . . . . . . . . . . . . . 198 C.2.8. Continuous Audio . . . . . . . . . . . . . . . . . . 198 C.2.9. Multiple Sources in an RTP Session . . . . . . . . . 198 C.2.10. Usage of SSRCs and the RTCP BYE Message During an RTSP Session . . . . . . . . . . . . . . . . . . . . 198 C.3. Future Additions . . . . . . . . . . . . . . . . . . . . 199 Appendix D. Use of SDP for RTSP Session Descriptions . . . . . . 200 D.1. Definitions . . . . . . . . . . . . . . . . . . . . . . 200 D.1.1. Control URI . . . . . . . . . . . . . . . . . . . . 200 D.1.2. Media Streams . . . . . . . . . . . . . . . . . . . 201 D.1.3. Payload Type(s) . . . . . . . . . . . . . . . . . . 202 D.1.4. Format-Specific Parameters . . . . . . . . . . . . . 202 D.1.5. Directionality of media stream . . . . . . . . . . . 202 D.1.6. Range of Presentation . . . . . . . . . . . . . . . 203 D.1.7. Time of Availability . . . . . . . . . . . . . . . . 204 D.1.8. Connection Information . . . . . . . . . . . . . . . 204 D.1.9. Entity Tag . . . . . . . . . . . . . . . . . . . . . 204 D.2. Aggregate Control Not Available . . . . . . . . . . . . 205 D.3. Aggregate Control Available . . . . . . . . . . . . . . 205 D.4. RTSP external SDP delivery . . . . . . . . . . . . . . . 206 Appendix E. Minimal RTSP Implementation . . . . . . . . . . . . 208 E.1. Minimal Core Implementation . . . . . . . . . . . . . . 208 E.2. Recommended Core Implementation . . . . . . . . . . . . 208 E.3. The Basic Playback Feature Support . . . . . . . . . . . 209 E.3.1. Client . . . . . . . . . . . . . . . . . . . . . . . 209 E.3.2. Server . . . . . . . . . . . . . . . . . . . . . . . 209 E.3.3. Proxy . . . . . . . . . . . . . . . . . . . . . . . 210 E.4. Secure Transport . . . . . . . . . . . . . . . . . . . . 210 Appendix F. Requirements for Unreliable Transport of RTSP . . . 211 Appendix G. Backwards Compatibility Considerations . . . . . . . 213 G.1. Play Request in Play mode . . . . . . . . . . . . . . . 213 G.2. Using Persistent Connections . . . . . . . . . . . . . . 213 Appendix H. Open Issues . . . . . . . . . . . . . . . . . . . . 214 Appendix I. Changes . . . . . . . . . . . . . . . . . . . . . . 216 I.1. Changes needing to be updated . . . . . . . . . . . . . 221 Appendix J. Acknowledgements . . . . . . . . . . . . . . . . . . 223 J.1. Contributors . . . . . . . . . . . . . . . . . . . . . . 223 Appendix K. RFC Editor Consideration . . . . . . . . . . . . . . 225 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 226 Intellectual Property and Copyright Statements . . . . . . . . . 227 Schulzrinne, et al. Expires August 28, 2008 [Page 7] Internet-Draft Real Time Streaming Protocol 2.0 (RTSP) February 2008 1. Introduction 1.1. Scope and Background This memo defines version 2.0 of the Real Time Streaming Protocol (RTSP 2.0) which is an application-level protocol for control over the delivery of data with real-time properties, typically streaming media. Streaming media is, for instance, video on demand or audio life streaming. Put simply, RTSP acts as a "network remote control" for multimedia servers, as you know it from your TV set. The protocol operates between RTSP 2.0 clients and servers, but also supports the usage of RTSP 2.0 proxies between clients and servers. Basically, clients can request information about streaming media from servers, by asking for a description of the media or use media description provided externally. Based on the media description clients can request to play out the media, pause it, or stop it completely, as known from a regular TV remote control. The requested media can consist of multiple audio and video streams that are delivered as a time- synchronized stream from servers to clients. This memorandum describes the use of RTSP over a reliable connection based transport level protocol, such as TCP. For security, TLS over a connection oriented transport is supported. There is no notion of an RTSP connection in the protocol. Instead, an RTSP server maintains a session labeled by an identifier to associate groups of media streams and their states. An RTSP session is not tied to a transport-level connection such as a TCP connection. During a session, a client may open and close multiple reliable transport connections to the server to issue RTSP requests for that session. The set of streams to be controlled in an RTSP session is defined by a presentation description. This memorandum does not define a format for the presentation description. However Appendix D describes how SDP [RFC4566] is used for this purpose. The streams controlled by RTSP may use RTP [RFC3550] for their data transport, but the operation of RTSP does not depend on the transport mechanism used to carry continuous media. RTSP is intentionally similar in syntax and operation to HTTP/1.1 [RFC2616] so that extension mechanisms to HTTP can in most cases also be applied to RTSP. The RTSP 2.0 protocol supports the following operations: Schulzrinne, et al. Expires August 28, 2008 [Page 8] Internet-Draft Real Time Streaming Protocol 2.0 (RTSP) February 2008 Retrieval of media from media server: The client can either request a presentation description via RTSP DESCRIBE, HTTP or some other method. If the presentation is being multicast, the presentation description contains the multicast addresses and ports to be used for the continuous media. If the presentation is to be sent only to the client via unicast, the client provides the destination. Invitation of a media server to a conference: A media server can be "invited" to join an existing conference to play back media into the presentation. This mode is useful, for example, in distributed teaching applications. Several parties in the conference may take turns "pushing the remote control buttons". Note: This functionality will require RTSP external application level functionality. RTSP requests may be handled by proxies, tunnels and caches as in HTTP/1.1 [RFC2616]. 1.2. RTSP Specificication Update This memorandum specifies RTSP 2.0 which is an update of RTSP 1.0, a proposed standard defined in [RFC2326]. The goal of this version is to correct the many flaws that have been identified in RTSP 1.0 since its publication. The corrections are such that backwards compatibility was impossible. Thus a new version was deemed the most appropriate solution to get a more functional protocol. There are no plans to revise RTSP 1.0. Appendix I catalogs the changes of this version in relation to RTSP 1.0. RTSP 2.0 has reduced functionality compared to RTSP 1.0 and aims at specifying the RTSP core, functionality and rules for extensions, and basic interaction with the media delivery protocol RTP [RFC3550]. Any other functionality would need to be published as extension documents. This specification provides rules for such extensions and defines registries to avoid naming collisions. 1.3. Notational Conventions Since many of the definitions and syntax are identical to HTTP/1.1, this specification only points to the section where they are defined rather than copying it. For brevity, [HX.Y] is to be taken to refer to Section X.Y of the current HTTP/1.1 specification ([RFC2616]). All the mechanisms specified in this document are described in both prose and the Augmented Backus-Naur form (ABNF) described in detail in [RFC4234]. Schulzrinne, et al. Expires August 28, 2008 [Page 9] Internet-Draft Real Time Streaming Protocol 2.0 (RTSP) February 2008 Indented and smaller-type paragraphs are used to provide informative background and motivation. This is intended to give readers who were not involved with the formulation of the specification an understanding of why things are the way they are in RTSP. The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC2119]. The word, "unspecified" is used to indicate functionality or features that are not defined in this specification. Such functionality cannot be used in a standardized manner without further definition in an extension specification to RTSP. 1.4. Terminology Some of the terminology has been adopted from HTTP/1.1 [RFC2616]. Terms not listed here are defined as in HTTP/1.1. Aggregate control: The concept of controlling multiple streams using a single timeline, generally maintained by the server. A client, for example, uses aggregate control when it issues a single play or pause message to simultaneously control both the audio and video in a movie. A session which is under aggregate control is referred to as an aggregated session. Aggregate control URI: The URI used in an RTSP request to refer to and control an aggregated session. It normally, but not always, corresponds to the presentation URI specified in the session description. See Section 13.3 for more information. Conference: A multiparty, multimedia presentation, where "multi" implies greater than or equal to one. Client: The client requests media service from the media server. Connection: A transport layer virtual circuit established between two programs for the purpose of communication. Container file: A file which may contain multiple media streams which often constitutes a presentation when played together. The concept of a container file is not embedded in the protocol. However, RTSP servers may offer aggregate control on the media streams within these files. Schulzrinne, et al. Expires August 28, 2008 [Page 10] Internet-Draft Real Time Streaming Protocol 2.0 (RTSP) February 2008 Continuous media: Data where there is a timing relationship between source and sink; that is, the sink needs to reproduce the timing relationship that existed at the source. The most common examples of continuous media are audio and motion video. Continuous media can be real-time (interactive or conversational), where there is a "tight" timing relationship between source and sink, or streaming (playback), where the relationship is less strict. Entity: The information transferred as the payload of a request or response. An entity consists of meta-information in the form of entity-header fields and content in the form of an entity-body, as described in Section 9. Feature-tag: A tag representing a certain set of functionality, i.e. a feature. IRI: Internationalized Resource Identifier, is the same as an URI, with the exception that it allows characters from the whole Universal Character Set (Unicode/ISO 10646), rather than the US- ASCII only. See [RFC3987] for more information. Live: Normally used to describe a presentation or session with media coming from an ongoing event. This generally results in the session having an unbound or only loosely defined duration, and sometimes no seek operations are possible. Media initialization: Datatype/codec specific initialization. This includes such things as clock rates, color tables, etc. Any transport-independent information which is required by a client for playback of a media stream occurs in the media initialization phase of stream setup. Media parameter: Parameter specific to a media type that may be changed before or during stream playback. Media server: The server providing playback services for one or more media streams. Different media streams within a presentation may originate from different media servers. A media server may reside on the same host or on a different host from which the presentation is invoked. Media server indirection: Redirection of a media client to a different media server. (Media) stream: A single media instance, e.g., an audio stream or a video stream as well as a single whiteboard or shared application group. When using RTP, a stream consists of all RTP and RTCP packets created by a source within an RTP session. Schulzrinne, et al. Expires August 28, 2008 [Page 11] Internet-Draft Real Time Streaming Protocol 2.0 (RTSP) February 2008 Message: The basic unit of RTSP communication, consisting of a structured sequence of octets matching the syntax defined in Section 20 and transmitted over a connection or a connectionless transport. Non-Aggregated Control: Control of a single media stream. This is only possible in RTSP sessions with a single media. Participant: Member of a conference. A participant may be a machine, e.g., a playback server. Presentation: A set of one or more streams presented to the client as a complete media feed and described by a presentation description as defined below. Presentations with more than one media stream are often handled in RTSP under aggregate control. Presentation description: A presentation description contains information about one or more media streams within a presentation, such as the set of encodings, network addresses and information about the content. Other IETF protocols such as SDP ([RFC4566]) use the term "session" for a presentation. The presentation description may take several different formats, including but not limited to the session description protocol format, SDP. Response: An RTSP response. If an HTTP response is meant, that is indicated explicitly. Request: An RTSP request. If an HTTP request is meant, that is indicated explicitly. Request-URI: The URI used in a request to indicate the resource on which the request is to be performed. RTSP agent: Refers to either an RTSP client, an RTSP server, or an RTSP Proxy. In this specification, there are many capabilities that are common to these three entities such as the capability to send requests or receive responses. This term will be used when describing functionality that is applicable to all three of these entities. RTSP session: A stateful abstraction upon which the main control methods of RTSP operate. An RTSP session is a server entity; it is created, maintained and destroyed by the server. It is established by an RTSP server upon the completion of a successful SETUP request (when a 200 OK response is sent) and is labelled with a session identifier at that time. The session exists until timed out by the server or explicitly removed by a TEARDOWN request. An RTSP session is a stateful entity; an RTSP server Schulzrinne, et al. Expires August 28, 2008 [Page 12] Internet-Draft Real Time Streaming Protocol 2.0 (RTSP) February 2008 maintains an explicit session state machine (see Appendix A) where most state transitions are triggered by client requests. The existence of a session implies the existence of state about the session's media streams and their respective transport mechanisms. A given session can have one or more media streams associated with it. An RTSP server uses the session to aggregate control over multiple media streams. Transport initialization: The negotiation of transport information (e.g., port numbers, transport protocols) between the client and the server. URI: Universal Resource Identifier, see [RFC3986]. The URIs used in RTSP are generally URLs as they give a location for the resource. As URLs are a subset of URIs, they will be referred to as URIs to cover also the cases when an RTSP URI would not be an URL. URL: Universal Resource Locator, is an URI which identifies the resource through its primary access mechanism, rather than identifying the resource by name or by some other attribute(s) of that resource. Schulzrinne, et al. Expires August 28, 2008 [Page 13] Internet-Draft Real Time Streaming Protocol 2.0 (RTSP) February 2008 2. RTSP Introduction 2.1. Protocol Properties RTSP has the following properties: Extendable: New methods and parameters can be easily added to RTSP. Easy to parse: RTSP can be parsed by standard HTTP or MIME parsers. Secure: RTSP re-uses web security mechanisms, either at the transport level (TLS, [RFC4346]) or within the protocol itself. All HTTP authentication mechanisms such as basic ([RFC2616]) and digest authentication ([RFC2617]) are directly applicable. Transport-independent: RTSP does not preclude the use of unreliable datagram protocol (UDP) ([RFC0768]) as it would be possible to implement application-level reliability. The use of a connectionless datagram protocol such as UDP requires additional definition that may be provided as extensions to the core RTSP specification. The reliable stream protocol TCP ([RFC0793]) and the secured reliable stream protocol TLS over TCP [RFC4346] are the currently defined transport protocols for RTSP messages. Media-delivery protocol independent: The operation of RTSP does not depend on the transport mechanism used to carry continuous media. While most real-time media will use RTP as a transport protocol, RTSP does not preclude the use of other protocols such as MPEG-2 [ISO.13818-1.2000]. The use of other protocols requires additional definition that may be provided as extensions to the core RTSP specification. Multi-server capable: Each media stream within a presentation can reside on a different server. The client automatically establishes several concurrent control sessions with the different media servers. Media synchronization in those cases is performed at the transport level. Separation of stream control and conference initiation: Stream control is divorced from inviting a media server to a conference. In particular, SIP [RFC3261] or H.323 [ITU.H323.1996] may be used to invite a server to a conference; however, the exact procedures are unspecified. Suitable for professional applications: RTSP supports frame- level accuracy through SMPTE time stamps to allow remote digital editing. Schulzrinne, et al. Expires August 28, 2008 [Page 14] Internet-Draft Real Time Streaming Protocol 2.0 (RTSP) February 2008 Presentation description neutral: The protocol does not impose a particular presentation description or metafile format and can convey the type of format to be used. However, the presentation description is required to contain at least one RTSP URI. Proxy and firewall friendly: The protocol should be readily handled by both application and transport-layer (SOCKS [RFC1961]) firewalls. A firewall may need to understand the SETUP method to open a "hole" for the media stream. HTTP-friendly: Where sensible, RTSP reuses HTTP concepts, so that the existing infrastructure can be reused. This infrastructure includes PICS (Platform for Internet Content Selection [W3C.REC-PICS-services] [W3C.REC-PICS-labels]) for associating labels with content. However, RTSP does not just add methods to HTTP since controlling continuous media requires server state in most cases. Appropriate server control: If a client can start a stream, it needs to be able to stop a stream. Servers should not start streaming to clients in such a way that clients cannot stop the stream. Transport negotiation: The client can negotiate the transport method prior to actually needing to process a continuous media stream. 2.2. RTSP's Relationship to HTTP RTSP is intentionally similar in syntax and operation to HTTP/1.1 [RFC2616] so that extension mechanisms to HTTP can in most cases also be applied to RTSP. However, RTSP differs in a number of important aspects from HTTP: * RTSP introduces a number of new methods and has a different protocol identifier. * RTSP has the notion of a session built into the protocol. * An RTSP server needs to maintain state in almost all cases, as opposed to the stateless nature of HTTP. * Both an RTSP server and client can issue requests. * Data is usually carried out-of-band by a different protocol. Session descriptions returned in a DESCRIBE response (see Section 13.2) and interleaving of RTP with RTSP over TCP are exceptions to this rule (see Section 14). Schulzrinne, et al. Expires August 28, 2008 [Page 15] Internet-Draft Real Time Streaming Protocol 2.0 (RTSP) February 2008 * RTSP is defined to use ISO 10646 (UTF-8) rather than ISO 8859-1, consistent with HTML internationalization efforts [RFC2070]. * The Request-URI always contains the absolute URI. Because of backward compatibility with a historical blunder, HTTP/1.1 [RFC2616] carries only the absolute path in the request and puts the host name in a separate header field. This makes "virtual hosting" easier, where a single host with one IP address hosts several document trees. 2.3. Extending RTSP Since not all media servers have the same functionality, media servers by necessity will support different sets of requests. For example: o A server may not be capable of seeking (absolute positioning) if it is to support live events only. o Some servers may not support setting stream parameters and thus not support GET_PARAMETER and SET_PARAMETER. o Some server may support an RTSP extension. It is up to the creators of presentation descriptions not to ask the impossible of a server. This situation is similar in HTTP/1.1 [RFC2616], where the methods described in [H19.5] are not likely to be supported across all servers. RTSP can be extended in three ways, listed here in order of the magnitude of changes supported: o Existing methods can be extended with new parameters, e.g. headers, as long as these parameters can be safely ignored by the recipient. If the client needs negative acknowledgement when a method extension is not supported, a tag corresponding to the extension may be added in the field of the Require or Proxy- Require headers (see Section 16.32). o New methods can be added. If the recipient of the message does not understand the request, it MUST respond with error code 501 (Not Implemented) so that the sender can avoid using this method again. A client may also use the OPTIONS method to inquire about methods supported by the server. The server MUST list the methods it supports using the Public response header. Schulzrinne, et al. Expires August 28, 2008 [Page 16] Internet-Draft Real Time Streaming Protocol 2.0 (RTSP) February 2008 o A new version of the protocol can be defined, allowing almost all aspects (except the position of the protocol version number) to change. A new version of the protocol MUST be registered through an IETF standard track document. The basic capability discovery mechanism can be used to both discover support for a certain feature and to ensure that a feature is available when performing a request. For detailed explanation of this see Section 11. 2.4. Overall Operation Each presentation and media stream is identified by an RTSP URI. The overall presentation and the properties of the media the presentation is composed of are defined by a presentation description file, the format of which is outside the scope of this specification. The presentation description file may be obtained by the client using HTTP or other means such as email and may not necessarily be stored on the media server. For the purposes of this specification, a presentation description is assumed to describe one or more presentations, each of which maintains a common time axis. For simplicity of exposition and without loss of generality, it is assumed that the presentation description contains exactly one such presentation. A presentation may contain several media streams. The presentation description file contains a description of the media streams making up the presentation, including their encodings, language, and other parameters that enable the client to choose the most appropriate combination of media. In this presentation description, each media stream that is individually controllable by RTSP is identified by an RTSP URI, which points to the media server handling that particular media stream and names the stream stored on that server. Several media streams can be located on different servers; for example, audio and video streams can be split across servers for load sharing. The description also enumerates which transport methods the server is capable of. Besides the media parameters, the network destination address and port need to be determined. Several modes of operation can be distinguished: Unicast: The media is transmitted to the source of the RTSP request or the requested destination, with the port number chosen by the client. Alternatively, the media is transmitted on the same reliable stream as RTSP. Schulzrinne, et al. Expires August 28, 2008 [Page 17] Internet-Draft Real Time Streaming Protocol 2.0 (RTSP) February 2008 Multicast, server chooses address: The media server picks the multicast address and port. This is the typical case for a live or near-media-on-demand transmission. Multicast, client chooses address: If the server is to participate in an existing multicast conference, the multicast address, port and encryption key are given by the conference description, established by means outside the scope of this specification, for example by a SIP created conference. 2.5. RTSP States RTSP controls a stream which may be sent via a separate protocol, independent of the control channel. For example, RTSP control may be transported on a TCP connection while the media data is conveyed via UDP. Thus, data delivery continues even if no RTSP requests are received by the media server. Also, during its lifetime a single media stream may be controlled by RTSP requests issued sequentially on different TCP connections. Therefore, the server needs to maintain "session state" to be able to correlate RTSP requests with a stream. The state transitions are described in Appendix A. Many methods in RTSP do not contribute to state. However, the following play a central role in defining the allocation and usage of stream resources on the server: SETUP, PLAY, PAUSE, REDIRECT, and TEARDOWN. SETUP: Causes the server to allocate resources for a stream and create an RTSP session. PLAY: Starts data transmission on a stream allocated via SETUP. PAUSE: Temporarily halts a stream without freeing server resources. REDIRECT: Indicates that the session should be moved to a new server or location TEARDOWN: Frees resources associated with the stream. The RTSP session ceases to exist on the server. RTSP methods that contribute to state use the Session header field (Section 16.44) to identify the RTSP session whose state is being manipulated. The server generates session identifiers in response to SETUP requests (Section 13.3). Schulzrinne, et al. Expires August 28, 2008 [Page 18] Internet-Draft Real Time Streaming Protocol 2.0 (RTSP) February 2008 2.6. Relationship with Other Protocols RTSP has some overlap in functionality with HTTP. It also may interact with HTTP in that the initial contact with streaming content will often be made through a web page. The current protocol specification aims to allow different hand-off points between a web server and the media server implementing RTSP. For example, the presentation description can be retrieved using HTTP or RTSP, which reduces round trips in web-browser-based scenarios, yet also allows for stand alone RTSP servers and clients which do not rely on HTTP at all. However, RTSP differs fundamentally from HTTP in that most data delivery takes place out-of-band in a different protocol. HTTP is an asymmetric protocol where the client issues requests and the server responds. In RTSP, both the media client and media server can issue requests. RTSP requests are also stateful; they may set parameters and continue to control a media stream long after the request has been acknowledged. Re-using HTTP functionality has advantages in at least two areas, namely security and proxies. The requirements are very similar, so having the ability to adopt HTTP work on caches, proxies and authentication is valuable. RTSP assumes the existence of a presentation description format that can express both static and temporal properties of a presentation containing several media streams. Session Description Protocol (SDP) [RFC4566] is generally the format of choice; however, RTSP is not bound to it. For data delivery, most real-time media will use RTP as a transport protocol. While RTSP works well with RTP, it is not tied to RTP. Schulzrinne, et al. Expires August 28, 2008 [Page 19] Internet-Draft Real Time Streaming Protocol 2.0 (RTSP) February 2008 3. RTSP Use Cases This section describes the most important and considered use cases for RTSP. They are listed in descending order of importance in regards to ensuring that all necessary functionality is present. This specification only fully supports usage of the two first. Also in these first two cases, there are special cases or exceptions that are not supported without extensions, e.g. the redirection of media to another address than the controlling entity. 3.1. On-demand Playback of Stored Content An RTSP capable server stores content suitable for being streamed to a client. A client desiring playback of any of the stored content uses RTSP to set up the media transport required to deliver the desired content. RTSP is then used to initiate, halt and manipulate the actual transmission (playout) of the content. RTSP is also required to provide necessary description and synchronization information for the content. The above high level description can be broken down into a number of functions that RTSP needs to be capable of. Presentation Description: Provide initialization information about the presentation (content); for example, which media codecs are needed for the content. Other information that is important includes the number of media stream the presentation contains, the transport protocols used for the media streams, and identifiers for these media streams. This information is required before setup of the content is possible and to determine if the client is even capable of using the content. This information need not be sent using RTSP; other external protocols can be used to transmit the transport presentation descriptions. Two good examples are the use of HTTP [RFC2616] or email to fetch or receive presentation descriptions like SDP [RFC4566] Setup: Set up some or all of the media streams in a presentation. The setup itself consist of selecting the protocol for media transport and the necessary parameters for the protocol, like addresses and ports. Control of Transmission: After the necessary media streams have been established the client can request the server to start transmitting the content. The client must be allowed to start or stop the transmission of the content at arbitrary times. The client must also be able to start the transmission at any Schulzrinne, et al. Expires August 28, 2008 [Page 20] Internet-Draft Real Time Streaming Protocol 2.0 (RTSP) February 2008 point in the timeline of the presentation. Synchronization: For media transport protocols like RTP [RFC3550] it might be beneficial to carry synchronization information within RTSP. This may be due to either the lack of inter-media synchronization within the protocol itself, or the potential delay before the synchronization is established (which is the case for RTP when using RTCP). Termination: Terminate the established contexts. For this use case there are a number of assumptions about how it works. These are: On-Demand content: The content is stored at the server and can be accessed at any time during a time period when it is intended to be available. Independent sessions: A server is capable of serving a number of clients simultaneously, including from the same piece of content at different points in that presentations time-line. Unicast Transport: Content for each individual client is transmitted to them using unicast traffic. It is also possible to redirect the media traffic to a different destination than that of the entity controlling the traffic. However, allowing this without appropriate mechanisms for checking that the destination approves of this allows for distributed denial of service attacks (DDoS). 3.2. Unicast distribution of Live Content This use cases is similar to the above on-demand content case (see Section 3.1) the difference is the nature of the content itself. Live content is continuously distributed as it becomes available from a source; i.e., the main difference from on-demand is that one starts distributing content before the end of it has become available to the server. In many cases the consumer of live content is only interested in consuming what is actually happens "now"; i.e., very similar to broadcast TV. However in this case it is assumed that there exist no broadcast or multicast channel to the users, and instead the server functions as a distribution node, sending the same content to multiple receivers, using unicast traffic between server and client. This unicast traffic and the transport parameters are individually negotiated for each receiving client. Schulzrinne, et al. Expires August 28, 2008 [Page 21] Internet-Draft Real Time Streaming Protocol 2.0 (RTSP) February 2008 Another aspect of live content is that it often has a very limited time of availability, as it is only is available for the duration of the event the content covers. An example of such a live content could be a music concert which lasts 2 hour and starts at a predetermined time. Thus there is need to announce when and for how long the live content is available. In some cases, the server providing live content may be saving some or all of the content to allow clients to pause the stream and resume it from the paused point, or to "rewind" and play continuously from a point earlier than the live point. Hence, this use case does not necessarily exclude playing from other than the live point of the stream, playing with scales other than 1.0, etc. 3.3. On-demand Playback using Multicast It is possible to use RTSP to request that media be delivered to a multicast group. The entity setting up the session (the controller) will then control when and what media is delivered to the group. This use case has some potential for denial of service attacks by flooding a multicast group. Therefore, a mechanism is needed to indicate that the group actually accepts the traffic from the RTSP server. An open issue in this use case is how one ensures that all receivers listening to the multicast or broadcast receives the session presentation configuring the receivers. 3.4. Inviting an RTSP server into a conference If one has an established conference or group session, it is possible to have an RTSP server distribute media to the whole group. Transmission to the group is simplest when controlled by a single participant or leader of the conference. Shared control might be possible, but would require further investigation and possibly extensions. This use case assumes that there exists either multicast or a conference focus that redistribute media to all participants. This use case is intended to be able to handle the following scenario: A conference leader or participant (hereafter called the controller) has some pre-stored content on an RTSP server that he wants to share with the group. The controller sets up an RTSP session at the streaming server for this content and retrieves the session description for the content. The destination for the media content is set to the shared multicast group or conference focus. When desired by the controller, he/she can start and stop the Schulzrinne, et al. Expires August 28, 2008 [Page 22] Internet-Draft Real Time Streaming Protocol 2.0 (RTSP) February 2008 transmission of the media to the conference group. There are several issues with this use case that are not solved by this core specification for RTSP: Denial of service: To avoid an RTSP server from being an unknowing participant in a denial of service attack the server needs to be able to verify the destination's acceptance of the media. Such a mechanism to verify the approval of received media does not yet exist; instead, only policies can be used, which can be made to work in controlled environments. Distributing the presentation description to all participants in the group: To enable a media receiver to correctly decode the content the media configuration information needs to be distributed reliably to all participants. This will most likely require support from an external protocol. Passing control of the session: If it is desired to pass control of the RTSP session between the participants, some support will be required by an external protocol to exchange state information and possibly floor control of who is controlling the RTSP session. If there interest in this use case, further work is required on the necessary extensions. 3.5. Live Content using Multicast This use case in its simplest form does not require any use of RTSP at all; this is what multicast conferences being announced with SAP and SDP are intended to handle. However in use cases where more advanced features like access control to the multicast session are desired, RTSP could be used for session establishment. A client desiring to join a live multicasted media session with cryptographic (encryption) access control could use RTSP in the following way. The source of the session announces the session and gives all interested an RTSP URI. The client connects to the server and requests the presentation description, allowing configuration for reception of the media. In this step it is possible for the client to use secured transport and any desired level of authentication; for example, for billing or access control. An RTSP link also allows for load balancing between multiple servers. If these were the only goals, they could be achieved by simply using HTTP. However, for cases where the sender likes to keep track of each individual receiver of a session, and possibly use the session Schulzrinne, et al. Expires August 28, 2008 [Page 23] Internet-Draft Real Time Streaming Protocol 2.0 (RTSP) February 2008 as a side channel for distributing key-updates or other information on a per-receiver basis, and the full set of receivers is not know prior to the session start, the state establishment that RTSP provides can be beneficial. In this case a client would establish an RTSP session for this multicast group with the RTSP server. The RTSP server will not transmit any media, but instead will point to the multicast group. The client and server will be able to keep the session alive for as long as the receiver participates in the session thus enabling, for example, the server to push updates to the client. This use case will most likely not be able to be implemented without some extensions to the server-to-client push mechanism. Here a method like ANNOUNCE (see [RFC2326]) might be suitable; however, it will require a RTSP extension to revive the method. Schulzrinne, et al. Expires August 28, 2008 [Page 24] Internet-Draft Real Time Streaming Protocol 2.0 (RTSP) February 2008 4. Protocol Parameters 4.1. RTSP Version HTTP specification section [H3.1] applies, with "HTTP" replaced by "RTSP". This specification defines version 2.0 of RTSP. 4.2. RTSP IRI and URI RTSP 2.0 defines and registers three URI schemas "rtsp", "rtsps" and "rtspu". The usage of the last, "rtspu", is unspecified in RTSP 2.0, and is defined here to register and reserve the URI scheme that is defined in RTSP 1.0. The "rtspu" scheme indicates undefined transport of the RTSP messages over unreliable transport (UDP). The syntax of "rtsp" and "rtsps" URIs has been changed from RTSP 1.0. This specification also defines the format of the RTSP IRI [RFC3987] that can be used as RTSP resource identifiers and locators, in web pages, user interfaces, on paper, etc. However, the RTSP request message format only allows usage of the absolute URI format. The RTSP IRI format SHALL use the rules and transformation for IRIs defined in [RFC3987]. This way RTSP 2.0 URIs for request can be produced from an RTSP IRI. The RTSP IRI and URI are both syntax restricted compared to the generic syntax defined in [RFC3986] and RFC [RFC3987]: o An absolute URI requires the authority part; i.e., a host identity must be provided. o Parameters in the path element are prefixed with the reserved separator ";". The RTSP URI and IRI is case sensitive, with the exception of those parts that [RFC3986] and [RFC3987] defines as case-insensitive; for example, the scheme and host part. The fragment identifier is used as defined in sections 3.5 and 4.3 of [RFC3986], i.e. the fragment is to be stripped from the URI by the requestor and not included in the request. The user agent also needs to interpret the value of the fragment based on the media type the request relates to; i.e., the media type indicated in Content-Type header in the response to DESCRIBE. The syntax of any URI query string is unspecified and responder (usually the server) specific. The query is, from the requestor's perspective, an opaque string and needs to be handled as such. Schulzrinne, et al. Expires August 28, 2008 [Page 25] Internet-Draft Real Time Streaming Protocol 2.0 (RTSP) February 2008 The URI scheme "rtsp" requires that commands are issued via a reliable protocol (within the Internet, TCP), while the scheme "rtsps" identifies a reliable transport using secure transport (TLS [RFC4346], see (Section 19). For the scheme "rtsp", if no port number is provided in the authority part of the URI port number 554 SHALL be used. For the scheme "rtsps", the TCP port 322 is registered and SHALL be assumed. A presentation or a stream is identified by a textual media identifier, using the character set and escape conventions of URIs [RFC3986]. URIs may refer to a stream or an aggregate of streams; i.e., a presentation. Accordingly, requests described in (Section 13) can apply to either the whole presentation or an individual stream within the presentation. Note that some request methods can only be applied to streams, not presentations, and vice versa. For example, the RTSP URI: rtsp://media.example.com:554/twister/audiotrack may identify the audio stream within the presentation "twister", which can be controlled via RTSP requests issued over a TCP connection to port 554 of host media.example.com. Also, the RTSP URI: rtsp://media.example.com:554/twister identifies the presentation "twister", which may be composed of audio and video streams, but could also be something else like a random media redirector. This does not imply a standard way to reference streams in URIs. The presentation description defines the hierarchical relationships in the presentation and the URIs for the individual streams. A presentation description may name a stream "a.mov" and the whole presentation "b.mov". The path components of the RTSP URI are opaque to the client and do not imply any particular file system structure for the server. This decoupling also allows presentation descriptions to be used with non-RTSP media control protocols simply by replacing the scheme in the URI. Schulzrinne, et al. Expires August 28, 2008 [Page 26] Internet-Draft Real Time Streaming Protocol 2.0 (RTSP) February 2008 4.3. Session Identifiers Session identifiers are strings of any arbitrary length but with a minimum length of 8 characters. A session identifier MUST be chosen cryptographically random (see [RFC4086]) and MUST be at least 8 characters long (can contain a maximum of 48 bits of entropy) to make guessing it more difficult. It is RECOMMENDED that it contains 128 bits of entropy, i.e. approxamitely 22 characters from a high quality generator. (see Section 21.) However, it needs to be noted that the session identifier does not provide any security against session hijacking unless it is kept confidential between client, server and trusted proxies. 4.4. SMPTE Relative Timestamps A SMPTE relative timestamp expresses time relative to the start of the clip. Relative timestamps are expressed as SMPTE time codes for frame-level access accuracy. The time code has the format hours:minutes:seconds:frames.subframes, with the origin at the start of the clip. The default smpte format is "SMPTE 30 drop" format, with frame rate is 29.97 frames per second. Other SMPTE codes MAY be supported (such as "SMPTE 25") through the use of alternative use of "smpte-type". For SMPTE 30, the "frames" field in the time value can assume the values 0 through 29. The difference between 30 and 29.97 frames per second is handled by dropping the first two frame indices (values 00 and 01) of every minute, except every tenth minute. If the frame and the subframe values are zero, they may be omitted. Subframes are measured in one- hundredth of a frame. Examples: smpte=10:12:33:20- smpte=10:07:33- smpte=10:07:00-10:07:33:05.01 smpte-25=10:07:00-10:07:33:05.01 4.5. Normal Play Time Normal play time (NPT) indicates the stream absolute position relative to the beginning of the presentation, not to be confused with the Network Time Protocol (NTP) [RFC1305]. The timestamp consists of a decimal fraction. The part left of the decimal may be expressed in either seconds or hours, minutes, and seconds. The part right of the decimal point measures fractions of a second. Schulzrinne, et al. Expires August 28, 2008 [Page 27] Internet-Draft Real Time Streaming Protocol 2.0 (RTSP) February 2008 The beginning of a presentation corresponds to 0.0 seconds. Negative values are not defined. The special constant "now" is defined as the current instant of a live event. It MAY only be used for live events, and SHALL NOT be used for on-demand (i.e., non-live) content. NPT is defined as in DSM-CC [ISO.13818-6.1995]: "Intuitively, NPT is the clock the viewer associates with a program. It is often digitally displayed on a VCR. NPT advances normally when in normal play mode (scale = 1), advances at a faster rate when in fast scan forward (high positive scale ratio), decrements when in scan reverse (high negative scale ratio) and is fixed in pause mode. NPT is (logically) equivalent to SMPTE time codes." Examples: npt=123.45-125 npt=12:05:35.3- npt=now- The syntax conforms to ISO 8601 [ISO.8601.2000]. The npt-sec notation is optimized for automatic generation, the npt-hhmmss notation for consumption by human readers. The "now" constant allows clients to request to receive the live feed rather than the stored or time-delayed version. This is needed since neither absolute time nor zero time are appropriate for this case. 4.6. Absolute Time Absolute time is expressed as ISO 8601 [ISO.8601.2000] timestamps, using UTC (GMT). Fractions of a second may be indicated. Example for November 8, 1996 at 14h37 and 20 and a quarter seconds UTC: 19961108T143720.25Z 4.7. Feature-tags Feature-tags are unique identifiers used to designate features in RTSP. These tags are used in Require (Section 16.38), Proxy-Require (Section 16.32), Proxy-Supported (Section 16.33), Unsupported (Section 16.47), and header fields. A feature-tag definition MUST indicate which combination of clients, servers or proxies they applies to. The creator of a new RTSP feature-tag should either prefix the feature-tag with a reverse domain name (e.g., Schulzrinne, et al. Expires August 28, 2008 [Page 28] Internet-Draft Real Time Streaming Protocol 2.0 (RTSP) February 2008 "com.example.mynewfeature" is an apt name for a feature whose inventor can be reached at "example.com"), or register the new feature-tag with the Internet Assigned Numbers Authority (IANA) (see IANA Section 22). The usage of feature-tags is further described in Section 11 that deals with capability handling. 4.8. Entity Tags Entity tags are opaque strings that are used to compare two entities from the same resource, for example in caches or to optimize setup after a redirect. Further explanation is present in [H3.11]. For an explanation of how to compare entity tags see [H13.3]. Entity tags can be carried in the ETag header (see Section 16.21) or in SDP (see Appendix D.1.9). Entity tags are used in RTSP to make some methods conditional. The methods are made conditional through the inclusion of headers, see Section 16.24 and Section 16.26. Note that RTSP entity tags apply to the complete presentation; i.e., both the session description and the individual media streams. Thus entity tags can be used to verify at setup time after a redirect that the same session description applies to the media at the new location using the If-Match header. Schulzrinne, et al. Expires August 28, 2008 [Page 29] Internet-Draft Real Time Streaming Protocol 2.0 (RTSP) February 2008 5. RTSP Message RTSP is a text-based protocol and uses the ISO 10646 character set in UTF-8 encoding (RFC 3629 [RFC3629]). Lines SHALL be terminated by CRLF. Text-based protocols make it easier to add optional parameters in a self-describing manner. Since the number of parameters and the frequency of commands is low, processing efficiency is not a concern. Text-based protocols, if done carefully, also allow easy implementation of research prototypes in scripting languages such as Tcl, Visual Basic and Perl. The ISO 10646 character set avoids tricky character set switching, but is invisible to the application as long as US-ASCII is being used. This is also the encoding used for RTCP [RFC3550]. ISO 8859-1 translates directly into Unicode with a high-order octet of zero. ISO 8859-1 characters with the most-significant bit set are represented as 1100001x 10xxxxxx. (See RFC 3629 [RFC3629]) Requests contain methods, the object the method is operating upon and parameters to further describe the method. Methods are idempotent unless otherwise noted. Methods are also designed to require little or no state maintenance at the media server. 5.1. Message Types See [H4.1]. 5.2. Message Headers See [H4.2]. 5.3. Message Body See [H4.3]. Unlike HTTP, the presence of a message-body in either a request or a response MUST be signaled by the inclusion of a Content-Length header field (see Section 16.16). 5.4. Message Length When a message body is included with a message, the length of that body is determined by one of the following (in order of precedence): 1. Any response message which MUST NOT include a message body (such as the 1xx, 204, and 304 responses) is always terminated by the Schulzrinne, et al. Expires August 28, 2008 [Page 30] Internet-Draft Real Time Streaming Protocol 2.0 (RTSP) February 2008 first empty line after the header fields, regardless of the entity-header fields present in the message. (Note: An empty line is a line with nothing preceding the CRLF.) 2. If a Content-Length header field (Section 16.16) is present, its value in bytes represents the length of the message-body. If this header field is not present, a value of zero is assumed. Unlike an HTTP message, an RTSP message MUST contain a Content-Length header field whenever it contains a message body. Note that RTSP does not support the HTTP/1.1 "chunked" transfer coding (see [H3.6.1]). Given the moderate length of presentation descriptions returned, the server should always be able to determine its length, even if it is generated dynamically, making the chunked transfer encoding unnecessary. Schulzrinne, et al. Expires August 28, 2008 [Page 31] Internet-Draft Real Time Streaming Protocol 2.0 (RTSP) February 2008 6. General Header Fields See [H4.5], except that the Pragma, Trailer, Transfer-Encoding, Upgrade, and Warning headers are not defined. RTSP further defines the CSeq, Pipelined-Requests, Proxy-Supported and Timestamp headers. The general headers are listed in Table 1: +--------------------+--------------------+ | Header Name | Defined in Section | +--------------------+--------------------+ | Cache-Control | Section 16.10 | | | | | Connection | Section 16.11 | | | | | CSeq | Section 16.19 | | | | | Date | Section 16.20 | | | | | Pipelined-Requests | Section 16.29 | | | | | Proxy-Supported | Section 16.33 | | | | | Supported | Section 16.44 | | | | | Timestamp | Section 16.45 | | | | | Via | Section 16.50 | +--------------------+--------------------+ Table 1: The general headers used in RTSP Schulzrinne, et al. Expires August 28, 2008 [Page 32] Internet-Draft Real Time Streaming Protocol 2.0 (RTSP) February 2008 7. Request A request message uses the format outlined below regardless of the direction of a request, client to server or server to client: o Request line, containing the method to be applied to the resource, the identifier of the resource, and the protocol version in use; o Zero or more Header lines, that can be of the following types: general (Section 6), request (Section 7.2), or entity (Section 9.1); o One empty line (CRLF) to indicate the end of the header section; o Optionally a message body (entity), consisting of one or more lines. The length of the message body in bytes is indicated by the Content-Length entity header. 7.1. Request Line The request line provides the key information about the request: what method, on what resources and using which RTSP version. The methods that are defined by this specification are listed in Table 2. Schulzrinne, et al. Expires August 28, 2008 [Page 33] Internet-Draft Real Time Streaming Protocol 2.0 (RTSP) February 2008 +---------------+--------------------+ | Method | Defined in Section | +---------------+--------------------+ | DESCRIBE | Section 13.2 | | | | | GET_PARAMETER | Section 13.7 | | | | | OPTIONS | Section 13.1 | | | | | PAUSE | Section 13.5 | | | | | PLAY | Section 13.4 | | | | | REDIRECT | Section 13.9 | | | | | SETUP | Section 13.3 | | | | | SET_PARAMETER | Section 13.8 | | | | | TEARDOWN | Section 13.6 | +---------------+--------------------+ Table 2: The RTSP Methods The syntax of the RTSP request line is the following: CRLF Note: This syntax cannot be freely changed in future versions of RTSP. This line needs to remain parsable by older RTSP implementations since it indicates the RTSP version of the message. In contrast to HTTP/1.1 [RFC2616], RTSP requests identify the resource through an absolute RTSP URI (scheme, host, and port) (see Section 4.2) rather than just the absolute path. HTTP/1.1 requires servers to understand the absolute URI, but clients are supposed to use the Host request header. This is purely needed for backward-compatibility with HTTP/1.0 servers, a consideration that does not apply to RTSP. An asterisk "*" can be used instead of an absolute URI in the Request-URI part to indicate that the request does not apply to a particular resource, but to the server or proxy itself, and is only allowed when the request method does not necessarily apply to a resource. For example: Schulzrinne, et al. Expires August 28, 2008 [Page 34] Internet-Draft Real Time Streaming Protocol 2.0 (RTSP) February 2008 OPTIONS * RTSP/2.0 An OPTIONS in this form will determine the capabilities of the server or the proxy that first receives the request. If the capability of the specific server needs to be determined, without regard to the capability of an intervening proxy, the server should be addressed explicitly with an absolute URI that contains the server's address. For example: OPTIONS rtsp://example.com RTSP/2.0 7.2. Request Header Fields The RTSP headers in Table 3 can be included in a request, as request headers, to modify the specifics of the request. Some of these headers may also be used in the response to a request, as response headers, to modify the specifics of a response (Section 8.2). +--------------------+--------------------+ | Header | Defined in Section | +--------------------+--------------------+ | Accept | Section 16.1 | | | | | Accept-Credentials | Section 16.2 | | | | | Accept-Encoding | Section 16.3 | | | | | Accept-Language | Section 16.4 | | | | | Authorization | Section 16.7 | | | | | Bandwidth | Section 16.8 | | | | | Blocksize | Section 16.9 | | | | | From | Section 16.23 | | | | | If-Match | Section 16.24 | | | | | If-Modified-Since | Section 16.25 | | | | | If-None-Match | Section 16.26 | | | | | Proxy-Require | Section 16.32 | | | | | Range | Section 16.35 | | | | Schulzrinne, et al. Expires August 28, 2008 [Page 35] Internet-Draft Real Time Streaming Protocol 2.0 (RTSP) February 2008 | Referer | Section 16.36 | | | | | Require | Section 16.38 | | | | | Scale | Section 16.40 | | | | | Session | Section 16.43 | | | | | Speed | Section 16.41 | | | | | Supported | Section 16.44 | | | | | Transport | Section 16.46 | | | | | User-Agent | Section 16.48 | +--------------------+--------------------+ Table 3: The RTSP request headers Detailed headers definition are provided in Section 16. New request headers may be defined. If the receiver of the request is required to understand the request header, the request MUST include a corresponding feature tag in a Require or Proxy-Require header to ensure the correct processing of the header. Schulzrinne, et al. Expires August 28, 2008 [Page 36] Internet-Draft Real Time Streaming Protocol 2.0 (RTSP) February 2008 8. Response [H6] applies except that HTTP-Version is replaced by RTSP-Version. Also, RTSP defines additional status codes and does not define some of the HTTP codes. The valid response codes and the methods they can be used with are listed in Table 4. After receiving and interpreting a request message, the recipient responds with an RTSP response message. 8.1. Status-Line The first line of a Response message is the Status-Line, consisting of the protocol version followed by a numeric status code and the textual phrase associated with the status code, with each element separated by SP characters. No CR or LF is allowed except in the final CRLF sequence. SP SP CRLF 8.1.1. Status Code and Reason Phrase The Status-Code element is a 3-digit integer result code of the attempt to understand and satisfy the request. These codes are fully defined in Section 15. The Reason-Phrase is intended to give a short textual description of the Status-Code. The Status-Code is intended for use by automata and the Reason-Phrase is intended for the human user. The client is not required to examine or display the Reason- Phrase. The first digit of the Status-Code defines the class of response. The last two digits do not have any categorization role. There are 5 values for the first digit: 1xx: Informational - Request received, continuing process 2xx: Success - The action was successfully received, understood, and accepted 3rr: Redirection - Further action needs to be taken in order to complete the request 4xx: Client Error - The request contains bad syntax or cannot be fulfilled Schulzrinne, et al. Expires August 28, 2008 [Page 37] Internet-Draft Real Time Streaming Protocol 2.0 (RTSP) February 2008 5xx: Server Error - The server failed to fulfill an apparently valid request The individual values of the numeric status codes defined for RTSP/2.0, and an example set of corresponding Reason-Phrases, are presented in Table 4. The reason phrases listed here are only recommended; they may be replaced by local equivalents without affecting the protocol. Note that RTSP adopts most HTTP/1.1 [RFC2616] status codes and adds RTSP-specific status codes starting at x50 to avoid conflicts with newly defined HTTP status codes. RTSP status codes are extensible. RTSP applications are not required to understand the meaning of all registered status codes, though such understanding is obviously desirable. However, applications MUST understand the class of any status code, as indicated by the first digit, and treat any unrecognized response as being equivalent to the x00 status code of that class, with the exception that an unrecognized response MUST NOT be cached. For example, if an unrecognized status code of 431 is received by the client, it can safely assume that there was something wrong with its request and treat the response as if it had received a 400 status code. In such cases, user agents SHOULD present to the user the entity returned with the response, since that entity is likely to include human- readable information which will explain the unusual status. +------+----------------------------------------+-----------------+ | Code | Reason | Method | +------+----------------------------------------+-----------------+ | 100 | Continue | all | | | | | | | | | | 200 | OK | all | | | | | | | | | | 300 | Multiple Choices | all | | | | | | 301 | Moved Permanently | all | | | | | | 302 | Found | all | | | | | | 303 | See Other | all | | | | | | 305 | Use Proxy | all | | | | | | | | | | 400 | Bad Request | all | | | | | | 401 | Unauthorized | all | Schulzrinne, et al. Expires August 28, 2008 [Page 38] Internet-Draft Real Time Streaming Protocol 2.0 (RTSP) February 2008 | 402 | Payment Required | all | | | | | | 403 | Forbidden | all | | | | | | 404 | Not Found | all | | | | | | 405 | Method Not Allowed | all | | | | | | 406 | Not Acceptable | all | | | | | | 407 | Proxy Authentication Required | all | | | | | | 408 | Request Timeout | all | | | | | | 410 | Gone | all | | | | | | 411 | Length Required | all | | | | | | 412 | Precondition Failed | DESCRIBE, SETUP | | | | | | 413 | Request Entity Too Large | all | | | | | | 414 | Request-URI Too Long | all | | | | | | 415 | Unsupported Media Type | all | | | | | | 451 | Parameter Not Understood | SET_PARAMETER | | | | | | 452 | reserved | n/a | | | | | | 453 | Not Enough Bandwidth | SETUP | | | | | | 454 | Session Not Found | all | | | | | | 455 | Method Not Valid In This State | all | | | | | | 456 | Header Field Not Valid | all | | | | | | 457 | Invalid Range | PLAY, PAUSE | | | | | | 458 | Parameter Is Read-Only | SET_PARAMETER | | | | | | 459 | Aggregate Operation Not Allowed | all | | | | | | 460 | Only Aggregate Operation Allowed | all | | | | | | 461 | Unsupported Transport | all | | | | | Schulzrinne, et al. Expires August 28, 2008 [Page 39] Internet-Draft Real Time Streaming Protocol 2.0 (RTSP) February 2008 | 462 | Destination Unreachable | all | | | | | | 463 | Destination Prohibited | SETUP | | | | | | 464 | Data Transport Not Ready Yet | PLAY | | | | | | 470 | Connection Authorization Required | all | | | | | | 471 | Connection Credentials not accepted | all | | | | | | 472 | Failure to establish secure connection | all | | | | | | | | | | 500 | Internal Server Error | all | | | | | | 501 | Not Implemented | all | | | | | | 502 | Bad Gateway | all | | | | | | 503 | Service Unavailable | all | | | | | | 504 | Gateway Timeout | all | | | | | | 505 | RTSP Version Not Supported | all | | | | | | 551 | Option not support | all | +------+----------------------------------------+-----------------+ Table 4: Status codes and their usage with RTSP methods 8.2. Response Header Fields The response-header fields allow the request recipient to pass additional information about the response which cannot be placed in the Status-Line. These header fields give information about the server and about further access to the resource identified by the Request-URI. All headers currently classified as response headers are listed in Table 5. Schulzrinne, et al. Expires August 28, 2008 [Page 40] Internet-Draft Real Time Streaming Protocol 2.0 (RTSP) February 2008 +------------------------+--------------------+ | Header | Defined in Section | +------------------------+--------------------+ | Accept-Credentials | Section 16.2 | | | | | Accept-Ranges | Section 16.5 | | | | | Connection-Credentials | Section 16.12 | | | | | ETag | Section 16.21 | | | | | Location | Section 16.28 | | | | | Proxy-Authenticate | Section 16.30 | | | | | Public | Section 16.34 | | | | | Range | Section 16.35 | | | | | Retry-After | Section 16.37 | | | | | RTP-Info | Section 16.39 | | | | | Scale | Section 16.40 | | | | | Session | Section 16.43 | | | | | Server | Section 16.42 | | | | | Speed | Section 16.41 | | | | | Transport | Section 16.46 | | | | | Unsupported | Section 16.47 | | | | | Vary | Section 16.49 | | | | | WWW-Authenticate | Section 16.51 | +------------------------+--------------------+ Table 5: The RTSP response headers Response-header field names can be extended reliably only in combination with a change in the protocol version. However the usage of feature-tags in the request allows the responding party to learn the capability of the receiver of the response. New or experimental header fields MAY be given the semantics of response-header fields if all parties in the communication recognize them to be response-header Schulzrinne, et al. Expires August 28, 2008 [Page 41] Internet-Draft Real Time Streaming Protocol 2.0 (RTSP) February 2008 fields. Unrecognized header fields in responses are treated as entity-header fields. Schulzrinne, et al. Expires August 28, 2008 [Page 42] Internet-Draft Real Time Streaming Protocol 2.0 (RTSP) February 2008 9. Entity Request and Response messages MAY transfer an entity if not otherwise restricted by the request method or response status code. An entity consists of entity-header fields and an entity-body, although some responses will only include the entity-headers. The SETPARAMETER and GET_PARAMETER request and response, and DESCRIBE response MAY have an entity. All 4xx and 5xx responses MAY also have an entity. In this section, both sender and recipient refer to either the client or the server, depending on who sends and who receives the entity. 9.1. Entity Header Fields Entity-header fields define meta-information about the entity-body or, if no body is present, about the resource identified by the request. The entity header fields are listed in Table 6. +------------------+--------------------+ | Header | Defined in Section | +------------------+--------------------+ | Allow | Section 16.6 | | | | | Content-Base | Section 16.13 | | | | | Content-Encoding | Section 16.14 | | | | | Content-Language | Section 16.15 | | | | | Content-Length | Section 16.16 | | | | | Content-Location | Section 16.17 | | | | | Content-Type | Section 16.18 | | | | | Expires | Section 16.22 | | | | | Last-Modified | Section 16.27 | +------------------+--------------------+ Table 6: The RTSP entity headers The extension-header mechanism allows additional entity-header fields to be defined without changing the protocol, but these fields cannot be assumed to be recognizable by the recipient. Unrecognized header fields SHOULD be ignored by the recipient and forwarded by proxies. Schulzrinne, et al. Expires August 28, 2008 [Page 43] Internet-Draft Real Time Streaming Protocol 2.0 (RTSP) February 2008 9.2. Entity Body See [H7.2] with the addition that an RTSP message with an entity body MUST include the Content-Type and Content-Length headers. Schulzrinne, et al. Expires August 28, 2008 [Page 44] Internet-Draft Real Time Streaming Protocol 2.0 (RTSP) February 2008 10. Connections RTSP requests can be transmitted using the two different connection scenarios listed below: o persistent - a transport connection is used for several request/ response transactions; o transient - a transport connection is used for a single request/ response transaction. RFC 2326 attempted to specify an optional mechanism for transmitting RTSP messages in connectionless mode over a transport protocol such as UDP. However, it was not specified in sufficient detail to allow for interoperable implementations. In an attempt to reduce complexity and scope, and due to lack of interest, RTSP 2.0 does not attempt to define a mechanism for supporting RTSP over UDP or other connectionless transport protocols. A side-effect of this is that RTSP requests SHALL NOT be sent to multicast groups since no connection can be established with a specific receiver in multicast environments. Certain RTSP headers, such as the CSeq header (Section 16.19), which may appear to be relevant only to connectionless transport scenarios are still retained and must be implemented according to the specification. In the case of CSeq, it is quite useful for matching responses to requests if the requests are pipelined (see Section 12). It is also useful in proxies for keeping track of the different requests when aggregating several client requests on a single TCP connection. 10.1. Reliability and Acknowledgements When RTSP messages are transmitted using reliable transport protocols, they MUST NOT be retransmitted at the RTSP protocol level. Instead, the implementation must rely on the underlying transport to provide reliability. The RTSP implementation may use any indication of reception acknowledgement of the message from the underlying transport protocols to optimize the RTSP behavior. If both the underlying reliable transport such as TCP and the RTSP application retransmit requests, each packet loss or message loss may result in two retransmissions. The receiver typically cannot take advantage of the application-layer retransmission since the transport stack will not deliver the application-layer retransmission before the first attempt has reached the receiver. If the packet loss is caused by congestion, multiple retransmissions at different layers will exacerbate the Schulzrinne, et al. Expires August 28, 2008 [Page 45] Internet-Draft Real Time Streaming Protocol 2.0 (RTSP) February 2008 congestion. Lack of acknowledgement of an RTSP request should be handled within the constraints of the connection timeout considerations described below (Section 10.4). 10.2. Using Connections A TCP transport can be used for both persistent connections (for several message exchanges) and transient connections (for a single message exchange). Implementations of this specification MUST support RTSP over TCP. The scheme of the RTSP URI (Section 4.2) indicates the default port that the server will listen on. A server MUST handle both persistent and transient connections. Transient connections facilitate mechanisms for fault tolerance. They also allow for application layer mobility. A server and client pair that support transient connections can survive the loss of a TCP connection; e.g., due to a NAT timeout. When the client has discovered that the TCP connection has been lost, it can set up a new one when there is need to communicate again. A persistent connection MAY be used for all transactions between the server and client, including messages for multiple RTSP sessions. However a persistent connection MAY also be closed after a few message exchanges. For example, a client may use a persistent connection for the initial SETUP and PLAY message exchanges in a session and then close the connection. Later, when the client wishes to send a new request, such as a PAUSE for the session, a new connection would be opened. This connection may either be transient or persistent. An RTSP agent SHOULD NOT have more than one connection to the server at any given point. If a client or proxy handles multiple RTSP sessions on the same server, it SHOULD use only one connection for managing those sessions. This saves connection resources on the server. It also reduces complexity by and enabling the server to maintain less state about its sessions and connections. Unlike HTTP, RTSP allows a server to send requests to a client. However, this can be supported only if a client establishes a persistent connection with the server. In cases where a persistent connection does not exist between a server and its client, due to the lack of a signalling channel the server may be forced to drop an RTSP session without notifying the client. An example of such a case is Schulzrinne, et al. Expires August 28, 2008 [Page 46] Internet-Draft Real Time Streaming Protocol 2.0 (RTSP) February 2008 when the server desires to send a REDIRECT request for an RTSP session to the client but is not able to do so because it cannot reach the client. Without a persistent connection between the client and the server, the media server has no reliable way of reaching the client. Also, this is the only way that requests from a server to its client are likely to traverse firewalls. In light of the above, it is RECOMMENDED that clients use persistent connections whenever possible. A client that supports persistent connections MAY "pipeline" its requests (see Section 12). 10.3. Closing Connections The client MAY close a connection at any point when no outstanding request/response transactions exist for any RTSP session being managed through the connection. The server, however, SHOULD NOT close a connection until all RTSP sessions being managed through the connection have been timed out (Section 16.43). A server SHOULD NOT close a connection immediately after responding to a session-level TEARDOWN request for the last RTSP session being controlled through the connection. Instead, it should wait for a reasonable amount of time for the client to receive the TEARDOWN response, take appropriate action, and initiate the connection closing. The server SHOULD wait at least 10 seconds after sending the TEARDOWN response before closing the connection. This is to ensure that the client has time to issue a SETUP for a new session on the existing connection after having torn the last one down. 10 seconds should give the client ample opportunity get its message to the server. A server SHOULD NOT close the connection directly as a result of responding to a request with an error code. Certain error responses such as "460 Only Aggregate Operation Allowed" (Section 15.4.12) are used for negotiating capabilities of a server with respect to content or other factors. In such cases, it is inefficient for the server to close a connection on an error response. Also, such behavior would prevent implementation of advanced/special types of requests or result in extra overhead for the client when testing for new features. On the flip side, keeping connections open after sending an error response poses a Denial of Service security risk (Section 21). If a server closes a connection while the client is attempting to send a new request, the client will have to close its current Schulzrinne, et al. Expires August 28, 2008 [Page 47] Internet-Draft Real Time Streaming Protocol 2.0 (RTSP) February 2008 connection, establish a new connection and send its request over the new connection. An RTSP message should not be terminated by closing the connection. Such a message MAY be considered to be incomplete by the receiver and discarded. An RTSP message is properly terminated as defined in Section 5. 10.4. Timing Out Connections and RTSP Messages Receivers of a request (responder) SHOULD respond to requests in a timely manner even when a reliable transport such as TCP is used. Similarly, the sender of a request (requestor) SHOULD wait for a sufficient time for a response before concluding that the responder will not be acting upon its request. A responder SHOULD respond to all requests within 5 seconds. If the responder recognizes that processing of a request will take longer than 5 seconds, it SHOULD send a 100 (Continue) response as soon as possible. It SHOULD continue sending a 100 response every 5 seconds thereafter until it is ready to send the final response to the requestor. After sending a 100 response, the receiver MUST send a final response indicating the success or failure of the request. A requestor SHOULD wait at least 10 seconds for a response before concluding that the responder will not be responding to its request. After receiving a 100 response, the requestor SHOULD continue waiting for further responses. If more than 10 seconds elapses without receiving any response, the requestor MAY assume that the responder is unresponsive and abort the connection. A requestor SHOULD wait longer than 10 seconds for a response if it is experiencing significant transport delays on its connection to the responder. The requestor is capable of determining the RTT of the request/response cycle using the Timestamp header (Section 16.45) in any RTSP request. 10.5. Showing Liveness The mechanisms for showing liveness of the client is, any RTSP request with a Session header, if RTP & RTCP is used an RTCP message, or through any other used media protocol capable of indicating liveness of the RTSP client. It is RECOMMENDED that a client does not wait to the last second of the timeout before trying to send a liveness message. The RTSP message may be lost or when using reliable protocols, such as TCP, the message may take some time to arrive safely at the receiver. To show liveness between RTSP request issued to accomplish other things, the following mechanisms can be Schulzrinne, et al. Expires August 28, 2008 [Page 48] Internet-Draft Real Time Streaming Protocol 2.0 (RTSP) February 2008 used, in descending order of preference: RTCP: If RTP is used for media transport RTCP SHOULD be used. If RTCP is used to report transport statistics, it SHALL also work as keep alive. The server can determine the client by used network address and port together with the fact that the client is reporting on the servers SSRC(s). A downside of using RTCP is that it only gives statistical guarantees to reach the server. However that probability is so low that it can be ignored in most cases. For example, a session with 60 seconds timeout and enough bitrate assigned to RTCP messages to send a message from client to server on average every 5 seconds. That client have for a network with 5 \% packet loss, the probability to fail showing liveness sign in that session within the timeout interval of 2.4*E-16. In sessions with shorter timeout times, or much higher packet loss, or small RTCP bandwidths SHOULD also use any of the mechanisms below. SETPARAMETER: When using SETPARAMETER for keep alive, no body SHOULD be included. This method is the RECOMMENDED RTSP method to use in request only intended to perform keep-alive. OPTIONS: This method does also work. However it causes the server to perform more unnecessary processing and result in bigger responses than necessary for the task. The reason for this is that the server needs to determine what capabilities that are associated with the media resource to correctly populate the Public and Allow headers. The timeout parameter MAY be included in a SETUP response, and SHALL NOT be included in requests. The server uses it to indicate to the client how long the server is prepared to wait between RTSP commands or other signs of life before closing the session due to lack of activity (see below and Appendix B). The timeout is measured in seconds, with a default of 60 seconds. The length of the session timeout SHALL NOT be changed in a established session. 10.6. Use of IPv6 Explicit IPv6 support was not present in RTSP 1.0 (RFC 2326). RTSP 2.0 has been updated for explicit IPv6 support. Implementations of RTSP 2.0 MUST understand literal IPv6 addresses in URIs and headers. Schulzrinne, et al. Expires August 28, 2008 [Page 49] Internet-Draft Real Time Streaming Protocol 2.0 (RTSP) February 2008 11. Capability Handling This section describes the available capability handling mechanism which allows RTSP to be extended. Extensions to this version of the protocol are basically done in two ways. First, new headers can be added. Secondly, new methods can be added. The capability handling mechanism is designed to handle both cases. When a method is added, the involved parties can use the OPTIONS method to discover wether it is supported. This is done by issuing a OPTIONS request to the other party. Depending on the URI it will either apply in regards to a certain media resource, the whole server in general, or simply the next hop. The OPTIONS response MUST contain a Public header which declares all methods supported for the indicated resource. It is not necessary to use OPTIONS to discover support of a method, the client could simply try the method. If the receiver of the request does not support the method it will respond with an error code indicating the the method is either not implemented (501) or does not apply for the resource (405). The choice between the two discovery methods depends on the requirements of the service. Feature-Tags are defined to handle functionality additions that are not new methods. Each feature-tag represents a certain block of functionality. The amount of functionality that a feature-tag represents can vary significantly. A feature-tag can for example represent the functionality a single RTSP header provides. Another feature-tag can represent much more functionality, such as the "play.basic" feature-tag which represents the minimal playback implementation. Feature-tags are used to determine wether the client, server or proxy supports the functionality that is necessary to achieve the desired service. To determine support of a feature-tag, several different headers can be used, each explained below: Supported: The supported header is used to determine the complete set of functionality that both client and server have. The intended usage is to determine before one needs to use a functionality that it is supported. It can be used in any method, however OPTIONS is the most suitable one as it at the same time determines all methods that are implemented. When sending a request the requestor declares all its capabilities by including all supported feature-tags. This results in that the receiver learns the requestors feature support. The receiver then includes its set of features in the response. Schulzrinne, et al. Expires August 28, 2008 [Page 50] Internet-Draft Real Time Streaming Protocol 2.0 (RTSP) February 2008 Proxy-Supported: The Proxy-Supported header is used similar to the Supported header, but instead of giving the supported functionality of the client or server it provides both the requestor and the responder a view of what functionality the proxy chain between the two supports. Proxies are required to add this header whenever the Supported header is present, but proxies may independently of the requestor add it. Require: The Require header can be included in any request where the end-point, i.e. the client or server, is required to understand the feature to correctly perform the request. This can, for example, be a SETUP request where the server is required to understand a certain parameter to be able to set up the media delivery correctly. Ignoring this parameter would not have the desired effect and is not acceptable. Therefore the end-point receiving a request containing a Require MUST negatively acknowledge any feature that it does not understand and not perform the request. The response in cases where features are not supported are 551 (Option Not Supported). Also the features that are not supported are given in the Unsupported header in the response. Proxy-Require: This method has the same purpose and workings as Require except that it only applies to proxies and not the end- point. Features that needs to be supported by both proxies and end-point needs to be included in both the Require and Proxy- Require header. Unsupported: This header is used in a 551 error response, to indicate which feature(s) that was not supported. Such a response is only the result of the usage of the Require and/or Proxy-Require header where one or more feature where not supported. This information allows the requestor to make the best of situations as it knows which features are not supported. Schulzrinne, et al. Expires August 28, 2008 [Page 51] Internet-Draft Real Time Streaming Protocol 2.0 (RTSP) February 2008 12. Pipelining Support Pipelining is a general method to improve performance of request response protocols by allowing the requesting entity to have more than one request outstanding and send them over the same persistent connection. For RTSP where the relative order of requests will matter it is important to maintain the order of the requests. Because of this the the responding entity SHALL process the incoming requests in their sending order. The sending order can be determined by the CSeq header and its sequence number. For TCP the delivery order will be the same as the sending order. The processing of the request SHALL also have been finished before processing the next request from the same entity. The responses MUST be sent in the order the requests was processed. RTSP 2.0 has extended support for pipelining compared to RTSP 1.0. The major improvement is to allow all requests to setup and initiate media playback to be pipelined after each other. This is accomplished by the utilization of the Pipelined-Requests header (see Section 16.29). This header allows a client to request that two or more requests is to be processed in the same RTSP session context which the first request creates. In other words a client can request that two or more media streams are set-up and then played without needing to wait for a single response. This speeds up the initial startup time for an RTSP session with at least one RTT. If a pipelined request builds on the succesful completion of one or more prior requests the requestor must verify that all requests were executed as expected. A common example will be two SETUP requests and a PLAY request. In case one of the SETUP fails unexpectedly, the PLAY request can still be succesfully executed. However, not as expected by the requesting client as only a single media instead of two will be played. In this case the client can send a PAUSE request, correct the failing SETUP request and then request it to be played. Schulzrinne, et al. Expires August 28, 2008 [Page 52] Internet-Draft Real Time Streaming Protocol 2.0 (RTSP) February 2008 13. Method Definitions The method indicates what is to be performed on the resource identified by the Request-URI. The method name is case-sensitive. New methods may be defined in the future. Method names SHALL NOT start with a $ character (decimal 24) and MUST be a token as defined by the ABNF [RFC4234] in the syntax chapter Section 20. The methods are summarized in Table 7. +---------------+-----------+--------+--------------+---------------+ | method | direction | object | Server req. | Client req. | +---------------+-----------+--------+--------------+---------------+ | DESCRIBE | C -> S | P,S | recommended | recommended | | | | | | | | GET_PARAMETER | C -> S | P,S | optional | optional | | | | | | | | | S -> C | | | | | | | | | | | OPTIONS | C -> S | P,S | R=Req, | Sd=Req, R=Opt | | | | | Sd=Opt | | | | | | | | | | S -> C | | | | | | | | | | | PAUSE | C -> S | P,S | required | required | | | | | | | | PLAY | C -> S | P,S | required | required | | | | | | | | REDIRECT | S -> C | P,S | optional | required | | | | | | | | SETUP | C -> S | S | required | required | | | | | | | | SET_PARAMETER | C -> S | P,S | required | optional | | | | | | | | | S -> C | | | | | | | | | | | TEARDOWN | C -> S | P,S | required | required | +---------------+-----------+--------+--------------+---------------+ Table 7: Overview of RTSP methods, their direction, and what objects (P: presentation, S: stream) they operate on. Legend: R=Respond, Sd=Send, Opt: Optional, Req: Required Note on Table 7: GET_PARAMETER is recommended, but not required. For example, a fully functional server can be built to deliver media without any parameters. SET_PARAMETER is required however due to its usage for keep-alive. PAUSE is now required due to that it is the only way of getting out of the state machines play state without terminating the whole session. Schulzrinne, et al. Expires August 28, 2008 [Page 53] Internet-Draft Real Time Streaming Protocol 2.0 (RTSP) February 2008 If an RTSP agent does not support a particular method, it MUST return 501 (Not Implemented) and the requesting RTSP agent, in turn, SHOULD NOT try this method again for the given agent / resource combination. 13.1. OPTIONS The semantics of the RTSP OPTIONS method is equivalent to that of the HTTP OPTIONS method described in [H9.2]. In RTSP however, OPTIONS is bi-directional, in that a client can request it to a server and vice versa. A client MUST implement the capability to send an OPTIONS request and a server or a proxy MUST implement the capability to respond to an OPTIONS request. The client, server or proxy MAY also implement the converse of their required capability. An OPTIONS request may be issued at any time. Such a request does not modify the session state. However, it may prolong the session lifespan (see below). The URI in an OPTIONS request determines the scope of the request and the corresponding response. If the Request- URI refers to a specific media resource on a given host, the scope is limited to the set of methods supported for that media resource by the indicated RTSP agent. A Request-URI with only the host address limits the scope to the specified RTSP agent's general capabilities without regard to any specific media. If the Request-URI is an asterisk ("*"), the scope is limited to the general capabilities of the next hop (i.e. the RTSP agent in direct communication with the request sender). Regardless of scope of the request, the Public header MUST always be included in the OPTIONS response listing the methods that are supported by the responding RTSP agent. In addition, if the scope of the request is limited to a media resource, the Allow header MUST be included in the response to enumerate the set of methods that are allowed for that resource unless the set of methods completely matches the set in the Public header. If the given resource is not available, the RTSP agent SHOULD return an appropriate response code such as 3rr or 4xx. The Supported header MAY be included in the request to query the set of features that are supported by the responding RTSP agent. The OPTIONS method can be used to keep an RTSP session alive. However, it is not the preferred means of session keep-alive signalling, see Section 16.43. An OPTIONS request intended for keeping alive an RTSP session MUST include the Session header with the associated session ID. Such a request SHOULD also use the media or the aggregated control URI as the Request-URI. Example: Schulzrinne, et al. Expires August 28, 2008 [Page 54] Internet-Draft Real Time Streaming Protocol 2.0 (RTSP) February 2008 C->S: OPTIONS * RTSP/2.0 CSeq: 1 User-Agent: PhonyClient/1.2 Require: Proxy-Require: gzipped-messages Supported: play.basic S->C: RTSP/2.0 200 OK CSeq: 1 Public: DESCRIBE, SETUP, TEARDOWN, PLAY, PAUSE Supported: play.basic, implicit-play, gzipped-messages Server: PhonyServer/1.1 Note that some of the feature-tags in Require and Proxy-Require are fictional features. 13.2. DESCRIBE The DESCRIBE method is used to retrieve the description of a presentation or media object from a server. The Request-URI of the DESCRIBE request identifies the media resource of interest. The client MAY include the Accept header in the request to list the description for