AVT Working Group D. Serenyi C. LeCroy Internet Draft Apple Computer Document: November 2000 Category: Standards Track RTP Payload Format for Payload Meta-Information Status of this Memo This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC2026 [ ]. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet- Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. 1. Abstract This memo describes the RTP payload format for payload meta- information. A RTP-Meta-Info packet is a collection of individual fields. Each field has its own format: the format of some fields are documented here, other fields may be defined and documented elsewhere. This memo also specifies methods within SDP (Session Description Protocol) and RTSP (Real-time Streaming Protocol) for negotiating the contents of an RTP-Meta-Info stream. 2. Conventions used in this document The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC-2119 [ ]. Serenyi Standards Track 1 INTERNET-DRAFT RTP Payload Meta-Information Nov 2000 3. Introduction Certain RTP clients, such as RTP reflectors, relays, and caching proxies, require per-packet meta information that goes beyond the sequence number and timestamp already provided in the RTP header. For instance, a caching proxy may want to provide stream thinning to its clients in case those clients are bandwidth constrained. If that stream thinning is based on the type of video frame being sent by the origin server, there is no payload-independent way for the caching proxy to determine the frame type. 4. Format of the RTP-Meta-Info Payload Type This section documents the format of the RTP-Meta-Info payload as a series of fields. The format of individual fields is described in Section 6, or in other documents. The format of each field in a RTP-Meta-Info payload consists of a field header and a field body. The field header can either have the "standard" format or the "compressed" format. The first bit of each field header indicates the type of field, so it is possible to mix standard and compressed fields in the same packet. The bit should be 0 for standard format fields, 1 for compressed format fields. The standard format consists of a 15 bit field name and a 2 octet field length. The field name should be ASCII representations of alphanumeric characters, such as "ft". Because the first character has only 7 bits of space allocated, the names must use only 7-bit ASCII characters. The field length should be the length of the entire field body in octets. The following is an example of the standard RTP-Meta-Info payload format: 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |C| field name | field len | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | field data | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | .... | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |C| field name | field len | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | field data | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | .... | Serenyi Standards Track 2 INTERNET-DRAFT RTP Payload Meta-Information Nov 2000 The compressed format requires out-of-band negotiation between client and server. During the negotiation process, a 7-bit field ID is negotiated for each field name. Instead of the field names being sent in the RTP payload, only the field ID is sent. Negotiation can either happen through RTSP, described in section 5, or through the SDP description of the RTP-Meta-Info stream, described in section 6. In either case, the field description will be a text header consisting of a semi-colon list of field names and field IDs. For example the following could be a server to client RTSP header: x-RTP-Meta-Info: to=0;ft=1;ba=2;rb=3 Each semi-colon delimited section of the header maps a field name to a unique ID. The field IDs must be between 0 and 127, because they are represented by 7 bits in the compressed header. The format of the compressed header is the 1-bit header type identifier, followed by the 7-bit field Id, followed by a 1-octet field length. The following is an example of a compressed RTP-Meta-Info packet with both compressed and standard fields: 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |C| field ID | field len | field data | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | .... | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |C| field name | field len | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | field data | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | .... | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |C| field ID | field len | field data | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | .... | 5. Field Negotiation Through RTSP Any payload can be re-encoded as an RTP-Meta-Info stream through the x-RTP-Meta-Info RTSP header. Serenyi Standards Track 3 INTERNET-DRAFT RTP Payload Meta-Information Nov 2000 The "x- RTP-Meta-Info" header should be sent from a client to server on a SETUP, and echoed by the server in its SETUP response. If the header is not echoed, the client must assume that the server does not support this header. The body of the x-RTP-Meta-Info header follows the format described in section 3: the field format followed by a semi-colon delimited list of field names. In the client request, this is the list of fields the client would like to receive in the RTP stream specified by the SETUP. In the server response, this is the list of fields the server will provide in that RTP stream. The response may also contain field ID mappings for some or all of the field names if the server supports the compressed header format. The server may return a subset of the field names in the event the server doesn't support all the requested fields, or some fields don't apply to the RTP stream specified by the SETUP. Example x-RTP-Meta-Info header in SETUP request: x-RTP-Meta-Info; to;bi;bo Example x-RTP-Meta-Info header in SETUP response: x-RTP-Meta-Info: to=0;bi;bo=1 or: x-RTP-Meta-Info: to;bi 6. Describing the RTP-Meta-Info Payload in SDP It is recommended that any source of RTP-Meta-Info payload packets describe the contents of the payload as part of the SDP description of the media. RTP-Meta-Info descriptions consist of two additional a= headers. The a=x-embedded-rtpmap header tells the client the payload type of the underlying RTP payload. The a=x-RTP-Meta-Info header is a field description equivalent in format to the x-RTP-Meta-Info header described in section 4. Example SDP description of the RTP-Meta-Info payload: m=other 5084 RTP/AVP 96 a=rtpmap:96 x-RTP-Meta-Info a=x-embedded-rtpmap:96 x-QT a=x-RTP-Meta-Info: standard;to;bi;bo 7. Field Definitions The fields defined below are primarily meant to be used by RTSP / RTP caching proxies and by broadcasters sending to a reflecting server. More field definitions can be added as is necessary. Serenyi Standards Track 4 INTERNET-DRAFT RTP Payload Meta-Information Nov 2000 7.1 transmit time field, 'tt' This field consists of a single 4-octet unsigned integer representing the recommended transmission time of the RTP packet in milliseconds. The transmit times are always offset from the start of the media presentation. For example, if the SDP response for a URL includes a range of 0-729.45, and the client makes a PLAY request with a Range of 100-729.45, then the first RTP packet from the server should have a tt value of approximately 100,000 (it may not be exactly 100,000 because the server is free to find a frame nearby the requested time). If the SDP for a URL does not contain a range, then the client may at least use these values as relative offsets. Example uncompressed transmit time field: 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | field name, 'tt' | field len, 4 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | transmit time | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 7.2 frame type field, 'ft' This field consists of a single 16-bit unsigned integer value with several well-known values representing different frame types. The well-known values are as follows: 0 - unknown frame type 1 - key frame 2 - b-frame 3 - p-frame Future versions may add additional frame types. Any value greater than 3 should be considered to be an unknown frame type. This field is only valid for video RTP streams. If a client asks for it when setting up another media type, the server may strip this sub- extension from its "x-RTP-Extension" response, or it may provide the field with a value of 0 in every packet. Example uncompressed frame type field: 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | field name, 'ft' | field len, 2 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | frame type | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Serenyi Standards Track 5 INTERNET-DRAFT RTP Payload Meta-Information Nov 2000 7.3 packet number field, 'pn' This field consists of a sinle 64-bit unsigned integer value. The value is the packet number offset from the absolute start of the stream. For example, if the SDP response for a URL includes a range of 0-729.45, and the client makes a PLAY request with a range of 0- 729.45, the pn value of the first packet will be 0, and increment by 1 for each subsequent packet. If there are 1000 packets between in the first 60 seconds of a stream, and a client makes a PLAY request of 60- 729.45, the pn value of the first packet will be 1001, and increment by 1 for each subsequent packet. Example uncompressed packet number field: 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | field name, 'pn' | field len, 8 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | hi-order packet number | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | lo-order packet number | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 7.4 packet position field, 'pp' This field consists of a single 64-bit unsigned integer value. The value is the byte offset of this packet from the absolute start of the stream. For example, if the SDP response for a URL includes a range of 0-729.45, and the client makes a PLAY request with a range of 100- 729.45, then the pp value of the first video RTP packet will be the total number of bytes of the video RTP packets between 0 - 100. Only the RTP packet payload bytes are used to compute each pp value. Some RTP streams, because they are live or dynamic media, will not be able to supply this identifier. In general, if the media SDP has a range attribute, the server will be able to provide the pp extension. Example uncompressed packet position field: 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | field name, 'pp' | field len, 8 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | hi-order packet position | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | lo-order packet position | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Serenyi Standards Track 6 INTERNET-DRAFT RTP Payload Meta-Information Nov 2000 7.5 media data field, 'md' This field contains media data for the underlying RTP payload. Example uncompressed media data field: 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | field name, 'md' | field len | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | ... | 7.6 sequence number, 'sq' This field consists of a 2-octet RTP sequence number. This field is useful for mapping RTP-Meta-Info fields to the underlying payload data that they refer to, if that data is being sent out-of-band. Example uncompressed sequence number field: 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | field name, 'sq' | field len, 2 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | sequence number | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 8. Security Considerations None. 9. References [1] Schulzrinne, H., "Real Time Streaming Protocol (RTSP)", RFC 2326, April 1998 [2] Schulzrinne, H., "RTP: A Transport Protocol for Real-Time Applications", RFC 1889, January 1996 [3] Handley, M., "SDP: Session Description Protocol", RFC 2327, April 1998 Serenyi Standards Track 7 INTERNET-DRAFT RTP Payload Meta-Information Nov 2000 10. Acknowledgments 11. Author's Addresses Denis Serenyi Apple Computer 1 Infinite Loop Cupertino, CA 95014 Email: denis@apple.com Chris LeCroy Apple Computer 1 Infinite Loop Cupertino, CA 95014 Email: lecroy@apple.com 12. Full Copyright Statement "Copyright (C) The Internet Society (date). All Rights Reserved. This document and translations of it may be copied and furnished to others, and derivative works that comment on or otherwise explain it or assist in its implmentation may be prepared, copied, published and distributed, in whole or in part, without restriction of any kind, provided that the above copyright notice and this paragraph are included on all such copies and derivative works. However, this document itself may not be modified in any way, such as by removing the copyright notice or references to the Internet Society or other Internet organizations, except as needed for the purpose of developing Internet standards in which case the procedures for copyrights defined in the Internet Standards process must be followed, or as required to translate it into languages other than English. The limited permissions granted above are perpetual and will not be revoked by the Internet Society or its successors or assigns. This document and the information contained herein is provided on an "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE."