< draft-ietf-avt-rtp-3gpp-timed-text-14.txt   draft-ietf-avt-rtp-3gpp-timed-text-15.txt >
Internet Draft J. Rey Internet Draft J. Rey
draft-ietf-avt-rtp-3gpp-timed-text-14.txt Y. Matsui draft-ietf-avt-rtp-3gpp-timed-text-15.txt Y. Matsui
Panasonic Panasonic
Expires: November 17, 2005 May 17, 2005 Expires: December 13, 2005 June 13, 2005
RTP Payload Format for 3GPP Timed Text RTP Payload Format for 3GPP Timed Text
Status of this Memo Status of this Memo
By submitting this Internet-Draft, each author represents that any By submitting this Internet-Draft, each author represents that any
applicable patent or other IPR claims of which he or she is aware applicable patent or other IPR claims of which he or she is aware
have been or will be disclosed, and any of which he or she becomes have been or will be disclosed, and any of which he or she becomes
aware will be disclosed, in accordance with Section 6 of BCP 79. aware will be disclosed, in accordance with Section 6 of BCP 79.
skipping to change at page 2, line 7 skipping to change at page 2, line 7
3GPP (3rd Generation Partnership Project) timed text. 3GPP timed 3GPP (3rd Generation Partnership Project) timed text. 3GPP timed
text is a time-lined decorated text media format with defined storage text is a time-lined decorated text media format with defined storage
in a 3GP file. Timed Text can be synchronized with audio/video in a 3GP file. Timed Text can be synchronized with audio/video
contents and used in application such as captioning, titling and contents and used in application such as captioning, titling and
multimedia presentations. In the following sections the problems of multimedia presentations. In the following sections the problems of
streaming timed text are addressed and a payload format for streaming streaming timed text are addressed and a payload format for streaming
3GPP timed text over RTP is specified. 3GPP timed text over RTP is specified.
Table of Contents Table of Contents
1. Introduction...................................................4 1. Introduction....................................................4
2. Motivation, Requirements and Design Rationale..................4 2. Motivation, Requirements and Design Rationale...................4
2.1. Motivation...................................................4 2.1. Motivation...................................................4
2.2. Basic Components of the 3GPP Timed Text Media Format.........5 2.2. Basic Components of the 3GPP Timed Text Media Format.........4
2.3. Requirements.................................................5 2.3. Requirements.................................................5
2.4. Limitations..................................................7 2.4. Limitations..................................................7
2.5. Design Rationale.............................................8 2.5. Design Rationale.............................................8
3. Terminology...................................................10 3. Terminology....................................................10
4. RTP Payload Format for 3GPP Timed Text........................12 4. RTP Payload Format for 3GPP Timed Text.........................12
4.1. Payload Header Definitions..................................13 4.1. Payload Header Definitions..................................13
4.1.1. Common Payload Header Fields.............................14 4.1.1. Common Payload Header Fields.............................14
4.1.2. TYPE 1 Header............................................16 4.1.2. TYPE 1 Header............................................16
4.1.3. TYPE 2 Header............................................19 4.1.3. TYPE 2 Header............................................19
4.1.4. TYPE 3 Header............................................22 4.1.4. TYPE 3 Header............................................22
4.1.5. TYPE 4 Header............................................23 4.1.5. TYPE 4 Header............................................23
4.1.6. TYPE 5 Header............................................23 4.1.6. TYPE 5 Header............................................23
4.2. Buffering of Sample Descriptions............................24 4.2. Buffering of Sample Descriptions............................24
4.2.1. Dynamic SIDX wrap-around mechanism.......................24 4.2.1. Dynamic SIDX wrap-around mechanism.......................24
4.3. Finding payload header values in 3GP files..................26 4.3. Finding payload header values in 3GP files..................26
4.4. Fragmentation of Timed Text Samples.........................29 4.4. Fragmentation of Timed Text Samples.........................29
4.5. Reassembling Text Samples at the Receiver...................30 4.5. Reassembling Text Samples at the Receiver...................30
4.6. On Aggregate Payloads.......................................32 4.6. On Aggregate Payloads.......................................32
4.7. Payload Examples............................................36 4.7. Payload Examples............................................36
4.8. Relation to RFC 3640........................................40 4.8. Relation to RFC 3640........................................40
4.9. Relation to RFC 2793........................................41 4.9. Relation to RFC 2793........................................41
5. Resilient Transport...........................................41 5. Resilient Transport............................................41
6. Congestion control............................................42 6. Congestion control.............................................42
7. Scene Description.............................................43 7. Scene Description..............................................43
7.1. Text Rendering Position and Composition.....................43 7.1. Text Rendering Position and Composition.....................43
7.2. SMIL usage..................................................44 7.2. SMIL usage..................................................44
7.3. Finding layout values in a 3GP file.........................44 7.3. Finding layout values in a 3GP file.........................44
8. 3GPP Timed Text Media Type....................................44 8. 3GPP Timed Text Media Type.....................................44
9. SDP usage.....................................................48 9. SDP usage......................................................48
9.1. Mapping to SDP..............................................48 9.1. Mapping to SDP..............................................48
9.2. Parameter Usage in the SDP Offer/Answer Model...............48 9.2. Parameter Usage in the SDP Offer/Answer Model...............48
9.2.1. Unicast Usage............................................49 9.2.1. Unicast Usage............................................49
9.2.2. Multicast Usage..........................................51 9.2.2. Multicast Usage..........................................51
9.3. Offer/Answer Examples.......................................52 9.3. Offer/Answer Examples.......................................52
9.4. Parameter Usage outside of Offer/Answer.....................54 9.4. Parameter Usage outside of Offer/Answer.....................54
10. IANA Considerations..........................................54 10. IANA Considerations...........................................54
11. Security considerations......................................54 11. Security considerations.......................................54
12. References...................................................55 12. References....................................................55
12.1. Normative References.......................................55 12.1. Normative References.......................................55
12.2. Informative References.....................................55 12.2. Informative References.....................................55
13. Annexes......................................................57 13. Annexes.......................................................57
13.1. Basics of the 3GP File Structure...........................57 13.1. Basics of the 3GP File Structure...........................57
14. Acknowledgements.............................................58 14. Acknowledgements..............................................58
15. Authors' Addresses...........................................58 15. Authors' Addresses............................................58
16. IPR Notices..................................................59 16. IPR Notices...................................................59
17. Full Copyright Statement.....................................59 17. Full Copyright Statement......................................59
[Note to the RFC Editor: [Note to the RFC Editor:
- Please replace "RFCXXXX" with the RFC designation of this document - Please replace "RFCXXXX" with the RFC designation of this document
when published, when published,
- Please substitute "draft-ietf-..." references with the - Please substitute "draft-ietf-..." references with the
corresponding RFC number if available at the time of publication] corresponding RFC number if available at the time of publication]
1. Introduction 1. Introduction
3GPP timed text is a media format for time-lined decorated text 3GPP timed text is a media format for time-lined decorated text
skipping to change at page 4, line 25 skipping to change at page 4, line 25
The purpose of this draft is to provide a means to stream 3GPP timed The purpose of this draft is to provide a means to stream 3GPP timed
text contents using RTP [3]. This includes the streaming of timed text contents using RTP [3]. This includes the streaming of timed
text being read out of a (3GP) file as well as the streaming of timed text being read out of a (3GP) file as well as the streaming of timed
text generated in real-time, a.k.a. live streaming. text generated in real-time, a.k.a. live streaming.
Section 2 contains the motivation of this document, an overview of Section 2 contains the motivation of this document, an overview of
the media format, the requirements and the design rationale. Section the media format, the requirements and the design rationale. Section
3 defines the terminology used. Section 4 specifies the payload 3 defines the terminology used. Section 4 specifies the payload
headers, the fragmentation and re-assembly rules for text samples, headers, the fragmentation and re-assembly rules for text samples,
the rules for payload aggregation and the relations of this document the rules for payload aggregation and the relations of this document
to RFC 3640 [12] and RFC 2793 [27]. Section 5 specifies some simple to RFC 3640 [12] and RFC 2793 [24]. Section 5 specifies some simple
schemes for resilient transport and gives pointers to other possible schemes for resilient transport and gives pointers to other possible
mechanisms. Section 6 addresses congestion control. Section 7 mechanisms. Section 6 addresses congestion control. Section 7
specifies scene description. Section 8 defines the media type. specifies scene description. Section 8 defines the media type.
Section 9 specifies SDP for unicast and multicast sessions, including Section 9 specifies SDP for unicast and multicast sessions, including
usage in the Offer / Answer model [13]. Sections 10 and 11 address usage in the Offer / Answer model [13]. Sections 10 and 11 address
IANA and security considerations. Section 12 lists references. IANA and security considerations. Section 12 lists references.
Annexes are included as Section 13. Annexes are included as Section 13.
2. Motivation, Requirements and Design Rationale 2. Motivation, Requirements and Design Rationale
2.1. Motivation 2.1. Motivation
The 3GPP timed text format was developed for use in the services The 3GPP timed text format was developed for use in the services
specified in the 3GPP Transparent End-to-end Packet-switched specified in the 3GPP Transparent End-to-end Packet-switched
Streaming Services (3GPP PSS) specification [16]. Streaming Services (3GPP PSS) specification [16].
The scope of the 3GPP PSS specification (in the following referred to
as PSS) includes both downloading and streaming of multimedia content
over 3G packet-switched networks. PSS adopts multimedia codecs such
as MPEG-4 Visual [22], AMR wide-band [23] or MPEG-4 AAC [24] for
encoding content. Other protocols like RTSP [15] for session set-up
and control, or SMIL [9] for handling presentation layouts. For
transport, HTTP over TCP is used for downloading and RTP for
streaming.
As of today, PSS allows to download 3GPP timed text contents stored As of today, PSS allows to download 3GPP timed text contents stored
in 3GP files. However, due to the lack of a RTP payload format, it in 3GP files. However, due to the lack of a RTP payload format, it
is not possible to stream 3GPP timed text contents over RTP. is not possible to stream 3GPP timed text contents over RTP.
This document specifies such payload format. This document specifies such payload format.
2.2. Basic Components of the 3GPP Timed Text Media Format 2.2. Basic Components of the 3GPP Timed Text Media Format
Before going into the details of the design, it is necessary to have Before going into the details of the design, it is necessary to have
knowledge about how the media format is constructed. We can identify knowledge about how the media format is constructed. We can identify
skipping to change at page 10, line 24 skipping to change at page 10, line 18
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in RFC 2119 [5]. document are to be interpreted as described in RFC 2119 [5].
Furthermore, the following terms are used and have specific meaning Furthermore, the following terms are used and have specific meaning
within the context of this document: within the context of this document:
text sample or whole text sample text sample or whole text sample
In the 3GPP Timed Text media format [1] this term refers to a In the 3GPP Timed Text media format [1] this term refers to a
unit of timed text data as contained in the source (3GP) file. unit of timed text data as contained in the source (3GP) file.
This includes the text string byte count, the text string and This includes the text string byte count, possibly a Byte Order
any modifiers that may follow. Its equivalent in audio/video Mark, the text string and any modifiers that may follow. Its
would be a frame. equivalent in audio/video would be a frame.
In this document, however, a text sample comprises only text In this document, however, a text sample comprises only text
strings and zero or more modifiers. This definition of text strings followed by zero or more modifiers. This definition of
sample only excludes the 16-bit text string byte count and the text sample excludes the 16-bit text string byte count and the
16-bit Byte Order Mark (BOM) present in 3GP file text samples 16-bit Byte Order Mark (BOM) present in 3GP file text samples
(see Section 4.3 and Figure 9). The 16-bit BOM is not (see Section 4.3 and Figure 9). The 16-bit BOM is not
transported in RTP as explained in Section 4.1.1. transported in RTP as explained in Section 4.1.1.
text strings: text strings:
text strings is the term used to denote the actual text text strings is the term used to denote the actual text
characters encoded either as UTF-8 or UTF-16. When using this characters encoded either as UTF-8 or UTF-16. When using this
payload format, the text string does not contain any byte order payload format, the text string does not contain any byte order
mark (BOM). mark (BOM). See Figure 9 for details.
fragment or text sample fragment: fragment or text sample fragment:
a fraction of a text sample. A fragment may contain either text a fraction of a text sample. A fragment may contain either text
strings or modifier (decoration) contents, but not both at the strings or modifier (decoration) contents, but not both at the
same time. same time.
sample contents: sample contents:
general term to identify timed text data transported when using general term to identify timed text data transported when using
skipping to change at page 24, line 29 skipping to change at page 24, line 29
o If dynamic sample descriptions are used, their buffering and o If dynamic sample descriptions are used, their buffering and
update of the SIDX values MUST follow the mechanism described in update of the SIDX values MUST follow the mechanism described in
the next section. the next section.
4.2.1. Dynamic SIDX wrap-around mechanism 4.2.1. Dynamic SIDX wrap-around mechanism
The use of dynamic sample descriptions by senders is OPTIONAL. The use of dynamic sample descriptions by senders is OPTIONAL.
However, if used, senders MUST implement this mechanism. Receivers However, if used, senders MUST implement this mechanism. Receivers
MUST always implement it. MUST always implement it.
As mentioned in Section 4.1.2, dynamic SIDX values remain active
either during the entire duration of the session (if used just once) Dynamic SIDX values remain active either during the entire duration
or in different intervals of it (if used once or more). of the session (if used just once) or in different intervals of it
(if used once or more).
Note: in the following SIDX means dynamic SIDX. Note: in the following SIDX means dynamic SIDX.
For choosing the wrap-around mechanism, the following rationale was For choosing the wrap-around mechanism, the following rationale was
used: there are 128 dynamic SIDX values possible, [0..127]. If one used: there are 128 dynamic SIDX values possible, [0..127]. If one
chooses to allow a maximum of 127 to be used as dynamic SIDXs, then chooses to allow a maximum of 127 to be used as dynamic SIDXs, then
any reordered packet with a new sample description would make the any reordered packet with a new sample description would make the
mechanism fail. E.g., if the last packet received is SIDX=5, then mechanism fail. E.g., if the last packet received is SIDX=5, then
all 127 values except SIDX=6 would be "active". Now, if a reordered all 127 values except SIDX=6 would be "active". Now, if a reordered
packet arrives with a new description, SIDX=9, it will be packet arrives with a new description, SIDX=9, it will be mistakenly
mistakenly discarded, because the SIDX=9 is, at that moment, marked discarded, because the SIDX=9 is, at that moment, marked as "active"
as "active" and active sample descriptions shall not be re-written. and active sample descriptions shall not be re-written. Therefore,
Therefore, a "guard interval" is introduced. This guard interval a "guard interval" is introduced. This guard interval reduces the
reduces the number of active SIDXs at any point in time to 64. number of active SIDXs at any point in time to 64. Although most
Although most timed text applications will probably need less than timed text applications will probably need less than 64 sample
64 sample descriptions during a session (in total), a wrap-around descriptions during a session (in total), a wrap-around mechanism to
mechanism to handle the need for more is described here. handle the need for more is described here.
Thereby, a sliding window of 64 active SIDX values is used. Values Thereby, a sliding window of 64 active SIDX values is used. Values
within the window are "active"; all others are marked "inactive". An within the window are "active"; all others are marked "inactive". An
SIDX value becomes active if at least one sample description SIDX value becomes active if at least one sample description
identified by that SIDX has been received. Since sample descriptions identified by that SIDX has been received. Since sample descriptions
MAY be sent redundantly, it is possible that a client receives a MAY be sent redundantly, it is possible that a client receives a
given SIDX several times. However, active sample descriptions SHALL given SIDX several times. However, active sample descriptions SHALL
NOT be overwritten: the receiver SHALL ignore redundant sample NOT be overwritten: the receiver SHALL ignore redundant sample
descriptions and it MUST use the already cached copy. The "guard descriptions and it MUST use the already cached copy. The "guard
interval" of (64) inactive values ensures that always the correct interval" of (64) inactive values ensures that always the correct
skipping to change at page 25, line 33 skipping to change at page 25, line 34
Chunk box ("stsc") for that sample. For live streaming, the Chunk box ("stsc") for that sample. For live streaming, the
first value MAY be zero or any other value in the interval first value MAY be zero or any other value in the interval
above. Go to step 2. above. Go to step 2.
2. First in-band sample description with SIDX=Z is received and 2. First in-band sample description with SIDX=Z is received and
stored, Set X=Z. Go to step 3. stored, Set X=Z. Go to step 3.
3. Any SIDX within the interval [X+1 modulo(128), X+64 modulo(128)] 3. Any SIDX within the interval [X+1 modulo(128), X+64 modulo(128)]
is marked as inactive and any corresponding sample description is marked as inactive and any corresponding sample description
is deleted. Any SIDX within the interval [X+65 modulo(128), X] is deleted. Any SIDX within the interval [X+65 modulo(128), X]
is set active. Go to step 4 (wait state). is set active. Go to step 4 (wait state).
4. Wait for next sample description. Once the client is 4. Wait for next sample description. Once the client is
initialized, the interval of active SIDX values MUST change initialized, the interval of active SIDX values MUST change
whenever a sample description with an SIDX value in the inactive whenever a sample description with an SIDX value in the inactive
set is received. I.e., upon reception of a sample description set is received. I.e., upon reception of a sample description
with SIDX=Z do: with SIDX=Z do:
a. If Z is in the (closed) interval [X+1 modulo(128), X+64 a. If Z is in the (closed) interval [X+1 modulo(128), X+64
modulo(128)] then set X=Z, store the sample description and modulo(128)] then set X=Z, store the sample description and
go to step 3. go to step 3.
skipping to change at page 40, line 10 skipping to change at page 40, line 10
| +-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+
| | | |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 16. An RTP packet carrying last modifiers fragment (TYPE 4). Figure 16. An RTP packet carrying last modifiers fragment (TYPE 4).
4.8. Relation to RFC 3640 4.8. Relation to RFC 3640
RFC 3640 defines a payload format for the transport of any RFC 3640 defines a payload format for the transport of any
non-multiplexed MPEG-4 elementary stream. One of the various MPEG-4 non-multiplexed MPEG-4 elementary stream. One of the various MPEG-4
elementary streams types are MPEG-4 timed text streams, specified in elementary streams types are MPEG-4 timed text streams, specified in
MPEG-4 part 17 [31], also known as ISO/IEC 14496-17. MPEG-4 timed MPEG-4 part 17 [28], also known as ISO/IEC 14496-17. MPEG-4 timed
text streams are capable of carrying 3GPP timed text data, as text streams are capable of carrying 3GPP timed text data, as
specified in 3GPP TS 26.245 [1]. specified in 3GPP TS 26.245 [1].
MPEG-4 timed text streams are intentionally constructed so as to MPEG-4 timed text streams are intentionally constructed so as to
guarantee interoperability between RFC 3640 and this payload format. guarantee interoperability between RFC 3640 and this payload format.
This means that the construction of the RTP packets carrying timed This means that the construction of the RTP packets carrying timed
text is the same. I.e., the MPEG-4 timed text elementary stream as text is the same. I.e., the MPEG-4 timed text elementary stream as
per ISO/IEC 14496-17 is identical to the (aggregate) payloads per ISO/IEC 14496-17 is identical to the (aggregate) payloads
constructed using this payload format. constructed using this payload format.
skipping to change at page 41, line 10 skipping to change at page 41, line 10
| RTP packets according to this Payload Format = | | RTP packets according to this Payload Format = |
|= RTP packets carrying MPEG-4 Timed Text ES over RFC3640 | |= RTP packets carrying MPEG-4 Timed Text ES over RFC3640 |
+-------------------------------------------------------------------+ +-------------------------------------------------------------------+
Figure 11. Relation to RFC 3640. Figure 11. Relation to RFC 3640.
Note: the use of RFC 3640 for transport of ISO/IEC 14496-17 data does Note: the use of RFC 3640 for transport of ISO/IEC 14496-17 data does
not require any new SDP parameters or any new mode definition. not require any new SDP parameters or any new mode definition.
4.9. Relation to RFC 2793 4.9. Relation to RFC 2793
The RFC 2793 [27] and its revision [28] specify a protocol for The RFC 2793 [24] and its revision [25] specify a protocol for
enabling text conversation. Typical applications of this payload enabling text conversation. Typical applications of this payload
format are text communication terminals and text conferencing tools. format are text communication terminals and text conferencing tools.
Text session contents are specified in ITU-T Recommendation T.140 Text session contents are specified in ITU-T Recommendation T.140
[29]. T.140 text is UTF-8 coded as specified in T.140 [29] with no [26]. T.140 text is UTF-8 coded as specified in T.140 [26] with no
extra framing. The T140block contains one or more T.140 code extra framing. The T140block contains one or more T.140 code
elements as specified in T.140. Code elements are control sequences elements as specified in T.140. Code elements are control sequences
such as "New Line", "Interrupt", "String Terminator" or "Start of such as "New Line", "Interrupt", "String Terminator" or "Start of
String". Most T.140 code elements are single ISO 10646 [30] String". Most T.140 code elements are single ISO 10646 [27]
characters, but some are multiple character sequences. Each character characters, but some are multiple character sequences. Each character
is UTF-8 encoded [18] into one or more octets. is UTF-8 encoded [18] into one or more octets.
This payload format may also be used for conversational applications This payload format may also be used for conversational applications
(even for instant messaging). However, this is not the main target (even for instant messaging). However, this is not the main target
of it. The differentiating feature of 3GPP Timed Text media format of it. The differentiating feature of 3GPP Timed Text media format
is that it allows text decoration. This is especially useful in is that it allows text decoration. This is especially useful in
multimedia presentations, karaoke, commercial banners, news tickers, multimedia presentations, karaoke, commercial banners, news tickers,
karaoke, clickable text strings and captions. T.140 text contents karaoke, clickable text strings and captions. T.140 text contents
used in RFC 2793 do not allow the use of text decoration. used in RFC 2793 do not allow the use of text decoration.
skipping to change at page 44, line 31 skipping to change at page 44, line 31
one track (or stream) is expressed relative to another track. This one track (or stream) is expressed relative to another track. This
is different to the 3GP file, where the upper left corner is the is different to the 3GP file, where the upper left corner is the
reference for all translation offsets. Hence, only if the position reference for all translation offsets. Hence, only if the position
in SMIL is relative to the video track origin, then this translation in SMIL is relative to the video track origin, then this translation
offset has the same value as (tx, ty) in the 3GP file. offset has the same value as (tx, ty) in the 3GP file.
Note also that the original track header information is used for each Note also that the original track header information is used for each
track only within its region, as assigned by SMIL. Therefore, even track only within its region, as assigned by SMIL. Therefore, even
if SMIL scene description is used, the track header information if SMIL scene description is used, the track header information
pieces SHOULD be sent anyway as they represent the intrinsic media pieces SHOULD be sent anyway as they represent the intrinsic media
properties. See 3GPP SMIL Language Profile in [32] for details. properties. See 3GPP SMIL Language Profile in [29] for details.
7.3. Finding layout values in a 3GP file 7.3. Finding layout values in a 3GP file
In a 3GP file, within the Track Header Box (tkhd): In a 3GP file, within the Track Header Box (tkhd):
o tx, ty: these values specify the translation offset of the o tx, ty: these values specify the translation offset of the
(text) track relative to the upper left corner of the video (text) track relative to the upper left corner of the video
track, if present. They are the second but last and third track, if present. They are the second but last and third
but last values in the unity matrix; values are fixed-point but last values in the unity matrix; values are fixed-point
16.16 values, restricted to be (signed) integers (i.e., the 16.16 values, restricted to be (signed) integers (i.e., the
skipping to change at page 45, line 4 skipping to change at page 44, line 52
lower 16 bits of each value shall be all zeros). Therefore, lower 16 bits of each value shall be all zeros). Therefore,
only the first 16 bits are used for obtaining the value of only the first 16 bits are used for obtaining the value of
the media type parameters. the media type parameters.
o width, height: they have the same name in the tkhd box. All o width, height: they have the same name in the tkhd box. All
(unsigned) 32 bits are meaningful. (unsigned) 32 bits are meaningful.
o layer: all (signed) 16 bits are used. o layer: all (signed) 16 bits are used.
8. 3GPP Timed Text Media Type 8. 3GPP Timed Text Media Type
The media subtype for the 3GPP Timed Text codec is allocated from the The media subtype for the 3GPP Timed Text codec is allocated from the
IETF tree. The top-level media type under which this payload format standards tree. The top-level media type under which this payload
is registered is 'text'. format is registered is 'video'. This registration is done using the
template defined in [31] and following RFC 3555 [30].
The receiver MUST ignore any unrecognized parameter. The receiver MUST ignore any unrecognized parameter.
Media type: text Media type: video
Media subtype: 3gpp-tt Media subtype: 3gpp-tt
Required parameters Required parameters
rate: rate:
Refer to Section 3 in RFCXXXX. Refer to Section 3 in RFCXXXX.
sver: sver:
The parameter "sver" contains a list of supported The parameter "sver" contains a list of supported
skipping to change at page 46, line 55 skipping to change at page 46, line 53
parameter supports. This value is the decimal parameter supports. This value is the decimal
representation of a 32-bit unsigned integer. representation of a 32-bit unsigned integer.
max-h: max-h:
This parameter indicates display capabilities. This is This parameter indicates display capabilities. This is
the maximum "height" value that the sender of this the maximum "height" value that the sender of this
parameter supports. This value is the decimal parameter supports. This value is the decimal
representation of a 32-bit unsigned integer. representation of a 32-bit unsigned integer.
Encoding considerations: Encoding considerations:
RTP payloads complying with this payload format contain binary This media type is framed (see section 4.8 in [31]) and
data. partially contains binary data.
Note that this type is incompatible with the use of text media
types in other protocols, e.g. text/html. This is because in
order to extract and decode any of the timed text media it is
necessary understand the (binary) payload headers defined in
RFCXXXX.
Restrictions on usage: Restrictions on usage:
This type is only defined for transfer via RTP. This media type depends on RTP framing, and hence is only
defined for transfer via RTP [3]. Transport within other framing
protocols is not defined at this time.
Security considerations: Security considerations:
Please refer to Section 11 of RFCXXXX. Please refer to Section 11 of RFCXXXX.
Interoperability considerations: Interoperability considerations:
The 3GPP Timed Text media format and its storage is specified in The 3GPP Timed Text media format and its file storage is
Release 6 of 3GPP TS 26.245 "Transparent end-to-end packet specified in Release 6 of 3GPP TS 26.245 "Transparent end-to-end
switched streaming service (PSS); Timed Text Format (Release packet switched streaming service (PSS); Timed Text Format
6)". The 3GPP file format (3GP) and the SMIL language profile (Release 6)". Note also that 3GPP may in future Releases
used can be found in Release 5 of 3GPP TS 26.234 and in the specify extensions or updates to the timed text media format in
corresponding specifications for later Releases. Note also that a backwards-compatible way, e. g. new modifier boxes or
3GPP may in future Releases specify extensions or updates to the extensions to the sample descriptions. The payload format
timed text media format in a backwards-compatible way, e.g. new defined in RFCXXXX allows for such extensions. For future 3GPP
modifier boxes or extensions to the sample descriptions. The Releases of the Timed Text Format, the parameter "sver" is used
payload format defined in RFCXXXX allows for such extensions. to identify the exact specification used.
For future 3GPP Releases of the Timed Text Format, the parameter
"sver" is used to identify the exact specification used. The defined storage format for 3GPP Timed Text format is the
3GPP File Format (3GP) [32]. 3GP files may be transferred using
the media type video/3gpp as registered by RFC 3839 [33]. The
3GPP File Format is a container file that may contain, e.g.,
audio and video which may be synchronized with the
3GPP Timed Text.
Published specification: RFC XXXX Published specification: RFC XXXX
Applications which use this media type: Applications which use this media type:
Multimedia streaming applications. Multimedia streaming applications.
Additional information: Additional information:
the 3GPP Timed Text media format is specified in 3GPP TS 26.245 the 3GPP Timed Text media format is specified in 3GPP TS 26.245
skipping to change at page 48, line 24 skipping to change at page 48, line 24
IESG. IESG.
9. SDP usage 9. SDP usage
9.1. Mapping to SDP 9.1. Mapping to SDP
The information carried in the media type specification has a The information carried in the media type specification has a
specific mapping to fields in SDP [4]. If SDP is used to specify specific mapping to fields in SDP [4]. If SDP is used to specify
sessions using this payload format, the mapping is done as follows: sessions using this payload format, the mapping is done as follows:
o The media type ("text") goes in the SDP "m=" as the media name. o The media type ("video") goes in the SDP "m=" as the media name.
m=text <port number> RTP/<RTP profile> <dynamic payload type> m=video <port number> RTP/<RTP profile> <dynamic payload type>
o The media subtype ("3gpp-tt") and the timestamp clockrate "rate" o The media subtype ("3gpp-tt") and the timestamp clockrate "rate"
(the RECOMMENDED 1000 Hz or other value) go in SDP "a=rtpmap" line (the RECOMMENDED 1000 Hz or other value) go in SDP "a=rtpmap" line
as the encoding name and rate, respectively: as the encoding name and rate, respectively:
a=rtpmap:<payload type> 3gpp-tt/1000 a=rtpmap:<payload type> 3gpp-tt/1000
o The REQUIRED parameter "sver" goes in the SDP "a=fmtp" attribute o The REQUIRED parameter "sver" goes in the SDP "a=fmtp" attribute
by copying it directly from the media type string as a semicolon by copying it directly from the media type string as a semicolon
separated parameter=value pair. separated parameter=value pair.
skipping to change at page 52, line 37 skipping to change at page 52, line 37
9.3. Offer/Answer Examples 9.3. Offer/Answer Examples
In these unicast O/A examples the long lines are wrapped around. In these unicast O/A examples the long lines are wrapped around.
Static sample descriptions are shortened for clarity. Static sample descriptions are shortened for clarity.
For sendrecv : For sendrecv :
O -> A O -> A
m=text <port> RTP/AVP 98 m=video <port> RTP/AVP 98
a=rtpmap:98 3gpp-tt/1000 a=rtpmap:98 3gpp-tt/1000
a=fmtp:98 tx=100; ty=100; layer=0; height=80; width=100; max-h=120; a=fmtp:98 tx=100; ty=100; layer=0; height=80; width=100; max-h=120;
max-w=160; sver=6256,60; tx3g=81... max-w=160; sver=6256,60; tx3g=81...
a=sendrecv a=sendrecv
A -> O A -> O
m=text <port> RTP/AVP 98.. m=video <port> RTP/AVP 98..
a=rtpmap:98 3gpp-tt/1000 a=rtpmap:98 3gpp-tt/1000
a=fmtp:98 tx=100; ty=95; layer=0; height=90; width=100; max-h=100; a=fmtp:98 tx=100; ty=95; layer=0; height=90; width=100; max-h=100;
max-w=160; sver=60; tx3g=82... max-w=160; sver=60; tx3g=82...
a=sendrecv a=sendrecv
In this example the offerer is telling the answerer where it will In this example the offerer is telling the answerer where it will
place the received stream and what is the maximum height and width place the received stream and what is the maximum height and width
allowable for the stream that it will receive. Also, it tells the allowable for the stream that it will receive. Also, it tells the
answerer the dimensions of the text track for the stream sent and answerer the dimensions of the text track for the stream sent and
which sample description it shall use. It offers two versions, 6256 which sample description it shall use. It offers two versions, 6256
skipping to change at page 53, line 20 skipping to change at page 53, line 20
matches the answerer's capabilities as expressed by "max-h" and matches the answerer's capabilities as expressed by "max-h" and
"max-w" in the negative answer. Note also that the answerer's text "max-w" in the negative answer. Note also that the answerer's text
box dimensions fit within the maximum values signalled in the offer. box dimensions fit within the maximum values signalled in the offer.
Finally, the answerer chooses to use version 60 of the timed text Finally, the answerer chooses to use version 60 of the timed text
format. format.
For recvonly: For recvonly:
Offerer -> Answerer Offerer -> Answerer
m=text <port> RTP/AVP 98 m=video <port> RTP/AVP 98
a=rtpmap:98 3gpp-tt/1000 a=rtpmap:98 3gpp-tt/1000
a=fmtp:98 tx=100; ty=100; layer=0; max-h=120; max-w=160; sver=6256,60 a=fmtp:98 tx=100; ty=100; layer=0; max-h=120; max-w=160; sver=6256,60
a=recvonly a=recvonly
A -> O A -> O
m=text <port> RTP/AVP 98.. m=video <port> RTP/AVP 98..
a=rtpmap:98 3gpp-tt/1000 a=rtpmap:98 3gpp-tt/1000
a=fmtp:98 tx=100; ty=100; layer=0; height=90; width=100; sver=60; a=fmtp:98 tx=100; ty=100; layer=0; height=90; width=100; sver=60;
tx3g=82... tx3g=82...
a=sendonly a=sendonly
In this case, the offer is different from the previous case: it does In this case, the offer is different from the previous case: it does
not include the stream properties: "height", "width" and "tx3g". The not include the stream properties: "height", "width" and "tx3g". The
answerer copies the "tx", "ty" and "layer" values, thus acknowledging answerer copies the "tx", "ty" and "layer" values, thus acknowledging
these. "max-h" and "max-w" are not present in the answer because the these. "max-h" and "max-w" are not present in the answer because the
"tx" and "ty" (and "layer") in this special case do not apply to the "tx" and "ty" (and "layer") in this special case do not apply to the
received, but to the sent stream. Also, if offerer and answerer had received, but to the sent stream. Also, if offerer and answerer had
very different displays sizes, it would not be possible to express very different displays sizes, it would not be possible to express
the answerer's capabilities. In the example above and for an the answerer's capabilities. In the example above and for an
answerer with a 50x50 display, the translation values are already out answerer with a 50x50 display, the translation values are already out
of range. of range.
For sendonly: For sendonly:
O -> A O -> A
m=text <port> RTP/AVP 98 m=video <port> RTP/AVP 98
a=rtpmap:98 3gpp-tt/1000 a=rtpmap:98 3gpp-tt/1000
a=fmtp:98 tx=100; ty=100; layer=0; height=80; width=100; a=fmtp:98 tx=100; ty=100; layer=0; height=80; width=100;
sver=6256,60; tx3g=81... sver=6256,60; tx3g=81...
a=sendonly a=sendonly
A -> O A -> O
m=text <port> RTP/AVP 98..
m=video <port> RTP/AVP 98..
a=rtpmap:98 3gpp-tt/1000 a=rtpmap:98 3gpp-tt/1000
a=fmtp:98 tx=100; ty=100; layer=0; height=80; width=100; max-h=100; a=fmtp:98 tx=100; ty=100; layer=0; height=80; width=100; max-h=100;
max-w=160; sver=60 max-w=160; sver=60
a=recvonly a=recvonly
Note that "max-h" and "max-w" are not present in the offer. Also, Note that "max-h" and "max-w" are not present in the offer. Also,
with this answer, the answerer would accept the offer as is (thus with this answer, the answerer would accept the offer as is (thus
echoing "tx", "ty", "height", "width" and "layer") and additionally echoing "tx", "ty", "height", "width" and "layer") and additionally
inform the offerer about its capabilities: "max-h" and "max-w". inform the offerer about its capabilities: "max-h" and "max-w".
Another possible answer for this case would be: Another possible answer for this case would be:
A -> O A -> O
m=text <port> RTP/AVP 98.. m=video <port> RTP/AVP 98..
a=rtpmap:98 3gpp-tt/1000 a=rtpmap:98 3gpp-tt/1000
a=fmtp:98 tx=120; ty=105; layer=0; max-h=95; max-w=150; sver=60 a=fmtp:98 tx=120; ty=105; layer=0; max-h=95; max-w=150; sver=60
a=recvonly a=recvonly
In this case the answerer does not accept the values offered. The In this case the answerer does not accept the values offered. The
offerer MUST use these values or else remove the stream. offerer MUST use these values or else remove the stream.
9.4. Parameter Usage outside of Offer/Answer 9.4. Parameter Usage outside of Offer/Answer
SDP may also be employed outside of the Offer/Answer context, for SDP may also be employed outside of the Offer/Answer context, for
skipping to change at page 54, line 44 skipping to change at page 54, line 46
In this case, the receiver of a session description is required to In this case, the receiver of a session description is required to
support the parameters and given values for the streams or else it support the parameters and given values for the streams or else it
MUST reject the session. It is the responsibility of the sender (or MUST reject the session. It is the responsibility of the sender (or
creator) of the session descriptions to define the session parameters creator) of the session descriptions to define the session parameters
so that the probability of unsuccessful session setup is minimized. so that the probability of unsuccessful session setup is minimized.
This is out of the scope of this document. This is out of the scope of this document.
10. IANA Considerations 10. IANA Considerations
IANA is requested to register the media subtype name "3gpp-tt" for IANA is requested to register the media subtype name "3gpp-tt" for
the media type "text" as specified in Section 8 of this document. the media type "video" as specified in Section 8 of this document.
11. Security considerations 11. Security considerations
RTP packets using the payload format defined in this specification RTP packets using the payload format defined in this specification
are subject to the security considerations discussed in the RTP are subject to the security considerations discussed in the RTP
specification [3] and any applicable RTP profile, e.g. AVP [17]. specification [3] and any applicable RTP profile, e.g. AVP [17].
In particular, an attacker may invalidate the current set of active In particular, an attacker may invalidate the current set of active
sample descriptions at the client by means of repeating a packet with sample descriptions at the client by means of repeating a packet with
an old sample description, i.e. replay attack. This would mean that an old sample description, i.e. replay attack. This would mean that
skipping to change at page 55, line 19 skipping to change at page 55, line 22
lengths (SLEN). This may cause a decoder to crash. lengths (SLEN). This may cause a decoder to crash.
These types of attack may easily be avoided by using source These types of attack may easily be avoided by using source
authentication and integrity protection. authentication and integrity protection.
Additionally, peers in a timed text session may desire to retain Additionally, peers in a timed text session may desire to retain
privacy in their communication, i.e. confidentiality. privacy in their communication, i.e. confidentiality.
This payload format does not provide any mechanisms for achieving This payload format does not provide any mechanisms for achieving
these. Confidentiality, integrity protection and authentication have these. Confidentiality, integrity protection and authentication have
to be solved by a mechanism external to this payload format, e.g. to be solved by a mechanism external to this payload format, e.g.,
SRTP [10]. SRTP [10].
12. References 12. References
12.1. Normative References 12.1. Normative References
[1] Transparent end-to-end packet switched streaming service (PSS); [1] Transparent end-to-end packet switched streaming service (PSS);
Timed Text Format (Release 6), TS 26.245 v 6.0.0, June 2004. Timed Text Format (Release 6), TS 26.245 v 6.0.0, June 2004.
[2] ISO/IEC 14496-12:2004 Information technology - Coding of [2] ISO/IEC 14496-12:2004 Information technology - Coding of
skipping to change at page 56, line 45 skipping to change at page 56, line 45
[19] P. Hoffman, F. Yergeau, "UTF-16, an encoding of ISO 10646", RFC [19] P. Hoffman, F. Yergeau, "UTF-16, an encoding of ISO 10646", RFC
2781, February 2000. 2781, February 2000.
[20] Friedman, et al., "RTP Control Protocol Extended Reports (RTCP [20] Friedman, et al., "RTP Control Protocol Extended Reports (RTCP
XR)", RFC 3611, November 2003. XR)", RFC 3611, November 2003.
[21] Ott, et al., "Extended RTP Profile for RTCP-based Feedback [21] Ott, et al., "Extended RTP Profile for RTCP-based Feedback
(RTP/AVPF)", draft-ietf-avt-rtcp-feedback-11.txt, work in (RTP/AVPF)", draft-ietf-avt-rtcp-feedback-11.txt, work in
progress, August 2004. progress, August 2004.
[22] ISO/IEC 14496-2:2004: "Information technology - Coding of [22] IETF RFC 3267: "Real-Time Transport Protocol (RTP) Payload
audio-visual objects - Part 2: Visual"
[23] 3GPP TS 26.171: "AMR Wideband Speech Codec; General
Description".
[24] 3GPP TS 26.401: "General audio codec audio processing functions;
Enhanced aacPlus general audio codec; General description".
[25] IETF RFC 3267: "Real-Time Transport Protocol (RTP) Payload
Format and File Storage Format for the Adaptive Multi-Rate (AMR) Format and File Storage Format for the Adaptive Multi-Rate (AMR)
Adaptive Multi-Rate Wideband (AMR-WB) Audio Codecs", Sjoberg J. et Adaptive Multi-Rate Wideband (AMR-WB) Audio Codecs", Sjoberg J. et
al., June 2002. al., June 2002.
[26] IETF RFC 3016: "RTP Payload Format for MPEG-4 Audio/Visual [23] IETF RFC 3016: "RTP Payload Format for MPEG-4 Audio/Visual
Streams", Kikuchi Y. et al., November 2000. Streams", Kikuchi Y. et al., November 2000.
[27] G. Hellstrom, "RTP Payload for Text Conversation", RFC 2793, May [24] G. Hellstrom, "RTP Payload for Text Conversation", RFC 2793, May
2000. 2000.
[28] G. Hellstrom, P. Jones, "RTP Payload for Text Conversation", [25] G. Hellstrom, P. Jones, "RTP Payload for Text Conversation",
draft-ietf-avt-rfc2793bis-09.txt, Work In Progress, August 2004. draft-ietf-avt-rfc2793bis-09.txt, Work In Progress, August 2004.
[29] ITU-T Recommendation T.140 (1998) - Text conversation protocol [26] ITU-T Recommendation T.140 (1998) - Text conversation protocol
for multimedia application, with amendment 1, (2000). for multimedia application, with amendment 1, (2000).
[30] ISO/IEC 10646-1: (1993), Universal Multiple Octet Coded [27] ISO/IEC 10646-1: (1993), Universal Multiple Octet Coded
Character Set. Character Set.
[31] ISO/IEC FCD 14496-17 Information technology - Coding of [28] ISO/IEC FCD 14496-17 Information technology - Coding of
audio-visual objects - Part 17: Streaming text format, Work in audio-visual objects - Part 17: Streaming text format, Work in
progress, June 2004. progress, June 2004.
[32] Transparent end-to-end Packet-switched Streaming Service (PSS); [29] Transparent end-to-end Packet-switched Streaming Service (PSS);
3GPP SMIL language profile, (Release 6), TS 26.246 v 6.0.0, June 3GPP SMIL language profile, (Release 6), TS 26.246 v 6.0.0, June
2004. 2004.
[30] Casner, S. and P. Hoschka, "MIME Type Registration of RTP
Payload Formats", RFC 3555, July 2003.
[31] Freed, N. and J. Klensin, "Media Type Specifications and
Registration Procedures", draft-freed-media-type-reg-04, April
2005.
[32] Transparent end-to-end packet switched streaming service (PSS);
3GPP file format (3GP) (Release 6), TS 26.244 V6.3. March 2005.
[33] Castagno, R. and D. Singer, "MIME Type Registrations for 3rd
Generation Partnership Project (3GPP) Multimedia files", RFC 3839,
July 2004.
13. Annexes 13. Annexes
13.1. Basics of the 3GP File Structure 13.1. Basics of the 3GP File Structure
This section provides a coarse overview of the 3GP file structure, This section provides a coarse overview of the 3GP file structure,
which follows the ISO Base Media file Format [2]. which follows the ISO Base Media file Format [2].
Each 3GP file consists of "Boxes". In general, a 3GP file contains Each 3GP file consists of "Boxes". In general, a 3GP file contains
the File Type Box (ftyp), the Movie Box (moov), and the Media Data the File Type Box (ftyp), the Movie Box (moov), and the Media Data
Box (mdat). The File Type Box identifies the type and properties of Box (mdat). The File Type Box identifies the type and properties of
 End of changes. 44 change blocks. 
137 lines changed or deleted 137 lines changed or added

This html diff was produced by rfcdiff 1.48. The latest version is available from http://tools.ietf.org/tools/rfcdiff/