idnits 2.17.1 draft-ietf-avt-rtp-new-11.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Looks like you're using RFC 2026 boilerplate. This must be updated to follow RFC 3978/3979, as updated by RFC 4748. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** The document seems to lack a 1id_guidelines paragraph about the list of Shadow Directories. == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an Authors' Addresses Section. ** There are 23 instances of too long lines in the document, the longest one being 4 characters in excess of 72. ** There are 19 instances of lines with control characters in the document. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the RFC 3978 Section 5.4 Copyright Line does not match the current year == Line 1010 has weird spacing: '... item item ...' == Line 3355 has weird spacing: '...ed char u_int...' == Line 3357 has weird spacing: '...ned int u_in...' == Line 3885 has weird spacing: '... char c[16...' == Line 3909 has weird spacing: '... struct timev...' == (6 more instances...) -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (November 20, 2001) is 8183 days in the past. Is this intentional? -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '' and '' lines. Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Looks like a reference, but probably isn't: '-packet-' on line 1007 -- Looks like a reference, but probably isn't: 'E1' on line 2480 -- Looks like a reference, but probably isn't: 'E6' on line 2480 -- Looks like a reference, but probably isn't: 'E2' on line 2489 -- Looks like a reference, but probably isn't: 'E4' on line 2489 -- Looks like a reference, but probably isn't: 'E3' on line 2491 -- Looks like a reference, but probably isn't: 'E5' on line 2495 -- Looks like a reference, but probably isn't: 'RR' on line 2995 == Missing Reference: '0' is mentioned on line 3816, but not defined -- Possible downref: Non-RFC (?) normative reference: ref. '1' ** Obsolete normative reference: RFC 1305 (ref. '4') (Obsoleted by RFC 5905) ** Obsolete normative reference: RFC 2279 (ref. '5') (Obsoleted by RFC 3629) ** Obsolete normative reference: RFC 822 (ref. '9') (Obsoleted by RFC 2822) -- Obsolete informational reference (is this intentional?): RFC 2543 (ref. '13') (Obsoleted by RFC 3261, RFC 3262, RFC 3263, RFC 3264, RFC 3265) -- Obsolete informational reference (is this intentional?): RFC 2327 (ref. '15') (Obsoleted by RFC 4566) -- Obsolete informational reference (is this intentional?): RFC 2326 (ref. '16') (Obsoleted by RFC 7826) -- Obsolete informational reference (is this intentional?): RFC 1750 (ref. '17') (Obsoleted by RFC 4086) -- Obsolete informational reference (is this intentional?): RFC 1597 (ref. '23') (Obsoleted by RFC 1918) -- Obsolete informational reference (is this intentional?): RFC 1627 (ref. '24') (Obsoleted by RFC 1918) -- Obsolete informational reference (is this intentional?): RFC 2401 (ref. '26') (Obsoleted by RFC 4301) Summary: 8 errors (**), 0 flaws (~~), 9 warnings (==), 19 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Internet Engineering Task Force Audio/Video Transport Working Group 3 Internet Draft Schulzrinne/Casner/Frederick/Jacobson 4 draft-ietf-avt-rtp-new-11.txt Columbia U./Packet Design/ 5 Cacheflow/Packet Design 6 November 20, 2001 7 Expires: May 2002 9 RTP: A Transport Protocol for Real-Time Applications 11 STATUS OF THIS MEMO 13 This document is an Internet-Draft and is in full conformance with 14 all provisions of Section 10 of RFC2026. 16 Internet-Drafts are working documents of the Internet Engineering 17 Task Force (IETF), its areas, and its working groups. Note that 18 other groups may also distribute working documents as Internet- 19 Drafts. 21 Internet-Drafts are draft documents valid for a maximum of six months 22 and may be updated, replaced, or obsoleted by other documents at any 23 time. It is inappropriate to use Internet-Drafts as reference 24 material or to cite them other than as "work in progress". 26 The list of current Internet-Drafts can be accessed at 27 http://www.ietf.org/ietf/1id-abstracts.txt 29 To view the list Internet-Draft Shadow Directories, see 30 http://www.ietf.org/shadow.html. 32 Abstract 34 This memorandum is a revision of RFC 1889 in preparation for 35 advancement from Proposed Standard to Draft Standard status. 37 This memorandum describes RTP, the real-time transport protocol. RTP 38 provides end-to-end network transport functions suitable for 39 applications transmitting real-time data, such as audio, video or 40 simulation data, over multicast or unicast network services. RTP does 41 not address resource reservation and does not guarantee quality-of- 42 service for real-time services. The data transport is augmented by a 43 control protocol (RTCP) to allow monitoring of the data delivery in a 44 manner scalable to large multicast networks, and to provide minimal 45 control and identification functionality. RTP and RTCP are designed 46 to be independent of the underlying transport and network layers. The 47 protocol supports the use of RTP-level translators and mixers. 49 This specification is a product of the Audio/Video Transport working 50 group within the Internet Engineering Task Force. Comments are 51 solicited and should be addressed to the working group's mailing list 52 at avt@ietf.org and/or the authors. 54 Contents 56 1 Introduction ........................................ 3 57 1.1 Terminology ......................................... 5 58 2 RTP Use Scenarios ................................... 5 59 2.1 Simple Multicast Audio Conference ................... 6 60 2.2 Audio and Video Conference .......................... 6 61 2.3 Mixers and Translators .............................. 7 62 2.4 Layered Encodings ................................... 8 63 3 Definitions ......................................... 8 64 4 Byte Order, Alignment, and Time Format .............. 11 65 5 RTP Data Transfer Protocol .......................... 12 66 5.1 RTP Fixed Header Fields ............................. 12 67 5.2 Multiplexing RTP Sessions ........................... 15 68 5.3 Profile-Specific Modifications to the RTP Header 69 ................................................................ 16 70 5.3.1 RTP Header Extension ................................ 17 71 6 RTP Control Protocol -- RTCP ........................ 17 72 6.1 RTCP Packet Format .................................. 19 73 6.2 RTCP Transmission Interval .......................... 21 74 6.2.1 Maintaining the number of session members ........... 26 75 6.3 RTCP Packet Send and Receive Rules .................. 26 76 6.3.1 Computing the RTCP transmission interval ............ 27 77 6.3.2 Initialization ...................................... 28 78 6.3.3 Receiving an RTP or non-BYE RTCP packet ............. 29 79 6.3.4 Receiving an RTCP BYE packet ........................ 29 80 6.3.5 Timing Out an SSRC .................................. 30 81 6.3.6 Expiration of transmission timer .................... 30 82 6.3.7 Transmitting a BYE packet ........................... 31 83 6.3.8 Updating we_sent .................................... 32 84 6.3.9 Allocation of source description bandwidth .......... 32 85 6.4 Sender and Receiver Reports ......................... 33 86 6.4.1 SR: Sender report RTCP packet ....................... 33 87 6.4.2 RR: Receiver report RTCP packet ..................... 39 88 6.4.3 Extending the sender and receiver reports ........... 40 89 6.4.4 Analyzing sender and receiver reports ............... 41 90 6.5 SDES: Source description RTCP packet ................ 43 91 6.5.1 CNAME: Canonical end-point identifier SDES item ..... 44 92 6.5.2 NAME: User name SDES item ........................... 46 93 6.5.3 EMAIL: Electronic mail address SDES item ............ 46 94 6.5.4 PHONE: Phone number SDES item ....................... 46 95 6.5.5 LOC: Geographic user location SDES item ............. 47 96 6.5.6 TOOL: Application or tool name SDES item ............ 47 97 6.5.7 NOTE: Notice/status SDES item ....................... 47 98 6.5.8 PRIV: Private extensions SDES item .................. 48 99 6.6 BYE: Goodbye RTCP packet ............................ 49 100 6.7 APP: Application-defined RTCP packet ................ 50 101 7 RTP Translators and Mixers .......................... 51 102 7.1 General Description ................................. 51 103 7.2 RTCP Processing in Translators ...................... 54 104 7.3 RTCP Processing in Mixers ........................... 55 105 7.4 Cascaded Mixers ..................................... 56 106 8 SSRC Identifier Allocation and Use .................. 57 107 8.1 Probability of Collision ............................ 57 108 8.2 Collision Resolution and Loop Detection ............. 58 109 8.3 Use with Layered Encodings .......................... 62 110 9 Security ............................................ 63 111 9.1 Confidentiality ..................................... 63 112 9.2 Authentication and Message Integrity ................ 65 113 10 Congestion Control .................................. 65 114 11 RTP over Network and Transport Protocols ............ 65 115 12 Summary of Protocol Constants ....................... 67 116 12.1 RTCP packet types ................................... 67 117 12.2 SDES types .......................................... 68 118 13 RTP Profiles and Payload Format Specifications ...... 68 119 14 IANA Considerations ................................. 71 120 A Algorithms .......................................... 71 121 A.1 RTP Data Header Validity Checks ..................... 75 122 A.2 RTCP Header Validity Checks ......................... 80 123 A.3 Determining the Number of RTP Packets Expected and 124 Lost ........................................................... 80 125 A.4 Generating SDES RTCP Packets ........................ 81 126 A.5 Parsing RTCP SDES Packets ........................... 82 127 A.6 Generating a Random 32-bit Identifier ............... 83 128 A.7 Computing the RTCP Transmission Interval ............ 86 129 A.8 Estimating the Interarrival Jitter .................. 93 130 B Changes from RFC 1889 ............................... 94 131 C Security Considerations ............................. 98 132 D Full Copyright Statement ............................ 98 133 E Addresses of Authors ................................ 99 135 1 Introduction 137 [Note to the RFC Editor: This paragraph and the first paragraph of 138 the Abstract are to be deleted when this draft is published as an 139 RFC. All RFC XXXX should be filled in with the RFC number of the RTP 140 Profile for Audio and Video Conferences as it is submitted for Draft 141 Standard status. Readers are directed to Appendix B, Changes from RFC 142 1889, for a listing of the changes that have been made in this 143 draft.] 145 This memorandum specifies the real-time transport protocol (RTP), 146 which provides end-to-end delivery services for data with real-time 147 characteristics, such as interactive audio and video. Those services 148 include payload type identification, sequence numbering, timestamping 149 and delivery monitoring. Applications typically run RTP on top of UDP 150 to make use of its multiplexing and checksum services; both protocols 151 contribute parts of the transport protocol functionality. However, 152 RTP may be used with other suitable underlying network or transport 153 protocols (see Section 11). RTP supports data transfer to multiple 154 destinations using multicast distribution if provided by the 155 underlying network. 157 Note that RTP itself does not provide any mechanism to ensure timely 158 delivery or provide other quality-of-service guarantees, but relies 159 on lower-layer services to do so. It does not guarantee delivery or 160 prevent out-of-order delivery, nor does it assume that the underlying 161 network is reliable and delivers packets in sequence. The sequence 162 numbers included in RTP allow the receiver to reconstruct the 163 sender's packet sequence, but sequence numbers might also be used to 164 determine the proper location of a packet, for example in video 165 decoding, without necessarily decoding packets in sequence. 167 While RTP is primarily designed to satisfy the needs of multi- 168 participant multimedia conferences, it is not limited to that 169 particular application. Storage of continuous data, interactive 170 distributed simulation, active badge, and control and measurement 171 applications may also find RTP applicable. 173 This document defines RTP, consisting of two closely-linked parts: 175 o the real-time transport protocol (RTP), to carry data that has 176 real-time properties. 178 o the RTP control protocol (RTCP), to monitor the quality of 179 service and to convey information about the participants in an 180 on-going session. The latter aspect of RTCP may be sufficient 181 for "loosely controlled" sessions, i.e., where there is no 182 explicit membership control and set-up, but it is not 183 necessarily intended to support all of an application's 184 control communication requirements. This functionality may be 185 fully or partially subsumed by a separate session control 186 protocol, which is beyond the scope of this document. 188 RTP represents a new style of protocol following the principles of 189 application level framing and integrated layer processing proposed by 190 Clark and Tennenhouse [10]. That is, RTP is intended to be malleable 191 to provide the information required by a particular application and 192 will often be integrated into the application processing rather than 193 being implemented as a separate layer. RTP is a protocol framework 194 that is deliberately not complete. This document specifies those 195 functions expected to be common across all the applications for which 196 RTP would be appropriate. Unlike conventional protocols in which 197 additional functions might be accommodated by making the protocol 198 more general or by adding an option mechanism that would require 199 parsing, RTP is intended to be tailored through modifications and/or 200 additions to the headers as needed. Examples are given in Sections 201 5.3 and 6.4.3. 203 Therefore, in addition to this document, a complete specification of 204 RTP for a particular application will require one or more companion 205 documents (see Section 13): 207 o a profile specification document, which defines a set of 208 payload type codes and their mapping to payload formats (e.g., 209 media encodings). A profile may also define extensions or 210 modifications to RTP that are specific to a particular class 211 of applications. Typically an application will operate under 212 only one profile. A profile for audio and video data may be 213 found in the companion RFC XXXX [1]. 215 o payload format specification documents, which define how a 216 particular payload, such as an audio or video encoding, is to 217 be carried in RTP. 219 A discussion of real-time services and algorithms for their 220 implementation as well as background discussion on some of the RTP 221 design decisions can be found in [11]. 223 1.1 Terminology 225 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 226 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 227 document are to be interpreted as described in RFC 2119 [2] and 228 indicate requirement levels for compliant RTP implementations. 230 2 RTP Use Scenarios 232 The following sections describe some aspects of the use of RTP. The 233 examples were chosen to illustrate the basic operation of 234 applications using RTP, not to limit what RTP may be used for. In 235 these examples, RTP is carried on top of IP and UDP, and follows the 236 conventions established by the profile for audio and video specified 237 in the companion RFC XXXX. 239 2.1 Simple Multicast Audio Conference 241 A working group of the IETF meets to discuss the latest protocol 242 draft, using the IP multicast services of the Internet for voice 243 communications. Through some allocation mechanism the working group 244 chair obtains a multicast group address and pair of ports. One port 245 is used for audio data, and the other is used for control (RTCP) 246 packets. This address and port information is distributed to the 247 intended participants. If privacy is desired, the data and control 248 packets may be encrypted as specified in Section 9.1, in which case 249 an encryption key must also be generated and distributed. The exact 250 details of these allocation and distribution mechanisms are beyond 251 the scope of RTP. 253 The audio conferencing application used by each conference 254 participant sends audio data in small chunks of, say, 20 ms duration. 255 Each chunk of audio data is preceded by an RTP header; RTP header and 256 data are in turn contained in a UDP packet. The RTP header indicates 257 what type of audio encoding (such as PCM, ADPCM or LPC) is contained 258 in each packet so that senders can change the encoding during a 259 conference, for example, to accommodate a new participant that is 260 connected through a low-bandwidth link or react to indications of 261 network congestion. 263 The Internet, like other packet networks, occasionally loses and 264 reorders packets and delays them by variable amounts of time. To cope 265 with these impairments, the RTP header contains timing information 266 and a sequence number that allow the receivers to reconstruct the 267 timing produced by the source, so that in this example, chunks of 268 audio are contiguously played out the speaker every 20 ms. This 269 timing reconstruction is performed separately for each source of RTP 270 packets in the conference. The sequence number can also be used by 271 the receiver to estimate how many packets are being lost. 273 Since members of the working group join and leave during the 274 conference, it is useful to know who is participating at any moment 275 and how well they are receiving the audio data. For that purpose, 276 each instance of the audio application in the conference periodically 277 multicasts a reception report plus the name of its user on the RTCP 278 (control) port. The reception report indicates how well the current 279 speaker is being received and may be used to control adaptive 280 encodings. In addition to the user name, other identifying 281 information may also be included subject to control bandwidth limits. 282 A site sends the RTCP BYE packet (Section 6.6) when it leaves the 283 conference. 285 2.2 Audio and Video Conference 286 If both audio and video media are used in a conference, they are 287 transmitted as separate RTP sessions RTCP packets are transmitted for 288 each medium using two different UDP port pairs and/or multicast 289 addresses. There is no direct coupling at the RTP level between the 290 audio and video sessions, except that a user participating in both 291 sessions should use the same distinguished (canonical) name in the 292 RTCP packets for both so that the sessions can be associated. 294 One motivation for this separation is to allow some participants in 295 the conference to receive only one medium if they choose. Further 296 explanation is given in Section 5.2. Despite the separation, 297 synchronized playback of a source's audio and video can be achieved 298 using timing information carried in the RTCP packets for both 299 sessions. 301 2.3 Mixers and Translators 303 So far, we have assumed that all sites want to receive media data in 304 the same format. However, this may not always be appropriate. 305 Consider the case where participants in one area are connected 306 through a low-speed link to the majority of the conference 307 participants who enjoy high-speed network access. Instead of forcing 308 everyone to use a lower-bandwidth, reduced-quality audio encoding, an 309 RTP-level relay called a mixer may be placed near the low-bandwidth 310 area. This mixer resynchronizes incoming audio packets to reconstruct 311 the constant 20 ms spacing generated by the sender, mixes these 312 reconstructed audio streams into a single stream, translates the 313 audio encoding to a lower-bandwidth one and forwards the lower- 314 bandwidth packet stream across the low-speed link. These packets 315 might be unicast to a single recipient or multicast on a different 316 address to multiple recipients. The RTP header includes a means for 317 mixers to identify the sources that contributed to a mixed packet so 318 that correct talker indication can be provided at the receivers. 320 Some of the intended participants in the audio conference may be 321 connected with high bandwidth links but might not be directly 322 reachable via IP multicast. For example, they might be behind an 323 application-level firewall that will not let any IP packets pass. For 324 these sites, mixing may not be necessary, in which case another type 325 of RTP-level relay called a translator may be used. Two translators 326 are installed, one on either side of the firewall, with the outside 327 one funneling all multicast packets received through a secure 328 connection to the translator inside the firewall. The translator 329 inside the firewall sends them again as multicast packets to a 330 multicast group restricted to the site's internal network. 332 Mixers and translators may be designed for a variety of purposes. An 333 example is a video mixer that scales the images of individual people 334 in separate video streams and composites them into one video stream 335 to simulate a group scene. Other examples of translation include the 336 connection of a group of hosts speaking only IP/UDP to a group of 337 hosts that understand only ST-II, or the packet-by-packet encoding 338 translation of video streams from individual sources without 339 resynchronization or mixing. Details of the operation of mixers and 340 translators are given in Section 7. 342 2.4 Layered Encodings 344 Multimedia applications should be able to adjust the transmission 345 rate to match the capacity of the receiver or to adapt to network 346 congestion. Many implementations place the responsibility of rate- 347 adaptivity at the source. This does not work well with multicast 348 transmission because of the conflicting bandwidth requirements of 349 heterogeneous receivers. The result is often a least-common 350 denominator scenario, where the smallest pipe in the network mesh 351 dictates the quality and fidelity of the overall live multimedia 352 "broadcast". 354 Instead, responsibility for rate-adaptation can be placed at the 355 receivers by combining a layered encoding with a layered transmission 356 system. In the context of RTP over IP multicast, the source can 357 stripe the progressive layers of a hierarchically represented signal 358 across multiple RTP sessions each carried on its own multicast group. 359 Receivers can then adapt to network heterogeneity and control their 360 reception bandwidth by joining only the appropriate subset of the 361 multicast groups. 363 Details of the use of RTP with layered encodings are given in 364 Sections 6.3.9, 8.3 and 11. 366 3 Definitions 368 RTP payload: The data transported by RTP in a packet, for 369 example audio samples or compressed video data. The payload 370 format and interpretation are beyond the scope of this 371 document. 373 RTP packet: A data packet consisting of the fixed RTP header, a 374 possibly empty list of contributing sources (see below), 375 and the payload data. Some underlying protocols may require 376 an encapsulation of the RTP packet to be defined. Typically 377 one packet of the underlying protocol contains a single RTP 378 packet, but several RTP packets MAY be contained if 379 permitted by the encapsulation method (see Section 11). 381 RTCP packet: A control packet consisting of a fixed header part 382 similar to that of RTP data packets, followed by structured 383 elements that vary depending upon the RTCP packet type. The 384 formats are defined in Section 6. Typically, multiple RTCP 385 packets are sent together as a compound RTCP packet in a 386 single packet of the underlying protocol; this is enabled 387 by the length field in the fixed header of each RTCP 388 packet. 390 Port: The "abstraction that transport protocols use to 391 distinguish among multiple destinations within a given host 392 computer. TCP/IP protocols identify ports using small 393 positive integers." [12] The transport selectors (TSEL) 394 used by the OSI transport layer are equivalent to ports. 395 RTP depends upon the lower-layer protocol to provide some 396 mechanism such as ports to multiplex the RTP and RTCP 397 packets of a session. 399 Transport address: The combination of a network address and port 400 that identifies a transport-level endpoint, for example an 401 IP address and a UDP port. Packets are transmitted from a 402 source transport address to a destination transport 403 address. 405 RTP media type: An RTP media type is the collection of payload 406 types which can be carried within a single RTP session. The 407 RTP Profile assigns RTP media types to RTP payload types. 409 RTP session: The association among a set of participants 410 communicating with RTP. For each participant, the session 411 is defined by a particular pair of destination transport 412 addresses (one network address plus a port pair for RTP and 413 RTCP). The destination transport address pair may be common 414 for all participants, as in the case of IP multicast, or 415 may be different for each, as in the case of individual 416 unicast network addresses and port pairs. In a multimedia 417 session, each medium is carried in a separate RTP session 418 with its own RTCP packets. The multiple RTP sessions are 419 distinguished by different port number pairs and/or 420 different multicast addresses. 422 Synchronization source (SSRC): The source of a stream of RTP 423 packets, identified by a 32-bit numeric SSRC identifier 424 carried in the RTP header so as not to be dependent upon 425 the network address. All packets from a synchronization 426 source form part of the same timing and sequence number 427 space, so a receiver groups packets by synchronization 428 source for playback. Examples of synchronization sources 429 include the sender of a stream of packets derived from a 430 signal source such as a microphone or a camera, or an RTP 431 mixer (see below). A synchronization source may change its 432 data format, e.g., audio encoding, over time. The SSRC 433 identifier is a randomly chosen value meant to be globally 434 unique within a particular RTP session (see Section 8). A 435 participant need not use the same SSRC identifier for all 436 the RTP sessions in a multimedia session; the binding of 437 the SSRC identifiers is provided through RTCP (see Section 438 6.5.1). If a participant generates multiple streams in one 439 RTP session, for example from separate video cameras, each 440 MUST be identified as a different SSRC. 442 Contributing source (CSRC): A source of a stream of RTP packets 443 that has contributed to the combined stream produced by an 444 RTP mixer (see below). The mixer inserts a list of the SSRC 445 identifiers of the sources that contributed to the 446 generation of a particular packet into the RTP header of 447 that packet. This list is called the CSRC list. An example 448 application is audio conferencing where a mixer indicates 449 all the talkers whose speech was combined to produce the 450 outgoing packet, allowing the receiver to indicate the 451 current talker, even though all the audio packets contain 452 the same SSRC identifier (that of the mixer). 454 End system: An application that generates the content to be sent 455 in RTP packets and/or consumes the content of received RTP 456 packets. An end system can act as one or more 457 synchronization sources in a particular RTP session, but 458 typically only one. 460 Mixer: An intermediate system that receives RTP packets from one 461 or more sources, possibly changes the data format, combines 462 the packets in some manner and then forwards a new RTP 463 packet. Since the timing among multiple input sources will 464 not generally be synchronized, the mixer will make timing 465 adjustments among the streams and generate its own timing 466 for the combined stream. Thus, all data packets originating 467 from a mixer will be identified as having the mixer as 468 their synchronization source. 470 Translator: An intermediate system that forwards RTP packets 471 with their synchronization source identifier intact. 472 Examples of translators include devices that convert 473 encodings without mixing, replicators from multicast to 474 unicast, and application-level filters in firewalls. 476 Monitor: An application that receives RTCP packets sent by 477 participants in an RTP session, in particular the reception 478 reports, and estimates the current quality of service for 479 distribution monitoring, fault diagnosis and long-term 480 statistics. The monitor function is likely to be built into 481 the application(s) participating in the session, but may 482 also be a separate application that does not otherwise 483 participate and does not send or receive the RTP data 484 packets (since they are on a separate port). These are 485 called third-party monitors. It is also acceptable for a 486 third-party monitor to receive the RTP data packets but not 487 send RTCP packets or otherwise be counted in the session. 489 Non-RTP means: Protocols and mechanisms that may be needed in 490 addition to RTP to provide a usable service. In particular, 491 for multimedia conferences, a control protocol may 492 distribute multicast addresses and keys for encryption, 493 negotiate the encryption algorithm to be used, and define 494 dynamic mappings between RTP payload type values and the 495 payload formats they represent for formats that do not have 496 a predefined payload type value. Examples of such protocols 497 include the Session Initiation Protocol (SIP) (RFC 2543 498 [13]), H.323 [14] and applications using SDP (RFC 2327 499 [15]), such as RTSP (RFC 2326 [16]). For simple 500 applications, electronic mail or a conference database may 501 also be used. The specification of such protocols and 502 mechanisms is outside the scope of this document. 504 4 Byte Order, Alignment, and Time Format 506 All integer fields are carried in network byte order, that is, most 507 significant byte (octet) first. This byte order is commonly known as 508 big-endian. The transmission order is described in detail in [3]. 509 Unless otherwise noted, numeric constants are in decimal (base 10). 511 All header data is aligned to its natural length, i.e., 16-bit fields 512 are aligned on even offsets, 32-bit fields are aligned at offsets 513 divisible by four, etc. Octets designated as padding have the value 514 zero. 516 Wallclock time (absolute date and time) is represented using the 517 timestamp format of the Network Time Protocol (NTP), which is in 518 seconds relative to 0h UTC on 1 January 1900 [4]. The full resolution 519 NTP timestamp is a 64-bit unsigned fixed-point number with the 520 integer part in the first 32 bits and the fractional part in the last 521 32 bits. In some fields where a more compact representation is 522 appropriate, only the middle 32 bits are used; that is, the low 16 523 bits of the integer part and the high 16 bits of the fractional part. 524 The high 16 bits of the integer part must be determined 525 independently. 527 An implementation is not required to run the Network Time Protocol in 528 order to use RTP. Other time sources, or none at all, may be used 529 (see the description of the NTP timestamp field in Section 6.4.1). 530 However, running NTP may be useful for synchronizing streams 531 transmitted from separate hosts. 533 The NTP timestamp will wrap around to zero some time in the year 534 2036, but for RTP purposes, only differences between pairs of NTP 535 timestamps are used. So long as the pairs of timestamps can be 536 assumed to be within 68 years of each other, using modulo arithmetic 537 for subtractions and comparisons makes the wraparound irrelevant. 539 5 RTP Data Transfer Protocol 541 5.1 RTP Fixed Header Fields 543 The RTP header has the following format: 545 0 1 2 3 546 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 547 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 548 |V=2|P|X| CC |M| PT | sequence number | 549 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 550 | timestamp | 551 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 552 | synchronization source (SSRC) identifier | 553 +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ 554 | contributing source (CSRC) identifiers | 555 | .... | 556 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 558 The first twelve octets are present in every RTP packet, while the 559 list of CSRC identifiers is present only when inserted by a mixer. 560 The fields have the following meaning: 562 version (V): 2 bits 563 This field identifies the version of RTP. The version 564 defined by this specification is two (2). (The value 1 is 565 used by the first draft version of RTP and the value 0 is 566 used by the protocol initially implemented in the "vat" 567 audio tool.) 569 padding (P): 1 bit 570 If the padding bit is set, the packet contains one or more 571 additional padding octets at the end which are not part of 572 the payload. The last octet of the padding contains a count 573 of how many padding octets should be ignored, including 574 itself. Padding may be needed by some encryption 575 algorithms with fixed block sizes or for carrying several 576 RTP packets in a lower-layer protocol data unit. 578 extension (X): 1 bit 579 If the extension bit is set, the fixed header MUST be 580 followed by exactly one header extension, with a format 581 defined in Section 5.3.1. 583 CSRC count (CC): 4 bits 584 The CSRC count contains the number of CSRC identifiers that 585 follow the fixed header. 587 marker (M): 1 bit 588 The interpretation of the marker is defined by a profile. 589 It is intended to allow significant events such as frame 590 boundaries to be marked in the packet stream. A profile MAY 591 define additional marker bits or specify that there is no 592 marker bit by changing the number of bits in the payload 593 type field (see Section 5.3). 595 payload type (PT): 7 bits 596 This field identifies the format of the RTP payload and 597 determines its interpretation by the application. A profile 598 MAY specify a default static mapping of payload type codes 599 to payload formats. Additional payload type codes MAY be 600 defined dynamically through non-RTP means (see Section 3). 601 A set of default mappings for audio and video is specified 602 in the companion RFC XXXX [1]. An RTP source MAY change 603 the payload type during a session, but this field SHOULD 604 NOT be used for multiplexing separate media streams (see 605 Section 5.2). 607 A receiver MUST ignore packets with payload types that it 608 does not understand. 610 sequence number: 16 bits 611 The sequence number increments by one for each RTP data 612 packet sent, and may be used by the receiver to detect 613 packet loss and to restore packet sequence. The initial 614 value of the sequence number SHOULD be random 615 (unpredictable) to make known-plaintext attacks on 616 encryption more difficult, even if the source itself does 617 not encrypt according to the method in Section 9.1, because 618 the packets may flow through a translator that does. 619 Techniques for choosing unpredictable numbers are discussed 620 in [17]. 622 timestamp: 32 bits 623 The timestamp reflects the sampling instant of the first 624 octet in the RTP data packet. The sampling instant MUST be 625 derived from a clock that increments monotonically and 626 linearly in time to allow synchronization and jitter 627 calculations (see Section 6.4.1). The resolution of the 628 clock MUST be sufficient for the desired synchronization 629 accuracy and for measuring packet arrival jitter (one tick 630 per video frame is typically not sufficient). The clock 631 frequency is dependent on the format of data carried as 632 payload and is specified statically in the profile or 633 payload format specification that defines the format, or 634 MAY be specified dynamically for payload formats defined 635 through non-RTP means. If RTP packets are generated 636 periodically, the nominal sampling instant as determined 637 from the sampling clock is to be used, not a reading of the 638 system clock. As an example, for fixed-rate audio the 639 timestamp clock would likely increment by one for each 640 sampling period. If an audio application reads blocks 641 covering 160 sampling periods from the input device, the 642 timestamp would be increased by 160 for each such block, 643 regardless of whether the block is transmitted in a packet 644 or dropped as silent. 646 The initial value of the timestamp SHOULD be random, as for 647 the sequence number. Several consecutive RTP packets will 648 have equal timestamps if they are (logically) generated at 649 once, e.g., belong to the same video frame. Consecutive RTP 650 packets MAY contain timestamps that are not monotonic if 651 the data is not transmitted in the order it was sampled, as 652 in the case of MPEG interpolated video frames. (The 653 sequence numbers of the packets as transmitted will still 654 be monotonic.) 656 SSRC: 32 bits 657 The SSRC field identifies the synchronization source. This 658 identifier SHOULD be chosen randomly, with the intent that 659 no two synchronization sources within the same RTP session 660 will have the same SSRC identifier. An example algorithm 661 for generating a random identifier is presented in Appendix 662 A.6. Although the probability of multiple sources choosing 663 the same identifier is low, all RTP implementations must be 664 prepared to detect and resolve collisions. Section 8 665 describes the probability of collision along with a 666 mechanism for resolving collisions and detecting RTP-level 667 forwarding loops based on the uniqueness of the SSRC 668 identifier. If a source changes its source transport 669 address, it must also choose a new SSRC identifier to avoid 670 being interpreted as a looped source (see Section 8.2). 672 CSRC list: 0 to 15 items, 32 bits each 673 The CSRC list identifies the contributing sources for the 674 payload contained in this packet. The number of identifiers 675 is given by the CC field. If there are more than 15 676 contributing sources, only 15 can be identified. CSRC 677 identifiers are inserted by mixers (see Section 7.1), using 678 the SSRC identifiers of contributing sources. For example, 679 for audio packets the SSRC identifiers of all sources that 680 were mixed together to create a packet are listed, allowing 681 correct talker indication at the receiver. 683 5.2 Multiplexing RTP Sessions 685 For efficient protocol processing, the number of multiplexing points 686 should be minimized, as described in the integrated layer processing 687 design principle [10]. In RTP, multiplexing is provided by the 688 destination transport address (network address and port number) which 689 define an RTP session. For example, in a teleconference composed of 690 audio and video media encoded separately, each medium SHOULD be 691 carried in a separate RTP session with its own destination transport 692 address. 694 Separate audio and video streams SHOULD NOT be carried in a single 695 RTP session and demultiplexed based on the payload type or SSRC 696 fields. Interleaving packets with different RTP media types but using 697 the same SSRC would introduce several problems: 699 1. If, say, two audio streams shared the same RTP session and 700 the same SSRC value, and one were to change encodings and 701 thus acquire a different RTP payload type, there would be 702 no general way of identifying which stream had changed 703 encodings. 705 2. An SSRC is defined to identify a single timing and sequence 706 number space. Interleaving multiple payload types would 707 require different timing spaces if the media clock rates 708 differ and would require different sequence number spaces 709 to tell which payload type suffered packet loss. 711 3. The RTCP sender and receiver reports (see Section 6.4) can 712 only describe one timing and sequence number space per SSRC 713 and do not carry a payload type field. 715 4. An RTP mixer would not be able to combine interleaved 716 streams of incompatible media into one stream. 718 5. Carrying multiple media in one RTP session precludes: the 719 use of different network paths or network resource 720 allocations if appropriate; reception of a subset of the 721 media if desired, for example just audio if video would 722 exceed the available bandwidth; and receiver 723 implementations that use separate processes for the 724 different media, whereas using separate RTP sessions 725 permits either single- or multiple-process implementations. 727 Using a different SSRC for each medium but sending them in the same 728 RTP session would avoid the first three problems but not the last 729 two. 731 5.3 Profile-Specific Modifications to the RTP Header 733 The existing RTP data packet header is believed to be complete for 734 the set of functions required in common across all the application 735 classes that RTP might support. However, in keeping with the ALF 736 design principle, the header MAY be tailored through modifications or 737 additions defined in a profile specification while still allowing 738 profile-independent monitoring and recording tools to function. 740 o The marker bit and payload type field carry profile-specific 741 information, but they are allocated in the fixed header since 742 many applications are expected to need them and might 743 otherwise have to add another 32-bit word just to hold them. 744 The octet containing these fields MAY be redefined by a 745 profile to suit different requirements, for example with a 746 more or fewer marker bits. If there are any marker bits, one 747 SHOULD be located in the most significant bit of the octet 748 since profile-independent monitors may be able to observe a 749 correlation between packet loss patterns and the marker bit. 751 o Additional information that is required for a particular 752 payload format, such as a video encoding, SHOULD be carried in 753 the payload section of the packet. This might be in a header 754 that is always present at the start of the payload section, or 755 might be indicated by a reserved value in the data pattern. 757 o If a particular class of applications needs additional 758 functionality independent of payload format, the profile under 759 which those applications operate SHOULD define additional 760 fixed fields to follow immediately after the SSRC field of the 761 existing fixed header. Those applications will be able to 762 quickly and directly access the additional fields while 763 profile-independent monitors or recorders can still process 764 the RTP packets by interpreting only the first twelve octets. 766 If it turns out that additional functionality is needed in common 767 across all profiles, then a new version of RTP should be defined to 768 make a permanent change to the fixed header. 770 5.3.1 RTP Header Extension 772 An extension mechanism is provided to allow individual 773 implementations to experiment with new payload-format-independent 774 functions that require additional information to be carried in the 775 RTP data packet header. This mechanism is designed so that the header 776 extension may be ignored by other interoperating implementations that 777 have not been extended. 779 Note that this header extension is intended only for limited use. 780 Most potential uses of this mechanism would be better done another 781 way, using the methods described in the previous section. For 782 example, a profile-specific extension to the fixed header is less 783 expensive to process because it is not conditional nor in a variable 784 location. Additional information required for a particular payload 785 format SHOULD NOT use this header extension, but SHOULD be carried in 786 the payload section of the packet. 788 0 1 2 3 789 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 790 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 791 | defined by profile | length | 792 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 793 | header extension | 794 | .... | 796 If the X bit in the RTP header is one, a variable-length header 797 extension MUST be appended to the RTP header, following the CSRC list 798 if present. The header extension contains a 16-bit length field that 799 counts the number of 32-bit words in the extension, excluding the 800 four-octet extension header (therefore zero is a valid length). Only 801 a single extension can be appended to the RTP data header. To allow 802 multiple interoperating implementations to each experiment 803 independently with different header extensions, or to allow a 804 particular implementation to experiment with more than one type of 805 header extension, the first 16 bits of the header extension are left 806 open for distinguishing identifiers or parameters. The format of 807 these 16 bits is to be defined by the profile specification under 808 which the implementations are operating. This RTP specification does 809 not define any header extensions itself. 811 6 RTP Control Protocol -- RTCP 813 The RTP control protocol (RTCP) is based on the periodic transmission 814 of control packets to all participants in the session, using the same 815 distribution mechanism as the data packets. The underlying protocol 816 MUST provide multiplexing of the data and control packets, for 817 example using separate port numbers with UDP. RTCP performs four 818 functions: 820 1. The primary function is to provide feedback on the quality 821 of the data distribution. This is an integral part of the 822 RTP's role as a transport protocol and is related to the 823 flow and congestion control functions of other transport 824 protocols (see Section 10 on the requirement for congestion 825 control). The feedback may be directly useful for control 826 of adaptive encodings [18,19], but experiments with IP 827 multicasting have shown that it is also critical to get 828 feedback from the receivers to diagnose faults in the 829 distribution. Sending reception feedback reports to all 830 participants allows one who is observing problems to 831 evaluate whether those problems are local or global. With a 832 distribution mechanism like IP multicast, it is also 833 possible for an entity such as a network service provider 834 who is not otherwise involved in the session to receive the 835 feedback information and act as a third-party monitor to 836 diagnose network problems. This feedback function is 837 performed by the RTCP sender and receiver reports, 838 described below in Section 6.4. 840 2. RTCP carries a persistent transport-level identifier for an 841 RTP source called the canonical name or CNAME, Section 842 6.5.1. Since the SSRC identifier may change if a conflict 843 is discovered or a program is restarted, receivers require 844 the CNAME to keep track of each participant. Receivers may 845 also require the CNAME to associate multiple data streams 846 from a given participant in a set of related RTP sessions, 847 for example to synchronize audio and video. Inter-media 848 synchronization also requires the NTP and RTP timestamps 849 included in RTCP packets by data senders. 851 3. The first two functions require that all participants send 852 RTCP packets, therefore the rate must be controlled in 853 order for RTP to scale up to a large number of 854 participants. By having each participant send its control 855 packets to all the others, each can independently observe 856 the number of participants. This number is used to 857 calculate the rate at which the packets are sent, as 858 explained in Section 6.2. 860 4. A fourth, OPTIONAL function is to convey minimal session 861 control information, for example participant identification 862 to be displayed in the user interface. This is most likely 863 to be useful in "loosely controlled" sessions where 864 participants enter and leave without membership control or 865 parameter negotiation. RTCP serves as a convenient channel 866 to reach all the participants, but it is not necessarily 867 expected to support all the control communication 868 requirements of an application. A higher-level session 869 control protocol, which is beyond the scope of this 870 document, may be needed. 872 Functions 1-3 SHOULD be used in all environments, but particularly in 873 the IP multicast environment. RTP application designers SHOULD avoid 874 mechanisms that can only work in unicast mode and will not scale to 875 larger numbers. Transmission of RTCP MAY be controlled separately for 876 senders and receivers, as described in Section 6.2, for cases such as 877 unidirectional links where feedback from receivers is not possible. 879 6.1 RTCP Packet Format 881 This specification defines several RTCP packet types to carry a 882 variety of control information: 884 SR: Sender report, for transmission and reception statistics 885 from participants that are active senders 887 RR: Receiver report, for reception statistics from participants 888 that are not active senders and in combination with SR for 889 active senders reporting on more than 31 sources 891 SDES: Source description items, including CNAME 893 BYE: Indicates end of participation 895 APP: Application specific functions 897 Each RTCP packet begins with a fixed part similar to that of RTP data 898 packets, followed by structured elements that MAY be of variable 899 length according to the packet type but MUST end on a 32-bit 900 boundary. The alignment requirement and a length field in the fixed 901 part of each packet are included to make RTCP packets "stackable". 902 Multiple RTCP packets can be concatenated without any intervening 903 separators to form a compound RTCP packet that is sent in a single 904 packet of the lower layer protocol, for example UDP. There is no 905 explicit count of individual RTCP packets in the compound packet 906 since the lower layer protocols are expected to provide an overall 907 length to determine the end of the compound packet. 909 Each individual RTCP packet in the compound packet may be processed 910 independently with no requirements upon the order or combination of 911 packets. However, in order to perform the functions of the protocol, 912 the following constraints are imposed: 914 o Reception statistics (in SR or RR) should be sent as often as 915 bandwidth constraints will allow to maximize the resolution of 916 the statistics, therefore each periodically transmitted 917 compound RTCP packet MUST include a report packet. 919 o New receivers need to receive the CNAME for a source as soon 920 as possible to identify the source and to begin associating 921 media for purposes such as lip-sync, so each compound RTCP 922 packet MUST also include the SDES CNAME except when the 923 compound RTCP packet is split for partial encryption as 924 described in Section 9.1. 926 o The number of packet types that may appear first in the 927 compound packet needs to be limited to increase the number of 928 constant bits in the first word and the probability of 929 successfully validating RTCP packets against misaddressed RTP 930 data packets or other unrelated packets. 932 Thus, all RTCP packets MUST be sent in a compound packet of at least 933 two individual packets, with the following format: 935 Encryption prefix: If and only if the compound packet is to be 936 encrypted according to the method in Section 9.1, it MUST 937 be prefixed by a random 32-bit quantity redrawn for every 938 compound packet transmitted. If padding is required for 939 the encryption, it MUST be added to the last packet of the 940 compound packet. 942 SR or RR: The first RTCP packet in the compound packet MUST 943 always be a report packet to facilitate header validation 944 as described in Appendix A.2. This is true even if no data 945 has been sent or received, in which case an empty RR MUST 946 be sent, and even if the only other RTCP packet in the 947 compound packet is a BYE. 949 Additional RRs: If the number of sources for which reception 950 statistics are being reported exceeds 31, the number that 951 will fit into one SR or RR packet, then additional RR 952 packets SHOULD follow the initial report packet. 954 SDES: An SDES packet containing a CNAME item MUST be included 955 in each compound RTCP packet, except as noted in Section 956 9.1. Other source description items MAY optionally be 957 included if required by a particular application, subject 958 to bandwidth constraints (see Section 6.3.9). 960 BYE or APP: Other RTCP packet types, including those yet to be 961 defined, MAY follow in any order, except that BYE SHOULD be 962 the last packet sent with a given SSRC/CSRC. Packet types 963 MAY appear more than once. 965 An individual RTP participant SHOULD send only one compound RTCP 966 packet per report interval in order for the RTCP bandwidth per 967 participant to be estimated correctly (see Section 6.2), except when 968 the compound RTCP packet is split for partial encryption as described 969 in Section 9.1. If there are too many sources to fit all the 970 necessary RR packets into one compound RTCP packet without exceeding 971 the maximum transmission unit (MTU) of the network path, then only 972 the subset that will fit into one MTU SHOULD be included in each 973 interval. The subsets SHOULD be selected round-robin across multiple 974 intervals so that all sources are reported. 976 It is RECOMMENDED that translators and mixers combine individual RTCP 977 packets from the multiple sources they are forwarding into one 978 compound packet whenever feasible in order to amortize the packet 979 overhead (see Section 7). An example RTCP compound packet as might be 980 produced by a mixer is shown in Fig. 1. If the overall length of a 981 compound packet would exceed the MTU of the network path, it SHOULD 982 be segmented into multiple shorter compound packets to be transmitted 983 in separate packets of the underlying protocol. This does not impair 984 the RTCP bandwidth estimation because each compound packet represents 985 at least one distinct participant. Note that each of the compound 986 packets MUST begin with an SR or RR packet. 988 An implementation SHOULD ignore incoming RTCP packets with types 989 unknown to it. Additional RTCP packet types may be registered with 990 the Internet Assigned Numbers Authority (IANA) as described in 991 Section 14. 993 6.2 RTCP Transmission Interval 995 RTP is designed to allow an application to scale automatically over 996 session sizes ranging from a few participants to thousands. For 997 example, in an audio conference the data traffic is inherently self- 998 limiting because only one or two people will speak at a time, so with 999 multicast distribution the data rate on any given link remains 1000 relatively constant independent of the number of participants. 1001 However, the control traffic is not self-limiting. If the reception 1002 reports from each participant were sent at a constant rate, the 1003 control traffic would grow linearly with the number of participants. 1004 Therefore, the rate must be scaled down by dynamically calculating 1005 if encrypted: random 32-bit integer 1006 | 1007 |[--------- packet --------][---------- packet ----------][-packet-] 1008 | 1009 | receiver chunk chunk 1010 V reports item item item item 1011 -------------------------------------------------------------------- 1012 R[SR #sendinfo #site1#site2][SDES #CNAME PHONE #CNAME LOC][BYE##why] 1013 -------------------------------------------------------------------- 1014 | | 1015 |<----------------------- compound packet ----------------------->| 1016 |<-------------------------- UDP packet ------------------------->| 1018 #: SSRC/CSRC identifier 1020 Figure 1: Example of an RTCP compound packet 1022 the interval between RTCP packet transmissions. 1024 For each session, it is assumed that the data traffic is subject to 1025 an aggregate limit called the "session bandwidth" to be divided among 1026 the participants. This bandwidth might be reserved and the limit 1027 enforced by the network. If there is no reservation, there may be 1028 other constraints, depending on the environment, that establish the 1029 "reasonable" maximum for the session to use, and that would be the 1030 session bandwidth. The session bandwidth may be chosen based on some 1031 cost or a priori knowledge of the available network bandwidth for the 1032 session. It is somewhat independent of the media encoding, but the 1033 encoding choice may be limited by the session bandwidth. Often, the 1034 session bandwidth is the sum of the nominal bandwidths of the senders 1035 expected to be concurrently active. For teleconference audio, this 1036 number would typically be one sender's bandwidth. For layered 1037 encodings, each layer is a separate RTP session with its own session 1038 bandwidth parameter. 1040 The session bandwidth parameter is expected to be supplied by a 1041 session management application when it invokes a media application, 1042 but media applications MAY set a default based on the single-sender 1043 data bandwidth for the encoding selected for the session. The 1044 application MAY also enforce bandwidth limits based on multicast 1045 scope rules or other criteria. All participants MUST use the same 1046 value for the session bandwidth so that the same RTCP interval will 1047 be calculated. 1049 Bandwidth calculations for control and data traffic include lower- 1050 layer transport and network protocols (e.g., UDP and IP) since that 1051 is what the resource reservation system would need to know. The 1052 application can also be expected to know which of these protocols are 1053 in use. Link level headers are not included in the calculation since 1054 the packet will be encapsulated with different link level headers as 1055 it travels. 1057 The control traffic should be limited to a small and known fraction 1058 of the session bandwidth: small so that the primary function of the 1059 transport protocol to carry data is not impaired; known so that the 1060 control traffic can be included in the bandwidth specification given 1061 to a resource reservation protocol, and so that each participant can 1062 independently calculate its share. It is RECOMMENDED that the 1063 fraction of the session bandwidth allocated to RTCP be fixed at 5%. 1064 It is also RECOMMENDED that 1/4 of the RTCP bandwidth be dedicated to 1065 participants that are sending data so that in sessions with a large 1066 number of receivers but a small number of senders, newly joining 1067 participants will more quickly receive the CNAME for the sending 1068 sites. When the proportion of senders is greater than 1/4 of the 1069 participants, the senders get their proportion of the full RTCP 1070 bandwidth. While the values of these and other constants in the 1071 interval calculation are not critical, all participants in the 1072 session MUST use the same values so the same interval will be 1073 calculated. Therefore, these constants SHOULD be fixed for a 1074 particular profile. 1076 A profile MAY specify that the control traffic bandwidth may be a 1077 separate parameter of the session rather than a strict percentage of 1078 the session bandwidth. Using a separate parameter allows rate- 1079 adaptive applications to set an RTCP bandwidth consistent with a 1080 "typical" data bandwidth that is lower than the maximum bandwidth 1081 specified by the session bandwidth parameter. 1083 The profile MAY further specify that the control traffic bandwidth 1084 may be divided into two separate session parameters for those 1085 participants which are active data senders and those which are not. 1086 Following the recommendation that 1/4 of the RTCP bandwidth be 1087 dedicated to data senders, the RECOMMENDED default values for these 1088 two parameters would be 1.25% and 3.75%, respectively. When the 1089 proportion of senders is greater than 1/4 of the participants, the 1090 senders get their proportion of the sum of these parameters. Using 1091 two parameters allows RTCP reception reports to be turned off 1092 entirely for a particular session by setting the RTCP bandwidth for 1093 non-data-senders to zero while keeping the RTCP bandwidth for data 1094 senders non-zero so that sender reports can still be sent for inter- 1095 media synchronization. This may be appropriate for systems operating 1096 on unidirectional links or for sessions that don't require feedback 1097 on the quality of reception. 1099 The calculated interval between transmissions of compound RTCP 1100 packets SHOULD also have a lower bound to avoid having bursts of 1101 packets exceed the allowed bandwidth when the number of participants 1102 is small and the traffic isn't smoothed according to the law of large 1103 numbers. It also keeps the report interval from becoming too small 1104 during transient outages like a network partition such that 1105 adaptation is delayed when the partition heals. At application 1106 startup, a delay SHOULD be imposed before the first compound RTCP 1107 packet is sent to allow time for RTCP packets to be received from 1108 other participants so the report interval will converge to the 1109 correct value more quickly. This delay MAY be set to half the 1110 minimum interval to allow quicker notification that the new 1111 participant is present. The RECOMMENDED value for a fixed minimum 1112 interval is 5 seconds. 1114 An implementation MAY scale the minimum RTCP interval to a smaller 1115 value inversely proportional to the session bandwidth parameter with 1116 the following limitations: 1118 o For multicast sessions, only active data senders MAY use the 1119 reduced minimum value to calculate the interval for 1120 transmission of compound RTCP packets. 1122 o For unicast sessions, the reduced value MAY be used by 1123 participants that are not active data senders as well, and the 1124 delay before sending the initial compound RTCP packet MAY be 1125 zero. 1127 o For all sessions, the fixed minimum SHOULD be used when 1128 calculating the participant timeout interval (see Section 1129 6.3.5) so that implementations which do not use the reduced 1130 value for transmitting RTCP packets are not timed out by other 1131 participants prematurely. 1133 o The RECOMMENDED value for the reduced minimum in seconds is 1134 360 divided by the session bandwidth in kilobits/second. This 1135 minimum is smaller than 5 seconds for bandwidths greater than 1136 72 kb/s. 1138 The algorithm described in Section 6.3 and Appendix A.7 was designed 1139 to meet the goals outlined in this section. It calculates the 1140 interval between sending compound RTCP packets to divide the allowed 1141 control traffic bandwidth among the participants. This allows an 1142 application to provide fast response for small sessions where, for 1143 example, identification of all participants is important, yet 1144 automatically adapt to large sessions. The algorithm incorporates the 1145 following characteristics: 1147 o The calculated interval between RTCP packets scales linearly 1148 with the number of members in the group. It is this linear 1149 factor which allows for a constant amount of control traffic 1150 when summed across all members. 1152 o The interval between RTCP packets is varied randomly over the 1153 range [0.5,1.5] times the calculated interval to avoid 1154 unintended synchronization of all participants [20]. The 1155 first RTCP packet sent after joining a session is also delayed 1156 by a random variation of half the minimum RTCP interval. 1158 o A dynamic estimate of the average compound RTCP packet size is 1159 calculated, including all those received and sent, to 1160 automatically adapt to changes in the amount of control 1161 information carried. 1163 o Since the calculated interval is dependent on the number of 1164 observed group members, there may be undesirable startup 1165 effects when a new user joins an existing session, or many 1166 users simultaneously join a new session. These new users will 1167 initially have incorrect estimates of the group membership, 1168 and thus their RTCP transmission interval will be too short. 1169 This problem can be significant if many users join the session 1170 simultaneously. To deal with this, an algorithm called "timer 1171 reconsideration" is employed. This algorithm implements a 1172 simple back-off mechanism which causes users to hold back RTCP 1173 packet transmission if the group sizes are increasing. 1175 o When users leave a session, either with a BYE or by timeout, 1176 the group membership decreases, and thus the calculated 1177 interval should decrease. A "reverse reconsideration" 1178 algorithm is used to allow members to more quickly reduce 1179 their intervals in response to group membership decreases. 1181 o BYE packets are given different treatment than other RTCP 1182 packets. When a user leaves a group, and wishes to send a BYE 1183 packet, it may do so before its next scheduled RTCP packet. 1184 However, transmission of BYE's follows a back-off algorithm 1185 which avoids floods of BYE packets should a large number of 1186 members simultaneously leave the session. 1188 This algorithm may be used for sessions in which all participants are 1189 allowed to send. In that case, the session bandwidth parameter is the 1190 product of the individual sender's bandwidth times the number of 1191 participants, and the RTCP bandwidth is 5% of that. 1193 Details of the algorithm's operation are given in the sections that 1194 follow. Appendix A.7 gives an example implementation. 1196 6.2.1 Maintaining the number of session members 1198 Calculation of the RTCP packet interval depends upon an estimate of 1199 the number of sites participating in the session. New sites are added 1200 to the count when they are heard, and an entry for each SHOULD be 1201 created in a table indexed by the SSRC or CSRC identifier (see 1202 Section 8.2) to keep track of them. New entries MAY be considered not 1203 valid until multiple packets carrying the new SSRC have been received 1204 (see Appendix A.1), or until an SDES RTCP packet containing a CNAME 1205 for that SSRC has been received. Entries MAY be deleted from the 1206 table when an RTCP BYE packet with the corresponding SSRC identifier 1207 is received, except that some straggler data packets might arrive 1208 after the BYE and cause the entry to be recreated. Instead, the entry 1209 SHOULD be marked as having received a BYE and then deleted after an 1210 appropriate delay. 1212 A participant MAY mark another site inactive, or delete it if not yet 1213 valid, if no RTP or RTCP packet has been received for a small number 1214 of RTCP report intervals (5 is RECOMMENDED). This provides some 1215 robustness against packet loss. All sites must have the same value 1216 for this multiplier and must calculate roughly the same value for the 1217 RTCP report interval in order for this timeout to work properly. 1218 Therefore, this multiplier SHOULD be fixed for a particular profile. 1220 For sessions with a very large number of participants, it may be 1221 impractical to maintain a table to store the SSRC identifier and 1222 state information for all of them. An implementation MAY use SSRC 1223 sampling, as described in [21], to reduce the storage requirements. 1224 An implementation MAY use any other algorithm with similar 1225 performance. A key requirement is that any algorithm considered 1226 SHOULD NOT substantially underestimate the group size, although it 1227 MAY overestimate. 1229 6.3 RTCP Packet Send and Receive Rules 1231 The rules for how to send, and what to do when receiving an RTCP 1232 packet are outlined here. An implementation that allows operation in 1233 a multicast environment or a multipoint unicast environment MUST meet 1234 the requirements in Section 6.2. Such an implementation MAY use the 1235 algorithm defined in this section to meet those requirements, or MAY 1236 use some other algorithm so long as it provides equivalent or better 1237 performance. An implementation which is constrained to two-party 1238 unicast operation SHOULD still use randomization of the RTCP 1239 transmission interval to avoid unintended synchronization of multiple 1240 instances operating in the same environment, but MAY omit the "timer 1241 reconsideration" and "reverse reconsideration" algorithms in Sections 1242 6.3.3, 6.3.6 and 6.3.7. 1244 To execute these rules, a session participant must maintain several 1245 pieces of state: 1247 tp: the last time an RTCP packet was transmitted; 1249 tc: the current time; 1251 tn: the next scheduled transmission time of an RTCP packet; 1253 pmembers: the estimated number of session members at the time tn 1254 was last recomputed; 1256 members: the most current estimate for the number of session 1257 members; 1259 senders: the most current estimate for the number of senders in 1260 the session; 1262 rtcp_bw: The target RTCP bandwidth, i.e., the total bandwidth 1263 that will be used for RTCP packets by all members of this 1264 session, in octets per second. This will be a specified 1265 fraction of the "session bandwidth" parameter supplied to 1266 the application at startup. 1268 we_sent: Flag that is true if the application has sent data 1269 since the 2nd previous RTCP report was transmitted. 1271 avg_rtcp_size: The average compound RTCP packet size, in octets, 1272 over all RTCP packets sent and received by this 1273 participant. The size includes lower-layer transport and 1274 network protocol headers (e.g., UDP and IP) as explained in 1275 Section 6.2. 1277 initial: Flag that is true if the application has not yet sent 1278 an RTCP packet. 1280 Many of these rules make use of the "calculated interval" between 1281 packet transmissions. This interval is described in the following 1282 section. 1284 6.3.1 Computing the RTCP transmission interval 1286 To maintain scalability, the average interval between packets from a 1287 session participant should scale with the group size. This interval 1288 is called the calculated interval. It is obtained by combining a 1289 number of the pieces of state described above. The calculated 1290 interval T is then determined as follows: 1292 1. If there are any senders (senders > 0) in the session, but 1293 the number of senders is less than 25% of the membership 1294 (members), the interval depends on whether the participant 1295 is a sender or not (based on the value of we_sent). If the 1296 participant is a sender (we_sent true), the constant C is 1297 set to the average RTCP packet size (avg_rtcp_size) divided 1298 by 25% of the RTCP bandwidth (rtcp_bw), and the constant n 1299 is set to the number of senders. If we_sent is not true, 1300 the constant C is set to the average RTCP packet size 1301 divided by 75% of the RTCP bandwidth. The constant n is set 1302 to the number of receivers (members - senders). If the 1303 number of senders is greater than 25%, senders and 1304 receivers are treated together. The constant C is set to 1305 the average RTCP packet size divided by the total RTCP 1306 bandwidth and n is set to the total number of members. 1308 2. If the participant has not yet sent an RTCP packet (the 1309 variable initial is true), the constant Tmin is set to 2.5 1310 seconds, else it is set to 5 seconds. 1312 3. The deterministic calculated interval Td is set to 1313 max(Tmin, n*C). 1315 4. The calculated interval T is set to a number uniformly 1316 distributed between 0.5 and 1.5 times the deterministic 1317 calculated interval. 1319 5. The resulting value of T is divided by e-3/2=1.21828 to 1320 compensate for the fact that the timer reconsideration 1321 algorithm converges to a value of the RTCP bandwidth below 1322 the intended average. 1324 This procedure results in an interval which is random, but which, on 1325 average, gives at least 25% of the RTCP bandwidth to senders and the 1326 rest to receivers. If the senders constitute more than one quarter of 1327 the membership, this procedure splits the bandwidth equally among all 1328 participants, on average. 1330 6.3.2 Initialization 1332 Upon joining the session, the participant initializes tp to 0, tc to 1333 0, senders to 0, pmembers to 1, members to 1, we_sent to false, 1334 rtcp_bw to the specified fraction of the session bandwidth, initial 1335 to true, and avg_rtcp_size to the probable size of the first RTCP 1336 packet that the application will later construct. The calculated 1337 interval T is then computed, and the first packet is scheduled for 1338 time tn = T. This means that a transmission timer is set which 1339 expires at time T. Note that an application MAY use any desired 1340 approach for implementing this timer. 1342 The participant adds its own SSRC to the member table. 1344 6.3.3 Receiving an RTP or non-BYE RTCP packet 1346 When an RTP or RTCP packet is received from a participant whose SSRC 1347 is not in the member table, the SSRC is added to the table, and the 1348 value for members is updated once the participant has been validated 1349 as described in Section 6.2.1. The same processing occurs for each 1350 CSRC in a validated RTP packet. 1352 When an RTP packet is received from a participant whose SSRC is not 1353 in the sender table, the SSRC is added to the table, and the value 1354 for senders is updated. 1356 For each compound RTCP packet received, the value of avg_rtcp_size is 1357 updated: avg_rtcp_size = (1/16)*packet_size + (15/16)* avg_rtcp_size, 1358 where packet_size is the size of the RTCP packet just received. 1360 6.3.4 Receiving an RTCP BYE packet 1362 Except as described in Section 6.3.7 for the case when an RTCP BYE is 1363 to be transmitted, if the received packet is an RTCP BYE packet, the 1364 SSRC is checked against the member table. If present, the entry is 1365 removed from the table, and the value for members is updated. The 1366 SSRC is then checked against the sender table. If present, the entry 1367 is removed from the table, and the value for senders is updated. 1369 Furthermore, to make the transmission rate of RTCP packets more 1370 adaptive to changes in group membership, the following "reverse 1371 reconsideration" algorithm SHOULD be executed when a BYE packet is 1372 received that reduces members to a value less than pmembers: 1374 o The value for tn is updated according to the following 1375 formula: tn = tc + (members/pmembers)(tn - tc). 1377 o The value for tp is updated according the following formula: 1378 tp = tc - (members/pmembers)(tc - tp). 1380 o The next RTCP packet is rescheduled for transmission at time 1381 tn, which is now earlier. 1383 o The value of pmembers is set equal to members. 1385 This algorithm does not prevent the group size estimate from 1386 incorrectly dropping to zero for a short time due to premature 1387 timeouts when most participants of a large session leave at once but 1388 some remain. The algorithm does make the estimate return to the 1389 correct value more rapidly. This situation is unusual enough and the 1390 consequences are sufficiently harmless that this problem is deemed 1391 only a secondary concern. 1393 6.3.5 Timing Out an SSRC 1395 At occasional intervals, the participant MUST check to see if any of 1396 the other participants time out. To do this, the participant computes 1397 the deterministic (without the randomization factor) calculated 1398 interval Td for a receiver, that is, with we_sent false. Any other 1399 session member who has not sent an RTP or RTCP packet since time tc - 1400 MTd (M is the timeout multiplier, and defaults to 5) is timed out. 1401 This means that its SSRC is removed from the member list, and members 1402 is updated. A similar check is performed on the sender list. Any 1403 member on the sender list who has not sent an RTP packet since time 1404 tc - 2T (within the last two RTCP report intervals) is removed from 1405 the sender list, and senders is updated. 1407 If any members time out, the reverse reconsideration algorithm 1408 described in Section 6.3.4 SHOULD be performed. 1410 The participant MUST perform this check at least once per RTCP 1411 transmission interval. 1413 6.3.6 Expiration of transmission timer 1415 When the packet transmission timer expires, the participant performs 1416 the following operations: 1418 o The transmission interval T is computed as described in 1419 Section 6.3.1, including the randomization factor. 1421 o If tp + T is less than or equal to tc, an RTCP packet is 1422 transmitted. tp is set to tc, then another value for T is 1423 calculated as in the previous step and tn is set to tc + T. 1424 The transmission timer is set to expire again at time tn. If 1425 tp + T is greater than tc, tn is set to tp + T. No RTCP packet 1426 is transmitted. The transmission timer is set to expire at 1427 time tn. 1429 o pmembers is set to members. 1431 If an RTCP packet is transmitted, the value of initial is set to 1432 FALSE. Furthermore, the value of avg_rtcp_size is updated: 1434 avg_rtcp_size = (1/16)*packet_size + (15/16)* avg_rtcp_size, where 1435 packet_size is the size of the RTCP packet just transmitted. 1437 6.3.7 Transmitting a BYE packet 1439 When a participant wishes to leave a session, a BYE packet is 1440 transmitted to inform the other participants of the event. In order 1441 to avoid a flood of BYE packets when many participants leave the 1442 system, a participant MUST execute the following algorithm if the 1443 number of members is more than 50 when the participant chooses to 1444 leave. This algorithm usurps the normal role of the members variable 1445 to count BYE packets instead: 1447 o When the participant decides to leave the system, tp is reset 1448 to tc, the current time, members and pmembers are initialized 1449 to 1, initial is set to 1, we_sent is set to false, senders is 1450 set to 0, and avg_rtcp_size is set to the size of the compound 1451 BYE packet. The calculated interval T is computed. The BYE 1452 packet is then scheduled for time tn = tc + T. 1454 o Every time a BYE packet from another participant is received, 1455 members is incremented by 1 regardless of whether that 1456 participant exists in the member table or not, and when SSRC 1457 sampling is in use, regardless of whether or not the BYE SSRC 1458 would be included in the sample. members is NOT incremented 1459 when other RTCP packets or RTP packets are received, but only 1460 for BYE packets. Similarly, avg_rtcp_size is updated only for 1461 received BYE packets. senders is NOT updated when RTP packets 1462 arrive; it remains 0. 1464 o Transmission of the BYE packet then follows the rules for 1465 transmitting a regular RTCP packet, as above. 1467 This allows BYE packets to be sent right away, yet controls their 1468 total bandwidth usage. In the worst case, this could cause RTCP 1469 control packets to use twice the bandwidth as normal (10%) -- 5% for 1470 non BYE RTCP packets and 5% for BYE. 1472 A participant that does not want to wait for the above mechanism to 1473 allow transmission of a BYE packet MAY leave the group without 1474 sending a BYE at all. That participant will eventually be timed out 1475 by the other group members. 1477 If the group size estimate members is less than 50 when the 1478 participant decides to leave, the participant MAY send a BYE packet 1479 immediately. Alternatively, the participant MAY choose to execute 1480 the above BYE backoff algorithm. 1482 In either case, a participant which never sent an RTP or RTCP packet 1483 MUST NOT send a BYE packet when they leave the group. 1485 6.3.8 Updating we_sent 1487 The variable we_sent contains true if the participant has sent an RTP 1488 packet recently, false otherwise. This determination is made by using 1489 the same mechanisms as for managing the set of other participants 1490 listed in the senders table. If the participant sends an RTP packet 1491 when we_sent is false, it adds itself to the sender table and sets 1492 we_sent to true. The reverse reconsideration algorithm described in 1493 Section 6.3.4 SHOULD be performed to possibly reduce the delay before 1494 sending an SR packet. Every time another RTP packet is sent, the 1495 time of transmission of that packet is maintained in the table. The 1496 normal sender timeout algorithm is then applied to the participant -- 1497 if an RTP packet has not been transmitted since time tc - 2T, the 1498 participant removes itself from the sender table, decrements the 1499 sender count, and sets we_sent to false. 1501 6.3.9 Allocation of source description bandwidth 1503 This specification defines several source description (SDES) items in 1504 addition to the mandatory CNAME item, such as NAME (personal name) 1505 and EMAIL (email address). It also provides a means to define new 1506 application-specific RTCP packet types. Applications should exercise 1507 caution in allocating control bandwidth to this additional 1508 information because it will slow down the rate at which reception 1509 reports and CNAME are sent, thus impairing the performance of the 1510 protocol. It is RECOMMENDED that no more than 20% of the RTCP 1511 bandwidth allocated to a single participant be used to carry the 1512 additional information. Furthermore, it is not intended that all 1513 SDES items will be included in every application. Those that are 1514 included SHOULD be assigned a fraction of the bandwidth according to 1515 their utility. Rather than estimate these fractions dynamically, it 1516 is recommended that the percentages be translated statically into 1517 report interval counts based on the typical length of an item. 1519 For example, an application may be designed to send only CNAME, NAME 1520 and EMAIL and not any others. NAME might be given much higher 1521 priority than EMAIL because the NAME would be displayed continuously 1522 in the application's user interface, whereas EMAIL would be displayed 1523 only when requested. At every RTCP interval, an RR packet and an SDES 1524 packet with the CNAME item would be sent. For a small session 1525 operating at the minimum interval, that would be every 5 seconds on 1526 the average. Every third interval (15 seconds), one extra item would 1527 be included in the SDES packet. Seven out of eight times this would 1528 be the NAME item, and every eighth time (2 minutes) it would be the 1529 EMAIL item. 1531 When multiple applications operate in concert using cross-application 1532 binding through a common CNAME for each participant, for example in a 1533 multimedia conference composed of an RTP session for each medium, the 1534 additional SDES information MAY be sent in only one RTP session. The 1535 other sessions would carry only the CNAME item. In particular, this 1536 approach should be applied to the multiple sessions of a layered 1537 encoding scheme (see Section 2.4). 1539 6.4 Sender and Receiver Reports 1541 RTP receivers provide reception quality feedback using RTCP report 1542 packets which may take one of two forms depending upon whether or not 1543 the receiver is also a sender. The only difference between the sender 1544 report (SR) and receiver report (RR) forms, besides the packet type 1545 code, is that the sender report includes a 20-byte sender information 1546 section for use by active senders. The SR is issued if a site has 1547 sent any data packets during the interval since issuing the last 1548 report or the previous one, otherwise the RR is issued. 1550 Both the SR and RR forms include zero or more reception report 1551 blocks, one for each of the synchronization sources from which this 1552 receiver has received RTP data packets since the last report. Reports 1553 are not issued for contributing sources listed in the CSRC list. Each 1554 reception report block provides statistics about the data received 1555 from the particular source indicated in that block. Since a maximum 1556 of 31 reception report blocks will fit in an SR or RR packet, 1557 additional RR packets SHOULD be stacked after the initial SR or RR 1558 packet as needed to contain the reception reports for all sources 1559 heard during the interval since the last report. If there are too 1560 many sources to fit all the necessary RR packets into one compound 1561 RTCP packet without exceeding the MTU of the network path, then only 1562 the subset that will fit into one MTU SHOULD be included in each 1563 interval. The subsets SHOULD be selected round-robin across multiple 1564 intervals so that all sources are reported. 1566 The next sections define the formats of the two reports, how they may 1567 be extended in a profile-specific manner if an application requires 1568 additional feedback information, and how the reports may be used. 1569 Details of reception reporting by translators and mixers is given in 1570 Section 7. 1572 6.4.1 SR: Sender report RTCP packet 1573 0 1 2 3 1574 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1575 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1576 |V=2|P| RC | PT=SR=200 | length | header 1577 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1578 | SSRC of sender | 1579 +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ 1580 | NTP timestamp, most significant word | sender 1581 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ info 1582 | NTP timestamp, least significant word | 1583 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1584 | RTP timestamp | 1585 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1586 | sender's packet count | 1587 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1588 | sender's octet count | 1589 +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ 1590 | SSRC_1 (SSRC of first source) | report 1591 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ block 1592 | fraction lost | cumulative number of packets lost | 1 1593 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1594 | extended highest sequence number received | 1595 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1596 | interarrival jitter | 1597 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1598 | last SR (LSR) | 1599 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1600 | delay since last SR (DLSR) | 1601 +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ 1602 | SSRC_2 (SSRC of second source) | report 1603 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ block 1604 : ... : 2 1605 +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ 1606 | profile-specific extensions | 1607 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1609 The sender report packet consists of three sections, possibly 1610 followed by a fourth profile-specific extension section if defined. 1611 The first section, the header, is 8 octets long. The fields have the 1612 following meaning: 1614 version (V): 2 bits 1615 Identifies the version of RTP, which is the same in RTCP 1616 packets as in RTP data packets. The version defined by this 1617 specification is two (2). 1619 padding (P): 1 bit 1620 If the padding bit is set, this individual RTCP packet 1621 contains some additional padding octets at the end which 1622 are not part of the control information but are included in 1623 the length field. The last octet of the padding is a count 1624 of how many padding octets should be ignored, including 1625 itself (it will be a multiple of four). Padding may be 1626 needed by some encryption algorithms with fixed block 1627 sizes. In a compound RTCP packet, padding is only required 1628 on one individual packet because the compound packet is 1629 encrypted as a whole for the method in Section 9.1. Thus, 1630 padding MUST only be added to the last individual packet, 1631 and if padding is added to that packet, the padding bit 1632 MUST be set only on that packet. This convention aids the 1633 header validity checks described in Appendix A.2 and allows 1634 detection of packets from some early implementations that 1635 incorrectly set the padding bit on the first individual 1636 packet and add padding to the last individual packet. 1638 reception report count (RC): 5 bits 1639 The number of reception report blocks contained in this 1640 packet. A value of zero is valid. 1642 packet type (PT): 8 bits 1643 Contains the constant 200 to identify this as an RTCP SR 1644 packet. 1646 length: 16 bits 1647 The length of this RTCP packet in 32-bit words minus one, 1648 including the header and any padding. (The offset of one 1649 makes zero a valid length and avoids a possible infinite 1650 loop in scanning a compound RTCP packet, while counting 1651 32-bit words avoids a validity check for a multiple of 4.) 1653 SSRC: 32 bits 1654 The synchronization source identifier for the originator of 1655 this SR packet. 1657 The second section, the sender information, is 20 octets long and is 1658 present in every sender report packet. It summarizes the data 1659 transmissions from this sender. The fields have the following 1660 meaning: 1662 NTP timestamp: 64 bits 1663 Indicates the wallclock time (see Section 4) when this 1664 report was sent so that it may be used in combination with 1665 timestamps returned in reception reports from other 1666 receivers to measure round-trip propagation to those 1667 receivers. Receivers should expect that the measurement 1668 accuracy of the timestamp may be limited to far less than 1669 the resolution of the NTP timestamp. The measurement 1670 uncertainty of the timestamp is not indicated as it may not 1671 be known. On a system that has no notion of wallclock time 1672 but does have some system-specific clock such as "system 1673 uptime", a sender MAY use that clock as a reference to 1674 calculate relative NTP timestamps. It is important to 1675 choose a commonly used clock so that if separate 1676 implementations are used to produce the individual streams 1677 of a multimedia session, all implementations will use the 1678 same clock. Until the year 2036, relative and absolute 1679 timestamps will differ in the high bit so (invalid) 1680 comparisons will show a large difference; by then one hopes 1681 relative timestamps will no longer be needed. A sender 1682 that has no notion of wallclock or elapsed time MAY set the 1683 NTP timestamp to zero. 1685 RTP timestamp: 32 bits 1686 Corresponds to the same time as the NTP timestamp (above), 1687 but in the same units and with the same random offset as 1688 the RTP timestamps in data packets. This correspondence may 1689 be used for intra- and inter-media synchronization for 1690 sources whose NTP timestamps are synchronized, and may be 1691 used by media-independent receivers to estimate the nominal 1692 RTP clock frequency. Note that in most cases this timestamp 1693 will not be equal to the RTP timestamp in any adjacent data 1694 packet. Rather, it MUST be calculated from the 1695 corresponding NTP timestamp using the relationship between 1696 the RTP timestamp counter and real time as maintained by 1697 periodically checking the wallclock time at a sampling 1698 instant. 1700 sender's packet count: 32 bits 1701 The total number of RTP data packets transmitted by the 1702 sender since starting transmission up until the time this 1703 SR packet was generated. The count SHOULD be reset if the 1704 sender changes its SSRC identifier. 1706 sender's octet count: 32 bits 1707 The total number of payload octets (i.e., not including 1708 header or padding) transmitted in RTP data packets by the 1709 sender since starting transmission up until the time this 1710 SR packet was generated. The count SHOULD be reset if the 1711 sender changes its SSRC identifier. This field can be used 1712 to estimate the average payload data rate. 1714 The third section contains zero or more reception report blocks 1715 depending on the number of other sources heard by this sender since 1716 the last report. Each reception report block conveys statistics on 1717 the reception of RTP packets from a single synchronization source. 1718 Receivers SHOULD NOT carry over statistics when a source changes its 1719 SSRC identifier due to a collision. These statistics are: 1721 SSRC_n (source identifier): 32 bits 1722 The SSRC identifier of the source to which the information 1723 in this reception report block pertains. 1725 fraction lost: 8 bits 1726 The fraction of RTP data packets from source SSRC_n lost 1727 since the previous SR or RR packet was sent, expressed as a 1728 fixed point number with the binary point at the left edge 1729 of the field. (That is equivalent to taking the integer 1730 part after multiplying the loss fraction by 256.) This 1731 fraction is defined to be the number of packets lost 1732 divided by the number of packets expected, as defined in 1733 the next paragraph. An implementation is shown in Appendix 1734 A.3. If the loss is negative due to duplicates, the 1735 fraction lost is set to zero. Note that a receiver cannot 1736 tell whether any packets were lost after the last one 1737 received, and that there will be no reception report block 1738 issued for a source if all packets from that source sent 1739 during the last reporting interval have been lost. 1741 cumulative number of packets lost: 24 bits 1742 The total number of RTP data packets from source SSRC_n 1743 that have been lost since the beginning of reception. This 1744 number is defined to be the number of packets expected less 1745 the number of packets actually received, where the number 1746 of packets received includes any which are late or 1747 duplicates. Thus packets that arrive late are not counted 1748 as lost, and the loss may be negative if there are 1749 duplicates. The number of packets expected is defined to 1750 be the extended last sequence number received, as defined 1751 next, less the initial sequence number received. This may 1752 be calculated as shown in Appendix A.3. 1754 extended highest sequence number received: 32 bits 1755 The low 16 bits contain the highest sequence number 1756 received in an RTP data packet from source SSRC_n, and the 1757 most significant 16 bits extend that sequence number with 1758 the corresponding count of sequence number cycles, which 1759 may be maintained according to the algorithm in Appendix 1760 A.1. Note that different receivers within the same session 1761 will generate different extensions to the sequence number 1762 if their start times differ significantly. 1764 interarrival jitter: 32 bits 1765 An estimate of the statistical variance of the RTP data 1766 packet interarrival time, measured in timestamp units and 1767 expressed as an unsigned integer. The interarrival jitter J 1768 is defined to be the mean deviation (smoothed absolute 1769 value) of the difference D in packet spacing at the 1770 receiver compared to the sender for a pair of packets. As 1771 shown in the equation below, this is equivalent to the 1772 difference in the "relative transit time" for the two 1773 packets; the relative transit time is the difference 1774 between a packet's RTP timestamp and the receiver's clock 1775 at the time of arrival, measured in the same units. 1777 If Si is the RTP timestamp from packet i, and Ri is the 1778 time of arrival in RTP timestamp units for packet i, then 1779 for two packets i and j, D may be expressed as D(i,j) = 1780 (R_j - R_i) - (S_j - S_i) = (R_j - S_j) - (R_i - S_i) 1782 The interarrival jitter SHOULD be calculated continuously 1783 as each data packet i is received from source SSRC_n, using 1784 this difference D for that packet and the previous packet 1785 i-1 in order of arrival (not necessarily in sequence), 1786 according to the formula J_i = J_i-1 + (|D(i-1,i)| - J_i- 1787 1)/16 1788 Whenever a reception report is issued, the current value of 1789 J is sampled. 1791 The jitter calculation MUST conform to the formula 1792 specified here in order to allow profile-independent 1793 monitors to make valid interpretations of reports coming 1794 from different implementations. This algorithm is the 1795 optimal first-order estimator and the gain parameter 1/16 1796 gives a good noise reduction ratio while maintaining a 1797 reasonable rate of convergence [22]. A sample 1798 implementation is shown in Appendix A.8. See Section 6.4.4 1799 for a discussion of the effects of varying packet duration 1800 and delay before transmission. 1802 last SR timestamp (LSR): 32 bits 1803 The middle 32 bits out of 64 in the NTP timestamp (as 1804 explained in Section 4) received as part of the most recent 1805 RTCP sender report (SR) packet from source SSRC_n. If no SR 1806 has been received yet, the field is set to zero. 1808 delay since last SR (DLSR): 32 bits 1809 The delay, expressed in units of 1/65536 seconds, between 1810 receiving the last SR packet from source SSRC_n and sending 1811 this reception report block. If no SR packet has been 1812 received yet from SSRC_n, the DLSR field is set to zero. 1814 Let SSRC_r denote the receiver issuing this receiver 1815 report. Source SSRC_n can compute the round-trip 1816 propagation delay to SSRC_r by recording the time A when 1817 this reception report block is received. It calculates the 1818 total round-trip time A-LSR using the last SR timestamp 1819 (LSR) field, and then subtracting this field to leave the 1820 round-trip propagation delay as (A- LSR - DLSR). This is 1821 illustrated in Fig. 2. Times are shown in both a 1822 hexadecimal representation of the 32-bit fields and the 1823 equivalent floating-point decimal representation. Colons 1824 indicate a 32-bit field divided into a 16-bit integer part 1825 and 16-bit fraction part. 1827 This may be used as an approximate measure of distance to 1828 cluster receivers, although some links have very asymmetric 1829 delays. 1831 [10 Nov 1995 11:33:25.125 UTC] [10 Nov 1995 11:33:36.5 UTC] 1832 n SR(n) A=b710:8000 (46864.500 s) 1833 ----------------------------------------------------------------> 1834 v ^ 1835 ntp_sec =0xb44db705 v ^ dlsr=0x0005:4000 ( 5.250s) 1836 ntp_frac=0x20000000 v ^ lsr =0xb705:2000 (46853.125s) 1837 (3024992005.125 s) v ^ 1838 r v ^ RR(n) 1839 ----------------------------------------------------------------> 1840 |<-DLSR->| 1841 (5.250 s) 1843 A 0xb710:8000 (46864.500 s) 1844 DLSR -0x0005:4000 ( 5.250 s) 1845 LSR -0xb705:2000 (46853.125 s) 1846 ------------------------------- 1847 delay 0x 6:2000 ( 6.125 s) 1849 Figure 2: Example for round-trip time computation 1851 6.4.2 RR: Receiver report RTCP packet 1852 0 1 2 3 1853 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1854 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1855 |V=2|P| RC | PT=RR=201 | length | header 1856 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1857 | SSRC of packet sender | 1858 +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ 1859 | SSRC_1 (SSRC of first source) | report 1860 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ block 1861 | fraction lost | cumulative number of packets lost | 1 1862 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1863 | extended highest sequence number received | 1864 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1865 | interarrival jitter | 1866 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1867 | last SR (LSR) | 1868 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1869 | delay since last SR (DLSR) | 1870 +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ 1871 | SSRC_2 (SSRC of second source) | report 1872 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ block 1873 : ... : 2 1874 +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ 1875 | profile-specific extensions | 1876 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1878 The format of the receiver report (RR) packet is the same as that of 1879 the SR packet except that the packet type field contains the constant 1880 201 and the five words of sender information are omitted (these are 1881 the NTP and RTP timestamps and sender's packet and octet counts). The 1882 remaining fields have the same meaning as for the SR packet. 1884 An empty RR packet (RC = 0) MUST be put at the head of a compound 1885 RTCP packet when there is no data transmission or reception to 1886 report. 1888 6.4.3 Extending the sender and receiver reports 1890 A profile SHOULD define profile-specific extensions to the sender 1891 report and receiver report if there is additional information that 1892 needs to be reported regularly about the sender or receivers. This 1893 method SHOULD be used in preference to defining another RTCP packet 1894 type because it requires less overhead: 1896 o fewer octets in the packet (no RTCP header or SSRC field); 1898 o simpler and faster parsing because applications running under 1899 that profile would be programmed to always expect the 1900 extension fields in the directly accessible location after the 1901 reception reports. 1903 The extension is a fourth section in the sender- or receiver-report 1904 packet which comes at the end after the reception report blocks, if 1905 any. If additional sender information is required, then for sender 1906 reports it would be included first in the extension section, but for 1907 receiver reports it would not be present. If information about 1908 receivers is to be included, that data SHOULD be structured as an 1909 array of blocks parallel to the existing array of reception report 1910 blocks; that is, the number of blocks would be indicated by the RC 1911 field. 1913 6.4.4 Analyzing sender and receiver reports 1915 It is expected that reception quality feedback will be useful not 1916 only for the sender but also for other receivers and third-party 1917 monitors. The sender may modify its transmissions based on the 1918 feedback; receivers can determine whether problems are local, 1919 regional or global; network managers may use profile-independent 1920 monitors that receive only the RTCP packets and not the corresponding 1921 RTP data packets to evaluate the performance of their networks for 1922 multicast distribution. 1924 Cumulative counts are used in both the sender information and 1925 receiver report blocks so that differences may be calculated between 1926 any two reports to make measurements over both short and long time 1927 periods, and to provide resilience against the loss of a report. The 1928 difference between the last two reports received can be used to 1929 estimate the recent quality of the distribution. The NTP timestamp is 1930 included so that rates may be calculated from these differences over 1931 the interval between two reports. Since that timestamp is independent 1932 of the clock rate for the data encoding, it is possible to implement 1933 encoding- and profile-independent quality monitors. 1935 An example calculation is the packet loss rate over the interval 1936 between two reception reports. The difference in the cumulative 1937 number of packets lost gives the number lost during that interval. 1938 The difference in the extended last sequence numbers received gives 1939 the number of packets expected during the interval. The ratio of 1940 these two is the packet loss fraction over the interval. This ratio 1941 should equal the fraction lost field if the two reports are 1942 consecutive, but otherwise it may not. The loss rate per second can 1943 be obtained by dividing the loss fraction by the difference in NTP 1944 timestamps, expressed in seconds. The number of packets received is 1945 the number of packets expected minus the number lost. The number of 1946 packets expected may also be used to judge the statistical validity 1947 of any loss estimates. For example, 1 out of 5 packets lost has a 1948 lower significance than 200 out of 1000. 1950 From the sender information, a third-party monitor can calculate the 1951 average payload data rate and the average packet rate over an 1952 interval without receiving the data. Taking the ratio of the two 1953 gives the average payload size. If it can be assumed that packet loss 1954 is independent of packet size, then the number of packets received by 1955 a particular receiver times the average payload size (or the 1956 corresponding packet size) gives the apparent throughput available to 1957 that receiver. 1959 In addition to the cumulative counts which allow long-term packet 1960 loss measurements using differences between reports, the fraction 1961 lost field provides a short-term measurement from a single report. 1962 This becomes more important as the size of a session scales up enough 1963 that reception state information might not be kept for all receivers 1964 or the interval between reports becomes long enough that only one 1965 report might have been received from a particular receiver. 1967 The interarrival jitter field provides a second short-term measure of 1968 network congestion. Packet loss tracks persistent congestion while 1969 the jitter measure tracks transient congestion. The jitter measure 1970 may indicate congestion before it leads to packet loss. The 1971 interarrival jitter field is only a snapshot of the jitter at the 1972 time of a report and is not intended to be taken quantitatively. 1973 Rather, it is intended for comparison across a number of reports from 1974 one receiver over time or from multiple receivers, e.g., within a 1975 single network, at the same time. To allow comparison across 1976 receivers, it is important the the jitter be calculated according to 1977 the same formula by all receivers. 1979 Because the jitter calculation is based on the RTP timestamp which 1980 represents the instant when the first data in the packet was sampled, 1981 any variation in the delay between that sampling instant and the time 1982 the packet is transmitted will affect the resulting jitter that is 1983 calculated. Such a variation in delay would occur for audio packets 1984 of varying duration. It will also occur for video encodings because 1985 the timestamp is the same for all the packets of one frame but those 1986 packets are not all transmitted at the same time. The variation in 1987 delay until transmission does reduce the accuracy of the jitter 1988 calculation as a measure of the behavior of the network by itself, 1989 but it is appropriate to include considering that the receiver buffer 1990 must accommodate it. When the jitter calculation is used as a 1991 comparative measure, the (constant) component due to variation in 1992 delay until transmission subtracts out so that a change in the 1993 network jitter component can then be observed unless it is relatively 1994 small. If the change is small then it is likely to be 1995 inconsequential. 1997 6.5 SDES: Source description RTCP packet 1999 0 1 2 3 2000 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2001 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2002 |V=2|P| SC | PT=SDES=202 | length | header 2003 +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ 2004 | SSRC/CSRC_1 | chunk 2005 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1 2006 | SDES items | 2007 | ... | 2008 +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ 2009 | SSRC/CSRC_2 | chunk 2010 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2 2011 | SDES items | 2012 | ... | 2013 +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ 2015 The SDES packet is a three-level structure composed of a header and 2016 zero or more chunks, each of which is composed of items describing 2017 the source identified in that chunk. The items are described 2018 individually in subsequent sections. 2020 version (V), padding (P), length: 2021 As described for the SR packet (see Section 6.4.1). 2023 packet type (PT): 8 bits 2024 Contains the constant 202 to identify this as an RTCP SDES 2025 packet. 2027 source count (SC): 5 bits 2028 The number of SSRC/CSRC chunks contained in this SDES 2029 packet. A value of zero is valid but useless. 2031 Each chunk consists of an SSRC/CSRC identifier followed by a list of 2032 zero or more items, which carry information about the SSRC/CSRC. Each 2033 chunk starts on a 32-bit boundary. Each item consists of an 8-bit 2034 type field, an 8-bit octet count describing the length of the text 2035 (thus, not including this two-octet header), and the text itself. 2036 Note that the text can be no longer than 255 octets, but this is 2037 consistent with the need to limit RTCP bandwidth consumption. 2039 The text is encoded according to the UTF-8 encoding specified in RFC 2040 2279 [5]. US-ASCII is a subset of this encoding and requires no 2041 additional encoding. The presence of multi-octet encodings is 2042 indicated by setting the most significant bit of a character to a 2043 value of one. 2045 Items are contiguous, i.e., items are not individually padded to a 2046 32-bit boundary. Text is not null terminated because some multi-octet 2047 encodings include null octets. The list of items in each chunk MUST 2048 be terminated by one or more null octets, the first of which is 2049 interpreted as an item type of zero to denote the end of the list. 2050 No length octet follows the null item type octet, but additional null 2051 octets MUST be included if needed to pad until the next 32-bit 2052 boundary. Note that this padding is separate from that indicated by 2053 the P bit in the RTCP header. A chunk with zero items (four null 2054 octets) is valid but useless. 2056 End systems send one SDES packet containing their own source 2057 identifier (the same as the SSRC in the fixed RTP header). A mixer 2058 sends one SDES packet containing a chunk for each contributing source 2059 from which it is receiving SDES information, or multiple complete 2060 SDES packets in the format above if there are more than 31 such 2061 sources (see Section 7). 2063 The SDES items currently defined are described in the next sections. 2064 Only the CNAME item is mandatory. Some items shown here may be useful 2065 only for particular profiles, but the item types are all assigned 2066 from one common space to promote shared use and to simplify profile- 2067 independent applications. Additional items may be defined in a 2068 profile by registering the type numbers with IANA as described in 2069 Section 14. 2071 6.5.1 CNAME: Canonical end-point identifier SDES item 2073 0 1 2 3 2074 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2075 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2076 | CNAME=1 | length | user and domain name ... 2077 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2079 The CNAME identifier has the following properties: 2081 o Because the randomly allocated SSRC identifier may change if a 2082 conflict is discovered or if a program is restarted, the CNAME 2083 item MUST be included to provide the binding from the SSRC 2084 identifier to an identifier for the source that remains 2085 constant. 2087 o Like the SSRC identifier, the CNAME identifier SHOULD also be 2088 unique among all participants within one RTP session. 2090 o To provide a binding across multiple media tools used by one 2091 participant in a set of related RTP sessions, the CNAME SHOULD 2092 be fixed for that participant. 2094 o To facilitate third-party monitoring, the CNAME SHOULD be 2095 suitable for either a program or a person to locate the 2096 source. 2098 Therefore, the CNAME SHOULD be derived algorithmically and not 2099 entered manually, when possible. To meet these requirements, the 2100 following format SHOULD be used unless a profile specifies an 2101 alternate syntax or semantics. The CNAME item SHOULD have the format 2102 "user@host", or "host" if a user name is not available as on single- 2103 user systems. For both formats, "host" is either the fully qualified 2104 domain name of the host from which the real-time data originates, 2105 formatted according to the rules specified in RFC 1034 [6], RFC 1035 2106 [7] and Section 2.1 of RFC 1123 [8]; or the standard ASCII 2107 representation of the host's numeric address on the interface used 2108 for the RTP communication. For example, the standard ASCII 2109 representation of an IP Version 4 address is "dotted decimal", also 2110 known as dotted quad. Other address types are expected to have ASCII 2111 representations that are mutually unique. The fully qualified domain 2112 name is more convenient for a human observer and may avoid the need 2113 to send a NAME item in addition, but it may be difficult or 2114 impossible to obtain reliably in some operating environments. 2115 Applications that may be run in such environments SHOULD use the 2116 ASCII representation of the address instead. 2118 Examples are "doe@sleepy.megacorp.com" or "doe@192.0.2.89" for a 2119 multi-user system. On a system with no user name, examples would be 2120 "sleepy.megacorp.com" or "192.0.2.89". 2122 The user name SHOULD be in a form that a program such as "finger" or 2123 "talk" could use, i.e., it typically is the login name rather than 2124 the personal name. The host name is not necessarily identical to the 2125 one in the participant's electronic mail address. 2127 This syntax will not provide unique identifiers for each source if an 2128 application permits a user to generate multiple sources from one 2129 host. Such an application would have to rely on the SSRC to further 2130 identify the source, or the profile for that application would have 2131 to specify additional syntax for the CNAME identifier. 2133 If each application creates its CNAME independently, the resulting 2134 CNAMEs may not be identical as would be required to provide a binding 2135 across multiple media tools belonging to one participant in a set of 2136 related RTP sessions. If cross-media binding is required, it may be 2137 necessary for the CNAME of each tool to be externally configured with 2138 the same value by a coordination tool. 2140 Application writers should be aware that private network address 2141 assignments such as the Net-10 assignment proposed in RFC 1597 [23] 2142 may create network addresses that are not globally unique. This would 2143 lead to non-unique CNAMEs if hosts with private addresses and no 2144 direct IP connectivity to the public Internet have their RTP packets 2145 forwarded to the public Internet through an RTP-level translator. 2146 (See also RFC 1627 [24].) To handle this case, applications MAY 2147 provide a means to configure a unique CNAME, but the burden is on the 2148 translator to translate CNAMEs from private addresses to public 2149 addresses if necessary to keep private addresses from being exposed. 2151 6.5.2 NAME: User name SDES item 2153 0 1 2 3 2154 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2155 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2156 | NAME=2 | length | common name of source ... 2157 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2159 This is the real name used to describe the source, e.g., "John Doe, 2160 Bit Recycler, Megacorp". It may be in any form desired by the user. 2161 For applications such as conferencing, this form of name may be the 2162 most desirable for display in participant lists, and therefore might 2163 be sent most frequently of those items other than CNAME. Profiles MAY 2164 establish such priorities. The NAME value is expected to remain 2165 constant at least for the duration of a session. It SHOULD NOT be 2166 relied upon to be unique among all participants in the session. 2168 6.5.3 EMAIL: Electronic mail address SDES item 2170 0 1 2 3 2171 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2172 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2173 | EMAIL=3 | length | email address of source ... 2174 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2176 The email address is formatted according to RFC 822 [9], for example, 2177 "John.Doe@megacorp.com". The EMAIL value is expected to remain 2178 constant for the duration of a session. 2180 6.5.4 PHONE: Phone number SDES item 2181 0 1 2 3 2182 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2183 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2184 | PHONE=4 | length | phone number of source ... 2185 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2187 The phone number SHOULD be formatted with the plus sign replacing the 2188 international access code. For example, "+1 908 555 1212" for a 2189 number in the United States. 2191 6.5.5 LOC: Geographic user location SDES item 2193 0 1 2 3 2194 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2195 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2196 | LOC=5 | length | geographic location of site ... 2197 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2199 Depending on the application, different degrees of detail are 2200 appropriate for this item. For conference applications, a string 2201 like "Murray Hill, New Jersey" may be sufficient, while, for an 2202 active badge system, strings like "Room 2A244, AT&T BL MH" might be 2203 appropriate. The degree of detail is left to the implementation 2204 and/or user, but format and content MAY be prescribed by a profile. 2205 The LOC value is expected to remain constant for the duration of a 2206 session, except for mobile hosts. 2208 6.5.6 TOOL: Application or tool name SDES item 2210 0 1 2 3 2211 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2212 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2213 | TOOL=6 | length | name/version of source appl. ... 2214 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2216 A string giving the name and possibly version of the application 2217 generating the stream, e.g., "videotool 1.2". This information may be 2218 useful for debugging purposes and is similar to the Mailer or Mail- 2219 System-Version SMTP headers. The TOOL value is expected to remain 2220 constant for the duration of the session. 2222 6.5.7 NOTE: Notice/status SDES item 2223 0 1 2 3 2224 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2225 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2226 | NOTE=7 | length | note about the source ... 2227 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2229 The following semantics are suggested for this item, but these or 2230 other semantics MAY be explicitly defined by a profile. The NOTE item 2231 is intended for transient messages describing the current state of 2232 the source, e.g., "on the phone, can't talk". Or, during a seminar, 2233 this item might be used to convey the title of the talk. It should be 2234 used only to carry exceptional information and SHOULD NOT be included 2235 routinely by all participants because this would slow down the rate 2236 at which reception reports and CNAME are sent, thus impairing the 2237 performance of the protocol. In particular, it SHOULD NOT be included 2238 as an item in a user's configuration file nor automatically generated 2239 as in a quote-of-the-day. 2241 Since the NOTE item may be important to display while it is active, 2242 the rate at which other non-CNAME items such as NAME are transmitted 2243 might be reduced so that the NOTE item can take that part of the RTCP 2244 bandwidth. When the transient message becomes inactive, the NOTE item 2245 SHOULD continue to be transmitted a few times at the same repetition 2246 rate but with a string of length zero to signal the receivers. 2247 However, receivers SHOULD also consider the NOTE item inactive if it 2248 is not received for a small multiple of the repetition rate, or 2249 perhaps 20-30 RTCP intervals. 2251 6.5.8 PRIV: Private extensions SDES item 2253 0 1 2 3 2254 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2255 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2256 | PRIV=8 | length | prefix length | prefix string... 2257 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2258 ... | value string ... 2259 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2261 This item is used to define experimental or application-specific SDES 2262 extensions. The item contains a prefix consisting of a length-string 2263 pair, followed by the value string filling the remainder of the item 2264 and carrying the desired information. The prefix length field is 8 2265 bits long. The prefix string is a name chosen by the person defining 2266 the PRIV item to be unique with respect to other PRIV items this 2267 application might receive. The application creator might choose to 2268 use the application name plus an additional subtype identification if 2269 needed. Alternatively, it is RECOMMENDED that others choose a name 2270 based on the entity they represent, then coordinate the use of the 2271 name within that entity. 2273 Note that the prefix consumes some space within the item's total 2274 length of 255 octets, so the prefix should be kept as short as 2275 possible. This facility and the constrained RTCP bandwidth SHOULD NOT 2276 be overloaded; it is not intended to satisfy all the control 2277 communication requirements of all applications. 2279 SDES PRIV prefixes will not be registered by IANA. If some form of 2280 the PRIV item proves to be of general utility, it SHOULD instead be 2281 assigned a regular SDES item type registered with IANA so that no 2282 prefix is required. This simplifies use and increases transmission 2283 efficiency. 2285 6.6 BYE: Goodbye RTCP packet 2287 0 1 2 3 2288 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2289 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2290 |V=2|P| SC | PT=BYE=203 | length | 2291 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2292 | SSRC/CSRC | 2293 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2294 : ... : 2295 +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ 2296 | length | reason for leaving ... (opt) 2297 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2299 The BYE packet indicates that one or more sources are no longer 2300 active. 2302 version (V), padding (P), length: 2303 As described for the SR packet (see Section 6.4.1). 2305 packet type (PT): 8 bits 2306 Contains the constant 203 to identify this as an RTCP BYE 2307 packet. 2309 source count (SC): 5 bits 2310 The number of SSRC/CSRC identifiers included in this BYE 2311 packet. A count value of zero is valid, but useless. 2313 The rules for when a BYE packet should be sent are specified in 2314 Sections 6.3.7 and 8.2. 2316 If a BYE packet is received by a mixer, the mixer SHOULD forward the 2317 BYE packet with the SSRC/CSRC identifier(s) unchanged. If a mixer 2318 shuts down, it SHOULD send a BYE packet listing all contributing 2319 sources it handles, as well as its own SSRC identifier. Optionally, 2320 the BYE packet MAY include an 8-bit octet count followed by that many 2321 octets of text indicating the reason for leaving, e.g., "camera 2322 malfunction" or "RTP loop detected". The string has the same encoding 2323 as that described for SDES. If the string fills the packet to the 2324 next 32-bit boundary, the string is not null terminated. If not, the 2325 BYE packet MUST be padded with null octets to the next 32-bit 2326 boundary. This padding is separate from that indicated by the P bit 2327 in the RTCP header. 2329 6.7 APP: Application-defined RTCP packet 2331 0 1 2 3 2332 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2333 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2334 |V=2|P| subtype | PT=APP=204 | length | 2335 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2336 | SSRC/CSRC | 2337 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2338 | name (ASCII) | 2339 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2340 | application-dependent data ... 2341 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2343 The APP packet is intended for experimental use as new applications 2344 and new features are developed, without requiring packet type value 2345 registration. APP packets with unrecognized names SHOULD be ignored. 2346 After testing and if wider use is justified, it is RECOMMENDED that 2347 each APP packet be redefined without the subtype and name fields and 2348 registered with IANA using an RTCP packet type. 2350 version (V), padding (P), length: 2351 As described for the SR packet (see Section 6.4.1). 2353 subtype: 5 bits 2354 May be used as a subtype to allow a set of APP packets to 2355 be defined under one unique name, or for any application- 2356 dependent data. 2358 packet type (PT): 8 bits 2359 Contains the constant 204 to identify this as an RTCP APP 2360 packet. 2362 name: 4 octets 2363 A name chosen by the person defining the set of APP packets 2364 to be unique with respect to other APP packets this 2365 application might receive. The application creator might 2366 choose to use the application name, and then coordinate the 2367 allocation of subtype values to others who want to define 2368 new packet types for the application. Alternatively, it is 2369 RECOMMENDED that others choose a name based on the entity 2370 they represent, then coordinate the use of the name within 2371 that entity. The name is interpreted as a sequence of four 2372 ASCII characters, with uppercase and lowercase characters 2373 treated as distinct. 2375 application-dependent data: variable length 2376 Application-dependent data may or may not appear in an APP 2377 packet. It is interpreted by the application and not RTP 2378 itself. It MUST be a multiple of 32 bits long. 2380 7 RTP Translators and Mixers 2382 In addition to end systems, RTP supports the notion of "translators" 2383 and "mixers", which could be considered as "intermediate systems" at 2384 the RTP level. Although this support adds some complexity to the 2385 protocol, the need for these functions has been clearly established 2386 by experiments with multicast audio and video applications in the 2387 Internet. Example uses of translators and mixers given in Section 2.3 2388 stem from the presence of firewalls and low bandwidth connections, 2389 both of which are likely to remain. 2391 7.1 General Description 2393 An RTP translator/mixer connects two or more transport-level 2394 "clouds". Typically, each cloud is defined by a common network and 2395 transport protocol (e.g., IP/UDP) plus a multicast address and 2396 transport level destination port or a pair of unicast addresses and 2397 ports. (Network-level protocol translators, such as IP version 4 to 2398 IP version 6, may be present within a cloud invisibly to RTP.) One 2399 system may serve as a translator or mixer for a number of RTP 2400 sessions, but each is considered a logically separate entity. 2402 In order to avoid creating a loop when a translator or mixer is 2403 installed, the following rules MUST be observed: 2405 o Each of the clouds connected by translators and mixers 2406 participating in one RTP session either MUST be distinct from 2407 all the others in at least one of these parameters (protocol, 2408 address, port), or MUST be isolated at the network level from 2409 the others. 2411 o A derivative of the first rule is that there MUST NOT be 2412 multiple translators or mixers connected in parallel unless by 2413 some arrangement they partition the set of sources to be 2414 forwarded. 2416 Similarly, all RTP end systems that can communicate through one or 2417 more RTP translators or mixers share the same SSRC space, that is, 2418 the SSRC identifiers MUST be unique among all these end systems. 2419 Section 8.2 describes the collision resolution algorithm by which 2420 SSRC identifiers are kept unique and loops are detected. 2422 There may be many varieties of translators and mixers designed for 2423 different purposes and applications. Some examples are to add or 2424 remove encryption, change the encoding of the data or the underlying 2425 protocols, or replicate between a multicast address and one or more 2426 unicast addresses. The distinction between translators and mixers is 2427 that a translator passes through the data streams from different 2428 sources separately, whereas a mixer combines them to form one new 2429 stream: 2431 Translator: Forwards RTP packets with their SSRC identifier 2432 intact; this makes it possible for receivers to identify 2433 individual sources even though packets from all the sources 2434 pass through the same translator and carry the translator's 2435 network source address. Some kinds of translators will pass 2436 through the data untouched, but others MAY change the 2437 encoding of the data and thus the RTP data payload type and 2438 timestamp. If multiple data packets are re-encoded into 2439 one, or vice versa, a translator MUST assign new sequence 2440 numbers to the outgoing packets. Losses in the incoming 2441 packet stream may induce corresponding gaps in the outgoing 2442 sequence numbers. Receivers cannot detect the presence of a 2443 translator unless they know by some other means what 2444 payload type or transport address was used by the original 2445 source. 2447 Mixer: Receives streams of RTP data packets from one or more 2448 sources, possibly changes the data format, combines the 2449 streams in some manner and then forwards the combined 2450 stream. Since the timing among multiple input sources will 2451 not generally be synchronized, the mixer will make timing 2452 adjustments among the streams and generate its own timing 2453 for the combined stream, so it is the synchronization 2454 source. Thus, all data packets forwarded by a mixer MUST be 2455 marked with the mixer's own SSRC identifier. In order to 2456 preserve the identity of the original sources contributing 2457 to the mixed packet, the mixer SHOULD insert their SSRC 2458 identifiers into the CSRC identifier list following the 2459 fixed RTP header of the packet. A mixer that is also itself 2460 a contributing source for some packet SHOULD explicitly 2461 include its own SSRC identifier in the CSRC list for that 2462 packet. 2464 For some applications, it MAY be acceptable for a mixer not 2465 to identify sources in the CSRC list. However, this 2466 introduces the danger that loops involving those sources 2467 could not be detected. 2469 The advantage of a mixer over a translator for applications like 2470 audio is that the output bandwidth is limited to that of one source 2471 even when multiple sources are active on the input side. This may be 2472 important for low-bandwidth links. The disadvantage is that receivers 2473 on the output side don't have any control over which sources are 2474 passed through or muted, unless some mechanism is implemented for 2475 remote control of the mixer. The regeneration of synchronization 2476 information by mixers also means that receivers can't do inter-media 2477 synchronization of the original streams. A multi-media mixer could do 2478 it. 2480 [E1] [E6] 2481 | | 2482 E1:17 | E6:15 | 2483 | | E6:15 2484 V M1:48 (1,17) M1:48 (1,17) V M1:48 (1,17) 2485 (M1)------------->----------------->-------------->[E7] 2486 ^ ^ E4:47 ^ E4:47 2487 E2:1 | E4:47 | | M3:89 (64,45) 2488 | | | 2489 [E2] [E4] M3:89 (64,45) | 2490 | legend: 2491 [E3] --------->(M2)----------->(M3)------------| [End system] 2492 E3:64 M2:12 (64) ^ (Mixer) 2493 | E5:45 2494 | 2495 [E5] source: SSRC (CSRCs) 2496 -------------------> 2498 Figure 3: Sample RTP network with end systems, mixers and translators 2500 A collection of mixers and translators is shown in Figure 3 to 2501 illustrate their effect on SSRC and CSRC identifiers. In the figure, 2502 end systems are shown as rectangles (named E), translators as 2503 triangles (named T) and mixers as ovals (named M). The notation "M1: 2504 48(1,17)" designates a packet originating a mixer M1, identified with 2505 M1's (random) SSRC value of 48 and two CSRC identifiers, 1 and 17, 2506 copied from the SSRC identifiers of packets from E1 and E2. 2508 7.2 RTCP Processing in Translators 2510 In addition to forwarding data packets, perhaps modified, translators 2511 and mixers MUST also process RTCP packets. In many cases, they will 2512 take apart the compound RTCP packets received from end systems to 2513 aggregate SDES information and to modify the SR or RR packets. 2514 Retransmission of this information may be triggered by the packet 2515 arrival or by the RTCP interval timer of the translator or mixer 2516 itself. 2518 A translator that does not modify the data packets, for example one 2519 that just replicates between a multicast address and a unicast 2520 address, MAY simply forward RTCP packets unmodified as well. A 2521 translator that transforms the payload in some way MUST make 2522 corresponding transformations in the SR and RR information so that it 2523 still reflects the characteristics of the data and the reception 2524 quality. These translators MUST NOT simply forward RTCP packets. In 2525 general, a translator SHOULD NOT aggregate SR and RR packets from 2526 different sources into one packet since that would reduce the 2527 accuracy of the propagation delay measurements based on the LSR and 2528 DLSR fields. 2530 SR sender information: A translator does not generate its own 2531 sender information, but forwards the SR packets received 2532 from one cloud to the others. The SSRC is left intact but 2533 the sender information MUST be modified if required by the 2534 translation. If a translator changes the data encoding, it 2535 MUST change the "sender's byte count" field. If it also 2536 combines several data packets into one output packet, it 2537 MUST change the "sender's packet count" field. If it 2538 changes the timestamp frequency, it MUST change the "RTP 2539 timestamp" field in the SR packet. 2541 SR/RR reception report blocks: A translator forwards reception 2542 reports received from one cloud to the others. Note that 2543 these flow in the direction opposite to the data. The SSRC 2544 is left intact. If a translator combines several data 2545 packets into one output packet, and therefore changes the 2546 sequence numbers, it MUST make the inverse manipulation for 2547 the packet loss fields and the "extended last sequence 2548 number" field. This may be complex. In the extreme case, 2549 there may be no meaningful way to translate the reception 2550 reports, so the translator MAY pass on no reception report 2551 at all or a synthetic report based on its own reception. 2552 The general rule is to do what makes sense for a particular 2553 translation. 2555 A translator does not require an SSRC identifier of its 2556 own, but MAY choose to allocate one for the purpose of 2557 sending reports about what it has received. These would be 2558 sent to all the connected clouds, each corresponding to the 2559 translation of the data stream as sent to that cloud, since 2560 reception reports are normally multicast to all 2561 participants. 2563 SDES: Translators typically forward without change the SDES 2564 information they receive from one cloud to the others, but 2565 MAY, for example, decide to filter non-CNAME SDES 2566 information if bandwidth is limited. The CNAMEs MUST be 2567 forwarded to allow SSRC identifier collision detection to 2568 work. A translator that generates its own RR packets MUST 2569 send SDES CNAME information about itself to the same clouds 2570 that it sends those RR packets. 2572 BYE: Translators forward BYE packets unchanged. A translator 2573 that is about to cease forwarding packets SHOULD send a BYE 2574 packet to each connected cloud containing all the SSRC 2575 identifiers that were previously being forwarded to that 2576 cloud, including the translator's own SSRC identifier if it 2577 sent reports of its own. 2579 APP: Translators forward APP packets unchanged. 2581 7.3 RTCP Processing in Mixers 2583 Since a mixer generates a new data stream of its own, it does not 2584 pass through SR or RR packets at all and instead generates new 2585 information for both sides. 2587 SR sender information: A mixer does not pass through sender 2588 information from the sources it mixes because the 2589 characteristics of the source streams are lost in the mix. 2590 As a synchronization source, the mixer SHOULD generate its 2591 own SR packets with sender information about the mixed data 2592 stream and send them in the same direction as the mixed 2593 stream. 2595 SR/RR reception report blocks: A mixer generates its own 2596 reception reports for sources in each cloud and sends them 2597 out only to the same cloud. It MUST NOT send these 2598 reception reports to the other clouds and MUST NOT forward 2599 reception reports from one cloud to the others because the 2600 sources would not be SSRCs there (only CSRCs). 2602 SDES: Mixers typically forward without change the SDES 2603 information they receive from one cloud to the others, but 2604 MAY, for example, decide to filter non-CNAME SDES 2605 information if bandwidth is limited. The CNAMEs MUST be 2606 forwarded to allow SSRC identifier collision detection to 2607 work. (An identifier in a CSRC list generated by a mixer 2608 might collide with an SSRC identifier generated by an end 2609 system.) A mixer MUST send SDES CNAME information about 2610 itself to the same clouds that it sends SR or RR packets. 2612 Since mixers do not forward SR or RR packets, they will 2613 typically be extracting SDES packets from a compound RTCP 2614 packet. To minimize overhead, chunks from the SDES packets 2615 MAY be aggregated into a single SDES packet which is then 2616 stacked on an SR or RR packet originating from the mixer. 2617 A mixer which aggregates SDES packets will use more RTCP 2618 bandwidth than an individual source because the compound 2619 packets will be longer, but that is appropriate since the 2620 mixer represents multiple sources. Similarly, a mixer 2621 which passes through SDES packets as they are received will 2622 be transmitting RTCP packets at higher than the single 2623 source rate, but again that is correct since the packets 2624 come from multiple sources. The RTCP packet rate may be 2625 different on each side of the mixer. 2627 A mixer that does not insert CSRC identifiers MAY also 2628 refrain from forwarding SDES CNAMEs. In this case, the SSRC 2629 identifier spaces in the two clouds are independent. As 2630 mentioned earlier, this mode of operation creates a danger 2631 that loops can't be detected. 2633 BYE: Mixers MUST forward BYE packets. A mixer that is about to 2634 cease forwarding packets SHOULD send a BYE packet to each 2635 connected cloud containing all the SSRC identifiers that 2636 were previously being forwarded to that cloud, including 2637 the mixer's own SSRC identifier if it sent reports of its 2638 own. 2640 APP: The treatment of APP packets by mixers is application- 2641 specific. 2643 7.4 Cascaded Mixers 2644 An RTP session may involve a collection of mixers and translators as 2645 shown in Figure 3. If two mixers are cascaded, such as M2 and M3 in 2646 the figure, packets received by a mixer may already have been mixed 2647 and may include a CSRC list with multiple identifiers. The second 2648 mixer SHOULD build the CSRC list for the outgoing packet using the 2649 CSRC identifiers from already-mixed input packets and the SSRC 2650 identifiers from unmixed input packets. This is shown in the output 2651 arc from mixer M3 labeled M3:89(64,45) in the figure. As in the case 2652 of mixers that are not cascaded, if the resulting CSRC list has more 2653 than 15 identifiers, the remainder cannot be included. 2655 8 SSRC Identifier Allocation and Use 2657 The SSRC identifier carried in the RTP header and in various fields 2658 of RTCP packets is a random 32-bit number that is required to be 2659 globally unique within an RTP session. It is crucial that the number 2660 be chosen with care in order that participants on the same network or 2661 starting at the same time are not likely to choose the same number. 2663 It is not sufficient to use the local network address (such as an 2664 IPv4 address) for the identifier because the address may not be 2665 unique. Since RTP translators and mixers enable interoperation among 2666 multiple networks with different address spaces, the allocation 2667 patterns for addresses within two spaces might result in a much 2668 higher rate of collision than would occur with random allocation. 2670 Multiple sources running on one host would also conflict. 2672 It is also not sufficient to obtain an SSRC identifier simply by 2673 calling random() without carefully initializing the state. An example 2674 of how to generate a random identifier is presented in Appendix A.6. 2676 8.1 Probability of Collision 2678 Since the identifiers are chosen randomly, it is possible that two or 2679 more sources will choose the same number. Collision occurs with the 2680 highest probability when all sources are started simultaneously, for 2681 example when triggered automatically by some session management 2682 event. If N is the number of sources and L the length of the 2683 identifier (here, 32 bits), the probability that two sources 2684 independently pick the same value can be approximated for large N 2685 [25] as 1 - exp(-N**2 / 2**(L+1)). For N=1000, the probability is 2686 roughly 10**-4. 2688 The typical collision probability is much lower than the worst-case 2689 above. When one new source joins an RTP session in which all the 2690 other sources already have unique identifiers, the probability of 2691 collision is just the fraction of numbers used out of the space. 2693 Again, if N is the number of sources and L the length of the 2694 identifier, the probability of collision is N / 2**L. For N=1000, the 2695 probability is roughly 2*10**-7. 2697 The probability of collision is further reduced by the opportunity 2698 for a new source to receive packets from other participants before 2699 sending its first packet (either data or control). If the new source 2700 keeps track of the other participants (by SSRC identifier), then 2701 before transmitting its first packet the new source can verify that 2702 its identifier does not conflict with any that have been received, or 2703 else choose again. 2705 8.2 Collision Resolution and Loop Detection 2707 Although the probability of SSRC identifier collision is low, all RTP 2708 implementations MUST be prepared to detect collisions and take the 2709 appropriate actions to resolve them. If a source discovers at any 2710 time that another source is using the same SSRC identifier as its 2711 own, it MUST send an RTCP BYE packet for the old identifier and 2712 choose another random one. (As explained below, this step is taken 2713 only once in case of a loop.) If a receiver discovers that two other 2714 sources are colliding, it MAY keep the packets from one and discard 2715 the packets from the other when this can be detected by different 2716 source transport addresses or CNAMEs. The two sources are expected 2717 to resolve the collision so that the situation doesn't last. 2719 Because the random SSRC identifiers are kept globally unique for each 2720 RTP session, they can also be used to detect loops that may be 2721 introduced by mixers or translators. A loop causes duplication of 2722 data and control information, either unmodified or possibly mixed, as 2723 in the following examples: 2725 o A translator may incorrectly forward a packet to the same 2726 multicast group from which it has received the packet, either 2727 directly or through a chain of translators. In that case, the 2728 same packet appears several times, originating from different 2729 network sources. 2731 o Two translators incorrectly set up in parallel, i.e., with the 2732 same multicast groups on both sides, would both forward 2733 packets from one multicast group to the other. Unidirectional 2734 translators would produce two copies; bidirectional 2735 translators would form a loop. 2737 o A mixer can close a loop by sending to the same transport 2738 destination upon which it receives packets, either directly or 2739 through another mixer or translator. In this case a source 2740 might show up both as an SSRC on a data packet and a CSRC in a 2741 mixed data packet. 2743 A source may discover that its own packets are being looped, or that 2744 packets from another source are being looped (a third-party loop). 2746 Both loops and collisions in the random selection of a source 2747 identifier result in packets arriving with the same SSRC identifier 2748 but a different source transport address, which may be that of the 2749 end system originating the packet or an intermediate system. 2750 Therefore, if a source changes its source transport address, it MAY 2751 also choose a new SSRC identifier to avoid being interpreted as a 2752 looped source. (This is not MUST because in some applications of RTP 2753 sources may be expected to change addresses during a session.) Note 2754 that if a translator restarts and consequently changes the source 2755 transport address (e.g., changes the UDP source port number) on which 2756 it forwards packets, then all those packets will appear to receivers 2757 to be looped because the SSRC identifiers are applied by the original 2758 source and will not change. This problem can be avoided by keeping 2759 the source transport addressed fixed across restarts, but in any case 2760 will be resolved after a timeout at the receivers. 2762 Loops or collisions occurring on the far side of a translator or 2763 mixer cannot be detected using the source transport address if all 2764 copies of the packets go through the translator or mixer, however 2765 collisions may still be detected when chunks from two RTCP SDES 2766 packets contain the same SSRC identifier but different CNAMEs. 2768 To detect and resolve these conflicts, an RTP implementation MUST 2769 include an algorithm similar to the one described below, though the 2770 implementation MAY choose a different policy for which packets from 2771 colliding third-party sources are kept. The algorithm described below 2772 ignores packets from a new source or loop that collide with an 2773 established source. It resolves collisions with the participant's own 2774 SSRC identifier by sending an RTCP BYE for the old identifier and 2775 choosing a new one. However, when the collision was induced by a loop 2776 of the participant's own packets, the algorithm will choose a new 2777 identifier only once and thereafter ignore packets from the looping 2778 source transport address. This is required to avoid a flood of BYE 2779 packets. 2781 This algorithm requires keeping a table indexed by the source 2782 identifier and containing the source transport addresses from the 2783 first RTP packet and first RTCP packet received with that identifier, 2784 along with other state for that source. Two source transport 2785 addresses are required since, for example, the UDP source port 2786 numbers may be different on RTP and RTCP packets. However, it may be 2787 assumed that the network address is the same in both source transport 2788 addresses. 2790 Each SSRC or CSRC identifier received in an RTP or RTCP packet is 2791 looked up in the source identifier table in order to process that 2792 data or control information. The source transport address from the 2793 packet is compared to the corresponding source transport address in 2794 the table to detect a loop or collision if they don't match. For 2795 control packets, each element with its own SSRC id, for example an 2796 SDES chunk, requires a separate lookup. (The SSRC id in a reception 2797 report block is an exception because it identifies a source heard by 2798 the reporter, and that SSRC id is unrelated to the source transport 2799 address of the RTCP packet sent by the reporter.) If the SSRC or CSRC 2800 is not found, a new entry is created. These table entries are removed 2801 when an RTCP BYE packet is received with the corresponding SSRC id 2802 and validated by a matching source transport address, or after no 2803 packets have arrived for a relatively long time (see Section 6.2.1). 2805 Note that if two sources on the same host are transmitting with the 2806 same source identifier at the time a receiver begins operation, it 2807 would be possible that the first RTP packet received came from one of 2808 the sources while the first RTCP packet received came from the other. 2809 This would cause the wrong RTCP information to be associated with the 2810 RTP data, but this situation should be sufficiently rare and harmless 2811 that it may be disregarded. 2813 In order to track loops of the participant's own data packets, the 2814 implementation MUST also keep a separate list of source transport 2815 addresses (not identifiers) that have been found to be conflicting. 2816 As in the source identifier table, two source transport addresses 2817 MUST be kept to separately track conflicting RTP and RTCP packets. 2818 Note that the conflicting address list should be short, usually 2819 empty. Each element in this list stores the source addresses plus 2820 the time when the most recent conflicting packet was received. An 2821 element MAY be removed from the list when no conflicting packet has 2822 arrived from that source for a time on the order of 10 RTCP report 2823 intervals (see Section 6.2). 2825 For the algorithm as shown, it is assumed that the participant's own 2826 source identifier and state are included in the source identifier 2827 table. The algorithm could be restructured to first make a separate 2828 comparison against the participant's own source identifier. 2830 if (SSRC or CSRC identifier is not found in the source 2831 identifier table) { 2832 create a new entry storing the data or control source 2833 transport address, the SSRC or CSRC id and other state; 2834 } 2836 /* Identifier is found in the table */ 2837 else if (table entry was created on receipt of a control packet 2838 and this is the first data packet or vice versa) { 2839 store the source transport address from this packet; 2840 } 2841 else if (source transport address from the packet does not match 2842 the one saved in the table entry for this identifier) { 2844 /* An identifier collision or a loop is indicated */ 2846 if (source identifier is not the participant's own) { 2847 /* OPTIONAL error counter step */ 2848 if (source identifier is from an RTCP SDES chunk 2849 containing a CNAME item that differs from the CNAME 2850 in the table entry) { 2851 count a third-party collision; 2852 } else { 2853 count a third-party loop; 2854 } 2855 abort processing of data packet or control element; 2856 /* MAY choose a different policy to keep new source */ 2857 } 2859 /* A collision or loop of the participant's own packets */ 2861 else if (source transport address is found in the list of 2862 conflicting data or control source transport 2863 addresses) { 2864 /* OPTIONAL error counter step */ 2865 if (source identifier is not from an RTCP SDES chunk 2866 containing a CNAME item or CNAME is the 2867 participant's own) { 2868 count occurrence of own traffic looped; 2869 } 2870 mark current time in conflicting address list entry; 2871 abort processing of data packet or control element; 2872 } 2874 /* New collision, change SSRC identifier */ 2876 else { 2877 log occurrence of a collision; 2878 create a new entry in the conflicting data or control 2879 source transport address list and mark current time; 2880 send an RTCP BYE packet with the old SSRC identifier; 2881 choose a new SSRC identifier; 2882 create a new entry in the source identifier table with 2883 the old SSRC plus the source transport address from 2884 the data or control packet being processed; 2886 } 2887 } 2889 In this algorithm, packets from a newly conflicting source address 2890 will be ignored and packets from the original source address will be 2891 kept. If no packets arrive from the original source for an extended 2892 period, the table entry will be timed out and the new source will be 2893 able to take over. This might occur if the original source detects 2894 the collision and moves to a new source identifier, but in the usual 2895 case an RTCP BYE packet will be received from the original source to 2896 delete the state without having to wait for a timeout. 2898 If the original source address was through a mixer (i.e., learned as 2899 a CSRC) and later the same source is received directly, the receiver 2900 may be well advised to switch to the new source address unless other 2901 sources in the mix would be lost. Furthermore, for applications such 2902 as telephony in which some sources such as mobile entities may change 2903 addresses during the course of an RTP session, the RTP implementation 2904 SHOULD modify the collision detection algorithm to accept packets 2905 from the new source transport address. To guard against flip-flopping 2906 between addresses if a genuine collision does occur, the algorithm 2907 SHOULD include some means to detect this case and avoid switching. 2909 When a new SSRC identifier is chosen due to a collision, the 2910 candidate identifier SHOULD first be looked up in the source 2911 identifier table to see if it was already in use by some other 2912 source. If so, another candidate MUST be generated and the process 2913 repeated. 2915 A loop of data packets to a multicast destination can cause severe 2916 network flooding. All mixers and translators MUST implement a loop 2917 detection algorithm like the one here so that they can break loops. 2918 This should limit the excess traffic to no more than one duplicate 2919 copy of the original traffic, which may allow the session to continue 2920 so that the cause of the loop can be found and fixed. However, in 2921 extreme cases where a mixer or translator does not properly break the 2922 loop and high traffic levels result, it may be necessary for end 2923 systems to cease transmitting data or control packets entirely. This 2924 decision may depend upon the application. An error condition SHOULD 2925 be indicated as appropriate. Transmission MAY be attempted again 2926 periodically after a long, random time (on the order of minutes). 2928 8.3 Use with Layered Encodings 2930 For layered encodings transmitted on separate RTP sessions (see 2931 Section 2.4), a single SSRC identifier space SHOULD be used across 2932 the sessions of all layers and the core (base) layer SHOULD be used 2933 for SSRC identifier allocation and collision resolution. When a 2934 source discovers that it has collided, it transmits an RTCP BYE 2935 packet on only the base layer but changes the SSRC identifier to the 2936 new value in all layers. 2938 9 Security 2940 Lower layer protocols may eventually provide all the security 2941 services that may be desired for applications of RTP, including 2942 authentication, integrity, and confidentiality. These services have 2943 been specified for IP in [26]. Since the initial audio and video 2944 applications using RTP needed a confidentiality service before such 2945 services were available for the IP layer, the confidentiality service 2946 described in the next section was defined for use with RTP and RTCP. 2947 That description is included here to codify existing practice. New 2948 applications of RTP MAY implement this RTP-specific confidentiality 2949 service for backward compatibility, and/or they MAY implement IP 2950 layer security services. The overhead on the RTP protocol for this 2951 confidentiality service is low, so the penalty will be minimal if 2952 this service is obsoleted by lower layer services in the future. 2954 Alternatively, other services, other implementations of services and 2955 other algorithms may be defined for RTP in the future if warranted. 2956 The selection presented here is meant to simplify implementation of 2957 interoperable, secure applications and provide guidance to 2958 implementors. No claim is made that the methods presented here are 2959 appropriate for a particular security need. A profile may specify 2960 which services and algorithms should be offered by applications, and 2961 may provide guidance as to their appropriate use. 2963 Key distribution and certificates are outside the scope of this 2964 document. 2966 9.1 Confidentiality 2968 Confidentiality means that only the intended receiver(s) can decode 2969 the received packets; for others, the packet contains no useful 2970 information. Confidentiality of the content is achieved by 2971 encryption. 2973 When encryption of RTP or RTCP is desired, all the octets that will 2974 be encapsulated for transmission in a single lower-layer packet are 2975 encrypted as a unit. For RTCP, a 32-bit random number MUST be 2976 prepended to the unit before encryption to deter known plaintext 2977 attacks. For RTP, no prefix is required because the sequence number 2978 and timestamp fields are initialized with random offsets. 2980 For RTCP, an implementation MAY segregate the individual RTCP packets 2981 in a compound RTCP packet into two separate compound RTCP packets, 2982 one to be encrypted and one to be sent in the clear. For example, 2983 SDES information might be encrypted while reception reports were sent 2984 in the clear to accommodate third-party monitors that are not privy 2985 to the encryption key. In this example, depicted in Fig. 4, the SDES 2986 information MUST be appended to an RR packet with no reports (and the 2987 random number) to satisfy the requirement that all compound RTCP 2988 packets begin with an SR or RR packet. The SDES CNAME item is 2989 required in either the encrypted or unencrypted packet, but not both. 2990 The same SDES information SHOULD NOT be carried in both packets as 2991 this may compromise the encryption. 2993 UDP packet UDP packet 2994 ----------------------------- ------------------------------ 2995 [random][RR][SDES #CNAME ...] [SR #senderinfo #site1 #site2] 2996 ----------------------------- ------------------------------ 2997 encrypted not encrypted 2999 #: SSRC identifier 3001 Figure 4: Encrypted and non-encrypted RTCP packets 3003 The presence of encryption and the use of the correct key are 3004 confirmed by the receiver through header or payload validity checks. 3005 Examples of such validity checks for RTP and RTCP headers are given 3006 in Appendices A.1 and A.2. 3008 To be consistent with existing practice, the default encryption 3009 algorithm is the Data Encryption Standard (DES) algorithm in cipher 3010 block chaining (CBC) mode, as described in Section 1.1 of RFC 1423 3011 [27], except that padding to a multiple of 8 octets is indicated as 3012 described for the P bit in Section 5.1. The initialization vector is 3013 zero because random values are supplied in the RTP header or by the 3014 random prefix for compound RTCP packets. For details on the use of 3015 CBC initialization vectors, see [28]. Implementations that support 3016 encryption SHOULD always support the DES algorithm in CBC mode as the 3017 default to maximize interoperability. This method was chosen because 3018 it has been demonstrated to be easy and practical to use in 3019 experimental audio and video tools in operation on the Internet. 3020 Other encryption algorithms MAY be specified dynamically for a 3021 session by non-RTP means. It is RECOMMENDED that stronger encryption 3022 algorithms such as Triple-DES be used in place of the default 3023 algorithm. 3025 As an alternative to encryption at the IP level or at the RTP level 3026 as described above, profiles MAY define additional payload types for 3027 encrypted encodings. Those encodings MUST specify how padding and 3028 other aspects of the encryption are to be handled. This method allows 3029 encrypting only the data while leaving the headers in the clear for 3030 applications where that is desired. It may be particularly useful for 3031 hardware devices that will handle both decryption and decoding. It 3032 is also valuable for applications where link-level compression of RTP 3033 and lower-layer headers is desired and confidentiality of the payload 3034 (but not addresses) is sufficient since encryption of the headers 3035 precludes compression. 3037 9.2 Authentication and Message Integrity 3039 Authentication and message integrity services are not defined at the 3040 RTP level since these services would not be directly feasible without 3041 a key management infrastructure. It is expected that authentication 3042 and integrity services will be provided by lower layer protocols. 3044 10 Congestion Control 3046 All transport protocols used on the Internet need to address 3047 congestion control in some way [29]. RTP is not an exception, but 3048 because the data transported over RTP is often inelastic (generated 3049 at a fixed or controlled rate), the means to control congestion in 3050 RTP may be quite different from those for other transport protocols 3051 such as TCP. In one sense, inelasticity reduces the risk of 3052 congestion because the RTP stream will not expand to consume all 3053 available bandwidth as a TCP stream can. However, inelasticity also 3054 means that the RTP stream cannot arbitrarily reduce its load on the 3055 network to eliminate congestion when it occurs. 3057 Since RTP may be used for a wide variety of applications in many 3058 different contexts, there is no single congestion control mechanism 3059 that will work for all. Therefore, congestion control SHOULD be 3060 defined in each RTP profile as appropriate. For some profiles, it may 3061 be sufficient to include an applicability statement restricting the 3062 use of that profile to environments where congestion is avoided by 3063 engineering. For other profiles, specific methods such as data rate 3064 adaptation based on RTCP feedback may be required. 3066 11 RTP over Network and Transport Protocols 3068 This section describes issues specific to carrying RTP packets within 3069 particular network and transport protocols. The following rules apply 3070 unless superseded by protocol-specific definitions outside this 3071 specification. 3073 RTP relies on the underlying protocol(s) to provide demultiplexing of 3074 RTP data and RTCP control streams. For UDP and similar protocols, RTP 3075 SHOULD use an even destination port number and the corresponding RTCP 3076 stream SHOULD use the next higher (odd) destination port number. For 3077 applications that take a single port number as a parameter and derive 3078 the RTP and RTCP port pair from that number, if an odd number is 3079 supplied then the application SHOULD replace that number with the 3080 next lower (even) number to use as the base of the port pair. For 3081 applications in which the RTP and RTCP destination port numbers are 3082 specified via explicit, separate parameters (using a signaling 3083 protocol or other means), the application MAY disregard the 3084 restrictions that the port numbers be even/odd and consecutive 3085 although the use of an even/odd port pair is still encouraged. The 3086 RTP and RTCP port numbers MUST NOT be the same since RTP relies on 3087 the port numbers to demultiplex the RTP data and RTCP control 3088 streams. 3090 In a unicast session, both participants need to identify a port pair 3091 for receiving RTP and RTCP packets. Both participants MAY use the 3092 same port pair. A participant MUST NOT assume that the source port of 3093 the incoming RTP or RTCP packet can be used as the destination port 3094 for outgoing RTP or RTCP packets. When RTP data packets are being 3095 sent in both directions, each participant's RTCP SR packets MUST be 3096 sent to the port that the other participant has specified for 3097 reception of RTCP. The RTCP SR packets combine sender information for 3098 the outgoing data plus reception report information for the incoming 3099 data. If a side is not actively sending data (see Section 6.4), an 3100 RTCP RR packet is sent instead. 3102 It is RECOMMENDED that layered encoding applications (see Section 3103 2.4) use a set of contiguous port numbers. The port numbers MUST be 3104 distinct because of a widespread deficiency in existing operating 3105 systems that prevents use of the same port with multiple multicast 3106 addresses, and for unicast, there is only one permissible address. 3107 Thus for layer n, the data port is P + 2n, and the control port is P 3108 + 2n + 1. When IP multicast is used, the addresses MUST also be 3109 distinct because multicast routing and group membership are managed 3110 on an address granularity. However, allocation of contiguous IP 3111 multicast addresses cannot be assumed because some groups may require 3112 different scopes and may therefore be allocated from different 3113 address ranges. 3115 The previous paragraph conflicts with the SDP specification, RFC 2327 3116 [15], which says that it is illegal for both multiple addresses and 3117 multiple ports to be specified in the same session description 3118 because the association of addresses with ports could be ambiguous. 3119 It is intended that this restriction will be relaxed in a revision of 3120 RFC 2327 to allow an equal number of addresses and ports to be 3121 specified with a one-to-one mapping implied. 3123 RTP data packets contain no length field or other delineation, 3124 therefore RTP relies on the underlying protocol(s) to provide a 3125 length indication. The maximum length of RTP packets is limited only 3126 by the underlying protocols. 3128 If RTP packets are to be carried in an underlying protocol that 3129 provides the abstraction of a continuous octet stream rather than 3130 messages (packets), an encapsulation of the RTP packets MUST be 3131 defined to provide a framing mechanism. Framing is also needed if the 3132 underlying protocol may contain padding so that the extent of the RTP 3133 payload cannot be determined. The framing mechanism is not defined 3134 here. 3136 A profile MAY specify a framing method to be used even when RTP is 3137 carried in protocols that do provide framing in order to allow 3138 carrying several RTP packets in one lower-layer protocol data unit, 3139 such as a UDP packet. Carrying several RTP packets in one network or 3140 transport packet reduces header overhead and may simplify 3141 synchronization between different streams. 3143 12 Summary of Protocol Constants 3145 This section contains a summary listing of the constants defined in 3146 this specification. 3148 The RTP payload type (PT) constants are defined in profiles rather 3149 than this document. However, the octet of the RTP header which 3150 contains the marker bit(s) and payload type MUST avoid the reserved 3151 values 200 and 201 (decimal) to distinguish RTP packets from the RTCP 3152 SR and RR packet types for the header validation procedure described 3153 in Appendix A.1. For the standard definition of one marker bit and a 3154 7-bit payload type field as shown in this specification, this 3155 restriction means that payload types 72 and 73 are reserved. 3157 12.1 RTCP packet types 3159 abbrev. name value 3160 SR sender report 200 3161 RR receiver report 201 3162 SDES source description 202 3163 BYE goodbye 203 3164 APP application-defined 204 3165 These type values were chosen in the range 200-204 for improved 3166 header validity checking of RTCP packets compared to RTP packets or 3167 other unrelated packets. When the RTCP packet type field is compared 3168 to the corresponding octet of the RTP header, this range corresponds 3169 to the marker bit being 1 (which it usually is not in data packets) 3170 and to the high bit of the standard payload type field being 1 (since 3171 the static payload types are typically defined in the low half). This 3172 range was also chosen to be some distance numerically from 0 and 255 3173 since all-zeros and all-ones are common data patterns. 3175 Since all compound RTCP packets MUST begin with SR or RR, these codes 3176 were chosen as an even/odd pair to allow the RTCP validity check to 3177 test the maximum number of bits with mask and value. 3179 Additional RTCP packet types may be registered through IANA (see 3180 Section 14). 3182 12.2 SDES types 3184 abbrev. name value 3185 END end of SDES list 0 3186 CNAME canonical name 1 3187 NAME user name 2 3188 EMAIL user's electronic mail address 3 3189 PHONE user's phone number 4 3190 LOC geographic user location 5 3191 TOOL name of application or tool 6 3192 NOTE notice about the source 7 3193 PRIV private extensions 8 3195 Additional SDES types may be registered through IANA (see Section 3196 14). 3198 13 RTP Profiles and Payload Format Specifications 3200 A complete specification of RTP for a particular application will 3201 require one or more companion documents of two types described here: 3202 profiles, and payload format specifications. 3204 RTP may be used for a variety of applications with somewhat differing 3205 requirements. The flexibility to adapt to those requirements is 3206 provided by allowing multiple choices in the main protocol 3207 specification, then selecting the appropriate choices or defining 3208 extensions for a particular environment and class of applications in 3209 a separate profile document. Typically an application will operate 3210 under only one profile in a particular RTP session, so there is no 3211 explicit indication within the RTP protocol itself as to which 3212 profile is in use. A profile for audio and video applications may be 3213 found in the companion RFC XXXX. Profiles are typically titled "RTP 3214 Profile for ...". 3216 The second type of companion document is a payload format 3217 specification, which defines how a particular kind of payload data, 3218 such as H.261 encoded video, should be carried in RTP. These 3219 documents are typically titled "RTP Payload Format for XYZ 3220 Audio/Video Encoding". Payload formats may be useful under multiple 3221 profiles and may therefore be defined independently of any particular 3222 profile. The profile documents are then responsible for assigning a 3223 default mapping of that format to a payload type value if needed. 3225 Within this specification, the following items have been identified 3226 for possible definition within a profile, but this list is not meant 3227 to be exhaustive: 3229 RTP data header: The octet in the RTP data header that contains 3230 the marker bit and payload type field MAY be redefined by a 3231 profile to suit different requirements, for example with 3232 more or fewer marker bits (Section 5.3, p. 14). 3234 Payload types: Assuming that a payload type field is included, 3235 the profile will usually define a set of payload formats 3236 (e.g., media encodings) and a default static mapping of 3237 those formats to payload type values. Some of the payload 3238 formats may be defined by reference to separate payload 3239 format specifications. For each payload type defined, the 3240 profile MUST specify the RTP timestamp clock rate to be 3241 used (Section 5.1, p. 12). 3243 RTP data header additions: Additional fields MAY be appended to 3244 the fixed RTP data header if some additional functionality 3245 is required across the profile's class of applications 3246 independent of payload type (Section 5.3, p. 14). 3248 RTP data header extensions: The contents of the first 16 bits of 3249 the RTP data header extension structure MUST be defined if 3250 use of that mechanism is to be allowed under the profile 3251 for implementation-specific extensions (Section 5.3.1, p. 3252 14). 3254 RTCP packet types: New application-class-specific RTCP packet 3255 types MAY be defined and registered with IANA. 3257 RTCP report interval: A profile SHOULD specify that the values 3258 suggested in Section 6.2 for the constants employed in the 3259 calculation of the RTCP report interval will be used. Those 3260 are the RTCP fraction of session bandwidth, the minimum 3261 report interval, and the bandwidth split between senders 3262 and receivers. A profile MAY specify alternate values if 3263 they have been demonstrated to work in a scalable manner. 3265 SR/RR extension: An extension section MAY be defined for the 3266 RTCP SR and RR packets if there is additional information 3267 that should be reported regularly about the sender or 3268 receivers (Section 6.4.3, p. 32). 3270 SDES use: The profile MAY specify the relative priorities for 3271 RTCP SDES items to be transmitted or excluded entirely 3272 (Section 6.3.9); an alternate syntax or semantics for the 3273 CNAME item (Section 6.5.1); the format of the LOC item 3274 (Section 6.5.5); the semantics and use of the NOTE item 3275 (Section 6.5.7); or new SDES item types to be registered 3276 with IANA. 3278 Security: A profile MAY specify which security services and 3279 algorithms should be offered by applications, and MAY 3280 provide guidance as to their appropriate use (Section 9, p. 3281 50). 3283 String-to-key mapping: A profile MAY specify how a user-provided 3284 password or pass phrase is mapped into an encryption key. 3286 Congestion: A profile SHOULD specify the congestion control 3287 behavior appropriate for that profile. 3289 Underlying protocol: Use of a particular underlying network or 3290 transport layer protocol to carry RTP packets MAY be 3291 required. 3293 Transport mapping: A mapping of RTP and RTCP to transport-level 3294 addresses, e.g., UDP ports, other than the standard mapping 3295 defined in Section 11, p. 52 may be specified. 3297 Encapsulation: An encapsulation of RTP packets may be defined to 3298 allow multiple RTP data packets to be carried in one 3299 lower-layer packet or to provide framing over underlying 3300 protocols that do not already do so (Section 11, p. 52). 3302 It is not expected that a new profile will be required for every 3303 application. Within one application class, it would be better to 3304 extend an existing profile rather than make a new one in order to 3305 facilitate interoperation among the applications since each will 3306 typically run under only one profile. Simple extensions such as the 3307 definition of additional payload type values or RTCP packet types may 3308 be accomplished by registering them through IANA and publishing their 3309 descriptions in an addendum to the profile or in a payload format 3310 specification. 3312 14 IANA Considerations 3314 Additional RTCP packet types and SDES item types may be registered 3315 through the Internet Assigned Numbers Authority (IANA). Since these 3316 number spaces are small, allowing unconstrained registration of new 3317 values would not be prudent. To facilitate review of requests and to 3318 promote shared use of new types among multiple applications, requests 3319 for registration of new values must be documented in an RFC or other 3320 permanent and readily available reference such as the product of 3321 another cooperative standards body (e.g., ITU-T). Other requests may 3322 also be accepted, under the advice of a "designated expert." (Contact 3323 the IANA for the contact information of the current expert.) 3325 RTP profile specifications SHOULD register with IANA a name for the 3326 profile in the form "RTP/xxx", where xxx is a short abbreviation of 3327 the profile title. These names are for use by higher-level control 3328 protocols, such as the Session Description Protocol (SDP), RFC 2327 3329 [15], to refer to transport methods. 3331 A Algorithms 3333 We provide examples of C code for aspects of RTP sender and receiver 3334 algorithms. There may be other implementation methods that are faster 3335 in particular operating environments or have other advantages. These 3336 implementation notes are for informational purposes only and are 3337 meant to clarify the RTP specification. 3339 The following definitions are used for all examples; for clarity and 3340 brevity, the structure definitions are only valid for 32-bit big- 3341 endian (most significant octet first) architectures. Bit fields are 3342 assumed to be packed tightly in big-endian bit order, with no 3343 additional padding. Modifications would be required to construct a 3344 portable implementation. 3346 /* 3347 * rtp.h -- RTP header file 3348 */ 3349 #include 3351 /* 3352 * The type definitions below are valid for 32-bit architectures and 3353 * may have to be adjusted for 16- or 64-bit architectures. 3354 */ 3355 typedef unsigned char u_int8; 3356 typedef unsigned short u_int16; 3357 typedef unsigned int u_int32; 3358 typedef short int16; 3360 /* 3361 * Current protocol version. 3362 */ 3363 #define RTP_VERSION 2 3365 #define RTP_SEQ_MOD (1<<16) 3366 #define RTP_MAX_SDES 255 /* maximum text length for SDES */ 3368 typedef enum { 3369 RTCP_SR = 200, 3370 RTCP_RR = 201, 3371 RTCP_SDES = 202, 3372 RTCP_BYE = 203, 3373 RTCP_APP = 204 3374 } rtcp_type_t; 3376 typedef enum { 3377 RTCP_SDES_END = 0, 3378 RTCP_SDES_CNAME = 1, 3379 RTCP_SDES_NAME = 2, 3380 RTCP_SDES_EMAIL = 3, 3381 RTCP_SDES_PHONE = 4, 3382 RTCP_SDES_LOC = 5, 3383 RTCP_SDES_TOOL = 6, 3384 RTCP_SDES_NOTE = 7, 3385 RTCP_SDES_PRIV = 8 3386 } rtcp_sdes_type_t; 3388 /* 3389 * RTP data header 3390 */ 3391 typedef struct { 3392 unsigned int version:2; /* protocol version */ 3393 unsigned int p:1; /* padding flag */ 3394 unsigned int x:1; /* header extension flag */ 3395 unsigned int cc:4; /* CSRC count */ 3396 unsigned int m:1; /* marker bit */ 3397 unsigned int pt:7; /* payload type */ 3398 unsigned int seq:16; /* sequence number */ 3399 u_int32 ts; /* timestamp */ 3400 u_int32 ssrc; /* synchronization source */ 3401 u_int32 csrc[1]; /* optional CSRC list */ 3402 } rtp_hdr_t; 3404 /* 3405 * RTCP common header word 3406 */ 3407 typedef struct { 3408 unsigned int version:2; /* protocol version */ 3409 unsigned int p:1; /* padding flag */ 3410 unsigned int count:5; /* varies by packet type */ 3411 unsigned int pt:8; /* RTCP packet type */ 3412 u_int16 length; /* pkt len in words, w/o this word */ 3413 } rtcp_common_t; 3415 /* 3416 * Big-endian mask for version, padding bit and packet type pair 3417 */ 3418 #define RTCP_VALID_MASK (0xc000 | 0x2000 | 0xfe) 3419 #define RTCP_VALID_VALUE ((RTP_VERSION << 14) | RTCP_SR) 3421 /* 3422 * Reception report block 3423 */ 3424 typedef struct { 3425 u_int32 ssrc; /* data source being reported */ 3426 unsigned int fraction:8; /* fraction lost since last SR/RR */ 3427 int lost:24; /* cumul. no. pkts lost (signed!) */ 3428 u_int32 last_seq; /* extended last seq. no. received */ 3429 u_int32 jitter; /* interarrival jitter */ 3430 u_int32 lsr; /* last SR packet from this source */ 3431 u_int32 dlsr; /* delay since last SR packet */ 3432 } rtcp_rr_t; 3434 /* 3435 * SDES item 3436 */ 3437 typedef struct { 3438 u_int8 type; /* type of item (rtcp_sdes_type_t) */ 3439 u_int8 length; /* length of item (in octets) */ 3440 char data[1]; /* text, not null-terminated */ 3442 } rtcp_sdes_item_t; 3444 /* 3445 * One RTCP packet 3446 */ 3447 typedef struct { 3448 rtcp_common_t common; /* common header */ 3449 union { 3450 /* sender report (SR) */ 3451 struct { 3452 u_int32 ssrc; /* sender generating this report */ 3453 u_int32 ntp_sec; /* NTP timestamp */ 3454 u_int32 ntp_frac; 3455 u_int32 rtp_ts; /* RTP timestamp */ 3456 u_int32 psent; /* packets sent */ 3457 u_int32 osent; /* octets sent */ 3458 rtcp_rr_t rr[1]; /* variable-length list */ 3459 } sr; 3461 /* reception report (RR) */ 3462 struct { 3463 u_int32 ssrc; /* receiver generating this report */ 3464 rtcp_rr_t rr[1]; /* variable-length list */ 3465 } rr; 3467 /* source description (SDES) */ 3468 struct rtcp_sdes { 3469 u_int32 src; /* first SSRC/CSRC */ 3470 rtcp_sdes_item_t item[1]; /* list of SDES items */ 3471 } sdes; 3473 /* BYE */ 3474 struct { 3475 u_int32 src[1]; /* list of sources */ 3476 /* can't express trailing text for reason */ 3477 } bye; 3478 } r; 3479 } rtcp_t; 3481 typedef struct rtcp_sdes rtcp_sdes_t; 3482 /* 3483 * Per-source state information 3484 */ 3485 typedef struct { 3486 u_int16 max_seq; /* highest seq. number seen */ 3487 u_int32 cycles; /* shifted count of seq. number cycles */ 3488 u_int32 base_seq; /* base seq number */ 3489 u_int32 bad_seq; /* last 'bad' seq number + 1 */ 3490 u_int32 probation; /* sequ. packets till source is valid */ 3491 u_int32 received; /* packets received */ 3492 u_int32 expected_prior; /* packet expected at last interval */ 3493 u_int32 received_prior; /* packet received at last interval */ 3494 u_int32 transit; /* relative trans time for prev pkt */ 3495 u_int32 jitter; /* estimated jitter */ 3496 /* ... */ 3497 } source; 3499 A.1 RTP Data Header Validity Checks 3501 An RTP receiver SHOULD check the validity of the RTP header on 3502 incoming packets since they might be encrypted or might be from a 3503 different application that happens to be misaddressed. Similarly, if 3504 encryption according to the method described in Section 9 is enabled, 3505 the header validity check is needed to verify that incoming packets 3506 have been correctly decrypted, although a failure of the header 3507 validity check (e.g., unknown payload type) may not necessarily 3508 indicate decryption failure. 3510 Only weak validity checks are possible on an RTP data packet from a 3511 source that has not been heard before: 3513 o RTP version field must equal 2. 3515 o The payload type must be known, and in particular it must not 3516 be equal to SR or RR. 3518 o If the P bit is set, then the last octet of the packet must 3519 contain a valid octet count, in particular, less than the 3520 total packet length minus the header size. 3522 o The X bit must be zero if the profile does not specify that 3523 the header extension mechanism may be used. Otherwise, the 3524 extension length field must be less than the total packet size 3525 minus the fixed header length and padding. 3527 o The length of the packet must be consistent with CC and 3528 payload type (if payloads have a known length). 3530 The last three checks are somewhat complex and not always possible, 3531 leaving only the first two which total just a few bits. If the SSRC 3532 identifier in the packet is one that has been received before, then 3533 the packet is probably valid and checking if the sequence number is 3534 in the expected range provides further validation. If the SSRC 3535 identifier has not been seen before, then data packets carrying that 3536 identifier may be considered invalid until a small number of them 3537 arrive with consecutive sequence numbers. Those invalid packets MAY 3538 be discarded or they MAY be stored and delivered once validation has 3539 been achieved if the resulting delay is acceptable. 3541 The routine update_seq shown below ensures that a source is declared 3542 valid only after MIN_SEQUENTIAL packets have been received in 3543 sequence. It also validates the sequence number seq of a newly 3544 received packet and updates the sequence state for the packet's 3545 source in the structure to which s points. 3547 When a new source is heard for the first time, that is, its SSRC 3548 identifier is not in the table (see Section 8.2), and the per-source 3549 state is allocated for it, s->probation should be set to the number 3550 of sequential packets required before declaring a source valid 3551 (parameter MIN_SEQUENTIAL ) and s->max_seq initialized to seq-1 s- 3552 >probation marks the source as not yet valid so the state may be 3553 discarded after a short timeout rather than a long one, as discussed 3554 in Section 6.2.1. 3556 After a source is considered valid, the sequence number is considered 3557 valid if it is no more than MAX_DROPOUT ahead of s->max_seq nor more 3558 than MAX_MISORDER behind. If the new sequence number is ahead of 3559 max_seq modulo the RTP sequence number range (16 bits), but is 3560 smaller than max_seq , it has wrapped around and the (shifted) count 3561 of sequence number cycles is incremented. A value of one is returned 3562 to indicate a valid sequence number. 3564 Otherwise, the value zero is returned to indicate that the validation 3565 failed, and the bad sequence number plus 1 is stored. If the next 3566 packet received carries the next higher sequence number, it is 3567 considered the valid start of a new packet sequence presumably caused 3568 by an extended dropout or a source restart. Since multiple complete 3569 sequence number cycles may have been missed, the packet loss 3570 statistics are reset. 3572 Typical values for the parameters are shown, based on a maximum 3573 misordering time of 2 seconds at 50 packets/second and a maximum 3574 dropout of 1 minute. The dropout parameter MAX_DROPOUT SHOULD be a 3575 small fraction of the 16-bit sequence number space to give a 3576 reasonable probability that new sequence numbers after a restart will 3577 not fall in the acceptable range for sequence numbers from before the 3578 restart. 3580 void init_seq(source *s, u_int16 seq) 3581 { 3582 s->base_seq = seq; 3583 s->max_seq = seq; 3584 s->bad_seq = RTP_SEQ_MOD + 1; /* so seq == bad_seq is false */ 3585 s->cycles = 0; 3586 s->received = 0; 3587 s->received_prior = 0; 3588 s->expected_prior = 0; 3589 /* other initialization */ 3590 } 3592 int update_seq(source *s, u_int16 seq) 3593 { 3594 u_int16 udelta = seq - s->max_seq; 3595 const int MAX_DROPOUT = 3000; 3596 const int MAX_MISORDER = 100; 3597 const int MIN_SEQUENTIAL = 2; 3599 /* 3600 * Source is not valid until MIN_SEQUENTIAL packets with 3601 * sequential sequence numbers have been received. 3602 */ 3603 if (s->probation) { 3604 /* packet is in sequence */ 3605 if (seq == s->max_seq + 1) { 3606 s->probation--; 3607 s->max_seq = seq; 3608 if (s->probation == 0) { 3609 init_seq(s, seq); 3610 s->received++; 3611 return 1; 3612 } 3613 } else { 3614 s->probation = MIN_SEQUENTIAL - 1; 3615 s->max_seq = seq; 3616 } 3617 return 0; 3618 } else if (udelta < MAX_DROPOUT) { 3619 /* in order, with permissible gap */ 3620 if (seq < s->max_seq) { 3621 /* 3622 * Sequence number wrapped - count another 64K cycle. 3623 */ 3624 s->cycles += RTP_SEQ_MOD; 3625 } 3626 s->max_seq = seq; 3628 } else if (udelta <= RTP_SEQ_MOD - MAX_MISORDER) { 3629 /* the sequence number made a very large jump */ 3630 if (seq == s->bad_seq) { 3631 /* 3632 * Two sequential packets -- assume that the other side 3633 * restarted without telling us so just re-sync 3634 * (i.e., pretend this was the first packet). 3635 */ 3636 init_seq(s, seq); 3637 } 3638 else { 3639 s->bad_seq = (seq + 1) & (RTP_SEQ_MOD-1); 3640 return 0; 3641 } 3642 } else { 3643 /* duplicate or reordered packet */ 3644 } 3645 s->received++; 3646 return 1; 3647 } 3649 The validity check can be made stronger requiring more than two 3650 packets in sequence. The disadvantages are that a larger number of 3651 initial packets will be discarded (or delayed in a queue) and that 3652 high packet loss rates could prevent validation. However, because the 3653 RTCP header validation is relatively strong, if an RTCP packet is 3654 received from a source before the data packets, the count could be 3655 adjusted so that only two packets are required in sequence. If 3656 initial data loss for a few seconds can be tolerated, an application 3657 MAY choose to discard all data packets from a source until a valid 3658 RTCP packet has been received from that source. 3660 Depending on the application and encoding, algorithms may exploit 3661 additional knowledge about the payload format for further validation. 3662 For payload types where the timestamp increment is the same for all 3663 packets, the timestamp values can be predicted from the previous 3664 packet received from the same source using the sequence number 3665 difference (assuming no change in payload type). 3667 A strong "fast-path" check is possible since with high probability 3668 the first four octets in the header of a newly received RTP data 3669 packet will be just the same as that of the previous packet from the 3670 same SSRC except that the sequence number will have increased by one. 3671 Similarly, a single-entry cache may be used for faster SSRC lookups 3672 in applications where data is typically received from one source at a 3673 time. 3675 A.2 RTCP Header Validity Checks 3677 The following checks SHOULD be applied to RTCP packets. 3679 o RTP version field must equal 2. 3681 o The payload type field of the first RTCP packet in a compound 3682 packet must be equal to SR or RR. 3684 o The padding bit (P) should be zero for the first packet of a 3685 compound RTCP packet because padding should only be applied, 3686 if it is needed, to the last packet. 3688 o The length fields of the individual RTCP packets must total to 3689 the overall length of the compound RTCP packet as received. 3690 This is a fairly strong check. 3692 The code fragment below performs all of these checks. The packet type 3693 is not checked for subsequent packets since unknown packet types may 3694 be present and should be ignored. 3696 u_int32 len; /* length of compound RTCP packet in words */ 3697 rtcp_t *r; /* RTCP header */ 3698 rtcp_t *end; /* end of compound RTCP packet */ 3700 if ((*(u_int16 *)r & RTCP_VALID_MASK) != RTCP_VALID_VALUE) { 3701 /* something wrong with packet format */ 3702 } 3703 end = (rtcp_t *)((u_int32 *)r + len); 3705 do r = (rtcp_t *)((u_int32 *)r + r->common.length + 1); 3706 while (r < end && r->common.version == 2); 3708 if (r != end) { 3709 /* something wrong with packet format */ 3710 } 3712 A.3 Determining the Number of RTP Packets Expected and Lost 3714 In order to compute packet loss rates, the number of packets expected 3715 and actually received from each source needs to be known, using per- 3716 source state information defined in struct source referenced via 3717 pointer s in the code below. The number of packets received is simply 3718 the count of packets as they arrive, including any late or duplicate 3719 packets. The number of packets expected can be computed by the 3720 receiver as the difference between the highest sequence number 3721 received ( s->max_seq ) and the first sequence number received ( s- 3722 >base_seq ). Since the sequence number is only 16 bits and will wrap 3723 around, it is necessary to extend the highest sequence number with 3724 the (shifted) count of sequence number wraparounds ( s->cycles ). 3725 Both the received packet count and the count of cycles are maintained 3726 the RTP header validity check routine in Appendix A.1. 3728 extended_max = s->cycles + s->max_seq; 3729 expected = extended_max - s->base_seq + 1; 3731 The number of packets lost is defined to be the number of packets 3732 expected less the number of packets actually received: 3734 lost = expected - s->received; 3736 Since this signed number is carried in 24 bits, it SHOULD be clamped 3737 at 0x7fffff for positive loss or 0x800000 for negative loss rather 3738 than wrapping around. 3740 The fraction of packets lost during the last reporting interval 3741 (since the previous SR or RR packet was sent) is calculated from 3742 differences in the expected and received packet counts across the 3743 interval, where expected_prior and received_prior are the values 3744 saved when the previous reception report was generated: 3746 expected_interval = expected - s->expected_prior; 3747 s->expected_prior = expected; 3748 received_interval = s->received - s->received_prior; 3749 s->received_prior = s->received; 3750 lost_interval = expected_interval - received_interval; 3751 if (expected_interval == 0 || lost_interval <= 0) fraction = 0; 3752 else fraction = (lost_interval << 8) / expected_interval; 3754 The resulting fraction is an 8-bit fixed point number with the binary 3755 point at the left edge. 3757 A.4 Generating SDES RTCP Packets 3759 This function builds one SDES chunk into buffer b composed of argc 3760 items supplied in arrays type , value and length b 3762 char *rtp_write_sdes(char *b, u_int32 src, int argc, 3763 rtcp_sdes_type_t type[], char *value[], 3764 int length[]) 3765 { 3766 rtcp_sdes_t *s = (rtcp_sdes_t *)b; 3767 rtcp_sdes_item_t *rsp; 3768 int i; 3769 int len; 3770 int pad; 3772 /* SSRC header */ 3773 s->src = src; 3774 rsp = &s->item[0]; 3776 /* SDES items */ 3777 for (i = 0; i < argc; i++) { 3778 rsp->type = type[i]; 3779 len = length[i]; 3780 if (len > RTP_MAX_SDES) { 3781 /* invalid length, may want to take other action */ 3782 len = RTP_MAX_SDES; 3783 } 3784 rsp->length = len; 3785 memcpy(rsp->data, value[i], len); 3786 rsp = (rtcp_sdes_item_t *)&rsp->data[len]; 3787 } 3789 /* terminate with end marker and pad to next 4-octet boundary */ 3790 len = ((char *) rsp) - b; 3791 pad = 4 - (len & 0x3); 3792 b = (char *) rsp; 3793 while (pad--) *b++ = RTCP_SDES_END; 3795 return b; 3796 } 3798 A.5 Parsing RTCP SDES Packets 3800 This function parses an SDES packet, calling functions find_member() 3801 to find a pointer to the information for a session member given the 3802 SSRC identifier and member_sdes() to store the new SDES information 3803 for that member. This function expects a pointer to the header of the 3804 RTCP packet. 3806 void rtp_read_sdes(rtcp_t *r) 3807 { 3808 int count = r->common.count; 3809 rtcp_sdes_t *sd = &r->r.sdes; 3810 rtcp_sdes_item_t *rsp, *rspn; 3811 rtcp_sdes_item_t *end = (rtcp_sdes_item_t *) 3812 ((u_int32 *)r + r->common.length + 1); 3813 source *s; 3815 while (--count >= 0) { 3816 rsp = &sd->item[0]; 3817 if (rsp >= end) break; 3818 s = find_member(sd->src); 3820 for (; rsp->type; rsp = rspn ) { 3821 rspn = (rtcp_sdes_item_t *)((char*)rsp+rsp->length+2); 3822 if (rspn >= end) { 3823 rsp = rspn; 3824 break; 3825 } 3826 member_sdes(s, rsp->type, rsp->data, rsp->length); 3827 } 3828 sd = (rtcp_sdes_t *) 3829 ((u_int32 *)sd + (((char *)rsp - (char *)sd) >> 2)+1); 3830 } 3831 if (count >= 0) { 3832 /* invalid packet format */ 3833 } 3834 } 3836 A.6 Generating a Random 32-bit Identifier 3838 The following subroutine generates a random 32-bit identifier using 3839 the MD5 routines published in RFC 1321 [30]. The system routines may 3840 not be present on all operating systems, but they should serve as 3841 hints as to what kinds of information may be used. Other system calls 3842 that may be appropriate include 3844 o getdomainname() , 3846 o getwd() , or 3848 o getrusage() 3850 "Live" video or audio samples are also a good source of random 3851 numbers, but care must be taken to avoid using a turned-off 3852 microphone or blinded camera as a source [17]. 3854 Use of this or similar routine is RECOMMENDED to generate the initial 3855 seed for the random number generator producing the RTCP period (as 3856 shown in Appendix A.7), to generate the initial values for the 3857 sequence number and timestamp, and to generate SSRC values. Since 3858 this routine is likely to be CPU-intensive, its direct use to 3859 generate RTCP periods is inappropriate because predictability is not 3860 an issue. Note that this routine produces the same result on repeated 3861 calls until the value of the system clock changes unless different 3862 values are supplied for the type argument. 3864 /* 3865 * Generate a random 32-bit quantity. 3866 */ 3867 #include /* u_long */ 3868 #include /* gettimeofday() */ 3869 #include /* get..() */ 3870 #include /* printf() */ 3871 #include /* clock() */ 3872 #include /* uname() */ 3873 #include "global.h" /* from RFC 1321 */ 3874 #include "md5.h" /* from RFC 1321 */ 3876 #define MD_CTX MD5_CTX 3877 #define MDInit MD5Init 3878 #define MDUpdate MD5Update 3879 #define MDFinal MD5Final 3881 static u_long md_32(char *string, int length) 3882 { 3883 MD_CTX context; 3884 union { 3885 char c[16]; 3886 u_long x[4]; 3887 } digest; 3888 u_long r; 3889 int i; 3891 MDInit (&context); 3892 MDUpdate (&context, string, length); 3893 MDFinal ((unsigned char *)&digest, &context); 3894 r = 0; 3895 for (i = 0; i < 3; i++) { 3896 r ^= digest.x[i]; 3897 } 3898 return r; 3899 } /* md_32 */ 3901 /* 3902 * Return random unsigned 32-bit quantity. Use 'type' argument if you 3903 * need to generate several different values in close succession. 3904 */ 3905 u_int32 random32(int type) 3906 { 3907 struct { 3908 int type; 3909 struct timeval tv; 3910 clock_t cpu; 3911 pid_t pid; 3912 u_long hid; 3913 uid_t uid; 3914 gid_t gid; 3915 struct utsname name; 3916 } s; 3918 gettimeofday(&s.tv, 0); 3919 uname(&s.name); 3920 s.type = type; 3921 s.cpu = clock(); 3922 s.pid = getpid(); 3923 s.hid = gethostid(); 3924 s.uid = getuid(); 3925 s.gid = getgid(); 3926 /* also: system uptime */ 3928 return md_32((char *)&s, sizeof(s)); 3929 } /* random32 */ 3931 A.7 Computing the RTCP Transmission Interval 3933 The following functions implement the RTCP transmission and reception 3934 rules described in Section 6.2. These rules are coded in several 3935 functions: 3937 o rtcp_interval() computes the deterministic calculated 3938 interval, measured in seconds. The parameters are defined in 3939 Section 6.3. 3941 o OnExpire() is called when the RTCP transmission timer expires. 3943 o OnReceive() is called whenever an RTCP packet is received. 3945 Both OnExpire() and OnReceive() have event e as an argument. This is 3946 the next scheduled event for that participant, either an RTCP report 3947 or a BYE packet. It is assumed that the following functions are 3948 available: 3950 o Schedule(time t, event e) schedules an event e to occur at 3951 time t. When time t arrives, the function OnExpire is called 3952 with e as an argument. 3954 o Reschedule(time t, event e) reschedules a previously scheduled 3955 event e for time t. 3957 o SendRTCPReport(event e) sends an RTCP report. 3959 o SendBYEPacket(event e) sends a BYE packet. 3961 o TypeOfEvent(event e) returns EVENT_BYE if the event being 3962 processed is for a BYE packet to be sent, else it returns 3963 EVENT_REPORT. 3965 o PacketType(p) returns PACKET_RTCP_REPORT if packet p is an 3966 RTCP report (not BYE), PACKET_BYE if its a BYE RTCP packet, 3967 and PACKET_RTP if its a regular RTP data packet. 3969 o ReceivedPacketSize() and SentPacketSize() return the size of 3970 the referenced packet in octets. 3972 o NewMember(p) returns a 1 if the participant who sent packet p 3973 is not currently in the member list, 0 otherwise. Note this 3974 function is not sufficient for a complete implementation 3975 because each CSRC identifier in an RTP packet and each SSRC in 3976 a BYE packet should be processed. 3978 o NewSender(p) returns a 1 if the participant who sent packet p 3979 is not currently in the sender sublist of the member list, 0 3980 otherwise. 3982 o AddMember() and RemoveMember() to add and remove participants 3983 from the member list. 3985 o AddSender() and RemoveSender() to add and remove participants 3986 from the sender sublist of the member list. 3988 double rtcp_interval(int members, 3989 int senders, 3990 double rtcp_bw, 3991 int we_sent, 3992 double avg_rtcp_size, 3993 int initial) 3994 { 3995 /* 3996 * Minimum average time between RTCP packets from this site (in 3997 * seconds). This time prevents the reports from `clumping' when 3998 * sessions are small and the law of large numbers isn't helping 3999 * to smooth out the traffic. It also keeps the report interval 4000 * from becoming ridiculously small during transient outages like 4001 * a network partition. 4002 */ 4003 double const RTCP_MIN_TIME = 5.; 4004 /* 4005 * Fraction of the RTCP bandwidth to be shared among active 4006 * senders. (This fraction was chosen so that in a typical 4007 * session with one or two active senders, the computed report 4008 * time would be roughly equal to the minimum report time so that 4009 * we don't unnecessarily slow down receiver reports.) The 4010 * receiver fraction must be 1 - the sender fraction. 4011 */ 4012 double const RTCP_SENDER_BW_FRACTION = 0.25; 4013 double const RTCP_RCVR_BW_FRACTION = (1-RTCP_SENDER_BW_FRACTION); 4014 /* 4015 /* To compensate for "unconditional reconsideration" converging to a 4016 * value below the intended average. 4017 */ 4018 double const COMPENSATION = 2.71828 - 1.5; 4020 double t; /* interval */ 4021 double rtcp_min_time = RTCP_MIN_TIME; 4022 int n; /* no. of members for computation */ 4024 /* 4025 * Very first call at application start-up uses half the min 4026 * delay for quicker notification while still allowing some time 4027 * before reporting for randomization and to learn about other 4028 * sources so the report interval will converge to the correct 4029 * interval more quickly. 4030 */ 4031 if (initial) { 4032 rtcp_min_time /= 2; 4033 } 4034 /* 4035 * If there were active senders, give them at least a minimum 4036 * share of the RTCP bandwidth. Otherwise all participants share 4037 * the RTCP bandwidth equally. 4038 */ 4039 n = members; 4040 if (senders > 0 && senders < members * RTCP_SENDER_BW_FRACTION) { 4041 if (we_sent) { 4042 rtcp_bw *= RTCP_SENDER_BW_FRACTION; 4043 n = senders; 4044 } else { 4045 rtcp_bw *= RTCP_RCVR_BW_FRACTION; 4046 n -= senders; 4047 } 4048 } 4050 /* 4051 * The effective number of sites times the average packet size is 4052 * the total number of octets sent when each site sends a report. 4053 * Dividing this by the effective bandwidth gives the time 4054 * interval over which those packets must be sent in order to 4055 * meet the bandwidth target, with a minimum enforced. In that 4056 * time interval we send one report so this time is also our 4057 * average time between reports. 4058 */ 4059 t = avg_rtcp_size * n / rtcp_bw; 4060 if (t < rtcp_min_time) t = rtcp_min_time; 4062 /* 4063 * To avoid traffic bursts from unintended synchronization with 4064 * other sites, we then pick our actual next report interval as a 4065 * random number uniformly distributed between 0.5*t and 1.5*t. 4066 */ 4067 t = t * (drand48() + 0.5); 4068 t = t / COMPENSATION; 4069 return t; 4070 } 4071 void OnExpire(event e, 4072 int members, 4073 int senders, 4074 double rtcp_bw, 4075 int we_sent, 4076 double *avg_rtcp_size, 4077 int *initial, 4078 time_tp tc, 4079 time_tp *tp, 4080 int *pmembers) 4081 { 4082 /* This function is responsible for deciding whether to send 4083 * an RTCP report or BYE packet now, or to reschedule transmission. 4084 * It is also responsible for updating the pmembers, initial, tp, 4085 * and avg_rtcp_size state variables. This function should be called 4086 * upon expiration of the event timer used by Schedule(). */ 4088 double t; /* Interval */ 4089 double tn; /* Next transmit time */ 4091 /* In the case of a BYE, we use "unconditional reconsideration" to 4092 * reschedule the transmission of the BYE if necessary */ 4094 if (TypeOfEvent(e) == EVENT_BYE) { 4095 t = rtcp_interval(members, 4096 senders, 4097 rtcp_bw, 4098 we_sent, 4099 *avg_rtcp_size, 4100 *initial); 4101 tn = *tp + t; 4102 if (tn <= tc) { 4103 SendBYEPacket(e); 4104 exit(1); 4105 } else { 4106 Schedule(tn, e); 4107 } 4109 } else if (TypeOfEvent(e) == EVENT_REPORT) { 4110 t = rtcp_interval(members, 4111 senders, 4112 rtcp_bw, 4113 we_sent, 4114 *avg_rtcp_size, 4115 *initial); 4116 tn = *tp + t; 4117 if (tn <= tc) { 4118 SendRTCPReport(e); 4119 *avg_rtcp_size = (1./16.)*SentPacketSize(e) + 4120 (15./16.)*(*avg_rtcp_size); 4121 *tp = tc; 4123 /* We must redraw the interval. Don't reuse the 4124 one computed above, since its not actually 4125 distributed the same, as we are conditioned 4126 on it being small enough to cause a packet to 4127 be sent */ 4129 t = rtcp_interval(members, 4130 senders, 4131 rtcp_bw, 4132 we_sent, 4133 *avg_rtcp_size, 4134 *initial); 4136 Schedule(t+tc,e); 4137 *initial = 0; 4138 } else { 4139 Schedule(tn, e); 4140 } 4141 *pmembers = members; 4142 } 4143 } 4144 void OnReceive(packet p, 4145 event e, 4146 int *members, 4147 int *pmembers, 4148 int *senders, 4149 double *avg_rtcp_size, 4150 double *tp, 4151 double tc, 4152 double tn) 4153 { 4154 /* What we do depends on whether we have left the group, and 4155 * are waiting to send a BYE (TypeOfEvent(e) == EVENT_BYE) or 4156 * an RTCP report. p represents the packet that was just received. */ 4158 if (PacketType(p) == PACKET_RTCP_REPORT) { 4159 if (NewMember(p) && (TypeOfEvent(e) == EVENT_REPORT)) { 4160 AddMember(p); 4161 *members += 1; 4162 } 4163 *avg_rtcp_size = (1./16.)*ReceivedPacketSize(p) + 4164 (15./16.)*(*avg_rtcp_size); 4165 } else if (PacketType(p) == PACKET_RTP) { 4166 if (NewMember(p) && (TypeOfEvent(e) == EVENT_REPORT)) { 4167 AddMember(p); 4168 *members += 1; 4169 } 4170 if (NewSender(p) && (TypeOfEvent(e) == EVENT_REPORT)) { 4171 AddSender(p); 4172 *senders += 1; 4173 } 4174 } else if (PacketType(p) == PACKET_BYE) { 4175 *avg_rtcp_size = (1./16.)*ReceivedPacketSize(p) + 4176 (15./16.)*(*avg_rtcp_size); 4178 if (TypeOfEvent(e) == EVENT_REPORT) { 4179 if (NewSender(p) == FALSE) { 4180 RemoveSender(p); 4181 *senders -= 1; 4182 } 4184 if (NewMember(p) == FALSE) { 4185 RemoveMember(p); 4186 *members -= 1; 4187 } 4189 if(*members < *pmembers) { 4190 tn = tc + (((double) *members)/(*pmembers))*(tn - tc); 4191 *tp = tc - (((double) *members)/(*pmembers))*(tc - *tp); 4193 /* Reschedule the next report for time tn */ 4195 Reschedule(tn, e); 4196 *pmembers = *members; 4197 } 4199 } else if (TypeOfEvent(e) == EVENT_BYE) { 4200 *members += 1; 4201 } 4202 } 4203 } 4205 A.8 Estimating the Interarrival Jitter 4207 The code fragments below implement the algorithm given in Section 4208 6.4.1 for calculating an estimate of the statistical variance of the 4209 RTP data interarrival time to be inserted in the interarrival jitter 4210 field of reception reports. The inputs are r->ts , the timestamp from 4211 the incoming packet, and arrival , the current time in the same 4212 units. Here s points to state for the source; s->transit holds the 4213 relative transit time for the previous packet, and s->jitter holds 4214 the estimated jitter. The jitter field of the reception report is 4215 measured in timestamp units and expressed as an unsigned integer, but 4216 the jitter estimate is kept in a floating point. As each data packet 4217 arrives, the jitter estimate is updated: 4219 int transit = arrival - r->ts; 4220 int d = transit - s->transit; 4221 s->transit = transit; 4222 if (d < 0) d = -d; 4223 s->jitter += (1./16.) * ((double)d - s->jitter); 4225 When a reception report block (to which rr points) is generated for 4226 this member, the current jitter estimate is returned: 4228 rr->jitter = (u_int32) s->jitter; 4230 Alternatively, the jitter estimate can be kept as an integer, but 4231 scaled to reduce round-off error. The calculation is the same except 4232 for the last line: 4234 s->jitter += d - ((s->jitter + 8) >> 4); 4236 In this case, the estimate is sampled for the reception report as: 4238 rr->jitter = s->jitter >> 4; 4240 B Changes from RFC 1889 4242 Most of this RFC is identical to RFC 1889. There are no changes in 4243 the packet formats on the wire, only changes to the rules and 4244 algorithms governing how the protocol is used. The biggest change was 4245 an enhancement to the scalable timer algorithm for calculating when 4246 to send RTCP packets: 4248 o The algorithm for calculating the RTCP transmission interval 4249 specified in Sections 6.2 and 6.3 and illustrated in Appendix 4250 A.7 is augmented to include "reconsideration" to minimize 4251 transmission in excess of the intended rate when many 4252 participants join a session simultaneously, and "reverse 4253 reconsideration" to reduce the incidence and duration of false 4254 participant timeouts when the number of participants drops 4255 rapidly. Reverse reconsideration is also used to possibly 4256 shorten the delay before sending RTCP SR when transitioning 4257 from passive receiver to active sender mode. 4259 o Section 6.3.7 specifies new rules controlling when an RTCP BYE 4260 packet should be sent in order to avoid a flood of packets 4261 when many participants leave a session simultaneously. 4263 o The requirement to retain state for inactive participants for 4264 a period long enough to span typical network partitions was 4265 removed from Section 6.2.1. In a session where many 4266 participants join for a brief time and fail to send BYE, this 4267 requirement would cause a significant overestimate of the 4268 number of participants. The reconsideration algorithm added in 4269 this revision compensates for the large number of new 4270 participants joining simultaneously when a partition heals. 4272 It should be noted that these enhancements only have a significant 4273 effect when the number of session participants is large (thousands) 4274 and most of the participants join or leave at the same time. This 4275 makes testing in a live network difficult. However, the algorithm was 4276 subjected to a thorough analysis and simulation to verify its 4277 performance. Furthermore, the enhanced algorithm was designed to 4278 interoperate with the algorithm in RFC 1889 such that the degree of 4279 reduction in excess RTCP bandwidth during a step join is proportional 4280 to the fraction of participants that implement the enhanced 4281 algorithm. Interoperation of the two algorithms has been verified 4282 experimentally on live networks. 4284 Other functional changes were: 4286 o Section 6.2.1 specifies that implementations may store only a 4287 sampling of the participants' SSRC identifiers to allow 4288 scaling to very large sessions. Algorithms are specified in 4289 RFC 2762 [21]. 4291 o In Section 6.2 it is specified that RTCP sender and receiver 4292 bandwidths to be set as separate parameters of the session 4293 rather than a strict percentage of the session bandwidth, and 4294 may be set to zero. The requirement that RTCP was mandatory 4295 for RTP sessions using IP multicast was relaxed. 4297 o Also in Section 6.2 it is specified that the minimum RTCP 4298 interval may be scaled to smaller values for high bandwidth 4299 sessions, and that the initial RTCP delay may be set to zero 4300 for unicast sessions. 4302 o Timing out a participant is to be based on inactivity for a 4303 number of RTCP report intervals calculated using the receiver 4304 RTCP bandwidth fraction even for active senders. 4306 o Sections 7.2 and 7.3 specify that translators and mixers 4307 should send BYE packets for the sources they are no longer 4308 forwarding. 4310 o Rule changes for layered encodings are defined in Sections 4311 2.4, 6.3.9, 8.3 and 11. In the last of these, it is noted that 4312 the address and port assignment rule conflicts with the SDP 4313 specification, RFC 2327 [15], but it is intended that this 4314 restriction will be relaxed in a revision of RFC 2327. 4316 o The convention for using even/odd port pairs for RTP and RTCP 4317 in Section 11 was clarified to refer to destination ports. 4318 The requirement to use an even/odd port pair was removed if 4319 the two ports are specified explicitly. For unicast RTP 4320 sessions, distinct port pairs may be used for the two ends 4321 (Sections 3, 7.1 and 11). 4323 o A new Section 10 was added to explain the requirement for 4324 congestion control in applications using RTP. 4326 o In Section 8.2, the requirement that a new SSRC identifier 4327 MUST be chosen whenever the source transport address is 4328 changed has been relaxed to say that a new SSRC identifier MAY 4329 be chosen. Correspondingly, it was clarified that an 4330 implementation MAY choose to keep packets from the new source 4331 address rather than the existing source address when a 4332 collision occurs, and SHOULD do so for applications such as 4333 telephony in which some sources such as mobile entities may 4334 change addresses during the course of an RTP session. 4336 o An indentation bug in the RFC 1889 printing of the pseudo-code 4337 for the collision detection and resolution algorithm in 4338 Section 8.2 has been corrected by translating the syntax to 4339 pseudo C language, and the algorithm has been modified to 4340 remove the restriction that both RTP and RTCP must be sent 4341 from the same source port number. 4343 o The description of the padding mechanism for RTCP packets was 4344 clarified and it is specified that padding MUST only be 4345 applied to the last packet of a compound RTCP packet. 4347 o In Section A.1, initialization of base_seq was corrected to be 4348 seq rather than seq - 1, and the text was corrected to say the 4349 bad sequence number plus 1 is stored. 4351 o Clamping of number of packets lost in Section A.3 was 4352 corrected to use both positive and negative limits. 4354 o The specification of "relative" NTP timestamp in the RTCP SR 4355 section now defines these timestamps to be based on the most 4356 common system-specific clock, such as system uptime, rather 4357 than on session elapsed time which would not be the same for 4358 multiple applications started on the same machine at different 4359 times. 4361 Non-functional changes: 4363 o It is specified that a receiver MUST ignore packets with 4364 payload types it does not understand. 4366 o In Fig. 2, the floating point NTP timestamp value was 4367 corrected and the UTC timezone was specified. 4369 o The inconsequence of NTP timestamps wrapping around in the 4370 year 2036 is explained. 4372 o The policy for registration of RTCP packet types and SDES 4373 types was clarified in a new Section 14, IANA Considerations. 4374 The suggestion that experimenters register the numbers they 4375 need and then unregister those which prove to be unneeded has 4376 been removed in favor of using APP and PRIV. Registration of 4377 profile names was also specified. 4379 o The reference for the UTF-8 character set was changed from an 4380 X/Open Preliminary Specification to be RFC 2279. 4382 o The last paragraph of the introduction in RFC 1889, which 4383 cautioned implementers to limit deployment in the Internet, 4384 was removed because it was deemed no longer relevant. 4386 o Small clarifications of the text have been made in several 4387 places, some in response to questions from readers. In 4388 particular: 4390 - A definition for "RTP media type" is given in Section 3 to 4391 allow the explanation of multiplexing RTP sessions in 4392 Section 5.2 to be more clear regarding the multiplexing of 4393 multiple media. 4395 - The definition for "non-RTP means" was expanded to include 4396 examples of other protocols constituting non-RTP means. 4398 - The description of the session bandwidth parameter is 4399 expanded in Section 6.2. 4401 - The effect of varying packet duration on the jitter 4402 calculation was explained in Section 6.4.4. 4404 - The method for terminating and padding a sequence of SDES 4405 items was clarified in Section 6.5. 4407 - The Security section adds a formal reference to IPSEC now 4408 that it is available, and says that the confidentiality 4409 method defined in this specification is primarily to codify 4410 existing practice. It is RECOMMENDED that stronger 4411 encryption algorithms such as Triple-DES be used in place of 4412 the default algorithm. It is also noted that payload-only 4413 encryption is necessary to allow for header compression. 4415 - The method for partial encryption of RTCP was clarified; in 4416 particular, SDES CNAME is carried in only one part when the 4417 compound RTCP packet is split. 4419 - It is clarified that only one compound RTCP packet should be 4420 sent per reporting interval and that if there are too many 4421 active sources for the reports to fit in the MTU, then a 4422 subset of the sources should be selected round-robin over 4423 multiple intervals. 4425 - A note was added in Appendix A.1 that packets may be saved 4426 during RTP header validation and delivered upon success. 4428 - Section 7.3 now explains that a mixer aggregating SDES 4429 packets uses more RTCP bandwidth due to longer packets, and 4430 a mixer passing through RTCP naturally sends packets at 4431 higher than the single source rate, but both behaviors are 4432 valid. 4434 - Section 13 clarifies that an RTP application may use 4435 multiple profiles but typically only one in a given session. 4437 - The terms MUST, SHOULD, MAY, etc. are used as defined in RFC 4438 2119. 4440 - The bibliography was divided into normative and non- 4441 normative references. 4443 C Security Considerations 4445 RTP suffers from the same security liabilities as the underlying 4446 protocols. For example, an impostor can fake source or destination 4447 network addresses, or change the header or payload. Within RTCP, the 4448 CNAME and NAME information may be used to impersonate another 4449 participant. In addition, RTP may be sent via IP multicast, which 4450 provides no direct means for a sender to know all the receivers of 4451 the data sent and therefore no measure of privacy. Rightly or not, 4452 users may be more sensitive to privacy concerns with audio and video 4453 communication than they have been with more traditional forms of 4454 network communication [31]. Therefore, the use of security mechanisms 4455 with RTP is important. These mechanisms are discussed in Section 9. 4457 RTP-level translators or mixers may be used to allow RTP traffic to 4458 reach hosts behind firewalls. Appropriate firewall security 4459 principles and practices, which are beyond the scope of this 4460 document, should be followed in the design and installation of these 4461 devices and in the admission of RTP applications for use behind the 4462 firewall. 4464 D Full Copyright Statement 4465 Copyright (C) The Internet Society (2001). All Rights Reserved. 4467 This document and translations of it may be copied and furnished to 4468 others, and derivative works that comment on or otherwise explain it 4469 or assist in its implementation may be prepared, copied, published 4470 and distributed, in whole or in part, without restriction of any 4471 kind, provided that the above copyright notice and this paragraph are 4472 included on all such copies and derivative works. However, this 4473 document itself may not be modified in any way, such as by removing 4474 the copyright notice or references to the Internet Society or other 4475 Internet organizations, except as needed for the purpose of 4476 developing Internet standards in which case the procedures for 4477 copyrights defined in the Internet Standards process must be 4478 followed, or as required to translate it into languages other than 4479 English. 4481 The limited permissions granted above are perpetual and will not be 4482 revoked by the Internet Society or its successors or assigns. 4484 This document and the information contained herein is provided on an 4485 "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING 4486 TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING 4487 BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION 4488 HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF 4489 MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. 4491 E Addresses of Authors 4493 Henning Schulzrinne 4494 Department of Computer Science 4495 Columbia University 4496 1214 Amsterdam Avenue 4497 New York, NY 10027 4498 USA 4499 electronic mail: schulzrinne@cs.columbia.edu 4501 Stephen L. Casner 4502 Packet Design 4503 2465 Latham Street 4504 Mountain View, CA 94040 4505 United States 4506 electronic mail: casner@acm.org 4508 Ron Frederick 4509 Cacheflow Inc. 4510 650 Almanor Avenue 4511 Sunnyvale, CA 94085 4512 United States 4513 electronic mail: ronf@cacheflow.com 4515 Van Jacobson 4516 Packet Design 4517 2465 Latham Street 4518 Mountain View, CA 94040 4519 United States 4520 electronic mail: van@packetdesign.com 4522 Acknowledgments 4524 This memorandum is based on discussions within the IETF Audio/Video 4525 Transport working group chaired by Stephen Casner and Colin Perkins. 4526 The current protocol has its origins in the Network Voice Protocol 4527 and the Packet Video Protocol (Danny Cohen and Randy Cole) and the 4528 protocol implemented by the vat application (Van Jacobson and Steve 4529 McCanne). Christian Huitema provided ideas for the random identifier 4530 generator. Extensive analysis and simulation of the timer 4531 reconsideration algorithm was done by Jonathan Rosenberg. 4533 References 4535 Normative References 4537 [1] H. Schulzrinne and S. Casner, "RTP profile for audio and video 4538 conferences with minimal control," Request for Comments XXXX, 4539 Internet Engineering Task Force. Work in progress. 4541 [2] S. Bradner, "Key words for use in RFCs to indicate requirement 4542 levels," Request for Comments (Best Current Practice) 2119, Internet 4543 Engineering Task Force, Mar. 1997. 4545 [3] J. Postel, "Internet protocol," Request for Comments (Standard) 4546 791, Internet Engineering Task Force, Sept. 1981. 4548 [4] D. L. Mills, "Network time protocol (version 3) specification, 4549 implementation," Request for Comments (Draft Standard) 1305, Internet 4550 Engineering Task Force, Mar. 1992. 4552 [5] F. Yergeau, "UTF-8, a transformation format of ISO 10646," 4553 Request for Comments (Proposed Standard) 2279, Internet Engineering 4554 Task Force, Jan. 1998. 4556 [6] P. V. Mockapetris, "Domain names - concepts and facilities," 4557 Request for Comments (Standard) 1034, Internet Engineering Task 4558 Force, Nov. 1987. 4560 [7] P. V. Mockapetris, "Domain names - implementation and 4561 specification," Request for Comments (Standard) 1035, Internet 4562 Engineering Task Force, Nov. 1987. 4564 [8] R. T. Braden, "Requirements for internet hosts - application and 4565 support," Request for Comments (Standard) 1123, Internet Engineering 4566 Task Force, Oct. 1989. 4568 [9] D. Crocker, "Standard for the format of ARPA internet text 4569 messages," Request for Comments (Standard) 822, Internet Engineering 4570 Task Force, Aug. 1982. 4572 Non-Normative References 4574 [10] D. D. Clark and D. L. Tennenhouse, "Architectural considerations 4575 for a new generation of protocols," in SIGCOMM Symposium on 4576 Communications Architectures and Protocols , (Philadelphia, 4577 Pennsylvania), pp. 200--208, IEEE, Sept. 1990. Computer 4578 Communications Review, Vol. 20(4), Sept. 1990. 4580 [11] H. Schulzrinne, "Issues in designing a transport protocol for 4581 audio and video conferences and other multiparticipant real-time 4582 applications." expired Internet draft, Oct. 1993. 4584 [12] D. E. Comer, Internetworking with TCP/IP , vol. 1. Englewood 4585 Cliffs, New Jersey: Prentice Hall, 1991. 4587 [13] M. Handley, H. Schulzrinne, E. Schooler, and J. Rosenberg, "SIP: 4588 session initiation protocol," Request for Comments (Proposed 4589 Standard) 2543, Internet Engineering Task Force, Mar. 1999. 4591 [14] International Telecommunication Union, "Visual telephone systems 4592 and equipment for local area networks which provide a non-guaranteed 4593 quality of service," Recommendation H.323, Telecommunication 4594 Standardization Sector of ITU, Geneva, Switzerland, May 1996. 4596 [15] M. Handley and V. Jacobson, "SDP: session description protocol," 4597 Request for Comments (Proposed Standard) 2327, Internet Engineering 4598 Task Force, Apr. 1998. 4600 [16] H. Schulzrinne, A. Rao, and R. Lanphier, "Real time streaming 4601 protocol (RTSP)," Request for Comments (Proposed Standard) 2326, 4602 Internet Engineering Task Force, Apr. 1998. 4604 [17] D. Eastlake, 3rd, S. Crocker, and J. Schiller, "Randomness 4605 recommendations for security," Request for Comments (Informational) 4606 1750, Internet Engineering Task Force, Dec. 1994. 4608 [18] J.-C. Bolot, T. Turletti, and I. Wakeman, "Scalable feedback 4609 control for multicast video distribution in the internet," in SIGCOMM 4610 Symposium on Communications Architectures and Protocols , (London, 4611 England), pp. 58--67, ACM, Aug. 1994. 4613 [19] I. Busse, B. Deffner, and H. Schulzrinne, "Dynamic QoS control 4614 of multimedia applications based on RTP," Computer Communications , 4615 vol. 19, pp. 49--58, Jan. 1996. 4617 [20] S. Floyd and V. Jacobson, "The synchronization of periodic 4618 routing messages," in SIGCOMM Symposium on Communications 4619 Architectures and Protocols (D. P. Sidhu, ed.), (San Francisco, 4620 California), pp. 33--44, ACM, Sept. 1993. also in [32]. 4622 [21] J. Rosenberg and H. Schulzrinne, "Sampling of the group 4623 membership in RTP," Request for Comments (Experimental) 2762, 4624 Internet Engineering Task Force, May 1999. 4626 [22] J. A. Cadzow, Foundations of digital signal processing and data 4627 analysis New York, New York: Macmillan, 1987. 4629 [23] Y. Rekhter, B. Moskowitz, D. Karrenberg, and G. de Groot, 4630 "Address allocation for private internets," Request for Comments 4631 (Informational) 1597, Internet Engineering Task Force, Mar. 1994. 4633 [24] E. Lear, E. Fair, D. Crocker, and T. Kessler, "Network 10 4634 considered harmful (some practices shouldn't be codified)," Request 4635 for Comments (Informational) 1627, Internet Engineering Task Force, 4636 June 1994. 4638 [25] W. Feller, An Introduction to Probability Theory and its 4639 Applications, Volume 1 , vol. 1. New York, New York: John Wiley and 4640 Sons, third ed., 1968. 4642 [26] S. Kent and R. Atkinson, "Security architecture for the internet 4643 protocol," Request for Comments (Proposed Standard) 2401, Internet 4644 Engineering Task Force, Nov. 1998. 4646 [27] D. Balenson, "Privacy enhancement for internet electronic mail: 4647 Part III: algorithms, modes, and identifiers," Request for Comments 4648 (Proposed Standard) 1423, Internet Engineering Task Force, Feb. 1993. 4650 [28] V. L. Voydock and S. T. Kent, "Security mechanisms in high-level 4651 network protocols," ACM Computing Surveys , vol. 15, pp. 135--171, 4652 June 1983. 4654 [29] S. Floyd, "Congestion Control Principles," Request for Comments 4655 (Best Current Practice) 2914, Internet Engineering Task Force, Sep. 4657 2000. 4659 [30] R. Rivest, "The MD5 message-digest algorithm," Request for 4660 Comments (Informational) 1321, Internet Engineering Task Force, Apr. 4661 1992. 4663 [31] S. Stubblebine, "Security services for multimedia conferencing," 4664 in 16th National Computer Security Conference , (Baltimore, 4665 Maryland), pp. 391--395, Sept. 1993. 4667 [32] S. Floyd and V. Jacobson, "The synchronization of periodic 4668 routing messages," IEEE/ACM Transactions on Networking , vol. 2, pp. 4669 122--136, Apr. 1994.