idnits 2.17.1 draft-ietf-avt-rtp-new-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Cannot find the required boilerplate sections (Copyright, IPR, etc.) in this document. Expected boilerplate is as follows today (2024-04-27) according to https://trustee.ietf.org/license-info : IETF Trust Legal Provisions of 28-dec-2009, Section 6.a: This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 2: Copyright (c) 2024 IETF Trust and the persons identified as the document authors. All rights reserved. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 3: This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** Missing expiration date. The document expiration date should appear on the first and last page. ** The document seems to lack a 1id_guidelines paragraph about Internet-Drafts being working documents. ** The document seems to lack a 1id_guidelines paragraph about 6 months document validity -- however, there's a paragraph with a matching beginning. Boilerplate error? ** The document seems to lack a 1id_guidelines paragraph about the list of current Internet-Drafts. ** The document seems to lack a 1id_guidelines paragraph about the list of Shadow Directories. ** Expected the document's filename to be given on the first page, but didn't find any ** The document is more than 15 pages and seems to lack a Table of Contents. == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** The document seems to lack an Authors' Addresses Section. ** There are 22 instances of too long lines in the document, the longest one being 3 characters in excess of 72. ** The document seems to lack a both a reference to RFC 2119 and the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. RFC 2119 keyword, line 193: '... that a receiver MUST ignore packets w...' RFC 2119 keyword, line 277: '...tently use the terms MUST, SHOULD, MAY...' RFC 2119 keyword, line 678: '... A receiver MUST ignore packets with...' RFC 2119 keyword, line 1255: '... MAY use any desired approach for im...' RFC 2119 keyword, line 1275: '..., an application MAY instead store onl...' (17 more instances...) Miscellaneous warnings: ---------------------------------------------------------------------------- == Line 1046 has weird spacing: '... item item ...' == Line 3137 has weird spacing: '...ed char u_int...' == Line 3139 has weird spacing: '...ned int u_in...' == Line 3662 has weird spacing: '... char c[16...' == Line 3686 has weird spacing: '... struct timev...' == (6 more instances...) == Couldn't figure out when the document was first submitted -- there may comments or warnings related to the use of a disclaimer for pre-RFC5378 work that could not be issued because of this. Please check the Legal Provisions document at https://trustee.ietf.org/license-info to determine if you need the pre-RFC5378 disclaimer. -- The document date (December 5, 1997) is 9640 days in the past. Is this intentional? -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '' and '' lines. Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Missing reference section? '1' on line 4045 looks like a reference -- Missing reference section? '2' on line 4051 looks like a reference -- Missing reference section? '3' on line 4055 looks like a reference -- Missing reference section? '4' on line 4058 looks like a reference -- Missing reference section? '5' on line 4061 looks like a reference -- Missing reference section? '6' on line 4064 looks like a reference -- Missing reference section? '7' on line 4067 looks like a reference -- Missing reference section? '8' on line 4071 looks like a reference -- Missing reference section? '9' on line 4076 looks like a reference -- Missing reference section? '-packet-' on line 1043 looks like a reference -- Missing reference section? '10' on line 4080 looks like a reference -- Missing reference section? '11' on line 4085 looks like a reference -- Missing reference section? '14' on line 4096 looks like a reference -- Missing reference section? '15' on line 4099 looks like a reference -- Missing reference section? '16' on line 4103 looks like a reference -- Missing reference section? '17' on line 4107 looks like a reference -- Missing reference section? '18' on line 4111 looks like a reference -- Missing reference section? '19' on line 4115 looks like a reference -- Missing reference section? 'E1' on line 2390 looks like a reference -- Missing reference section? 'E6' on line 2390 looks like a reference -- Missing reference section? 'E2' on line 2399 looks like a reference -- Missing reference section? 'E4' on line 2399 looks like a reference -- Missing reference section? 'E3' on line 2401 looks like a reference -- Missing reference section? 'E5' on line 2405 looks like a reference -- Missing reference section? '20' on line 4119 looks like a reference -- Missing reference section? '21' on line 4123 looks like a reference -- Missing reference section? '22' on line 4127 looks like a reference -- Missing reference section? '0' on line 3593 looks like a reference -- Missing reference section? '23' on line 4131 looks like a reference -- Missing reference section? '24' on line 4134 looks like a reference -- Missing reference section? '25' on line 4138 looks like a reference -- Missing reference section? '12' on line 4088 looks like a reference -- Missing reference section? '13' on line 4093 looks like a reference Summary: 12 errors (**), 0 flaws (~~), 8 warnings (==), 35 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Internet Engineering Task Force Audio/Video Transport Working Group 3 Internet Draft Schulzrinne/Casner/Frederick/Jacobson 4 ietf-avt-rtp-new-00.txt Columbia U./Precept/Xerox/LBNL 5 December 5, 1997 6 Expires: June 5, 1998 8 RTP: A Transport Protocol for Real-Time Applications 10 STATUS OF THIS MEMO 12 This document is an Internet-Draft. Internet-Drafts are working 13 documents of the Internet Engineering Task Force (IETF), its areas, 14 and its working groups. Note that other groups may also distribute 15 working documents as Internet-Drafts. 17 Internet-Drafts are draft documents valid for a maximum of six months 18 and may be updated, replaced, or obsoleted by other documents at any 19 time. It is inappropriate to use Internet-Drafts as reference 20 material or to cite them other than as ``work in progress''. 22 To learn the current status of any Internet-Draft, please check the 23 ``1id-abstracts.txt'' listing contained in the Internet-Drafts Shadow 24 Directories on ftp.is.co.za (Africa), nic.nordu.net (Europe), 25 munnari.oz.au (Pacific Rim), ds.internic.net (US East Coast), or 26 ftp.isi.edu (US West Coast). 28 Distribution of this document is unlimited. 30 ABSTRACT 32 This memorandum is a revision of RFC 1889 in preparation 33 for advancement from Proposed Standard to Draft Standard 34 status. Readers are encouraged to use the PostScript form 35 of this draft to see where changes from RFC 1889 are 36 marked by change bars. The revision process is not yet 37 complete; some changes which have been discussed and 38 tentatively accepted in meetings of the Audio/Video 39 Transport working group have not yet been incorporated 40 into this draft. 42 This memorandum describes RTP, the real-time transport 43 protocol. RTP provides end-to-end network transport 44 functions suitable for applications transmitting real- 45 time data, such as audio, video or simulation data, over 46 multicast or unicast network services. RTP does not 47 address resource reservation and does not guarantee 48 quality-of-service for real-time services. The data 49 transport is augmented by a control protocol (RTCP) to 50 allow monitoring of the data delivery in a manner 51 scalable to large multicast networks, and to provide 52 minimal control and identification functionality. RTP and 53 RTCP are designed to be independent of the underlying 54 transport and network layers. The protocol supports the 55 use of RTP-level translators and mixers. 57 This specification is a product of the Audio/Video Transport working 58 group within the Internet Engineering Task Force. Comments are 59 solicited and should be addressed to the working group's mailing list 60 at rem-conf@es.net and/or the authors. 62 1 Introduction 64 This memorandum specifies the real-time transport protocol (RTP), 65 which provides end-to-end delivery services for data with real-time 66 characteristics, such as interactive audio and video. Those services 67 include payload type identification, sequence numbering, timestamping 68 and delivery monitoring. Applications typically run RTP on top of UDP 69 to make use of its multiplexing and checksum services; both protocols 70 contribute parts of the transport protocol functionality. However, 71 RTP may be used with other suitable underlying network or transport 72 protocols (see Section 10). RTP supports data transfer to multiple 73 destinations using multicast distribution if provided by the 74 underlying network. 76 Note that RTP itself does not provide any mechanism to ensure timely 77 delivery or provide other quality-of-service guarantees, but relies 78 on lower-layer services to do so. It does not guarantee delivery or 79 prevent out-of-order delivery, nor does it assume that the underlying 80 network is reliable and delivers packets in sequence. The sequence 81 numbers included in RTP allow the receiver to reconstruct the 82 sender's packet sequence, but sequence numbers might also be used to 83 determine the proper location of a packet, for example in video 84 decoding, without necessarily decoding packets in sequence. 86 While RTP is primarily designed to satisfy the needs of multi- 87 participant multimedia conferences, it is not limited to that 88 particular application. Storage of continuous data, interactive 89 distributed simulation, active badge, and control and measurement 90 applications may also find RTP applicable. 92 This document defines RTP, consisting of two closely-linked parts: 94 o the real-time transport protocol (RTP), to carry data that has 95 real-time properties. 97 o the RTP control protocol (RTCP), to monitor the quality of 98 service and to convey information about the participants in an 99 on-going session. The latter aspect of RTCP may be sufficient 100 for "loosely controlled" sessions, i.e., where there is no 101 explicit membership control and set-up, but it is not 102 necessarily intended to support all of an application's control 103 communication requirements. This functionality may be fully or 104 partially subsumed by a separate session control protocol, 105 which is beyond the scope of this document. 107 RTP represents a new style of protocol following the principles of 108 application level framing and integrated layer processing proposed by 109 Clark and Tennenhouse [1]. That is, RTP is intended to be malleable 110 to provide the information required by a particular application and 111 will often be integrated into the application processing rather than 112 being implemented as a separate layer. RTP is a protocol framework 113 that is deliberately not complete. This document specifies those 114 functions expected to be common across all the applications for which 115 RTP would be appropriate. Unlike conventional protocols in which 116 additional functions might be accommodated by making the protocol 117 more general or by adding an option mechanism that would require 118 parsing, RTP is intended to be tailored through modifications and/or 119 additions to the headers as needed. Examples are given in Sections 120 5.3 and 6.4.3. 122 Therefore, in addition to this document, a complete specification of 123 RTP for a particular application will require one or more companion 124 documents (see Section 12): 126 o a profile specification document, which defines a set of 127 payload type codes and their mapping to payload formats (e.g., 128 media encodings). A profile may also define extensions or 129 modifications to RTP that are specific to a particular class of 130 applications. Typically an application will operate under only 131 one profile. A profile for audio and video data may be found in 132 the companion RFC 1890. 134 o payload format specification documents, which define how a 135 particular payload, such as an audio or video encoding, is to 136 be carried in RTP. 138 A discussion of real-time services and algorithms for their 139 implementation as well as background discussion on some of the RTP 140 design decisions can be found in [2]. 142 Several RTP applications, both experimental and commercial, have 143 already been implemented from draft specifications. These 144 applications include audio and video tools along with diagnostic 145 tools such as traffic monitors. Users of these tools number in the 146 thousands. However, the current Internet cannot yet support the full 147 potential demand for real-time services. High-bandwidth services 148 using RTP, such as video, can potentially seriously degrade the 149 quality of service of other network services. Thus, implementors 150 should take appropriate precautions to limit accidental bandwidth 151 usage. Application documentation should clearly outline the 152 limitations and possible operational impact of high-bandwidth real- 153 time services on the Internet and other network services. 155 1.1 Changes 157 Most of this draft is identical to RFC 1889. The changes are listed 158 below and are marked with change bars in the PostScript form of this 159 draft. This section may become an appendix when the draft is 160 published as an updated RFC, but it is included here at the front of 161 the document at this point to encourage feedback on these changes. 163 o The algorithm for calculating the RTCP transmission interval 164 specified in Sections 6.2 and 6.3 and illustrated in Appendix 165 A.7 is augmented to include "reconsideration" to minimize 166 transmission over the intended rate when many participants join 167 a session simultaneously, and "reverse reconsideration" to 168 reduce the incidence and duration of false participant timeouts 169 when the number of participants drops rapidly. 171 o Section 6.3.7 specifies new rules controlling when an RTCP BYE 172 packet should be sent in order to avoid a flood of packets when 173 many participants leave a session simultaneously. Sections 7.2 174 and 7.3 specify that translators and mixers should send BYE 175 packets for the sources they are no longer forwarding. 177 o An algorithm is specified in Sections 6.3.3 and 6.3.4 to allow 178 storage of only a sampling of the participants' SSRC 179 identifiers to allow scaling to very large sessions. 181 o Rule changes for layered encodings are defined in Sections 182 2.4, 6.3.9, 8.3 and 10. 184 o An indentation bug in the RFC 1889 printing of the pseudo-code 185 for the collision detection and resolution algorithm in Section 186 8.2 is corrected, and the algorithm has been modified to remove 187 the restriction that both RTP and RTCP must be sent from the 188 same source port number. 190 o For unicast RTP sessions, distinct port pairs may be used for 191 the two ends (Sections 3 and 7.1). 193 o It is specified that a receiver MUST ignore packets with 194 payload types it does not understand. 196 o The reference for the UTF-8 character set was changed to be 197 RFC 2044. 199 o Small clarifications of the text have been made in several 200 places in response to questions from readers. In particular: 202 -A definition for "RTP media type" is given in Section 3 to 203 allow the explanation of multiplexing RTP sessions in Section 204 5.2 to be more clear regarding the multiplexing of multiple 205 media. 207 -The description of the session bandwidth parameter is expanded 208 in Section 6.2. 210 -The method for padding RTCP packets is clarified in Section 211 6.4. 213 -The method for terminating and padding a sequence of SDES 214 items is clarified in Section 6.5. 216 1.2 Open Issues 218 The revisions in this draft are not yet complete; first, there are 219 some open issues regarding the changes that have been made: 221 o The RTCP timer reconsideration algorithm settles to a steady 222 state bandwidth that is below the desired level. Can the 223 algorithm compensate for this using a fudge factor? 225 o The algorithm for sampled storaged of SSRC identifiers results 226 in a temporary underestimate in group size (and an increase in 227 the RTCP rate) by a factor of 1/2 or more when the group size 228 is decreasing such that the mask size also decreases. This may 229 require some mechanism to compensate. 231 o The "reverse reconsideration" algorithm does not prevent the 232 group size estimate from incorrectly dropping to zero for a 233 short time when most participants of a large session leave at 234 once but some remain. The algorithm does make the estimate 235 return to the correct value more rapidly. It may be possible to 236 use a filter to slow the decrease in the estimate and prevent 237 this problem, but that would also slow down the increase in the 238 estimate for simultaneous joins, which is a problem. The 239 incorrect drop to zero may be deemed only a secondary concern. 241 Second, there are also some changes which have been discussed and 242 tentatively accepted in meetings of the Audio/Video Transport working 243 group have not yet been incorporated into this draft: 245 o Allowing RTCP sender and receiver bandwidths to be separate 246 parameters of the session rather than a strict percentage of 247 the session bandwidth. The defaults would retain the current 248 values of 1.25% and 3.75%. This change would allow rate- 249 adaptive applications to set an RTCP bandwidth consistent with 250 a "typical" data bandwidth that is lower than the maximum 251 bandwidth specified by the session bandwidth parameter. It 252 would also allow RTCP reception reports to be turned off 253 entirely for operation on unidirectional links. 254 Correspondingly, the text requiring transmission of RTCP for 255 multicast sessions needs to be generalized. 257 o Scaling the minimum RTCP interval inversely proportional to 258 the session bandwidth parameter: 260 -to a larger value to help reduce the spike size on a step join 261 when access links are slow (and the session bandwidth is 262 therefore low); 264 -to provide sufficient time for a packet to arrive for 265 conditional reconsideration; 267 -to a smaller value for high-rate multicast sessions to allow 268 for faster inter-media synchronization. Since the simultaneous 269 join flood is largely a function of the ratio of network 270 delays to the minimum interval, the value should not be scaled 271 much below the current 5 second minimum for receivers. 272 However, senders could be allowed to transmit a higher RTCP 273 bandwidth while still using the 5 second value when computing 274 the interval for timeouts to avoid timing out receivers. A 275 smaller value is also appropriate for unicast sessions. 277 o The text should consistently use the terms MUST, SHOULD, MAY 278 as defined in RFC 2119. 280 Third, since the publication of RFC 1889, the following changes have 281 been suggested but not yet discussed within the working group: 283 o For media with several packets with the same timestamp, the 284 jitter computation should be done only for one packet (the 285 first?). 287 o Define a photo URL item in SDES, which might be constrained to 288 use by senders only. Such an addition could cause severe web 289 server overload by triggering many simultaneous requests if 290 used in a large multicast session. 292 o The specification of the NTP timestamp in the RTCP SR section 293 says that when "relative" NTP timestamps are used they should 294 be based on elapsed time from the start of the session. 295 However, if the start times for the audio and video sessions 296 are not the same, then the NTP timestamps won't be usable for 297 synchronization. Should the base be changed to "system uptime," 298 and if so, how should that be defined? 300 o The padding mechanism for RTCP packets is not exactly the same 301 as for RTP packets because of the compound packet structure. 302 This was not explained clearly enough, resulting in incorrect 303 implementations. It is suggested that the current padding 304 mechanism for RTCP packets (only) be deprecated. In its place, 305 a new RTCP packet type "PAD" could be defined that is always to 306 be ignored. That packet can take whatever length (in 32-bit 307 words) is required for padding, assuming there is no need to 308 pad to odd boundaries. The new mechanism would be backward 309 compatible because older implementations should ignore the 310 unknown PAD packet type. 312 o It is specified that sources should add random offsets to the 313 sequence number and timestamp fields to make known-plaintext 314 attacks on encryption more difficult, even if the source itself 315 does not encrypt, because the packets may flow through a 316 translator that does. However, the translator cannot depend 317 upon the source to do this. Should the translator be allowed 318 to add its own random offsets to these fields and the 319 corresponding fields in RTCP packets? 321 o The discussion of security issues may need to be expanded. In 322 particular, it has been recommended that the confidentiality 323 mechanisms defined in this document should follow the same 324 overall format as the IPSEC ESP work, unless there is some 325 compelling reason not to. 327 2 RTP Use Scenarios 329 The following sections describe some aspects of the use of RTP. The 330 examples were chosen to illustrate the basic operation of 331 applications using RTP, not to limit what RTP may be used for. In 332 these examples, RTP is carried on top of IP and UDP, and follows the 333 conventions established by the profile for audio and video specified 334 in the companion RFC 1890 (updated by Internet-Draft draft-ietf-avt- 335 profile-new ). 337 2.1 Simple Multicast Audio Conference 339 A working group of the IETF meets to discuss the latest protocol 340 draft, using the IP multicast services of the Internet for voice 341 communications. Through some allocation mechanism the working group 342 chair obtains a multicast group address and pair of ports. One port 343 is used for audio data, and the other is used for control (RTCP) 344 packets. This address and port information is distributed to the 345 intended participants. If privacy is desired, the data and control 346 packets may be encrypted as specified in Section 9.1, in which case 347 an encryption key must also be generated and distributed. The exact 348 details of these allocation and distribution mechanisms are beyond 349 the scope of RTP. 351 The audio conferencing application used by each conference 352 participant sends audio data in small chunks of, say, 20 ms duration. 353 Each chunk of audio data is preceded by an RTP header; RTP header and 354 data are in turn contained in a UDP packet. The RTP header indicates 355 what type of audio encoding (such as PCM, ADPCM or LPC) is contained 356 in each packet so that senders can change the encoding during a 357 conference, for example, to accommodate a new participant that is 358 connected through a low-bandwidth link or react to indications of 359 network congestion. 361 The Internet, like other packet networks, occasionally loses and 362 reorders packets and delays them by variable amounts of time. To cope 363 with these impairments, the RTP header contains timing information 364 and a sequence number that allow the receivers to reconstruct the 365 timing produced by the source, so that in this example, chunks of 366 audio are contiguously played out the speaker every 20 ms. This 367 timing reconstruction is performed separately for each source of RTP 368 packets in the conference. The sequence number can also be used by 369 the receiver to estimate how many packets are being lost. 371 Since members of the working group join and leave during the 372 conference, it is useful to know who is participating at any moment 373 and how well they are receiving the audio data. For that purpose, 374 each instance of the audio application in the conference periodically 375 multicasts a reception report plus the name of its user on the RTCP 376 (control) port. The reception report indicates how well the current 377 speaker is being received and may be used to control adaptive 378 encodings. In addition to the user name, other identifying 379 information may also be included subject to control bandwidth limits. 380 A site sends the RTCP BYE packet (Section 6.6) when it leaves the 381 conference. 383 2.2 Audio and Video Conference 385 If both audio and video media are used in a conference, they are 386 transmitted as separate RTP sessions RTCP packets are transmitted for 387 each medium using two different UDP port pairs and/or multicast 388 addresses. There is no direct coupling at the RTP level between the 389 audio and video sessions, except that a user participating in both 390 sessions should use the same distinguished (canonical) name in the 391 RTCP packets for both so that the sessions can be associated. 393 One motivation for this separation is to allow some participants in 394 the conference to receive only one medium if they choose. Further 395 explanation is given in Section 5.2. Despite the separation, 396 synchronized playback of a source's audio and video can be achieved 397 using timing information carried in the RTCP packets for both 398 sessions. 400 2.3 Mixers and Translators 402 So far, we have assumed that all sites want to receive media data in 403 the same format. However, this may not always be appropriate. 404 Consider the case where participants in one area are connected 405 through a low-speed link to the majority of the conference 406 participants who enjoy high-speed network access. Instead of forcing 407 everyone to use a lower-bandwidth, reduced-quality audio encoding, an 408 RTP-level relay called a mixer may be placed near the low-bandwidth 409 area. This mixer resynchronizes incoming audio packets to reconstruct 410 the constant 20 ms spacing generated by the sender, mixes these 411 reconstructed audio streams into a single stream, translates the 412 audio encoding to a lower-bandwidth one and forwards the lower- 413 bandwidth packet stream across the low-speed link. These packets 414 might be unicast to a single recipient or multicast on a different 415 address to multiple recipients. The RTP header includes a means for 416 mixers to identify the sources that contributed to a mixed packet so 417 that correct talker indication can be provided at the receivers. 419 Some of the intended participants in the audio conference may be 420 connected with high bandwidth links but might not be directly 421 reachable via IP multicast. For example, they might be behind an 422 application-level firewall that will not let any IP packets pass. For 423 these sites, mixing may not be necessary, in which case another type 424 of RTP-level relay called a translator may be used. Two translators 425 are installed, one on either side of the firewall, with the outside 426 one funneling all multicast packets received through a secure 427 connection to the translator inside the firewall. The translator 428 inside the firewall sends them again as multicast packets to a 429 multicast group restricted to the site's internal network. 431 Mixers and translators may be designed for a variety of purposes. An 432 example is a video mixer that scales the images of individual people 433 in separate video streams and composites them into one video stream 434 to simulate a group scene. Other examples of translation include the 435 connection of a group of hosts speaking only IP/UDP to a group of 436 hosts that understand only ST-II, or the packet-by-packet encoding 437 translation of video streams from individual sources without 438 resynchronization or mixing. Details of the operation of mixers and 439 translators are given in Section 7. 441 2.4 Layered Encodings 443 Multimedia applications should be able to adjust the transmission 444 rate to match the capacity of the receiver or to adapt to network 445 congestion. Many implementations place the responsibility of rate- 446 adaptivity at the source. This does not work well with multicast 447 transmission because of the conflicting bandwidth requirements of 448 heterogeneous receivers. The result is often a least-common 449 denominator scenario, where the smallest pipe in the network mesh 450 dictates the quality and fidelity of the overall live multimedia 451 "broadcast". 453 Instead, responsibility for rate-adaptation can be placed at the 454 receivers by combining a layered encoding with a layered transmission 455 system. In the context of RTP over IP multicast, the source can 456 stripe the progressive layers of a hierarchically represented signal 457 across multiple RTP sessions each carried on its own multicast group. 458 Receivers can then adapt to network heterogeneity and control their 459 reception bandwidth by joining only the appropriate subset of the 460 multicast groups. 462 Details of the use of RTP with layered encodings are given in 463 Sections 6.3.9, 8.3 and 10. 465 3 Definitions 467 RTP payload: The data transported by RTP in a packet, for example 468 audio samples or compressed video data. The payload format and 469 interpretation are beyond the scope of this document. 471 RTP packet: A data packet consisting of the fixed RTP header, a 472 possibly empty list of contributing sources (see below), and the 473 payload data. Some underlying protocols may require an 474 encapsulation of the RTP packet to be defined. Typically one 475 packet of the underlying protocol contains a single RTP packet, 476 but several RTP packets may be contained if permitted by the 477 encapsulation method (see Section 10). 479 RTCP packet: A control packet consisting of a fixed header part 480 similar to that of RTP data packets, followed by structured 481 elements that vary depending upon the RTCP packet type. The 482 formats are defined in Section 6. Typically, multiple RTCP 483 packets are sent together as a compound RTCP packet in a single 484 packet of the underlying protocol; this is enabled by the length 485 field in the fixed header of each RTCP packet. 487 Port: The "abstraction that transport protocols use to distinguish 488 among multiple destinations within a given host computer. TCP/IP 489 protocols identify ports using small positive integers." [3] The 490 transport selectors (TSEL) used by the OSI transport layer are 491 equivalent to ports. RTP depends upon the lower-layer protocol 492 to provide some mechanism such as ports to multiplex the RTP and 493 RTCP packets of a session. 495 Transport address: The combination of a network address and port that 496 identifies a transport-level endpoint, for example an IP address 497 and a UDP port. Packets are transmitted from a source transport 498 address to a destination transport address. 500 RTP media type: An RTP media type is the collection of payload types 501 which can be carried within a single RTP session. The RTP 502 Profile assigns RTP media types to RTP payload types. 504 RTP session: The association among a set of participants 505 communicating with RTP. For each participant, the session is 506 defined by a particular pair of destination transport addresses 507 (one network address plus a port pair for RTP and RTCP). The 508 destination transport address pair may be common for all 509 participants, as in the case of IP multicast, or may be 510 different for each, as in the case of individual unicast network 511 addresses and port pairs. In a multimedia session, each medium 512 is carried in a separate RTP session with its own RTCP packets. 513 The multiple RTP sessions are distinguished by different port 514 number pairs and/or different multicast addresses. 516 Synchronization source (SSRC): The source of a stream of RTP packets, 517 identified by a 32-bit numeric SSRC identifier carried in the 518 RTP header so as not to be dependent upon the network address. 519 All packets from a synchronization source form part of the same 520 timing and sequence number space, so a receiver groups packets 521 by synchronization source for playback. Examples of 522 synchronization sources include the sender of a stream of 523 packets derived from a signal source such as a microphone or a 524 camera, or an RTP mixer (see below). A synchronization source 525 may change its data format, e.g., audio encoding, over time. The 526 SSRC identifier is a randomly chosen value meant to be globally 527 unique within a particular RTP session (see Section 8). A 528 participant need not use the same SSRC identifier for all the 529 RTP sessions in a multimedia session; the binding of the SSRC 530 identifiers is provided through RTCP (see Section 6.5.1). If a 531 participant generates multiple streams in one RTP session, for 532 example from separate video cameras, each must be identified as 533 a different SSRC. 535 Contributing source (CSRC): A source of a stream of RTP packets that 536 has contributed to the combined stream produced by an RTP mixer 537 (see below). The mixer inserts a list of the SSRC identifiers of 538 the sources that contributed to the generation of a particular 539 packet into the RTP header of that packet. This list is called 540 the CSRC list. An example application is audio conferencing 541 where a mixer indicates all the talkers whose speech was 542 combined to produce the outgoing packet, allowing the receiver 543 to indicate the current talker, even though all the audio 544 packets contain the same SSRC identifier (that of the mixer). 546 End system: An application that generates the content to be sent in 547 RTP packets and/or consumes the content of received RTP packets. 548 An end system can act as one or more synchronization sources in 549 a particular RTP session, but typically only one. 551 Mixer: An intermediate system that receives RTP packets from one or 552 more sources, possibly changes the data format, combines the 553 packets in some manner and then forwards a new RTP packet. Since 554 the timing among multiple input sources will not generally be 555 synchronized, the mixer will make timing adjustments among the 556 streams and generate its own timing for the combined stream. 557 Thus, all data packets originating from a mixer will be 558 identified as having the mixer as their synchronization source. 560 Translator: An intermediate system that forwards RTP packets with 561 their synchronization source identifier intact. Examples of 562 translators include devices that convert encodings without 563 mixing, replicators from multicast to unicast, and application- 564 level filters in firewalls. 566 Monitor: An application that receives RTCP packets sent by 567 participants in an RTP session, in particular the reception 568 reports, and estimates the current quality of service for 569 distribution monitoring, fault diagnosis and long-term 570 statistics. The monitor function is likely to be built into the 571 application(s) participating in the session, but may also be a 572 separate application that does not otherwise participate and 573 does not send or receive the RTP data packets. These are called 574 third party monitors. 576 Non-RTP means: Protocols and mechanisms that may be needed in 577 addition to RTP to provide a usable service. In particular, for 578 multimedia conferences, a conference control application may 579 distribute multicast addresses and keys for encryption, 580 negotiate the encryption algorithm to be used, and define 581 dynamic mappings between RTP payload type values and the payload 582 formats they represent for formats that do not have a predefined 583 payload type value. For simple applications, electronic mail or 584 a conference database may also be used. The specification of 585 such protocols and mechanisms is outside the scope of this 586 document. 588 4 Byte Order, Alignment, and Time Format 590 All integer fields are carried in network byte order, that is, most 591 significant byte (octet) first. This byte order is commonly known as 592 big-endian. The transmission order is described in detail in [4]. 593 Unless otherwise noted, numeric constants are in decimal (base 10). 595 All header data is aligned to its natural length, i.e., 16-bit fields 596 are aligned on even offsets, 32-bit fields are aligned at offsets 597 divisible by four, etc. Octets designated as padding have the value 598 zero. 600 Wallclock time (absolute time) is represented using the timestamp 601 format of the Network Time Protocol (NTP), which is in seconds 602 relative to 0h UTC on 1 January 1900 [5]. The full resolution NTP 603 timestamp is a 64-bit unsigned fixed-point number with the integer 604 part in the first 32 bits and the fractional part in the last 32 605 bits. In some fields where a more compact representation is 606 appropriate, only the middle 32 bits are used; that is, the low 16 607 bits of the integer part and the high 16 bits of the fractional part. 608 The high 16 bits of the integer part must be determined 609 independently. 611 5 RTP Data Transfer Protocol 613 5.1 RTP Fixed Header Fields 615 The RTP header has the following format: 617 0 1 2 3 618 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 619 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 620 |V=2|P|X| CC |M| PT | sequence number | 621 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 622 | timestamp | 623 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 624 | synchronization source (SSRC) identifier | 625 +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ 626 | contributing source (CSRC) identifiers | 627 | .... | 628 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 630 The first twelve octets are present in every RTP packet, while the 631 list of CSRC identifiers is present only when inserted by a mixer. 632 The fields have the following meaning: 634 version (V): 2 bits 635 This field identifies the version of RTP. The version defined by 636 this specification is two (2). (The value 1 is used by the first 637 draft version of RTP and the value 0 is used by the protocol 638 initially implemented in the "vat" audio tool.) 640 padding (P): 1 bit 641 If the padding bit is set, the packet contains one or more 642 additional padding octets at the end which are not part of the 643 payload. The last octet of the padding contains a count of how 644 many padding octets should be ignored, including itself. 645 Padding may be needed by some encryption algorithms with fixed 646 block sizes or for carrying several RTP packets in a lower-layer 647 protocol data unit. 649 extension (X): 1 bit 650 If the extension bit is set, the fixed header is followed by 651 exactly one header extension, with a format defined in Section 652 5.3.1. 654 CSRC count (CC): 4 bits 655 The CSRC count contains the number of CSRC identifiers that 656 follow the fixed header. 658 marker (M): 1 bit 659 The interpretation of the marker is defined by a profile. It is 660 intended to allow significant events such as frame boundaries to 661 be marked in the packet stream. A profile may define additional 662 marker bits or specify that there is no marker bit by changing 663 the number of bits in the payload type field (see Section 5.3). 665 payload type (PT): 7 bits 666 This field identifies the format of the RTP payload and 667 determines its interpretation by the application. A profile 668 specifies a default static mapping of payload type codes to 669 payload formats. Additional payload type codes may be defined 670 dynamically through non-RTP means (see Section 3). An initial 671 set of default mappings for audio and video is specified in the 672 companion RFC 1890 (updated by Internet-Draft draft-ietf-avt- 673 profile-new ), and may be extended in future editions of the 674 Assigned Numbers RFC [6]. An RTP sender emits a single RTP 675 payload type at any given time; this field is not intended for 676 multiplexing separate media streams (see Section 5.2). 678 A receiver MUST ignore packets with payload types that it does not 679 understand. 681 sequence number: 16 bits 682 The sequence number increments by one for each RTP data packet 683 sent, and may be used by the receiver to detect packet loss and 684 to restore packet sequence. The initial value of the sequence 685 number is random (unpredictable) to make known-plaintext attacks 686 on encryption more difficult, even if the source itself does not 687 encrypt, because the packets may flow through a translator that 688 does. Techniques for choosing unpredictable numbers are 689 discussed in [7]. 691 timestamp: 32 bits 692 The timestamp reflects the sampling instant of the first octet 693 in the RTP data packet. The sampling instant must be derived 694 from a clock that increments monotonically and linearly in time 695 to allow synchronization and jitter calculations (see Section 696 6.4.1). The resolution of the clock must be sufficient for the 697 desired synchronization accuracy and for measuring packet 698 arrival jitter (one tick per video frame is typically not 699 sufficient). The clock frequency is dependent on the format of 700 data carried as payload and is specified statically in the 701 profile or payload format specification that defines the format, 702 or may be specified dynamically for payload formats defined 703 through non-RTP means. If RTP packets are generated 704 periodically, the nominal sampling instant as determined from 705 the sampling clock is to be used, not a reading of the system 706 clock. As an example, for fixed-rate audio the timestamp clock 707 would likely increment by one for each sampling period. If an 708 audio application reads blocks covering 160 sampling periods 709 from the input device, the timestamp would be increased by 160 710 for each such block, regardless of whether the block is 711 transmitted in a packet or dropped as silent. 713 The initial value of the timestamp is random, as for the sequence 714 number. Several consecutive RTP packets may have equal timestamps if 715 they are (logically) generated at once, e.g., belong to the same 716 video frame. Consecutive RTP packets may contain timestamps that are 717 not monotonic if the data is not transmitted in the order it was 718 sampled, as in the case of MPEG interpolated video frames. (The 719 sequence numbers of the packets as transmitted will still be 720 monotonic.) 722 SSRC: 32 bits 723 The SSRC field identifies the synchronization source. This 724 identifier is chosen randomly, with the intent that no two 725 synchronization sources within the same RTP session will have 726 the same SSRC identifier. An example algorithm for generating a 727 random identifier is presented in Appendix A.6. Although the 728 probability of multiple sources choosing the same identifier is 729 low, all RTP implementations must be prepared to detect and 730 resolve collisions. Section 8 describes the probability of 731 collision along with a mechanism for resolving collisions and 732 detecting RTP-level forwarding loops based on the uniqueness of 733 the SSRC identifier. If a source changes its source transport 734 address, it must also choose a new SSRC identifier to avoid 735 being interpreted as a looped source (see Section 8.2). 737 CSRC list: 0 to 15 items, 32 bits each 738 The CSRC list identifies the contributing sources for the 739 payload contained in this packet. The number of identifiers is 740 given by the CC field. If there are more than 15 contributing 741 sources, only 15 may be identified. CSRC identifiers are 742 inserted by mixers, using the SSRC identifiers of contributing 743 sources. For example, for audio packets the SSRC identifiers of 744 all sources that were mixed together to create a packet are 745 listed, allowing correct talker indication at the receiver. 747 5.2 Multiplexing RTP Sessions 749 For efficient protocol processing, the number of multiplexing points 750 should be minimized, as described in the integrated layer processing 751 design principle [1]. In RTP, multiplexing is provided by the 752 destination transport address (network address and port number) which 753 define an RTP session. For example, in a teleconference composed of 754 audio and video media encoded separately, each medium should be 755 carried in a separate RTP session with its own destination transport 756 address. It is not intended that the audio and video streams be 757 carried in a single RTP session and demultiplexed based on the 758 payload type or SSRC fields. Interleaving packets with different RTP 759 media types but using the same SSRC would introduce several problems: 761 1. If, say, two audio streams shared the same RTP session and 762 the same SSRC value, and one were to change encodings and 763 thus acquire a different RTP payload type, there would be 764 no general way of identifying which stream had changed 765 encodings. 767 2. An SSRC is defined to identify a single timing and sequence 768 number space. Interleaving multiple payload types would 769 require different timing spaces if the media clock rates 770 differ and would require different sequence number spaces 771 to tell which payload type suffered packet loss. 773 3. The RTCP sender and receiver reports (see Section 6.4) can 774 only describe one timing and sequence number space per SSRC 775 and do not carry a payload type field. 777 4. An RTP mixer would not be able to combine interleaved 778 streams of incompatible media into one stream. 780 5. Carrying multiple media in one RTP session precludes: the 781 use of different network paths or network resource 782 allocations if appropriate; reception of a subset of the 783 media if desired, for example just audio if video would 784 exceed the available bandwidth; and receiver 785 implementations that use separate processes for the 786 different media, whereas using separate RTP sessions 787 permits either single- or multiple-process implementations. 789 Using a different SSRC for each medium but sending them in the same 790 RTP session would avoid the first three problems but not the last 791 two. 793 5.3 Profile-Specific Modifications to the RTP Header 795 The existing RTP data packet header is believed to be complete for 796 the set of functions required in common across all the application 797 classes that RTP might support. However, in keeping with the ALF 798 design principle, the header may be tailored through modifications or 799 additions defined in a profile specification while still allowing 800 profile-independent monitoring and recording tools to function. 802 o The marker bit and payload type field carry profile-specific 803 information, but they are allocated in the fixed header since 804 many applications are expected to need them and might otherwise 805 have to add another 32-bit word just to hold them. The octet 806 containing these fields may be redefined by a profile to suit 807 different requirements, for example with a more or fewer marker 808 bits. If there are any marker bits, one should be located in 809 the most significant bit of the octet since profile-independent 810 monitors may be able to observe a correlation between packet 811 loss patterns and the marker bit. 813 o Additional information that is required for a particular 814 payload format, such as a video encoding, should be carried in 815 the payload section of the packet. This might be in a header 816 that is always present at the start of the payload section, or 817 might be indicated by a reserved value in the data pattern. 819 o If a particular class of applications needs additional 820 functionality independent of payload format, the profile under 821 which those applications operate should define additional fixed 822 fields to follow immediately after the SSRC field of the 823 existing fixed header. Those applications will be able to 824 quickly and directly access the additional fields while 825 profile-independent monitors or recorders can still process the 826 RTP packets by interpreting only the first twelve octets. 828 If it turns out that additional functionality is needed in common 829 across all profiles, then a new version of RTP should be defined to 830 make a permanent change to the fixed header. 832 5.3.1 RTP Header Extension 834 An extension mechanism is provided to allow individual 835 implementations to experiment with new payload-format-independent 836 functions that require additional information to be carried in the 837 RTP data packet header. This mechanism is designed so that the header 838 extension may be ignored by other interoperating implementations that 839 have not been extended. 841 Note that this header extension is intended only for limited use. 842 Most potential uses of this mechanism would be better done another 843 way, using the methods described in the previous section. For 844 example, a profile-specific extension to the fixed header is less 845 expensive to process because it is not conditional nor in a variable 846 location. Additional information required for a particular payload 847 format should not use this header extension, but should be carried in 848 the payload section of the packet. 850 0 1 2 3 851 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 852 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 853 | defined by profile | length | 854 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 855 | header extension | 856 | .... | 858 If the X bit in the RTP header is one, a variable-length header 859 extension is appended to the RTP header, following the CSRC list if 860 present. The header extension contains a 16-bit length field that 861 counts the number of 32-bit words in the extension, excluding the 862 four-octet extension header (therefore zero is a valid length). Only 863 a single extension may be appended to the RTP data header. To allow 864 multiple interoperating implementations to each experiment 865 independently with different header extensions, or to allow a 866 particular implementation to experiment with more than one type of 867 header extension, the first 16 bits of the header extension are left 868 open for distinguishing identifiers or parameters. The format of 869 these 16 bits is to be defined by the profile specification under 870 which the implementations are operating. This RTP specification does 871 not define any header extensions itself. 873 6 RTP Control Protocol -- RTCP 875 The RTP control protocol (RTCP) is based on the periodic transmission 876 of control packets to all participants in the session, using the same 877 distribution mechanism as the data packets. The underlying protocol 878 must provide multiplexing of the data and control packets, for 879 example using separate port numbers with UDP. RTCP performs four 880 functions: 882 1. The primary function is to provide feedback on the quality 883 of the data distribution. This is an integral part of the 884 RTP's role as a transport protocol and is related to the 885 flow and congestion control functions of other transport 886 protocols. The feedback may be directly useful for control 887 of adaptive encodings [8,9], but experiments with IP 888 multicasting have shown that it is also critical to get 889 feedback from the receivers to diagnose faults in the 890 distribution. Sending reception feedback reports to all 891 participants allows one who is observing problems to 892 evaluate whether those problems are local or global. With a 893 distribution mechanism like IP multicast, it is also 894 possible for an entity such as a network service provider 895 who is not otherwise involved in the session to receive the 896 feedback information and act as a third-party monitor to 897 diagnose network problems. This feedback function is 898 performed by the RTCP sender and receiver reports, 899 described below in Section 6.4. 901 2. RTCP carries a persistent transport-level identifier for an 902 RTP source called the canonical name or CNAME, Section 903 6.5.1. Since the SSRC identifier may change if a conflict 904 is discovered or a program is restarted, receivers require 905 the CNAME to keep track of each participant. Receivers may 906 also require the CNAME to associate multiple data streams 907 from a given participant in a set of related RTP sessions, 908 for example to synchronize audio and video. 910 3. The first two functions require that all participants send 911 RTCP packets, therefore the rate must be controlled in 912 order for RTP to scale up to a large number of 913 participants. By having each participant send its control 914 packets to all the others, each can independently observe 915 the number of participants. This number is used to 916 calculate the rate at which the packets are sent, as 917 explained in Section 6.2. 919 4. A fourth, optional function is to convey minimal session 920 control information, for example participant identification 921 to be displayed in the user interface. This is most likely 922 to be useful in "loosely controlled" sessions where 923 participants enter and leave without membership control or 924 parameter negotiation. RTCP serves as a convenient channel 925 to reach all the participants, but it is not necessarily 926 expected to support all the control communication 927 requirements of an application. A higher-level session 928 control protocol, which is beyond the scope of this 929 document, may be needed. 931 Functions 1-3 are mandatory when RTP is used in the IP multicast 932 environment, and are recommended for all environments. RTP 933 application designers are advised to avoid mechanisms that can only 934 work in unicast mode and will not scale to larger numbers. 936 6.1 RTCP Packet Format 938 This specification defines several RTCP packet types to carry a 939 variety of control information: 941 SR: Sender report, for transmission and reception statistics from 942 participants that are active senders 944 RR: Receiver report, for reception statistics from participants that 945 are not active senders 947 SDES: Source description items, including CNAME 949 BYE: Indicates end of participation 951 APP: Application specific functions 953 Each RTCP packet begins with a fixed part similar to that of RTP data 954 packets, followed by structured elements that may be of variable 955 length according to the packet type but always end on a 32-bit 956 boundary. The alignment requirement and a length field in the fixed 957 part of each packet are included to make RTCP packets "stackable". 958 Multiple RTCP packets may be concatenated without any intervening 959 separators to form a compound RTCP packet that is sent in a single 960 packet of the lower layer protocol, for example UDP. There is no 961 explicit count of individual RTCP packets in the compound packet 962 since the lower layer protocols are expected to provide an overall 963 length to determine the end of the compound packet. 965 Each individual RTCP packet in the compound packet may be processed 966 independently with no requirements upon the order or combination of 967 packets. However, in order to perform the functions of the protocol, 968 the following constraints are imposed: 970 o Reception statistics (in SR or RR) should be sent as often as 971 bandwidth constraints will allow to maximize the resolution of 972 the statistics, therefore each periodically transmitted 973 compound RTCP packet should include a report packet. 975 o New receivers need to receive the CNAME for a source as soon 976 as possible to identify the source and to begin associating 977 media for purposes such as lip-sync, so each compound RTCP 978 packet should also include the SDES CNAME. 980 o The number of packet types that may appear first in the 981 compound packet should be limited to increase the number of 982 constant bits in the first word and the probability of 983 successfully validating RTCP packets against misaddressed RTP 984 data packets or other unrelated packets. 986 Thus, all RTCP packets must be sent in a compound packet of at least 987 two individual packets, with the following format recommended: 989 Encryption prefix: If and only if the compound packet is to be 990 encrypted, it is prefixed by a random 32-bit quantity redrawn 991 for every compound packet transmitted. 993 SR or RR: The first RTCP packet in the compound packet must always 994 be a report packet to facilitate header validation as described 995 in Appendix A.2. This is true even if no data has been sent nor 996 received, in which case an empty RR is sent, and even if the 997 only other RTCP packet in the compound packet is a BYE. 999 Additional RRs: If the number of sources for which reception 1000 statistics are being reported exceeds 31, the number that will 1001 fit into one SR or RR packet, then additional RR packets should 1002 follow the initial report packet. 1004 SDES: An SDES packet containing a CNAME item must be included in 1005 each compound RTCP packet. Other source description items may 1006 optionally be included if required by a particular application, 1007 subject to bandwidth constraints (see Section 6.3.9). 1009 BYE or APP: Other RTCP packet types, including those yet to be 1010 defined, may follow in any order, except that BYE should be the 1011 last packet sent with a given SSRC/CSRC. Packet types may appear 1012 more than once. 1014 It is advisable for translators and mixers to combine individual RTCP 1015 packets from the multiple sources they are forwarding into one 1016 compound packet whenever feasible in order to amortize the packet 1017 overhead (see Section 7). An example RTCP compound packet as might be 1018 produced by a mixer is shown in Fig. 1. If the overall length of a 1019 compound packet would exceed the maximum transmission unit (MTU) of 1020 the network path, it may be segmented into multiple shorter compound 1021 packets to be transmitted in separate packets of the underlying 1022 protocol. Note that each of the compound packets must begin with an 1023 SR or RR packet. 1025 An implementation may ignore incoming RTCP packets with types unknown 1026 to it. Additional RTCP packet types may be registered with the 1027 Internet Assigned Numbers Authority (IANA). 1029 6.2 RTCP Transmission Interval 1031 RTP is designed to allow an application to scale automatically over 1032 session sizes ranging from a few participants to thousands. For 1033 example, in an audio conference the data traffic is inherently self- 1034 limiting because only one or two people will speak at a time, so with 1035 multicast distribution the data rate on any given link remains 1036 relatively constant independent of the number of participants. 1037 However, the control traffic is not self-limiting. If the reception 1038 reports from each participant were sent at a constant rate, the 1039 control traffic would grow linearly with the number of participants. 1041 if encrypted: random 32-bit integer 1042 | 1043 |[------- packet -------][----------- packet -----------][-packet-] 1044 | 1045 | receiver chunk chunk 1046 V reports item item item item 1047 -------------------------------------------------------------------- 1048 |R[SR|# sender #site#site][SDES|# CNAME PHONE |#CNAME LOC][BYE##why] 1049 |R[ |# report # 1 # 2 ][ |# |# ][ ## ] 1050 |R[ |# # # ][ |# |# ][ ## ] 1051 |R[ |# # # ][ |# |# ][ ## ] 1052 -------------------------------------------------------------------- 1053 |<------------------ UDP packet (compound packet) --------------->| 1055 #: SSRC/CSRC 1057 Figure 1: Example of an RTCP compound packet 1059 Therefore, the rate must be scaled down. 1061 For each session, it is assumed that the data traffic is subject to 1062 an aggregate limit called the "session bandwidth" to be divided among 1063 the participants. This bandwidth might be reserved and the limit 1064 enforced by the network. If there is no reservation, there may be 1065 other constraints, depending on the environment, that establish the 1066 "reasonable" maximum for the session to use, and that would be the 1067 session bandwidth. The session bandwidth may be chosen based or some 1068 cost or a priori knowledge of the available network bandwidth for the 1069 session. It is somewhat independent of the media encoding, but the 1070 encoding choice may be limited by the session bandwidth. Often, the 1071 session bandwidth is the sum of the nominal bandwidths of the senders 1072 expected to be concurrently active. For teleconference audio, this 1073 number would typically be one sender's bandwidth. For layered 1074 encodings, each layer is a separate RTP session with its own session 1075 bandwidth parameter. 1077 The session bandwidth parameter is expected to be supplied by a 1078 session management application when it invokes a media application, 1079 but media applications may also set a default based on the single- 1080 sender data bandwidth for the encoding selected for the session. The 1081 application may also enforce bandwidth limits based on multicast 1082 scope rules or other criteria. 1084 Bandwidth calculations for control and data traffic include lower- 1085 layer transport and network protocols (e.g., UDP and IP) since that 1086 is what the resource reservation system would need to know. The 1087 application can also be expected to know which of these protocols are 1088 in use. Link level headers are not included in the calculation since 1089 the packet will be encapsulated with different link level headers as 1090 it travels. 1092 The control traffic should be limited to a small and known fraction 1093 of the session bandwidth: small so that the primary function of the 1094 transport protocol to carry data is not impaired; known so that the 1095 control traffic can be included in the bandwidth specification given 1096 to a resource reservation protocol, and so that each participant can 1097 independently calculate its share. It is suggested that the fraction 1098 of the session bandwidth allocated to RTCP be fixed at 5%. While the 1099 value of this and other constants in the interval calculation is not 1100 critical, all participants in the session must use the same values so 1101 the same interval will be calculated. Therefore, these constants 1102 should be fixed for a particular profile. 1104 The algorithm described in Appendix A.7 was designed to meet the 1105 goals outlined above. It calculates the interval between sending 1106 compound RTCP packets to divide the allowed control traffic bandwidth 1107 among the participants. This allows an application to provide fast 1108 response for small sessions where, for example, identification of all 1109 participants is important, yet automatically adapt to large sessions. 1110 The algorithm incorporates the following characteristics: 1112 o Senders are collectively allocated at least 1/4 of the control 1113 traffic bandwidth so that in sessions with a large number of 1114 receivers but a small number of senders, newly joining 1115 participants will more quickly receive the CNAME for the 1116 sending sites. 1118 o The calculated interval between RTCP packets is required to be 1119 greater than a minimum of 5 seconds to avoid having bursts of 1120 RTCP packets exceed the allowed bandwidth when the number of 1121 participants is small and the traffic isn't smoothed according 1122 to the law of large numbers. 1124 o The calculated interval between RTCP packets scales linearly 1125 with the number of members in the group. It is this linear 1126 factor which allows for a constant amount of control traffic 1127 when summed across all members. 1129 o The interval between RTCP packets is varied randomly over the 1130 range [0.5,1.5] times the calculated interval to avoid 1131 unintended synchronization of all participants [10]. The first 1132 RTCP packet sent after joining a session is also delayed by a 1133 random variation of half the minimum RTCP interval in case the 1134 application is started at multiple sites simultaneously, for 1135 example as initiated by a session announcement. 1137 o A dynamic estimate of the average compound RTCP packet size is 1138 calculated, including all those received and sent, to 1139 automatically adapt to changes in the amount of control 1140 information carried. 1142 o Since the calculated interval is dependent on the number of 1143 observed group members, there may be an undesirable startup 1144 effects when a new user joins an existing session, or many 1145 users simultaneously join a new session. These new users will 1146 initially have incorrect estimates of the group membership, and 1147 thus their RTCP transmission interval will be too low. This 1148 problem can be significant if many users join the session 1149 simultaneously. To deal with this, an algorithm called "timer 1150 reconsideration" is employed. This algorithm implements a 1151 simple back-off mechanism which causes users to hold back RTCP 1152 packet transmission if the group sizes are increasing. 1154 o When users leave a session, either with a BYE or by timeout, 1155 the group membership decreases, and thus the calculated 1156 interval should decrease. A "reverse reconsideration" algorithm 1157 is used to allow members to more quickly reduce their intervals 1158 in response to group membership decreases. 1160 o BYE packets are given different treatment than normal RTCP 1161 packets. When a user leaves a group, and wishes to send a BYE 1162 packet, it may do so before its next scheduled RTCP packet. 1163 However, transmission of BYE's follows a back-off algorithm 1164 which avoids floods of BYE packets should a large number of 1165 members simultaneously leave the session. 1167 This algorithm may be used for sessions in which all participants are 1168 allowed to send. In that case, the session bandwidth parameter is the 1169 product of the individual sender's bandwidth times the number of 1170 participants, and the RTCP bandwidth is 5% of that. 1172 Details of the algorithm's operation are given in the sections that 1173 follow. Appendix A.7 gives an example implementation. 1175 6.3 RTCP Packet Send and Receive Rules 1177 The rules for how to send, and what to do when receiving an RTCP 1178 packet are outlined here. To execute these rules, a session 1179 participant must maintain several pieces of state: 1181 tp: the last time an RTCP packet was transmitted; 1182 tc: the current time; 1184 tn: the next scheduled transmission time of an RTCP packet; 1186 pmembers: the estimated number of session members at time tp 1188 members: the most current estimate for the number of session members; 1190 senders: the most current estimate for the number of senders in the 1191 session; 1193 rtcp_bw: The target RTCP bandwidth, i.e., the total bandwidth that 1194 will be used for RTCP packets by all members of this session, in 1195 octets per second. This should be 5% of the "session bandwidth" 1196 parameter supplied to the application at startup. 1198 we_sent: Flag that is true if the application has sent data since the 1199 2nd previous RTCP report was transmitted. 1201 avg_rtcp_size: The average compound RTCP packet size, in octets, over 1202 all RTCP packets sent and received by this user. 1204 initial: Flag that is true if the application has not yet sent an 1205 RTCP packet. 1207 Many of these rules make use of the "calculated interval" between 1208 packet transmissions. This interval is described in the following 1209 section. 1211 6.3.1 Computing the RTCP transmission interval 1213 To maintain scalability, the average interval between packets from a 1214 session participant should scale with the group size. This interval 1215 is called the calculated interval. It is obtained by combining a 1216 number of the pieces of state described above. The calculated 1217 interval T is then determined as follows: 1219 1. If there are any senders (senders > 0) in the session, but 1220 the number of senders is less than 25% of the membership 1221 (members), the interval depends on whether the user is a 1222 sender or not (based on the value of we_sent). If the user 1223 is a sender (we_sent true), the constant C is set to the 1224 average rtcp packet size (avg_rtcp_size) divided by 25% of 1225 the rtcp bandwidth (rtcp_bw), and the constant n is set to 1226 the number of senders. If we_sent is not true, the constant 1227 C is set to the average rtcp packet size divided by 75% of 1228 the rtcp bandwidth. The constant n is set to the number of 1229 receivers (members - senders). 1231 2. If the user has not yet sent an RTCP packet (the variable 1232 initial is false), the constant Tmin is set to 5 seconds, 1233 else it is set to 2.5 seconds. 1235 3. The deterministic calculated interval Td is set to 1236 max(Tmin, n*C). 1238 4. The calculated interval T is set to a number uniformly 1239 distributed between half and three half the deterministic 1240 calculated interval. 1242 This procedure results in an interval which is random, but which, on 1243 average, gives 25% of the rtcp bandwidth to senders, and 75% to 1244 receivers. 1246 6.3.2 Initialization 1248 Upon joining the session, the user initializes tp to 0, tc to 0, 1249 senders to 0, initial to 1, pmembers to 1, members to 1, we_sent to 1250 false, rtcp_bw to 5% of the session bandwidth, initial to true, and 1251 avg_pkt_sz to the size of the very first packet constructed by the 1252 application. The calculated interval T is then computed, and the 1253 first packet is scheduled for time tn = T. This means that a 1254 transmission timer is set which expires at time T. Note that the user 1255 MAY use any desired approach for implementing this timer. 1257 The user adds their own SSRC to the member table. 1259 6.3.3 Receiving an RTP or non-BYE RTCP packet 1261 When an RTP or RTCP packet is received from a user whose SSRC is not 1262 in the member table, the SSRC is added to the table, and the value 1263 for members is incremented by 1. 1265 When an RTP packet is received from a user whose SSRC is not in the 1266 sender table, the SSRC is added to the table, and the value for 1267 senders is incremented by 1. 1269 For large scale applications, such as a broadcast session, the 1270 approach of storing all the received SSRC identifiers in a table does 1271 not scale well. For huge groups, the amound of memory required to 1272 store all the SSRC identifiers and related per-source state may 1273 become impractical. 1275 To reduce this storage burden, an application MAY instead store only 1276 a sampling of the received SSRC identifiers using the algorithm 1277 described here, or any other algorithm with similar behavior. The 1278 algorithm operates by attempting to maintain the number of entries 1279 stored below some threshold, B. This threshold SHOULD NOT be less 1280 than 100 in order to achieve sufficient statistical accuracy in the 1281 sampling. 1283 The idea is to filter which SSRC identifiers are stored based on a 1284 mask. A participant uses its own SSRC as the (random) key, and starts 1285 with a mask of 0 bits (so all other SSRC identifiers received will 1286 match). Matching SSRC identifiers are placed into the table. When the 1287 table reaches full capacity (B), the mask is extended by 1 bit. 1288 (Shifting 1 bits into the least significant bit is recommended.) 1289 Now, all of the SSRC values in the table which no longer equal the 1290 key under the masking operation are discarded. On average, this 1291 reduces the size of the table by 1/2. As new SSRC identifiers are 1292 received, they are only added to the table if they match the key 1293 under the masking operation. Again, when the table size increases to 1294 B, the mask is extended by another bit, and the nonmatching entries 1295 are discarded. The mask may not be extended beyond 32 bits, in which 1296 case only the participants own SSRC would match. 1298 If m is the number of 1 bits in the mask, and n is the number of SSRC 1299 in the table, the estimate of the group size is given by members = n 1300 * 2**m. 1302 The algorithm described attempts to keep the value of m to the 1303 smallest possible value without overflowing the table. This yields 1304 the best group size estimate possible for a given table size B. 1306 Note that this sampling algorithm MUST NOT be applied to SSRC 1307 identifiers that correspond to senders because otherwise the 1308 calculation of the RTCP bandwidth when we_sent is true would be 1309 inaccurate. The SSRC identifiers for senders MUST always be added to 1310 the table when first received and not removed from the table when the 1311 mask is extended. 1313 For each compound RTCP packet received, the value of avg_rtcp_sz is 1314 updated: avg_rtcp_sz = (1/16)*packet_size + (15/16)* avg_rtcp_sz, 1315 where packet_size is the size of the RTCP packet just received. 1317 6.3.4 Receiving an RTCP BYE packet 1319 If the received packet is an RTCP BYE packet, the SSRC is checked 1320 against the member table. If present, the entry is removed from the 1321 table, and the value for members is decremented by 1. The SSRC is 1322 then checked against the sender table. If present, the entry is 1323 removed from the table, and the value for senders is decremented by 1324 1. 1326 If an SSRC sampling algorithm is in use as described in the previous 1327 section, then when the number of entries in the member table falls 1328 below B/2, the mask SHOULD be reduced by 1 bit unless m is already 1329 zero. Note that this will cause the group size estimate to drop by 1/ 1330 2. The estimate will eventually converge to the correct value as SSRC 1331 identifiers which did not previously match the key under masking, and 1332 now do, are added to the table. 1334 Furthermore, to make the transmission rate of RTCP packets more 1335 adaptive to changes in group membership, the following "reverse 1336 reconsideration" algorithm SHOULD be executed when a BYE packet is 1337 received: 1339 o The value for tn is updated according to the following 1340 formula: tn = tc + (members/pmembers)(tn - tc). 1342 o The value for tp is updated according the following formula: 1343 tp = tc - (members/pmembers)(tc - tp). 1345 o The next RTCP packet is rescheduled for transmission at time 1346 tn, which is now earlier. 1348 o The value of pmembers is set equal to members. 1350 6.3.5 Timing Out an SSRC 1352 At occassional intervals, the user MUST check to see if any of the 1353 other users timeout. To do this, the user computes the deterministic 1354 calculated interval (without the randomization factor) Td. Any other 1355 session member who has not sent a packet since time tc - MTd (M is 1356 the timeout multiplier, and defaults to 5) is timed out. This means 1357 that their SSRC is removed from the member list, and members is 1358 decremented by 1. A similar check is performed on the sender list. 1359 Any member on the sender list who has not sent an RTP packet since 1360 time tc - T (note the absence of the M factor) is removed from the 1361 sender list, and senders is decremented by 1. 1363 The user SHOULD perform this check every time an RTCP packet of any 1364 type is received. The user MAY perform the check less frequently, but 1365 it MUST be done at least once between RTCP packet transmissions from 1366 the user. 1368 As described in the previous section, if an SSRC sampling algorithm 1369 is in use then when the number of entries in the member table falls 1370 below B/2, the mask SHOULD be reduced by 1 bit unless m is already 1371 zero. 1373 6.3.6 Expiration of transmission timer 1374 When the packet transmission timer expires, the user performs one of 1375 the following operations: 1377 Option A: 1379 o If members mbers, an RTCP packet is transmitted. The 1380 transmission interval T, including the randomization factor, is 1381 computed. pmembers is set to members, tp is set to tc, and tn 1382 is set to tc + T. The transmission timer is set to expire again 1383 at time tn. 1385 o If members > pmembers, the transmission interval T, including 1386 the randomization factor, is computed. If tp + T is less than 1387 or equal to tc, an RTCP packet is transmitted. pmembers is set 1388 to members, tp is set to tc, and tn is set to tc + T. The 1389 transmission timer is set to expire again at time tn. If tp + T 1390 is greater than tc, pmembers is set to members, and tn is set 1391 to tc + T. No RTCP packet is transmitted. The transmission 1392 timer is set to expire at time tn. 1394 Option B: 1396 o The transmission interval T, including the randomization 1397 factor, is computed. 1399 o If tp + T is less than or equal to tc, an RTCP packet is 1400 transmitted. pmembers is set to members, tp is set to tc, and 1401 tn is set to tc + T. The transmission timer is set to expire 1402 again at time tn. If tp + T is greater than tc, pmembers is set 1403 to members, and tn is set to tc + T. No RTCP packet is 1404 transmitted. The transmission timer is set to expire at time 1405 tn. 1407 Option C: 1409 o Option B is executed for the first RTCP packet. 1411 o Option A is executed for all subsequent packets. 1413 Users SHOULD use Option B. Users MAY use options C and A. Option B 1414 provides the best protection against RTCP packet floods in the event 1415 of simultaneous joins or when network partitions heal. 1417 If an RTCP packet is transmitted (using any of the above options), 1418 the value of initial is set to FALSE. Furthermore, the value of 1419 avg_rtcp_sz is updated: avg_rtcp_sz = (1/16)*packet_size + (15/16)* 1420 avg_rtcp_sz, where packet_size is the size of the RTCP packet just 1421 transmitted. 1423 6.3.7 Transmitting a BYE packet 1425 When a user wishes to leave a session, a BYE packet is transmitted to 1426 inform the other users of the event. In order to avoid a flood of BYE 1427 packets when many users leave the system, a client MUST implement the 1428 following algorithm if the number of members is more than 50 when the 1429 user chooses to leave: 1431 o When the user decides to leave the system, tp is reset to tc, 1432 the current time, members and pmembers are initialized to 1, 1433 initial is set to 1, we_sent is set to 0, senders is set to 0, 1434 and avg_rtcp_sz is set to the size of the BYE packet. The 1435 calculated interval T is computed. The BYE packet is then 1436 scheduled for time tn = tc + T. 1438 o Every time a BYE packet from another user is received, members 1439 is incremented by 1. members is NOT incremented when other RTCP 1440 packets or RTP packets are received, but only for BYE packets. 1442 o Transmission of the BYE packet then follows the rules for 1443 transmitting a regular RTCP packet, as above. Option B SHOULD 1444 be used. 1446 This allows BYE packets to be sent right away, yet controls their 1447 total bandwidth usage. In the worst case, this could cause RTCP 1448 control packets to use twice the bandwidth as normal (10%) - 5% for 1449 non BYE RTCP packets and 5% for BYE. 1451 A client which does not want to wait for the above mechanism to allow 1452 them to transmit a BYE packet MAY leave the group without sending a 1453 BYE at all. They will eventually be timed out by the other group 1454 members. 1456 When the group size estimate members is less than 50 when the user 1457 decides to leave, the user MAY send a BYE packet immediately. 1458 Alternatively, the user MAY choose to implement the above BYE backoff 1459 algorithm. 1461 In either case, a client which never sent an RTP or RTCP packet MUST 1462 NOT send a BYE packet when they leave the group. 1464 6.3.8 Updating we_sent 1466 The variable we_sent contains TRUE if the user has sent an RTP packet 1467 recently, false otherwise. This determination is made by using the 1468 same mechanisms for managing the senders table. When the user first 1469 sends an RTP packet, they add themselves to the sender table. Every 1470 time another RTP packet is sent, the time of transmission of that 1471 packet is maintained in the table. The normal sender timeout 1472 algorithm is then applied to the user - if an RTP packet has not been 1473 transmitted since time tc - T, the user removes themselves from the 1474 sender table, decrements the sender count, and sents we_sent to 1475 false. Whenever an RTP packet is sent, we_sent is set to true. 1477 6.3.9 Allocation of source description bandwidth 1479 This specification defines several source description (SDES) items in 1480 addition to the mandatory CNAME item, such as NAME (personal name) 1481 and EMAIL (email address). It also provides a means to define new 1482 application-specific RTCP packet types. Applications should exercise 1483 caution in allocating control bandwidth to this additional 1484 information because it will slow down the rate at which reception 1485 reports and CNAME are sent, thus impairing the performance of the 1486 protocol. It is recommended that no more than 20% of the RTCP 1487 bandwidth allocated to a single participant be used to carry the 1488 additional information. Furthermore, it is not intended that all 1489 SDES items should be included in every application. Those that are 1490 included should be assigned a fraction of the bandwidth according to 1491 their utility. Rather than estimate these fractions dynamically, it 1492 is recommended that the percentages be translated statically into 1493 report interval counts based on the typical length of an item. 1495 For example, an application may be designed to send only CNAME, NAME 1496 and EMAIL and not any others. NAME might be given much higher 1497 priority than EMAIL because the NAME would be displayed continuously 1498 in the application's user interface, whereas EMAIL would be displayed 1499 only when requested. At every RTCP interval, an RR packet and an SDES 1500 packet with the CNAME item would be sent. For a small session 1501 operating at the minimum interval, that would be every 5 seconds on 1502 the average. Every third interval (15 seconds), one extra item would 1503 be included in the SDES packet. Seven out of eight times this would 1504 be the NAME item, and every eighth time (2 minutes) it would be the 1505 EMAIL item. 1507 When multiple applications operate in concert using cross-application 1508 binding through a common CNAME for each participant, for example in a 1509 multimedia conference composed of an RTP session for each medium, the 1510 additional SDES information might be sent in only one RTP session. 1511 The other sessions would carry only the CNAME item. In particular, 1512 this approach should be applied to the multiple sessions of a layered 1513 encoding scheme (see Section 2.4). 1515 6.4 Sender and Receiver Reports 1517 RTP receivers provide reception quality feedback using RTCP report 1518 packets which may take one of two forms depending upon whether or not 1519 the receiver is also a sender. The only difference between the sender 1520 report (SR) and receiver report (RR) forms, besides the packet type 1521 code, is that the sender report includes a 20-byte sender information 1522 section for use by active senders. The SR is issued if a site has 1523 sent any data packets during the interval since issuing the last 1524 report or the previous one, otherwise the RR is issued. 1526 Both the SR and RR forms include zero or more reception report 1527 blocks, one for each of the synchronization sources from which this 1528 receiver has received RTP data packets since the last report. Reports 1529 are not issued for contributing sources listed in the CSRC list. Each 1530 reception report block provides statistics about the data received 1531 from the particular source indicated in that block. Since a maximum 1532 of 31 reception report blocks will fit in an SR or RR packet, 1533 additional RR packets may be stacked after the initial SR or RR 1534 packet as needed to contain the reception reports for all sources 1535 heard during the interval since the last report. 1537 The next sections define the formats of the two reports, how they may 1538 be extended in a profile-specific manner if an application requires 1539 additional feedback information, and how the reports may be used. 1540 Details of reception reporting by translators and mixers is given in 1541 Section 7. 1543 6.4.1 SR: Sender report RTCP packet 1544 0 1 2 3 1545 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1546 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1547 |V=2|P| RC | PT=SR=200 | length | header 1548 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1549 | SSRC of sender | 1550 +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ 1551 | NTP timestamp, most significant word | sender 1552 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ info 1553 | NTP timestamp, least significant word | 1554 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1555 | RTP timestamp | 1556 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1557 | sender's packet count | 1558 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1559 | sender's octet count | 1560 +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ 1561 | SSRC_1 (SSRC of first source) | report 1562 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ block 1563 | fraction lost | cumulative number of packets lost | 1 1564 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1565 | extended highest sequence number received | 1566 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1567 | interarrival jitter | 1568 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1569 | last SR (LSR) | 1570 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1571 | delay since last SR (DLSR) | 1572 +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ 1573 | SSRC_2 (SSRC of second source) | report 1574 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ block 1575 : ... : 2 1576 +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ 1577 | profile-specific extensions | 1578 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1580 The sender report packet consists of three sections, possibly 1581 followed by a fourth profile-specific extension section if defined. 1582 The first section, the header, is 8 octets long. The fields have the 1583 following meaning: 1585 version (V): 2 bits 1586 Identifies the version of RTP, which is the same in RTCP packets 1587 as in RTP data packets. The version defined by this 1588 specification is two (2). 1590 padding (P): 1 bit 1591 If the padding bit is set, this individual RTCP packet contains 1592 some additional padding octets at the end which are not part of 1593 the control information but are included in the length field. 1594 The last octet of the padding is a count of how many padding 1595 octets should be ignored, including itself (it will be a 1596 multiple of four). Padding may be needed by some encryption 1597 algorithms with fixed block sizes. In a compound RTCP packet, 1598 padding should only be required on the last individual packet 1599 because the compound packet is encrypted as a whole. Thus, the 1600 padding bit would be set only on the last individual packet. 1602 reception report count (RC): 5 bits 1603 The number of reception report blocks contained in this packet. 1604 A value of zero is valid. 1606 packet type (PT): 8 bits 1607 Contains the constant 200 to identify this as an RTCP SR packet. 1609 length: 16 bits 1610 The length of this RTCP packet in 32-bit words minus one, 1611 including the header and any padding. (The offset of one makes 1612 zero a valid length and avoids a possible infinite loop in 1613 scanning a compound RTCP packet, while counting 32-bit words 1614 avoids a validity check for a multiple of 4.) 1616 SSRC: 32 bits 1617 The synchronization source identifier for the originator of this 1618 SR packet. 1620 The second section, the sender information, is 20 octets long and is 1621 present in every sender report packet. It summarizes the data 1622 transmissions from this sender. The fields have the following 1623 meaning: 1625 NTP timestamp: 64 bits 1626 Indicates the wallclock time when this report was sent so that 1627 it may be used in combination with timestamps returned in 1628 reception reports from other receivers to measure round-trip 1629 propagation to those receivers. Receivers should expect that the 1630 measurement accuracy of the timestamp may be limited to far less 1631 than the resolution of the NTP timestamp. The measurement 1632 uncertainty of the timestamp is not indicated as it may not be 1633 known. A sender that can keep track of elapsed time but has no 1634 notion of wallclock time may use the elapsed time since joining 1635 the session instead. This is assumed to be less than 68 years, 1636 so the high bit will be zero. It is permissible to use the 1637 sampling clock to estimate elapsed wallclock time. A sender that 1638 has no notion of wallclock or elapsed time may set the NTP 1639 timestamp to zero. 1641 RTP timestamp: 32 bits 1642 Corresponds to the same time as the NTP timestamp (above), but 1643 in the same units and with the same random offset as the RTP 1644 timestamps in data packets. This correspondence may be used for 1645 intra- and inter-media synchronization for sources whose NTP 1646 timestamps are synchronized, and may be used by media- 1647 independent receivers to estimate the nominal RTP clock 1648 frequency. Note that in most cases this timestamp will not be 1649 equal to the RTP timestamp in any adjacent data packet. Rather, 1650 it is calculated from the corresponding NTP timestamp using the 1651 relationship between the RTP timestamp counter and real time as 1652 maintained by periodically checking the wallclock time at a 1653 sampling instant. 1655 sender's packet count: 32 bits 1656 The total number of RTP data packets transmitted by the sender 1657 since starting transmission up until the time this SR packet was 1658 generated. The count is reset if the sender changes its SSRC 1659 identifier. 1661 sender's octet count: 32 bits 1662 The total number of payload octets (i.e., not including header 1663 or padding) transmitted in RTP data packets by the sender since 1664 starting transmission up until the time this SR packet was 1665 generated. The count is reset if the sender changes its SSRC 1666 identifier. This field can be used to estimate the average 1667 payload data rate. 1669 The third section contains zero or more reception report blocks 1670 depending on the number of other sources heard by this sender since 1671 the last report. Each reception report block conveys statistics on 1672 the reception of RTP packets from a single synchronization source. 1673 Receivers do not carry over statistics when a source changes its SSRC 1674 identifier due to a collision. These statistics are: 1676 SSRC_n (source identifier): 32 bits 1677 The SSRC identifier of the source to which the information in 1678 this reception report block pertains. 1680 fraction lost: 8 bits 1681 The fraction of RTP data packets from source SSRC_n lost since 1682 the previous SR or RR packet was sent, expressed as a fixed 1683 point number with the binary point at the left edge of the 1684 field. (That is equivalent to taking the integer part after 1685 multiplying the loss fraction by 256.) This fraction is defined 1686 to be the number of packets lost divided by the number of 1687 packets expected, as defined in the next paragraph. An 1688 implementation is shown in Appendix A.3. If the loss is 1689 negative due to duplicates, the fraction lost is set to zero. 1690 Note that a receiver cannot tell whether any packets were lost 1691 after the last one received, and that there will be no reception 1692 report block issued for a source if all packets from that source 1693 sent during the last reporting interval have been lost. 1695 cumulative number of packets lost: 24 bits 1696 The total number of RTP data packets from source SSRC_n that 1697 have been lost since the beginning of reception. This number is 1698 defined to be the number of packets expected less the number of 1699 packets actually received, where the number of packets received 1700 includes any which are late or duplicates. Thus packets that 1701 arrive late are not counted as lost, and the loss may be 1702 negative if there are duplicates. The number of packets 1703 expected is defined to be the extended last sequence number 1704 received, as defined next, less the initial sequence number 1705 received. This may be calculated as shown in Appendix A.3. 1707 extended highest sequence number received: 32 bits 1708 The low 16 bits contain the highest sequence number received in 1709 an RTP data packet from source SSRC_n, and the most significant 1710 16 bits extend that sequence number with the corresponding count 1711 of sequence number cycles, which may be maintained according to 1712 the algorithm in Appendix A.1. Note that different receivers 1713 within the same session will generate different extensions to 1714 the sequence number if their start times differ significantly. 1716 interarrival jitter: 32 bits 1717 An estimate of the statistical variance of the RTP data packet 1718 interarrival time, measured in timestamp units and expressed as 1719 an unsigned integer. The interarrival jitter J is defined to be 1720 the mean deviation (smoothed absolute value) of the difference D 1721 in packet spacing at the receiver compared to the sender for a 1722 pair of packets. As shown in the equation below, this is 1723 equivalent to the difference in the "relative transit time" for 1724 the two packets; the relative transit time is the difference 1725 between a packet's RTP timestamp and the receiver's clock at the 1726 time of arrival, measured in the same units. 1728 If Si is the RTP timestamp from packet i, and Ri is the time of 1729 arrival in RTP timestamp units for packet i, then for two packets i 1730 and j, D may be expressed as D(i,j) = (R_j - R_i) - (S_j - S_i) = 1731 (R_j - S_j) - (R_i - S_i) 1733 The interarrival jitter is calculated continuously as each data 1734 packet i is received from source SSRC_n, using this difference D for 1735 that packet and the previous packet i-1 in order of arrival (not 1736 necessarily in sequence), according to the formula J_i = J_i-1 + 1737 (|D(i-1,i)| - J_i-1)/16 1738 Whenever a reception report is issued, the current value of J is 1739 sampled. 1741 The jitter calculation is prescribed here to allow profile- 1742 independent monitors to make valid interpretations of reports coming 1743 from different implementations. This algorithm is the optimal first- 1744 order estimator and the gain parameter 1/16 gives a good noise 1745 reduction ratio while maintaining a reasonable rate of convergence 1746 [11]. A sample implementation is shown in Appendix A.8. 1748 last SR timestamp (LSR): 32 bits 1749 The middle 32 bits out of 64 in the NTP timestamp (as explained 1750 in Section 4) received as part of the most recent RTCP sender 1751 report (SR) packet from source SSRC_n. If no SR has been 1752 received yet, the field is set to zero. 1754 delay since last SR (DLSR): 32 bits 1755 The delay, expressed in units of 1/65536 seconds, between 1756 receiving the last SR packet from source SSRC_n and sending this 1757 reception report block. If no SR packet has been received yet 1758 from SSRC_n, the DLSR field is set to zero. 1760 Let SSRC_r denote the receiver issuing this receiver report. Source 1761 SSRC_n can compute the round propagation delay to SSRC_r by recording 1762 the time A when this reception report block is received. It 1763 calculates the total round-trip time A-LSR using the last SR 1764 timestamp (LSR) field, and then subtracting this field to leave the 1765 round-trip propagation delay as (A- LSR - DLSR). This is illustrated 1766 in Fig. 2. 1768 This may be used as an approximate measure of distance to cluster 1769 receivers, although some links have very asymmetric delays. 1771 6.4.2 RR: Receiver report RTCP packet 1773 [10 Nov 1995 11:33:25.125] [10 Nov 1995 11:33:36.5] 1774 n SR(n) A=b710:8000 (46864.500 s) 1775 ----------------------------------------------------------------> 1776 v ^ 1777 ntp_sec =0xb44db705 v ^ dlsr=0x0005.4000 ( 5.250s) 1778 ntp_frac=0x20000000 v ^ lsr =0xb705:2000 (46853.125s) 1779 (3024992016.125 s) v ^ 1780 r v ^ RR(n) 1781 ----------------------------------------------------------------> 1782 |<-DLSR->| 1783 (5.250 s) 1785 A 0xb710:8000 (46864.500 s) 1786 DLSR -0x0005:4000 ( 5.250 s) 1787 LSR -0xb705:2000 (46853.125 s) 1788 ------------------------------- 1789 delay 0x 6:2000 ( 6.125 s) 1791 Figure 2: Example for round-trip time computation 1793 0 1 2 3 1794 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1795 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1796 |V=2|P| RC | PT=RR=201 | length | header 1797 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1798 | SSRC of packet sender | 1799 +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ 1800 | SSRC_1 (SSRC of first source) | report 1801 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ block 1802 | fraction lost | cumulative number of packets lost | 1 1803 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1804 | extended highest sequence number received | 1805 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1806 | interarrival jitter | 1807 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1808 | last SR (LSR) | 1809 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1810 | delay since last SR (DLSR) | 1811 +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ 1812 | SSRC_2 (SSRC of second source) | report 1813 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ block 1814 : ... : 2 1815 +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ 1816 | profile-specific extensions | 1817 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1819 The format of the receiver report (RR) packet is the same as that of 1820 the SR packet except that the packet type field contains the constant 1821 201 and the five words of sender information are omitted (these are 1822 the NTP and RTP timestamps and sender's packet and octet counts). The 1823 remaining fields have the same meaning as for the SR packet. 1825 An empty RR packet (RC = 0) is put at the head of a compound RTCP 1826 packet when there is no data transmission or reception to report. 1828 6.4.3 Extending the sender and receiver reports 1830 A profile should define profile- or application-specific extensions 1831 to the sender report and receiver if there is additional information 1832 that should be reported regularly about the sender or receivers. This 1833 method should be used in preference to defining another RTCP packet 1834 type because it requires less overhead: 1836 o fewer octets in the packet (no RTCP header or SSRC field); 1838 o simpler and faster parsing because applications running under 1839 that profile would be programmed to always expect the extension 1840 fields in the directly accessible location after the reception 1841 reports. 1843 If additional sender information is required, it should be included 1844 first in the extension for sender reports, but would not be present 1845 in receiver reports. If information about receivers is to be 1846 included, that data may be structured as an array of blocks parallel 1847 to the existing array of reception report blocks; that is, the number 1848 of blocks would be indicated by the RC field. 1850 6.4.4 Analyzing sender and receiver reports 1852 It is expected that reception quality feedback will be useful not 1853 only for the sender but also for other receivers and third-party 1854 monitors. The sender may modify its transmissions based on the 1855 feedback; receivers can determine whether problems are local, 1856 regional or global; network managers may use profile-independent 1857 monitors that receive only the RTCP packets and not the corresponding 1858 RTP data packets to evaluate the performance of their networks for 1859 multicast distribution. 1861 Cumulative counts are used in both the sender information and 1862 receiver report blocks so that differences may be calculated between 1863 any two reports to make measurements over both short and long time 1864 periods, and to provide resilience against the loss of a report. The 1865 difference between the last two reports received can be used to 1866 estimate the recent quality of the distribution. The NTP timestamp is 1867 included so that rates may be calculated from these differences over 1868 the interval between two reports. Since that timestamp is independent 1869 of the clock rate for the data encoding, it is possible to implement 1870 encoding- and profile-independent quality monitors. 1872 An example calculation is the packet loss rate over the interval 1873 between two reception reports. The difference in the cumulative 1874 number of packets lost gives the number lost during that interval. 1875 The difference in the extended last sequence numbers received gives 1876 the number of packets expected during the interval. The ratio of 1877 these two is the packet loss fraction over the interval. This ratio 1878 should equal the fraction lost field if the two reports are 1879 consecutive, but otherwise not. The loss rate per second can be 1880 obtained by dividing the loss fraction by the difference in NTP 1881 timestamps, expressed in seconds. The number of packets received is 1882 the number of packets expected minus the number lost. The number of 1883 packets expected may also be used to judge the statistical validity 1884 of any loss estimates. For example, 1 out of 5 packets lost has a 1885 lower significance than 200 out of 1000. 1887 From the sender information, a third-party monitor can calculate the 1888 average payload data rate and the average packet rate over an 1889 interval without receiving the data. Taking the ratio of the two 1890 gives the average payload size. If it can be assumed that packet loss 1891 is independent of packet size, then the number of packets received by 1892 a particular receiver times the average payload size (or the 1893 corresponding packet size) gives the apparent throughput available to 1894 that receiver. 1896 In addition to the cumulative counts which allow long-term packet 1897 loss measurements using differences between reports, the fraction 1898 lost field provides a short-term measurement from a single report. 1899 This becomes more important as the size of a session scales up enough 1900 that reception state information might not be kept for all receivers 1901 or the interval between reports becomes long enough that only one 1902 report might have been received from a particular receiver. 1904 The interarrival jitter field provides a second short-term measure of 1905 network congestion. Packet loss tracks persistent congestion while 1906 the jitter measure tracks transient congestion. The jitter measure 1907 may indicate congestion before it leads to packet loss. Since the 1908 interarrival jitter field is only a snapshot of the jitter at the 1909 time of a report, it may be necessary to analyze a number of reports 1910 from one receiver over time or from multiple receivers, e.g., within 1911 a single network. 1913 6.5 SDES: Source description RTCP packet 1915 0 1 2 3 1916 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1917 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1918 |V=2|P| SC | PT=SDES=202 | length | header 1919 +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ 1920 | SSRC/CSRC_1 | chunk 1921 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1 1922 | SDES items | 1923 | ... | 1924 +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ 1925 | SSRC/CSRC_2 | chunk 1926 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2 1927 | SDES items | 1928 | ... | 1929 +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ 1931 The SDES packet is a three-level structure composed of a header and 1932 zero or more chunks, each of of which is composed of items describing 1933 the source identified in that chunk. The items are described 1934 individually in subsequent sections. 1936 version (V), padding (P), length: 1937 As described for the SR packet (see Section 6.4.1). 1939 packet type (PT): 8 bits 1940 Contains the constant 202 to identify this as an RTCP SDES 1941 packet. 1943 source count (SC): 5 bits 1944 The number of SSRC/CSRC chunks contained in this SDES packet. A 1945 value of zero is valid but useless. 1947 Each chunk consists of an SSRC/CSRC identifier followed by a list of 1948 zero or more items, which carry information about the SSRC/CSRC. Each 1949 chunk starts on a 32-bit boundary. Each item consists of an 8-bit 1950 type field, an 8-bit octet count describing the length of the text 1951 (thus, not including this two-octet header), and the text itself. 1952 Note that the text can be no longer than 255 octets, but this is 1953 consistent with the need to limit RTCP bandwidth consumption. 1955 The text is encoded according to the UTF-8 encoding specified in RFC 1956 2044. US-ASCII is a subset of this encoding and requires no 1957 additional encoding. The presence of multi-octet encodings is 1958 indicated by setting the most significant bit of a character to a 1959 value of one. 1961 Items are contiguous, i.e., items are not individually padded to a 1962 32-bit boundary. Text is not null terminated because some multi-octet 1963 encodings include null octets. The list of items in each chunk is 1964 terminated by one or more null octets, the first of which is 1965 interpreted as an item type of zero to denote the end of the list. 1966 No length octet follows the null item type octet, but additional null 1967 octets are included if needed to pad until the next 32-bit boundary. 1968 Note that this padding is separate from that indicated by the P bit 1969 in the RTCP header. A chunk with zero items (four null octets) is 1970 valid but useless. 1972 End systems send one SDES packet containing their own source 1973 identifier (the same as the SSRC in the fixed RTP header). A mixer 1974 sends one SDES packet containing a chunk for each contributing source 1975 from which it is receiving SDES information, or multiple complete 1976 SDES packets in the format above if there are more than 31 such 1977 sources (see Section 7). 1979 The SDES items currently defined are described in the next sections. 1980 Only the CNAME item is mandatory. Some items shown here may be useful 1981 only for particular profiles, but the item types are all assigned 1982 from one common space to promote shared use and to simplify profile- 1983 independent applications. Additional items may be defined in a 1984 profile by registering the type numbers with IANA. 1986 6.5.1 CNAME: Canonical end-point identifier SDES item 1988 0 1 2 3 1989 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1990 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1991 | CNAME=1 | length | user and domain name ... 1992 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1994 The CNAME identifier has the following properties: 1996 o Because the randomly allocated SSRC identifier may change if a 1997 conflict is discovered or if a program is restarted, the CNAME 1998 item is required to provide the binding from the SSRC 1999 identifier to an identifier for the source that remains 2000 constant. 2002 o Like the SSRC identifier, the CNAME identifier should also be 2003 unique among all participants within one RTP session. 2005 o To provide a binding across multiple media tools used by one 2006 participant in a set of related RTP sessions, the CNAME should 2007 be fixed for that participant. 2009 o To facilitate third-party monitoring, the CNAME should be 2010 suitable for either a program or a person to locate the source. 2012 Therefore, the CNAME should be derived algorithmically and not 2013 entered manually, when possible. To meet these requirements, the 2014 following format should be used unless a profile specifies an 2015 alternate syntax or semantics. The CNAME item should have the format 2016 "user@host", or "host" if a user name is not available as on single- 2017 user systems. For both formats, "host" is either the fully qualified 2018 domain name of the host from which the real-time data originates, 2019 formatted according to the rules specified in RFC 1034 [14], RFC 1035 2020 [15] and Section 2.1 of RFC 1123 [16]; or the standard ASCII 2021 representation of the host's numeric address on the interface used 2022 for the RTP communication. For example, the standard ASCII 2023 representation of an IP Version 4 address is "dotted decimal", also 2024 known as dotted quad. Other address types are expected to have ASCII 2025 representations that are mutually unique. The fully qualified domain 2026 name is more convenient for a human observer and may avoid the need 2027 to send a NAME item in addition, but it may be difficult or 2028 impossible to obtain reliably in some operating environments. 2029 Applications that may be run in such environments should use the 2030 ASCII representation of the address instead. 2032 Examples are "doe@sleepy.megacorp.com" or "doe@192.0.2.89" for a 2033 multi-user system. On a system with no user name, examples would be 2034 "sleepy.megacorp.com" or "192.0.2.89". 2036 The user name should be in a form that a program such as "finger" or 2037 "talk" could use, i.e., it typically is the login name rather than 2038 the personal name. The host name is not necessarily identical to the 2039 one in the participant's electronic mail address. 2041 This syntax will not provide unique identifiers for each source if an 2042 application permits a user to generate multiple sources from one 2043 host. Such an application would have to rely on the SSRC to further 2044 identify the source, or the profile for that application would have 2045 to specify additional syntax for the CNAME identifier. 2047 If each application creates its CNAME independently, the resulting 2048 CNAMEs may not be identical as would be required to provide a binding 2049 across multiple media tools belonging to one participant in a set of 2050 related RTP sessions. If cross-media binding is required, it may be 2051 necessary for the CNAME of each tool to be externally configured with 2052 the same value by a coordination tool. 2054 Application writers should be aware that private network address 2055 assignments such as the Net-10 assignment proposed in RFC 1597 [17] 2056 may create network addresses that are not globally unique. This would 2057 lead to non-unique CNAMEs if hosts with private addresses and no 2058 direct IP connectivity to the public Internet have their RTP packets 2059 forwarded to the public Internet through an RTP-level translator. 2060 (See also RFC 1627 [18].) To handle this case, applications may 2061 provide a means to configure a unique CNAME, but the burden is on the 2062 translator to translate CNAMEs from private addresses to public 2063 addresses if necessary to keep private addresses from being exposed. 2065 6.5.2 NAME: User name SDES item 2067 0 1 2 3 2068 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2069 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2070 | NAME=2 | length | common name of source ... 2071 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2073 This is the real name used to describe the source, e.g., "John Doe, 2074 Bit Recycler, Megacorp". It may be in any form desired by the user. 2075 For applications such as conferencing, this form of name may be the 2076 most desirable for display in participant lists, and therefore might 2077 be sent most frequently of those items other than CNAME. Profiles may 2078 establish such priorities. The NAME value is expected to remain 2079 constant at least for the duration of a session. It should not be 2080 relied upon to be unique among all participants in the session. 2082 6.5.3 EMAIL: Electronic mail address SDES item 2084 0 1 2 3 2085 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2086 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2087 | EMAIL=3 | length | email address of source ... 2088 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2090 The email address is formatted according to RFC 822 [19], for 2091 example, "John.Doe@megacorp.com". The EMAIL value is expected to 2092 remain constant for the duration of a session. 2094 6.5.4 PHONE: Phone number SDES item 2096 0 1 2 3 2097 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2098 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2099 | PHONE=4 | length | phone number of source ... 2100 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2101 The phone number should be formatted with the plus sign replacing the 2102 international access code. For example, "+1 908 555 1212" for a 2103 number in the United States. 2105 6.5.5 LOC: Geographic user location SDES item 2107 0 1 2 3 2108 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2109 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2110 | LOC=5 | length | geographic location of site ... 2111 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2113 Depending on the application, different degrees of detail are 2114 appropriate for this item. For conference applications, a string like 2115 "Murray Hill, New Jersey" may be sufficient, while, for an active 2116 badge system, strings like "Room 2A244, AT&T BL MH" might be 2117 appropriate. The degree of detail is left to the implementation 2118 and/or user, but format and content may be prescribed by a profile. 2119 The LOC value is expected to remain constant for the duration of a 2120 session, except for mobile hosts. 2122 6.5.6 TOOL: Application or tool name SDES item 2124 0 1 2 3 2125 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2126 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2127 | TOOL=6 | length | name/version of source appl. ... 2128 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2130 A string giving the name and possibly version of the application 2131 generating the stream, e.g., "videotool 1.2". This information may be 2132 useful for debugging purposes and is similar to the Mailer or Mail- 2133 System-Version SMTP headers. The TOOL value is expected to remain 2134 constant for the duration of the session. 2136 6.5.7 NOTE: Notice/status SDES item 2138 0 1 2 3 2139 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2140 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2141 | NOTE=7 | length | note about the source ... 2142 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2144 The following semantics are suggested for this item, but these or 2145 other semantics may be explicitly defined by a profile. The NOTE item 2146 is intended for transient messages describing the current state of 2147 the source, e.g., "on the phone, can't talk". Or, during a seminar, 2148 this item might be used to convey the title of the talk. It should be 2149 used only to carry exceptional information and should not be included 2150 routinely by all participants because this would slow down the rate 2151 at which reception reports and CNAME are sent, thus impairing the 2152 performance of the protocol. In particular, it should not be included 2153 as an item in a user's configuration file nor automatically generated 2154 as in a quote-of-the-day. 2156 Since the NOTE item may be important to display while it is active, 2157 the rate at which other non-CNAME items such as NAME are transmitted 2158 might be reduced so that the NOTE item can take that part of the RTCP 2159 bandwidth. When the transient message becomes inactive, the NOTE item 2160 should continue to be transmitted a few times at the same repetition 2161 rate but with a string of length zero to signal the receivers. 2162 However, receivers should also consider the NOTE item inactive if it 2163 is not received for a small multiple of the repetition rate, or 2164 perhaps 20-30 RTCP intervals. 2166 6.5.8 PRIV: Private extensions SDES item 2168 0 1 2 3 2169 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2170 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2171 | PRIV=8 | length | prefix length | prefix string... 2172 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2173 ... | value string ... 2174 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2176 This item is used to define experimental or application-specific SDES 2177 extensions. The item contains a prefix consisting of a length-string 2178 pair, followed by the value string filling the remainder of the item 2179 and carrying the desired information. The prefix length field is 8 2180 bits long. The prefix string is a name chosen by the person defining 2181 the PRIV item to be unique with respect to other PRIV items this 2182 application might receive. The application creator might choose to 2183 use the application name plus an additional subtype identification if 2184 needed. Alternatively, it is recommended that others choose a name 2185 based on the entity they represent, then coordinate the use of the 2186 name within that entity. 2188 Note that the prefix consumes some space within the item's total 2189 length of 255 octets, so the prefix should be kept as short as 2190 possible. This facility and the constrained RTCP bandwidth should not 2191 be overloaded; it is not intended to satisfy all the control 2192 communication requirements of all applications. 2194 SDES PRIV prefixes will not be registered by IANA. If some form of 2195 the PRIV item proves to be of general utility, it should instead be 2196 assigned a regular SDES item type registered with IANA so that no 2197 prefix is required. This simplifies use and increases transmission 2198 efficiency. 2200 6.6 BYE: Goodbye RTCP packet 2202 0 1 2 3 2203 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2204 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2205 |V=2|P| SC | PT=BYE=203 | length | 2206 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2207 | SSRC/CSRC | 2208 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2209 : ... : 2210 +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ 2211 | length | reason for leaving ... (opt) 2212 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2214 The BYE packet indicates that one or more sources are no longer 2215 active. 2217 version (V), padding (P), length: 2218 As described for the SR packet (see Section 6.4.1). 2220 packet type (PT): 8 bits 2221 Contains the constant 203 to identify this as an RTCP BYE 2222 packet. 2224 source count (SC): 5 bits 2225 The number of SSRC/CSRC identifiers included in this BYE packet. 2226 A count value of zero is valid, but useless. 2228 The rules for when a BYE packet should be sent are specified in 2229 Section 6.3.7. 2231 If a BYE packet is received by a mixer, the mixer forwards the BYE 2232 packet with the SSRC/CSRC identifier(s) unchanged. If a mixer shuts 2233 down, it should send a BYE packet listing all contributing sources it 2234 handles, as well as its own SSRC identifier. Optionally, the BYE 2235 packet may include an 8-bit octet count followed by that many octets 2236 of text indicating the reason for leaving, e.g., "camera malfunction" 2237 or "RTP loop detected". The string has the same encoding as that 2238 described for SDES. If the string fills the packet to the next 32-bit 2239 boundary, the string is not null terminated. If not, the BYE packet 2240 is padded with null octets to the next 32-bit boundary. This padding 2241 is separate from that indicated by the P bit in the RTCP header. 2243 6.7 APP: Application-defined RTCP packet 2245 0 1 2 3 2246 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2247 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2248 |V=2|P| subtype | PT=APP=204 | length | 2249 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2250 | SSRC/CSRC | 2251 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2252 | name (ASCII) | 2253 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2254 | application-dependent data ... 2255 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2257 The APP packet is intended for experimental use as new applications 2258 and new features are developed, without requiring packet type value 2259 registration. APP packets with unrecognized names should be ignored. 2260 After testing and if wider use is justified, it is recommended that 2261 each APP packet be redefined without the subtype and name fields and 2262 registered with the Internet Assigned Numbers Authority using an RTCP 2263 packet type. 2265 version (V), padding (P), length: 2266 As described for the SR packet (see Section 6.4.1). 2268 subtype: 5 bits 2269 May be used as a subtype to allow a set of APP packets to be 2270 defined under one unique name, or for any application-dependent 2271 data. 2273 packet type (PT): 8 bits 2274 Contains the constant 204 to identify this as an RTCP APP 2275 packet. 2277 name: 4 octets 2278 A name chosen by the person defining the set of APP packets to 2279 be unique with respect to other APP packets this application 2280 might receive. The application creator might choose to use the 2281 application name, and then coordinate the allocation of subtype 2282 values to others who want to define new packet types for the 2283 application. Alternatively, it is recommended that others 2284 choose a name based on the entity they represent, then 2285 coordinate the use of the name within that entity. The name is 2286 interpreted as a sequence of four ASCII characters, with 2287 uppercase and lowercase characters treated as distinct. 2289 application-dependent data: variable length 2290 Application-dependent data may or may not appear in an APP 2291 packet. It is interpreted by the application and not RTP itself. 2292 It must be a multiple of 32 bits long. 2294 7 RTP Translators and Mixers 2296 In addition to end systems, RTP supports the notion of "translators" 2297 and "mixers", which could be considered as "intermediate systems" at 2298 the RTP level. Although this support adds some complexity to the 2299 protocol, the need for these functions has been clearly established 2300 by experiments with multicast audio and video applications in the 2301 Internet. Example uses of translators and mixers given in Section 2.3 2302 stem from the presence of firewalls and low bandwidth connections, 2303 both of which are likely to remain. 2305 7.1 General Description 2307 An RTP translator/mixer connects two or more transport-level 2308 "clouds". Typically, each cloud is defined by a common network and 2309 transport protocol (e.g., IP/UDP) plus a multicast address and 2310 transport level destination port or a pair of unicast addresses and 2311 ports. (Network-level protocol translators, such as IP version 4 to 2312 IP version 6, may be present within a cloud invisibly to RTP.) One 2313 system may serve as a translator or mixer for a number of RTP 2314 sessions, but each is considered a logically separate entity. 2316 In order to avoid creating a loop when a translator or mixer is 2317 installed, the following rules must be observed: 2319 o Each of the clouds connected by translators and mixers 2320 participating in one RTP session either must be distinct from 2321 all the others in at least one of these parameters (protocol, 2322 address, port), or must be isolated at the network level from 2323 the others. 2325 o A derivative of the first rule is that there must not be 2326 multiple translators or mixers connected in parallel unless by 2327 some arrangement they partition the set of sources to be 2328 forwarded. 2330 Similarly, all RTP end systems that can communicate through one or 2331 more RTP translators or mixers share the same SSRC space, that is, 2332 the SSRC identifiers must be unique among all these end systems. 2333 Section 8.2 describes the collision resolution algorithm by which 2334 SSRC identifiers are kept unique and loops are detected. 2336 There may be many varieties of translators and mixers designed for 2337 different purposes and applications. Some examples are to add or 2338 remove encryption, change the encoding of the data or the underlying 2339 protocols, or replicate between a multicast address and one or more 2340 unicast addresses. The distinction between translators and mixers is 2341 that a translator passes through the data streams from different 2342 sources separately, whereas a mixer combines them to form one new 2343 stream: 2345 Translator: Forwards RTP packets with their SSRC identifier intact; 2346 this makes it possible for receivers to identify individual 2347 sources even though packets from all the sources pass through 2348 the same translator and carry the translator's network source 2349 address. Some kinds of translators will pass through the data 2350 untouched, but others may change the encoding of the data and 2351 thus the RTP data payload type and timestamp. If multiple data 2352 packets are re-encoded into one, or vice versa, a translator 2353 must assign new sequence numbers to the outgoing packets. Losses 2354 in the incoming packet stream may induce corresponding gaps in 2355 the outgoing sequence numbers. Receivers cannot detect the 2356 presence of a translator unless they know by some other means 2357 what payload type or transport address was used by the original 2358 source. 2360 Mixer: Receives streams of RTP data packets from one or more sources, 2361 possibly changes the data format, combines the streams in some 2362 manner and then forwards the combined stream. Since the timing 2363 among multiple input sources will not generally be synchronized, 2364 the mixer will make timing adjustments among the streams and 2365 generate its own timing for the combined stream, so it is the 2366 synchronization source. Thus, all data packets forwarded by a 2367 mixer will be marked with the mixer's own SSRC identifier. In 2368 order to preserve the identity of the original sources 2369 contributing to the mixed packet, the mixer should insert their 2370 SSRC identifiers into the CSRC identifier list following the 2371 fixed RTP header of the packet. A mixer that is also itself a 2372 contributing source for some packet should explicitly include 2373 its own SSRC identifier in the CSRC list for that packet. 2375 For some applications, it may be acceptable for a mixer not to 2376 identify sources in the CSRC list. However, this introduces the 2377 danger that loops involving those sources could not be detected. 2379 The advantage of a mixer over a translator for applications like 2380 audio is that the output bandwidth is limited to that of one source 2381 even when multiple sources are active on the input side. This may be 2382 important for low-bandwidth links. The disadvantage is that receivers 2383 on the output side don't have any control over which sources are 2384 passed through or muted, unless some mechanism is implemented for 2385 remote control of the mixer. The regeneration of synchronization 2386 information by mixers also means that receivers can't do inter-media 2387 synchronization of the original streams. A multi-media mixer could do 2388 it. 2390 [E1] [E6] 2391 | | 2392 E1:17 | E6:15 | 2393 | | E6:15 2394 V M1:48 (1,17) M1:48 (1,17) V M1:48 (1,17) 2395 (M1)------------->----------------->-------------->[E7] 2396 ^ ^ E4:47 ^ E4:47 2397 E2:1 | E4:47 | | M3:89 (64,45) 2398 | | | 2399 [E2] [E4] M3:89 (64,45) | 2400 | legend: 2401 [E3] --------->(M2)----------->(M3)------------| [End system] 2402 E3:64 M2:12 (64) ^ (Mixer) 2403 | E5:45 2404 | 2405 [E5] source: SSRC (CSRCs) 2406 -------------------> 2408 Figure 3: Sample RTP network with end systems, mixers and translators 2410 A collection of mixers and translators is shown in Figure 3 to 2411 illustrate their effect on SSRC and CSRC identifiers. In the figure, 2412 end systems are shown as rectangles (named E), translators as 2413 triangles (named T) and mixers as ovals (named M). The notation "M1: 2414 48(1,17)" designates a packet originating a mixer M1, identified with 2415 M1's (random) SSRC value of 48 and two CSRC identifiers, 1 and 17, 2416 copied from the SSRC identifiers of packets from E1 and E2. 2418 7.2 RTCP Processing in Translators 2420 In addition to forwarding data packets, perhaps modified, translators 2421 and mixers must also process RTCP packets. In many cases, they will 2422 take apart the compound RTCP packets received from end systems to 2423 aggregate SDES information and to modify the SR or RR packets. 2424 Retransmission of this information may be triggered by the packet 2425 arrival or by the RTCP interval timer of the translator or mixer 2426 itself. 2428 A translator that does not modify the data packets, for example one 2429 that just replicates between a multicast address and a unicast 2430 address, may simply forward RTCP packets unmodified as well. A 2431 translator that transforms the payload in some way must make 2432 corresponding transformations in the SR and RR information so that it 2433 still reflects the characteristics of the data and the reception 2434 quality. These translators must not simply forward RTCP packets. In 2435 general, a translator should not aggregate SR and RR packets from 2436 different sources into one packet since that would reduce the 2437 accuracy of the propagation delay measurements based on the LSR and 2438 DLSR fields. 2440 SR sender information: A translator does not generate its own sender 2441 information, but forwards the SR packets received from one cloud 2442 to the others. The SSRC is left intact but the sender 2443 information must be modified if required by the translation. If 2444 a translator changes the data encoding, it must change the 2445 "sender's byte count" field. If it also combines several data 2446 packets into one output packet, it must change the "sender's 2447 packet count" field. If it changes the timestamp frequency, it 2448 must change the "RTP timestamp" field in the SR packet. 2450 SR/RR reception report blocks: A translator forwards reception 2451 reports received from one cloud to the others. Note that these 2452 flow in the direction opposite to the data. The SSRC is left 2453 intact. If a translator combines several data packets into one 2454 output packet, and therefore changes the sequence numbers, it 2455 must make the inverse manipulation for the packet loss fields 2456 and the "extended last sequence number" field. This may be 2457 complex. In the extreme case, there may be no meaningful way to 2458 translate the reception reports, so the translator may pass on 2459 no reception report at all or a synthetic report based on its 2460 own reception. The general rule is to do what makes sense for a 2461 particular translation. 2463 A translator does not require an SSRC identifier of its own, but may 2464 choose to allocate one for the purpose of sending reports about what 2465 it has received. These would be sent to all the connected clouds, 2466 each corresponding to the translation of the data stream as sent to 2467 that cloud, since reception reports are normally multicast to all 2468 participants. 2470 SDES: Translators typically forward without change the SDES 2471 information they receive from one cloud to the others, but may, 2472 for example, decide to filter non-CNAME SDES information if 2473 bandwidth is limited. The CNAMEs must be forwarded to allow SSRC 2474 identifier collision detection to work. A translator that 2475 generates its own RR packets must send SDES CNAME information 2476 about itself to the same clouds that it sends those RR packets. 2478 BYE: Translators forward BYE packets unchanged. A translator that is 2479 about to cease forwarding packets should send a BYE packet to 2480 each connected cloud containing all the SSRC identifiers that 2481 were previously being forwarded to that cloud, including the 2482 translator's own SSRC identifier if it sent reports of its own. 2484 APP: Translators forward APP packets unchanged. 2486 7.3 RTCP Processing in Mixers 2488 Since a mixer generates a new data stream of its own, it does not 2489 pass through SR or RR packets at all and instead generates new 2490 information for both sides. 2492 SR sender information: A mixer does not pass through sender 2493 information from the sources it mixes because the 2494 characteristics of the source streams are lost in the mix. As a 2495 synchronization source, the mixer generates its own SR packets 2496 with sender information about the mixed data stream and sends 2497 them in the same direction as the mixed stream. 2499 SR/RR reception report blocks: A mixer generates its own reception 2500 reports for sources in each cloud and sends them out only to the 2501 same cloud. It does not send these reception reports to the 2502 other clouds and does not forward reception reports from one 2503 cloud to the others because the sources would not be SSRCs there 2504 (only CSRCs). 2506 SDES: Mixers typically forward without change the SDES information 2507 they receive from one cloud to the others, but may, for example, 2508 decide to filter non-CNAME SDES information if bandwidth is 2509 limited. The CNAMEs must be forwarded to allow SSRC identifier 2510 collision detection to work. (An identifier in a CSRC list 2511 generated by a mixer might collide with an SSRC identifier 2512 generated by an end system.) A mixer must send SDES CNAME 2513 information about itself to the same clouds that it sends SR or 2514 RR packets. 2516 Since mixers do not forward SR or RR packets, they will typically be 2517 extracting SDES packets from a compound RTCP packet. To minimize 2518 overhead, chunks from the SDES packets may be aggregated into a 2519 single SDES packet which is then stacked on an SR or RR packet 2520 originating from the mixer. The RTCP packet rate may be different on 2521 each side of the mixer. 2523 A mixer that does not insert CSRC identifiers may also refrain from 2524 forwarding SDES CNAMEs. In this case, the SSRC identifier spaces in 2525 the two clouds are independent. As mentioned earlier, this mode of 2526 operation creates a danger that loops can't be detected. 2528 BYE: Mixers need to forward BYE packets. A mixer that is about to 2529 cease forwarding packets should send a BYE packet to each 2530 connected cloud containing all the SSRC identifiers that were 2531 previously being forwarded to that cloud, including the mixer's 2532 own SSRC identifier if it sent reports of its own. 2534 APP: The treatment of APP packets by mixers is application-specific. 2536 7.4 Cascaded Mixers 2538 An RTP session may involve a collection of mixers and translators as 2539 shown in Figure 3. If two mixers are cascaded, such as M2 and M3 in 2540 the figure, packets received by a mixer may already have been mixed 2541 and may include a CSRC list with multiple identifiers. The second 2542 mixer should build the CSRC list for the outgoing packet using the 2543 CSRC identifiers from already-mixed input packets and the SSRC 2544 identifiers from unmixed input packets. This is shown in the output 2545 arc from mixer M3 labeled M3:89(64,45) in the figure. As in the case 2546 of mixers that are not cascaded, if the resulting CSRC list has more 2547 than 15 identifiers, the remainder cannot be included. 2549 8 SSRC Identifier Allocation and Use 2551 The SSRC identifier carried in the RTP header and in various fields 2552 of RTCP packets is a random 32-bit number that is required to be 2553 globally unique within an RTP session. It is crucial that the number 2554 be chosen with care in order that participants on the same network or 2555 starting at the same time are not likely to choose the same number. 2557 It is not sufficient to use the local network address (such as an 2558 IPv4 address) for the identifier because the address may not be 2559 unique. Since RTP translators and mixers enable interoperation among 2560 multiple networks with different address spaces, the allocation 2561 patterns for addresses within two spaces might result in a much 2562 higher rate of collision than would occur with random allocation. 2564 Multiple sources running on one host would also conflict. 2566 It is also not sufficient to obtain an SSRC identifier simply by 2567 calling random() without carefully initializing the state. An example 2568 of how to generate a random identifier is presented in Appendix A.6. 2570 8.1 Probability of Collision 2571 Since the identifiers are chosen randomly, it is possible that two or 2572 more sources will choose the same number. Collision occurs with the 2573 highest probability when all sources are started simultaneously, for 2574 example when triggered automatically by some session management 2575 event. If N is the number of sources and L the length of the 2576 identifier (here, 32 bits), the probability that two sources 2577 independently pick the same value can be approximated for large N 2578 [20] as 1 - exp(-N**2 / 2**(L+1)). For N=1000, the probability is 2579 roughly 10**-4. 2581 The typical collision probability is much lower than the worst-case 2582 above. When one new source joins an RTP session in which all the 2583 other sources already have unique identifiers, the probability of 2584 collision is just the fraction of numbers used out of the space. 2585 Again, if N is the number of sources and L the length of the 2586 identifier, the probability of collision is N / 2**L. For N=1000, the 2587 probability is roughly 2*10**-7. 2589 The probability of collision is further reduced by the opportunity 2590 for a new source to receive packets from other participants before 2591 sending its first packet (either data or control). If the new source 2592 keeps track of the other participants (by SSRC identifier), then 2593 before transmitting its first packet the new source can verify that 2594 its identifier does not conflict with any that have been received, or 2595 else choose again. 2597 8.2 Collision Resolution and Loop Detection 2599 Although the probability of SSRC identifier collision is low, all RTP 2600 implementations must be prepared to detect collisions and take the 2601 appropriate actions to resolve them. If a source discovers at any 2602 time that another source is using the same SSRC identifier as its 2603 own, it must send an RTCP BYE packet for the old identifier and 2604 choose another random one. (As explained below, this step is taken 2605 only once in case of a loop.) If a receiver discovers that two other 2606 sources are colliding, it may keep the packets from one and discard 2607 the packets from the other when this can be detected by different 2608 source transport addresses or CNAMEs. The two sources are expected to 2609 resolve the collision so that the situation doesn't last. 2611 Because the random SSRC identifiers are kept globally unique for each 2612 RTP session, they can also be used to detect loops that may be 2613 introduced by mixers or translators. A loop causes duplication of 2614 data and control information, either unmodified or possibly mixed, as 2615 in the following examples: 2617 o A translator may incorrectly forward a packet to the same 2618 multicast group from which it has received the packet, either 2619 directly or through a chain of translators. In that case, the 2620 same packet appears several times, originating from different 2621 network sources. 2623 o Two translators incorrectly set up in parallel, i.e., with the 2624 same multicast groups on both sides, would both forward packets 2625 from one multicast group to the other. Unidirectional 2626 translators would produce two copies; bidirectional translators 2627 would form a loop. 2629 o A mixer can close a loop by sending to the same transport 2630 destination upon which it receives packets, either directly or 2631 through another mixer or translator. In this case a source 2632 might show up both as an SSRC on a data packet and a CSRC in a 2633 mixed data packet. 2635 A source may discover that its own packets are being looped, or that 2636 packets from another source are being looped (a third-party loop). 2638 Both loops and collisions in the random selection of a source 2639 identifier result in packets arriving with the same SSRC identifier 2640 but a different source transport address, which may be that of the 2641 end system originating the packet or an intermediate system. 2642 Therefore, if a source changes its source transport address, it must 2643 also choose a new SSRC identifier to avoid being interpreted as a 2644 looped source. Note that if a translator restarts and consequently 2645 changes the source transport address (e.g., changes the UDP source 2646 port number) on which it forwards packets, then all those packets 2647 will appear to receivers to be looped because the SSRC identifiers 2648 are applied by the original source and will not change. This problem 2649 may be avoided by keeping the source transport addressed fixed across 2650 restarts, but in any case will be resolved after a timeout at the 2651 receivers. 2653 Loops or collisions occurring on the far side of a translator or 2654 mixer cannot be detected using the source transport address if all 2655 copies of the packets go through the translator or mixer, however 2656 collisions may still be detected when chunks from two RTCP SDES 2657 packets contain the same SSRC identifier but different CNAMEs. 2659 To detect and resolve these conflicts, an RTP implementation must 2660 include an algorithm similar to the one described below. It ignores 2661 packets from a new source or loop that collide with an established 2662 source. It resolves collisions with the participant's own SSRC 2663 identifier by sending an RTCP BYE for the old identifier and choosing 2664 a new one. However, when the collision was induced by a loop of the 2665 participant's own packets, the algorithm will choose a new identifier 2666 only once and thereafter ignore packets from the looping source 2667 transport address. This is required to avoid a flood of BYE packets. 2669 This algorithm requires keeping a table indexed by the source 2670 identifier and containing the source transport addresses from the 2671 first RTP packet and first RTCP packet received with that identifier, 2672 along with other state for that source. Two source transport 2673 addresses are required since, for example, the UDP source port 2674 numbers may be different on RTP and RTCP packets. However, it may be 2675 assumed that the network address is the same in both source transport 2676 addresses. 2678 Each SSRC or CSRC identifier received in an RTP or RTCP packet is 2679 looked up in the source identifier table in order to process that 2680 data or control information. The source transport address from the 2681 packet is compared to the corresponding source transport address in 2682 the table to detect a loop or collision if they don't match. For 2683 control packets, each element with its own SSRC id, for example an 2684 SDES chunk, requires a separate lookup. (The SSRC id in a reception 2685 report block is an exception because it identifies a source heard by 2686 the reporter, and that SSRC id is unrelated to the source transport 2687 adddress of the RTCP packet sent by the reporter.) If the SSRC or 2688 CSRC is not found, a new entry is created. These table entries are 2689 removed when an RTCP BYE packet is received with the corresponding 2690 SSRC id and validated by a matching source transport address, or 2691 after no packets have arrived for a relatively long time (see Section 2692 6.3). 2694 Note that if two sources on the same host are transmitting with the 2695 same source identifier at the time a receiver begins operation, it 2696 would be possible that the first RTP packet received came from one of 2697 the sources while the first RTCP packet received came from the other. 2698 This would cause the wrong RTCP information to be associated with the 2699 RTP data, but this situation should be sufficiently rare and harmless 2700 that it may be disregarded. 2702 In order to track loops of the participant's own data packets, it is 2703 also necessary to keep a separate list of source transport addresses 2704 (not identifiers) that have been found to be conflicting. As in the 2705 source identifier table, two source transport addresses must be kept 2706 to separately track conflicting RTP and RTCP packets. Note that the 2707 conflicting address list should be a short, usually empty. Each 2708 element in this list stores the source addresses plus the time when 2709 the most recent conflicting packet was received. An element may be 2710 removed from the list when no conflicting packet has arrived from 2711 that source for a time on the order of 10 RTCP report intervals (see 2712 Section 6.2). 2714 For the algorithm as shown, it is assumed that the participant's own 2715 source identifier and state are included in the source identifier 2716 table. The algorithm could be restructured to first make a separate 2717 comparison against the participant's own source identifier. 2719 IF the SSRC or CSRC identifier is not found in the source 2720 identifier table: 2721 THEN create a new entry storing the data or control source 2722 transport address, the SSRC or CSRC id and other state. 2723 CONTINUE with normal processing. 2725 (identifier is found in the table) 2727 IF the table entry was created on receipt of a control packet 2728 and this is the first data packet or vice versa: 2729 THEN store the source transport address from this packet. 2730 CONTINUE with normal processing. 2731 IF the source transport address from the packet matches 2732 the one saved in the table entry for this identifier: 2733 THEN CONTINUE with normal processing. 2735 (an identifier collision or a loop is indicated) 2737 IF the source identifier is not the participant's own: 2738 THEN IF the source identifier is from an RTCP SDES chunk 2739 containing a CNAME item that differs from the CNAME 2740 in the table entry: 2741 THEN (optionally) count a third-party collision. 2742 ELSE (optionally) count a third-party loop. 2743 ABORT processing of data packet or control element. 2745 (a collision or loop of the participant's own packets) 2747 IF the source transport address is found in the list of 2748 conflicting data or control source transport addresses: 2749 THEN IF the source identifier is not from an RTCP SDES chunk 2750 containing a CNAME item OR if that CNAME is the 2751 participant's own: 2752 THEN (optionally) count occurrence of own traffic looped. 2753 mark current time in conflicting address list entry. 2754 ABORT processing of data packet or control element. 2755 log occurrence of a collision. 2756 create a new entry in the conflicting data or control source 2757 transport address list and mark current time. 2758 send an RTCP BYE packet with the old SSRC identifier. 2759 choose a new identifier. 2760 create a new entry in the source identifier table with the 2761 old SSRC plus the source transport address from the data 2762 or control packet being processed. 2763 CONTINUE with normal processing. 2765 In this algorithm, packets from a newly conflicting source address 2766 will be ignored and packets from the original source will be kept. 2767 (If the original source was through a mixer and later the same source 2768 is received directly, the receiver may be well advised to switch 2769 unless other sources in the mix would be lost.) If no packets arrive 2770 from the original source for an extended period, the table entry will 2771 be timed out and the new source will be able to take over. This might 2772 occur if the original source detects the collision and moves to a new 2773 source identifier, but in the usual case an RTCP BYE packet will be 2774 received from the original source to delete the state without having 2775 to wait for a timeout. 2777 When a new SSRC identifier is chosen due to a collision, the 2778 candidate identifier should first be looked up in the source 2779 identifier table to see if it was already in use by some other 2780 source. If so, another candidate should be generated and the process 2781 repeated. 2783 A loop of data packets to a multicast destination can cause severe 2784 network flooding. All mixers and translators are required to 2785 implement a loop detection algorithm like the one here so that they 2786 can break loops. This should limit the excess traffic to no more than 2787 one duplicate copy of the original traffic, which may allow the 2788 session to continue so that the cause of the loop can be found and 2789 fixed. However, in extreme cases where a mixer or translator does not 2790 properly break the loop and high traffic levels result, it may be 2791 necessary for end systems to cease transmitting data or control 2792 packets entirely. This decision may depend upon the application. An 2793 error condition should be indicated as appropriate. Transmission 2794 might be attempted again periodically after a long, random time (on 2795 the order of minutes). 2797 8.3 Use with Layered Encodings 2799 For layered encodings transmitted on separate RTP sessions (see 2800 Section 2.4), a single SSRC identifier space should be used across 2801 the sessions of all layers and the core (base) layer should be used 2802 for SSRC identifier allocation and collision resolution. When a 2803 source discovers that it has collided, it transmits an RTCP BYE 2804 message on only the base layer but changes the SSRC identifier to the 2805 new value in all layers. 2807 9 Security 2808 Lower layer protocols may eventually provide all the security 2809 services that may be desired for applications of RTP, including 2810 authentication, integrity, and confidentiality. These services have 2811 recently been specified for IP. Since the need for a confidentiality 2812 service is well established in the initial audio and video 2813 applications that are expected to use RTP, a confidentiality service 2814 is defined in the next section for use with RTP and RTCP until lower 2815 layer services are available. The overhead on the protocol for this 2816 service is low, so the penalty will be minimal if this service is 2817 obsoleted by lower layer services in the future. 2819 Alternatively, other services, other implementations of services and 2820 other algorithms may be defined for RTP in the future if warranted. 2821 The selection presented here is meant to simplify implementation of 2822 interoperable, secure applications and provide guidance to 2823 implementors. No claim is made that the methods presented here are 2824 appropriate for a particular security need. A profile may specify 2825 which services and algorithms should be offered by applications, and 2826 may provide guidance as to their appropriate use. 2828 Key distribution and certificates are outside the scope of this 2829 document. 2831 9.1 Confidentiality 2833 Confidentiality means that only the intended receiver(s) can decode 2834 the received packets; for others, the packet contains no useful 2835 information. Confidentiality of the content is achieved by 2836 encryption. 2838 When encryption of RTP or RTCP is desired, all the octets that will 2839 be encapsulated for transmission in a single lower-layer packet are 2840 encrypted as a unit. For RTCP, a 32-bit random number is prepended to 2841 the unit before encryption to deter known plaintext attacks. For RTP, 2842 no prefix is required because the sequence number and timestamp 2843 fields are initialized with random offsets. 2845 For RTCP, it is allowed to split a compound RTCP packet into two 2846 lower-layer packets, one to be encrypted and one to be sent in the 2847 clear. For example, SDES information might be encrypted while 2848 reception reports were sent in the clear to accommodate third-party 2849 monitors that are not privy to the encryption key. In this example, 2850 depicted in Fig. 4, the SDES information must be appended to an RR 2851 packet with no reports (and the encrypted) to satisfy the requirement 2852 that all compound RTCP packets begin with an SR or RR packet. 2854 The presence of encryption and the use of the correct key are 2855 UDP packet UDP packet 2856 ------------------------------------- ------------------------- 2857 [32-bit ][ ][ # ] [ # sender # receiver] 2858 [random ][ RR ][SDES # CNAME, ...] [ SR # report # report ] 2859 [integer][(empty)][ # ] [ # # ] 2860 ------------------------------------- ------------------------- 2861 encrypted not encrypted 2863 #: SSRC 2865 Figure 4: Encrypted and non-encrypted RTCP packets 2867 confirmed by the receiver through header or payload validity checks. 2868 Examples of such validity checks for RTP and RTCP headers are given 2869 in Appendices A.1 and A.2. 2871 The default encryption algorithm is the Data Encryption Standard 2872 (DES) algorithm in cipher block chaining (CBC) mode, as described in 2873 Section 1.1 of RFC 1423 [21], except that padding to a multiple of 8 2874 octets is indicated as described for the P bit in Section 5.1. The 2875 initialization vector is zero because random values are supplied in 2876 the RTP header or by the random prefix for compound RTCP packets. For 2877 details on the use of CBC initialization vectors, see [22]. 2878 Implementations that support encryption should always support the DES 2879 algorithm in CBC mode as the default to maximize interoperability. 2880 This method is chosen because it has been demonstrated to be easy and 2881 practical to use in experimental audio and video tools in operation 2882 on the Internet. Other encryption algorithms may be specified 2883 dynamically for a session by non-RTP means. 2885 As an alternative to encryption at the RTP level as described above, 2886 profiles may define additional payload types for encrypted encodings. 2887 Those encodings must specify how padding and other aspects of the 2888 encryption should be handled. This method allows encrypting only the 2889 data while leaving the headers in the clear for applications where 2890 that is desired. It may be particularly useful for hardware devices 2891 that will handle both decryption and decoding. 2893 9.2 Authentication and Message Integrity 2895 Authentication and message integrity are not defined in the current 2896 specification of RTP since these services would not be directly 2897 feasible without a key management infrastructure. It is expected that 2898 authentication and integrity services will be provided by lower layer 2899 protocols in the future. 2901 10 RTP over Network and Transport Protocols 2903 This section describes issues specific to carrying RTP packets within 2904 particular network and transport protocols. The following rules apply 2905 unless superseded by protocol-specific definitions outside this 2906 specification. 2908 RTP relies on the underlying protocol(s) to provide demultiplexing of 2909 RTP data and RTCP control streams. For UDP and similar protocols, RTP 2910 uses an even port number and the corresponding RTCP stream uses the 2911 next higher (odd) port number. If an application is supplied with an 2912 odd number for use as the RTP port, it should replace this number 2913 with the next lower (even) number. 2915 In a unicast session, applications should be prepared to receive RTP 2916 data and control on one port pair and send to another. 2918 It is recommended that layered encoding applications (see Section 2919 2.4) use a set of contiguous port numbers. Ports must be distinct 2920 because of a widespread deficiency in existing operating systems that 2921 prevents use of the same port with multiple multicast addresses, and 2922 for unicast, there is only one permissible address. Thus for layer n, 2923 the data port is P + 2n, and the control port is P + 2n + 1. When IP 2924 multicast is used, the addresses must also be distinct because 2925 multicast routing and group membership are managed on an address 2926 granularity. However, allocation of contiguous IP multicast addresses 2927 cannot be assumed because some groups may require different scopes 2928 and may therefore be allocated from different address ranges. 2930 RTP data packets contain no length field or other delineation, 2931 therefore RTP relies on the underlying protocol(s) to provide a 2932 length indication. The maximum length of RTP packets is limited only 2933 by the underlying protocols. 2935 If RTP packets are to be carried in an underlying protocol that 2936 provides the abstraction of a continuous octet stream rather than 2937 messages (packets), an encapsulation of the RTP packets must be 2938 defined to provide a framing mechanism. Framing is also needed if the 2939 underlying protocol may contain padding so that the extent of the RTP 2940 payload cannot be determined. The framing mechanism is not defined 2941 here. 2943 A profile may specify a framing method to be used even when RTP is 2944 carried in protocols that do provide framing in order to allow 2945 carrying several RTP packets in one lower-layer protocol data unit, 2946 such as a UDP packet. Carrying several RTP packets in one network or 2947 transport packet reduces header overhead and may simplify 2948 synchronization between different streams. 2950 11 Summary of Protocol Constants 2952 This section contains a summary listing of the constants defined in 2953 this specification. 2955 The RTP payload type (PT) constants are defined in profiles rather 2956 than this document. However, the octet of the RTP header which 2957 contains the marker bit(s) and payload type must avoid the reserved 2958 values 200 and 201 (decimal) to distinguish RTP packets from the RTCP 2959 SR and RR packet types for the header validation procedure described 2960 in Appendix A.1. For the standard definition of one marker bit and a 2961 7-bit payload type field as shown in this specification, this 2962 restriction means that payload types 72 and 73 are reserved. 2964 11.1 RTCP packet types 2966 abbrev. name value 2967 SR sender report 200 2968 RR receiver report 201 2969 SDES source description 202 2970 BYE goodbye 203 2971 APP application-defined 204 2973 These type values were chosen in the range 200-204 for improved 2974 header validity checking of RTCP packets compared to RTP packets or 2975 other unrelated packets. When the RTCP packet type field is compared 2976 to the corresponding octet of the RTP header, this range corresponds 2977 to the marker bit being 1 (which it usually is not in data packets) 2978 and to the high bit of the standard payload type field being 1 (since 2979 the static payload types are typically defined in the low half). This 2980 range was also chosen to be some distance numerically from 0 and 255 2981 since all-zeros and all-ones are common data patterns. 2983 Since all compound RTCP packets must begin with SR or RR, these codes 2984 were chosen as an even/odd pair to allow the RTCP validity check to 2985 test the maximum number of bits with mask and value. 2987 Other constants are assigned by IANA. Experimenters are encouraged to 2988 register the numbers they need for experiments, and then unregister 2989 those which prove to be unneeded. 2991 11.2 SDES types 2992 abbrev. name value 2993 END end of SDES list 0 2994 CNAME canonical name 1 2995 NAME user name 2 2996 EMAIL user's electronic mail address 3 2997 PHONE user's phone number 4 2998 LOC geographic user location 5 2999 TOOL name of application or tool 6 3000 NOTE notice about the source 7 3001 PRIV private extensions 8 3003 Other constants are assigned by IANA. Experimenters are encouraged to 3004 register the numbers they need for experiments, and then unregister 3005 those which prove to be unneeded. 3007 12 RTP Profiles and Payload Format Specifications 3009 A complete specification of RTP for a particular application will 3010 require one or more companion documents of two types described here: 3011 profiles, and payload format specifications. 3013 RTP may be used for a variety of applications with somewhat differing 3014 requirements. The flexibility to adapt to those requirements is 3015 provided by allowing multiple choices in the main protocol 3016 specification, then selecting the appropriate choices or defining 3017 extensions for a particular environment and class of applications in 3018 a separate profile document. Typically an application will operate 3019 under only one profile so there is no explicit indication of which 3020 profile is in use. A profile for audio and video applications may be 3021 found in the companion RFC 1890 (updated by Internet-Draft draft- 3022 ietf-avt-profile-new ). Profiles are typically titled "RTP Profile 3023 for ...". 3025 The second type of companion document is a payload format 3026 specification, which defines how a particular kind of payload data, 3027 such as H.261 encoded video, should be carried in RTP. These 3028 documents are typically titled "RTP Payload Format for XYZ 3029 Audio/Video Encoding". Payload formats may be useful under multiple 3030 profiles and may therefore be defined independently of any particular 3031 profile. The profile documents are then responsible for assigning a 3032 default mapping of that format to a payload type value if needed. 3034 Within this specification, the following items have been identified 3035 for possible definition within a profile, but this list is not meant 3036 to be exhaustive: 3038 RTP data header: The octet in the RTP data header that contains the 3039 marker bit and payload type field may be redefined by a profile 3040 to suit different requirements, for example with more or fewer 3041 marker bits (Section 5.3, p. 14). 3043 Payload types: Assuming that a payload type field is included, the 3044 profile will usually define a set of payload formats (e.g., 3045 media encodings) and a default static mapping of those formats 3046 to payload type values. Some of the payload formats may be 3047 defined by reference to separate payload format specifications. 3048 For each payload type defined, the profile must specify the RTP 3049 timestamp clock rate to be used (Section 5.1, p. 13). 3051 RTP data header additions: Additional fields may be appended to the 3052 fixed RTP data header if some additional functionality is 3053 required across the profile's class of applications independent 3054 of payload type (Section 5.3, p. 14). 3056 RTP data header extensions: The contents of the first 16 bits of the 3057 RTP data header extension structure must be defined if use of 3058 that mechanism is to be allowed under the profile for 3059 implementation-specific extensions (Section 5.3.1, p. 15). 3061 RTCP packet types: New application-class-specific RTCP packet types 3062 may be defined and registered with IANA. 3064 RTCP report interval: A profile should specify that the values 3065 suggested in Section 6.2 for the constants employed in the 3066 calculation of the RTCP report interval will be used. Those are 3067 the RTCP fraction of session bandwidth, the minimum report 3068 interval, and the bandwidth split between senders and receivers. 3069 A profile may specify alternate values if they have been 3070 demonstrated to work in a scalable manner. 3072 SR/RR extension: An extension section may be defined for the RTCP SR 3073 and RR packets if there is additional information that should be 3074 reported regularly about the sender or receivers (Section 6.4.3, 3075 p. 31). 3077 SDES use: The profile may specify the relative priorities for RTCP 3078 SDES items to be transmitted or excluded entirely (Section 3079 6.3.9); an alternate syntax or semantics for the CNAME item 3080 (Section 6.5.1); the format of the LOC item (Section 6.5.5); the 3081 semantics and use of the NOTE item (Section 6.5.7); or new SDES 3082 item types to be registered with IANA. 3084 Security: A profile may specify which security services and 3085 algorithms should be offered by applications, and may provide 3086 guidance as to their appropriate use (Section 9, p. 46). 3088 String-to-key mapping: A profile may specify how a user-provided 3089 password or pass phrase is mapped into an encryption key. 3091 Underlying protocol: Use of a particular underlying network or 3092 transport layer protocol to carry RTP packets may be required. 3094 Transport mapping: A mapping of RTP and RTCP to transport-level 3095 addresses, e.g., UDP ports, other than the standard mapping 3096 defined in Section 10, p. 48 may be specified. 3098 Encapsulation: An encapsulation of RTP packets may be defined to 3099 allow multiple RTP data packets to be carried in one lower-layer 3100 packet or to provide framing over underlying protocols that do 3101 not already do so (Section 10, p. 48). 3103 It is not expected that a new profile will be required for every 3104 application. Within one application class, it would be better to 3105 extend an existing profile rather than make a new one in order to 3106 facilitate interoperation among the applications since each will 3107 typically run under only one profile. Simple extensions such as the 3108 definition of additional payload type values or RTCP packet types may 3109 be accomplished by registering them through the Internet Assigned 3110 Numbers Authority and publishing their descriptions in an addendum to 3111 the profile or in a payload format specification. 3113 A Algorithms 3115 We provide examples of C code for aspects of RTP sender and receiver 3116 algorithms. There may be other implementation methods that are faster 3117 in particular operating environments or have other advantages. These 3118 implementation notes are for informational purposes only and are 3119 meant to clarify the RTP specification. 3121 The following definitions are used for all examples; for clarity and 3122 brevity, the structure definitions are only valid for 32-bit big- 3123 endian (most significant octet first) architectures. Bit fields are 3124 assumed to be packed tightly in big-endian bit order, with no 3125 additional padding. Modifications would be required to construct a 3126 portable implementation. 3128 /* 3129 * rtp.h -- RTP header file (RFC XXXX) 3130 */ 3131 #include 3133 /* 3134 * The type definitions below are valid for 32-bit architectures and 3135 * may have to be adjusted for 16- or 64-bit architectures. 3136 */ 3137 typedef unsigned char u_int8; 3138 typedef unsigned short u_int16; 3139 typedef unsigned int u_int32; 3140 typedef short int16; 3142 /* 3143 * Current protocol version. 3144 */ 3145 #define RTP_VERSION 2 3147 #define RTP_SEQ_MOD (1<<16) 3148 #define RTP_MAX_SDES 255 /* maximum text length for SDES */ 3150 typedef enum { 3151 RTCP_SR = 200, 3152 RTCP_RR = 201, 3153 RTCP_SDES = 202, 3154 RTCP_BYE = 203, 3155 RTCP_APP = 204 3156 } rtcp_type_t; 3158 typedef enum { 3159 RTCP_SDES_END = 0, 3160 RTCP_SDES_CNAME = 1, 3161 RTCP_SDES_NAME = 2, 3162 RTCP_SDES_EMAIL = 3, 3163 RTCP_SDES_PHONE = 4, 3164 RTCP_SDES_LOC = 5, 3165 RTCP_SDES_TOOL = 6, 3166 RTCP_SDES_NOTE = 7, 3167 RTCP_SDES_PRIV = 8 3168 } rtcp_sdes_type_t; 3170 /* 3171 * RTP data header 3172 */ 3173 typedef struct { 3174 unsigned int version:2; /* protocol version */ 3175 unsigned int p:1; /* padding flag */ 3176 unsigned int x:1; /* header extension flag */ 3177 unsigned int cc:4; /* CSRC count */ 3178 unsigned int m:1; /* marker bit */ 3179 unsigned int pt:7; /* payload type */ 3180 u_int16 seq; /* sequence number */ 3181 u_int32 ts; /* timestamp */ 3182 u_int32 ssrc; /* synchronization source */ 3183 u_int32 csrc[1]; /* optional CSRC list */ 3184 } rtp_hdr_t; 3186 /* 3187 * RTCP common header word 3188 */ 3189 typedef struct { 3190 unsigned int version:2; /* protocol version */ 3191 unsigned int p:1; /* padding flag */ 3192 unsigned int count:5; /* varies by packet type */ 3193 unsigned int pt:8; /* RTCP packet type */ 3194 u_int16 length; /* pkt len in words, w/o this word */ 3195 } rtcp_common_t; 3197 /* 3198 * Big-endian mask for version, padding bit and packet type pair 3199 */ 3200 #define RTCP_VALID_MASK (0xc000 | 0x2000 | 0xfe) 3201 #define RTCP_VALID_VALUE ((RTP_VERSION << 14) | RTCP_SR) 3203 /* 3204 * Reception report block 3205 */ 3206 typedef struct { 3207 u_int32 ssrc; /* data source being reported */ 3208 unsigned int fraction:8; /* fraction lost since last SR/RR */ 3209 int lost:24; /* cumul. no. pkts lost (signed!) */ 3210 u_int32 last_seq; /* extended last seq. no. received */ 3211 u_int32 jitter; /* interarrival jitter */ 3212 u_int32 lsr; /* last SR packet from this source */ 3213 u_int32 dlsr; /* delay since last SR packet */ 3214 } rtcp_rr_t; 3216 /* 3217 * SDES item 3218 */ 3219 typedef struct { 3220 u_int8 type; /* type of item (rtcp_sdes_type_t) */ 3221 u_int8 length; /* length of item (in octets) */ 3222 char data[1]; /* text, not null-terminated */ 3224 } rtcp_sdes_item_t; 3226 /* 3227 * One RTCP packet 3228 */ 3229 typedef struct { 3230 rtcp_common_t common; /* common header */ 3231 union { 3232 /* sender report (SR) */ 3233 struct { 3234 u_int32 ssrc; /* sender generating this report */ 3235 u_int32 ntp_sec; /* NTP timestamp */ 3236 u_int32 ntp_frac; 3237 u_int32 rtp_ts; /* RTP timestamp */ 3238 u_int32 psent; /* packets sent */ 3239 u_int32 osent; /* octets sent */ 3240 rtcp_rr_t rr[1]; /* variable-length list */ 3241 } sr; 3243 /* reception report (RR) */ 3244 struct { 3245 u_int32 ssrc; /* receiver generating this report */ 3246 rtcp_rr_t rr[1]; /* variable-length list */ 3247 } rr; 3249 /* source description (SDES) */ 3250 struct rtcp_sdes { 3251 u_int32 src; /* first SSRC/CSRC */ 3252 rtcp_sdes_item_t item[1]; /* list of SDES items */ 3253 } sdes; 3255 /* BYE */ 3256 struct { 3257 u_int32 src[1]; /* list of sources */ 3258 /* can't express trailing text for reason */ 3259 } bye; 3260 } r; 3261 } rtcp_t; 3263 typedef struct rtcp_sdes rtcp_sdes_t; 3264 /* 3265 * Per-source state information 3266 */ 3267 typedef struct { 3268 u_int16 max_seq; /* highest seq. number seen */ 3269 u_int32 cycles; /* shifted count of seq. number cycles */ 3270 u_int32 base_seq; /* base seq number */ 3271 u_int32 bad_seq; /* last 'bad' seq number + 1 */ 3272 u_int32 probation; /* sequ. packets till source is valid */ 3273 u_int32 received; /* packets received */ 3274 u_int32 expected_prior; /* packet expected at last interval */ 3275 u_int32 received_prior; /* packet received at last interval */ 3276 u_int32 transit; /* relative trans time for prev pkt */ 3277 u_int32 jitter; /* estimated jitter */ 3278 /* ... */ 3279 } source; 3281 A.1 RTP Data Header Validity Checks 3283 An RTP receiver should check the validity of the RTP header on 3284 incoming packets since they might be encrypted or might be from a 3285 different application that happens to be misaddressed. Similarly, if 3286 encryption is enabled, the header validity check is needed to verify 3287 that incoming packets have been correctly decrypted, although a 3288 failure of the header validity check (e.g., unknown payload type) may 3289 not necessarily indicate decryption failure. 3291 Only weak validity checks are possible on an RTP data packet from a 3292 source that has not been heard before: 3294 o RTP version field must equal 2. 3296 o The payload type must be known, in particular it must not be 3297 equal to SR or RR. 3299 o If the P bit is set, then the last octet of the packet must 3300 contain a valid octet count, in particular, less than the total 3301 packet length minus the header size. 3303 o The X bit must be zero if the profile does not specify that 3304 the header extension mechanism may be used. Otherwise, the 3305 extension length field must be less than the total packet size 3306 minus the fixed header length and padding. 3308 o The length of the packet must be consistent with CC and 3309 payload type (if payloads have a known length). 3311 The last three checks are somewhat complex and not always possible, 3312 leaving only the first two which total just a few bits. If the SSRC 3313 identifier in the packet is one that has been received before, then 3314 the packet is probably valid and checking if the sequence number is 3315 in the expected range provides further validation. If the SSRC 3316 identifier has not been seen before, then data packets carrying that 3317 identifier may be considered invalid until a small number of them 3318 arrive with consecutive sequence numbers. 3320 The routine update_seq shown below ensures that a source is declared 3321 valid only after MIN_SEQUENTIAL packets have been received in 3322 sequence. It also validates the sequence number seq of a newly 3323 received packet and updates the sequence state for the packet's 3324 source in the structure to which s points. 3326 When a new source is heard for the first time, that is, its SSRC 3327 identifier is not in the table (see Section 8.2), and the per-source 3328 state is allocated for it, s->probation should be set to the number 3329 of sequential packets required before declaring a source valid 3330 (parameter MIN_SEQUENTIAL ) and s->max_seq initialized to seq-1 s- 3331 >probation marks the source as not yet valid so the state may be 3332 discarded after a short timeout rather than a long one, as discussed 3333 in Section 6.3. 3335 After a source is considered valid, the sequence number is considered 3336 valid if it is no more than MAX_DROPOUT ahead of s->max_seq nor more 3337 than MAX_MISORDER behind. If the new sequence number is ahead of 3338 max_seq modulo the RTP sequence number range (16 bits), but is 3339 smaller than max_seq , it has wrapped around and the (shifted) count 3340 of sequence number cycles is incremented. A value of one is returned 3341 to indicate a valid sequence number. 3343 Otherwise, the value zero is returned to indicate that the validation 3344 failed, and the bad sequence number is stored. If the next packet 3345 received carries the next higher sequence number, it is considered 3346 the valid start of a new packet sequence presumably caused by an 3347 extended dropout or a source restart. Since multiple complete 3348 sequence number cycles may have been missed, the packet loss 3349 statistics are reset. 3351 Typical values for the parameters are shown, based on a maximum 3352 misordering time of 2 seconds at 50 packets/second and a maximum 3353 dropout of 1 minute. The dropout parameter MAX_DROPOUT should be a 3354 small fraction of the 16-bit sequence number space to give a 3355 reasonable probability that new sequence numbers after a restart will 3356 not fall in the acceptable range for sequence numbers from before the 3357 restart. 3359 void init_seq(source *s, u_int16 seq) 3360 { 3361 s->base_seq = seq - 1; 3362 s->max_seq = seq; 3363 s->bad_seq = RTP_SEQ_MOD + 1; 3364 s->cycles = 0; 3365 s->received = 0; 3366 s->received_prior = 0; 3367 s->expected_prior = 0; 3368 /* other initialization */ 3369 } 3371 int update_seq(source *s, u_int16 seq) 3372 { 3373 u_int16 udelta = seq - s->max_seq; 3374 const int MAX_DROPOUT = 3000; 3375 const int MAX_MISORDER = 100; 3376 const int MIN_SEQUENTIAL = 2; 3378 /* 3379 * Source is not valid until MIN_SEQUENTIAL packets with 3380 * sequential sequence numbers have been received. 3381 */ 3382 if (s->probation) { 3383 /* packet is in sequence */ 3384 if (seq == s->max_seq + 1) { 3385 s->probation--; 3386 s->max_seq = seq; 3387 if (s->probation == 0) { 3388 init_seq(s, seq); 3389 s->received++; 3390 return 1; 3391 } 3392 } else { 3393 s->probation = MIN_SEQUENTIAL - 1; 3394 s->max_seq = seq; 3395 } 3396 return 0; 3397 } else if (udelta < MAX_DROPOUT) { 3398 /* in order, with permissible gap */ 3399 if (seq < s->max_seq) { 3400 /* 3401 * Sequence number wrapped - count another 64K cycle. 3402 */ 3403 s->cycles += RTP_SEQ_MOD; 3404 } 3405 s->max_seq = seq; 3407 } else if (udelta <= RTP_SEQ_MOD - MAX_MISORDER) { 3408 /* the sequence number made a very large jump */ 3409 if (seq == s->bad_seq) { 3410 /* 3411 * Two sequential packets -- assume that the other side 3412 * restarted without telling us so just re-sync 3413 * (i.e., pretend this was the first packet). 3414 */ 3415 init_seq(s, seq); 3416 } 3417 else { 3418 s->bad_seq = (seq + 1) & (RTP_SEQ_MOD-1); 3419 return 0; 3420 } 3421 } else { 3422 /* duplicate or reordered packet */ 3423 } 3424 s->received++; 3425 return 1; 3426 } 3428 The validity check can be made stronger requiring more than two 3429 packets in sequence. The disadvantages are that a larger number of 3430 initial packets will be discarded and that high packet loss rates 3431 could prevent validation. However, because the RTCP header validation 3432 is relatively strong, if an RTCP packet is received from a source 3433 before the data packets, the count could be adjusted so that only two 3434 packets are required in sequence. If initial data loss for a few 3435 seconds can be tolerated, an application could choose to discard all 3436 data packets from a source until a valid RTCP packet has been 3437 received from that source. 3439 Depending on the application and encoding, algorithms may exploit 3440 additional knowledge about the payload format for further validation. 3441 For payload types where the timestamp increment is the same for all 3442 packets, the timestamp values can be predicted from the previous 3443 packet received from the same source using the sequence number 3444 difference (assuming no change in payload type). 3446 A strong "fast-path" check is possible since with high probability 3447 the first four octets in the header of a newly received RTP data 3448 packet will be just the same as that of the previous packet from the 3449 same SSRC except that the sequence number will have increased by one. 3450 Similarly, a single-entry cache may be used for faster SSRC lookups 3451 in applications where data is typically received from one source at a 3452 time. 3454 A.2 RTCP Header Validity Checks 3456 The following checks can be applied to RTCP packets. 3458 o RTP version field must equal 2. 3460 o The payload type field of the first RTCP packet in a compound 3461 packet must be equal to SR or RR. 3463 o The padding bit (P) should be zero for the first packet of a 3464 compound RTCP packet because only the last should possibly need 3465 padding. 3467 o The length fields of the individual RTCP packets must total to 3468 the overall length of the compound RTCP packet as received. 3469 This is a fairly strong check. 3471 The code fragment below performs all of these checks. The packet type 3472 is not checked for subsequent packets since unknown packet types may 3473 be present and should be ignored. 3475 u_int32 len; /* length of compound RTCP packet in words */ 3476 rtcp_t *r; /* RTCP header */ 3477 rtcp_t *end; /* end of compound RTCP packet */ 3479 if ((*(u_int16 *)r & RTCP_VALID_MASK) != RTCP_VALID_VALUE) { 3480 /* something wrong with packet format */ 3481 } 3482 end = (rtcp_t *)((u_int32 *)r + len); 3484 do r = (rtcp_t *)((u_int32 *)r + r->common.length + 1); 3485 while (r < end && r->common.version == 2); 3487 if (r != end) { 3488 /* something wrong with packet format */ 3489 } 3491 A.3 Determining the Number of RTP Packets Expected and Lost 3493 In order to compute packet loss rates, the number of packets expected 3494 and actually received from each source needs to be known, using per- 3495 source state information defined in struct source referenced via 3496 pointer s in the code below. The number of packets received is simply 3497 the count of packets as they arrive, including any late or duplicate 3498 packets. The number of packets expected can be computed by the 3499 receiver as the difference between the highest sequence number 3500 received ( s->max_seq ) and the first sequence number received ( s- 3501 >base_seq ). Since the sequence number is only 16 bits and will wrap 3502 around, it is necessary to extend the highest sequence number with 3503 the (shifted) count of sequence number wraparounds ( s->cycles ). 3504 Both the received packet count and the count of cycles are maintained 3505 the RTP header validity check routine in Appendix A.1. 3507 extended_max = s->cycles + s->max_seq; 3508 expected = extended_max - s->base_seq + 1; 3510 The number of packets lost is defined to be the number of packets 3511 expected less the number of packets actually received: 3513 lost = expected - s->received; 3515 Since this number is carried in 24 bits, it should be clamped at 3516 0xffffff rather than wrap around to zero. 3518 The fraction of packets lost during the last reporting interval 3519 (since the previous SR or RR packet was sent) is calculated from 3520 differences in the expected and received packet counts across the 3521 interval, where expected_prior and received_prior are the values 3522 saved when the previous reception report was generated: 3524 expected_interval = expected - s->expected_prior; 3525 s->expected_prior = expected; 3526 received_interval = s->received - s->received_prior; 3527 s->received_prior = s->received; 3528 lost_interval = expected_interval - received_interval; 3529 if (expected_interval == 0 || lost_interval <= 0) fraction = 0; 3530 else fraction = (lost_interval << 8) / expected_interval; 3532 The resulting fraction is an 8-bit fixed point number with the binary 3533 point at the left edge. 3535 A.4 Generating SDES RTCP Packets 3537 This function builds one SDES chunk into buffer b composed of argc 3538 items supplied in arrays type , value and length b 3539 char *rtp_write_sdes(char *b, u_int32 src, int argc, 3540 rtcp_sdes_type_t type[], char *value[], 3541 int length[]) 3542 { 3543 rtcp_sdes_t *s = (rtcp_sdes_t *)b; 3544 rtcp_sdes_item_t *rsp; 3545 int i; 3546 int len; 3547 int pad; 3549 /* SSRC header */ 3550 s->src = src; 3551 rsp = &s->item[0]; 3553 /* SDES items */ 3554 for (i = 0; i < argc; i++) { 3555 rsp->type = type[i]; 3556 len = length[i]; 3557 if (len > RTP_MAX_SDES) { 3558 /* invalid length, may want to take other action */ 3559 len = RTP_MAX_SDES; 3560 } 3561 rsp->length = len; 3562 memcpy(rsp->data, value[i], len); 3563 rsp = (rtcp_sdes_item_t *)&rsp->data[len]; 3564 } 3566 /* terminate with end marker and pad to next 4-octet boundary */ 3567 len = ((char *) rsp) - b; 3568 pad = 4 - (len & 0x3); 3569 b = (char *) rsp; 3570 while (pad--) *b++ = RTCP_SDES_END; 3572 return b; 3573 } 3575 A.5 Parsing RTCP SDES Packets 3577 This function parses an SDES packet, calling functions find_member() 3578 to find a pointer to the information for a session member given the 3579 SSRC identifier and member_sdes() to store the new SDES information 3580 for that member. This function expects a pointer to the header of the 3581 RTCP packet. 3583 void rtp_read_sdes(rtcp_t *r) 3584 { 3585 int count = r->common.count; 3586 rtcp_sdes_t *sd = &r->r.sdes; 3587 rtcp_sdes_item_t *rsp, *rspn; 3588 rtcp_sdes_item_t *end = (rtcp_sdes_item_t *) 3589 ((u_int32 *)r + r->common.length + 1); 3590 source *s; 3592 while (--count >= 0) { 3593 rsp = &sd->item[0]; 3594 if (rsp >= end) break; 3595 s = find_member(sd->src); 3597 for (; rsp->type; rsp = rspn ) { 3598 rspn = (rtcp_sdes_item_t *)((char*)rsp+rsp->length+2); 3599 if (rspn >= end) { 3600 rsp = rspn; 3601 break; 3602 } 3603 member_sdes(s, rsp->type, rsp->data, rsp->length); 3604 } 3605 sd = (rtcp_sdes_t *) 3606 ((u_int32 *)sd + (((char *)rsp - (char *)sd) >> 2)+1); 3607 } 3608 if (count >= 0) { 3609 /* invalid packet format */ 3610 } 3611 } 3613 A.6 Generating a Random 32-bit Identifier 3615 The following subroutine generates a random 32-bit identifier using 3616 the MD5 routines published in RFC 1321 [23]. The system routines may 3617 not be present on all operating systems, but they should serve as 3618 hints as to what kinds of information may be used. Other system calls 3619 that may be appropriate include 3621 o getdomainname() , 3623 o getwd() , or 3625 o getrusage() 3627 "Live" video or audio samples are also a good source of random 3628 numbers, but care must be taken to avoid using a turned-off 3629 microphone or blinded camera as a source [7]. 3631 Use of this or similar routine is suggested to generate the initial 3632 seed for the random number generator producing the RTCP period (as 3633 shown in Appendix A.7), to generate the initial values for the 3634 sequence number and timestamp, and to generate SSRC values. Since 3635 this routine is likely to be CPU-intensive, its direct use to 3636 generate RTCP periods is inappropriate because predictability is not 3637 an issue. Note that this routine produces the same result on repeated 3638 calls until the value of the system clock changes unless different 3639 values are supplied for the type argument. 3641 /* 3642 * Generate a random 32-bit quantity. 3643 */ 3644 #include /* u_long */ 3645 #include /* gettimeofday() */ 3646 #include /* get..() */ 3647 #include /* printf() */ 3648 #include /* clock() */ 3649 #include /* uname() */ 3650 #include "global.h" /* from RFC 1321 */ 3651 #include "md5.h" /* from RFC 1321 */ 3653 #define MD_CTX MD5_CTX 3654 #define MDInit MD5Init 3655 #define MDUpdate MD5Update 3656 #define MDFinal MD5Final 3658 static u_long md_32(char *string, int length) 3659 { 3660 MD_CTX context; 3661 union { 3662 char c[16]; 3663 u_long x[4]; 3664 } digest; 3665 u_long r; 3666 int i; 3668 MDInit (&context); 3669 MDUpdate (&context, string, length); 3670 MDFinal ((unsigned char *)&digest, &context); 3671 r = 0; 3672 for (i = 0; i < 3; i++) { 3673 r ^= digest.x[i]; 3674 } 3675 return r; 3676 } /* md_32 */ 3678 /* 3679 * Return random unsigned 32-bit quantity. Use 'type' argument if you 3680 * need to generate several different values in close succession. 3681 */ 3682 u_int32 random32(int type) 3683 { 3684 struct { 3685 int type; 3686 struct timeval tv; 3687 clock_t cpu; 3688 pid_t pid; 3689 u_long hid; 3690 uid_t uid; 3691 gid_t gid; 3692 struct utsname name; 3693 } s; 3695 gettimeofday(&s.tv, 0); 3696 uname(&s.name); 3697 s.type = type; 3698 s.cpu = clock(); 3699 s.pid = getpid(); 3700 s.hid = gethostid(); 3701 s.uid = getuid(); 3702 s.gid = getgid(); 3703 /* also: system uptime */ 3705 return md_32((char *)&s, sizeof(s)); 3706 } /* random32 */ 3708 A.7 Computing the RTCP Transmission Interval 3710 The following functions implement the RTCP transmission and reception 3711 rules described in Section 6.2. These rules are coded in several 3712 functions: 3714 o OnExpire() is called when the RTCP transmission timer expires. 3716 o rtcp_interval() computes the deterministic calculated 3717 interval, measured in seconds. 3719 o OnReception() is called whenever an RTCP packet is received. 3721 It is assumed that the following functions are available: 3723 o Schedule(time t, event e) schedules an event e to occur at 3724 time t. When time t arrives, the funcion OnExpire is called 3725 with e as an argument. 3727 o ReSchedule(time t, event e) reschedules a previously scheduled 3728 event e for time t. 3730 o SendRTCPReport() sends an RTCP report. 3732 o SendBYEPacket() sends a BYE packet. 3734 o TypeOfEvent(event e) returns EVENT_BYE if the next pending 3735 report is a BYE packet, else it returns EVENT_REPORT. 3737 o NewMember(p) returns a 1 if the person who sent packet p is 3738 not currently in the member list, 0 otherwise. 3740 o PacketType(p) returns PACKET_RTCP_REPORT if packet p is an 3741 RTCP report (not BYE), PACKET_BYE if its a BYE RTCP packet, and 3742 PACKET_RTP if its a regular RTP data packet. 3744 The parameters of rtcp_interval() are defined in Section 6.3. 3746 double rtcp_interval(int members, 3747 int senders, 3748 double rtcp_bw, 3749 int we_sent, 3750 double avg_rtcp_size, 3751 int initial) 3752 { 3753 /* 3754 * Minimum average time between RTCP packets from this site (in 3755 * seconds). This time prevents the reports from `clumping' when 3756 * sessions are small and the law of large numbers isn't helping 3757 * to smooth out the traffic. It also keeps the report interval 3758 * from becoming ridiculously small during transient outages like 3759 * a network partition. 3760 */ 3761 double const RTCP_MIN_TIME = 5.; 3762 /* 3763 * Fraction of the RTCP bandwidth to be shared among active 3764 * senders. (This fraction was chosen so that in a typical 3765 * session with one or two active senders, the computed report 3766 * time would be roughly equal to the minimum report time so that 3767 * we don't unnecessarily slow down receiver reports.) The 3768 * receiver fraction must be 1 - the sender fraction. 3769 */ 3770 double const RTCP_SENDER_BW_FRACTION = 0.25; 3771 double const RTCP_RCVR_BW_FRACTION = (1-RTCP_SENDER_BW_FRACTION); 3772 double t; /* interval */ 3773 double rtcp_min_time = RTCP_MIN_TIME; 3774 int n; /* no. of members for computation */ 3776 /* 3777 * Very first call at application start-up uses half the min 3778 * delay for quicker notification while still allowing some time 3779 * before reporting for randomization and to learn about other 3780 * sources so the report interval will converge to the correct 3781 * interval more quickly. */ 3783 if (initial) { 3784 rtcp_min_time /= 2; 3785 } 3787 /* 3788 * If there were active senders, give them at least a minimum 3789 * share of the RTCP bandwidth. Otherwise all participants share 3790 * the RTCP bandwidth equally. 3791 */ 3792 n = members; 3793 if (senders > 0 && senders < members * RTCP_SENDER_BW_FRACTION) { 3794 if (we_sent) { 3795 rtcp_bw *= RTCP_SENDER_BW_FRACTION; 3796 n = senders; 3797 } else { 3798 rtcp_bw *= RTCP_RCVR_BW_FRACTION; 3799 n -= senders; 3800 } 3801 } 3803 /* 3804 * The effective number of sites times the average packet size is 3805 * the total number of octets sent when each site sends a report. 3806 * Dividing this by the effective bandwidth gives the time 3807 * interval over which those packets must be sent in order to 3808 * meet the bandwidth target, with a minimum enforced. In that 3809 * time interval we send one report so this time is also our 3810 * average time between reports. 3811 */ 3812 t = avg_rtcp_size * n / rtcp_bw; 3813 if (t < rtcp_min_time) t = rtcp_min_time; 3815 /* 3816 * To avoid traffic bursts from unintended synchronization with 3817 * other sites, we then pick our actual next report interval as a 3818 * random number uniformly distributed between 0.5*t and 1.5*t. 3819 */ 3820 return t * (drand48() + 0.5); 3821 } 3822 void OnExpire(event e, 3823 int members, 3824 int senders, 3825 double rtcp_bw, 3826 int we_sent, 3827 double *avg_rtcp_sz, 3828 int *initial, 3829 time tc, 3830 time *tp, 3831 int *pmembers) { 3833 /* This function is responsible for deciding whether to send 3834 * an RTCP report or BYE packet now, or to reschedule transmission. 3835 * It is also responsible for updating the pmembers, initial, tp, 3836 * and avg_rtcp_sz state variables. This function should be called 3837 * upon expiration of the event timer used by Schedule(). */ 3839 double t; /* Interval */ 3840 double tn; /* Next transmit time */ 3841 int SendIt; /* flag for sending packet */ 3843 /* In the case of a BYE, we use OPTION B to reschedule the 3844 * transmission of the BYE if necessary */ 3846 if(TypeOfEvent(e) == EVENT_BYE) { 3847 t = rtcp_interval(members, 3848 senders, 3849 rtcp_bw, 3850 we_sent, 3851 avg_rtcp_sz, 3852 initial); 3853 tn = *tp + t; 3854 if(tn <= tc) { 3855 SendBYEPacket(); 3856 exit(1); 3857 } else { 3858 Schedule(tn, e); 3859 } 3861 } else if(TypeOfEvent(e) == EVENT_REPORT) { 3862 t = rtcp_interval(members, 3863 senders, 3864 rtcp_bw, 3865 we_sent, 3866 avg_rtcp_sz, 3867 initial); 3869 SendIt = FALSE; 3870 if((algorithm == ALGORITHM_A) || 3871 ((algorithm == ALGORITHM_C) && (initial == FALSE))) { 3873 if(members <= pmembers) { 3874 SendIt = TRUE; 3875 } else { 3876 tn = *tp + t; 3878 if(tn <= tc) { 3879 SendIt = TRUE; 3880 } 3881 } 3882 } else if((algorithm == ALGORITHM_B) || 3883 ((algorithm == ALGORITHM_C) && (initial == TRUE))) { 3885 tn = *tp + t; 3887 if(tn <= tc) { 3888 SendIt = TRUE; 3889 } 3890 } 3892 if(SendIt == TRUE) { 3893 SendRTCPReport(); 3894 *pmembers = members; 3895 *avg_rtcp_sz = (1./16.)*PacketSize(e) + 3896 (15./16.)*(*avg_rtcp_sz); 3897 *tp = tc; 3898 } else { 3899 Schedule(tn, e); 3900 *pmembers = members; 3901 } 3902 } 3903 } 3904 void OnReceive(packet p, 3905 event e, 3906 int *members, 3907 int *pmembers, 3908 int *senders 3909 double *avg_rtcp_sz, 3910 double *tp, 3911 double tc) { 3913 double tn; /* Next packet transmission time */ 3915 /* What we do depends on whether we have left the group, and 3916 * are waiting to send a BYE (TypeOfEvent(e) == EVENT_BYE) or 3917 * an RTCP report. p represents the packet that was just received. */ 3919 if(PacketType(p) == PACKET_RTCP_REPORT) { 3920 if(NewMember(p) && (TypeOfEvent(e) == EVENT_REPORT)) *members += 1; 3921 *avg_rtcp_sz = (1./16.)*PacketSize(e) + (15./16.)*(*avg_rtcp_sz); 3922 } else if(PacketType(p) == PACKET_RTP) { 3923 if(NewSender(p) && (TypeOfEvent(e) == EVENT_REPORT)) *senders += 1; 3924 } else if(PacketType(p) == PACKET_BYE) { 3925 *avg_rtcp_sz = (1./16.)*PacketSize(e) + (15./16.)*(*avg_rtcp_sz); 3927 if(TypeOfEvent(e) == EVENT_REPORT) { 3928 if(NewSender(p) == FALSE) *senders -= 1; 3929 if(NewMember(p) == FALSE) *members -= 1; 3931 tn = tc + ((*members)/(*pmembers))*(tn - tc); 3932 *tp = *tp - ((*members)/(*pmembers))*(tc - *tp); 3934 /* Reschedule the next report for time tn */ 3936 Reschedule(e, tn); 3937 *pmembers = members; 3939 } else if(TypeOfEvent(e) == EVENT_BYE) { 3941 *members += 1; 3943 } 3944 } 3945 } 3947 A.8 Estimating the Interarrival Jitter 3948 The code fragments below implement the algorithm given in Section 3949 6.4.1 for calculating an estimate of the statistical variance of the 3950 RTP data interarrival time to be inserted in the interarrival jitter 3951 field of reception reports. The inputs are r->ts , the timestamp from 3952 the incoming packet, and arrival , the current time in the same 3953 units. Here s points to state for the source; s->transit holds the 3954 relative transit time for the previous packet, and s->jitter holds 3955 the estimated jitter. The jitter field of the reception report is 3956 measured in timestamp units and expressed as an unsigned integer, but 3957 the jitter estimate is kept in a floating point. As each data packet 3958 arrives, the jitter estimate is updated: 3960 int transit = arrival - r->ts; 3961 int d = transit - s->transit; 3962 s->transit = transit; 3963 if (d < 0) d = -d; 3964 s->jitter += (1./16.) * ((double)d - s->jitter); 3966 When a reception report block (to which rr points) is generated for 3967 this member, the current jitter estimate is returned: 3969 rr->jitter = (u_int32) s->jitter; 3971 Alternatively, the jitter estimate can be kept as an integer, but 3972 scaled to reduce round-off error. The calculation is the same except 3973 for the last line: 3975 s->jitter += d - ((s->jitter + 8) >> 4); 3977 In this case, the estimate is sampled for the reception report as: 3979 rr->jitter = s->jitter >> 4; 3981 B Security Considerations 3983 RTP suffers from the same security liabilities as the underlying 3984 protocols. For example, an impostor can fake source or destination 3985 network addresses, or change the header or payload. Within RTCP, the 3986 CNAME and NAME information may be used to impersonate another 3987 participant. In addition, RTP may be sent via IP multicast, which 3988 provides no direct means for a sender to know all the receivers of 3989 the data sent and therefore no measure of privacy. Rightly or not, 3990 users may be more sensitive to privacy concerns with audio and video 3991 communication than they have been with more traditional forms of 3992 network communication [24]. Therefore, the use of security mechanisms 3993 with RTP is important. These mechanisms are discussed in Section 9. 3995 RTP-level translators or mixers may be used to allow RTP traffic to 3996 reach hosts behind firewalls. Appropriate firewall security 3997 principles and practices, which are beyond the scope of this 3998 document, should be followed in the design and installation of these 3999 devices and in the admission of RTP applications for use behind the 4000 firewall. 4002 C Addresses of Authors 4004 Henning Schulzrinne 4005 Dept. of Computer Science 4006 Columbia University 4007 1214 Amsterdam Avenue 4008 New York, NY 10027 4009 USA 4010 electronic mail: schulzrinne@cs.columbia.edu 4012 Stephen L. Casner 4013 Precept Software, Inc. 4014 21580 Stevens Creek Boulevard, Suite 207 4015 Cupertino, CA 95014 4016 United States 4017 electronic mail: casner@precept.com 4019 Ron Frederick 4020 Xerox Palo Alto Research Center 4021 3333 Coyote Hill Road 4022 Palo Alto, CA 94304 4023 United States 4024 electronic mail: frederic@parc.xerox.com 4026 Van Jacobson 4027 MS 46a-1121 4028 Lawrence Berkeley National Laboratory 4029 Berkeley, CA 94720 4030 United States 4031 electronic mail: van@ee.lbl.gov 4032 Acknowledgments 4034 This memorandum is based on discussions within the IETF Audio/Video 4035 Transport working group chaired by Stephen Casner. The current 4036 protocol has its origins in the Network Voice Protocol and the Packet 4037 Video Protocol (Danny Cohen and Randy Cole) and the protocol 4038 implemented by the vat application (Van Jacobson and Steve McCanne). 4039 Christian Huitema provided ideas for the random identifier generator. 4040 Extensive analysis and simulation of the timer reconsideration 4041 algorithm was done by Jonathan Rosenberg. 4043 D Bibliography 4045 [1] D. D. Clark and D. L. Tennenhouse, "Architectural considerations 4046 for a new generation of protocols," in SIGCOMM Symposium on 4047 Communications Architectures and Protocols , (Philadelphia, 4048 Pennsylvania), pp. 200--208, IEEE, Sept. 1990. Computer 4049 Communications Review, Vol. 20(4), Sept. 1990. 4051 [2] H. Schulzrinne, "Issues in designing a transport protocol for 4052 audio and video conferences and other multiparticipant real-time 4053 applications." expired Internet draft, Oct. 1993. 4055 [3] D. E. Comer, Internetworking with TCP/IP , vol. 1. Englewood 4056 Cliffs, New Jersey: Prentice Hall, 1991. 4058 [4] J. Postel, "Internet protocol," RFC 791, Internet Engineering 4059 Task Force, Sept. 1981. 4061 [5] D. Mills, "Network time protocol (v3)," RFC 1305, Internet 4062 Engineering Task Force, Apr. 1992. 4064 [6] J. Reynolds and J. Postel, "Assigned numbers," STD 2, RFC 1700, 4065 Internet Engineering Task Force, Oct. 1994. 4067 [7] D. Eastlake, S. Crocker, and J. Schiller, "Randomness 4068 recommendations for security," RFC 1750, Internet Engineering Task 4069 Force, Dec. 1994. 4071 [8] J.-C. Bolot, T. Turletti, and I. Wakeman, "Scalable feedback 4072 control for multicast video distribution in the internet," in SIGCOMM 4073 Symposium on Communications Architectures and Protocols , (London, 4074 England), pp. 58--67, ACM, Aug. 1994. 4076 [9] I. Busse, B. Deffner, and H. Schulzrinne, "Dynamic QoS control of 4077 multimedia applications based on RTP," Computer Communications , Jan. 4078 1996. 4080 [10] S. Floyd and V. Jacobson, "The synchronization of periodic 4081 routing messages," in SIGCOMM Symposium on Communications 4082 Architectures and Protocols (D. P. Sidhu, ed.), (San Francisco, 4083 California), pp. 33--44, ACM, Sept. 1993. also in [25]. 4085 [11] J. A. Cadzow, Foundations of digital signal processing and data 4086 analysis New York, New York: Macmillan, 1987. 4088 [12] International Standards Organization, "ISO/IEC DIS 10646-1:1993 4089 information technology -- universal multiple-octet coded character 4090 set (UCS) -- part I: Architecture and basic multilingual plane," 4091 1993. 4093 [13] The Unicode Consortium, The Unicode Standard New York, New York: 4094 Addison-Wesley, 1991. 4096 [14] P. Mockapetris, "Domain names - concepts and facilities," STD 4097 13, RFC 1034, Internet Engineering Task Force, Nov. 1987. 4099 [15] P. Mockapetris, "Domain names - implementation and 4100 specification," STD 13, RFC 1035, Internet Engineering Task Force, 4101 Nov. 1987. 4103 [16] R. Braden, "Requirements for internet hosts - application and 4104 support," STD 3, RFC 1123, Internet Engineering Task Force, Oct. 4105 1989. 4107 [17] Y. Rekhter, R. Moskowitz, D. Karrenberg, and G. de Groot, 4108 "Address allocation for private internets," RFC 1597, Internet 4109 Engineering Task Force, Mar. 1994. 4111 [18] E. Lear, E. Fair, D. Crocker, and T. Kessler, "Network 10 4112 considered harmful (some practices shouldn't be codified)," RFC 4113 1627, Internet Engineering Task Force, July 1994. 4115 [19] D. Crocker, "Standard for the format of ARPA internet text 4116 messages," STD 11, RFC 822, Internet Engineering Task Force, Aug. 4117 1982. 4119 [20] W. Feller, An Introduction to Probability Theory and its 4120 Applications, Volume 1 , vol. 1. New York, New York: John Wiley and 4121 Sons, third ed., 1968. 4123 [21] D. Balenson, "Privacy enhancement for internet electronic mail: 4124 Part III: algorithms, modes, and identifiers," RFC 1423, Internet 4125 Engineering Task Force, Feb. 1993. 4127 [22] V. L. Voydock and S. T. Kent, "Security mechanisms in high-level 4128 network protocols," ACM Computing Surveys , vol. 15, pp. 135--171, 4129 June 1983. 4131 [23] R. Rivest, "The MD5 message-digest algorithm," RFC 1321, 4132 Internet Engineering Task Force, Apr. 1992. 4134 [24] S. Stubblebine, "Security services for multimedia conferencing," 4135 in 16th National Computer Security Conference , (Baltimore, 4136 Maryland), pp. 391--395, Sept. 1993. 4138 [25] S. Floyd and V. Jacobson, "The synchronization of periodic 4139 routing messages," IEEE/ACM Transactions on Networking , vol. 2, pp. 4140 122--136, Apr. 1994. 4142 Table of Contents 4144 1 Introduction ........................................ 2 4145 1.1 Changes ............................................. 4 4146 1.2 Open Issues ......................................... 5 4147 2 RTP Use Scenarios ................................... 7 4148 2.1 Simple Multicast Audio Conference ................... 8 4149 2.2 Audio and Video Conference .......................... 9 4150 2.3 Mixers and Translators .............................. 9 4151 2.4 Layered Encodings ................................... 10 4152 3 Definitions ......................................... 10 4153 4 Byte Order, Alignment, and Time Format .............. 13 4154 5 RTP Data Transfer Protocol .......................... 13 4155 5.1 RTP Fixed Header Fields ............................. 13 4156 5.2 Multiplexing RTP Sessions ........................... 16 4157 5.3 Profile-Specific Modifications to the RTP Header 4158 ................................................................ 17 4159 5.3.1 RTP Header Extension ................................ 18 4160 6 RTP Control Protocol -- RTCP ........................ 19 4161 6.1 RTCP Packet Format .................................. 20 4162 6.2 RTCP Transmission Interval .......................... 22 4163 6.3 RTCP Packet Send and Receive Rules .................. 25 4164 6.3.1 Computing the RTCP transmission interval ............ 26 4165 6.3.2 Initialization ...................................... 27 4166 6.3.3 Receiving an RTP or non-BYE RTCP packet ............. 27 4167 6.3.4 Receiving an RTCP BYE packet ........................ 28 4168 6.3.5 Timing Out an SSRC .................................. 29 4169 6.3.6 Expiration of transmission timer .................... 29 4170 6.3.7 Transmitting a BYE packet ........................... 31 4171 6.3.8 Updating we_sent .................................... 31 4172 6.3.9 Allocation of source description bandwidth .......... 32 4173 6.4 Sender and Receiver Reports ......................... 32 4174 6.4.1 SR: Sender report RTCP packet ....................... 33 4175 6.4.2 RR: Receiver report RTCP packet ..................... 38 4176 6.4.3 Extending the sender and receiver reports ........... 40 4177 6.4.4 Analyzing sender and receiver reports ............... 40 4178 6.5 SDES: Source description RTCP packet ................ 42 4179 6.5.1 CNAME: Canonical end-point identifier SDES item ..... 43 4180 6.5.2 NAME: User name SDES item ........................... 45 4181 6.5.3 EMAIL: Electronic mail address SDES item ............ 45 4182 6.5.4 PHONE: Phone number SDES item ....................... 45 4183 6.5.5 LOC: Geographic user location SDES item ............. 46 4184 6.5.6 TOOL: Application or tool name SDES item ............ 46 4185 6.5.7 NOTE: Notice/status SDES item ....................... 46 4186 6.5.8 PRIV: Private extensions SDES item .................. 47 4187 6.6 BYE: Goodbye RTCP packet ............................ 48 4188 6.7 APP: Application-defined RTCP packet ................ 49 4189 7 RTP Translators and Mixers .......................... 50 4190 7.1 General Description ................................. 50 4191 7.2 RTCP Processing in Translators ...................... 52 4192 7.3 RTCP Processing in Mixers ........................... 54 4193 7.4 Cascaded Mixers ..................................... 55 4194 8 SSRC Identifier Allocation and Use .................. 55 4195 8.1 Probability of Collision ............................ 55 4196 8.2 Collision Resolution and Loop Detection ............. 56 4197 8.3 Use with Layered Encodings .......................... 60 4198 9 Security ............................................ 60 4199 9.1 Confidentiality ..................................... 61 4200 9.2 Authentication and Message Integrity ................ 62 4201 10 RTP over Network and Transport Protocols ............ 63 4202 11 Summary of Protocol Constants ....................... 64 4203 11.1 RTCP packet types ................................... 64 4204 11.2 SDES types .......................................... 64 4205 12 RTP Profiles and Payload Format Specifications ...... 65 4206 A Algorithms .......................................... 67 4207 A.1 RTP Data Header Validity Checks ..................... 71 4208 A.2 RTCP Header Validity Checks ......................... 75 4209 A.3 Determining the Number of RTP Packets Expected and 4210 Lost ........................................................... 75 4211 A.4 Generating SDES RTCP Packets ........................ 76 4212 A.5 Parsing RTCP SDES Packets ........................... 77 4213 A.6 Generating a Random 32-bit Identifier ............... 78 4214 A.7 Computing the RTCP Transmission Interval ............ 81 4215 A.8 Estimating the Interarrival Jitter .................. 87 4216 B Security Considerations ............................. 88 4217 C Addresses of Authors ................................ 89 4218 D Bibliography ........................................ 90