idnits 2.17.1 draft-ietf-avt-rtp-new-02.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Cannot find the required boilerplate sections (Copyright, IPR, etc.) in this document. Expected boilerplate is as follows today (2024-04-18) according to https://trustee.ietf.org/license-info : IETF Trust Legal Provisions of 28-dec-2009, Section 6.a: This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 2: Copyright (c) 2024 IETF Trust and the persons identified as the document authors. All rights reserved. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 3: This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** Missing expiration date. The document expiration date should appear on the first and last page. ** The document seems to lack a 1id_guidelines paragraph about Internet-Drafts being working documents. ** The document seems to lack a 1id_guidelines paragraph about 6 months document validity -- however, there's a paragraph with a matching beginning. Boilerplate error? ** The document seems to lack a 1id_guidelines paragraph about the list of current Internet-Drafts. ** The document seems to lack a 1id_guidelines paragraph about the list of Shadow Directories. ** Expected the document's filename to be given on the first page, but didn't find any ** The document is more than 15 pages and seems to lack a Table of Contents. == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an Authors' Addresses Section. ** There are 21 instances of too long lines in the document, the longest one being 4 characters in excess of 72. Miscellaneous warnings: ---------------------------------------------------------------------------- == Line 925 has weird spacing: '... item item ...' == Line 2987 has weird spacing: '...ed char u_int...' == Line 2989 has weird spacing: '...ned int u_in...' == Line 3515 has weird spacing: '... char c[16...' == Line 3539 has weird spacing: '... struct timev...' == (6 more instances...) == Couldn't figure out when the document was first submitted -- there may comments or warnings related to the use of a disclaimer for pre-RFC5378 work that could not be issued because of this. Please check the Legal Provisions document at https://trustee.ietf.org/license-info to determine if you need the pre-RFC5378 disclaimer. -- The document date (November 18, 1998) is 9283 days in the past. Is this intentional? -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '' and '' lines. Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Missing reference section? '1' on line 3996 looks like a reference -- Missing reference section? '2' on line 4002 looks like a reference -- Missing reference section? '3' on line 4006 looks like a reference -- Missing reference section? '4' on line 4009 looks like a reference -- Missing reference section? '5' on line 4012 looks like a reference -- Missing reference section? '6' on line 4015 looks like a reference -- Missing reference section? '7' on line 4018 looks like a reference -- Missing reference section? '8' on line 4021 looks like a reference -- Missing reference section? '9' on line 4025 looks like a reference -- Missing reference section? '10' on line 4030 looks like a reference -- Missing reference section? '-packet-' on line 922 looks like a reference -- Missing reference section? '11' on line 4034 looks like a reference -- Missing reference section? '13' on line 4043 looks like a reference -- Missing reference section? '14' on line 4046 looks like a reference -- Missing reference section? '15' on line 4049 looks like a reference -- Missing reference section? '16' on line 4052 looks like a reference -- Missing reference section? '17' on line 4056 looks like a reference -- Missing reference section? '18' on line 4060 looks like a reference -- Missing reference section? '19' on line 4064 looks like a reference -- Missing reference section? '20' on line 4068 looks like a reference -- Missing reference section? 'E1' on line 2331 looks like a reference -- Missing reference section? 'E6' on line 2331 looks like a reference -- Missing reference section? 'E2' on line 2340 looks like a reference -- Missing reference section? 'E4' on line 2340 looks like a reference -- Missing reference section? 'E3' on line 2342 looks like a reference -- Missing reference section? 'E5' on line 2346 looks like a reference -- Missing reference section? '21' on line 4072 looks like a reference -- Missing reference section? '22' on line 4076 looks like a reference -- Missing reference section? '23' on line 4080 looks like a reference -- Missing reference section? '24' on line 4084 looks like a reference -- Missing reference section? '0' on line 3446 looks like a reference -- Missing reference section? '25' on line 4088 looks like a reference -- Missing reference section? '26' on line 4091 looks like a reference -- Missing reference section? '27' on line 4095 looks like a reference -- Missing reference section? '12' on line 4039 looks like a reference Summary: 10 errors (**), 0 flaws (~~), 8 warnings (==), 37 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Internet Engineering Task Force Audio/Video Transport Working Group 3 Internet Draft Schulzrinne/Casner/Frederick/Jacobson 4 ietf-avt-rtp-new-02.txt Columbia U./Cisco/Xerox/Cisco 5 November 18, 1998 6 Expires: May 18, 1999 8 RTP: A Transport Protocol for Real-Time Applications 10 STATUS OF THIS MEMO 12 This document is an Internet-Draft. Internet-Drafts are working 13 documents of the Internet Engineering Task Force (IETF), its areas, 14 and its working groups. Note that other groups may also distribute 15 working documents as Internet-Drafts. 17 Internet-Drafts are draft documents valid for a maximum of six months 18 and may be updated, replaced, or obsoleted by other documents at any 19 time. It is inappropriate to use Internet-Drafts as reference 20 material or to cite them other than as ``work in progress''. 22 To view the entire list of current Internet-Drafts, please check the 23 ``1id-abstracts.txt'' listing contained in the Internet-Drafts Shadow 24 Directories on ftp.is.co.za (Africa), ftp.nordu.net (Northern 25 Europe), ftp.nis.garr.it (Southern Europe), munnari.oz.au (Pacific 26 Rim), ftp.ietf.org (US East Coast), or ftp.isi.edu (US West Coast). 28 Distribution of this document is unlimited. 30 ABSTRACT 32 This memorandum is a revision of RFC 1889 in preparation 33 for advancement from Proposed Standard to Draft Standard 34 status. Readers are encouraged to use the PostScript form 35 of this draft to see where changes from RFC 1889 are 36 marked by change bars. 38 This memorandum describes RTP, the real-time transport 39 protocol. RTP provides end-to-end network transport 40 functions suitable for applications transmitting real- 41 time data, such as audio, video or simulation data, over 42 multicast or unicast network services. RTP does not 43 address resource reservation and does not guarantee 44 quality-of-service for real-time services. The data 45 transport is augmented by a control protocol (RTCP) to 46 allow monitoring of the data delivery in a manner 47 scalable to large multicast networks, and to provide 48 minimal control and identification functionality. RTP and 49 RTCP are designed to be independent of the underlying 50 transport and network layers. The protocol supports the 51 use of RTP-level translators and mixers. 53 This specification is a product of the Audio/Video Transport working 54 group within the Internet Engineering Task Force. Comments are 55 solicited and should be addressed to the working group's mailing list 56 at rem-conf@es.net and/or the authors. 58 Resolution of Open Issues 60 [Note to the RFC Editor: This section is to be deleted when this 61 draft is published as an RFC but is shown here for reference during 62 the Last Call.] 64 Readers are directed to Appendix B, Changes from RFC 1889, for a 65 listing of the changes that have been made in this draft. The changes 66 are marked with change bars in the PostScript form of this draft. 68 The revisions in this draft are intended to be complete for Working 69 Group last call; the open issues from previous drafts have been 70 addressed: 72 o A fudge factor has been added to the RTCP unconditional 73 reconsideration algorithm to compensate for the fact that it 74 settles to a steady state bandwidth that is below the desired 75 level. 77 o As agreed at the Chicago IETF, the conditional and hybrid 78 reconsideration schemes have been removed in favor of 79 unconditional reconsideration. 81 o The SSRC sampling algorithm has been extracted to a separate 82 draft as agreed at the Chicago IETF. That draft describes the 83 "bin" mechanism that avoids a temporary underestimate in group 84 size when the group size is decreasing. 86 o The "reverse reconsideration" algorithm does not prevent the 87 group size estimate from incorrectly dropping to zero for a 88 short time when most participants of a large session leave at 89 once but some remain. This has just been noted as only a 90 secondary concern. 92 o Scaling of the minimum RTCP interval inversely proportional to 93 the session bandwidth parameter has been added, but only in the 94 direction of smaller intervals for higher bandwidth. Scaling to 95 longer intervals for low bandwidths would cause a problem 96 because this is an optional step. Some participants might be 97 timed out prematurely if they scaled to a longer interval while 98 others kept the nominal 5 seconds. The benefit of scaling 99 longer was not considered great in any case. 101 o No change was specified for the jitter computation for media 102 with several packets with the same timestamp. There is not a 103 clear answer as to what should be done, or that any change 104 would make a significant improvement. 106 o As proposed without objection at the Los Angeles IETF, 107 definition of additional SDES items such as PHOTO URL and 108 NICKNAME will be deferred to subsequent registration through 109 IANA since that method has been established. This is in the 110 spirit of minimizing changes to the protocol in the transition 111 from Proposed to Draft. 113 o Nothing was added about allowing a translator to add its own 114 random offsets to the sequence number and timestamp fields 115 because it would likely cause more trouble than good. 117 1 Introduction 119 This memorandum specifies the real-time transport protocol (RTP), 120 which provides end-to-end delivery services for data with real-time 121 characteristics, such as interactive audio and video. Those services 122 include payload type identification, sequence numbering, timestamping 123 and delivery monitoring. Applications typically run RTP on top of UDP 124 to make use of its multiplexing and checksum services; both protocols 125 contribute parts of the transport protocol functionality. However, 126 RTP may be used with other suitable underlying network or transport 127 protocols (see Section 10). RTP supports data transfer to multiple 128 destinations using multicast distribution if provided by the 129 underlying network. 131 Note that RTP itself does not provide any mechanism to ensure timely 132 delivery or provide other quality-of-service guarantees, but relies 133 on lower-layer services to do so. It does not guarantee delivery or 134 prevent out-of-order delivery, nor does it assume that the underlying 135 network is reliable and delivers packets in sequence. The sequence 136 numbers included in RTP allow the receiver to reconstruct the 137 sender's packet sequence, but sequence numbers might also be used to 138 determine the proper location of a packet, for example in video 139 decoding, without necessarily decoding packets in sequence. 141 While RTP is primarily designed to satisfy the needs of multi- 142 participant multimedia conferences, it is not limited to that 143 particular application. Storage of continuous data, interactive 144 distributed simulation, active badge, and control and measurement 145 applications may also find RTP applicable. 147 This document defines RTP, consisting of two closely-linked parts: 149 o the real-time transport protocol (RTP), to carry data that has 150 real-time properties. 152 o the RTP control protocol (RTCP), to monitor the quality of 153 service and to convey information about the participants in an 154 on-going session. The latter aspect of RTCP may be sufficient 155 for "loosely controlled" sessions, i.e., where there is no 156 explicit membership control and set-up, but it is not 157 necessarily intended to support all of an application's control 158 communication requirements. This functionality may be fully or 159 partially subsumed by a separate session control protocol, 160 which is beyond the scope of this document. 162 RTP represents a new style of protocol following the principles of 163 application level framing and integrated layer processing proposed by 164 Clark and Tennenhouse [1]. That is, RTP is intended to be malleable 165 to provide the information required by a particular application and 166 will often be integrated into the application processing rather than 167 being implemented as a separate layer. RTP is a protocol framework 168 that is deliberately not complete. This document specifies those 169 functions expected to be common across all the applications for which 170 RTP would be appropriate. Unlike conventional protocols in which 171 additional functions might be accommodated by making the protocol 172 more general or by adding an option mechanism that would require 173 parsing, RTP is intended to be tailored through modifications and/or 174 additions to the headers as needed. Examples are given in Sections 175 5.3 and 6.4.3. 177 Therefore, in addition to this document, a complete specification of 178 RTP for a particular application will require one or more companion 179 documents (see Section 12): 181 o a profile specification document, which defines a set of 182 payload type codes and their mapping to payload formats (e.g., 183 media encodings). A profile may also define extensions or 184 modifications to RTP that are specific to a particular class of 185 applications. Typically an application will operate under only 186 one profile. A profile for audio and video data may be found in 187 the companion RFC 1890. 189 o payload format specification documents, which define how a 190 particular payload, such as an audio or video encoding, is to 191 be carried in RTP. 193 A discussion of real-time services and algorithms for their 194 implementation as well as background discussion on some of the RTP 195 design decisions can be found in [2]. 197 1.1 Terminology 199 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 200 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 201 document are to be interpreted as described in RFC 2119 [3] and 202 indicate requirement levels for compliant RTP implementations. 204 2 RTP Use Scenarios 206 The following sections describe some aspects of the use of RTP. The 207 examples were chosen to illustrate the basic operation of 208 applications using RTP, not to limit what RTP may be used for. In 209 these examples, RTP is carried on top of IP and UDP, and follows the 210 conventions established by the profile for audio and video specified 211 in the companion RFC 1890 (updated by Internet-Draft draft-ietf-avt- 212 profile-new ). 214 2.1 Simple Multicast Audio Conference 216 A working group of the IETF meets to discuss the latest protocol 217 draft, using the IP multicast services of the Internet for voice 218 communications. Through some allocation mechanism the working group 219 chair obtains a multicast group address and pair of ports. One port 220 is used for audio data, and the other is used for control (RTCP) 221 packets. This address and port information is distributed to the 222 intended participants. If privacy is desired, the data and control 223 packets may be encrypted as specified in Section 9.1, in which case 224 an encryption key must also be generated and distributed. The exact 225 details of these allocation and distribution mechanisms are beyond 226 the scope of RTP. 228 The audio conferencing application used by each conference 229 participant sends audio data in small chunks of, say, 20 ms duration. 230 Each chunk of audio data is preceded by an RTP header; RTP header and 231 data are in turn contained in a UDP packet. The RTP header indicates 232 what type of audio encoding (such as PCM, ADPCM or LPC) is contained 233 in each packet so that senders can change the encoding during a 234 conference, for example, to accommodate a new participant that is 235 connected through a low-bandwidth link or react to indications of 236 network congestion. 238 The Internet, like other packet networks, occasionally loses and 239 reorders packets and delays them by variable amounts of time. To cope 240 with these impairments, the RTP header contains timing information 241 and a sequence number that allow the receivers to reconstruct the 242 timing produced by the source, so that in this example, chunks of 243 audio are contiguously played out the speaker every 20 ms. This 244 timing reconstruction is performed separately for each source of RTP 245 packets in the conference. The sequence number can also be used by 246 the receiver to estimate how many packets are being lost. 248 Since members of the working group join and leave during the 249 conference, it is useful to know who is participating at any moment 250 and how well they are receiving the audio data. For that purpose, 251 each instance of the audio application in the conference periodically 252 multicasts a reception report plus the name of its user on the RTCP 253 (control) port. The reception report indicates how well the current 254 speaker is being received and may be used to control adaptive 255 encodings. In addition to the user name, other identifying 256 information may also be included subject to control bandwidth limits. 257 A site sends the RTCP BYE packet (Section 6.6) when it leaves the 258 conference. 260 2.2 Audio and Video Conference 262 If both audio and video media are used in a conference, they are 263 transmitted as separate RTP sessions RTCP packets are transmitted for 264 each medium using two different UDP port pairs and/or multicast 265 addresses. There is no direct coupling at the RTP level between the 266 audio and video sessions, except that a user participating in both 267 sessions should use the same distinguished (canonical) name in the 268 RTCP packets for both so that the sessions can be associated. 270 One motivation for this separation is to allow some participants in 271 the conference to receive only one medium if they choose. Further 272 explanation is given in Section 5.2. Despite the separation, 273 synchronized playback of a source's audio and video can be achieved 274 using timing information carried in the RTCP packets for both 275 sessions. 277 2.3 Mixers and Translators 279 So far, we have assumed that all sites want to receive media data in 280 the same format. However, this may not always be appropriate. 281 Consider the case where participants in one area are connected 282 through a low-speed link to the majority of the conference 283 participants who enjoy high-speed network access. Instead of forcing 284 everyone to use a lower-bandwidth, reduced-quality audio encoding, an 285 RTP-level relay called a mixer may be placed near the low-bandwidth 286 area. This mixer resynchronizes incoming audio packets to reconstruct 287 the constant 20 ms spacing generated by the sender, mixes these 288 reconstructed audio streams into a single stream, translates the 289 audio encoding to a lower-bandwidth one and forwards the lower- 290 bandwidth packet stream across the low-speed link. These packets 291 might be unicast to a single recipient or multicast on a different 292 address to multiple recipients. The RTP header includes a means for 293 mixers to identify the sources that contributed to a mixed packet so 294 that correct talker indication can be provided at the receivers. 296 Some of the intended participants in the audio conference may be 297 connected with high bandwidth links but might not be directly 298 reachable via IP multicast. For example, they might be behind an 299 application-level firewall that will not let any IP packets pass. For 300 these sites, mixing may not be necessary, in which case another type 301 of RTP-level relay called a translator may be used. Two translators 302 are installed, one on either side of the firewall, with the outside 303 one funneling all multicast packets received through a secure 304 connection to the translator inside the firewall. The translator 305 inside the firewall sends them again as multicast packets to a 306 multicast group restricted to the site's internal network. 308 Mixers and translators may be designed for a variety of purposes. An 309 example is a video mixer that scales the images of individual people 310 in separate video streams and composites them into one video stream 311 to simulate a group scene. Other examples of translation include the 312 connection of a group of hosts speaking only IP/UDP to a group of 313 hosts that understand only ST-II, or the packet-by-packet encoding 314 translation of video streams from individual sources without 315 resynchronization or mixing. Details of the operation of mixers and 316 translators are given in Section 7. 318 2.4 Layered Encodings 320 Multimedia applications should be able to adjust the transmission 321 rate to match the capacity of the receiver or to adapt to network 322 congestion. Many implementations place the responsibility of rate- 323 adaptivity at the source. This does not work well with multicast 324 transmission because of the conflicting bandwidth requirements of 325 heterogeneous receivers. The result is often a least-common 326 denominator scenario, where the smallest pipe in the network mesh 327 dictates the quality and fidelity of the overall live multimedia 328 "broadcast". 330 Instead, responsibility for rate-adaptation can be placed at the 331 receivers by combining a layered encoding with a layered transmission 332 system. In the context of RTP over IP multicast, the source can 333 stripe the progressive layers of a hierarchically represented signal 334 across multiple RTP sessions each carried on its own multicast group. 335 Receivers can then adapt to network heterogeneity and control their 336 reception bandwidth by joining only the appropriate subset of the 337 multicast groups. 339 Details of the use of RTP with layered encodings are given in 340 Sections 6.3.9, 8.3 and 10. 342 3 Definitions 344 RTP payload: The data transported by RTP in a packet, for example 345 audio samples or compressed video data. The payload format and 346 interpretation are beyond the scope of this document. 348 RTP packet: A data packet consisting of the fixed RTP header, a 349 possibly empty list of contributing sources (see below), and the 350 payload data. Some underlying protocols may require an 351 encapsulation of the RTP packet to be defined. Typically one 352 packet of the underlying protocol contains a single RTP packet, 353 but several RTP packets MAY be contained if permitted by the 354 encapsulation method (see Section 10). 356 RTCP packet: A control packet consisting of a fixed header part 357 similar to that of RTP data packets, followed by structured 358 elements that vary depending upon the RTCP packet type. The 359 formats are defined in Section 6. Typically, multiple RTCP 360 packets are sent together as a compound RTCP packet in a single 361 packet of the underlying protocol; this is enabled by the length 362 field in the fixed header of each RTCP packet. 364 Port: The "abstraction that transport protocols use to distinguish 365 among multiple destinations within a given host computer. TCP/IP 366 protocols identify ports using small positive integers." [4] The 367 transport selectors (TSEL) used by the OSI transport layer are 368 equivalent to ports. RTP depends upon the lower-layer protocol 369 to provide some mechanism such as ports to multiplex the RTP and 370 RTCP packets of a session. 372 Transport address: The combination of a network address and port that 373 identifies a transport-level endpoint, for example an IP address 374 and a UDP port. Packets are transmitted from a source transport 375 address to a destination transport address. 377 RTP media type: An RTP media type is the collection of payload types 378 which can be carried within a single RTP session. The RTP 379 Profile assigns RTP media types to RTP payload types. 381 RTP session: The association among a set of participants 382 communicating with RTP. For each participant, the session is 383 defined by a particular pair of destination transport addresses 384 (one network address plus a port pair for RTP and RTCP). The 385 destination transport address pair may be common for all 386 participants, as in the case of IP multicast, or may be 387 different for each, as in the case of individual unicast network 388 addresses and port pairs. In a multimedia session, each medium 389 is carried in a separate RTP session with its own RTCP packets. 390 The multiple RTP sessions are distinguished by different port 391 number pairs and/or different multicast addresses. 393 Synchronization source (SSRC): The source of a stream of RTP packets, 394 identified by a 32-bit numeric SSRC identifier carried in the 395 RTP header so as not to be dependent upon the network address. 396 All packets from a synchronization source form part of the same 397 timing and sequence number space, so a receiver groups packets 398 by synchronization source for playback. Examples of 399 synchronization sources include the sender of a stream of 400 packets derived from a signal source such as a microphone or a 401 camera, or an RTP mixer (see below). A synchronization source 402 may change its data format, e.g., audio encoding, over time. The 403 SSRC identifier is a randomly chosen value meant to be globally 404 unique within a particular RTP session (see Section 8). A 405 participant need not use the same SSRC identifier for all the 406 RTP sessions in a multimedia session; the binding of the SSRC 407 identifiers is provided through RTCP (see Section 6.5.1). If a 408 participant generates multiple streams in one RTP session, for 409 example from separate video cameras, each MUST be identified as 410 a different SSRC. 412 Contributing source (CSRC): A source of a stream of RTP packets that 413 has contributed to the combined stream produced by an RTP mixer 414 (see below). The mixer inserts a list of the SSRC identifiers of 415 the sources that contributed to the generation of a particular 416 packet into the RTP header of that packet. This list is called 417 the CSRC list. An example application is audio conferencing 418 where a mixer indicates all the talkers whose speech was 419 combined to produce the outgoing packet, allowing the receiver 420 to indicate the current talker, even though all the audio 421 packets contain the same SSRC identifier (that of the mixer). 423 End system: An application that generates the content to be sent in 424 RTP packets and/or consumes the content of received RTP packets. 425 An end system can act as one or more synchronization sources in 426 a particular RTP session, but typically only one. 428 Mixer: An intermediate system that receives RTP packets from one or 429 more sources, possibly changes the data format, combines the 430 packets in some manner and then forwards a new RTP packet. Since 431 the timing among multiple input sources will not generally be 432 synchronized, the mixer will make timing adjustments among the 433 streams and generate its own timing for the combined stream. 434 Thus, all data packets originating from a mixer will be 435 identified as having the mixer as their synchronization source. 437 Translator: An intermediate system that forwards RTP packets with 438 their synchronization source identifier intact. Examples of 439 translators include devices that convert encodings without 440 mixing, replicators from multicast to unicast, and application- 441 level filters in firewalls. 443 Monitor: An application that receives RTCP packets sent by 444 participants in an RTP session, in particular the reception 445 reports, and estimates the current quality of service for 446 distribution monitoring, fault diagnosis and long-term 447 statistics. The monitor function is likely to be built into the 448 application(s) participating in the session, but may also be a 449 separate application that does not otherwise participate and 450 does not send or receive the RTP data packets. These are called 451 third party monitors. 453 Non-RTP means: Protocols and mechanisms that may be needed in 454 addition to RTP to provide a usable service. In particular, for 455 multimedia conferences, a conference control application may 456 distribute multicast addresses and keys for encryption, 457 negotiate the encryption algorithm to be used, and define 458 dynamic mappings between RTP payload type values and the payload 459 formats they represent for formats that do not have a predefined 460 payload type value. For simple applications, electronic mail or 461 a conference database may also be used. The specification of 462 such protocols and mechanisms is outside the scope of this 463 document. 465 4 Byte Order, Alignment, and Time Format 467 All integer fields are carried in network byte order, that is, most 468 significant byte (octet) first. This byte order is commonly known as 469 big-endian. The transmission order is described in detail in [5]. 470 Unless otherwise noted, numeric constants are in decimal (base 10). 472 All header data is aligned to its natural length, i.e., 16-bit fields 473 are aligned on even offsets, 32-bit fields are aligned at offsets 474 divisible by four, etc. Octets designated as padding have the value 475 zero. 477 Wallclock time (absolute date and time) is represented using the 478 timestamp format of the Network Time Protocol (NTP), which is in 479 seconds relative to 0h UTC on 1 January 1900 [6]. The full resolution 480 NTP timestamp is a 64-bit unsigned fixed-point number with the 481 integer part in the first 32 bits and the fractional part in the last 482 32 bits. In some fields where a more compact representation is 483 appropriate, only the middle 32 bits are used; that is, the low 16 484 bits of the integer part and the high 16 bits of the fractional part. 485 The high 16 bits of the integer part must be determined 486 independently. 488 The NTP timestamp will wrap around to zero some time in the year 489 2036, but for RTP purposes, only differences between pairs of NTP 490 timestamps are used. So long as the pairs of timestamps can be 491 assumed to be within 68 years of each other, using modulo arithmetic 492 for subtractions and comparisons makes the wraparound irrelevant. 494 5 RTP Data Transfer Protocol 496 5.1 RTP Fixed Header Fields 498 The RTP header has the following format: 500 0 1 2 3 501 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 502 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 503 |V=2|P|X| CC |M| PT | sequence number | 504 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 505 | timestamp | 506 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 507 | synchronization source (SSRC) identifier | 508 +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ 509 | contributing source (CSRC) identifiers | 510 | .... | 511 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 513 The first twelve octets are present in every RTP packet, while the 514 list of CSRC identifiers is present only when inserted by a mixer. 515 The fields have the following meaning: 517 version (V): 2 bits 518 This field identifies the version of RTP. The version defined by 519 this specification is two (2). (The value 1 is used by the first 520 draft version of RTP and the value 0 is used by the protocol 521 initially implemented in the "vat" audio tool.) 523 padding (P): 1 bit 524 If the padding bit is set, the packet contains one or more 525 additional padding octets at the end which are not part of the 526 payload. The last octet of the padding contains a count of how 527 many padding octets should be ignored, including itself. 528 Padding may be needed by some encryption algorithms with fixed 529 block sizes or for carrying several RTP packets in a lower-layer 530 protocol data unit. 532 extension (X): 1 bit 533 If the extension bit is set, the fixed header MUST be followed 534 by exactly one header extension, with a format defined in 535 Section 5.3.1. 537 CSRC count (CC): 4 bits 538 The CSRC count contains the number of CSRC identifiers that 539 follow the fixed header. 541 marker (M): 1 bit 542 The interpretation of the marker is defined by a profile. It is 543 intended to allow significant events such as frame boundaries to 544 be marked in the packet stream. A profile MAY define additional 545 marker bits or specify that there is no marker bit by changing 546 the number of bits in the payload type field (see Section 5.3). 548 payload type (PT): 7 bits 549 This field identifies the format of the RTP payload and 550 determines its interpretation by the application. A profile MAY 551 specify a default static mapping of payload type codes to 552 payload formats. Additional payload type codes MAY be defined 553 dynamically through non-RTP means (see Section 3). An initial 554 set of default mappings for audio and video is specified in the 555 companion RFC 1890 (updated by Internet-Draft draft-ietf-avt- 556 profile-new ), and may be extended in future editions of the 557 Assigned Numbers RFC [7]. An RTP sender emits a single RTP 558 payload type at any given time; this field SHOULD NOT be used 559 for multiplexing separate media streams (see Section 5.2). 561 A receiver MUST ignore packets with payload types that it does not 562 understand. 564 sequence number: 16 bits 565 The sequence number increments by one for each RTP data packet 566 sent, and may be used by the receiver to detect packet loss and 567 to restore packet sequence. The initial value of the sequence 568 number SHOULD be random (unpredictable) to make known-plaintext 569 attacks on encryption more difficult, even if the source itself 570 does not encrypt according to the method in Section 9.1, because 571 the packets may flow through a translator that does. Techniques 572 for choosing unpredictable numbers are discussed in [8]. 574 timestamp: 32 bits 575 The timestamp reflects the sampling instant of the first octet 576 in the RTP data packet. The sampling instant MUST be derived 577 from a clock that increments monotonically and linearly in time 578 to allow synchronization and jitter calculations (see Section 579 6.4.1). The resolution of the clock MUST be sufficient for the 580 desired synchronization accuracy and for measuring packet 581 arrival jitter (one tick per video frame is typically not 582 sufficient). The clock frequency is dependent on the format of 583 data carried as payload and is specified statically in the 584 profile or payload format specification that defines the format, 585 or MAY be specified dynamically for payload formats defined 586 through non-RTP means. If RTP packets are generated 587 periodically, the nominal sampling instant as determined from 588 the sampling clock is to be used, not a reading of the system 589 clock. As an example, for fixed-rate audio the timestamp clock 590 would likely increment by one for each sampling period. If an 591 audio application reads blocks covering 160 sampling periods 592 from the input device, the timestamp would be increased by 160 593 for each such block, regardless of whether the block is 594 transmitted in a packet or dropped as silent. 596 The initial value of the timestamp SHOULD be random, as for the 597 sequence number. Several consecutive RTP packets will have equal 598 timestamps if they are (logically) generated at once, e.g., belong to 599 the same video frame. Consecutive RTP packets MAY contain timestamps 600 that are not monotonic if the data is not transmitted in the order it 601 was sampled, as in the case of MPEG interpolated video frames. (The 602 sequence numbers of the packets as transmitted will still be 603 monotonic.) 605 SSRC: 32 bits 606 The SSRC field identifies the synchronization source. This 607 identifier SHOULD be chosen randomly, with the intent that no 608 two synchronization sources within the same RTP session will 609 have the same SSRC identifier. An example algorithm for 610 generating a random identifier is presented in Appendix A.6. 611 Although the probability of multiple sources choosing the same 612 identifier is low, all RTP implementations must be prepared to 613 detect and resolve collisions. Section 8 describes the 614 probability of collision along with a mechanism for resolving 615 collisions and detecting RTP-level forwarding loops based on the 616 uniqueness of the SSRC identifier. If a source changes its 617 source transport address, it must also choose a new SSRC 618 identifier to avoid being interpreted as a looped source (see 619 Section 8.2). 621 CSRC list: 0 to 15 items, 32 bits each 622 The CSRC list identifies the contributing sources for the 623 payload contained in this packet. The number of identifiers is 624 given by the CC field. If there are more than 15 contributing 625 sources, only 15 can be identified. CSRC identifiers are 626 inserted by mixers (see Section 7.1), using the SSRC identifiers 627 of contributing sources. For example, for audio packets the SSRC 628 identifiers of all sources that were mixed together to create a 629 packet are listed, allowing correct talker indication at the 630 receiver. 632 5.2 Multiplexing RTP Sessions 634 For efficient protocol processing, the number of multiplexing points 635 should be minimized, as described in the integrated layer processing 636 design principle [1]. In RTP, multiplexing is provided by the 637 destination transport address (network address and port number) which 638 define an RTP session. For example, in a teleconference composed of 639 audio and video media encoded separately, each medium SHOULD be 640 carried in a separate RTP session with its own destination transport 641 address. Separate audio and video streams SHOULD NOT be carried in a 642 single RTP session and demultiplexed based on the payload type or 643 SSRC fields. Interleaving packets with different RTP media types but 644 using the same SSRC would introduce several problems: 646 1. If, say, two audio streams shared the same RTP session and 647 the same SSRC value, and one were to change encodings and 648 thus acquire a different RTP payload type, there would be 649 no general way of identifying which stream had changed 650 encodings. 652 2. An SSRC is defined to identify a single timing and sequence 653 number space. Interleaving multiple payload types would 654 require different timing spaces if the media clock rates 655 differ and would require different sequence number spaces 656 to tell which payload type suffered packet loss. 658 3. The RTCP sender and receiver reports (see Section 6.4) can 659 only describe one timing and sequence number space per SSRC 660 and do not carry a payload type field. 662 4. An RTP mixer would not be able to combine interleaved 663 streams of incompatible media into one stream. 665 5. Carrying multiple media in one RTP session precludes: the 666 use of different network paths or network resource 667 allocations if appropriate; reception of a subset of the 668 media if desired, for example just audio if video would 669 exceed the available bandwidth; and receiver 670 implementations that use separate processes for the 671 different media, whereas using separate RTP sessions 672 permits either single- or multiple-process implementations. 674 Using a different SSRC for each medium but sending them in the same 675 RTP session would avoid the first three problems but not the last 676 two. 678 5.3 Profile-Specific Modifications to the RTP Header 680 The existing RTP data packet header is believed to be complete for 681 the set of functions required in common across all the application 682 classes that RTP might support. However, in keeping with the ALF 683 design principle, the header MAY be tailored through modifications or 684 additions defined in a profile specification while still allowing 685 profile-independent monitoring and recording tools to function. 687 o The marker bit and payload type field carry profile-specific 688 information, but they are allocated in the fixed header since 689 many applications are expected to need them and might otherwise 690 have to add another 32-bit word just to hold them. The octet 691 containing these fields MAY be redefined by a profile to suit 692 different requirements, for example with a more or fewer marker 693 bits. If there are any marker bits, one SHOULD be located in 694 the most significant bit of the octet since profile-independent 695 monitors may be able to observe a correlation between packet 696 loss patterns and the marker bit. 698 o Additional information that is required for a particular 699 payload format, such as a video encoding, SHOULD be carried in 700 the payload section of the packet. This might be in a header 701 that is always present at the start of the payload section, or 702 might be indicated by a reserved value in the data pattern. 704 o If a particular class of applications needs additional 705 functionality independent of payload format, the profile under 706 which those applications operate SHOULD define additional fixed 707 fields to follow immediately after the SSRC field of the 708 existing fixed header. Those applications will be able to 709 quickly and directly access the additional fields while 710 profile-independent monitors or recorders can still process the 711 RTP packets by interpreting only the first twelve octets. 713 If it turns out that additional functionality is needed in common 714 across all profiles, then a new version of RTP should be defined to 715 make a permanent change to the fixed header. 717 5.3.1 RTP Header Extension 718 An extension mechanism is provided to allow individual 719 implementations to experiment with new payload-format-independent 720 functions that require additional information to be carried in the 721 RTP data packet header. This mechanism is designed so that the header 722 extension may be ignored by other interoperating implementations that 723 have not been extended. 725 Note that this header extension is intended only for limited use. 726 Most potential uses of this mechanism would be better done another 727 way, using the methods described in the previous section. For 728 example, a profile-specific extension to the fixed header is less 729 expensive to process because it is not conditional nor in a variable 730 location. Additional information required for a particular payload 731 format SHOULD NOT use this header extension, but SHOULD be carried in 732 the payload section of the packet. 734 0 1 2 3 735 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 736 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 737 | defined by profile | length | 738 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 739 | header extension | 740 | .... | 742 If the X bit in the RTP header is one, a variable-length header 743 extension MUST be appended to the RTP header, following the CSRC list 744 if present. The header extension contains a 16-bit length field that 745 counts the number of 32-bit words in the extension, excluding the 746 four-octet extension header (therefore zero is a valid length). Only 747 a single extension can be appended to the RTP data header. To allow 748 multiple interoperating implementations to each experiment 749 independently with different header extensions, or to allow a 750 particular implementation to experiment with more than one type of 751 header extension, the first 16 bits of the header extension are left 752 open for distinguishing identifiers or parameters. The format of 753 these 16 bits is to be defined by the profile specification under 754 which the implementations are operating. This RTP specification does 755 not define any header extensions itself. 757 6 RTP Control Protocol -- RTCP 759 The RTP control protocol (RTCP) is based on the periodic transmission 760 of control packets to all participants in the session, using the same 761 distribution mechanism as the data packets. The underlying protocol 762 MUST provide multiplexing of the data and control packets, for 763 example using separate port numbers with UDP. RTCP performs four 764 functions: 766 1. The primary function is to provide feedback on the quality 767 of the data distribution. This is an integral part of the 768 RTP's role as a transport protocol and is related to the 769 flow and congestion control functions of other transport 770 protocols. The feedback may be directly useful for control 771 of adaptive encodings [9,10], but experiments with IP 772 multicasting have shown that it is also critical to get 773 feedback from the receivers to diagnose faults in the 774 distribution. Sending reception feedback reports to all 775 participants allows one who is observing problems to 776 evaluate whether those problems are local or global. With a 777 distribution mechanism like IP multicast, it is also 778 possible for an entity such as a network service provider 779 who is not otherwise involved in the session to receive the 780 feedback information and act as a third-party monitor to 781 diagnose network problems. This feedback function is 782 performed by the RTCP sender and receiver reports, 783 described below in Section 6.4. 785 2. RTCP carries a persistent transport-level identifier for an 786 RTP source called the canonical name or CNAME, Section 787 6.5.1. Since the SSRC identifier may change if a conflict 788 is discovered or a program is restarted, receivers require 789 the CNAME to keep track of each participant. Receivers may 790 also require the CNAME to associate multiple data streams 791 from a given participant in a set of related RTP sessions, 792 for example to synchronize audio and video. Inter-media 793 synchronization also requires the NTP and RTP timestamps 794 included in RTCP packets by data senders. 796 3. The first two functions require that all participants send 797 RTCP packets, therefore the rate must be controlled in 798 order for RTP to scale up to a large number of 799 participants. By having each participant send its control 800 packets to all the others, each can independently observe 801 the number of participants. This number is used to 802 calculate the rate at which the packets are sent, as 803 explained in Section 6.2. 805 4. A fourth, OPTIONAL function is to convey minimal session 806 control information, for example participant identification 807 to be displayed in the user interface. This is most likely 808 to be useful in "loosely controlled" sessions where 809 participants enter and leave without membership control or 810 parameter negotiation. RTCP serves as a convenient channel 811 to reach all the participants, but it is not necessarily 812 expected to support all the control communication 813 requirements of an application. A higher-level session 814 control protocol, which is beyond the scope of this 815 document, may be needed. 817 Functions 1-3 SHOULD be used in all environments, but particularly in 818 the IP multicast environment. RTP application designers SHOULD avoid 819 mechanisms that can only work in unicast mode and will not scale to 820 larger numbers. Transmission of RTCP MAY be controlled separately for 821 senders and receivers, as described in Section 6.2, for cases such as 822 unidirectional links where feedback from receivers is not possible. 824 6.1 RTCP Packet Format 826 This specification defines several RTCP packet types to carry a 827 variety of control information: 829 SR: Sender report, for transmission and reception statistics from 830 participants that are active senders 832 RR: Receiver report, for reception statistics from participants that 833 are not active senders 835 SDES: Source description items, including CNAME 837 BYE: Indicates end of participation 839 APP: Application specific functions 841 Each RTCP packet begins with a fixed part similar to that of RTP data 842 packets, followed by structured elements that MAY be of variable 843 length according to the packet type but MUST end on a 32-bit 844 boundary. The alignment requirement and a length field in the fixed 845 part of each packet are included to make RTCP packets "stackable". 846 Multiple RTCP packets can be concatenated without any intervening 847 separators to form a compound RTCP packet that is sent in a single 848 packet of the lower layer protocol, for example UDP. There is no 849 explicit count of individual RTCP packets in the compound packet 850 since the lower layer protocols are expected to provide an overall 851 length to determine the end of the compound packet. 853 Each individual RTCP packet in the compound packet may be processed 854 independently with no requirements upon the order or combination of 855 packets. However, in order to perform the functions of the protocol, 856 the following constraints are imposed: 858 o Reception statistics (in SR or RR) should be sent as often as 859 bandwidth constraints will allow to maximize the resolution of 860 the statistics, therefore each periodically transmitted 861 compound RTCP packet MUST include a report packet. 863 o New receivers need to receive the CNAME for a source as soon 864 as possible to identify the source and to begin associating 865 media for purposes such as lip-sync, so each compound RTCP 866 packet MUST also include the SDES CNAME. 868 o The number of packet types that may appear first in the 869 compound packet needs to be limited to increase the number of 870 constant bits in the first word and the probability of 871 successfully validating RTCP packets against misaddressed RTP 872 data packets or other unrelated packets. 874 Thus, all RTCP packets MUST be sent in a compound packet of at least 875 two individual packets, with the following format RECOMMENDED: 877 Encryption prefix: If and only if the compound packet is to be 878 encrypted according to the method in Section 9.1, it MUST be 879 prefixed by a random 32-bit quantity redrawn for every compound 880 packet transmitted. If padding is required for the encryption, 881 it MUST be added to the last packet of the compound packet. 883 SR or RR: The first RTCP packet in the compound packet MUST always 884 be a report packet to facilitate header validation as described 885 in Appendix A.2. This is true even if no data has been sent nor 886 received, in which case an empty RR MUST be sent, and even if 887 the only other RTCP packet in the compound packet is a BYE. 889 Additional RRs: If the number of sources for which reception 890 statistics are being reported exceeds 31, the number that will 891 fit into one SR or RR packet, then additional RR packets SHOULD 892 follow the initial report packet. 894 SDES: An SDES packet containing a CNAME item MUST be included in 895 each compound RTCP packet. Other source description items MAY 896 optionally be included if required by a particular application, 897 subject to bandwidth constraints (see Section 6.3.9). 899 BYE or APP: Other RTCP packet types, including those yet to be 900 defined, MAY follow in any order, except that BYE SHOULD be the 901 last packet sent with a given SSRC/CSRC. Packet types MAY appear 902 more than once. 904 It is RECOMMENDED that translators and mixers combine individual RTCP 905 packets from the multiple sources they are forwarding into one 906 compound packet whenever feasible in order to amortize the packet 907 overhead (see Section 7). An example RTCP compound packet as might be 908 produced by a mixer is shown in Fig. 1. If the overall length of a 909 compound packet would exceed the maximum transmission unit (MTU) of 910 the network path, it SHOULD be segmented into multiple shorter 911 compound packets to be transmitted in separate packets of the 912 underlying protocol. Note that each of the compound packets MUST 913 begin with an SR or RR packet. 915 An implementation SHOULD ignore incoming RTCP packets with types 916 unknown to it. Additional RTCP packet types may be registered with 917 the Internet Assigned Numbers Authority (IANA) as described in 918 Section 11.3. 920 if encrypted: random 32-bit integer 921 | 922 |[------- packet -------][----------- packet -----------][-packet-] 923 | 924 | receiver chunk chunk 925 V reports item item item item 926 -------------------------------------------------------------------- 927 |R[SR|# sender #site#site][SDES|# CNAME PHONE |#CNAME LOC][BYE##why] 928 |R[ |# report # 1 # 2 ][ |# |# ][ ## ] 929 |R[ |# # # ][ |# |# ][ ## ] 930 |R[ |# # # ][ |# |# ][ ## ] 931 -------------------------------------------------------------------- 932 |<------------------ UDP packet (compound packet) --------------->| 934 #: SSRC/CSRC 936 Figure 1: Example of an RTCP compound packet 938 6.2 RTCP Transmission Interval 940 RTP is designed to allow an application to scale automatically over 941 session sizes ranging from a few participants to thousands. For 942 example, in an audio conference the data traffic is inherently self- 943 limiting because only one or two people will speak at a time, so with 944 multicast distribution the data rate on any given link remains 945 relatively constant independent of the number of participants. 946 However, the control traffic is not self-limiting. If the reception 947 reports from each participant were sent at a constant rate, the 948 control traffic would grow linearly with the number of participants. 949 Therefore, the rate must be scaled down by dynamically calculating 950 the interval between RTCP packet transmissions. 952 For each session, it is assumed that the data traffic is subject to 953 an aggregate limit called the "session bandwidth" to be divided among 954 the participants. This bandwidth might be reserved and the limit 955 enforced by the network. If there is no reservation, there may be 956 other constraints, depending on the environment, that establish the 957 "reasonable" maximum for the session to use, and that would be the 958 session bandwidth. The session bandwidth may be chosen based or some 959 cost or a priori knowledge of the available network bandwidth for the 960 session. It is somewhat independent of the media encoding, but the 961 encoding choice may be limited by the session bandwidth. Often, the 962 session bandwidth is the sum of the nominal bandwidths of the senders 963 expected to be concurrently active. For teleconference audio, this 964 number would typically be one sender's bandwidth. For layered 965 encodings, each layer is a separate RTP session with its own session 966 bandwidth parameter. 968 The session bandwidth parameter is expected to be supplied by a 969 session management application when it invokes a media application, 970 but media applications MAY also set a default based on the single- 971 sender data bandwidth for the encoding selected for the session. The 972 application may also enforce bandwidth limits based on multicast 973 scope rules or other criteria. 975 Bandwidth calculations for control and data traffic include lower- 976 layer transport and network protocols (e.g., UDP and IP) since that 977 is what the resource reservation system would need to know. The 978 application can also be expected to know which of these protocols are 979 in use. Link level headers are not included in the calculation since 980 the packet will be encapsulated with different link level headers as 981 it travels. 983 The control traffic should be limited to a small and known fraction 984 of the session bandwidth: small so that the primary function of the 985 transport protocol to carry data is not impaired; known so that the 986 control traffic can be included in the bandwidth specification given 987 to a resource reservation protocol, and so that each participant can 988 independently calculate its share. It is RECOMMENDED that the 989 fraction of the session bandwidth allocated to RTCP be fixed at 5%. 990 It is also RECOMMENDED that 1/4 of the RTCP bandwidth be dedicated to 991 participants that are sending data so that in sessions with a large 992 number of receivers but a small number of senders, newly joining 993 participants will more quickly receive the CNAME for the sending 994 sites. When the proportion of senders is greater than 1/4 of the 995 participants, the senders get their proportion of the full RTCP 996 bandwidth. While the values of these and other constants in the 997 interval calculation are not critical, all participants in the 998 session MUST use the same values so the same interval will be 999 calculated. Therefore, these constants SHOULD be fixed for a 1000 particular profile. 1002 A profile MAY specify that the control traffic bandwidth may be a 1003 separate parameter of the session rather than a strict percentage of 1004 the session bandwidth. Using a separate parameter allows rate- 1005 adaptive applications to set an RTCP bandwidth consistent with a 1006 "typical" data bandwidth that is lower than the maximum bandwidth 1007 specified by the session bandwidth parameter. 1009 The profile MAY further specify that the control traffic bandwidth 1010 may be divided into two separate session parameters for those 1011 participants which are active data senders and those which are not. 1012 Following the recommendation that 1/4 of the RTCP bandwidth be 1013 dedicated to data senders, the RECOMMENDED default values for these 1014 two parameters would be 1.25% and 3.75%, respectively. When the 1015 proportion of senders is greater than 1/4 of the participants, the 1016 senders get their proportion of the sum of these parameters. Using 1017 two parameters allows RTCP reception reports to be turned off 1018 entirely for a particular session by setting the RTCP bandwidth for 1019 non-data-senders to zero while keeping the RTCP bandwidth for data 1020 senders non-zero so that sender reports can still be sent for inter- 1021 media synchronization. This may be appropriate for systems operating 1022 on unidirectional links or for sessions that don't require feedback 1023 on the quality of reception. 1025 The calculated interval between transmissions of compound RTCP 1026 packets SHOULD also have a lower bound to avoid having bursts of 1027 packets exceed the allowed bandwidth when the number of participants 1028 is small and the traffic isn't smoothed according to the law of large 1029 numbers. It also keeps the report interval from becoming too small 1030 during transient outages like a network partition such that 1031 adaptation is delayed when the partition heals. At application 1032 startup, a delay SHOULD be imposed before the first compound RTCP 1033 packet is sent to allow time for RTCP packets to be received from 1034 other participants so the report interval will converge to the 1035 correct value more quickly. This delay MAY be set to half the 1036 minimum interval to allow quicker notification that the new 1037 participant is present. The RECOMMENDED value for a fixed minimum 1038 interval is 5 seconds. 1040 An implementation MAY scale the minimum RTCP interval to a smaller 1041 value inversely proportional to the session bandwidth parameter with 1042 the following limitations: 1044 o For multicast sessions, only active data senders MAY use the 1045 reduced minimum value to calculate the interval for 1046 transmission of compound RTCP packets. 1048 o For unicast sessions, the reduced value MAY be used by 1049 participants that are not active data senders as well, and the 1050 delay before sending the initial compound RTCP packet MAY be 1051 zero. 1053 o For all sessions, the fixed minimum SHOULD be used when 1054 calculating the participant timeout interval (see Section 6.3.5 1055 so that implementations which do not to use the reduced value 1056 for transmitting RTCP packets are not timed out by other 1057 participants prematurely. 1059 o The RECOMMENDED value for the reduced minimum in seconds is 1060 360 divided by the session bandwidth in kilobits/second. This 1061 minimum is smaller than 5 seconds for bandwidths greater than 1062 72 kb/s. 1064 The algorithm described in Section 6.3 and Appendix A.7 was designed 1065 to meet the goals outlined above. It calculates the interval between 1066 sending compound RTCP packets to divide the allowed control traffic 1067 bandwidth among the participants. This allows an application to 1068 provide fast response for small sessions where, for example, 1069 identification of all participants is important, yet automatically 1070 adapt to large sessions. The algorithm incorporates the following 1071 characteristics: 1073 o The calculated interval between RTCP packets scales linearly 1074 with the number of members in the group. It is this linear 1075 factor which allows for a constant amount of control traffic 1076 when summed across all members. 1078 o The interval between RTCP packets is varied randomly over the 1079 range [0.5,1.5] times the calculated interval to avoid 1080 unintended synchronization of all participants [11]. The first 1081 RTCP packet sent after joining a session is also delayed by a 1082 random variation of half the minimum RTCP interval. 1084 o A dynamic estimate of the average compound RTCP packet size is 1085 calculated, including all those received and sent, to 1086 automatically adapt to changes in the amount of control 1087 information carried. 1089 o Since the calculated interval is dependent on the number of 1090 observed group members, there may be undesirable startup 1091 effects when a new user joins an existing session, or many 1092 users simultaneously join a new session. These new users will 1093 initially have incorrect estimates of the group membership, and 1094 thus their RTCP transmission interval will be too short. This 1095 problem can be significant if many users join the session 1096 simultaneously. To deal with this, an algorithm called "timer 1097 reconsideration" is employed. This algorithm implements a 1098 simple back-off mechanism which causes users to hold back RTCP 1099 packet transmission if the group sizes are increasing. 1101 o When users leave a session, either with a BYE or by timeout, 1102 the group membership decreases, and thus the calculated 1103 interval should decrease. A "reverse reconsideration" algorithm 1104 is used to allow members to more quickly reduce their intervals 1105 in response to group membership decreases. 1107 o BYE packets are given different treatment than other RTCP 1108 packets. When a user leaves a group, and wishes to send a BYE 1109 packet, it may do so before its next scheduled RTCP packet. 1110 However, transmission of BYE's follows a back-off algorithm 1111 which avoids floods of BYE packets should a large number of 1112 members simultaneously leave the session. 1114 This algorithm may be used for sessions in which all participants are 1115 allowed to send. In that case, the session bandwidth parameter is the 1116 product of the individual sender's bandwidth times the number of 1117 participants, and the RTCP bandwidth is 5% of that. 1119 Details of the algorithm's operation are given in the sections that 1120 follow. Appendix A.7 gives an example implementation. 1122 6.2.1 Maintaining the number of session members 1124 Calculation of the RTCP packet interval depends upon an estimate of 1125 the number of sites participating in the session. New sites are added 1126 to the count when they are heard, and an entry for each SHOULD be 1127 created in a table indexed by the SSRC or CSRC identifier (see 1128 Section 8.2) to keep track of them. New entries MAY be considered not 1129 valid until multiple packets carrying the new SSRC have been received 1130 (see Appendix A.1). Entries MAY be deleted from the table when an 1131 RTCP BYE packet with the corresponding SSRC identifier is received, 1132 except that some straggler data packets might arrive after the BYE 1133 and cause the entry to be recreated. Instead, the entry SHOULD be 1134 marked as having received a BYE and then deleted after an appropriate 1135 delay. 1137 A participant may mark another site inactive, or delete it if not yet 1138 valid, if no RTP or RTCP packet has been received for a small number 1139 of RTCP report intervals (5 is RECOMMENDED). This provides some 1140 robustness against packet loss. All sites must calculate roughly the 1141 same value for the RTCP report interval in order for this timeout to 1142 work properly. 1144 Once a site has been validated, then if it is later marked inactive 1145 the state for that site SHOULD still be retained and the site SHOULD 1146 continue to be counted in the total number of sites sharing RTCP 1147 bandwidth for a period long enough to span typical network 1148 partitions. This is to avoid excessive traffic, when the partition 1149 heals, due to an RTCP report interval that is too small. A timeout of 1150 30 minutes is RECOMMENDED. Note that this is still larger than 5 1151 times the largest value to which the RTCP report interval is expected 1152 to usefully scale, about 2-5 minutes. 1154 For sessions with a very large number of participants, it may be 1155 impractical to maintain a table to store the SSRC identifier and 1156 state information for all of them. An implementation MAY use SSRC 1157 sampling, as described in @citedraft-ietf-avt-rtpsample, to reduce 1158 the storage requirements. An implementation MAY use any other 1159 algorithm with similar performance. A key requirement is that any 1160 algorithm considered SHOULD NOT substantially underestimate the group 1161 size, although it MAY overestimate. 1163 6.3 RTCP Packet Send and Receive Rules 1165 The rules for how to send, and what to do when receiving an RTCP 1166 packet are outlined here. An implementation that allows operation in 1167 a multicast environment MUST meet the scalability goals described in 1168 Section 6.2. Such an implementation MAY use an algorithm other than 1169 the one defined here so long as it provides equivalent or better 1170 performance. 1172 To execute these rules, a session participant must maintain several 1173 pieces of state: 1175 tp: the last time an RTCP packet was transmitted; 1177 tc: the current time; 1179 tn: the next scheduled transmission time of an RTCP packet; 1181 pmembers: the estimated number of session members at time tp 1183 members: the most current estimate for the number of session members; 1185 senders: the most current estimate for the number of senders in the 1186 session; 1188 rtcp_bw: The target RTCP bandwidth, i.e., the total bandwidth that 1189 will be used for RTCP packets by all members of this session, in 1190 octets per second. This should be 5% of the "session bandwidth" 1191 parameter supplied to the application at startup. 1193 we_sent: Flag that is true if the application has sent data since the 1194 2nd previous RTCP report was transmitted. 1196 avg_rtcp_size: The average compound RTCP packet size, in octets, over 1197 all RTCP packets sent and received by this participant. 1199 initial: Flag that is true if the application has not yet sent an 1200 RTCP packet. 1202 Many of these rules make use of the "calculated interval" between 1203 packet transmissions. This interval is described in the following 1204 section. 1206 6.3.1 Computing the RTCP transmission interval 1208 To maintain scalability, the average interval between packets from a 1209 session participant should scale with the group size. This interval 1210 is called the calculated interval. It is obtained by combining a 1211 number of the pieces of state described above. The calculated 1212 interval T is then determined as follows: 1214 1. If there are any senders (senders > 0) in the session, but 1215 the number of senders is less than 25% of the membership 1216 (members), the interval depends on whether the participant 1217 is a sender or not (based on the value of we_sent). If the 1218 participant is a sender (we_sent true), the constant C is 1219 set to the average RTCP packet size (avg_rtcp_size) divided 1220 by 25% of the RTCP bandwidth (rtcp_bw), and the constant n 1221 is set to the number of senders. If we_sent is not true, 1222 the constant C is set to the average RTCP packet size 1223 divided by 75% of the RTCP bandwidth. The constant n is set 1224 to the number of receivers (members - senders). If the 1225 number of senders is greater than 25%, senders and 1226 receivers are treated together. The constant C is set to 1227 the total RTCP bandwidth and n is set to the total number 1228 of members. 1230 2. If the participant has not yet sent an RTCP packet (the 1231 variable initial is true), the constant Tmin is set to 2.5 1232 seconds, else it is set to 5 seconds. 1234 3. The deterministic calculated interval Td is set to 1235 max(Tmin, n*C). 1237 4. The calculated interval T is set to a number uniformly 1238 distributed between 0.5 and 1.5 times the deterministic 1239 calculated interval. 1241 This procedure results in an interval which is random, but which, on 1242 average, gives 25% of the RTCP bandwidth to senders, and 75% to 1243 receivers. 1245 6.3.2 Initialization 1247 Upon joining the session, the participant initializes tp to 0, tc to 1248 0, senders to 0, pmembers to 1, members to 1, we_sent to false, 1249 rtcp_bw to 5% of the session bandwidth, initial to true, and 1250 avg_rtcp_size to the size of the very first packet constructed by the 1251 application. The calculated interval T is then computed, and the 1252 first packet is scheduled for time tn = T. This means that a 1253 transmission timer is set which expires at time T. Note that an 1254 application MAY use any desired approach for implementing this timer. 1256 The participant adds their own SSRC to the member table. 1258 6.3.3 Receiving an RTP or non-BYE RTCP packet 1260 When an RTP or RTCP packet is received from a participant whose SSRC 1261 is not in the member table, the SSRC is added to the table, and the 1262 value for members is updated. 1264 When an RTP packet is received from a participant whose SSRC is not 1265 in the sender table, the SSRC is added to the table, and the value 1266 for senders is updated. 1268 For each compound RTCP packet received, the value of avg_rtcp_size is 1269 updated: avg_rtcp_size = (1/16)*packet_size + (15/16)* avg_rtcp_size, 1270 where packet_size is the size of the RTCP packet just received. 1272 6.3.4 Receiving an RTCP BYE packet 1274 Except as described in Section 6.3.7 for the case when an RTCP BYE is 1275 to be transmitted, if the received packet is an RTCP BYE packet, the 1276 SSRC is checked against the member table. If present, the entry is 1277 removed from the table, and the value for members is updated. The 1278 SSRC is then checked against the sender table. If present, the entry 1279 is removed from the table, and the value for senders is updated. 1281 Furthermore, to make the transmission rate of RTCP packets more 1282 adaptive to changes in group membership, the following "reverse 1283 reconsideration" algorithm SHOULD be executed when a BYE packet is 1284 received: 1286 o The value for tn is updated according to the following 1287 formula: tn = tc + (members/pmembers)(tn - tc). 1289 o The value for tp is updated according the following formula: 1290 tp = tc - (members/pmembers)(tc - tp). 1292 o The next RTCP packet is rescheduled for transmission at time 1293 tn, which is now earlier. 1295 o The value of pmembers is set equal to members. 1297 This algorithm does not prevent the group size estimate from 1298 incorrectly dropping to zero for a short time when most participants 1299 of a large session leave at once but some remain. The algorithm does 1300 make the estimate return to the correct value more rapidly. This 1301 situation is unusual enough and the consequences are sufficiently 1302 harmless that this problem is deemed only a secondary concern. 1304 6.3.5 Timing Out an SSRC 1306 At occassional intervals, the participant MUST check to see if any of 1307 the other participants time out. To do this, the participant computes 1308 the deterministic calculated interval (without the randomization 1309 factor) Td. Any other session member who has not sent a packet since 1310 time tc - MTd (M is the timeout multiplier, and defaults to 5) is 1311 timed out. This means that their SSRC is removed from the member 1312 list, and members is updated. A similar check is performed on the 1313 sender list. Any member on the sender list who has not sent an RTP 1314 packet since time tc - 2T (within the last two RTCP report intervals) 1315 is removed from the sender list, and senders is updated. 1317 If any members time out, the reverse reconsideration algorithm 1318 described in Section 6.3.4 SHOULD be performed. 1320 The participant MUST perform this check at least once per RTCP 1321 transmission interval. 1323 6.3.6 Expiration of transmission timer 1325 When the packet transmission timer expires, the participant performs 1326 the following operations: 1328 o The transmission interval T is computed, including the 1329 randomization factor and a factor e-3/2=1.21828 times the 1330 rtcp_bw to compensate for the fact that the unconditional 1331 reconsideration algorithm converges to a value below the 1332 intended average. 1334 o If tp + T is less than or equal to tc, an RTCP packet is 1335 transmitted. tp is set to tc, and tn is set to tc + T. The 1336 transmission timer is set to expire again at time tn. If tp + T 1337 is greater than tc, pmembers is set to members, and tn is set 1338 to tp + T. No RTCP packet is transmitted. The transmission 1339 timer is set to expire at time tn. 1341 If an RTCP packet is transmitted, the value of initial is set to 1342 FALSE. Furthermore, the value of avg_rtcp_size is updated: 1343 avg_rtcp_size = (1/16)*packet_size + (15/16)* avg_rtcp_size, where 1344 packet_size is the size of the RTCP packet just transmitted. 1346 6.3.7 Transmitting a BYE packet 1348 When a participant wishes to leave a session, a BYE packet is 1349 transmitted to inform the other participants of the event. In order 1350 to avoid a flood of BYE packets when many participants leave the 1351 system, a participant MUST execute the following algorithm if the 1352 number of members is more than 50 when the participant chooses to 1353 leave. This algorithm usurps the normal role of the members variable 1354 to count BYE packets instead: 1356 o When the participant decides to leave the system, tp is reset 1357 to tc, the current time, members and pmembers are initialized 1358 to 1, initial is set to 1, we_sent is set to 0, senders is set 1359 to 0, and avg_rtcp_size is set to the size of the BYE packet. 1360 The calculated interval T is computed. The BYE packet is then 1361 scheduled for time tn = tc + T. 1363 o Every time a BYE packet from another participant is received, 1364 members is incremented by 1 regardless of whether that 1365 participant exists in the member table or not, and when SSRC 1366 sampling is in use, regardless of whether the BYE SSRC matches 1367 the key or not. members is NOT incremented when other RTCP 1368 packets or RTP packets are received, but only for BYE packets. 1370 o Transmission of the BYE packet then follows the rules for 1371 transmitting a regular RTCP packet, as above. 1373 This allows BYE packets to be sent right away, yet controls their 1374 total bandwidth usage. In the worst case, this could cause RTCP 1375 control packets to use twice the bandwidth as normal (10%) -- 5% for 1376 non BYE RTCP packets and 5% for BYE. 1378 A participant that does not want to wait for the above mechanism to 1379 allow transmission of a BYE packet MAY leave the group without 1380 sending a BYE at all. That participant will eventually be timed out 1381 by the other group members. 1383 If the group size estimate members is less than 50 when the 1384 participant decides to leave, the participant MAY send a BYE packet 1385 immediately. Alternatively, the participant MAY choose to execute the 1386 above BYE backoff algorithm. 1388 In either case, a participant which never sent an RTP or RTCP packet 1389 MUST NOT send a BYE packet when they leave the group. 1391 6.3.8 Updating we_sent 1393 The variable we_sent contains true if the participant has sent an RTP 1394 packet recently, false otherwise. This determination is made by using 1395 the same mechanisms for managing the senders table and sending SR 1396 packets. If the participant sends an RTP packet when we_sent is 1397 false, it adds itself to the sender table and sets we_sent to true. 1398 Every time another RTP packet is sent, the time of transmission of 1399 that packet is maintained in the table. The normal sender timeout 1400 algorithm is then applied to the participant -- if an RTP packet has 1401 not been transmitted since time tc - 2T, the participant removes 1402 itself from the sender table, decrements the sender count, and sets 1403 we_sent to false. 1405 6.3.9 Allocation of source description bandwidth 1407 This specification defines several source description (SDES) items in 1408 addition to the mandatory CNAME item, such as NAME (personal name) 1409 and EMAIL (email address). It also provides a means to define new 1410 application-specific RTCP packet types. Applications should exercise 1411 caution in allocating control bandwidth to this additional 1412 information because it will slow down the rate at which reception 1413 reports and CNAME are sent, thus impairing the performance of the 1414 protocol. It is RECOMMENDED that no more than 20% of the RTCP 1415 bandwidth allocated to a single participant be used to carry the 1416 additional information. Furthermore, it is not intended that all 1417 SDES items will be included in every application. Those that are 1418 included SHOULD be assigned a fraction of the bandwidth according to 1419 their utility. Rather than estimate these fractions dynamically, it 1420 is recommended that the percentages be translated statically into 1421 report interval counts based on the typical length of an item. 1423 For example, an application may be designed to send only CNAME, NAME 1424 and EMAIL and not any others. NAME might be given much higher 1425 priority than EMAIL because the NAME would be displayed continuously 1426 in the application's user interface, whereas EMAIL would be displayed 1427 only when requested. At every RTCP interval, an RR packet and an SDES 1428 packet with the CNAME item would be sent. For a small session 1429 operating at the minimum interval, that would be every 5 seconds on 1430 the average. Every third interval (15 seconds), one extra item would 1431 be included in the SDES packet. Seven out of eight times this would 1432 be the NAME item, and every eighth time (2 minutes) it would be the 1433 EMAIL item. 1435 When multiple applications operate in concert using cross-application 1436 binding through a common CNAME for each participant, for example in a 1437 multimedia conference composed of an RTP session for each medium, the 1438 additional SDES information MAY be sent in only one RTP session. The 1439 other sessions would carry only the CNAME item. In particular, this 1440 approach should be applied to the multiple sessions of a layered 1441 encoding scheme (see Section 2.4). 1443 6.4 Sender and Receiver Reports 1445 RTP receivers provide reception quality feedback using RTCP report 1446 packets which may take one of two forms depending upon whether or not 1447 the receiver is also a sender. The only difference between the sender 1448 report (SR) and receiver report (RR) forms, besides the packet type 1449 code, is that the sender report includes a 20-byte sender information 1450 section for use by active senders. The SR is issued if a site has 1451 sent any data packets during the interval since issuing the last 1452 report or the previous one, otherwise the RR is issued. 1454 Both the SR and RR forms include zero or more reception report 1455 blocks, one for each of the synchronization sources from which this 1456 receiver has received RTP data packets since the last report. Reports 1457 are not issued for contributing sources listed in the CSRC list. Each 1458 reception report block provides statistics about the data received 1459 from the particular source indicated in that block. Since a maximum 1460 of 31 reception report blocks will fit in an SR or RR packet, 1461 additional RR packets MAY be stacked after the initial SR or RR 1462 packet as needed to contain the reception reports for all sources 1463 heard during the interval since the last report. 1465 The next sections define the formats of the two reports, how they may 1466 be extended in a profile-specific manner if an application requires 1467 additional feedback information, and how the reports may be used. 1468 Details of reception reporting by translators and mixers is given in 1469 Section 7. 1471 6.4.1 SR: Sender report RTCP packet 1472 0 1 2 3 1473 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1474 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1475 |V=2|P| RC | PT=SR=200 | length | header 1476 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1477 | SSRC of sender | 1478 +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ 1479 | NTP timestamp, most significant word | sender 1480 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ info 1481 | NTP timestamp, least significant word | 1482 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1483 | RTP timestamp | 1484 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1485 | sender's packet count | 1486 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1487 | sender's octet count | 1488 +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ 1489 | SSRC_1 (SSRC of first source) | report 1490 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ block 1491 | fraction lost | cumulative number of packets lost | 1 1492 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1493 | extended highest sequence number received | 1494 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1495 | interarrival jitter | 1496 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1497 | last SR (LSR) | 1498 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1499 | delay since last SR (DLSR) | 1500 +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ 1501 | SSRC_2 (SSRC of second source) | report 1502 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ block 1503 : ... : 2 1504 +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ 1505 | profile-specific extensions | 1506 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1508 The sender report packet consists of three sections, possibly 1509 followed by a fourth profile-specific extension section if defined. 1510 The first section, the header, is 8 octets long. The fields have the 1511 following meaning: 1513 version (V): 2 bits 1514 Identifies the version of RTP, which is the same in RTCP packets 1515 as in RTP data packets. The version defined by this 1516 specification is two (2). 1518 padding (P): 1 bit 1519 If the padding bit is set, this individual RTCP packet contains 1520 some additional padding octets at the end which are not part of 1521 the control information but are included in the length field. 1522 The last octet of the padding is a count of how many padding 1523 octets should be ignored, including itself (it will be a 1524 multiple of four). Padding may be needed by some encryption 1525 algorithms with fixed block sizes. In a compound RTCP packet, 1526 padding is only required on one individual packet because the 1527 compound packet is encrypted as a whole for the method in 1528 Section 9.1. Thus, padding MUST only be added to the last 1529 individual packet, and if padding is added to that packet, the 1530 padding bit MUST be set only on that packet. This convention 1531 aids the header validity checks described in Appendix A.2 and 1532 allows detection of packets from some early implementations that 1533 incorrectly set the padding bit on the first individual packet 1534 and add padding to the last individual packet. 1536 reception report count (RC): 5 bits 1537 The number of reception report blocks contained in this packet. 1538 A value of zero is valid. 1540 packet type (PT): 8 bits 1541 Contains the constant 200 to identify this as an RTCP SR packet. 1543 length: 16 bits 1544 The length of this RTCP packet in 32-bit words minus one, 1545 including the header and any padding. (The offset of one makes 1546 zero a valid length and avoids a possible infinite loop in 1547 scanning a compound RTCP packet, while counting 32-bit words 1548 avoids a validity check for a multiple of 4.) 1550 SSRC: 32 bits 1551 The synchronization source identifier for the originator of this 1552 SR packet. 1554 The second section, the sender information, is 20 octets long and is 1555 present in every sender report packet. It summarizes the data 1556 transmissions from this sender. The fields have the following 1557 meaning: 1559 NTP timestamp: 64 bits 1560 Indicates the wallclock time (see Section 4) when this report 1561 was sent so that it may be used in combination with timestamps 1562 returned in reception reports from other receivers to measure 1563 round-trip propagation to those receivers. Receivers should 1564 expect that the measurement accuracy of the timestamp may be 1565 limited to far less than the resolution of the NTP timestamp. 1566 The measurement uncertainty of the timestamp is not indicated as 1567 it may not be known. On a system that has no notion of 1568 wallclock time but does have some system-specific clock such as 1569 "system uptime", a sender MAY use that clock as a reference to 1570 calculate relative NTP timestamps. It is important to choose a 1571 commonly used clock so that if separate implementations are used 1572 to produce the individual streams of a multimedia session, all 1573 implementations will use the same clock. Until the year 2036, 1574 relative and absolute timestamps will differ in the high bit so 1575 (invalid) comparisons will show a large difference; by then one 1576 hopes relative timestamps will no longer be needed. A sender 1577 that has no notion of wallclock or elapsed time MAY set the NTP 1578 timestamp to zero. 1580 RTP timestamp: 32 bits 1581 Corresponds to the same time as the NTP timestamp (above), but 1582 in the same units and with the same random offset as the RTP 1583 timestamps in data packets. This correspondence may be used for 1584 intra- and inter-media synchronization for sources whose NTP 1585 timestamps are synchronized, and may be used by media- 1586 independent receivers to estimate the nominal RTP clock 1587 frequency. Note that in most cases this timestamp will not be 1588 equal to the RTP timestamp in any adjacent data packet. Rather, 1589 it MUST be calculated from the corresponding NTP timestamp using 1590 the relationship between the RTP timestamp counter and real time 1591 as maintained by periodically checking the wallclock time at a 1592 sampling instant. 1594 sender's packet count: 32 bits 1595 The total number of RTP data packets transmitted by the sender 1596 since starting transmission up until the time this SR packet was 1597 generated. The count SHOULD be reset if the sender changes its 1598 SSRC identifier. 1600 sender's octet count: 32 bits 1601 The total number of payload octets (i.e., not including header 1602 or padding) transmitted in RTP data packets by the sender since 1603 starting transmission up until the time this SR packet was 1604 generated. The count SHOULD be reset if the sender changes its 1605 SSRC identifier. This field can be used to estimate the average 1606 payload data rate. 1608 The third section contains zero or more reception report blocks 1609 depending on the number of other sources heard by this sender since 1610 the last report. Each reception report block conveys statistics on 1611 the reception of RTP packets from a single synchronization source. 1612 Receivers SHOULD NOT carry over statistics when a source changes its 1613 SSRC identifier due to a collision. These statistics are: 1615 SSRC_n (source identifier): 32 bits 1616 The SSRC identifier of the source to which the information in 1617 this reception report block pertains. 1619 fraction lost: 8 bits 1620 The fraction of RTP data packets from source SSRC_n lost since 1621 the previous SR or RR packet was sent, expressed as a fixed 1622 point number with the binary point at the left edge of the 1623 field. (That is equivalent to taking the integer part after 1624 multiplying the loss fraction by 256.) This fraction is defined 1625 to be the number of packets lost divided by the number of 1626 packets expected, as defined in the next paragraph. An 1627 implementation is shown in Appendix A.3. If the loss is 1628 negative due to duplicates, the fraction lost is set to zero. 1629 Note that a receiver cannot tell whether any packets were lost 1630 after the last one received, and that there will be no reception 1631 report block issued for a source if all packets from that source 1632 sent during the last reporting interval have been lost. 1634 cumulative number of packets lost: 24 bits 1635 The total number of RTP data packets from source SSRC_n that 1636 have been lost since the beginning of reception. This number is 1637 defined to be the number of packets expected less the number of 1638 packets actually received, where the number of packets received 1639 includes any which are late or duplicates. Thus packets that 1640 arrive late are not counted as lost, and the loss may be 1641 negative if there are duplicates. The number of packets 1642 expected is defined to be the extended last sequence number 1643 received, as defined next, less the initial sequence number 1644 received. This may be calculated as shown in Appendix A.3. 1646 extended highest sequence number received: 32 bits 1647 The low 16 bits contain the highest sequence number received in 1648 an RTP data packet from source SSRC_n, and the most significant 1649 16 bits extend that sequence number with the corresponding count 1650 of sequence number cycles, which may be maintained according to 1651 the algorithm in Appendix A.1. Note that different receivers 1652 within the same session will generate different extensions to 1653 the sequence number if their start times differ significantly. 1655 interarrival jitter: 32 bits 1656 An estimate of the statistical variance of the RTP data packet 1657 interarrival time, measured in timestamp units and expressed as 1658 an unsigned integer. The interarrival jitter J is defined to be 1659 the mean deviation (smoothed absolute value) of the difference D 1660 in packet spacing at the receiver compared to the sender for a 1661 pair of packets. As shown in the equation below, this is 1662 equivalent to the difference in the "relative transit time" for 1663 the two packets; the relative transit time is the difference 1664 between a packet's RTP timestamp and the receiver's clock at the 1665 time of arrival, measured in the same units. 1667 If Si is the RTP timestamp from packet i, and Ri is the time of 1668 arrival in RTP timestamp units for packet i, then for two packets i 1669 and j, D may be expressed as D(i,j) = (R_j - R_i) - (S_j - S_i) = 1670 (R_j - S_j) - (R_i - S_i) 1672 The interarrival jitter SHOULD be calculated continuously as each 1673 data packet i is received from source SSRC_n, using this difference D 1674 for that packet and the previous packet i-1 in order of arrival (not 1675 necessarily in sequence), according to the formula J_i = J_i-1 + 1676 (|D(i-1,i)| - J_i-1)/16 1677 Whenever a reception report is issued, the current value of J is 1678 sampled. 1680 The jitter calculation MUST conform to the formula specified here in 1681 order to allow profile-independent monitors to make valid 1682 interpretations of reports coming from different implementations. 1683 This algorithm is the optimal first-order estimator and the gain 1684 parameter 1/16 gives a good noise reduction ratio while maintaining a 1685 reasonable rate of convergence [13]. A sample implementation is 1686 shown in Appendix A.8. 1688 last SR timestamp (LSR): 32 bits 1689 The middle 32 bits out of 64 in the NTP timestamp (as explained 1690 in Section 4) received as part of the most recent RTCP sender 1691 report (SR) packet from source SSRC_n. If no SR has been 1692 received yet, the field is set to zero. 1694 delay since last SR (DLSR): 32 bits 1695 The delay, expressed in units of 1/65536 seconds, between 1696 receiving the last SR packet from source SSRC_n and sending this 1697 reception report block. If no SR packet has been received yet 1698 from SSRC_n, the DLSR field is set to zero. 1700 Let SSRC_r denote the receiver issuing this receiver report. Source 1701 SSRC_n can compute the round propagation delay to SSRC_r by recording 1702 the time A when this reception report block is received. It 1703 calculates the total round-trip time A-LSR using the last SR 1704 timestamp (LSR) field, and then subtracting this field to leave the 1705 round-trip propagation delay as (A- LSR - DLSR). This is illustrated 1706 in Fig. 2. 1708 This may be used as an approximate measure of distance to cluster 1709 receivers, although some links have very asymmetric delays. 1711 [10 Nov 1995 11:33:25.125] [10 Nov 1995 11:33:36.5] 1712 n SR(n) A=b710:8000 (46864.500 s) 1713 ----------------------------------------------------------------> 1714 v ^ 1715 ntp_sec =0xb44db705 v ^ dlsr=0x0005.4000 ( 5.250s) 1716 ntp_frac=0x20000000 v ^ lsr =0xb705:2000 (46853.125s) 1717 (3024992016.125 s) v ^ 1718 r v ^ RR(n) 1719 ----------------------------------------------------------------> 1720 |<-DLSR->| 1721 (5.250 s) 1723 A 0xb710:8000 (46864.500 s) 1724 DLSR -0x0005:4000 ( 5.250 s) 1725 LSR -0xb705:2000 (46853.125 s) 1726 ------------------------------- 1727 delay 0x 6:2000 ( 6.125 s) 1729 Figure 2: Example for round-trip time computation 1731 6.4.2 RR: Receiver report RTCP packet 1732 0 1 2 3 1733 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1734 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1735 |V=2|P| RC | PT=RR=201 | length | header 1736 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1737 | SSRC of packet sender | 1738 +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ 1739 | SSRC_1 (SSRC of first source) | report 1740 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ block 1741 | fraction lost | cumulative number of packets lost | 1 1742 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1743 | extended highest sequence number received | 1744 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1745 | interarrival jitter | 1746 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1747 | last SR (LSR) | 1748 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1749 | delay since last SR (DLSR) | 1750 +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ 1751 | SSRC_2 (SSRC of second source) | report 1752 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ block 1753 : ... : 2 1754 +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ 1755 | profile-specific extensions | 1756 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1758 The format of the receiver report (RR) packet is the same as that of 1759 the SR packet except that the packet type field contains the constant 1760 201 and the five words of sender information are omitted (these are 1761 the NTP and RTP timestamps and sender's packet and octet counts). The 1762 remaining fields have the same meaning as for the SR packet. 1764 An empty RR packet (RC = 0) MUST be put at the head of a compound 1765 RTCP packet when there is no data transmission or reception to 1766 report. 1768 6.4.3 Extending the sender and receiver reports 1770 A profile SHOULD define profile-specific extensions to the sender 1771 report and receiver report if there is additional information that 1772 needs to be reported regularly about the sender or receivers. This 1773 method SHOULD be used in preference to defining another RTCP packet 1774 type because it requires less overhead: 1776 o fewer octets in the packet (no RTCP header or SSRC field); 1778 o simpler and faster parsing because applications running under 1779 that profile would be programmed to always expect the extension 1780 fields in the directly accessible location after the reception 1781 reports. 1783 The extension is a fourth section in the sender- or receiver-report 1784 packet which comes at the end after the reception report blocks, if 1785 any. If additional sender information is required, then for sender 1786 reports it would be included first in the extension section, but for 1787 receiver reports it would not be present. If information about 1788 receivers is to be included, that data SHOULD be structured as an 1789 array of blocks parallel to the existing array of reception report 1790 blocks; that is, the number of blocks would be indicated by the RC 1791 field. 1793 6.4.4 Analyzing sender and receiver reports 1795 It is expected that reception quality feedback will be useful not 1796 only for the sender but also for other receivers and third-party 1797 monitors. The sender may modify its transmissions based on the 1798 feedback; receivers can determine whether problems are local, 1799 regional or global; network managers may use profile-independent 1800 monitors that receive only the RTCP packets and not the corresponding 1801 RTP data packets to evaluate the performance of their networks for 1802 multicast distribution. 1804 Cumulative counts are used in both the sender information and 1805 receiver report blocks so that differences may be calculated between 1806 any two reports to make measurements over both short and long time 1807 periods, and to provide resilience against the loss of a report. The 1808 difference between the last two reports received can be used to 1809 estimate the recent quality of the distribution. The NTP timestamp is 1810 included so that rates may be calculated from these differences over 1811 the interval between two reports. Since that timestamp is independent 1812 of the clock rate for the data encoding, it is possible to implement 1813 encoding- and profile-independent quality monitors. 1815 An example calculation is the packet loss rate over the interval 1816 between two reception reports. The difference in the cumulative 1817 number of packets lost gives the number lost during that interval. 1818 The difference in the extended last sequence numbers received gives 1819 the number of packets expected during the interval. The ratio of 1820 these two is the packet loss fraction over the interval. This ratio 1821 should equal the fraction lost field if the two reports are 1822 consecutive, but otherwise it may not. The loss rate per second can 1823 be obtained by dividing the loss fraction by the difference in NTP 1824 timestamps, expressed in seconds. The number of packets received is 1825 the number of packets expected minus the number lost. The number of 1826 packets expected may also be used to judge the statistical validity 1827 of any loss estimates. For example, 1 out of 5 packets lost has a 1828 lower significance than 200 out of 1000. 1830 From the sender information, a third-party monitor can calculate the 1831 average payload data rate and the average packet rate over an 1832 interval without receiving the data. Taking the ratio of the two 1833 gives the average payload size. If it can be assumed that packet loss 1834 is independent of packet size, then the number of packets received by 1835 a particular receiver times the average payload size (or the 1836 corresponding packet size) gives the apparent throughput available to 1837 that receiver. 1839 In addition to the cumulative counts which allow long-term packet 1840 loss measurements using differences between reports, the fraction 1841 lost field provides a short-term measurement from a single report. 1842 This becomes more important as the size of a session scales up enough 1843 that reception state information might not be kept for all receivers 1844 or the interval between reports becomes long enough that only one 1845 report might have been received from a particular receiver. 1847 The interarrival jitter field provides a second short-term measure of 1848 network congestion. Packet loss tracks persistent congestion while 1849 the jitter measure tracks transient congestion. The jitter measure 1850 may indicate congestion before it leads to packet loss. Since the 1851 interarrival jitter field is only a snapshot of the jitter at the 1852 time of a report, it may be necessary to analyze a number of reports 1853 from one receiver over time or from multiple receivers, e.g., within 1854 a single network. 1856 6.5 SDES: Source description RTCP packet 1858 0 1 2 3 1859 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1860 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1861 |V=2|P| SC | PT=SDES=202 | length | header 1862 +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ 1863 | SSRC/CSRC_1 | chunk 1864 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1 1865 | SDES items | 1866 | ... | 1867 +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ 1868 | SSRC/CSRC_2 | chunk 1869 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2 1870 | SDES items | 1871 | ... | 1872 +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ 1873 The SDES packet is a three-level structure composed of a header and 1874 zero or more chunks, each of of which is composed of items describing 1875 the source identified in that chunk. The items are described 1876 individually in subsequent sections. 1878 version (V), padding (P), length: 1879 As described for the SR packet (see Section 6.4.1). 1881 packet type (PT): 8 bits 1882 Contains the constant 202 to identify this as an RTCP SDES 1883 packet. 1885 source count (SC): 5 bits 1886 The number of SSRC/CSRC chunks contained in this SDES packet. A 1887 value of zero is valid but useless. 1889 Each chunk consists of an SSRC/CSRC identifier followed by a list of 1890 zero or more items, which carry information about the SSRC/CSRC. Each 1891 chunk starts on a 32-bit boundary. Each item consists of an 8-bit 1892 type field, an 8-bit octet count describing the length of the text 1893 (thus, not including this two-octet header), and the text itself. 1894 Note that the text can be no longer than 255 octets, but this is 1895 consistent with the need to limit RTCP bandwidth consumption. 1897 The text is encoded according to the UTF-8 encoding specified in RFC 1898 2279 [14]. US-ASCII is a subset of this encoding and requires no 1899 additional encoding. The presence of multi-octet encodings is 1900 indicated by setting the most significant bit of a character to a 1901 value of one. 1903 Items are contiguous, i.e., items are not individually padded to a 1904 32-bit boundary. Text is not null terminated because some multi-octet 1905 encodings include null octets. The list of items in each chunk MUST 1906 be terminated by one or more null octets, the first of which is 1907 interpreted as an item type of zero to denote the end of the list. 1908 No length octet follows the null item type octet, but additional null 1909 octets MUST be included if needed to pad until the next 32-bit 1910 boundary. Note that this padding is separate from that indicated by 1911 the P bit in the RTCP header. A chunk with zero items (four null 1912 octets) is valid but useless. 1914 End systems send one SDES packet containing their own source 1915 identifier (the same as the SSRC in the fixed RTP header). A mixer 1916 sends one SDES packet containing a chunk for each contributing source 1917 from which it is receiving SDES information, or multiple complete 1918 SDES packets in the format above if there are more than 31 such 1919 sources (see Section 7). 1921 The SDES items currently defined are described in the next sections. 1922 Only the CNAME item is mandatory. Some items shown here may be useful 1923 only for particular profiles, but the item types are all assigned 1924 from one common space to promote shared use and to simplify profile- 1925 independent applications. Additional items may be defined in a 1926 profile by registering the type numbers with IANA as described in 1927 Section 11.3. 1929 6.5.1 CNAME: Canonical end-point identifier SDES item 1931 0 1 2 3 1932 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1933 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1934 | CNAME=1 | length | user and domain name ... 1935 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1937 The CNAME identifier has the following properties: 1939 o Because the randomly allocated SSRC identifier may change if a 1940 conflict is discovered or if a program is restarted, the CNAME 1941 item MUST be included to provide the binding from the SSRC 1942 identifier to an identifier for the source that remains 1943 constant. 1945 o Like the SSRC identifier, the CNAME identifier SHOULD also be 1946 unique among all participants within one RTP session. 1948 o To provide a binding across multiple media tools used by one 1949 participant in a set of related RTP sessions, the CNAME SHOULD 1950 be fixed for that participant. 1952 o To facilitate third-party monitoring, the CNAME SHOULD be 1953 suitable for either a program or a person to locate the source. 1955 Therefore, the CNAME SHOULD be derived algorithmically and not 1956 entered manually, when possible. To meet these requirements, the 1957 following format SHOULD be used unless a profile specifies an 1958 alternate syntax or semantics. The CNAME item SHOULD have the format 1959 "user@host", or "host" if a user name is not available as on single- 1960 user systems. For both formats, "host" is either the fully qualified 1961 domain name of the host from which the real-time data originates, 1962 formatted according to the rules specified in RFC 1034 [15], RFC 1035 1963 [16] and Section 2.1 of RFC 1123 [17]; or the standard ASCII 1964 representation of the host's numeric address on the interface used 1965 for the RTP communication. For example, the standard ASCII 1966 representation of an IP Version 4 address is "dotted decimal", also 1967 known as dotted quad. Other address types are expected to have ASCII 1968 representations that are mutually unique. The fully qualified domain 1969 name is more convenient for a human observer and may avoid the need 1970 to send a NAME item in addition, but it may be difficult or 1971 impossible to obtain reliably in some operating environments. 1972 Applications that may be run in such environments SHOULD use the 1973 ASCII representation of the address instead. 1975 Examples are "doe@sleepy.megacorp.com" or "doe@192.0.2.89" for a 1976 multi-user system. On a system with no user name, examples would be 1977 "sleepy.megacorp.com" or "192.0.2.89". 1979 The user name SHOULD be in a form that a program such as "finger" or 1980 "talk" could use, i.e., it typically is the login name rather than 1981 the personal name. The host name is not necessarily identical to the 1982 one in the participant's electronic mail address. 1984 This syntax will not provide unique identifiers for each source if an 1985 application permits a user to generate multiple sources from one 1986 host. Such an application would have to rely on the SSRC to further 1987 identify the source, or the profile for that application would have 1988 to specify additional syntax for the CNAME identifier. 1990 If each application creates its CNAME independently, the resulting 1991 CNAMEs may not be identical as would be required to provide a binding 1992 across multiple media tools belonging to one participant in a set of 1993 related RTP sessions. If cross-media binding is required, it may be 1994 necessary for the CNAME of each tool to be externally configured with 1995 the same value by a coordination tool. 1997 Application writers should be aware that private network address 1998 assignments such as the Net-10 assignment proposed in RFC 1597 [18] 1999 may create network addresses that are not globally unique. This would 2000 lead to non-unique CNAMEs if hosts with private addresses and no 2001 direct IP connectivity to the public Internet have their RTP packets 2002 forwarded to the public Internet through an RTP-level translator. 2003 (See also RFC 1627 [19].) To handle this case, applications MAY 2004 provide a means to configure a unique CNAME, but the burden is on the 2005 translator to translate CNAMEs from private addresses to public 2006 addresses if necessary to keep private addresses from being exposed. 2008 6.5.2 NAME: User name SDES item 2010 0 1 2 3 2011 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2012 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2013 | NAME=2 | length | common name of source ... 2014 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2015 This is the real name used to describe the source, e.g., "John Doe, 2016 Bit Recycler, Megacorp". It may be in any form desired by the user. 2017 For applications such as conferencing, this form of name may be the 2018 most desirable for display in participant lists, and therefore might 2019 be sent most frequently of those items other than CNAME. Profiles MAY 2020 establish such priorities. The NAME value is expected to remain 2021 constant at least for the duration of a session. It SHOULD NOT be 2022 relied upon to be unique among all participants in the session. 2024 6.5.3 EMAIL: Electronic mail address SDES item 2026 0 1 2 3 2027 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2028 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2029 | EMAIL=3 | length | email address of source ... 2030 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2032 The email address is formatted according to RFC 822 [20], for 2033 example, "John.Doe@megacorp.com". The EMAIL value is expected to 2034 remain constant for the duration of a session. 2036 6.5.4 PHONE: Phone number SDES item 2038 0 1 2 3 2039 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2040 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2041 | PHONE=4 | length | phone number of source ... 2042 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2044 The phone number SHOULD be formatted with the plus sign replacing the 2045 international access code. For example, "+1 908 555 1212" for a 2046 number in the United States. 2048 6.5.5 LOC: Geographic user location SDES item 2050 0 1 2 3 2051 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2052 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2053 | LOC=5 | length | geographic location of site ... 2054 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2056 Depending on the application, different degrees of detail are 2057 appropriate for this item. For conference applications, a string like 2058 "Murray Hill, New Jersey" may be sufficient, while, for an active 2059 badge system, strings like "Room 2A244, AT&T BL MH" might be 2060 appropriate. The degree of detail is left to the implementation 2061 and/or user, but format and content MAY be prescribed by a profile. 2062 The LOC value is expected to remain constant for the duration of a 2063 session, except for mobile hosts. 2065 6.5.6 TOOL: Application or tool name SDES item 2067 0 1 2 3 2068 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2069 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2070 | TOOL=6 | length | name/version of source appl. ... 2071 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2073 A string giving the name and possibly version of the application 2074 generating the stream, e.g., "videotool 1.2". This information may be 2075 useful for debugging purposes and is similar to the Mailer or Mail- 2076 System-Version SMTP headers. The TOOL value is expected to remain 2077 constant for the duration of the session. 2079 6.5.7 NOTE: Notice/status SDES item 2081 0 1 2 3 2082 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2083 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2084 | NOTE=7 | length | note about the source ... 2085 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2087 The following semantics are suggested for this item, but these or 2088 other semantics MAY be explicitly defined by a profile. The NOTE item 2089 is intended for transient messages describing the current state of 2090 the source, e.g., "on the phone, can't talk". Or, during a seminar, 2091 this item might be used to convey the title of the talk. It should be 2092 used only to carry exceptional information and SHOULD NOT be included 2093 routinely by all participants because this would slow down the rate 2094 at which reception reports and CNAME are sent, thus impairing the 2095 performance of the protocol. In particular, it SHOULD NOT be included 2096 as an item in a user's configuration file nor automatically generated 2097 as in a quote-of-the-day. 2099 Since the NOTE item may be important to display while it is active, 2100 the rate at which other non-CNAME items such as NAME are transmitted 2101 might be reduced so that the NOTE item can take that part of the RTCP 2102 bandwidth. When the transient message becomes inactive, the NOTE item 2103 SHOULD continue to be transmitted a few times at the same repetition 2104 rate but with a string of length zero to signal the receivers. 2105 However, receivers SHOULD also consider the NOTE item inactive if it 2106 is not received for a small multiple of the repetition rate, or 2107 perhaps 20-30 RTCP intervals. 2109 6.5.8 PRIV: Private extensions SDES item 2111 0 1 2 3 2112 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2113 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2114 | PRIV=8 | length | prefix length | prefix string... 2115 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2116 ... | value string ... 2117 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2119 This item is used to define experimental or application-specific SDES 2120 extensions. The item contains a prefix consisting of a length-string 2121 pair, followed by the value string filling the remainder of the item 2122 and carrying the desired information. The prefix length field is 8 2123 bits long. The prefix string is a name chosen by the person defining 2124 the PRIV item to be unique with respect to other PRIV items this 2125 application might receive. The application creator might choose to 2126 use the application name plus an additional subtype identification if 2127 needed. Alternatively, it is RECOMMENDED that others choose a name 2128 based on the entity they represent, then coordinate the use of the 2129 name within that entity. 2131 Note that the prefix consumes some space within the item's total 2132 length of 255 octets, so the prefix should be kept as short as 2133 possible. This facility and the constrained RTCP bandwidth SHOULD NOT 2134 be overloaded; it is not intended to satisfy all the control 2135 communication requirements of all applications. 2137 SDES PRIV prefixes will not be registered by IANA. If some form of 2138 the PRIV item proves to be of general utility, it SHOULD instead be 2139 assigned a regular SDES item type registered with IANA so that no 2140 prefix is required. This simplifies use and increases transmission 2141 efficiency. 2143 6.6 BYE: Goodbye RTCP packet 2144 0 1 2 3 2145 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2146 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2147 |V=2|P| SC | PT=BYE=203 | length | 2148 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2149 | SSRC/CSRC | 2150 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2151 : ... : 2152 +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ 2153 | length | reason for leaving ... (opt) 2154 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2156 The BYE packet indicates that one or more sources are no longer 2157 active. 2159 version (V), padding (P), length: 2160 As described for the SR packet (see Section 6.4.1). 2162 packet type (PT): 8 bits 2163 Contains the constant 203 to identify this as an RTCP BYE 2164 packet. 2166 source count (SC): 5 bits 2167 The number of SSRC/CSRC identifiers included in this BYE packet. 2168 A count value of zero is valid, but useless. 2170 The rules for when a BYE packet should be sent are specified in 2171 Sections 6.3.7 and 8.2. 2173 If a BYE packet is received by a mixer, the mixer SHOULD forward the 2174 BYE packet with the SSRC/CSRC identifier(s) unchanged. If a mixer 2175 shuts down, it SHOULD send a BYE packet listing all contributing 2176 sources it handles, as well as its own SSRC identifier. Optionally, 2177 the BYE packet MAY include an 8-bit octet count followed by that many 2178 octets of text indicating the reason for leaving, e.g., "camera 2179 malfunction" or "RTP loop detected". The string has the same encoding 2180 as that described for SDES. If the string fills the packet to the 2181 next 32-bit boundary, the string is not null terminated. If not, the 2182 BYE packet MUST be padded with null octets to the next 32-bit 2183 boundary. This padding is separate from that indicated by the P bit 2184 in the RTCP header. 2186 6.7 APP: Application-defined RTCP packet 2187 0 1 2 3 2188 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2189 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2190 |V=2|P| subtype | PT=APP=204 | length | 2191 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2192 | SSRC/CSRC | 2193 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2194 | name (ASCII) | 2195 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2196 | application-dependent data ... 2197 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2199 The APP packet is intended for experimental use as new applications 2200 and new features are developed, without requiring packet type value 2201 registration. APP packets with unrecognized names SHOULD be ignored. 2202 After testing and if wider use is justified, it is RECOMMENDED that 2203 each APP packet be redefined without the subtype and name fields and 2204 registered with IANA using an RTCP packet type. 2206 version (V), padding (P), length: 2207 As described for the SR packet (see Section 6.4.1). 2209 subtype: 5 bits 2210 May be used as a subtype to allow a set of APP packets to be 2211 defined under one unique name, or for any application-dependent 2212 data. 2214 packet type (PT): 8 bits 2215 Contains the constant 204 to identify this as an RTCP APP 2216 packet. 2218 name: 4 octets 2219 A name chosen by the person defining the set of APP packets to 2220 be unique with respect to other APP packets this application 2221 might receive. The application creator might choose to use the 2222 application name, and then coordinate the allocation of subtype 2223 values to others who want to define new packet types for the 2224 application. Alternatively, it is RECOMMENDED that others 2225 choose a name based on the entity they represent, then 2226 coordinate the use of the name within that entity. The name is 2227 interpreted as a sequence of four ASCII characters, with 2228 uppercase and lowercase characters treated as distinct. 2230 application-dependent data: variable length 2231 Application-dependent data may or may not appear in an APP 2232 packet. It is interpreted by the application and not RTP itself. 2233 It MUST be a multiple of 32 bits long. 2235 7 RTP Translators and Mixers 2237 In addition to end systems, RTP supports the notion of "translators" 2238 and "mixers", which could be considered as "intermediate systems" at 2239 the RTP level. Although this support adds some complexity to the 2240 protocol, the need for these functions has been clearly established 2241 by experiments with multicast audio and video applications in the 2242 Internet. Example uses of translators and mixers given in Section 2.3 2243 stem from the presence of firewalls and low bandwidth connections, 2244 both of which are likely to remain. 2246 7.1 General Description 2248 An RTP translator/mixer connects two or more transport-level 2249 "clouds". Typically, each cloud is defined by a common network and 2250 transport protocol (e.g., IP/UDP) plus a multicast address and 2251 transport level destination port or a pair of unicast addresses and 2252 ports. (Network-level protocol translators, such as IP version 4 to 2253 IP version 6, may be present within a cloud invisibly to RTP.) One 2254 system may serve as a translator or mixer for a number of RTP 2255 sessions, but each is considered a logically separate entity. 2257 In order to avoid creating a loop when a translator or mixer is 2258 installed, the following rules MUST be observed: 2260 o Each of the clouds connected by translators and mixers 2261 participating in one RTP session either MUST be distinct from 2262 all the others in at least one of these parameters (protocol, 2263 address, port), or MUST be isolated at the network level from 2264 the others. 2266 o A derivative of the first rule is that there MUST NOT be 2267 multiple translators or mixers connected in parallel unless by 2268 some arrangement they partition the set of sources to be 2269 forwarded. 2271 Similarly, all RTP end systems that can communicate through one or 2272 more RTP translators or mixers share the same SSRC space, that is, 2273 the SSRC identifiers MUST be unique among all these end systems. 2274 Section 8.2 describes the collision resolution algorithm by which 2275 SSRC identifiers are kept unique and loops are detected. 2277 There may be many varieties of translators and mixers designed for 2278 different purposes and applications. Some examples are to add or 2279 remove encryption, change the encoding of the data or the underlying 2280 protocols, or replicate between a multicast address and one or more 2281 unicast addresses. The distinction between translators and mixers is 2282 that a translator passes through the data streams from different 2283 sources separately, whereas a mixer combines them to form one new 2284 stream: 2286 Translator: Forwards RTP packets with their SSRC identifier intact; 2287 this makes it possible for receivers to identify individual 2288 sources even though packets from all the sources pass through 2289 the same translator and carry the translator's network source 2290 address. Some kinds of translators will pass through the data 2291 untouched, but others MAY change the encoding of the data and 2292 thus the RTP data payload type and timestamp. If multiple data 2293 packets are re-encoded into one, or vice versa, a translator 2294 MUST assign new sequence numbers to the outgoing packets. Losses 2295 in the incoming packet stream may induce corresponding gaps in 2296 the outgoing sequence numbers. Receivers cannot detect the 2297 presence of a translator unless they know by some other means 2298 what payload type or transport address was used by the original 2299 source. 2301 Mixer: Receives streams of RTP data packets from one or more sources, 2302 possibly changes the data format, combines the streams in some 2303 manner and then forwards the combined stream. Since the timing 2304 among multiple input sources will not generally be synchronized, 2305 the mixer will make timing adjustments among the streams and 2306 generate its own timing for the combined stream, so it is the 2307 synchronization source. Thus, all data packets forwarded by a 2308 mixer MUST be marked with the mixer's own SSRC identifier. In 2309 order to preserve the identity of the original sources 2310 contributing to the mixed packet, the mixer SHOULD insert their 2311 SSRC identifiers into the CSRC identifier list following the 2312 fixed RTP header of the packet. A mixer that is also itself a 2313 contributing source for some packet SHOULD explicitly include 2314 its own SSRC identifier in the CSRC list for that packet. 2316 For some applications, it MAY be acceptable for a mixer not to 2317 identify sources in the CSRC list. However, this introduces the 2318 danger that loops involving those sources could not be detected. 2320 The advantage of a mixer over a translator for applications like 2321 audio is that the output bandwidth is limited to that of one source 2322 even when multiple sources are active on the input side. This may be 2323 important for low-bandwidth links. The disadvantage is that receivers 2324 on the output side don't have any control over which sources are 2325 passed through or muted, unless some mechanism is implemented for 2326 remote control of the mixer. The regeneration of synchronization 2327 information by mixers also means that receivers can't do inter-media 2328 synchronization of the original streams. A multi-media mixer could do 2329 it. 2331 [E1] [E6] 2332 | | 2333 E1:17 | E6:15 | 2334 | | E6:15 2335 V M1:48 (1,17) M1:48 (1,17) V M1:48 (1,17) 2336 (M1)------------->----------------->-------------->[E7] 2337 ^ ^ E4:47 ^ E4:47 2338 E2:1 | E4:47 | | M3:89 (64,45) 2339 | | | 2340 [E2] [E4] M3:89 (64,45) | 2341 | legend: 2342 [E3] --------->(M2)----------->(M3)------------| [End system] 2343 E3:64 M2:12 (64) ^ (Mixer) 2344 | E5:45 2345 | 2346 [E5] source: SSRC (CSRCs) 2347 -------------------> 2349 Figure 3: Sample RTP network with end systems, mixers and translators 2351 A collection of mixers and translators is shown in Figure 3 to 2352 illustrate their effect on SSRC and CSRC identifiers. In the figure, 2353 end systems are shown as rectangles (named E), translators as 2354 triangles (named T) and mixers as ovals (named M). The notation "M1: 2355 48(1,17)" designates a packet originating a mixer M1, identified with 2356 M1's (random) SSRC value of 48 and two CSRC identifiers, 1 and 17, 2357 copied from the SSRC identifiers of packets from E1 and E2. 2359 7.2 RTCP Processing in Translators 2361 In addition to forwarding data packets, perhaps modified, translators 2362 and mixers MUST also process RTCP packets. In many cases, they will 2363 take apart the compound RTCP packets received from end systems to 2364 aggregate SDES information and to modify the SR or RR packets. 2365 Retransmission of this information may be triggered by the packet 2366 arrival or by the RTCP interval timer of the translator or mixer 2367 itself. 2369 A translator that does not modify the data packets, for example one 2370 that just replicates between a multicast address and a unicast 2371 address, MAY simply forward RTCP packets unmodified as well. A 2372 translator that transforms the payload in some way MUST make 2373 corresponding transformations in the SR and RR information so that it 2374 still reflects the characteristics of the data and the reception 2375 quality. These translators MUST NOT simply forward RTCP packets. In 2376 general, a translator SHOULD NOT aggregate SR and RR packets from 2377 different sources into one packet since that would reduce the 2378 accuracy of the propagation delay measurements based on the LSR and 2379 DLSR fields. 2381 SR sender information: A translator does not generate its own sender 2382 information, but forwards the SR packets received from one cloud 2383 to the others. The SSRC is left intact but the sender 2384 information MUST be modified if required by the translation. If 2385 a translator changes the data encoding, it MUST change the 2386 "sender's byte count" field. If it also combines several data 2387 packets into one output packet, it MUST change the "sender's 2388 packet count" field. If it changes the timestamp frequency, it 2389 MUST change the "RTP timestamp" field in the SR packet. 2391 SR/RR reception report blocks: A translator forwards reception 2392 reports received from one cloud to the others. Note that these 2393 flow in the direction opposite to the data. The SSRC is left 2394 intact. If a translator combines several data packets into one 2395 output packet, and therefore changes the sequence numbers, it 2396 MUST make the inverse manipulation for the packet loss fields 2397 and the "extended last sequence number" field. This may be 2398 complex. In the extreme case, there may be no meaningful way to 2399 translate the reception reports, so the translator MAY pass on 2400 no reception report at all or a synthetic report based on its 2401 own reception. The general rule is to do what makes sense for a 2402 particular translation. 2404 A translator does not require an SSRC identifier of its own, but MAY 2405 choose to allocate one for the purpose of sending reports about what 2406 it has received. These would be sent to all the connected clouds, 2407 each corresponding to the translation of the data stream as sent to 2408 that cloud, since reception reports are normally multicast to all 2409 participants. 2411 SDES: Translators typically forward without change the SDES 2412 information they receive from one cloud to the others, but MAY, 2413 for example, decide to filter non-CNAME SDES information if 2414 bandwidth is limited. The CNAMEs MUST be forwarded to allow SSRC 2415 identifier collision detection to work. A translator that 2416 generates its own RR packets MUST send SDES CNAME information 2417 about itself to the same clouds that it sends those RR packets. 2419 BYE: Translators forward BYE packets unchanged. A translator that is 2420 about to cease forwarding packets SHOULD send a BYE packet to 2421 each connected cloud containing all the SSRC identifiers that 2422 were previously being forwarded to that cloud, including the 2423 translator's own SSRC identifier if it sent reports of its own. 2425 APP: Translators forward APP packets unchanged. 2427 7.3 RTCP Processing in Mixers 2429 Since a mixer generates a new data stream of its own, it does not 2430 pass through SR or RR packets at all and instead generates new 2431 information for both sides. 2433 SR sender information: A mixer does not pass through sender 2434 information from the sources it mixes because the 2435 characteristics of the source streams are lost in the mix. As a 2436 synchronization source, the mixer SHOULD generate its own SR 2437 packets with sender information about the mixed data stream and 2438 send them in the same direction as the mixed stream. 2440 SR/RR reception report blocks: A mixer generates its own reception 2441 reports for sources in each cloud and sends them out only to the 2442 same cloud. It MUST NOT send these reception reports to the 2443 other clouds and MUST NOT forward reception reports from one 2444 cloud to the others because the sources would not be SSRCs there 2445 (only CSRCs). 2447 SDES: Mixers typically forward without change the SDES information 2448 they receive from one cloud to the others, but MAY, for example, 2449 decide to filter non-CNAME SDES information if bandwidth is 2450 limited. The CNAMEs MUST be forwarded to allow SSRC identifier 2451 collision detection to work. (An identifier in a CSRC list 2452 generated by a mixer might collide with an SSRC identifier 2453 generated by an end system.) A mixer MUST send SDES CNAME 2454 information about itself to the same clouds that it sends SR or 2455 RR packets. 2457 Since mixers do not forward SR or RR packets, they will typically be 2458 extracting SDES packets from a compound RTCP packet. To minimize 2459 overhead, chunks from the SDES packets MAY be aggregated into a 2460 single SDES packet which is then stacked on an SR or RR packet 2461 originating from the mixer. The RTCP packet rate MAY be different on 2462 each side of the mixer. 2464 A mixer that does not insert CSRC identifiers MAY also refrain from 2465 forwarding SDES CNAMEs. In this case, the SSRC identifier spaces in 2466 the two clouds are independent. As mentioned earlier, this mode of 2467 operation creates a danger that loops can't be detected. 2469 BYE: Mixers MUST forward BYE packets. A mixer that is about to cease 2470 forwarding packets SHOULD send a BYE packet to each connected 2471 cloud containing all the SSRC identifiers that were previously 2472 being forwarded to that cloud, including the mixer's own SSRC 2473 identifier if it sent reports of its own. 2475 APP: The treatment of APP packets by mixers is application-specific. 2477 7.4 Cascaded Mixers 2479 An RTP session may involve a collection of mixers and translators as 2480 shown in Figure 3. If two mixers are cascaded, such as M2 and M3 in 2481 the figure, packets received by a mixer may already have been mixed 2482 and may include a CSRC list with multiple identifiers. The second 2483 mixer SHOULD build the CSRC list for the outgoing packet using the 2484 CSRC identifiers from already-mixed input packets and the SSRC 2485 identifiers from unmixed input packets. This is shown in the output 2486 arc from mixer M3 labeled M3:89(64,45) in the figure. As in the case 2487 of mixers that are not cascaded, if the resulting CSRC list has more 2488 than 15 identifiers, the remainder cannot be included. 2490 8 SSRC Identifier Allocation and Use 2492 The SSRC identifier carried in the RTP header and in various fields 2493 of RTCP packets is a random 32-bit number that is required to be 2494 globally unique within an RTP session. It is crucial that the number 2495 be chosen with care in order that participants on the same network or 2496 starting at the same time are not likely to choose the same number. 2498 It is not sufficient to use the local network address (such as an 2499 IPv4 address) for the identifier because the address may not be 2500 unique. Since RTP translators and mixers enable interoperation among 2501 multiple networks with different address spaces, the allocation 2502 patterns for addresses within two spaces might result in a much 2503 higher rate of collision than would occur with random allocation. 2505 Multiple sources running on one host would also conflict. 2507 It is also not sufficient to obtain an SSRC identifier simply by 2508 calling random() without carefully initializing the state. An example 2509 of how to generate a random identifier is presented in Appendix A.6. 2511 8.1 Probability of Collision 2513 Since the identifiers are chosen randomly, it is possible that two or 2514 more sources will choose the same number. Collision occurs with the 2515 highest probability when all sources are started simultaneously, for 2516 example when triggered automatically by some session management 2517 event. If N is the number of sources and L the length of the 2518 identifier (here, 32 bits), the probability that two sources 2519 independently pick the same value can be approximated for large N 2520 [21] as 1 - exp(-N**2 / 2**(L+1)). For N=1000, the probability is 2521 roughly 10**-4. 2523 The typical collision probability is much lower than the worst-case 2524 above. When one new source joins an RTP session in which all the 2525 other sources already have unique identifiers, the probability of 2526 collision is just the fraction of numbers used out of the space. 2527 Again, if N is the number of sources and L the length of the 2528 identifier, the probability of collision is N / 2**L. For N=1000, the 2529 probability is roughly 2*10**-7. 2531 The probability of collision is further reduced by the opportunity 2532 for a new source to receive packets from other participants before 2533 sending its first packet (either data or control). If the new source 2534 keeps track of the other participants (by SSRC identifier), then 2535 before transmitting its first packet the new source can verify that 2536 its identifier does not conflict with any that have been received, or 2537 else choose again. 2539 8.2 Collision Resolution and Loop Detection 2541 Although the probability of SSRC identifier collision is low, all RTP 2542 implementations MUST be prepared to detect collisions and take the 2543 appropriate actions to resolve them. If a source discovers at any 2544 time that another source is using the same SSRC identifier as its 2545 own, it MUST send an RTCP BYE packet for the old identifier and 2546 choose another random one. (As explained below, this step is taken 2547 only once in case of a loop.) If a receiver discovers that two other 2548 sources are colliding, it MAY keep the packets from one and discard 2549 the packets from the other when this can be detected by different 2550 source transport addresses or CNAMEs. The two sources are expected to 2551 resolve the collision so that the situation doesn't last. 2553 Because the random SSRC identifiers are kept globally unique for each 2554 RTP session, they can also be used to detect loops that may be 2555 introduced by mixers or translators. A loop causes duplication of 2556 data and control information, either unmodified or possibly mixed, as 2557 in the following examples: 2559 o A translator may incorrectly forward a packet to the same 2560 multicast group from which it has received the packet, either 2561 directly or through a chain of translators. In that case, the 2562 same packet appears several times, originating from different 2563 network sources. 2565 o Two translators incorrectly set up in parallel, i.e., with the 2566 same multicast groups on both sides, would both forward packets 2567 from one multicast group to the other. Unidirectional 2568 translators would produce two copies; bidirectional translators 2569 would form a loop. 2571 o A mixer can close a loop by sending to the same transport 2572 destination upon which it receives packets, either directly or 2573 through another mixer or translator. In this case a source 2574 might show up both as an SSRC on a data packet and a CSRC in a 2575 mixed data packet. 2577 A source may discover that its own packets are being looped, or that 2578 packets from another source are being looped (a third-party loop). 2580 Both loops and collisions in the random selection of a source 2581 identifier result in packets arriving with the same SSRC identifier 2582 but a different source transport address, which may be that of the 2583 end system originating the packet or an intermediate system. 2584 Therefore, if a source changes its source transport address, it MUST 2585 also choose a new SSRC identifier to avoid being interpreted as a 2586 looped source. Note that if a translator restarts and consequently 2587 changes the source transport address (e.g., changes the UDP source 2588 port number) on which it forwards packets, then all those packets 2589 will appear to receivers to be looped because the SSRC identifiers 2590 are applied by the original source and will not change. This problem 2591 MAY be avoided by keeping the source transport addressed fixed across 2592 restarts, but in any case will be resolved after a timeout at the 2593 receivers. 2595 Loops or collisions occurring on the far side of a translator or 2596 mixer cannot be detected using the source transport address if all 2597 copies of the packets go through the translator or mixer, however 2598 collisions may still be detected when chunks from two RTCP SDES 2599 packets contain the same SSRC identifier but different CNAMEs. 2601 To detect and resolve these conflicts, an RTP implementation MUST 2602 include an algorithm similar to the one described below. It ignores 2603 packets from a new source or loop that collide with an established 2604 source. It resolves collisions with the participant's own SSRC 2605 identifier by sending an RTCP BYE for the old identifier and choosing 2606 a new one. However, when the collision was induced by a loop of the 2607 participant's own packets, the algorithm will choose a new identifier 2608 only once and thereafter ignore packets from the looping source 2609 transport address. This is required to avoid a flood of BYE packets. 2611 This algorithm requires keeping a table indexed by the source 2612 identifier and containing the source transport addresses from the 2613 first RTP packet and first RTCP packet received with that identifier, 2614 along with other state for that source. Two source transport 2615 addresses are required since, for example, the UDP source port 2616 numbers may be different on RTP and RTCP packets. However, it may be 2617 assumed that the network address is the same in both source transport 2618 addresses. 2620 Each SSRC or CSRC identifier received in an RTP or RTCP packet is 2621 looked up in the source identifier table in order to process that 2622 data or control information. The source transport address from the 2623 packet is compared to the corresponding source transport address in 2624 the table to detect a loop or collision if they don't match. For 2625 control packets, each element with its own SSRC id, for example an 2626 SDES chunk, requires a separate lookup. (The SSRC id in a reception 2627 report block is an exception because it identifies a source heard by 2628 the reporter, and that SSRC id is unrelated to the source transport 2629 adddress of the RTCP packet sent by the reporter.) If the SSRC or 2630 CSRC is not found, a new entry is created. These table entries are 2631 removed when an RTCP BYE packet is received with the corresponding 2632 SSRC id and validated by a matching source transport address, or 2633 after no packets have arrived for a relatively long time (see Section 2634 6.2.1). 2636 Note that if two sources on the same host are transmitting with the 2637 same source identifier at the time a receiver begins operation, it 2638 would be possible that the first RTP packet received came from one of 2639 the sources while the first RTCP packet received came from the other. 2640 This would cause the wrong RTCP information to be associated with the 2641 RTP data, but this situation should be sufficiently rare and harmless 2642 that it may be disregarded. 2644 In order to track loops of the participant's own data packets, the 2645 implementation MUST also keep a separate list of source transport 2646 addresses (not identifiers) that have been found to be conflicting. 2647 As in the source identifier table, two source transport addresses 2648 MUST be kept to separately track conflicting RTP and RTCP packets. 2649 Note that the conflicting address list should be a short, usually 2650 empty. Each element in this list stores the source addresses plus the 2651 time when the most recent conflicting packet was received. An element 2652 MAY be removed from the list when no conflicting packet has arrived 2653 from that source for a time on the order of 10 RTCP report intervals 2654 (see Section 6.2). 2656 For the algorithm as shown, it is assumed that the participant's own 2657 source identifier and state are included in the source identifier 2658 table. The algorithm could be restructured to first make a separate 2659 comparison against the participant's own source identifier. 2661 IF the SSRC or CSRC identifier is not found in the source 2662 identifier table: 2663 THEN create a new entry storing the data or control source 2664 transport address, the SSRC or CSRC id and other state. 2665 CONTINUE with normal processing. 2667 (identifier is found in the table) 2669 IF the table entry was created on receipt of a control packet 2670 and this is the first data packet or vice versa: 2671 THEN store the source transport address from this packet. 2672 CONTINUE with normal processing. 2673 IF the source transport address from the packet matches 2674 the one saved in the table entry for this identifier: 2675 THEN CONTINUE with normal processing. 2677 (an identifier collision or a loop is indicated) 2679 IF the source identifier is not the participant's own: 2680 THEN IF the source identifier is from an RTCP SDES chunk 2681 containing a CNAME item that differs from the CNAME 2682 in the table entry: 2683 THEN (optionally) count a third-party collision. 2684 ELSE (optionally) count a third-party loop. 2685 ABORT processing of data packet or control element. 2687 (a collision or loop of the participant's own packets) 2689 IF the source transport address is found in the list of 2690 conflicting data or control source transport addresses: 2691 THEN IF the source identifier is not from an RTCP SDES chunk 2692 containing a CNAME item OR if that CNAME is the 2693 participant's own: 2694 THEN (optionally) count occurrence of own traffic looped. 2695 mark current time in conflicting address list entry. 2696 ABORT processing of data packet or control element. 2697 log occurrence of a collision. 2698 create a new entry in the conflicting data or control source 2699 transport address list and mark current time. 2700 send an RTCP BYE packet with the old SSRC identifier. 2701 choose a new identifier. 2702 create a new entry in the source identifier table with the 2703 old SSRC plus the source transport address from the data 2704 or control packet being processed. 2705 CONTINUE with normal processing. 2707 In this algorithm, packets from a newly conflicting source address 2708 will be ignored and packets from the original source will be kept. 2709 (If the original source was through a mixer and later the same source 2710 is received directly, the receiver may be well advised to switch 2711 unless other sources in the mix would be lost.) If no packets arrive 2712 from the original source for an extended period, the table entry will 2713 be timed out and the new source will be able to take over. This might 2714 occur if the original source detects the collision and moves to a new 2715 source identifier, but in the usual case an RTCP BYE packet will be 2716 received from the original source to delete the state without having 2717 to wait for a timeout. 2719 When a new SSRC identifier is chosen due to a collision, the 2720 candidate identifier SHOULD first be looked up in the source 2721 identifier table to see if it was already in use by some other 2722 source. If so, another candidate MUST be generated and the process 2723 repeated. 2725 A loop of data packets to a multicast destination can cause severe 2726 network flooding. All mixers and translators MUST implement a loop 2727 detection algorithm like the one here so that they can break loops. 2728 This should limit the excess traffic to no more than one duplicate 2729 copy of the original traffic, which may allow the session to continue 2730 so that the cause of the loop can be found and fixed. However, in 2731 extreme cases where a mixer or translator does not properly break the 2732 loop and high traffic levels result, it may be necessary for end 2733 systems to cease transmitting data or control packets entirely. This 2734 decision may depend upon the application. An error condition SHOULD 2735 be indicated as appropriate. Transmission MAY be attempted again 2736 periodically after a long, random time (on the order of minutes). 2738 8.3 Use with Layered Encodings 2740 For layered encodings transmitted on separate RTP sessions (see 2741 Section 2.4), a single SSRC identifier space SHOULD be used across 2742 the sessions of all layers and the core (base) layer SHOULD be used 2743 for SSRC identifier allocation and collision resolution. When a 2744 source discovers that it has collided, it transmits an RTCP BYE 2745 message on only the base layer but changes the SSRC identifier to the 2746 new value in all layers. 2748 9 Security 2750 Lower layer protocols may eventually provide all the security 2751 services that may be desired for applications of RTP, including 2752 authentication, integrity, and confidentiality. These services have 2753 been specified for IP in [22]. Since the initial audio and video 2754 applications using RTP needed a confidentiality service before such 2755 services were available for the IP layer, the confidentiality service 2756 described in the next section was defined for use with RTP and RTCP. 2757 That description is included here to codify existing practice. New 2758 applications of RTP MAY implement this RTP-specific confidentiality 2759 service for backward compatibility, and/or they MAY implement IP 2760 layer security services. The overhead on the RTP protocol for this 2761 confidentiality service is low, so the penalty will be minimal if 2762 this service is obsoleted by lower layer services in the future. 2764 Alternatively, other services, other implementations of services and 2765 other algorithms may be defined for RTP in the future if warranted. 2766 The selection presented here is meant to simplify implementation of 2767 interoperable, secure applications and provide guidance to 2768 implementors. No claim is made that the methods presented here are 2769 appropriate for a particular security need. A profile may specify 2770 which services and algorithms should be offered by applications, and 2771 may provide guidance as to their appropriate use. 2773 Key distribution and certificates are outside the scope of this 2774 document. 2776 9.1 Confidentiality 2778 Confidentiality means that only the intended receiver(s) can decode 2779 the received packets; for others, the packet contains no useful 2780 information. Confidentiality of the content is achieved by 2781 encryption. 2783 When encryption of RTP or RTCP is desired, all the octets that will 2784 be encapsulated for transmission in a single lower-layer packet are 2785 encrypted as a unit. For RTCP, a 32-bit random number MUST be 2786 prepended to the unit before encryption to deter known plaintext 2787 attacks. For RTP, no prefix is required because the sequence number 2788 and timestamp fields are initialized with random offsets. 2790 For RTCP, an implementation MAY split a compound RTCP packet into two 2791 lower-layer packets, one to be encrypted and one to be sent in the 2792 clear. For example, SDES information might be encrypted while 2793 reception reports were sent in the clear to accommodate third-party 2794 monitors that are not privy to the encryption key. In this example, 2795 depicted in Fig. 4, the SDES information MUST be appended to an RR 2796 packet with no reports (and the encrypted) to satisfy the requirement 2797 that all compound RTCP packets begin with an SR or RR packet. 2799 The presence of encryption and the use of the correct key are 2800 confirmed by the receiver through header or payload validity checks. 2801 Examples of such validity checks for RTP and RTCP headers are given 2802 UDP packet UDP packet 2803 ------------------------------------- ------------------------- 2804 [32-bit ][ ][ # ] [ # sender # receiver] 2805 [random ][ RR ][SDES # CNAME, ...] [ SR # report # report ] 2806 [integer][(empty)][ # ] [ # # ] 2807 ------------------------------------- ------------------------- 2808 encrypted not encrypted 2810 #: SSRC 2812 Figure 4: Encrypted and non-encrypted RTCP packets 2814 in Appendices A.1 and A.2. 2816 The default encryption algorithm is the Data Encryption Standard 2817 (DES) algorithm in cipher block chaining (CBC) mode, as described in 2818 Section 1.1 of RFC 1423 [23], except that padding to a multiple of 8 2819 octets is indicated as described for the P bit in Section 5.1. The 2820 initialization vector is zero because random values are supplied in 2821 the RTP header or by the random prefix for compound RTCP packets. For 2822 details on the use of CBC initialization vectors, see [24]. 2823 Implementations that support encryption SHOULD always support the DES 2824 algorithm in CBC mode as the default to maximize interoperability. 2825 This method is chosen because it has been demonstrated to be easy and 2826 practical to use in experimental audio and video tools in operation 2827 on the Internet. Other encryption algorithms MAY be specified 2828 dynamically for a session by non-RTP means. 2830 As an alternative to encryption at the IP level or at the RTP level 2831 as described above, profiles MAY define additional payload types for 2832 encrypted encodings. Those encodings MUST specify how padding and 2833 other aspects of the encryption should be handled. This method allows 2834 encrypting only the data while leaving the headers in the clear for 2835 applications where that is desired. It may be particularly useful for 2836 hardware devices that will handle both decryption and decoding. 2838 9.2 Authentication and Message Integrity 2840 Authentication and message integrity services are not defined at the 2841 RTP level since these services would not be directly feasible without 2842 a key management infrastructure. It is expected that authentication 2843 and integrity services will be provided by lower layer protocols. 2845 10 RTP over Network and Transport Protocols 2846 This section describes issues specific to carrying RTP packets within 2847 particular network and transport protocols. The following rules apply 2848 unless superseded by protocol-specific definitions outside this 2849 specification. 2851 RTP relies on the underlying protocol(s) to provide demultiplexing of 2852 RTP data and RTCP control streams. For UDP and similar protocols, RTP 2853 SHOULD use an even port number and the corresponding RTCP stream 2854 SHOULD use the next higher (odd) port number. If an application is 2855 supplied with an odd number for use as the RTP port, it SHOULD 2856 replace this number with the next lower (even) number. 2858 In a unicast session, applications SHOULD be prepared to receive RTP 2859 data and control on one port pair and send to another. 2861 It is RECOMMENDED that layered encoding applications (see Section 2862 2.4) use a set of contiguous port numbers. The port numbers MUST be 2863 distinct because of a widespread deficiency in existing operating 2864 systems that prevents use of the same port with multiple multicast 2865 addresses, and for unicast, there is only one permissible address. 2866 Thus for layer n, the data port is P + 2n, and the control port is P 2867 + 2n + 1. When IP multicast is used, the addresses MUST also be 2868 distinct because multicast routing and group membership are managed 2869 on an address granularity. However, allocation of contiguous IP 2870 multicast addresses cannot be assumed because some groups may require 2871 different scopes and may therefore be allocated from different 2872 address ranges. 2874 RTP data packets contain no length field or other delineation, 2875 therefore RTP relies on the underlying protocol(s) to provide a 2876 length indication. The maximum length of RTP packets is limited only 2877 by the underlying protocols. 2879 If RTP packets are to be carried in an underlying protocol that 2880 provides the abstraction of a continuous octet stream rather than 2881 messages (packets), an encapsulation of the RTP packets MUST be 2882 defined to provide a framing mechanism. Framing is also needed if the 2883 underlying protocol may contain padding so that the extent of the RTP 2884 payload cannot be determined. The framing mechanism is not defined 2885 here. 2887 A profile MAY specify a framing method to be used even when RTP is 2888 carried in protocols that do provide framing in order to allow 2889 carrying several RTP packets in one lower-layer protocol data unit, 2890 such as a UDP packet. Carrying several RTP packets in one network or 2891 transport packet reduces header overhead and may simplify 2892 synchronization between different streams. 2894 11 Summary of Protocol Constants 2896 This section contains a summary listing of the constants defined in 2897 this specification. 2899 The RTP payload type (PT) constants are defined in profiles rather 2900 than this document. However, the octet of the RTP header which 2901 contains the marker bit(s) and payload type MUST avoid the reserved 2902 values 200 and 201 (decimal) to distinguish RTP packets from the RTCP 2903 SR and RR packet types for the header validation procedure described 2904 in Appendix A.1. For the standard definition of one marker bit and a 2905 7-bit payload type field as shown in this specification, this 2906 restriction means that payload types 72 and 73 are reserved. 2908 11.1 RTCP packet types 2910 abbrev. name value 2911 SR sender report 200 2912 RR receiver report 201 2913 SDES source description 202 2914 BYE goodbye 203 2915 APP application-defined 204 2917 These type values were chosen in the range 200-204 for improved 2918 header validity checking of RTCP packets compared to RTP packets or 2919 other unrelated packets. When the RTCP packet type field is compared 2920 to the corresponding octet of the RTP header, this range corresponds 2921 to the marker bit being 1 (which it usually is not in data packets) 2922 and to the high bit of the standard payload type field being 1 (since 2923 the static payload types are typically defined in the low half). This 2924 range was also chosen to be some distance numerically from 0 and 255 2925 since all-zeros and all-ones are common data patterns. 2927 Since all compound RTCP packets MUST begin with SR or RR, these codes 2928 were chosen as an even/odd pair to allow the RTCP validity check to 2929 test the maximum number of bits with mask and value. 2931 Additional RTCP packet types may be registered through IANA (see 2932 Section 11.3). 2934 11.2 SDES types 2936 abbrev. name value 2937 END end of SDES list 0 2938 CNAME canonical name 1 2939 NAME user name 2 2940 EMAIL user's electronic mail address 3 2941 PHONE user's phone number 4 2942 LOC geographic user location 5 2943 TOOL name of application or tool 6 2944 NOTE notice about the source 7 2945 PRIV private extensions 8 2947 Additional SDES types may be registered through IANA (see Section 2948 11.3). 2950 11.3 IANA Considerations 2952 Additional RTCP packet types and SDES types may be registered through 2953 the Internet Assigned Numbers Authority (IANA). Since these number 2954 spaces are small, allowing unconstrained registration of new values 2955 would not be prudent. To facilitate review of requests and to promote 2956 shared use of new types among multiple applications, requests for 2957 registration of new values must be documented in an RFC or other 2958 permanent and readily available reference such as the product of 2959 another cooperative standards body (e.g., ITU-T). Other requests may 2960 also be accepted, under the advice of a "designated expert." (Contact 2961 the IANA for the contact information of the current expert.) 2963 A Algorithms 2965 We provide examples of C code for aspects of RTP sender and receiver 2966 algorithms. There may be other implementation methods that are faster 2967 in particular operating environments or have other advantages. These 2968 implementation notes are for informational purposes only and are 2969 meant to clarify the RTP specification. 2971 The following definitions are used for all examples; for clarity and 2972 brevity, the structure definitions are only valid for 32-bit big- 2973 endian (most significant octet first) architectures. Bit fields are 2974 assumed to be packed tightly in big-endian bit order, with no 2975 additional padding. Modifications would be required to construct a 2976 portable implementation. 2978 /* 2979 * rtp.h -- RTP header file (RFC XXXX) 2980 */ 2981 #include 2983 /* 2984 * The type definitions below are valid for 32-bit architectures and 2985 * may have to be adjusted for 16- or 64-bit architectures. 2986 */ 2987 typedef unsigned char u_int8; 2988 typedef unsigned short u_int16; 2989 typedef unsigned int u_int32; 2990 typedef short int16; 2992 /* 2993 * Current protocol version. 2994 */ 2995 #define RTP_VERSION 2 2997 #define RTP_SEQ_MOD (1<<16) 2998 #define RTP_MAX_SDES 255 /* maximum text length for SDES */ 3000 typedef enum { 3001 RTCP_SR = 200, 3002 RTCP_RR = 201, 3003 RTCP_SDES = 202, 3004 RTCP_BYE = 203, 3005 RTCP_APP = 204 3006 } rtcp_type_t; 3008 typedef enum { 3009 RTCP_SDES_END = 0, 3010 RTCP_SDES_CNAME = 1, 3011 RTCP_SDES_NAME = 2, 3012 RTCP_SDES_EMAIL = 3, 3013 RTCP_SDES_PHONE = 4, 3014 RTCP_SDES_LOC = 5, 3015 RTCP_SDES_TOOL = 6, 3016 RTCP_SDES_NOTE = 7, 3017 RTCP_SDES_PRIV = 8 3018 } rtcp_sdes_type_t; 3020 /* 3021 * RTP data header 3022 */ 3023 typedef struct { 3024 unsigned int version:2; /* protocol version */ 3025 unsigned int p:1; /* padding flag */ 3026 unsigned int x:1; /* header extension flag */ 3027 unsigned int cc:4; /* CSRC count */ 3028 unsigned int m:1; /* marker bit */ 3029 unsigned int pt:7; /* payload type */ 3030 unsigned int seq:16; /* sequence number */ 3031 u_int32 ts; /* timestamp */ 3032 u_int32 ssrc; /* synchronization source */ 3033 u_int32 csrc[1]; /* optional CSRC list */ 3034 } rtp_hdr_t; 3036 /* 3037 * RTCP common header word 3038 */ 3039 typedef struct { 3040 unsigned int version:2; /* protocol version */ 3041 unsigned int p:1; /* padding flag */ 3042 unsigned int count:5; /* varies by packet type */ 3043 unsigned int pt:8; /* RTCP packet type */ 3044 u_int16 length; /* pkt len in words, w/o this word */ 3045 } rtcp_common_t; 3047 /* 3048 * Big-endian mask for version, padding bit and packet type pair 3049 */ 3050 #define RTCP_VALID_MASK (0xc000 | 0x2000 | 0xfe) 3051 #define RTCP_VALID_VALUE ((RTP_VERSION << 14) | RTCP_SR) 3053 /* 3054 * Reception report block 3055 */ 3056 typedef struct { 3057 u_int32 ssrc; /* data source being reported */ 3058 unsigned int fraction:8; /* fraction lost since last SR/RR */ 3059 int lost:24; /* cumul. no. pkts lost (signed!) */ 3060 u_int32 last_seq; /* extended last seq. no. received */ 3061 u_int32 jitter; /* interarrival jitter */ 3062 u_int32 lsr; /* last SR packet from this source */ 3063 u_int32 dlsr; /* delay since last SR packet */ 3064 } rtcp_rr_t; 3066 /* 3067 * SDES item 3068 */ 3069 typedef struct { 3070 u_int8 type; /* type of item (rtcp_sdes_type_t) */ 3071 u_int8 length; /* length of item (in octets) */ 3072 char data[1]; /* text, not null-terminated */ 3074 } rtcp_sdes_item_t; 3076 /* 3077 * One RTCP packet 3078 */ 3079 typedef struct { 3080 rtcp_common_t common; /* common header */ 3081 union { 3082 /* sender report (SR) */ 3083 struct { 3084 u_int32 ssrc; /* sender generating this report */ 3085 u_int32 ntp_sec; /* NTP timestamp */ 3086 u_int32 ntp_frac; 3087 u_int32 rtp_ts; /* RTP timestamp */ 3088 u_int32 psent; /* packets sent */ 3089 u_int32 osent; /* octets sent */ 3090 rtcp_rr_t rr[1]; /* variable-length list */ 3091 } sr; 3093 /* reception report (RR) */ 3094 struct { 3095 u_int32 ssrc; /* receiver generating this report */ 3096 rtcp_rr_t rr[1]; /* variable-length list */ 3097 } rr; 3099 /* source description (SDES) */ 3100 struct rtcp_sdes { 3101 u_int32 src; /* first SSRC/CSRC */ 3102 rtcp_sdes_item_t item[1]; /* list of SDES items */ 3103 } sdes; 3105 /* BYE */ 3106 struct { 3107 u_int32 src[1]; /* list of sources */ 3108 /* can't express trailing text for reason */ 3109 } bye; 3110 } r; 3111 } rtcp_t; 3113 typedef struct rtcp_sdes rtcp_sdes_t; 3114 /* 3115 * Per-source state information 3116 */ 3117 typedef struct { 3118 u_int16 max_seq; /* highest seq. number seen */ 3119 u_int32 cycles; /* shifted count of seq. number cycles */ 3120 u_int32 base_seq; /* base seq number */ 3121 u_int32 bad_seq; /* last 'bad' seq number + 1 */ 3122 u_int32 probation; /* sequ. packets till source is valid */ 3123 u_int32 received; /* packets received */ 3124 u_int32 expected_prior; /* packet expected at last interval */ 3125 u_int32 received_prior; /* packet received at last interval */ 3126 u_int32 transit; /* relative trans time for prev pkt */ 3127 u_int32 jitter; /* estimated jitter */ 3128 /* ... */ 3129 } source; 3131 A.1 RTP Data Header Validity Checks 3133 An RTP receiver SHOULD check the validity of the RTP header on 3134 incoming packets since they might be encrypted or might be from a 3135 different application that happens to be misaddressed. Similarly, if 3136 encryption according to the method described in Section 9 is enabled, 3137 the header validity check is needed to verify that incoming packets 3138 have been correctly decrypted, although a failure of the header 3139 validity check (e.g., unknown payload type) may not necessarily 3140 indicate decryption failure. 3142 Only weak validity checks are possible on an RTP data packet from a 3143 source that has not been heard before: 3145 o RTP version field must equal 2. 3147 o The payload type must be known, in particular it must not be 3148 equal to SR or RR. 3150 o If the P bit is set, then the last octet of the packet must 3151 contain a valid octet count, in particular, less than the total 3152 packet length minus the header size. 3154 o The X bit must be zero if the profile does not specify that 3155 the header extension mechanism may be used. Otherwise, the 3156 extension length field must be less than the total packet size 3157 minus the fixed header length and padding. 3159 o The length of the packet must be consistent with CC and 3160 payload type (if payloads have a known length). 3162 The last three checks are somewhat complex and not always possible, 3163 leaving only the first two which total just a few bits. If the SSRC 3164 identifier in the packet is one that has been received before, then 3165 the packet is probably valid and checking if the sequence number is 3166 in the expected range provides further validation. If the SSRC 3167 identifier has not been seen before, then data packets carrying that 3168 identifier may be considered invalid until a small number of them 3169 arrive with consecutive sequence numbers. 3171 The routine update_seq shown below ensures that a source is declared 3172 valid only after MIN_SEQUENTIAL packets have been received in 3173 sequence. It also validates the sequence number seq of a newly 3174 received packet and updates the sequence state for the packet's 3175 source in the structure to which s points. 3177 When a new source is heard for the first time, that is, its SSRC 3178 identifier is not in the table (see Section 8.2), and the per-source 3179 state is allocated for it, s->probation should be set to the number 3180 of sequential packets required before declaring a source valid 3181 (parameter MIN_SEQUENTIAL ) and s->max_seq initialized to seq-1 s- 3182 >probation marks the source as not yet valid so the state may be 3183 discarded after a short timeout rather than a long one, as discussed 3184 in Section 6.2.1. 3186 After a source is considered valid, the sequence number is considered 3187 valid if it is no more than MAX_DROPOUT ahead of s->max_seq nor more 3188 than MAX_MISORDER behind. If the new sequence number is ahead of 3189 max_seq modulo the RTP sequence number range (16 bits), but is 3190 smaller than max_seq , it has wrapped around and the (shifted) count 3191 of sequence number cycles is incremented. A value of one is returned 3192 to indicate a valid sequence number. 3194 Otherwise, the value zero is returned to indicate that the validation 3195 failed, and the bad sequence number is stored. If the next packet 3196 received carries the next higher sequence number, it is considered 3197 the valid start of a new packet sequence presumably caused by an 3198 extended dropout or a source restart. Since multiple complete 3199 sequence number cycles may have been missed, the packet loss 3200 statistics are reset. 3202 Typical values for the parameters are shown, based on a maximum 3203 misordering time of 2 seconds at 50 packets/second and a maximum 3204 dropout of 1 minute. The dropout parameter MAX_DROPOUT SHOULD be a 3205 small fraction of the 16-bit sequence number space to give a 3206 reasonable probability that new sequence numbers after a restart will 3207 not fall in the acceptable range for sequence numbers from before the 3208 restart. 3210 void init_seq(source *s, u_int16 seq) 3211 { 3212 s->base_seq = seq - 1; 3213 s->max_seq = seq; 3214 s->bad_seq = RTP_SEQ_MOD + 1; 3215 s->cycles = 0; 3216 s->received = 0; 3217 s->received_prior = 0; 3218 s->expected_prior = 0; 3219 /* other initialization */ 3220 } 3222 int update_seq(source *s, u_int16 seq) 3223 { 3224 u_int16 udelta = seq - s->max_seq; 3225 const int MAX_DROPOUT = 3000; 3226 const int MAX_MISORDER = 100; 3227 const int MIN_SEQUENTIAL = 2; 3229 /* 3230 * Source is not valid until MIN_SEQUENTIAL packets with 3231 * sequential sequence numbers have been received. 3232 */ 3233 if (s->probation) { 3234 /* packet is in sequence */ 3235 if (seq == s->max_seq + 1) { 3236 s->probation--; 3237 s->max_seq = seq; 3238 if (s->probation == 0) { 3239 init_seq(s, seq); 3240 s->received++; 3241 return 1; 3242 } 3243 } else { 3244 s->probation = MIN_SEQUENTIAL - 1; 3245 s->max_seq = seq; 3246 } 3247 return 0; 3248 } else if (udelta < MAX_DROPOUT) { 3249 /* in order, with permissible gap */ 3250 if (seq < s->max_seq) { 3251 /* 3252 * Sequence number wrapped - count another 64K cycle. 3253 */ 3254 s->cycles += RTP_SEQ_MOD; 3255 } 3256 s->max_seq = seq; 3258 } else if (udelta <= RTP_SEQ_MOD - MAX_MISORDER) { 3259 /* the sequence number made a very large jump */ 3260 if (seq == s->bad_seq) { 3261 /* 3262 * Two sequential packets -- assume that the other side 3263 * restarted without telling us so just re-sync 3264 * (i.e., pretend this was the first packet). 3265 */ 3266 init_seq(s, seq); 3267 } 3268 else { 3269 s->bad_seq = (seq + 1) & (RTP_SEQ_MOD-1); 3270 return 0; 3271 } 3272 } else { 3273 /* duplicate or reordered packet */ 3274 } 3275 s->received++; 3276 return 1; 3277 } 3279 The validity check can be made stronger requiring more than two 3280 packets in sequence. The disadvantages are that a larger number of 3281 initial packets will be discarded and that high packet loss rates 3282 could prevent validation. However, because the RTCP header validation 3283 is relatively strong, if an RTCP packet is received from a source 3284 before the data packets, the count could be adjusted so that only two 3285 packets are required in sequence. If initial data loss for a few 3286 seconds can be tolerated, an application MAY choose to discard all 3287 data packets from a source until a valid RTCP packet has been 3288 received from that source. 3290 Depending on the application and encoding, algorithms may exploit 3291 additional knowledge about the payload format for further validation. 3292 For payload types where the timestamp increment is the same for all 3293 packets, the timestamp values can be predicted from the previous 3294 packet received from the same source using the sequence number 3295 difference (assuming no change in payload type). 3297 A strong "fast-path" check is possible since with high probability 3298 the first four octets in the header of a newly received RTP data 3299 packet will be just the same as that of the previous packet from the 3300 same SSRC except that the sequence number will have increased by one. 3301 Similarly, a single-entry cache may be used for faster SSRC lookups 3302 in applications where data is typically received from one source at a 3303 time. 3305 A.2 RTCP Header Validity Checks 3307 The following checks SHOULD be applied to RTCP packets. 3309 o RTP version field must equal 2. 3311 o The payload type field of the first RTCP packet in a compound 3312 packet must be equal to SR or RR. 3314 o The padding bit (P) should be zero for the first packet of a 3315 compound RTCP packet because padding should only be applied, if 3316 it is needed, to the last packet. 3318 o The length fields of the individual RTCP packets must total to 3319 the overall length of the compound RTCP packet as received. 3320 This is a fairly strong check. 3322 The code fragment below performs all of these checks. The packet type 3323 is not checked for subsequent packets since unknown packet types may 3324 be present and should be ignored. 3326 u_int32 len; /* length of compound RTCP packet in words */ 3327 rtcp_t *r; /* RTCP header */ 3328 rtcp_t *end; /* end of compound RTCP packet */ 3330 if ((*(u_int16 *)r & RTCP_VALID_MASK) != RTCP_VALID_VALUE) { 3331 /* something wrong with packet format */ 3332 } 3333 end = (rtcp_t *)((u_int32 *)r + len); 3335 do r = (rtcp_t *)((u_int32 *)r + r->common.length + 1); 3336 while (r < end && r->common.version == 2); 3338 if (r != end) { 3339 /* something wrong with packet format */ 3340 } 3342 A.3 Determining the Number of RTP Packets Expected and Lost 3344 In order to compute packet loss rates, the number of packets expected 3345 and actually received from each source needs to be known, using per- 3346 source state information defined in struct source referenced via 3347 pointer s in the code below. The number of packets received is simply 3348 the count of packets as they arrive, including any late or duplicate 3349 packets. The number of packets expected can be computed by the 3350 receiver as the difference between the highest sequence number 3351 received ( s->max_seq ) and the first sequence number received ( s- 3352 >base_seq ). Since the sequence number is only 16 bits and will wrap 3353 around, it is necessary to extend the highest sequence number with 3354 the (shifted) count of sequence number wraparounds ( s->cycles ). 3355 Both the received packet count and the count of cycles are maintained 3356 the RTP header validity check routine in Appendix A.1. 3358 extended_max = s->cycles + s->max_seq; 3359 expected = extended_max - s->base_seq + 1; 3361 The number of packets lost is defined to be the number of packets 3362 expected less the number of packets actually received: 3364 lost = expected - s->received; 3366 Since this signed number is carried in 24 bits, it SHOULD be clamped 3367 at 0x7fffff for positive loss or 0xffffff for negative loss rather 3368 than wrapping around. 3370 The fraction of packets lost during the last reporting interval 3371 (since the previous SR or RR packet was sent) is calculated from 3372 differences in the expected and received packet counts across the 3373 interval, where expected_prior and received_prior are the values 3374 saved when the previous reception report was generated: 3376 expected_interval = expected - s->expected_prior; 3377 s->expected_prior = expected; 3378 received_interval = s->received - s->received_prior; 3379 s->received_prior = s->received; 3380 lost_interval = expected_interval - received_interval; 3381 if (expected_interval == 0 || lost_interval <= 0) fraction = 0; 3382 else fraction = (lost_interval << 8) / expected_interval; 3384 The resulting fraction is an 8-bit fixed point number with the binary 3385 point at the left edge. 3387 A.4 Generating SDES RTCP Packets 3389 This function builds one SDES chunk into buffer b composed of argc 3390 items supplied in arrays type , value and length b 3392 char *rtp_write_sdes(char *b, u_int32 src, int argc, 3393 rtcp_sdes_type_t type[], char *value[], 3394 int length[]) 3395 { 3396 rtcp_sdes_t *s = (rtcp_sdes_t *)b; 3397 rtcp_sdes_item_t *rsp; 3398 int i; 3399 int len; 3400 int pad; 3402 /* SSRC header */ 3403 s->src = src; 3404 rsp = &s->item[0]; 3406 /* SDES items */ 3407 for (i = 0; i < argc; i++) { 3408 rsp->type = type[i]; 3409 len = length[i]; 3410 if (len > RTP_MAX_SDES) { 3411 /* invalid length, may want to take other action */ 3412 len = RTP_MAX_SDES; 3413 } 3414 rsp->length = len; 3415 memcpy(rsp->data, value[i], len); 3416 rsp = (rtcp_sdes_item_t *)&rsp->data[len]; 3417 } 3419 /* terminate with end marker and pad to next 4-octet boundary */ 3420 len = ((char *) rsp) - b; 3421 pad = 4 - (len & 0x3); 3422 b = (char *) rsp; 3423 while (pad--) *b++ = RTCP_SDES_END; 3425 return b; 3426 } 3428 A.5 Parsing RTCP SDES Packets 3430 This function parses an SDES packet, calling functions find_member() 3431 to find a pointer to the information for a session member given the 3432 SSRC identifier and member_sdes() to store the new SDES information 3433 for that member. This function expects a pointer to the header of the 3434 RTCP packet. 3436 void rtp_read_sdes(rtcp_t *r) 3437 { 3438 int count = r->common.count; 3439 rtcp_sdes_t *sd = &r->r.sdes; 3440 rtcp_sdes_item_t *rsp, *rspn; 3441 rtcp_sdes_item_t *end = (rtcp_sdes_item_t *) 3442 ((u_int32 *)r + r->common.length + 1); 3443 source *s; 3445 while (--count >= 0) { 3446 rsp = &sd->item[0]; 3447 if (rsp >= end) break; 3448 s = find_member(sd->src); 3450 for (; rsp->type; rsp = rspn ) { 3451 rspn = (rtcp_sdes_item_t *)((char*)rsp+rsp->length+2); 3452 if (rspn >= end) { 3453 rsp = rspn; 3454 break; 3455 } 3456 member_sdes(s, rsp->type, rsp->data, rsp->length); 3457 } 3458 sd = (rtcp_sdes_t *) 3459 ((u_int32 *)sd + (((char *)rsp - (char *)sd) >> 2)+1); 3460 } 3461 if (count >= 0) { 3462 /* invalid packet format */ 3463 } 3464 } 3466 A.6 Generating a Random 32-bit Identifier 3468 The following subroutine generates a random 32-bit identifier using 3469 the MD5 routines published in RFC 1321 [25]. The system routines may 3470 not be present on all operating systems, but they should serve as 3471 hints as to what kinds of information may be used. Other system calls 3472 that may be appropriate include 3474 o getdomainname() , 3476 o getwd() , or 3478 o getrusage() 3480 "Live" video or audio samples are also a good source of random 3481 numbers, but care must be taken to avoid using a turned-off 3482 microphone or blinded camera as a source [8]. 3484 Use of this or similar routine is RECOMMENDED to generate the initial 3485 seed for the random number generator producing the RTCP period (as 3486 shown in Appendix A.7), to generate the initial values for the 3487 sequence number and timestamp, and to generate SSRC values. Since 3488 this routine is likely to be CPU-intensive, its direct use to 3489 generate RTCP periods is inappropriate because predictability is not 3490 an issue. Note that this routine produces the same result on repeated 3491 calls until the value of the system clock changes unless different 3492 values are supplied for the type argument. 3494 /* 3495 * Generate a random 32-bit quantity. 3496 */ 3497 #include /* u_long */ 3498 #include /* gettimeofday() */ 3499 #include /* get..() */ 3500 #include /* printf() */ 3501 #include /* clock() */ 3502 #include /* uname() */ 3503 #include "global.h" /* from RFC 1321 */ 3504 #include "md5.h" /* from RFC 1321 */ 3506 #define MD_CTX MD5_CTX 3507 #define MDInit MD5Init 3508 #define MDUpdate MD5Update 3509 #define MDFinal MD5Final 3511 static u_long md_32(char *string, int length) 3512 { 3513 MD_CTX context; 3514 union { 3515 char c[16]; 3516 u_long x[4]; 3517 } digest; 3518 u_long r; 3519 int i; 3521 MDInit (&context); 3522 MDUpdate (&context, string, length); 3523 MDFinal ((unsigned char *)&digest, &context); 3524 r = 0; 3525 for (i = 0; i < 3; i++) { 3526 r ^= digest.x[i]; 3527 } 3528 return r; 3529 } /* md_32 */ 3531 /* 3532 * Return random unsigned 32-bit quantity. Use 'type' argument if you 3533 * need to generate several different values in close succession. 3534 */ 3535 u_int32 random32(int type) 3536 { 3537 struct { 3538 int type; 3539 struct timeval tv; 3540 clock_t cpu; 3541 pid_t pid; 3542 u_long hid; 3543 uid_t uid; 3544 gid_t gid; 3545 struct utsname name; 3546 } s; 3548 gettimeofday(&s.tv, 0); 3549 uname(&s.name); 3550 s.type = type; 3551 s.cpu = clock(); 3552 s.pid = getpid(); 3553 s.hid = gethostid(); 3554 s.uid = getuid(); 3555 s.gid = getgid(); 3556 /* also: system uptime */ 3558 return md_32((char *)&s, sizeof(s)); 3559 } /* random32 */ 3561 A.7 Computing the RTCP Transmission Interval 3563 The following functions implement the RTCP transmission and reception 3564 rules described in Section 6.2. These rules are coded in several 3565 functions: 3567 o OnExpire() is called when the RTCP transmission timer expires. 3569 o rtcp_interval() computes the deterministic calculated 3570 interval, measured in seconds. 3572 o OnReception() is called whenever an RTCP packet is received. 3574 It is assumed that the following functions are available: 3576 o Schedule(time t, event e) schedules an event e to occur at 3577 time t. When time t arrives, the funcion OnExpire is called 3578 with e as an argument. 3580 o ReSchedule(time t, event e) reschedules a previously scheduled 3581 event e for time t. 3583 o SendRTCPReport() sends an RTCP report. 3585 o SendBYEPacket() sends a BYE packet. 3587 o TypeOfEvent(event e) returns EVENT_BYE if the event being 3588 processed is for a BYE packet to be sent, else it returns 3589 EVENT_REPORT. 3591 o NewMember(p) returns a 1 if the person who sent packet p is 3592 not currently in the member list, 0 otherwise. Note this 3593 function is not sufficient for a complete implementation 3594 because each CSRC identifier in an RTP packet and each SSRC in 3595 a BYE packet should be processed. 3597 o PacketType(p) returns PACKET_RTCP_REPORT if packet p is an 3598 RTCP report (not BYE), PACKET_BYE if its a BYE RTCP packet, and 3599 PACKET_RTP if its a regular RTP data packet. 3601 The parameters of rtcp_interval() are defined in Section 6.3. 3603 double rtcp_interval(int members, 3604 int senders, 3605 double rtcp_bw, 3606 int we_sent, 3607 double avg_rtcp_size, 3608 int initial) 3609 { 3610 /* 3611 * Minimum average time between RTCP packets from this site (in 3612 * seconds). This time prevents the reports from `clumping' when 3613 * sessions are small and the law of large numbers isn't helping 3614 * to smooth out the traffic. It also keeps the report interval 3615 * from becoming ridiculously small during transient outages like 3616 * a network partition. 3617 */ 3618 double const RTCP_MIN_TIME = 5.; 3619 /* 3620 * Fraction of the RTCP bandwidth to be shared among active 3621 * senders. (This fraction was chosen so that in a typical 3622 * session with one or two active senders, the computed report 3623 * time would be roughly equal to the minimum report time so that 3624 * we don't unnecessarily slow down receiver reports.) The 3625 * receiver fraction must be 1 - the sender fraction. 3626 */ 3627 double const RTCP_SENDER_BW_FRACTION = 0.25; 3628 double const RTCP_RCVR_BW_FRACTION = (1-RTCP_SENDER_BW_FRACTION); 3629 double t; /* interval */ 3630 double rtcp_min_time = RTCP_MIN_TIME; 3631 int n; /* no. of members for computation */ 3633 /* 3634 * Very first call at application start-up uses half the min 3635 * delay for quicker notification while still allowing some time 3636 * before reporting for randomization and to learn about other 3637 * sources so the report interval will converge to the correct 3638 * interval more quickly. 3639 */ 3640 if (initial) { 3641 rtcp_min_time /= 2; 3642 } 3644 /* 3645 * If there were active senders, give them at least a minimum 3646 * share of the RTCP bandwidth. Otherwise all participants share 3647 * the RTCP bandwidth equally. 3648 */ 3649 n = members; 3650 if (senders > 0 && senders < members * RTCP_SENDER_BW_FRACTION) { 3651 if (we_sent) { 3652 rtcp_bw *= RTCP_SENDER_BW_FRACTION; 3653 n = senders; 3654 } else { 3655 rtcp_bw *= RTCP_RCVR_BW_FRACTION; 3656 n -= senders; 3657 } 3658 } 3660 /* 3661 * The effective number of sites times the average packet size is 3662 * the total number of octets sent when each site sends a report. 3663 * Dividing this by the effective bandwidth gives the time 3664 * interval over which those packets must be sent in order to 3665 * meet the bandwidth target, with a minimum enforced. In that 3666 * time interval we send one report so this time is also our 3667 * average time between reports. 3668 */ 3669 t = avg_rtcp_size * n / rtcp_bw; 3670 if (t < rtcp_min_time) t = rtcp_min_time; 3672 /* 3673 * To avoid traffic bursts from unintended synchronization with 3674 * other sites, we then pick our actual next report interval as a 3675 * random number uniformly distributed between 0.5*t and 1.5*t. 3676 */ 3677 return t * (drand48() + 0.5); 3678 } 3679 void OnExpire(event e, 3680 int members, 3681 int senders, 3682 double rtcp_bw, 3683 int we_sent, 3684 double *avg_rtcp_size, 3685 int *initial, 3686 time tc, 3687 time *tp, 3688 int *pmembers) { 3690 /* This function is responsible for deciding whether to send 3691 * an RTCP report or BYE packet now, or to reschedule transmission. 3692 * It is also responsible for updating the pmembers, initial, tp, 3693 * and avg_rtcp_size state variables. This function should be called 3694 * upon expiration of the event timer used by Schedule(). */ 3696 double t; /* Interval */ 3697 double tn; /* Next transmit time */ 3698 int SendIt; /* flag for sending packet */ 3700 /* To compensate for "unconditional reconsideration" converging to a 3701 * value below the intended average. */ 3702 double const COMPENSATION = 2.71828 - 1.5; 3704 /* In the case of a BYE, we use "unconditional reconsideration" to 3705 * reschedule the transmission of the BYE if necessary */ 3707 if (TypeOfEvent(e) == EVENT_BYE) { 3708 t = rtcp_interval(members, 3709 senders, 3710 rtcp_bw * COMPENSATION, 3711 we_sent, 3712 avg_rtcp_size, 3713 initial); 3714 tn = *tp + t; 3715 if (tn <= tc) { 3716 SendBYEPacket(); 3717 exit(1); 3718 } else { 3719 Schedule(tn, e); 3720 } 3722 } else if (TypeOfEvent(e) == EVENT_REPORT) { 3723 SendIt = FALSE; 3724 if (initial == TRUE) { 3725 t = rtcp_interval(members, 3726 senders, 3727 rtcp_bw * COMPENSATION, 3728 we_sent, 3729 avg_rtcp_size, 3730 initial); 3732 tn = *tp + t; 3734 if (tn <= tc) { 3735 SendIt = TRUE; 3736 } 3737 } 3739 if (SendIt == TRUE) { 3740 SendRTCPReport(); 3741 *pmembers = members; 3742 *avg_rtcp_size = (1./16.)*PacketSize(e) + 3743 (15./16.)*(*avg_rtcp_size); 3744 *tp = tc; 3745 } else { 3746 Schedule(tn, e); 3747 *pmembers = members; 3748 } 3749 } 3750 } 3751 void OnReceive(packet p, 3752 event e, 3753 int *members, 3754 int *pmembers, 3755 int *senders 3756 double *avg_rtcp_size, 3757 double *tp, 3758 double tc) { 3760 double tn; /* Next packet transmission time */ 3762 /* What we do depends on whether we have left the group, and 3763 * are waiting to send a BYE (TypeOfEvent(e) == EVENT_BYE) or 3764 * an RTCP report. p represents the packet that was just received. */ 3766 if (PacketType(p) == PACKET_RTCP_REPORT) { 3767 if (NewMember(p) && (TypeOfEvent(e) == EVENT_REPORT)) 3768 *members += 1; 3769 *avg_rtcp_size = (1./16.)*PacketSize(e) + 3770 (15./16.)*(*avg_rtcp_size); 3771 } else if (PacketType(p) == PACKET_RTP) { 3772 if (NewSender(p) && (TypeOfEvent(e) == EVENT_REPORT)) 3773 *senders += 1; 3774 if (NewMember(p) && (TypeOfEvent(e) == EVENT_REPORT)) 3775 *members += 1; 3776 } else if (PacketType(p) == PACKET_BYE) { 3777 *avg_rtcp_size = (1./16.)*PacketSize(e) + 3778 (15./16.)*(*avg_rtcp_size); 3780 if (TypeOfEvent(e) == EVENT_REPORT) { 3781 if (NewSender(p) == FALSE) *senders -= 1; 3782 if (NewMember(p) == FALSE) *members -= 1; 3784 tn = tc + ((*members)/(*pmembers))*(tn - tc); 3785 *tp = tc - ((*members)/(*pmembers))*(tc - *tp); 3787 /* Reschedule the next report for time tn */ 3789 Reschedule(e, tn); 3790 *pmembers = members; 3792 } else if (TypeOfEvent(e) == EVENT_BYE) { 3793 *members += 1; 3794 } 3795 } 3796 } 3798 A.8 Estimating the Interarrival Jitter 3800 The code fragments below implement the algorithm given in Section 3801 6.4.1 for calculating an estimate of the statistical variance of the 3802 RTP data interarrival time to be inserted in the interarrival jitter 3803 field of reception reports. The inputs are r->ts , the timestamp from 3804 the incoming packet, and arrival , the current time in the same 3805 units. Here s points to state for the source; s->transit holds the 3806 relative transit time for the previous packet, and s->jitter holds 3807 the estimated jitter. The jitter field of the reception report is 3808 measured in timestamp units and expressed as an unsigned integer, but 3809 the jitter estimate is kept in a floating point. As each data packet 3810 arrives, the jitter estimate is updated: 3812 int transit = arrival - r->ts; 3813 int d = transit - s->transit; 3814 s->transit = transit; 3815 if (d < 0) d = -d; 3816 s->jitter += (1./16.) * ((double)d - s->jitter); 3818 When a reception report block (to which rr points) is generated for 3819 this member, the current jitter estimate is returned: 3821 rr->jitter = (u_int32) s->jitter; 3823 Alternatively, the jitter estimate can be kept as an integer, but 3824 scaled to reduce round-off error. The calculation is the same except 3825 for the last line: 3827 s->jitter += d - ((s->jitter + 8) >> 4); 3829 In this case, the estimate is sampled for the reception report as: 3831 rr->jitter = s->jitter >> 4; 3833 B Changes from RFC 1889 3834 Most of this RFC is identical to RFC 1889. The changes are listed 3835 below. 3837 o The algorithm for calculating the RTCP transmission interval 3838 specified in Sections 6.2 and 6.3 and illustrated in Appendix 3839 A.7 is augmented to include "reconsideration" to minimize 3840 transmission over the intended rate when many participants join 3841 a session simultaneously, and "reverse reconsideration" to 3842 reduce the incidence and duration of false participant timeouts 3843 when the number of participants drops rapidly. 3845 o Section 6.3.7 specifies new rules controlling when an RTCP BYE 3846 packet should be sent in order to avoid a flood of packets when 3847 many participants leave a session simultaneously. Sections 7.2 3848 and 7.3 specify that translators and mixers should send BYE 3849 packets for the sources they are no longer forwarding. 3851 o Section 6.2.1 specifies that implementations may store only a 3852 sampling of the participants' SSRC identifiers to allow scaling 3853 to very large sessions. Algorithms are specified in a separate 3854 RFC. 3856 o In Section 6.2 it is specified that RTCP sender and receiver 3857 bandwidths to be set as separate parameters of the session 3858 rather than a strict percentage of the session bandwidth, and 3859 may be set to zero. The requirement that RTCP was mandatory for 3860 RTP sessions using IP multicast was relaxed. 3862 o Also in Section 6.2 it is specified that the minimum RTCP 3863 interval may be scaled to smaller values for high bandwidth 3864 sessions, and may be set to zero for unicast sessions. 3866 o Rule changes for layered encodings are defined in Sections 3867 2.4, 6.3.9, 8.3 and 10. 3869 o An indentation bug in the RFC 1889 printing of the pseudo-code 3870 for the collision detection and resolution algorithm in Section 3871 8.2 is corrected, and the algorithm has been modified to remove 3872 the restriction that both RTP and RTCP must be sent from the 3873 same source port number. 3875 o For unicast RTP sessions, distinct port pairs may be used for 3876 the two ends (Sections 3 and 7.1). 3878 o The description of the padding mechanism for RTCP packets was 3879 clarified and it is specified that padding MUST be applied to 3880 the last packet of a compound RTCP packet. 3882 o It is specified that a receiver MUST ignore packets with 3883 payload types it does not understand. 3885 o The specification of "relative" NTP timestamp in the RTCP SR 3886 section now defines these timestamps to be based on the most 3887 common system-specific clock, such as system uptime, rather 3888 than on session elapsed time which would not be the same for 3889 multiple applications started on the same machine at different 3890 times. 3892 o The inconsequence of NTP timestamps wrapping around in the 3893 year 2036 is explained. 3895 o The policy for registration of RTCP packet types and SDES 3896 types was clarified in a new Section 11.3, IANA Considerations. 3897 The suggestion that experimenters register the numbers they 3898 need and then unregister those which prove to be unneeded has 3899 been removed in in favor of using APP and PRIV. 3901 o The reference for the UTF-8 character set was changed to be 3902 RFC 2279. 3904 o The last paragraph of the introduction in RFC 1889, which 3905 cautioned implementers to limit deployment in the Internet, was 3906 removed because it was deemed no longer relevant. 3908 o Small clarifications of the text have been made in several 3909 places, some in response to questions from readers. In 3910 particular: 3912 -A definition for "RTP media type" is given in Section 3 to 3913 allow the explanation of multiplexing RTP sessions in Section 3914 5.2 to be more clear regarding the multiplexing of multiple 3915 media. 3917 -The description of the session bandwidth parameter is expanded 3918 in Section 6.2. 3920 -The method for terminating and padding a sequence of SDES 3921 items is clarified in Section 6.5. 3923 -The Security section adds a formal reference to IPSEC now that 3924 it is available, and says that the confidentiality method 3925 defined in this specification is primarily to codify existing 3926 practice. 3928 -The terms MUST, SHOULD, MAY, etc. are used as defined in RFC 3929 2119. 3931 C Security Considerations 3933 RTP suffers from the same security liabilities as the underlying 3934 protocols. For example, an impostor can fake source or destination 3935 network addresses, or change the header or payload. Within RTCP, the 3936 CNAME and NAME information may be used to impersonate another 3937 participant. In addition, RTP may be sent via IP multicast, which 3938 provides no direct means for a sender to know all the receivers of 3939 the data sent and therefore no measure of privacy. Rightly or not, 3940 users may be more sensitive to privacy concerns with audio and video 3941 communication than they have been with more traditional forms of 3942 network communication [26]. Therefore, the use of security mechanisms 3943 with RTP is important. These mechanisms are discussed in Section 9. 3945 RTP-level translators or mixers may be used to allow RTP traffic to 3946 reach hosts behind firewalls. Appropriate firewall security 3947 principles and practices, which are beyond the scope of this 3948 document, should be followed in the design and installation of these 3949 devices and in the admission of RTP applications for use behind the 3950 firewall. 3952 D Addresses of Authors 3954 Henning Schulzrinne 3955 Dept. of Computer Science 3956 Columbia University 3957 1214 Amsterdam Avenue 3958 New York, NY 10027 3959 USA 3960 electronic mail: schulzrinne@cs.columbia.edu 3962 Stephen L. Casner 3963 Cisco Systems, Inc. 3964 170 West Tasman Drive 3965 San Jose, CA 95134 3966 United States 3967 electronic mail: casner@cisco.com 3969 Ron Frederick 3970 Xerox Palo Alto Research Center 3971 3333 Coyote Hill Road 3972 Palo Alto, CA 94304 3973 United States 3974 electronic mail: frederic@parc.xerox.com 3976 Van Jacobson 3977 Cisco Systems, Inc. 3978 170 West Tasman Drive 3979 San Jose, CA 95134 3980 United States 3981 electronic mail: van@cisco.com 3983 Acknowledgments 3985 This memorandum is based on discussions within the IETF Audio/Video 3986 Transport working group chaired by Stephen Casner. The current 3987 protocol has its origins in the Network Voice Protocol and the Packet 3988 Video Protocol (Danny Cohen and Randy Cole) and the protocol 3989 implemented by the vat application (Van Jacobson and Steve McCanne). 3990 Christian Huitema provided ideas for the random identifier generator. 3991 Extensive analysis and simulation of the timer reconsideration 3992 algorithm was done by Jonathan Rosenberg. 3994 E Bibliography 3996 [1] D. D. Clark and D. L. Tennenhouse, "Architectural considerations 3997 for a new generation of protocols," in SIGCOMM Symposium on 3998 Communications Architectures and Protocols , (Philadelphia, 3999 Pennsylvania), pp. 200--208, IEEE, Sept. 1990. Computer 4000 Communications Review, Vol. 20(4), Sept. 1990. 4002 [2] H. Schulzrinne, "Issues in designing a transport protocol for 4003 audio and video conferences and other multiparticipant real-time 4004 applications." expired Internet draft, Oct. 1993. 4006 [3] S. Bradner, "Key words for use in RFCs to Indicate Requirement 4007 Levels," RFC 2119, Internet Engineering Task Force, Mar. 1997. 4009 [4] D. E. Comer, Internetworking with TCP/IP , vol. 1. Englewood 4010 Cliffs, New Jersey: Prentice Hall, 1991. 4012 [5] J. Postel, "Internet protocol," RFC 791, Internet Engineering 4013 Task Force, Sept. 1981. 4015 [6] D. Mills, "Network time protocol (v3)," RFC 1305, Internet 4016 Engineering Task Force, Apr. 1992. 4018 [7] J. Reynolds and J. Postel, "Assigned numbers," STD 2, RFC 1700, 4019 Internet Engineering Task Force, Oct. 1994. 4021 [8] D. Eastlake, S. Crocker, and J. Schiller, "Randomness 4022 recommendations for security," RFC 1750, Internet Engineering Task 4023 Force, Dec. 1994. 4025 [9] J.-C. Bolot, T. Turletti, and I. Wakeman, "Scalable feedback 4026 control for multicast video distribution in the internet," in SIGCOMM 4027 Symposium on Communications Architectures and Protocols , (London, 4028 England), pp. 58--67, ACM, Aug. 1994. 4030 [10] I. Busse, B. Deffner, and H. Schulzrinne, "Dynamic QoS control 4031 of multimedia applications based on RTP," Computer Communications , 4032 Jan. 1996. 4034 [11] S. Floyd and V. Jacobson, "The synchronization of periodic 4035 routing messages," in SIGCOMM Symposium on Communications 4036 Architectures and Protocols (D. P. Sidhu, ed.), (San Francisco, 4037 California), pp. 33--44, ACM, Sept. 1993. also in [27]. 4039 [12] J. Rosenberg and H. Schulzrinne, "Sampling of the Group 4040 Membership in RTP," Internet Draft, Internet Engineering Task Force, 4041 November 1998. Work in progress. 4043 [13] J. A. Cadzow, Foundations of digital signal processing and data 4044 analysis New York, New York: Macmillan, 1987. 4046 [14] F. Yergeau, "UTF-8, a transformation format of ISO 10646," RFC 4047 2279, Internet Engineering Task Force, Jan. 1998. 4049 [15] P. Mockapetris, "Domain names - concepts and facilities," STD 4050 13, RFC 1034, Internet Engineering Task Force, Nov. 1987. 4052 [16] P. Mockapetris, "Domain names - implementation and 4053 specification," STD 13, RFC 1035, Internet Engineering Task Force, 4054 Nov. 1987. 4056 [17] R. Braden, "Requirements for internet hosts - application and 4057 support," STD 3, RFC 1123, Internet Engineering Task Force, Oct. 4058 1989. 4060 [18] Y. Rekhter, R. Moskowitz, D. Karrenberg, and G. de Groot, 4061 "Address allocation for private internets," RFC 1597, Internet 4062 Engineering Task Force, Mar. 1994. 4064 [19] E. Lear, E. Fair, D. Crocker, and T. Kessler, "Network 10 4065 considered harmful (some practices shouldn't be codified)," RFC 4066 1627, Internet Engineering Task Force, July 1994. 4068 [20] D. Crocker, "Standard for the format of ARPA internet text 4069 messages," STD 11, RFC 822, Internet Engineering Task Force, Aug. 4070 1982. 4072 [21] W. Feller, An Introduction to Probability Theory and its 4073 Applications, Volume 1 , vol. 1. New York, New York: John Wiley and 4074 Sons, third ed., 1968. 4076 [22] S. Kent and R. Atkinson, "Security Architecture for the Internet 4077 Protocol," Internet Draft, Internet Engineering Task Force, July 4078 1998. Work in progress. 4080 [23] D. Balenson, "Privacy enhancement for internet electronic mail: 4081 Part III: algorithms, modes, and identifiers," RFC 1423, Internet 4082 Engineering Task Force, Feb. 1993. 4084 [24] V. L. Voydock and S. T. Kent, "Security mechanisms in high-level 4085 network protocols," ACM Computing Surveys , vol. 15, pp. 135--171, 4086 June 1983. 4088 [25] R. Rivest, "The MD5 message-digest algorithm," RFC 1321, 4089 Internet Engineering Task Force, Apr. 1992. 4091 [26] S. Stubblebine, "Security services for multimedia conferencing," 4092 in 16th National Computer Security Conference , (Baltimore, 4093 Maryland), pp. 391--395, Sept. 1993. 4095 [27] S. Floyd and V. Jacobson, "The synchronization of periodic 4096 routing messages," IEEE/ACM Transactions on Networking , vol. 2, pp. 4097 122--136, Apr. 1994. 4099 Table of Contents 4101 1 Introduction ........................................ 3 4102 1.1 Terminology ......................................... 5 4103 2 RTP Use Scenarios ................................... 5 4104 2.1 Simple Multicast Audio Conference ................... 5 4105 2.2 Audio and Video Conference .......................... 6 4106 2.3 Mixers and Translators .............................. 6 4107 2.4 Layered Encodings ................................... 7 4108 3 Definitions ......................................... 8 4109 4 Byte Order, Alignment, and Time Format .............. 10 4110 5 RTP Data Transfer Protocol .......................... 11 4111 5.1 RTP Fixed Header Fields ............................. 11 4112 5.2 Multiplexing RTP Sessions ........................... 14 4113 5.3 Profile-Specific Modifications to the RTP Header 4114 ................................................................ 15 4115 5.3.1 RTP Header Extension ................................ 15 4116 6 RTP Control Protocol -- RTCP ........................ 16 4117 6.1 RTCP Packet Format .................................. 18 4118 6.2 RTCP Transmission Interval .......................... 20 4119 6.2.1 Maintaining the number of session members ........... 24 4120 6.3 RTCP Packet Send and Receive Rules .................. 25 4121 6.3.1 Computing the RTCP transmission interval ............ 26 4122 6.3.2 Initialization ...................................... 27 4123 6.3.3 Receiving an RTP or non-BYE RTCP packet ............. 27 4124 6.3.4 Receiving an RTCP BYE packet ........................ 27 4125 6.3.5 Timing Out an SSRC .................................. 28 4126 6.3.6 Expiration of transmission timer .................... 28 4127 6.3.7 Transmitting a BYE packet ........................... 29 4128 6.3.8 Updating we_sent .................................... 30 4129 6.3.9 Allocation of source description bandwidth .......... 30 4130 6.4 Sender and Receiver Reports ......................... 31 4131 6.4.1 SR: Sender report RTCP packet ....................... 31 4132 6.4.2 RR: Receiver report RTCP packet ..................... 37 4133 6.4.3 Extending the sender and receiver reports ........... 38 4134 6.4.4 Analyzing sender and receiver reports ............... 39 4135 6.5 SDES: Source description RTCP packet ................ 40 4136 6.5.1 CNAME: Canonical end-point identifier SDES item ..... 42 4137 6.5.2 NAME: User name SDES item ........................... 43 4138 6.5.3 EMAIL: Electronic mail address SDES item ............ 44 4139 6.5.4 PHONE: Phone number SDES item ....................... 44 4140 6.5.5 LOC: Geographic user location SDES item ............. 44 4141 6.5.6 TOOL: Application or tool name SDES item ............ 45 4142 6.5.7 NOTE: Notice/status SDES item ....................... 45 4143 6.5.8 PRIV: Private extensions SDES item .................. 46 4144 6.6 BYE: Goodbye RTCP packet ............................ 46 4145 6.7 APP: Application-defined RTCP packet ................ 47 4146 7 RTP Translators and Mixers .......................... 49 4147 7.1 General Description ................................. 49 4148 7.2 RTCP Processing in Translators ...................... 51 4149 7.3 RTCP Processing in Mixers ........................... 53 4150 7.4 Cascaded Mixers ..................................... 54 4151 8 SSRC Identifier Allocation and Use .................. 54 4152 8.1 Probability of Collision ............................ 54 4153 8.2 Collision Resolution and Loop Detection ............. 55 4154 8.3 Use with Layered Encodings .......................... 59 4155 9 Security ............................................ 59 4156 9.1 Confidentiality ..................................... 60 4157 9.2 Authentication and Message Integrity ................ 61 4158 10 RTP over Network and Transport Protocols ............ 61 4159 11 Summary of Protocol Constants ....................... 63 4160 11.1 RTCP packet types ................................... 63 4161 11.2 SDES types .......................................... 63 4162 11.3 IANA Considerations ................................. 64 4163 A Algorithms .......................................... 64 4164 A.1 RTP Data Header Validity Checks ..................... 68 4165 A.2 RTCP Header Validity Checks ......................... 73 4166 A.3 Determining the Number of RTP Packets Expected and 4167 Lost ........................................................... 73 4168 A.4 Generating SDES RTCP Packets ........................ 74 4169 A.5 Parsing RTCP SDES Packets ........................... 75 4170 A.6 Generating a Random 32-bit Identifier ............... 76 4171 A.7 Computing the RTCP Transmission Interval ............ 79 4172 A.8 Estimating the Interarrival Jitter .................. 86 4173 B Changes from RFC 1889 ............................... 86 4174 C Security Considerations ............................. 89 4175 D Addresses of Authors ................................ 89 4176 E Bibliography ........................................ 90