idnits 2.17.1 draft-ietf-mmusic-sdp-new-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Looks like you're using RFC 2026 boilerplate. This must be updated to follow RFC 3978/3979, as updated by RFC 4748. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** The document seems to lack a 1id_guidelines paragraph about 6 months document validity -- however, there's a paragraph with a matching beginning. Boilerplate error? ** The document is more than 15 pages and seems to lack a Table of Contents. == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** The document seems to lack an Authors' Addresses Section. ** There are 4 instances of too long lines in the document, the longest one being 13 characters in excess of 72. == There are 2 instances of lines with non-RFC6890-compliant IPv4 addresses in the document. If these are example addresses, they should be changed. == There are 12 instances of lines with multicast IPv4 addresses in the document. If these are generic example addresses, they should be changed to use the 233.252.0.x range defined in RFC 5771 ** The document seems to lack a both a reference to RFC 2119 and the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. RFC 2119 keyword, line 462: '...domain name is the form that SHOULD be...' RFC 2119 keyword, line 464: '...local IP address MUST NOT be used in a...' RFC 2119 keyword, line 863: '...yte strings, and MAY use any byte valu...' RFC 2119 keyword, line 1153: '... field. The charset specified MUST be...' RFC 2119 keyword, line 1155: '...a US-ASCII string and MUST be compared...' (7 more instances...) Miscellaneous warnings: ---------------------------------------------------------------------------- == Line 537 has weird spacing: '...7 or p=+...' == Line 761 has weird spacing: '...it must be po...' == Using lowercase 'not' together with uppercase 'MUST', 'SHALL', 'SHOULD', or 'RECOMMENDED' is not an accepted usage according to RFC 2119. Please use uppercase 'NOT' together with RFC 2119 keywords (if that is what you mean). Found 'SHOULD not' in this paragraph: On receiving a session description over an unauthenticated transport mechanism or from an untrusted party, software parsing the session should take a few precautions. Session description contain information required to start software on the receivers system. Software that parses a session description MUST not be able to start other software except that which is specifically configured as appropriate software to participate in multimedia sessions. It is normally considered INAPPROPRIATE for software parsing a session description to start, on a user's system, software that is appropriate to participate in multimedia sessions, without the user first being informed that such software will be started and giving their consent. Thus a session description arriving by session announcement, email, sessioR multimedia,session page SHOULD not deliver the user into an interactive without the user being aware that this will happen. As it is not always simple to tell whether a session is interactive or not, applications that are unsure should assume sessions are interactive. -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (2 March 2001) is 8449 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Missing reference section? '4' on line 275 looks like a reference -- Missing reference section? '11' on line 74 looks like a reference -- Missing reference section? '12' on line 75 looks like a reference -- Missing reference section? '1' on line 689 looks like a reference -- Missing reference section? '3' on line 1054 looks like a reference -- Missing reference section? '2' on line 901 looks like a reference -- Missing reference section? '10' on line 1133 looks like a reference Summary: 7 errors (**), 0 flaws (~~), 6 warnings (==), 9 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 Internet Engineering Task Force MMUSIC WG 2 INTERNET-DRAFT Mark Handley/ACIRI 3 draft-ietf-mmusic-sdp-new-01.txt Van Jacobson/Packet Design 4 Colin Perkins/ISI 5 2 March 2001 6 Expires: September 2001 8 SDP: Session Description Protocol 10 Status of this Memo 12 This document is an Internet-Draft and is in full conformance with all 13 provisions of Section 10 of RFC2026. 15 Internet-Drafts are working documents of the Internet Engineering Task 16 Force (IETF), its areas, and its working groups. Note that other groups 17 may also distribute working documents as Internet-Drafts. 19 Internet-Drafts are draft documents valid for a maximum of six months 20 and may be updated, replaced, or obsoleted by other documents at any 21 time. It is inappropriate to use Internet- Drafts as reference material 22 or to cite them other than as "work in progress." 24 The list of current Internet-Drafts can be accessed at 25 http://www.ietf.org/ietf/1id-abstracts.txt 27 The list of Internet-Draft Shadow Directories can be accessed at 28 http://www.ietf.org/shadow.html. 30 Abstract 32 This document defines the Session Description Protocol, SDP. 33 SDP is intended for describing multimedia sessions for the 34 purposes of session announcement, session invitation, and 35 other forms of multimedia session initiation. 37 This document is a product of the Multiparty Multimedia Session Control 38 (MMUSIC) working group of the Internet Engineering Task Force. Comments 39 are solicited and should be addressed to the working group's mailing 40 list at confctrl@isi.edu and/or the authors. 42 1. Introduction 44 Note: This draft is essentially identical to RFC 2327. It is made 45 available to stimulate discussion of corrections and clarifications 46 which need to be made in order to advance SDP to a draft standard RFC. 48 On the Internet multicast backbone (Mbone), a session directory tool is 49 used to advertise multimedia conferences and communicate the conference 50 addresses and conference tool-specific information necessary for 51 participation. This document defines a session description protocol for 52 this purpose, and for general real-time multimedia session description 53 purposes. This draft does not describe multicast address allocation or 54 the distribution of SDP messages in detail. These are described in 55 accompanying drafts. SDP is not intended for negotiation of media 56 encodings. 58 2. Background 60 The Mbone is the part of the internet that supports IP multicast, and 61 thus permits efficient many-to-many communication. It is used 62 extensively for multimedia conferencing. Such conferences usually have 63 the property that tight coordination of conference membership is not 64 necessary; to receive a conference, a user at an Mbone site only has to 65 know the conference's multicast group address and the UDP ports for the 66 conference data streams. 68 Session directories assist the advertisement of conference sessions and 69 communicate the relevant conference setup information to prospective 70 participants. SDP is designed to convey such information to recipients. 71 SDP is purely a format for session description - it does not incorporate 72 a transport protocol, and is intended to use different transport 73 protocols as appropriate including the Session Announcement Protocol 74 [4], Session Initiation Protocol [11], Real-Time Streaming Protocol 75 [12], electronic mail using the MIME extensions, and the Hypertext 76 Transport Protocol. 78 SDP is intended to be general purpose so that it can be used for a wider 79 range of network environments and applications than just multicast 80 session directories. However, it is not intended to support negotiation 81 of session content or media encodings - this is viewed as outside the 82 scope of session description. 84 3. Glossary of Terms 86 The following terms are used in this document, and have specific meaning 87 within the context of this document. 89 Conference 90 A multimedia conference is a set of two or more communicating users 91 along with the software they are using to communicate. 93 Session 94 A multimedia session is a set of multimedia senders and receivers 95 and the data streams flowing from senders to receivers. A 96 multimedia conference is an example of a multimedia session. 98 Session Advertisement 99 See session announcement. 101 Session Announcement 102 A session announcement is a mechanism by which a session description 103 is conveyed to users in a pro-active fashion, i.e., the session 104 description was not explicitly requested by the user. 106 Session Description 107 A well defined format for conveying sufficient information to 108 discover and participate in a multimedia session. 110 3.1. Terminology 112 This document uses the same words as RFC 1123 for defining the 113 significance of each particular requirement. These words are: 115 must: 116 This word or the adjective ``required'' means that the item is an 117 absolute requirement of the specification. 119 should: 120 This word or the adjective ``recommended'' means that there may 121 exist valid reasons in particular circumstances to ignore this 122 item, but the full implications should be understood and the case 123 carefully weighed before choosing a different course. 125 may: 126 This word or the adjective ``optional'' means that this item is 127 truly optional. One implementation may choose to include the item 128 because a particular application requires it or because it enhances 129 the product, for example, another implementation may omit the same 130 item. 132 An implementation is not compliant if it fails to satisfy one or more of 133 the must requirements for the protocols it implements. An 134 implementation that satisfies all the must and all the should 135 requirements for the protocols it implements is said to be 136 ``unconditionally compliant''; one that satisfies all the must 137 requirements but not all of the should requirements for the protocols it 138 implements is said to be ``conditionally compliant''. 140 4. SDP Usage 142 4.1. Multicast Announcements 144 SDP is a session description protocol for multimedia sessions. A common 145 mode of usage is for a client to announce a conference session by 146 periodically multicasting an announcement packet to a well known 147 multicast address and port using the Session Announcement Protocol 148 (SAP). 150 SAP packets are UDP packets with the following format: 152 0 31 153 |--------------------| 154 | SAP header | 155 |--------------------| 156 | text payload | 157 |/\/\/\/\/\/\/\/\/\/\| 159 The header is the Session Announcement Protocol header. SAP is 160 described in more detail in a companion draft [4] 162 The text payload is an SDP session description, as described in this 163 draft. The text payload should be no greater than 1 Kbyte in length. 164 If announced by SAP, only one session announcement is permitted in a 165 single packet. 167 4.2. Email and WWW Announcements 169 Alternative means of conveying session descriptions include electronic 170 mail and the World Wide Web. For both email and WWW distribution, the 171 use of the MIME content type ``application/sdp'' should be used. This 172 enables the automatic launching of applications for participation in the 173 session from the WWW client or mail reader in a standard manner. 175 Note that announcements of multicast sessions made only via email or the 176 World Wide Web (WWW) do not have the property that the receiver of a 177 session announcement can necessarily receive the session because the 178 multicast sessions may be restricted in scope, and access to the WWW 179 server or reception of email is possible outside this scope. SAP 180 announcements do not suffer from this mismatch. 182 5. Requirements and Recommendations 184 The purpose of SDP is to convey information about media streams in 185 multimedia sessions to allow the recipients of a session description to 186 participate in the session. SDP is primarily intended for use in an 187 internetwork, although it is sufficiently general that it can describe 188 conferences in other network environments. 190 A multimedia session, for these purposes, is defined as a set of media 191 streams that exist for some duration of time. Media streams can be 192 many-to-many. The times during which the session is active need not be 193 continuous. 195 Thus far, multicast based sessions on the Internet have differed from 196 many other forms of conferencing in that anyone receiving the traffic 197 can join the session (unless the session traffic is encrypted). In such 198 an environment, SDP serves two primary purposes. It is a means to 199 communicate the existence of a session, and is a means to convey 200 sufficient information to enable joining and participating in the 201 session. In a unicast environment, only the latter purpose is likely to 202 be relevant. 204 Thus SDP includes: 206 o Session name and purpose 208 o Time(s) the session is active 210 o The media comprising the session 212 o Information to receive those media (addresses, ports, formats and so 213 on) 215 As resources necessary to participate in a session may be limited, some 216 additional information may also be desirable: 218 o Information about the bandwidth to be used by the conference 219 o Contact information for the person responsible for the session 221 In general, SDP must convey sufficient information to be able to join a 222 session (with the possible exception of encryption keys) and to announce 223 the resources to be used to non-participants that may need to know. 225 5.1. Media Information 227 SDP includes: 229 o The type of media (video, audio, etc) 231 o The transport protocol (RTP/UDP/IP, H.320, etc) 233 o The format of the media (H.261 video, MPEG video, etc) 235 For an IP multicast session, the following are also conveyed: 237 o Multicast address for media 239 o Transport Port for media 241 This address and port are the destination address and destination port 242 of the multicast stream, whether being sent, received, or both. 244 For an IP unicast session, the following are conveyed: 246 o Remote address for media 248 o Transport port for contact address 250 The semantics of this address and port depend on the media and transport 251 protocol defined. By default, this is the remote address and remote 252 port to which data is sent, and the remote address and local port on 253 which to receive data. However, some media may define to use these to 254 establish a control channel for the actual media flow. 256 5.2. Timing Information 258 Sessions may either be bounded or unbounded in time. Whether or not 259 they are bounded, they may be only active at specific times. 261 SDP can convey: 263 o An arbitrary list of start and stop times bounding the session 264 o For each bound, repeat times such as "every Wednesday at 10am for 265 one hour" 267 This timing information is globally consistent, irrespective of local 268 time zone or daylight saving time. 270 5.3. Private Sessions 272 It is possible to create both public sessions and private sessions. 273 Private sessions will typically be conveyed by encrypting the session 274 description to distribute it. The details of how encryption is 275 performed are dependent on the mechanism used to convey SDP - see [4] 276 for how this is done for session announcements. 278 If a session announcement is private it is possible to use that private 279 announcement to convey encryption keys necessary to decode each of the 280 media in a conference, including enough information to know which 281 encryption scheme is used for each media. 283 5.4. Obtaining Further Information about a Session 285 A session description should convey enough information to decide whether 286 or not to participate in a session. SDP may include additional pointers 287 in the form of Universal Resources Identifiers (URIs) for more 288 information about the session. 290 5.5. Categorisation 292 When many session descriptions are being distributed by SAP or any other 293 advertisement mechanism, it may be desirable to filter announcements 294 that are of interest from those that are not. SDP supports a 295 categorisation mechanism for sessions that is capable of being 296 automated. 298 5.6. Internationalization 300 The SDP specification recommends the use of the ISO 10646 character sets 301 in the UTF-8 encoding (RFC 2044) to allow many different languages to be 302 represented. However, to assist in compact representations, SDP also 303 allows other character sets such as ISO 8859-1 to be used when desired. 304 Internationalization only applies to free-text fields (session name and 305 background information), and not to SDP as a whole. 307 6. SDP Specification 309 SDP session descriptions are entirely textual using the ISO 10646 310 character set in UTF-8 encoding. SDP field names and attributes names 311 use only the US-ASCII subset of UTF-8, but textual fields and attribute 312 values may use the full ISO 10646 character set. The textual form, as 313 opposed to a binary encoding such as ASN/1 or XDR, was chosen to enhance 314 portability, to enable a variety of transports to be used (e.g, session 315 description in a MIME email message) and to allow flexible, text-based 316 toolkits (e.g., Tcl/Tk ) to be used to generate and to process session 317 descriptions. However, since the total bandwidth allocated to all SAP 318 announcements is strictly limited, the encoding is deliberately compact. 319 Also, since announcements may be transported via very unreliable means 320 (e.g., email) or damaged by an intermediate caching server, the encoding 321 was designed with strict order and formatting rules so that most errors 322 would result in malformed announcements which could be detected easily 323 and discarded. This also allows rapid discarding of encrypted 324 announcements for which a receiver does not have the correct key. 326 An SDP session description consists of a number of lines of text of the 327 form 328 = 329 is always exactly one character and is case-significant. 330 is a structured text string whose format depends on . It also 331 will be case-significant unless a specific field defines otherwise. 332 Whitespace is not permitted either side of the `=' sign. In general 333 is either a number of fields delimited by a single space 334 character or a free format string. 336 A session description consists of a session-level description (details 337 that apply to the whole session and all media streams) and optionally 338 several media-level descriptions (details that apply onto to a single 339 media stream). 341 An announcement consists of a session-level section followed by zero or 342 more media-level sections. The session-level part starts with a `v=' 343 line and continues to the first media-level section. The media 344 description starts with an `m=' line and continues to the next media 345 description or end of the whole session description. In general, 346 session-level values are the default for all media unless overridden by 347 an equivalent media-level value. 349 When SDP is conveyed by SAP, only one session description is allowed per 350 packet. When SDP is conveyed by other means, many SDP session 351 descriptions may be concatenated together (the `v=' line indicating the 352 start of a session description terminates the previous description). 353 Some lines in each description are required and some are optional but 354 all must appear in exactly the order given here (the fixed order greatly 355 enhances error detection and allows for a simple parser). Optional 356 items are marked with a `*'. 358 Session description 359 v= (protocol version) 360 o= (owner/creator and session identifier). 361 s= (session name) 362 i=* (session information) 363 u=* (URI of description) 364 e=* (email address) 365 p=* (phone number) 366 c=* (connection information - not required if included in all media) 367 b=* (bandwidth information) 368 One or more time descriptions (see below) 369 z=* (time zone adjustments) 370 k=* (encryption key) 371 a=* (zero or more session attribute lines) 372 Zero or more media descriptions (see below) 374 Time description 375 t= (time the session is active) 376 r=* (zero or more repeat times) 378 Media description 379 m= (media name and transport address) 380 i=* (media title) 381 c=* (connection information - optional if included at session-level) 382 b=* (bandwidth information) 383 k=* (encryption key) 384 a=* (zero or more media attribute lines) 386 The set of `type' letters is deliberately small and not intended to be 387 extensible -- SDP parsers must completely ignore any announcement that 388 contains a `type' letter that it does not understand. The `attribute' 389 mechanism ("a=" described below) is the primary means for extending SDP 390 and tailoring it to particular applications or media. Some attributes 391 (the ones listed in this document) have a defined meaning but others may 392 be added on an application-, media- or session-specific basis. A 393 session directory must ignore any attribute it doesn't understand. 395 The connection (`c=') and attribute (`a=') information in the session- 396 level section applies to all the media of that session unless overridden 397 by connection information or an attribute of the same name in the media 398 description. For instance, in the example below, each media behaves as 399 if it were given a `recvonly' attribute. 401 An example SDP description is: 403 v=0 404 o=mhandley 2890844526 2890842807 IN IP4 126.16.64.4 405 s=SDP Seminar 406 i=A Seminar on the session description protocol 407 u=http://www.cs.ucl.ac.uk/staff/M.Handley/sdp.03.ps 408 e=mjh@isi.edu (Mark Handley) 409 c=IN IP4 224.2.17.12/127 410 t=2873397496 2873404696 411 a=recvonly 412 m=audio 49170 RTP/AVP 0 413 m=video 51372 RTP/AVP 31 414 m=application 32416 udp wb 415 a=orient:portrait 417 Text records such as the session name and information are bytes strings 418 which may contain any byte with the exceptions of 0x00 (Nul), 0x0a 419 (ASCII newline) and 0x0d (ASCII carriage return). The sequence CRLF 420 (0x0d0a) is used to end a record, although parsers should be tolerant 421 and also accept records terminated with a single newline character. By 422 default these byte strings contain ISO-10646 characters in UTF-8 423 encoding, but this default may be changed using the `charset' attribute. 425 Protocol Version 427 v=0 429 The ``v='' field gives the version of the Session Description Protocol. 430 There is no minor version number. 432 Origin 434 o=
435
437 The ``o='' field gives the originator of the session (their username and 438 the address of the user's host) plus a session id and session version 439 number. is the user's login on the originating host, or it 440 is ``-'' if the originating host does not support the concept of user 441 ids. must not contain spaces. is a numeric 442 string such that the tuple of , , , 443
and
form a globally unique identifier for the 444 session. The method of session id allocation is up to the creating 445 tool, but it has been suggested that a Network Time Protocol (NTP) 446 timestamp be used to ensure uniqueness [1]. is a version 447 number for this announcement. It is needed for proxy announcements to 448 detect which of several announcements for the same session is the most 449 recent. Again its usage is up to the creating tool, so long as 450 is increased when a modification is made to the session data. 451 Again, it is recommended (but not mandatory) that an NTP timestamp is 452 used. is a text string giving the type of network. 453 Initially ``IN'' is defined to have the meaning ``Internet''.
is a text string giving the type of the address that follows. 455 Initially ``IP4'' and ``IP6'' are defined.
is the globally 456 unique address of the machine from which the session was created. For 457 an address type of IP4, this is either the fully-qualified domain name 458 of the machine, or the dotted-decimal representation of the IP version 4 459 address of the machine. For an address type of IP6, this is either the 460 fully-qualified domain name of the machine, or the compressed textual 461 representation of the IP version 6 address of the machine. For both IP4 462 and IP6, the fully-qualified domain name is the form that SHOULD be 463 given unless this is unavailable, in which case the globally unique 464 address may be substituted. A local IP address MUST NOT be used in any 465 context where the SDP description might leave the scope in which the 466 address is meaningful. 468 In general, the ``o='' field serves as a globally unique identifier for 469 this version of this session description, and the subfields excepting 470 the version taken together identify the session irrespective of any 471 modifications. 473 Session Name 475 s= 477 The ``s='' field is the session name. There must be one and only one 478 ``s='' field per session description, and it must contain ISO 10646 479 characters (but see also the `charset' attribute below). 481 Session and Media Information 483 i= 485 The ``i='' field is information about the session. There may be at most 486 one session-level ``i='' field per session description, and at most one 487 ``i='' field per media. Although it may be omitted, this is discouraged 488 for session announcements, and user interfaces for composing sessions 489 should require text to be entered. If it is present it must contain ISO 490 10646 characters (but see also the `charset' attribute below). 492 A single ``i='' field can also be used for each media definition. In 493 media definitions, ``i='' fields are primarily intended for labeling 494 media streams. As such, they are most likely to be useful when a single 495 session has more than one distinct media stream of the same media type. 496 An example would be two different whiteboards, one for slides and one 497 for feedback and questions. 499 URI 501 u= 503 o A URI is a Universal Resource Identifier as used by WWW clients 505 o The URI should be a pointer to additional information about the 506 conference 508 o This field is optional, but if it is present it should be specified 509 before the first media field 511 o No more than one URI field is allowed per session description 513 Email Address and Phone Number 515 e= 516 p= 518 o These specify contact information for the person responsible for the 519 conference. This is not necessarily the same person that created 520 the conference announcement. 522 o Either an email field or a phone field must be specified. 523 Additional email and phone fields are allowed. 525 o If these are present, they should be specified before the first 526 media field. 528 o More than one email or phone field can be given for a session 529 description. 531 o Phone numbers should be given in the conventional international 532 format - preceded by a ``+'' and the international country code. 533 There must be a space or a hyphen (``-'') between the country code 534 and the rest of the phone number. Spaces and hyphens may be used to 535 split up a phone field to aid readability if desired. For example: 537 p=+44-171-380-7777 or p=+1 617 253 6011 539 o Both email addresses and phone numbers can have an optional free 540 text string associated with them, normally giving the name of the 541 person who may be contacted. This should be enclosed in parenthesis 542 if it is present. For example: 544 e=mjh@isi.edu (Mark Handley) 546 The alternative RFC822 name quoting convention is also allowed for 547 both email addresses and phone numbers. For example, 549 e=Mark Handley 551 The free text string should be in the ISO-10646 character set with 552 UTF-8 encoding, or alternatively in ISO-8859-1 or other encodings if 553 the appropriate charset session-level attribute is set. 555 Connection Data 557 c=
559 The ``c='' field contains connection data. 561 A session announcement must contain one ``c='' field in each media 562 description (see below) or a ``c='' field at the session-level. It may 563 contain a session-level ``c='' field and one additional ``c='' field per 564 media description, in which case the per-media values override the 565 session-level settings for the relevant media. 567 The first sub-field is the network type, which is a text string giving 568 the type of network. Initially ``IN'' is defined to have the meaning 569 ``Internet''. 571 The second sub-field is the address type. This allows SDP to be used 572 for sessions that are not IP based. Currently only IP4 is defined. 574 The third sub-field is the connection address. Optional extra sub- 575 fields may be added after the connection address depending on the value 576 of the
field. 578 For IP4 addresses, the connection address is defined as follows: 580 o Typically the connection address will be a class-D IP multicast 581 group address. If the conference is not multicast, then the 582 connection address contains the unicast IP address of the expected 583 data source or data relay or data sink as determined by additional 584 attribute fields. It is not expected that unicast addresses will be 585 given in a session description that is communicated by a multicast 586 announcement, though this is not prohibited. 588 o Conferences using an IP multicast connection address must also have 589 a time to live (TTL) value present in addition to the multicast 590 address. The TTL and the address together define the scope with 591 which multicast packets sent in this conference will be sent. TTL 592 values must be in the range 0-255. 594 The TTL for the session is appended to the address using a slash as 595 a separator. An example is: 597 c=IN IP4 224.2.1.1/127 599 Hierarchical or layered encoding schemes are data streams where the 600 encoding from a single media source is split into a number of 601 layers. The receiver can choose the desired quality (and hence 602 bandwidth) by only subscribing to a subset of these layers. Such 603 layered encodings are normally transmitted in multiple multicast 604 groups to allow multicast pruning. This technique keeps unwanted 605 traffic from sites only requiring certain levels of the hierarchy. 606 For applications requiring multiple multicast groups, we allow the 607 following notation to be used for the connection address: 609 // 611 If the number of addresses is not given it is assumed to be one. 612 Multicast addresses so assigned are contiguously allocated above the 613 base address, so that, for example: 615 c=IN IP4 224.2.1.1/127/3 617 would state that addresses 224.2.1.1, 224.2.1.2 and 224.2.1.3 are to 618 be used at a ttl of 127. This is semantically identical to 619 including multiple ``c='' lines in a media description: 621 c=IN IP4 224.2.1.1/127 622 c=IN IP4 224.2.1.2/127 623 c=IN IP4 224.2.1.3/127 625 Multiple addresses or ``c='' lines can only be specified on a per- 626 media basis, and not for a session-level ``c='' field. 628 It is illegal for the slash notation described above to be used for 629 IP unicast addresses. 631 Bandwidth 632 b=: 634 o This specifies the proposed bandwidth to be used by the session or 635 media, and is optional. 637 o is in kilobits per second by default. Modifiers 638 may specify that alternative units are to be used (the modifiers 639 defined in this memo use the default units). 641 o is a single alphanumeric word giving the meaning of the 642 bandwidth figure. 644 o Two modifiers are initially defined: 646 CT Conference Total: An implicit maximum bandwidth is associated with 647 each TTL on the Mbone or within a particular multicast 648 administrative scope region (the Mbone bandwidth vs. TTL limits 649 are given in the MBone FAQ). If the bandwidth of a session or 650 media in a session is different from the bandwidth implicit from 651 the scope, a `b=CT:...' line should be supplied for the session 652 giving the proposed upper limit to the bandwidth used. The 653 primary purpose of this is to give an approximate idea as to 654 whether two or more conferences can co-exist simultaneously. 656 AS Application-Specific Maximum: The bandwidth is interpreted to be 657 application-specific, i.e., will be the application's concept of 658 maximum bandwidth. Normally this will coincide with what is set 659 on the application's ``maximum bandwidth'' control if applicable. 661 Note that CT gives a total bandwidth figure for all the media at all 662 sites. AS gives a bandwidth figure for a single media at a single 663 site, although there may be many sites sending simultaneously. 665 o Extension Mechanism: Tool writers can define experimental bandwidth 666 modifiers by prefixing their modifier with ``X-''. For example: 668 b=X-YZ:128 670 SDP parsers should ignore bandwidth fields with unknown modifiers. 671 Modifiers should be alpha-numeric and, although no length limit is 672 given, they are recommended to be short. 674 Times, Repeat Times and Time Zones 676 t= 677 o ``t='' fields specify the start and stop times for a conference 678 session. Multiple ``t='' fields may be used if a session is active 679 at multiple irregularly spaced times; each additional ``t='' field 680 specifies an additional period of time for which the session will be 681 active. If the session is active at regular times, an ``r='' field 682 (see below) should be used in addition to and following a ``t='' 683 field - in which case the ``t='' field specifies the start and stop 684 times of the repeat sequence. 686 o The first and second sub-fields give the start and stop times for 687 the conference respectively. These values are the decimal 688 representation of Network Time Protocol (NTP) time values in seconds 689 [1]. To convert these values to UNIX time, subtract decimal 690 2208988800. 692 o If the stop-time is set to zero, then the session is not bounded, 693 though it will not become active until after the start-time. If the 694 start-time is also zero, the session is regarded as permanent. 696 User interfaces should strongly discourage the creation of unbounded 697 and permanent sessions as they give no information about when the 698 session is actually going to terminate, and so make scheduling 699 difficult. 701 The general assumption may be made, when displaying unbounded 702 sessions that have not timed out to the user, that an unbounded 703 session will only be active until half an hour from the current time 704 or the session start time, whichever is the later. If behaviour 705 other than this is required, an end-time should be given and 706 modified as appropriate when new information becomes available about 707 when the session should really end. 709 Permanent sessions may be shown to the user as never being active 710 unless there are associated repeat times which state precisely when 711 the session will be active. In general, permanent sessions should 712 not be created for any session expected to have a duration of less 713 than 2 months, and should be discouraged for sessions expected to 714 have a duration of less than 6 months. 716 r= 718 o ``r='' fields specify repeat times for a session. For example, if 719 a session is active at 10am on Monday and 11am on Tuesday for one 720 hour each week for three months, then the in the 721 corresponding ``t='' field would be the NTP representation of 10am 722 on the first Monday, the would be 1 week, the 723 would be 1 hour, and the offsets would be zero and 724 25 hours. The corresponding ``t='' field stop time would be the NTP 725 representation of the end of the last session three months later. By 726 default all fields are in seconds, so the ``r='' and ``t='' fields 727 might be: 729 t=3034423619 3042462419 730 r=604800 3600 0 90000 732 To make announcements more compact, times may also be given in units 733 of days, hours or minutes. The syntax for these is a number 734 immediately followed by a single case-sensitive character. 735 Fractional units are not allowed - a smaller unit should be used 736 instead. The following unit specification characters are allowed: 738 d - days (86400 seconds) 739 h - minutes (3600 seconds) 740 m - minutes (60 seconds) 741 s - seconds (allowed for completeness but not recommended) 743 Thus, the above announcement could also have been written: 745 r=7d 1h 0 25h 747 Monthly and yearly repeats cannot currently be directly specified 748 with a single SDP repeat time - instead separate "t" fields should 749 be used to explicitly list the session times. 751 z= .... 753 o To schedule a repeated session which spans a change from daylight- 754 saving time to standard time or vice-versa, it is necessary to 755 specify offsets from the base repeat times. This is required because 756 different time zones change time at different times of day, 757 different countries change to or from daylight time on different 758 dates, and some countries do not have daylight saving time at all. 760 Thus in order to schedule a session that is at the same time winter 761 and summer, it must be possible to specify unambiguously by whose 762 time zone a session is scheduled. To simplify this task for 763 receivers, we allow the sender to specify the NTP time that a time 764 zone adjustment happens and the offset from the time when the 765 session was first scheduled. The ``z'' field allows the sender to 766 specify a list of these adjustment times and offsets from the base 767 time. 769 An example might be: 771 z=2882844526 -1h 2898848070 0 773 This specifies that at time 2882844526 the time base by which the 774 session's repeat times are calculated is shifted back by 1 hour, and 775 that at time 2898848070 the session's original time base is 776 restored. Adjustments are always relative to the specified start 777 time - they are not cumulative. 779 o If a session is likely to last several years, it is expected that 780 the session announcement will be modified periodically rather than 781 transmit several years worth of adjustments in one announcement. 783 Encryption Keys 785 k= 786 k=: 788 o The session description protocol may be used to convey encryption 789 keys. A key field is permitted before the first media entry (in 790 which case it applies to all media in the session), or for each 791 media entry as required. 793 o The format of keys and their usage is outside the scope of this 794 document, but see [3]. 796 o The method indicates the mechanism to be used to obtain a usable key 797 by external means, or from the encoded encryption key given. The 798 following methods are defined: 800 k=clear: 801 The encryption key (as described in [3] for RTP media streams 802 under the AV profile) is included untransformed in this key 803 field. 805 k=base64: 806 The encryption key (as described in [3] for RTP media streams 807 under the AV profile) is included in this key field but has been 808 base64 encoded because it includes characters that are 809 prohibited in SDP. 811 k=uri: 812 A Universal Resource Identifier as used by WWW clients is 813 included in this key field. The URI refers to the data 814 containing the key, and may require additional authentication 815 before the key can be returned. When a request is made to the 816 given URI, the MIME content-type of the reply specifies the 817 encoding for the key in the reply. The key should not be 818 obtained until the user wishes to join the session to reduce 819 synchronisation of requests to the WWW server(s). 821 k=prompt 822 No key is included in this SDP description, but the session or 823 media stream referred to by this key field is encrypted. The 824 user should be prompted for the key when attempting to join the 825 session, and this user-supplied key should then be used to 826 decrypt the media streams. 828 Attributes 830 a= 831 a=: 833 Attributes are the primary means for extending SDP. Attributes may be 834 defined to be used as "session-level" attributes, "media-level" 835 attributes, or both. 837 A media description may have any number of attributes (``a='' fields) 838 which are media specific. These are referred to as "media-level" 839 attributes and add information about the media stream. Attribute fields 840 can also be added before the first media field; these "session-level" 841 attributes convey additional information that applies to the conference 842 as a whole rather than to individual media; an example might be the 843 conference's floor control policy. 845 Attribute fields may be of two forms: 847 o property attributes. A property attribute is simply of the form 848 ``a=''. These are binary attributes, and the presence of the 849 attribute conveys that the attribute is a property of the session. 850 An example might be ``a=recvonly''. 852 o value attributes. A value attribute is of the form 853 ``a=:''. An example might be that a whiteboard 854 could have the value attribute ``a=orient:landscape'' 856 Attribute interpretation depends on the media tool being invoked. Thus 857 receivers of session descriptions should be configurable in their 858 interpretation of announcements in general and of attributes in 859 particular. 861 Attribute names must be in the US-ASCII subset of ISO-10646/UTF-8. 863 Attribute values are byte strings, and MAY use any byte value except 864 0x00 (Nul), 0x0A (LF), and 0x0D (CR). By default, attribute values are 865 to be interpreted as in ISO-10646 character set with UTF-8 encoding. 866 Unlike other text fields, attribute values are NOT normally affected by 867 the `charset' attribute as this would make comparisons against known 868 values problematic. However, when an attribute is defined, it can be 869 defined to be charset-dependent, in which case it's value should be 870 interpreted in the session charset rather than in ISO-10646. 872 Attributes that will be commonly used can be registered with IANA (see 873 Appendix B). Unregistered attributes should begin with "X-" to prevent 874 inadvertent collision with registered attributes. In either case, if an 875 attribute is received that is not understood, it should simply be 876 ignored by the receiver. 878 Media Announcements 880 m= 882 A session description may contain a number of media descriptions. Each 883 media description starts with an ``m='' field, and is terminated by 884 either the next ``m='' field or by the end of the session description. 885 A media field also has several sub-fields: 887 o The first sub-field is the media type. Currently defined media are 888 ``audio'', ``video'', ``application'', ``data'' and ``control'', 889 though this list may be extended as new communication modalities 890 emerge (e.g., telepresense). The difference between ``application'' 891 and ``data'' is that the former is a media flow such as whiteboard 892 information, and the latter is bulk-data transfer such as 893 multicasting of program executables which will not typically be 894 displayed to the user. ``control'' is used to specify an additional 895 conference control channel for the session. 897 o The second sub-field is the transport port to which the media stream 898 will be sent. The meaning of the transport port depends on the 899 network being used as specified in the relevant ``c'' field and on 900 the transport protocol defined in the third sub-field. Other ports 901 used by the media application (such as the RTCP port, see [2]) 902 should be derived algorithmically from the base media port. 904 Note: For transports based on UDP, the value should be in the range 905 1024 to 65535 inclusive. For RTP compliance it should be an even 906 number. 908 For applications where hierarchically encoded streams are being sent 909 to a unicast address, it may be necessary to specify multiple 910 transport ports. This is done using a similar notation to that used 911 for IP multicast addresses in the ``c='' field: 913 m= / 915 In such a case, the ports used depend on the transport protocol. 916 For RTP, only the even ports are used for data and the corresponding 917 one-higher odd port is used for RTCP. For example: 919 m=video 49170/2 RTP/AVP 31 921 would specify that ports 49170 and 49171 form one RTP/RTCP pair and 922 49172 and 49173 form the second RTP/RTCP pair. RTP/AVP is the 923 transport protocol and 31 is the format (see below). 925 If multiple addresses are specified in the ``c='' field and multiple 926 ports are specified in the ``m='' field, a one-to-one mapping from 927 port to the corresponding address is implied. For example: 929 c=IN IP4 224.2.1.1/127/2 930 m=video 49170/2 RTP/AVP 31 932 would imply that address 224.2.1.1 is used with ports 49170 and 933 49171, and address 224.2.1.2 is used with ports 49172 and 49173. 935 o The third sub-field is the transport protocol. The transport 936 protocol values are dependent on the address-type field in the 937 ``c='' fields. Thus a ``c='' field of IP4 defines that the 938 transport protocol runs over IP4. For IP4, it is normally expected 939 that most media traffic will be carried as RTP over UDP. The 940 following transport protocols are preliminarily defined, but may be 941 extended through registration of new protocols with IANA: 943 - RTP/AVP - the IETF's Realtime Transport Protocol using the 944 Audio/Video profile carried over UDP. 946 - udp - User Datagram Protocol 948 If an application uses a single combined proprietary media format 949 and transport protocol over UDP, then simply specifying the 950 transport protocol as udp and using the format field to distinguish 951 the combined protocol is recommended. If a transport protocol is 952 used over UDP to carry several distinct media types that need to be 953 distinguished by a session directory, then specifying the transport 954 protocol and media format separately is necessary. RTP is an 955 example of a transport-protocol that carries multiple payload 956 formats that must be distinguished by the session directory for it 957 to know how to start appropriate tools, relays, mixers or recorders. 959 The main reason to specify the transport-protocol in addition to the 960 media format is that the same standard media formats may be carried 961 over different transport protocols even when the network protocol is 962 the same - a historical example is vat PCM audio and RTP PCM audio. 963 In addition, relays and monitoring tools that are transport- 964 protocol-specific but format-independent are possible. 966 For RTP media streams operating under the RTP Audio/Video Profile 967 [3], the protocol field is ``RTP/AVP''. Should other RTP profiles 968 be defined in the future, their profiles will be specified in the 969 same way. For example, the protocol field ``RTP/XYZ'' would specify 970 RTP operating under a profile whose short name is ``XYZ''. 972 o The fourth and subsequent sub-fields are media formats. For audio 973 and video, these will normally be a media payload type as defined in 974 the RTP Audio/Video Profile. 976 When a list of payload formats is given, this implies that all of 977 these formats may be used in the session, but the first of these 978 formats is the default format for the session. 980 For media whose transport protocol is not RTP or UDP the format 981 field is protocol specific. Such formats should be defined in an 982 additional specification document. 984 For media whose transport protocol is RTP, SDP can be used to 985 provide a dynamic binding of media encoding to RTP payload type. 986 The encoding names in the RTP AV Profile do not specify unique audio 987 encodings (in terms of clock rate and number of audio channels), and 988 so they are not used directly in SDP format fields. Instead, the 989 payload type number should be used to specify the format for static 990 payload types and the payload type number along with additional 991 encoding information should be used for dynamically allocated 992 payload types. 994 An example of a static payload type is u-law PCM coded single 995 channel audio sampled at 8KHz. This is completely defined in the 996 RTP Audio/Video profile as payload type 0, so the media field for 997 such a stream sent to UDP port 49232 is: 999 m=video 49232 RTP/AVP 0 1001 An example of a dynamic payload type is 16 bit linear encoded stereo 1002 audio sampled at 16KHz. If we wish to use dynamic RTP/AVP payload 1003 type 98 for such a stream, additional information is required to 1004 decode it: 1006 m=video 49232 RTP/AVP 98 1007 a=rtpmap:98 L16/16000/2 1009 The general form of an rtpmap attribute is: 1011 a=rtpmap: /[/] 1013 For audio streams, may specify the number of 1014 audio channels. This parameter may be omitted if the number of 1015 channels is one provided no additional parameters are needed. 1016 For video streams, no encoding parameters are currently specified. 1018 Additional parameters may be defined in the future, but codec- 1019 specific parameters should not be added. Parameters added to an 1020 rtpmap attribute should only be those required for a session 1021 directory to make the choice of appropriate media too to participate 1022 in a session. Codec-specific parameters should be added in other 1023 attributes. 1025 Up to one rtpmap attribute can be defined for each media format 1026 specified. Thus we might have: 1028 m=audio 49230 RTP/AVP 96 97 98 1029 a=rtpmap:96 L8/8000 1030 a=rtpmap:97 L16/8000 1031 a=rtpmap:98 L16/11025/2 1033 RTP profiles that specify the use of dynamic payload types must 1034 define the set of valid encoding names and/or a means to register 1035 encoding names if that profile is to be used with SDP. 1037 Experimental encoding formats can also be specified using rtpmap. 1038 RTP formats that are not registered as standard format names must be 1039 preceded by ``X-''. Thus a new experimental redundant audio stream 1040 called GSMLPC using dynamic payload type 99 could be specified as: 1042 m=video 49232 RTP/AVP 99 1043 a=rtpmap:99 X-GSMLPC/8000 1045 Such an experimental encoding requires that any site wishing to 1046 receive the media stream has relevant configured state in its 1047 session directory to know which tools are appropriate. 1049 Note that RTP audio formats typically do not include information 1050 about the number of samples per packet. If a non-default (as 1051 defined in the RTP Audio/Video Profile) packetisation is required, 1052 the``ptime'' attribute is used as given below. 1054 For more details on RTP audio and video formats, see [3]. 1056 o Predefined formats for UDP protocol non-RTP media are as below. 1058 Application Formats: 1060 wb: LBL Whiteboard (transport: udp) 1062 nt: UCL Network Text Editor (transport: udp) 1064 Suggested Attributes 1066 The following attributes are suggested. Since application writers may 1067 add new attributes as they are required, this list is not exhaustive. 1069 a=cat: 1070 This attribute gives the dot-separated hierarchical category of the 1071 session. This is to enable a receiver to filter unwanted sessions 1072 by category. It would probably have been a compulsory separate 1073 field, except for its experimental nature at this time. It is a 1074 session-level attribute, and is not dependent on charset. 1076 a=keywds: 1077 Like the cat attribute, this is to assist identifying wanted 1078 sessions at the receiver. This allows a receiver to select 1079 interesting session based on keywords describing the purpose of the 1080 session. It is a session-level attribute. It is a charset dependent 1081 attribute, meaning that its value should be interpreted in the 1082 charset specified for the session description if one is specified, 1083 or by default in ISO 10646/UTF-8. 1085 a=tool: 1086 This gives the name and version number of the tool used to create 1087 the session description. It is a session-level attribute, and is 1088 not dependent on charset. 1090 a=ptime: 1091 This gives the length of time in milliseconds represented by the 1092 media in a packet. This is probably only meaningful for audio data. 1093 It should not be necessary to know ptime to decode RTP or vat audio, 1094 and it is intended as a recommendation for the 1095 encoding/packetisation of audio. It is a media attribute, and is 1096 not dependent on charset. 1098 a=recvonly 1099 This specifies that the tools should be started in receive-only mode 1100 where applicable. It can be either a session or media attribute, and 1101 is not dependent on charset. 1103 a=sendrecv 1104 This specifies that the tools should be started in send and receive 1105 mode. This is necessary for interactive conferences with tools such 1106 as wb which defaults to receive only mode. It can be either a 1107 session or media attribute, and is not dependent on charset. 1109 a=sendonly 1110 This specifies that the tools should be started in send-only mode. 1111 An example may be where a different unicast address is to be used 1112 for a traffic destination than for a traffic source. In such a case, 1113 two media descriptions may be use, one sendonly and one recvonly. It 1114 can be either a session or media attribute, but would normally only 1115 be used as a media attribute, and is not dependent on charset. 1117 a=orient: 1118 Normally this is only used in a whiteboard media specification. It 1119 specifies the orientation of a the whiteboard on the screen. It is 1120 a media attribute. Permitted values are `portrait', `landscape' and 1121 `seascape' (upside down landscape). It is not dependent on charset 1123 a=type: 1124 This specifies the type of the conference. Suggested values are 1125 `broadcast', `meeting', `moderated', `test' and `H332'. `recvonly' 1126 should be the default for `type:broadcast' sessions, `type:meeting' 1127 should imply `sendrecv' and `type:moderated' should indicate the use 1128 of a floor control tool and that the media tools are started so as 1129 to ``mute'' new sites joining the conference. 1131 Specifying the attribute type:H332 indicates that this loosely 1132 coupled session is part of a H.332 session as defined in the ITU 1133 H.332 specification [10]. Media tools should be started `recvonly'. 1135 Specifying the attribute type:test is suggested as a hint that, 1136 unless explicitly requested otherwise, receivers can safely avoid 1137 displaying this session description to users. 1139 The type attribute is a session-level attribute, and is not 1140 dependent on charset. 1142 a=charset: 1143 This specifies the character set to be used to display the session 1144 name and information data. By default, the ISO-10646 character set 1145 in UTF-8 encoding is used. If a more compact representation is 1146 required, other character sets may be used such as ISO-8859-1 for 1147 Northern European languages. In particular, the ISO 8859-1 is 1148 specified with the following SDP attribute: 1150 a=charset:ISO-8859-1 1152 This is a session-level attribute; if this attribute is present, it 1153 must be before the first media field. The charset specified MUST be 1154 one of those registered with IANA, such as ISO-8859-1. The 1155 character set identifier is a US-ASCII string and MUST be compared 1156 against the IANA identifiers using a case-insensitive comparison. 1157 If the identifier is not recognised or not supported, all strings 1158 that are affected by it SHOULD be regarded as byte strings. 1160 Note that a character set specified MUST still prohibit the use of 1161 bytes 0x00 (Nul), 0x0A (LF) and 0x0d (CR). Character sets requiring 1162 the use of these characters MUST define a quoting mechanism that 1163 prevents these bytes appearing within text fields. 1165 a=sdplang: 1166 This can be a session level attribute or a media level attribute. 1167 As a session level attribute, it specifies the language for the 1168 session description. As a media level attribute, it specifies the 1169 language for any media-level SDP information field associated with 1170 that media. Multiple sdplang attributes can be provided either at 1171 session or media level if multiple languages in the session 1172 description or media use multiple languages, in which case the order 1173 of the attributes indicates the order of importance of the various 1174 languages in the session or media from most important to least 1175 important. 1177 In general, sending session descriptions consisting of multiple 1178 languages should be discouraged. Instead, multiple descriptions 1179 should be sent describing the session, one in each language. 1180 However this is not possible with all transport mechanisms, and so 1181 multiple sdplang attributes are allowed although not recommended. 1183 The sdplang attribute value must be a single RFC 1766 language tag 1184 in US-ASCII. It is not dependent on the charset attribute. An 1185 sdplang attribute SHOULD be specified when a session is of 1186 sufficient scope to cross geographic boundaries where the language 1187 of recipients cannot be assumed, or where the session is in a 1188 different language from the locally assumed norm. 1190 a=lang: 1191 This can be a session level attribute or a media level attribute. 1192 As a session level attribute, it specifies the default language for 1193 the session being described. As a media level attribute, it 1194 specifies the language for that media, overriding any session-level 1195 language specified. Multiple lang attributes can be provided either 1196 at session or media level if multiple languages if the session 1197 description or media use multiple languages, in which case the order 1198 of the attributes indicates the order of importance of the various 1199 languages in the session or media from most important to least 1200 important. 1202 The lang attribute value must be a single RFC 1766 language tag in 1203 US-ASCII. It is not dependent on the charset attribute. A lang 1204 attribute SHOULD be specified when a session is of sufficient scope 1205 to cross geographic boundaries where the language of recipients 1206 cannot be assumed, or where the session is in a different language 1207 from the locally assumed norm. 1209 a=framerate: 1210 This gives the maximum video frame rate in frames/sec. It is 1211 intended as a recommendation for the encoding of video data. 1212 Decimal representations of fractional values using the notation 1213 "." are allowed. It is a media attribute, is 1214 only defined for video media, and is not dependent on charset. 1216 a=quality: 1217 This gives a suggestion for the quality of the encoding as an 1218 integer value. 1220 The intention of the quality attribute for video is to specify a 1221 non-default trade-off between frame-rate and still-image quality. 1222 For video, the value in the range 0 to 10, with the following 1223 suggested meaning: 1225 10 - the best still-image quality the compression scheme can give. 1227 5 - the default behaviour given no quality suggestion. 1229 0 - the worst still-image quality the codec designer thinks is 1230 still usable. 1231 It is a media attribute, and is not dependent on charset. 1233 a=fmtp: 1234 This attribute allows parameters that are specific to a particular 1235 format to be conveyed in a way that SDP doesn't have to understand 1236 them. The format must be one of the formats specified for the 1237 media. Format-specific parameters may be any set of parameters 1238 required to be conveyed by SDP and given unchanged to the media tool 1239 that will use this format. 1241 It is a media attribute, and is not dependent on charset. 1243 6.1. Communicating Conference Control Policy 1245 There is some debate over the way conference control policy should be 1246 communicated. In general, the authors believe that an implicit 1247 declarative style of specifying conference control is desirable where 1248 possible. 1250 A simple declarative style uses a single conference attribute field 1251 before the first media field, possibly supplemented by properties such 1252 as `recvonly' for some of the media tools. This conference attribute 1253 conveys the conference control policy. An example might be: 1255 a=type:moderated 1257 In some cases, however, it is possible that this may be insufficient to 1258 communicate the details of an unusual conference control policy. If 1259 this is the case, then a conference attribute specifying external 1260 control might be set, and then one or more ``media'' fields might be 1261 used to specify the conference control tools and configuration data for 1262 those tools. An example is an ITU H.332 session: 1264 ... 1265 c=IN IP4 224.5.6.7 1266 a=type:H332 1267 m=audio 49230 RTP/AVP 0 1268 m=video 49232 RTP/AVP 31 1269 m=application 12349 udp wb 1270 m=control 49234 H323 mc 1271 c=IN IP4 134.134.157.81 1273 In this example, a general conference attribute (type:H332) is specified 1274 stating that conference control will be provided by an external H.332 1275 tool, and a contact addresses for the H.323 session multipoint 1276 controller is given. 1278 In this document, only the declarative style of conference control 1279 declaration is specified. Other forms of conference control should 1280 specify an appropriate type attribute, and should define the 1281 implications this has for control media. 1283 7. Security Considerations 1285 SDP is a session description format that describes multimedia sessions. 1286 A session description should not be trusted unless it has been obtained 1287 by an authenticated transport protocol from a trusted source. Many 1288 different transport protocols may be used to distribute session 1289 description, and the nature of the authentication will differ from 1290 transport to transport. 1292 One transport that will frequently be used to distribute session 1293 descriptions is the Session Announcement Protocol (SAP). SAP provides 1294 both encryption and authentication mechanisms but due to the nature of 1295 session announcements it is likely that there are many occasions where 1296 the originator of a session announcement cannot be authenticated because 1297 they are previously unknown to the receiver of the announcement and 1298 because no common public key infrastructure is available. 1300 On receiving a session description over an unauthenticated transport 1301 mechanism or from an untrusted party, software parsing the session 1302 should take a few precautions. Session description contain information 1303 required to start software on the receivers system. Software that 1304 parses a session description MUST not be able to start other software 1305 except that which is specifically configured as appropriate software to 1306 participate in multimedia sessions. It is normally considered 1307 INAPPROPRIATE for software parsing a session description to start, on a 1308 user's system, software that is appropriate to participate in multimedia 1309 sessions, without the user first being informed that such software will 1310 be started and giving their consent. Thus a session description 1311 arriving by session announcement, email, sessioR multimedia,session page 1312 SHOULD not deliver the user into an interactive 1313 without the user being aware that this will happen. As it is not always 1314 simple to tell whether a session is interactive or not, applications 1315 that are unsure should assume sessions are interactive. 1317 In this specification, there are no attributes which would allow the 1318 recipient of a session description to be informed to start multimedia 1319 tools in a mode where they default to transmitting. Under some 1320 circumstances it might be appropriate to define such attributes. If 1321 this is done an application parsing a session description containing 1322 such attributes SHOULD either ignore them, or inform the user that 1323 joining this session will result in the automatic transmission of 1324 multimedia data. The default behaviour for an unknown attribute is to 1325 ignore it. 1327 Session descriptions may be parsed at intermediate systems such as 1328 firewalls for the purposes of opening a hole in the firewall to allow 1329 the participation in multimedia sessions. It is considered 1330 INAPPROPRIATE for a firewall to open such holes for unicast data streams 1331 unless the session description comes in a request from inside the 1332 firewall. For multicast sessions, it is likely that local 1333 administrators will apply their own policies, but the exclusive use of 1334 "local" or "site-local" administrative scope within the firewall and the 1335 refusal of the firewall to open a hole for such scopes will provide 1336 separation of global multicast sessions from local ones. 1338 Appendix A: SDP Grammar 1340 This appendix provides an Augmented BNF grammar for SDP. ABNF is 1341 defined in RFC 2234. 1343 announcement = proto-version 1344 origin-field 1345 session-name-field 1346 information-field 1347 uri-field 1348 email-fields 1349 phone-fields 1350 connection-field 1351 bandwidth-fields 1352 time-fields 1353 key-field 1354 attribute-fields 1355 media-descriptions 1357 proto-version = "v=" 1*DIGIT CRLF 1358 ;this draft describes version 0 1360 origin-field = "o=" username space 1361 sess-id space sess-version space 1362 nettype space addrtype space 1363 addr CRLF 1365 session-name-field = "s=" text CRLF 1367 information-field = ["i=" text CRLF] 1369 uri-field = ["u=" uri CRLF] 1371 email-fields = *("e=" email-address CRLF) 1373 phone-fields = *("p=" phone-number CRLF) 1375 connection-field = ["c=" nettype space addrtype space 1376 connection-address CRLF] 1377 ;a connection field must be present 1378 ;in every media description or at the 1379 ;session-level 1381 bandwidth-fields = *("b=" bwtype ":" bandwidth CRLF) 1382 time-fields = 1*( "t=" start-time space stop-time 1383 *(CRLF repeat-fields) CRLF) 1384 [zone-adjustments CRLF] 1386 repeat-fields = "r=" repeat-interval space typed-time 1387 1*(space typed-time) 1389 zone-adjustments = time space [``-''] typed-time 1390 *(space time space [``-''] typed-time) 1392 key-field = ["k=" key-type CRLF] 1394 key-type = "prompt" | 1395 "clear:" key-data | 1396 "base64:" key-data | 1397 "uri:" uri 1399 key-data = email-safe | "~" | "\" 1401 attribute-fields = *("a=" attribute CRLF) 1403 media-descriptions = *( media-field 1404 information-field 1405 *(connection-field) 1406 bandwidth-fields 1407 key-field 1408 attribute-fields ) 1410 media-field = "m=" media space port ["/" integer] 1411 space proto 1*(space fmt) CRLF 1413 media = 1*(alpha-numeric) 1414 ;typically "audio", "video", "application" 1415 ;or "data" 1417 fmt = 1*(alpha-numeric) 1418 ;typically an RTP payload type for audio 1419 ;and video media 1421 proto = 1*(alpha-numeric) 1422 ;typically "RTP/AVP" or "udp" for IP4 1424 port = 1*(DIGIT) 1425 ;should in the range "1024" to "65535" inclusive 1426 ;for UDP based media 1428 attribute = (att-field ":" att-value) | att-field 1430 att-field = 1*(alpha-numeric) 1432 att-value = byte-string 1434 sess-id = 1*(DIGIT) 1435 ;should be unique for this originating username/host 1437 sess-version = 1*(DIGIT) 1438 ;0 is a new session 1440 connection-address = multicast-address 1441 | unicast-address 1443 multicast-address = 3*(decimal_uchar ".") decimal_uchar "/" ttl 1444 [ "/" integer ] 1445 ;multicast addresses may be in the range 1446 ;224.0.0.0 to 239.255.255.255 1448 ttl = decimal_uchar 1450 start-time = time | "0" 1452 stop-time = time | "0" 1454 time = POS-DIGIT 9*(DIGIT) 1455 ;sufficient for 2 more centuries 1457 repeat-interval = typed-time 1459 typed-time = 1*(DIGIT) [fixed-len-time-unit] 1461 fixed-len-time-unit = ``d'' | ``h'' | ``m'' | ``s'' 1463 bwtype = 1*(alpha-numeric) 1465 bandwidth = 1*(DIGIT) 1467 username = safe 1468 ;pretty wide definition, but doesn't include space 1470 email-address = email | email "(" email-safe ")" | 1471 email-safe "<" email ">" 1473 email = ;defined in RFC822 1475 uri= ;defined in RFC1630 1477 phone-number = phone | phone "(" email-safe ")" | 1478 email-safe "<" phone ">" 1480 phone = "+" POS-DIGIT 1*(space | "-" | DIGIT) 1481 ;there must be a space or hyphen between the 1482 ;international code and the rest of the number. 1484 nettype = "IN" 1485 ;list to be extended 1487 addrtype = "IP4" | "IP6" 1488 ;list to be extended 1490 addr = FQDN | unicast-address