idnits 2.17.1 draft-jennings-rtcweb-signaling-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- -- The document has examples using IPv4 documentation addresses according to RFC6890, but does not use any IPv6 documentation addresses. Maybe there should be IPv6 examples, too? Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == The document seems to lack the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords -- however, there's a paragraph with a matching beginning. Boilerplate error? (The document does seem to have the reference to RFC 2119 which the ID-Checklist requires). == The document seems to contain a disclaimer for pre-RFC5378 work, but was first submitted on or after 10 November 2008. The disclaimer is usually necessary only for documents that revise or obsolete older RFCs, and that take significant amounts of text from those RFCs. If you can contact all authors of the source material and they are willing to grant the BCP78 rights to the IETF Trust, you can and should remove the disclaimer. Otherwise, the disclaimer is needed and you can ignore this comment. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (October 30, 2011) is 4563 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) ** Obsolete normative reference: RFC 4627 (Obsoleted by RFC 7158, RFC 7159) ** Obsolete normative reference: RFC 4566 (Obsoleted by RFC 8866) -- Obsolete informational reference (is this intentional?): RFC 5245 (Obsoleted by RFC 8445, RFC 8839) == Outdated reference: A later version (-16) exists of draft-ietf-rtcweb-use-cases-and-requirements-06 Summary: 2 errors (**), 0 flaws (~~), 4 warnings (==), 3 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group C. Jennings 3 Internet-Draft Cisco 4 Intended status: Standards Track J. Rosenberg 5 Expires: May 2, 2012 jdrosen.net 6 J. Uberti 7 Google 8 R. Jesup 9 Mozilla 10 October 30, 2011 12 RTCWeb Offer/Answer Protocol (ROAP) 13 draft-jennings-rtcweb-signaling-01 15 Abstract 17 This document describes an protocol used to negotiate media between 18 browsers or other compatible devices. This protocol provides the 19 state machinery needed to implement the offer/answer model (RFC 20 3264), and defines the semantics and necessary attributes of messages 21 that must be exchanged. The protocol uses an abstract transport in 22 that it does not actually define how these messages are exchanged. 23 Rather, such exchanges are handled through web-based transports like 24 HTTP or WebSockets. The protocol focuses solely on media negotiation 25 and does not handle call control, call processing, or other 26 functions. 28 Status of this Memo 30 This Internet-Draft is submitted in full conformance with the 31 provisions of BCP 78 and BCP 79. 33 Internet-Drafts are working documents of the Internet Engineering 34 Task Force (IETF). Note that other groups may also distribute 35 working documents as Internet-Drafts. The list of current Internet- 36 Drafts is at http://datatracker.ietf.org/drafts/current/. 38 Internet-Drafts are draft documents valid for a maximum of six months 39 and may be updated, replaced, or obsoleted by other documents at any 40 time. It is inappropriate to use Internet-Drafts as reference 41 material or to cite them other than as "work in progress." 43 This Internet-Draft will expire on May 2, 2012. 45 Copyright Notice 47 Copyright (c) 2011 IETF Trust and the persons identified as the 48 document authors. All rights reserved. 50 This document is subject to BCP 78 and the IETF Trust's Legal 51 Provisions Relating to IETF Documents 52 (http://trustee.ietf.org/license-info) in effect on the date of 53 publication of this document. Please review these documents 54 carefully, as they describe your rights and restrictions with respect 55 to this document. Code Components extracted from this document must 56 include Simplified BSD License text as described in Section 4.e of 57 the Trust Legal Provisions and are provided without warranty as 58 described in the Simplified BSD License. 60 This document may contain material from IETF Documents or IETF 61 Contributions published or made publicly available before November 62 10, 2008. The person(s) controlling the copyright in some of this 63 material may not have granted the IETF Trust the right to allow 64 modifications of such material outside the IETF Standards Process. 65 Without obtaining an adequate license from the person(s) controlling 66 the copyright in such materials, this document may not be modified 67 outside the IETF Standards Process, and derivative works of it may 68 not be created outside the IETF Standards Process, except to format 69 it for publication as an RFC or to translate it into languages other 70 than English. 72 Table of Contents 74 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 75 2. Requirements and Design Goals . . . . . . . . . . . . . . . . 5 76 3. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 6 77 4. Protocol Overview . . . . . . . . . . . . . . . . . . . . . . 6 78 5. Semantics & Syntax . . . . . . . . . . . . . . . . . . . . . . 8 79 5.1. Reliability Model . . . . . . . . . . . . . . . . . . . . 8 80 5.2. Common Fields . . . . . . . . . . . . . . . . . . . . . . 9 81 5.2.1. Session IDs . . . . . . . . . . . . . . . . . . . . . 9 82 5.2.2. Seq . . . . . . . . . . . . . . . . . . . . . . . . . 10 83 5.2.3. Session Tokens . . . . . . . . . . . . . . . . . . . . 10 84 5.2.4. Response Tokens . . . . . . . . . . . . . . . . . . . 10 85 5.3. Media Setup . . . . . . . . . . . . . . . . . . . . . . . 11 86 5.3.1. OFFER Message . . . . . . . . . . . . . . . . . . . . 12 87 5.3.1.1. Offerer Behavior . . . . . . . . . . . . . . . . . 12 88 5.3.1.2. Answerer Behavior . . . . . . . . . . . . . . . . 12 89 5.3.2. ANSWER . . . . . . . . . . . . . . . . . . . . . . . . 13 90 5.3.2.1. moreComing Flag . . . . . . . . . . . . . . . . . 13 91 5.3.3. OK . . . . . . . . . . . . . . . . . . . . . . . . . . 14 92 5.3.4. ERROR . . . . . . . . . . . . . . . . . . . . . . . . 14 93 5.4. Changing Media Parameters . . . . . . . . . . . . . . . . 14 94 5.4.1. Conflicting OFFERS (glare) . . . . . . . . . . . . . . 15 95 5.4.2. Premature OFFER . . . . . . . . . . . . . . . . . . . 17 96 5.5. Notification of Media Termination . . . . . . . . . . . . 18 97 5.6. Errors . . . . . . . . . . . . . . . . . . . . . . . . . . 18 98 5.6.1. NOMATCH . . . . . . . . . . . . . . . . . . . . . . . 18 99 5.6.2. TIMEOUT . . . . . . . . . . . . . . . . . . . . . . . 19 100 5.6.3. REFUSED . . . . . . . . . . . . . . . . . . . . . . . 19 101 5.6.4. CONFLICT . . . . . . . . . . . . . . . . . . . . . . . 19 102 5.6.5. DOUBLECONFLICT . . . . . . . . . . . . . . . . . . . . 19 103 5.6.6. FAILED . . . . . . . . . . . . . . . . . . . . . . . . 19 104 6. Security Considerations . . . . . . . . . . . . . . . . . . . 19 105 7. Companion APIs . . . . . . . . . . . . . . . . . . . . . . . . 19 106 7.1. Capabilities . . . . . . . . . . . . . . . . . . . . . . . 20 107 7.2. Hints . . . . . . . . . . . . . . . . . . . . . . . . . . 20 108 7.3. Stats . . . . . . . . . . . . . . . . . . . . . . . . . . 20 109 8. Relationship with SIP & Jingle . . . . . . . . . . . . . . . . 21 110 9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 21 111 10. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 21 112 11. Open Issues . . . . . . . . . . . . . . . . . . . . . . . . . 22 113 12. References . . . . . . . . . . . . . . . . . . . . . . . . . . 22 114 12.1. Normative References . . . . . . . . . . . . . . . . . . . 22 115 12.2. Informative References . . . . . . . . . . . . . . . . . . 22 116 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 23 118 1. Introduction 120 This specification defines a protocol that allows an RTCWeb browser 121 to exchange information to control the set up of media to another 122 browser or device. The scope of this protocol is limited to 123 functionality required for the setup and negotiation of media and the 124 associated transports, referred to as media control. The protocol 125 defines the minimum set of messages and state machinery necessary to 126 implement the offer/answer model as defined in [RFC3264]. The offer 127 answer model specifies rules for the bilateral exchange of Session 128 Description Protocol (SDP) messages [RFC4566] for creation of media 129 streams. 131 The protocol specified here defines the state machines, semantic 132 behaviors, and messages that are exchanged between instances of the 133 state machines. However, it does not specify the actual on the wire 134 transport of these messages. Rather, it assumes that the 135 implementation of this protocol would occur within the browser 136 itself, and then browser APIs would allow the application's 137 JavaScript to request creation of messages and insert messages into 138 the state machine. The actual transfer of these messages would be 139 the responsibility of the web application, and would utilize 140 protocols such as HTTP and WebSockets. To facilitate implementation 141 within a browser, messages are encoded in JSON [RFC4627]. This 142 protocol, with appropriate selected transports, could also be 143 implemented by a signalling gateway that converts ROAP to SIP or 144 Jingle. 146 This protocol is designed to be closely aligned with the 147 PeerConnection API defined in the RTCWeb API[webrtc-api] 148 specification. It is important to note that while ROAP does not 149 require what has been referred to as a low level API for media 150 manipulation, ROAP does not prevent having a such an API as well and 151 both styles of API could coexist and be used where appropriate. 153 The protocol defined here does not provide any call control. 154 Concepts like ringing of phones, user search, call forwarding, 155 redirection, transfer, hold, and so on, are all the domain of call 156 processing and are out of scope for this specification. It is 157 assumed that the application running within the browser provides any 158 call control based on the needs of the application, the scope of 159 which is not a matter for standardization. 161 Despite that fact that it has an abstract transport, ROAP is still a 162 protocol. This means it has state machines, and it has rules 163 governing the behavior of those state machines which guarantee that 164 system operates properly based on any set of inputs. It is assumed 165 that this state machinery is implemented in the browser and thus 166 immutable by the application, which can then guarantee proper 167 behavior regardless of the operation of the resident JavaScript. 169 The protocol is designed to operate between two entities (browsers 170 for example), which exchange messages "directly" - meaning that a 171 message output by one entity is meant to be directly processed by the 172 other entity without further modification. In practice, this means 173 that a web server can treat ROAP messages as opaque and just shuffle 174 them between browser instances. This allows for simple 175 implementations. However, more powerful applications can be built in 176 which the web server or JavaScript can modify the messages in order 177 to provide more complex features. As long as those modifications 178 produce messages compliant to this specification, SDP Offer/Answer 179 [RFC3264], SDP [RFC4566], ICE [RFC5245] and any other dependencies, 180 interoperability is still possible. 182 This protocol is designed for two major use cases: 184 o Browser to browser 185 o Browser to SIP device via a SIP gateway 187 In the browser to SIP use case, the gateway obviously needs to be 188 somewhat more sophisticated. However, because this design is a small 189 subset of the design space covered by SIP [RFC3261], it is intended 190 to be simple to translate to and from/SIP via a signalling gateway. 191 Moreover, many of the elements in messages have clear mappings to 192 elements in SIP messages, thus allowing simple, stateless 193 translation. 195 2. Requirements and Design Goals 197 There has been extensive debate about the best architecture for 198 RTCWeb signaling. To a great extent this decision is dictated by the 199 requirements that the signaling mechanism is intended to fit. The 200 protocol in this document was designed to minimize the amount of 201 implementation effort required outside the browser and RTC-Web 202 signaling gateways. This implies the following requirements: 204 It should be possible to develop a simple browser to browser voice 205 and video service in a small amount of code. In particular, it MUST 206 be possible to implement a functional service such that: 208 o It's possible to build a web service that maintains only 209 transaction state, not call state; 210 o In the browser to browser case, the web server can simply pass 211 protocol messages between the browser agents without examining or 212 modifying them; 214 o The service operates without needing to examine the details of the 215 browser capabilities (e.g., new codecs should be automatically 216 accommodated without modifying either the service or the 217 associated JS. 219 It should be possible to implement a simple RTC-Web gateway that: 221 o Connects to legacy SIP devices ranging from multiscreen video 222 phones to PSTN gateways; 223 o Has a deterministic mapping between RTC-Web messages and SIP 224 messages; 225 o Permits the mechanical translation of messages without knowledge 226 of the details of all the browser capabilities; 227 o is only required to maintains transaction state, not call state 228 (note is fine if an implementation want to maintain call state); 229 and 230 o Does not need to send or receive the media (unless also acting as 231 a relay or a translator for codecs which are not jointly 232 supported). 234 Finally it seems clear that SDP is too complicated to reinvent, so 235 despite its manifest deficiencies we opt to take it as-is rather than 236 trying to reinvent it. 238 3. Terminology 240 The key words "MUST", "MUST NOT", "REQUIRED", "SHOULD", "SHOULD NOT", 241 "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be 242 interpreted as described in [RFC2119]. 244 This draft uses the API and terminology described in [webrtc-api]. 246 4. Protocol Overview 248 We start with a simple example. Consider the case where browser A 249 wishes to setup up a media session with browser B. At the high level, 250 A needs to communicate the following information: 252 o This is a new media session and not an update to a different 253 session. 254 o Here is A's SDP offer, including media parameters and ICE 255 candidates. 257 The OFFER message is used to carry this information. For example, A 258 might send B: 260 { 261 "messageType":"OFFER", 262 "offererSessionId":"13456789ABCDEF", 263 "seq": 1, 264 "sdp":" 265 v=0\n 266 o=- 2890844526 2890842807 IN IP4 192.0.2.1\n 267 s= \n 268 c=IN IP4 192.0.2.1\n 269 t=2873397496 2873404696\n 270 m=audio 49170 RTP/AVP 0" 271 } 273 The messageType field indicates that this is an OFFER and the 274 offererSessionId indicates the media session that this OFFER is 275 associated with. B can tell that this is for a new media session 276 because it contains a offererSessionId that he has not seen before. 277 The sdp field contains the offer itself, which is just an ordinary 278 SDP offer rendered as a string. 280 If B elects to start a media session, B responds with an ANSWER 281 message containing SDP, as shown below. 283 { 284 "messageType":"ANSWER", 285 "offererSessionId":"13456789ABCDEF", 286 "answererSessionId":"abc1234356", 287 "seq": 1, 288 "sdp":" 289 v=0\n 290 o=- 2890844526 2890842807 IN IP4 192.0.2.3\n 291 s= \n 292 c=IN IP4 192.0.2.3\n 293 t=2873397496 2873404696\n 294 m=audio 49175 RTP/AVP 0" 295 } 297 The contents of this message are more or less the same as those in 298 the OFFER, except that B also includes a answererSessionId to 299 uniquely identify the session from B's perspective. The combination 300 of offererSessionId and answererSessionId uniquely identifies this 301 session. 303 Finally, in order to confirm that A has seen B's ANSWER, A responds 304 with an OK message. 306 { 307 "messageType":"OK", 308 "offererSessionId":"13456789ABCDEF", 309 "answererSessionId":"abc1234356", 310 "seq": 1 311 } 313 Note that all of these messages contain a seq field which contains a 314 transaction sequence number. The seq field makes it possible to 315 correlate messages which belong to the same transaction, as well as 316 to detect duplicates, which is described later in section 317 Section 5.1. 319 The messageType value of "OFFER" will always contain an SDP offer, 320 and an object with a messageType value of "ANSWER" will always 321 contain an SDP answer. The complete list of message types is defined 322 in Section 5. Only a small number of messages are permitted and much 323 of the message set is devoted to error handling. 325 In building web systems it is often useful for a request to contain 326 some state that is passed back in future messages. This system 327 includes two types of state: session state and request state. If a 328 browser receives a message that contains state in a setSessionState 329 attribute, any future messages it sends that have the same 330 offererSessionId MUST include this state in a sessionState attribute. 331 Similarly if a request contains an setResponseState attribute, that 332 state MUST be included in any response to that request in a 333 responseState attribute. 335 Once a session has been set up, additional rounds of offer/answer can 336 be sent using the OFFER/ANSWER/OK sequence. Note that the seq 337 attribute makes it easy to differentiate these additional rounds from 338 the initial exchange and from each other. 340 At the point that one side which to end the session, it simply sends 341 a SHUTDOWN message which is responded to with an OK response. A 342 SHUTDOWN can be sent regardless of it any response has been received 343 to the initial OFFER. The key purpose of the SHUTDOWN messages is to 344 allow the other side to know they can clean up any state associated 345 with the session. 347 5. Semantics & Syntax 349 5.1. Reliability Model 351 ROAP messages are typically carried over a reliable transport (likely 352 HTTP via XMLHttpRequest or WebSockets), so the chance of message loss 353 is low (though non-zero), provided that the signaling service is up. 354 However, the common web reliability and scaleability model is based 355 on the principle that transactions are idempotent and that requests 356 can just be discarded and will be retried. A retry of a transaction 357 might happened if a given host was down and the DNS round robin 358 approach wanted to move to the next server, or if a server was 359 overloaded, or if there was a hiccup in the network. Web 360 applications that want to work well need to deal with theses issues 361 to get the advantages of the general web design pattern for 362 scaleability and reliability. Because only the application knows 363 what its internal reliability characteristics are, the JS application 364 (and whatever associated servers it uses) are ultimately responsible 365 for ensuring end-to-end delivery; the browser simply assumes that 366 messages which are provided to the JS will be delivered eventually. 368 However, in order to maintain OFFER/ANSWER transaction state, the SDP 369 state machine does need to understand when the far end has received 370 an ANSWER if it caused an error or not. To support this model, OFFER 371 and ANSWER messages are acknowledged end to end with an ANSWER or OK 372 however any retransmission need to be handled by the JS or whatever 373 is providing the transport of the ROAP messages. The combination of 374 the sessionID and seq allow the browser to detect and discard 375 duplicate requests and to detect glare. 377 NOTE: The split of the reliability model between the JS and browser 378 is something where implementations are playing around with and 379 trying to get some experience with what works best. This is an 380 area that is highly likely to change as understanding of the 381 implications evolves. 383 5.2. Common Fields 385 5.2.1. Session IDs 387 Each call is identified by a pair of session identifiers: 389 offererSessionId The offerer's half of the session ID (supplied in 390 the OFFER) 392 answererSessionId The answerer's half of the session ID (supplied in 393 the response to an OFFER) 395 The session ID values MUST be generated so that they are globally 396 unique. Thus, the combination of both sessionIds is itself globally 397 unique. Session IDs never change for the duration of an media 398 session. 400 All messages MUST contain the "offererSessionId", and all messages 401 other than OFFER or an error in response to an OFFER MUST contain 402 both "offererSessionId" and "answererSessionId". 404 5.2.2. Seq 406 This is a sequence counter for the key requests that helps correlate 407 responses to the correct request. 409 This is a 32-bit unsigned integer. On each new OFFER (from either 410 browser) it is incremented by one. The Seq of an OK or ANSWER is set 411 to the same Seq that was used in the OFFER which caused it. When a 412 PeerConnection objects originates a new session by sending an OFFER 413 type message, it starts the Seq at 1. 415 Note: If browser A starts an OFFER/ANSWER/OK transaction with a seq 416 of 1 to browser B, then later B initiates a second 417 OFFER/ANSWER?/OK transaction, it will have a seq of 2. 419 5.2.3. Session Tokens 421 While session IDs serve to uniquely identify a session, it may be 422 useful to allow one or another sides to offload state onto the other 423 side (for instance to enable a stateless gateway). The 424 "setSessionToken" and "sessionToken" fields are used for this 425 purpose. When an implementation receives a message with a 426 "setSessionToken" field, it MUST associate the field value with the 427 session. For all future messages in the session MUST send the 428 associated value in the "sessionToken" field (unless the session 429 token is reset by another "setSessionToken" value). If no session 430 token has yet been received, the "sessionToken" field MUST be 431 omitted. 433 5.2.4. Response Tokens 435 In addition to tokens which persist for the life of a session, it is 436 also possible to have tokens which are only valid for the lifetime of 437 a given request/response pair. The "setResponseToken" and 438 "responseToken" fields are used for this purpose. 440 When an implementation responds to a message from the other side 441 (e.g., supplies an answer to an offer, or replies to an answer with 442 an OK), it MUST copy into the "responseToken" field any value found 443 in a "setResponseToken" field in the message being responded to. If 444 no "setResponseToken" field is present, then the "responseToken" 445 field MUST be omitted. 447 5.3. Media Setup 449 In order to initiate sending media between the browsers, the offerer 450 sends an OFFER message. In order to accept the media, the answerer 451 responds with an ANSWER message. A sample message flow for this is 452 shown below: 454 participant OffererUA 455 participant OffererJS 456 participant AnswererJS 457 participant AnswererUA 458 OffererJS->OffererUA: peer=new PeerConnection(); 460 OffererJS->OffererUA: peer->addStream(); 461 OffererUA->OffererJS: sendSignalingChannel(); 462 OffererJS->AnswererJS: {"type":"OFFER", "sdp":"..."} 463 AnswererJS->AnswererUA: peer=new PeerConnection(); 464 AnswererJS->AnswererUA: peer->processSignalingMessage(); 465 AnswererUA->AnswererJS: onconnecting(); 467 AnswererUA->OffererUA: ICE starts checking 469 note right of AnswererUA: User decides it is OK to send video 470 AnswererJS->AnswererUA: peer->addStream(); 471 AnswererUA->OffererUA: Media 473 AnswererUA->AnswererJS: sendSignalingChannel(); 474 AnswererJS->OffererJS: {"type":"ANSWER","sdp":"..."} 475 OffererJS->OffererUA: peer->processSignalingMessage(); 476 OffererUA->OffererJS: onaddstream(); 477 OffererUA->AnswererUA: Media 479 AnswererUA->OffererUA: ICE Completes 480 AnswererUA->AnswererJS: onopen(); 481 OffererUA->OffererJS: onopen(); 483 OffererUA->OffererJS: sendSignalingChannel(); 484 OffererJS->AnswererJS: {"type":"OK" } 485 AnswererJS->AnswererUA: peer->processSignalingMessage(); 486 AnswererUA->AnswererJS: onaddstream(); 488 The above figure shows a simple message flow for negotiating media: 490 o The offerer sends an OFFER to initiate the call; 491 o At this point, ICE negotiation starts; 492 o Once the browser authorizes sending media to the far side, the 493 answerer sends an ANSWER containing the media parameters; and 494 finally, 496 o Once ICE is completed and an OK to the ANSWER is received, both 497 sides know that media can flow. 499 The contents of each of these messages is detailed below. 501 5.3.1. OFFER Message 503 The first OFFER message with a given offererSessionId is used to 504 indicate the desire to start a media session. 506 5.3.1.1. Offerer Behavior 508 In order to start a new media session, a offerer constructs a new 509 OFFER message with a fresh offererSessionId. The answererSessionId 510 field MUST be empty. Like all SDP offers, the message MUST contain 511 an "sdp" field with the offerer's offer. It MUST also contain the 512 tieBreaker field, containing a 32 bit random integer used for glare 513 resolution as described in Section 5.4.1. 515 5.3.1.2. Answerer Behavior 517 A answerer can receive an OFFER in three cases: 519 o A new session (this is detected by seeing a new offererSessionId 520 value); 521 o A retransmit of a new OFFER (known offererSessionId, empty 522 answererSessionId); or 523 o A request to change media parameters (known offererSessionId, 524 known answererSessionId, new seq value). 526 The first two situations are described in this section. The third 527 case is described in Section 5.4. Any other condition represents an 528 alien packet and SHOULD be rejected with Error:NOMATCH 530 If no media session exists with the given "offererSessionId" value, 531 then this is a new media session. The answerer has three primary 532 options: 534 o Reject the request, either silently with no response or with an 535 Error:REFUSED message; 536 o Reply to the OFFER message with a final ANSWER message; or 537 Section 5.3.2 538 o Send back a non final ANSWER message and then later respond with 539 an final ANSWER. 541 In either of the latter two cases, the answerer performs the 542 following steps: 544 1. Generate a "answererSessionId" value; 545 2. Create some local call state (i.e., a PeerConnection object) and 546 bind it to the "offererSessionId"/"answererSessionId" pair. All 547 future messages on this session MUST then be delivered to that 548 PeerConnection object; 549 3. Start ICE handshaking with the offerer; and finally, 550 4. Respond with a message containing an SDP answer in the "sdp" 551 field. This will contain the answerer's (potentially with 552 moreComing=true) media information and the ICE parameters. 554 If an OFFER is received that has already been received and responded 555 to and the media session still exists, then the answerer MUST respond 556 with the same message as before. If the session has been terminated 557 in the meantime, then an Error:NOMATCH message SHOULD be sent. 559 5.3.2. ANSWER 561 The ANSWER message is used by the receiver of an OFFER message to 562 indicate that the offer has been accepted. The ANSWER message MUST 563 contain the answererSessionId for this media session and an sdp 564 parameter containing ICE candidates and the final media parameters 565 for the session (although of course these can be adjusted by a new 566 OFFER/ANSWER exchange. See Section 5.4). In addition, ANSWERs MAY 567 contain the moreComing flag, as described below. 569 5.3.2.1. moreComing Flag 571 This is a boolean flag that can only appear in an ANSWER and, if set 572 to true, indicates that this answer is not the final answer that will 573 be sent for the associated OFFER. If this flag is not present, it is 574 assumed to be false. 576 One motivating use case for moreComing is where an Agent wishes to 577 respond immediately to an OFFER in order to start ICE checking before 578 the user has provided authorization to send media. The Agent cannot 579 send an ANSWER containing media information but can send ICE 580 candidate. In this case, the Agent could send an ANSWER that had 581 moreComing=true but that allowed ICE to start. Then later, when the 582 user had authorized the media, the Agent could send an ANSWER with 583 the moreComing flag=false that indicated this was the final media 584 selection. 586 To see why simply having multiple independent offers (as opposed to 587 multiple answers for a single offer), consider the case where browser 588 A requests video with B. When the A side that sent the initial OFFER 589 gets an ANSWER that rejects the video, it may very well present a UI 590 indication that there is no media. Five seconds later when browser B 591 sends an OFFER requesting video, browser A may present a UI element 592 that asks is OK to do the video that was just rejected. This results 593 in a bad user experience and in the extreme can result in both sides 594 always rejecting the other side's OFFER of video, then waiting for 595 the user to authorize video that results in a new OFFER that is 596 always rejected. 598 It easier to be able to indicate that OFFER resulted in one valid 599 ANSWER, but that the OFFER needs to be held open as other valid 600 ANSWERS which would replace the current one. This stops the other 601 side from generating new a new OFFER while this is taking place. 602 This is also needed to support a SIP gateway doing early media. 604 5.3.3. OK 606 The OK message is used by the receiver of an ANSWER message to 607 indicate that it has received the ANSWER message. It has no contents 608 itself and is merely used to stop the retransmissions of the ANSWER. 610 5.3.4. ERROR 612 The ERROR message is used to indicate that there has been an error. 613 The contents and semantics of this message are defined in 614 Section 5.6. 616 5.4. Changing Media Parameters 618 Once a call has been set up, it is common to want to adjust the media 619 parameters, e.g., to add video to an audio-only call. This is also 620 done with the OFFER/ANSWER/OK sequence of messages, though the 621 details are slightly different. 623 Either side may initiate a new OFFER/ANSWER exchange by sending an 624 OFFER message. However, implementations MUST NOT attempt this for 625 sessions which are still in active negotiation. Specifically, the 626 offerer MUST NOT send a new OFFER until it has received the ANSWER, 627 and the answerer MUST NOT send a new OFFER until it has received the 628 OK indicating receipt of the ANSWER. 630 A new OFFER MUST contain a complete set of media parameters 631 describing the proposed new media configuration as well as a full set 632 of ICE parameters. The recipient of a new OFFER on a valid 633 connection MUST respond with an appropriate ANSWER message. However 634 that message MAY refuse to accept the proposed new configuration. If 635 the session has been terminated in the meantime, then an Error: 636 NOMATCH message SHOULD be sent. 638 5.4.1. Conflicting OFFERS (glare) 640 Because a change of media parameters may be initiated by either side, 641 there is a potential for the change requests to occur simultaneously 642 (i.e., "glare"). This document defines a glare handling procedure 643 that results in immediate resolution of the glare condition allowing 644 one OFFER message to continue to be processed while the other is 645 terminated. It is defined in such a way that it can interwork with 646 SIP's glare handling mechanism. However SIP's timer based mechanism 647 aren't suitable for the ROAP as strict requirements on ROAP message 648 transport between end-points are not possible and thus easily could 649 result in an repeated glare situation. 651 To achieve immediate resolution each OFFER message includes a 32 652 unsigned integer value, the tie breaker, that is randomly generated 653 for each new OFFER message an end-point issues. Whenever a end-point 654 receives an OFFER message that has the same sequence number as an 655 outstanding OFFER the end-point itself sent, a glare condition has 656 arisen. In a glare condition the end-point compares the received 657 OFFER's tiebreaker value with the tiebreaker value of the tiebreaker 658 in the OFFER outstanding. The OFFER with the greatest numerical 659 value wins and that OFFER is allowed to continue being processed. IF 660 the received OFFER lost the tie breaking an Error:CONFLICT message is 661 sent. If it is the outstanding OFFER that lost, the end-point can 662 expect an Error:CONFLICT message to be eventually received. However, 663 that OFFER can immediately be considered as terminated. 665 Some special considerations has been made in this glare handling for 666 interworking well with SIP glare handling as currently specified. 667 Thus it has the notion of a gateway that converts the ROAP message 668 into SIP message. This process is discussed in more detail below 669 after the basic rules are defined normatively. 671 A regular end-point SHALL generate a random 32-bit unsigned numerical 672 value for each OFFER message. In the case the random value becomes 0 673 or 4,294,967,295 a new random value SHALL be generated until it is 674 neither values. The values 0 and 4,294,967,295 MAY be assigned to 675 ROAP messages generated by gateways to ensure efficient glare 676 handling towards other systems. 678 An ROAP message end-point that has an outstanding OFFER, i.e. an 679 OFFER where it has not yet received an ANSWER SHALL upon receiving an 680 OFFER perform the following processing: 682 1 Check if the incomming OFFER has a answererSessionId, if not it is 683 an initial offer. If the outstanding OFFER also is an intial 684 OFFER there is an Error. If the outstanding OFFER is not an 685 initial OFFER and the outstanding OFFER do have answererSessionId 686 equal to the offererSessionId in the received message then the 687 sequence numbers are checked. In case the incomming OFFER's 688 sequence number is equal to the sequence number of the outstanding 689 OFFER there is glare. If the sequence number is not the same and 690 the sequence number of the incomming is larger than the 691 outstanding OFFER's sequence number, then this message is out of 692 order with an ANSWER to the out-standing message. If the sequence 693 number of the incomming is lower than the outstanding, then this 694 is a old request. 696 2 In case of glare, compare the tie-breaker values for each OFFER. 697 The tie-breaker value that is greater than the other wins. The 698 OFFER with the winning value is processed as if there was no 699 glare. The OFFER with the losing value is terminated, see 3A or 700 3B. In case the tie-breaker values are equal the double-glare case 701 in 3C is invoked. 703 3A The OFFER being terminated is the received one: The end-point 704 SHALL send a Error:CONFLICT response message. 706 3B The OFFER being terminated is this end-points outstanding OFFER: 707 The end-point knows the OFFER will be terminated and can expect an 708 Error:CONFLICT response. The end-point can assume this 709 termination and MAY issue a new OFFER as soon as possible after 710 having concluded the transactions for the winning OFFER. 712 3C The two tie-breaker values where equal, in this case both OFFERs 713 are terminated and a Error:DOUBLCONFLICT message is sent. Both of 714 the Offerer SHOULD re-attempt their offers by generating new OFFER 715 messages, these messages SHALL have new tie-breaker values and 716 incremented sequence number. Also gateways SHOULD generate random 717 values, as one reason for this double conflict is that two 718 gateways have become interconnected and both selects either 0 or 719 4,294,967,295. 721 The following figure assumes the previous message flow has happened 722 and media is flowing. 724 participant OffererUA 725 participant OffererJS 726 participant AnswererJS 727 participant AnswererUA 729 note left of OffererJS: "Hi, Let's do video" 730 note right of AnswererJS: "Sounds great" 731 OffererJS->OffererUA: peer->addStream( new MediaStream() ); 732 OffererUA->OffererJS: sendSignalingChannel(); 733 AnswererJS->AnswererUA: peer->addStream( new MediaStream() ); 734 AnswererUA->AnswererJS: sendSignalingChannel(); 735 OffererJS->AnswererJS: {"type":"OFFER", tiebreaker="123", "sdp":"..."} 736 AnswererJS->OffererJS: {"type":"OFFER", tiebreaker="456", "sdp":"..."} 737 AnswererJS->AnswererUA: peer->processSignalingMessage(); 738 OffererJS->OffererUA: peer->processSignalingMessage(); 740 OffererUA->OffererJS: sendSignalingChannel(); 741 AnswererUA->AnswererJS: sendSignalingChannel(); 742 OffererJS->AnswererJS: {"type":"ERROR",error="conflict","sdp":"..."} 743 AnswererJS->OffererJS: {"type":"ANSWER", "sdp":"..."} 744 AnswererJS->AnswererUA: peer->processSignalingMessage(); 745 OffererJS->OffererUA: peer->processSignalingMessage(); 747 OffererUA->OffererJS: sendSignalingChannel(); 748 OffererJS->AnswererJS: {"type":"OK"} 749 AnswererJS->AnswererUA: peer->processSignalingMessage(); 750 AnswererUA->AnswererJS: onaddstream(); 752 AnswererUA->AnswererJS: sendSignalingChannel(); 753 AnswererJS->OffererJS: {"type":"OFFER", tiebreaker="789", "sdp":"..."} 754 OffererJS->OffererUA: peer->processSignalingMessage(); 755 OffererUA->OffererJS: sendSignalingChannel(); 756 OffererJS->AnswererJS: {"type":"ANSWER", "sdp":"..."} 757 AnswererJS->AnswererUA: peer->processSignalingMessage(); 758 AnswererUA->OffererUA: Both way Video 759 AnswererUA->AnswererJS: sendSignalingChannel(); 760 AnswererJS->OffererJS: {"type":"OK"} 761 OffererJS->OffererUA: peer->processSignalingMessage(); 762 OffererUA->OffererJS: onaddstream(); 764 5.4.2. Premature OFFER 766 It is an error, though technically possible, for an agent to generate 767 a second OFFER while it already has an unanswered OFFER pending. An 768 agent which receives such an offer MUST respond with an Error:FAILED 769 message containing a "RetryAfter" attribute generated as a random 770 value from 0 to 10 seconds. 772 5.5. Notification of Media Termination 774 The SHUTDOWN message is used to indicate the termination of an 775 existing session. Either side may initiate a SHUTDOWN at any time 776 during the session, including while the initial OFFER is outstanding 777 (i.e., before an ANSWER has been sent/received.) 778 TODO - FIX NAMES 780 participant OffererUA 781 participant OffererJS 782 participant AnswererJS 783 participant AnswererUA 785 OffererJS->OffererUA: peer->close(); 786 OffererUA->OffererJS: sendSignalingChannel(); 787 OffererJS->AnswererJS: { "type":"SHUTDOWN" } 788 AnswererJS->AnswererUA: peer->processSignalingMessage(); 789 AnswererUA->AnswererJS: onclose(); 791 AnswererUA->AnswererJS: sendSignalingChannel(); 792 AnswererJS->OffererJS: {"type":"OK"} 793 OffererJS->OffererUA: peer->processSignalingMessage(); 794 OffererUA->OffererJS: onclose(); 796 Upon receipt of a SHUTDOWN which corresponds to an existing session, 797 an agent MUST immediately terminate the session and send an OK 798 message. Subsequent messages directed to this session MUST result in 799 an Error:NOMATCH message. Similarly, on receipt of the OK, the agent 800 which sent the SHUTDOWN MUST terminate the session and SHOULD respond 801 to future messages with Error:NOMATCH. 803 5.6. Errors 805 Errors are indicated by the messageType "ERROR". All errors MUST 806 contain an "errorType" field indicating the type of error which 807 occurred and echo the "seq" value (if any) and the session id values 808 of the message which generated the error. The following sections 809 describe each error type. 811 5.6.1. NOMATCH 813 An implementation which receives a message with either an unknown 814 offererSessionId (for an OFFER) or an unknown offererSessionId/ 815 answererSessionId pair SHOULD respond with a NOMATCH error. 817 5.6.2. TIMEOUT 819 The TIMEOUT error is used to indicate that the corresponding message 820 required some processing which timed out. For instance, an agent 821 which is a SIP gateway translates ROAP signaling messages into SIP 822 messages. If those SIP messages time out, the gateway would generate 823 a TIMEOUT error. 825 5.6.3. REFUSED 827 An agent which has received an initial OFFER MAY indicate its refusal 828 of the media session by sending a REFUSED error. Note that this 829 error is not required; an agent MAY simply drop the OFFER with no 830 acknowledgement at all. However, agents which do not wish to accept 831 subsequent OFFERS SHOULD [OPEN ISSUE: MUST?] send a REFUSED in order 832 to avoid timeouts and confusion on the offerer side. 834 5.6.4. CONFLICT 836 The CONFLICT error is used to indicate that an agent has received an 837 OFFER while it has its own OFFER outstanding. The offerer's behavior 838 in response to this error is defined in Section 5.4.1. 840 5.6.5. DOUBLECONFLICT 842 The DOUBLECONFLICT error is used to indicate the tiebreaker values in 843 CONFLICT were the same. See Section 5.4.1. 845 5.6.6. FAILED 847 FAILED is a catch-all error indicating that something went wrong 848 while processing a message. A FAILED error MAY contain a 849 "retryAfter" field, which indicates the time (in seconds) after which 850 the message MAY be retried (though retries are OPTIONAL). 852 6. Security Considerations 854 TBD 856 7. Companion APIs 858 Note: This section may need to move to the requirements 859 draft[I-D.ietf-rtcweb-use-cases-and-requirements] but for now it 860 is convenient to put it here just to help see how all the pieces 861 fit together. 863 The offer / answer concepts in this draft are not enough to meet all 864 the use cases of RTCWeb. They need to be combined with some 865 additional functionality that the browser exposes to the JavaScript 866 applications. This additional functionality loosely falls into three 867 categories: capabilities, hints, and stats. The capabilities allow 868 the JS application to find out what video codecs and capabilities a 869 given browser supports before initiating a media session. The hints 870 provide a way for the JS application to provide useful information to 871 the browser about how the media will be used so that the browser can 872 negotiate appropriate codecs and modes. Stats provides statistics 873 about what the current media sessions. The capabilities, hints, and 874 stats do not need to be communicated between the two browsers, so 875 they are not specified in this draft. However, this drafts assumes 876 the existence of API so that these three can be used to build 877 complete systems. Some of the assumptions about these APIs are 878 described in the following sections. 880 7.1. Capabilities 882 The APIs need to provide a way to find out the capabilities as 883 defined in section 9 of RFC 3264. This allows the JS to find out the 884 codecs that the browser supports. 886 7.2. Hints 888 When creating a new PeerConenction in a browser, the application 889 needs to be able to provide optional hints to the browser about 890 preferences for the media to be negotiated. These include: 892 1. Whether the session has audio, video, or both; 893 2. Whether the audio is spoken voice or music; 894 3. Preferred video resolution and frame rate (perhaps these just 895 come from the MediaTrack objects); 896 4. Whether the video should prefer temporal or spatial fidelity; 897 5. 899 The JS applications should also be able to update and change these 900 hints mid-session. Some types of hint changes may simply impact the 901 parameter on various codecs and require no signalling to the other 902 end of the media stream. Other types of hint changes may cause a new 903 offer answer exchange. 905 7.3. Stats 907 Several parts of the media session create statistics that are 908 important to some applications. APIs should provide the JS 909 applications with information on the following statistics: 911 1. Total IP data rate for the session; 912 2. ICE statistics including current candidates, active pairs, RTT; 913 3. RTP statistics including codecs selected, parameters, and bit 914 rates; 915 4. RTCP statistics including packet loss rate; and 916 5. SRTP statistics. 918 8. Relationship with SIP & Jingle 920 The SIP [RFC3261] specifies an application protocol that provides a 921 complete solution for setting up and managing communications on the 922 Internet. It combines both "call processing" functions - identity 923 and name spaces, call routing, user search, call features, 924 authentication, and so on - as well as media processing through its 925 transport of SDP and support for the offer/answer model. 927 In a web context, application processing can be done through 928 proprietary logic implemented in Javascript/HTML, along with 929 proprietary logic implemented in the web server, and proprietary 930 messaging transported through HTTP and WebSockets. One of the 931 advantages of the web is to allow a rich set of applications to be 932 built without changing the browser. Although application processing 933 and be done in JavaScript and the web servers, we do require raw 934 media control in the browser. ROAP basically extracts the offer/ 935 answer media control processing used in SIP, and puts it into an 936 protocol that can operate independently of SIP itself. 938 The information contained in ROAP messages corresponds closely to the 939 offer/answer information carried by complete solutions such as SIP 940 and Jingle, so it is straightforward to build gateways to and from 941 ROAP. These gateways need only translate the signaling, while 942 allowing end-to-end media without the need for media relays (except, 943 of course, for NAT traversal.) In the case of SIP, which uses SDP 944 directly, such gateways would translate between SIP and ROAP, while 945 transporting SDP end-to-end. In the case of Jingle [XEP-0166], it 946 would also be necessary to translate between SDP and the Jingle 947 offer/answer format; [XEP-0167] describes such a mapping. 949 9. IANA Considerations 951 This document requires no actions from IANA. 953 10. Acknowledgments 955 The text for the glare resoltuion section was provided by Magnus 956 Westerlund. Many thanks for comment, ideas, and text from Eric 957 Rescorla, Harald Alvestrand, Magnus Westerlund, Ted Hardie, and 958 Stefan Hakansson. 960 11. Open Issues 962 How to negotiate support for enhancements to this JSON message. 963 (consider supported / required ) 965 Common way to indicate destination in offer going to a signalling 966 gateway. 968 Need to generate proper ASCII art version of message flows. 970 12. References 972 12.1. Normative References 974 [RFC4627] Crockford, D., "The application/json Media Type for 975 JavaScript Object Notation (JSON)", RFC 4627, July 2006. 977 [RFC3264] Rosenberg, J. and H. Schulzrinne, "An Offer/Answer Model 978 with Session Description Protocol (SDP)", RFC 3264, 979 June 2002. 981 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 982 Requirement Levels", BCP 14, RFC 2119, March 1997. 984 [RFC4566] Handley, M., Jacobson, V., and C. Perkins, "SDP: Session 985 Description Protocol", RFC 4566, July 2006. 987 12.2. Informative References 989 [RFC3261] Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston, 990 A., Peterson, J., Sparks, R., Handley, M., and E. 991 Schooler, "SIP: Session Initiation Protocol", RFC 3261, 992 June 2002. 994 [XEP-0166] 995 Ludwig, S., Beda, J., Saint-Andre, P., McQueen, R., Egan, 996 S., and J. Hildebrand, "Jingle", XSF XEP 0166, 997 December 2009. 999 [XEP-0167] 1000 Ludwig, S., Saint-Andre, P., Egan, S., McQueen, R., and D. 1001 Cionoiu, "Jingle RTP Sessions", XSF XEP 0167, 1002 December 2008. 1004 [RFC5245] Rosenberg, J., "Interactive Connectivity Establishment 1005 (ICE): A Protocol for Network Address Translator (NAT) 1006 Traversal for Offer/Answer Protocols", RFC 5245, 1007 April 2010. 1009 [webrtc-api] 1010 Bergkvist, Burnett, Jennings, Narayanan, "WebRTC 1.0: 1011 Real-time Communication Between Browsers", October 2011. 1013 Available at 1014 http://dev.w3.org/2011/webrtc/editor/webrtc.html 1016 [I-D.ietf-rtcweb-use-cases-and-requirements] 1017 Holmberg, C., Hakansson, S., and G. Eriksson, "Web Real- 1018 Time Communication Use-cases and Requirements", 1019 draft-ietf-rtcweb-use-cases-and-requirements-06 (work in 1020 progress), October 2011. 1022 Authors' Addresses 1024 Cullen Jennings 1025 Cisco 1026 170 West Tasman Drive 1027 San Jose, CA 95134 1028 USA 1030 Phone: +1 408 421-9990 1031 Email: fluffy@cisco.com 1033 Jonathan Rosenberg 1034 jdrosen.net 1036 Email: jdrosen@jdrosen.net 1037 URI: http://www.jdrosen.net 1039 Justin Uberti 1040 Google, Inc. 1042 Randell Jesup 1043 Mozilla 1045 Email: randell-ietf@jesup.org