idnits 2.17.1 draft-alvestrand-dispatch-rtcweb-protocols-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (March 13, 2011) is 4764 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Unused Reference: 'RFC5124' is defined on line 451, but no explicit reference was found in the text ** Downref: Normative reference to an Experimental draft: draft-alvestrand-dispatch-rtcweb-datagram (ref. 'I-D.alvestrand-dispatch-rtcweb-datagram') == Outdated reference: A later version (-16) exists of draft-ietf-codec-opus-04 == Outdated reference: A later version (-17) exists of draft-ietf-hybi-thewebsocketprotocol-06 ** Downref: Normative reference to an Experimental draft: draft-westin-payload-vp8 (ref. 'I-D.westin-payload-vp8') ** Obsolete normative reference: RFC 1890 (Obsoleted by RFC 3551) Summary: 3 errors (**), 0 flaws (~~), 4 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group H. Alvestrand 3 Internet-Draft Google 4 Intended status: Standards Track March 13, 2011 5 Expires: September 14, 2011 7 Overview: Real Time Protocols for Brower-based Applications 8 draft-alvestrand-dispatch-rtcweb-protocols-01 10 Abstract 12 This document gives an overview and context of a protocol suite 13 intended for use with real-time applications that can be deployed in 14 browsers - "real time communication on the Web". 16 It intends to serve as a starting and coordination point to make sure 17 all the parts that are needed to achieve this goal are findable, and 18 that the parts that belong in the Internet protocol suite are fully 19 specified and on the right publication track. 21 This work is an attempt to synthesize the input of many people, but 22 makes no claims to fully represent the views of any of them. All 23 parts of the document should be regarded as open for discussion, with 24 the intended discussion forum being the "RTCWEB" WG (in formation). 26 Currently, discussion is on the rtc-web@alvestrand.no mailing list. 28 Requirements Language 30 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 31 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 32 document are to be interpreted as described in RFC 2119 [RFC2119]. 34 Status of this Memo 36 This Internet-Draft is submitted in full conformance with the 37 provisions of BCP 78 and BCP 79. 39 Internet-Drafts are working documents of the Internet Engineering 40 Task Force (IETF). Note that other groups may also distribute 41 working documents as Internet-Drafts. The list of current Internet- 42 Drafts is at http://datatracker.ietf.org/drafts/current/. 44 Internet-Drafts are draft documents valid for a maximum of six months 45 and may be updated, replaced, or obsoleted by other documents at any 46 time. It is inappropriate to use Internet-Drafts as reference 47 material or to cite them other than as "work in progress." 48 This Internet-Draft will expire on September 14, 2011. 50 Copyright Notice 52 Copyright (c) 2011 IETF Trust and the persons identified as the 53 document authors. All rights reserved. 55 This document is subject to BCP 78 and the IETF Trust's Legal 56 Provisions Relating to IETF Documents 57 (http://trustee.ietf.org/license-info) in effect on the date of 58 publication of this document. Please review these documents 59 carefully, as they describe your rights and restrictions with respect 60 to this document. Code Components extracted from this document must 61 include Simplified BSD License text as described in Section 4.e of 62 the Trust Legal Provisions and are provided without warranty as 63 described in the Simplified BSD License. 65 Table of Contents 67 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 68 2. On interoperability and innovation . . . . . . . . . . . . . . 5 69 3. Functionality groups . . . . . . . . . . . . . . . . . . . . . 5 70 4. Data transport . . . . . . . . . . . . . . . . . . . . . . . . 6 71 5. Data framing and securing . . . . . . . . . . . . . . . . . . 7 72 6. Data formats . . . . . . . . . . . . . . . . . . . . . . . . . 7 73 7. Connection management . . . . . . . . . . . . . . . . . . . . 8 74 8. Presentation and control . . . . . . . . . . . . . . . . . . . 9 75 9. Local system support functions . . . . . . . . . . . . . . . . 9 76 10. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 9 77 11. Security Considerations . . . . . . . . . . . . . . . . . . . 10 78 12. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 10 79 13. References . . . . . . . . . . . . . . . . . . . . . . . . . . 11 80 13.1. Normative References . . . . . . . . . . . . . . . . . . 11 81 13.2. Informative References . . . . . . . . . . . . . . . . . 12 82 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 12 84 1. Introduction 86 The Internet was, from very early in its lifetime, considered a 87 possible veichle for the deployment of real-time, interactive 88 applications - with the most easily imaginable being audio 89 conversations (aka "Internet telephony") and videoconferencing. 91 The first attempts to build this were dependent on special networks, 92 special hardware and custom-built software, often at very high prices 93 or at low quality, placing great demands on the infrastructure. 95 As the available bandwidth has increased, and as processors and other 96 hardware has become ever faster, the barriers to participation have 97 decreased, and it is possible to deliver a satisfactory experience on 98 commonly available computing hardware. 100 Still, there are a number of barriers to the ability to communicate 101 universally - one of these is that there are, as of yet, no single 102 set of communication protocols that all agree should be made 103 available for communication; another is the sheer lack of universal 104 identification systems (such as is served by telephone numbers or 105 email addresses in other communications systems). 107 Development of The Universal Solution has proved hard, however, for 108 all the usual reasons. This memo aims to take a more building-block- 109 oriented approach, and try to find consensus on a set of substrate 110 components that we think will be useful in any real-time 111 communications systems. 113 The last few years have also seen a new platform rise for deployment 114 of services: The browser-embedded application, or "Web application". 115 It turns out that as long as the browser platform has the necessary 116 interfaces, it is possible to deliver almost any kind of service on 117 it. 119 Traditionally, these interfaces have been delivered by plugins, which 120 had to be downloaded and installed separately from the browser; in 121 the development of HTML5, much promise is seen by the possiblitiy of 122 making those interfaces available in a standardized way within the 123 browser. 125 Other efforts, for instance the W3C Web Applications and Device API 126 working groups, focus on making standardized APIs and interfaces 127 available, within or alongside the HTML5 effort, for those functions; 128 this memo concentrates on specifying the protocols and subprotocols 129 that are needed to specify the interactions that happen across the 130 network. 132 2. On interoperability and innovation 134 The "Mission statement of the IETF" [RFC3935] states that "The 135 benefit of a standard to the Internet is in interoperability - that 136 multiple products implementing a standard are able to work together 137 in order to deliver valuable functions to the Internet's users." 139 Communication on the Internet frequently occurs in two phases: 141 o Two parties communicate, through some mechanism, what 142 functionality they both are able to support 144 o They use that shared communicative functionality to communicate, 145 or, failing to find anything in common, give up on communication. 147 There are often many choices that can be made for communicative 148 functionality; the history of the Internet is rife with the proposal, 149 standardization, implementation, and success or failure of many types 150 of options, in all sorts of protocols. 152 The goal of having a mandatory to implement function set is to 153 prevent negotiation failure, not to preempt or prevent negotiation. 155 The presence of a mandatory to implement function set serves as a 156 strong changer of the marketplace of deployment - in that it gives a 157 guarantee that, as long as you conform to a specification, and the 158 other party is willing to accept communication at the base level of 159 that specification, you can communicate successfully. 161 The alternative - that of having no mandatory to implement - does not 162 mean that you cannot communicate, it merely means that in order to be 163 part of the communications partnership, you have to implement the 164 standard "and then some" - that "and then some" usually being called 165 a profile of some sort; in the version most antithetical to the 166 Internet ethos, that "and then some" consists of having to use a 167 specific vendor's product only. 169 3. Functionality groups 171 The functionallity groups that are needed can be specified, more or 172 less from the bottom up, as: 174 o Data transport: TCP, UDP and the means to securely set up 175 connections between entities. 177 o Data framing: RTP and other data formats that serve as containers, 178 and their functions for data confidentiality and integrity. 180 o Data formats: Codec specifications, format specifications and 181 functionality specifications for the data passed between systems. 182 Audio and video codecs, as well as formats for data and document 183 sharing, belong in this category. 185 o Connection management: Setting up connections, agreeing on data 186 formats, changing data formats during the duration of a call; SIP 187 and Jingle/XMPP belong in this category. 189 o Presentation and control: What needs to happen in order to ensure 190 that interactions behave in a non-surprising manner. This can 191 include floor control, screen layout, voice activated image 192 switching and other such functions - where part of the system 193 require the cooperation between parties. Cisco/Tandberg's TIP was 194 one attempt at specifying this functionality. 196 o Local system support functions: These are things that need not be 197 specified uniformly, because each participant may choose to do 198 these in a way of the participant's choosing, without affecting 199 the bits on the wire in a way that others have to be cognizant of. 200 Examples in this category include echo cancellation (some forms of 201 it), local authentication and authorization mechanisms, OS access 202 control and the ability to do local recording of conversations. 204 Within each functionality group, it is important to preserve both 205 freedom to innovate and the ability for global communication. 206 Freedom to innovate is helped by doing the specification in terms of 207 interfaces, not implementation; any implementation able to 208 communicate according to the interfaces is a valid implementation. 209 Ability to communicate globally is helped both by having core 210 specifications be unencumbered by IPR issues and by having the 211 formats and protocols be fully enough specified to allow for 212 independent implementation. 214 One can think of the three first groups as forming a "media transport 215 infrastructure", and of the three last groups as forming a "media 216 service". In many contexts, it makes sense to use a common 217 specification for the media transport infrastructure, which can be 218 embedded in browsers and accessed using standard interfaces, and "let 219 a thousand flowers bloom" in the "media service" layer; to achieve 220 interoperable services, however, at least the first five of the six 221 groups need to be specified. 223 4. Data transport 225 Datagram transport is the subject of a separate draft, "A Datagram 226 Transport for the RTC-Web 227 profile".[I-D.alvestrand-dispatch-rtcweb-datagram] The basic approach 228 is to use ICE as a setup mechanism, and to specify mechanisms to use 229 ICE over connections that utilize UDP and TCP if needed to support a 230 basic datagram-passing function with adequate security. In order to 231 deal with complex NAT/firewall situations, relaying using TURN MUST 232 be supported. 234 For octet-stream transport, TCP is used. (QUESTION: Do we need a TCP 235 relay specification? The use of TURN over TCP and TLS is specified 236 in the TURN RFC - is it suitable?) 238 (The role of Web Sockets [I-D.ietf-hybi-thewebsocketprotocol] needs 239 to be clarified.) 241 The data transport MUST behave reasonably in the presence of 242 congested networks; this is usually interpreted as reducing the send 243 rate when congestion is encountered. TCP, when correctly 244 implemented, does this automatically; this is not the case with UDP, 245 and the RTP framing specification does not contain a congestion 246 control component. 248 Determining an useful congestion handling mechanism is a high 249 priority for work with this specification suite. 251 5. Data framing and securing 253 RTP [RFC3550]and SRTP [RFC3711]. The RTP/SAVP profile, defined as 254 part of SRTP, is supported, and "extended RTCP", RTP/SAVPF [RFC4585], 255 with its secured version RTP/SAVPF [RFC5124]is used in order to 256 support codec functionality that depends on this RTP profile, such as 258 The implementation of SRTP used MUST support encryption using AES-CM 259 with MIC, on both RTP and RTCP channels. (Note that like for all mandatory-to- 261 implement, there is no requirement that these protocols be used, just 262 that it is possible to negotiate them.) 264 [OPEN ISSUE; We need to specify a securable format of passing data 265 that is not RTP. This should probably be a profile on using TLS 266 and/or DTLS, although specifying a "data codec" and using SRTP has 267 been proposed too.] 269 6. Data formats 271 The intent of this specification is to allow each communications 272 event to use the data formats that are best suited for that 273 particular instance, where a format is supported by both sides of the 274 connection. However, a minimum standard is greatly helpful in order 275 to ensure that communication can be achieved. This document 276 specifies a minimum baseline that will be supported by all 277 implementations of this specification, and leaves further codecs to 278 be included at the will of the implementor. 280 NOTE IN DRAFT: The particular codecs named are NOT A DECISION. They 281 are included to illustrate possible choices, and to check with the 282 group that the references given are necessary and sufficient for the 283 purpose of specifying an interoperable codec suite. 285 In audio, the OPUS codec[I-D.ietf-codec-opus] MUST be supported. For 286 ease of interoperability with gateways to older equipment, G.711 287 U-law, audio/PCMU, defined in RFC 1890 [RFC1890] section 4.4.12, is 288 also mandatory to implement. There is no third mandatory to 289 implement. 291 In video, the VP8 codec [I-D.westin-payload-vp8] MUST be supported. 293 The Theora codec is also freely available. H.264/AVC and H.264/SVC 294 [I-D.ietf-avt-rtp-svc] are widely enough used that it gives a wider 295 range of communications partners if they are supported. 297 7. Connection management 299 This specification is silent on the definition of connection 300 management protocols. It envisions that implementors will make a 301 choice on whether to implement connection management protocols as a 302 downloadable component, as a browser plug-in, or as a frontend/ 303 backend split, where a part of the protocol machinery is downloaded 304 into the browser and uses some mechanism (for instance WebSockets) to 305 communicate back to a backend implementing the rest of the connection 306 management protocol. 308 XMPP, and its Jingle component, has proved a versatile tool in 309 building interoperable communities, and so has SIP. This suite 310 requires that the browser support establishing and describing 311 connections using a data format capable of representing the 312 information needed by these two protocols, such as one that can be 313 one-to-one transformed into SDP. The exact specification of this API 314 is done elsewhere ; this API is 315 powerful enough that all interesting parameters of the transport 316 mechanisms specified above are settable, and clear enough that how to 317 connect the API to the protocols is obvious. 319 8. Presentation and control 321 The most important part of control is the user's control over the 322 browser's interaction with input/output devices and communications 323 channels. It is important that the user have some way of figuring 324 out where his audio, video or texting is being sent, for what 325 purported reason, and what guarantees are made by the parties that 326 form part of this control channel. This is largely a local function 327 between the browser, the underlying operating system and the user 328 interface; this is being worked on in . 331 9. Local system support functions 333 These are characterized by the fact that the quality of these 334 functions strongly influences the user experience, but the exact 335 algorithm does not need coordination. In some cases (for instance 336 echo cancellation, as described below), the overall system definition 337 may need to specify that the overall system needs to have some 338 characteristics for which these facilities are useful, without 339 requiring them to be implemented a certain way. 341 Local functions include echo cancellation, volume control, camera 342 management including focus, zoom, pan/tilt controls (if available), 343 and more. 345 Certain parts of the system SHOULD conform to certain properties, for 346 instance: 348 o Echo cancellation should be good enough that feedback (defined as 349 a rising volume of sound with no local sound input) does not 350 occur. 352 o Privacy concerns must be satisfied; for instance, if remote 353 control of camera is offered, the APIs should be available to let 354 the local participant to figure out who's controlling the camera, 355 and possibly decide to revoke the permission for camera usage. 357 o Automatic gain control, if present, should normalize a speaking 358 voice into 361 10. IANA Considerations 363 This document makes no request of IANA. 365 Note to RFC Editor: this section may be removed on publication as an 366 RFC. 368 11. Security Considerations 370 Security of the web-enabled real time communications comes in several 371 pieces: 373 o Security of the components: The browsers, and other servers 374 involved. The most target-rich environment here is probably the 375 browser; the aim here should be that the introduction of these 376 components introduces no additional vulnerability. 378 o Security of the communication channels: It should be easy for a 379 participant to reassure himself of the security of his 380 communication - by verifying the crypto parameters of the links he 381 himself participates in, and to get reassurances from the other 382 parties to the communication that they promise that appropriate 383 measures are taken. 385 o Security of the partners' identity: verifying that the 386 participants are who they say they are (when positivie 387 identification is appropriate), or that their identity cannot be 388 uncovered (when anonymity is a goal of the application). 390 This specification addresses some, but not all, of these concerns, 391 and makes some assumptions about the security considerations of other 392 parts of the environment; it is up to the implementor to see that 393 these security assumptions are warranted. In particular: 395 o We assume that the ICE security mechanism is a necessary and 396 sufficient criterion for accepting that a connection attempt is 397 from a communications partner. This means that we trust the 398 randomness of ICE "usernames" and the security of ICE "passwords". 400 o We assume that the SRTP key exchange mechanisms and security 401 profiles specified provide an adequate level of protection for 402 audio and video media. 404 (there needs to be more text here) 406 12. Acknowledgements 408 13. References 409 13.1. Normative References 411 [I-D.alvestrand-dispatch-rtcweb-datagram] 412 Alvestrand, H., "A Datagram Transport for the RTC-Web 413 profile", draft-alvestrand-dispatch-rtcweb-datagram-01 414 (work in progress), February 2011. 416 [I-D.ietf-codec-opus] 417 Valin, J. and K. Vos, "Definition of the Opus Audio 418 Codec", draft-ietf-codec-opus-04 (work in progress), 419 March 2011. 421 [I-D.ietf-hybi-thewebsocketprotocol] 422 Fette, I., "The WebSocket protocol", 423 draft-ietf-hybi-thewebsocketprotocol-06 (work in 424 progress), February 2011. 426 [I-D.westin-payload-vp8] 427 Westin, P. and H. Lundin, "Proposal for the IETF on "RTP 428 Payload Format for VP8 Video"", 429 draft-westin-payload-vp8-02 (work in progress), 430 March 2011. 432 [RFC1890] Schulzrinne, H., "RTP Profile for Audio and Video 433 Conferences with Minimal Control", RFC 1890, January 1996. 435 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 436 Requirement Levels", BCP 14, RFC 2119, March 1997. 438 [RFC3550] Schulzrinne, H., Casner, S., Frederick, R., and V. 439 Jacobson, "RTP: A Transport Protocol for Real-Time 440 Applications", STD 64, RFC 3550, July 2003. 442 [RFC3711] Baugher, M., McGrew, D., Naslund, M., Carrara, E., and K. 443 Norrman, "The Secure Real-time Transport Protocol (SRTP)", 444 RFC 3711, March 2004. 446 [RFC4585] Ott, J., Wenger, S., Sato, N., Burmeister, C., and J. Rey, 447 "Extended RTP Profile for Real-time Transport Control 448 Protocol (RTCP)-Based Feedback (RTP/AVPF)", RFC 4585, 449 July 2006. 451 [RFC5124] Ott, J. and E. Carrara, "Extended Secure RTP Profile for 452 Real-time Transport Control Protocol (RTCP)-Based Feedback 453 (RTP/SAVPF)", RFC 5124, February 2008. 455 13.2. Informative References 457 [I-D.ietf-avt-rtp-svc] 458 Wenger, S., Wang, Y., Schierl, T., and A. Eleftheriadis, 459 "RTP Payload Format for Scalable Video Coding", 460 draft-ietf-avt-rtp-svc-27 (work in progress), 461 February 2011. 463 [RFC3935] Alvestrand, H., "A Mission Statement for the IETF", 464 BCP 95, RFC 3935, October 2004. 466 Author's Address 468 Harald T. Alvestrand 469 Google 470 Kungsbron 2 471 Stockholm, 11122 472 Sweden 474 Email: harald@alvestrand.no