idnits 2.17.1 draft-ietf-xmpp-websocket-07.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (June 6, 2014) is 3611 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Obsolete informational reference (is this intentional?): RFC 5988 (Obsoleted by RFC 8288) Summary: 0 errors (**), 0 flaws (~~), 1 warning (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 XMPP Working Group L. Stout, Ed. 3 Internet-Draft &yet 4 Intended status: Standards Track J. Moffitt 5 Expires: December 8, 2014 Mozilla 6 E. Cestari 7 cstar industries 8 June 6, 2014 10 An XMPP Sub-protocol for WebSocket 11 draft-ietf-xmpp-websocket-07 13 Abstract 15 This document defines a binding for the XMPP protocol over a 16 WebSocket transport layer. A WebSocket binding for XMPP provides 17 higher performance than the current HTTP binding for XMPP. 19 Status of This Memo 21 This Internet-Draft is submitted in full conformance with the 22 provisions of BCP 78 and BCP 79. 24 Internet-Drafts are working documents of the Internet Engineering 25 Task Force (IETF). Note that other groups may also distribute 26 working documents as Internet-Drafts. The list of current Internet- 27 Drafts is at http://datatracker.ietf.org/drafts/current/. 29 Internet-Drafts are draft documents valid for a maximum of six months 30 and may be updated, replaced, or obsoleted by other documents at any 31 time. It is inappropriate to use Internet-Drafts as reference 32 material or to cite them other than as "work in progress." 34 This Internet-Draft will expire on December 8, 2014. 36 Copyright Notice 38 Copyright (c) 2014 IETF Trust and the persons identified as the 39 document authors. All rights reserved. 41 This document is subject to BCP 78 and the IETF Trust's Legal 42 Provisions Relating to IETF Documents 43 (http://trustee.ietf.org/license-info) in effect on the date of 44 publication of this document. Please review these documents 45 carefully, as they describe your rights and restrictions with respect 46 to this document. Code Components extracted from this document must 47 include Simplified BSD License text as described in Section 4.e of 48 the Trust Legal Provisions and are provided without warranty as 49 described in the Simplified BSD License. 51 Table of Contents 53 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 54 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 3 55 3. XMPP Sub-Protocol . . . . . . . . . . . . . . . . . . . . . . 3 56 3.1. Handshake . . . . . . . . . . . . . . . . . . . . . . . . 3 57 3.2. WebSocket Messages . . . . . . . . . . . . . . . . . . . 4 58 3.3. XMPP Framing . . . . . . . . . . . . . . . . . . . . . . 4 59 3.3.1. Framed XML Stream . . . . . . . . . . . . . . . . . . 4 60 3.3.2. Framed Stream Namespace . . . . . . . . . . . . . . . 5 61 3.3.3. Stream Frames . . . . . . . . . . . . . . . . . . . . 5 62 3.4. Stream Initiation . . . . . . . . . . . . . . . . . . . . 6 63 3.5. Stream Errors . . . . . . . . . . . . . . . . . . . . . . 6 64 3.6. Closing the Connection . . . . . . . . . . . . . . . . . 7 65 3.6.1. see-other-uri . . . . . . . . . . . . . . . . . . . . 8 66 3.7. Stream Restarts . . . . . . . . . . . . . . . . . . . . . 8 67 3.8. Pings and Keepalives . . . . . . . . . . . . . . . . . . 8 68 3.9. Use of TLS . . . . . . . . . . . . . . . . . . . . . . . 9 69 3.10. Stream Management . . . . . . . . . . . . . . . . . . . . 9 70 4. Discovering the WebSocket Connection Method . . . . . . . . . 9 71 5. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 10 72 5.1. WebSocket Subprotocol Name . . . . . . . . . . . . . . . 10 73 5.2. URN Sub-Namespace . . . . . . . . . . . . . . . . . . . . 10 74 6. Security Considerations . . . . . . . . . . . . . . . . . . . 11 75 6.1. Intermediary Services . . . . . . . . . . . . . . . . . . 11 76 7. References . . . . . . . . . . . . . . . . . . . . . . . . . 11 77 7.1. Normative References . . . . . . . . . . . . . . . . . . 11 78 7.2. Informative References . . . . . . . . . . . . . . . . . 12 79 Appendix A. XML Schema . . . . . . . . . . . . . . . . . . . . . 13 80 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 14 82 1. Introduction 84 Applications using the Extensible Messaging and Presence Protocol 85 (XMPP) (see [RFC6120] and [RFC6121]) on the Web currently make use of 86 BOSH (see [XEP-0124] and [XEP-0206]), an XMPP binding to HTTP. BOSH 87 is based on the HTTP long polling technique, and it suffers from high 88 transport overhead compared to XMPP's native binding to TCP. In 89 addition, there are a number of other known issues with long polling 90 [RFC6202], which have an impact on BOSH-based systems. 92 It would be much better in most circumstances to avoid tunneling XMPP 93 over HTTP long polled connections and instead use the XMPP protocol 94 directly. However, the APIs and sandbox that browsers have provided 95 do not allow this. The WebSocket protocol [RFC6455] exists to solve 96 these kinds of problems and is a bidirectional protocol that provides 97 a simple message-based framing layer over raw sockets, allowing for 98 more robust and efficient communication in web applications. 100 The WebSocket protocol enables two-way communication between a client 101 and a server, effectively emulating TCP at the application layer and 102 therefore overcoming many of the problems with existing long-polling 103 techniques for bidirectional HTTP. This document defines a WebSocket 104 sub-protocol for XMPP. 106 2. Terminology 108 The basic unit of framing in the WebSocket protocol is called a 109 message. In XMPP, the basic unit is the stanza, which is a subset of 110 the first-level children of each document in an XMPP stream (see 111 Section 9 of [RFC6120]). XMPP also has a concept of messages, which 112 are stanzas with a top-level element of . In this 113 document, the word "message" will mean a WebSocket message, not an 114 XMPP message stanza, unless otherwise noted. 116 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 117 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 118 "OPTIONAL" in this document are to be interpreted as described in 119 [RFC2119]. 121 3. XMPP Sub-Protocol 123 3.1. Handshake 125 The XMPP sub-protocol is used to transport XMPP over a WebSocket 126 connection. The client and server agree to this protocol during the 127 WebSocket handshake (see Section 1.3 of [RFC6455]). 129 During the WebSocket handshake, the client MUST include the 130 value |xmpp| in the list of protocols for the |Sec-WebSocket- 131 Protocol| header. The reply from the server MUST also contain |xmpp| 132 in its own |Sec-WebSocket-Protocol| header in order for an XMPP sub- 133 protocol connection to be established. 135 Once the handshake is complete, WebSocket messages sent or received 136 will conform to the protocol defined in the rest of this document. 138 C: GET /xmpp-websocket HTTP/1.1 139 Host: example.com 140 Upgrade: websocket 141 Connection: Upgrade 142 Sec-WebSocket-Key: dGhlIHNhbXBsZSBub25jZQ== 143 Origin: http://example.com 144 ... 145 Sec-WebSocket-Protocol: xmpp 146 Sec-WebSocket-Version: 13 148 S: HTTP/1.1 101 Switching Protocols 149 Upgrade: websocket 150 Connection: Upgrade 151 ... 152 Sec-WebSocket-Accept: s3pPLMBiTxaQ9kYGzzhZRbK+xOo= 153 Sec-WebSocket-Protocol: xmpp 155 [WebSocket connection established] 157 C: 161 S: 167 3.2. WebSocket Messages 169 Data frame messages in the XMPP sub-protocol MUST be of the text type 170 and contain UTF-8 encoded data. 172 3.3. XMPP Framing 174 The WebSocket XMPP sub-protocol deviates from the standard method of 175 constructing and using XML streams as defined in [RFC6120] by 176 adopting the message framing provided by WebSocket to delineate the 177 stream open and close headers, stanzas, and other top-level stream 178 elements. 180 3.3.1. Framed XML Stream 182 The start of a framed XML stream is marked by the use of an opening 183 "stream header" which is an element with the appropriate 184 attributes and namespace declarations (see Section 3.3.2). The 185 attributes of the element are the same as those of the 186 element defined defined for the 'http://etherx.jabber.org/ 187 streams' namespace in [RFC6120] and with the same semantics and 188 restrictions. 190 The end of a framed XML stream is denoted by the closing "stream 191 header" which is a element with its associated attributes 192 and namespace declarations (see Section 3.3.2). 194 The introduction of the and elements is motivated by 195 the parsable XML document framing restriction in Section 3.3.3. As a 196 consequence, note that a framed XML stream does not provided a 197 wrapping element encompassing the entirety of the 198 XML stream, as in [RFC6120]. 200 3.3.2. Framed Stream Namespace 202 The XML stream "headers" (the and elements) MUST be 203 qualified by the namespace 'urn:ietf:params:xml:ns:xmpp-framing' (the 204 "framed stream namespace"). If this rule is violated, the entity 205 that receives the offending stream header MUST close the stream with 206 an error, which MUST be (see Section 4.9.3.10 of 207 [RFC6120]). 209 3.3.3. Stream Frames 211 The individual frames of a framed XML stream have a one-to-one 212 correspondence with WebSocket messages, and MUST be parsable as 213 standalone XML documents, complete with all relevant namespace and 214 language declarations. The inclusion of XML declarations, however, 215 is NOT RECOMMENDED as WebSocket messages are already mandated to be 216 UTF-8 encoded and therefore would only add a constant size overhead 217 to each message. 219 The first character of each frame MUST be a '<' character. 221 Every XMPP stanza or other XML element (including the stream open and 222 close headers) sent directly over the XML stream MUST be sent in its 223 own frame. 225 Example of a WebSocket message that contains an independently 226 parsable XML document: 228 229 Every WebSocket message is parsable by itself. 230 231 Note that for stream features and errors, there is no parent context 232 element providing the "stream" namespace prefix as in [RFC6120], and 233 thus the stream prefix MUST be declared or use an unprefixed form: 235 236 237 239 -- OR -- 241 242 243 245 3.4. Stream Initiation 247 The first message sent after the WebSocket opening handshake MUST be 248 from the initiating entity, and MUST be an element qualified 249 by the "urn:ietf:params:xml:ns:xmpp-framing" namespace and with the 250 same attributes mandated for the opening tag as described in 251 Section 4.7 of [RFC6120]. 253 The receiving entity MUST respond with either an element 254 (whose attributes match those described in Section 4.7 of [RFC6120]) 255 or a element (see Section 3.6.1). 257 An example of a successful stream initiation exchange: 259 C: 263 S: 269 Clients MUST NOT multiplex XMPP streams over the same WebSocket. 271 3.5. Stream Errors 273 Stream level errors in XMPP are terminal. Should such an error 274 occur, the server MUST send the stream error as a complete element in 275 a message to the client. 277 If the error occurs during the opening of a stream, the server MUST 278 send the initial open element response, followed by the stream level 279 error in a second WebSocket message frame. The server MUST then 280 close the connection as specified in Section 3.6. 282 3.6. Closing the Connection 284 The closing process for the XMPP sub-protocol mirrors that of the 285 XMPP TCP binding as defined in Section 4.4 of [RFC6120], except that 286 a element is used instead of the ending 287 tag. 289 Either the server or the client may close the connection at any time. 290 Before closing the connection, the closing party is expected to first 291 close the XMPP stream (if one has been opened) by sending a message 292 with the element, qualified by the "urn:ietf:params:xml:ns 293 :xmpp-framing" namespace. The stream is considered closed when a 294 corresponding element is received from the other party, and 295 the XMPP session is ended. 297 To then close the WebSocket connection, the closing party MUST 298 initiate the WebSocket closing handshake (see Section 7.1.2 of 299 [RFC6455]). 301 An example of ending an XMPP over WebSocket session by first closing 302 the XMPP stream layer and then the WebSocket connection layer: 304 Client (XMPP WSS) Server 305 | | | | 306 | | | | 309 | |<------------------------------------------------------------| | 310 | | | | 311 | | (XMPP Stream Closed) | | 312 | +-------------------------------------------------------------+ | 313 | | 314 | WS CLOSE FRAME | 315 |------------------------------------------------------------------>| 316 | WS CLOSE FRAME | 317 |<------------------------------------------------------------------| 318 | | 319 | (Connection Closed) | 320 +-------------------------------------------------------------------+ 322 If the WebSocket connection is closed or broken without the XMPP 323 stream having been closed first, then the XMPP stream is considered 324 implicitly closed and the XMPP session ended; however, if the use of 325 stream management resumption was negotiated (see [XEP-0198]), the 326 server SHOULD consider the XMPP session still alive for a period of 327 time based on server policy as specified in [XEP-0198]. 329 3.6.1. see-other-uri 331 If the server wishes at any point to instruct the client to move to a 332 different WebSocket endpoint (e.g. for load balancing purposes), the 333 server MAY send a element and set the "see-other-uri" 334 attribute to the URI of the new connection endpoint (which MAY be for 335 a different transport method, such as BOSH (see [XEP-0124] and 336 [XEP-0206]). 338 Clients MUST NOT accept suggested endpoints with a lower security 339 context (e.g. moving from a "wss://" endpoint to a "ws://" or "http:/ 340 /" endpoint). 342 An example of the server closing a stream and instructing the client 343 to connect at a different WebSocket endpoint: 345 S: 348 3.7. Stream Restarts 350 Whenever a stream restart is mandated, both the server and client 351 streams are implicitly closed and new streams MUST be opened, using 352 the same process as in Section 3.4. The client MUST send a new 353 stream element and MUST NOT send a closing element. 355 An example of restarting the stream after successful SASL 356 negotiation: 358 S: 360 [Streams implicitly closed] 362 C: 366 3.8. Pings and Keepalives 368 Traditionally, XMPP servers and clients often send "whitespace 369 keepalives" (see Section 4.6.1 of [RFC6120]) between stanzas to 370 maintain an XML stream. However, for the XMPP sub-protocol, each 371 message is required to start with a '<' character, and as such 372 whitespace keepalives MUST NOT be used. 374 As alternatives, the XMPP Ping extension [XEP-0199] and the XMPP 375 Stream Management extension [XEP-0198] provide pinging mechanisms. 376 The use of either of these extensions (or both) MAY be used to 377 determine the state of the connection. 379 Clients and servers MAY also use WebSocket ping control frames for 380 this purpose, but note that some environments, such as browsers, do 381 not provide access for generating or monitoring ping control frames. 383 3.9. Use of TLS 385 TLS cannot be used at the XMPP sub-protocol layer because the sub- 386 protocol does not allow for raw binary data to be sent. Instead, 387 when TLS is used, it MUST be enabled the WebSocket layer using secure 388 WebSocket connections via the |wss| URI scheme. (See Section 10.6 of 389 [RFC6455].) 391 Because TLS is to be provided outside of the XMPP sub-protocol layer, 392 a server MUST NOT advertise TLS as a stream feature (see Section 4.6 393 of [RFC6120]), and a client MUST ignore any advertised TLS stream 394 feature, when using the XMPP sub-protocol. 396 3.10. Stream Management 398 In order to alleviate the problems of temporary disconnections, the 399 XMPP Stream Management extension [XEP-0198] MAY be used to confirm 400 when stanzas have been received by the server. 402 In particular, the use of session resumption in [XEP-0198] MAY be 403 used to allow for recreating the same stream session state after a 404 temporary network unavailability or after navigating to a new URL in 405 a browser. 407 4. Discovering the WebSocket Connection Method 409 Section 3 of [RFC6120] defines a procedure for connecting to an XMPP 410 server, including ways to discover the TCP/IP address and port of the 411 server. When using the WebSocket binding as specified in this 412 document (instead of the TCP binding as specified in [RFC6120]), a 413 client needs an alternative way to discover information about the 414 server's connection methods, since web browsers and other WebSocket- 415 capable software applications typically cannot obtain such 416 information from the Domain Name System. 418 The alternative lookup process uses Web Host Metadata [RFC6415] and 419 Web Linking [RFC5988], where the link relation type is "urn:xmpp:alt- 420 connections:websocket" as described in Discovering Alternate XMPP 421 Connection Methods [XEP-0156]. An example follows. 423 424 426 428 Servers MAY expose discovery information using host-meta documents, 429 and clients MAY use such information to determine the WebSocket 430 endpoint for a server. 432 Use of web-host metadata MAY be used to establish trust between the 433 XMPP server domain and the WebSocket endpoint, particularly in multi- 434 tenant situations where the same WebSocket endpoint is serving 435 multiple XMPP domains. 437 5. IANA Considerations 439 5.1. WebSocket Subprotocol Name 441 This specification requests IANA to register the WebSocket XMPP sub- 442 protocol under the "WebSocket Subprotocol Name" Registry with the 443 following data: 445 Subprotocol Identifier: xmpp 447 Subprotocol Common Name: WebSocket Transport for the Extensible 448 Messaging and Presence Protocol (XMPP) 450 Subprotocol Definition: this document 452 5.2. URN Sub-Namespace 454 A URN sub-namespace for framing of Extensible Messaging and Presence 455 Protocol (XMPP) streams is defined as follows. 457 URI: urn:ietf:params:xml:ns:xmpp-framing 459 Specification: this document 461 Description: This is the XML namespace name for framing of 462 Extensible Messaging and Presence Protocol (XMPP) streams as 463 defined by RFC XXXX. 465 Registrant Contact: IESG 467 6. Security Considerations 469 Since application level TLS cannot be used (see Section 3.9), 470 applications need to protect the privacy of XMPP traffic at the 471 WebSocket or other appropriate layer. 473 Browser based applications are not able to inspect and verify at the 474 application layer the certificate used for the WebSocket connection 475 to ensure that it corresponds to the domain specified as the "to" 476 address of the XMPP stream. For hosts whose domain matches the 477 origin for the WebSocket connection, that check is already performed 478 by the browser. However, in situations where the domain of the XMPP 479 server might not match the origin for the WebSocket endpoint 480 (especially multi-tenant hosting situations), the web host metadata 481 method (see [RFC6415] and [XEP-0156]) MAY be used to delegate trust 482 from the XMPP server domain to the WebSocket origin. 484 When presented with a new WebSocket endpoint via the "see-other-uri" 485 attribute of a element, clients MUST NOT accept the 486 suggestion if the security context of the new endpoint is lower than 487 the current one in order to prevent downgrade attacks from a "wss://" 488 endpoint to "ws://". 490 The Security Considerations for both WebSocket (see Section 10 of 491 [RFC6455] and XMPP (see Section 13 of [RFC6120]) apply to the 492 WebSocket XMPP sub-protocol. 494 6.1. Intermediary Services 496 If the XMPP over WebSocket endpoint is provided as an intermediary 497 service between a backend XMPP service and the client, then it SHOULD 498 encrypt its connection to the backend XMPP service using any 499 available and appropriate technologies, such as TLS and StartTLS. 501 If data privacy is desired, a client SHOULD encrypt its messages 502 using an application specific end-to-end encryption technology, as 503 there is no way for the client to ensure that the XMPP over WebSocket 504 service is using an encryped connection to the backend XMPP service. 505 Methods for doing so are beyond the scope of this specification. 507 7. References 509 7.1. Normative References 511 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 512 Requirement Levels", BCP 14, RFC 2119, March 1997. 514 [RFC6120] Saint-Andre, P., "Extensible Messaging and Presence 515 Protocol (XMPP): Core", RFC 6120, March 2011. 517 [RFC6455] Fette, I. and A. Melnikov, "The WebSocket Protocol", RFC 518 6455, December 2011. 520 7.2. Informative References 522 [RFC5988] Nottingham, M., "Web Linking", RFC 5988, October 2010. 524 [RFC6121] Saint-Andre, P., "Extensible Messaging and Presence 525 Protocol (XMPP): Instant Messaging and Presence", RFC 526 6121, March 2011. 528 [RFC6202] Loreto, S., Saint-Andre, P., Salsano, S., and G. Wilkins, 529 "Known Issues and Best Practices for the Use of Long 530 Polling and Streaming in Bidirectional HTTP", RFC 6202, 531 April 2011. 533 [RFC6415] Hammer-Lahav, E. and B. Cook, "Web Host Metadata", RFC 534 6415, October 2011. 536 [XEP-0124] 537 Paterson, I., Smith, D., Saint-Andre, P., Moffitt, J., and 538 L. Stout, "Bidirectional-streams Over Synchronous HTTP 539 (BOSH)", XSF XEP 0124, November 2013. 541 [XEP-0156] 542 Hildebrand, J., Saint-Andre, P., and L. Stout, 543 "Discovering Alternative XMPP Connection Methods", XSF XEP 544 0156, January 2014. 546 [XEP-0198] 547 Karneges, J., Saint-Andre, P., Hildebrand, J., Forno, F., 548 Cridland, D., and M. Wild, "Stream Management", XSF XEP 549 0198, June 2011. 551 [XEP-0199] 552 Saint-Andre, P., "XMPP Ping", XSF XEP 0199, June 2009. 554 [XEP-0206] 555 Paterson, I., Saint-Andre, P., and L. Stout, "XMPP Over 556 BOSH", XSF XEP 0206, November 2013. 558 [XML-SCHEMA] 559 Thompson, H., Maloney, M., Mendelsohn, N., and D. Beech, 560 "XML Schema Part 1: Structures Second Edition", World Wide 561 Web Consortium Recommendation REC-xmlschema-1-20041028, 562 October 2004, 563 . 565 Appendix A. XML Schema 567 The following schema formally defines the 'urn:ietf:params:xml:ns 568 :xmpp-framing' namespace used in this document, in conformance with 569 W3C XML Schema [XML-SCHEMA]. Because validation of XML streams and 570 stanzas is optional, this schema is not normative and is provided for 571 descriptive purposes only. 573 575 581 582 583 584 585 587 589 591 593 595 596 597 598 600 601 602 603 604 607 609 611 613 615 617 618 619 620 622 623 624 625 626 628 630 Authors' Addresses 632 Lance Stout (editor) 633 &yet 635 Email: lance@andyet.net 637 Jack Moffitt 638 Mozilla 640 Email: jack@metajack.im 642 Eric Cestari 643 cstar industries 645 Email: eric@cstar.io