idnits 2.17.1 draft-ietf-mmusic-ice-tcp-08.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** The document seems to lack a License Notice according IETF Trust Provisions of 28 Dec 2009, Section 6.b.i or Provisions of 12 Sep 2009 Section 6.b -- however, there's a paragraph with a matching beginning. Boilerplate error? (You're using the IETF Trust Provisions' Section 6.b License Notice from 12 Feb 2009 rather than one of the newer Notices. See https://trustee.ietf.org/license-info/.) Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack a both a reference to RFC 2119 and the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. RFC 2119 keyword, line 161: '...ng. Therefore, it is RECOMMENDED that...' RFC 2119 keyword, line 263: '... in [RFC4145] for constructing the offer. However, the offerer MUST...' RFC 2119 keyword, line 272: '...ither be UDP or TCP), the agent SHOULD...' RFC 2119 keyword, line 281: '...or choice, it is RECOMMENDED that agen...' RFC 2119 keyword, line 313: '... Each agent SHOULD "obtain" an activ...' (57 more instances...) Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (October 13, 2009) is 5306 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) ** Obsolete normative reference: RFC 5389 (Obsoleted by RFC 8489) ** Obsolete normative reference: RFC 4572 (Obsoleted by RFC 8122) Summary: 4 errors (**), 0 flaws (~~), 1 warning (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 MMUSIC S. Perreault, Ed. 3 Internet-Draft Viagenie 4 Intended status: Standards Track J. Rosenberg 5 Expires: April 16, 2010 Cisco 6 October 13, 2009 8 TCP Candidates with Interactive Connectivity Establishment (ICE) 9 draft-ietf-mmusic-ice-tcp-08 11 Status of this Memo 13 This Internet-Draft is submitted to IETF in full conformance with the 14 provisions of BCP 78 and BCP 79. 16 Internet-Drafts are working documents of the Internet Engineering 17 Task Force (IETF), its areas, and its working groups. Note that 18 other groups may also distribute working documents as Internet- 19 Drafts. 21 Internet-Drafts are draft documents valid for a maximum of six months 22 and may be updated, replaced, or obsoleted by other documents at any 23 time. It is inappropriate to use Internet-Drafts as reference 24 material or to cite them other than as "work in progress." 26 The list of current Internet-Drafts can be accessed at 27 http://www.ietf.org/ietf/1id-abstracts.txt. 29 The list of Internet-Draft Shadow Directories can be accessed at 30 http://www.ietf.org/shadow.html. 32 This Internet-Draft will expire on April 16, 2010. 34 Copyright Notice 36 Copyright (c) 2009 IETF Trust and the persons identified as the 37 document authors. All rights reserved. 39 This document is subject to BCP 78 and the IETF Trust's Legal 40 Provisions Relating to IETF Documents in effect on the date of 41 publication of this document (http://trustee.ietf.org/license-info). 42 Please review these documents carefully, as they describe your rights 43 and restrictions with respect to this document. 45 Abstract 47 Interactive Connectivity Establishment (ICE) defines a mechanism for 48 NAT traversal for multimedia communication protocols based on the 49 offer/answer model of session negotiation. ICE works by providing a 50 set of candidate transport addresses for each media stream, which are 51 then validated with peer-to-peer connectivity checks based on Session 52 Traversal Utilities for NAT (STUN). ICE provides a general framework 53 for describing candidates, but only defines UDP-based transport 54 protocols. This specification extends ICE to TCP-based media, 55 including the ability to offer a mix of TCP and UDP-based candidates 56 for a single stream. 58 Table of Contents 60 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 61 2. Overview of Operation . . . . . . . . . . . . . . . . . . . . 4 62 3. Sending the Initial Offer . . . . . . . . . . . . . . . . . . 6 63 3.1. Gathering Candidates . . . . . . . . . . . . . . . . . . . 6 64 3.2. Prioritization . . . . . . . . . . . . . . . . . . . . . . 8 65 3.3. Choosing Default Candidates . . . . . . . . . . . . . . . 9 66 3.4. Encoding the SDP . . . . . . . . . . . . . . . . . . . . . 9 67 4. Receiving the Initial Offer . . . . . . . . . . . . . . . . . 10 68 4.1. Verifying ICE Support . . . . . . . . . . . . . . . . . . 10 69 4.2. Forming the Check Lists . . . . . . . . . . . . . . . . . 11 70 5. Connectivity Checks . . . . . . . . . . . . . . . . . . . . . 11 71 5.1. STUN Client Procedures . . . . . . . . . . . . . . . . . . 11 72 5.1.1. Sending the Request . . . . . . . . . . . . . . . . . 11 73 5.2. STUN Server Procedures . . . . . . . . . . . . . . . . . . 12 74 6. Concluding ICE Processing . . . . . . . . . . . . . . . . . . 12 75 7. Subsequent Offer/Answer Exchanges . . . . . . . . . . . . . . 13 76 7.1. ICE Restarts . . . . . . . . . . . . . . . . . . . . . . . 13 77 8. Media Handling . . . . . . . . . . . . . . . . . . . . . . . . 13 78 8.1. Sending Media . . . . . . . . . . . . . . . . . . . . . . 13 79 8.2. Receiving Media . . . . . . . . . . . . . . . . . . . . . 14 80 9. Connection Management . . . . . . . . . . . . . . . . . . . . 14 81 9.1. Connections Formed During Connectivity Checks . . . . . . 14 82 9.2. Connections formed for Gathering Candidates . . . . . . . 15 83 10. Security Considerations . . . . . . . . . . . . . . . . . . . 16 84 11. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 16 85 12. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 16 86 13. References . . . . . . . . . . . . . . . . . . . . . . . . . . 16 87 13.1. Normative References . . . . . . . . . . . . . . . . . . . 16 88 13.2. Informative References . . . . . . . . . . . . . . . . . . 17 89 Appendix A. Implementation Considerations for BSD Sockets . . . . 18 90 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 19 92 1. Introduction 94 Interactive Connectivity Establishment (ICE) [I-D.ietf-mmusic-ice] 95 defines a mechanism for NAT traversal for multimedia communication 96 protocols based on the offer/answer model [RFC3264] of session 97 negotiation. ICE works by providing a set of candidate transport 98 addresses for each media stream, which are then validated with peer- 99 to-peer connectivity checks based on Session Traversal Utilities for 100 NAT (STUN) [RFC5389]. However, ICE only defines procedures for UDP- 101 based transport protocols. 103 There are many reasons why ICE support for TCP is important. 104 Firstly, there are media protocols that only run over TCP. Examples 105 of such protocols are web and application sharing and instant 106 messaging [RFC4975]. For these protocols to work in the presence of 107 NAT, unless they define their own NAT traversal mechanisms, ICE 108 support for TCP is needed. In addition, RTP itself can run over TCP 109 [RFC4571]. Typically, it is preferable to run RTP over UDP, and not 110 TCP. However, in a variety of network environments, overly 111 restrictive NAT and firewall devices prevent UDP-based communications 112 altogether, but general TCP-based communications are permitted. In 113 such environments, sending RTP over TCP, and thus establishing the 114 media session, may be preferable to having it fail altogether. With 115 this specification, agents can gather UDP and TCP candidates for an 116 RTP-based stream, list the UDP ones with higher priority, and then 117 only use the TCP-based ones if the UDP ones fail. This provides a 118 fallback mechanism that allows multimedia communications to be highly 119 reliable. 121 The usage of RTP over TCP is particularly useful when combined with 122 Traversal Using Relay NAT [I-D.ietf-behave-turn]. In this case, one 123 of the agents would connect to its TURN server using TCP, and obtain 124 a TCP-based relayed candidate. It would offer this to its peer agent 125 as a candidate. The answerer would initiate a TCP connection towards 126 the TURN server. When that connection is established, media can flow 127 over the connections, through the TURN server. The benefit of this 128 usage is that it only requires the agents to make outbound TCP 129 connections to a server on the public network. This kind of 130 operation is broadly interoperable through NAT and firewall devices. 131 Since it is a goal of ICE and this extension to provide highly 132 reliable communications that "just works" in as a broad a set of 133 network deployments as possible, this use case is particularly 134 important. 136 This specification extends ICE by defining its usage with TCP 137 candidates. It also defines how ICE can be used with RTP and SRTP to 138 provide both TCP and UDP candidates. This specification does so by 139 following the outline of ICE itself, and calling out the additions 140 and changes necessary in each section of ICE to support TCP 141 candidates. 143 2. Overview of Operation 145 The usage of ICE with TCP is relatively straightforward. The main 146 area of specification is around how and when connections are opened, 147 and how those connections relate to candidate pairs. 149 When the agents perform address allocations to gather TCP-based 150 candidates, three types of candidates can be obtained. These are 151 active candidates, passive candidates, and simultaneous-open 152 candidates. An active candidate is one for which the agent will 153 attempt to open an outbound connection, but will not receive incoming 154 connection requests. A passive candidate is one for which the agent 155 will receive incoming connection attempts, but not attempt a 156 connection. A simultaneous-open candidate is one for which the agent 157 will attempt to open a connection simultaneously with its peer. 159 Note: It has been reported that the simultaneous-open technique 160 has a low success rate (~40%) with the population of NAT devices 161 in use as of this writing. Therefore, it is RECOMMENDED that 162 implementations of this specification acquire and use IPv6 host 163 candidates. Means of doing so across NATs include Tunnel Setup 164 Protocol, [I-D.blanchet-v6ops-tunnelbroker-tsp], Teredo [RFC4380], 165 IPSec NAT-T [RFC3947], and others. 167 Unlike UDP, there are no lite implementation defined for TCP. 168 Instead, an implementation that meets the criteria for a lite 169 implementation as discussed in Appendix A of [I-D.ietf-mmusic-ice] 170 can just uses the mechanisms defined in [RFC4145], with constraints 171 defined here on selection of attribute values. 173 When gathering candidates from a host interface, the agent typically 174 obtains an active, passive and simultaneous-open candidates. 175 Similarly, communications with a STUN server will provide server 176 reflexive and relayed versions of all three types. Connections to 177 the STUN server are kept open during ICE processing. 179 When encoding these candidates into offers and answers, the type of 180 the candidate is signaled. In the case of active candidates, an IP 181 address and port is present, but it is meaningless, as it is ignored 182 by the peer. As a consequence, active candidates do not need to be 183 physically allocated at the time of address gathering. Rather, the 184 physical allocations, which occur as a consequence of a connection 185 attempt, occur at the time of the connectivity checks. 187 When the candidates are paired together, active candidates are always 188 paired with passive, and simultaneous-open candidates with each 189 other. When a connectivity check is to be made on a candidate pair, 190 each agent determines whether it is to make a connection attempt for 191 this pair. 193 Why have both active and simultaneous-open candidates? Why not 194 just simultaneous-open? The reason is that NAT treatment of 195 simultaneous opens is currently not well defined, though 196 specifications are being developed to address this [RFC5382]. 197 Some NATs block the second TCP SYN packet or improperly process 198 the subsequent SYNACK, which will cause the connection attempt to 199 fail. Therefore, if only simultaneous opens are used, connections 200 may often fail. Alternatively, using unidirectional opens (where 201 one side is active and the other is passive) is more reliable, but 202 will always require a relay if both sides are behind NAT. 203 Therefore, in the spirit of the ICE philosophy, both are tried. 204 Simultaneous-opens are preferred since, if it does work, it will 205 not require a relay even when both sides are behind a different 206 NAT. 208 The actual process of generating connectivity checks, managing the 209 state of the check list, and updating the Valid list, work 210 identically for TCP as they do for UDP. 212 ICE requires an agent to demultiplex STUN and application layer 213 traffic, since they appear on the same port. This demultiplexing is 214 described by ICE, and is done using the magic cookie and other fields 215 of the message. Stream-oriented transports introduce another 216 wrinkle, since they require a way to frame the connection so that the 217 application and STUN packets can be extracted in order to determine 218 which is which. For this reason, TCP media streams utilizing ICE use 219 the basic framing provided in RFC 4571 [RFC4571], even if the 220 application layer protocol is not RTP. 222 When TLS is in use (for non-RTP traffic) or DTLS (for RTP traffic), 223 it runs over the RFC 4571 framing shim, so that STUN runs outside of 224 the D/TLS connection (D/TLS is shorthand for TLS or DTLS). 225 Pictorially: 227 +----------+ 228 | | 229 | App | 230 +----------+----------+ 231 | | | 232 | STUN | D/TLS | 233 +----------+----------+ 234 | | 235 | RFC 4571 | 236 +---------------------+ 237 | | 238 | TCP | 239 +---------------------+ 240 | | 241 | IP | 242 +---------------------+ 244 Figure 1: ICE TCP Stack 246 The implication of this is that, for any media stream protected by 247 D/TLS, the agent will first run ICE procedures, exchanging STUN 248 messages. Then, once ICE completes, D/TLS procedures begin. ICE and 249 D/TLS are thus "peers" in the protocol stack. The STUN messages are 250 not sent over the D/TLS connection, even ones sent for the purposes 251 of keepalive in the middle of the media session. 253 When an updated offer is generated by the controlling endpoint, the 254 SDP extensions for connection oriented media [RFC4145] are used to 255 signal that an existing connection should be used, rather than 256 opening a new one. 258 3. Sending the Initial Offer 260 If an offerer meets the criteria for lite as defined in Appendix A of 261 [I-D.ietf-mmusic-ice], it omits any ICE attributes for its TCP-based 262 media streams. Instead, the offerer follows the procedures defined 263 in [RFC4145] for constructing the offer. However, the offerer MUST 264 use a setup attribute of "actpass" for those streams. 266 For offerers making use of ICE for TCP streams, the procedures below 267 are used. 269 3.1. Gathering Candidates 271 For each TCP capable media stream the agent wishes to use (including 272 ones, like RTP, which can either be UDP or TCP), the agent SHOULD 273 obtain two host candidates (each on a different port) for each 274 component of the media stream on each interface that the host has - 275 one for the simultaneous open, and one for the passive candidate. If 276 an agent is not capable of acting in one of these modes it would omit 277 those candidates. 279 Providers of real-time communications services may decide that it is 280 preferable to have no media at all than it is to have media over TCP. 281 To allow for choice, it is RECOMMENDED that agents be configurable 282 with whether they obtain TCP candidates for real time media. 284 Having it be configurable, and then configuring it to be off, is 285 far better than not having the capability at all. An important 286 goal of this specification is to provide a single mechanism that 287 can be used across all types of endpoints. As such, it is 288 preferable to account for provider and network variation through 289 configuration, instead of hard-coded limitations in an 290 implementation. Furthermore, network characteristics and 291 connectivity assumptions can, and will change over time. Just 292 because a agent is communicating with a server on the public 293 network today, doesn't mean that it won't need to communicate with 294 one behind a NAT tomorrow. Just because a agent is behind a NAT 295 with endpoint indpendent mapping today, doesn't mean that tomorrow 296 they won't pick up their agent and take it to a public network 297 access point where there is a NAT with address and port dependent 298 mapping properties, or one that only allows outbound TCP. The way 299 to handle these cases and build a reliable system is for agents to 300 implement a diverse set of techniques for allocating addresses, so 301 that at least one of them is almost certainly going to work in any 302 situation. Implementors should consider very carefully any 303 assumptions that they make about deployments before electing not 304 to implement one of the mechanisms for address allocation. In 305 particular, implementors should consider whether the elements in 306 the system may be mobile, and connect through different networks 307 with different connectivity. They should also consider whether 308 endpoints which are under their control, in terms of location and 309 network connectivity, would always be under their control. In 310 environments where mobility and user control are possible, a 311 multiplicity of techniques is essential for reliability. 313 Each agent SHOULD "obtain" an active host candidate for each 314 component of each TCP capable media stream on each interface that the 315 host has. The agent does not have to actually allocate a port for 316 these candidates. These candidates serve as a placeholder for the 317 creation of the check lists. 319 Next, the agent SHOULD take all host TCP candidates for a component 320 that have the same foundation (there will typically be two - a 321 passive and a simultaneous-open), and amongst them, pick two 322 arbitrarily. These two host candidates will be used to obtain 323 relayed and server reflexive candidates. To do that, the agent 324 initiates a TCP connection from each candidate to the TURN server 325 (resulting in two TCP connections). On each connection, it issues an 326 Allocate request. One of the resulting relayed candidate is used as 327 a passive relayed candidate, and the other, as a simultaneous-open 328 relayed candidate. In addition, the Allocate responses will provide 329 the agent with a server reflexive candidate for their corresponding 330 host candidate. 332 For all of the remaining host candidates, if any, the agent only 333 needs to obtain server reflexive candidates. To do that, it 334 initiates a TCP connection from each host candidate to a STUN server, 335 and uses a Binding request over that connection to learn the server 336 reflexive candidate corresponding to that host candidate. 338 Once the Allocate or Binding request has completed, the agent MUST 339 keep the TCP connection open until ICE processing has completed. See 340 Appendix A for important implementation guidelines. 342 If a media stream is UDP-based (such as RTP), an agent MAY use an 343 additional host TCP candidate to request a UDP-based candidate from a 344 TURN server. Usage of the UDP candidate from the TURN server follows 345 the procedures defined in ICE for UDP candidates. 347 Each agent SHOULD "obtain" an active relayed candidate for each 348 component of each TCP capable media stream on each interface that the 349 host has. The agent does not have to actually allocate a port for 350 these candidates from the relay at this time. These candidates serve 351 as a placeholder for the creation of the check lists. 353 Like its UDP counterparts, TCP-based STUN transactions are paced out 354 at one every Ta seconds. This pacing refers strictly to STUN 355 transactions (both Binding and Allocate requests). If performance of 356 the transaction requires establishment of a TCP connection, then the 357 connection gets opened when the transaction is performed. 359 3.2. Prioritization 361 The transport protocol itself is a criteria for choosing one 362 candidate over another. If a particular media stream can run over 363 UDP or TCP, the UDP candidates might be preferred over the TCP 364 candidates. This allows ICE to use the lower latency UDP 365 connectivity if it exists, but fallback to TCP if UDP doesn't work. 367 To accomplish this, the local preference SHOULD be defined as: 369 local-preference = (2^12)*(transport-pref) + 370 (2^9)*(direction-pref) + 371 (2^0)*(other-pref) 373 Transport-pref is the relative preference for candidates with this 374 particular transport protocol (UDP or TCP), and direction-pref is the 375 preference for candidates with this particular establishment 376 directionality (active, passive, or simultaneous-open). Other-pref 377 is used as a differentiator when two candidates would otherwise have 378 identical local preferences. 380 Transport-pref MUST be between 0 and 15, with 15 being the most 381 preferred. Direction-pref MUST be between 0 and 7, with 7 being the 382 most preferred. Other-pref MUST be between 0 and 511, with 511 being 383 the most preferred. For RTP-based media streams, it is RECOMMENDED 384 that UDP have a transport-pref of 15 and TCP of 6. It is RECOMMENDED 385 that, for all connection-oriented media, simultaneous-open candidates 386 have a direction-pref of 7, active of 5 and passive of 2. If any two 387 candidates have the same type-preference, transport-pref, and 388 direction-pref, they MUST have a unique other-pref. With this 389 specification, the only way that can happen is with multi-homed 390 hosts, in which case other-pref is a preference amongst interfaces. 392 3.3. Choosing Default Candidates 394 The default candidate is chosen primarily based on the likelihood of 395 it working with a non-ICE peer. When media streams supporting mixed 396 modes (both TCP and UDP) are used with ICE, it is RECOMMENDED that, 397 for real-time streams (such as RTP), the default candidates be UDP- 398 based. However, the default SHOULD NOT be the simultaneous-open 399 candidate. 401 If a media stream is inherently TCP-based, the agent MUST select the 402 active candidate as default. This ensures proper directionality of 403 connection establishment for NAT traversal with non-ICE 404 implementations. 406 3.4. Encoding the SDP 408 TCP-based candidates are encoded into a=candidate lines identically 409 to the UDP encoding described in [I-D.ietf-mmusic-ice]. However, the 410 transport protocol is set to "tcp-so" for TCP simultaneous-open 411 candidates, "tcp-act" for TCP active candidates, and "tcp-pass" for 412 TCP passive candidates. The addr and port encoded into the candidate 413 attribute for active candidates MUST be set to IP address that will 414 be used for the attempt, but the port MUST be set to 9 (i.e., 415 Discard). For active relayed candidates, the value for addr must be 416 identical to the IP address of a passive or simultaneous-open 417 candidate from the same TURN server. 419 If the default candidate is TCP, the agent MUST include the a=setup 420 and a=connection attributes from RFC 4145 [RFC4145], following the 421 procedures defined there as if ICE was not in use. In particular, if 422 an agent is the answerer, the a=setup attribute MUST meet the 423 constraints in RFC 4145 based on the value in the offer. Since an 424 ICE-tcp offerer always uses the active candidate as default, an ICE- 425 tcp answerer will always use the passive attribute as default and 426 include the a=setup:passive attribute in the answer. 428 If an agent is utilizing SRTP [RFC3711], it MAY include a mix of UDP 429 and TCP candidates. If ICE selects a TCP candidate pair, the agent 430 MUST still utilize SRTP, but run over the connection establised by 431 ICE. The alternative, RTP over TLS, MUST NOT be used. This allows 432 for the higher layer protocols (the security handshakes and media 433 transport) to be independent of the underlying transport protocol. 434 In the case of DTLS-SRTP [I-D.ietf-avt-dtls-srtp], the directionality 435 attributes (a=setup) are utilized strictly to determine the direction 436 of the DTLS handshake. Directionality of the TCP connection 437 establishment are determined by the ICE attributes and procedures 438 defined here. 440 If an agent is securing non-RTP media over TCP/TLS, he SDP MUST be 441 constructed as described in RFC 4572 [RFC4572]. The directionality 442 attributes (a=setup) are utilized strictly to determine the direction 443 of the TLS handshake. Directionality of the TCP connection 444 establishment are determined by the ICE attributes and procedures 445 defined here. 447 4. Receiving the Initial Offer 449 4.1. Verifying ICE Support 451 Since this specification does not define a lite mode for ICE-tcp, a 452 lite implementation will include candidate attributes for its UDP 453 streams, but no such attributes for its TCP streams. An agent 454 receiving such an offer MUST proceed with ICE in this case. ICE will 455 be used for the UDP streams, and [RFC4145] procedures will be used 456 for the TCP streams. However, if the offer indicates a setup 457 direction of actpass, the answerer MUST utilize a=setup:active in the 458 answer. This is required to ensure proper directionality of 459 connection establishment to work through NAT. 461 Similarly, if an agent is lite, and receives an offer that includes 462 streams with TCP candidates, it will omit candidates from the answer 463 for those streams. This will cause [RFC4145] procedures to be used 464 for those streams. In this case, the offer will indicate a direction 465 of active, and the agent will use passive in its answer. 467 4.2. Forming the Check Lists 469 When forming candidate pairs, the following types of candidates can 470 be paired with each other: 472 Local Remote 473 Candidate Candidate 474 ---------------------------- 475 tcp-so tcp-so 476 tcp-act tcp-pass 477 tcp-pass tcp-act 479 When the agent prunes the check list, it MUST also remove any pair 480 for which the local candidate is tcp-pass. 482 The remainder of check list processing works like the UDP case. 484 5. Connectivity Checks 486 5.1. STUN Client Procedures 488 5.1.1. Sending the Request 490 When an agent wants to send a TCP-based connectivity check, it first 491 opens a TCP connection if none yet exists for the 5-tuple defined by 492 the candidate pair for which the check is to be sent. This 493 connection is opened from the local candidate of the pair to the 494 remote candidate of the pair. If the local candidate is tcp-act, the 495 agent MUST open a connection from the interface associated with that 496 local candidate. This connection MUST be opened from an unallocated 497 port. For host candidates, this is readily done by connecting from 498 the candidates interface. For relayed candidates, the agent uses the 499 procedures in [I-D.ietf-behave-turn] to initiate a new connection 500 from the specified interface on the TURN server. 502 Once the connection is established, the agent MUST utilize the shim 503 defined in RFC 4571 [RFC4571] for the duration this connection 504 remains open. The STUN Binding requests and responses are sent ontop 505 of this shim, so that the length field defined in RFC 4571 precedes 506 each STUN message. If TLS or DTLS-SRTP is to be utilized for the 507 media session, the TLS or DTLS-SRTP handshakes will take place ontop 508 of this shim as well. However, they only start once ICE processing 509 has completed. In essence, the TLS or DTLS-SRTP handshakes are 510 considered a part of the media protocol. STUN is never run within 511 the TLS or DTLS-SRTP session. 513 If the TCP connection cannot be established, the check is considered 514 to have failed, and a full-mode agent MUST update the pair state to 515 Failed in the check list. 517 Once the connection is established, client procedures are identical 518 to those for UDP candidates. Note that STUN responses received on an 519 active TCP candidate will typically produce a remote peer reflexive 520 candidate. 522 5.2. STUN Server Procedures 524 An agent MUST be prepared to receive incoming TCP connection requests 525 on any host or relayed TCP candidate that is simultaneous-open or 526 passive. When the connection request is received, the agent MUST 527 accept it. The agent MUST utilize the framing defined in RFC 4571 528 [RFC4571] for the lifetime of this connection. Due to this framing, 529 the agent will receive data in discrete frames. Each frame could be 530 media (such as RTP or SRTP), TLS, DLTS, or STUN packets. The STUN 531 packets are extracted as described in Section 8.2. 533 Once the connection is established, STUN server procedures are 534 identical to those for UDP candidates. Note that STUN requests 535 received on a passive TCP candidate will typically produce a remote 536 peer reflexive candidate. 538 6. Concluding ICE Processing 540 If there are TCP candidates for a media stream, a controlling agent 541 MUST use a regular selection algorithm. 543 When ICE processing for a media stream completes, each agent SHOULD 544 close all TCP connections except the one between the candidate pairs 545 selected by ICE. 547 These two rules are related; the closure of connection on 548 completion of ICE implies that a regular selection algorithm has 549 to be used. This is because aggressive selection might cause 550 transient pairs to be selected. Once such a pair was selected, 551 the agents would close the other connections, one of which may be 552 about to be selected as a better choice. This race condition may 553 result in TCP connections being accidentally closed for the pair 554 that ICE selects. 556 7. Subsequent Offer/Answer Exchanges 558 7.1. ICE Restarts 560 If an ICE restart occurs for a media stream with TCP candidate pairs 561 that have been selected by ICE, the agents MUST NOT close the 562 connections after the restart. In the offer or answer that causes 563 the restart, an agent MAY include a simultaneous-open candidate whose 564 transport address matches the previously selected candidate. If both 565 agents do this, the result will be a simultaneous-open candidate pair 566 matching an existing TCP connection. In this case, the agents MUST 567 NOT attempt to open a new connection (or start new TLS or DTLS-SRTP 568 procedures). Instead, that existing connection is reused and STUN 569 checks are performed. 571 Once the restart completes, if the selected pair does not match the 572 previously selected pair, the TCP connection for the previously 573 selected pair SHOULD be closed by the agent. 575 8. Media Handling 577 8.1. Sending Media 579 When sending media, if the selected candidate pair matches an 580 existing TCP connection, that connection MUST be used for sending 581 media. 583 The framing defined in RFC 4571 MUST be used when sending media. For 584 media streams that are not RTP-based and do not normally use RFC 585 4571, the agent treats the media stream as a byte stream, and assumes 586 that it has its own framing of some sort. It then takes an arbitrary 587 number of bytes from the bytestream, and places that as a payload in 588 the RFC 4571 frames, including the length. Next, the sender checks 589 to see if the resulting set of bytes would be viewed as a STUN packet 590 based on the rules in sections 6 and 8 of [RFC5389]. This includes a 591 check on the most significant two bits, the magic cookie, the length, 592 and the fingerprint. If, based on those rules, the bytes would be 593 viewed as a STUN message, the sender SHOULD utilize a different 594 number of bytes so that the length checks will fail. Though it is 595 normally highly unlikely that an arbitrary number of bytes from a 596 bytestream would resemble a STUN packet based on all of the checks, 597 it can happen if the content of the application stream happens to 598 contain a STUN message (for example, a file transfer of logs from a 599 client which includes STUN messages). 601 If TLS or DTLS-SRTP procedures are being utilized to protect the 602 media stream, those procedures start at the point that media is 603 permitted to flow, as defined in the ICE specification 604 [I-D.ietf-mmusic-ice]. The TLS or DTLS-SRTP handshakes occur ontop 605 of the RFC 4571 shim, and are considered part of the media stream for 606 purposes of this specification. 608 8.2. Receiving Media 610 The framing defined in RFC 4571 MUST be used when receiving media. 611 For media streams that are not RTP-based and do not normally use RFC 612 4571, the agent extracts the payload of each RFC 4571 frame, and 613 determines if it is a STUN or an application layer data based on the 614 procedures in ICE [I-D.ietf-mmusic-ice]. If media is being protected 615 with DTLS-SRTP, the DTLS, RTP and STUN packets are demultiplexed as 616 described in Section 3.6.2 of [I-D.ietf-avt-dtls-srtp]. 618 For non-STUN data, the agent appends this to the ongoing bytestream 619 collected from the frames. It then parses the bytestream as if it 620 had been directly received over the TCP connection. This allows for 621 ICE-tcp to work without regard to the framing mechanism used by the 622 application layer protocol. 624 9. Connection Management 626 9.1. Connections Formed During Connectivity Checks 628 Once a TCP or TCP/TLS connection is opened by ICE for the purpose of 629 connectivity checks, its lifecycle depends on how it is used. If 630 that candidate pair is selected by ICE for usage for media, an agent 631 SHOULD keep the connection open until: 633 o The session terminates 635 o The media stream is removed 637 o An ICE restart takes place, resulting in the selection of a 638 different candidate pair. 640 In these cases, the agent SHOULD close the connection when that event 641 occurs. This applies to both agents in a session, in which case 642 usually one of the agents will end up closing the connection first. 644 If a connection has been selected by ICE, an agent MAY close it 645 anyway. As described in the next paragraph, this will cause it to be 646 reopened almost immediately, and in the interim media cannot be sent. 647 Consequently, such closures have a negative effect and are NOT 648 RECOMMENDED. However, there may be cases where an agent needs to 649 close a connection for some reason. 651 If an agent needs to send media on the selected candidate pair, and 652 its TCP connection has closed, either on purpose or due to some 653 error, then: 655 o If the agent's local candidate is tcp-act or tcp-so, it MUST 656 reopen a connection to the remote candidate of the selected pair. 658 o If the agent's local candidate is tcp-pass, the agent MUST await 659 an incoming connection request, and consequently, will not be able 660 to send media until it has been opened. 662 If the TCP connection is established, the framing of RFC 4571 is 663 utilized. If the agent opened the connection, it MUST send a STUN 664 connectivity check. An agent MUST be prepared to receive a 665 connectivity check over a connection it opened or accepted (note that 666 this is true in general; ICE requires that an agent be prepared to 667 receive a connectivity check at any time, even after ICE processing 668 completes). If an agent receives a connectivity check after re- 669 establishment of the connection, it MUST generate a triggered check 670 over that connection in response if it has not already sent a check. 671 Once an agent has sent a check and received a successful response, 672 the connection is considered Valid and media can be sent (which 673 includes a TLS or DTLS-SRTP session resumption or restart). 675 If the TCP connection cannot be established, the controlling agent 676 SHOULD restart ICE for this media stream. This will happen in cases 677 where one of the agents is behind a NAT with connection dependent 678 mapping properties [RFC5382]. 680 9.2. Connections formed for Gathering Candidates 682 If the agent opened a connection to a STUN server for the purposes of 683 gathering a server reflexive candidate, that connection SHOULD be 684 closed by the client once ICE processing has completed. This happens 685 irregardless of whether the candidate learned from the STUN server 686 was selected by ICE. 688 If the agent opened a connection to a TURN server for the purposes of 689 gathering a relayed candidate, that connection MUST be kept open by 690 the client for the duration of the media session if: 692 o A relayed candidate learned by the TURN server was selected by 693 ICE, 695 o or an active candidate established as a consequence of a Connect 696 request sent through that TCP connection was selected by ICE. 698 Otherwise, the connection to the TURN server SHOULD be closed once 699 ICE processing completes. 701 If, despite efforts of the client, a TCP connection to a TURN server 702 fails during the lifetime of the media session utilizing a transport 703 address allocated by that server, the client SHOULD reconnect to the 704 TURN server, obtain a new allocation, and restart ICE for that media 705 stream. 707 10. Security Considerations 709 The main threat in ICE is hijacking of connections for the purposes 710 of directing media streams to DoS targets or to malicious users. 711 ICE-tcp prevents that by only using TCP connections that have been 712 validated. Validation requires a STUN transaction to take place over 713 the connection. This transaction cannot complete without both 714 participants knowing a shared secret exchanged in the rendezvous 715 protocol used with ICE, such as SIP. This shared secret, in turn, is 716 protected by that protocol exchange. In the case of SIP, the usage 717 of the sips mechanism is RECOMMENDED. When this is done, an 718 attacker, even if it knows or can guess the port on which an agent is 719 listening for incoming TCP connections, will not be able to open a 720 connection and send media to the agent. 722 A more detailed analysis of this attack and the various ways ICE 723 prevents it are described in [I-D.ietf-mmusic-ice]. Those 724 considerations apply to this specification. 726 11. IANA Considerations 728 There are no IANA considerations associated with this specification. 730 12. Acknowledgements 732 The authors would like to thank Tim Moore, Saikat Guha, Francois 733 Audet and Roni Even for the reviews and input on this document. 735 13. References 737 13.1. Normative References 739 [RFC5389] Rosenberg, J., Mahy, R., Matthews, P., and D. Wing, 740 "Session Traversal Utilities for NAT (STUN)", RFC 5389, 741 October 2008. 743 [RFC3264] Rosenberg, J. and H. Schulzrinne, "An Offer/Answer Model 744 with Session Description Protocol (SDP)", RFC 3264, 745 June 2002. 747 [RFC4145] Yon, D. and G. Camarillo, "TCP-Based Media Transport in 748 the Session Description Protocol (SDP)", RFC 4145, 749 September 2005. 751 [RFC4571] Lazzaro, J., "Framing Real-time Transport Protocol (RTP) 752 and RTP Control Protocol (RTCP) Packets over Connection- 753 Oriented Transport", RFC 4571, July 2006. 755 [RFC4572] Lennox, J., "Connection-Oriented Media Transport over the 756 Transport Layer Security (TLS) Protocol in the Session 757 Description Protocol (SDP)", RFC 4572, July 2006. 759 [I-D.ietf-mmusic-ice] 760 Rosenberg, J., "Interactive Connectivity Establishment 761 (ICE): A Protocol for Network Address Translator (NAT) 762 Traversal for Offer/Answer Protocols", 763 draft-ietf-mmusic-ice-19 (work in progress), October 2007. 765 [I-D.ietf-avt-dtls-srtp] 766 McGrew, D. and E. Rescorla, "Datagram Transport Layer 767 Security (DTLS) Extension to Establish Keys for Secure 768 Real-time Transport Protocol (SRTP)", 769 draft-ietf-avt-dtls-srtp-07 (work in progress), 770 February 2009. 772 [I-D.ietf-behave-turn] 773 Rosenberg, J., Mahy, R., and P. Matthews, "Traversal Using 774 Relays around NAT (TURN): Relay Extensions to Session 775 Traversal Utilities for NAT (STUN)", 776 draft-ietf-behave-turn-16 (work in progress), July 2009. 778 [RFC3711] Baugher, M., McGrew, D., Naslund, M., Carrara, E., and K. 779 Norrman, "The Secure Real-time Transport Protocol (SRTP)", 780 RFC 3711, March 2004. 782 13.2. Informative References 784 [RFC5382] Guha, S., Biswas, K., Ford, B., Sivakumar, S., and P. 785 Srisuresh, "NAT Behavioral Requirements for TCP", BCP 142, 786 RFC 5382, October 2008. 788 [RFC4975] Campbell, B., Mahy, R., and C. Jennings, "The Message 789 Session Relay Protocol (MSRP)", RFC 4975, September 2007. 791 [I-D.blanchet-v6ops-tunnelbroker-tsp] 792 Blanchet, M. and F. Parent, "IPv6 Tunnel Broker with the 793 Tunnel Setup Protocol (TSP)", 794 draft-blanchet-v6ops-tunnelbroker-tsp-04 (work in 795 progress), May 2008. 797 [RFC4380] Huitema, C., "Teredo: Tunneling IPv6 over UDP through 798 Network Address Translations (NATs)", RFC 4380, 799 February 2006. 801 [RFC3947] Kivinen, T., Swander, B., Huttunen, A., and V. Volpe, 802 "Negotiation of NAT-Traversal in the IKE", RFC 3947, 803 January 2005. 805 Appendix A. Implementation Considerations for BSD Sockets 807 This specification requires unusual handling of TCP connections, the 808 implementation of which in traditional BSD socket APIs is non- 809 trivial. 811 In particular, ICE requirs an agent to obtain a local TCP candidate, 812 bound to a local IP and port, and then from that local port, initiate 813 a TCP connection (to the STUN server, in order to obtain server 814 reflexive candidates, to the TURN server, to obtain a relayed 815 candidate, or to the peer as part of a connectivity check), and be 816 prepared to receive incoming TCP connections (for passive and 817 simultaneous-open candidates). A "typical" BSD socket is used either 818 for initiating or receiving connections, and not for both. The code 819 required to allow incoming and outgoing connections on the same local 820 IP and port is non-obvious. The following pseudocode, contributed by 821 Saikat Guha, has been found to work on many platforms: 823 for i in 0 to MAX 824 sock_i = socket() 825 set(sock_i, SO_REUSEADDR) 826 bind(sock_i, local) 828 listen(sock_0) 829 connect(sock_1, stun) 830 connect(sock_2, remote_a) 831 connect(sock_3, remote_b) 833 The key here is that, prior to the listen() call, the full set of 834 sockets that need to be utilized for outgoing connections must be 835 allocated and bound to the local IP address and port. This number, 836 MAX, represents the maximum number of TCP connections to different 837 destinations that might need to be established from the same local 838 candidate. This number can be potentially large for simultaneous- 839 open candidates. If a request forks, ICE procedures may take place 840 with multiple peers. Furthermore, for each peer, connections would 841 need to be established to each passive or simultaneous-open candidate 842 for the same component. If we assume a worst case of 5 forked 843 branches, and for each peer, five simultaneous-open candidates, that 844 results in MAX=25. For a passive candidate, MAX is equal to the 845 number of STUN servers, since the agent only initiates TCP 846 connections on a passive candidate to its STUN server. 848 Authors' Addresses 850 Simon Perreault (editor) 851 Viagenie 852 2600 boul. Laurier, suite 625 853 Quebec, QC G1V 4W1 854 Canada 856 Phone: +1 418 656 9254 857 Email: simon.perreault@viagenie.ca 858 URI: http://www.viagenie.ca 860 Jonathan Rosenberg 861 Cisco 862 Edison, NJ 863 US 865 Email: jdrosen@cisco.com 866 URI: http://www.jdrosen.net