idnits 2.17.1 draft-ietf-mmusic-ice-tcp-07.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** It looks like you're using RFC 3978 boilerplate. You should update this to the boilerplate described in the IETF Trust License Policy document (see https://trustee.ietf.org/license-info), which is required now. -- Found old boilerplate from RFC 3978, Section 5.1 on line 15. -- Found old boilerplate from RFC 3978, Section 5.5, updated by RFC 4748 on line 848. -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 859. -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 866. -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 872. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack a both a reference to RFC 2119 and the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. RFC 2119 keyword, line 250: '... in [RFC4145] for constructing the offer. However, the offerer MUST...' RFC 2119 keyword, line 259: '...ither be UDP or TCP), the agent SHOULD...' RFC 2119 keyword, line 268: '...or choice, it is RECOMMENDED that agen...' RFC 2119 keyword, line 300: '... Each agent SHOULD "obtain" an activ...' RFC 2119 keyword, line 306: '... Next, the agent SHOULD take all host ...' (56 more instances...) Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust Copyright Line does not match the current year -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (July 14, 2008) is 5765 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Outdated reference: A later version (-18) exists of draft-ietf-behave-rfc3489bis-16 ** Obsolete normative reference: RFC 4572 (Obsoleted by RFC 8122) == Outdated reference: A later version (-07) exists of draft-ietf-avt-dtls-srtp-02 == Outdated reference: A later version (-16) exists of draft-ietf-behave-turn-08 == Outdated reference: A later version (-08) exists of draft-ietf-behave-tcp-07 Summary: 3 errors (**), 0 flaws (~~), 5 warnings (==), 7 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 MMUSIC J. Rosenberg 3 Internet-Draft Cisco 4 Intended status: Standards Track July 14, 2008 5 Expires: January 15, 2009 7 TCP Candidates with Interactive Connectivity Establishment (ICE) 8 draft-ietf-mmusic-ice-tcp-07 10 Status of this Memo 12 By submitting this Internet-Draft, each author represents that any 13 applicable patent or other IPR claims of which he or she is aware 14 have been or will be disclosed, and any of which he or she becomes 15 aware will be disclosed, in accordance with Section 6 of BCP 79. 17 Internet-Drafts are working documents of the Internet Engineering 18 Task Force (IETF), its areas, and its working groups. Note that 19 other groups may also distribute working documents as Internet- 20 Drafts. 22 Internet-Drafts are draft documents valid for a maximum of six months 23 and may be updated, replaced, or obsoleted by other documents at any 24 time. It is inappropriate to use Internet-Drafts as reference 25 material or to cite them other than as "work in progress." 27 The list of current Internet-Drafts can be accessed at 28 http://www.ietf.org/ietf/1id-abstracts.txt. 30 The list of Internet-Draft Shadow Directories can be accessed at 31 http://www.ietf.org/shadow.html. 33 This Internet-Draft will expire on January 15, 2009. 35 Copyright Notice 37 Copyright (C) The IETF Trust (2008). 39 Abstract 41 Interactive Connectivity Establishment (ICE) defines a mechanism for 42 NAT traversal for multimedia communication protocols based on the 43 offer/answer model of session negotiation. ICE works by providing a 44 set of candidate transport addresses for each media stream, which are 45 then validated with peer-to-peer connectivity checks based on Session 46 Traversal Utilities for NAT (STUN). ICE provides a general framework 47 for describing candidates, but only defines UDP-based transport 48 protocols. This specification extends ICE to TCP-based media, 49 including the ability to offer a mix of TCP and UDP-based candidates 50 for a single stream. 52 Table of Contents 54 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 55 2. Overview of Operation . . . . . . . . . . . . . . . . . . . . 4 56 3. Sending the Initial Offer . . . . . . . . . . . . . . . . . . 6 57 3.1. Gathering Candidates . . . . . . . . . . . . . . . . . . . 6 58 3.2. Prioritization . . . . . . . . . . . . . . . . . . . . . . 8 59 3.3. Choosing Default Candidates . . . . . . . . . . . . . . . 9 60 3.4. Encoding the SDP . . . . . . . . . . . . . . . . . . . . . 9 61 4. Receiving the Initial Offer . . . . . . . . . . . . . . . . . 10 62 4.1. Verifying ICE Support . . . . . . . . . . . . . . . . . . 10 63 4.2. Forming the Check Lists . . . . . . . . . . . . . . . . . 11 64 5. Connectivity Checks . . . . . . . . . . . . . . . . . . . . . 11 65 5.1. STUN Client Procedures . . . . . . . . . . . . . . . . . . 11 66 5.1.1. Sending the Request . . . . . . . . . . . . . . . . . 11 67 5.2. STUN Server Procedures . . . . . . . . . . . . . . . . . . 12 68 6. Concluding ICE Processing . . . . . . . . . . . . . . . . . . 12 69 7. Subsequent Offer/Answer Exchanges . . . . . . . . . . . . . . 13 70 7.1. ICE Restarts . . . . . . . . . . . . . . . . . . . . . . . 13 71 8. Media Handling . . . . . . . . . . . . . . . . . . . . . . . . 13 72 8.1. Sending Media . . . . . . . . . . . . . . . . . . . . . . 13 73 8.2. Receiving Media . . . . . . . . . . . . . . . . . . . . . 14 74 9. Connection Management . . . . . . . . . . . . . . . . . . . . 14 75 9.1. Connections Formed During Connectivity Checks . . . . . . 14 76 9.2. Connections formed for Gathering Candidates . . . . . . . 15 77 10. Security Considerations . . . . . . . . . . . . . . . . . . . 16 78 11. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 16 79 12. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 16 80 13. References . . . . . . . . . . . . . . . . . . . . . . . . . . 16 81 13.1. Normative References . . . . . . . . . . . . . . . . . . . 16 82 13.2. Informative References . . . . . . . . . . . . . . . . . . 17 83 Appendix 1. Implementation Considerations for BSD Sockets . . . . 18 84 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 19 85 Intellectual Property and Copyright Statements . . . . . . . . . . 20 87 1. Introduction 89 Interactive Connectivity Establishment (ICE) [I-D.ietf-mmusic-ice] 90 defines a mechanism for NAT traversal for multimedia communication 91 protocols based on the offer/answer model [RFC3264] of session 92 negotiation. ICE works by providing a set of candidate transport 93 addresses for each media stream, which are then validated with peer- 94 to-peer connectivity checks based on Session Traversal Utilities for 95 NAT (STUN) [I-D.ietf-behave-rfc3489bis]. However, ICE only defines 96 procedures for UDP-based transport protocols. 98 There are many reasons why ICE support for TCP is important. 99 Firstly, there are media protocols that only run over TCP. Examples 100 of such protocols are web and application sharing and instant 101 messaging [RFC4975]. For these protocols to work in the presence of 102 NAT, unless they define their own NAT traversal mechanisms, ICE 103 support for TCP is needed. In addition, RTP itself can run over TCP 104 [RFC4571]. Typically, it is preferable to run RTP over UDP, and not 105 TCP. However, in a variety of network environments, overly 106 restrictive NAT and firewall devices prevent UDP-based communications 107 altogether, but general TCP-based communications are permitted. In 108 such environments, sending RTP over TCP, and thus establishing the 109 media session, may be preferable to having it fail altogether. With 110 this specification, agents can gather UDP and TCP candidates for an 111 RTP-based stream, list the UDP ones with higher priority, and then 112 only use the TCP-based ones if the UDP ones fail. This provides a 113 fallback mechanism that allows multimedia communications to be highly 114 reliable. 116 The usage of RTP over TCP is particularly useful when combined with 117 Traversal Using Relay NAT [I-D.ietf-behave-turn]. In this case, one 118 of the agents would connect to its TURN server using TCP, and obtain 119 a TCP-based relayed candidate. It would offer this to its peer agent 120 as a candidate. The answerer would initiate a TCP connection towards 121 the TURN server. When that connection is established, media can flow 122 over the connections, through the TURN server. The benefit of this 123 usage is that it only requires the agents to make outbound TCP 124 connections to a server on the public network. This kind of 125 operation is broadly interoperable through NAT and firewall devices. 126 Since it is a goal of ICE and this extension to provide highly 127 reliable communications that "just works" in as a broad a set of 128 network deployments as possible, this use case is particularly 129 important. 131 This specification extends ICE by defining its usage with TCP 132 candidates. It also defines how ICE can be used with RTP and SRTP to 133 provide both TCP and UDP candidates. This specification does so by 134 following the outline of ICE itself, and calling out the additions 135 and changes necessary in each section of ICE to support TCP 136 candidates. 138 2. Overview of Operation 140 The usage of ICE with TCP is relatively straightforward. The main 141 area of specification is around how and when connections are opened, 142 and how those connections relate to candidate pairs. 144 When the agents perform address allocations to gather TCP-based 145 candidates, three types of candidates can be obtained. These are 146 active candidates, passive candidates, and simultaneous-open 147 candidates. An active candidate is one for which the agent will 148 attempt to open an outbound connection, but will not receive incoming 149 connection requests. A passive candidate is one for which the agent 150 will receive incoming connection attempts, but not attempt a 151 connection. A simultaneous-open candidate is one for which the agent 152 will attempt to open a connection simultaneously with its peer. 154 Unlike UDP, there are no lite implementation defined for TCP. 155 Instead, an implementation that meets the criteria for a lite 156 implementation as discussed in Appendix A of [I-D.ietf-mmusic-ice] 157 can just uses the mechanisms defined in [RFC4145], with constraints 158 defined here on selection of attribute values. 160 When gathering candidates from a host interface, the agent typically 161 obtains an active, passive and simultaneous-open candidates. 162 Similarly, communications with a STUN server will provide server 163 reflexive and relayed versions of all three types. Connections to 164 the STUN server are kept open during ICE processing. 166 When encoding these candidates into offers and answers, the type of 167 the candidate is signaled. In the case of active candidates, an IP 168 address and port is present, but it is meaningless, as it is ignored 169 by the peer. As a consequence, active candidates do not need to be 170 physically allocated at the time of address gathering. Rather, the 171 physical allocations, which occur as a consequence of a connection 172 attempt, occur at the time of the connectivity checks. 174 When the candidates are paired together, active candidates are always 175 paired with passive, and simultaneous-open candidates with each 176 other. When a connectivity check is to be made on a candidate pair, 177 each agent determines whether it is to make a connection attempt for 178 this pair. 180 Why have both active and simultaneous-open candidates? Why not 181 just simultaneous-open? The reason is that NAT treatment of 182 simultaneous opens is currently not well defined, though 183 specifications are being developed to address this 184 [I-D.ietf-behave-tcp]. Some NATs block the second TCP SYN packet 185 or improperly process the subsequent SYNACK, which will cause the 186 connection attempt to fail. Therefore, if only simultaneous opens 187 are used, connections may often fail. Alternatively, using 188 unidirectional opens (where one side is active and the other is 189 passive) is more reliable, but will always require a relay if both 190 sides are behind NAT. Therefore, in the spirit of the ICE 191 philosophy, both are tried. Simultaneous-opens are preferred 192 since, if it does work, it will not require a relay even when both 193 sides are behind a different NAT. 195 The actual process of generating connectivity checks, managing the 196 state of the check list, and updating the Valid list, work 197 identically for TCP as they do for UDP. 199 ICE requires an agent to demultiplex STUN and application layer 200 traffic, since they appear on the same port. This demultiplexing is 201 described by ICE, and is done using the magic cookie and other fields 202 of the message. Stream-oriented transports introduce another 203 wrinkle, since they require a way to frame the connection so that the 204 application and STUN packets can be extracted in order to determine 205 which is which. For this reason, TCP media streams utilizing ICE use 206 the basic framing provided in RFC 4571 [RFC4571], even if the 207 application layer protocol is not RTP. 209 When TLS is in use (for non-RTP traffic) or DTLS (for RTP traffic), 210 it runs over the RFC 4571 framing shim, so that STUN runs outside of 211 the D/TLS connection (D/TLS is shorthand for TLS or DTLS). 212 Pictorially: 214 +----------+ 215 | | 216 | App | 217 +----------+----------+ 218 | | | 219 | STUN | D/TLS | 220 +----------+----------+ 221 | | 222 | RFC 4571 | 223 +---------------------+ 224 | | 225 | TCP | 226 +---------------------+ 227 | | 228 | IP | 229 +---------------------+ 231 Figure 1: ICE TCP Stack 233 The implication of this is that, for any media stream protected by 234 D/TLS, the agent will first run ICE procedures, exchanging STUN 235 messages. Then, once ICE completes, D/TLS procedures begin. ICE and 236 D/TLS are thus "peers" in the protocol stack. The STUN messages are 237 not sent over the D/TLS connection, even ones sent for the purposes 238 of keepalive in the middle of the media session. 240 When an updated offer is generated by the controlling endpoint, the 241 SDP extensions for connection oriented media [RFC4145] are used to 242 signal that an existing connection should be used, rather than 243 opening a new one. 245 3. Sending the Initial Offer 247 If an offerer meets the criteria for lite as defined in Appendix A of 248 [I-D.ietf-mmusic-ice], it omits any ICE attributes for its TCP-based 249 media streams. Instead, the offerer follows the procedures defined 250 in [RFC4145] for constructing the offer. However, the offerer MUST 251 use a setup attribute of "actpass" for those streams. 253 For offerers making use of ICE for TCP streams, the procedures below 254 are used. 256 3.1. Gathering Candidates 258 For each TCP capable media stream the agent wishes to use (including 259 ones, like RTP, which can either be UDP or TCP), the agent SHOULD 260 obtain two host candidates (each on a different port) for each 261 component of the media stream on each interface that the host has - 262 one for the simultaneous open, and one for the passive candidate. If 263 an agent is not capable of acting in one of these modes it would omit 264 those candidates. 266 Providers of real-time communications services may decide that it is 267 preferable to have no media at all than it is to have media over TCP. 268 To allow for choice, it is RECOMMENDED that agents be configurable 269 with whether they obtain TCP candidates for real time media. 271 Having it be configurable, and then configuring it to be off, is 272 far better than not having the capability at all. An important 273 goal of this specification is to provide a single mechanism that 274 can be used across all types of endpoints. As such, it is 275 preferable to account for provider and network variation through 276 configuration, instead of hard-coded limitations in an 277 implementation. Furthermore, network characteristics and 278 connectivity assumptions can, and will change over time. Just 279 because a agent is communicating with a server on the public 280 network today, doesn't mean that it won't need to communicate with 281 one behind a NAT tomorrow. Just because a agent is behind a NAT 282 with endpoint indpendent mapping today, doesn't mean that tomorrow 283 they won't pick up their agent and take it to a public network 284 access point where there is a NAT with address and port dependent 285 mapping properties, or one that only allows outbound TCP. The way 286 to handle these cases and build a reliable system is for agents to 287 implement a diverse set of techniques for allocating addresses, so 288 that at least one of them is almost certainly going to work in any 289 situation. Implementors should consider very carefully any 290 assumptions that they make about deployments before electing not 291 to implement one of the mechanisms for address allocation. In 292 particular, implementors should consider whether the elements in 293 the system may be mobile, and connect through different networks 294 with different connectivity. They should also consider whether 295 endpoints which are under their control, in terms of location and 296 network connectivity, would always be under their control. In 297 environments where mobility and user control are possible, a 298 multiplicity of techniques is essential for reliability. 300 Each agent SHOULD "obtain" an active host candidate for each 301 component of each TCP capable media stream on each interface that the 302 host has. The agent does not have to actually allocate a port for 303 these candidates. These candidates serve as a placeholder for the 304 creation of the check lists. 306 Next, the agent SHOULD take all host TCP candidates for a component 307 that have the same foundation (there will typically be two - a 308 passive and a simultaneous-open), and amongst them, pick two 309 arbitrarily. These two host candidates will be used to obtain 310 relayed and server reflexive candidates. To do that, the agent 311 initiates a TCP connection from each candidate to the TURN server 312 (resulting in two TCP connections). On each connection, it issues an 313 Allocate request. One of the resulting relayed candidate is used as 314 a passive relayed candidate, and the other, as a simultaneous-open 315 relayed candidate. In addition, the Allocate responses will provide 316 the agent with a server reflexive candidate for their corresponding 317 host candidate. 319 For all of the remaining host candidates, if any, the agent only 320 needs to obtain server reflexive candidates. To do that, it 321 initiates a TCP connection from each host candidate to a STUN server, 322 and uses a Binding request over that connection to learn the server 323 reflexive candidate corresponding to that host candidate. 325 Once the Allocate or Binding request has completed, the agent MUST 326 keep the TCP connection open until ICE processing has completed. See 327 Section 1 for important implementation guidelines. 329 If a media stream is UDP-based (such as RTP), an agent MAY use an 330 additional host TCP candidate to request a UDP-based candidate from a 331 TURN server. Usage of the UDP candidate from the TURN server follows 332 the procedures defined in ICE for UDP candidates. 334 Each agent SHOULD "obtain" an active relayed candidate for each 335 component of each TCP capable media stream on each interface that the 336 host has. The agent does not have to actually allocate a port for 337 these candidates from the relay at this time. These candidates serve 338 as a placeholder for the creation of the check lists. 340 Like its UDP counterparts, TCP-based STUN transactions are paced out 341 at one every Ta seconds. This pacing refers strictly to STUN 342 transactions (both Binding and Allocate requests). If performance of 343 the transaction requires establishment of a TCP connection, then the 344 connection gets opened when the transaction is performed. 346 3.2. Prioritization 348 The transport protocol itself is a criteria for choosing one 349 candidate over another. If a particular media stream can run over 350 UDP or TCP, the UDP candidates might be preferred over the TCP 351 candidates. This allows ICE to use the lower latency UDP 352 connectivity if it exists, but fallback to TCP if UDP doesn't work. 354 To accomplish this, the local preference SHOULD be defined as: 356 local-preference = (2^12)*(transport-pref) + 357 (2^9)*(direction-pref) + 358 (2^0)*(other-pref) 360 Transport-pref is the relative preference for candidates with this 361 particular transport protocol (UDP or TCP), and direction-pref is the 362 preference for candidates with this particular establishment 363 directionality (active, passive, or simultaneous-open). Other-pref 364 is used as a differentiator when two candidates would otherwise have 365 identical local preferences. 367 Transport-pref MUST be between 0 and 15, with 15 being the most 368 preferred. Direction-pref MUST be between 0 and 7, with 7 being the 369 most preferred. Other-pref MUST be between 0 and 511, with 511 being 370 the most preferred. For RTP-based media streams, it is RECOMMENDED 371 that UDP have a transport-pref of 15 and TCP of 6. It is RECOMMENDED 372 that, for all connection-oriented media, simultaneous-open candidates 373 have a direction-pref of 7, active of 5 and passive of 2. If any two 374 candidates have the same type-preference, transport-pref, and 375 direction-pref, they MUST have a unique other-pref. With this 376 specification, the only way that can happen is with multi-homed 377 hosts, in which case other-pref is a preference amongst interfaces. 379 3.3. Choosing Default Candidates 381 The default candidate is chosen primarily based on the likelihood of 382 it working with a non-ICE peer. When media streams supporting mixed 383 modes (both TCP and UDP) are used with ICE, it is RECOMMENDED that, 384 for real-time streams (such as RTP), the default candidates be UDP- 385 based. However, the default SHOULD NOT be the simultaneous-open 386 candidate. 388 If a media stream is inherently TCP-based, the agent MUST select the 389 active candidate as default. This ensures proper directionality of 390 connection establishment for NAT traversal with non-ICE 391 implementations. 393 3.4. Encoding the SDP 395 TCP-based candidates are encoded into a=candidate lines identically 396 to the UDP encoding described in [I-D.ietf-mmusic-ice]. However, the 397 transport protocol is set to "tcp-so" for TCP simultaneous-open 398 candidates, "tcp-act" for TCP active candidates, and "tcp-pass" for 399 TCP passive candidates. The addr and port encoded into the candidate 400 attribute for active candidates MUST be set to IP address that will 401 be used for the attempt, but the port MUST be set to 9 (i.e., 402 Discard). For active relayed candidates, the value for addr must be 403 identical to the IP address of a passive or simultaneous-open 404 candidate from the same TURN server. 406 If the default candidate is TCP, the agent MUST include the a=setup 407 and a=connection attributes from RFC 4145 [RFC4145], following the 408 procedures defined there as if ICE was not in use. In particular, if 409 an agent is the answerer, the a=setup attribute MUST meet the 410 constraints in RFC 4145 based on the value in the offer. Since an 411 ICE-tcp offerer always uses the active candidate as default, an ICE- 412 tcp answerer will always use the passive attribute as default and 413 include the a=setup:passive attribute in the answer. 415 If an agent is utilizing SRTP [RFC3711], it MAY include a mix of UDP 416 and TCP candidates. If ICE selects a TCP candidate pair, the agent 417 MUST still utilize SRTP, but run over the connection establised by 418 ICE. The alternative, RTP over TLS, MUST NOT be used. This allows 419 for the higher layer protocols (the security handshakes and media 420 transport) to be independent of the underlying transport protocol. 421 In the case of DTLS-SRTP [I-D.ietf-avt-dtls-srtp], the directionality 422 attributes (a=setup) are utilized strictly to determine the direction 423 of the DTLS handshake. Directionality of the TCP connection 424 establishment are determined by the ICE attributes and procedures 425 defined here. 427 If an agent is securing non-RTP media over TCP/TLS, he SDP MUST be 428 constructed as described in RFC 4572 [RFC4572]. The directionality 429 attributes (a=setup) are utilized strictly to determine the direction 430 of the TLS handshake. Directionality of the TCP connection 431 establishment are determined by the ICE attributes and procedures 432 defined here. 434 4. Receiving the Initial Offer 436 4.1. Verifying ICE Support 438 Since this specification does not define a lite mode for ICE-tcp, a 439 lite implementation will include candidate attributes for its UDP 440 streams, but no such attributes for its TCP streams. An agent 441 receiving such an offer MUST proceed with ICE in this case. ICE will 442 be used for the UDP streams, and [RFC4145] procedures will be used 443 for the TCP streams. However, if the offer indicates a setup 444 direction of actpass, the answerer MUST utilize a=setup:active in the 445 answer. This is required to ensure proper directionality of 446 connection establishment to work through NAT. 448 Similarly, if an agent is lite, and receives an offer that includes 449 streams with TCP candidates, it will omit candidates from the answer 450 for those streams. This will cause [RFC4145] procedures to be used 451 for those streams. In this case, the offer will indicate a direction 452 of active, and the agent will use passive in its answer. 454 4.2. Forming the Check Lists 456 When forming candidate pairs, the following types of candidates can 457 be paired with each other: 459 Local Remote 460 Candidate Candidate 461 ---------------------------- 462 tcp-so tcp-so 463 tcp-act tcp-pass 464 tcp-pass tcp-act 466 When the agent prunes the check list, it MUST also remove any pair 467 for which the local candidate is tcp-pass. 469 The remainder of check list processing works like the UDP case. 471 5. Connectivity Checks 473 5.1. STUN Client Procedures 475 5.1.1. Sending the Request 477 When an agent wants to send a TCP-based connectivity check, it first 478 opens a TCP connection if none yet exists for the 5-tuple defined by 479 the candidate pair for which the check is to be sent. This 480 connection is opened from the local candidate of the pair to the 481 remote candidate of the pair. If the local candidate is tcp-act, the 482 agent MUST open a connection from the interface associated with that 483 local candidate. This connection MUST be opened from an unallocated 484 port. For host candidates, this is readily done by connecting from 485 the candidates interface. For relayed candidates, the agent uses the 486 procedures in [I-D.ietf-behave-turn] to initiate a new connection 487 from the specified interface on the TURN server. 489 Once the connection is established, the agent MUST utilize the shim 490 defined in RFC 4571 [RFC4571] for the duration this connection 491 remains open. The STUN Binding requests and responses are sent ontop 492 of this shim, so that the length field defined in RFC 4571 precedes 493 each STUN message. If TLS or DTLS-SRTP is to be utilized for the 494 media session, the TLS or DTLS-SRTP handshakes will take place ontop 495 of this shim as well. However, they only start once ICE processing 496 has completed. In essence, the TLS or DTLS-SRTP handshakes are 497 considered a part of the media protocol. STUN is never run within 498 the TLS or DTLS-SRTP session. 500 If the TCP connection cannot be established, the check is considered 501 to have failed, and a full-mode agent MUST update the pair state to 502 Failed in the check list. 504 Once the connection is established, client procedures are identical 505 to those for UDP candidates. Note that STUN responses received on an 506 active TCP candidate will typically produce a remote peer reflexive 507 candidate. 509 5.2. STUN Server Procedures 511 An agent MUST be prepared to receive incoming TCP connection requests 512 on any host or relayed TCP candidate that is simultaneous-open or 513 passive. When the connection request is received, the agent MUST 514 accept it. The agent MUST utilize the framing defined in RFC 4571 515 [RFC4571] for the lifetime of this connection. Due to this framing, 516 the agent will receive data in discrete frames. Each frame could be 517 media (such as RTP or SRTP), TLS, DLTS, or STUN packets. The STUN 518 packets are extracted as described in Section 8.2. 520 Once the connection is established, STUN server procedures are 521 identical to those for UDP candidates. Note that STUN requests 522 received on a passive TCP candidate will typically produce a remote 523 peer reflexive candidate. 525 6. Concluding ICE Processing 527 If there are TCP candidates for a media stream, a controlling agent 528 MUST use a regular selection algorithm. 530 When ICE processing for a media stream completes, each agent SHOULD 531 close all TCP connections except the one between the candidate pairs 532 selected by ICE. 534 These two rules are related; the closure of connection on 535 completion of ICE implies that a regular selection algorithm has 536 to be used. This is because aggressive selection might cause 537 transient pairs to be selected. Once such a pair was selected, 538 the agents would close the other connections, one of which may be 539 about to be selected as a better choice. This race condition may 540 result in TCP connections being accidentally closed for the pair 541 that ICE selects. 543 7. Subsequent Offer/Answer Exchanges 545 7.1. ICE Restarts 547 If an ICE restart occurs for a media stream with TCP candidate pairs 548 that have been selected by ICE, the agents MUST NOT close the 549 connections after the restart. In the offer or answer that causes 550 the restart, an agent MAY include a simultaneous-open candidate whose 551 transport address matches the previously selected candidate. If both 552 agents do this, the result will be a simultaneous-open candidate pair 553 matching an existing TCP connection. In this case, the agents MUST 554 NOT attempt to open a new connection (or start new TLS or DTLS-SRTP 555 procedures). Instead, that existing connection is reused and STUN 556 checks are performed. 558 Once the restart completes, if the selected pair does not match the 559 previously selected pair, the TCP connection for the previously 560 selected pair SHOULD be closed by the agent. 562 8. Media Handling 564 8.1. Sending Media 566 When sending media, if the selected candidate pair matches an 567 existing TCP connection, that connection MUST be used for sending 568 media. 570 The framing defined in RFC 4571 MUST be used when sending media. For 571 media streams that are not RTP-based and do not normally use RFC 572 4571, the agent treats the media stream as a byte stream, and assumes 573 that it has its own framing of some sort. It then takes an arbitrary 574 number of bytes from the bytestream, and places that as a payload in 575 the RFC 4571 frames, including the length. Next, the sender checks 576 to see if the resulting set of bytes would be viewed as a STUN packet 577 based on the rules in sections 6 and 8 of 578 [I-D.ietf-behave-rfc3489bis]. This includes a check on the most 579 significant two bits, the magic cookie, the length, and the 580 fingerprint. If, based on those rules, the bytes would be viewed as 581 a STUN message, the sender SHOULD utilize a different number of bytes 582 so that the length checks will fail. Though it is normally highly 583 unlikely that an arbitrary number of bytes from a bytestream would 584 resemble a STUN packet based on all of the checks, it can happen if 585 the content of the application stream happens to contain a STUN 586 message (for example, a file transfer of logs from a client which 587 includes STUN messages). 589 If TLS or DTLS-SRTP procedures are being utilized to protect the 590 media stream, those procedures start at the point that media is 591 permitted to flow, as defined in the ICE specification 592 [I-D.ietf-mmusic-ice]. The TLS or DTLS-SRTP handshakes occur ontop 593 of the RFC 4571 shim, and are considered part of the media stream for 594 purposes of this specification. 596 8.2. Receiving Media 598 The framing defined in RFC 4571 MUST be used when receiving media. 599 For media streams that are not RTP-based and do not normally use RFC 600 4571, the agent extracts the payload of each RFC 4571 frame, and 601 determines if it is a STUN or an application layer data based on the 602 procedures in ICE [I-D.ietf-mmusic-ice]. If media is being protected 603 with DTLS-SRTP, the DTLS, RTP and STUN packets are demultiplexed as 604 described in Section 3.6.2 of [I-D.ietf-avt-dtls-srtp]. 606 For non-STUN data, the agent appends this to the ongoing bytestream 607 collected from the frames. It then parses the bytestream as if it 608 had been directly received over the TCP connection. This allows for 609 ICE-tcp to work without regard to the framing mechanism used by the 610 application layer protocol. 612 9. Connection Management 614 9.1. Connections Formed During Connectivity Checks 616 Once a TCP or TCP/TLS connection is opened by ICE for the purpose of 617 connectivity checks, its lifecycle depends on how it is used. If 618 that candidate pair is selected by ICE for usage for media, an agent 619 SHOULD keep the connection open until: 621 o The session terminates 623 o The media stream is removed 625 o An ICE restart takes place, resulting in the selection of a 626 different candidate pair. 628 In these cases, the agent SHOULD close the connection when that event 629 occurs. This applies to both agents in a session, in which case 630 usually one of the agents will end up closing the connection first. 632 If a connection has been selected by ICE, an agent MAY close it 633 anyway. As described in the next paragraph, this will cause it to be 634 reopened almost immediately, and in the interim media cannot be sent. 635 Consequently, such closures have a negative effect and are NOT 636 RECOMMENDED. However, there may be cases where an agent needs to 637 close a connection for some reason. 639 If an agent needs to send media on the selected candidate pair, and 640 its TCP connection has closed, either on purpose or due to some 641 error, then: 643 o If the agent's local candidate is tcp-act or tcp-so, it MUST 644 reopen a connection to the remote candidate of the selected pair. 646 o If the agent's local candidate is tcp-pass, the agent MUST await 647 an incoming connection request, and consequently, will not be able 648 to send media until it has been opened. 650 If the TCP connection is established, the framing of RFC 4571 is 651 utilized. If the agent opened the connection, it MUST send a STUN 652 connectivity check. An agent MUST be prepared to receive a 653 connectivity check over a connection it opened or accepted (note that 654 this is true in general; ICE requires that an agent be prepared to 655 receive a connectivity check at any time, even after ICE processing 656 completes). If an agent receives a connectivity check after re- 657 establishment of the connection, it MUST generate a triggered check 658 over that connection in response if it has not already sent a check. 659 Once an agent has sent a check and received a successful response, 660 the connection is considered Valid and media can be sent (which 661 includes a TLS or DTLS-SRTP session resumption or restart). 663 If the TCP connection cannot be established, the controlling agent 664 SHOULD restart ICE for this media stream. This will happen in cases 665 where one of the agents is behind a NAT with connection dependent 666 mapping properties [I-D.ietf-behave-tcp]. 668 9.2. Connections formed for Gathering Candidates 670 If the agent opened a connection to a STUN server for the purposes of 671 gathering a server reflexive candidate, that connection SHOULD be 672 closed by the client once ICE processing has completed. This happens 673 irregardless of whether the candidate learned from the STUN server 674 was selected by ICE. 676 If the agent opened a connection to a TURN server for the purposes of 677 gathering a relayed candidate, that connection MUST be kept open by 678 the client for the duration of the media session if: 680 o A relayed candidate learned by the TURN server was selected by 681 ICE, 683 o or an active candidate established as a consequence of a Connect 684 request sent through that TCP connection was selected by ICE. 686 Otherwise, the connection to the TURN server SHOULD be closed once 687 ICE processing completes. 689 If, despite efforts of the client, a TCP connection to a TURN server 690 fails during the lifetime of the media session utilizing a transport 691 address allocated by that server, the client SHOULD reconnect to the 692 TURN server, obtain a new allocation, and restart ICE for that media 693 stream. 695 10. Security Considerations 697 The main threat in ICE is hijacking of connections for the purposes 698 of directing media streams to DoS targets or to malicious users. 699 ICE-tcp prevents that by only using TCP connections that have been 700 validated. Validation requires a STUN transaction to take place over 701 the connection. This transaction cannot complete without both 702 participants knowing a shared secret exchanged in the rendezvous 703 protocol used with ICE, such as SIP. This shared secret, in turn, is 704 protected by that protocol exchange. In the case of SIP, the usage 705 of the sips mechanism is RECOMMENDED. When this is done, an 706 attacker, even if it knows or can guess the port on which an agent is 707 listening for incoming TCP connections, will not be able to open a 708 connection and send media to the agent. 710 A more detailed analysis of this attack and the various ways ICE 711 prevents it are described in [I-D.ietf-mmusic-ice]. Those 712 considerations apply to this specification. 714 11. IANA Considerations 716 There are no IANA considerations associated with this specification. 718 12. Acknowledgements 720 The authors would like to thank Tim Moore, Saikat Guha, Francois 721 Audet and Roni Even for the reviews and input on this document. 723 13. References 725 13.1. Normative References 727 [I-D.ietf-behave-rfc3489bis] 728 Rosenberg, J., Mahy, R., Matthews, P., and D. Wing, 729 "Session Traversal Utilities for (NAT) (STUN)", 730 draft-ietf-behave-rfc3489bis-16 (work in progress), 731 July 2008. 733 [RFC3264] Rosenberg, J. and H. Schulzrinne, "An Offer/Answer Model 734 with Session Description Protocol (SDP)", RFC 3264, 735 June 2002. 737 [RFC4145] Yon, D. and G. Camarillo, "TCP-Based Media Transport in 738 the Session Description Protocol (SDP)", RFC 4145, 739 September 2005. 741 [RFC4571] Lazzaro, J., "Framing Real-time Transport Protocol (RTP) 742 and RTP Control Protocol (RTCP) Packets over Connection- 743 Oriented Transport", RFC 4571, July 2006. 745 [RFC4572] Lennox, J., "Connection-Oriented Media Transport over the 746 Transport Layer Security (TLS) Protocol in the Session 747 Description Protocol (SDP)", RFC 4572, July 2006. 749 [I-D.ietf-mmusic-ice] 750 Rosenberg, J., "Interactive Connectivity Establishment 751 (ICE): A Protocol for Network Address Translator (NAT) 752 Traversal for Offer/Answer Protocols", 753 draft-ietf-mmusic-ice-19 (work in progress), October 2007. 755 [I-D.ietf-avt-dtls-srtp] 756 McGrew, D. and E. Rescorla, "Datagram Transport Layer 757 Security (DTLS) Extension to Establish Keys for Secure 758 Real-time Transport Protocol (SRTP)", 759 draft-ietf-avt-dtls-srtp-02 (work in progress), 760 February 2008. 762 [I-D.ietf-behave-turn] 763 Rosenberg, J., Mahy, R., and P. Matthews, "Traversal Using 764 Relays around NAT (TURN): Relay Extensions to Session 765 Traversal Utilities for NAT (STUN)", 766 draft-ietf-behave-turn-08 (work in progress), June 2008. 768 [RFC3711] Baugher, M., McGrew, D., Naslund, M., Carrara, E., and K. 769 Norrman, "The Secure Real-time Transport Protocol (SRTP)", 770 RFC 3711, March 2004. 772 13.2. Informative References 774 [I-D.ietf-behave-tcp] 775 Guha, S., "NAT Behavioral Requirements for TCP", 776 draft-ietf-behave-tcp-07 (work in progress), April 2007. 778 [RFC4975] Campbell, B., Mahy, R., and C. Jennings, "The Message 779 Session Relay Protocol (MSRP)", RFC 4975, September 2007. 781 1. Implementation Considerations for BSD Sockets 783 This specification requires unusual handling of TCP connections, the 784 implementation of which in traditional BSD socket APIs is non- 785 trivial. 787 In particular, ICE requirs an agent to obtain a local TCP candidate, 788 bound to a local IP and port, and then from that local port, initiate 789 a TCP connection (to the STUN server, in order to obtain server 790 reflexive candidates, to the TURN server, to obtain a relayed 791 candidate, or to the peer as part of a connectivity check), and be 792 prepared to receive incoming TCP connections (for passive and 793 simultaneous-open candidates). A "typical" BSD socket is used either 794 for initiating or receiving connections, and not for both. The code 795 required to allow incoming and outgoing connections on the same local 796 IP and port is non-obvious. The following pseudocode, contributed by 797 Saikat Guha, has been found to work on many platforms: 799 for i in 0 to MAX 800 sock_i = socket() 801 set(sock_i, SO_REUSEADDR) 802 bind(sock_i, local) 804 listen(sock_0) 805 connect(sock_1, stun) 806 connect(sock_2, remote_a) 807 connect(sock_3, remote_b) 809 The key here is that, prior to the listen() call, the full set of 810 sockets that need to be utilized for outgoing connections must be 811 allocated and bound to the local IP address and port. This number, 812 MAX, represents the maximum number of TCP connections to different 813 destinations that might need to be established from the same local 814 candidate. This number can be potentially large for simultaneous- 815 open candidates. If a request forks, ICE procedures may take place 816 with multiple peers. Furthermore, for each peer, connections would 817 need to be established to each passive or simultaneous-open candidate 818 for the same component. If we assume a worst case of 5 forked 819 branches, and for each peer, five simultaneous-open candidates, that 820 results in MAX=25. For a passive candidate, MAX is equal to the 821 number of STUN servers, since the agent only initiates TCP 822 connections on a passive candidate to its STUN server. 824 Author's Address 826 Jonathan Rosenberg 827 Cisco 828 Edison, NJ 829 US 831 Email: jdrosen@cisco.com 832 URI: http://www.jdrosen.net 834 Full Copyright Statement 836 Copyright (C) The IETF Trust (2008). 838 This document is subject to the rights, licenses and restrictions 839 contained in BCP 78, and except as set forth therein, the authors 840 retain all their rights. 842 This document and the information contained herein are provided on an 843 "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS 844 OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND 845 THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS 846 OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF 847 THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED 848 WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. 850 Intellectual Property 852 The IETF takes no position regarding the validity or scope of any 853 Intellectual Property Rights or other rights that might be claimed to 854 pertain to the implementation or use of the technology described in 855 this document or the extent to which any license under such rights 856 might or might not be available; nor does it represent that it has 857 made any independent effort to identify any such rights. Information 858 on the procedures with respect to rights in RFC documents can be 859 found in BCP 78 and BCP 79. 861 Copies of IPR disclosures made to the IETF Secretariat and any 862 assurances of licenses to be made available, or the result of an 863 attempt made to obtain a general license or permission for the use of 864 such proprietary rights by implementers or users of this 865 specification can be obtained from the IETF on-line IPR repository at 866 http://www.ietf.org/ipr. 868 The IETF invites any interested party to bring to its attention any 869 copyrights, patents or patent applications, or other proprietary 870 rights that may cover technology that may be required to implement 871 this standard. Please address the information to the IETF at 872 ietf-ipr@ietf.org. 874 Acknowledgment 876 Funding for the RFC Editor function is provided by the IETF 877 Administrative Support Activity (IASA).