idnits 2.17.1 draft-ietf-mmusic-ice-02.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** It looks like you're using RFC 3978 boilerplate. You should update this to the boilerplate described in the IETF Trust License Policy document (see https://trustee.ietf.org/license-info), which is required now. -- Found old boilerplate from RFC 3667, Section 5.1 on line 16. -- Found old boilerplate from RFC 3978, Section 5.5 on line 1581. -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 1558. -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 1565. -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 1571. ** Found boilerplate matching RFC 3978, Section 5.4, paragraph 1 (on line 1587), which is fine, but *also* found old RFC 2026, Section 10.4C, paragraph 1 text on line 38. ** The document seems to lack an RFC 3978 Section 5.1 IPR Disclosure Acknowledgement -- however, there's a paragraph with a matching beginning. Boilerplate error? ** This document has an original RFC 3978 Section 5.4 Copyright Line, instead of the newer IETF Trust Copyright according to RFC 4748. ** This document has an original RFC 3978 Section 5.5 Disclaimer, instead of the newer disclaimer which includes the IETF Trust according to RFC 4748. ** The document uses RFC 3667 boilerplate or RFC 3978-like boilerplate instead of verbatim RFC 3978 boilerplate. After 6 May 2005, submission of drafts without verbatim RFC 3978 boilerplate is not accepted. The following non-3978 patterns matched text found in the document. That text should be removed or replaced: By submitting this Internet-Draft, I certify that any applicable patent or other IPR claims of which I am aware have been disclosed, or will be disclosed, and any of which I become aware will be disclosed, in accordance with RFC 3668. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- == No 'Intended status' indicated for this document; assuming Proposed Standard == The page length should not exceed 58 lines per page, but there was 31 longer pages, the longest (page 40) being 71 lines == It seems as if not all pages are separated by form feeds - found 0 form feeds but 42 pages Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** There are 13 instances of too long lines in the document, the longest one being 9 characters in excess of 72. == There are 17 instances of lines with private range IPv4 addresses in the document. If these are generic example addresses, they should be changed to use any of the ranges defined in RFC 6890 (or successor): 192.0.2.x, 198.51.100.x or 203.0.113.x. -- The document has examples using IPv4 documentation addresses according to RFC6890, but does not use any IPv6 documentation addresses. Maybe there should be IPv6 examples, too? ** The document seems to lack a both a reference to RFC 2119 and the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. RFC 2119 keyword, line 395: '...STUN password. The client MUST assign...' RFC 2119 keyword, line 396: '...ier. These identifiers MUST be unique...' RFC 2119 keyword, line 410: '...In this case, it SHOULD just send to t...' RFC 2119 keyword, line 414: '...o deal with this, the initiator SHOULD...' RFC 2119 keyword, line 417: '...d, the initiator SHOULD send all media...' (37 more instances...) Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the RFC 3978 Section 5.4 Copyright Line does not match the current year == Line 723 has weird spacing: '...siprulz x-fra...' == Line 724 has weird spacing: '...siprulz z-fra...' == Line 725 has weird spacing: '...siprulz x-fra...' == Line 726 has weird spacing: '...siprulz z-fra...' -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (July 19, 2004) is 7220 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Looks like a reference, but probably isn't: 'U' on line 1243 -- Looks like a reference, but probably isn't: 'V' on line 1226 ** Obsolete normative reference: RFC 3489 (ref. '1') (Obsoleted by RFC 5389) == Outdated reference: A later version (-02) exists of draft-ietf-mmusic-anat-01 -- Obsolete informational reference (is this intentional?): RFC 2326 (ref. '5') (Obsoleted by RFC 7826) == Outdated reference: A later version (-10) exists of draft-ietf-mmusic-sdp-comedia-07 == Outdated reference: A later version (-05) exists of draft-huitema-v6ops-teredo-02 == Outdated reference: A later version (-08) exists of draft-rosenberg-midcom-turn-04 Summary: 9 errors (**), 0 flaws (~~), 13 warnings (==), 11 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 MMUSIC J. Rosenberg 3 Internet-Draft dynamicsoft 4 Expires: January 17, 2005 July 19, 2004 6 Interactive Connectivity Establishment (ICE): A Methodology for 7 Network Address Translator (NAT) Traversal for Multimedia Session 8 Establishment Protocols 9 draft-ietf-mmusic-ice-02 11 Status of this Memo 13 By submitting this Internet-Draft, I certify that any applicable 14 patent or other IPR claims of which I am aware have been disclosed, 15 and any of which I become aware will be disclosed, in accordance with 16 RFC 3668. 18 Internet-Drafts are working documents of the Internet Engineering 19 Task Force (IETF), its areas, and its working groups. Note that 20 other groups may also distribute working documents as 21 Internet-Drafts. 23 Internet-Drafts are draft documents valid for a maximum of six months 24 and may be updated, replaced, or obsoleted by other documents at any 25 time. It is inappropriate to use Internet-Drafts as reference 26 material or to cite them other than as "work in progress." 28 The list of current Internet-Drafts can be accessed at 29 http://www.ietf.org/ietf/1id-abstracts.txt. 31 The list of Internet-Draft Shadow Directories can be accessed at 32 http://www.ietf.org/shadow.html. 34 This Internet-Draft will expire on January 17, 2005. 36 Copyright Notice 38 Copyright (C) The Internet Society (2004). All Rights Reserved. 40 Abstract 42 This document describes a methodology for Network Address Translator 43 (NAT) traversal for multimedia session signaling protocols, such as 44 the Session Initiation Protocol (SIP). This methodology is called 45 Interactive Connectivity Establishment (ICE). ICE makes use of 46 existing protocols, such as Simple Traversal of UDP Through NAT 47 (STUN) and Traversal Using Relay NAT (TURN). ICE makes use of STUN 48 in peer-to-peer cooperative fashion, allowing participants to 49 discover, create and verify mutual connectivity. 51 Table of Contents 53 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 54 2. Multimedia Signaling Protocol Abstraction . . . . . . . . . . 4 55 3. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 6 56 4. Overview of ICE . . . . . . . . . . . . . . . . . . . . . . . 8 57 5. Detailed ICE Algorithm . . . . . . . . . . . . . . . . . . . . 10 58 5.1 Initiator Processing . . . . . . . . . . . . . . . . . . . 10 59 5.1.1 Sending the Initiate Message . . . . . . . . . . . . . 10 60 5.1.2 Processing the Accept . . . . . . . . . . . . . . . . 10 61 5.2 Responder Processing . . . . . . . . . . . . . . . . . . . 11 62 5.2.1 Processing the Initiate Message . . . . . . . . . . . 11 63 5.3 Common Procedures . . . . . . . . . . . . . . . . . . . . 12 64 5.3.1 Gathering Transport Addresses . . . . . . . . . . . . 12 65 5.3.2 Enabling STUN on Each Local Transport Address . . . . 13 66 5.3.3 Prioritizing the Transport Addresses and Choosing 67 a Default . . . . . . . . . . . . . . . . . . . . . . 14 68 5.3.4 Sending STUN Connectivity Checks . . . . . . . . . . . 16 69 5.3.5 Receiving STUN Requests . . . . . . . . . . . . . . . 21 70 6. Running STUN on Derived Transport Addresses . . . . . . . . . 23 71 6.1 STUN on a TURN Derived Transport Address . . . . . . . . . 23 72 6.2 STUN on a STUN Derived Transport Address . . . . . . . . . 24 73 7. XML Schema for ICE Messages . . . . . . . . . . . . . . . . . 26 74 8. Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 75 8.1 Port Restricted . . . . . . . . . . . . . . . . . . . . . 29 76 9. Mapping ICE into SIP . . . . . . . . . . . . . . . . . . . . . 32 77 10. Security Considerations . . . . . . . . . . . . . . . . . . 34 78 11. IANA Considerations . . . . . . . . . . . . . . . . . . . . 35 79 12. IAB Considerations . . . . . . . . . . . . . . . . . . . . . 36 80 12.1 Problem Definition . . . . . . . . . . . . . . . . . . . . 36 81 12.2 Exit Strategy . . . . . . . . . . . . . . . . . . . . . . 36 82 12.3 Brittleness Introduced by ICE . . . . . . . . . . . . . . 37 83 12.4 Requirements for a Long Term Solution . . . . . . . . . . 38 84 12.5 Issues with Existing NAPT Boxes . . . . . . . . . . . . . 38 85 13. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 39 86 14. References . . . . . . . . . . . . . . . . . . . . . . . . . 40 87 14.1 Normative References . . . . . . . . . . . . . . . . . . . . 40 88 14.2 Informative References . . . . . . . . . . . . . . . . . . . 40 89 Author's Address . . . . . . . . . . . . . . . . . . . . . . . 41 90 Intellectual Property and Copyright Statements . . . . . . . . 42 92 1. Introduction 94 A multimedia session signaling protocol is a protocol that exchanges 95 control messages between a pair of agents for the purposes of 96 establishing the flow of media traffic between them. This media flow 97 is distinct from the flow of control messages, and may take a 98 different path through the network. Examples of such protocols are 99 the Session Initiation Protocol (SIP) [3], the Real Time Streaming 100 Protocol (RTSP) [5] and the International Telecommunications Union 101 (ITU) H.323. 103 These protocols, by nature of their design, are difficult to operate 104 through Network Address Translators (NAT). Because their purpose in 105 life is to establish a flow of packets, they tend to carry IP 106 addresses within their messages, which is known to be problematic 107 through NAT [6]. The protocols also seek to create a media flow 108 directly between participants, so that there is no application layer 109 intermediary between them. This is done to reduce media latency, 110 decrease packet loss, and reduce the operational costs of deploying 111 the application. However, this is difficult to accomplish through 112 NAT. A full treatment of the reasons for this is beyond the scope of 113 this specification. 115 Numerous solutions have been proposed for allowing these protocols to 116 operate through NAT. These include Application Layer Gateways 117 (ALGs), the Middlebox Control Protocol [7], Simple Traversal of UDP 118 through NAT (STUN) [1], Traversal Using Relay NAT [16], Realm 119 Specific IP [8][9], symmetric RTP [10], along with session 120 description extensions needed to make them work, such as [2]. 121 Unfortunately, these techniques all have pros and cons which make 122 each one optimal in some network topologies, but a poor choice in 123 others. The result is that administrators and implementors are 124 making assumptions about the topologies of the networks in which 125 their solutions will be deployed. This introduces a lot of 126 complexity and brittleness into the system. What is needed is a 127 single solution which is flexible enough to work well in all 128 situations. 130 This specification provides that solution. It is called Interactive 131 Connectivity Establishment, or ICE. ICE makes use of many of the 132 protocols above, but uses them in a specific methodology which avoids 133 many of the pitfalls of using any one alone. ICE uses STUN and TURN 134 without extension, and allows for other similar protocols to be used 135 as well. However, it does require additional signaling capabilities 136 to be introduced into the multimedia session signaling protocols. 137 For those protocols which make use of the Session Description 138 Protocol (SDP), this specification defines the necessary extensions 139 to it. Other protocols will need to define their own mechanisms. 141 2. Multimedia Signaling Protocol Abstraction 143 This specification defines a general methodology that allows the 144 media streams of multimedia signaling protocols to successfully 145 traverse NAT. This methodology is independent of any particular 146 signaling protocol. In order to discuss the methodology, we need to 147 to define an abstraction of a multimedia signaling system, and define 148 terms that can be used throughout this specification. Figure 1 shows 149 the abstraction. 151 +-----------+ 152 | | 153 | | 154 > | Signaling |\ 155 / | Relay | \ 156 / | | \ 157 Initiate / | | \ Initiate 158 Message / / +-----------+ \ Message 159 / / < \ 160 / / \ \ 161 / / \ \ 162 / / Accept Accept \ \ 163 / / Message Message \ > 164 / / \ 165 +-----------+ / \ +-----------+ 166 | | < | | 167 | | Media Stream | | 168 | Session | ................................ | Session | 169 | Initiator | | Responder | 170 | | Media Stream | | 171 | | ................................ | | 172 +-----------+ +-----------+ 174 Figure 1 176 Communications occur between two clients - the session initiator and 177 the session responder, also referred to as the initiator and 178 responder. The initiator is the one that decides to engage in 179 communications. To do so, it sends an initiate message. The 180 initiate message contains parameters that describe the capabilities 181 and configuration of media streams for the initiator. This message 182 may travel through signaling intermediaries, called a signaling 183 relay, before finally arriving at the session responder. Assuming 184 the session responder wishes to communicate, it generates an accept 185 message, which is relayed back to the initiator. This message 186 contains capabilities and configuration of media streams for the 187 responder. As a result, media streams are established between the 188 initiator and responder. The signaling protocol may also support an 189 operation that allows for termination of the communications session. 190 We refer to this signaling message as a terminate message. 192 This abstraction is readily mapped to SIP, RTSP, and H.323, amongst 193 others. For SIP, the initiator is the User Agent Client (UAC), the 194 responder is the User Agent Server (UAS), the initiate message is a 195 SIP message containing an SDP offer (for example, an INVITE), the 196 accept message is a SIP message containing an SDP answer (for 197 example, a 200 OK), and the terminate message is a BYE. For RTSP, 198 the initiator is the RTSP client, the responder is the RTSP server, 199 the initiate message is a SETUP message, and the accept message is a 200 SETUP response. 202 This specification defines parameters that need to be included in 203 these various signaling messages in order to implement the 204 functionality described by ICE. Those parameters are represented in 205 XML for convenience. Any multimedia signaling protocol that uses ICE 206 will need to define how to map those parameters into its own protocol 207 messages. Section 9 provides such a mapping for SIP. 209 3. Terminology 211 Several new terms are introduced in this specification: 212 Session Initiator: A software or hardware entity that, at the request 213 of a user, tries to establish communications with another entity, 214 called the session responder. A session initiator is also called 215 an initiator. 216 Initiator: Another term for a session initiator. 217 Session Responder: A software or hardware entity that receives a 218 request for establishment of communications from the session 219 initiator, and either accepts or declines the request. A session 220 responder is also called a responder. 221 Responder: Another term for a session responder. 222 Client: Either the initiator or responder. 223 Peer: From the perspective of one of the clients in a session, its 224 peer is the other client. Specifically, from the perspective of 225 the initiator, the peer is the responder. From the perspective of 226 the responder, the peer is the initiator. 227 Signaling Relay: An intermediary of signaling messages. Examples are 228 SIP proxies and H.323 Gatekeepers. 229 Initiate Message: The signaling message used by an initiator to 230 establish communications. It contains capabilities and other 231 information needed by the responder to send media to the 232 initiator. 233 Accept Message: The signaling message used by a responder to agree to 234 communications. It contains capabilities and other information 235 needed by the initiator to send media to the responder. 236 Terminate Message The signaling message used by a client to terminate 237 the session and associated media streams. 238 Transport Address: The combination of an IP address and port. 239 Local Transport Address: A local transport address is a transport 240 address that has been allocated from the operating system on the 241 host. This includes transport addresses obtained through Virtual 242 Private Networks (VPNs) and transport addresses obtained through 243 Realm Specific IP (RSIP) [8] (which lives at the operating system 244 level). Transport addresses are typically obtained by binding to 245 an interface. 246 Derived Transport Address: A derived transport address is a transport 247 address which is associated with, but different from, a local 248 transport address. The derived transport address is associated 249 with the local transport address in that packets sent to the 250 derived transport address are received on the socket bound to that 251 local transport address. Derived addresses are obtained using 252 protocols like STUN and TURN, and more generally, any UNSAF 253 protocol [11]. 255 Peer Derived Transport Address: A peer derived transport address is a 256 derived transport address learned from a STUN server running 257 within a peer in a media session. 258 TURN Derived Transport Address: A derived transport address obtained 259 from a TURN server. 260 STUN Derived Transport Address: A derived transport address obtained 261 from a STUN server whose address has been provisioned into the UA. 262 This, by definition, excludes Peer Derived Transport Addresses. 263 Unilateral Allocations: Queries made to a network server which 264 provides an UNSAF service. 265 Bilateral Allocations: Addresses obtained by using an UNSAF service 266 that actually runs on the peer of the communications session. 267 Peer derived transport addresses are synonymous with bilateral 268 allocations. 270 4. Overview of ICE 272 ICE makes the fundamental assumption that clients exist in a network 273 of segmented connectivity. This segmentation is the result of a 274 number of addressing realms in which a client can simultaneously be 275 connected. We use "realms" here in the broadest sense. A realm is 276 defined purely by connectivity. Two clients are in the same realm 277 if, when they exchange the addresses each has in that realm, they are 278 able to send packets to each other. This includes IPv6 and IPv4 279 realms, which actually use different address spaces, in addition to 280 private networks connected to the public Internet through NAT. 282 The key assumption in ICE is that a client cannot know, apriori, 283 which address realms it shares with any peer it may wish to 284 communicate with. Therefore, in order to communicate, it has to try 285 connecting to addresses in all of the realms. 287 Before the initiator establishes a session, it obtains as many IP 288 address and port combinations in as many address realms as it can. 289 These adresses all represent potential points at which the initiator 290 will receive a specific media stream. Any protocol that provides a 291 client with an IP address and port on which it can receive traffic 292 can be used. These include STUN, TURN, RSIP, and even a VPN. The 293 client also uses any local interface addresses. A dual-stack v4/v6 294 client will obtain both a v6 and a v4 address/port. The only 295 requirement is that, across all of these addresses, the initiator can 296 be certain that at least one of them will work for any responder it 297 might communicate with. Unfortunately, if the initiator communicates 298 with a peer that doesn't support ICE, only one address can be 299 provided to that peer. As such, the client will need to choose one 300 default address, which will be used by non-ICE clients. This would 301 typically be a TURN derived transport address, as it is most likely 302 to work with unknown non-ICE peers. 304 The initiator then runs a STUN server on each of the local transport 305 addresses it has obtained. The initiator will need to be able to 306 demultiplex STUN messages and media messages received on that IP 307 address and port, and process them appropriately. All of these 308 addresses are placed into the initiate message, and they are ordered 309 in terms of preference. Preference is a matter of local policy, but 310 typically, lowest preference would be given to transport addresses 311 learned from a TURN server (i.e., TURN derived transport addresses). 312 The initiate message also conveys the STUN username and password 313 which are required to gain access to the STUN server on each address/ 314 port combination. 316 The initiate message is sent to the responder. This specification 317 does not address the issue of how the signaling messages themselves 318 traverse NAT. It is assumed that signaling protocol specific 319 mechanisms are used for that purpose. The responder follows a 320 similar process as the initiator followed; it obtains addresses from 321 local interfaces, STUN servers, TURN servers, etc., and it places all 322 of them into the accept message. 324 Once the responder receives the initiate message, it has a set of 325 potential addresses it can use to communicate with the initiator. 326 The initiator will be running a STUN server at each address. The 327 responder sends a STUN request to each address, in parallel. When 328 the initiator receives these, it sends a STUN response. If the 329 responder receives the STUN response, it knows that it can reach its 330 peer at that address. It can then begin to send media to that 331 address. As additional STUN responses arrive, the responder will 332 learn about additional transport addresses which work. If one of 333 those has a higher priority than the one currently in use, it starts 334 sending media to that one instead. No additional control messages 335 (i.e., SIP signaling) occur for this change. 337 The STUN messages described above happen while the accept message is 338 being sent to the intitiator. Once the intitiator receives the 339 accept message, it too will have a set of potential addresses with 340 which it can communicate to the responder. It follows exactly the 341 same process described above. 343 Furthermore, when a either the initiator or responder receives a STUN 344 request, it takes note of the source IP address and port of that 345 request. It compares that transport address to the existing set of 346 potential addresses. If it's not amongst them, it gets added as 347 another potential address. The incoming STUN message provides the 348 client with enough context to associate that transport address with a 349 STUN username, STUN password, and priority, just as if it had been 350 sent in an initiate or accept message. As such, the client begins 351 sending STUN messages to it as well, and if those succeed, the 352 address can be used if it has a higher priority. 354 After a successful STUN transaction, the client will re-perform the 355 STUN query periodically to revalidate connectivity. This allows for 356 recovery from NAT failures, or from route flaps which may cause 357 packets to suddenly traverse a different NAT. As such, the address 358 used as the destination for media is the highest priority address to 359 which connectivity currently exists. 361 5. Detailed ICE Algorithm 363 This section describes the detailed processing needed for ICE. 365 5.1 Initiator Processing 367 5.1.1 Sending the Initiate Message 369 When the initiator wishes to begin communications, it starts by 370 gathering transport addresses, as described in Section 5.3.1, and 371 starting a STUN server on each local transport address, as described 372 in Section 5.3.2. This process can actually happen at any time 373 before sending an initiate message. A client can pre-gather 374 transport addresses, using a user interface cue (such as picking up 375 the phone, or entry into an address book) as a hint that 376 communications is imminent. 378 When it comes time to initiate communications, it determines a 379 priority for each one and identifies one as a default, as described 380 in Section 5.3.3. 382 The next step is to construct the initiate message. Section 7 383 provides the XML schema for the initiate message. The message 384 consists of a series of media streams. For each media stream, there 385 is a default address and a list of alternates. The default address 386 is the one that will be used by responders that don't understand ICE 387 (for SIP, this is accomplished by mapping the default address into 388 the m and c line in the SDP). The alternates represent addresses 389 that the responder should also try. In SIP, these are conveyed with 390 the new SDP alt parameter. 392 The client then encodes all of its available transport addresses 393 (including the default) as a series of alternate elements. Each 394 alternate element conveys a transport address for RTP, one for RTCP, 395 a STUN username fragment and STUN password. The client MUST assign 396 each alternate a unique identifier. These identifiers MUST be unique 397 across all alternates used within the session. This identifier is 398 encoded in the "id" attribute of the alternate element. The priority 399 for the transport address, as computed above, is included as an 400 attribute as well. 402 Once the initiate message is constructed, it is sent. 404 5.1.2 Processing the Accept 406 There are two possible cases for processing of the Accept message. 407 If the recipient of the Initiate message did not support ICE, the 408 Accept message will only contain the default address information. As 409 a result, the initiator knows that it cannot perform its connectivity 410 checks. In this case, it SHOULD just send to the transport address 411 listed. However, if local configuration information tells the 412 initiator to try connectivity checks by sending them through the TURN 413 server, this means that packets sent directly to responder may be 414 dropped by a local firewall. To deal with this, the initiator SHOULD 415 issue a SEND command using this new transport address. The SEND 416 command contains the media packet to send to the responder. Once 417 this command has been accepted, the initiator SHOULD send all media 418 packets to the TURN server, which will then forward them towards the 419 responder. 421 If the Accept message contains alternates, it implies that the 422 responder supported ICE. In that case, the initiator takes each 423 transport address, STUN username, STUN password and priority, and 424 places them into a list, called the candidate list. It then begins 425 processing the candidate list as described in Section 5.3.4. That 426 processing associates a state with each transport address. As 427 described there, once a successful STUN query is made to the STUN 428 server at an address, the initiator can begin sending media to that 429 address. 431 5.2 Responder Processing 433 5.2.1 Processing the Initiate Message 435 Upon receipt of the initiate message, the client starts gathering 436 transport addresses, as described in Section 5.3.1, and starts a STUN 437 server on each local transport address, as described in Section 438 5.3.2. This processing is done immediately on receipt of the 439 request, to prepare for the case where the user should accept the 440 call, or early media needs to be generated. 442 At some point, the responder will decide to accept or reject the 443 communications. A rejection terminates ICE processing, of course. 444 In the case of acceptance, the accept message is constructed as 445 follows. 447 The client first determines a priority for each transport address it 448 has gathered, and identifies one as a default, as described in 449 Section 5.3.3. 451 Constructing the accept proceeds identically to the way in which the 452 initiate message is constructed (Section 5.1.1). 454 The accept is then sent. 456 5.3 Common Procedures 458 This section discusses procedures that are common between initiator 459 and responder. 461 5.3.1 Gathering Transport Addresses 463 A client gathers addresses when it believes that communications is 464 imminent. For initiators, this occurs before sending an initiate 465 message (Section 5.1.1). For responders, it occurs before sending a 466 accept message (Section 5.2.1). 468 There are two types of addresses a client can gather - local 469 transport addresses, and derived transport addresses. Local 470 transport addresses are obtained by binding to an ephemeral port on 471 an interface (physical or virtual) on the host. A multi-homed host 472 SHOULD attempt to bind on all interfaces for all media streams it 473 wishes to receive. For media streams carried using the Real Time 474 Transport Protocol (RTP) [12], the client will need to bind to an 475 ephemeral port for both RTP and RTCP. 477 The result will be a set of local transport addresses. The client 478 may also have access to servers that provide unilateral self-address 479 fixing (UNSAF) [11]. Examples of such protocols include STUN, TURN, 480 and TEREDO [15]. All ICE implementations MUST implement STUN and 481 TURN, but MAY, through configuration, disable the use of STUN or TURN 482 for unilateral address allocation (STUN is mandatory for the 483 connectivity checks described below). When disabled, it MUST be 484 possible through user or administrator operation to re-enable. This 485 allows all implementations to have the breadth of protocol support 486 needed to work in all situations, with the flexibility to turn if off 487 if its not needed. 489 These protocols work by having the client send, from a specific local 490 transport address, some kind of message to a server. The server 491 provides to the client, in some kind of response, an additional 492 transport address, called a derived transport address. This derived 493 transport address is derived from the local transport address. Here, 494 derivation means that a request sent to the derived transport address 495 might (under good network conditions) reach the client on its local 496 transport address. 498 For each of these protocols, the client may have access to a 499 multiplicity of servers. For example, a user connected to a natted 500 cable access network might have access to a STUN server in the 501 private cable network and in the public Internet. For each local 502 transport address, the client SHOULD obtain an address from every 503 server for each protocol it supports. The result of this will be a 504 set of derived transport addresses, with each derived address 505 associated with the local transport address it is derived from. 507 5.3.2 Enabling STUN on Each Local Transport Address 509 Once the client has obtained a set of transport addresses, it starts 510 a STUN server on each local transport address (including ones used 511 for RTCP). This, by definition, means that the STUN service will be 512 reached for requests sent to the derived addresses. 514 However, the client does not need to provide STUN service on any 515 other IP address or port, unlike the STUN usage described in [1]. 516 The need to run the service on multiple ports is to support the 517 change flags. However, those flags are not needed with ICE, and the 518 server SHOULD reject, with a 400 response, any STUN requests with 519 these flags set. 521 Furthermore, there is no need to support TLS or to be prepared to 522 receive SharedSecret request messages. Those messages are used to 523 obtain shared secrets to be used with BindingRequests. However, with 524 ICE, usernames and passwords are exchanged in the signaling protocol. 526 The client will receive both STUN requests and media packets on each 527 local transport address. The client MUST be able to disambiguate 528 them. In the case of RTP/RTCP, this disambiguation is easy. RTP and 529 RTCP packets start with the bits 0b10 (v=2). The first two bits in 530 STUN are always 0b00. This disambiguation also works for packets 531 sent using Secure RTP [13], since the RTP header is in the clear. 532 Disambiguating STUN with other media stream protocols may be more 533 complicated. However, it can always be possible with arbitrarily 534 high probabilities by selecting an appropriately random username (see 535 below). 537 The need to run STUN on the same transport address as the media 538 stream represents the "ugliest" piece of ICE. However, it is an 539 essential part of the story. By sending STUN requests to the very 540 same place media is sent, any bindings learned through STUN will be 541 useful even when communicating through symmetric NATs. This results 542 in a substantial increase in the scope of applicability of STUN. 544 For each local transport address where a STUN server is running, the 545 client MUST choose a username fragment and a password. The username 546 fragment created by the client will be concatenated with the fragment 547 created by its peer. The result will serve as the username provided 548 by its peer in STUN requests. By creating the username as a 549 combination of information from each side of a call, it allows a 550 client to correlate the source of the request with a candidate 551 transport address. This is discussed further below. 553 The username fragment MUST be globally unique, so that no other host 554 will select a username with the same value. This username fragment 555 and password will be passed to its peer in an initiate or accept 556 message. As such, the process described in this section will 557 associate, with each local transport address, a username fragment and 558 password. The client also associates this same username fragment and 559 password with any transport addresses derived from the local 560 transport address. 562 The global uniqueness requirement stems from the lack of uniquenes 563 afforded by IP addresses. Consider clients A, B, and C. A and B are 564 within private enterprise 1, which is using 10.0.0.0/8. C is within 565 private enterprise 2, which is also using 10.0.0.0/8. As it turns 566 out, B and C both have IP address 10.0.1.1. A initiates 567 communications to C. C, in its accept message, provides A with its 568 transport addresses. In this case, thats 10.0.1.1:8866 and 8877. As 569 it turns out, B is in a session at that same time, and is also using 570 10.0.1.1:8866 and 8877. This means that B has a STUN server running 571 on those ports, just as C does. A will send a STUN request to 572 10.0.1.1:8866 and 8877. However, these do not go to C as expected. 573 Instead, they go to B. If B just replied to them, A would believe it 574 has connectivity to C, when in fact it has connectivity to a 575 completely different user, B. To fix this, the STUN username takes 576 on the role of a unique identifier. C provides A with a unique 577 username. A uses this username in its STUN query to 10.0.1.1:8866. 578 This STUN query arrives at B. However, the username is unknown to B, 579 and so the request is rejected. A treats the rejected STUN request 580 as if there were no connectivity to C (which is actually true). 581 Therefore, the error is avoided. 583 Once the STUN server is started, it MUST run continuously until the 584 session is completed. While the server is running, it MUST act as a 585 normal STUN server, but MUST only accept STUN requests from clients 586 that authenticate, as discussed below in Section 5.3.5 588 5.3.3 Prioritizing the Transport Addresses and Choosing a Default 590 The prioritization process takes a list of transport addresses, and 591 associates each with a priority. This priority reflects the desire 592 that the UA has to receive media on that address, and is assigned as 593 a value from 0 to 1 (1 being most preferred). Priorities are 594 ordinal, so that their significance is only relative to other 595 transport address priorities in the same list. 597 This specification makes no normative recommendations on how the 598 prioritization is done. However, some useful guidelines are 599 suggested on how such a prioritization can be determined. 601 One criteria for choosing one transport address over another is 602 whether or not that transport address involves the use of a relay. 603 That is, if media is sent to that transport address, will the media 604 first transit a relay before being received. TURN derived transport 605 addresses make use of relays (the TURN server), as to any local 606 transport addresses associated with a VPN server. When media is 607 transited through a relay, it can increase the latency between 608 transmission and reception. It can increase the packet losses, 609 because of the additional router hops that may be taken. It may 610 increase the cost of providing service, since media will be routed in 611 and right back out of a relay run by the provider. If these concerns 612 are important, transport addresses with this property can be listed 613 with lower priority. 615 Another criteria for choosing one address over another is IP address 616 family. ICE works with both IPv4 and IPv6. It therefore provides a 617 transition mechanism that allows dual-stack hosts to prefer 618 connectivity over IPv6, but to fall back to IPv4 in case the v6 619 networks are disconnected (due, for example, to a failure in a 6to4 620 relay) [14]. It can also help with hosts that have both a native 621 IPv6 address and a 6to4 address. In such a case, higher priority 622 could be afforded to the native v6 address, followed by the 6to4 623 address, followed by a native v4 address. This allows a site to 624 obtain and begin using native v6 addresss immediately, yet still 625 fallback to 6to4 addresses when communicating with clients in other 626 sites that do not yet have native v6 connectivity. 628 Another criteria for choosing one address over another is security. 629 If a user is a telecommuter, and therefore connected to their 630 corporate network and a local home network, they may prefer their 631 voice traffic to be routed over the VPN in order to keep it on the 632 local network when communicating within the enterprise, but use the 633 local network when communicating with users outside of the 634 enterprise. 636 Another criteria for choosing one address over another is topological 637 awareness. This is most useful for transport addresses which make 638 use of relays (including TURN and VPN). In those cases, if a client 639 has preconfigured or dynamically discovered knowledge of the 640 topological proximity of the relays to itself, it can use that to 641 select closer relays with higher priority. 643 Once the transport addresses have been prioritized, one is selected 644 as the default. This is the address that will be used by a peer that 645 doesn't understand ICE. The default has no relevance when 646 communicating with an ICE capable peer. As such, it is RECOMMENDED 647 that the default be chosen based on the likelihood of that address 648 being useful when communicating with a peer that doesn't support ICE. 650 This will frequently be a TURN derived transport address from a TURN 651 server providing public IP addresses. 653 5.3.4 Sending STUN Connectivity Checks 655 Once a responder has received an initiate message, or an initiator 656 has received an accept message, the list of transport addresses is 657 extracted from the message. These transport addresses, called the 658 remote transport addresses, along with the username fragment, 659 password, and priority from the message are placed into a table, 660 called the candidate table. There is a candidate table for RTP for 661 each media stream, and for RTCP for each media stream. So, if a 662 session is established with audio and video, there would be four 663 tables - audio RTP, audio RTCP, video RTP and video RTCP. 665 The client then takes its own gathered addresses, and creates a 666 subset called the sourceable addresses. This subset is the set of 667 local transport addresses (including VPN and RSIP) and TURN derived 668 transport addresses. Thus, it excludes STUN derived transport 669 addresses. The formal definition of this subset is defined below. 671 Each row in this table is then replicated once for each sourceable 672 transport address. The table has a column for the sourceable 673 transport address value, and this is populated upon replication. 674 That table also has a column called "my username fragment", which is 675 the username fragment that the client created for sourceable 676 transport address in that row. Each row in this table is called a 677 candidate. 679 Each candidate is associated with a state. The state represents the 680 current understanding of connectivity to that remote transport 681 address when packets are sent from that sourceable address. There 682 are five possible states. These states are: 683 INIT: No STUN transaction has been completed towards this remote 684 transport address from this sourceable address. 685 HANDSHAKING: One or more STUN transactions have failed, but 686 insufficient time has passed since leaving the INIT state to be 687 certain that the remote transport address is unreachable from this 688 sourceable address. This state is important for connectivity 689 checks made to STUN derived transport addresses through port 690 restricted NAT or a TURN derived transport address. 691 BAD: All STUN transactions to this remote transport address from this 692 sourceable address have either timed out, or failed with a 600 693 response, and a sufficient amount of time has elapsed since the 694 INIT state to have high confidence that the remote transport 695 address cannot be reached from this sourceable address. 697 GOOD: The last STUN transaction to this remote transport address from 698 this sourceable address was successful. However, it is not the 699 highest priority candidate, and therefore, is not in use for 700 media. 702 When the client first populates the tables from the initiate or 703 accept message, all of the transport addresses are set to the INIT 704 state. 706 Consider the the following example. An initiator sends an initiate 707 message with one media stream (audio), with two transport addresses, 708 A and B. A is a local transport address, and B is a STUN derived 709 transport address (although that fact is not signaled in the 710 message). Both of these will have the same username fragment and 711 password, but different priorities. The initiate message is sent to 712 the responder. The responder has a local transport address, a STUN 713 derived transport address, and a TURN derived transport address. 714 Call these X, Y and Z respectively. Thus, it has two sourceable 715 addresses, X and Z. The table created by the responder would have 716 four rows. Each of the two transport addresses in the initiate 717 message is present twice, once with the responder's local transport 718 address, and once with its TURN derived address. Such a table might 719 look like this: 721 Remote Srcable User Frag Passwd My-Usr-Frag Priority State 722 -------------------------------------------------------------------- 723 A X asd9f8f8== siprulz x-frag 0.4 INIT 724 A Z asd9f8f8== siprulz z-frag 0.4 INIT 725 B X asd9f8f8== siprulz x-frag 0.2 INIT 726 B Z asd9f8f8== siprulz z-frag 0.2 INIT 728 The client begins a STUN BindingRequest transaction for each 729 candidate. This STUN transaction is sent to the IP address and port 730 from the Remote column. It sends the request from the IP address and 731 port in the sourceable column. For local transport addresses, that 732 means sending from the locally bound socket. For VPN addresses, that 733 means sending from the socket bound to the VPN interface. For TURN 734 derived transport addresses, this means using the TURN Send message 735 to send a request through the TURN server. This provides the 736 definition of the sourceable flag: they represent distinct transport 737 addresses that a client can send from. A STUN derived transport 738 address is not distinct from a local transport address, since a 739 client cannot send a packet to a particular IP address and port with 740 different source IP addresses and ports as seen by that recipient 741 [[REPHRASE]] 742 The STUN USERNAME attribute MUST be present. It is set to the 743 concatenation of the user fragment from the table, with the "My User 744 Fragment" from the candidate. Thus, for the candidate with remote 745 transport address A and sourceable address X, the USERNAME would be 746 set to "asd9f8f8==x-frag". The BindingRequest SHOULD contain a 747 MESSAGE-INTEGRITY attribute, computed using the username in the 748 USERNAME attribute, and the password from the password field in the 749 row. The BindingRequest MUST NOT contain the CHANGE-REQUEST or 750 RESPONSE-ADDRESS attribute. 752 Each of these STUN transactions will generate either a timeout, or a 753 response. If the response is an error, but recoverable as described 754 in RFC 3489, the client SHOULD try again using the procedures 755 discussed there. Either initialy, or after retry, the STUN 756 transaction will produce a timeout result, a success result, or a 757 non-recoverable failure result (error codes 400, 431, or 600). These 758 correspond to "timeout", "success", and "error" events, respectively. 760 These events are fed into the state machine described in Figure 3. 761 This figure shows the transitions between states that occur on the 762 completion of the STUN BindingRequest transaction. After the 763 completion of each transaction, the client sets a timer that 764 determines when it will do another transaction for that candidate. 765 The result of that next transaction drives the next transition in the 766 state machine, and so on. Since timers are set at the entry to each 767 state, STUN BindingRequest tranasactions will be tried continuously 768 throughout a call. This is necessary to detect a variety of failure 769 cases, as discussed below. 771 .......... 772 . . timeout/ 773 . . Set Rapid 774 +---------+ +---------+ . Retry Timer 775 | | | | . 776 | | | |<.... 777 | INIT |......................>| HAND | 778 | | timeout/ | SHAKING | 779 | | Set Rapid | | 780 +---------+ Retry Timer, error/ +---------+ 781 . . Giveup Timer Set . . 782 . . Retry . . 783 error/ . . Timer . . 784 Set . . ............................. . success/ 785 Retry . . . . Set Refresh 786 Timer . ...C.............................. . Timer 787 . . success/ . . 788 . . Set Refresh . . 789 V V Timer V V 790 +---------+ +---------+ 791 | | | | 792 | | | | 793 | BAD |......................>| GOOD | 794 ...>| | success/ | |....... 795 . | | Set Refresh | | . 796 . +---------+ Timer +---------+ . 797 . . ^ . ^ . 798 . . . . . . 799 ....... . . .......... 800 timeout or ................................ success/ 801 error/ timeout or Set Refresh 802 Set error/ Timer 803 Retry Set 804 Timer Retry 805 Timer 807 Figure 3 809 Starting in the INIT state, if the transaction is successful, the 810 client has verified connectivity to that remote transport address 811 when sending from that sourceable transport address. This means that 812 media packets sent in exactly the same way will get through. As 813 such, the FSM transitions to the GOOD state, and the client sets the 814 Refresh Timer. This timer is used to continually check that a good 815 candidate remains good. It is possible for a candidate to cease 816 being good if a NAT should fail and recover, resulting in loss of any 817 bindings it holds, or if an IP route should flap, causing those 818 packets to be delivered through a new NAT that allocates new 819 bindings, or a firewall with different policies. The Retry Timer 820 value SHOULD be configurable. In order to rapidly recover from 821 failures, it is RECOMMENDED that it default to five seconds. [[TODO: 822 Need to work this number as a function of codec rates as well, 823 perhaps apply the RTCP algorithm for its computation.]] 825 If, from the INIT state, the STUN transaction times out, the FSM 826 enters the HANDSHAKE state. At this point, there are two reasons 827 that the STUN request might have timed out. One reason is that the 828 candidate is simply unreachable. The other reason is that the peer 829 is behind a port restricted NAT, and so STUN requests from the client 830 cannot get through until its peer creates a permission by generating 831 its own STUN request. It may take some time to generate that STUN 832 request, as it may depend on a response message getting delivered. 833 As such, the HANDSHAKE state allows for rapid retry of the STUN 834 transaction until enough time has passed to be certain that the 835 remote transport address is actually unreachable. Thus, upon 836 entering the HANDSHAKE state, two timers are set. The first, called 837 the Rapid Retry timer, determines how long until the next attempt. 838 This timer SHOULD be configurable. It is RECOMMENDED that it default 839 to 1 second. The second timer, called the Giveup Timer, determines 840 how long the client will keep trying until it decides that the remote 841 transport address is unreachable. This timer SHOULD be configurable. 842 It is RECOMMENDED that it default to 50 seconds. This is a 843 reasonable approximation of the maximum SIP transaction duration. 845 If, from the INIT state, the STUN transaction generates an error, the 846 FSM moves into the BAD state. The retry timer is set. This retry 847 timer is used to periodically retry, and see if the candidate may now 848 be reachable. The value of this timer SHOULD be configurable. It is 849 RECOMMENDED that it default to 1 minute. 851 If, while in the HANDSHAKE state, the Giveup timer fires, or the STUN 852 transaction results in an error, the client moves into the BAD state, 853 and sets the retry timer. The default durations for ths timer are 854 identical for all entries into the BAD state, and thus it defaults to 855 1 minute here as well. If, while in the HANDSHAKE state, the Rapid 856 Retry timer fires, the timer is reset and the client remains in the 857 HANDSHAKE state. 859 If, while in the BAD state, the retried transaction is executed and 860 fails or results in a timeout, the client resets the timer and 861 remains in the BAD state. If the STUN transaction succeeds, it moves 862 into the GOOD state and sets the refresh timer. The default 863 durations for this timer are the same for all entries into the GOOD 864 state, and thus it defaults to 1 second. 866 If while in the GOOD state, the transaction resulting from the 867 refresh timer times out or fails, the client moves into the BAD state 868 and sets the retry timer. If, however, that transaction succeeds, 869 the client stays in the GOOD state and resets the refresh timer. 871 As the FSM operates throughout the call, candidates will move their 872 states around. At any point in time, the client sends media packets 873 (including RTCP) using one of the candidates in the GOOD state. It 874 is RECOMMENDED that the one with highest priority be used. It 875 another candidate should change state such that it moves into the 876 GOOD state, and it has a higher priority, the client SHOULD switch to 877 that candidate, but SHOULD do so after waiting a small period of time 878 (10 seconds is RECOMMENDED) to prevent against flapping of candidates 879 during periods of route flaps in the network. 881 To send media to a candidate, the client sends media packets (whether 882 they are RTP or RTCP or something else) to the remote transport 883 address, from the sourceable transport address. 885 If, for some reason, there was at least one candidate in the GOOD 886 state, and due to an FSM transition, none of the candidates are in 887 the GOOD state, the client SHOULD forcefully transition all of the 888 candidates into the HANDSHAKE state in an attempt to rapidly 889 reconnect. If none of them succeed, and all of the candidates enter 890 the BAD state, the client SHOULD terminate the call and alert the 891 user to the failure [[TODO: Need to work in some good congestion 892 control here; in cases where timeouts happen due to network 893 congestion this is probably too agressive]]. 895 5.3.5 Receiving STUN Requests 897 When a client receives a STUN request (presumably after 898 disambiguating it from a media packet), it follows the logic 899 described in this section. 901 The client MUST follow the procedures defined in RFC 3489 and verify 902 that the USERNAME attribute is known to the server. Here, this is 903 done by taking the USERNAME attribute, and doing a prefix match 904 against the "my user fragment" column in the candidate table. If it 905 doesn't match any rows, the client generates a 432 response. If it 906 matches multiple rows, the client checks the suffix of the username 907 against the "user fragment" column. If it doesn't match any rows, 908 the client generates a 432 response. If it does match rows, it will 909 match those rows corresponding to the transport addresses that the 910 peer could have sent this STUN request from. 912 Assuming the USERNAME is valid, the client MUST generate a STUN 913 response per RFC 3489. 915 Once the response is sent, the client examines the source IP and port 916 where the request came from. It matches those against the remote 917 transport addresses in the candidate table. If there is no match, 918 this source address is itself another possible candidate. As with 919 other candidates, it must be associated with a STUN username 920 fragment, password and priority, all normally provided by the peer, 921 along with sourceable transport addresses and their username 922 fragments. 924 How does the client obtain this other information? The suffix of the 925 USERNAME is the key (literally). That suffix was already provided to 926 the client in an initiate or accept message, and was used to populate 927 the current candidate table. If it matches an existing value in the 928 table, it means that the STUN request came from the same transport 929 address as a previously advertised candidate; however, when it showed 930 up at the client, its source IP address was different than the peer 931 thought it would be. This will happen when a symmetric NAT exists 932 between the clients. In this case, the source IP address and port of 933 the STUN packet now become a viable candidate, since the client 934 should be able to send messages back to it and reach its peer. 936 However, this connectivity, like all other connectivity, needs to be 937 verified. So, the client needs to find out the user fragment and 938 password to use in STUN requests. To do that, it takes the suffix of 939 the USERNAME in the STUN request, and looks it up in the "user frag" 940 column of the table. If its a match, that is the user fragment 941 needed as part of the candidate. The password is the value from that 942 row. The sourceable transport address is also the value from that 943 row. The priority is also copied from that row. 945 This new candidate can then be verified by sending STUN requests to 946 it, as described in Section 5.3.4. 948 6. Running STUN on Derived Transport Addresses 950 One of the seemingly bizarre operations done during the ICE 951 processing is the transmission of a STUN request to a transport 952 address which is obtained through TURN or STUN itself. This actually 953 does work, and in fact, has extremely useful properties. The 954 subsections below go through the detailed operations that would occur 955 at each point to demonstrate correctness and the properties derived 956 from it. 958 6.1 STUN on a TURN Derived Transport Address 960 Consider a client A that is behind a NAT. It connects to a TURN 961 server on the public side of the NAT. To do that, A binds to a local 962 transport address, say 10.0.1.1:8866, and then sends a TURN request 963 to the TURN server. The NAT translates the net-10 address to 964 192.0.2.88:5063. Assume that the TURN server is running on 192.0.2.1 965 and listening for TURN traffic on port 7764. The TURN server 966 allocates a derived transport address 192.0.2.1:26524 to the client, 967 and returns it in the TURN response. Remember that all traffic from 968 the TURN server to the client is sent from 192.0.2.1:7764 to 969 10.0.1.1:8866. 971 Now, the client runs a STUN server on 10.0.1.1:8866, and advertises 972 that its server actually runs on 192.0.2.1:26524. Another client, B, 973 sends a STUN request to this server. It sends it from a local 974 transport address, 192.0.2.77:1296. When it arrives at 975 192.0.2.1:26524, the TURN server "locks down" outgoing traffic, so 976 that data packets received from A are sent to 192.0.2.77:1296. The 977 STUN request is then forwarded to the client, sent with a source 978 address of 192.0.2.1:7764 and a destination address of 979 192.0.2.88:5063. This passes through the NAT, which rewrites the 980 source address to 10.0.1.1:8866. This arrives at A's STUN server. 981 The server observes the source address of 192.0.2.1:7764, and 982 generates a STUN response containing this value in the MAPPED-ADDRESS 983 attribute. The STUN response is sent with a source address fo 984 10.0.1.1:8866, and a destination of 192.0.2.1:7764. This arrives at 985 the TURN server, which, because of the lock-down, sends the STUN 986 response with a source address of 192.0.2.1:26524 and destination of 987 192.0.2.77:1296, which is B's STUN client. 989 Now, as far as B is concerned, it has obtained a new STUN derived 990 transport address of 192.0.2.1:7764. And indeed, it has! STUN 991 derived transport addresses are scoped to the session, so they can 992 only be used by the peer in the session. Furthermore, that peer has 993 to send requests from the socket on which the STUN server was 994 running. In this case, A is the peer, and its STUN server was on 995 10.0.1.1:8866. If it sends to 192.0.2.1:7764, the packet goes to the 996 TURN server, and due to lock-down, is forwarded to B, and 997 specifically, is forwarded to the transport address B sent the STUN 998 request from. Therefore, the address is indeed a valid STUN derived 999 transport address. 1001 The benefit of this is that it allows two clients to share the same 1002 TURN server for media traffic in both directions. With "normal" TURN 1003 usage, both clients would obtain a derived address from their own 1004 TURN servers. The result is that, for a single call, there are two 1005 bindings allocated by each side from their respective servers, and 1006 all four are used. With ICE, that drops to two bindings allocated 1007 from a single server. Of course, all four bindings are allocated 1008 initially. However, once one of the clients begins receiving media 1009 on its STUN derived address, it can deallocate its TURN resources. 1011 [[TODO: Include a diagram that shows this pictorially.]] 1013 6.2 STUN on a STUN Derived Transport Address 1015 Consider a client A that is behind a NAT. It connects to a STUN 1016 server on the public side of the NAT. To do that, A binds to a local 1017 transport address, say 10.0.1.1:8866, and then sends a STUN request 1018 to the STUN server. The NAT translates the net-10 address to 1019 192.0.2.88:5063. Assume that the STUN server is running on 192.0.2.1 1020 and listening for STUN traffic on port 3478, the default STUN port. 1021 The STUN server sees a source IP address of 192.0.2.88:5063, and 1022 returns that to the client in the STUN response. The NAT forwards 1023 the response to the client. 1025 Now, the client runs a STUN server on 10.0.1.1:8866, and advertises 1026 that its server actually runs on 192.0.2.88:5063. Another client, B, 1027 sends a STUN request to this address. It sends it from a local 1028 transport address, 192.0.2.77:1296. When it arrives at 1029 192.0.2.88:5063 (on the NAT), the NAT rewrites the source address to 1030 10.0.1.1:8866, assuming that it is of the full-cone variety [1], or 1031 is restricted, and the permission for 192.0.2.77:1296 is open. This 1032 arrives at A's STUN server. The server observes the source address 1033 of 192.0.2.77:1296, and generates a STUN response containing this 1034 value in the MAPPED-ADDRESS attribute. The STUN response is sent 1035 with a source address of 10.0.1.1:8866, and a destination of 1036 192.0.2.77:1296. This arrives at B's STUN client. 1038 Now, as far as B is concerned, it has obtained a new STUN derived 1039 transport address of 192.0.2.77:1296. Of course, this is the same 1040 address as the local transport address, and therefore this derived 1041 address is not used. However, had there been additonal NATs between 1042 B and A's NAT, B would end up seeing the binding allocated by that 1043 outermost NAT. The net result is that STUN requests sent to a STUN 1044 derived address behave as normal STUN would. However, these STUN 1045 requests have the side-effect of creating permissions in the NATs 1046 which see those requests in the public to private direction. This 1047 turns out to be very useful for traversing restricted NATs. 1049 7. XML Schema for ICE Messages 1051 This section contains the XML schema used to define the initiate, 1052 accept, and modify messages. Any protocol that uses ICE needs to map 1053 the parameters defined here into its own messages. 1055 Note that STUN allows both the username and password to contain the 1056 space character. However, usernames and passwords used with ICE 1057 cannot contain the space. 1059 1060 1064 1066 1067 1068 1069 This is the root element, which holds a 1070 media-streams elements. 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 There are zero or more media stream 1082 elements. Each defines attributes for a specific media 1083 stream. 1084 1085 1086 1087 1088 1089 The default address is used for 1090 sending media before connectivity has been 1091 verified. 1092 1093 1094 1095 1097 1098 1099 1100 1101 1102 1103 Each alternate is a 1104 possible point of contact. 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 1142 1143 1144 1146 1147 1148 1149 1150 1151 1152 1153 1154 1155 1156 1157 1159 8. Examples 1161 In the examples that follow, messages are labeled with "message name 1162 A,B" to mean a message from transport address A to B. For STUN 1163 Requests, this is followed by curly brackets enclosing the username 1164 and password. For STUN responses, this is followed by square 1165 brackets and the value of MAPPED ADDRESS. 1167 8.1 Port Restricted 1169 This section shows a flow of two clients behind port restricted NAT 1170 talking to each other. 1172 A P.R. NAT STUN+TURN P.R. NAT B 1173 |(1) STUN Req P1,S+T | | | 1174 |----------->| | | | 1175 | |(2) STUN Req U, S+T | | 1176 | |----------->| | | 1177 | |(3) STUN Res S+T,U [U] | | 1178 | |<-----------| | | 1179 |(4) STUN Res S+T,P1 [U] | | | 1180 |<-----------| | | | 1181 |(5) Intitiate {P1,unameA,passA,q=0.4} | | 1182 |{U,unameA,passA,q=0.3} | | | 1183 |-------------------------------------------------->| 1184 | | | |(6) STUN Req P2,S+T 1185 | | | |<-----------| 1186 | | |(7) STUN Req V, S+T | 1187 | | |<-----------| | 1188 | | |(8) STUN Res S+T,V [V] | 1189 | | |----------->| | 1190 | | | |(9) STUN Res S+T,P2 [V] 1191 | | | |----------->| 1192 |(10) Accept {P2,unameB,passB,q=0.4} | | 1193 |{V,unameB,passB,q=0.3} | | | 1194 |<--------------------------------------------------| 1195 |(11) STUN Req P1,P2 | | | 1196 |(unameBunameA,passB) | | | 1197 |----------->| | | | 1198 | |Timeout | | | 1199 |(12) STUN Req P1,V | | | 1200 |(unameBunameA,passB) | | | 1201 |----------->| | | | 1202 | |(13) STUN Req U,V | | 1203 | |(unameBunameA,passB) | | 1204 | |------------------------>| | 1205 | |Permission open V->U | | 1206 | | | |No success, Retries continue 1207 | | | |(14) STUN Req P2,P1 1208 | | | |(unameAunameB,passA) 1209 | | | |<-----------| 1210 | | | |Timeout | 1211 | | | |(15) STUN Req P2,U 1212 | | | |(unameAunameB,passA) 1213 | | | |<-----------| 1214 | |(16) STUN Req V,U | | 1215 | |(unameAunameB,passA) | | 1216 | |<------------------------| | 1217 | | | |Permission open U->V 1218 | |Passes NAT! | | | 1219 |(17) STUN Req V,P1 | | | 1220 |(unameAunameB,passA) | | | 1221 |<-----------| | | | 1222 |(18) STUN Res P1,V [V] | | | 1223 |----------->| | | | 1224 | |(19) STUN Res U,V [V] | | 1225 | |------------------------>| | 1226 | | | |(20) STUN Res U,P2 [V] 1227 | | | |----------->| 1228 | |Retries continue | | 1229 |(21) STUN Req P1,V | | | 1230 |(unameBunameA,passB) | | | 1231 |----------->| | | | 1232 | |(22) STUN Req U,V | | 1233 | |(unameBunameA,passB) | | 1234 | |------------------------>| | 1235 | | | |Passes NAT! | 1236 | | | |(23) STUN Req U,P2 1237 | | | |(unameBunameA,passB) 1238 | | | |----------->| 1239 | | | |(24) STUN Res P2,U [U] 1240 | | | |<-----------| 1241 | |(25) STUN Res V,U [U] | | 1242 | |<------------------------| | 1243 |(26) STUN Res V,P1 [U] | | | 1244 |<-----------| | | | 1245 |(27) RTP P1,V | | | 1246 |----------->| | | | 1247 | |(28) RTP U,V| | | 1248 | |------------------------>| | 1249 | | | |Passes NAT! | 1250 | | | |(29) RTP U,P2 1251 | | | |----------->| 1252 | | | |(30) RTP P2,U 1253 | | | |<-----------| 1254 | |(31) RTP V,U| | | 1255 | |<------------------------| | 1256 | |Passes NAT! | | | 1257 |(32) RTP V,P1 | | | 1258 |<-----------| | | | 1260 9. Mapping ICE into SIP 1262 In this section, we show how to map ICE into SIP. This requires 1263 extensions to SDP. 1265 A new SDP attribute is defined to support ICE. It is called "alt". 1266 The alt attribute MUST be present within a media block of the SDP. 1267 It contains an alternative IP address and port (or pair of IP 1268 addresses and ports in the case of RTP) that the recipient of the SDP 1269 can use instead of the ones indicated in the m and c lines. There 1270 MAY be multiple alt attributes in a media block. In that case, each 1271 of them MUST contain a different IP address and port (or a differing 1272 pair of IP address and ports in the case of RTP). 1274 The syntax of this attribute is: 1276 alt-attribute = "alt" ":" id SP qvalue SP 1277 username SP password SP 1278 unicast-address SP port [unicast-address SP port] 1279 ;qvalue from RFC 3261 1280 ;unicast-address, port from RFC 2327 1281 username = non-ws-string 1282 password = non-ws-string 1283 id = token 1284 derived-from = ":" / id 1286 With the addition of the alt attribute, the mapping of the ICE 1287 messages to SIP/SDP is straightforward. The ICE initiate message 1288 corresponds to a SIP message with an SDP offer. The ICE accept 1289 message corresponds to a SIP message with a SDP answer. The ICE 1290 modify message corresponds to a SIP INVITE or UPDATE with an offer, 1291 and the ICE modify accept message corresponds to an INVITE or UPDATE 1292 response with an answer. 1294 Each media stream element in an ICE message maps to a media block in 1295 the SDP. The default address maps to the m and c lines in the SDP. 1296 If the ICE message indicates an RTCP address and port that are not 1297 one higher than that of the RTP, the SDP RTCP attribute [2] MUST be 1298 used to convey them. 1300 Each alternate element in an ICE message maps either to an alt 1301 attribute in the SDP, or a new media block, depending on the IP 1302 version of the alternate. For the highest priority IPv6 alternate, 1303 it is mapped into a separate media block, using the ANAT grouping 1304 [4]. Any additional IPv6 addresses are placed as alternates within 1305 this media block. For alternates that are IPv4 addresses, the alt 1306 attribute is used. The rtp-address element maps to the first 1307 unicast-address and port components of the alt attribute. The 1308 rtcp-address element maps to the second unicast-address and port 1309 components of the alt attribute. Note that, if the RTCP address is 1310 identical to the RTP address, and the port is one higher, the second 1311 unicast-address and port MAY be omitted. The preference value from 1312 the alternate element is mapped to the q-value component of the alt 1313 attribute. The STUN user fragment and password elements map to the 1314 user fragment and password components of the alt attribute. 1316 10. Security Considerations 1318 ICE conveys the STUN username and password within its messages. If 1319 an eavesdropper should see the username and password, the worst they 1320 can do is send STUN requests to the host. Since STUN is a stateless 1321 protocol, the attacker can not alter the processing of the call or 1322 otherwise disrupt it. They could flood the server with 1323 BindingRequest packets. However, this would be no worse than if the 1324 attacker simply floods the host with any kind of packet. 1326 However, integrity protection of the username and password are more 1327 important. If an attacker is capable of intercepting the message and 1328 modifying the username or password, they could prevent connectivity 1329 from being established between peers, and therefore disrupt the call. 1330 Of course, if the attacker can intercept the message, there are many 1331 other ways in which they could do that, such as simply discarding the 1332 message. Injecting fake messages with incorect usernames and 1333 passwords can also disrupt a call, and does not require the 1334 compromise of an intermediate server. A similar attack is possible 1335 by modifying most of the ICE message attributes. To prevent these 1336 kinds of attacks, it is RECOMMENDED that the actual protocols the ICE 1337 maps to make use of security mechanisms that provide message 1338 integrity protection. 1340 11. IANA Considerations 1342 This specification defines one new media attribute: alt. Its syntax 1343 is defined in Section 9. 1345 12. IAB Considerations 1347 The IAB has studied the problem of "Unilateral Self Address Fixing", 1348 which is the general process by which a client attempts to determine 1349 its address in another realm on the other side of a NAT through a 1350 collaborative protocol reflection mechanism [11]. ICE is an example 1351 of a protocol that performs this type of function. Interestingly, 1352 the process for ICE is not unilateral, but bilateral, and the 1353 difference has a signficant impact on the issues raised by IAB. The 1354 IAB has mandated that any protocols developed for this purpose 1355 document a specific set of considerations. This section meets those 1356 requirements. 1358 12.1 Problem Definition 1360 From RFC 3424 any UNSAF proposal must provide: 1361 Precise definition of a specific, limited-scope problem that is to 1362 be solved with the UNSAF proposal. A short term fix should not be 1363 generalized to solve other problems; this is why "short term 1364 fixes usually aren't". 1366 The specific problems being solved by ICE are: 1367 Provide a means for two peers to determine the set of transport 1368 addresses which can be used for communication. 1369 Provide a means for resolving many of the limitations of other 1370 UNSAF mechanisms by wrapping them in an additional layer of 1371 processing (the ICE methodology). 1372 Provide a means for a client to determine an address that is 1373 reachable by another peer with which it wishes to communicate. 1375 12.2 Exit Strategy 1377 From RFC 3424, any UNSAF proposal must provide: 1378 Description of an exit strategy/transition plan. The better short 1379 term fixes are the ones that will naturally see less and less use 1380 as the appropriate technology is deployed. 1382 ICE itself doesn't easily get phased out. However, it is useful even 1383 in a globally connected Internet, to serve as a means for detecting 1384 whether a router failure has temporarily disrupted connectivity, for 1385 example. However, what ICE does is help phase out other UNSAF 1386 mechanisms. ICE effectively selects amongst those mechanisms, 1387 prioritizing ones that are better, and deprioritizing ones that are 1388 worse. Local IPv6 addresses are always the most preferred. As NATs 1389 begin to dissipate as IPv6 is introduced, derived transport addresses 1390 from other UNSAF mechanisms simply never get used, because higher 1391 priority connectivity exists. Therefore, the servers get used less 1392 and less, and can eventually be remove when their usage goes to zero. 1394 Indeed, ICE can assist in the transition from IPv4 to IPv6. It can 1395 be used to determine whether to use IPv6 or IPv4 when two dual-stack 1396 hosts communicate with SIP (IPv6 gets used). It can also allow a 1397 client in a v6 island to communicate with a v4 host on the other side 1398 of a 6to4 NAT, by allowing the v6 host to address-fix against the v4 1399 host, and in the process, obtain a v4 address which can be handed to 1400 the v4 client. 1402 12.3 Brittleness Introduced by ICE 1404 From RFC3424, any UNSAF proposal must provide: 1405 Discussion of specific issues that may render systems more 1406 "brittle". For example, approaches that involve using data at 1407 multiple network layers create more dependencies, increase 1408 debugging challenges, and make it harder to transition. 1410 ICE actually removes brittleness from existing UNSAF mechanisms. In 1411 particular, traditional STUN (the usage described in RFC 3489) has 1412 several points of brittleness. One of them is the discovery process 1413 which requires a client to try and classify the type of NAT it is 1414 behind. This process is error-prone. With ICE, that discovery 1415 process is simply not used. Rather than unilaterally assessing the 1416 validity of the address, its validity is dynamically determined by 1417 measuring connectivity to a peer. The process of determining 1418 connectivity is very robust. The only potential problem is that 1419 bilaterally fixed addresses through STUN can expire if traffic does 1420 not keep them alive. However, that is substantially less brittleness 1421 than the STUN discovery mechanisms. 1423 Another point of brittleness in STUN, TURN, and any other unilateral 1424 mechanism is its absolute reliance on an additional server. ICE 1425 makes use of a server for allocating unilateral addresses, but allows 1426 clients to directly connect if possible. Therefore, in some cases, 1427 the failure of a STUN or TURN server would still allow for a call to 1428 progress when ICE is used. 1430 Another point of brittleness in traditional STUN is that it assumes 1431 that the STUN server is on the public Internet. Interestingly, with 1432 ICE, that is not necessary. There can be a multitude of STUN servers 1433 in a variety of address realms. ICE will discover the one that has 1434 provided a usable address. 1436 The most troubling point of brittleness in traditional STUN is that 1437 it doesn't work in all network topologies. In cases where there is a 1438 shared NAT between each client and the STUN server, traditional STUN 1439 may not work. With ICE, that restriction can be lifted. 1441 Traditional STUN also introduces some security considerations. 1443 Unfortunately, since ICE still uses network resident STUN servers, 1444 those security considerations still exist. 1446 12.4 Requirements for a Long Term Solution 1448 From RFC 3424, any UNSAF proposal must provide: 1449 Identify requirements for longer term, sound technical solutions 1450 -- contribute to the process of finding the right longer term 1451 solution. 1453 Our conclusions from STUN remain unchanged. However, we feel ICE 1454 actually helps because we believe it can be part of the long term 1455 solution. 1457 12.5 Issues with Existing NAPT Boxes 1459 From RFC 3424, any UNSAF proposal must provide: 1460 Discussion of the impact of the noted practical issues with 1461 existing, deployed NA[P]Ts and experience reports. 1463 A number of NAT boxes are now being deployed into the market which 1464 try and provide "generic" ALG functionality. These generic ALGs hunt 1465 for IP addresses, either in text or binary form within a packet, and 1466 rewrite them if they match a binding. This will interfere with 1467 proper operation of any UNSAF mechanism, including ICE. 1469 13. Acknowledgements 1471 The authors would like to thank Douglas Otis and Francois Audet for 1472 their comments and input. 1474 14. References 1476 14.1 Normative References 1478 [1] Rosenberg, J., Weinberger, J., Huitema, C. and R. Mahy, "STUN - 1479 Simple Traversal of User Datagram Protocol (UDP) Through Network 1480 Address Translators (NATs)", RFC 3489, March 2003. 1482 [2] Huitema, C., "Real Time Control Protocol (RTCP) attribute in 1483 Session Description Protocol (SDP)", RFC 3605, October 2003. 1485 [3] Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston, A., 1486 Peterson, J., Sparks, R., Handley, M. and E. Schooler, "SIP: 1487 Session Initiation Protocol", RFC 3261, June 2002. 1489 [4] Camarillo, G., "The Alternative Network Address Types Semantics 1490 for the Session Description Protocol Grouping Framework", 1491 draft-ietf-mmusic-anat-01 (work in progress), June 2004. 1493 14.2 Informative References 1495 [5] Schulzrinne, H., Rao, A. and R. Lanphier, "Real Time Streaming 1496 Protocol (RTSP)", RFC 2326, April 1998. 1498 [6] Senie, D., "Network Address Translator (NAT)-Friendly 1499 Application Design Guidelines", RFC 3235, January 2002. 1501 [7] Srisuresh, P., Kuthan, J., Rosenberg, J., Molitor, A. and A. 1502 Rayhan, "Middlebox communication architecture and framework", 1503 RFC 3303, August 2002. 1505 [8] Borella, M., Lo, J., Grabelsky, D. and G. Montenegro, "Realm 1506 Specific IP: Framework", RFC 3102, October 2001. 1508 [9] Borella, M., Grabelsky, D., Lo, J. and K. Taniguchi, "Realm 1509 Specific IP: Protocol Specification", RFC 3103, October 2001. 1511 [10] Yon, D., "Connection-Oriented Media Transport in the Session 1512 Description Protocol (SDP)", draft-ietf-mmusic-sdp-comedia-07 1513 (work in progress), June 2004. 1515 [11] Daigle, L. and IAB, "IAB Considerations for UNilateral 1516 Self-Address Fixing (UNSAF) Across Network Address 1517 Translation", RFC 3424, November 2002. 1519 [12] Schulzrinne, H., Casner, S., Frederick, R. and V. Jacobson, 1520 "RTP: A Transport Protocol for Real-Time Applications", RFC 1521 3550, July 2003. 1523 [13] Baugher, M., McGrew, D., Naslund, M., Carrara, E. and K. 1524 Norrman, "The Secure Real-time Transport Protocol (SRTP)", RFC 1525 3711, March 2004. 1527 [14] Carpenter, B. and K. Moore, "Connection of IPv6 Domains via 1528 IPv4 Clouds", RFC 3056, February 2001. 1530 [15] Huitema, C., "Teredo: Tunneling IPv6 over UDP through NATs", 1531 draft-huitema-v6ops-teredo-02 (work in progress), June 2004. 1533 [16] Rosenberg, J., "Traversal Using Relay NAT (TURN)", 1534 draft-rosenberg-midcom-turn-04 (work in progress), February 1535 2004. 1537 Author's Address 1539 Jonathan Rosenberg 1540 dynamicsoft 1541 600 Lanidex Plaza 1542 Parsippany, NJ 07054 1543 US 1545 Phone: +1 973 952-5000 1546 EMail: jdrosen@dynamicsoft.com 1547 URI: http://www.jdrosen.net 1549 Intellectual Property Statement 1551 The IETF takes no position regarding the validity or scope of any 1552 Intellectual Property Rights or other rights that might be claimed to 1553 pertain to the implementation or use of the technology described in 1554 this document or the extent to which any license under such rights 1555 might or might not be available; nor does it represent that it has 1556 made any independent effort to identify any such rights. Information 1557 on the procedures with respect to rights in RFC documents can be 1558 found in BCP 78 and BCP 79. 1560 Copies of IPR disclosures made to the IETF Secretariat and any 1561 assurances of licenses to be made available, or the result of an 1562 attempt made to obtain a general license or permission for the use of 1563 such proprietary rights by implementers or users of this 1564 specification can be obtained from the IETF on-line IPR repository at 1565 http://www.ietf.org/ipr. 1567 The IETF invites any interested party to bring to its attention any 1568 copyrights, patents or patent applications, or other proprietary 1569 rights that may cover technology that may be required to implement 1570 this standard. Please address the information to the IETF at 1571 ietf-ipr@ietf.org. 1573 Disclaimer of Validity 1575 This document and the information contained herein are provided on an 1576 "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS 1577 OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET 1578 ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, 1579 INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE 1580 INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED 1581 WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. 1583 Copyright Statement 1585 Copyright (C) The Internet Society (2004). This document is subject 1586 to the rights, licenses and restrictions contained in BCP 78, and 1587 except as set forth therein, the authors retain all their rights. 1589 Acknowledgment 1591 Funding for the RFC Editor function is currently provided by the 1592 Internet Society.