idnits 2.17.1 draft-ietf-mmusic-ice-04.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** It looks like you're using RFC 3978 boilerplate. You should update this to the boilerplate described in the IETF Trust License Policy document (see https://trustee.ietf.org/license-info), which is required now. -- Found old boilerplate from RFC 3978, Section 5.1.a on line 18. -- Found old boilerplate from RFC 3978, Section 5.5 on line 2058. -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 2035. -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 2042. -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 2048. ** The document seems to lack an RFC 3978 Section 5.1 IPR Disclosure Acknowledgement. ** This document has an original RFC 3978 Section 5.4 Copyright Line, instead of the newer IETF Trust Copyright according to RFC 4748. ** This document has an original RFC 3978 Section 5.5 Disclaimer, instead of the newer disclaimer which includes the IETF Trust according to RFC 4748. ** The document uses RFC 3667 boilerplate or RFC 3978-like boilerplate instead of verbatim RFC 3978 boilerplate. After 6 May 2005, submission of drafts without verbatim RFC 3978 boilerplate is not accepted. The following non-3978 patterns matched text found in the document. That text should be removed or replaced: This document is an Internet-Draft and is subject to all provisions of Section 3 of RFC 3667. By submitting this Internet-Draft, each author represents that any applicable patent or other IPR claims of which he or she is aware have been or will be disclosed, and any of which he or she becomes aware will be disclosed, in accordance with Section 6 of BCP 79. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** There are 20 instances of too long lines in the document, the longest one being 10 characters in excess of 72. == There are 21 instances of lines with private range IPv4 addresses in the document. If these are generic example addresses, they should be changed to use any of the ranges defined in RFC 6890 (or successor): 192.0.2.x, 198.51.100.x or 203.0.113.x. -- The document has examples using IPv4 documentation addresses according to RFC6890, but does not use any IPv6 documentation addresses. Maybe there should be IPv6 examples, too? ** The document seems to lack a both a reference to RFC 2119 and the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. RFC 2119 keyword, line 470: '... client MUST assign each candidate a...' RFC 2119 keyword, line 471: '... identifiers MUST be unique across a...' RFC 2119 keyword, line 488: '...In this case, it SHOULD just send to t...' RFC 2119 keyword, line 492: '...o deal with this, the initiator SHOULD...' RFC 2119 keyword, line 496: '... SHOULD send all media packets throu...' (48 more instances...) Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the RFC 3978 Section 5.4 Copyright Line does not match the current year == Line 855 has weird spacing: '...ca87sbb f99f...' == Line 918 has weird spacing: '...ca87sbb zhff...' == Line 1568 has weird spacing: '..., which requi...' -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (February 21, 2005) is 7004 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Looks like a reference, but probably isn't: 'U' on line 1499 -- Looks like a reference, but probably isn't: 'P3' on line 1479 -- Looks like a reference, but probably isn't: 'W' on line 1489 ** Obsolete normative reference: RFC 3489 (ref. '1') (Obsoleted by RFC 5389) == Outdated reference: A later version (-08) exists of draft-rosenberg-midcom-turn-06 -- Possible downref: Normative reference to a draft: ref. '8' -- Obsolete informational reference (is this intentional?): RFC 2326 (ref. '9') (Obsoleted by RFC 7826) == Outdated reference: A later version (-05) exists of draft-huitema-v6ops-teredo-04 Summary: 8 errors (**), 0 flaws (~~), 8 warnings (==), 13 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 MMUSIC J. Rosenberg 3 Internet-Draft Cisco Systems 4 Expires: August 22, 2005 February 21, 2005 6 Interactive Connectivity Establishment (ICE): A Methodology for 7 Network Address Translator (NAT) Traversal for Multimedia Session 8 Establishment Protocols 9 draft-ietf-mmusic-ice-04 11 Status of this Memo 13 This document is an Internet-Draft and is subject to all provisions 14 of section 3 of RFC 3667. By submitting this Internet-Draft, each 15 author represents that any applicable patent or other IPR claims of 16 which he or she is aware have been or will be disclosed, and any of 17 which he or she become aware will be disclosed, in accordance with 18 RFC 3668. 20 Internet-Drafts are working documents of the Internet Engineering 21 Task Force (IETF), its areas, and its working groups. Note that 22 other groups may also distribute working documents as 23 Internet-Drafts. 25 Internet-Drafts are draft documents valid for a maximum of six months 26 and may be updated, replaced, or obsoleted by other documents at any 27 time. It is inappropriate to use Internet-Drafts as reference 28 material or to cite them other than as "work in progress." 30 The list of current Internet-Drafts can be accessed at 31 http://www.ietf.org/ietf/1id-abstracts.txt. 33 The list of Internet-Draft Shadow Directories can be accessed at 34 http://www.ietf.org/shadow.html. 36 This Internet-Draft will expire on August 22, 2005. 38 Copyright Notice 40 Copyright (C) The Internet Society (2005). 42 Abstract 44 This document describes a methodology for Network Address Translator 45 (NAT) traversal for multimedia session signaling protocols, such as 46 the Session Initiation Protocol (SIP). This methodology is called 47 Interactive Connectivity Establishment (ICE). ICE makes use of 48 existing protocols, such as Simple Traversal of UDP Through NAT 49 (STUN) and Traversal Using Relay NAT (TURN). ICE makes use of STUN 50 in peer-to-peer cooperative fashion, allowing participants to 51 discover, create and verify mutual connectivity. 53 Table of Contents 55 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 56 2. Multimedia Signaling Protocol Abstraction . . . . . . . . . . 5 57 3. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 6 58 4. Overview of ICE . . . . . . . . . . . . . . . . . . . . . . . 8 59 5. Detailed ICE Algorithm . . . . . . . . . . . . . . . . . . . . 10 60 5.1 Initiator Processing . . . . . . . . . . . . . . . . . . . 11 61 5.1.1 Sending the Initiate Message . . . . . . . . . . . . . 11 62 5.1.2 Processing the Accept . . . . . . . . . . . . . . . . 12 63 5.2 Responder Processing . . . . . . . . . . . . . . . . . . . 12 64 5.2.1 Processing the Initiate Message . . . . . . . . . . . 12 65 5.3 Common Procedures . . . . . . . . . . . . . . . . . . . . 13 66 5.3.1 Gathering Transport Addresses . . . . . . . . . . . . 13 67 5.3.2 Enabling STUN on Each Local Transport Address . . . . 15 68 5.3.3 Prioritizing the Transport Addresses and Choosing 69 a Default . . . . . . . . . . . . . . . . . . . . . . 17 70 5.3.4 Sending STUN Connectivity Checks . . . . . . . . . . . 19 71 5.3.5 Receiving STUN Requests . . . . . . . . . . . . . . . 24 72 5.3.6 Management of Resources . . . . . . . . . . . . . . . 25 73 5.3.7 Binding Keepalives . . . . . . . . . . . . . . . . . . 25 74 6. Running STUN on Derived Transport Addresses . . . . . . . . . 26 75 6.1 STUN on a TURN Derived Transport Address . . . . . . . . . 27 76 6.2 STUN on a STUN Derived Transport Address . . . . . . . . . 29 77 7. XML Schema for ICE Messages . . . . . . . . . . . . . . . . . 30 78 8. Example . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 79 9. Mapping ICE into SIP . . . . . . . . . . . . . . . . . . . . . 35 80 9.1 Message Mapping . . . . . . . . . . . . . . . . . . . . . 35 81 9.2 SIP and SDP Specific Security Considerations . . . . . . . 37 82 9.3 Updates in the Offer/Answer Model . . . . . . . . . . . . 37 83 10. Security Considerations . . . . . . . . . . . . . . . . . . 37 84 11. IANA Considerations . . . . . . . . . . . . . . . . . . . . 38 85 11.1 SDP Attribute Name . . . . . . . . . . . . . . . . . . . . 38 86 11.2 URN Sub-Namespace Registration . . . . . . . . . . . . . . 39 87 11.3 XML Schema Registration . . . . . . . . . . . . . . . . . 40 88 12. IAB Considerations . . . . . . . . . . . . . . . . . . . . . 40 89 12.1 Problem Definition . . . . . . . . . . . . . . . . . . . . 41 90 12.2 Exit Strategy . . . . . . . . . . . . . . . . . . . . . . 41 91 12.3 Brittleness Introduced by ICE . . . . . . . . . . . . . . 42 92 12.4 Requirements for a Long Term Solution . . . . . . . . . . 42 93 12.5 Issues with Existing NAPT Boxes . . . . . . . . . . . . . 43 94 13. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 43 95 14. References . . . . . . . . . . . . . . . . . . . . . . . . . 43 96 14.1 Normative References . . . . . . . . . . . . . . . . . . . . 43 97 14.2 Informative References . . . . . . . . . . . . . . . . . . . 44 98 Author's Address . . . . . . . . . . . . . . . . . . . . . . . 45 99 Intellectual Property and Copyright Statements . . . . . . . . 46 101 1. Introduction 103 A multimedia session signaling protocol is a protocol that exchanges 104 control messages between a pair of agents for the purposes of 105 establishing the flow of media traffic between them. This media flow 106 is distinct from the flow of control messages, and may take a 107 different path through the network. Examples of such protocols are 108 the Session Initiation Protocol (SIP) [3], the Real Time Streaming 109 Protocol (RTSP) [9] and the International Telecommunications Union 110 (ITU) H.323. 112 These protocols, by nature of their design, are difficult to operate 113 through Network Address Translators (NAT). Because their purpose in 114 life is to establish a flow of packets, they tend to carry IP 115 addresses within their messages, which is known to be problematic 116 through NAT [10]. The protocols also seek to create a media flow 117 directly between participants, so that there is no application layer 118 intermediary between them. This is done to reduce media latency, 119 decrease packet loss, and reduce the operational costs of deploying 120 the application. However, this is difficult to accomplish through 121 NAT. A full treatment of the reasons for this is beyond the scope of 122 this specification. 124 Numerous solutions have been proposed for allowing these protocols to 125 operate through NAT. These include Application Layer Gateways 126 (ALGs), the Middlebox Control Protocol [11], Simple Traversal of UDP 127 through NAT (STUN) [1], Traversal Using Relay NAT [8], and Realm 128 Specific IP [12][13] along with session description extensions needed 129 to make them work, such as the SDP attribute for RTCP [2]. 130 Unfortunately, these techniques all have pros and cons which make 131 each one optimal in some network topologies, but a poor choice in 132 others. The result is that administrators and implementors are 133 making assumptions about the topologies of the networks in which 134 their solutions will be deployed. This introduces a lot of 135 complexity and brittleness into the system. What is needed is a 136 single solution which is flexible enough to work well in all 137 situations. 139 This specification provides that solution. It is called Interactive 140 Connectivity Establishment, or ICE. ICE makes use of many of the 141 protocols above, but uses them in a specific methodology which avoids 142 many of the pitfalls of using any one alone. ICE uses STUN and TURN 143 without extension, and allows for other similar protocols to be used 144 as well. However, it does require additional signaling capabilities 145 to be introduced into the multimedia session signaling protocols. 146 For those protocols which make use of the Session Description 147 Protocol (SDP), this specification defines the necessary extensions 148 to it. Other protocols will need to define their own mechanisms. 150 2. Multimedia Signaling Protocol Abstraction 152 This specification defines a general methodology that allows the 153 media streams of multimedia signaling protocols to successfully 154 traverse NAT. This methodology is independent of any particular 155 signaling protocol. In order to discuss the methodology, we need to 156 to define an abstraction of a multimedia signaling system, and define 157 terms that can be used throughout this specification. Figure 1 shows 158 the abstraction. 160 +-----------+ 161 | | 162 | | 163 > | Signaling |\ 164 / | Relay | \ 165 / | | \ 166 Initiate / | | \ Initiate 167 Message / / +-----------+ \ Message 168 / / < \ 169 / / \ \ 170 / / \ \ 171 / / Accept Accept \ \ 172 / / Message Message \ > 173 / / \ 174 +-----------+ / \ +-----------+ 175 | | < | | 176 | | Media Stream | | 177 | Session | ................................ | Session | 178 | Initiator | | Responder | 179 | | Media Stream | | 180 | | ................................ | | 181 +-----------+ +-----------+ 183 Figure 1 185 Communications occur between two clients - the session initiator and 186 the session responder, also referred to as the initiator and 187 responder. The initiator is the one that decides to engage in 188 communications. To do so, it sends an initiate message. The 189 initiate message contains parameters that describe the capabilities 190 and configuration of media streams for the initiator. This message 191 may travel through signaling intermediaries, called a signaling 192 relay, before finally arriving at the session responder. Assuming 193 the session responder wishes to communicate, it generates an accept 194 message, which is relayed back to the initiator. This message 195 contains capabilities and configuration of media streams for the 196 responder. As a result, media streams are established between the 197 initiator and responder. The signaling protocol may also support an 198 operation that allows for termination of the communications session. 199 We refer to this signaling message as a terminate message. 201 This abstraction is readily mapped to SIP, RTSP, and H.323, amongst 202 others. For SIP, the initiator is the the user agent that generates 203 an SDP offer [4], the responder is a SIP user agent that generates an 204 SDP answer to the offer, the initiate message is a SIP message 205 containing an SDP offer (for example, an INVITE), the accept message 206 is a SIP message containing an SDP answer (for example, a 200 OK), 207 and the terminate message is a BYE. For RTSP, the initiator is the 208 RTSP client, the responder is the RTSP server, the initiate message 209 is a SETUP message, and the accept message is a SETUP response. 211 The initiate and accept messages need to contain parameters, defined 212 by this specification, for the protocol to operate. The initiate and 213 accept mesages are therefore defined by this specification as XML 214 documents containing the relevant information. Of course, multimedia 215 signaling protocols will not use these XML documents directly. 216 Rather, those protocols will need to define extensions as needed to 217 show how the initiate, accept and terminate messages map to messages 218 in the actual protocol, and how every element and attribute in the 219 XML document for those messages maps into parameters of the actual 220 protocol. Section 9 provides such a mapping for SIP. 222 3. Terminology 224 Several new terms are introduced in this specification: 226 Session Initiator: A software or hardware entity that, at the request 227 of a user, tries to establish communications with another entity, 228 called the session responder. A session initiator is also called 229 an initiator. 231 Initiator: Another term for a session initiator. 233 Session Responder: A software or hardware entity that receives a 234 request for establishment of communications from the session 235 initiator, and either accepts or declines the request. A session 236 responder is also called a responder. 238 Responder: Another term for a session responder. 240 Client: Either the initiator or responder. 242 Peer: From the perspective of one of the clients in a session, its 243 peer is the other client. Specifically, from the perspective of 244 the initiator, the peer is the responder. From the perspective of 245 the responder, the peer is the initiator. 247 Signaling Relay: An intermediary of signaling messages. Examples are 248 SIP proxies and H.323 Gatekeepers. 250 Initiate Message: The signaling message used by an initiator to 251 establish communications. It contains capabilities and other 252 information needed by the responder to send media to the 253 initiator. For SIP, this is any SIP message that contains an 254 offer. Usually, this is the initial INVITE. 256 Accept Message: The signaling message used by a responder to agree to 257 communications. It contains capabilities and other information 258 needed by the initiator to send media to the responder. For SIP, 259 this is any SIP message that contains an answer. Usually, this is 260 a 200 OK. 262 Terminate Message The signaling message used by a client to terminate 263 the session and associated media streams. 265 Transport Address: The combination of an IP address and port. 267 Local Transport Address: A local transport address a transport 268 address that has been allocated from the operating system on the 269 host. This includes transport addresses obtained through Virtual 270 Private Networks (VPNs) and transport addresses obtained through 271 Realm Specific IP (RSIP) [12] (which lives at the operating system 272 level). Transport addresses are typically obtained by binding to 273 an interface. 275 Usable Local Transport Address: A local transport address created for 276 the purposes of advertisement to ICE peers. 278 Associated Local Transport Address: An associated transport address 279 is a local transport address used solely to obtain a derived 280 transport address. Associated local transport addresses are never 281 advertised in ICE messages. However, packets are received on them 282 when sent to the derived transport address. 284 Derived Transport Address: A derived transport address is a transport 285 address which is derived from an associated local transport 286 address. The derived transport address is related to the 287 associated local transport address in that packets sent to the 288 derived transport address are received on the socket bound to its 289 associated local transport address. Derived addresses are 290 obtained using protocols like STUN and TURN, and more generally, 291 any UNSAF protocol [14]. 293 Advertised Transport Addresses: The union of the usable local 294 transport addresses and the derived transport addresses. These 295 are the ones used in ICE messages. 297 Peer Derived Transport Address: A peer derived transport address is a 298 derived transport address learned from a STUN server running 299 within a peer in a media session. 301 TURN Derived Transport Address: A derived transport address obtained 302 from a TURN server. 304 STUN Derived Transport Address: A derived transport address obtained 305 from a STUN server whose address has been provisioned into the UA. 306 This, by definition, excludes Peer Derived Transport Addresses. 308 Unilateral Allocations: Queries made to a network server which 309 provides an UNSAF service. 311 Bilateral Allocations: Addresses obtained by using an UNSAF service 312 that actually runs on the peer of the communications session. 313 Peer derived transport addresses are synonymous with bilateral 314 allocations. 316 4. Overview of ICE 318 ICE makes the fundamental assumption that clients exist in a network 319 of segmented connectivity. This segmentation is the result of a 320 number of addressing realms in which a client can simultaneously be 321 connected. We use "realms" here in the broadest sense. A realm is 322 defined purely by connectivity. Two clients are in the same realm 323 if, when they exchange the addresses each has in that realm, they are 324 able to send packets to each other. This includes IPv6 and IPv4 325 realms, which actually use different address spaces, in addition to 326 private networks connected to the public Internet through NAT. 328 The key assumption in ICE is that a client cannot know, apriori, 329 which address realms it shares with any peer it may wish to 330 communicate with. Therefore, in order to communicate, it has to try 331 connecting to addresses in all of the realms. 333 Initiator TURN,STUN Servers Responder 334 |(1) Gather Addresses | | 335 |-------------------->| | 336 |(2) Initiate Msg. | | 337 |------------------------------------------>| 338 | |(3) Gather Addresses | 339 | |<--------------------| 340 |(4) Accept Msg. | | 341 |<------------------------------------------| 342 |(5) STUN Checks | | 343 |<------------------------------------------| 344 |(6) STUN Checks | | 345 |------------------------------------------>| 346 |(7) Media | | 347 |<------------------------------------------| 348 |(8) Media | | 349 |------------------------------------------>| 351 Figure 2 353 The basic flow of operation for ICE is shown in Figure 2. Before the 354 initiator establishes a session, it obtains as many IP address and 355 port combinations in as many address realms as it can. These 356 adresses all represent potential points at which the initiator will 357 receive a specific media stream. Any protocol that provides a client 358 with an IP address and port on which it can receive traffic can be 359 used. These include STUN, TURN, RSIP, and even a VPN. The client 360 also uses any local interface addresses. A dual-stack v4/v6 client 361 will obtain both a v6 and a v4 address/port. The only requirement is 362 that, across all of these addresses, the initiator can be certain 363 that at least one of them will work for any responder it might 364 communicate with. Unfortunately, if the initiator communicates with 365 a peer that doesn't support ICE, only one address can be provided to 366 that peer. As such, the client will need to choose one default 367 address, which will be used by non-ICE clients. This would typically 368 be a TURN derived transport address, as it is most likely to work 369 with unknown non-ICE peers. 371 The initiator then runs a STUN server on each the local transport 372 addresses it has obtained. These include ones that will be 373 advertised directly through ICE, and so-called associated local 374 transport addresses, which are not directly advertised; rather, the 375 transport address derived from them is advertised. The initiator 376 will need to be able to demultiplex STUN messages and media messages 377 received on that IP address and port, and process them appropriately. 378 All of these addresses are placed into the initiate message, and they 379 are ordered in terms of preference. Preference is a matter of local 380 policy, but typically, lowest preference would be given to transport 381 addresses learned from a TURN server (i.e., TURN derived transport 382 addresses). The initiate message also conveys the one half of the 383 STUN username and the password which are required to gain access to 384 the STUN server on each address/port combination. 386 The initiate message is sent to the responder. This specification 387 does not address the issue of how the signaling messages themselves 388 traverse NAT. It is assumed that signaling protocol specific 389 mechanisms are used for that purpose. The responder follows a 390 similar process as the initiator followed; it obtains addresses from 391 local interfaces, STUN servers, TURN servers, etc., and it places all 392 of them, along with the other half of the STUN username and its 393 password, into the accept message. 395 Once the responder receives the initiate message, it has a set of 396 potential addresses it can use to communicate with the initiator. 397 The initiator will be running a STUN server at each address. The 398 responder sends a STUN request to each address, in parallel. When 399 the initiator receives these, it sends a STUN response. If the 400 responder receives the STUN response, it knows that it can reach its 401 peer at that address. It can then begin to send media to that 402 address. As additional STUN responses arrive, the responder will 403 learn about additional transport addresses which work. If one of 404 those has a higher priority than the one currently in use, it starts 405 sending media to that one instead. No additional control messages 406 (i.e., SIP signaling) occur for this change. 408 The STUN messages described above happen while the accept message is 409 being sent to the intitiator. Once the intitiator receives the 410 accept message, it too will have a set of potential addresses with 411 which it can communicate to the responder. It follows exactly the 412 same process described above. 414 Furthermore, when a either the initiator or responder receives a STUN 415 request, it takes note of the source IP address and port of that 416 request. It compares that transport address to the existing set of 417 potential addresses. If it's not amongst them, it gets added as 418 another potential address. The incoming STUN message provides the 419 client with enough context to associate that transport address with a 420 STUN username, STUN password, and priority, just as if it had been 421 sent in an initiate or accept message. As such, the client begins 422 sending STUN messages to it as well, and if those succeed, the 423 address can be used if it has a higher priority. 425 5. Detailed ICE Algorithm 427 This section describes the detailed processing needed for ICE. 429 5.1 Initiator Processing 431 5.1.1 Sending the Initiate Message 433 When the initiator wishes to begin communications, it starts by 434 gathering transport addresses, as described in Section 5.3.1, and 435 starting a STUN server on each local transport address, both usable 436 and associated, as described in Section 5.3.2. This process can 437 actually happen at any time before sending an initiate message. A 438 client can pre-gather transport addresses, using a user interface cue 439 (such as picking up the phone, or entry into an address book) as a 440 hint that communications is imminent. Doing so eliminates any 441 additional perceivable call setup delays due to address gathering. 443 When it comes time to initiate communications, it determines a 444 priority for each one and identifies one as a default, as described 445 in Section 5.3.3. 447 The next step is to construct the initiate message. Section 7 448 provides the XML schema for the initiate message. The message 449 consists of a series of media streams. For each media stream, there 450 is an IPv4 and/or an IPv6 default address, and a list of candidates. 451 Each candidate has information for RTP and optionally RTCP. RTCP 452 information is optional since, unfortunately, many systems don't 453 support it. If ICE did not indicate that RTCP was not supported, 454 connectivity checks would be made to the RTCP ports and fail, 455 confusing operation and adding unneccesary overhead. 457 The default address is the one that will be used by responders that 458 don't understand ICE (for SIP, this is accomplished by mapping the 459 default address into the m and c line in the SDP). The candidates 460 represent addresses that the responder should try using the 461 mechanisms of this specification. The list of candidates includes 462 the defaults. In SIP, the candidates are conveyed with the new SDP 463 candidate parameter. 465 The client then encodes its usable local transport addresses and 466 derived transport addresses (including the one set as the default) as 467 a series of candidate elements. Each candidate element conveys a 468 transport address for RTP, a transport address for RTCP, a STUN 469 username fragment and STUN password for RTP, and one for RTCP. The 470 client MUST assign each candidate a unique identifier. These 471 identifiers MUST be unique across all candidates used within the 472 session. Though they are not used in this specification, they serve 473 as a convenient and short handle for each candidate within the 474 document. Experience has shown that explicit identifiers for 475 elements in SDP is a good idea. This identifier is encoded in the 476 "id" attribute of the element. The priority for the 477 transport address, as computed above, is included as an attribute as 478 well. 480 Once the initiate message is constructed, it is sent. 482 5.1.2 Processing the Accept 484 There are two possible cases for processing of the Accept message. 485 If the recipient of the Initiate message did not support ICE, the 486 Accept message will only contain the default address information. As 487 a result, the initiator knows that it cannot perform its connectivity 488 checks. In this case, it SHOULD just send to the transport address 489 listed. However, if local configuration information tells the 490 initiator to try connectivity checks by sending them through the TURN 491 server, this means that packets sent directly to responder may be 492 dropped by a local firewall. To deal with this, the initiator SHOULD 493 issue a SEND command using this new transport address as the 494 destination. The SEND command contains the media packet to send to 495 the responder. Once this command has been accepted, the initiator 496 SHOULD send all media packets through the TURN server, which will 497 then forward them towards the responder. 499 If the Accept message contains candidates, it implies that the 500 responder supported ICE. In that case, the initiator takes each 501 candidate transport address, STUN username fragment, STUN password 502 and priority, and places them into a list, called the candidate list. 503 It then begins processing the candidate list as described in Section 504 5.3.4. That processing associates a state with each transport 505 address. As described there, once a successful STUN query is made to 506 the STUN server at an address, the initiator can begin sending media 507 to that address. 509 5.2 Responder Processing 511 5.2.1 Processing the Initiate Message 513 Upon receipt of the initiate message, the client starts gathering 514 transport addresses, as described in Section 5.3.1, and starts a STUN 515 server on each local transport address, as described in Section 516 5.3.2. This processing is done immediately on receipt of the 517 request, to prepare for the case where the user should accept the 518 call, or early media needs to be generated. By gathering addresses 519 while the user is being alerted to the request for communications, 520 session establishment delays due to that gathering can be eliminated. 522 At some point, the responder will decide to accept or reject the 523 communications. A rejection terminates ICE processing, of course. 524 In the case of acceptance, the accept message is constructed as 525 follows. 527 The client first determines a priority for each usable local 528 transport address and derived transport address it has gathered, and 529 identifies one as a default, as described in Section 5.3.3. 531 Constructing the accept message proceeds identically to the way in 532 which the initiate message is constructed (Section 5.1.1). 534 The accept message is then sent. 536 5.3 Common Procedures 538 This section discusses procedures that are common between initiator 539 and responder. 541 5.3.1 Gathering Transport Addresses 543 A client gathers addresses when it believes that communications is 544 imminent. For initiators, this occurs before sending an initiate 545 message (Section 5.1.1). For responders, it occurs before sending a 546 accept message (Section 5.2.1). 548 There are two types of addresses a client can gather - usable local 549 transport addresses and derived transport addresses. Usable local 550 transport addresses are obtained by binding to an ephemeral port on 551 an interface (physical or virtual) on the host. A multi-homed host 552 SHOULD attempt to bind on all interfaces for all media streams it 553 wishes to receive. For media streams carried using the Real Time 554 Transport Protocol (RTP) [15], the client will need to bind to an 555 ephemeral port for both RTP and RTCP. 557 The result will be a set of usable local transport addresses. The 558 client may also have access to servers that provide unilateral 559 self-address fixing (UNSAF) [14]. Examples of such protocols include 560 STUN, TURN, and TEREDO [18]. UNSAF protocols work by having the 561 client send, from a specific associated local transport address, some 562 kind of message to a server. The server provides to the client, in 563 some kind of response, an additional transport address, called a 564 derived transport address. This derived transport address is derived 565 from the associated local transport address. Here, derivation means 566 that a request sent to the derived transport address might (under 567 good network conditions) reach the client on its associated local 568 transport address. 570 All ICE implementations SHOULD implement and use STUN and TURN for 571 unilateral allocation. STUN is an integral part of this 572 specification for connectivity checks and will always be present for 573 that purpose. The usage of TURN and STUN for unilateral allocations 574 is at SHOULD strength, and not MUST, since there are many network 575 environments, and there will be deployments for which one of these 576 will never be used and will impose needless cost. However, one of 577 the key ideas behind ICE is that network conditions and connectivity 578 assumptions can, and will change. Just because a client is 579 communicating with a server on the public network today, doesn't mean 580 that it won't need to communicate with one behind a NAT tomorrow. 581 Just because a client is behind a full cone NAT today, doesn't mean 582 that tomorrow they won't pick up their client and take it to a public 583 network access point where there is a symmetric NAT. The way to 584 handle these cases and build a reliable system is for clients to 585 implement a diverse set of techniques for allocating addresses, so 586 that at least one of them is almost certainly going to work in any 587 situation. The combination of TURN, STUN and local address 588 allocations provide sufficient coverage to handle nearly any NAT 589 configuration. Implementors should consider very carefully any 590 assumptions that they make about deployments before electing not to 591 implement one of these mechanisms for address allocation. In 592 particular, implementors should consider whether the elements in the 593 system may be mobile, and connect through different networks with 594 different connectivity. They should also consider whether endpoints 595 which are under their control, in terms of location and network 596 connectivity, would always be under their control. Only in cases 597 where implementors truly believe that these cases will not require 598 either TURN or STUN allocations, should those techniques not be 599 implemented. 601 For each UNSAF protocol, the client may have access to a multiplicity 602 of servers. For example, a user connected to a natted cable access 603 network might have access to a STUN server in the private cable 604 network and in the public Internet. For each server for each UNSAF 605 protocol, the client MUST bind to a new local transport address, and 606 uses it to obtain a single derived transport address for it. This 607 local IP address and port is called an associated transport address. 608 These addresses are not advertised to peers in ICE messages; their 609 derived transport addresses are. As a result of using a different 610 local transport address for each derived transport address, every 611 transport address advertised in an ICE message is either a unique 612 local transport address, or else is derived from a unique local 613 transport address. 615 If a derived transport address is equal to the associated local 616 transport address from which it was derived, the local transport 617 address SHOULD be promoted to a usable local transport address. It 618 is preferable to do this than to use a new local transport address; 619 the UNSAF protocol may have caused pinholes to open in intervening 620 firewalls. 622 Implementations MAY use other protocols that provide derived 623 transport addresses, as long as those techniques meet the following 624 conditions: 626 1. The technique does not require its peer to know about, or 627 understand the technique in order to interoperate. 629 2. The technique can provide the client with an IP address and port 630 that may be reachable by some peers. 632 3. The technique allows the client to receive STUN connectivity 633 checks in addition to media packets on the same IP address and 634 port. 636 4. The technique allows the client to send packets to a peer, so 637 that the peer will see the derived transport address as the 638 source IP address and port of the packet. 640 5.3.2 Enabling STUN on Each Local Transport Address 642 Once the client has obtained a set of transport addresses, it starts 643 a STUN server on each local transport address, including both 644 associated local transport addresses and usable transport addresses. 645 These include ones used for both RTP and RTCP. This, by definition, 646 means that the STUN service will be reached for requests sent to the 647 derived addresses. 649 However, the client does not need to provide STUN service on any 650 other IP address or port, unlike the STUN usage described in [1]. 651 The need to run the service on multiple ports is to support the 652 change flags. However, those flags are not needed with ICE, and the 653 server SHOULD reject, with a 400 response, any STUN requests with 654 these flags set. The CHANGED-ADDRESS attribute in a BindingResponse 655 is set to the transport address on which the server is running. 657 Furthermore, there is no need to support TLS or to be prepared to 658 receive SharedSecret request messages. Those messages are used to 659 obtain shared secrets to be used with BindingRequests. However, with 660 ICE, usernames and passwords are exchanged in the signaling protocol. 662 The client will receive both STUN requests and media packets on each 663 local transport address. The client MUST be able to disambiguate 664 them. In the case of RTP/RTCP, this disambiguation is easy. RTP and 665 RTCP packets start with the bits 0b10 (v=2). The first two bits in 666 STUN are always 0b00. This disambiguation also works for packets 667 sent using Secure RTP [16], since the RTP header is in the clear. 668 Disambiguating STUN with other media stream protocols may be more 669 complicated. However, it can always be possible with arbitrarily 670 high probabilities by selecting an appropriately random username (see 671 below). 673 The need to run STUN on the same transport address as the media 674 stream represents the "ugliest" piece of ICE. However, it is an 675 essential part of the story. By sending STUN requests to the very 676 same place media is sent, any bindings learned through STUN will be 677 useful even when communicating through symmetric NATs. This results 678 in a substantial increase in the scope of applicability of STUN. 680 For each transport address advertised in the initiate message, the 681 client MUST choose a username fragment and a password. The username 682 fragment created by the client (called the local username fragment) 683 is concatenated with the fragment created by its peer (called the 684 remote username fragment) to create the actual username used for 685 access to the STUN server that will receive packets sent to that 686 transport address. This username will be present in STUN requests 687 sent by its peer. By creating the username as a combination of 688 information from each side of a call, it allows a client to correlate 689 the source of the request with a candidate transport address. This 690 is discussed further below. 692 The username fragment MUST be globally unique with high probability, 693 and different for each advertised transport address. It SHOULD be 694 persistently used over time for that particular transport address. A 695 value computed as the 128 bit hash of the transport address 696 concatenated with a 128 bit random number selected to identify the 697 host will meet these requirements. This results in two properties. 698 First - each transport address can be uniquely identified. Secondly, 699 no other host will select a username with the same value. The 700 password MUST be random with at least 128 bits of randomness and is 701 selected separately for each transport address advertised as part of 702 a distinct session. This means that RTP and RTCP, which run on 703 different transport addresses, will get different usernames and 704 passwords. The password will remain constant during a session with a 705 peer, but will otherwise vary across sessions. The username fragment 706 and password will be passed to its peer in an initiate or accept 707 message. Because the password is conveyed through these signaling 708 protocols, those protocols MUST provide facilities for encryption, 709 authentication and message integrity, and those facilities SHOULD be 710 used when ICE is employed. As such, the process described in this 711 section will associate, with each local transport address, a username 712 fragment and password. The client also associates this same username 713 fragment and password with any transport addresses derived from the 714 local transport address. 716 The global uniqueness requirement stems from the lack of uniquenes 717 afforded by IP addresses. Consider clients A, B, and C. A and B are 718 within private enterprise 1, which is using 10.0.0.0/8. C is within 719 private enterprise 2, which is also using 10.0.0.0/8. As it turns 720 out, B and C both have IP address 10.0.1.1. A initiates 721 communications to C. C, in its accept message, provides A with its 722 transport addresses. In this case, thats 10.0.1.1:8866 and 8877. As 723 it turns out, B is in a session at that same time, and is also using 724 10.0.1.1:8866 and 8877. This means that B has a STUN server running 725 on those ports, just as C does. A will send a STUN request to 726 10.0.1.1:8866 and 8877. However, these do not go to C as expected. 727 Instead, they go to B. If B just replied to them, A would believe it 728 has connectivity to C, when in fact it has connectivity to a 729 completely different user, B. To fix this, the STUN username 730 fragment takes on the role of a unique identifier. C provides A with 731 a unique username fragment, and A provides one to C. A uses these 732 two fragments to construct the username in its STUN query to 733 10.0.1.1:8866. This STUN query arrives at B. However, the username 734 is unknown to B, and so the request is rejected. A treats the 735 rejected STUN request as if there were no connectivity to C (which is 736 actually true). Therefore, the error is avoided. 738 An unfortunate consequence of the non-uniqueness of IP addresses is 739 that, in the above example, B might not even be an ICE client. It 740 could be any host, and the port to which the STUN packet is directed 741 could be any ephemeral port on that host. If there is an application 742 listening on this socket for packets, and it is not prepared to 743 handle malformed packets for whatever protocol is in use, the 744 operation of that application could be effected. Fortunately, since 745 the ports exchanged in SDP are ephemeral and ususally drawn from the 746 dynamic or registered range, the odds are good that the port is not 747 used to run a server on host B, but rather is the client side of some 748 protocol. This decreases the probability of hitting a port in-use, 749 due to the transient nature of port usage in this range. However, 750 the possibility of a problem does exist, and network deployers should 751 be prepared for it. 753 Termination of the local STUN servers is discussed in Section 5.3.6. 755 5.3.3 Prioritizing the Transport Addresses and Choosing a Default 757 The prioritization process takes the list of the advertised transport 758 addresses, and associates each with a priority. This priority 759 reflects the desire that the UA has to receive media on that address, 760 and is assigned as a value from 0 to 1 (1 being most preferred). 761 Priorities are ordinal, so that their significance is only relative 762 to other transport address priorities in the same list. 764 This specification makes no normative recommendations on how the 765 prioritization is done. However, some useful guidelines are 766 suggested on how such a prioritization can be determined. 768 One criteria for choosing one transport address over another is 769 whether or not that transport address involves the use of a relay. 770 That is, if media is sent to that transport address, will the media 771 first transit a relay before being received. TURN derived transport 772 addresses make use of relays (the TURN server), as do any local 773 transport addresses associated with a VPN server. When media is 774 transited through a relay, it can increase the latency between 775 transmission and reception. It can increase the packet losses, 776 because of the additional router hops that may be taken. It may 777 increase the cost of providing service, since media will be routed in 778 and right back out of a relay run by the provider. If these concerns 779 are important, transport addresses with this property can be listed 780 with lower priority. 782 Another criteria for choosing one address over another is IP address 783 family. ICE works with both IPv4 and IPv6. It therefore provides a 784 transition mechanism that allows dual-stack hosts to prefer 785 connectivity over IPv6, but to fall back to IPv4 in case the v6 786 networks are disconnected (due, for example, to a failure in a 6to4 787 relay) [17]. It can also help with hosts that have both a native 788 IPv6 address and a 6to4 address. In such a case, higher priority 789 could be afforded to the native v6 address, followed by the 6to4 790 address, followed by a native v4 address. This allows a site to 791 obtain and begin using native v6 addresss immediately, yet still 792 fallback to 6to4 addresses when communicating with clients in other 793 sites that do not yet have native v6 connectivity. 795 Another criteria for choosing one address over another is security. 796 If a user is a telecommuter, and therefore connected to their 797 corporate network and a local home network, they may prefer their 798 voice traffic to be routed over the VPN in order to keep it on the 799 corporate network when communicating within the enterprise, but use 800 the local network when communicating with users outside of the 801 enterprise. 803 Another criteria for choosing one address over another is topological 804 awareness. This is most useful for transport addresses which make 805 use of relays (including TURN and VPN). In those cases, if a client 806 has preconfigured or dynamically discovered knowledge of the 807 topological proximity of the relays to itself, it can use that to 808 select closer relays with higher priority. 810 Once the transport addresses have been prioritized, one is selected 811 as the default. This is the address that will be used by a peer that 812 doesn't understand ICE. The default has no relevance when 813 communicating with an ICE capable peer. As such, it is RECOMMENDED 814 that the default be chosen based on the likelihood of that address 815 being useful when communicating with a peer that doesn't support ICE. 816 Unfortunately, it is difficult to ascertain which address that might 817 be. As an example, consider a user within an enterprise. To reach 818 non-ICE capable clients within the enterprise, a local transport 819 address has to be used, since the enterprise policies may prevent 820 communication between elements using a relay on the public network. 821 However, when communicating to peers outside of the enterprise, a 822 TURN-based public address is needed. 824 Indeed, the difficulty in picking just one address that will work is 825 the whole problem that motivated the development of this 826 specification in the first place. As such, it is RECOMMENDED that 827 the default address be a TURN derived transport address from a TURN 828 server providing public IP addresses. Furthermore, ICE is only truly 829 effective when it is supported on both sides of the session. It is 830 therefore most prudent to deploy it to close-knit communities as a 831 whole, rather than piecemeal. In the example above, this would mean 832 that ICE would ideally be deployed completely within the enterprise, 833 rather than just to parts of it. 835 5.3.4 Sending STUN Connectivity Checks 837 Once a responder has received an initiate message, or an initiator 838 has received an accept message, the list of transport addresses is 839 extracted from the message. These transport addresses, called the 840 remote transport addresses, along with the username fragment from the 841 peer (called the remote username fragment), the password from the 842 peer (called the remote password), and priority from the peer (called 843 the remote priority) are placed into a table called the candidate 844 table. There is a candidate table for RTP for each media stream, and 845 for RTCP for each media stream. So, if a session is established with 846 audio and video, there would be four tables - audio RTP, audio RTCP, 847 video RTP and video RTCP. An example of a candidate table for RTP 848 audio is shown below. 850 Remote Remote Remote Remote 851 Transport Username Password Priority 852 Address Fragment 853 -------------------------------------------------------------------- 854 10.0.1.1:38746 asd9f8f8== 9asfhfvva9==affahnz 0.4 855 192.0.2.77:44634 xcyca87sbb f99fhaz0ftrafdgl99d 0.2 857 Figure 3 859 The client then creates a new table, called the connection table. 860 There is a row in this table for each gathered address and remote 861 transport address pair. This table has a column for the local 862 transport address, which is equal to the gathered address if it was a 863 usable local transport address, else equal to the associated local 864 transport address if the gathered address was a derived address. 865 There is also a column for the remote transport address, the local 866 username fragment, the remote username fragment, the remote password 867 and the state. Each row in this table is called a connection, and it 868 provides information on the connectivity when sending packets from 869 the local transport address to the remote transport address. 871 There are four possible states for each connection. These states 872 are: 874 INIT: No STUN transaction has been completed towards this remote 875 transport address from this local transport address. 877 HANDSHAKING: One or more STUN transactions have failed, but 878 insufficient time has passed since leaving the INIT state to be 879 certain that the remote transport address is unreachable from this 880 local transport address. This state is important for connectivity 881 checks made to STUN derived transport addresses through port 882 restricted NAT. 884 BAD: All STUN transactions to this remote transport address from this 885 local transport address have either timed out, or failed with a 886 600 response, and a sufficient amount of time has elapsed since 887 the INIT state to have high confidence that the remote transport 888 address cannot be reached from this local transport address. 890 GOOD: A STUN transaction to this remote transport address from this 891 local transport address was successful. 893 When the client first populates the tables from the initiate or 894 accept message, all of the connections are set to the INIT state. 896 Consider the the following example. An initiator sends an initiate 897 message with one media stream (audio), with two RTP transport 898 addresses, 10.0.1.1:38746 (which we denote "A" for shorthand) and 899 192.0.2.77:44634 (which we denote "B" for shorthand). A is a usable 900 local transport address, and B is a STUN derived transport address 901 (although that fact is not signaled in the message). The usernames 902 and passwords for these transport addresses are shown in Figure 3. 903 The initiate message is sent to the responder. The responder has a 904 local transport address (10.0.1.76:43443), and a a STUN derived 905 transport address (192.0.2.64:54766) derived from (10.0.1.76:43444). 906 Call these two local transport addresses X and Y respectively. The 907 connection table created by the responder would have four rows (two 908 local transport addresses times two remote transport addresses). 909 Such a table might look like this: 911 Remote Local Remote Local Remote Remote 912 Trans. Trans. Username Username Password Priority 913 Address Address Fragment Fragment State 914 ------------------------------------------------------------------------ 915 A X asd9f8f8== 8asd77fa9 9asfhfvva9==affahnz 0.4 INIT 916 A Y asd9f8f8== zhff8dga^ 9asfhfvva9==affahnz 0.4 INIT 917 B X xcyca87sbb 8asd77fa9 f99fhaz0ftrafdgl99d 0.2 INIT 918 B Y xcyca87sbb zhff8dga^ f99fhaz0ftrafdgl99d 0.2 INIT 920 The client begins a STUN BindingRequest transaction for each 921 connection. This STUN transaction is sent to the IP address and port 922 from the Remote Transport Address column. It sends the request from 923 the IP address and port in the Local Transport Address column. The 924 STUN USERNAME attribute MUST be present. It is set to the 925 concatenation of the remote user fragment with the local user 926 fragment from the table. Thus, for the candidate with remote 927 transport address A and local transport address X, the USERNAME would 928 be set to "asd9f8f8==8asd77fa9". The BindingRequest SHOULD contain a 929 MESSAGE-INTEGRITY attribute, computed using the username in the 930 USERNAME attribute, and the password from the password field in the 931 row. The BindingRequest MUST NOT contain the CHANGE-REQUEST or 932 RESPONSE-ADDRESS attribute. 934 Each of these STUN transactions will generate either a timeout, or a 935 response. If the response is a 420, 500, or 401, the client should 936 try again as described in RFC 3489. Either initially, or after such 937 a retry, the STUN transaction will produce a timeout result, a 938 success result, a fundamentally non-recoverable failure result (error 939 codes 400, 431, or 600) or a failure result inapplicable to this 940 usage of STUN and thus unrecoverable (432, 433), or a 430 error. 941 These correspond to the "timeout", "success", "error" and 942 "race-failure" events, respectively. The 430 response code, as 943 described below, is generated when the server doesn't recognize the 944 STUN username, presumably because the BindingRequest was sent to the 945 initiator prior to receipt of the ICE Accept message by the 946 initiator. It ocurrence is thus a result of a failed race between 947 the BindingRequest and Accept message. As the state machine below 948 discusses, the client will retry in this case. 950 These events are fed into the finite state machine (FSM) described in 951 Figure 5. This figure shows the transitions between states that 952 occur on the completion of the STUN BindingRequest transaction or 953 upon the expiration of timers set by the FSM. 955 race-failure,.......... 956 timeout/ . . .......... 957 Set . . . . Retry Fires/ 958 Retry Timer,. V . . Retry 959 +---------+ . +---------+ . 960 | | . | | . 961 | | .......| |<.... 962 | INIT |......................>| HAND | 963 | | race-failure, | SHAKING | 964 | | timeout/ | | 965 +---------+ Set +---------+ 966 . . Retry Timer, error, . . 967 . . Giveup Timer Giveup . . 968 error . . Fires . . 969 . . ............................. . success 970 . . . . 971 . ...C.............................. . 972 . . success . . 973 . . . . 974 V V V V 975 +---------+ +---------+ 976 | | | | 977 | | | | 978 | BAD |. | GOOD | 979 | | | | 980 | | | | 981 +---------+ +---------+ 983 Figure 5 985 Starting in the INIT state, if the transaction is successful, the 986 client has verified connectivity to that remote transport address 987 when sending from that local transport address. This means that 988 media packets sent in exactly the same way will get through. As 989 such, the FSM transitions to the GOOD state. If, from the INIT 990 state, the STUN transaction times out, the FSM enters the HANDSHAKING 991 state. At this point, there are two likely reasons that the STUN 992 transaction might have timed out. One reason is that the candidate 993 is simply unreachable. The other reason is that the peer is behind a 994 port restricted NAT, and so STUN requests from the client cannot get 995 through until its peer creates a permission by generating its own 996 STUN request. It may take some time to generate that STUN request, 997 as it may depend on a response message getting delivered. It is also 998 possible that the STUN transaction timed out due to a persistent 999 network failure, in which case, a retry is in order. As such, the 1000 HANDSHAKING state allows for rapid retry of the STUN transaction 1001 until enough time has passed to be certain that the remote transport 1002 address is actually unreachable. Thus, upon entering the HANDSHAKING 1003 state, two timers are set. The first, called the Rapid Retry timer, 1004 determines how long until the next attempt. This timer SHOULD be 1005 configurable. It is RECOMMENDED that it default to 50ms. Note that 1006 this timer does not mean that a STUN request is repeated every 50ms. 1007 It means that a new STUN transaction begins 50ms after the completion 1008 of the previous one. STUN transactions themselves employ 1009 exponentially back off retransmit timers. The second timer, called 1010 the Giveup Timer, determines how long the client will keep trying 1011 until it decides that the remote transport address is unreachable. 1012 This timer SHOULD be configurable. It is RECOMMENDED that it default 1013 to 50 seconds. This is a reasonable approximation of the maximum SIP 1014 transaction duration. 1016 If, from the INIT state the STUN transaction generates a race-failure 1017 event, it means that the peer has not yet completed the 1018 initiate/accept exchange, and thus the username has not been 1019 allocated. Another BindingRequest transaction needs to take place to 1020 try again. Thus, the same retry and giveup timers as in the timeout 1021 event are started. 1023 If, from the INIT state, the STUN transaction generates an error, the 1024 FSM moves into the BAD state. This state means that the connection 1025 is definitively unreachable, and it will not be used subsequently in 1026 the session. 1028 If, while in the HANDSHAKING state, the Giveup timer fires, or the 1029 STUN transaction results in an error, the client moves into the BAD 1030 state. If, while in the HANDSHAKING state, the Rapid Retry timer 1031 fires, a new STUN transaction is started. The output of that 1032 transaction will be subsequently fed into the FSM, but upon 1033 initiation of the retry attempt there is no change in state. If the 1034 pending BindingRequest transaction succeeds, the FSM moves into the 1035 GOOD state. This transport connection is viable for communications. 1037 Once one of the connections in the connection table enters the GOOD 1038 state, the client SHOULD begin using it for communications. It 1039 SHOULD cease any ongoing transactions and terminate FSMs for 1040 connections of lower priority. If, another connection of higher 1041 priority should subsequently enter the GOOD state, the client SHOULD 1042 switch to that one, and once more cease all ongoing transactions and 1043 terminate FSMs for connections of lower priority. It SHOULD perform 1044 this switch after waiting a small period of time (2 seconds is 1045 RECOMMENDED) to prevent against quick changes in transport address as 1046 each of the ongoing connectivity checks completes. If there are 1047 multiple GOOD connections whose priorities are equal and higher than 1048 any other GOOD connection, the client SHOULD pick one randomly and 1049 use that. It SHOULD NOT change to another one of equal priority 1050 later on. Each change in address is likely to cause a change in 1051 transport characteristics, and manifest itself as a "glitch" to the 1052 user. 1054 To send media on a connection, the client sends media packets 1055 (whether they are RTP or RTCP or something else) to the remote 1056 transport address, from the local transport address. 1058 5.3.5 Receiving STUN Requests 1060 When a client receives a STUN request (presumably after 1061 disambiguating it from a media packet), it follows the logic 1062 described in this section. 1064 The client MUST follow the procedures defined in RFC 3489 and verify 1065 that the USERNAME attribute is known to the server. Here, this is 1066 done by taking the USERNAME attribute, and doing a prefix match 1067 against the "local username fragment" column in the connection table. 1068 If it doesn't match any rows, the client generates a 400 response. 1069 If it matches one or more rows, the client checks the suffix of the 1070 username against the "remote username fragment" column in those 1071 matching rows. If the final result doesn't match any rows, the 1072 client generates a 430 response. If the final result matches a 1073 single row, that row identifies the connection on which the STUN 1074 request was received. The client then proceeds with the processing 1075 of the request and generation of a response as per RFC 3489. 1077 Once the response is sent, the client examines the source IP and port 1078 where the request came from. It matches those against the remote 1079 transport addresses of the matching connection from the previous 1080 paragraph. If they don't match, and that remote transport address is 1081 not elsewhere in the table, this source transport address is itself 1082 another possible candidate. As with other candidates, it must be 1083 associated with a STUN remote username fragment, remote password and 1084 remote priority. These are obtained from the values of these columns 1085 for the matching connection in the table. This candidate is then 1086 paired with each local transport address, and the resulting set of 1087 connections are added to the connection table and verified using STUN 1088 connectivity checks as per Section 5.3.4. 1090 When will the source transport address of the BindingRequest not 1091 match an existing candidate remote transport address? This happens 1092 when there is a NAT between the peers which is not on the path 1093 between each peer and the UNSAF servers. 1095 5.3.6 Management of Resources 1097 The beginning of a multimedia session results in the creation of 1098 several resources to support ICE. These include gathered addresses, 1099 both local and derived, along with the local STUN servers that run on 1100 the local addresses. These resources must be maintained and 1101 eventually freed. 1103 It is RECOMMENDED that all gathered addresses be retained for the 1104 duration of the session. Even if they are not used initially, this 1105 allows them to be used later in the session should conditions change, 1106 requiring a signaling operation to update the set of candidate 1107 addresses. Maintaining these resources depends on the type of 1108 resource. For a local transport address, nothing is required. The 1109 socket is maintained until freed by the ICE application. For STUN 1110 derived transport addresses, the bindings in the NAT for that address 1111 need to be maintained. If the derived transport address is used by 1112 the peer for media, the media itself serves to keep the bindings 1113 alive (see Section 5.3.7). A client can determine that a STUN 1114 derived transport address was used for media when the RTP packet 1115 arrives at the associated local transport address. For the other 1116 STUN derived transport addresses, the client SHOULD periodically 1117 generate STUN transactions to the STUN server. Every 20 seconds is 1118 RECOMMENDED. 1120 For TURN derived transport addresses, the bindings in the NAT along 1121 with the mappings in the TURN server need to be maintained. Media 1122 traffic itself can accomplish that. The client will know that its 1123 TURN derived transport address is in use when an RTP packet arrives 1124 at the associated local transport address. For other TURN derived 1125 transport addresses, the TURN keepalive mechanisms SHOULD be used. 1127 Once the STUN servers are started on the local transport addresses, 1128 they MUST run until a valid media packet is detected on that 1129 transport address. Once a media packet is received, it signals that 1130 the peer has completed its connectivity checks and has decided to use 1131 that transport address (or the derived transport address, as the case 1132 may be) for media communications. While the server is running, it 1133 MUST act as a normal STUN server, but MUST only accept STUN requests 1134 from clients that authenticate, as discussed below in Section 5.3.5 1136 5.3.7 Binding Keepalives 1138 Once the STUN connectivity checks complete, STUN packets are no 1139 longer used. However, bindings in intermediate NATs need to be kept 1140 alive so that the media can continue to flow. Doing so is the 1141 responsibility of the media protocol. 1143 In the case of RTP, the RTP packets themselves normally come 1144 sufficiently quickly to keep the bindings alive. However, several 1145 cases merit further discussion. Firstly, in some RTP usages, such as 1146 SIP, the media streams can be "put on hold". This is accomplished by 1147 using the SDP "sendonly" or "inactive" attributes, as defined in RFC 1148 3264 [4]. RFC 3264 directs implementations to cease transmission of 1149 media in these cases. However, doing so may cause NAT bindings to 1150 timeout, and media won't be able to come off hold. 1152 As such, clients SHOULD instead send a media packet periodically, 1153 independent of whether the stream is "sendonly", "recvonly" or 1154 "inactive". At least once every 20 seconds is RECOMMENDED. These 1155 packets can be sent using any of the payload formats listed by the 1156 peer in its SDP. For audio streams, It is RECOMMENDED that 1157 implementations support the RTP payload format for comfort noise [5], 1158 which makes a good choice. For video codecs, a minimally coded frame 1159 is a good choice. 1161 Secondly, some RTP payload formats, such as the payload format for 1162 text conversation [19], may send packets so infrequently that the 1163 interval exceeds the NAT binding timeouts. In such cases, the 1164 implementation should send some any kind of content, if possible. If 1165 the payload type doesn't allow anything meaningful to be sent, even a 1166 malformed RTP packet is superior to nothing at all; the malformed 1167 packet would be rejected by the peer, and have the side effect of 1168 keeping the NAT bindings open. 1170 6. Running STUN on Derived Transport Addresses 1172 One of the seemingly bizarre operations done during the ICE 1173 processing is the transmission of a STUN request to a transport 1174 address which is obtained through TURN or STUN itself. This actually 1175 does work, and in fact, has extremely useful properties. The 1176 subsections below go through the detailed operations that would occur 1177 at each point to demonstrate correctness and the properties derived 1178 from it. They are tutorial in nature. 1180 6.1 STUN on a TURN Derived Transport Address 1182 +----------+ 1183 | |192.0.2.1:26524 1184 | TURN X 1185 | Server | 1186 | | 1187 | | 1188 +----------+ 1189 192.0.2.1:7764. ^192.0.2.1:7764 1190 . . 1191 . .192.0.2.88:5063 1192 +----------+ 1193 | NAT | 1194 +----------+ 1195 TURN . . 1196 Response . . TURN Request 1197 . . 1198 10.0.1.1:8866 V .10.0.1.1:8866 1199 +----------+ +----------+ 1200 | | | | 1201 | Client | | Client | 1202 | | | | 1203 | A | | B | 1204 | | | | 1205 +----------+ +----------+ 1207 Figure 6 1209 Consider a client A that is behind a NAT, shown in Figure 6. It 1210 connects to a TURN server on the public side of the NAT. To do that, 1211 A binds to a local transport address, say 10.0.1.1:8866, and then 1212 sends a TURN request to the TURN server. The NAT translates the 1213 net-10 address to 192.0.2.88:5063. Assume that the TURN server is 1214 running on 192.0.2.1 and listening for TURN traffic on port 7764. 1215 The TURN server allocates a derived transport address 192.0.2.1:26524 1216 to the client (shown as the X on the TURN server in the diagram), and 1217 returns it in the TURN response. Remember that all traffic from the 1218 TURN server to the client is sent from 192.0.2.1:7764 to 1219 10.0.1.1:8866, including the TURN response. 1221 Now, the client runs a STUN server on 10.0.1.1:8866, and advertises 1222 that its server actually runs on 192.0.2.1:26524. Another client, B, 1223 sends a STUN request to this server. It sends it from a local 1224 transport address, 192.0.2.77:1296. When it arrives at 1225 192.0.2.1:26524, it is discarded since client A has not sent a packet 1226 to 192.0.2.77:1296. Once client A gets client B's accept message, it 1227 will learn about B's candidate address, and generate a STUN request 1228 towards it. This results in a permission being installed in the TURN 1229 server, so that packets from 192.0.2.77:1296 will now be accepted. 1230 The next STUN request from client B will therefore succeed. This is 1231 the normal mode of operations for port restricted NAT; as described 1232 in TURN, the server turns a symmetric NAT into a port restricted one 1233 [8]. 1235 +----------+ 1236 | |192.0.2.1:26524 STUN Request 1237 | TURN X<............................... 1238 | Server | STUN Response . 1239 | |......................... . 1240 | |192.0.2.1:26524 . . 1241 +----------+ . . 1242 192.0.2.1:7764 . ^ 192.0.2.1:7764 . . 1243 . . . . 1244 192.0.2.88:5063 V . 192.0.2.88:5063 . . 1245 +----------+ . . 1246 | NAT | . . 1247 +----------+ . . 1248 192.0.2.1:7764 . ^ 192.0.2.1:7764 . . 1249 . . 192.0.2.77:1296 . 1250 . . . . 1251 10.0.1.1:8866 V . 10.0.1.1:8866 V .192.0.2.77:1296 1252 +----------+ +----------+ 1253 | | | | 1254 | Client | | Client | 1255 | | | | 1256 | A | | B | 1257 | | | | 1258 +----------+ +----------+ 1260 Figure 7 1262 As shown in Figure 7, client B will retry, sending it STUN request 1263 from 192.0.2.77:1296 to 192.0.2.1:26524. This successful STUN 1264 request is forwarded to the client, sent with a source address of 1265 192.0.2.1:7764 and a destination address of 192.0.2.88:5063. This 1266 passes through the NAT, which rewrites the destination address to 1267 10.0.1.1:8866. This arrives at A's STUN server. The server observes 1268 the source address of 192.0.2.1:7764, and generates a STUN response 1269 containing this value in the MAPPED-ADDRESS attribute. The STUN 1270 response is sent with a source address of 10.0.1.1:8866, and a 1271 destination of 192.0.2.1:7764. This arrives at the TURN server, 1272 which, because of current destination is 192.0.2.1:7764, sends the 1273 STUN response with a source address of 192.0.2.1:26524 and 1274 destination of 192.0.2.77:1296, which is B's STUN client. 1276 Now, as far as A is concerned, it has obtained a new candidate 1277 transport address of 192.0.2.1:7764. And indeed, it has! STUN 1278 derived transport addresses are scoped to the session, so they can 1279 only be used by the peer in the session. Furthermore, that peer has 1280 to send requests from the socket on which the STUN server was 1281 running. In this case, A is the peer, and its STUN server was on 1282 10.0.1.1:8866. If it sends to 192.0.2.1:7764, the packet goes to the 1283 TURN server, and since the destination address is set to 1284 192.0.2.77:1296, is forwarded to B, and specifically, is forwarded to 1285 the transport address B sent the STUN request from. Therefore, the 1286 address is indeed a valid candidate transport address. Its priority 1287 is derived from the priority of client B's public IP address. 1289 The benefit of this is that it allows two clients to share the same 1290 TURN server for media traffic in both directions. With "normal" TURN 1291 usage, both clients would obtain a derived address from their own 1292 TURN servers. The result is that, for a single call, there are two 1293 bindings allocated by each side from their respective servers, and 1294 all four are used. With ICE, that drops to two bindings allocated 1295 from a single server. Of course, all four bindings are allocated 1296 initially. However, once one of the clients begins receiving media 1297 on its STUN derived address, it can deallocate its TURN resources. 1299 6.2 STUN on a STUN Derived Transport Address 1301 Consider a client A that is behind a NAT. It connects to a STUN 1302 server on the public side of the NAT. To do that, A binds to a local 1303 transport address, say 10.0.1.1:8866, and then sends a STUN request 1304 to the STUN server. The NAT translates the net-10 address to 1305 192.0.2.88:5063. Assume that the STUN server is running on 192.0.2.1 1306 and listening for STUN traffic on port 3478, the default STUN port. 1307 The STUN server sees a source IP address of 192.0.2.88:5063, and 1308 returns that to the client in the STUN response. The NAT forwards 1309 the response to the client. 1311 Now, the client runs a STUN server on 10.0.1.1:8866, and advertises 1312 that its transport address is 192.0.2.88:5063. Another client, B, 1313 sends a STUN request to this address. It sends it from a local 1314 transport address, 192.0.2.77:1296. When it arrives at 1315 192.0.2.88:5063 (on the NAT), the NAT rewrites the source address to 1316 10.0.1.1:8866, assuming that it is of the full-cone or restricted 1317 variety [1], and the permission for 192.0.2.77:1296 is open. This 1318 arrives at A's local STUN server. The server observes the source 1319 address of 192.0.2.77:1296, and generates a STUN response containing 1320 this value in the MAPPED-ADDRESS attribute. The STUN response is 1321 sent with a source address of 10.0.1.1:8866, and a destination of 1322 192.0.2.77:1296. This arrives at B's STUN client. 1324 Now, as far as A is concerned, the STUN request had a source 1325 transport address which was already known to A, presumably from an 1326 ICE exchange. As far as B is concerned, the check succeeded, and the 1327 address is viable. 1329 7. XML Schema for ICE Messages 1331 This section contains the XML schema used to define the initiate and 1332 accept messages. Any protocol that uses ICE needs to map the 1333 parameters defined here into its own messages. 1335 Note that STUN allows both the username and password to contain the 1336 space character. However, usernames and passwords used with ICE 1337 cannot contain the space. 1339 1340 1344 1346 1347 1348 1349 This is the root element, which holds a 1350 media-streams elements. 1351 1352 1353 1354 1355 1356 1357 1358 1359 1360 1361 There are zero or more media stream 1362 elements. Each defines attributes for a specific media 1363 stream. 1364 1365 1366 1367 1368 1369 1370 1371 1372 1373 1374 1375 1376 1377 1378 Each candidate is a possible point 1379 of media reception. 1380 1381 1382 1383 1384 1385 1386 1387 1388 1389 1390 1391 1392 1393 1394 1395 1396 1397 1398 1399 1400 1401 1402 1403 1404 1405 1406 1407 1408 1409 1410 1411 1412 1413 1414 1415 1416 1417 1418 1419 1420 1421 1422 1423 1424 1425 1426 1427 1428 1429 1430 1431 1432 1433 1435 1436 1437 1439 8. Example 1441 In the example that follows, messages are labeled with "message name 1442 A,B" to mean a message from transport address A to B. For STUN 1443 Requests, this is followed by curly brackets enclosing the username 1444 and password. For STUN responses, this is followed by square 1445 brackets and the value of MAPPED ADDRESS. The example shows a flow 1446 of two clients where one is behind a full cone NAT, and the other is 1447 on the public Internet. 1449 A NAT STUN B 1450 |(1) STUN Req P1,STUN-PUBLIC | | 1451 |---------------->| | | 1452 | |(2) STUN Req U, STUN-PUBLIC | 1453 | |---------------->| | 1454 | |(3) STUN Res STUN-PUBLIC, U [U] | 1455 | |<----------------| | 1456 |(4) STUN Res STUN-PUBLIC, P1 [U] | | 1457 |<----------------| | | 1458 |(5) Intitiate {P2,ufrag1A,pass1A,q=0.4} | 1459 |{U,ufrag2A,pass2A,q=0.4} | | 1460 |---------------------------------------------------->| 1461 | | |(6) STUN Req P3,STUN-PUBLIC 1462 | | |<----------------| 1463 | | |(7) STUN Res STUN-PUBLIC,P3 [P3] 1464 | | |---------------->| 1465 |(8) Accept {P3,ufrag1B,pass1B,q=0.4} | 1466 |<----------------------------------------------------| 1467 | |(9) STUN Req P3,P2 | 1468 | |(ufrag1Aufrag1B,pass1A) | 1469 | |<----------------------------------| 1470 | |Timeout | | 1471 | |(10) STUN Req P3,U | 1472 | |(ufrag2Aufrag1B,pass2A) | 1473 | |<----------------------------------| 1474 |(11) STUN Req P3,P1 | | 1475 |(ufrag2Aufrag1B,pass2A) | | 1476 |<----------------| | | 1477 |(12) STUN Res P1,P3 [P3] | | 1478 |---------------->| | | 1479 | |(13) STUN Res U,P3 [P3] | 1480 | |---------------------------------->| 1481 |(14) STUN Req P2,P3 | | 1482 |(ufrag1Bufrag1A,pass1B) | | 1483 |---------------->| | | 1484 | |(15) STUN Req W,P3 | 1485 | |(ufrag1Bufrag1A,pass1B) | 1486 | |---------------------------------->| 1487 | |(16) STUN Res P3,W [W] | 1488 | |<----------------------------------| 1489 |(17) STUN Res P3,P2 [W] | | 1490 |<----------------| | | 1491 |(18) STUN Req P1,P3 | | 1492 |(ufrag1Bufrag2A,pass1B) | | 1493 |---------------->| | | 1494 | |(19) STUN Req U,P3 | 1495 | |(ufrag1Bufrag2A,pass1B) | 1496 | |---------------------------------->| 1497 | |(20) STUN Res P3,U [U] | 1498 | |<----------------------------------| 1499 |(21) STUN Res P3,P1 [U] | | 1500 |<----------------| | | 1502 The initiator, client A, binds to a local transport address P1, which 1503 will be used as an associated local transport address. As such, it 1504 sends a STUN request to its STUN server (message 1). This passes 1505 through a NAT, and the NAT maps private address P1 to public address 1506 U (message 2). The STUN server mirrors this public address in the 1507 MAPPED-ADDRESS of the STUN response (message 3), and it is forwarded 1508 to the initiator (message 4). Now, client A has a STUN derived 1509 transport address of U. It also binds to a second local transport 1510 address, P2, which will be a usable local transport address. It 1511 starts STUN servers on both local transport addresses P1 and P2. It 1512 then generates an Initiate request to client B (message 5) which 1513 contains both of the gathered transport addresses P2 and U, along 1514 with username fragments and passwords. 1516 Client B is not behind a NAT. It binds to a local transport address 1517 P3, and sends a STUN request to its STUN server (message 6). This is 1518 responded to by the STUN server (message 7). The client observes 1519 that this address is identical to its local transport address, and 1520 therefore that local transport address is, which was targeted for an 1521 associated local transport address, is promoted to a usable local 1522 transport address. It then sends an Accept message to client A, 1523 including this transport address and its username fragment and 1524 password (message 8). 1526 Once the Accept message is sent, the client can perform its STUN 1527 connectivity checks. B has a single local transport address (P3), 1528 which it matches up with A's two remote transport addresses (P2 and 1529 U). B tries P2 (message 9). This request fails since P2 is a 1530 private address. In parallel, B tries U (message 10). Since A's NAT 1531 is full cone, this packet is accepted and is passed to client A 1532 (message 11). Client A generates a response (message 12) which is 1533 forwarded to client B (message 13). The source transport address in 1534 the STUN packet, P3, is already known to client A, and thus no new 1535 candidates are learned. Client B learns that client A is reachable 1536 at transport address U, but not P3. Thus, it can begin sending media 1537 to U from local transport address P3. 1539 Once the Accept message arrives at client A, it can begin its 1540 connectivity checks. It has two local transport addresses P1 and P2, 1541 which it combines with client Bs single transport address P3. It 1542 tries to send a STUN packet from P2 to P3 (message 14). Since the 1543 NAT has not seen source address P2 yet, it maps it to a new public 1544 transport address W, and the STUN request is forwarded to client B 1545 (message 15). Client B generates a STUN response (message 16), which 1546 is forwarded back to client A (message 17). Based on this, client A 1547 learns that it can reach P3 from P2. Client B learns a new remote 1548 transport address, W. However, the priority of this address is the 1549 same as P2, which is 0.4, and equal to the priority of address U, to 1550 which client B has already connected. Thus, it does not bother to 1551 perform the check (such a check would have succeeded if it had been 1552 done). 1554 While the P2->P3 check is taking place, client A also sends a STUN 1555 request from P1 to P3 (message 18). This passes through the NAT, 1556 which maps the source transport address to the same public address it 1557 allocated previously, U. This STUN request arrives at client B 1558 (message 19). It generates a response (message 20), which is 1559 forwarded to client A (message 21). Based on this check, client A 1560 learns that P3 is also reachable from P1. Client B did not learn a 1561 new candidate transport address, since U was already known. Now, 1562 client A can send media to P3 from either P1 or P2. 1564 9. Mapping ICE into SIP 1566 In this section, we show how to map ICE into SIP. This mapping 1567 involves three parts. The first is the actual mapping of the ICE 1568 message into SIP and SDP messages, which requires extensions to SDP 1569 documented here. The second are security considerations specific to 1570 SIP. The third is handling of updates in the offer/answer model. 1572 9.1 Message Mapping 1574 A new SDP attribute is defined to support ICE. It is called 1575 "candidate". The candidate attribute MUST be present within a media 1576 block of the SDP. It contains a candidate IP address and port (or 1577 pair of IP addresses and ports in the case of RTP) that the recipient 1578 of the SDP can use. There MAY be multiple candidate attributes in a 1579 media block. In that case, each of them MUST contain a different IP 1580 address and port (or a differing pair of IP address and ports in the 1581 case of RTP). 1583 The syntax of this attribute is: 1585 candidate-attribute = "candidate" ":" id SP qvalue SP 1586 rtp-user-frag SP rtp-password SP 1587 rtp-unicast-address SP rtp-port [SP rtcp-user-frag 1588 SP rtcp-password [SP rtcp-unicast-address SP 1589 rtcp-port]] 1590 ;qvalue from RFC 3261 1591 rtp-port = port 1592 rtcp-port = port 1593 rtp-unicast-address = unicast-address 1594 rtcp-unicast-address = unicast-address 1595 ;unicast-address, port from RFC 2327 1596 rtp-user-frag = non-ws-string 1597 rtp-password = non-ws-string 1598 rtcp-user-frag = non-ws-string 1599 rtcp-password = non-ws-string 1600 id = token 1602 With the addition of the candidate attribute, the mapping of the ICE 1603 messages to SIP/SDP is straightforward. The ICE initiate message 1604 corresponds to a SIP message with an SDP offer. The ICE accept 1605 message corresponds to a SIP message with a SDP answer. 1607 Each media stream element in an ICE message maps to either one or two 1608 media blocks in the SDP. If the ICE message has only an IPv4 default 1609 address or an IPv6 default address, but not both, one media block is 1610 used. If both defaults are present, two media blocks are used. Each 1611 default address maps to the m and c lines in the SDP media block. In 1612 particular, the from the element maps into 1613 the SDP c line. The from the maps into the port 1614 in the SDP m line. If the ICE message indicates a default RTCP 1615 address whose IP address is not identical to the default RTP address, 1616 and whose port is not one higher than that of the RTP, the SDP RTCP 1617 attribute [2] MUST be used to convey the RTCP transport address. 1619 Each element in an ICE message maps to a candidate 1620 attribute in the SDP. If the IP version of the is IPv4, 1621 it MUST be mapped into the media block containing the default IPv4 1622 address. If the IP version of the is IPv6, it MUST be 1623 mapped into the media block containing the default IPv6 address. 1624 Mapping of each individual candidate is simple. The 1625 element of the element maps to 1626 the rtp-user-frag component of the candidate attribute. The 1627 element of the element maps to the 1628 rtp-password component of the candidate attribute. The 1629 element maps to the first unicast-address and port components of the 1630 candidate attribute. 1632 If the element is present, it means that RTCP is in 1633 use. The rtcp-user-frag and rtcp-password components of the 1634 candidate attribute MUST be present, and MUST be set to the 1635 and elements of the 1636 element, respectively. If the element is also 1637 present, its IP address and port information is copied into the 1638 rtcp-unicast-address and rtcp-port components of the candidate 1639 attribute. 1641 The preference attribute from the element is mapped to 1642 the q-value component of the candidate attribute. The id attribute 1643 from the element is mapped into the id component of the 1644 candidate attribute. 1646 If the mapping process produced both an IPv6 media block (that is, a 1647 media block with an IPv6 address in the c line, and with all IPv6 1648 addresses in the candidate attributes within that block) and an IPv4 1649 media block, these two blocks MUST be grouped using the ANAT grouping 1650 [7]. 1652 9.2 SIP and SDP Specific Security Considerations 1654 The SDP messages described here contain usernames and passwords. If 1655 those passwords are transmitted in the clear, it introduces 1656 significant security vulnerabilities, discussed in detail below. In 1657 summary, those vulnerabilities would allow an eavesdropper that can 1658 inject packets, to "steal" the media streams for a call unless secure 1659 media transport (such as SRTP) is used. Even if SRTP is used, an 1660 attacker could disrupt a call and prevent media from flowing. These 1661 attacks, fortunately, can be obviated by providing secure transport 1662 of the SDP. SIP-based implementations of ICE SHOULD use the sips URI 1663 scheme when transporting SDP with ICE information, and MAY use S/MIME 1664 [3]. 1666 9.3 Updates in the Offer/Answer Model 1668 ICE itself only considers an initial exchange of messages. However, 1669 the offer/answer model [4] allows for the session to be modified with 1670 subsequent exchanges. How is an updated offer with SDP alternate 1671 attributes to be treated? 1673 If a user agent receives an updated offer with candidate attributes, 1674 it checks to see if it already knows about those candidates. This is 1675 done by comparing the transport address and username fragment with 1676 existing values. If the combination is already known, no additional 1677 action is taken. In particular, if STUN connectivity checks had 1678 already been made, no new ones are performed. However, if a 1679 candidate contains a new transport address or new username fragment, 1680 it is treated as a totally new candidate, and STUN connectivity 1681 checks are performed per Section 5.3.4. If a candidate formerly sent 1682 by the peer no longer appears, that candidate is considered BAD, and 1683 if it was in use previously, it ceases being used, and the next 1684 highest priority connection in the GOOD state is used. 1686 The inclusion of the username fragment in the determination of 1687 whether a candidate is known provides a hook that allows a peer to 1688 request a new set of connectivity checks on an existing transport 1689 address. It can update the username fragment and generate an updated 1690 offer, without changing the transport address. 1692 10. Security Considerations 1694 STUN itself introduces many security considerations. In particular, 1695 there are attacks whereby an eavesdropper replays STUN packets with a 1696 modified source address. These modified packets can cause service 1697 disruptions and denial-of-service attacks, which are only partially 1698 mitigated by the heuristics described in STUN [1]. 1700 Interestingly, when STUN is used within ICE, these security 1701 weaknesses are mitigated completely, without the need for the 1702 heuristics defined in RFC 3489. 1704 Consider an attacker that intercepts a STUN packet used for 1705 connectivity checks, and replays it using a faked source address. If 1706 successful, this would fool an endpoint into thinking that this faked 1707 source address was a valid destination for media (recall that the 1708 source transport address of received STUN packets is used as a 1709 potential candidate address). However, the recipient of the replayed 1710 packet will not just send media to that candidate. It will verify it 1711 with a STUN connectivity check. This check will be sent to that 1712 faked source address, and if there is no response, the address will 1713 not be used. The attacker cannot answer the STUN request without 1714 access to the username and password, which are exchanged as part of 1715 the signaling. Thus, if the signaling is protected as recommended 1716 above, the attacker cannot obtain the username or password. 1718 If an attacker instead intercepts and replays STUN packets used for 1719 the purposes of unilateral allocation, a similar result occurs. The 1720 target of the attack will be fooled into thinking it has a STUN 1721 derived transport address that it does not. Its peer will perform a 1722 connectivity check to this address, which will fail. The attacker 1723 cannot force this check to succeed without access to the username and 1724 password, which are protected. Thus, this address will not be used. 1726 In the worst case, an attacker can generate enough traffic so that 1727 none of the valid STUN checks or unilateral allocations succeed. 1728 This would result in a service disruption. However, this attack is 1729 no worse than any pure packet flood disruption attack launched 1730 against any other protocol. These attacks cannot be prevented by any 1731 protocol means. 1733 If an attacker could intercept and modify the contents of the 1734 Initiate or Accept messages, they could disrupt the session, divert 1735 the media, and otherwise take control over the session. This attack 1736 is prevented by encryption, authentication and message integrity of 1737 the signaling channel used for ICE. 1739 11. IANA Considerations 1741 11.1 SDP Attribute Name 1743 This specification defines one new SDP attribute per the procedures 1744 of Appendix B of RFC 2327. The required information for the 1745 registration is: 1747 Contact Name: Jonathan Rosenberg, jdrosen@jdrosen.net. 1749 Attribute Name: candidate 1751 Long Form: candidiate 1753 Type of Attribute: media level 1755 Charset Considerations: The attribute is not subject the the charset 1756 attribute. 1758 Purpose: This attribute is used with Interactive Connectivity 1759 Establishment (ICE), and provides one of many possible candidate 1760 addresses for communication. These addresses are validated with 1761 an end-to-end connectivity check using Simple Traversal of UDP 1762 with NAT (STUN). 1764 Appropriate Values: See Section 9 of RFC XXXX [Note to RFC-ed: please 1765 replace XXXX with the RFC number of this specification]. 1767 11.2 URN Sub-Namespace Registration 1769 This section registers a new XML namespace, per the guidelines in [6] 1771 URI: The URI for this namespace is urn:ietf:params:xml:ns:ice. 1773 Registrant Contact: IETF, MMUSIC working group, (mmusic@ietf.org), 1774 Jonathan Rosenberg (jdrosen@jdrosen.net). 1776 XML: 1778 BEGIN 1779 1780 1782 1783 1784 1786 ICE Namespace 1787 1788 1789

Namespace for ICE Documents

1790

urn:ietf:params:xml:ns:ice

1791

See RFCXXXX. [Note to RFC-ed: please replace XXXX with the RFC 1793 number of this specification.]

1794 1795 1796 END 1798 11.3 XML Schema Registration 1800 This section registers an XML schema per the procedures in [6]. 1802 URI: urn:ietf:params:xml:schema:ice 1804 Registrant Contact: IETF, MMUSIC working group, (mmusic@ietf.org), 1805 Jonathan Rosenberg (jdrosen@jdrosen.net). 1807 The XML for this schema can be found as the sole content of 1808 Section 7. 1810 12. IAB Considerations 1812 The IAB has studied the problem of "Unilateral Self Address Fixing", 1813 which is the general process by which a client attempts to determine 1814 its address in another realm on the other side of a NAT through a 1815 collaborative protocol reflection mechanism [14]. ICE is an example 1816 of a protocol that performs this type of function. Interestingly, 1817 the process for ICE is not unilateral, but bilateral, and the 1818 difference has a signficant impact on the issues raised by IAB. The 1819 IAB has mandated that any protocols developed for this purpose 1820 document a specific set of considerations. This section meets those 1821 requirements. 1823 12.1 Problem Definition 1825 From RFC 3424 any UNSAF proposal must provide: 1827 Precise definition of a specific, limited-scope problem that is to 1828 be solved with the UNSAF proposal. A short term fix should not be 1829 generalized to solve other problems; this is why "short term 1830 fixes usually aren't". 1832 The specific problems being solved by ICE are: 1834 Provide a means for two peers to determine the set of transport 1835 addresses which can be used for communication. 1837 Provide a means for resolving many of the limitations of other 1838 UNSAF mechanisms by wrapping them in an additional layer of 1839 processing (the ICE methodology). 1841 Provide a means for a client to determine an address that is 1842 reachable by another peer with which it wishes to communicate. 1844 12.2 Exit Strategy 1846 From RFC 3424, any UNSAF proposal must provide: 1848 Description of an exit strategy/transition plan. The better short 1849 term fixes are the ones that will naturally see less and less use 1850 as the appropriate technology is deployed. 1852 ICE itself doesn't easily get phased out. However, it is useful even 1853 in a globally connected Internet, to serve as a means for detecting 1854 whether a router failure has temporarily disrupted connectivity, for 1855 example. However, what ICE does is help phase out other UNSAF 1856 mechanisms. ICE effectively selects amongst those mechanisms, 1857 prioritizing ones that are better, and deprioritizing ones that are 1858 worse. Local IPv6 addresses can be preferred. As NATs begin to 1859 dissipate as IPv6 is introduced, derived transport addresses from 1860 other UNSAF mechanisms simply never get used, because higher priority 1861 connectivity exists. Therefore, the servers get used less and less, 1862 and can eventually be remove when their usage goes to zero. 1864 Indeed, ICE can assist in the transition from IPv4 to IPv6. It can 1865 be used to determine whether to use IPv6 or IPv4 when two dual-stack 1866 hosts communicate with SIP (IPv6 gets used). It can also allow a 1867 network with both 6to4 and native v6 connectivity to determine which 1868 address to use when communicating with a peer. 1870 12.3 Brittleness Introduced by ICE 1872 From RFC3424, any UNSAF proposal must provide: 1874 Discussion of specific issues that may render systems more 1875 "brittle". For example, approaches that involve using data at 1876 multiple network layers create more dependencies, increase 1877 debugging challenges, and make it harder to transition. 1879 ICE actually removes brittleness from existing UNSAF mechanisms. In 1880 particular, traditional STUN (the usage described in RFC 3489) has 1881 several points of brittleness. One of them is the discovery process 1882 which requires a client to try and classify the type of NAT it is 1883 behind. This process is error-prone. With ICE, that discovery 1884 process is simply not used. Rather than unilaterally assessing the 1885 validity of the address, its validity is dynamically determined by 1886 measuring connectivity to a peer. The process of determining 1887 connectivity is very robust. The only potential problem is that 1888 bilaterally fixed addresses through STUN can expire if traffic does 1889 not keep them alive. However, that is substantially less brittleness 1890 than the STUN discovery mechanisms. 1892 Another point of brittleness in STUN, TURN, and any other unilateral 1893 mechanism is its absolute reliance on an additional server. ICE 1894 makes use of a server for allocating unilateral addresses, but allows 1895 clients to directly connect if possible. Therefore, in some cases, 1896 the failure of a STUN or TURN server would still allow for a call to 1897 progress when ICE is used. 1899 Another point of brittleness in traditional STUN is that it assumes 1900 that the STUN server is on the public Internet. Interestingly, with 1901 ICE, that is not necessary. There can be a multitude of STUN servers 1902 in a variety of address realms. ICE will discover the one that has 1903 provided a usable address. 1905 The most troubling point of brittleness in traditional STUN is that 1906 it doesn't work in all network topologies. In cases where there is a 1907 shared NAT between each client and the STUN server, traditional STUN 1908 may not work. With ICE, that restriction can be lifted. 1910 Traditional STUN also introduces some security considerations. 1911 Fortunately, those security considerations are also mitigated by ICE. 1913 12.4 Requirements for a Long Term Solution 1915 From RFC 3424, any UNSAF proposal must provide: 1917 Identify requirements for longer term, sound technical solutions 1918 -- contribute to the process of finding the right longer term 1919 solution. 1921 Our conclusions from STUN remain unchanged. However, we feel ICE 1922 actually helps because we believe it can be part of the long term 1923 solution. 1925 12.5 Issues with Existing NAPT Boxes 1927 From RFC 3424, any UNSAF proposal must provide: 1929 Discussion of the impact of the noted practical issues with 1930 existing, deployed NA[P]Ts and experience reports. 1932 A number of NAT boxes are now being deployed into the market which 1933 try and provide "generic" ALG functionality. These generic ALGs hunt 1934 for IP addresses, either in text or binary form within a packet, and 1935 rewrite them if they match a binding. This will interfere with 1936 proper operation of any UNSAF mechanism, including ICE. 1938 13. Acknowledgements 1940 The authors would like to thank Douglas Otis, Francois Audet and 1941 Magnus Westerland for their comments and input. 1943 14. References 1945 14.1 Normative References 1947 [1] Rosenberg, J., Weinberger, J., Huitema, C. and R. Mahy, "STUN - 1948 Simple Traversal of User Datagram Protocol (UDP) Through Network 1949 Address Translators (NATs)", RFC 3489, March 2003. 1951 [2] Huitema, C., "Real Time Control Protocol (RTCP) attribute in 1952 Session Description Protocol (SDP)", RFC 3605, October 2003. 1954 [3] Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston, A., 1955 Peterson, J., Sparks, R., Handley, M. and E. Schooler, "SIP: 1956 Session Initiation Protocol", RFC 3261, June 2002. 1958 [4] Rosenberg, J. and H. Schulzrinne, "An Offer/Answer Model with 1959 Session Description Protocol (SDP)", RFC 3264, June 2002. 1961 [5] Zopf, R., "Real-time Transport Protocol (RTP) Payload for 1962 Comfort Noise (CN)", RFC 3389, September 2002. 1964 [6] Mealling, M., "The IETF XML Registry", BCP 81, RFC 3688, January 1965 2004. 1967 [7] Camarillo, G., "The Alternative Network Address Types Semantics 1968 (ANAT) for theSession Description Protocol (SDP) Grouping 1969 Framework", draft-ietf-mmusic-anat-02 (work in progress), 1970 October 2004. 1972 [8] Rosenberg, J., "Traversal Using Relay NAT (TURN)", 1973 draft-rosenberg-midcom-turn-06 (work in progress), October 2004. 1975 14.2 Informative References 1977 [9] Schulzrinne, H., Rao, A. and R. Lanphier, "Real Time Streaming 1978 Protocol (RTSP)", RFC 2326, April 1998. 1980 [10] Senie, D., "Network Address Translator (NAT)-Friendly 1981 Application Design Guidelines", RFC 3235, January 2002. 1983 [11] Srisuresh, P., Kuthan, J., Rosenberg, J., Molitor, A. and A. 1984 Rayhan, "Middlebox communication architecture and framework", 1985 RFC 3303, August 2002. 1987 [12] Borella, M., Lo, J., Grabelsky, D. and G. Montenegro, "Realm 1988 Specific IP: Framework", RFC 3102, October 2001. 1990 [13] Borella, M., Grabelsky, D., Lo, J. and K. Taniguchi, "Realm 1991 Specific IP: Protocol Specification", RFC 3103, October 2001. 1993 [14] Daigle, L. and IAB, "IAB Considerations for UNilateral 1994 Self-Address Fixing (UNSAF) Across Network Address 1995 Translation", RFC 3424, November 2002. 1997 [15] Schulzrinne, H., Casner, S., Frederick, R. and V. Jacobson, 1998 "RTP: A Transport Protocol for Real-Time Applications", RFC 1999 3550, July 2003. 2001 [16] Baugher, M., McGrew, D., Naslund, M., Carrara, E. and K. 2002 Norrman, "The Secure Real-time Transport Protocol (SRTP)", RFC 2003 3711, March 2004. 2005 [17] Carpenter, B. and K. Moore, "Connection of IPv6 Domains via 2006 IPv4 Clouds", RFC 3056, February 2001. 2008 [18] Huitema, C., "Teredo: Tunneling IPv6 over UDP through NATs", 2009 draft-huitema-v6ops-teredo-04 (work in progress), January 2005. 2011 [19] Hellstrom, G., "RTP Payload for Text Conversation", 2012 draft-ietf-avt-rfc2793bis-09 (work in progress), August 2004. 2014 Author's Address 2016 Jonathan Rosenberg 2017 Cisco Systems 2018 600 Lanidex Plaza 2019 Parsippany, NJ 07054 2020 US 2022 Phone: +1 973 952-5000 2023 EMail: jdrosen@cisco.com 2024 URI: http://www.jdrosen.net 2026 Intellectual Property Statement 2028 The IETF takes no position regarding the validity or scope of any 2029 Intellectual Property Rights or other rights that might be claimed to 2030 pertain to the implementation or use of the technology described in 2031 this document or the extent to which any license under such rights 2032 might or might not be available; nor does it represent that it has 2033 made any independent effort to identify any such rights. Information 2034 on the procedures with respect to rights in RFC documents can be 2035 found in BCP 78 and BCP 79. 2037 Copies of IPR disclosures made to the IETF Secretariat and any 2038 assurances of licenses to be made available, or the result of an 2039 attempt made to obtain a general license or permission for the use of 2040 such proprietary rights by implementers or users of this 2041 specification can be obtained from the IETF on-line IPR repository at 2042 http://www.ietf.org/ipr. 2044 The IETF invites any interested party to bring to its attention any 2045 copyrights, patents or patent applications, or other proprietary 2046 rights that may cover technology that may be required to implement 2047 this standard. Please address the information to the IETF at 2048 ietf-ipr@ietf.org. 2050 Disclaimer of Validity 2052 This document and the information contained herein are provided on an 2053 "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS 2054 OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET 2055 ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, 2056 INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE 2057 INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED 2058 WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. 2060 Copyright Statement 2062 Copyright (C) The Internet Society (2005). This document is subject 2063 to the rights, licenses and restrictions contained in BCP 78, and 2064 except as set forth therein, the authors retain all their rights. 2066 Acknowledgment 2068 Funding for the RFC Editor function is currently provided by the 2069 Internet Society.