idnits 2.17.1 draft-ietf-mmusic-ice-07.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** It looks like you're using RFC 3978 boilerplate. You should update this to the boilerplate described in the IETF Trust License Policy document (see https://trustee.ietf.org/license-info), which is required now. -- Found old boilerplate from RFC 3978, Section 5.1 on line 15. -- Found old boilerplate from RFC 3978, Section 5.5 on line 4240. -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 4217. -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 4224. -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 4230. ** This document has an original RFC 3978 Section 5.4 Copyright Line, instead of the newer IETF Trust Copyright according to RFC 4748. ** This document has an original RFC 3978 Section 5.5 Disclaimer, instead of the newer disclaimer which includes the IETF Trust according to RFC 4748. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** There are 16 instances of too long lines in the document, the longest one being 11 characters in excess of 72. == There are 12 instances of lines with private range IPv4 addresses in the document. If these are generic example addresses, they should be changed to use any of the ranges defined in RFC 6890 (or successor): 192.0.2.x, 198.51.100.x or 203.0.113.x. ** The document seems to lack a both a reference to RFC 2119 and the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. RFC 2119 keyword, line 483: '...ia, described in Section 7.13, MUST be...' RFC 2119 keyword, line 514: '... SHOULD have the same number of comp...' RFC 2119 keyword, line 516: '...s in a candidate MUST be of the same t...' RFC 2119 keyword, line 519: '..., each component MUST be obtained from...' RFC 2119 keyword, line 521: '...a streams, it is RECOMMENDED that ther...' (145 more instances...) Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the RFC 3978 Section 5.4 Copyright Line does not match the current year -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (March 6, 2006) is 6616 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Unused Reference: '12' is defined on line 4102, but no explicit reference was found in the text == Unused Reference: '15' is defined on line 4116, but no explicit reference was found in the text == Unused Reference: '23' is defined on line 4143, but no explicit reference was found in the text == Unused Reference: '28' is defined on line 4160, but no explicit reference was found in the text == Unused Reference: '35' is defined on line 4185, but no explicit reference was found in the text ** Obsolete normative reference: RFC 3548 (ref. '3') (Obsoleted by RFC 4648) ** Obsolete normative reference: RFC 2327 (ref. '5') (Obsoleted by RFC 4566) ** Obsolete normative reference: RFC 4234 (ref. '9') (Obsoleted by RFC 5234) ** Obsolete normative reference: RFC 3266 (ref. '10') (Obsoleted by RFC 4566) == Outdated reference: A later version (-18) exists of draft-ietf-behave-rfc3489bis-02 == Outdated reference: A later version (-16) exists of draft-ietf-behave-turn-00 -- Obsolete informational reference (is this intentional?): RFC 2326 (ref. '15') (Obsoleted by RFC 7826) -- Obsolete informational reference (is this intentional?): RFC 3489 (ref. '16') (Obsoleted by RFC 5389) -- Obsolete informational reference (is this intentional?): RFC 2733 (ref. '18') (Obsoleted by RFC 5109) == Outdated reference: A later version (-07) exists of draft-ietf-mmusic-connectivity-precon-01 == Outdated reference: A later version (-04) exists of draft-ietf-avt-rtp-no-op-00 == Outdated reference: A later version (-05) exists of draft-iab-dos-03 == Outdated reference: A later version (-08) exists of draft-ietf-behave-nat-udp-00 Summary: 9 errors (**), 0 flaws (~~), 14 warnings (==), 10 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 MMUSIC J. Rosenberg 3 Internet-Draft Cisco Systems 4 Expires: September 7, 2006 March 6, 2006 6 Interactive Connectivity Establishment (ICE): A Methodology for Network 7 Address Translator (NAT) Traversal for Offer/Answer Protocols 8 draft-ietf-mmusic-ice-07 10 Status of this Memo 12 By submitting this Internet-Draft, each author represents that any 13 applicable patent or other IPR claims of which he or she is aware 14 have been or will be disclosed, and any of which he or she becomes 15 aware will be disclosed, in accordance with Section 6 of BCP 79. 17 Internet-Drafts are working documents of the Internet Engineering 18 Task Force (IETF), its areas, and its working groups. Note that 19 other groups may also distribute working documents as Internet- 20 Drafts. 22 Internet-Drafts are draft documents valid for a maximum of six months 23 and may be updated, replaced, or obsoleted by other documents at any 24 time. It is inappropriate to use Internet-Drafts as reference 25 material or to cite them other than as "work in progress." 27 The list of current Internet-Drafts can be accessed at 28 http://www.ietf.org/ietf/1id-abstracts.txt. 30 The list of Internet-Draft Shadow Directories can be accessed at 31 http://www.ietf.org/shadow.html. 33 This Internet-Draft will expire on September 7, 2006. 35 Copyright Notice 37 Copyright (C) The Internet Society (2006). 39 Abstract 41 This document describes a protocol for Network Address Translator 42 (NAT) traversal for multimedia session signaling protocols based on 43 the offer/answer model, such as the Session Initiation Protocol 44 (SIP). This protocol is called Interactive Connectivity 45 Establishment (ICE). ICE makes use of the Simple Traversal of UDP 46 through NAT (STUN), applying its binding discovery, connectivity 47 check and relay usages. 49 Table of Contents 51 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 4 52 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . 4 53 3. Overview of ICE . . . . . . . . . . . . . . . . . . . . . . 8 54 4. Sending the Initial Offer . . . . . . . . . . . . . . . . . 11 55 5. Receipt of the Offer and Generation of the Answer . . . . . 11 56 6. Processing the Answer . . . . . . . . . . . . . . . . . . . 12 57 7. Common Procedures . . . . . . . . . . . . . . . . . . . . . 12 58 7.1 Gathering Candidates . . . . . . . . . . . . . . . . . . . 12 59 7.2 Prioritizing the Candidates and Choosing an Active One . . 16 60 7.3 Encoding Candidates into SDP . . . . . . . . . . . . . . . 18 61 7.4 Forming Candidate Pairs . . . . . . . . . . . . . . . . . 21 62 7.5 Ordering the Candidate Pairs . . . . . . . . . . . . . . . 23 63 7.6 Performing the Connectivity Checks . . . . . . . . . . . . 26 64 7.7 Sending a Binding Request for Connectivity Checks . . . . 30 65 7.8 Receiving a Binding Request for Connectivity Checks . . . 31 66 7.9 Promoting a Candidate to Active . . . . . . . . . . . . . 33 67 7.10 Learning New Candidates from Connectivity Checks . . . . 34 68 7.10.1 On Receipt of a Binding Request . . . . . . . . . . 34 69 7.10.2 On Receipt of a Binding Response . . . . . . . . . . 38 70 7.11 Subsequent Offer/Answer Exchanges . . . . . . . . . . . 39 71 7.11.1 Sending of a Subsequent Offer . . . . . . . . . . . 40 72 7.11.2 Receiving the Offer and Sending an Answer . . . . . 42 73 7.11.3 Receiving the Answer . . . . . . . . . . . . . . . . 45 74 7.12 Binding Keepalives . . . . . . . . . . . . . . . . . . . 45 75 7.13 Sending Media . . . . . . . . . . . . . . . . . . . . . 46 76 8. Guidelines for Usage with SIP . . . . . . . . . . . . . . . 49 77 9. Interactions with Forking . . . . . . . . . . . . . . . . . 51 78 10. Interactions with Preconditions . . . . . . . . . . . . . . 51 79 11. Examples . . . . . . . . . . . . . . . . . . . . . . . . . . 51 80 11.1 Basic Example . . . . . . . . . . . . . . . . . . . . . 53 81 11.2 Advanced Example . . . . . . . . . . . . . . . . . . . . 57 82 12. Grammar . . . . . . . . . . . . . . . . . . . . . . . . . . 77 83 13. Security Considerations . . . . . . . . . . . . . . . . . . 79 84 13.1 Attacks on Connectivity Checks . . . . . . . . . . . . . 79 85 13.2 Attacks on Address Gathering . . . . . . . . . . . . . . 81 86 13.3 Attacks on the Offer/Answer Exchanges . . . . . . . . . 82 87 13.4 Insider Attacks . . . . . . . . . . . . . . . . . . . . 82 88 13.4.1 The Voice Hammer Attack . . . . . . . . . . . . . . 82 89 13.4.2 STUN Amplification Attack . . . . . . . . . . . . . 83 90 14. IANA Considerations . . . . . . . . . . . . . . . . . . . . 83 91 14.1 candidate Attribute . . . . . . . . . . . . . . . . . . 83 92 14.2 remote-candidate Attribute . . . . . . . . . . . . . . . 84 93 14.3 ice-pwd Attribute . . . . . . . . . . . . . . . . . . . 84 94 15. IAB Considerations . . . . . . . . . . . . . . . . . . . . . 85 95 15.1 Problem Definition . . . . . . . . . . . . . . . . . . . 85 96 15.2 Exit Strategy . . . . . . . . . . . . . . . . . . . . . 86 97 15.3 Brittleness Introduced by ICE . . . . . . . . . . . . . 86 98 15.4 Requirements for a Long Term Solution . . . . . . . . . 87 99 15.5 Issues with Existing NAPT Boxes . . . . . . . . . . . . 87 100 16. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 88 101 17. References . . . . . . . . . . . . . . . . . . . . . . . . . 88 102 17.1 Normative References . . . . . . . . . . . . . . . . . . 88 103 17.2 Informative References . . . . . . . . . . . . . . . . . 89 104 Author's Address . . . . . . . . . . . . . . . . . . . . . . 91 105 Intellectual Property and Copyright Statements . . . . . . . 92 107 1. Introduction 109 RFC 3264 [4] defines a two-phase exchange of Session Descrption 110 Protocol (SDP) messages [5] for the purposes of establishment of 111 multimedia sessions. This offer/answer mechanism is used by 112 protocols such as the Session Initiation Protocol (SIP) [2]. 114 Protocols using offer/answer are difficult to operate through Network 115 Address Translators (NAT). Because their purpose is to establish a 116 flow of media packets, they tend to carry IP addresses within their 117 messages, which is known to be problematic through NAT [17]. The 118 protocols also seek to create a media flow directly between 119 participants, so that there is no application layer intermediary 120 between them. This is done to reduce media latency, decrease packet 121 loss, and reduce the operational costs of deploying the application. 122 However, this is difficult to accomplish through NAT. A full 123 treatment of the reasons for this is beyond the scope of this 124 specification. 126 Numerous solutions have been proposed for allowing these protocols to 127 operate through NAT. These include Application Layer Gateways 128 (ALGs), the Middlebox Control Protocol [19], Simple Traversal of UDP 129 through NAT (STUN) [16] and its revision [13], the STUN Relay Usage 130 [14], and Realm Specific IP [20] [21] along with session description 131 extensions needed to make them work, such as the Session Description 132 Protocol (SDP) [5] attribute for the Real Time Control Protocol 133 (RTCP) [1]. Unfortunately, these techniques all have pros and cons 134 which make each one optimal in some network topologies, but a poor 135 choice in others. The result is that administrators and implementors 136 are making assumptions about the topologies of the networks in which 137 their solutions will be deployed. This introduces complexity and 138 brittleness into the system. What is needed is a single solution 139 which is flexible enough to work well in all situations. 141 This specification provides that solution for media streams 142 established by signaling protocols based on the offer-answer model. 143 It is called Interactive Connectivity Establishment, or ICE. ICE 144 makes use of STUN and its relay extension, commonly called TURN, but 145 uses them in a specific methodology which avoids many of the pitfalls 146 of using any one alone. 148 2. Terminology 150 Several new terms are introduced in this specification: 152 Agent: As defined in RFC 3264, an agent is the protocol 153 implementation involved in the offer/answer exchange. There are 154 two agents involved in an offer/answer exchange. 156 Peer: From the perspective of one of the agents in a session, its 157 peer is the other agent. Specifically, from the perspective of 158 the offerer, the peer is the answerer. From the perspective of 159 the answerer, the peer is the offerer. 161 Transport Address: The combination of an IP address and port. 163 Local Transport Address: A local transport address is a transport 164 address that has been allocated from the operating system on the 165 host. This includes transport addresses obtained through Virtual 166 Private Networks (VPNs) and transport addresses obtained through 167 Realm Specific IP (RSIP) [20] (which lives at the operating system 168 level). Transport addresses are typically obtained by binding to 169 an interface. 171 m/c line: The media and connection lines in the SDP, which together 172 hold the transport address used for the receipt of media. 174 Derived Transport Address: A derived transport address is a transport 175 address which is derived from a local transport address. The 176 derived transport address is related to the associated local 177 transport address in that packets sent to the derived transport 178 address are received on the socket bound to its associated local 179 transport address. Derived addresses are obtained using protocols 180 like STUN, and more generally, any UNSAF protocol [22]. 182 Reflexive Transport Address: As defined in [13], a transport address 183 learned by a client which identifies that client as seen by 184 another host on an IP network, typically a STUN server. When 185 there is an intervening NAT between the client and the other host, 186 the reflexive transport address represents the binding allocated 187 to the client on the public side of the NAT. Reflexive transport 188 addresses are learned from the MAPPED-ADDRESS attribute in STUN 189 Binding Responses and Allocate Responses [14], and are a type of 190 derived transport address. 192 Server Reflexive Transport Address: A server reflexive transport 193 address is a reflexive address that is reflected off of a server, 194 distinct from the peer, whose address is configured or learned by 195 the client prior to an offer/answer exchange. 197 Peer Reflexive Transport Address: A peer reflexive transport address 198 is a reflexive address that is reflected off of the peer. Peer 199 reflexive transport addresses are learned by connectivity checks. 201 Relayed Transport Address: A transport address that terminates on a 202 server, and is forwarded towards the client. The STUN Allocate 203 Request can be used to obtain a relayed transport address, for 204 example. 206 Associated Local Transport Address: When a peer sends a packet to a 207 transport address, the associated local transport address is the 208 local transport address at which those packets will actually 209 arrive. For a local transport address, its associated local 210 transport address is the same as the local transport address 211 itself. For reflexive and relayed transport addresses, however, 212 they are not the same. The associated local transport address is 213 the one from which the reflexive or relayed transport was derived. 215 Candidate: A sequence of transport addresses that form an atomic set 216 for usage with a particular media session. Here, atomic means 217 that all of transport addresses in the candidate need to work 218 before the candidate will be used for actual media transport. In 219 the case of RTP, there can be one or more transport addresses per 220 candidate. In the most common case, there are two - one for RTP, 221 and another for RTCP. If the agent doesn't use RTCP, there would 222 be just one. If Generic Forward Error Correction (FEC) [18] is in 223 use, there may be more than two. The transport addresses that 224 compose a candidate are all of the same type - local, server 225 reflexive, peer reflexive or relayed. 227 Local Candidate: A candidate whose transport addresses are local 228 transport addresses. 230 Server Reflexive Candidate: A candidate whose transport addresses are 231 server reflexive transport addresses. 233 Peer Reflexive Candidate: A candidate whose transport addresses are 234 peer reflexive transport addresses. 236 Relayed Candidate: A candidate whose transport addresses are relayed 237 transport addresses. 239 Generating Candidate: The candidate from which a peer reflexive 240 candidate is derived. 242 Active Candidate: The candidate that is in use for exchange of media. 243 This is the one that an agent places in the m/c line of an offer 244 or answer. 246 Candidate ID: An identifier for a candidate. 248 Component: When a media stream, and as a consequence, its candidate, 249 require several IP addresses and ports to work atomically, each of 250 the constituent IP addresses and ports represents a component of 251 that media stream. For example, RTP-based media streams typically 252 have two components - one for RTP, and one for RTCP. 254 Component ID: An integer, starting with one within each candidate and 255 incrementing by one for each component, which identifies the 256 component. 258 Transport Address ID (tid): An identifier for a transport address, 259 formed by concatenating the candidate ID with the component ID, 260 separated by a "colon". 262 Candidate Pair: The combination of a candidate from one agent along 263 with a candidate from its peer. 265 Native Candidate: From the perspective of each agent, the candidate 266 in a candidate pair which represents a set of addresses obtained 267 by that agent. 269 Remote Candidate: From the perspective of each agent, the candidate 270 in a candidate pair which represents the set of addresses obtained 271 by that agents peer. 273 Transport Address Pair: The combination of the transport address for 274 one component of a candidate with the transport address of the 275 same component for the matching candidate in a candidate pair. 277 Transport Address Pair ID: An identifier for a transport address 278 pair. Formed by concatenating the native transport address ID 279 with the remote transport address ID, separated by a "colon". 281 Matching Transport Address Pair: When a STUN Binding Request is 282 received on a local transport address, the matching transport 283 address pair is the transport address pair whose connectivity is 284 being checked by that Binding Request. 286 Candidate Pair Priority Ordering: An ordering of candidate pairs 287 based on a combination of the qvalues of each candidate and the 288 candidate IDs of each candidate. 290 Candidate Pair Check Ordering: An ordering of candidate pairs that is 291 similar to the candidate pair priority ordering, except that the 292 active candidate appears at the top of the list, regardless of its 293 priority. 295 Transport Address Pair Check Ordering: An ordering of transport 296 address pairs that determines the sequence of connectivity checks 297 performed for the pairs. 299 Transport Address Pair Count: The number of transport address pairs 300 in a candidate pair. This is equal to the minimum of the number 301 of transport addresses in the native candidate and the number of 302 transport addresses in the remote candidate. 304 3. Overview of ICE 306 ICE makes the fundamental assumption that clients exist in a network 307 of segmented connectivity. This segmentation is the result of a 308 number of addressing realms in which a client can simultaneously be 309 connected. We use "realms" here in the broadest sense. A realm is 310 defined purely by connectivity. Two clients are in the same realm 311 if, when they exchange the addresses each has in that realm, they are 312 able to send packets to each other. This includes IPv6 and IPv4 313 realms, which actually use different address spaces, in addition to 314 private networks connected to the public Internet through NAT. 316 The key assumption in ICE is that a client cannot know, apriori, 317 which address realms it shares with any peer it may wish to 318 communicate with. Therefore, in order to communicate, it has to try 319 connecting to addresses in all of the realms. 321 Agent A STUN Servers Agent B 322 |(1) Gather Addresses | | 323 |-------------------->| | 324 |(2) Offer | | 325 |------------------------------------------>| 326 | |(3) Gather Addresses | 327 | |<--------------------| 328 |(4) Answer | | 329 |<------------------------------------------| 330 |(5) STUN Check | | 331 |<------------------------------------------| 332 |(6) STUN Check | | 333 |------------------------------------------>| 334 |(7) Media | | 335 |<------------------------------------------| 336 |(8) Media | | 337 |------------------------------------------>| 338 |(9) Offer | | 339 |------------------------------------------>| 340 |(10) Answer | | 341 |<------------------------------------------| 343 Figure 1 345 The basic flow of operation for ICE is shown in Figure 1. Before the 346 offerer establishes a session, it obtains local transport addresses 347 from its operating system on as many interfaces as it has access to. 348 These interfaces can include IPv4 and IPv6 interfaces, in addition to 349 Virtual Private Network (VPN) interfaces or ones associated with 350 RSIP. It then obtains transport addresses for the media from each 351 interface. Though ICE can support any type of transport protocol, 352 this specification only defines mechanisms for UDP. In addition, the 353 agent obtains server reflexive and relayed transport addresses. 354 These are usually obtained through a single STUN Allocate request, 355 which provides both. These requests are paced at a fixed rate in 356 order to limit network load and avoid NAT overload. The local, 357 server reflexive and relayed transport addresses are formed into 358 candidates, each of which represents a possible set of transport 359 addresses that might be viable for a media stream. 361 Each candidate is listed in a set of a=candidate attributes in the 362 offer. Each candidate is given a priority. Priority is a matter of 363 local policy, but typically, lowest priority would be given to 364 relayed transport addresses. Each candidate is also assigned a 365 distinct ID, called a candidate ID. 367 The agent will choose one of its candidates as its active candidate 368 for inclusion in the connection and media lines in the offer. Media 369 can be sent to this candidate immediately following its validation. 370 Media can also be sent to a candidate that is not active but has been 371 validated. Media is not sent without validation in order to avoid 372 denial-of-service attacks. In particular, without ICE, an offerer 373 can send an offer to another agent, and list the IP address and port 374 of a target in the offer. If the agent is an automata that answers a 375 call automatically, it will do so and then proceed to send media to 376 the target. This provides substantial packet amplifications. ICE 377 fixes this by requiring that an agent never send media packets unless 378 it has sent a STUN message towards the target of the RTP packets, and 379 received a reply from that target Section 7.13. 381 The offer is then sent to the answerer. This specification does not 382 address the issue of how the signaling messages themselves traverse 383 NAT. It is assumed that signaling protocol specific mechanisms are 384 used for that purpose. The answerer follows a similar process as the 385 offerer followed; it obtains addresses from local interfaces, obtains 386 derived transport addresses from those, and then groups them into 387 candidates for inclusion in a=candidate attributes in the answer. It 388 picks one candidate as its active candidate and places it into the 389 m/c line in the answer. 391 Once the offer/answer exchange has completed, both agents pair up the 392 candidates, and then determine an ordered set of transport address 393 pairs. This ordering is based primarily on the priority of the 394 candidates, with the exception of the active candidate, whose 395 addresses are at the top of the list. Both agents start at the top 396 of this list, beginning a connectivity check for that transport 397 address pair. At a fixed interval, checks for the next transport 398 address on the list begin. This results in a pacing of the 399 connectivity checks. These connectivity checks are performed through 400 peer-to-peer STUN requests, sent from one agent to the other. In 401 addition to pacing the checks out at regular intervals, the offerer 402 will generate a connectivity check for a transport address pair when 403 it receives one from its peer. As soon as the active candidate has 404 been verified by the STUN checks, media can begin to flow. Once a 405 higher priority candidate has been verified by the offerer, it ceases 406 additional connectivity checks, begins using that candidate for 407 media, and sends an updated offer which promotes this higher priority 408 candidate to the m/c-line. That candidate is also listed in 409 a=candidate attributes, resulting in periodic STUN keepalives through 410 the duration of the media session. 412 If an agent receives a STUN connectivity check with a new source IP 413 address and port, or a response to such a check with a new IP address 414 and port indicated in the MAPPED-ADDRESS attribute, this new address 415 might be a viable candidate for the receipt of media. This happens 416 when there is a NAT with an address dependent or address and port 417 dependent mapping property [37] between the agents. In such a case, 418 the agents algorithmically construct a new candidate. Like other 419 candidates, connectivity checks begin for it, and if they succeed, 420 its transport addresses can be used for receipt of media by promoting 421 it to the m/c-line. 423 The gathering of addresses and connectivity checks take time. As a 424 consequence, in order to have minimal impact on the call setup time 425 or post-pickup delay for SIP, these offer/answer exchanges and checks 426 happen while the call is ringing. 428 4. Sending the Initial Offer 430 When an agent wishes to begin a session by sending an initial offer, 431 it starts by gathering transport addresses, as described in 432 Section 7.1. This will produce a set of candidates, including local 433 ones, server reflexive ones, and relayed ones. 435 This process of gathering candidates can actually happen at any time 436 before sending the initial offer. A agent can pre-gather transport 437 addresses, using a user interface cue (such as picking up the phone, 438 or entry into an address book) as a hint that communications is 439 imminent. Doing so eliminates any additional perceivable call setup 440 delays due to address gathering. 442 When it comes time to offer communications, the agent determines a 443 priority for each candidate and identifies the active candidate that 444 will be used for receipt of media, as described in Section 7.2. 446 The next step is to construct the offer message. For each media 447 stream, it places its candidates into a=candidate attributes in the 448 offer and puts its active candidate into the m/c line. The process 449 for doing this is described in Section 7.3. The offer is then sent. 451 5. Receipt of the Offer and Generation of the Answer 453 Upon receipt of the offer message, the agent checks if the offer 454 contains any a=candidate attributes. If the offer does, the offerer 455 supports ICE. In that case, it starts gathering candidates, as 456 described in Section 7.1, and prioritizes them as described in 457 Section 7.2. This processing is done immediately on receipt of the 458 offer, to prepare for the case where the user should accept the call, 459 or early media needs to be generated. By gathering candidates (and 460 performing connectivity checks) while the user is being alerted to 461 the request for communications, session establishment delays are 462 reduced. 464 The agent then constructs its answer, encoding its candidates into 465 a=candidate attributes and including the active one in the m/c-line, 466 as described in Section 7.3. The agent then forms candidate pairs as 467 described in Section 7.4. These are ordered as described in 468 Section 7.5. The agent then begins connectivity checks, as described 469 in Section 7.6. It follows the logic in Section 7.10 on receipt of 470 Binding Requests and responses to learn new candidates from the 471 checks themselves. 473 Transmission of media is performed according to the procedures in 474 Section 7.13. 476 6. Processing the Answer 478 There are two possible cases for processing of the answer. If the 479 answerer did not support ICE, the answer will not contain any 480 a=candidate attributes. As a result, the offerer knows that it 481 cannot perform its connectivity checks. In this case, it proceeds 482 with normal media processing as if ICE was not in use. The 483 procedures for sending media, described in Section 7.13, MUST be 484 followed however. 486 If the answer contains candidates, it implies that the answerer 487 supports ICE. The offerer then forms candidate pairs as described in 488 Section 7.4. These are ordered as described in Section 7.5. The 489 agent then begins connectivity checks, as described in Section 7.6. 490 It follows the logic in Section 7.10 on receipt of Binding Requests 491 and responses to learn new candidates from the checks themselves. 493 Transmission of media is performed according to the procedures in 494 Section 7.13. 496 7. Common Procedures 498 This section discusses procedures that are common between offerer and 499 answerer. 501 7.1 Gathering Candidates 503 An agent gathers candidates when it believes that communications is 504 imminent. For offerers, this occurs before sending an offer 505 (Section 4). For answerers, it occurs before sending an answer 506 (Section 5). 508 Each candidate has one or more components, each of which is 509 associated with a sequence number, starting at 1 for the first 510 component of each candidate, and incrementing by 1 for each 511 additional component within that candidate. These components 512 represent a set of transport addresses for which connectivity must be 513 validated. For a particular media stream, all of the candidates 514 SHOULD have the same number of components. The number of components 515 that are needed are a function of the type of media stream. All of 516 the components in a candidate MUST be of the same type - server 517 reflexive, relayed, or local, and obtained from the same server in 518 the case of server reflexive or relayed candidates. For local 519 candidates, each component MUST be obtained from the same interface. 521 For traditional RTP-based media streams, it is RECOMMENDED that there 522 be two components per candidate - one for RTP and one for RTCP. The 523 component with the component ID of 1 MUST be RTP, and the one with 524 component ID of 2 MUST be RTCP. If an agent doesn't implement RTCP, 525 it SHOULD have a single component for the RTP stream (which will have 526 a component ID of 1 by definition). Each component of a candidate 527 has a single transport address. 529 The first step is to gather local candidates. Local candidates are 530 obtained by binding to ephemeral ports on an interface (physical or 531 virtual, including VPN interfaces) on the host. The process for 532 gathering local candidates depends on the transport protocol. 533 Procedures are specified here for UDP. Extensions to ICE that define 534 procedures for other transport protocols MUST specify how local 535 transport addresses are gathered. 537 For each UDP media stream the agent wishes to use, the agent SHOULD 538 obtain a set of candidates (one for each interface) by binding to N 539 ephemeral UDP ports on each interface, where N is the number of 540 components needed for the candidate. For RTP, N is typically two. 541 If a host has K local interfaces, this will result in K candidates 542 for each UDP stream, requiring K*N local transport addresses. 544 Once the agent has obtained local candidates, it obtains candidates 545 with derived transport addresses. The process for gathering derived 546 candidates depends on the transport protocol. Procedures are 547 specified here for UDP. Extensions to ICE that define procedures for 548 other transport protocols MUST specify how derived transport 549 addresses are gathered. 551 Agents which serve end users directly, such as softphones, 552 hardphones, terminal adapters and so on, MUST implement the STUN 553 Binding Discovery usage and SHOULD use it to obtain server reflexive 554 candidates. These devices SHOULD implement the STUN Relay usage, and 555 SHOULD use its Allocate request to obtain both server reflexive and 556 relayed candidates. They MAY implement and MAY use other protocols 557 that provide server reflexive or relayed transport addresses, such as 558 TEREDO [33]. 560 The requirement to use the relay Usage is at SHOULD strength to allow 561 for provider variation. If it is not to be used, it is RECOMMENDED 562 that it be implemented and just disabled through configuration, so 563 that it can re-enabled through configuration if conditions change in 564 the future. 566 Agents which represent network servers under the control of a service 567 provider, such as gateways to the telephone network, media servers, 568 or conferencing servers that are targeted at deployment only in 569 networks with public IP addresses MAY use the STUN Binding Discovery 570 usage and relay usage, or other similar protocols to obtain 571 candidates. 573 Why would these types of endpoints even bother to implement ICE? 574 The answer is that such an implementation greatly facilitates NAT 575 traversal for clients that connect to it. The ability to process 576 STUN connectivity checks allows for clients to obtain peer 577 reflexive transport addresses that can be used by the network 578 server to reach them without a relay, even through NATs with 579 restrictive mapping and filtering policies. Furthermore, 580 implementation of the STUN connectivity checks allows for NAT 581 bindings along the way to be kept open. ICE also provides 582 numerous security properties that are independent of NAT 583 traversal, and would benefit any multimedia endpoint. See 584 Section 13 for a discussion on these benefits. 586 Obtaining derived candidates requires transmission of packets which 587 have the effect of creating bindings on NAT devices between the 588 client and the STUN servers. Experience has shown that many NAT 589 devices have upper limits on the rate at which they will create new 590 bindings. Furthermore, transmission of these packets on the network 591 makes use of bandwidth and needs to be rate limited by the agent. As 592 a consequence, a client SHOULD pace its STUN transactions, such that 593 the start of each new transaction occurs at least Ta seconds after 594 the start of the previous transaction. The value of Ta SHOULD be 595 configurable, and SHOULD have a default of 50ms. Note that this 596 pacing applies only to the start of a new transaction; pacing of 597 retransmissions within a STUN transaction is governed by the 598 retransmission rules defined by STUN. 600 Derived candidates can be obtained from the STUN Binding Discovery 601 usage or the STUN Relay usage. The latter is preferred since it will 602 provide the client with both a server reflexive and a relayed 603 transport address with a single transaction. It is possible that 604 some STUN servers will only support the Relay usage or only the 605 Binding Discovery usage, in which case a client might be configured 606 with different servers depending on the usage. 608 To obtain both server reflexive and relayed candidates using the STUN 609 Relay Usage, the client takes a local UDP candidate, and for each 610 configured STUN server, produces both candidates. It is anticipated 611 that clients may have a multiplicity of STUN servers configured or 612 discovered in network environments where there are multiple layers of 613 NAT, and that layering is known to the provider of the client. To 614 obtain these candidates, for each configured STUN server, the client 615 initiates an Allocate Request transaction using the procedures of 616 Section 8.1.2 of [14] from each transport address of a particular 617 local candidate. The Allocate Response will provide the client with 618 its server reflexive transport address in the MAPPED-ADDRESS 619 attribute and its relayed transport address in the RELAY-ADDRESS 620 attribute. Once the Allocate requests have given a client a relayed 621 transport address for all transport addresses in a relayed candidate, 622 there is no reason for a client to obtain further relayed candidates 623 through the same STUN server. Thus, if there are other local 624 candidates from which the client has not yet obtained relayed 625 transport address, the client SHOULD NOT bother to obtain them. 626 Instead, it SHOULD use the STUN Binding Discovery usage and obtain 627 just server reflexive addresses from that STUN server. The order in 628 which local candidates are tried against the STUN server to obtain 629 relayed candidates is a matter of local policy. 631 To obtain server reflexice candidates using the STUN Binding 632 Discovery usage, the client takes a local UDP candidate, and for each 633 configured STUN server, produces a server reflexive candidate. To 634 produce the server reflexive candidate from the local candidate, it 635 follows the procedures of Section XX of [13] for each local transport 636 address in the local candidate. The Binding Response will provide 637 the client with its server reflexive transport address in the MAPPED- 638 ADDRESS attribute. If the client had K local candidates, this will 639 produce S*K server reflexive candidates, where S is the number of 640 STUN servers. 642 Since a client will pace its STUN transactions (both Binding and 643 Allocate requests) at a total rate of one new transaction every Ta 644 seconds, it will take a certain amount of time to complete the 645 address gathering phase. It is RECOMMENDED that implementations have 646 a configurable upper bound on the total amount of time allotted to 647 address gathering. Any transactions not completed at that point 648 SHOULD be abandoned, but MAY continue and be used in an updated offer 649 once they complete. A default value of 5s is RECOMMENDED. Since the 650 total number of allocations that could be done (based on the number 651 of STUN servers and local interfaces) might exceed this value, 652 clients SHOULD prioritize their local candidates and STUN servers, 653 performing transactions from the highest priority local candidates to 654 the highest priority STUN servers first. A STUN server would 655 typically be higher priority if it supports the STUN Relay Usage, 656 since such a server provides two transport addresses with one 657 transaction. 659 Once the allocations are complete, any redundant candidates are 660 discarded. Candidate A is redundant with candidate B if the 661 transport addresses for each component of each component match, and 662 each component of their associated local candidates match. For 663 example, consider a set of candidates with a single component. One 664 candidate is a local candidate, and its one component has a transport 665 address of 10.0.1.1:4458. A reflexive transport address is derived 666 from this local transport address, producing a 10.0.1.1:4458. These 667 two candidates are identical, and also have identical associated 668 local transport addresses, so they are redundant. However, in a more 669 complicated case, consider a multi-homed host, with one interface at 670 192.168.1.1 and another at 10.0.1.1. The 192.168 network is natted, 671 with its "public" side in another net-10 private network. The client 672 obtains two local candidates, A and B, with transport addresses of 673 192.168.1.1:2376 and 10.0.1.1:7266 respectively. A server reflexive 674 transport address is derived from A through a STUN query, and it 675 happens to produce 10.0.1.1:7266. Call this candidate C. Candidate C 676 is not redundant with candidate B, since they have different 677 associated local transport addresses. 679 7.2 Prioritizing the Candidates and Choosing an Active One 681 The prioritization process takes the set of candidates and associates 682 each with a priority. This priority reflects the desire that the 683 agent has to receive media at that candidate, and is assigned as a 684 value from 0 to 1 (1 being most preferred). Priorities are ordinal, 685 so that their significance is only meaningful relative to other 686 candidates from that agent for a particular media stream. Candidates 687 MAY have the same priority. However, it is RECOMMENDED that each 688 candidate have a distinct priority. Doing so improves the efficiency 689 of ICE. 691 This specification makes no normative statements on how the 692 prioritization is done. However, some useful guidelines are 693 suggested on how such a prioritization can be determined. 695 One criteria for choosing one candidate over another is whether or 696 not that candidate involves the use of an intermediary. That is, if 697 media is sent to that candidate, will the media first transit an 698 intermediate server before being received. Relayed candidates are 699 clearly one type of candidates that involve an intermediary. Another 700 are local candidates associated with a VPN server. When media is 701 transited through an intermediary, it can increase the latency 702 between transmission and reception. It can increase the packet 703 losses, because of the additional router hops that may be taken. It 704 may increase the cost of providing service, since media will be 705 routed in and right back out of an intermediary run by the provider. 706 If these concerns are important, candidates with this property can be 707 listed with lower priority. 709 Another criteria for choosing one candidate over another is IP 710 address family. ICE works with both IPv4 and IPv6. It therefore 711 provides a transition mechanism that allows dual-stack hosts to 712 prefer connectivity over IPv6, but to fall back to IPv4 in case the 713 v6 networks are disconnected (due, for example, to a failure in a 714 6to4 relay) [25]. It can also help with hosts that have both a 715 native IPv6 address and a 6to4 address. In such a case, higher 716 priority could be afforded to the native v6 address, followed by the 717 6to4 address, followed by a native v4 address. This allows a site to 718 obtain and begin using native v6 addresses immediately, yet still 719 fallback to 6to4 addresses when communicating with agents in other 720 sites that do not yet have native v6 connectivity. 722 Another criteria for choosing one candidate over another is security. 723 If a user is a telecommuter, and therefore connected to their 724 corporate network and a local home network, they may prefer their 725 voice traffic to be routed over the VPN in order to keep it on the 726 corporate network when communicating within the enterprise, but use 727 the local network when communicating with users outside of the 728 enterprise. 730 Another criteria for choosing one address over another is topological 731 awareness. This is most useful for candidates that make use of 732 relays. In those cases, if an agent has preconfigured or dynamically 733 discovered knowledge of the topological proximity of the relays to 734 itself, it can use that to select closer relays with higher priority. 736 There may be transport-specific reasons for preferring one candidate 737 over another. In such a case, specifications defining usage of ICE 738 with other transport protocols SHOULD document such considerations. 740 Once the candidates have been prioritized, one may be selected as the 741 active one. This is the candidate that will be used for actual 742 exchange of media if and when its validated, until a higher priority 743 candidate is validated. The active candidate will also be used to 744 receive media from ICE-unaware peers. As such, it is RECOMMENDED 745 that one be chosen based on the likelihood of that candidate to work 746 with the peer that is being contacted. Unfortunately, it is 747 difficult to ascertain which candidate that might be. As an example, 748 consider a user within an enterprise. To reach non-ICE capable 749 agents within the enterprise, a local candidate has to be used, since 750 the enterprise policies may prevent communication between elements 751 using a relay on the public network. However, when communicating to 752 peers outside of the enterprise, a relayed candidate from a 753 publically accessible STUN server is needed. 755 Indeed, the difficulty in picking just one address that will work is 756 the whole problem that motivated the development of this 757 specification in the first place. As such, it is RECOMMENDED that 758 the active candidate be a relayed candidate from a STUN server 759 providing public IP addresses in response to an Allocate request. 760 Furthermore, ICE is only truly effective when it is supported on both 761 sides of the session. It is therefore most prudent to deploy it to 762 close-knit communities as a whole, rather than piecemeal. In the 763 example above, this would mean that ICE would ideally be deployed 764 completely within the enterprise, rather than just to parts of it. 766 An additional consideration for selection of the active candidate is 767 the switching of media stream destinations between the initial offer 768 and the subsequent offer. If the active candidate pair in the 769 initial offer is being validated, media will flow to that pair once 770 it is validated. When the ICE checks complete and yield a higher 771 priority candidate pair, media will begin to flow to it (there will 772 also be an updated offer/answer exchange that changes the active 773 candidate). This will result in a change in the destination of the 774 media packets. This may also cause a different path for the media 775 packets. That path might have different delay and jitter 776 characteristics. As a consequence, the jitter buffers may see a 777 glitch, causing possible media artifacts. If these issues are a 778 concern, the initial offer MAY omit an active candidate. In such a 779 case, an updated offer will need to be sent immediately when 780 communicating with an ICE-unaware agent, setting an active candidate. 782 There may be transport-specific reasons for selection of an active 783 candidate. In such a case, specifications defining usage of ICE with 784 other transport protocols SHOULD document such considerations. 786 7.3 Encoding Candidates into SDP 788 For each candidate for a media stream, the agent includes a series of 789 a=candidate attributes as media-level attributes, one for each 790 component in the candidate. Each candidate has a unique identifier, 791 called the candidate-id. The candidate-id MUST be chosen randomly 792 and contain at least 24 bits of randomness (this does not mean that 793 the candidate-id is 24 bits long; just that it has at least 24 bits 794 of randomness). It is chosen only when the candidate is placed into 795 the SDP for the first time; subsequent offers or answers within the 796 same session containing that same candidate MUST use the same 797 candidate-id used previously. 24 bits is sufficient because the 798 candidate-id is not providing security (the much more random password 799 is). It is needed only to prevent a possible simultaneous selection 800 by two agents within a private network for the useful lifetime of the 801 software or hardware. 803 Each component of the candidate has an identifier, called the 804 component-id. The component-id is a sequence number. For each 805 candidate, it starts at one, and increments by one for each 806 component. As discussed below, ICE will perform connectivity checks 807 such that, between a pair of candidates, checks only occur between 808 transport addresses with the same component-id. As a consequence, if 809 one candidate has three components, and it is paired with a candidate 810 that has two, there will only be two transport address pairs and two 811 connectivity checks. 813 ICE will work without a standardized mapping between the components 814 of a media stream and the numerical value of the component-id. This 815 allows ICE to be used with media streams with multiple components 816 without development of standards around such a mapping. However, a 817 specific mapping has been defined in this specification for RTP - 818 component-id 1 corresponds to RTP, and component-id of 2 corresponds 819 to RTCP. Like the candidate-id, the component-id is assigned at the 820 time the candidate is first placed into the SDP; subsequent offers or 821 answers within the same session containing that same candidate MUST 822 use the same component-id used previously. 824 The transport, addr and port of the a=candidate attribute (all 825 defined in Section 12) are set to the transport protocol, unicast 826 address and port of the tranport address. A Fully Qualified Domain 827 Name (FQDN) for a host MAY be used in place of a unicast address. In 828 that case, when receiving an offer or answer containing an FQDN in an 829 a=candidate attribute, the FQDN is looked up in the DNS using an A or 830 AAAA record, and the resulting IP address is used for the remainder 831 of ICE processing. The qvalue is set to the priority of the 832 candidate, and MUST be the same for all components of the candidate. 834 All of the candidates share a password that is used for securing the 835 STUN connectivity checks. This password MUST be chosen randomly with 836 128 bits of randomness (though it can be longer than 128 bits). This 837 password is contained in the a=ice-pwd attribute, present as a 838 session level attribute. A new password MUST be selected for each 839 new session, and MUST be present with the same value in all 840 subsequent offers and answers from the agent. The converse is true; 841 if a new offer is generated as part of a new multimedia session, a 842 new password MUST be used even if the transport address from a 843 previous session was being recycled. 845 The combination of candidate-id and component-id uniquely identify 846 each transport address. As a consequence, each transport address has 847 a unique identifier, called the tid. The tid is formed by 848 concatenating the candidate-id with the component-id, separated by 849 the colon (":"). The tid is not explicitly encoded in the SDP; it is 850 derived from the candidate-id and component-id, which are present in 851 the SDP. The usage of the colon as a separator allows the 852 candidate-id and component-id to be extracted from the tid, since the 853 colon is not a valid character for the candidate-id. 855 The tid gets combined, through further concatenation, with the tid of 856 a transport address from the remote candidate (separated again by 857 another colon) to form the username that is placed in the STUN checks 858 between the peers. This allows the STUN message to uniquely identify 859 the pairing whose connectivity it is checking. The tid is needed as 860 a unique identifier because the IP address within the candidate fails 861 to provide that uniqueness as a consequence of NAT. 863 Consider agents A, B, and C. A and B are within private enterprise 1, 864 which is using 10.0.0.0/8. C is within private enterprise 2, which 865 is also using 10.0.0.0/8. As it turns out, B and C both have IP 866 address 10.0.1.1. A sends an offer to C. C, in its answer, provides 867 A with its transport addresses. In this case, thats 10.0.1.1:8866 868 and 8877. As it turns out, B is in a session at that same time, and 869 is also using 10.0.1.1:8866 and 8877. This means that B is prepared 870 to accept STUN messages on those ports, just as C is. A will send a 871 STUN request to 10.0.1.1:8866 and 8877. However, these do not go to 872 C as expected. Instead, they go to B. If B just replied to them, A 873 would believe it has connectivity to C, when in fact it has 874 connectivity to a completely different user, B. To fix this, tid 875 takes on the role of a unique identifier. C provides A with an 876 identifier for its transport address, and A provides one to C. A 877 concatenates these two identifiers (with a colon between) and uses 878 the result as the username in its STUN query to 10.0.1.1:8866. This 879 STUN query arrives at B. However, the username is unknown to B, and 880 so the request is rejected. A treats the rejected STUN request as if 881 there were no connectivity to C (which is actually true). Therefore, 882 the error is avoided. 884 An unfortunate consequence of the non-uniqueness of IP addresses is 885 that, in the above example, B might not even be an ICE agent. It 886 could be any host, and the port to which the STUN packet is directed 887 could be any ephemeral port on that host. If there is an application 888 listening on this socket for packets, and it is not prepared to 889 handle malformed packets for whatever protocol is in use, the 890 operation of that application could be affected. Fortunately, since 891 the ports exchanged in SDP are ephemeral and ususally drawn from the 892 dynamic or registered range, the odds are good that the port is not 893 used to run a server on host B, but rather is the agent side of some 894 protocol. This decreases the probability of hitting a port in-use, 895 due to the transient nature of port usage in this range. However, 896 the possibility of a problem does exist, and network deployers should 897 be prepared for it. Note that this is not a problem specific to ICE; 898 stray packets can arrive at a port at any time for any type of 899 protocol, especially ones on the public Internet. As such, this 900 requirement is just restating a general design guideline for Internet 901 applications - be prepared for unknown packets on any port. 903 The active candidate, if there is one, is placed into the m/c lines 904 of the SDP. For RTP streams, this is done by placing the RTP address 905 and port into the c and m lines in the SDP respectively. If the 906 agent is utilizing RTCP, it MUST encode its address and port using 907 the a=rtcp attribute as defined in RFC 3605 [1]. If RTCP is not in 908 use, the agent MUST signal that using b=RS:0 and b=RR:0 as defined in 909 RFC 3556 [6]. 911 If there is no active candidate, the agent MUST include an a=inactive 912 attribute. The RTP address and port in the m/c-line is 913 inconsequential, since it won't be used. 915 Encoding of candidates may involve transport protocol specific 916 considerations. There are none for UDP. However, extensions that 917 define usage of ICE with other transport protocols SHOULD specify any 918 special encoding considerations. 920 Once an offer or answer are sent, an agent MUST be prepared to 921 receive both STUN and media packets on each candidate. As discussed 922 in Section 7.13, media packets can be sent to a candidate prior to 923 its promotion to active. 925 7.4 Forming Candidate Pairs 927 Once the offer/answer exchange has completed, both agents will have a 928 set of candidates for each media stream. Each agent forms a set of 929 candidate pairs for each media stream by combining each of its 930 candidates with each of the candidates of its peer. Candidates can 931 be paired up only if their transport protocols are identical. If an 932 offer/answer exchange took place for a session comprised of an audio 933 and a video stream, and each agent had two candidates per media 934 stream, there would be 8 candidate pairs, 4 for audio and 4 for 935 video. One agent can offer two candidates for a media stream, and 936 the answer can contain three candidates for the same media stream. 937 In that case, there would be six candidate pairs. 939 Each candidate has a number of components, each of which has a 940 transport address. Within a candidate pair, the components 941 themselves are paired up such that transport addresses with the same 942 component ID are combined to form a transport address pair. 943 Returning to the previous example, for each of the 8 candidate pairs, 944 there would be two transport address pairs - one for RTP, and one for 945 RTCP. If one candidate has more components than the other, those 946 extra components will not be part of a transport address pair, won't 947 be validated, and will effectively be treated as if they weren't 948 included in the candidate pair in the first place. 950 The relationship between a candidate, candidate pair, transport 951 address, transport address pair and component are shown in Figure 2. 952 This figure shows the relationships as seen by the agent that owns 953 the candidate with candidate ID "L". This candidate has two 954 components with transport addresses A and B respectively. This 955 candidate is called the native candidate, since it is the one owned 956 by the agent in question. The candidate owned by its peer is called 957 the remote candidate. As the figure shows, there is a single 958 candidate pair, and two components in each candidate. The native 959 candidate has a candidate-id of "L", and the remote candidate has a 960 candidate-id of "R". Since the two component-ids are 1 and 2, 961 candidate "L" has two transport addresses with transport address IDs 962 of "L:1" and "L:2" respectively. Similarly, candidate "R" has two 963 transport addresses with transport address IDs of "R:1" and "R:2" 964 respectively. 966 Furthermore, each transport address pair is associated with an ID, 967 the transport address pair ID. This ID is equal to the concatenation 968 of the tid of the native transport address with the tid of the remote 969 transport address, separated by a colon. This means that the 970 identifiers are seen differenly for each agent. For the agent that 971 owns candidate "L", there are two transport address pairs. One 972 contains transport address "L:1" and "R:1", with a transport address 973 pair ID of "L:1:R:1". The other contains transport address "L:2" and 974 "R:2", with a transport address pair ID of "L:2:R:2". For the agent 975 that owns candidate "R", the identifiers for these two transport 976 address pairs are reversed; it would be "R:1:L:1" for the first one 977 and "R:2:L:2" for the second. 979 ............................................... 980 . . 981 . . 982 . ............. ............. . 983 . . tid=L:1 . . tid=R:1 . . 984 . . -- . . -- . . component 985 component. . | A|------------------------| C| . . id=1 986 id=1 . . -- . Transport . -- . . 987 . . . Address . . . 988 . . . Pair . . . 989 . . . id=L:1:R:1 . . . 990 . . . . . . 991 . . . . . . 992 . . tid=L:2 . . tid=R:2 . . 993 component . . -- . . -- . . 994 id=2 . . | B|------------------------| D| component 995 . . -- . Transport . -- . . id=2 996 . . . Address . . . 997 . . . Pair . . . 998 . . . id=L:2:R:2 . . . 999 . . . . . . 1000 . ............. ............. . 1001 . Native Remote . 1002 . Candidate Candidate . 1003 . id=L id=R . 1004 . . 1005 . . 1006 ............................................... 1008 Candidate Pair 1010 Figure 2 1012 If a candidate pair was created as a consequence of an offer 1013 generated by an agent, then that agent is said to be the offerer of 1014 that candidate pair and all of its transport address pairs. 1015 Similarly, the other agent is said to be the answerer of that 1016 candidate pair and all of its transport address pairs. As a 1017 consequence, each agent has a particular role, either offerer or 1018 answerer, for each transport address pair. This role is important; 1019 when a candidate pair is to be promoted to active, the offerer is the 1020 one which performs the updated offer. 1022 7.5 Ordering the Candidate Pairs 1024 For the same reason that the STUN transactions during address 1025 gathering are paced at a rate of Ta transactions per second, so too 1026 are the connectivity checks paced, also at a rate of Ta transactions 1027 per second. However, in order to rapidly converge on a valid 1028 candidate pair that is mutually desirable, the candidate pairs are 1029 ordered, and the checks start with the candidate pair at the top of 1030 the list. Rapid convergence of ICE depends on both the offerer and 1031 answerer coming to the same conclusion on the ordering of candidate 1032 pairs. 1034 Recall that when each candidate is encoded into SDP, it contains a 1035 qvalue between 1 and 0, with 1 being the highest priority. Peer 1036 reflexive candidates, learned through the procedures described in 1037 Section 7.10 also have a priority between 0 and 1. For each media 1038 stream, the native candidates are ordered based on their qvalues, 1039 with higher q-values coming first. Amongst candidates with the same 1040 qvalue, they are ordered based on candidate ID, using reverse 1041 lexicographic order, where C1 is placed before C2, if C2 precedes C1 1042 lexicographically. Lexicographic order can be viewed as a numerical 1043 ordering where each "digit" is actually a number in numerical base 1044 256, with the mapping of characters to numerical value being defined 1045 by their ASCII encoding. For example, the candidate with candidate 1046 ID agD is greater than the candidate with ID ad7, and both of those 1047 are greater than the candidate with ID zz. Consequently, if these 1048 three candidates had equal q-values, they would be ordered as agD, 1049 ad7, zz - reverse of their lexicographic order. 1051 The usage of a reverse lexicographic order is important; as discussed 1052 in Section 13, it allows peer-derived candidates to be preferred over 1053 native ones. 1055 The result of these ordering rules will be an ordered list of 1056 candidates. The first candidate in this list is given a sequence 1057 number of 1, the next is given a sequence number of 2, and so on. 1058 This same procedure is done for the remote candidates. The result is 1059 that each candidate pair has two sequence numbers, one for the native 1060 candidate, and one for the remote candidate. 1062 First, all of the candidate pairs for whom the smaller of the two 1063 sequence numbers equals 1 are taken first. Then, all of those for 1064 whom the smaller of the two sequence numbers equals 2 are taken next, 1065 and so on. Amongst those pairs that share the same value for their 1066 smaller sequence number, they are ordered by the larger of their two 1067 sequence numbers (smallest first). Amongst those pairs that share 1068 the same value for their smaller sequence number and the same value 1069 for their larger sequence number, the larger of the two candidate IDs 1070 in each pair are selected, and the pairs are lexicographically 1071 ordered in reverse by that candidate ID, largest first. 1073 As an example, consider two agents, A and B. One offers two 1074 candidates for a media stream with candidate IDs of "g9" and "88", 1075 with q-values of 1.0 and 0.8 respectively. The other answers with 1076 three candidates with candidate IDs of "h8", "65" and "kl", with 1077 q-values of 0.3, 0.2 and 0.1 respectively. The following table shows 1078 the rank ordering of the six candidate pairs. The column labeled 1079 "Max SN" is the larger of the two sequence numbers in the candidate 1080 pair, and "Min SN" is the minimum. The column labeled "Max Cand. 1081 ID" is the value of the larger of the two candidate IDs in the 1082 candidate pair. 1084 Order A A A B B B Max 1085 Cand. Cand. Cand. Cand. Cand. Cand. Max Min Cand. 1086 ID q-value SN ID q-value SN SN SN ID 1087 --------------------------------------------------------------------- 1088 1 g9 1.0 1 h8 0.3 1 1 1 h8 1089 2 88 0.8 2 h8 0.3 1 2 1 h8 1090 3 g9 1.0 1 65 0.2 2 2 1 g9 1091 4 g9 1.0 1 k1 0.1 3 3 1 k1 1092 5 88 0.8 2 65 0.2 2 2 2 88 1093 6 88 0.8 2 k1 0.1 3 3 2 k1 1095 This ordering is then modified slightly by taking the candidate pair 1096 corresponding to the active candidate, if there is one, and promoting 1097 it to the top of the list. To find this candidate pair, the agent 1098 looks for candidate pairs whose native and remote transport addresses 1099 match the native and remote transport addresses in the m/c-line. It 1100 is possible that multiple candidates match; this happens in the case 1101 where an agent obtained the same derived transport address from 1102 different local transport addresses. In such a case, the agent 1103 should pick one of the matching candidates. 1105 Putting the active candidate at the top of the list allows it to be 1106 tested first. As discussed below, media is not sent until the 1107 corresponding candidate is verified, necessitating rapid verification 1108 of the active candidate. This modified ordering is called the 1109 candidate pair check ordering, since it reflects the order in which 1110 connectivity checks will be done. If there was no active candidate, 1111 the candidate pair check ordering and the candidate pair priority 1112 ordering will be identical. 1114 Within each candidate pair there will be a set of transport address 1115 pairs, one for each component ID. Those pairs are ordered by 1116 component ID. The result is an absolute ordering of all transport 1117 address pairs for a media stream, sorted first by the order of their 1118 candidate pairs (with the exception of the active candidate), 1119 followed by the order of their component IDs. This ordering is 1120 called the transport address pair check ordering. 1122 Ordering of candidates may involve transport protocol specific 1123 considerations. There are none for UDP. However, extensions that 1124 define usage of ICE with other transport protocols SHOULD specify any 1125 special ordering considerations. 1127 7.6 Performing the Connectivity Checks 1129 Connectivity checks are a STUN usage defined in [13]. They are 1130 performed by sending peer-to-peer STUN Binding Requests. These 1131 checks result in a candidate progressing through a state machine that 1132 captures the progress of connectivity checks. The specific state 1133 machine and the procedures for the connectivity checks are specific 1134 to the transport protocol. This specification defines rules for UDP. 1135 Extensions to ICE that describe other transport protocols SHOULD 1136 describe the state machine and the procedures for connectivity 1137 checks. 1139 The set of states visited by the offerer and answerer are depicted 1140 graphically in Figure 4 1142 | 1143 |Start 1144 | 1145 | 1146 V 1147 +------------+ 1148 | | 1149 | | 1150 | Waiting |----------------+ 1151 | | | 1152 | | | 1153 +------------+ | 1154 | | 1155 | Timer Ta | Get Req 1156 | --------. | ------- 1157 | Send Req Get Req | Send Res, 1158 V ------- | Send Req 1159 Get Res +------------+ Send Res, | 1160 ------- | | Re-Xmit | 1161 - | | Req | 1162 +---------------| Testing |-----------+ | 1163 | | | | | 1164 | | | | | 1165 | +------------+ | | 1166 | | | | 1167 | | Error | | 1168 | | ----- | | 1169 Timer Tr | | - | | 1170 -------- V V V V 1171 Send Req +------------+ +------------+ +------------+ 1172 +-----| | | | | | 1173 | | Recv- | | | | Send- | 1174 | | Valid |------->| Invalid |<-------| Valid | 1175 | | | | | | | 1176 +---->| | Error | | Error | | 1177 +------------+ ----- +------------+ ----- +------------+ 1178 | - ^ - | 1179 | | Error | 1180 | | ----- | 1181 | | - | 1182 | +------------+ | 1183 | | | | 1184 | | | | 1185 +-------------->| Valid |<-------------+ 1186 Get Req | | Get Res 1187 ------- | | ------- 1188 Send Res +------------+ - 1189 | ^ 1190 | | 1191 | | 1192 +-------+ 1193 Timer Tr 1194 -------- 1195 Send Req 1197 Figure 4 1199 The state machine has six states - waiting, testing, Recv-Valid, 1200 Send-Valid, Valid and Invalid. Initially, all transport address 1201 pairs start in the waiting state. In this state, the agent waits for 1202 one of two events - a chance to send a Binding Request, or receipt of 1203 a Binding Request. 1205 Since there is an instance of the state machine for each transport 1206 address pair, Binding Requests and responses need to be matched to 1207 the specific state machine for which they apply. This is done by 1208 computing the matching transport address pair for each Binding 1209 Request. This is done by examining the USERNAME of the incoming 1210 Binding Request. The USERNAME directly contains the transport 1211 address pair ID. Requests that are sent by an agent as part of the 1212 processing described here encode the transport address pair in the 1213 USERNAME. Binding Responses are matched to their requests using the 1214 STUN transaction ID, and then mapped to the transport address pair 1215 from that. 1217 Every Ta seconds, the agent starts a new connectivity check for a 1218 transport address pair. The check is started for the first transport 1219 address pair in the transport address pair check ordered list (which 1220 will be part of the active candidate) that is in the Waiting state. 1221 The state machine for this transport address pair is moved to the 1222 Testing state, and the agent sends a connectivity check using a STUN 1223 Binding Request, as outlined in Section 7.7. Once a STUN 1224 connectivity check begins, the processing of the check follows the 1225 rules for STUN. Specifically, retransmits of STUN requests are done 1226 as specified in [13], and furthermore, if a transaction fails and 1227 needs to be retried, that retry can happen rapidly, as described 1228 below. It doesn't "count" against the rate limit of 1/Ta checks per 1229 second. In addition, the keepalives that are generated for a valid 1230 pair do not count against the rate limit either. The rate limit 1231 applies strictly to the start of connectivity checks for a transport 1232 address pair that has been newly signaled through an offer/answer 1233 exchange. 1235 In addition, if, while in the Waiting state, an agent receives a 1236 Binding Request matching that transport address pair, and this 1237 Binding Request generates a successful response, the transport 1238 address pair moves into the Send-Valid state, and the agent sends a 1239 connectivity check of its own using a STUN Binding Request, as 1240 outlined in Section 7.7. If the Binding Request didn't generate a 1241 success response, there is no change in state or generation of a 1242 Binding Request. 1244 If, while in the Testing state, the agent receives a successful 1245 response to its STUN request, the transport address pair moves into 1246 the Recv-Valid state. In this state, the agent knows that packets 1247 can flow in both directions. However, its peer agent doesn't yet 1248 know that; all it knows is that it has been able to receive a packet. 1249 Thus, in this state, the agent awaits receipt of the Binding Request 1250 sent by its peer, as the response to that request is what informs its 1251 peer that packets can flow in both directions. 1253 If, while in the Testing state, the agent receives a Binding Request 1254 matching that transport address pair, and this Binding Request 1255 generates a successful response, the transport address pair moves 1256 into the Send-Valid state. In addition, the agent retransmits a 1257 Binding Request for the transaction in progress. This helps speed up 1258 bidirectional connectivity verification when one agent is behind a 1259 symmetric NAT. If the Binding Request didn't generate a success 1260 response, there is no change in state or generation of a Binding 1261 Request. 1263 If, while in the Send-Valid state, the agent receives a successful 1264 response to its STUN request, the transport address pair moves to the 1265 Valid state. In this state, the agent knows that packets can flow in 1266 each direction. It also knows that its peer has sent it the STUN 1267 Request whose response will demonstrate to the peer that packets can 1268 flow in each direction. 1270 If, while in the Recv-Valid state, the agent receives a STUN Binding 1271 Request from its peer that results in a successful response, the 1272 transport address pair moves into the Valid state. Receipt of a 1273 request whose response was not a successful one does not result in a 1274 change in state. 1276 In any state, if the STUN transaction results in an error, the state 1277 machine moves into the invalid state. A STUN transaction produces an 1278 "error" based on the processing in Section 7.7, which indicates which 1279 STUN response codes constitute an error as far as ICE processing is 1280 concerned. 1282 If a transport address pair is in the Recv-Valid or Valid state, an 1283 agent MUST generate a new STUN Binding Request transaction every Tr 1284 seconds. This transaction ensures that NAT bindings for the 1285 transport address pair remain open while the candidate is under 1286 consideration. The transaction is performed as outlined in 1287 Section 7.7. These transactions can also be used to keep the NAT 1288 bindings alive when the candidate is promoted to active, as described 1289 in Section 7.12. Tr SHOULD be configurable, and SHOULD default to 15 1290 seconds. If the transaction results in an error, the state machine 1291 moves to the invalid state. This happens in cases where the NAT 1292 bindings expire (e.g., due to binding timeouts or NAT failures). 1294 The candidate pair itself has a state, which is derived from the 1295 states of its transport address pairs. If at least one of the 1296 transport address pairs in a candidate pair is in the invalid state, 1297 the state of the candidate pair is considered to be invalid. If the 1298 candidate pair enters this state, an agent SHOULD move the state 1299 machines for all of the other transport address pairs in this 1300 candidate pair into the invalid state as well. This will ensure that 1301 connectivity checks never start for those transport address pairs. 1302 Furthermore, if checks are already in progress for one of those 1303 transport address pairs, the agent SHOULD cease them. 1305 If all of the transport address pairs making up the candidate pair 1306 are Valid, the candidate pair is considered valid. If all of the 1307 transport address pairs making up the candidate pair are either Valid 1308 or Recv-Valid, and at least one is Recv-Valid, the candidate pair is 1309 considered to be Recv-Valid. If all of the transport address pairs 1310 making up the candidate pair are either Valid or Send-Valid, and at 1311 least one is Send-Valid, the candidate pair is considered to be Send- 1312 Valid. If all of the transport address pairs in a candidate pair are 1313 in the Waiting state, the candidate pair is in the waiting state. If 1314 all of the transport address pairs in the candidate pair are either 1315 in the Waiting or Testing states, and at least one is in the Testing 1316 state, the state of the candidate pair is Testing. Otherwise, the 1317 state of the candidate pair is considered Indeterminate. 1319 A candidate itself also has a state. If a candidate is present in at 1320 least one valid candidate pair, that candidate is said to be valid. 1321 If all of the candidate pairs containing that candidate are invalid, 1322 the candidate itself is invalid. Otherwise, the candidate's state is 1323 Indeterminate. 1325 7.7 Sending a Binding Request for Connectivity Checks 1327 An agent performs a connectivity check on a transport address pair by 1328 sending a STUN Binding Request from its native transport address, and 1329 sending it to the remote transport address. The meaning of "sending 1330 from its native transport address" depends on the type of transport 1331 protocol and the type of transport address (local, reflexive, or 1332 relayed). This specification defines the meaning for UDP. 1333 Specifications defining other transport protocols must define what 1334 this means for them. 1336 For UDP-based local transport addresses, sending from the local 1337 transport address has the meaning one would expect - the request is 1338 sent such that the source IP address and port equal that of the local 1339 transport address. For reflexive ransport addresses, it is sent by 1340 sending from the associated local transport address used to derive 1341 that reflesive address. For relayed transport addresses, it is sent 1342 by using STUN mechanisms to send the request through the STUN relay 1343 (using the Send request). Sending the request through the STUN relay 1344 server neccesarily requires that the request be sent from the client, 1345 using the local transport address used to derive the relayed 1346 transport address. 1348 The Binding Request sent by the agent MUST contain the USERNAME 1349 attribute. This attribute MUST be set to the transport address pair 1350 ID of the corresponding transport address pair as seen by its peer. 1351 Thus, for the first transport address pair in Figure 2, if the agent 1352 on the left sends the STUN Binding Request, the USERNAME will have 1353 the value R:1:L:1. If the agent on the right sends the STUN Binding 1354 Request, the USERNAME will have the value L:1:R:1. To be clear, the 1355 USERNAME that is used is NOT the one seen locally, but rather the one 1356 as seen by its peer. The request SHOULD contain the MESSAGE- 1357 INTEGRITY attribute, computed according to [13]. The key used as 1358 input to the HMAC is the password provided by the peer for this 1359 remote transport address. This password will be identical for all 1360 remote transport addresses for the same media stream. 1362 The STUN transaction will generate either a timeout, or a response. 1363 If the response is a 420, 500, or 401, the agent should try again as 1364 described in [13] (as mentioned above, it need not wait Ta seconds to 1365 try again). Either initially, or after such a retry, the STUN 1366 transaction might produce a non-recoverable failure response (error 1367 codes 400, 430, 431, or 600) or a failure result inapplicable to this 1368 usage of STUN and thus unrecoverable (432, 433). If this happens, an 1369 error event is generated into the state machine, and the transport 1370 address pair enters the invalid state. 1372 If the STUN transaction times out, the client SHOULD NOT retry. The 1373 only reason a retry might succeed is if there was severe packet loss 1374 during the duration of the check, or the answer was significantly 1375 delayed, also due to packet loss. However, STUN Binding Request 1376 transactions run for 9.5 seconds, which is well beyond the typical 1377 tolerance for a session establishment. The retries come with a 1378 penalty of additional traffic, which can be used to launch DoS 1379 attacks Section 13.4.2. The only reason to not follow the SHOULD NOT 1380 is if the agent has adjusted the STUN transaction timers to be more 1381 aggressive. 1383 If the Binding Response is a 200, the agent SHOULD check for the 1384 MESSAGE-INTEGRITY attribute and verify it, as discussed in [13]. 1385 Indeed, this check SHOULD be done for all responses. This will 1386 result in the response being discarded (eventually leading to a 1387 timeout), if the integrity check fails. 1389 7.8 Receiving a Binding Request for Connectivity Checks 1391 As a result of providing a list of candidates in its offer or answer, 1392 an agent will receive STUN Binding Request messages. An agent MUST 1393 be prepared to receive STUN Binding Requests on each local transport 1394 address from the moment it sends an offer or answer that contains a 1395 candidate with that local transport address. Similarly, it MUST be 1396 prepared to receive STUN Binding Requests on a local transport 1397 address the moment it sends an offer or answer that contains a 1398 reflexive or relayed candidate derived from a local candidate with 1399 that local transport address. It can cease listening for STUN 1400 messages on that local transport address after sending an updated 1401 offer or answer which does not include any candidates with transport 1402 addresses that are equal to or derived from that local transport 1403 address. 1405 As discussed in [13], since the username and password for STUN 1406 requests are exchanged through another mechanism - here, ICE - the 1407 Shared Secret Request mechanism is not needed and need not be 1408 implemented by agents that provide the connectivity check usage. 1410 One of the candidates may be in use as the active candidate, or may 1411 become promoted to the active candidate in the next offer/answer 1412 exchange as a consequence of a successful validation. In either 1413 case, both media and STUN packets will be sent to the transport 1414 addresses comprising that candidate, causing both to receive on their 1415 associated local transport addresses. The agent MUST be able to 1416 disambiguate them. This is done trivially by looking for the STUN 1417 magic cookie as the value of the second 32-bit word in the packet. 1418 If present, it identifies a STUN packet. 1420 Processing of the Binding Request proceeds in two steps. The first 1421 is generation of the response, and the second ICE-specific 1422 processing. Generation of the response follows the general 1423 procedures of [13]. The USERNAME is considered valid if one of the 1424 candidate IDs sent in an offer or answer is a prefix of the USERNAME 1425 (this will always be the case, even for peer reflexive candidates). 1426 The password associated with that candidate ID is used to verify the 1427 MESSAGE-INTEGRITY attribute, if one was present in the request. If 1428 the USERNAME was not valid, the agent generates a 430. Otherwise, 1429 the success response will include the MAPPED-ADDRESS attribute, which 1430 is used for learning new candidates, as described in Section 7.10. 1431 The MAPPED-ADDRESS attribute is populated with the source IP address 1432 and port of the Binding Request. For Binding Requests received over 1433 relayed transport addresses, this MUST be the source IP address and 1434 port of the Binding Request when it arrived at the relay, prior to 1435 forwarding towards the agent. That source transport address will be 1436 present in the REMOTE-ADDRESS attribute of a STUN Data Indication 1437 message, if the Binding Request was delivered through a Data 1438 Indication. If the Binding Request was not encapsulated in a Data 1439 Indication, that source address is equal to the current active 1440 destination for the STUN relay session. 1442 The ICE processing involves changes to the state machine for a 1443 transport address pair. This processing cannot be done until the 1444 initial offer/answer exchange has completed. As a consequence, if 1445 the oferrer received a Binding Request that generated a success 1446 response, but had not yet received the answer to its offer, it waits 1447 for the answer, and when it arrives, then performs the ICE 1448 processing. 1450 The agent takes the entire contents of the USERNAME, and compares 1451 them against the transport address pair identifiers as seen by that 1452 agent for each transport address pair. If there is no match, nothing 1453 is done - this should never happen for compliant implementations. If 1454 there is a match, the resulting transport address pair is called the 1455 matching transport address pair. The state machine for the matching 1456 transport address pair is then updated based on the receipt of a STUN 1457 Binding Request, and the resulting actions described in Section 7.6 1458 are undertaken. 1460 An agent will continue to receive periodic STUN connectivity checks 1461 on a local transport address as long as it had listed that transport 1462 address, or one derived from it, in an a=candidate attribute in its 1463 most recent offer or answer, the state machine for that transport 1464 address is in the Recv-Valid or Valid states, and the transport 1465 address is for UDP. Whether STUN keepalives are used for other 1466 transport protocols is defined by the specifications for that 1467 transport protocol. The agent processes any such transactions 1468 according to this section. It is possible that a transport address 1469 pair that was previously valid may become invalidated as a result of 1470 a subsequent failed STUN transaction. 1472 7.9 Promoting a Candidate to Active 1474 As a consequence of the connectivity checks, each agent will change 1475 the states for each transport address pair, and consequently, for the 1476 candidate pairs. When a candidate pair becomes valid, and the agent 1477 is in the role of offerer for that candidate pair, the agent follows 1478 the logic in this section. The rules only apply to the offerer of a 1479 candidate pair in order to eliminate the possibility of both agents 1480 simultaneously offering an update to promote a candidate to active. 1482 If this candidate pair is the first one in the candidate pair 1483 priority ordered list, the agent SHOULD send an updated offer as 1484 described in Section 7.11.1. If this candidate pair is not the first 1485 on that list, but it is the first on the candidate pair check ordered 1486 list, it means that this candidate pair is the active one, and its 1487 connectivity has been verified. This is good news; the currently 1488 active candidate is working. Media can now flow as described in 1489 Section 7.13 (media will never flow prior to validation). However, 1490 no updated offer is sent at this time. 1492 If this candidate pair is not the first on the candidate pair 1493 priority ordered list or the candidate pair check ordered list, and 1494 the wait-state timer has not yet been set, the agent sets this timer 1495 to Tws seconds. Tws SHOULD be configurable, and SHOULD have a 1496 default of 100ms. This timer allows for a higher priority 1497 connectivity check to complete, in the event its STUN Binding Request 1498 was lost or delayed in the network. If, prior to the wait-state 1499 timer firing, another connectivity check completes and a candidate 1500 pair is validated, there is no need to reset or cancel the timer. 1502 Once the timer fires, the agent SHOULD issue an updated offer as 1503 described in Section 7.11.1. 1505 In addition, in order to speed up ICE processing, once the agent has 1506 determined the candidate that is to be promoted, it will send and 1507 receive media using that candidate in expectation of an updated 1508 offer. This is discussed in Section 7.13. 1510 7.10 Learning New Candidates from Connectivity Checks 1512 ICE makes use of reflexive addresses, which are addresses that inform 1513 an agent of its transport address as seen by another host. An 1514 initial offer or answer generated by an agent includes server 1515 reflexive addresses, which are learned from a configured or 1516 discovered STUN server in the network. However, the connectivity 1517 checks themselves can inform an agent of reflexive addresses, and in 1518 particular, ones that are reflexive towards its peer. These are 1519 called peer reflexive candidates. A new peer reflexive candidate is 1520 typically observed when two agents are separated by a NAT with the 1521 address-dependent or address and port dependent mapping properties 1522 [37]. When the agent behind such a NAT sends a Binding Request to 1523 the other agent (assuming it is reachable), the NAT will create a new 1524 mapping for this Binding Request. Because STUN and the media packets 1525 are sent on the same port, regardless of the filtering properties of 1526 the NAT (whether endpoint independent, address dependent, or address 1527 and port dependent), this reflexive address can be used by the peer 1528 for sending STUN and media packets back towards the agent. 1530 To obtain and use these peer reflexive transport addresses, ICE 1531 agents perform additional processing on the receipt of STUN Binding 1532 Requests and responses, beyond the logic described in Section 7.7 and 1533 Section 7.8. This logic is described below. 1535 7.10.1 On Receipt of a Binding Request 1537 When a STUN Binding Request is received which generates a success 1538 response, that Binding Request would have been associated with a 1539 matching transport address pair and corresponding candidate pair. 1540 The source IP and port of this Binding Request are compared to the IP 1541 address and port of the remote transport address in the matching 1542 transport address pair. Note that, in this case, we are comparing 1543 actual IP addresses and ports - not tids. In addition, if the 1544 Binding Request arrived through a relayed transport address, the 1545 source IP and port of this binding request used for the comparison 1546 are those in the Binding Request when it arrived at the relay, prior 1547 to forwarding towards the agent. That source transport address will 1548 be present in the REMOTE-ADDRESS attribute of a STUN Data Indication 1549 message, if the Binding Request were delivered through a Data 1550 Indication. If the Binding Request was not encapsulated in a Data 1551 Indication, that source address is equal to the current active 1552 destination for the STUN relay session. 1554 The comparison of the source IP and port of the Binding Request and 1555 the IP address and port of the remote transport address in the 1556 matching transport address pair may indicate inequality. In that 1557 case, the source IP and port of the Binding Request (and again, for 1558 relayed transport address, this refers to the source IP address and 1559 port of the packet when it arrived at the relay) are compared to the 1560 IP address and ports across the transport address pairs in *all* 1561 remote candidates. If there is still no match, it means that the 1562 source IP and port might represent another valid remote transport 1563 address - a peer derived one. 1565 To use it, that address needs to be associated with a candidate 1566 (called a peer-derived candidate). In this case, however, the 1567 candidate isn't signaled through an offer/answer exchange; it is 1568 constructed dynamically from information in the STUN request. Like 1569 all other candidates, the peer-derived candidate has a candidate ID. 1570 The candidate ID is derived from the candidate IDs of the matching 1571 candidate pair. In particular, the candidate ID is constructed by 1572 concatenating the remote candidate ID with the native candidate ID 1573 (without the colon). The password for the new candidate equals that 1574 of the remote candidate ID in the matching candidate pair. 1576 On receipt of a STUN Binding Request whose source IP and port don't 1577 match the transport address in any remote candidate, the agent 1578 constructs the candidate ID that represents the peer reflexive 1579 candidate, and checks to see if that candidate exists. It may 1580 already exist if it had been constructed as a consequence of a 1581 previous application of this logic on receipt of a Binding Request 1582 for a different transport address pair of the same candidate pair. 1583 If there is not yet a peer reflexive candidate with that candidate 1584 ID, the agent creates it, and assigns it the newly computed candidate 1585 ID. The priority of the peer-derived candidate MUST be set to the 1586 priority of its generating candidate - the remote candidate in the 1587 matching transport address pair. Note that, at this time, the peer 1588 derived candidate has no transport addresses in it. 1590 Newly created or not, the agent extracts the component ID from the 1591 matching transport address pair, and sees if a transport address with 1592 that same component ID exists in the peer reflexive candidate. If 1593 not (and it shouldn't), the agent adds a transport address to the 1594 peer reflexive candidate. This transport address is equal to the 1595 source IP address and port from the incoming STUN Binding Request 1596 (and in the case of a relayed transport address, the one seen by the 1597 relay). It is assigned the component ID equal to the component ID in 1598 the matching transport address pair. This transport address will 1599 have a tid, equal to the concatenation of the candidate ID for this 1600 new candidate, and the component ID, separated by a colon. 1602 The peer reflexive candidate becomes usable once the number of 1603 transport addresses in it equals the transport address pair count of 1604 the candidate pair from which it is derived. Initially, the peer 1605 reflexive candidate will start with a single transport address. More 1606 are added as the connectivity checks for the original candidate pair 1607 take place. Once the peer reflexive candidate becomes usable, it has 1608 to be paired up with native candidates. However, unlike the 1609 procedures of Section 7.5, which pair up each remote candidate with 1610 each native candidate, this peer reflexive candidate is only paired 1611 up with the native candidate from the candidate pair from which it 1612 was derived. This creates a new candidate pair, and a set of new 1613 transport address pairs. 1615 Recall that, for each candidate pair, one agent plays the role of 1616 offerer, and the other of answerer. For a peer-reflexive candidate, 1617 the role is identical to that of its generating candidate. 1619 Figure 5 provides a pictorial representation of the peer reflexive 1620 candidate (the one with id=RL) and its pairing with the native 1621 candidate with id L. The candidate with ID R is referred to as the 1622 generating candidate. The peer reflexive candidate is effectively an 1623 alternate for that generating candidate, but is only paired with a 1624 specific native candidate. Note that, for a particular generating 1625 candidate, there can be many peer derived candidates, up to one for 1626 each native candidate. 1628 ............. ............. 1629 . tid=L:1 . . tid=R:1 . 1630 component. -- . id=L:1:R:1 . -- .component 1631 id=1 . | A|-------------------------| C| . id=1 1632 . -- -------+ . -- . 1633 . . | . . Generating 1634 . . | . . Candidate 1635 . tid=L:2 . | . tid=R:2 . 1636 component. -- . | id=L:2:R:2 . -- .component 1637 id=2 . | B|-------C-----------------| D| . id=2 1638 . -- -----+ | . -- . 1639 .............| | ............. 1640 Native | | Remote 1641 Candidate | | Candidate 1642 id=L | | id=R 1643 | | 1644 | | ............. 1645 | | . tid=RL:1 . 1646 | | id=L:1:RL:1 . -- .component 1647 | +-----------------| C| . id=1 1648 | . -- . 1649 | . . Peer Derived 1650 | . . Candidate 1651 | . tid=RL:2 . 1652 | id=L:2:RL:2 . -- .component 1653 +-------------------| D| . id=2 1654 . -- . 1655 ............. 1656 Remote 1657 Candidate 1658 id=RL 1660 Figure 5 1662 The new transport address pairs have a state machine associated with 1663 them. The state that is entered, and actions to take as a 1664 consequence, are specific to the transport protocol. For UDP, the 1665 procedures are defined here. Extensions that define processing for 1666 other transport protocols SHOULD describe the behavior. 1668 For UDP, the state machine enters the Send-Valid state. Effectively, 1669 the Binding Request just received "counts" as a validation in this 1670 direction, even though it was formally done for a different candidate 1671 pair. In addition, the agent SHOULD generate a Binding Request for 1672 each transport address in this new candidate pair, as described in 1673 Section 7.7. The transport address pairs are inserted into the 1674 ordered list of pairs based on the ordering described in Section 7.5 1675 and processing follows the logic described in Section 7.6. 1677 7.10.2 On Receipt of a Binding Response 1679 The procedures on receipt of a Binding Response are nearly identical 1680 to those for receipt of a Binding Request as described above. 1682 When a successful STUN Binding Response is received, it will be 1683 associated with a matching transport address pair and corresponding 1684 candidate pair. This matching is done based on comparison of 1685 candidate IDs. The value of the MAPPED-ADDRESS attribute of the 1686 Binding Response are compared to the IP address and port of the 1687 native transport address in the matching transport address pair. 1688 Note that, in this case, we are comparing actual IP addresses and 1689 ports - not tids. These may not match if there was a NAT between the 1690 two agents. If they do not match, the value of the MAPPED-ADDRESS 1691 attribute of the Binding Response are compared to the IP address and 1692 ports across the transport address pairs in *all* native candidates. 1693 If there is still no match, it means that the MAPPED-ADDRESS might 1694 represent another valid native transport address. 1696 To use it, that address needs to be associated with a candidate. In 1697 this case, however, the candidate isn't signaled through an offer/ 1698 answer exchange; it is constructed dynamically from information in 1699 the STUN response. Such a candidate is called a peer reflexive 1700 candidate. Like all other candidates, the peer reflexive candidate 1701 has a candidate ID. The candidate ID is derived from the candidate 1702 IDs of the matching candidate pair. In particular, the candidate ID 1703 is constructed by concatenating the native candidate ID with the 1704 remote candidate ID (without the colon). The password for the new 1705 candidate equals that of the native candidate ID in the matching 1706 candidate pair. 1708 On receipt of a STUN Binding Response whose MAPPED-ADDRESS didn't 1709 match the transport address in any native candidate, the agent 1710 constructs the candidate ID that represents the peer reflexive 1711 candidate, and checks to see if that candidate exists. It may 1712 already exist if it had been constructed as a consequence of a 1713 previous application of this logic on receipt of a Binding Response 1714 for a different transport address pair of the same candidate pair. 1715 If there is not yet a peer derived candidate with that candidate ID, 1716 the agent creates it, and assigns it the newly computed candidate ID. 1717 The priority of the new candidate MUST be set to the priority of the 1718 generating candidate - the native candidate in the matching transport 1719 address pair. Note that, at this time, the peer derived candidate 1720 has no transport addresses in it. 1722 Newly created or not, the agent extracts the component ID from the 1723 matching transport address pair, and sees if a transport address with 1724 that same component ID exists in the peer reflexive candidate. If 1725 not (and it shouldn't), the agent adds a transport address to the 1726 peer reflexive candidate. This transport address is equal to the 1727 MAPPED-ADDRESS from the STUN Binding Response. It is assigned the 1728 component ID equal to the component ID in the matching transport 1729 address pair. This transport address will have a tid, equal to the 1730 concatenation of the candidate ID for this new candidate, and the 1731 component ID, separated by a colon. 1733 The peer-derived candidate becomes usable once the number of 1734 transport addresses in it equals the transport address pair count of 1735 candidate pair from which it is derived. Initially, the peer-derived 1736 candidate will start with a single transport address. More are added 1737 as the connectivity checks for the original candidate pair take 1738 place. Once the peer-derived candidate becomes usable, it has to be 1739 paired up with remote candidates. However, unlike the procedures of 1740 Section 7.5, which pair up each remote candidate with each native 1741 candidate, the peer-derived candidate is only paired up with the 1742 remote candidate from the matching candidate pair. This creates a 1743 new candidate pair, and a set of new transport address pairs. 1745 Recall that, for each candidate pair, one agent plays the role of 1746 offerer, and the other of answerer. For a peer-reflexive candidate, 1747 the role is identical to that of its generating candidate. 1749 The new transport address pairs have a state machine associated with 1750 them. The state that is entered, and actions to take as a 1751 consequence, are specific to the transport protocol. For UDP, the 1752 procedures are defined here. Extensions that define processing for 1753 other transport protocols SHOULD describe the behavior. 1755 For UDP, the state machine enters the Recv-Valid state. Effectively, 1756 the Binding Response just received "counts" as a validation in this 1757 direction, even though it was formally done for a different candidate 1758 pair. The transport address pairs are inserted into the ordered list 1759 of pairs based on the ordering described in Section 7.5, and 1760 processing follows the logic described in Section 7.6. 1762 7.11 Subsequent Offer/Answer Exchanges 1764 An agent MAY issue an updated offer at any time. This updated offer 1765 may be sent for reasons having nothing to do with ICE processing (for 1766 example, the addition of a video stream in a multimedia session), or 1767 it may be due to a change in ICE-related parameters. For example, if 1768 an agent acquires a new candidate after the initial offer/answer 1769 exchange, it may seek to add it. 1771 However, agents SHOULD follow the logic described in Section 7.9 to 1772 determine when to send an updated offer as a consequence of promoting 1773 a candidate to active. 1775 If there are any aspects of this processing that are specific to the 1776 transport protocol, those SHOULD be called out in ICE extensions that 1777 define operation with other transport protocols. There are no 1778 additional considerations for UDP. 1780 7.11.1 Sending of a Subsequent Offer 1782 The offer MAY contain a new active candidate in the m/c line. This 1783 candidate SHOULD be the native candidate from the highest candidate 1784 pair in the candidate pair priority ordered list whose state is 1785 Valid. If there are no candidate pairs in this state, the highest 1786 one whose state is Send-Valid or Recv-Valid SHOULD be used. If there 1787 are no candidate pairs in these states, the candidate pair that is 1788 most likely to work with this peer, as described in Section 7.2, 1789 SHOULD be used. The candidate is encoded into the m/c line in an 1790 updated offer as described in Section 7.3. 1792 If the candidate pair whose native candidate was encoded into the 1793 m/c-line was Valid, Send-Valid or Recv-Valid, the agent MUST include 1794 an a=remote-candidate attribute into the offer. This attribute MUST 1795 contain the candidate ID of the remote candidate in the candidate 1796 pair. It is used by the recipient of the offer in selecting its 1797 candidate for the answer. 1799 The meaning of a=candidate attributes within a subsequent offer have 1800 the same meaning as they do in an initial offer. They are a request 1801 for the peer to attempt (or continue to attempt if the candidate was 1802 provided previously) a connectivity check using STUN from each of its 1803 own candidates. When an updated offer is sent, there are several 1804 dispositions regarding the candidates: 1806 retained: A candidate is retained if the candidate ID for the 1807 candidate is included in the new offer, and matches the candidate 1808 ID for a candidate in the previous offer or answer from the agent. 1809 In this case, all of the information about the candidate - its 1810 qvalue and components, and the IP addresses, ports, and transport 1811 protocols of its components, MUST be the same as the previous 1812 offer or answer from the agent. If the agent wants to change 1813 them, this is accomplished by changing the candidate ID as well. 1814 That will have the effect of removing the old candidate and adding 1815 a new one with the updated information. 1817 removed: A candidate is removed if its candidate ID appeared in a 1818 previous offer or answer, and that candidate ID is not present in 1819 the new offer. 1821 added: A candidate is added if its candidate ID appeared in the new 1822 offer, but was not present in a previous offer or answer from that 1823 agent. 1825 The following rules are used to determine the disposition of the each 1826 of the current native candidates in the new offer: 1828 o If a candidate is invalid, and all peer reflexive candidates 1829 generated from it are invalid as well, it SHOULD be removed. 1831 o If the candidate in the m/c-line is valid, all other candidates 1832 SHOULD be removed. This has the effect of stopping connectivity 1833 checks of other candidates. This SHOULD would not be followed if 1834 an agent wanted to keep a candidate ready for usage should, for 1835 some reason, the active candidate later become invalid. 1837 o If the candidate in the m/c-line is valid, and it is not peer 1838 reflexive, that candidate MUST be retained. If the candidate in 1839 the m/c-line is peer reflexive, its generating candidate MUST be 1840 retained, even if it is itself invalid. 1842 o If the candidate in the m/c-line has not been validated, all other 1843 candidates that are not invalid, or candidates for whom their 1844 derived candidates are not invalid, SHOULD be retained. 1846 o Peer reflexive candidates MUST NOT be added; they continue to be 1847 used as long as their generating candidate was retained. Peer 1848 derived candidates are learned exclusively through the STUN 1849 connectivity checks. 1851 A new candidate MAY be added. This can happen when the candidate is 1852 a new one, learned since the previous offer/answer exchange, and it 1853 has a higher priority than the currently active candidate. It can 1854 also occur when an agent wishes to restart checks for a transport 1855 address it had tried previously. Effectively, changing the candidate 1856 ID value in an updated offer will "restart" connectivity checks for 1857 that candidate. 1859 If a candidate is removed, the agent takes the following steps once 1860 the offer is sent: 1862 1. The agent eliminates any candidate pairs whose native candidate 1863 equalled the candidate that was removed. Equality is based on 1864 comparison of candidate IDs. 1866 2. The agent eliminates any candidate pairs that had a native 1867 candidate that is a peer reflexive candidate generated from the 1868 candidate that was removed. 1870 3. The candidate pairs that are eliminated are removed from the 1871 candidate pair priority ordered list and candidate pair check 1872 ordered list. As a consequence of this, if connectivity checks 1873 had not yet begun for the candidate pair, they won't. 1875 4. If connectivity checks were already in progress for transport 1876 addresses in a candidate pair that was removed, the agent SHOULD 1877 immediately terminate them. No further retransmissions take 1878 place, and no further transactions from that candidate will be 1879 made. 1881 5. If the removed candidate was a relayed candidate, the agent 1882 SHOULD de-allocate its transport addresses from the STUN relay if 1883 it is not using those resources elswhere. If a local candidate 1884 was removed, and all of its derived candidates were also removed 1885 (including any peer reflexive candidates), local operating system 1886 resources for each of the transport addresses in the local 1887 candidate SHOULD be de-allocated, as long as it is not using 1888 those resources elsewhere. The resources may be in use elsewhere 1889 if they were included in an initial offer which generated 1890 multiple answers (as can happen with SIP forking). In such a 1891 case, a subsequent offer which removes the candidate will not 1892 imply its removal with the other branches; each becomes a 1893 separate offer/answer relationship. 1895 Subsequent offers MUST contain the a=ice-pwd attribute. This SHOULD 1896 have the same value as in previous offers. However, an agent MAY 1897 change it if, for some reason, the agent believes that the password 1898 may have been compromised. Since the same password is applied across 1899 all transport addresses in all candidates for all media streams, a 1900 change in the password impacts all of them. An agent MUST be 1901 prepared to receive connectivity checks that use either the new or 1902 old password until Tpw seconds after it receives the answer. Tpw 1903 SHOULD be configurable, and SHOULD default to 2 seconds. 1905 7.11.2 Receiving the Offer and Sending an Answer 1907 To generate the answer, the answerer has to decide which transport 1908 addresses to include in the m/c line, and which to include in 1909 candidate attributes. 1911 The first step in the process is to look for the a=remote-candidate 1912 attribute in the offer. The a=remote-candidate exists to eliminate a 1913 race condition between the updated offer and the response to the STUN 1914 Binding Request that moved a candidate into the Valid state. This 1915 race condition is shown in Figure 6. On receipt of message 5, agent 1916 A can move its transport address pair state machine into the Valid 1917 state. It sends a STUN response to the request (message 6), but this 1918 is lost. Agent A proceeds with an updated offer (message 7), which 1919 is received at agent B. As far as agent B is concerned, the transport 1920 address pair is still in the Send-Valid state. It will move into the 1921 Valid state only on receipt of the STUN response in message 10. 1922 Thus, upon receipt of the offer, agent B cannot determine which 1923 candidate to include in its answer. To eliminate this condition, the 1924 identity of the validated candidate is included in the offer itself. 1925 Note, however, that the answerer will not send media until it has 1926 received this STUN response. 1928 Agent A Network Agent B 1929 |(1) Offer | | 1930 |------------------------------------------>| 1931 |(2) Answer | | 1932 |<------------------------------------------| 1933 |(3) STUN Req. | | 1934 |------------------------------------------>| 1935 |(4) STUN Res. | | 1936 |<------------------------------------------| 1937 |(5) STUN Req. | | 1938 |<------------------------------------------| 1939 |(6) STUN Res. | | 1940 |-------------------->| | 1941 | |Lost | 1942 |(7) Offer | | 1943 |------------------------------------------>| 1944 |(8) Answer | | 1945 |<------------------------------------------| 1946 |(9) STUN Req. | | 1947 |<------------------------------------------| 1948 |(10) STUN Res. | | 1949 |------------------------------------------>| 1951 Figure 6 1953 If the a=remote-candidate attribute is present, the agent examines 1954 the transport addresses in the m/c-line of the offer. It compares 1955 these with the transport addresses in the remote candidates of all 1956 candidate pairs. If there is at least one match, the agent compares 1957 the native candidate ID of each matching pair with the value of the 1958 a=remote-candidate attribute. If there is a match, that candidate 1959 pair is selected. For each transport address pair in that candidate 1960 pair, if the state of the transport address pair is Send-Valid, the 1961 agent considers the state to be Valid just for the purpose of 1962 selecting the m/c-line as discussed in the paragraph below. The 1963 actual state MUST remain Send-Valid. This is necessary to prevent 1964 against DoS attacks. 1966 Rules for choosing transport addresses for the m/c-line are as 1967 follows. The agent examines the transport addresses in the m/c-line 1968 of the offer. It compares these with the transport addresses in the 1969 remote candidates of candidate pairs whose states are Valid. If 1970 there is a matching candidate pair in that state, the pair with the 1971 highest priority MUST be chosen, and the native candidate from that 1972 pair used as the active candidate. If there were no matching 1973 candidate pairs in the Valid state, the candidate that is most likely 1974 to work with this peer, as described in Section 7.2, SHOULD be used. 1976 Like the offerer, the answerer can decide, for each of its 1977 candidates, whether they are retained or removed. The same rules 1978 defined in Section 7.11.1 for determining their disposition apply to 1979 the answerer. Similarly, if a candidate is removed, the same rules 1980 in Section 7.11.1 regarding removal of canididate pairs and freeing 1981 of resources apply. 1983 Once the answer is sent, the answerer will have the set of native and 1984 remote candidates before this offer/answer exchange, and the set of 1985 native and remote candidates afterwards. A peer derived candidate 1986 continues to be used as long as its generating parent continues to be 1987 used. The agent then pairs up the native and remote candidates which 1988 were added or retained. This leads to a set of current candidate 1989 pairs. 1991 If a candidate pair existed previously, but as a consequence of the 1992 offer/answer exchange, it no longer exists, the agent takes the 1993 following steps: 1995 1. The candidate pair is removed from the candidate pair priority 1996 ordered list and candidate pair check ordered list. As a 1997 consequence of this, if connectivity checks had not yet begun for 1998 the candidate pair, they won't. 2000 2. If connectivity checks were already in progress for that 2001 candidate pair, the agent SHOULD immediately terminate any STUN 2002 transactions in progress from that candidate. No further 2003 retransmissions take place, and no further transactions from that 2004 candidate will be made. 2006 3. If the agent receives a STUN Binding Request for that candidate 2007 pair, the agent SHOULD generate a 430 response. 2009 If a candidate pair existed previously, and continues to exist, no 2010 changes are made; any STUN transactions in progress for that 2011 candidate pair continue, and it remains on the candidate pair 2012 priority ordered list and candidate pair check ordered list. 2014 If a candidate pair is new (because either its native candidate is 2015 new, or its remote candidate is new, or both), the agent takes the 2016 role of answerer for this candidate pair. The new candidate pair is 2017 inserted into the candidate pair priority ordered list and candidate 2018 pair check ordered list. STUN connectivity checks will start for 2019 them based on the logic described in Section 7.6. 2021 7.11.3 Receiving the Answer 2023 Once the answer is received, the answerer will have the set of native 2024 and remote candidates before this offer/answer exchange, and the set 2025 of native and remote candidates afterwards. It then follows the same 2026 logic described in Section 7.11.2, pairing up the candidate pairs, 2027 removing ones that are no longer in use, and beginning of processing 2028 for ones that are new. 2030 7.12 Binding Keepalives 2032 Once a candidate is promoted to active, and media begins flowing, it 2033 is still necessary to keep the bindings alive at intermediate NATs 2034 for the duration of the session. Normally, the media stream packets 2035 themselves (e.g., RTP) meet this objective. However, several cases 2036 merit further discussion. Firstly, in some RTP usages, such as SIP, 2037 the media streams can be "put on hold". This is accomplished by 2038 using the SDP "sendonly" or "inactive" attributes, as defined in RFC 2039 3264 [4]. RFC 3264 directs implementations to cease transmission of 2040 media in these cases. However, doing so may cause NAT bindings to 2041 timeout, and media won't be able to come off hold. 2043 Secondly, some RTP payload formats, such as the payload format for 2044 text conversation [36], may send packets so infrequently that the 2045 interval exceeds the NAT binding timeouts. 2047 Thirdly, if silence suppression is in use, long periods of silence 2048 may cause media transmission to cease sufficiently long for NAT 2049 bindings to time out. 2051 To prevent these problems, ICE implementations MUST continue to list 2052 their active candidate in a=candidate lines for UDP-based media 2053 streams. As a consequence of this, STUN packets will be transmitted 2054 periodically independently of the transmission (or lack thereof) of 2055 media packets. This provides a media independent, RTP independent, 2056 and codec independent solution for keeping the NAT bindings alive. 2058 If an ICE implementation is communciating with one that does not 2059 support ICE, keepalives MUST still be sent. Indeed, these keepalives 2060 are essential even if neither endpoint implements ICE. As such, this 2061 specification defines keepalive behavior generally, for endpoints 2062 that support ICE, and those that do not. 2064 All endpoints MUST send keepalives for each media session. These 2065 keepalives MUST be sent regardless of whether the media stream is 2066 currently inactive, sendonly, recvonly or sendrecv. The keepalive 2067 SHOULD be sent using a format which is supported by its peer. ICE 2068 endpoints allow for STUN-based keepalives for UDP streams, and as 2069 such, STUN keepalives MUST be used when an agent is communicating 2070 with a peer that supports ICE. An agent can determine that its peer 2071 supports ICE by the presence of the a=candidate attributes for each 2072 media session. If the peer does not support ICE, the choice of a 2073 packet format for keepalives is a matter of local implementation. A 2074 format which allows packets to easily be sent in the absence of 2075 actual media content is RECOMMENDED. Examples of formats which 2076 readily meet this goal are RTP No-Op [31] and RTP comfort noise [26]. 2078 STUN-based keepalives will be sent periodically every Tr seconds as a 2079 consequence of the rules in in Section 7.7. If STUN keepalives are 2080 not in use (because the peer does not support ICE), an agent SHOULD 2081 ensure that a media packet is sent every Tr seconds. If one is not 2082 sent as a consequence of normal media communications, a keepalive 2083 packet using one of the formats discussed above SHOULD be sent. 2085 7.13 Sending Media 2087 When an agent receives an offer and sends an answer, or when it 2088 receives an answer to an offer it sent, it begins connectivity 2089 checks. These checks will include validation of the active candidate 2090 pair, if there was one. An agent SHOULD NOT send media on the active 2091 candidate pair until that candidate pair has reached the Valid or 2092 Recv-Valid state. This is to help prevent a denial-of-service 2093 attack, described in Section 13. Once the active candidate pair 2094 reaches the Valid or Recv-Valid state, an agent MAY start sending 2095 media to that candidate pair. 2097 However, offer/answer exchanges are used with protocols, like SIP, 2098 which require media to be sent "early", from the answerer to the 2099 offer, prior to completion of the initial offer/answer exchange. It 2100 is highly desirable (and sometimes necessary) for this early media to 2101 use the candidate pair ultimately selected by ICE connectivity 2102 checks. For this reason, ICE provides an early media mechanism that 2103 allows for a candidate pair to be used in one direction prior to its 2104 promotion to active in a subsequent offer/answer exchange. Note 2105 that, with ICE, early media pertains to media sent to a candidate 2106 pair until its promotion to active in a subsequent offer/answer 2107 exchange. This is a broader definition than is used in [29], which 2108 defines early media as media sent prior to acceptance of a call. 2110 As a consequence of the connectivity checks, an agent will change the 2111 states for each transport address pair, and consequently, for the 2112 candidate pairs. When a candidate pair becomes Valid or Recv-Valid, 2113 and the candidate pair is not equal to the active candidate pair, and 2114 the agent is in the role of answerer for that candidate pair, the 2115 agent checks the position of that pair in the candidate pair priority 2116 ordered list. If it is the first, the agent selects this candidate 2117 pair for early media. If this candidate pair is not the first on the 2118 candidate pair priority ordered list, but is higher priority than the 2119 active candidate pair, and the early media wait-state timer has not 2120 yet been set, the agent sets this timer to Tws seconds. Tws SHOULD 2121 be configurable, and SHOULD have a default of 100ms. This timer 2122 allows for a higher priority connectivity check to complete, in the 2123 event its STUN Binding Request or Response was lost or delayed in the 2124 network. If, prior to the wait-state timer firing, another 2125 connectivity check completes and a candidate pair enters the Valid or 2126 Recv-Valid states, there is no need to reset or cancel the timer. 2127 Once the timer fires, the agent SHOULD select the highest priority 2128 candidate pair in the Valid or Recv-Valid state for which the agent 2129 has the role of answerer, and use that candidate pair for early 2130 media. 2132 ICE processing will ensure that, under almost all circumstances, the 2133 candidate pair selected by the answerer for early media will also be 2134 the one selected by the offerer for eventual promotion to active. 2135 The early media state implies that the answerer knows that this 2136 candidate pair is to be used, but the offerer doesn't know yet that 2137 it will eventually be validated. It is for this reason that the 2138 candidate pair can be used for early media. 2140 If a candidate pair is selected for early media, an agent MAY send 2141 media on that candidate pair, even if it is not the same as the 2142 active candidate pair. However, to deal with cases in which the 2143 offerer and answerer do not agree on the eventual selection of this 2144 candidate for promotion to active (a rare but possible case), the 2145 agent MUST discontinue using the candidate pair for sending media Tlo 2146 seconds after the answer has been reliably delivered. An answer is 2147 considered reliably delivered when the agent receives a confirmation 2148 that is has been delivered. In the case of an answer delivered in a 2149 200 OK to an offer in an INVITE (in the SIP case), the answer is 2150 considered reliably delivered upon receipt of the ACK. Tlo SHOULD be 2151 configurable and SHOULD have a default of 5 seconds. This time 2152 represents the amount of time it should take the offerer to perform 2153 its connectivity checks, arrive at the same conclusion about the 2154 viability of the early candidate, and then generate an updated offer 2155 promoting it to active. If, after Tlo seconds, no updated offer 2156 arrives, the answerer MUST cease using the early candidate. Media 2157 MAY be sent to the active candidate pair if it is in the Valid or 2158 Recv-Valid state. 2160 If an updated offer does arrive prior to the expiration of the timer, 2161 the agent MUST execute the procedures in Section 7.11.2, which will 2162 result in the selection of a candidate for the m/c-line in the 2163 answer. At that point, the procedures of this section SHOULD be 2164 restarted by the answerer. This implies that the active candidate 2165 pair, if Valid or Recv-Valid, will be used. If a higher priority 2166 candidate pair subsequently enters the Valid or Recv-Valid state, it 2167 may end up being used as an early candidate. 2169 To use a candidate pair, whether it is early or active, media is sent 2170 to the IP addresses and ports of the components in the remote 2171 candidate, and sends that media from the IP addresses and ports of 2172 the components in the native candidate. Transport addresses are 2173 paired up based on component ID. For example, if a remote candidate 2174 has two components R1 and R2, and the native candidate has two 2175 components L1 and L2, media packets are sent from L1 to R1 and from 2176 L2 to R2. This provides a property known as symmetry. This 2177 symmetric behavior MUST be followed by an agent even if its peer in 2178 the session doesn't support ICE. 2180 The definition of sending media "from" a particular transport address 2181 depends on the type of transport address. In the case of a server 2182 reflexive transport address, this means that the RTP packets are sent 2183 from the local transport address used to obtain the STUN address. In 2184 the case of a relayed transport address, this means that media 2185 packets are sent through the relay server (for STUN relays, this 2186 would be using the Send request). For local transport addresses, 2187 media is sent from that local transport address. For peer reflexive 2188 transport addresses, media is sent from the local transport address 2189 used to obtain the reflexive address. 2191 ICE has interactions with jitter buffer adaptation mechanisms. An 2192 RTP stream can begin using one candidate, and switch to another one. 2193 The newer candidate may result in RTP packets taking a different path 2194 through the network - one with different delay characteristics. To 2195 signal to the jitter buffers that this change has happened, it is 2196 RECOMMENDED that, when an agent switches transmission of media from 2197 one candidate pair to another, it sets the RTP marker bit. 2198 Furthermore, it is RECOMMENDED that, upon receipt of an RTP packet 2199 with the marker bit set, or upon receipt of a packet with a different 2200 source IP address, that the agent re-adjust its jitter buffers. 2202 8. Guidelines for Usage with SIP 2204 SIP [2] makes use of the offer/answer model, and is one of the 2205 primary targets for usage of ICE. SIP allows for offer/answer 2206 exchanges to occur in many different combinations of messages, 2207 including INVITE/200 OK and 200 OK/ACK. When support for reliable 2208 provisional responses (RFC 3262 [11]) and UPDATE (RFC 3311 [27]) are 2209 added, additional combinations of messages that can be used for 2210 offer/answer exchanges are added. As such, this section provides 2211 some guidance on good ways to make use of SIP with ICE. 2213 ICE requires a series of STUN-based connectivity checks to take place 2214 between endpoints. These checks start from the answerer on 2215 generation of its answer, and start from the offerer when it receives 2216 the answer. These checks can take time to complete, and as such, the 2217 selection of messages to use with offers and answers can effect 2218 perceived user latency. Two latency of figures are of particular 2219 interest. These are the post-pickup delay and the post-dial delay. 2220 The post-pickup delay refers to the time between when a user "answers 2221 the phone" and when any speech they utter can be delivered to the 2222 caller. The post-dial delay refers to the time between when a user 2223 enters the destination address for the user, and ringback begins as a 2224 consequence of having succesfully started ringing the phone of the 2225 called party. 2227 To reduce post-dial delays, it is RECOMMENDED that the caller begin 2228 gathering candidates prior to actually sending its initial INVITE. 2229 This can be started upon user interface cues that a call is pending, 2230 such as activity on a keypad or the phone going offhook. 2232 To reduce post-pickup delays, ICE allows for media to be sent from 2233 the answerer to the offerer on a candidate pair, prior to its 2234 promotion to active. However, this requires the answerer to have 2235 generated its answer and sent it. In most cases, it will require 2236 this answer to be received by the offerer. The reason is that 2237 connectivity checks or RTP packets from the answerer to the offerer 2238 will not be forwarded by NATs towards the offerer until the offerer 2239 has established a permission in the NAT by generating a packet 2240 towards the answerer. 2242 For this reason, if an offer is received in an INVITE request, the 2243 UAS SHOULD immediately gather its candidates and then generate an 2244 answer in a provisional response. When reliable provisional 2245 responses are not used, the SDP in the provisional response is not 2246 formally the answer; the value in the 200 OK is the actual answer. 2247 However, RFC 3261 allows for SDP to appear in an unreliable 2248 provisional response, in which case its value has to be identical to 2249 the value placed in the 200 OK. Thus, we refer to the SDP in the 2250 provisional response, even when unreliable, as the answer. To deal 2251 with possible losses of the provisional response, it SHOULD be 2252 retransmitted until some indication of receipt. This indication can 2253 either be through PRACK [11], or through the receipt of a STUN 2254 Binding Request with a correct username and password. Furthermore, 2255 once the answer has been sent, the agent SHOULD begin its 2256 connectivity checks. Once a candidate reaches the Valid or Recv- 2257 Valid state, the UAS has a known-valid path for media packets towards 2258 the UAC. This point is called the connected point in ICE. 2260 Once the UAS reaches the connected point, media can be sent from the 2261 UAS towards the UAC without any additional delays. However, between 2262 the receipt of the INVITE and the connected point, any media that 2263 needs to be sent towards the caller (such as SIP early media [29] 2264 cannot be transmitted. For this reason, implementations MAY choose 2265 to delay alerting the called party until the connected point is 2266 reached. In the case of a PSTN gateway, this would mean that the 2267 setup message into the PSTN is delayed until the connected point. 2268 Doing this increases the post-dial delay, but has the effect of 2269 eliminating 'ghost rings'. Ghost rings are cases where the called 2270 party hears the phone ring, picks up, but hears nothing and cannot be 2271 heard. This technique works without requiring support for, or usage 2272 of, preconditions [7], since its a localized decision. It also has 2273 the benefit of guaranteeing that not a single packet of early media 2274 will get clipped. If an agent chooses to delay local alerting in 2275 this way, it SHOULD generate a 180 response once alerting begins. 2277 A slight variation of this approach is to wait for a connectivity 2278 check to succeed to a higher priority candidate pair than the active 2279 one. This allows for the agent to only ever send media, early or 2280 otherwise, to a single candidate, which will work better with jitter 2281 buffers, at the expense of even greater post-dial delays. 2283 Note that, prior to the promotion of a candidate pair to active, the 2284 offerer will not be able to send using the candidate pair. When used 2285 with SIP, if the initial offer is sent in the INVITE, and the answer 2286 is sent in both the provisional and final 200 OK response, the 2287 offerer will not be able to send media until it sends a re-INVITE and 2288 receives the 200 OK response to that re-INVITE. This can take 2289 several hundred milliseconds. If this latency is an issue (it is 2290 generally not considered an issue for voice systems), reliable 2291 provisional responses [11] MAY be used, in which case an UPDATE [27] 2292 can be used to send an updated offer prior to the call being 2293 answered. 2295 As discussed in Section 13, offer/answer exchanges SHOULD be secured 2296 against eavesdropping and man-in-the-middle attacks. To do that, the 2297 usage of SIPS [2] is RECOMMENDED when used in concert with ICE. 2299 9. Interactions with Forking 2301 SIP allows INVITE requests carrying offers to fork, which means that 2302 they are delivered to multiple user agents. Each of those user 2303 agents then provides an answer to the offer in the INVITE. The 2304 result is that a single offer generated by the UAC produces multiple 2305 answers. 2307 ICE interacts very well with forking. Indeed, ICE fixes some of the 2308 problems associated with forking. Once the offer/answer exchange has 2309 completed, the UAC will have an answer from each UAS that received 2310 the INVITE. The ICE connectivity checks that ensue will carry 2311 transport address pair IDs that correlate each of those checks (and 2312 thus their corresponding IP addresses and ports) with a specific 2313 remote user agent. As these checks happen before any media is 2314 transmitted, ICE allows a UAC to disambiguate subsequent media 2315 traffic by looking at the source IP address and port, and then 2316 correlate that traffic with a particular remote UA. When SIP is used 2317 without ICE, the incoming media traffic cannot be disambiguated 2318 without an additional offer/answer exchange. 2320 10. Interactions with Preconditions 2322 Because ICE involves multiple addresses and pre-session activities, 2323 its interactions with preconditions merits further discussion. 2325 Quality of Service (QoS) preconditions, which are defined in RFC 3312 2326 [7] and RFC 4032 [8], apply only to the IP addresses and ports listed 2327 in the m/c lines in an offer/answer. If ICE changes the address and 2328 port where media is received, this change is reflected in the m/c 2329 lines of a new offer/answer. As such, it appears like any other re- 2330 INVITE would, and is fully treated in RFC 3312 and 4032, which 2331 applies without regard to the fact that the m/c lines are changing 2332 due to ICE negotiations ocurring "in the background". 2334 However, usage of early candidates with QoS preconditions is NOT 2335 RECOMMENDED, since QoS will only be reserved for the candidate pair 2336 in the m/c-line. An agent SHOULD only send to the active candidate 2337 (once it enters the Valid or Recv-Valid states) if QoS preconditions 2338 are used for a media session. 2340 ICE also has (purposeful) interactions with connectivity 2341 preconditions [30]. Those interactions are described there. 2343 11. Examples 2345 This section provides two examples. One is a very basic example, and 2346 the other is more elaborate. A common configuration and setup is 2347 used in both cases. 2349 Two agents, L and R, are using ICE. Both agents have a single IPv4 2350 interface, and are configured with a single STUN server each (indeed, 2351 the same one for each). This STUN server supports both the Binding 2352 Discovery usage and the Relay usage. Agent L is behind a NAT, and 2353 agent R is on the public Internet. 2355 To facilitate understanding, transport addresses are listed in a 2356 mnemonic form. This form is entity-type-seqno, where entity refers 2357 to the entity whose interface the transport address is on, and is one 2358 of "L", "R", "STUN", or "NAT". The type is either "PUB" for 2359 transport addresses that are public, and "PRIV" for transport 2360 addresses that are private. Finally, seq-no is a sequence number 2361 that is different for each transport address of the same type on a 2362 particular entity. 2364 The STUN server has advertised transport address STUN-PUB-1 for both 2365 the binding discovery usage and the relay usage. 2367 In addition, candidate IDs are also listed in mnemonic form. Agent L 2368 uses candidate ID L1 for its local candidate, L2 for its server 2369 reflexive candidate, and L3 for its relayed candidate. Agent R uses 2370 R1 for its local candidate and R2 for its relayed candidate. The 2371 password is LPASS for each candidate from agent L, and RPASS for each 2372 candidate from agent R. 2374 In example SDP messages, $TADDR.IP is used to refer to the value of 2375 the IP address of the transport address with mnemonic name "taddr". 2376 Similarly, $TADDR.PORT is used to refer to the value of the port of 2377 the transport address with mnemonic name "TADDR". 2379 In the call flow itself, STUN messages are annotated with several 2380 attributes. The "S=" attribute indicates the source transport 2381 address of the message. The "D=" attribute indicates the destination 2382 transport address of the message. The "MA=" attribute is used in 2383 STUN Binding Response messages, STUN Binding Response messages 2384 carried in a STUN Send Request or Data Indication, and in a Allocate 2385 Response, and refers to the value of the MAPPED-ADDRESS attribute. 2386 The "RA=" attribute is used in STUN Data Indications, and refers to 2387 the value of the REMOTE-ADDRESS attribute. The "U=" attribute is 2388 used in STUN Requests, and corresponds to the STUN USERNAME. The 2389 "DA=" attribute is used in STUN Send requests, and refers to the 2390 value of the DESTINATION-ADDRESS attribute. The "R=" attribute is 2391 used in Allocate responses, and it indicates the value of the RELAY- 2392 ADDRESS attribute. 2394 The call flow examples omit STUN authentication operations. 2396 11.1 Basic Example 2398 In this example, the NAT has the address and port independent mapping 2399 property and the address dependent permission property. Neither 2400 agent is using the STUN relay usage, only the binding discovery 2401 usage. As a consequence, agent L will end up with two candidates - a 2402 local candidate and a server reflexive candidate. Agent R will have 2403 one - a local candidate (the reflexive candidate will be identical to 2404 the local one, and thus discarded). The agents are seeking to 2405 communicate using a single RTP-based voice stream. RTCP is not used. 2406 As a consequence, each candidate has one component. 2408 L NAT STUN R 2409 | | | | 2410 | | | | 2411 | | | | 2412 |RTP STUN alloc. | | 2413 | | | | 2414 | | | | 2415 | | | | 2416 |(1) STUN Req | | | 2417 |S=L-PRIV-1 | | | 2418 |D=STUN-PUB-1 | | | 2419 |------------->| | | 2420 | | | | 2421 | | | | 2422 | |(2) STUN Req | | 2423 | |S=NAT-PUB-1 | | 2424 | |D=STUN-PUB-1 | | 2425 | |------------->| | 2426 | | | | 2427 | |(3) STUN Res | | 2428 | |S=STUN-PUB-1 | | 2429 | |D=NAT-PUB-1 | | 2430 | |MA=NAT-PUB-1 | | 2431 | |<-------------| | 2432 | | | | 2433 |(4) STUN Res | | | 2434 |S=STUN-PUB-1 | | | 2435 |D=L-PRIV-1 | | | 2436 |MA=NAT-PUB-1 | | | 2437 |<-------------| | | 2438 | | | | 2439 | | | | 2440 | | | | 2441 | | | | 2442 |(5) Offer | | | 2443 |------------------------------------------->| 2444 | | | | 2445 | | | | 2446 | | | | 2447 | | | | 2448 | | | |RTP STUN alloc. 2449 | | | | 2450 | | | | 2451 | | | | 2452 | | |(6) STUN Req | 2453 | | |S=R-PUB-1 | 2454 | | |D=STUN-PUB-1 | 2455 | | |<-------------| 2456 | | | | 2457 | | |(7) STUN Res | 2458 | | |S=STUN-PUB-1 | 2459 | | |D=R-PUB-1 | 2460 | | |MA=R-PUB-1 | 2461 | | |------------->| 2462 | | | | 2463 | | | | 2464 | | | | 2465 | | | | 2466 |(8) answer | | | 2467 |<-------------------------------------------| 2468 | | | | 2469 | | | | 2470 |(9) Bind Req | | | 2471 |S=L-PRIV-1 | | | 2472 |D=R-PUB-1 | | | 2473 |------------->| | | 2474 | | | | 2475 | | | | 2476 | |(10) Bind Req | | 2477 | |S=NAT-PUB-1 | | 2478 | |D=R-PUB-1 | | 2479 | |---------------------------->| 2480 | | | | 2481 | |(11) Bind Res | | 2482 | |S=R-PUB-1 | | 2483 | |D=NAT-PUB-1 | | 2484 | |MA=NAT-PUB-1 | | 2485 | |<----------------------------| 2486 | | | | 2487 |(12) Bind Res | | | 2488 |S=R-PUB-1 | | | 2489 |D=L-PRIV-1 | | | 2490 |MA=NAT-PUB-1 | | | 2491 |<-------------| | | 2492 | | | | 2493 | | | | 2494 | | | | 2495 | | | | 2496 |RTP flows | | | 2497 | | | | 2498 | | | | 2499 | | | | 2500 | |(13) Bind Req | | 2501 | |S=R-PUB-1 | | 2502 | |D=NAT-PUB-1 | | 2503 | |<----------------------------| 2504 | | | | 2505 | | | | 2506 |(14) Bind Req | | | 2507 |S=R-PUB-1 | | | 2508 |D=L-PRIV-1 | | | 2509 |<-------------| | | 2510 | | | | 2511 |(15) Bind Res | | | 2512 |S=L-PRIV-1 | | | 2513 |D=R-PUB-1 | | | 2514 |MA=R-PUB-1 | | | 2515 |------------->| | | 2516 | | | | 2517 | |(16) Bind Res | | 2518 | |S=NAT-PUB-1 | | 2519 | |D=R-PUB-1 | | 2520 | |MA=R-PUB-1 | | 2521 | |---------------------------->| 2522 | | | | 2523 | | | | 2524 | | | | 2525 | | | | 2526 | | | |RTP flows 2527 | | | | 2528 | | | | 2529 | | | | 2530 | | | | 2531 | | | | 2532 | | | | 2533 | | | | 2535 Figure 7 2537 First, agent L obtains a server reflexive transport address for its 2538 RTP packets (messages 1-4). Recall that the NAT has the address and 2539 port independent mapping property. Here, it creates a binding of 2540 NAT-PUB-1 for this UDP request, and this becomes the server reflexive 2541 transport address for RTP, the sole component of its server reflexive 2542 candidate. 2544 With its two candidates, agent L prioritizes them, choosing the local 2545 candidate as highest priority, followed by the server reflexive 2546 candidate. It chooses its server reflexive candidate as the active 2547 candidate, and encodes it into the m/c-line. The resulting offer 2548 (message 5) looks like: 2550 v=0 2551 o=jdoe 2890844526 2890842807 IN IP4 $L-PRIV-1.IP 2552 s= 2553 c=IN IP4 $STUN-PUB-1.IP 2554 t=0 0 2555 a=ice-pwd:$LPASS 2556 m=audio $STUN-PUB-1.PORT RTP/AVP 0 2557 a=rtpmap:0 PCMU/8000 2558 a=candidate $L1 1 UDP 1.0 $L-PRIV-1.IP $L-PRIV-1.PORT 2559 a=candidate $L2 1 UDP 0.7 $NAT-PUB-1.IP $NAT-PUB-1.PORT 2561 This offer is received at agent R. Agent R will gather its server 2562 reflexive transport address (messages 6-7). Since R is not behind a 2563 NAT, this address is identical to its local transport address, and 2564 thus does not represent a separate candidate. It therefore ends up 2565 with a single local candidate with a single component for RTP. Its 2566 resulting answer looks like: 2568 v=0 2569 o=bob 2808844564 2808844564 IN IP4 $R-PUB-1.IP 2570 s= 2571 c=IN IP4 $R-PUB-1.IP 2572 t=0 0 2573 a=ice-pwd:$RPASS 2574 m=audio $R-PUB-1.PORT RTP/AVP 0 2575 a=rtpmap:0 PCMU/8000 2576 a=candidate $R1 1 UDP 1.0 $R-PUB-1.IP $R-PUB-1.PORT 2578 Next, agents L and R form candidate pairs and the transport address 2579 check ordered list. This list will start with the single component 2580 in the currently active candidate pair, L2:1:R1:1. Agent L begins 2581 its connectivity checks (messages 9-12), which succeed, placing the 2582 transport address pair and resulting candidate pair into the Recv- 2583 Valid state. Media can now flow. When agent R receives this request 2584 (message 10), the state of the candidate pair moves to Send-Valid. 2585 Agent R begins its connectivity checks (messages 13-16). When the 2586 check arrives at the NAT (message 13), it is permitted to pass since 2587 a permission was created towards $R-PUB-1 as a consequence of message 2588 10. This check arrives at agent L, which generates a success 2589 response (message 11), and updates the state of the candidate pair to 2590 Valid. This response arrives at agent R, which also updates the 2591 state of the candidate pair to valid. Now, media can flow from agent 2592 R to agent L as well. 2594 11.2 Advanced Example 2596 In this more advanced example, The NAT has address and port dependent 2597 mapping and filtering properties. Both agents use the STUN relay 2598 usage in addition to the binding discovery usage. As a consequence, 2599 agent L will end up with three candidates - a local candidate, a 2600 relayed candidate, and a server reflexive candidate. Agent R will 2601 have two - a local candidate and a relayed candidate (the server 2602 reflexive candidate will equal the local candidate and thus not be 2603 used). The agents are seeking to communicate using a single RTP- 2604 based voice stream, but are using RTCP. As a consequence, each 2605 candidate has two components - one for RTP and one for RTCP. 2607 L NAT STUN R 2608 | | | | 2609 | | | | 2610 | | | | 2611 |RTP Alloc. | | | 2612 | | | | 2613 | | | | 2614 | | | | 2615 |(1) Alloc Req | | | 2616 |S=L-PRIV-1 | | | 2617 |D=STUN-PUB-1 | | | 2618 |------------->| | | 2619 | | | | 2620 | | | | 2621 | |(2) Alloc Req | | 2622 | |S=NAT-PUB-1 | | 2623 | |D=STUN-PUB-1 | | 2624 | |------------->| | 2625 | |(3) Alloc Res | | 2626 | |S=STUN-PUB-1 | | 2627 | |D=NAT-PUB-1 | | 2628 | |R=STUN-PUB-2 | | 2629 | |MA=NAT-PUB-1 | | 2630 | |<-------------| | 2631 |(4) Alloc Res | | | 2632 |S=STUN-PUB-1 | | | 2633 |D=L-PRIV-1 | | | 2634 |R=STUN-PUB-2 | | | 2635 |MA=NAT-PUB-1 | | | 2636 |<-------------| | | 2637 | | | | 2638 | | | | 2639 | | | | 2640 |RTCP Alloc. | | | 2641 |Ta secs. later| | | 2642 | | | | 2643 | | | | 2644 | | | | 2645 |(5) Alloc Req | | | 2646 |S=L-PRIV-2 | | | 2647 |D=STUN-PUB-1 | | | 2648 |------------->| | | 2649 | | | | 2650 | | | | 2651 | |(6) Alloc Req | | 2652 | |S=NAT-PUB-2 | | 2653 | |D=STUN-PUB-1 | | 2654 | |------------->| | 2655 | |(7) Alloc Res | | 2656 | |S=STUN-PUB-1 | | 2657 | |D=NAT-PUB-2 | | 2658 | |R=STUN-PUB-3 | | 2659 | |MA=NAT-PUB-2 | | 2660 | |<-------------| | 2661 |(8) Alloc Res | | | 2662 |S=STUN-PUB-1 | | | 2663 |D=L-PRIV-2 | | | 2664 |R=STUN-PUB-3 | | | 2665 |MA=NAT-PUB-2 | | | 2666 |<-------------| | | 2667 | | | | 2668 | | | | 2669 | | | | 2670 | | | | 2671 |(9) Offer | | | 2672 |------------------------------------------->| 2673 | | | | 2674 | | | | 2675 | | | | 2676 | | | | 2677 | | | |RTP Alloc. 2678 | | | | 2679 | | | | 2680 | | | | 2681 | | |(10) Alloc Req| 2682 | | |S=R-PUB-1 | 2683 | | |D=STUN-PUB-1 | 2684 | | |<-------------| 2685 | | |(11) Alloc Res| 2686 | | |S=STUN-PUB-1 | 2687 | | |D=R-PUB-1 | 2688 | | |R=STUN-PUB-4 | 2689 | | |MA=R-PUB-1 | 2690 | | |------------->| 2691 | | | | 2692 | | | | 2693 | | | | 2694 | | | |RTCP Alloc. 2695 | | | |Ta secs. later 2696 | | | | 2697 | | | | 2698 | | | | 2699 | | |(12) Alloc Req| 2700 | | |S=R-PUB-2 | 2701 | | |D=STUN-PUB-1 | 2702 | | |<-------------| 2703 | | |(13) Alloc Res| 2704 | | |S=STUN-PUB-1 | 2705 | | |D=R-PUB-2 | 2706 | | |R=STUN-PUB-5 | 2707 | | |MA=R-PUB-2 | 2708 | | |------------->| 2709 | | | | 2710 | | | | 2711 | | | | 2712 | | | | 2713 |(14) answer | | | 2714 |<-------------------------------------------| 2715 | | | | 2716 | | | | 2717 | | | | 2718 | | | |Validate 2719 | | | |STUN-PUB-4 to STUN-PUB-2 2720 | | | | 2721 | | | | 2722 | | |(15) Send Ind | 2723 | | |S=R-PUB-1 | 2724 | | |D=STUN-PUB-1 | 2725 | | |DA=STUN-PUB-2 | 2726 | | |<-------------| 2727 | | | | 2728 | | |Bind Req. | 2729 | | |S=STUN-PUB-4 | 2730 | | |D=STUN-PUB-2 | 2731 | | |U=L3:1:R2:1 | 2732 | | | | 2733 | | | | 2734 | | | | 2735 | | | | 2736 | | | | 2737 | | |Discard | 2738 | | | | 2739 | | | | 2740 | | | | 2741 | | | | 2742 |Validate | | | 2743 |STUN-PUB-2 to STUN-PUB-4 | | 2744 | | | | 2745 | | | | 2746 |(16) Send Ind | | | 2747 |S=L-PRIV-1 | | | 2748 |D=STUN-PUB-1 | | | 2749 |DA=STUN-PUB-4 | | | 2750 |------------->| | | 2751 | | | | 2752 | |(17) Send Ind | | 2753 | |S=NAT-PUB-1 | | 2754 | |D=STUN-PUB-1 | | 2755 | |DA=STUN-PUB-4 | | 2756 | |------------->| | 2757 | | | | 2758 | | |Bind Req. | 2759 | | |S=STUN-PUB-2 | 2760 | | |D=STUN-PUB-4 | 2761 | | |U=R2:1:L3:1 | 2762 | | | | 2763 | | | | 2764 | | |(18) Data Ind | 2765 | | |S=STUN-PUB-1 | 2766 | | |D=R-PUB-1 | 2767 | | |RA=STUN-PUB-2 | 2768 | | |------------->| 2769 | | |(19) Send Ind | 2770 | | |S=R-PUB-1 | 2771 | | |D=STUN-PUB-1 | 2772 | | |DA=STUN-PUB-2 | 2773 | | |MA=STUN-PUB-2 | 2774 | | |<-------------| 2775 | | | | 2776 | | |Bind Res. | 2777 | | |S=STUN-PUB-4 | 2778 | | |D=STUN-PUB-2 | 2779 | | |MA=STUN-PUB-2 | 2780 | | | | 2781 | |(20) Data Ind | | 2782 | |S=STUN-PUB-1 | | 2783 | |D=NAT-PUB-1 | | 2784 | |RA=STUN-PUB-4 | | 2785 | |MA=STUN-PUB-2 | | 2786 | |<-------------| | 2787 |(21) Data Ind | | | 2788 |S=STUN-PUB-1 | | | 2789 |D=L-PRIV-1 | | | 2790 |RA=STUN-PUB-4 | | | 2791 |MA=STUN-PUB-2 | | | 2792 |<-------------| | | 2793 | | | | 2794 | | | | 2795 | | | | 2796 | | | |Validate 2797 | | | |STUN-PUB-4 to STUN-PUB-2 2798 | | | | 2799 | | | | 2800 | | |(22) Send Ind | 2801 | | |S=R-PUB-1 | 2802 | | |D=STUN-PUB-1 | 2803 | | |DA=STUN-PUB-2 | 2804 | | |<-------------| 2805 | | | | 2806 | | |Bind Req. | 2807 | | |S=STUN-PUB-4 | 2808 | | |D=STUN-PUB-2 | 2809 | | |U=L3:1:R2:1 | 2810 | | | | 2811 | | | | 2812 | |(23) Data Ind | | 2813 | |S=STUN-PUB-1 | | 2814 | |D=NAT-PUB-1 | | 2815 | |RA=STUN-PUB-4 | | 2816 | |<-------------| | 2817 | | | | 2818 |(24) Data Ind | | | 2819 |S=STUN-PUB-1 | | | 2820 |D=L-PRIV-1 | | | 2821 |RA=STUN-PUB-4 | | | 2822 |<-------------| | | 2823 |(25) Send Ind | | | 2824 |S=L-PRIV-1 | | | 2825 |D=STUN-PUB-1 | | | 2826 |DA=STUN-PUB-4 | | | 2827 |MA=STUN-PUB-4 | | | 2828 |------------->| | | 2829 | |(26) Send Ind | | 2830 | |S=NAT-PUB-1 | | 2831 | |D=STUN-PUB-1 | | 2832 | |DA=STUN-PUB-4 | | 2833 | |MA=STUN-PUB-4 | | 2834 | |------------->| | 2835 | | | | 2836 | | |Bind Res. | 2837 | | |S=STUN-PUB-2 | 2838 | | |D=STUN-PUB-4 | 2839 | | |MA=STUN-PUB-4 | 2840 | | | | 2841 | | |(27) Data Ind | 2842 | | |S=STUN-PUB-1 | 2843 | | |D=R-PUB-1 | 2844 | | |RA=STUN-PUB-2 | 2845 | | |MA=STUN-PUB-4 | 2846 | | |------------->| 2847 | | | | 2848 | | | | 2849 | | | | 2850 | | | |Validate 2851 | | | |STUN-PUB-5 to STUN-PUB-3 2852 | | | | 2853 | | | | 2854 | | |(28) Send Ind | 2855 | | |S=R-PUB-2 | 2856 | | |D=STUN-PUB-1 | 2857 | | |DA=STUN-PUB-3 | 2858 | | |<-------------| 2859 | | | | 2860 | | |Bind Req. | 2861 | | |S=STUN-PUB-5 | 2862 | | |D=STUN-PUB-3 | 2863 | | |U=L3:2:R2:2 | 2864 | | | | 2865 | | | | 2866 | | | | 2867 | | | | 2868 | | | | 2869 | | |Discard | 2870 | | | | 2871 | | | | 2872 | | | | 2873 | | | | 2874 |Validate | | | 2875 |STUN-PUB-3 to STUN-PUB-5 | | 2876 | | | | 2877 | | | | 2878 |(29) Send Ind | | | 2879 |S=L-PRIV-2 | | | 2880 |D=STUN-PUB-1 | | | 2881 |DA=STUN-PUB-5 | | | 2882 |------------->| | | 2883 | | | | 2884 | |(30) Send Ind | | 2885 | |S=NAT-PUB-2 | | 2886 | |D=STUN-PUB-1 | | 2887 | |DA=STUN-PUB-5 | | 2888 | |------------->| | 2889 | | | | 2890 | | |Bind Req. | 2891 | | |S=STUN-PUB-3 | 2892 | | |D=STUN-PUB-5 | 2893 | | |U=R2:2:L3:2 | 2894 | | | | 2895 | | | | 2896 | | |(31) Data Ind | 2897 | | |S=STUN-PUB-1 | 2898 | | |D=R-PUB-2 | 2899 | | |RA=STUN-PUB-3 | 2900 | | |------------->| 2901 | | |(32) Send Ind | 2902 | | |S=R-PUB-2 | 2903 | | |D=STUN-PUB-1 | 2904 | | |DA=STUN-PUB-3 | 2905 | | |MA=STUN-PUB-3 | 2906 | | |<-------------| 2907 | | | | 2908 | | |Bind Res. | 2909 | | |S=STUN-PUB-5 | 2910 | | |D=STUN-PUB-3 | 2911 | | |MA=STUN-PUB-3 | 2912 | | | | 2913 | |(33) Data Ind | | 2914 | |S=STUN-PUB-1 | | 2915 | |D=NAT-PUB-2 | | 2916 | |RA=STUN-PUB-5 | | 2917 | |MA=STUN-PUB-3 | | 2918 | |<-------------| | 2919 |(34) Data Ind | | | 2920 |S=STUN-PUB-1 | | | 2921 |D=L-PRIV-2 | | | 2922 |RA=STUN-PUB-5 | | | 2923 |MA=STUN-PUB-3 | | | 2924 |<-------------| | | 2925 | | | | 2926 | | | | 2927 | | | | 2928 | | | |Validate 2929 | | | |STUN-PUB-5 to STUN-PUB-3 2930 | | | | 2931 | | | | 2932 | | |(35) Send Ind | 2933 | | |S=R-PUB-2 | 2934 | | |D=STUN-PUB-1 | 2935 | | |DA=STUN-PUB-3 | 2936 | | |<-------------| 2937 | | | | 2938 | | |Bind Req. | 2939 | | |S=STUN-PUB-5 | 2940 | | |D=STUN-PUB-3 | 2941 | | |U=L3:2:R2:2 | 2942 | | | | 2943 | | | | 2944 | |(36) Data Ind | | 2945 | |S=STUN-PUB-1 | | 2946 | |D=NAT-PUB-2 | | 2947 | |RA=STUN-PUB-5 | | 2948 | |<-------------| | 2949 | | | | 2950 |(37) Data Ind | | | 2951 |S=STUN-PUB-1 | | | 2952 |D=L-PRIV-2 | | | 2953 |RA=STUN-PUB-5 | | | 2954 |<-------------| | | 2955 |(38) Send Ind | | | 2956 |S=L-PRIV-2 | | | 2957 |D=STUN-PUB-1 | | | 2958 |DA=STUN-PUB-5 | | | 2959 |MA=STUN-PUB-5 | | | 2960 |------------->| | | 2961 | |(39) Send Ind | | 2962 | |S=NAT-PUB-2 | | 2963 | |D=STUN-PUB-1 | | 2964 | |DA=STUN-PUB-5 | | 2965 | |MA=STUN-PUB-5 | | 2966 | |------------->| | 2967 | | | | 2968 | | |Bind Res. | 2969 | | |S=STUN-PUB-3 | 2970 | | |D=STUN-PUB-5 | 2971 | | |MA=STUN-PUB-5 | 2972 | | | | 2973 | | |(40) Data Ind | 2974 | | |S=STUN-PUB-1 | 2975 | | |D=R-PUB-2 | 2976 | | |RA=STUN-PUB-3 | 2977 | | |MA=STUN-PUB-5 | 2978 | | |------------->| 2979 | | | | 2980 | | | | 2981 | | | | 2982 | | | | 2983 |RTP flows | | | 2984 | | | | 2985 | | | | 2986 |(41) Send Ind | | | 2987 |S=L-PRIV-1 | | | 2988 |D=STUN-PUB-1 | | | 2989 |DA=STUN-PUB-4 | | | 2990 |------------->| | | 2991 | | | | 2992 | |(42) Send Ind | | 2993 | |S=NAT-PUB-1 | | 2994 | |D=STUN-PUB-1 | | 2995 | |DA=STUN-PUB-4 | | 2996 | |------------->| | 2997 | | | | 2998 | | | | 2999 | | |RTP | 3000 | | |S=STUN-PUB-2 | 3001 | | |D=STUN-PUB-4 | 3002 | | | | 3003 | | | | 3004 | | |(43) Data Ind | 3005 | | |S=STUN-PUB-1 | 3006 | | |D=R-PUB-1 | 3007 | | |RA=STUN-PUB-2 | 3008 | | |------------->| 3009 | | | | 3010 | | | | 3011 | | | | 3012 | | | | 3013 | | | |RTP flows 3014 | | | | 3015 | | | | 3016 | | |(44) Send Ind | 3017 | | |S=R-PUB-1 | 3018 | | |D=STUN-PUB-1 | 3019 | | |DA=STUN-PUB-2 | 3020 | | |<-------------| 3021 | | | | 3022 | | | | 3023 | | |RTP | 3024 | | |S=STUN-PUB-4 | 3025 | | |D=STUN-PUB-2 | 3026 | | | | 3027 | | | | 3028 | |(45) Data Ind | | 3029 | |S=STUN-PUB-1 | | 3030 | |D=NAT-PUB-1 | | 3031 | |RA=STUN-PUB-4 | | 3032 | |<-------------| | 3033 | | | | 3034 |(46) Data Ind | | | 3035 |S=STUN-PUB-1 | | | 3036 |D=L-PRIV-1 | | | 3037 |RA=STUN-PUB-4 | | | 3038 |<-------------| | | 3039 | | | | 3040 | | | | 3041 | | | | 3042 |Validate | | | 3043 |L-PRIV-1 to R-PUB-1 | | 3044 | | | | 3045 | | | | 3046 |(47) Bind Req.| | | 3047 |S=L-PRIV-1 | | | 3048 |D=R-PUB-1 | | | 3049 |U=R1:1:L1:1 | | | 3050 |------------->| | | 3051 | | | | 3052 | |(48) Bind Req.| | 3053 | |S=NAT-PUB-3 | | 3054 | |D=R-PUB-1 | | 3055 | |U=R1:1:L1:1 | | 3056 | |---------------------------->| 3057 | | | | 3058 | |(49) Bind Res.| | 3059 | |S=R-PUB-1 | | 3060 | |D=NAT-PUB-3 | | 3061 | |MA=NAT-PUB-3 | | 3062 | |<----------------------------| 3063 | | | | 3064 |(50) Bind Res.| | | 3065 |S=R-PUB-1 | | | 3066 |D=L-PRIV-1 | | | 3067 |MA-NAT-PUB-3 | | | 3068 |<-------------| | | 3069 | | | | 3070 | | | | 3071 | | | | 3072 | | | |Validate 3073 | | | |R-PUB-1 to L-PRIV-1 3074 | | | | 3075 | | | | 3076 | |(51) Bind Req.| | 3077 | |S=R-PUB-1 | | 3078 | |D=L-PRIV-1 | | 3079 | |U=L1:1:R1:1 | | 3080 | |<----------------------------| 3081 | | | | 3082 | | | | 3083 | | | | 3084 | | | | 3085 | |Discard | | 3086 | | | | 3087 | | | | 3088 | | | | 3089 | | | | 3090 | | | |Validate 3091 | | | |R-PUB-2 to L-PRIV-2 3092 | | | | 3093 | | | | 3094 | |(52) Bind Req.| | 3095 | |S=R-PUB-2 | | 3096 | |D=L-PRIV-2 | | 3097 | |U=L1:2:R1:2 | | 3098 | |<----------------------------| 3099 | | | | 3100 | | | | 3101 | | | | 3102 | | | | 3103 | |Discard | | 3104 | | | | 3105 | | | | 3106 | | | | 3107 | | | | 3108 |Validate | | | 3109 |L-PRIV-2 to R-PUB-2 | | 3110 | | | | 3111 | | | | 3112 |(53) Bind Req.| | | 3113 |S=L-PRIV-2 | | | 3114 |D=R-PUB-2 | | | 3115 |U=R1:2:L1:2 | | | 3116 |------------->| | | 3117 | | | | 3118 | |(54) Bind Req.| | 3119 | |S=NAT-PUB-4 | | 3120 | |D=R-PUB-2 | | 3121 | |U=R1:2:L1:2 | | 3122 | |---------------------------->| 3123 | | | | 3124 | |(55) Bind Res.| | 3125 | |S=R-PUB-2 | | 3126 | |D=NAT-PUB-4 | | 3127 | |MA=NAT-PUB-4 | | 3128 | |<----------------------------| 3129 | | | | 3130 |(56) Bind Res.| | | 3131 |S=R-PUB-2 | | | 3132 |D=L-PRIV-2 | | | 3133 |MA=NAT-PUB-4 | | | 3134 |<-------------| | | 3135 | | | | 3136 | | | | 3137 | | | | 3138 | | | |Validate 3139 | | | |R-PUB-1 to NAT-PUB-3 3140 | | | | 3141 | | | | 3142 | |(57) Bind Req.| | 3143 | |S=R-PUB-1 | | 3144 | |D=NAT-PUB-3 | | 3145 | |U=L1R1:1:R1:1 | | 3146 | |<----------------------------| 3147 | | | | 3148 |(58) Bind Req.| | | 3149 |S=R-PUB-1 | | | 3150 |D=L-PRIV-1 | | | 3151 |U=L1R1:1:R1:1 | | | 3152 |<-------------| | | 3153 | | | | 3154 |(59) Bind Res.| | | 3155 |S=L-PRIV-1 | | | 3156 |D=R-PUB-1 | | | 3157 |MA=R-PUB-1 | | | 3158 |------------->| | | 3159 | | | | 3160 | |(60) Bind Res.| | 3161 | |S=NAT-PUB-3 | | 3162 | |D=R-PUB-1 | | 3163 | |MA=R-PUB-1 | | 3164 | |---------------------------->| 3165 | | | | 3166 | | | | 3167 | | | | 3168 | | | |Validate 3169 | | | |R-PUB-2 to NAT-PUB-4 3170 | | | | 3171 | | | | 3172 | |(61) Bind Req.| | 3173 | |S=R-PUB-2 | | 3174 | |D=NAT-PUB-4 | | 3175 | |U=L1R1:2:R1:2 | | 3176 | |<----------------------------| 3177 | | | | 3178 |(62) Bind Req.| | | 3179 |S=R-PUB-2 | | | 3180 |D=L-PRIV-2 | | | 3181 |U=L1R1:2:R1:2 | | | 3182 |<-------------| | | 3183 | | | | 3184 |(63) Bind Res.| | | 3185 |S=L-PRIV-2 | | | 3186 |D=R-PUB-2 | | | 3187 |MA=R-PUB-2 | | | 3188 |------------->| | | 3189 | | | | 3190 | |(64) Bind Res.| | 3191 | |S=NAT-PUB-4 | | 3192 | |D=R-PUB-2 | | 3193 | |MA=R-PUB-2 | | 3194 | |---------------------------->| 3195 | | | | 3196 | | | | 3197 | | | | 3198 | | | | 3199 |(65) Offer | | | 3200 |------------------------------------------->| 3201 | | | | 3202 | | | | 3203 | | | | 3204 | | | | 3205 |(66) Answer | | | 3206 |<-------------------------------------------| 3207 | | | | 3208 | | | | 3209 | | | | 3210 | | | | 3211 | | | | 3212 | | | | 3214 Figure 10 3216 First, agent L obtains both server reflexive and relayed transport 3217 addresses for its RTP packets, using a STUN Allocate request, which 3218 will provide it with both types of addresses (messages 1-4). Recall 3219 that the NAT has the address and port dependent mapping property. 3220 Here, it creates a binding of NAT-PUB-1 for this UDP request, and 3221 this becomes the server reflexive transport address for RTP. The 3222 relayed transport address is STUN-PUB-2, allocated by the STUN 3223 server. Agent L repeats this process for RTCP (messages 5-8) Ta 3224 seconds later, and obtains NAT-PUB-2 as its server reflexive 3225 transport address for RTCP and STUN-PUB-3 for its relayed transport 3226 address. 3228 With its three candidates, agent L prioritizes them, choosing the 3229 local candidate as highest priority, followed by the server reflexive 3230 candidate, followed by the relayed candidate. It chooses its relayed 3231 candidate as the active candidate, and encodes it into the m/c-line. 3232 The resulting offer (message 17) looks like: 3234 v=0 3235 o=jdoe 2890844526 2890842807 IN IP4 $L-PRIV-1.IP 3236 s= 3237 c=IN IP4 $STUN-PUB-2.IP 3238 t=0 0 3239 a=ice-pwd:$LPASS 3240 m=audio $STUN-PUB-2.PORT RTP/AVP 0 3241 a=rtpmap:0 PCMU/8000 3242 a=rtcp:$STUN-PUB-3.PORT 3243 a=candidate $L1 1 UDP 1.0 $L-PRIV-1.IP $L-PRIV-1.PORT 3244 a=candidate $L1 2 UDP 1.0 $L-PRIV-2.IP $L-PRIV-2.PORT 3245 a=candidate $L2 1 UDP 0.7 $NAT-PUB-1.IP $NAT-PUB-1.PORT 3246 a=candidate $L2 2 UDP 0.7 $NAT-PUB-2.IP $NAT-PUB-2.PORT 3247 a=candidate $L3 1 UDP 0.3 $STUN-PUB-2.IP $STUN-PUB-2.PORT 3248 a=candidate $L3 2 UDP 0.3 $STUN-PUB-3.IP $STUN-PUB-3.PORT 3250 This offer is received at agent R. Agent R will gather its server 3251 reflexive and relayed transport addresses for RTP from an Allocate 3252 request (messages 10-11). Since the server reflexive transport 3253 address matches its local transport address, no separate candidate is 3254 used for it. The agent then gathers its server reflexive and relayed 3255 transport addresses for RTCP (messages 12-13). It prioritizes the 3256 local candidate with higher priority than the relayed candidate, and 3257 selects the relayed candidate as the active candidate. Its resulting 3258 answer looks like: 3260 v=0 3261 o=bob 2808844564 2808844564 IN IP4 $R-PUB-1.IP 3262 s= 3263 c=IN IP4 $STUN-PUB-4.IP 3264 t=0 0 3265 a=ice-pwd:$RPASS 3266 m=audio $STUN-PUB-4.PORT RTP/AVP 0 3267 a=rtpmap:0 PCMU/8000 3268 a=rtcp:$STUN-PUB-5.PORT 3269 a=candidate $R1 1 UDP 1.0 $R-PUB-1.IP $R-PUB-1.PORT 3270 a=candidate $R1 2 UDP 1.0 $R-PUB-2.IP $R-PUB-2.PORT 3271 a=candidate $R2 1 UDP 0.3 $STUN-PUB-4.IP $STUN-PUB-4.PORT 3272 a=candidate $R2 2 UDP 0.3 $STUN-PUB-5.IP $STUN-PUB-5.PORT 3274 Next, agents L and R form candidate pairs and the transport address 3275 check ordered list. This list will start with the two components in 3276 the currently active candidate pair - relayed candidates. Agent R 3277 begins its checks (message 15). It will check connectivity between 3278 the active candidate pair, starting with the first component, which 3279 is STUN-PUB-4 for agent R and STUN-PUB-2 for agent L. The state 3280 machine for that transport address pair moves to the Testing state. 3281 Since this is a relayed transport address for agent R, it utilizes 3282 the STUN Send Indication to deliver the Binding Request. The 3283 DESTINATION-ADDRESS is STUN-PUB-2. 3285 The STUN server will extract the content of the Send indication, 3286 which is a STUN Binding Request, and deliver it to the destination, 3287 STUN-PUB-4. This request will be sent from the relayed address 3288 allocated to R, which is STUN-PUB-4. As both interfaces are on the 3289 STUN server, this message is sent to itself (and thus the lack of a 3290 message number in the sequence diagram above). Note that the 3291 USERNAME in the Binding Request is L3:1:R2:1, which represents the 3292 transport address pair ID. This message gets discarded by the STUN 3293 server since, as of yet, there are no permissions established for the 3294 STUN-PUB-2 allocation. However, it did have the side effect of 3295 establishing a permission on the STUN-PUB-4 binding, allowing 3296 incoming packets from STUN-PUB-2. 3298 Once L gets the offer, it will attempt to validate the first 3299 transport address pair in the transport address pair check ordered 3300 list, which will be the active candidate. The state machine for this 3301 transport address pair moves into the Testing state. Like agent R 3302 did, it will use the STUN Send Indication to send a STUN Binding 3303 Request from its relayed transport address, STUN-PUB-2, to STUN-PUB-4 3304 (message 16). This packet traverses the NAT (message 17) and arrives 3305 at the STUN server. The STUN server will unwrap the contents of the 3306 packet and send them from STUN-PUB-2 to STUN-PUB-4. It will also, as 3307 a consequence, add a permission for STUN-PUB-4. The contents of the 3308 packet are a STUN Binding Request with USERNAME R2:1:L3:1 (note how 3309 this is the flip of the USERNAME in the Binding Request sent by agent 3310 R). This is also a packet from the STUN server to itself. However, 3311 now, the packet is not discarded, as a permission had been installed 3312 as a consequence of the "suicide packet" from agent R (a suicide 3313 packet is a packet that has no hope of traversing a far end NAT, but 3314 serves the purpose of enabling a permission in a near end NAT so that 3315 a packet from the peer can be returned). Thus, the STUN server will 3316 relay the received STUN request towards agent R (message 18). This 3317 is delivered as a STUN Data Indication. Notice how the REMOTE- 3318 ADDRESS is STUN-PUB-2; this is important as it will be used to 3319 construct the STUN Binding Response. 3321 Agent R will receive the Data Indication, and unwrap its contents to 3322 find the Binding Request. The state machine for this transport 3323 address pair is currently in the Testing state. It therefore moves 3324 into the Send-Valid state, and it generates a Binding Response. 3325 However, the MAPPED-ADDRESS in the Binding Response is constructed 3326 using the source IP address and port that were seen by the STUN 3327 server when the Binding Request arrived at STUN-PUB-4, which is the 3328 looped message between messages 17 and 18. This source address is 3329 STUN-PUB-2, which is the value of the REMOTE-ADDRESS attribute in 3330 message 18. Thus, the STUN Binding Response will contain STUN-PUB-2 3331 in the MAPPED-ADDRESS, and is to be sent to STUN-PUB-2. To send the 3332 response, agent R takes the STUN Binding Response and encapsulates it 3333 in a STUN Send indication, setting the DESTINATION-ADDRESS to STUN- 3334 PUB-2. This is shown in message 19. 3336 The STUN server will receive this Send Indication, and unwrap its 3337 contents to find the STUN Binding Response. It sends it to the value 3338 of the DESTINATION-ADDRESS attribute, and sends it from the relayed 3339 address allocated to R, which is STUN-PUB-4. This, once again, 3340 results in a looped message to itself, and it arrives at STUN-PUB-2. 3341 Now, however, there is a permission installed for STUN-PUB-4. The 3342 STUN server will therefore forward the packet to agent L. To do so, 3343 it constructs a STUN Data Indication containing the contents of the 3344 packet. It sets the REMOTE-ADDRESS to the source transport address 3345 of the request it received (STUN-PUB-4), and forwards it to agent L 3346 (message 20). This traverses the NAT (message 21) and arrives at 3347 agent L. As a consequence of the receipt of a Binding Response, the 3348 state machine for this transport address pair moves to the Recv-Valid 3349 state. The agent also examines the MAPPED-ADDRESS of the STUN 3350 response. It is STUN-PUB-2. This is the same as the native 3351 transport address of this transport address pair, and thus doesn't 3352 represent a new transport address that might have been learned. 3354 Because of the receipt of message 18, the transport address pair 3355 moved from Testing to Send-Valid, causing R to attempt a 3356 retransmission of its STUN Binding Request that was lost (the 3357 contents of message 15 that were discarded by the STUN server due to 3358 lack of permission). This time, however, a permission has been 3359 installed and the retransmission will work. So, it sends the Binding 3360 Request again (message 22, identical to message 15). This is looped 3361 by the STUN server to itself again, but this time there is a 3362 permission in place when it arrives at STUN-PUB-2. As such, the 3363 request is forwarded towards agent L this time, in a STUN Data 3364 Indication (message 23). This traverses the NAT (message 24) and 3365 arrives at agent L. Agent L extracts the contents of the request, 3366 which are a STUN Binding Request. This causes the state machine to 3367 move from Recv-Valid to Valid. It generates a STUN Binding Response, 3368 and sets the MAPPED-ADDRESS to the value of the REMOTE-ADDRESS in 3369 message 24 (STUN-PUB-4). This Binding Response is sent to 3370 STUN-PUB-4, which is accomplished through a STUN Send Indication 3371 (message 25). This Send Indication traverses the NAT (message 26) 3372 and is received by the STUN server. Its contents are decapsulated, 3373 and sent to STUN-PUB-4, which is again a loop on the same host. This 3374 packet is then sent towards agent R in a Data Indication (message 3375 27). The contents of the DATA Indication are extracted, and the 3376 agent sees a successful Binding Response. It therefore moves the 3377 state machine from the Send-Valid state to the Valid state. At this 3378 point, the transport address pair is in the Valid state for both 3379 agents. 3381 Approximately Ta seconds after agent R sent message 15, agent R will 3382 start checks for the next transport address pair in its transport 3383 address pair check ordered list. This is the second component of the 3384 same candidate pair, used for RTCP. This sequence, messages 28 3385 through 40, are identical to the ones for RTP, but differ only in the 3386 specific transport addresses. 3388 Once that validation happens, the second transport address pair has 3389 been validated. The candidate pair moves into the valid state, and 3390 both candidates are considered valid. The active candidate has now 3391 been validated, and media can begin to flow. It will do so through 3392 the STUN server; indeed, it is relayed "twice" through the STUN 3393 server. Even though there is a single STUN server, it is logically 3394 acting as two separate STUN servers. Indeed, had L and R used two 3395 separate STUN servers, media would be relayed through both STUN 3396 servers in a trapezoid configuration. 3398 The actual media flows are shown as well. It is important to note 3399 that, since the ICE checks have not yet concluded on the candidate 3400 that will ultimately be used, no STUN Set Active Destinations have 3401 been sent. As a consequence, media that is sent through the STUN 3402 servers has to be sent using STUN Send indications. This introduces 3403 some overhead, but is a transient condition. In message 41, agent L 3404 sends an RTP packet to agent R using a Send indication. It is sent 3405 to STUN-PUB-4. This traverses the NAT (message 42), and arrives at 3406 the STUN server. It is decapsulated, looped to itself, and arrives 3407 at STUN-PUB-4. From there, it is encapsulated in a Data Indication 3408 and sent to agent R (message 43). In the reverse direction, agent R 3409 will send an RTP packet using a STUN Send indication (message 42), 3410 and send it to STUN-PUB-2. This is received by the STUN server, 3411 decapsulated, and sent to STUN-PUB-2 from STUN-PUB-4. This is again 3412 a loop within the same host, arriving at STUN-PUB-4. The contents of 3413 the packet are sent to agent L through a STUN Data Indication 3414 (message 45), which traverses the NAT (message 46) to arrive at agent 3415 L. Since this call flow is already long enough, RTCP packet 3416 transmission is not shown. 3418 Approximately Ta seconds after it sends message 29, agent L goes to 3419 the next transport address pair in its transport address pair check 3420 ordered list that is in the Waiting state. This will be the RTP 3421 candidate for the top priority candidate pair, which is L-PRIV-1 on 3422 agent L and R-PUB-1 on agent R. This is a local candidate for each 3423 agent. To perform the check, agent L sends a STUN Binding Request 3424 from L-PRIV-1 to R-PUB-1 (message 47). Note the USERNAME of 3425 R1:1:L1:1, which identifies this transport address pair. This 3426 traverses the NAT (message 48). Since the NAT has the address and 3427 port dependent mapping property, and this is a new destination IP 3428 address, the NAT allocates a new transport address on its public 3429 side, NAT-PUB-3, and places this in the source IP address and port. 3430 This packet arrives at agent R. Agent R finds a matching transport 3431 address pair in the Waiting state. The state machine transitions to 3432 the Send-Valid state. It sends the Binding response, with a MAPPED- 3433 ADDRESS equal to NAT-PUB-3 (message 49), which traverses the NAT and 3434 arrives at agent L (message 50). Agent R, in addition to sending the 3435 response, will also send a Binding Request. It is important to 3436 remember that this Binding Request is sent to the remote address in 3437 the transport address pair (L-PRIV-1), and NOT to the source IP 3438 address and port of the Binding Request (NAT-PUB-3); that will happen 3439 later. This attempt is shown in message 51. However, since the 3440 L-PRIV-1 is private, the packet is discarded in the network. 3442 Now, as a consequence of receiving message 48, agent R will have 3443 constructed a peer-derived candidate. The candidate ID for this 3444 candidate is L1R1, and it initially contains a single transport 3445 address pair, NAT-PUB-3 and R-PUB-1. However, the candidate isn't 3446 yet usable until the other component gets added. Similarly, agent L 3447 will have constructed the same peer-derived candidate, with the same 3448 candidate ID and the same transport address pair. 3450 Some Ta seconds after sending message 28, agent R will move to the 3451 next transport address pair in the transport address pair check 3452 ordered list whose state is Waiting. This is the RTCP component of 3453 the highest priority candidate pair. It will attempt a connectivity 3454 check, from R-PUB-2 to L-PRIV-2 (message 52). Since L-PRIV-1 is 3455 private, this message is discarded. 3457 Some Ta seconds after sending message 47, agent L will move to the 3458 next transport address pair in the transport address pair check 3459 ordered list whose state is Waiting. This is the RTCP component of 3460 the highest priority candidate pair. It will attempt a connectivity 3461 check, from L-PRIV-2 to R-PUB-2 (message 53), which operates nearly 3462 identically to messages 47-50, with the exception of the specific 3463 addresses. Here, the NAT will create a new binding for the RTCP, 3464 NAT-PUB-4, and this transport address is new for both participants. 3465 On receipt of this Binding Request at agent R (message 54), agent R 3466 constructs the candidate ID for the peer-derived candidate, L1R1, and 3467 finds it already exists. As such, this new transport address is 3468 added, and the peer-derived candidate becomes complete and usable. 3469 Agent L does the same thing on receipt of message 56. This candidate 3470 will have the same priority as its generating candidate L1 (1.0), and 3471 is paired up with R1 (also at priority 1.0). Since L1R1 has the same 3472 priority as L1 itself, the ordering algorithm in Section 7.5 will use 3473 the reverse lexicographic order of the candidate ID iself to 3474 determine order. L1R1 is larger than L1, so that the peer-derived 3475 candidate will come before its generating candidate. As a 3476 consequence, the peer-derived candidate pair will have a higher 3477 priority than its generating candidate, and appear just before it in 3478 the candidate pair priority ordered list. 3480 As a consequence, after agent R sends message 55 and completes the 3481 peer-derived candidate, it will move the two transport addresses in 3482 the peer derived candidate into the Send-Valid state, and send a 3483 Binding Request for each in rapid succession (agent L will have moved 3484 both into the Recv-Valid state upon receipt of message 56). The 3485 first of these connectivity checks are for the RTP component, from 3486 R-PUB-1 to NAT-PUB-3 (message 57). Note the USERNAME in the STUN 3487 Binding Request, L1R1:1:R1:1, which identifies the peer-derived 3488 transport address pair. This will succesfully traverse the NAT and 3489 be delivered to agent L (message 58). The receipt of this request 3490 moves the state machine for this transport address pair from Recv- 3491 Valid to Valid, and a Binding Response is sent (message 59). This 3492 passes through the NAT and arrives at agent R (message 60). This 3493 causes its state machine to enter the Valid state as well. The 3494 MAPPED-ADDRESS, R-PUB-1, is not new to agent R and thus does not 3495 result in the creation of a new peer-derived candidate. 3497 Messages 61 through 64 show the same basic flow for RTCP. Upon 3498 receipt of message 64, both transport address pairs are Valid at both 3499 agents, causing the peer derived candidate to become valid. Timer 3500 Tws is set at agent L, and fires without any higher priority 3501 candidate pairs becoming validated. At agent R, media can now be 3502 sent on this candidate pair from answerer (agent R) to offerer (agent 3503 L). Agent L sends an updated offer to promote the peer-derived 3504 candidate to active. This offer (message 65) looks like: 3506 v=0 3507 o=jdoe 2890844526 2890842808 IN IP4 $L-PRIV-1.IP 3508 s= 3509 c=IN IP4 $NAT-PUB-3.IP 3510 t=0 0 3511 a=ice-pwd:$LPASS 3512 m=audio $NAT-PUB-3.PORT RTP/AVP 0 3513 a=rtpmap:0 PCMU/8000 3514 a=rtcp:$NAT-PUB-4.PORT 3515 a=remote-candidate:R1 3516 a=candidate $L1 1 UDP 1.0 $L-PRIV-1.IP $L-PRIV-1.PORT 3517 a=candidate $L1 2 UDP 1.0 $L-PRIV-2.IP $L-PRIV-2.PORT 3519 There are several important things to note in this offer. Firstly, 3520 note how the m/c-line now contains NAT-PUB-3 and NAT-PUB-4, the peer 3521 derived transport addresses it learned through the ICE processing. 3522 Secondly, note how there remains a candidate encoded into the 3523 a=candidate attributes. This is candidate L1, NOT candidate L1R1. 3524 Recall that the peer-derived candidates are never encoded into the 3525 SDP. Rather, their generating candidate is encoded. This will cause 3526 keepalives to take place for the generating candidate if valid 3527 (though its not) and any of its derived candidates, which is what we 3528 want. Finally, notice the inclusion of the a=remote-candidate 3529 attribute. Since agent L doesn't know whether agent R received 3530 messages 60 or 64, it doesnt know whether the state of the candidate 3531 is Send-Valid or Valid at agent R. So, it has to tell agent R that, 3532 in case its Send-Valid, to please use it anyway. 3534 The answer generated by agent R looks like: 3536 v=0 3537 o=bob 2808844564 2808844565 IN IP4 $R-PUB-1.IP 3538 s= 3539 c=IN IP4 $R-PUB-1.IP 3540 t=0 0 3541 a=ice-pwd:$RPASS 3542 m=audio $R-PUB-1.PORT RTP/AVP 0 3543 a=rtpmap:0 PCMU/8000 3544 a=rtcp:$R-PUB-2.PORT 3545 a=candidate $R1 1 UDP 1.0 $R-PUB-1.IP $R-PUB-1.PORT 3546 a=candidate $R1 2 UDP 1.0 $R-PUB-2.IP $R-PUB-2.PORT 3548 With this, media can now flow directly between endpoints. The 3549 removal of the relayed candidates from the offer/answer exchange will 3550 cause the STUN relay allocations to be removed. 3552 12. Grammar 3554 This specification defines three new SDP attributes - the 3555 "candidate", "remote-candidate" and "ice-pwd" attributes. 3557 The candidate attribute is a media-level attribute only. It contains 3558 a transport address for a candidate that can be used for connectivity 3559 checks. There may be multiple candidate attributes in a media block. 3561 The syntax of this attribute is defined using Augmented BNF as 3562 defined in RFC 4234 [9]: 3564 candidate-attribute = "candidate" ":" candidate-id SP component-id SP 3565 transport SP 3566 qvalue SP ;qvalue from RFC 3261 3567 addr SP ;addr from RFC 3266 3568 port ;port from RFC 2327 3569 *(SP extension-att-name SP 3570 extension-att-value) 3572 transport = "UDP" / transport-extension 3573 transport-extension = token 3574 candidate-id = 1*base64-char 3576 base64-char = ALPHANUM / DIGIT / "+" / "/" 3577 ;ALPHANUM from RFC 3261 3578 component-id = 1*DIGIT 3579 extension-att-name = byte-string ;from RFC 2327 3580 extension-att-value = byte-string 3581 The candidate-id is used to group together the transport addresses 3582 for a particular candidate. It MUST be constructed with at least 24 3583 bits of randomness. It MUST have the same value for all transport 3584 addresses within the same candidate. It MUST have a different value 3585 for transport addresses within different candidates for the same 3586 media stream. The candidate-id uses a syntax that is defined to be 3587 equal to the base64 alphabet [3], which allows the candidate-id to be 3588 generated by performing a base64 encoding of a randomly generated 3589 value (note, however, that this does not mean that the candidate-id 3590 or password is base64 decoded when use in STUN messages). In 3591 addition, if content is base64 encoded to generate the candidate-id, 3592 it MUST NOT be padded with '='. The component-id is a positive 3593 integer, which identifies the specific component of the candidate. 3594 It MUST start at 1 and MUST increment by 1 for each component of a 3595 particular candidate. 3597 The addr production is taken from [10], allowing for IPv4 addresses, 3598 IPv6 addresses and FQDNs. The port production is taken from RFC 2327 3599 [5]. The token production is taken from RFC 3261 [2]. The transport 3600 production indicates the transport protocol for the candidate. This 3601 specification only defines UDP. However, extensibility is provided 3602 to allow for future transport protocols to be used with ICE, such as 3603 TCP or the Datagram Congestion Control Protocol (DCCP) [34]. 3605 The a=candidate attribute can itself be extended. The grammar allows 3606 for new name/value pairs to be added at the end of the attribute. An 3607 implementation MUST ignore any name/value pairs it doesn't 3608 understand. 3610 The syntax of the "remote-candidate" attribute is defined using 3611 Augmented BNF as defined in RFC 4234 [9]: 3613 remote-candidate-att = "remote-candidate" ":" candidate-id 3615 This attribute MUST be present in an offer when the candidate in the 3616 m/c-line is part of a candidate pair that is in the valid or 3617 partially valid state. 3619 The syntax of the "ice-pwd" attribute is defined as: 3621 ice-pwd-att = "ice-pwd" ":" password 3622 password = 1*base64-char 3624 The "ice-pwd" attribute MUST appear at the session-level, and is 3625 consequently shared by all candidates for all media streams within 3626 the session. It MUST have at least 128 bits of randomness. Like the 3627 candidate-ID, its syntax is taken from the base64 alphabet, allowing 3628 the password to be generted from a base64 encoding of a 128 bit 3629 value. In addition, if content is base64 encoded to generate the 3630 candidate-id, it MUST NOT be padded with '='. 3632 13. Security Considerations 3634 There are several types of attacks possible in an ICE system. This 3635 section considers these attacks and their countermeasures. 3637 13.1 Attacks on Connectivity Checks 3639 An attacker might attempt to disrupt the STUN-based connectivity 3640 checks. Ultimately, all of these attacks fool an agent into thinking 3641 something incorrect about the results of the connectivity checks. 3642 The possible false conclusions an attacker can try and cause are: 3644 False Invalid: An attacker can fool a pair of agents into thinking a 3645 candidate pair is invalid, when it isn't. This can be used to 3646 cause an agent to prefer a different candidate (such as one 3647 injected by the attacker), or to disrupt a call by forcing all 3648 candidates to fail. 3650 False Valid: An attacker can fool a pair of agents into thinking a 3651 candidate pair is valid, when it isn't. This can cause an agent 3652 to proceed with a session, but then not be able to receive any 3653 media. 3655 False Peer-Derived Candidate: An attacker can cause an agent to 3656 discover a new peer-derived candidate, when it shouldn't have. 3657 This can be used to redirect media streams to a DoS target or to 3658 the attacker, for eavesdropping or other purposes. 3660 False Valid on False Candidate: An attacker has already convinced an 3661 agent that there is a candidate with an address that doesn't 3662 actually route to that agent (for example, by injecting a false 3663 peer-derived candidate or false STUN-derived candidate). It must 3664 then launch an attack that forces the agents to believe that this 3665 candidate is valid. 3667 Of the various techniques for creating faked STUN messages described 3668 in [13], many are not applicable for the connectivity checks. 3669 Compromises of STUN servers are not much of a concern, since the STUN 3670 servers are embedded in endpoints and distributed throughout the 3671 network. Thus, compromising the STUN server is equivalent to 3672 comprimising the endpoint, and if that happens, far more problematic 3673 attacks are possible than those against ICE. Similarly, DNS attacks 3674 are irrelevant since STUN servers are not discovered via DNS, they 3675 are signaled via SIP. Injection of fake responses and relaying 3676 modified requests all can be handled in ICE with the countermeasures 3677 discussed below. 3679 To force the false invalid result, the attacker has to wait for the 3680 connectivity check for one of the agents to be sent. When it is, the 3681 attacker needs to inject a fake response with an unrecoverable error 3682 response, such as a 600. This attack only needs to be launched 3683 against one of the agents in order to invalidate the candidate pair. 3684 However, since the candidate is, in fact, valid, the original request 3685 may reach the peer agent, and result in a success response. The 3686 attacker needs to force this packet or its response to be dropped, 3687 through a DoS attack, layer 2 network disruption, or other technique. 3688 If it doesn't do this, the success response will also reach the 3689 originator, alerting it to a possible attack. This will cause the 3690 agent to abandon the candidate, which is the desired result in any 3691 case. Fortunately, this attack is mitigated completely through the 3692 STUN message integrity mechanism. The attacker needs to inject a 3693 fake response, and in order for this response to be processed, the 3694 attacker needs the password. If the offer/answer signaling is 3695 secured, the attacker will not have the password. 3697 Forcing the fake valid result works in a similar way. The agent 3698 needs to wait for the Binding Request from each agent, and inject a 3699 fake success response. The attacker won't need to worry about 3700 disrupting the actual response since, if the candidate is not valid, 3701 it presumably wouldn't be received anyway. However, like the fake 3702 invalid attack, this attack is mitigated completely through the STUN 3703 message integrity and offer/answer security techniques. 3705 Forcing the false peer-derived candidate result can be done either 3706 with fake requests or responses, or with replays. We consider the 3707 fake requests and responses case first. It requires the attacker to 3708 send a Binding Request to one agent with a source IP address and port 3709 for the false transport address. In addition, the attacker must wait 3710 for a Binding Request from the other agent, and generate a fake 3711 response with a MAPPED-ADDRESS attribute. This attack is best 3712 launched against a candidate pair that is likely to be invalid, so 3713 the attacker doesnt need to contend with the actual responses to the 3714 real connectivity checks. Like the other attacks described here, 3715 this attack is mitigated by the STUN message integrity mechanisms and 3716 secure offer/answer exchanges. 3718 Forcing the false peer-derived candidate result with packet replays 3719 is different. The attacker waits until one of the agents sends a 3720 Binding Request for one of the transport address pairs. It then 3721 intercepts this request, and replays it towards the other agent with 3722 a faked source IP address. It must also prevent the original request 3723 from reaching the remote agent, either by launching a DoS attack to 3724 cause the packet to be dropped, or forcing it to be dropped using 3725 layer 2 mechanisms. The replayed packet is received at the other 3726 agent, and accepted, since the integrity check passes (the integrity 3727 check cannot and does not cover the source IP address and port). It 3728 is then responded to. This response will contain a MAPPED-ADDRESS 3729 with the false transport address. It is passed to the this false 3730 address. The attacker must then intercept it and relay it towards 3731 the originator. 3733 The other agent will then initiate a connectivity check towards that 3734 transport address. This validation needs to succeed. This requires 3735 the attacker to force a false valid on a false candidate. Injecting 3736 of fake requests or responses to achieve this goal is prevented using 3737 the integrity mechanisms of STUN and the offer/answer exchange. 3738 Thus, this attack can only be launched through replays. To do that, 3739 the attacker must intercept the Binding Request towards this false 3740 transport address, and replay it towards the other agent. Then, it 3741 must intercept the response and replay that back as well. 3743 This attack is very hard to launch unless the attacker themself is 3744 identified by the fake transport address. This is because it 3745 requires the attacker to intercept and replay packets sent by two 3746 different hosts. If both agents are on different networks (for 3747 example, across the public Internet), this attack can be hard to 3748 coordinate, since it needs to occur against two different endpoints 3749 on different parts of the network at the same time. 3751 If the attacker themself is identified by the fake transport address, 3752 the attack is easier to coordinate. However, if SRTP is used [24], 3753 the attacker will not be able to play the media packets, they will 3754 only be able to discard them, effectively disabling the media stream 3755 for the call. However, this attack requires the agent to disrupt 3756 packets in order to block the connectivity check from reaching the 3757 target. In that case, if the goal is to disrupt the media stream, 3758 its much easier to just disrupt it with the same mechanism, rather 3759 than attack ICE. 3761 13.2 Attacks on Address Gathering 3763 ICE endpoints make use of STUN for gathering addresses from a STUN 3764 server in the network. This is corresponds to the binding 3765 acquisition use case discussed in Section 10.1 of [13]. As a 3766 consequence, the attacks against STUN itself that are described in 3767 Section 12 [13] can still be used against the STUN address gathering 3768 operations that occur in ICE. 3770 However, the additional mechanisms provided by ICE actually 3771 counteract such attacks, making binding acquisition with STUN more 3772 secure when combined with ICE than without ICE. 3774 Consider an attacker which is able to provide an agent with a faked 3775 MAPPED-ADDRESS in a STUN Binding Request that is used for address 3776 gathering. This is the primary attack primitive described in Section 3777 12 of [13]. This address will be used as a STUN derived candidate in 3778 the ICE exchange. For this candidate to actually be used for media, 3779 the attacker must also attack the connectivity checks, and in 3780 particular, force a false valid on a false candidate. This attack is 3781 very hard to launch if the false address identifies a third party, 3782 and is prevented by SRTP if it identifies the attacker themself. 3784 If the attacker elects not to attack the connectivity checks, the 3785 worst it can do is prevent the STUN-derived address from being used. 3786 However, if the peer agent has at least one address that is reachable 3787 by the agent under attack, the STUN connectivity checks themselves 3788 will provide a STUN-derived address that can be used for the exchange 3789 of media. Peer derived candidates are preferred over the candidate 3790 they are generated from for this reason. As such, an attack solely 3791 on the STUN address gathering will normally have no impact on a call 3792 at all. 3794 13.3 Attacks on the Offer/Answer Exchanges 3796 An attacker that can modify or disrupt the offer/answer exchanges 3797 themselves can readily launch a variety of attacks with ICE. They 3798 could direct media to a target of a DoS attack, they could insert 3799 themselves into the media stream, and so on. These are similar to 3800 the general security considerations for offer/answer exchanges, and 3801 the security considerations in RFC 3264 [4] apply. These require 3802 techniques for message integrity and encryption for offers and 3803 answers, which are satisfied by the SIPS mechanism [2] when SIP is 3804 used. As such, the usage of SIPS with ICE is RECOMMENDED. 3806 13.4 Insider Attacks 3808 In addition to attacks where the attacker is a third party trying to 3809 insert fake offers, answers or stun messages, there are several 3810 attacks possible with ICE when the attacker is an authenticated and 3811 valid participant in the ICE exchange. 3813 13.4.1 The Voice Hammer Attack 3815 The voice hammer attack is an amplification attack, of the variety 3816 discussed in Section 3 of [32]. In this attack, the attacker 3817 initiates sessions to other agents, and includes the IP address and 3818 port of a DoS target in the m/c-line of their SDP. This causes 3819 substantial amplification; a single offer/answer exchange can create 3820 a continuing flood of media packets, possibly at high rates (consider 3821 video sources). This attack is not speific to ICE, but ICE can help 3822 provide remediation. 3824 Specifically, if ICE is used, the agent receiving the malicious SDP 3825 will first peform connectivity checks to the target of media before 3826 sending it there. If this target is a third party host, the checks 3827 will not succeed, and media is never sent. 3829 Unfortunately, ICE doesn't help if its not used, in which case an 3830 attacker could simply send the offer without the ICE parameters. 3831 However, in environments where the set of clients are known, and 3832 limited to ones that support ICE, the server can reject any offers or 3833 answers that don't indicate ICE support. 3835 13.4.2 STUN Amplification Attack 3837 The STUN amplification attack is similar to the voice hammer. 3838 However, instead of voice packets being directed to the target, STUN 3839 connectivity checks are directed to the target. This attack is 3840 accomplished by having the offerer send an offer with a large number 3841 of candidates, say 50. The answerer receives the offer, and starts 3842 its checks, which are directed at the target, and consequently, never 3843 generate a response. The answerer will start a new connectivity 3844 check every 50ms, and each check is a STUN transaction consisting of 3845 9 retransmits of a message 64 bytes in length. This produces a 3846 fairly substantial 92 kbps, just in STUN requests. 3848 It is impossible to eliminate the amplification, but the volume can 3849 be reduced through a variety of heuristics. For example, agents can 3850 limit the number of candidates they'll accept in an offer or answer, 3851 they can increase the value of Ta, or exponentially increase Ta as 3852 time goes on. All of these ultimately trade off the time for the ICE 3853 exchanges to complete, with the amount of traffic that gets sent. 3855 14. IANA Considerations 3857 This specification defines three new SDP attribute per the procedures 3858 of Appendix B of RFC 2327. The required information for the 3859 registrations are included here. 3861 14.1 candidate Attribute 3863 Contact Name: Jonathan Rosenberg, jdrosen@jdrosen.net. 3865 Attribute Name: candidate 3867 Long Form: candidate 3869 Type of Attribute: media level 3871 Charset Considerations: The attribute is not subject to the charset 3872 attribute. 3874 Purpose: This attribute is used with Interactive Connectivity 3875 Establishment (ICE), and provides one of many possible candidate 3876 addresses for communication. These addresses are validated with 3877 an end-to-end connectivity check using Simple Traversal of UDP 3878 with NAT (STUN). 3880 Appropriate Values: See Section 12 of RFC XXXX [Note to RFC-ed: 3881 please replace XXXX with the RFC number of this specification]. 3883 14.2 remote-candidate Attribute 3885 Contact Name: Jonathan Rosenberg, jdrosen@jdrosen.net. 3887 Attribute Name: remote-candidate 3889 Long Form: remote-candidate 3891 Type of Attribute: media level 3893 Charset Considerations: The attribute is not subject to the charset 3894 attribute. 3896 Purpose: This attribute is used with Interactive Connectivity 3897 Establishment (ICE), and provides the identity of the remote 3898 candidate that the offerer wishes the answerer to use in its 3899 answer. 3901 Appropriate Values: See Section 12 of RFC XXXX [Note to RFC-ed: 3902 please replace XXXX with the RFC number of this specification]. 3904 14.3 ice-pwd Attribute 3906 Contact Name: Jonathan Rosenberg, jdrosen@jdrosen.net. 3908 Attribute Name: ice-pwd 3910 Long Form: ice-pwd 3912 Type of Attribute: session level 3914 Charset Considerations: The attribute is not subject to the charset 3915 attribute. 3917 Purpose: This attribute is used with Interactive Connectivity 3918 Establishment (ICE), and provides the password used to protect 3919 STUN connectivity checks. 3921 Appropriate Values: See Section 12 of RFC XXXX [Note to RFC-ed: 3922 please replace XXXX with the RFC number of this specification]. 3924 15. IAB Considerations 3926 The IAB has studied the problem of "Unilateral Self Address Fixing", 3927 which is the general process by which a agent attempts to determine 3928 its address in another realm on the other side of a NAT through a 3929 collaborative protocol reflection mechanism [22]. ICE is an example 3930 of a protocol that performs this type of function. Interestingly, 3931 the process for ICE is not unilateral, but bilateral, and the 3932 difference has a signficant impact on the issues raised by IAB. The 3933 IAB has mandated that any protocols developed for this purpose 3934 document a specific set of considerations. This section meets those 3935 requirements. 3937 15.1 Problem Definition 3939 From RFC 3424 any UNSAF proposal must provide: 3941 Precise definition of a specific, limited-scope problem that is to 3942 be solved with the UNSAF proposal. A short term fix should not be 3943 generalized to solve other problems; this is why "short term 3944 fixes usually aren't". 3946 The specific problems being solved by ICE are: 3948 Provide a means for two peers to determine the set of transport 3949 addresses which can be used for communication. 3951 Provide a means for resolving many of the limitations of other 3952 UNSAF mechanisms by wrapping them in an additional layer of 3953 processing (the ICE methodology). 3955 Provide a means for a agent to determine an address that is 3956 reachable by another peer with which it wishes to communicate. 3958 15.2 Exit Strategy 3960 From RFC 3424, any UNSAF proposal must provide: 3962 Description of an exit strategy/transition plan. The better short 3963 term fixes are the ones that will naturally see less and less use 3964 as the appropriate technology is deployed. 3966 ICE itself doesn't easily get phased out. However, it is useful even 3967 in a globally connected Internet, to serve as a means for detecting 3968 whether a router failure has temporarily disrupted connectivity, for 3969 example. However, what ICE does is help phase out other UNSAF 3970 mechanisms. ICE effectively selects amongst those mechanisms, 3971 prioritizing ones that are better, and deprioritizing ones that are 3972 worse. Local IPv6 addresses can be preferred. As NATs begin to 3973 dissipate as IPv6 is introduced, derived transport addresses from 3974 other UNSAF mechanisms simply never get used, because higher priority 3975 connectivity exists. Therefore, the servers get used less and less, 3976 and can eventually be remove when their usage goes to zero. 3978 Indeed, ICE can assist in the transition from IPv4 to IPv6. It can 3979 be used to determine whether to use IPv6 or IPv4 when two dual-stack 3980 hosts communicate with SIP (IPv6 gets used). It can also allow a 3981 network with both 6to4 and native v6 connectivity to determine which 3982 address to use when communicating with a peer. 3984 15.3 Brittleness Introduced by ICE 3986 From RFC3424, any UNSAF proposal must provide: 3988 Discussion of specific issues that may render systems more 3989 "brittle". For example, approaches that involve using data at 3990 multiple network layers create more dependencies, increase 3991 debugging challenges, and make it harder to transition. 3993 ICE actually removes brittleness from existing UNSAF mechanisms. In 3994 particular, traditional STUN (the usage described in [13]) has 3995 several points of brittleness. One of them is the discovery process 3996 which requires a agent to try and classify the type of NAT it is 3997 behind. This process is error-prone. With ICE, that discovery 3998 process is simply not used. Rather than unilaterally assessing the 3999 validity of the address, its validity is dynamically determined by 4000 measuring connectivity to a peer. The process of determining 4001 connectivity is very robust. The only potential problem is that 4002 bilaterally fixed addresses through STUN can expire if traffic does 4003 not keep them alive. However, that is substantially less brittleness 4004 than the STUN discovery mechanisms. 4006 Another point of brittleness in STUN and any other unilateral 4007 mechanism is its absolute reliance on an additional server. ICE 4008 makes use of a server for allocating unilateral addresses, but allows 4009 agents to directly connect if possible. Therefore, in some cases, 4010 the failure of a STUN server would still allow for a call to progress 4011 when ICE is used. 4013 Another point of brittleness in traditional STUN is that it assumes 4014 that the STUN server is on the public Internet. Interestingly, with 4015 ICE, that is not necessary. There can be a multitude of STUN servers 4016 in a variety of address realms. ICE will discover the one that has 4017 provided a usable address. 4019 The most troubling point of brittleness in traditional STUN is that 4020 it doesn't work in all network topologies. In cases where there is a 4021 shared NAT between each agent and the STUN server, traditional STUN 4022 may not work. With ICE, that restriction can be lifted. 4024 Traditional STUN also introduces some security considerations. 4025 Fortunately, those security considerations are also mitigated by ICE. 4027 15.4 Requirements for a Long Term Solution 4029 From RFC 3424, any UNSAF proposal must provide: 4031 Identify requirements for longer term, sound technical solutions 4032 -- contribute to the process of finding the right longer term 4033 solution. 4035 Our conclusions from STUN remain unchanged. However, we feel ICE 4036 actually helps because we believe it can be part of the long term 4037 solution. 4039 15.5 Issues with Existing NAPT Boxes 4041 From RFC 3424, any UNSAF proposal must provide: 4043 Discussion of the impact of the noted practical issues with 4044 existing, deployed NA[P]Ts and experience reports. 4046 A number of NAT boxes are now being deployed into the market which 4047 try and provide "generic" ALG functionality. These generic ALGs hunt 4048 for IP addresses, either in text or binary form within a packet, and 4049 rewrite them if they match a binding. This will interfere with 4050 proper operation of any UNSAF mechanism, including ICE. 4052 16. Acknowledgements 4054 The authors would like to thank Flemming Andreasen, Rohan Mahy, Dean 4055 Willis, Dan Wing, Douglas Otis, and Francois Audet for their comments 4056 and input. A special thanks goes to Magnus Westerlund for doing 4057 several detailed reviews on the various revisions of this 4058 specification. His input led to many substantive improvements in 4059 this document. 4061 17. References 4063 17.1 Normative References 4065 [1] Huitema, C., "Real Time Control Protocol (RTCP) attribute in 4066 Session Description Protocol (SDP)", RFC 3605, October 2003. 4068 [2] Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston, A., 4069 Peterson, J., Sparks, R., Handley, M., and E. Schooler, "SIP: 4070 Session Initiation Protocol", RFC 3261, June 2002. 4072 [3] Josefsson, S., "The Base16, Base32, and Base64 Data Encodings", 4073 RFC 3548, July 2003. 4075 [4] Rosenberg, J. and H. Schulzrinne, "An Offer/Answer Model with 4076 Session Description Protocol (SDP)", RFC 3264, June 2002. 4078 [5] Handley, M. and V. Jacobson, "SDP: Session Description 4079 Protocol", RFC 2327, April 1998. 4081 [6] Casner, S., "Session Description Protocol (SDP) Bandwidth 4082 Modifiers for RTP Control Protocol (RTCP) Bandwidth", RFC 3556, 4083 July 2003. 4085 [7] Camarillo, G., Marshall, W., and J. Rosenberg, "Integration of 4086 Resource Management and Session Initiation Protocol (SIP)", 4087 RFC 3312, October 2002. 4089 [8] Camarillo, G. and P. Kyzivat, "Update to the Session Initiation 4090 Protocol (SIP) Preconditions Framework", RFC 4032, March 2005. 4092 [9] Crocker, D. and P. Overell, "Augmented BNF for Syntax 4093 Specifications: ABNF", RFC 4234, October 2005. 4095 [10] Olson, S., Camarillo, G., and A. Roach, "Support for IPv6 in 4096 Session Description Protocol (SDP)", RFC 3266, June 2002. 4098 [11] Rosenberg, J. and H. Schulzrinne, "Reliability of Provisional 4099 Responses in Session Initiation Protocol (SIP)", RFC 3262, 4100 June 2002. 4102 [12] Yon, D., "Connection-Oriented Media Transport in the Session 4103 Description Protocol (SDP)", draft-ietf-mmusic-sdp-comedia-10 4104 (work in progress), November 2004. 4106 [13] Rosenberg, J., "Simple Traversal of UDP Through Network Address 4107 Translators (NAT) (STUN)", draft-ietf-behave-rfc3489bis-02 4108 (work in progress), July 2005. 4110 [14] Rosenberg, J., Mahy, R., and C. Huitema, "Obtaining Relay 4111 Addresses from Simple Traversal of UDP Through NAT (STUN)", 4112 Internet Draft draft-ietf-behave-turn-00.txt, February 2006. 4114 17.2 Informative References 4116 [15] Schulzrinne, H., Rao, A., and R. Lanphier, "Real Time Streaming 4117 Protocol (RTSP)", RFC 2326, April 1998. 4119 [16] Rosenberg, J., Weinberger, J., Huitema, C., and R. Mahy, "STUN 4120 - Simple Traversal of User Datagram Protocol (UDP) Through 4121 Network Address Translators (NATs)", RFC 3489, March 2003. 4123 [17] Senie, D., "Network Address Translator (NAT)-Friendly 4124 Application Design Guidelines", RFC 3235, January 2002. 4126 [18] Rosenberg, J. and H. Schulzrinne, "An RTP Payload Format for 4127 Generic Forward Error Correction", RFC 2733, December 1999. 4129 [19] Srisuresh, P., Kuthan, J., Rosenberg, J., Molitor, A., and A. 4130 Rayhan, "Middlebox communication architecture and framework", 4131 RFC 3303, August 2002. 4133 [20] Borella, M., Lo, J., Grabelsky, D., and G. Montenegro, "Realm 4134 Specific IP: Framework", RFC 3102, October 2001. 4136 [21] Borella, M., Grabelsky, D., Lo, J., and K. Taniguchi, "Realm 4137 Specific IP: Protocol Specification", RFC 3103, October 2001. 4139 [22] Daigle, L. and IAB, "IAB Considerations for UNilateral Self- 4140 Address Fixing (UNSAF) Across Network Address Translation", 4141 RFC 3424, November 2002. 4143 [23] Schulzrinne, H., Casner, S., Frederick, R., and V. Jacobson, 4144 "RTP: A Transport Protocol for Real-Time Applications", 4145 RFC 3550, July 2003. 4147 [24] Baugher, M., McGrew, D., Naslund, M., Carrara, E., and K. 4148 Norrman, "The Secure Real-time Transport Protocol (SRTP)", 4149 RFC 3711, March 2004. 4151 [25] Carpenter, B. and K. Moore, "Connection of IPv6 Domains via 4152 IPv4 Clouds", RFC 3056, February 2001. 4154 [26] Zopf, R., "Real-time Transport Protocol (RTP) Payload for 4155 Comfort Noise (CN)", RFC 3389, September 2002. 4157 [27] Rosenberg, J., "The Session Initiation Protocol (SIP) UPDATE 4158 Method", RFC 3311, October 2002. 4160 [28] Bonica, R., Kompella, K., and D. Meyer, "Tracing Requirements 4161 for Generic Tunnels", RFC 3609, September 2003. 4163 [29] Camarillo, G. and H. Schulzrinne, "Early Media and Ringing Tone 4164 Generation in the Session Initiation Protocol (SIP)", RFC 3960, 4165 December 2004. 4167 [30] Andreasen, F., "Connectivity Preconditions for Session 4168 Description Protocol Media Streams", 4169 draft-ietf-mmusic-connectivity-precon-01 (work in progress), 4170 October 2005. 4172 [31] Andreasen, F., "A No-Op Payload Format for RTP", 4173 draft-ietf-avt-rtp-no-op-00 (work in progress), May 2005. 4175 [32] Rescorla, E. and M. Handley, "Internet Denial of Service 4176 Considerations", draft-iab-dos-03 (work in progress), 4177 September 2005. 4179 [33] Huitema, C., "Teredo: Tunneling IPv6 over UDP through NATs", 4180 draft-huitema-v6ops-teredo-05 (work in progress), April 2005. 4182 [34] Kohler, E., "Datagram Congestion Control Protocol (DCCP)", 4183 draft-ietf-dccp-spec-13 (work in progress), December 2005. 4185 [35] Lazzaro, J., "Framing RTP and RTCP Packets over Connection- 4186 Oriented Transport", draft-ietf-avt-rtp-framing-contrans-06 4187 (work in progress), September 2005. 4189 [36] Hellstrom, G., "RTP Payload for Text Conversation", 4190 draft-ietf-avt-rfc2793bis-09 (work in progress), August 2004. 4192 [37] Audet, F. and C. Jennings, "NAT Behavioral Requirements for 4193 Unicast UDP", Internet Draft draft-ietf-behave-nat-udp-00.txt, 4194 February 2006. 4196 Author's Address 4198 Jonathan Rosenberg 4199 Cisco Systems 4200 600 Lanidex Plaza 4201 Parsippany, NJ 07054 4202 US 4204 Phone: +1 973 952-5000 4205 Email: jdrosen@cisco.com 4206 URI: http://www.jdrosen.net 4208 Intellectual Property Statement 4210 The IETF takes no position regarding the validity or scope of any 4211 Intellectual Property Rights or other rights that might be claimed to 4212 pertain to the implementation or use of the technology described in 4213 this document or the extent to which any license under such rights 4214 might or might not be available; nor does it represent that it has 4215 made any independent effort to identify any such rights. Information 4216 on the procedures with respect to rights in RFC documents can be 4217 found in BCP 78 and BCP 79. 4219 Copies of IPR disclosures made to the IETF Secretariat and any 4220 assurances of licenses to be made available, or the result of an 4221 attempt made to obtain a general license or permission for the use of 4222 such proprietary rights by implementers or users of this 4223 specification can be obtained from the IETF on-line IPR repository at 4224 http://www.ietf.org/ipr. 4226 The IETF invites any interested party to bring to its attention any 4227 copyrights, patents or patent applications, or other proprietary 4228 rights that may cover technology that may be required to implement 4229 this standard. Please address the information to the IETF at 4230 ietf-ipr@ietf.org. 4232 Disclaimer of Validity 4234 This document and the information contained herein are provided on an 4235 "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS 4236 OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET 4237 ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, 4238 INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE 4239 INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED 4240 WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. 4242 Copyright Statement 4244 Copyright (C) The Internet Society (2006). This document is subject 4245 to the rights, licenses and restrictions contained in BCP 78, and 4246 except as set forth therein, the authors retain all their rights. 4248 Acknowledgment 4250 Funding for the RFC Editor function is currently provided by the 4251 Internet Society.