idnits 2.17.1 draft-ietf-ice-rfc5245bis-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- == There are 14 instances of lines with private range IPv4 addresses in the document. If these are generic example addresses, they should be changed to use any of the ranges defined in RFC 6890 (or successor): 192.0.2.x, 198.51.100.x or 203.0.113.x. -- The document has examples using IPv4 documentation addresses according to RFC6890, but does not use any IPv6 documentation addresses. Maybe there should be IPv6 examples, too? Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == The document seems to contain a disclaimer for pre-RFC5378 work, but was first submitted on or after 10 November 2008. The disclaimer is usually necessary only for documents that revise or obsolete older RFCs, and that take significant amounts of text from those RFCs. If you can contact all authors of the source material and they are willing to grant the BCP78 rights to the IETF Trust, you can and should remove the disclaimer. Otherwise, the disclaimer is needed and you can ignore this comment. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (December 21, 2015) is 3046 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) ** Obsolete normative reference: RFC 5389 (Obsoleted by RFC 8489) ** Obsolete normative reference: RFC 5766 (Obsoleted by RFC 8656) -- Obsolete informational reference (is this intentional?): RFC 3489 (Obsoleted by RFC 5389) -- Obsolete informational reference (is this intentional?): RFC 4091 (Obsoleted by RFC 5245) -- Obsolete informational reference (is this intentional?): RFC 4092 (Obsoleted by RFC 5245) -- Obsolete informational reference (is this intentional?): RFC 4566 (Obsoleted by RFC 8866) -- Obsolete informational reference (is this intentional?): RFC 5245 (Obsoleted by RFC 8445, RFC 8839) == Outdated reference: A later version (-39) exists of draft-ietf-mmusic-ice-sip-sdp-07 Summary: 2 errors (**), 0 flaws (~~), 4 warnings (==), 7 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 MMUSIC A. Keranen 3 Internet-Draft Ericsson 4 Obsoletes: 5245 (if approved) J. Rosenberg 5 Intended status: Standards Track jdrosen.net 6 Expires: June 23, 2016 December 21, 2015 8 Interactive Connectivity Establishment (ICE): A Protocol for Network 9 Address Translator (NAT) Traversal 10 draft-ietf-ice-rfc5245bis-01 12 Abstract 14 This document describes a protocol for Network Address Translator 15 (NAT) traversal for UDP-based multimedia. This protocol is called 16 Interactive Connectivity Establishment (ICE). ICE makes use of the 17 Session Traversal Utilities for NAT (STUN) protocol and its 18 extension, Traversal Using Relay NAT (TURN). 20 This document obsoletes RFC 5245. 22 Status of This Memo 24 This Internet-Draft is submitted in full conformance with the 25 provisions of BCP 78 and BCP 79. 27 Internet-Drafts are working documents of the Internet Engineering 28 Task Force (IETF). Note that other groups may also distribute 29 working documents as Internet-Drafts. The list of current Internet- 30 Drafts is at http://datatracker.ietf.org/drafts/current/. 32 Internet-Drafts are draft documents valid for a maximum of six months 33 and may be updated, replaced, or obsoleted by other documents at any 34 time. It is inappropriate to use Internet-Drafts as reference 35 material or to cite them other than as "work in progress." 37 This Internet-Draft will expire on June 23, 2016. 39 Copyright Notice 41 Copyright (c) 2015 IETF Trust and the persons identified as the 42 document authors. All rights reserved. 44 This document is subject to BCP 78 and the IETF Trust's Legal 45 Provisions Relating to IETF Documents 46 (http://trustee.ietf.org/license-info) in effect on the date of 47 publication of this document. Please review these documents 48 carefully, as they describe your rights and restrictions with respect 49 to this document. Code Components extracted from this document must 50 include Simplified BSD License text as described in Section 4.e of 51 the Trust Legal Provisions and are provided without warranty as 52 described in the Simplified BSD License. 54 This document may contain material from IETF Documents or IETF 55 Contributions published or made publicly available before November 56 10, 2008. The person(s) controlling the copyright in some of this 57 material may not have granted the IETF Trust the right to allow 58 modifications of such material outside the IETF Standards Process. 59 Without obtaining an adequate license from the person(s) controlling 60 the copyright in such materials, this document may not be modified 61 outside the IETF Standards Process, and derivative works of it may 62 not be created outside the IETF Standards Process, except to format 63 it for publication as an RFC or to translate it into languages other 64 than English. 66 Table of Contents 68 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 5 69 2. Overview of ICE . . . . . . . . . . . . . . . . . . . . . . . 6 70 2.1. Gathering Candidate Addresses . . . . . . . . . . . . . . 8 71 2.2. Connectivity Checks . . . . . . . . . . . . . . . . . . . 10 72 2.3. Sorting Candidates . . . . . . . . . . . . . . . . . . . 11 73 2.4. Frozen Candidates . . . . . . . . . . . . . . . . . . . . 12 74 2.5. Security for Checks . . . . . . . . . . . . . . . . . . . 13 75 2.6. Concluding ICE . . . . . . . . . . . . . . . . . . . . . 13 76 2.7. Lite Implementations . . . . . . . . . . . . . . . . . . 15 77 2.8. Usages of ICE . . . . . . . . . . . . . . . . . . . . . . 15 78 3. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 15 79 4. ICE Candidate Gathering and Exchange . . . . . . . . . . . . 19 80 4.1. Procedures for Full Implementation . . . . . . . . . . . 20 81 4.1.1. Gathering Candidates . . . . . . . . . . . . . . . . 20 82 4.1.1.1. Host Candidates . . . . . . . . . . . . . . . . . 21 83 4.1.1.2. Server Reflexive and Relayed Candidates . . . . . 22 84 4.1.1.3. Computing Foundations . . . . . . . . . . . . . . 24 85 4.1.1.4. Keeping Candidates Alive . . . . . . . . . . . . 24 86 4.1.2. Prioritizing Candidates . . . . . . . . . . . . . . . 24 87 4.1.2.1. Recommended Formula . . . . . . . . . . . . . . . 25 88 4.1.2.2. Guidelines for Choosing Type and Local 89 Preferences . . . . . . . . . . . . . . . . . . . 26 90 4.1.3. Eliminating Redundant Candidates . . . . . . . . . . 27 91 4.2. Lite Implementation Procedures . . . . . . . . . . . . . 27 92 4.3. Encoding the Candidate Information . . . . . . . . . . . 28 93 5. ICE Candidate Processing . . . . . . . . . . . . . . . . . . 30 94 5.1. Procedures for Full Implementation . . . . . . . . . . . 30 95 5.1.1. Verifying ICE Support . . . . . . . . . . . . . . . . 30 96 5.1.2. Determining Role . . . . . . . . . . . . . . . . . . 31 97 5.1.3. Forming the Check Lists . . . . . . . . . . . . . . . 32 98 5.1.3.1. Forming Candidate Pairs . . . . . . . . . . . . . 32 99 5.1.3.2. Computing Pair Priority and Ordering Pairs . . . 35 100 5.1.3.3. Pruning the Pairs . . . . . . . . . . . . . . . . 35 101 5.1.3.4. Computing States . . . . . . . . . . . . . . . . 35 102 5.1.4. Scheduling Checks . . . . . . . . . . . . . . . . . . 38 103 5.2. Lite Implementation Procedures . . . . . . . . . . . . . 40 104 6. Performing Connectivity Checks . . . . . . . . . . . . . . . 40 105 6.1. STUN Client Procedures . . . . . . . . . . . . . . . . . 40 106 6.1.1. Creating Permissions for Relayed Candidates . . . . . 40 107 6.1.2. Sending the Request . . . . . . . . . . . . . . . . . 41 108 6.1.2.1. PRIORITY and USE-CANDIDATE . . . . . . . . . . . 41 109 6.1.2.2. ICE-CONTROLLED and ICE-CONTROLLING . . . . . . . 41 110 6.1.2.3. Forming Credentials . . . . . . . . . . . . . . . 42 111 6.1.2.4. DiffServ Treatment . . . . . . . . . . . . . . . 42 112 6.1.3. Processing the Response . . . . . . . . . . . . . . . 42 113 6.1.3.1. Failure Cases . . . . . . . . . . . . . . . . . . 42 114 6.1.3.2. Success Cases . . . . . . . . . . . . . . . . . . 43 115 6.1.3.2.1. Discovering Peer Reflexive Candidates . . . . 43 116 6.1.3.2.2. Constructing a Valid Pair . . . . . . . . . . 44 117 6.1.3.2.3. Updating Pair States . . . . . . . . . . . . 45 118 6.1.3.2.4. Updating the Nominated Flag . . . . . . . . . 46 119 6.1.3.3. Check List and Timer State Updates . . . . . . . 46 120 6.2. STUN Server Procedures . . . . . . . . . . . . . . . . . 47 121 6.2.1. Additional Procedures for Full Implementations . . . 47 122 6.2.1.1. Detecting and Repairing Role Conflicts . . . . . 47 123 6.2.1.2. Computing Mapped Address . . . . . . . . . . . . 49 124 6.2.1.3. Learning Peer Reflexive Candidates . . . . . . . 49 125 6.2.1.4. Triggered Checks . . . . . . . . . . . . . . . . 49 126 6.2.1.5. Updating the Nominated Flag . . . . . . . . . . . 51 127 6.2.2. Additional Procedures for Lite Implementations . . . 51 128 7. Concluding ICE Processing . . . . . . . . . . . . . . . . . . 51 129 7.1. Procedures for Full Implementations . . . . . . . . . . . 51 130 7.1.1. Nominating Pairs . . . . . . . . . . . . . . . . . . 51 131 7.1.1.1. Regular Nomination . . . . . . . . . . . . . . . 52 132 7.1.1.2. Aggressive Nomination . . . . . . . . . . . . . . 52 133 7.1.2. Updating States . . . . . . . . . . . . . . . . . . . 53 134 7.2. Procedures for Lite Implementations . . . . . . . . . . . 54 135 7.2.1. Peer Is Full . . . . . . . . . . . . . . . . . . . . 55 136 7.2.2. Peer Is Lite . . . . . . . . . . . . . . . . . . . . 55 137 7.3. Freeing Candidates . . . . . . . . . . . . . . . . . . . 56 138 7.3.1. Full Implementation Procedures . . . . . . . . . . . 56 139 7.3.2. Lite Implementation Procedures . . . . . . . . . . . 56 140 8. ICE Restarts . . . . . . . . . . . . . . . . . . . . . . . . 56 141 9. Keepalives . . . . . . . . . . . . . . . . . . . . . . . . . 57 142 10. Media Handling . . . . . . . . . . . . . . . . . . . . . . . 58 143 10.1. Sending Media . . . . . . . . . . . . . . . . . . . . . 58 144 10.1.1. Procedures for Full Implementations . . . . . . . . 58 145 10.1.2. Procedures for Lite Implementations . . . . . . . . 59 146 10.1.3. Procedures for All Implementations . . . . . . . . . 59 147 10.2. Receiving Media . . . . . . . . . . . . . . . . . . . . 59 148 11. Extensibility Considerations . . . . . . . . . . . . . . . . 60 149 12. Setting Ta and RTO . . . . . . . . . . . . . . . . . . . . . 61 150 12.1. Real-time Media Streams . . . . . . . . . . . . . . . . 61 151 12.2. Non-real-time Sessions . . . . . . . . . . . . . . . . . 63 152 13. Example . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 153 14. Security Considerations . . . . . . . . . . . . . . . . . . . 68 154 14.1. Attacks on Connectivity Checks . . . . . . . . . . . . . 68 155 14.2. Attacks on Server Reflexive Address Gathering . . . . . 71 156 14.3. Attacks on Relayed Candidate Gathering . . . . . . . . . 72 157 14.4. Insider Attacks . . . . . . . . . . . . . . . . . . . . 72 158 14.4.1. STUN Amplification Attack . . . . . . . . . . . . . 72 159 15. STUN Extensions . . . . . . . . . . . . . . . . . . . . . . . 73 160 15.1. New Attributes . . . . . . . . . . . . . . . . . . . . . 73 161 15.2. New Error Response Codes . . . . . . . . . . . . . . . . 74 162 16. Operational Considerations . . . . . . . . . . . . . . . . . 74 163 16.1. NAT and Firewall Types . . . . . . . . . . . . . . . . . 74 164 16.2. Bandwidth Requirements . . . . . . . . . . . . . . . . . 74 165 16.2.1. STUN and TURN Server Capacity Planning . . . . . . . 74 166 16.2.2. Gathering and Connectivity Checks . . . . . . . . . 75 167 16.2.3. Keepalives . . . . . . . . . . . . . . . . . . . . . 75 168 16.3. ICE and ICE-lite . . . . . . . . . . . . . . . . . . . . 76 169 16.4. Troubleshooting and Performance Management . . . . . . . 76 170 16.5. Endpoint Configuration . . . . . . . . . . . . . . . . . 76 171 17. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 77 172 17.1. STUN Attributes . . . . . . . . . . . . . . . . . . . . 77 173 17.2. STUN Error Responses . . . . . . . . . . . . . . . . . . 77 174 18. IAB Considerations . . . . . . . . . . . . . . . . . . . . . 77 175 18.1. Problem Definition . . . . . . . . . . . . . . . . . . . 77 176 18.2. Exit Strategy . . . . . . . . . . . . . . . . . . . . . 78 177 18.3. Brittleness Introduced by ICE . . . . . . . . . . . . . 78 178 18.4. Requirements for a Long-Term Solution . . . . . . . . . 79 179 18.5. Issues with Existing NAPT Boxes . . . . . . . . . . . . 80 180 19. Changes from RFC 5245 . . . . . . . . . . . . . . . . . . . . 80 181 20. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 80 182 21. References . . . . . . . . . . . . . . . . . . . . . . . . . 81 183 21.1. Normative References . . . . . . . . . . . . . . . . . . 81 184 21.2. Informative References . . . . . . . . . . . . . . . . . 81 185 Appendix A. Lite and Full Implementations . . . . . . . . . . . 85 186 Appendix B. Design Motivations . . . . . . . . . . . . . . . . . 86 187 B.1. Pacing of STUN Transactions . . . . . . . . . . . . . . . 86 188 B.2. Candidates with Multiple Bases . . . . . . . . . . . . . 87 189 B.3. Purpose of the Related Address and Related Port 190 Attributes . . . . . . . . . . . . . . . . . . . . . . . 89 191 B.4. Importance of the STUN Username . . . . . . . . . . . . . 90 192 B.5. The Candidate Pair Priority Formula . . . . . . . . . . . 91 193 B.6. Why Are Keepalives Needed? . . . . . . . . . . . . . . . 91 194 B.7. Why Prefer Peer Reflexive Candidates? . . . . . . . . . . 92 195 B.8. Why Are Binding Indications Used for Keepalives? . . . . 92 196 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 92 198 1. Introduction 200 Protocols establishing multimedia sessions between peers typically 201 involve exchanging IP addresses and ports for the media sources and 202 sinks. However this poses challenges when operated through Network 203 Address Translators (NATs) [RFC3235]. These protocols also seek to 204 create a media flow directly between participants, so that there is 205 no application layer intermediary between them. This is done to 206 reduce media latency, decrease packet loss, and reduce the 207 operational costs of deploying the application. However, this is 208 difficult to accomplish through NAT. A full treatment of the reasons 209 for this is beyond the scope of this specification. 211 Numerous solutions have been defined for allowing these protocols to 212 operate through NAT. These include Application Layer Gateways 213 (ALGs), the Middlebox Control Protocol [RFC3303], the original Simple 214 Traversal of UDP Through NAT (STUN) [RFC3489] specification, and 215 Realm Specific IP [RFC3102] [RFC3103] along with session description 216 extensions needed to make them work, such as the Session Description 217 Protocol (SDP) [RFC4566] attribute for the Real Time Control Protocol 218 (RTCP) [RFC3605]. Unfortunately, these techniques all have pros and 219 cons which, make each one optimal in some network topologies, but a 220 poor choice in others. The result is that administrators and 221 implementors are making assumptions about the topologies of the 222 networks in which their solutions will be deployed. This introduces 223 complexity and brittleness into the system. What is needed is a 224 single solution that is flexible enough to work well in all 225 situations. 227 This specification defines Interactive Connectivity Establishment 228 (ICE) as a technique for NAT traversal for UDP-based media streams 229 (though ICE has been extended to handle other transport protocols, 230 such as TCP [RFC6544]). ICE works by exchanging a multiplicity of IP 231 addresses and ports which are then tested for connectivity by peer- 232 to-peer connectivity checks. The IP addresses and ports are 233 exchanged via mechanisms (for example, including in a offer/answer 234 exchange) and the connectivity checks are performed using Session 235 Traversal Utilities for NAT (STUN) specification [RFC5389]. ICE also 236 makes use of Traversal Using Relays around NAT (TURN) [RFC5766], an 237 extension to STUN. Because ICE exchanges a multiplicity of IP 238 addresses and ports for each media stream, it also allows for address 239 selection for multihomed and dual-stack hosts, and for this reason it 240 deprecates [RFC4091] and [RFC4092]. 242 2. Overview of ICE 244 In a typical ICE deployment, we have two endpoints (known as ICE 245 AGENTS) that want to communicate. They are able to communicate 246 indirectly via some signaling protocol (such as SIP), by which they 247 can exchange ICE candidates. Note that ICE is not intended for NAT 248 traversal for the signaling protocol, which is assumed to be provided 249 via another mechanism. At the beginning of the ICE process, the 250 agents are ignorant of their own topologies. In particular, they 251 might or might not be behind a NAT (or multiple tiers of NATs). ICE 252 allows the agents to discover enough information about their 253 topologies to potentially find one or more paths by which they can 254 communicate. 256 Figure 1 shows a typical environment for ICE deployment. The two 257 endpoints are labelled L and R (for left and right, which helps 258 visualize call flows). Both L and R are behind their own respective 259 NATs though they may not be aware of it. The type of NAT and its 260 properties are also unknown. Agents L and R are capable of engaging 261 in an candidate exchange process, whose purpose is to set up a media 262 session between L and R. Typically, this exchange will occur through 263 a signaling (e.g., SIP) server. 265 In addition to the agents, a signaling server and NATs, ICE is 266 typically used in concert with STUN or TURN servers in the network. 267 Each agent can have its own STUN or TURN server, or they can be the 268 same. 270 +---------+ 271 +--------+ |Signaling| +--------+ 272 | STUN | |Server | | STUN | 273 | Server | +---------+ | Server | 274 +--------+ / \ +--------+ 275 / \ 276 / \ 277 / <- Signaling -> \ 278 / \ 279 +--------+ +--------+ 280 | NAT | | NAT | 281 +--------+ +--------+ 282 / \ 283 / \ 284 +-------+ +-------+ 285 | Agent | | Agent | 286 | L | | R | 287 +-------+ +-------+ 289 Figure 1: ICE Deployment Scenario 291 The basic idea behind ICE is as follows: each agent has a variety of 292 candidate TRANSPORT ADDRESSES (combination of IP address and port for 293 a particular transport protocol, which is always UDP in this 294 specification) it could use to communicate with the other agent. 295 These might include: 297 o A transport address on a directly attached network interface 299 o A translated transport address on the public side of a NAT (a 300 "server reflexive" address) 302 o A transport address allocated from a TURN server (a "relayed 303 address") 305 Potentially, any of L's candidate transport addresses can be used to 306 communicate with any of R's candidate transport addresses. In 307 practice, however, many combinations will not work. For instance, if 308 L and R are both behind NATs, their directly attached interface 309 addresses are unlikely to be able to communicate directly (this is 310 why ICE is needed, after all!). The purpose of ICE is to discover 311 which pairs of addresses will work. The way that ICE does this is to 312 systematically try all possible pairs (in a carefully sorted order) 313 until it finds one or more that work. 315 2.1. Gathering Candidate Addresses 317 In order to execute ICE, an agent has to identify all of its address 318 candidates. A CANDIDATE is a transport address -- a combination of 319 IP address and port for a particular transport protocol (with only 320 UDP specified here). This document defines three types of 321 candidates, some derived from physical or logical network interfaces, 322 others discoverable via STUN and TURN. Naturally, one viable 323 candidate is a transport address obtained directly from a local 324 interface. Such a candidate is called a HOST CANDIDATE. The local 325 interface could be Ethernet or WiFi, or it could be one that is 326 obtained through a tunnel mechanism, such as a Virtual Private 327 Network (VPN) or Mobile IP (MIP). In all cases, such a network 328 interface appears to the agent as a local interface from which ports 329 (and thus candidates) can be allocated. 331 If an agent is multihomed, it obtains a candidate from each IP 332 address. Depending on the location of the PEER (the other agent in 333 the session) on the IP network relative to the agent, the agent may 334 be reachable by the peer through one or more of those IP addresses. 335 Consider, for example, an agent that has a local IP address on a 336 private net 10 network (I1), and a second connected to the public 337 Internet (I2). A candidate from I1 will be directly reachable when 338 communicating with a peer on the same private net 10 network, while a 339 candidate from I2 will be directly reachable when communicating with 340 a peer on the public Internet. Rather than trying to guess which IP 341 address will work, the initiating sends both the candidates to its 342 peer. 344 Next, the agent uses STUN or TURN to obtain additional candidates. 345 These come in two flavors: translated addresses on the public side of 346 a NAT (SERVER REFLEXIVE CANDIDATES) and addresses on TURN servers 347 (RELAYED CANDIDATES). When TURN servers are utilized, both types of 348 candidates are obtained from the TURN server. If only STUN servers 349 are utilized, only server reflexive candidates are obtained from 350 them. The relationship of these candidates to the host candidate is 351 shown in Figure 2. In this figure, both types of candidates are 352 discovered using TURN. In the figure, the notation X:x means IP 353 address X and UDP port x. 355 To Internet 357 | 358 | 359 | /------------ Relayed 360 Y:y | / Address 361 +--------+ 362 | | 363 | TURN | 364 | Server | 365 | | 366 +--------+ 367 | 368 | 369 | /------------ Server 370 X1':x1'|/ Reflexive 371 +------------+ Address 372 | NAT | 373 +------------+ 374 | 375 | /------------ Local 376 X:x |/ Address 377 +--------+ 378 | | 379 | Agent | 380 | | 381 +--------+ 383 Figure 2: Candidate Relationships 385 When the agent sends the TURN Allocate request from IP address and 386 port X:x, the NAT (assuming there is one) will create a binding 387 X1':x1', mapping this server reflexive candidate to the host 388 candidate X:x. Outgoing packets sent from the host candidate will be 389 translated by the NAT to the server reflexive candidate. Incoming 390 packets sent to the server reflexive candidate will be translated by 391 the NAT to the host candidate and forwarded to the agent. We call 392 the host candidate associated with a given server reflexive candidate 393 the BASE. 395 Note: "Base" refers to the address an agent sends from for a 396 particular candidate. Thus, as a degenerate case host candidates 397 also have a base, but it's the same as the host candidate. 399 When there are multiple NATs between the agent and the TURN server, 400 the TURN request will create a binding on each NAT, but only the 401 outermost server reflexive candidate (the one nearest the TURN 402 server) will be discovered by the agent. If the agent is not behind 403 a NAT, then the base candidate will be the same as the server 404 reflexive candidate and the server reflexive candidate is redundant 405 and will be eliminated. 407 The Allocate request then arrives at the TURN server. The TURN 408 server allocates a port y from its local IP address Y, and generates 409 an Allocate response, informing the agent of this relayed candidate. 410 The TURN server also informs the agent of the server reflexive 411 candidate, X1':x1' by copying the source transport address of the 412 Allocate request into the Allocate response. The TURN server acts as 413 a packet relay, forwarding traffic between L and R. In order to send 414 traffic to L, R sends traffic to the TURN server at Y:y, and the TURN 415 server forwards that to X1':x1', which passes through the NAT where 416 it is mapped to X:x and delivered to L. 418 When only STUN servers are utilized, the agent sends a STUN Binding 419 request [RFC5389] to its STUN server. The STUN server will inform 420 the agent of the server reflexive candidate X1':x1' by copying the 421 source transport address of the Binding request into the Binding 422 response. 424 2.2. Connectivity Checks 426 Once L has gathered all of its candidates, it orders them in highest 427 to lowest-priority and sends them to R over the signaling channel. 428 When R receives the candidates from L, it performs the same gathering 429 process and responds with its own list of candidates. At the end of 430 this process, each agent has a complete list of both its candidates 431 and its peer's candidates. It pairs them up, resulting in CANDIDATE 432 PAIRS. To see which pairs work, each agent schedules a series of 433 CHECKS. Each check is a STUN request/response transaction that the 434 client will perform on a particular candidate pair by sending a STUN 435 request from the local candidate to the remote candidate. 437 The basic principle of the connectivity checks is simple: 439 1. Sort the candidate pairs in priority order. 441 2. Send checks on each candidate pair in priority order. 443 3. Acknowledge checks received from the other agent. 445 With both agents performing a check on a candidate pair, the result 446 is a 4-way handshake: 448 L R 449 - - 450 STUN request -> \ L's 451 <- STUN response / check 453 <- STUN request \ R's 454 STUN response -> / check 456 Figure 3: Basic Connectivity Check 458 It is important to note that the STUN requests are sent to and from 459 the exact same IP addresses and ports that will be used for media 460 (e.g., RTP and RTCP). Consequently, agents demultiplex STUN and RTP/ 461 RTCP using contents of the packets, rather than the port on which 462 they are received. Fortunately, this demultiplexing is easy to do, 463 especially for RTP and RTCP. 465 Because a STUN Binding request is used for the connectivity check, 466 the STUN Binding response will contain the agent's translated 467 transport address on the public side of any NATs between the agent 468 and its peer. If this transport address is different from other 469 candidates the agent already learned, it represents a new candidate, 470 called a PEER REFLEXIVE CANDIDATE, which then gets tested by ICE just 471 the same as any other candidate. 473 As an optimization, as soon as R gets L's check message, R schedules 474 a connectivity check message to be sent to L on the same candidate 475 pair. This accelerates the process of finding a valid candidate, and 476 is called a TRIGGERED CHECK. 478 At the end of this handshake, both L and R know that they can send 479 (and receive) messages end-to-end in both directions. 481 2.3. Sorting Candidates 483 Because the algorithm above searches all candidate pairs, if a 484 working pair exists it will eventually find it no matter what order 485 the candidates are tried in. In order to produce faster (and better) 486 results, the candidates are sorted in a specified order. The 487 resulting list of sorted candidate pairs is called the CHECK LIST. 488 The algorithm is described in Section 4.1.2 but follows two general 489 principles: 491 o Each agent gives its candidates a numeric priority, which is sent 492 along with the candidate to the peer. 494 o The local and remote priorities are combined so that each agent 495 has the same ordering for the candidate pairs. 497 The second property is important for getting ICE to work when there 498 are NATs in front of L and R. Frequently, NATs will not allow 499 packets in from a host until the agent behind the NAT has sent a 500 packet towards that host. Consequently, ICE checks in each direction 501 will not succeed until both sides have sent a check through their 502 respective NATs. 504 The agent works through this check list by sending a STUN request for 505 the next candidate pair on the list periodically. These are called 506 ORDINARY CHECKS. 508 In general, the priority algorithm is designed so that candidates of 509 similar type get similar priorities and so that more direct routes 510 (that is, through fewer media relays and through fewer NATs) are 511 preferred over indirect ones (ones with more media relays and more 512 NATs). Within those guidelines, however, agents have a fair amount 513 of discretion about how to tune their algorithms. 515 2.4. Frozen Candidates 517 The previous description only addresses the case where the agents 518 wish to establish a media session with one COMPONENT (a piece of a 519 media stream requiring a single transport address; a media stream may 520 require multiple components, each of which has to work for the media 521 stream as a whole to be work). Sometimes (e.g., with RTP and RTCP in 522 separate components), the agents actually need to establish 523 connectivity for more than one flow. 525 The network properties are likely to be very similar for each 526 component (especially because RTP and RTCP are sent and received from 527 the same IP address). It is usually possible to leverage information 528 from one media component in order to determine the best candidates 529 for another. ICE does this with a mechanism called "frozen 530 candidates". 532 Each candidate is associated with a property called its FOUNDATION. 533 Two candidates have the same foundation when they are "similar" -- of 534 the same type and obtained from the same host candidate and STUN/TURN 535 server using the same protocol. Otherwise, their foundation is 536 different. A candidate pair has a foundation too, which is just the 537 concatenation of the foundations of its two candidates. Initially, 538 only the candidate pairs with unique foundations are tested. The 539 other candidate pairs are marked "frozen". When the connectivity 540 checks for a candidate pair succeed, the other candidate pairs with 541 the same foundation are unfrozen. This avoids repeated checking of 542 components that are superficially more attractive but in fact are 543 likely to fail. 545 While we've described "frozen" here as a separate mechanism for 546 expository purposes, in fact it is an integral part of ICE and the 547 ICE prioritization algorithm automatically ensures that the right 548 candidates are unfrozen and checked in the right order. However, if 549 the ICE usage does not utilize multiple components or media streams, 550 it does not need to implement this algorithm. 552 2.5. Security for Checks 554 Because ICE is used to discover which addresses can be used to send 555 media between two agents, it is important to ensure that the process 556 cannot be hijacked to send media to the wrong location. Each STUN 557 connectivity check is covered by a message authentication code (MAC) 558 computed using a key exchanged in the signaling channel. This MAC 559 provides message integrity and data origin authentication, thus 560 stopping an attacker from forging or modifying connectivity check 561 messages. Furthermore, if for example a SIP [RFC3261] caller is 562 using ICE, and their call forks, the ICE exchanges happen 563 independently with each forked recipient. In such a case, the keys 564 exchanged in the signaling help associate each ICE exchange with each 565 forked recipient. 567 2.6. Concluding ICE 569 ICE checks are performed in a specific sequence, so that high- 570 priority candidate pairs are checked first, followed by lower- 571 priority ones. One way to conclude ICE is to declare victory as soon 572 as a check for each component of each media stream completes 573 successfully. Indeed, this is a reasonable algorithm, and details 574 for it are provided below. However, it is possible that a packet 575 loss will cause a higher-priority check to take longer to complete. 576 In that case, allowing ICE to run a little longer might produce 577 better results. More fundamentally, however, the prioritization 578 defined by this specification may not yield "optimal" results. As an 579 example, if the aim is to select low-latency media paths, usage of a 580 relay is a hint that latencies may be higher, but it is nothing more 581 than a hint. An actual round-trip time (RTT) measurement could be 582 made, and it might demonstrate that a pair with lower priority is 583 actually better than one with higher priority. 585 Consequently, ICE assigns one of the agents in the role of the 586 CONTROLLING AGENT, and the other of the CONTROLLED AGENT. The 587 controlling agent gets to nominate which candidate pairs will get 588 used for media amongst the ones that are valid. It can do this in 589 one of two ways -- using REGULAR NOMINATION or AGGRESSIVE NOMINATION. 591 With regular nomination, the controlling agent lets the checks 592 continue until at least one valid candidate pair for each media 593 stream is found. Then, it picks amongst those that are valid, and 594 sends a second STUN request on its NOMINATED candidate pair, but this 595 time with a flag set to tell the peer that this pair has been 596 nominated for use. This is shown in Figure 4. 598 L R 599 - - 600 STUN request -> \ L's 601 <- STUN response / check 603 <- STUN request \ R's 604 STUN response -> / check 606 STUN request + flag -> \ L's 607 <- STUN response / check 609 Figure 4: Regular Nomination 611 Once the STUN transaction with the flag completes, both sides cancel 612 any future checks for that media stream. ICE will now send media 613 using this pair. The pair an ICE agent is using for media is called 614 the SELECTED PAIR. 616 In aggressive nomination, the controlling agent puts the flag in 617 every connectivity check STUN request it sends. This way, once the 618 first check succeeds, ICE processing is complete for that media 619 stream and the controlling agent doesn't have to send a second STUN 620 request. The selected pair will be the highest-priority valid pair 621 whose check succeeded. Aggressive nomination is faster than regular 622 nomination, but gives less flexibility. Aggressive nomination is 623 shown in Figure 5. 625 L R 626 - - 627 STUN request + flag -> \ L's 628 <- STUN response / check 630 <- STUN request \ R's 631 STUN response -> / check 633 Figure 5: Aggressive Nomination 635 Once ICE is concluded, it can be restarted at any time for one or all 636 of the media streams by either agent. This is done by sending an 637 updated candidate information indicating a restart. 639 2.7. Lite Implementations 641 In order for ICE to be used in a call, both agents need to support 642 it. However, certain agents will always be connected to the public 643 Internet and have a public IP address at which it can receive packets 644 from any correspondent. To make it easier for these devices to 645 support ICE, ICE defines a special type of implementation called LITE 646 (in contrast to the normal FULL implementation). A lite 647 implementation doesn't gather candidates; it includes only host 648 candidates for any media stream. Lite agents do not generate 649 connectivity checks or run the state machines, though they need to be 650 able to respond to connectivity checks. When a lite implementation 651 connects with a full implementation, the full agent takes the role of 652 the controlling agent, and the lite agent takes on the controlled 653 role. When two lite implementations connect, no checks are sent. 655 For guidance on when a lite implementation is appropriate, see the 656 discussion in Appendix A. 658 It is important to note that the lite implementation was added to 659 this specification to provide a stepping stone to full 660 implementation. Even for devices that are always connected to the 661 public Internet, a full implementation is preferable if achievable. 663 2.8. Usages of ICE 665 This document specifies generic use of ICE with protocols that 666 provide means to exchange candidate information between the ICE 667 Peers. The specific details of (i.e how to encode candidate 668 information and the actual candidate exchange process) for different 669 protocols using ICE are described in separate usage documents. One 670 possible way the agents can exchange the candidate information is to 671 use [RFC3264] based Offer/Answer semantics as part of the SIP 672 [RFC3261] protocol [I-D.ietf-mmusic-ice-sip-sdp]. 674 3. Terminology 676 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 677 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 678 "OPTIONAL" in this document are to be interpreted as described in RFC 679 2119 [RFC2119]. 681 Readers should be familiar with the terminology defined in the STUN 682 [RFC5389], and NAT Behavioral requirements for UDP [RFC4787]. 684 This specification makes use of the following additional terminology: 686 ICE Agent: An agent is the protocol implementation involved in the 687 ICE candidate exchange. There are two agents involved in a 688 typical candidate exchange. 690 Initiating Peer, Initiating Agent, Initiator: An initiating agent is 691 the protocol implementation involved in the ICE candidate exchange 692 that initiates the ICE candidate exchange process. 694 Responding Peer, Responding Agent, Responder: A receiving agent is 695 the protocol implementation involved in the ICE candidate exchange 696 that receives and responds to the candidate exchange process 697 initiated by the Initiator. 699 ICE Candidate Exchange, Candidate Exchange: The process where the 700 ICE agents exchange information (e.g., candidates and passwords) 701 that is needed to perform ICE. [RFC3264] Offer/Answer with SDP 702 encoding is one example of a protocol that can be used for 703 exchanging the candidate information. 705 Peer: From the perspective of one of the agents in a session, its 706 peer is the other agent. Specifically, from the perspective of 707 the initiating agent, the peer is the responding agent. From the 708 perspective of the responding agent, the peer is the initiating 709 agent. 711 Transport Address: The combination of an IP address and transport 712 protocol (such as UDP or TCP) port. 714 Media, Media Stream, Media Session: When ICE is used to setup 715 multimedia sessions, the media is usually transported over RTP, 716 and a media stream composes of a stream of RTP packets. When ICE 717 is used with other than multimedia sessions, the terms "media", 718 "media stream", and "media session" are still used in this 719 specification to refer to the IP data packets that are exchanged 720 between the peers on the path created and tested with ICE. 722 Candidate, Candidate Information: A transport address that is a 723 potential point of contact for receipt of media. Candidates also 724 have properties -- their type (server reflexive, relayed, or 725 host), priority,foundation, and base. 727 Component: A component is a piece of a media stream requiring a 728 single transport address; a media stream may require multiple 729 components, each of which has to work for the media stream as a 730 whole to work. For media streams based on RTP, unless RTP and 731 RTCP are multiplexed in the same port, there are two components 732 per media stream -- one for RTP, and one for RTCP. 734 Host Candidate: A candidate obtained by binding to a specific port 735 from an IP address on the host. This includes IP addresses on 736 physical interfaces and logical ones, such as ones obtained 737 through Virtual Private Networks (VPNs) and Realm Specific IP 738 (RSIP) [RFC3102] (which lives at the operating system level). 740 Server Reflexive Candidate: A candidate whose IP address and port 741 are a binding allocated by a NAT for an agent when it sent a 742 packet through the NAT to a server. Server reflexive candidates 743 can be learned by STUN servers using the Binding request, or TURN 744 servers, which provides both a relayed and server reflexive 745 candidate. 747 Peer Reflexive Candidate: A candidate whose IP address and port are 748 a binding allocated by a NAT for an agent when it sent a STUN 749 Binding request through the NAT to its peer. 751 Relayed Candidate: A candidate obtained by sending a TURN Allocate 752 request from a host candidate to a TURN server. The relayed 753 candidate is resident on the TURN server, and the TURN server 754 relays packets back towards the agent. 756 Base: The base of a server reflexive candidate is the host candidate 757 from which it was derived. A host candidate is also said to have 758 a base, equal to that candidate itself. Similarly, the base of a 759 relayed candidate is that candidate itself. 761 Foundation: An arbitrary string that is the same for two candidates 762 that have the same type, base IP address, protocol (UDP, TCP, 763 etc.), and STUN or TURN server. If any of these are different, 764 then the foundation will be different. Two candidate pairs with 765 the same foundation pairs are likely to have similar network 766 characteristics. Foundations are used in the frozen algorithm. 768 Local Candidate: A candidate that an agent has obtained and shared 769 with the peer. 771 Remote Candidate: A candidate that an agent received from its peer. 773 Default Destination/Candidate: The default destination for a 774 component of a media stream is the transport address that would be 775 used by an agent that is not ICE aware. A default candidate for a 776 component is one whose transport address matches the default 777 destination for that component. 779 Candidate Pair: A pairing containing a local candidate and a remote 780 candidate. 782 Check, Connectivity Check, STUN Check: A STUN Binding request 783 transaction for the purposes of verifying connectivity. A check 784 is sent from the local candidate to the remote candidate of a 785 candidate pair. 787 Check List: An ordered set of candidate pairs that an agent will use 788 to generate checks. 790 Ordinary Check: A connectivity check generated by an agent as a 791 consequence of a timer that fires periodically, instructing it to 792 send a check. 794 Triggered Check: A connectivity check generated as a consequence of 795 the receipt of a connectivity check from the peer. 797 Valid List: An ordered set of candidate pairs for a media stream 798 that have been validated by a successful STUN transaction. 800 Full: An ICE implementation that performs the complete set of 801 functionality defined by this specification. 803 Lite: An ICE implementation that omits certain functions, 804 implementing only as much as is necessary for a peer 805 implementation that is full to gain the benefits of ICE. Lite 806 implementations do not maintain any of the state machines and do 807 not generate connectivity checks. 809 Controlling Agent: The ICE agent that is responsible for selecting 810 the final choice of candidate pairs and signaling them through 811 STUN. In any session, one agent is always controlling. The other 812 is the controlled agent. 814 Controlled Agent: An ICE agent that waits for the controlling agent 815 to select the final choice of candidate pairs. 817 Regular Nomination: The process of picking a valid candidate pair 818 for media traffic by validating the pair with one STUN request, 819 and then picking it by sending a second STUN request with a flag 820 indicating its nomination. 822 Aggressive Nomination: The process of picking a valid candidate pair 823 for media traffic by including a flag in every connectivity check 824 STUN request, such that the first one to produce a valid candidate 825 pair is used for media. 827 Nominated: If a valid candidate pair has its nominated flag set, it 828 means that it may be selected by ICE for sending and receiving 829 media. 831 Selected Pair, Selected Candidate: The candidate pair selected by 832 ICE for sending and receiving media is called the selected pair, 833 and each of its candidates is called the selected candidate. 835 Using Protocol, ICE Usage: The protocol that uses ICE for NAT 836 traversal. A usage specification defines the protocol specific 837 details on how the procedures defined here are applied to that 838 protocol. 840 4. ICE Candidate Gathering and Exchange 842 As part of ICE processing, both the initiating and responding agents 843 exchange encoded candidate information as defined by the Usage 844 Protocol (ICE Usage). Specifics of encoding mechanism and the 845 semantics of candidate information exchange is out of scope of this 846 specification. 848 However at a higher level, the below diagram captures ICE processing 849 sequence in the agents (initiator and responder) for exchange of 850 their respective candidate(s) information. 852 Initiating Responding 853 Agent Agent 854 (I) (R) 855 Gather, | | 856 prioritize, | | 857 eliminate | | 858 redundant | | 859 candidates, | | 860 Encode | | 861 candidates | | 862 | I's Candidate Information | 863 |------------------------------>| 864 | | Gather, 865 | | prioritize, 866 | | eliminate 867 | | redundant 868 | | candidates, 869 | | Encode 870 | | candidates 871 | R's Candidate Information | 872 |<------------------------------| 873 | | 875 Figure 6: Candidate Gathering and Exchange Sequence 877 As shown, the agents involved in the candidate exchange perform (1) 878 candidate gathering, (2) candidate prioritization, (3) eliminating 879 redundant candidates, (4) (possibly) choose default candidates, and 880 then (5) formulate and send the candidates to the Peer ICE agent. 881 All but the last of these five steps differ for full and lite 882 implementations. 884 4.1. Procedures for Full Implementation 886 4.1.1. Gathering Candidates 888 An agent gathers candidates when it believes that communication is 889 imminent. An initiating agent can do this based on a user interface 890 cue, or based on an explicit request to initiate a session. Every 891 candidate is a transport address. It also has a type and a base. 892 Four types are defined and gathered by this specification -- host 893 candidates, server reflexive candidates, peer reflexive candidates, 894 and relayed candidates. The server reflexive candidates are gathered 895 using STUN or TURN, and relayed candidates are obtained through TURN. 896 Peer reflexive candidates are obtained in later phases of ICE, as a 897 consequence of connectivity checks. The base of a candidate is the 898 candidate that an agent must send from when using that candidate. 900 The process for gathering candidates at the responding agent is 901 identical to the process for the initiating agent. It is RECOMMENDED 902 that the responding agent begins this process immediately on receipt 903 of the candidate information, prior to alerting the user. Such 904 gathering MAY begin when an agent starts. 906 4.1.1.1. Host Candidates 908 The first step is to gather host candidates. Host candidates are 909 obtained by binding to ports (typically ephemeral) on a IP address 910 attached to an interface (physical or virtual, including VPN 911 interfaces) on the host. 913 For each UDP media stream the agent wishes to use, the agent SHOULD 914 obtain a candidate for each component of the media stream on each IP 915 address that the host has, with the exceptions listed below. The 916 agent obtains each candidate by binding to a UDP port on the specific 917 IP address. A host candidate (and indeed every candidate) is always 918 associated with a specific component for which it is a candidate. 920 Each component has an ID assigned to it, called the component ID. 921 For RTP-based media streams, unless both RTP and RTCP are multiplexed 922 in the same UDP port (RTP/RTCP multiplexing), the RTP itself has a 923 component ID of 1, and RTCP a component ID of 2. In case of RTP/RTCP 924 multiplexing, a component ID of 1 is used for both RTP and RTCP. 926 When candidates are obtained, unless the agent knows for sure that 927 RTP/RTCP multiplexing will be used (i.e. the agent knows that the 928 other agent also supports, and is willing to use, RTP/RTCP 929 multiplexing), or unless the agent only supports RTP/RTCP 930 multiplexing, the agent MUST obtain a separate candidate for RTCP. 931 If an agent has obtained a candidate for RTCP, and ends up using RTP/ 932 RTCP multiplexing, the agent does not need to perform connectivity 933 checks on the RTCP candidate. 935 If an agent is using separate candidates for RTP and RTCP, it will 936 end up with 2*K host candidates if an agent has K IP addresses. 938 Note that the responding agent, when obtaining its candidates, will 939 typically know if the other agent supports RTP/RTCP multiplexing, in 940 which case it will not need to obtain a separate candidate for RTCP. 941 However, absence of a component ID 2 as such does not imply use of 942 RTCP/RTP multiplexing, as it could also mean that RTCP is not used. 944 For other than RTP-based streams, use of multiple components is 945 discouraged since using them increases the complexity of ICE 946 processing. If multiple components are needed, the component IDs 947 SHOULD start with 1 and increase by 1 for each component. 949 The base for each host candidate is set to the candidate itself. 951 The host candidates are gathered from all IP addresses with the 952 following exceptions: 954 o Addresses from a loopback interface MUST NOT be included in the 955 candidate addresses. 957 o Deprecated IPv4-compatible IPv6 addresses [RFC4291] and IPv6 site- 958 local unicast addresses [RFC3879] MUST NOT be included in the 959 address candidates. 961 o IPv4-mapped IPv6 addresses SHOULD NOT be included in the offered 962 candidates unless the application using ICE does not support IPv4 963 (i.e., is an IPv6-only application [RFC4038]). 965 o If one or more host candidates corresponding to an IPv6 address 966 generated using a mechanism that prevents location tracking 967 [I-D.ietf-6man-ipv6-address-generation-privacy] are gathered, host 968 candidates corresponding to IPv6 addresses that do allow location 969 tracking, that are configured on the same interface, and are part 970 of the same network prefix MUST NOT be gathered; and host 971 candidates corresponding to IPv6 link-local addresses MUST NOT be 972 gathered. 974 4.1.1.2. Server Reflexive and Relayed Candidates 976 Agents SHOULD obtain relayed candidates and SHOULD obtain server 977 reflexive candidates. These requirements are at SHOULD strength to 978 allow for provider variation. Use of STUN and TURN servers may be 979 unnecessary in closed networks where agents are never connected to 980 the public Internet or to endpoints outside of the closed network. 981 In such cases, a full implementation would be used for agents that 982 are dual-stack or multihomed, to select a host candidate. Use of 983 TURN servers is expensive, and when ICE is being used, they will only 984 be utilized when both endpoints are behind NATs that perform address 985 and port dependent mapping. Consequently, some deployments might 986 consider this use case to be marginal, and elect not to use TURN 987 servers. If an agent does not gather server reflexive or relayed 988 candidates, it is RECOMMENDED that the functionality be implemented 989 and just disabled through configuration, so that it can be re-enabled 990 through configuration if conditions change in the future. 992 If an agent is gathering both relayed and server reflexive 993 candidates, it uses a TURN server. If it is gathering just server 994 reflexive candidates, it uses a STUN server. 996 The agent next pairs each host candidate with the STUN or TURN server 997 with which it is configured or has discovered by some means. If a 998 STUN or TURN server is configured, it is RECOMMENDED that a domain 999 name be configured, and the DNS procedures in [RFC5389] (using SRV 1000 records with the "stun" service) be used to discover the STUN server, 1001 and the DNS procedures in [RFC5766] (using SRV records with the 1002 "turn" service) be used to discover the TURN server. 1004 This specification only considers usage of a single STUN or TURN 1005 server. When there are multiple choices for that single STUN or TURN 1006 server (when, for example, they are learned through DNS records and 1007 multiple results are returned), an agent SHOULD use a single STUN or 1008 TURN server (based on its IP address) for all candidates for a 1009 particular session. This improves the performance of ICE. The 1010 result is a set of pairs of host candidates with STUN or TURN 1011 servers. The agent then chooses one pair, and sends a Binding or 1012 Allocate request to the server from that host candidate. Binding 1013 requests to a STUN server are not authenticated, and any ALTERNATE- 1014 SERVER attribute in a response is ignored. Agents MUST support the 1015 backwards compatibility mode for the Binding request defined in 1016 [RFC5389]. Allocate requests SHOULD be authenticated using a long- 1017 term credential obtained by the client through some other means. 1019 Every Ta milliseconds thereafter, the agent can generate another new 1020 STUN or TURN transaction. This transaction can either be a retry of 1021 a previous transaction that failed with a recoverable error (such as 1022 authentication failure), or a transaction for a new host candidate 1023 and STUN or TURN server pair. The agent SHOULD NOT generate 1024 transactions more frequently than one every Ta milliseconds. See 1025 Section 12 for guidance on how to set Ta and the STUN retransmit 1026 timer, RTO. 1028 The agent will receive a Binding or Allocate response. A successful 1029 Allocate response will provide the agent with a server reflexive 1030 candidate (obtained from the mapped address) and a relayed candidate 1031 in the XOR-RELAYED-ADDRESS attribute. If the Allocate request is 1032 rejected because the server lacks resources to fulfill it, the agent 1033 SHOULD instead send a Binding request to obtain a server reflexive 1034 candidate. A Binding response will provide the agent with only a 1035 server reflexive candidate (also obtained from the mapped address). 1036 The base of the server reflexive candidate is the host candidate from 1037 which the Allocate or Binding request was sent. The base of a 1038 relayed candidate is that candidate itself. If a relayed candidate 1039 is identical to a host candidate (which can happen in rare cases), 1040 the relayed candidate MUST be discarded. 1042 If an IPv6-only agent is in a network that utilizes NAT64 [RFC6146] 1043 and DNS64 [RFC6147] technologies, it may gather also IPv4 server 1044 reflexive and/or relayed candidates from IPv4-only STUN or TURN 1045 servers. IPv6-only agents SHOULD also utilize IPv6 prefix discovery 1046 [RFC7050] to discover the IPv6 prefix used by NAT64 (if any) and 1047 generate server reflexive candidates for each IPv6-only interface 1048 accordingly. The NAT64 server reflexive candidates are prioritized 1049 like IPv4 server reflexive candidates. 1051 4.1.1.3. Computing Foundations 1053 Finally, the agent assigns each candidate a foundation. The 1054 foundation is an identifier, scoped within a session. Two candidates 1055 MUST have the same foundation ID when all of the following are true: 1057 o they are of the same type (host, relayed, server reflexive, or 1058 peer reflexive) 1060 o their bases have the same IP address (the ports can be different) 1062 o for reflexive and relayed candidates, the STUN or TURN servers 1063 used to obtain them have the same IP address 1065 o they were obtained using the same transport protocol (TCP, UDP, 1066 etc.) 1068 Similarly, two candidates MUST have different foundations if their 1069 types are different, their bases have different IP addresses, the 1070 STUN or TURN servers used to obtain them have different IP addresses, 1071 or their transport protocols are different. 1073 4.1.1.4. Keeping Candidates Alive 1075 Once server reflexive and relayed candidates are allocated, they MUST 1076 be kept alive until ICE processing has completed, as described in 1077 Section 7.3. For server reflexive candidates learned through a 1078 Binding request, the bindings MUST be kept alive by additional 1079 Binding requests to the server. Refreshes for allocations are done 1080 using the Refresh transaction, as described in [RFC5766]. The 1081 Refresh requests will also refresh the server reflexive candidate. 1083 4.1.2. Prioritizing Candidates 1085 The prioritization process results in the assignment of a priority to 1086 each candidate. Each candidate for a media stream MUST have a unique 1087 priority that MUST be a positive integer between 1 and (2**31 - 1). 1088 This priority will be used by ICE to determine the order of the 1089 connectivity checks and the relative preference for candidates. 1091 An agent SHOULD compute this priority using the formula in 1092 Section 4.1.2.1 and choose its parameters using the guidelines in 1093 Section 4.1.2.2. If an agent elects to use a different formula, ICE 1094 will take longer to converge since both agents will not be 1095 coordinated in their checks. 1097 The process for prioritizing candidates is common across the 1098 initiating and the responding agent. 1100 4.1.2.1. Recommended Formula 1102 When using the formula, an agent computes the priority by determining 1103 a preference for each type of candidate (server reflexive, peer 1104 reflexive, relayed, and host), and, when the agent is multihomed, 1105 choosing a preference for its IP addresses. These two preferences 1106 are then combined to compute the priority for a candidate. That 1107 priority is computed using the following formula: 1109 priority = (2^24)*(type preference) + 1110 (2^8)*(local preference) + 1111 (2^0)*(256 - component ID) 1113 The type preference MUST be an integer from 0 to 126 inclusive, and 1114 represents the preference for the type of the candidate (where the 1115 types are local, server reflexive, peer reflexive, and relayed). A 1116 126 is the highest preference, and a 0 is the lowest. Setting the 1117 value to a 0 means that candidates of this type will only be used as 1118 a last resort. The type preference MUST be identical for all 1119 candidates of the same type and MUST be different for candidates of 1120 different types. The type preference for peer reflexive candidates 1121 MUST be higher than that of server reflexive candidates. Note that 1122 candidates gathered based on the procedures of Section 4.1.1 will 1123 never be peer reflexive candidates; candidates of these type are 1124 learned from the connectivity checks performed by ICE. 1126 The local preference MUST be an integer from 0 to 65535 inclusive. 1127 It represents a preference for the particular IP address from which 1128 the candidate was obtained. 65535 represents the highest preference, 1129 and a zero, the lowest. When there is only a single IP address, this 1130 value SHOULD be set to 65535. More generally, if there are multiple 1131 candidates for a particular component for a particular media stream 1132 that have the same type, the local preference MUST be unique for each 1133 one. In this specification, this only happens for multihomed hosts 1134 or if an agent is using multiple TURN servers. If a host is 1135 multihomed because it is dual-stack, the local preference SHOULD be 1136 set equal to the precedence value for IP addresses described in RFC 1137 6724 [RFC6724]. If the host operating system provides an API for 1138 discovering preference among different addresses, those preferences 1139 SHOULD be used for the local preference to prioritize addresses 1140 indicated as preferred by the operating system. 1142 The component ID is the component ID for the candidate, and MUST be 1143 between 1 and 256 inclusive. 1145 4.1.2.2. Guidelines for Choosing Type and Local Preferences 1147 One criterion for selection of the type and local preference values 1148 is the use of a media intermediary, such as a TURN server, VPN 1149 server, or NAT. With a media intermediary, if media is sent to that 1150 candidate, it will first transit the media intermediary before being 1151 received. Relayed candidates are one type of candidate that involves 1152 a media intermediary. Another are host candidates obtained from a 1153 VPN interface. When media is transited through a media intermediary, 1154 it can increase the latency between transmission and reception. It 1155 can increase the packet losses, because of the additional router hops 1156 that may be taken. It may increase the cost of providing service, 1157 since media will be routed in and right back out of a media 1158 intermediary run by a provider. If these concerns are important, the 1159 type preference for relayed candidates SHOULD be lower than host 1160 candidates. The RECOMMENDED values are 126 for host candidates, 100 1161 for server reflexive candidates, 110 for peer reflexive candidates, 1162 and 0 for relayed candidates. 1164 Furthermore, if an agent is multihomed and has multiple IP addresses, 1165 the local preference for host candidates from a VPN interface SHOULD 1166 have a priority of 0. If multiple TURN servers are used, local 1167 priorities for the candidates obtained from the TURN servers are 1168 chosen in a similar fashion as for multihomed local candidates: the 1169 local preference value is used to indicate preference among different 1170 servers but the preference MUST be unique for each one. 1172 Another criterion for selection of preferences is IP address family. 1173 ICE works with both IPv4 and IPv6. It therefore provides a 1174 transition mechanism that allows dual-stack hosts to prefer 1175 connectivity over IPv6, but to fall back to IPv4 in case the v6 1176 networks are disconnected (due, for example, to a failure in a 6to4 1177 relay) [RFC3056]. It can also help with hosts that have both a 1178 native IPv6 address and a 6to4 address. In such a case, higher local 1179 preferences could be assigned to the v6 addresses, followed by the 1180 6to4 addresses, followed by the v4 addresses. This allows a site to 1181 obtain and begin using native v6 addresses immediately, yet still 1182 fall back to 6to4 addresses when communicating with agents in other 1183 sites that do not yet have native v6 connectivity. 1185 Another criterion for selecting preferences is security. If a user 1186 is a telecommuter, and therefore connected to a corporate network and 1187 a local home network, the user may prefer their voice traffic to be 1188 routed over the VPN in order to keep it on the corporate network when 1189 communicating within the enterprise, but use the local network when 1190 communicating with users outside of the enterprise. In such a case, 1191 a VPN address would have a higher local preference than any other 1192 address. 1194 Another criterion for selecting preferences is topological awareness. 1195 This is most useful for candidates that make use of intermediaries. 1196 In those cases, if an agent has preconfigured or dynamically 1197 discovered knowledge of the topological proximity of the 1198 intermediaries to itself, it can use that to assign higher local 1199 preferences to candidates obtained from closer intermediaries. 1201 4.1.3. Eliminating Redundant Candidates 1203 Next, the agent eliminates redundant candidates. A candidate is 1204 redundant if its transport address equals another candidate, and its 1205 base equals the base of that other candidate. Note that two 1206 candidates can have the same transport address yet have different 1207 bases, and these would not be considered redundant. Frequently, a 1208 server reflexive candidate and a host candidate will be redundant 1209 when the agent is not behind a NAT. The agent SHOULD eliminate the 1210 redundant candidate with the lower priority. 1212 This process is common across the initiating and responding agents. 1214 4.2. Lite Implementation Procedures 1216 Lite implementations only utilize host candidates. A lite 1217 implementation MUST, for each component of each media stream, 1218 allocate zero or one IPv4 candidates. It MAY allocate zero or more 1219 IPv6 candidates, but no more than one per each IPv6 address utilized 1220 by the host. Since there can be no more than one IPv4 candidate per 1221 component of each media stream, if an agent has multiple IPv4 1222 addresses, it MUST choose one for allocating the candidate. If a 1223 host is dual-stack, it is RECOMMENDED that it allocate one IPv4 1224 candidate and one global IPv6 address. With the lite implementation, 1225 ICE cannot be used to dynamically choose amongst candidates. 1226 Therefore, including more than one candidate from a particular scope 1227 is NOT RECOMMENDED, since only a connectivity check can truly 1228 determine whether to use one address or the other. 1230 Each component has an ID assigned to it, called the component ID. 1231 For RTP-based media streams, unless RTCP is multiplexed in the same 1232 port with RTP, the RTP itself has a component ID of 1, and RTCP a 1233 component ID of 2. If an agent is using RTCP without multiplexing, 1234 it MUST obtain candidates for it. However, absence of a component ID 1235 2 as such does not imply use of RTCP/RTP multiplexing, as it could 1236 also mean that RTCP is not used. 1238 Each candidate is assigned a foundation. The foundation MUST be 1239 different for two candidates allocated from different IP addresses, 1240 and MUST be the same otherwise. A simple integer that increments for 1241 each IP address will suffice. In addition, each candidate MUST be 1242 assigned a unique priority amongst all candidates for the same media 1243 stream. This priority SHOULD be equal to: 1245 priority = (2^24)*(126) + 1246 (2^8)*(IP precedence) + 1247 (2^0)*(256 - component ID) 1249 If a host is v4-only, it SHOULD set the IP precedence to 65535. If a 1250 host is v6 or dual-stack, the IP precedence SHOULD be the precedence 1251 value for IP addresses described in RFC 6724 [RFC6724]. 1253 Next, an agent chooses a default candidate for each component of each 1254 media stream. If a host is IPv4-only, there would only be one 1255 candidate for each component of each media stream, and therefore that 1256 candidate is the default. If a host is IPv6 or dual-stack, the 1257 selection of default is a matter of local policy. This default 1258 SHOULD be chosen such that it is the candidate most likely to be used 1259 with a peer. For IPv6-only hosts, this would typically be a globally 1260 scoped IPv6 address. For dual-stack hosts, the IPv4 address is 1261 RECOMMENDED. 1263 The procedures in this section is common across the initiating and 1264 responding agents. 1266 4.3. Encoding the Candidate Information 1268 Regardless of the agent being an Initiator or Responder Agent, the 1269 following parameters and their data types needs to be conveyed as 1270 part of the candidate exchange process. The specifics of syntax for 1271 encoding the candidate information is out of scope of this 1272 specification. 1274 Candidate attribute There will be one or more of these for each 1275 "media stream". Each candidate is composed of: 1277 Connection Address: The IP address and transport protocol port of 1278 the candidate. 1280 Transport: An indicator of the transport protocol for this 1281 candidate. This need not be present if the using protocol will 1282 only ever run over a single transport protocol. If it runs 1283 over more than one, or if others are anticipated to be used in 1284 the future, this should be present. 1286 Foundation: A sequence of up to 32 characters. 1288 Component-ID: This would be present only if the using protocol 1289 were utilizing the concept of components. If it is, it would 1290 be a positive integer that indicates the component ID for which 1291 this is a candidate. 1293 Priority: An encoding of the 32-bit priority value. 1295 Candidate Type: The candidate type, as defined in ICE. 1297 Related Address and Port: The related IP address and port for 1298 this candidate, as defined by ICE. These MAY be omitted or set 1299 to invalid values if the agent does not want to reveal them, 1300 e.g., for privacy reasons. 1302 Extensibility Parameters: The using protocol should define some 1303 means for adding new per-candidate ICE parameters in the 1304 future. 1306 Lite Flag: If ICE lite is used by the using protocol, it needs to 1307 convey a boolean parameter which indicates whether the 1308 implementation is lite or not. 1310 Connectivity check pacing value: If an agent wants to use other than 1311 the default pacing values for the connectivity checks, it MUST 1312 indicate this in the ICE exchange. 1314 Username Fragment and Password: The using protocol has to convey a 1315 username fragment and password. The username fragment MUST 1316 contain at least 24 bits of randomness, and the password MUST 1317 contain at least 128 bits of randomness. 1319 ICE extensions: In addition to the per-candidate extensions above, 1320 the using protocol should allow for new media-stream or session- 1321 level attributes (ice-options). 1323 If the using protocol is using the ICE mismatch feature, a way is 1324 needed to convey this parameter in answers. It is a boolean flag. 1326 The exchange of parameters is symmetric; both agents need to send the 1327 same set of attributes as defined above. 1329 The using protocol may (or may not) need to deal with backwards 1330 compatibility with older implementations that do not support ICE. If 1331 the fallback mechanism is being used, then presumably the using 1332 protocol provides a way of conveying the default candidate (its IP 1333 address and port) in addition to the ICE parameters. 1335 STUN connectivity checks between agents are authenticated using the 1336 short-term credential mechanism defined for STUN [RFC5389]. This 1337 mechanism relies on a username and password that are exchanged 1338 through protocol machinery between the client and server. The 1339 username part of this credential is formed by concatenating a 1340 username fragment from each agent, separated by a colon. Each agent 1341 also provides a password, used to compute the message integrity for 1342 requests it receives. The username fragment and password are 1343 exchanged between the peers. In addition to providing security, the 1344 username provides disambiguation and correlation of checks to media 1345 streams. See Appendix B.4 for motivation. 1347 If the initiating agent is a lite implementation, it MUST indicate 1348 this when sending its candidates . 1350 ICE provides for extensibility by allowing an agent to include a 1351 series of tokens that identify ICE extensions as part of the 1352 candidate exchange process. 1354 Once an agent has sent its candidate information, that agent MUST be 1355 prepared to receive both STUN and media packets on each candidate. 1356 As discussed in Section 10.1, media packets can be sent to a 1357 candidate prior to its appearance as the default destination for 1358 media. 1360 5. ICE Candidate Processing 1362 Once an agent has candidates from it's peer, it will check if the 1363 peer supports ICE, determine its own role, exchanges candidates 1364 (Section 4) and for full implementations, forms the check lists and 1365 begins connectivity checks as explained in this section. 1367 5.1. Procedures for Full Implementation 1369 5.1.1. Verifying ICE Support 1371 Certain middleboxes, such as ALGs, may alter the ICE candidate 1372 information that breaks ICE. If the using protocol is vulnerable to 1373 this kind of changes, called ICE mismatch, the responding agent needs 1374 to detect this and signal this back to the initiating agent. The 1375 details on whether this is needed and how it is done is defined by 1376 the usage specifications. One exception to the above is that an 1377 initiating agent would never indicate ICE mismatch. 1379 5.1.2. Determining Role 1381 For each session, each agent (Initiating and Responding) takes on a 1382 role. There are two roles -- controlling and controlled. The 1383 controlling agent is responsible for the choice of the final 1384 candidate pairs used for communications. For a full agent, this 1385 means nominating the candidate pairs that can be used by ICE for each 1386 media stream, and for updating the peer with the ICE's selection, 1387 when needed. The controlled agent is told which candidate pairs to 1388 use for each media stream, and does not require updating the peer to 1389 signal this information. The sections below describe in detail the 1390 actual procedures followed by controlling and controlled nodes. 1392 The rules for determining the role and the impact on behavior are as 1393 follows: 1395 Both agents are full: The Initiating Agent which started the ICE 1396 processing MUST take the controlling role, and the other MUST take 1397 the controlled role. Both agents will form check lists, run the 1398 ICE state machines, and generate connectivity checks. The 1399 controlling agent will execute the logic in Section 7.1 to 1400 nominate pairs that will be selected by ICE, and then both agents 1401 end ICE as described in Section 7.1.2. 1403 One agent full, one lite: The full agent MUST take the controlling 1404 role, and the lite agent MUST take the controlled role. The full 1405 agent will form check lists, run the ICE state machines, and 1406 generate connectivity checks. That agent will execute the logic 1407 in Section 7.1 to nominate pairs that will be selected by ICE, and 1408 use the logic in Section 7.1.2 to end ICE. The lite 1409 implementation will just listen for connectivity checks, receive 1410 them and respond to them, and then conclude ICE as described in 1411 Section 7.2. For the lite implementation, the state of ICE 1412 processing for each media stream is considered to be Running, and 1413 the state of ICE overall is Running. 1415 Both lite: The Initiating Agent which started the ICE processing 1416 MUST take the controlling role, and the other MUST take the 1417 controlled role. In this case, no connectivity checks are ever 1418 sent. Rather, once the candidates are exchanged, each agent 1419 performs the processing described in Section 7 without 1420 connectivity checks. It is possible that both agents will believe 1421 they are controlled or controlling. In the latter case, the 1422 conflict is resolved through glare detection capabilities in the 1423 signaling protocol enabling the candidate exchange. The state of 1424 ICE processing for each media stream is considered to be Running, 1425 and the state of ICE overall is Running. 1427 Once roles are determined for a session, they persist unless ICE is 1428 restarted. An ICE restart causes a new selection of roles and tie- 1429 breakers. 1431 5.1.3. Forming the Check Lists 1433 There is one check list per in-use media stream resulting from the 1434 candidate exchange. To form the check list for a media stream, the 1435 agent forms candidate pairs, computes a candidate pair priority, 1436 orders the pairs by priority, prunes them, and sets their states. 1437 These steps are described in this section. 1439 5.1.3.1. Forming Candidate Pairs 1441 First, the agent takes each of its candidates for a media stream 1442 (called LOCAL CANDIDATES) and pairs them with the candidates it 1443 received from its peer (called REMOTE CANDIDATES) for that media 1444 stream. In order to prevent the attacks described in Section 14.4.1, 1445 agents MAY limit the number of candidates they'll accept in an 1446 candidate exchange process. A local candidate is paired with a 1447 remote candidate if and only if the two candidates have the same 1448 component ID and have the same IP address version. It is possible 1449 that some of the local candidates won't get paired with remote 1450 candidates, and some of the remote candidates won't get paired with 1451 local candidates. This can happen if one agent doesn't include 1452 candidates for the all of the components for a media stream. If this 1453 happens, the number of components for that media stream is 1454 effectively reduced, and considered to be equal to the minimum across 1455 both agents of the maximum component ID provided by each agent across 1456 all components for the media stream. 1458 In the case of RTP, this would happen when one agent provides 1459 candidates for RTCP, and the other does not. As another example, the 1460 initiating agent can multiplex RTP and RTCP on the same port 1461 [RFC5761]. However, since the initiating agent doesn't know if the 1462 peer agent can perform such multiplexing, it includes candidates for 1463 RTP and RTCP on separate ports. If the peer agent can perform such 1464 multiplexing, it would include just a single component for each 1465 candidate -- for the combined RTP/RTCP mux. ICE would end up acting 1466 as if there was just a single component for this candidate. 1468 With IPv6 it is common for a host to have multiple host candidates 1469 for each interface. To keep the amount of resulting candidate pairs 1470 reasonable and to avoid candidate pairs that are highly unlikely to 1471 work, IPv6 link-local addresses [RFC4291] MUST NOT be paired with 1472 other than link-local addresses. 1474 The candidate pairs whose local and remote candidates are both the 1475 default candidates for a particular component is called, 1476 unsurprisingly, the default candidate pair for that component. This 1477 is the pair that would be used to transmit media if both agents had 1478 not been ICE aware. 1480 In order to aid understanding, Figure 7 shows the relationships 1481 between several key concepts -- transport addresses, candidates, 1482 candidate pairs, and check lists, in addition to indicating the main 1483 properties of candidates and candidate pairs. 1485 +--------------------------------------------+ 1486 | | 1487 | +---------------------+ | 1488 | |+----+ +----+ +----+ | +Type | 1489 | || IP | |Port| |Tran| | +Priority | 1490 | ||Addr| | | | | | +Foundation | 1491 | |+----+ +----+ +----+ | +Component ID | 1492 | | Transport | +Related Address | 1493 | | Addr | | 1494 | +---------------------+ +Base | 1495 | Candidate | 1496 +--------------------------------------------+ 1497 * * 1498 * ************************************* 1499 * * 1500 +-------------------------------+ 1501 .| | 1502 | Local Remote | 1503 | +----+ +----+ +default? | 1504 | |Cand| |Cand| +valid? | 1505 | +----+ +----+ +nominated?| 1506 | +State | 1507 | | 1508 | | 1509 | Candidate Pair | 1510 +-------------------------------+ 1511 * * 1512 * ************ 1513 * * 1514 +------------------+ 1515 | Candidate Pair | 1516 +------------------+ 1517 +------------------+ 1518 | Candidate Pair | 1519 +------------------+ 1520 +------------------+ 1521 | Candidate Pair | 1522 +------------------+ 1524 Check 1525 List 1527 Figure 7: Conceptual Diagram of a Check List 1529 5.1.3.2. Computing Pair Priority and Ordering Pairs 1531 Once the pairs are formed, a candidate pair priority is computed. 1532 Let G be the priority for the candidate provided by the controlling 1533 agent. Let D be the priority for the candidate provided by the 1534 controlled agent. The priority for a pair is computed as: 1536 pair priority = 2^32*MIN(G,D) + 2*MAX(G,D) + (G>D?1:0) 1538 Where G>D?1:0 is an expression whose value is 1 if G is greater than 1539 D, and 0 otherwise. Once the priority is assigned, the agent sorts 1540 the candidate pairs in decreasing order of priority. If two pairs 1541 have identical priority, the ordering amongst them is arbitrary. 1543 5.1.3.3. Pruning the Pairs 1545 This sorted list of candidate pairs is used to determine a sequence 1546 of connectivity checks that will be performed. Each check involves 1547 sending a request from a local candidate to a remote candidate. 1548 Since an agent cannot send requests directly from a reflexive 1549 candidate, but only from its base, the agent next goes through the 1550 sorted list of candidate pairs. For each pair where the local 1551 candidate is server reflexive, the server reflexive candidate MUST be 1552 replaced by its base. Once this has been done, the agent MUST prune 1553 the list. This is done by removing a pair if its local and remote 1554 candidates are identical to the local and remote candidates of a pair 1555 higher up on the priority list. The result is a sequence of ordered 1556 candidate pairs, called the check list for that media stream. 1558 In addition, in order to limit the attacks described in 1559 Section 14.4.1, an agent MUST limit the total number of connectivity 1560 checks the agent performs across all check lists to a specific value, 1561 and this value MUST be configurable. A default of 100 is 1562 RECOMMENDED. This limit is enforced by discarding the lower-priority 1563 candidate pairs until there are less than 100. It is RECOMMENDED 1564 that a lower value be utilized when possible, set to the maximum 1565 number of plausible checks that might be seen in an actual deployment 1566 configuration. The requirement for configuration is meant to provide 1567 a tool for fixing this value in the field if, once deployed, it is 1568 found to be problematic. 1570 5.1.3.4. Computing States 1572 Each candidate pair in the check list has a foundation and a state. 1573 The foundation is the combination of the foundations of the local and 1574 remote candidates in the pair. The state is assigned once the check 1575 list for each media stream has been computed. There are five 1576 potential values that the state can have: 1578 Waiting: A check has not been performed for this pair, and can be 1579 performed as soon as it is the highest-priority Waiting pair on 1580 the check list. 1582 In-Progress: A check has been sent for this pair, but the 1583 transaction is in progress. 1585 Succeeded: A check for this pair was already done and produced a 1586 successful result. 1588 Failed: A check for this pair was already done and failed, either 1589 never producing any response or producing an unrecoverable failure 1590 response. 1592 Frozen: A check for this pair hasn't been performed, and it can't 1593 yet be performed until some other check succeeds, allowing this 1594 pair to unfreeze and move into the Waiting state. 1596 As ICE runs, the pairs will move between states as shown in Figure 8. 1598 +-----------+ 1599 | | 1600 | | 1601 | Frozen | 1602 | | 1603 | | 1604 +-----------+ 1605 | 1606 |unfreeze 1607 | 1608 V 1609 +-----------+ +-----------+ 1610 | | | | 1611 | | perform | | 1612 | Waiting |-------->|In-Progress| 1613 | | | | 1614 | | | | 1615 +-----------+ +-----------+ 1616 / | 1617 // | 1618 // | 1619 // | 1620 / | 1621 // | 1622 failure // |success 1623 // | 1624 / | 1625 // | 1626 // | 1627 // | 1628 V V 1629 +-----------+ +-----------+ 1630 | | | | 1631 | | | | 1632 | Failed | | Succeeded | 1633 | | | | 1634 | | | | 1635 +-----------+ +-----------+ 1637 Figure 8: Pair State FSM 1639 The initial states for each pair in a check list are computed by 1640 performing the following sequence of steps: 1642 1. The agent sets all of the pairs in each check list to the Frozen 1643 state. 1645 2. The agent examines the check list for the first media stream. 1646 For that media stream: 1648 * For all pairs with the same foundation, it sets the state of 1649 the pair with the lowest component ID to Waiting. If there is 1650 more than one such pair, the one with the highest-priority is 1651 used. 1653 One of the check lists will have some number of pairs in the Waiting 1654 state, and the other check lists will have all of their pairs in the 1655 Frozen state. A check list with at least one pair that is Waiting is 1656 called an active check list, and a check list with all pairs Frozen 1657 is called a frozen check list. 1659 The check list itself is associated with a state, which captures the 1660 state of ICE checks for that media stream. There are three states: 1662 Running: In this state, ICE checks are still in progress for this 1663 media stream. 1665 Completed: In this state, ICE checks have produced nominated pairs 1666 for each component of the media stream. Consequently, ICE has 1667 succeeded and media can be sent. 1669 Failed: In this state, the ICE checks have not completed 1670 successfully for this media stream. 1672 When a check list is first constructed as the consequence of an 1673 candidate exchange, it is placed in the Running state. 1675 ICE processing across all media streams also has a state associated 1676 with it. This state is equal to Running while ICE processing is 1677 under way. The state is Completed when ICE processing is complete 1678 and Failed if it failed without success. Rules for transitioning 1679 between states are described below. 1681 5.1.4. Scheduling Checks 1683 An agent performs ordinary checks and triggered checks. The 1684 generation of both checks is governed by a timer that fires 1685 periodically for each media stream. The agent maintains a FIFO 1686 queue, called the triggered check queue, which contains candidate 1687 pairs for which checks are to be sent at the next available 1688 opportunity. When the timer fires, the agent removes the top pair 1689 from the triggered check queue, performs a connectivity check on that 1690 pair, and sets the state of the candidate pair to In-Progress. If 1691 there are no pairs in the triggered check queue, an ordinary check is 1692 sent. 1694 Once the agent has computed the check lists as described in 1695 Section 5.1.3, it sets a timer for each active check list. The timer 1696 fires every Ta*N seconds, where N is the number of active check lists 1697 (initially, there is only one active check list). Implementations 1698 MAY set the timer to fire less frequently than this. Implementations 1699 SHOULD take care to spread out these timers so that they do not fire 1700 at the same time for each media stream. Ta and the retransmit timer 1701 RTO are computed as described in Section 12. Multiplying by N allows 1702 this aggregate check throughput to be split between all active check 1703 lists. The first timer fires immediately, so that the agent performs 1704 a connectivity check the moment the candidate exchange has been done, 1705 followed by the next check Ta seconds later (since there is only one 1706 active check list). 1708 When the timer fires and there is no triggered check to be sent, the 1709 agent MUST choose an ordinary check as follows: 1711 o Find the highest-priority pair in that check list that is in the 1712 Waiting state. 1714 o If there is such a pair: 1716 * Send a STUN check from the local candidate of that pair to the 1717 remote candidate of that pair. The procedures for forming the 1718 STUN request for this purpose are described in Section 6.1.2. 1720 * Set the state of the candidate pair to In-Progress. 1722 o If there is no such pair: 1724 * Find the highest-priority pair in that check list that is in 1725 the Frozen state. 1727 * If there is such a pair: 1729 + Unfreeze the pair. 1731 + Perform a check for that pair, causing its state to 1732 transition to In-Progress. 1734 * If there is no such pair: 1736 + Terminate the timer for that check list. 1738 To compute the message integrity for the check, the agent uses the 1739 remote username fragment and password learned from the candidate 1740 information obtained from its peer. The local username fragment is 1741 known directly by the agent for its own candidate. 1743 The Initiator performs the ordinary checks on receiving the candidate 1744 information from the Peer (responder) and having formed the 1745 checklists. On the other hand the responding agent either performs 1746 the triggered or ordinary checks as described above. 1748 5.2. Lite Implementation Procedures 1750 Lite implementations skips most of the steps in Section 5 except for 1751 verifying the peer's ICE support and determining its role in the ICE 1752 processing. 1754 On determining the role for a lite implementation being the 1755 controlling agent means selecting a candidate pair based on the ones 1756 in the candidate exchange (for IPv4, there is only ever one pair), 1757 and then updating the peer with the new candidate information 1758 reflecting that selection, when needed (it is never needed for an 1759 IPv4-only host). The controlled agent is told which candidate pairs 1760 to use for each media stream, and no further candidate updates are 1761 needed to signal this information. 1763 6. Performing Connectivity Checks 1765 This section describes how connectivity checks are performed. All 1766 ICE implementations are required to be compliant to [RFC5389], as 1767 opposed to the older [RFC3489]. However, whereas a full 1768 implementation will both generate checks (acting as a STUN client) 1769 and receive them (acting as a STUN server), a lite implementation 1770 will only receive checks, and thus will only act as a STUN server. 1772 6.1. STUN Client Procedures 1774 These procedures define how an agent sends a connectivity check, 1775 whether it is an ordinary or a triggered check. These procedures are 1776 only applicable to full implementations. 1778 6.1.1. Creating Permissions for Relayed Candidates 1780 If the connectivity check is being sent using a relayed local 1781 candidate, the client MUST create a permission first if it has not 1782 already created one previously. It would have created one previously 1783 if it had told the TURN server to create a permission for the given 1784 relayed candidate towards the IP address of the remote candidate. To 1785 create the permission, the agent follows the procedures defined in 1786 [RFC5766]. The permission MUST be created towards the IP address of 1787 the remote candidate. It is RECOMMENDED that the agent defer 1788 creation of a TURN channel until ICE completes, in which case 1789 permissions for connectivity checks are normally created using a 1790 CreatePermission request. Once established, the agent MUST keep the 1791 permission active until ICE concludes. 1793 6.1.2. Sending the Request 1795 A connectivity check is generated by sending a Binding request from a 1796 local candidate to a remote candidate. [RFC5389] describes how 1797 Binding requests are constructed and generated. A connectivity check 1798 MUST utilize the STUN short-term credential mechanism. Support for 1799 backwards compatibility with RFC 3489 MUST NOT be used or assumed 1800 with connectivity checks. The FINGERPRINT mechanism MUST be used for 1801 connectivity checks. 1803 ICE extends STUN by defining several new attributes, including 1804 PRIORITY, USE-CANDIDATE, ICE-CONTROLLED, and ICE-CONTROLLING. These 1805 new attributes are formally defined in Section 15.1, and their usage 1806 is described in the subsections below. These STUN extensions are 1807 applicable only to connectivity checks used for ICE. 1809 6.1.2.1. PRIORITY and USE-CANDIDATE 1811 An agent MUST include the PRIORITY attribute in its Binding request. 1812 The attribute MUST be set equal to the priority that would be 1813 assigned, based on the algorithm in Section 4.1.2, to a peer 1814 reflexive candidate, should one be learned as a consequence of this 1815 check (see Section 6.1.3.2.1 for how peer reflexive candidates are 1816 learned). This priority value will be computed identically to how 1817 the priority for the local candidate of the pair was computed, except 1818 that the type preference is set to the value for peer reflexive 1819 candidate types. 1821 The controlling agent MAY include the USE-CANDIDATE attribute in the 1822 Binding request. The controlled agent MUST NOT include it in its 1823 Binding request. This attribute signals that the controlling agent 1824 wishes to cease checks for this component, and use the candidate pair 1825 resulting from the check for this component. Section 7.1.1 provides 1826 guidance on determining when to include it. 1828 6.1.2.2. ICE-CONTROLLED and ICE-CONTROLLING 1830 The agent MUST include the ICE-CONTROLLED attribute in the request if 1831 it is in the controlled role, and MUST include the ICE-CONTROLLING 1832 attribute in the request if it is in the controlling role. The 1833 content of either attribute MUST be the tie-breaker that was 1834 determined in Section 5.1.2. These attributes are defined fully in 1835 Section 15.1. 1837 6.1.2.3. Forming Credentials 1839 A Binding request serving as a connectivity check MUST utilize the 1840 STUN short-term credential mechanism. The username for the 1841 credential is formed by concatenating the username fragment provided 1842 by the peer with the username fragment of the agent sending the 1843 request, separated by a colon (":"). The password is equal to the 1844 password provided by the peer. For example, consider the case where 1845 agent L is the initiating , agent and agent R is the responding 1846 agent. Agent L included a username fragment of LFRAG for its 1847 candidates and a password of LPASS. Agent R provided a username 1848 fragment of RFRAG and a password of RPASS. A connectivity check from 1849 L to R utilizes the username RFRAG:LFRAG and a password of RPASS. A 1850 connectivity check from R to L utilizes the username LFRAG:RFRAG and 1851 a password of LPASS. The responses utilize the same usernames and 1852 passwords as the requests (note that the USERNAME attribute is not 1853 present in the response). 1855 6.1.2.4. DiffServ Treatment 1857 If the agent is using Diffserv Codepoint markings [RFC2475] in its 1858 media packets, it SHOULD apply those same markings to its 1859 connectivity checks. 1861 6.1.3. Processing the Response 1863 When a Binding response is received, it is correlated to its Binding 1864 request using the transaction ID, as defined in [RFC5389], which then 1865 ties it to the candidate pair for which the Binding request was sent. 1866 This section defines additional procedures for processing Binding 1867 responses specific to this usage of STUN. 1869 6.1.3.1. Failure Cases 1871 If the STUN transaction generates a 487 (Role Conflict) error 1872 response, the agent checks whether it included the ICE-CONTROLLED or 1873 ICE-CONTROLLING attribute in the Binding request. If the request 1874 contained the ICE-CONTROLLED attribute, the agent MUST switch to the 1875 controlling role if it has not already done so. If the request 1876 contained the ICE-CONTROLLING attribute, the agent MUST switch to the 1877 controlled role if it has not already done so. Once it has switched, 1878 the agent MUST enqueue the candidate pair whose check generated the 1879 487 into the triggered check queue. The state of that pair is set to 1880 Waiting. When the triggered check is sent, it will contain an ICE- 1881 CONTROLLING or ICE-CONTROLLED attribute reflecting its new role. 1882 Note, however, that the tie-breaker value MUST NOT be reselected. 1884 A change in roles will require an agent to recompute pair priorities 1885 (Section 5.1.3.2), since those priorities are a function of 1886 controlling and controlled roles. The change in role will also 1887 impact whether the agent is responsible for selecting nominated pairs 1888 and generating updated candidate information for sharing upon 1889 conclusion of ICE. 1891 Agents MAY support receipt of ICMP errors for connectivity checks. 1892 If the STUN transaction generates an ICMP error, the agent sets the 1893 state of the pair to Failed. If the STUN transaction generates a 1894 STUN error response that is unrecoverable (as defined in [RFC5389]) 1895 or times out, the agent sets the state of the pair to Failed. 1897 The agent MUST check that the source IP address and port of the 1898 response equal the destination IP address and port to which the 1899 Binding request was sent, and that the destination IP address and 1900 port of the response match the source IP address and port from which 1901 the Binding request was sent. In other words, the source and 1902 destination transport addresses in the request and responses are 1903 symmetric. If they are not symmetric, the agent sets the state of 1904 the pair to Failed. 1906 6.1.3.2. Success Cases 1908 A check is considered to be a success if all of the following are 1909 true: 1911 o The STUN transaction generated a success response. 1913 o The source IP address and port of the response equals the 1914 destination IP address and port to which the Binding request was 1915 sent. 1917 o The destination IP address and port of the response match the 1918 source IP address and port from which the Binding request was 1919 sent. 1921 6.1.3.2.1. Discovering Peer Reflexive Candidates 1923 The agent checks the mapped address from the STUN response. If the 1924 transport address does not match any of the local candidates that the 1925 agent knows about, the mapped address represents a new candidate -- a 1926 peer reflexive candidate. Like other candidates, it has a type, 1927 base, priority, and foundation. They are computed as follows: 1929 o Its type is equal to peer reflexive. 1931 o Its base is set equal to the local candidate of the candidate pair 1932 from which the STUN check was sent. 1934 o Its priority is set equal to the value of the PRIORITY attribute 1935 in the Binding request. 1937 o Its foundation is selected as described in Section 4.1.1.3. 1939 This peer reflexive candidate is then added to the list of local 1940 candidates for the media stream. Its username fragment and password 1941 are the same as all other local candidates for that media stream. 1942 However, the peer reflexive candidate is not paired with other remote 1943 candidates. This is not necessary; a valid pair will be generated 1944 from it momentarily based on the procedures in Section 6.1.3.2.2. If 1945 an agent wishes to pair the peer reflexive candidate with other 1946 remote candidates besides the one in the valid pair that will be 1947 generated, the agent MAY generate an update the peer with the 1948 candidate information that includes the peer reflexive candidate. 1949 This will cause it to be paired with all other remote candidates. 1951 6.1.3.2.2. Constructing a Valid Pair 1953 The agent constructs a candidate pair whose local candidate equals 1954 the mapped address of the response, and whose remote candidate equals 1955 the destination address to which the request was sent. This is 1956 called a valid pair, since it has been validated by a STUN 1957 connectivity check. The valid pair may equal the pair that generated 1958 the check, may equal a different pair in the check list, or may be a 1959 pair not currently on any check list. If the pair equals the pair 1960 that generated the check or is on a check list currently, it is also 1961 added to the VALID LIST, which is maintained by the agent for each 1962 media stream. This list is empty at the start of ICE processing, and 1963 fills as checks are performed, resulting in valid candidate pairs. 1965 It will be very common that the pair will not be on any check list. 1966 Recall that the check list has pairs whose local candidates are never 1967 server reflexive; those pairs had their local candidates converted to 1968 the base of the server reflexive candidates, and then pruned if they 1969 were redundant. When the response to the STUN check arrives, the 1970 mapped address will be reflexive if there is a NAT between the two. 1971 In that case, the valid pair will have a local candidate that doesn't 1972 match any of the pairs in the check list. 1974 If the pair is not on any check list, the agent computes the priority 1975 for the pair based on the priority of each candidate, using the 1976 algorithm in Section 5.1.3. The priority of the local candidate 1977 depends on its type. If it is not peer reflexive, it is equal to the 1978 priority signaled for that candidate in the candidate exchange. If 1979 it is peer reflexive, it is equal to the PRIORITY attribute the agent 1980 placed in the Binding request that just completed. The priority of 1981 the remote candidate is taken from the candidate information of the 1982 peer. If the candidate does not appear there, then the check must 1983 have been a triggered check to a new remote candidate. In that case, 1984 the priority is taken as the value of the PRIORITY attribute in the 1985 Binding request that triggered the check that just completed. The 1986 pair is then added to the VALID LIST. 1988 6.1.3.2.3. Updating Pair States 1990 The agent sets the state of the pair that *generated* the check to 1991 Succeeded. Note that, the pair which *generated* the check may be 1992 different than the valid pair constructed in Section 6.1.3.2.2 as a 1993 consequence of the response. The success of this check might also 1994 cause the state of other checks to change as well. The agent MUST 1995 perform the following two steps: 1997 1. The agent changes the states for all other Frozen pairs for the 1998 same media stream and same foundation to Waiting. Typically, but 1999 not always, these other pairs will have different component IDs. 2001 2. If there is a pair in the valid list for every component of this 2002 media stream (where this is the actual number of components being 2003 used, in cases where the number of components signaled in the 2004 candidate exchange differs from initiating to responding agent), 2005 the success of this check may unfreeze checks for other media 2006 streams. Note that this step is followed not just the first time 2007 the valid list under consideration has a pair for every 2008 component, but every subsequent time a check succeeds and adds 2009 yet another pair to that valid list. The agent examines the 2010 check list for each other media stream in turn: 2012 * If the check list is active, the agent changes the state of 2013 all Frozen pairs in that check list whose foundation matches a 2014 pair in the valid list under consideration to Waiting. 2016 * If the check list is frozen, and there is at least one pair in 2017 the check list whose foundation matches a pair in the valid 2018 list under consideration, the state of all pairs in the check 2019 list whose foundation matches a pair in the valid list under 2020 consideration is set to Waiting. This will cause the check 2021 list to become active, and ordinary checks will begin for it, 2022 as described in Section 5.1.4. 2024 * If the check list is frozen, and there are no pairs in the 2025 check list whose foundation matches a pair in the valid list 2026 under consideration, the agent 2027 + groups together all of the pairs with the same foundation, 2028 and 2030 + for each group, sets the state of the pair with the lowest 2031 component ID to Waiting. If there is more than one such 2032 pair, the one with the highest-priority is used. 2034 6.1.3.2.4. Updating the Nominated Flag 2036 If the agent was a controlling agent, and it had included a USE- 2037 CANDIDATE attribute in the Binding request, the valid pair generated 2038 from that check has its nominated flag set to true. This flag 2039 indicates that this valid pair should be used for media if it is the 2040 highest-priority one amongst those whose nominated flag is set. This 2041 may conclude ICE processing for this media stream or all media 2042 streams; see Section 7. 2044 If the agent is the controlled agent, the response may be the result 2045 of a triggered check that was sent in response to a request that 2046 itself had the USE-CANDIDATE attribute. This case is described in 2047 Section 6.2.1.5, and may now result in setting the nominated flag for 2048 the pair learned from the original request. 2050 6.1.3.3. Check List and Timer State Updates 2052 Regardless of whether the check was successful or failed, the 2053 completion of the transaction may require updating of check list and 2054 timer states. 2056 If all of the pairs in the check list are now either in the Failed or 2057 Succeeded state: 2059 o If there is not a pair in the valid list for each component of the 2060 media stream, the state of the check list is set to Failed. 2062 o For each frozen check list, the agent 2064 * groups together all of the pairs with the same foundation, and 2066 * for each group, sets the state of the pair with the lowest 2067 component ID to Waiting. If there is more than one such pair, 2068 the one with the highest-priority is used. 2070 If none of the pairs in the check list are in the Waiting or Frozen 2071 state, the check list is no longer considered active, and will not 2072 count towards the value of N in the computation of timers for 2073 ordinary checks as described in Section 5.1.4. 2075 6.2. STUN Server Procedures 2077 An agent MUST be prepared to receive a Binding request on the base of 2078 each candidate it included in its most recent candidate exchange. 2079 This requirement holds even if the peer is a lite implementation. 2081 The agent MUST use the short-term credential mechanism (i.e., the 2082 MESSAGE-INTEGRITY attribute) to authenticate the request and perform 2083 a message integrity check. Likewise, the short-term credential 2084 mechanism MUST be used for the response. The agent MUST consider the 2085 username to be valid if it consists of two values separated by a 2086 colon, where the first value is equal to the username fragment 2087 generated by the agent in an candidate exchange for a session in- 2088 progress. It is possible (and in fact very likely) that the 2089 initiating agent will receive a Binding request prior to receiving 2090 the candidates from its peer. If this happens, the agent MUST 2091 immediately generate a response (including computation of the mapped 2092 address as described in Section 6.2.1.2). The agent has sufficient 2093 information at this point to generate the response; the password from 2094 the peer is not required. Once the answer is received, it MUST 2095 proceed with the remaining steps required, namely, Section 6.2.1.3, 2096 Section 6.2.1.4, and Section 6.2.1.5 for full implementations. In 2097 cases where multiple STUN requests are received before the answer, 2098 this may cause several pairs to be queued up in the triggered check 2099 queue. 2101 An agent MUST NOT utilize the ALTERNATE-SERVER mechanism, and MUST 2102 NOT support the backwards-compatibility mechanisms to RFC 3489. It 2103 MUST utilize the FINGERPRINT mechanism. 2105 If the agent is using Diffserv Codepoint markings [RFC2475] in its 2106 media packets, it SHOULD apply those same markings to its responses 2107 to Binding requests. The same would apply to any layer 2 markings 2108 the endpoint might be applying to media packets. 2110 6.2.1. Additional Procedures for Full Implementations 2112 This subsection defines the additional server procedures applicable 2113 to full implementations. 2115 6.2.1.1. Detecting and Repairing Role Conflicts 2117 Normally, the rules for selection of a role in Section 5.1.2 will 2118 result in each agent selecting a different role -- one controlling 2119 and one controlled. However, in unusual call flows, typically 2120 utilizing third party call control, it is possible for both agents to 2121 select the same role. This section describes procedures for checking 2122 for this case and repairing it. These procedures apply only to 2123 usages of ICE that require conflict resolution. The usage document 2124 MUST specify whether this mechanism is needed. 2126 An agent MUST examine the Binding request for either the ICE- 2127 CONTROLLING or ICE-CONTROLLED attribute. It MUST follow these 2128 procedures: 2130 o If neither ICE-CONTROLLING nor ICE-CONTROLLED is present in the 2131 request, the peer agent may have implemented a previous version of 2132 this specification. There may be a conflict, but it cannot be 2133 detected. 2135 o If the agent is in the controlling role, and the ICE-CONTROLLING 2136 attribute is present in the request: 2138 * If the agent's tie-breaker is larger than or equal to the 2139 contents of the ICE-CONTROLLING attribute, the agent generates 2140 a Binding error response and includes an ERROR-CODE attribute 2141 with a value of 487 (Role Conflict) but retains its role. 2143 * If the agent's tie-breaker is less than the contents of the 2144 ICE-CONTROLLING attribute, the agent switches to the controlled 2145 role. 2147 o If the agent is in the controlled role, and the ICE-CONTROLLED 2148 attribute is present in the request: 2150 * If the agent's tie-breaker is larger than or equal to the 2151 contents of the ICE-CONTROLLED attribute, the agent switches to 2152 the controlling role. 2154 * If the agent's tie-breaker is less than the contents of the 2155 ICE-CONTROLLED attribute, the agent generates a Binding error 2156 response and includes an ERROR-CODE attribute with a value of 2157 487 (Role Conflict) but retains its role. 2159 o If the agent is in the controlled role and the ICE-CONTROLLING 2160 attribute was present in the request, or the agent was in the 2161 controlling role and the ICE-CONTROLLED attribute was present in 2162 the request, there is no conflict. 2164 A change in roles will require an agent to recompute pair priorities 2165 (Section 5.1.3.2), since those priorities are a function of 2166 controlling and controlled roles. The change in role will also 2167 impact whether the agent is responsible for selecting nominated pairs 2168 and initiating exchange with updated candidate information upon 2169 conclusion of ICE. 2171 The remaining sections in Section 6.2.1 are followed if the server 2172 generated a successful response to the Binding request, even if the 2173 agent changed roles. 2175 6.2.1.2. Computing Mapped Address 2177 For requests being received on a relayed candidate, the source 2178 transport address used for STUN processing (namely, generation of the 2179 XOR-MAPPED-ADDRESS attribute) is the transport address as seen by the 2180 TURN server. That source transport address will be present in the 2181 XOR-PEER-ADDRESS attribute of a Data Indication message, if the 2182 Binding request was delivered through a Data Indication. If the 2183 Binding request was delivered through a ChannelData message, the 2184 source transport address is the one that was bound to the channel. 2186 6.2.1.3. Learning Peer Reflexive Candidates 2188 If the source transport address of the request does not match any 2189 existing remote candidates, it represents a new peer reflexive remote 2190 candidate. This candidate is constructed as follows: 2192 o The priority of the candidate is set to the PRIORITY attribute 2193 from the request. 2195 o The type of the candidate is set to peer reflexive. 2197 o The foundation of the candidate is set to an arbitrary value, 2198 different from the foundation for all other remote candidates. If 2199 any subsequent candidate exchanges contain this peer reflexive 2200 candidate, it will signal the actual foundation for the candidate. 2202 o The component ID of this candidate is set to the component ID for 2203 the local candidate to which the request was sent. 2205 This candidate is added to the list of remote candidates. However, 2206 the agent does not pair this candidate with any local candidates. 2208 6.2.1.4. Triggered Checks 2210 Next, the agent constructs a pair whose local candidate is equal to 2211 the transport address on which the STUN request was received, and a 2212 remote candidate equal to the source transport address where the 2213 request came from (which may be the peer reflexive remote candidate 2214 that was just learned). The local candidate will either be a host 2215 candidate (for cases where the request was not received through a 2216 relay) or a relayed candidate (for cases where it is received through 2217 a relay). The local candidate can never be a server reflexive 2218 candidate. Since both candidates are known to the agent, it can 2219 obtain their priorities and compute the candidate pair priority. 2220 This pair is then looked up in the check list. There can be one of 2221 several outcomes: 2223 o If the pair is already on the check list: 2225 * If the state of that pair is Waiting or Frozen, a check for 2226 that pair is enqueued into the triggered check queue if not 2227 already present. 2229 * If the state of that pair is In-Progress, the agent cancels the 2230 in-progress transaction. Cancellation means that the agent 2231 will not retransmit the request, will not treat the lack of 2232 response to be a failure, but will wait the duration of the 2233 transaction timeout for a response. In addition, the agent 2234 MUST create a new connectivity check for that pair 2235 (representing a new STUN Binding request transaction) by 2236 enqueueing the pair in the triggered check queue. The state of 2237 the pair is then changed to Waiting. 2239 * If the state of the pair is Failed, it is changed to Waiting 2240 and the agent MUST create a new connectivity check for that 2241 pair (representing a new STUN Binding request transaction), by 2242 enqueueing the pair in the triggered check queue. 2244 * If the state of that pair is Succeeded, nothing further is 2245 done. 2247 These steps are done to facilitate rapid completion of ICE when both 2248 agents are behind NAT. 2250 o If the pair is not already on the check list: 2252 * The pair is inserted into the check list based on its priority. 2254 * Its state is set to Waiting. 2256 * The pair is enqueued into the triggered check queue. 2258 When a triggered check is to be sent, it is constructed and processed 2259 as described in Section 6.1.2. These procedures require the agent to 2260 know the transport address, username fragment, and password for the 2261 peer. The username fragment for the remote candidate is equal to the 2262 part after the colon of the USERNAME in the Binding request that was 2263 just received. Using that username fragment, the agent can check the 2264 candidates received from its peer (there may be more than one in 2265 cases of forking), and find this username fragment. The 2266 corresponding password is then selected. 2268 6.2.1.5. Updating the Nominated Flag 2270 If the Binding request received by the agent had the USE-CANDIDATE 2271 attribute set, and the agent is in the controlled role, the agent 2272 looks at the state of the pair computed in Section 6.2.1.4: 2274 o If the state of this pair is Succeeded, it means that the check 2275 generated by this pair produced a successful response. This would 2276 have caused the agent to construct a valid pair when that success 2277 response was received (see Section 6.1.3.2.2). The agent now sets 2278 the nominated flag in the valid pair to true. This may end ICE 2279 processing for this media stream; see Section 7. 2281 o If the state of this pair is In-Progress, if its check produces a 2282 successful result, the resulting valid pair has its nominated flag 2283 set when the response arrives. This may end ICE processing for 2284 this media stream when it arrives; see Section 7. 2286 6.2.2. Additional Procedures for Lite Implementations 2288 If the check that was just received contained a USE-CANDIDATE 2289 attribute, the agent constructs a candidate pair whose local 2290 candidate is equal to the transport address on which the request was 2291 received, and whose remote candidate is equal to the source transport 2292 address of the request that was received. This candidate pair is 2293 assigned an arbitrary priority, and placed into a list of valid 2294 candidates called the valid list. The agent sets the nominated flag 2295 for that pair to true. ICE processing is considered complete for a 2296 media stream if the valid list contains a candidate pair for each 2297 component. 2299 7. Concluding ICE Processing 2301 This section describes how an agent completes ICE. 2303 7.1. Procedures for Full Implementations 2305 Concluding ICE involves nominating pairs by the controlling agent and 2306 updating of state machinery. 2308 7.1.1. Nominating Pairs 2310 The controlling agent nominates pairs to be selected by ICE by using 2311 one of two techniques: regular nomination or aggressive nomination. 2312 If its peer has a lite implementation, an agent MUST use a regular 2313 nomination algorithm. If its peer is using ICE options (present in 2314 an ice-options attribute from the peer) that the agent does not 2315 understand, the agent MUST use a regular nomination algorithm. If 2316 its peer is a full implementation and isn't using any ICE options or 2317 is using ICE options understood by the agent, the agent MAY use 2318 either the aggressive or the regular nomination algorithm. However, 2319 the regular algorithm is RECOMMENDED since it provides greater 2320 stability. 2322 7.1.1.1. Regular Nomination 2324 With regular nomination, the agent lets some number of checks 2325 complete, each of which omit the USE-CANDIDATE attribute. Once one 2326 or more checks complete successfully for a component of a media 2327 stream, valid pairs are generated and added to the valid list. The 2328 agent lets the checks continue until some stopping criterion is met, 2329 and then picks amongst the valid pairs based on an evaluation 2330 criterion. The criteria for stopping the checks and for evaluating 2331 the valid pairs is entirely a matter of local optimization. 2333 When the controlling agent selects the valid pair, it repeats the 2334 check that produced this valid pair (by enqueueing the pair that 2335 generated the check into the triggered check queue), this time with 2336 the USE-CANDIDATE attribute. This check should succeed (since the 2337 previous did), causing the nominated flag of that and only that pair 2338 to be set. Consequently, there will be only a single nominated pair 2339 in the valid list for each component, and when the state of the check 2340 list moves to completed, that exact pair is selected by ICE for 2341 sending and receiving media for that component. 2343 Regular nomination provides the most flexibility, since the agent has 2344 control over the stopping and selection criteria for checks. The 2345 only requirement is that the agent MUST eventually pick one and only 2346 one candidate pair and generate a check for that pair with the USE- 2347 CANDIDATE attribute present. Regular nomination also improves ICE's 2348 resilience to variations in implementation (see Section 11). Regular 2349 nomination is also more stable, allowing both agents to converge on a 2350 single pair for media without any transient selections, which can 2351 happen with the aggressive algorithm. The drawback of regular 2352 nomination is that it is guaranteed to increase latencies because it 2353 requires an additional check to be done. 2355 7.1.1.2. Aggressive Nomination 2357 With aggressive nomination, the controlling agent includes the USE- 2358 CANDIDATE attribute in every check it sends. Once the first check 2359 for a component succeeds, it will be added to the valid list and have 2360 its nominated flag set. When all components have a nominated pair in 2361 the valid list, media can begin to flow using the highest-priority 2362 nominated pair. However, because the agent included the USE- 2363 CANDIDATE attribute in all of its checks, another check may yet 2364 complete, causing another valid pair to have its nominated flag set. 2365 ICE always selects the highest-priority nominated candidate pair from 2366 the valid list as the one used for media. Consequently, the selected 2367 pair may actually change briefly as ICE checks complete, resulting in 2368 a set of transient selections until it stabilizes. 2370 If certain connectivity check messages are lost, ICE agents using 2371 aggressive nomination may end up with different views on the selected 2372 candidate pair. In this case, if a security protocol that is able to 2373 authenticate the communicating parties (e.g., DTLS) is used, the 2374 controlled agent may receive valid secured traffic or handshake 2375 initialization originating from the controlling agent on a candidate 2376 pair that is different from the one the controlled agent considers as 2377 the selected pair. If this happens, the controlled agent MUST 2378 consider the pair with the secured traffic as the correct selected 2379 pair. If such security protocol is not used, both agents SHOULD 2380 continue sending connectivity check messages on the selected pair 2381 even after a pair has already been selected for use. In order to 2382 prevent the problem described here, at least one check from both 2383 agents needs to fully succeed on the selected pair. 2385 7.1.2. Updating States 2387 For both controlling and controlled agents, the state of ICE 2388 processing depends on the presence of nominated candidate pairs in 2389 the valid list and on the state of the check list. Note that, at any 2390 time, more than one of the following cases can apply: 2392 o If there are no nominated pairs in the valid list for a media 2393 stream and the state of the check list is Running, ICE processing 2394 continues. 2396 o If there is at least one nominated pair in the valid list for a 2397 media stream and the state of the check list is Running: 2399 * The agent MUST remove all Waiting and Frozen pairs in the check 2400 list and triggered check queue for the same component as the 2401 nominated pairs for that media stream. 2403 * If an In-Progress pair in the check list is for the same 2404 component as a nominated pair, the agent SHOULD cease 2405 retransmissions for its check if its pair priority is lower 2406 than the lowest-priority nominated pair for that component. 2408 o Once there is at least one nominated pair in the valid list for 2409 every component of at least one media stream and the state of the 2410 check list is Running: 2412 * The agent MUST change the state of processing for its check 2413 list for that media stream to Completed. 2415 * The agent MUST continue to respond to any checks it may still 2416 receive for that media stream, and MUST perform triggered 2417 checks if required by the processing of Section 6.2. 2419 * The agent MUST continue retransmitting any In-Progress checks 2420 for that check list. 2422 * The agent MAY begin transmitting media for this media stream as 2423 described in Section 10.1. 2425 o Once the state of each check list is Completed: 2427 * The agent sets the state of ICE processing overall to 2428 Completed. 2430 * If the controlling agent is using an aggressive nomination 2431 algorithm, this may result in several updated candidate 2432 exchanges as the pairs selected for media change. An agent MAY 2433 delay sending its candidates for a brief interval (one second 2434 is RECOMMENDED) in order to allow the selected pairs to 2435 stabilize. 2437 o If the state of the check list is Failed, ICE has not been able to 2438 complete for this media stream. The correct behavior depends on 2439 the state of the check lists for other media streams: 2441 * If all check lists are Failed, ICE processing overall is 2442 considered to be in the Failed state, and the agent SHOULD 2443 consider the session a failure, SHOULD NOT restart ICE, and the 2444 controlling agent SHOULD terminate the entire session. 2446 * If at least one of the check lists for other media streams is 2447 Completed, the controlling agent SHOULD remove the failed media 2448 stream from the session while sending updated candidate list to 2449 its peer. 2451 * If none of the check lists for other media streams are 2452 Completed, but at least one is Running, the agent SHOULD let 2453 ICE continue. 2455 7.2. Procedures for Lite Implementations 2457 Concluding ICE for a lite implementation is relatively 2458 straightforward. There are two cases to consider: 2460 The implementation is lite, and its peer is full. 2462 The implementation is lite, and its peer is lite. 2464 The effect of ICE concluding is that the agent can free any allocated 2465 host candidates that were not utilized by ICE, as described in 2466 Section 7.3. 2468 7.2.1. Peer Is Full 2470 In this case, the agent will receive connectivity checks from its 2471 peer. When an agent has received a connectivity check that includes 2472 the USE-CANDIDATE attribute for each component of a media stream, the 2473 state of ICE processing for that media stream moves from Running to 2474 Completed. When the state of ICE processing for all media streams is 2475 Completed, the state of ICE processing overall is Completed. 2477 The lite implementation will never itself determine that ICE 2478 processing has failed for a media stream; rather, the full peer will 2479 make that determination and then remove or restart the failed media 2480 stream as part of subsequent candidate exchange process. 2482 7.2.2. Peer Is Lite 2484 Once the candidate exchange has completed, both agents examine their 2485 candidates and those of its peer. For each media stream, each agent 2486 pairs up its own candidates with the candidates of its peer for that 2487 media stream. Two candidates are paired up when they are for the 2488 same component, utilize the same transport protocol (UDP in this 2489 specification), and are from the same IP address family (IPv4 or 2490 IPv6). 2492 o If there is a single pair per component, that pair is added to the 2493 Valid list. If all of the components for a media stream had one 2494 pair, the state of ICE processing for that media stream is set to 2495 Completed. If all media streams are Completed, the state of ICE 2496 processing is set to Completed overall. This will always be the 2497 case for implementations that are IPv4-only. 2499 o If there is more than one pair per component: 2501 * The agent MUST select a pair based on local policy. Since this 2502 case only arises for IPv6, it is RECOMMENDED that an agent 2503 follow the procedures of RFC 6724 [RFC6724] to select a single 2504 pair. 2506 * The agent adds the selected pair for each component to the 2507 valid list. As described in Section 10.1, this will permit 2508 media to begin flowing. However, it is possible (and in fact 2509 likely) that both agents have chosen different pairs. 2511 * To reconcile this, the controlling agent MUST send updated 2512 candidate list which will include the remote-candidates 2513 attribute. 2515 * The agent MUST NOT update the state of ICE processing until 2516 after the candidate exchange completes. Then the controlling 2517 agent MUST change the state of ICE processing to Completed for 2518 all media streams, and the state of ICE processing overall to 2519 Completed. 2521 7.3. Freeing Candidates 2523 7.3.1. Full Implementation Procedures 2525 The procedures in Section 7 require that an agent continue to listen 2526 for STUN requests and continue to generate triggered checks for a 2527 media stream, even once processing for that stream completes. The 2528 rules in this section describe when it is safe for an agent to cease 2529 sending or receiving checks on a candidate that was not selected by 2530 ICE, and then free the candidate. 2532 7.3.2. Lite Implementation Procedures 2534 A lite implementation MAY free candidates not selected by ICE as soon 2535 as ICE processing has reached the Completed state for all peers for 2536 all media streams using those candidates. 2538 8. ICE Restarts 2540 An agent MAY restart ICE processing for an existing media stream. An 2541 ICE restart, as the name implies, will cause all previous states of 2542 ICE processing to be flushed and checks to start anew. The only 2543 difference between an ICE restart and a brand new media session is 2544 that, during the restart, media can continue to be sent to the 2545 previously validated pair. 2547 An agent MUST restart ICE for a media stream if: 2549 o The candidate(s) is being generated for the purposes of changing 2550 the target of the media stream. In other words, if an agent wants 2551 to generate an updated candidate information that, had ICE not 2552 been in use, would result in a new value for the destination of a 2553 media component. 2555 o An agent is changing its implementation level. This typically 2556 only happens in third party call control use cases, where the 2557 entity performing the signaling is not the entity receiving the 2558 media, and it has changed the target of media mid-session to 2559 another entity that has a different ICE implementation. 2561 To restart ICE, an agent MUST change both the password and the user 2562 name fragment for the media stream when exchanging the candidates. 2563 The new candidate set MAY include some, none, or all of the previous 2564 candidates for that stream and MAY include a totally new set of 2565 candidates. 2567 9. Keepalives 2569 All endpoints MUST send keepalives for each media session. These 2570 keepalives serve the purpose of keeping NAT bindings alive for the 2571 media session. These keepalives MUST be sent even if ICE is not 2572 being utilized for the session at all. The keepalive SHOULD be sent 2573 using a format that is supported by its peer. ICE endpoints allow 2574 for STUN-based keepalives for UDP streams, and as such, STUN 2575 keepalives MUST be used when an agent is a full ICE implementation 2576 and is communicating with a peer that supports ICE (lite or full). 2577 If the peer does not support ICE, the choice of a packet format for 2578 keepalives is a matter of local implementation. A format that allows 2579 packets to easily be sent in the absence of actual media content is 2580 RECOMMENDED. Examples of formats that readily meet this goal are RTP 2581 No-Op [I-D.ietf-avt-rtp-no-op], and in cases where both sides support 2582 it, RTP comfort noise [RFC3389]. If the peer doesn't support any 2583 formats that are particularly well suited for keepalives, an agent 2584 SHOULD send RTP packets with an incorrect version number, or some 2585 other form of error that would cause them to be discarded by the 2586 peer. 2588 If there has been no packet sent on the candidate pair ICE is using 2589 for a media component for Tr seconds (where packets include those 2590 defined for the component (RTP or RTCP) and previous keepalives), an 2591 agent MUST generate a keepalive on that pair. Tr SHOULD be 2592 configurable and SHOULD have a default of 15 seconds. Tr MUST NOT be 2593 configured to less than 15 seconds. Alternatively, if an agent has a 2594 dynamic way to discover the binding lifetimes of the intervening 2595 NATs, it can use that value to determine Tr. Administrators 2596 deploying ICE in more controlled networking environments SHOULD set 2597 Tr to the longest duration possible in their environment. 2599 If STUN is being used for keepalives, a STUN Binding Indication is 2600 used [RFC5389]. The Indication MUST NOT utilize any authentication 2601 mechanism. It SHOULD contain the FINGERPRINT attribute to aid in 2602 demultiplexing, but SHOULD NOT contain any other attributes. It is 2603 used solely to keep the NAT bindings alive. The Binding Indication 2604 is sent using the same local and remote candidates that are being 2605 used for media. Though Binding Indications are used for keepalives, 2606 an agent MUST be prepared to receive a connectivity check as well. 2607 If a connectivity check is received, a response is generated as 2608 discussed in [RFC5389], but there is no impact on ICE processing 2609 otherwise. 2611 An agent MUST begin the keepalive processing once ICE has selected 2612 candidates for usage with media, or media begins to flow, whichever 2613 happens first. Keepalives end once the session terminates or the 2614 media stream is removed. 2616 10. Media Handling 2618 10.1. Sending Media 2620 Procedures for sending media differ for full and lite 2621 implementations. 2623 10.1.1. Procedures for Full Implementations 2625 Agents always send media using a candidate pair, called the selected 2626 candidate pair. An agent will send media to the remote candidate in 2627 the selected pair (setting the destination address and port of the 2628 packet equal to that remote candidate), and will send it from the 2629 local candidate of the selected pair. When the local candidate is 2630 server or peer reflexive, media is originated from the base. Media 2631 sent from a relayed candidate is sent from the base through that TURN 2632 server, using procedures defined in [RFC5766]. 2634 If the local candidate is a relayed candidate, it is RECOMMENDED that 2635 an agent create a channel on the TURN server towards the remote 2636 candidate. This is done using the procedures for channel creation as 2637 defined in Section 11 of [RFC5766]. 2639 The selected pair for a component of a media stream is: 2641 o empty if the state of the check list for that media stream is 2642 Running, and there is no previous selected pair for that component 2643 due to an ICE restart 2645 o equal to the previous selected pair for a component of a media 2646 stream if the state of the check list for that media stream is 2647 Running, and there was a previous selected pair for that component 2648 due to an ICE restart 2650 o equal to the highest-priority nominated pair for that component in 2651 the valid list if the state of the check list is Completed 2653 If the selected pair for at least one component of a media stream is 2654 empty, an agent MUST NOT send media for any component of that media 2655 stream. If the selected pair for each component of a media stream 2656 has a value, an agent MAY send media for all components of that media 2657 stream. 2659 10.1.2. Procedures for Lite Implementations 2661 A lite implementation MUST NOT send media until it has a Valid list 2662 that contains a candidate pair for each component of that media 2663 stream. Once that happens, the agent MAY begin sending media 2664 packets. To do that, it sends media to the remote candidate in the 2665 pair (setting the destination address and port of the packet equal to 2666 that remote candidate), and will send it from the local candidate. 2668 10.1.3. Procedures for All Implementations 2670 ICE has interactions with jitter buffer adaptation mechanisms. An 2671 RTP stream can begin using one candidate, and switch to another one, 2672 though this happens rarely with ICE. The newer candidate may result 2673 in RTP packets taking a different path through the network -- one 2674 with different delay characteristics. As discussed below, agents are 2675 encouraged to re-adjust jitter buffers when there are changes in 2676 source or destination address of media packets. Furthermore, many 2677 audio codecs use the marker bit to signal the beginning of a 2678 talkspurt, for the purposes of jitter buffer adaptation. For such 2679 codecs, it is RECOMMENDED that the sender set the marker bit 2680 [RFC3550] when an agent switches transmission of media from one 2681 candidate pair to another. 2683 10.2. Receiving Media 2685 ICE implementations MUST be prepared to receive media on each 2686 component on any candidates provided for that component in the most 2687 recent candidate exchange (in the case of RTP, this would include 2688 both RTP and RTCP if candidates were provided for both). 2690 It is RECOMMENDED that, when an agent receives an RTP packet with a 2691 new source or destination IP address for a particular media stream, 2692 that the agent re-adjust its jitter buffers. 2694 RFC 3550 [RFC3550] describes an algorithm in Section 8.2 for 2695 detecting synchronization source (SSRC) collisions and loops. These 2696 algorithms are based, in part, on seeing different source transport 2697 addresses with the same SSRC. However, when ICE is used, such 2698 changes will sometimes occur as the media streams switch between 2699 candidates. An agent will be able to determine that a media stream 2700 is from the same peer as a consequence of the STUN exchange that 2701 proceeds media transmission. Thus, if there is a change in source 2702 transport address, but the media packets come from the same peer 2703 agent, this SHOULD NOT be treated as an SSRC collision. 2705 11. Extensibility Considerations 2707 This specification makes very specific choices about how both agents 2708 in a session coordinate to arrive at the set of candidate pairs that 2709 are selected for media. It is anticipated that future specifications 2710 will want to alter these algorithms, whether they are simple changes 2711 like timer tweaks or larger changes like a revamp of the priority 2712 algorithm. When such a change is made, providing interoperability 2713 between the two agents in a session is critical. 2715 First, ICE provides the ice-options attribute. Each extension or 2716 change to ICE is associated with a token. When an agent supporting 2717 such an extension or change triggers candidate exchange, it MUST 2718 include the token for that extension in this attribute. This allows 2719 each side to know what the other side is doing. This attribute MUST 2720 NOT be present if the agent doesn't support any ICE extensions or 2721 changes. 2723 One of the complications in achieving interoperability is that ICE 2724 relies on a distributed algorithm running on both agents to converge 2725 on an agreed set of candidate pairs. If the two agents run different 2726 algorithms, it can be difficult to guarantee convergence on the same 2727 candidate pairs. The regular nomination procedure described in 2728 Section 7 eliminates some of the tight coordination by delegating the 2729 selection algorithm completely to the controlling agent. 2730 Consequently, when a controlling agent is communicating with a peer 2731 that supports options it doesn't know about, the agent MUST run a 2732 regular nomination algorithm. When regular nomination is used, ICE 2733 will converge perfectly even when both agents use different pair 2734 prioritization algorithms. One of the keys to such convergence is 2735 triggered checks, which ensure that the nominated pair is validated 2736 by both agents. Consequently, any future ICE enhancements MUST 2737 preserve triggered checks. 2739 ICE is also extensible to other media streams beyond RTP, and for 2740 transport protocols beyond UDP. Extensions to ICE for non-RTP media 2741 streams need to specify how many components they utilize, and assign 2742 component IDs to them, starting at 1 for the most important component 2743 ID. Specifications for new transport protocols must define how, if 2744 at all, various steps in the ICE processing differ from UDP. 2746 12. Setting Ta and RTO 2748 During the gathering phase of ICE (Section 4.1.1) and while ICE is 2749 performing connectivity checks (Section 6), an agent sends STUN and 2750 TURN transactions. These transactions are paced at a rate of one 2751 every Ta milliseconds, and utilize a specific RTO. This section 2752 describes how the values of Ta and RTO are computed. This 2753 computation depends on whether ICE is being used with a real-time 2754 media stream (such as RTP) or something else. When ICE is used for a 2755 stream with a known maximum bandwidth, the computation in 2756 Section 12.1 MAY be followed to rate-control the ICE exchanges. For 2757 all other streams, the computation in Section 12.2 MUST be followed. 2759 12.1. Real-time Media Streams 2761 The values of RTO and Ta change during the lifetime of ICE 2762 processing. One set of values applies during the gathering phase, 2763 and the other, for connectivity checks. 2765 The value of Ta SHOULD be configurable, and SHOULD have a default of: 2767 For each media stream i: 2768 Ta_i = (stun_packet_size / rtp_packet_size) * rtp_ptime 2770 1 2771 Ta = MAX (20ms, ------------------- ) 2772 k 2773 ---- 2774 \ 1 2775 > ------ 2776 / Ta_i 2777 ---- 2778 i=1 2780 where k is the number of media streams. During the gathering phase, 2781 Ta is computed based on the number of media streams the agent has 2782 indicated in the candidate information, and the RTP packet size and 2783 RTP ptime are those of the most preferred codec for each media 2784 stream. Once the candidate exchange is completed, the agent 2785 recomputes Ta to pace the connectivity checks. In that case, the 2786 value of Ta is based on the number of media streams that will 2787 actually be used in the session, and the RTP packet size and RTP 2788 ptime are those of the most preferred codec with which the agent will 2789 send. 2791 In addition, the retransmission timer for the STUN transactions, RTO, 2792 defined in [RFC5389], SHOULD be configurable and during the gathering 2793 phase, SHOULD have a default of: 2795 RTO = MAX (100ms, Ta * (number of pairs)) 2797 where the number of pairs refers to the number of pairs of candidates 2798 with STUN or TURN servers. 2800 For connectivity checks, RTO SHOULD be configurable and SHOULD have a 2801 default of: 2803 RTO = MAX (100ms, Ta*N * (Num-Waiting + Num-In-Progress)) 2805 where Num-Waiting is the number of checks in the check list in the 2806 Waiting state, and Num-In-Progress is the number of checks in the In- 2807 Progress state. Note that the RTO will be different for each 2808 transaction as the number of checks in the Waiting and In-Progress 2809 states change. 2811 These formulas are aimed at causing STUN transactions to be paced at 2812 the same rate as media. This ensures that ICE will work properly 2813 under the same network conditions needed to support the media as 2814 well. See Appendix B.1 for additional discussion and motivations. 2815 Because of this pacing, it will take a certain amount of time to 2816 obtain all of the server reflexive and relayed candidates. 2817 Implementations should be aware of the time required to do this, and 2818 if the application requires a time budget, limit the number of 2819 candidates that are gathered. 2821 The formulas result in a behavior whereby an agent will send its 2822 first packet for every single connectivity check before performing a 2823 retransmit. This can be seen in the formulas for the RTO (which 2824 represents the retransmit interval). Those formulas scale with N, 2825 the number of checks to be performed. As a result of this, ICE 2826 maintains a nicely constant rate, but becomes more sensitive to 2827 packet loss. The loss of the first single packet for any 2828 connectivity check is likely to cause that pair to take a long time 2829 to be validated, and instead, a lower-priority check (but one for 2830 which there was no packet loss) is much more likely to complete 2831 first. This results in ICE performing sub-optimally, choosing lower- 2832 priority pairs over higher-priority pairs. Implementors should be 2833 aware of this consequence, but still should utilize the timer values 2834 described here. 2836 12.2. Non-real-time Sessions 2838 In cases where ICE is used to establish some kind of session that is 2839 not real time, and has no fixed rate associated with it that is known 2840 to work on the network in which ICE is deployed, Ta and RTO revert to 2841 more conservative values. Ta SHOULD be configurable, SHOULD have a 2842 default of 500 ms, and MUST NOT be configurable to be less than 500 2843 ms. 2845 If other Ta value than the default is used, the agent MUST indicate 2846 the value it prefers to use in the ICE exchange. Both agents MUST 2847 use the higher out of the two proposed values. 2849 In addition, the retransmission timer for the STUN transactions, RTO, 2850 SHOULD be configurable and during the gathering phase, SHOULD have a 2851 default of: 2853 RTO = MAX (500ms, Ta * (number of pairs)) 2855 where the number of pairs refers to the number of pairs of candidates 2856 with STUN or TURN servers. 2858 For connectivity checks, RTO SHOULD be configurable and SHOULD have a 2859 default of: 2861 RTO = MAX (500ms, Ta*N * (Num-Waiting + Num-In-Progress)) 2863 13. Example 2865 The example is based on the simplified topology of Figure 9. 2867 +-------+ 2868 |STUN | 2869 |Server | 2870 +-------+ 2871 | 2872 +---------------------+ 2873 | | 2874 | Internet | 2875 | | 2876 +---------------------+ 2877 | | 2878 | | 2879 +---------+ | 2880 | NAT | | 2881 +---------+ | 2882 | | 2883 | | 2884 +-----+ +-----+ 2885 | L | | R | 2886 +-----+ +-----+ 2888 Figure 9: Example Topology 2890 Two agents, L and R, are using ICE. Both are full-mode ICE 2891 implementations and use aggressive nomination when they are 2892 controlling. Both agents have a single IPv4 address. For agent L, 2893 it is 10.0.1.1 in private address space [RFC1918], and for agent R, 2894 192.0.2.1 on the public Internet. Both are configured with the same 2895 STUN server (shown in this example for simplicity, although in 2896 practice the agents do not need to use the same STUN server), which 2897 is listening for STUN Binding requests at an IP address of 192.0.2.2 2898 and port 3478. TURN servers are not used in this example. Agent L 2899 is behind a NAT, and agent R is on the public Internet. The NAT has 2900 an endpoint independent mapping property and an address dependent 2901 filtering property. The public side of the NAT has an IP address of 2902 192.0.2.3. 2904 To facilitate understanding, transport addresses are listed using 2905 variables that have mnemonic names. The format of the name is 2906 entity-type-seqno, where entity refers to the entity whose IP address 2907 the transport address is on, and is one of "L", "R", "STUN", or 2908 "NAT". The type is either "PUB" for transport addresses that are 2909 public, and "PRIV" for transport addresses that are private. 2910 Finally, seq-no is a sequence number that is different for each 2911 transport address of the same type on a particular entity. Each 2912 variable has an IP address and port, denoted by varname.IP and 2913 varname.PORT, respectively, where varname is the name of the 2914 variable. 2916 The STUN server has advertised transport address STUN-PUB-1 (which is 2917 192.0.2.2:3478). 2919 In the call flow itself, STUN messages are annotated with several 2920 attributes. The "S=" attribute indicates the source transport 2921 address of the message. The "D=" attribute indicates the destination 2922 transport address of the message. The "MA=" attribute is used in 2923 STUN Binding response messages and refers to the mapped address. 2924 "USE-CAND" implies the presence of the USE-CANDIDATE attribute. 2926 The call flow examples omit STUN authentication operations and RTCP, 2927 and focus on RTP for a single media stream between two full 2928 implementations. 2930 L NAT STUN R 2931 |RTP STUN alloc. | | 2932 |(1) STUN Req | | | 2933 |S=$L-PRIV-1 | | | 2934 |D=$STUN-PUB-1 | | | 2935 |------------->| | | 2936 | |(2) STUN Req | | 2937 | |S=$NAT-PUB-1 | | 2938 | |D=$STUN-PUB-1 | | 2939 | |------------->| | 2940 | |(3) STUN Res | | 2941 | |S=$STUN-PUB-1 | | 2942 | |D=$NAT-PUB-1 | | 2943 | |MA=$NAT-PUB-1 | | 2944 | |<-------------| | 2945 |(4) STUN Res | | | 2946 |S=$STUN-PUB-1 | | | 2947 |D=$L-PRIV-1 | | | 2948 |MA=$NAT-PUB-1 | | | 2949 |<-------------| | | 2950 |(5) L's Candidate Information| | 2951 |------------------------------------------->| 2952 | | | | RTP STUN 2953 | | | | alloc. 2954 | | |(6) STUN Req | 2955 | | |S=$R-PUB-1 | 2956 | | |D=$STUN-PUB-1 | 2957 | | |<-------------| 2958 | | |(7) STUN Res | 2959 | | |S=$STUN-PUB-1 | 2960 | | |D=$R-PUB-1 | 2961 | | |MA=$R-PUB-1 | 2962 | | |------------->| 2963 |(8) R's Candidate Information| | 2964 |<-------------------------------------------| 2965 | |(9) Bind Req | |Begin 2966 | |S=$R-PUB-1 | |Connectivity 2967 | |D=L-PRIV-1 | |Checks 2968 | |<----------------------------| 2969 | |Dropped | | 2970 |(10) Bind Req | | | 2971 |S=$L-PRIV-1 | | | 2972 |D=$R-PUB-1 | | | 2973 |USE-CAND | | | 2974 |------------->| | | 2975 | |(11) Bind Req | | 2976 | |S=$NAT-PUB-1 | | 2977 | |D=$R-PUB-1 | | 2978 | |USE-CAND | | 2979 | |---------------------------->| 2980 | |(12) Bind Res | | 2981 | |S=$R-PUB-1 | | 2982 | |D=$NAT-PUB-1 | | 2983 | |MA=$NAT-PUB-1 | | 2984 | |<----------------------------| 2985 |(13) Bind Res | | | 2986 |S=$R-PUB-1 | | | 2987 |D=$L-PRIV-1 | | | 2988 |MA=$NAT-PUB-1 | | | 2989 |<-------------| | | 2990 |RTP flows | | | 2991 | |(14) Bind Req | | 2992 | |S=$R-PUB-1 | | 2993 | |D=$NAT-PUB-1 | | 2994 | |<----------------------------| 2995 |(15) Bind Req | | | 2996 |S=$R-PUB-1 | | | 2997 |D=$L-PRIV-1 | | | 2998 |<-------------| | | 2999 |(16) Bind Res | | | 3000 |S=$L-PRIV-1 | | | 3001 |D=$R-PUB-1 | | | 3002 |MA=$R-PUB-1 | | | 3003 |------------->| | | 3004 | |(17) Bind Res | | 3005 | |S=$NAT-PUB-1 | | 3006 | |D=$R-PUB-1 | | 3007 | |MA=$R-PUB-1 | | 3008 | |---------------------------->| 3009 | | | |RTP flows 3010 Figure 10: Example Flow 3012 First, agent L obtains a host candidate from its local IP address 3013 (not shown), and from that, sends a STUN Binding request to the STUN 3014 server to get a server reflexive candidate (messages 1-4). Recall 3015 that the NAT has the address and port independent mapping property. 3016 Here, it creates a binding of NAT-PUB-1 for this UDP request, and 3017 this becomes the server reflexive candidate for RTP. 3019 Agent L sets a type preference of 126 for the host candidate and 100 3020 for the server reflexive. The local preference is 65535. Based on 3021 this, the priority of the host candidate is 2130706431 and for the 3022 server reflexive candidate is 1694498815. The host candidate is 3023 assigned a foundation of 1, and the server reflexive, a foundation of 3024 2. These are sent to the peer. 3026 This candidate information is received at agent R. Agent R will 3027 obtain a host candidate, and from it, obtain a server reflexive 3028 candidate (messages 6-7). Since R is not behind a NAT, this 3029 candidate is identical to its host candidate, and they share the same 3030 base. It therefore discards this redundant candidate and ends up 3031 with a single host candidate. With identical type and local 3032 preferences as L, the priority for this candidate is 2130706431. It 3033 chooses a foundation of 1 for its single candidate. Then R's 3034 candidates are then sent to L. 3036 Since neither side indicated that it is lite, the initiating agent 3037 that began ICE processing (agent L) becomes the controlling agent. 3039 Agents L and R both pair up the candidates. They both initially have 3040 two pairs. However, agent L will prune the pair containing its 3041 server reflexive candidate, resulting in just one. At agent L, this 3042 pair has a local candidate of $L_PRIV_1 and remote candidate of 3043 $R_PUB_1, and has a candidate pair priority of 4.57566E+18 (note that 3044 an implementation would represent this as a 64-bit integer so as not 3045 to lose precision). At agent R, there are two pairs. The highest 3046 priority has a local candidate of $R_PUB_1 and remote candidate of 3047 $L_PRIV_1 and has a priority of 4.57566E+18, and the second has a 3048 local candidate of $R_PUB_1 and remote candidate of $NAT_PUB_1 and 3049 priority 3.63891E+18. 3051 Agent R begins its connectivity check (message 9) for the first pair 3052 (between the two host candidates). Since R is the controlled agent 3053 for this session, the check omits the USE-CANDIDATE attribute. The 3054 host candidate from agent L is private and behind a NAT, and thus 3055 this check won't be successful, because the packet cannot be routed 3056 from R to L. 3058 When agent L gets the R's candidates, it performs its one and only 3059 connectivity check (messages 10-13). It implements the aggressive 3060 nomination algorithm, and thus includes a USE-CANDIDATE attribute in 3061 this check. Since the check succeeds, agent L creates a new pair, 3062 whose local candidate is from the mapped address in the Binding 3063 response (NAT-PUB-1 from message 13) and whose remote candidate is 3064 the destination of the request (R-PUB-1 from message 10). This is 3065 added to the valid list. In addition, it is marked as selected since 3066 the Binding request contained the USE-CANDIDATE attribute. Since 3067 there is a selected candidate in the Valid list for the one component 3068 of this media stream, ICE processing for this stream moves into the 3069 Completed state. Agent L can now send media if it so chooses. 3071 Soon after receipt of the STUN Binding request from agent L (message 3072 11), agent R will generate its triggered check. This check happens 3073 to match the next one on its check list -- from its host candidate to 3074 agent L's server reflexive candidate. This check (messages 14-17) 3075 will succeed. Consequently, agent R constructs a new candidate pair 3076 using the mapped address from the response as the local candidate (R- 3077 PUB-1) and the destination of the request (NAT-PUB-1) as the remote 3078 candidate. This pair is added to the Valid list for that media 3079 stream. Since the check was generated in the reverse direction of a 3080 check that contained the USE-CANDIDATE attribute, the candidate pair 3081 is marked as selected. Consequently, processing for this stream 3082 moves into the Completed state, and agent R can also send media. 3084 14. Security Considerations 3086 There are several types of attacks possible in an ICE system. This 3087 section considers these attacks and their countermeasures. These 3088 countermeasures include: 3090 o Using ICE in conjunction with secure signaling techniques, such as 3091 SIPS. 3093 o Limiting the total number of connectivity checks to 100, and 3094 optionally limiting the number of candidates they'll accept in an 3095 candidate exchange. 3097 14.1. Attacks on Connectivity Checks 3099 An attacker might attempt to disrupt the STUN connectivity checks. 3100 Ultimately, all of these attacks fool an agent into thinking 3101 something incorrect about the results of the connectivity checks. 3102 The possible false conclusions an attacker can try and cause are: 3104 False Invalid: An attacker can fool a pair of agents into thinking a 3105 candidate pair is invalid, when it isn't. This can be used to 3106 cause an agent to prefer a different candidate (such as one 3107 injected by the attacker) or to disrupt a call by forcing all 3108 candidates to fail. 3110 False Valid: An attacker can fool a pair of agents into thinking a 3111 candidate pair is valid, when it isn't. This can cause an agent 3112 to proceed with a session, but then not be able to receive any 3113 media. 3115 False Peer Reflexive Candidate: An attacker can cause an agent to 3116 discover a new peer reflexive candidate, when it shouldn't have. 3117 This can be used to redirect media streams to a Denial-of-Service 3118 (DoS) target or to the attacker, for eavesdropping or other 3119 purposes. 3121 False Valid on False Candidate: An attacker has already convinced an 3122 agent that there is a candidate with an address that doesn't 3123 actually route to that agent (for example, by injecting a false 3124 peer reflexive candidate or false server reflexive candidate). It 3125 must then launch an attack that forces the agents to believe that 3126 this candidate is valid. 3128 If an attacker can cause a false peer reflexive candidate or false 3129 valid on a false candidate, it can launch any of the attacks 3130 described in [RFC5389]. 3132 To force the false invalid result, the attacker has to wait for the 3133 connectivity check from one of the agents to be sent. When it is, 3134 the attacker needs to inject a fake response with an unrecoverable 3135 error response, such as a 400. However, since the candidate is, in 3136 fact, valid, the original request may reach the peer agent, and 3137 result in a success response. The attacker needs to force this 3138 packet or its response to be dropped, through a DoS attack, layer 2 3139 network disruption, or other technique. If it doesn't do this, the 3140 success response will also reach the originator, alerting it to a 3141 possible attack. Fortunately, this attack is mitigated completely 3142 through the STUN short-term credential mechanism. The attacker needs 3143 to inject a fake response, and in order for this response to be 3144 processed, the attacker needs the password. If the candidate 3145 exchange signaling is secured, the attacker will not have the 3146 password and its response will be discarded. 3148 Forcing the fake valid result works in a similar way. The agent 3149 needs to wait for the Binding request from each agent, and inject a 3150 fake success response. The attacker won't need to worry about 3151 disrupting the actual response since, if the candidate is not valid, 3152 it presumably wouldn't be received anyway. However, like the fake 3153 invalid attack, this attack is mitigated by the STUN short-term 3154 credential mechanism in conjunction with a secure candidate exchange. 3156 Forcing the false peer reflexive candidate result can be done either 3157 with fake requests or responses, or with replays. We consider the 3158 fake requests and responses case first. It requires the attacker to 3159 send a Binding request to one agent with a source IP address and port 3160 for the false candidate. In addition, the attacker must wait for a 3161 Binding request from the other agent, and generate a fake response 3162 with a XOR-MAPPED-ADDRESS attribute containing the false candidate. 3163 Like the other attacks described here, this attack is mitigated by 3164 the STUN message integrity mechanisms and secure candidate exchanges. 3166 Forcing the false peer reflexive candidate result with packet replays 3167 is different. The attacker waits until one of the agents sends a 3168 check. It intercepts this request, and replays it towards the other 3169 agent with a faked source IP address. It must also prevent the 3170 original request from reaching the remote agent, either by launching 3171 a DoS attack to cause the packet to be dropped, or forcing it to be 3172 dropped using layer 2 mechanisms. The replayed packet is received at 3173 the other agent, and accepted, since the integrity check passes (the 3174 integrity check cannot and does not cover the source IP address and 3175 port). It is then responded to. This response will contain a XOR- 3176 MAPPED-ADDRESS with the false candidate, and will be sent to that 3177 false candidate. The attacker must then receive it and relay it 3178 towards the originator. 3180 The other agent will then initiate a connectivity check towards that 3181 false candidate. This validation needs to succeed. This requires 3182 the attacker to force a false valid on a false candidate. Injecting 3183 of fake requests or responses to achieve this goal is prevented using 3184 the integrity mechanisms of STUN and the candidate exchange. Thus, 3185 this attack can only be launched through replays. To do that, the 3186 attacker must intercept the check towards this false candidate, and 3187 replay it towards the other agent. Then, it must intercept the 3188 response and replay that back as well. 3190 This attack is very hard to launch unless the attacker is identified 3191 by the fake candidate. This is because it requires the attacker to 3192 intercept and replay packets sent by two different hosts. If both 3193 agents are on different networks (for example, across the public 3194 Internet), this attack can be hard to coordinate, since it needs to 3195 occur against two different endpoints on different parts of the 3196 network at the same time. 3198 If the attacker itself is identified by the fake candidate, the 3199 attack is easier to coordinate. However, if the media path is 3200 secured (e.g., using SRTP [RFC3711]), the attacker will not be able 3201 to play the media packets, but will only be able to discard them, 3202 effectively disabling the media stream for the call. However, this 3203 attack requires the agent to disrupt packets in order to block the 3204 connectivity check from reaching the target. In that case, if the 3205 goal is to disrupt the media stream, it's much easier to just disrupt 3206 it with the same mechanism, rather than attack ICE. 3208 14.2. Attacks on Server Reflexive Address Gathering 3210 ICE endpoints make use of STUN Binding requests for gathering server 3211 reflexive candidates from a STUN server. These requests are not 3212 authenticated in any way. As a consequence, there are numerous 3213 techniques an attacker can employ to provide the client with a false 3214 server reflexive candidate: 3216 o An attacker can compromise the DNS, causing DNS queries to return 3217 a rogue STUN server address. That server can provide the client 3218 with fake server reflexive candidates. This attack is mitigated 3219 by DNS security, though DNS-SEC is not required to address it. 3221 o An attacker that can observe STUN messages (such as an attacker on 3222 a shared network segment, like WiFi) can inject a fake response 3223 that is valid and will be accepted by the client. 3225 o An attacker can compromise a STUN server by means of a virus, and 3226 cause it to send responses with incorrect mapped addresses. 3228 A false mapped address learned by these attacks will be used as a 3229 server reflexive candidate in the ICE exchange. For this candidate 3230 to actually be used for media, the attacker must also attack the 3231 connectivity checks, and in particular, force a false valid on a 3232 false candidate. This attack is very hard to launch if the false 3233 address identifies a fourth party (neither the initiator, responder, 3234 nor attacker), since it requires attacking the checks generated by 3235 each agent in the session, and is prevented by SRTP if it identifies 3236 the attacker themself. 3238 If the attacker elects not to attack the connectivity checks, the 3239 worst it can do is prevent the server reflexive candidate from being 3240 used. However, if the peer agent has at least one candidate that is 3241 reachable by the agent under attack, the STUN connectivity checks 3242 themselves will provide a peer reflexive candidate that can be used 3243 for the exchange of media. Peer reflexive candidates are generally 3244 preferred over server reflexive candidates. As such, an attack 3245 solely on the STUN address gathering will normally have no impact on 3246 a session at all. 3248 14.3. Attacks on Relayed Candidate Gathering 3250 An attacker might attempt to disrupt the gathering of relayed 3251 candidates, forcing the client to believe it has a false relayed 3252 candidate. Exchanges with the TURN server are authenticated using a 3253 long-term credential. Consequently, injection of fake responses or 3254 requests will not work. In addition, unlike Binding requests, 3255 Allocate requests are not susceptible to replay attacks with modified 3256 source IP addresses and ports, since the source IP address and port 3257 are not utilized to provide the client with its relayed candidate. 3259 However, TURN servers are susceptible to DNS attacks, or to viruses 3260 aimed at the TURN server, for purposes of turning it into a zombie or 3261 rogue server. These attacks can be mitigated by DNS-SEC and through 3262 good box and software security on TURN servers. 3264 Even if an attacker has caused the client to believe in a false 3265 relayed candidate, the connectivity checks cause such a candidate to 3266 be used only if they succeed. Thus, an attacker must launch a false 3267 valid on a false candidate, per above, which is a very difficult 3268 attack to coordinate. 3270 14.4. Insider Attacks 3272 In addition to attacks where the attacker is a third party trying to 3273 insert fake candidate information or stun messages, there are attacks 3274 possible with ICE when the attacker is an authenticated and valid 3275 participant in the ICE exchange. 3277 14.4.1. STUN Amplification Attack 3279 The STUN amplification attack is similar to the voice hammer. 3280 However, instead of voice packets being directed to the target, STUN 3281 connectivity checks are directed to the target. The attacker sends 3282 an a large number of candidates, say, 50. The responding agent 3283 receives the candidate information, and starts its checks, which are 3284 directed at the target, and consequently, never generate a response. 3285 The answerer will start a new connectivity check every Ta ms (say, 3286 Ta=20ms). However, the retransmission timers are set to a large 3287 number due to the large number of candidates. As a consequence, 3288 packets will be sent at an interval of one every Ta milliseconds, and 3289 then with increasing intervals after that. Thus, STUN will not send 3290 packets at a rate faster than media would be sent, and the STUN 3291 packets persist only briefly, until ICE fails for the session. 3292 Nonetheless, this is an amplification mechanism. 3294 It is impossible to eliminate the amplification, but the volume can 3295 be reduced through a variety of heuristics. Agents SHOULD limit the 3296 total number of connectivity checks they perform to 100. 3297 Additionally, agents MAY limit the number of candidates they'll 3298 accept. 3300 Frequently, protocols that wish to avoid these kinds of attacks force 3301 the initiator to wait for a response prior to sending the next 3302 message. However, in the case of ICE, this is not possible. It is 3303 not possible to differentiate the following two cases: 3305 o There was no response because the initiator is being used to 3306 launch a DoS attack against an unsuspecting target that will not 3307 respond. 3309 o There was no response because the IP address and port are not 3310 reachable by the initiator. 3312 In the second case, another check should be sent at the next 3313 opportunity, while in the former case, no further checks should be 3314 sent. 3316 15. STUN Extensions 3318 15.1. New Attributes 3320 This specification defines four new attributes, PRIORITY, USE- 3321 CANDIDATE, ICE-CONTROLLED, and ICE-CONTROLLING. 3323 The PRIORITY attribute indicates the priority that is to be 3324 associated with a peer reflexive candidate, should one be discovered 3325 by this check. It is a 32-bit unsigned integer, and has an attribute 3326 value of 0x0024. 3328 The USE-CANDIDATE attribute indicates that the candidate pair 3329 resulting from this check should be used for transmission of media. 3330 The attribute has no content (the Length field of the attribute is 3331 zero); it serves as a flag. It has an attribute value of 0x0025. 3333 The ICE-CONTROLLED attribute is present in a Binding request and 3334 indicates that the client believes it is currently in the controlled 3335 role. The content of the attribute is a 64-bit unsigned integer in 3336 network byte order, which contains a random number used for tie- 3337 breaking of role conflicts. 3339 The ICE-CONTROLLING attribute is present in a Binding request and 3340 indicates that the client believes it is currently in the controlling 3341 role. The content of the attribute is a 64-bit unsigned integer in 3342 network byte order, which contains a random number used for tie- 3343 breaking of role conflicts. 3345 15.2. New Error Response Codes 3347 This specification defines a single error response code: 3349 487 (Role Conflict): The Binding request contained either the ICE- 3350 CONTROLLING or ICE-CONTROLLED attribute, indicating a role that 3351 conflicted with the server. The server ran a tie-breaker based on 3352 the tie-breaker value in the request and determined that the 3353 client needs to switch roles. 3355 16. Operational Considerations 3357 This section discusses issues relevant to network operators looking 3358 to deploy ICE. 3360 16.1. NAT and Firewall Types 3362 ICE was designed to work with existing NAT and firewall equipment. 3363 Consequently, it is not necessary to replace or reconfigure existing 3364 firewall and NAT equipment in order to facilitate deployment of ICE. 3365 Indeed, ICE was developed to be deployed in environments where the 3366 Voice over IP (VoIP) operator has no control over the IP network 3367 infrastructure, including firewalls and NAT. 3369 That said, ICE works best in environments where the NAT devices are 3370 "behave" compliant, meeting the recommendations defined in [RFC4787] 3371 and [RFC5382]. In networks with behave-compliant NAT, ICE will work 3372 without the need for a TURN server, thus improving voice quality, 3373 decreasing call setup times, and reducing the bandwidth demands on 3374 the network operator. 3376 16.2. Bandwidth Requirements 3378 Deployment of ICE can have several interactions with available 3379 network capacity that operators should take into consideration. 3381 16.2.1. STUN and TURN Server Capacity Planning 3383 First and foremost, ICE makes use of TURN and STUN servers, which 3384 would typically be located in the network operator's data centers. 3385 The STUN servers require relatively little bandwidth. For each 3386 component of each media stream, there will be one or more STUN 3387 transactions from each client to the STUN server. In a basic voice- 3388 only IPv4 VoIP deployment, there will be four transactions per call 3389 (one for RTP and one for RTCP, for both caller and callee). Each 3390 transaction is a single request and a single response, the former 3391 being 20 bytes long, and the latter, 28. Consequently, if a system 3392 has N users, and each makes four calls in a busy hour, this would 3393 require N*1.7bps. For one million users, this is 1.7 Mbps, a very 3394 small number (relatively speaking). 3396 TURN traffic is more substantial. The TURN server will see traffic 3397 volume equal to the STUN volume (indeed, if TURN servers are 3398 deployed, there is no need for a separate STUN server), in addition 3399 to the traffic for the actual media traffic. The amount of calls 3400 requiring TURN for media relay is highly dependent on network 3401 topologies, and can and will vary over time. In a network with 100% 3402 behave-compliant NAT, it is exactly zero. At time of writing, large- 3403 scale consumer deployments were seeing between 5 and 10 percent of 3404 calls requiring TURN servers. Considering a voice-only deployment 3405 using G.711 (so 80 kbps in each direction), with .2 erlangs during 3406 the busy hour, this is N*3.2 kbps. For a population of one million 3407 users, this is 3.2 Gbps, assuming a 10% usage of TURN servers. 3409 16.2.2. Gathering and Connectivity Checks 3411 The process of gathering of candidates and performing of connectivity 3412 checks can be bandwidth intensive. ICE has been designed to pace 3413 both of these processes. The gathering phase and the connectivity 3414 check phase are meant to generate traffic at roughly the same 3415 bandwidth as the media traffic itself. This was done to ensure that, 3416 if a network is designed to support multimedia traffic of a certain 3417 type (voice, video, or just text), it will have sufficient capacity 3418 to support the ICE checks for that media. Of course, the ICE checks 3419 will cause a marginal increase in the total utilization; however, 3420 this will typically be an extremely small increase. 3422 Congestion due to the gathering and check phases has proven to be a 3423 problem in deployments that did not utilize pacing. Typically, 3424 access links became congested as the endpoints flooded the network 3425 with checks as fast as they can send them. Consequently, network 3426 operators should make sure that their ICE implementations support the 3427 pacing feature. Though this pacing does increase call setup times, 3428 it makes ICE network friendly and easier to deploy. 3430 16.2.3. Keepalives 3432 STUN keepalives (in the form of STUN Binding Indications) are sent in 3433 the middle of a media session. However, they are sent only in the 3434 absence of actual media traffic. In deployments that are not 3435 utilizing Voice Activity Detection (VAD), the keepalives are never 3436 used and there is no increase in bandwidth usage. When VAD is being 3437 used, keepalives will be sent during silence periods. This involves 3438 a single packet every 15-20 seconds, far less than the packet every 3439 20-30 ms that is sent when there is voice. Therefore, keepalives 3440 don't have any real impact on capacity planning. 3442 16.3. ICE and ICE-lite 3444 Deployments utilizing a mix of ICE and ICE-lite interoperate 3445 perfectly. They have been explicitly designed to do so, without loss 3446 of function. 3448 However, ICE-lite can only be deployed in limited use cases. Those 3449 cases, and the caveats involved in doing so, are documented in 3450 Appendix A. 3452 16.4. Troubleshooting and Performance Management 3454 ICE utilizes end-to-end connectivity checks, and places much of the 3455 processing in the endpoints. This introduces a challenge to the 3456 network operator -- how can they troubleshoot ICE deployments? How 3457 can they know how ICE is performing? 3459 ICE has built-in features to help deal with these problems. SIP 3460 servers on the signaling path, typically deployed in the data centers 3461 of the network operator, will see the contents of the candidate 3462 exchanges that convey the ICE parameters. These parameters include 3463 the type of each candidate (host, server reflexive, or relayed), 3464 along with their related addresses. Once ICE processing has 3465 completed, an updated candidate exchange takes place, signaling the 3466 selected address (and its type). This updated re-INVITE is performed 3467 exactly for the purposes of educating network equipment (such as a 3468 diagnostic tool attached to a SIP server) about the results of ICE 3469 processing. 3471 As a consequence, through the logs generated by the SIP server, a 3472 network operator can observe what types of candidates are being used 3473 for each call, and what address was selected by ICE. This is the 3474 primary information that helps evaluate how ICE is performing. 3476 16.5. Endpoint Configuration 3478 ICE relies on several pieces of data being configured into the 3479 endpoints. This configuration data includes timers, credentials for 3480 TURN servers, and hostnames for STUN and TURN servers. ICE itself 3481 does not provide a mechanism for this configuration. Instead, it is 3482 assumed that this information is attached to whatever mechanism is 3483 used to configure all of the other parameters in the endpoint. For 3484 SIP phones, standard solutions such as the configuration framework 3485 [RFC6080] have been defined. 3487 17. IANA Considerations 3489 The original ICE specification registered four new STUN attributes, 3490 and one new STUN error response. The STUN attributes and error 3491 response are reproduced here. 3493 17.1. STUN Attributes 3495 IANA has registered four STUN attributes: 3497 0x0024 PRIORITY 3498 0x0025 USE-CANDIDATE 3499 0x8029 ICE-CONTROLLED 3500 0x802A ICE-CONTROLLING 3502 17.2. STUN Error Responses 3504 IANA has registered following STUN error response code: 3506 487 Role Conflict: The client asserted an ICE role (controlling or 3507 controlled) that is in conflict with the role of the server. 3509 18. IAB Considerations 3511 The IAB has studied the problem of "Unilateral Self-Address Fixing", 3512 which is the general process by which a agent attempts to determine 3513 its address in another realm on the other side of a NAT through a 3514 collaborative protocol reflection mechanism [RFC3424]. ICE is an 3515 example of a protocol that performs this type of function. 3516 Interestingly, the process for ICE is not unilateral, but bilateral, 3517 and the difference has a significant impact on the issues raised by 3518 IAB. Indeed, ICE can be considered a B-SAF (Bilateral Self-Address 3519 Fixing) protocol, rather than an UNSAF protocol. Regardless, the IAB 3520 has mandated that any protocols developed for this purpose document a 3521 specific set of considerations. This section meets those 3522 requirements. 3524 18.1. Problem Definition 3526 >From RFC 3424, any UNSAF proposal must provide: 3528 Precise definition of a specific, limited-scope problem that is to 3529 be solved with the UNSAF proposal. A short-term fix should not be 3530 generalized to solve other problems; this is why "short-term fixes 3531 usually aren't". 3533 The specific problems being solved by ICE are: 3535 Provide a means for two peers to determine the set of transport 3536 addresses that can be used for communication. 3538 Provide a means for a agent to determine an address that is 3539 reachable by another peer with which it wishes to communicate. 3541 18.2. Exit Strategy 3543 >From RFC 3424, any UNSAF proposal must provide: 3545 Description of an exit strategy/transition plan. The better 3546 short-term fixes are the ones that will naturally see less and 3547 less use as the appropriate technology is deployed. 3549 ICE itself doesn't easily get phased out. However, it is useful even 3550 in a globally connected Internet, to serve as a means for detecting 3551 whether a router failure has temporarily disrupted connectivity, for 3552 example. ICE also helps prevent certain security attacks that have 3553 nothing to do with NAT. However, what ICE does is help phase out 3554 other UNSAF mechanisms. ICE effectively selects amongst those 3555 mechanisms, prioritizing ones that are better, and deprioritizing 3556 ones that are worse. Local IPv6 addresses can be preferred. As NATs 3557 begin to dissipate as IPv6 is introduced, server reflexive and 3558 relayed candidates (both forms of UNSAF addresses) simply never get 3559 used, because higher-priority connectivity exists to the native host 3560 candidates. Therefore, the servers get used less and less, and can 3561 eventually be remove when their usage goes to zero. 3563 Indeed, ICE can assist in the transition from IPv4 to IPv6. It can 3564 be used to determine whether to use IPv6 or IPv4 when two dual-stack 3565 hosts communicate with SIP (IPv6 gets used). It can also allow a 3566 network with both 6to4 and native v6 connectivity to determine which 3567 address to use when communicating with a peer. 3569 18.3. Brittleness Introduced by ICE 3571 >From RFC 3424, any UNSAF proposal must provide: 3573 Discussion of specific issues that may render systems more 3574 "brittle". For example, approaches that involve using data at 3575 multiple network layers create more dependencies, increase 3576 debugging challenges, and make it harder to transition. 3578 ICE actually removes brittleness from existing UNSAF mechanisms. In 3579 particular, classic STUN (as described in RFC 3489 [RFC3489]) has 3580 several points of brittleness. One of them is the discovery process 3581 that requires an agent to try to classify the type of NAT it is 3582 behind. This process is error-prone. With ICE, that discovery 3583 process is simply not used. Rather than unilaterally assessing the 3584 validity of the address, its validity is dynamically determined by 3585 measuring connectivity to a peer. The process of determining 3586 connectivity is very robust. 3588 Another point of brittleness in classic STUN and any other unilateral 3589 mechanism is its absolute reliance on an additional server. ICE 3590 makes use of a server for allocating unilateral addresses, but allows 3591 agents to directly connect if possible. Therefore, in some cases, 3592 the failure of a STUN server would still allow for a call to progress 3593 when ICE is used. 3595 Another point of brittleness in classic STUN is that it assumes that 3596 the STUN server is on the public Internet. Interestingly, with ICE, 3597 that is not necessary. There can be a multitude of STUN servers in a 3598 variety of address realms. ICE will discover the one that has 3599 provided a usable address. 3601 The most troubling point of brittleness in classic STUN is that it 3602 doesn't work in all network topologies. In cases where there is a 3603 shared NAT between each agent and the STUN server, traditional STUN 3604 may not work. With ICE, that restriction is removed. 3606 Classic STUN also introduces some security considerations. 3607 Fortunately, those security considerations are also mitigated by ICE. 3609 Consequently, ICE serves to repair the brittleness introduced in 3610 classic STUN, and does not introduce any additional brittleness into 3611 the system. 3613 The penalty of these improvements is that ICE increases session 3614 establishment times. 3616 18.4. Requirements for a Long-Term Solution 3618 From RFC 3424, any UNSAF proposal must provide: 3620 ... requirements for longer term, sound technical solutions -- 3621 contribute to the process of finding the right longer term 3622 solution. 3624 Our conclusions from RFC 3489 remain unchanged. However, we feel ICE 3625 actually helps because we believe it can be part of the long-term 3626 solution. 3628 18.5. Issues with Existing NAPT Boxes 3630 From RFC 3424, any UNSAF proposal must provide: 3632 Discussion of the impact of the noted practical issues with 3633 existing, deployed NA[P]Ts and experience reports. 3635 A number of NAT boxes are now being deployed into the market that try 3636 to provide "generic" ALG functionality. These generic ALGs hunt for 3637 IP addresses, either in text or binary form within a packet, and 3638 rewrite them if they match a binding. This interferes with classic 3639 STUN. However, the update to STUN [RFC5389] uses an encoding that 3640 hides these binary addresses from generic ALGs. 3642 Existing NAPT boxes have non-deterministic and typically short 3643 expiration times for UDP-based bindings. This requires 3644 implementations to send periodic keepalives to maintain those 3645 bindings. ICE uses a default of 15 s, which is a very conservative 3646 estimate. Eventually, over time, as NAT boxes become compliant to 3647 behave [RFC4787], this minimum keepalive will become deterministic 3648 and well-known, and the ICE timers can be adjusted. Having a way to 3649 discover and control the minimum keepalive interval would be far 3650 better still. 3652 19. Changes from RFC 5245 3654 Following is the list of changes from RFC 5245 3656 o The specification was generalized to be more usable with any 3657 protocol and the parts that are specific to SIP and SDP were moved 3658 to a SIP/SDP usage document [I-D.ietf-mmusic-ice-sip-sdp]. 3660 o Default candidates, multiple components, ICE mismatch detection, 3661 subsequent offer/answer, and role conflict resolution were made 3662 optional since they are not needed with every protocol using ICE. 3664 o With IPv6, the precedence rules of RFC 6724 are used instead of 3665 the obsoleted RFC 3483 and using address preferences provided by 3666 the host operating system is recommended. 3668 o Candidate gathering rules regarding loopback addresses and IPv6 3669 addresses were clarified. 3671 20. Acknowledgements 3673 Most of the text in this document comes from the original ICE 3674 specification, RFC 5245. The authors would like to thank everyone 3675 who has contributed to that document. For additional contributions 3676 to this revision of the specification we would like to thank Christer 3677 Holmberg, Emil Ivov, Paul Kyzivat, Pal-Erik Martinsen, Simon 3678 Perrault, Eric Rescorla, Thomas Stach, Peter Thatcher, Martin 3679 Thomson, Justin Uberti, and Suhas Nandakumar. 3681 21. References 3683 21.1. Normative References 3685 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 3686 Requirement Levels", BCP 14, RFC 2119, 3687 DOI 10.17487/RFC2119, March 1997, 3688 . 3690 [RFC5389] Rosenberg, J., Mahy, R., Matthews, P., and D. Wing, 3691 "Session Traversal Utilities for NAT (STUN)", RFC 5389, 3692 DOI 10.17487/RFC5389, October 2008, 3693 . 3695 [RFC5766] Mahy, R., Matthews, P., and J. Rosenberg, "Traversal Using 3696 Relays around NAT (TURN): Relay Extensions to Session 3697 Traversal Utilities for NAT (STUN)", RFC 5766, 3698 DOI 10.17487/RFC5766, April 2010, 3699 . 3701 [RFC6724] Thaler, D., Ed., Draves, R., Matsumoto, A., and T. Chown, 3702 "Default Address Selection for Internet Protocol Version 6 3703 (IPv6)", RFC 6724, DOI 10.17487/RFC6724, September 2012, 3704 . 3706 21.2. Informative References 3708 [RFC3605] Huitema, C., "Real Time Control Protocol (RTCP) attribute 3709 in Session Description Protocol (SDP)", RFC 3605, 3710 DOI 10.17487/RFC3605, October 2003, 3711 . 3713 [RFC3261] Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston, 3714 A., Peterson, J., Sparks, R., Handley, M., and E. 3715 Schooler, "SIP: Session Initiation Protocol", RFC 3261, 3716 DOI 10.17487/RFC3261, June 2002, 3717 . 3719 [RFC3264] Rosenberg, J. and H. Schulzrinne, "An Offer/Answer Model 3720 with Session Description Protocol (SDP)", RFC 3264, 3721 DOI 10.17487/RFC3264, June 2002, 3722 . 3724 [RFC3489] Rosenberg, J., Weinberger, J., Huitema, C., and R. Mahy, 3725 "STUN - Simple Traversal of User Datagram Protocol (UDP) 3726 Through Network Address Translators (NATs)", RFC 3489, 3727 DOI 10.17487/RFC3489, March 2003, 3728 . 3730 [RFC3235] Senie, D., "Network Address Translator (NAT)-Friendly 3731 Application Design Guidelines", RFC 3235, 3732 DOI 10.17487/RFC3235, January 2002, 3733 . 3735 [RFC3303] Srisuresh, P., Kuthan, J., Rosenberg, J., Molitor, A., and 3736 A. Rayhan, "Middlebox communication architecture and 3737 framework", RFC 3303, DOI 10.17487/RFC3303, August 2002, 3738 . 3740 [RFC3102] Borella, M., Lo, J., Grabelsky, D., and G. Montenegro, 3741 "Realm Specific IP: Framework", RFC 3102, 3742 DOI 10.17487/RFC3102, October 2001, 3743 . 3745 [RFC3103] Borella, M., Grabelsky, D., Lo, J., and K. Taniguchi, 3746 "Realm Specific IP: Protocol Specification", RFC 3103, 3747 DOI 10.17487/RFC3103, October 2001, 3748 . 3750 [RFC3424] Daigle, L., Ed. and IAB, "IAB Considerations for 3751 UNilateral Self-Address Fixing (UNSAF) Across Network 3752 Address Translation", RFC 3424, DOI 10.17487/RFC3424, 3753 November 2002, . 3755 [RFC3550] Schulzrinne, H., Casner, S., Frederick, R., and V. 3756 Jacobson, "RTP: A Transport Protocol for Real-Time 3757 Applications", STD 64, RFC 3550, DOI 10.17487/RFC3550, 3758 July 2003, . 3760 [RFC3711] Baugher, M., McGrew, D., Naslund, M., Carrara, E., and K. 3761 Norrman, "The Secure Real-time Transport Protocol (SRTP)", 3762 RFC 3711, DOI 10.17487/RFC3711, March 2004, 3763 . 3765 [RFC3056] Carpenter, B. and K. Moore, "Connection of IPv6 Domains 3766 via IPv4 Clouds", RFC 3056, DOI 10.17487/RFC3056, February 3767 2001, . 3769 [RFC3389] Zopf, R., "Real-time Transport Protocol (RTP) Payload for 3770 Comfort Noise (CN)", RFC 3389, DOI 10.17487/RFC3389, 3771 September 2002, . 3773 [RFC3879] Huitema, C. and B. Carpenter, "Deprecating Site Local 3774 Addresses", RFC 3879, DOI 10.17487/RFC3879, September 3775 2004, . 3777 [RFC4038] Shin, M-K., Ed., Hong, Y-G., Hagino, J., Savola, P., and 3778 E. Castro, "Application Aspects of IPv6 Transition", 3779 RFC 4038, DOI 10.17487/RFC4038, March 2005, 3780 . 3782 [RFC4091] Camarillo, G. and J. Rosenberg, "The Alternative Network 3783 Address Types (ANAT) Semantics for the Session Description 3784 Protocol (SDP) Grouping Framework", RFC 4091, 3785 DOI 10.17487/RFC4091, June 2005, 3786 . 3788 [RFC4092] Camarillo, G. and J. Rosenberg, "Usage of the Session 3789 Description Protocol (SDP) Alternative Network Address 3790 Types (ANAT) Semantics in the Session Initiation Protocol 3791 (SIP)", RFC 4092, DOI 10.17487/RFC4092, June 2005, 3792 . 3794 [RFC4291] Hinden, R. and S. Deering, "IP Version 6 Addressing 3795 Architecture", RFC 4291, DOI 10.17487/RFC4291, February 3796 2006, . 3798 [RFC4566] Handley, M., Jacobson, V., and C. Perkins, "SDP: Session 3799 Description Protocol", RFC 4566, DOI 10.17487/RFC4566, 3800 July 2006, . 3802 [RFC2475] Blake, S., Black, D., Carlson, M., Davies, E., Wang, Z., 3803 and W. Weiss, "An Architecture for Differentiated 3804 Services", RFC 2475, DOI 10.17487/RFC2475, December 1998, 3805 . 3807 [RFC1918] Rekhter, Y., Moskowitz, B., Karrenberg, D., de Groot, G., 3808 and E. Lear, "Address Allocation for Private Internets", 3809 BCP 5, RFC 1918, DOI 10.17487/RFC1918, February 1996, 3810 . 3812 [RFC4787] Audet, F., Ed. and C. Jennings, "Network Address 3813 Translation (NAT) Behavioral Requirements for Unicast 3814 UDP", BCP 127, RFC 4787, DOI 10.17487/RFC4787, January 3815 2007, . 3817 [I-D.ietf-avt-rtp-no-op] 3818 Andreasen, F., "A No-Op Payload Format for RTP", draft- 3819 ietf-avt-rtp-no-op-04 (work in progress), May 2007. 3821 [RFC5761] Perkins, C. and M. Westerlund, "Multiplexing RTP Data and 3822 Control Packets on a Single Port", RFC 5761, 3823 DOI 10.17487/RFC5761, April 2010, 3824 . 3826 [RFC4103] Hellstrom, G. and P. Jones, "RTP Payload for Text 3827 Conversation", RFC 4103, DOI 10.17487/RFC4103, June 2005, 3828 . 3830 [RFC5245] Rosenberg, J., "Interactive Connectivity Establishment 3831 (ICE): A Protocol for Network Address Translator (NAT) 3832 Traversal for Offer/Answer Protocols", RFC 5245, 3833 DOI 10.17487/RFC5245, April 2010, 3834 . 3836 [RFC5382] Guha, S., Ed., Biswas, K., Ford, B., Sivakumar, S., and P. 3837 Srisuresh, "NAT Behavioral Requirements for TCP", BCP 142, 3838 RFC 5382, DOI 10.17487/RFC5382, October 2008, 3839 . 3841 [RFC6080] Petrie, D. and S. Channabasappa, Ed., "A Framework for 3842 Session Initiation Protocol User Agent Profile Delivery", 3843 RFC 6080, DOI 10.17487/RFC6080, March 2011, 3844 . 3846 [RFC6146] Bagnulo, M., Matthews, P., and I. van Beijnum, "Stateful 3847 NAT64: Network Address and Protocol Translation from IPv6 3848 Clients to IPv4 Servers", RFC 6146, DOI 10.17487/RFC6146, 3849 April 2011, . 3851 [RFC6147] Bagnulo, M., Sullivan, A., Matthews, P., and I. van 3852 Beijnum, "DNS64: DNS Extensions for Network Address 3853 Translation from IPv6 Clients to IPv4 Servers", RFC 6147, 3854 DOI 10.17487/RFC6147, April 2011, 3855 . 3857 [RFC6544] Rosenberg, J., Keranen, A., Lowekamp, B., and A. Roach, 3858 "TCP Candidates with Interactive Connectivity 3859 Establishment (ICE)", RFC 6544, DOI 10.17487/RFC6544, 3860 March 2012, . 3862 [RFC7050] Savolainen, T., Korhonen, J., and D. Wing, "Discovery of 3863 the IPv6 Prefix Used for IPv6 Address Synthesis", 3864 RFC 7050, DOI 10.17487/RFC7050, November 2013, 3865 . 3867 [I-D.ietf-mmusic-ice-sip-sdp] 3868 Petit-Huguenin, M., Keranen, A., and S. Nandakumar, "Using 3869 Interactive Connectivity Establishment (ICE) with Session 3870 Description Protocol (SDP) offer/answer and Session 3871 Initiation Protocol (SIP)", draft-ietf-mmusic-ice-sip- 3872 sdp-07 (work in progress), October 2015. 3874 [I-D.ietf-6man-ipv6-address-generation-privacy] 3875 Cooper, A., Gont, F., and D. Thaler, "Privacy 3876 Considerations for IPv6 Address Generation Mechanisms", 3877 draft-ietf-6man-ipv6-address-generation-privacy-08 (work 3878 in progress), September 2015. 3880 Appendix A. Lite and Full Implementations 3882 ICE allows for two types of implementations. A full implementation 3883 supports the controlling and controlled roles in a session, and can 3884 also perform address gathering. In contrast, a lite implementation 3885 is a minimalist implementation that does little but respond to STUN 3886 checks. 3888 Because ICE requires both endpoints to support it in order to bring 3889 benefits to either endpoint, incremental deployment of ICE in a 3890 network is more complicated. Many sessions involve an endpoint that 3891 is, by itself, not behind a NAT and not one that would worry about 3892 NAT traversal. A very common case is to have one endpoint that 3893 requires NAT traversal (such as a VoIP hard phone or soft phone) make 3894 a call to one of these devices. Even if the phone supports a full 3895 ICE implementation, ICE won't be used at all if the other device 3896 doesn't support it. The lite implementation allows for a low-cost 3897 entry point for these devices. Once they support the lite 3898 implementation, full implementations can connect to them and get the 3899 full benefits of ICE. 3901 Consequently, a lite implementation is only appropriate for devices 3902 that will *always* be connected to the public Internet and have a 3903 public IP address at which it can receive packets from any 3904 correspondent. ICE will not function when a lite implementation is 3905 placed behind a NAT. 3907 ICE allows a lite implementation to have a single IPv4 host candidate 3908 and several IPv6 addresses. In that case, candidate pairs are 3909 selected by the controlling agent using a static algorithm, such as 3910 the one in RFC 6724, which is recommended by this specification. 3911 However, static mechanisms for address selection are always prone to 3912 error, since they cannot ever reflect the actual topology and can 3913 never provide actual guarantees on connectivity. They are always 3914 heuristics. Consequently, if an agent is implementing ICE just to 3915 select between its IPv4 and IPv6 addresses, and none of its IP 3916 addresses are behind NAT, usage of full ICE is still RECOMMENDED in 3917 order to provide the most robust form of address selection possible. 3919 It is important to note that the lite implementation was added to 3920 this specification to provide a stepping stone to full 3921 implementation. Even for devices that are always connected to the 3922 public Internet with just a single IPv4 address, a full 3923 implementation is preferable if achievable. A full implementation 3924 will reduce call setup times, since ICE's aggressive mode can be 3925 used. Full implementations also obtain the security benefits of ICE 3926 unrelated to NAT traversal; in particular, the voice hammer attack 3927 described in Section 14 is prevented only for full implementations, 3928 not lite. Finally, it is often the case that a device that finds 3929 itself with a public address today will be placed in a network 3930 tomorrow where it will be behind a NAT. It is difficult to 3931 definitively know, over the lifetime of a device or product, that it 3932 will always be used on the public Internet. Full implementation 3933 provides assurance that communications will always work. 3935 Appendix B. Design Motivations 3937 ICE contains a number of normative behaviors that may themselves be 3938 simple, but derive from complicated or non-obvious thinking or use 3939 cases that merit further discussion. Since these design motivations 3940 are not necessary to understand for purposes of implementation, they 3941 are discussed here in an appendix to the specification. This section 3942 is non-normative. 3944 B.1. Pacing of STUN Transactions 3946 STUN transactions used to gather candidates and to verify 3947 connectivity are paced out at an approximate rate of one new 3948 transaction every Ta milliseconds. Each transaction, in turn, has a 3949 retransmission timer RTO that is a function of Ta as well. Why are 3950 these transactions paced, and why are these formulas used? 3952 Sending of these STUN requests will often have the effect of creating 3953 bindings on NAT devices between the client and the STUN servers. 3954 Experience has shown that many NAT devices have upper limits on the 3955 rate at which they will create new bindings. Experiments have shown 3956 that once every 20 ms is well supported, but not much lower than 3957 that. This is why Ta has a lower bound of 20 ms. Furthermore, 3958 transmission of these packets on the network makes use of bandwidth 3959 and needs to be rate limited by the agent. Deployments based on 3960 earlier draft versions of [RFC5245] tended to overload rate- 3961 constrained access links and perform poorly overall, in addition to 3962 negatively impacting the network. As a consequence, the pacing 3963 ensures that the NAT device does not get overloaded and that traffic 3964 is kept at a reasonable rate. 3966 The definition of a "reasonable" rate is that STUN should not use 3967 more bandwidth than the RTP itself will use, once media starts 3968 flowing. The formula for Ta is designed so that, if a STUN packet 3969 were sent every Ta seconds, it would consume the same amount of 3970 bandwidth as RTP packets, summed across all media streams. Of 3971 course, STUN has retransmits, and the desire is to pace those as 3972 well. For this reason, RTO is set such that the first retransmit on 3973 the first transaction happens just as the first STUN request on the 3974 last transaction occurs. Pictorially: 3976 First Packets Retransmits 3978 | | 3979 | | 3980 -------+------ -------+------ 3981 / \ / \ 3982 / \ / \ 3984 +--+ +--+ +--+ +--+ +--+ +--+ 3985 |A1| |B1| |C1| |A2| |B2| |C2| 3986 +--+ +--+ +--+ +--+ +--+ +--+ 3988 ---+-------+-------+-------+-------+-------+------------ Time 3989 0 Ta 2Ta 3Ta 4Ta 5Ta 3991 In this picture, there are three transactions that will be sent (for 3992 example, in the case of candidate gathering, there are three host 3993 candidate/STUN server pairs). These are transactions A, B, and C. 3994 The retransmit timer is set so that the first retransmission on the 3995 first transaction (packet A2) is sent at time 3Ta. 3997 Subsequent retransmits after the first will occur even less 3998 frequently than Ta milliseconds apart, since STUN uses an exponential 3999 back-off on its retransmissions. 4001 B.2. Candidates with Multiple Bases 4003 Section 4.1.3 talks about eliminating candidates that have the same 4004 transport address and base. However, candidates with the same 4005 transport addresses but different bases are not redundant. When can 4006 an agent have two candidates that have the same IP address and port, 4007 but different bases? Consider the topology of Figure 11: 4009 +----------+ 4010 | STUN Srvr| 4011 +----------+ 4012 | 4013 | 4014 ----- 4015 // \\ 4016 | | 4017 | B:net10 | 4018 | | 4019 \\ // 4020 ----- 4021 | 4022 | 4023 +----------+ 4024 | NAT | 4025 +----------+ 4026 | 4027 | 4028 ----- 4029 // \\ 4030 | A | 4031 |192.168/16 | 4032 | | 4033 \\ // 4034 ----- 4035 | 4036 | 4037 |192.168.1.100 ----- 4038 +----------+ // \\ +----------+ 4039 | | | | | | 4040 | Initiator|---------| C:net10 |-----------| Responder| 4041 | |10.0.1.100| | 10.0.1.101 | | 4042 +----------+ \\ // +----------+ 4043 ----- 4045 Figure 11: Identical Candidates with Different Bases 4047 In this case, the initiating agent is multihomed. It has one IP 4048 address, 10.0.1.100, on network C, which is a net 10 private network. 4049 The responding agent is on this same network. The initiating agent 4050 is also connected to network A, which is 192.168/16 and has an IP 4051 address of 192.168.1.100 on this network. There is a NAT on this 4052 network, natting into network B, which is another net 10 private 4053 network, but not connected to network C. There is a STUN server on 4054 network B. 4056 The initiating agent obtains a host candidate on its IP address on 4057 network C (10.0.1.100:2498) and a host candidate on its IP address on 4058 network A (192.168.1.100:3344). It performs a STUN query to its 4059 configured STUN server from 192.168.1.100:3344. This query passes 4060 through the NAT, which happens to assign the binding 10.0.1.100:2498. 4061 The STUN server reflects this in the STUN Binding response. Now, the 4062 initiating agent has obtained a server reflexive candidate with a 4063 transport address that is identical to a host candidate 4064 (10.0.1.100:2498). However, the server reflexive candidate has a 4065 base of 192.168.1.100:3344, and the host candidate has a base of 4066 10.0.1.100:2498. 4068 B.3. Purpose of the Related Address and Related Port Attributes 4070 The candidate attribute contains two values that are not used at all 4071 by ICE itself -- related address and related port. Why are they 4072 present? 4074 There are two motivations for its inclusion. The first is 4075 diagnostic. It is very useful to know the relationship between the 4076 different types of candidates. By including it, an agent can know 4077 which relayed candidate is associated with which reflexive candidate, 4078 which in turn is associated with a specific host candidate. When 4079 checks for one candidate succeed and not for others, this provides 4080 useful diagnostics on what is going on in the network. 4082 The second reason has to do with off-path Quality of Service (QoS) 4083 mechanisms. When ICE is used in environments such as PacketCable 4084 2.0, proxies will, in addition to performing normal SIP operations, 4085 inspect the SDP in SIP messages, and extract the IP address and port 4086 for media traffic. They can then interact, through policy servers, 4087 with access routers in the network, to establish guaranteed QoS for 4088 the media flows. This QoS is provided by classifying the RTP traffic 4089 based on 5-tuple, and then providing it a guaranteed rate, or marking 4090 its Diffserv codepoints appropriately. When a residential NAT is 4091 present, and a relayed candidate gets selected for media, this 4092 relayed candidate will be a transport address on an actual TURN 4093 server. That address says nothing about the actual transport address 4094 in the access router that would be used to classify packets for QoS 4095 treatment. Rather, the server reflexive candidate towards the TURN 4096 server is needed. By carrying the translation in the SDP, the proxy 4097 can use that transport address to request QoS from the access router. 4099 B.4. Importance of the STUN Username 4101 ICE requires the usage of message integrity with STUN using its 4102 short-term credential functionality. The actual short-term 4103 credential is formed by exchanging username fragments in the 4104 candidate exchange. The need for this mechanism goes beyond just 4105 security; it is actually required for correct operation of ICE in the 4106 first place. 4108 Consider agents L, R, and Z. L and R are within private enterprise 4109 1, which is using 10.0.0.0/8. Z is within private enterprise 2, 4110 which is also using 10.0.0.0/8. As it turns out, R and Z both have 4111 IP address 10.0.1.1. L sends candidates to Z. Z, in responds L with 4112 its host candidates. In this case, those candidates are 4113 10.0.1.1:8866 and 10.0.1.1:8877. As it turns out, R is in a session 4114 at that same time, and is also using 10.0.1.1:8866 and 10.0.1.1:8877 4115 as host candidates. This means that R is prepared to accept STUN 4116 messages on those ports, just as Z is. L will send a STUN request to 4117 10.0.1.1:8866 and another to 10.0.1.1:8877. However, these do not go 4118 to Z as expected. Instead, they go to R! If R just replied to them, 4119 L would believe it has connectivity to Z, when in fact it has 4120 connectivity to a completely different user, R. To fix this, the 4121 STUN short-term credential mechanisms are used. The username 4122 fragments are sufficiently random that it is highly unlikely that R 4123 would be using the same values as Z. Consequently, R would reject 4124 the STUN request since the credentials were invalid. In essence, the 4125 STUN username fragments provide a form of transient host identifiers, 4126 bound to a particular session established as part of the candidate 4127 exchange. 4129 An unfortunate consequence of the non-uniqueness of IP addresses is 4130 that, in the above example, R might not even be an ICE agent. It 4131 could be any host, and the port to which the STUN packet is directed 4132 could be any ephemeral port on that host. If there is an application 4133 listening on this socket for packets, and it is not prepared to 4134 handle malformed packets for whatever protocol is in use, the 4135 operation of that application could be affected. Fortunately, since 4136 the ports exchanged are ephemeral and usually drawn from the dynamic 4137 or registered range, the odds are good that the port is not used to 4138 run a server on host R, but rather is the agent side of some 4139 protocol. This decreases the probability of hitting an allocated 4140 port, due to the transient nature of port usage in this range. 4141 However, the possibility of a problem does exist, and network 4142 deployers should be prepared for it. Note that this is not a problem 4143 specific to ICE; stray packets can arrive at a port at any time for 4144 any type of protocol, especially ones on the public Internet. As 4145 such, this requirement is just restating a general design guideline 4146 for Internet applications -- be prepared for unknown packets on any 4147 port. 4149 B.5. The Candidate Pair Priority Formula 4151 The priority for a candidate pair has an odd form. It is: 4153 pair priority = 2^32*MIN(G,D) + 2*MAX(G,D) + (G>D?1:0) 4155 Why is this? When the candidate pairs are sorted based on this 4156 value, the resulting sorting has the MAX/MIN property. This means 4157 that the pairs are first sorted based on decreasing value of the 4158 minimum of the two priorities. For pairs that have the same value of 4159 the minimum priority, the maximum priority is used to sort amongst 4160 them. If the max and the min priorities are the same, the 4161 controlling agent's priority is used as the tie-breaker in the last 4162 part of the expression. The factor of 2*32 is used since the 4163 priority of a single candidate is always less than 2*32, resulting in 4164 the pair priority being a "concatenation" of the two component 4165 priorities. This creates the MAX/MIN sorting. MAX/MIN ensures that, 4166 for a particular agent, a lower-priority candidate is never used 4167 until all higher-priority candidates have been tried. 4169 B.6. Why Are Keepalives Needed? 4171 Once media begins flowing on a candidate pair, it is still necessary 4172 to keep the bindings alive at intermediate NATs for the duration of 4173 the session. Normally, the media stream packets themselves (e.g., 4174 RTP) meet this objective. However, several cases merit further 4175 discussion. Firstly, in some RTP usages, such as SIP, the media 4176 streams can be "put on hold". This is accomplished by using the SDP 4177 "sendonly" or "inactive" attributes, as defined in RFC 3264 4178 [RFC3264]. RFC 3264 directs implementations to cease transmission of 4179 media in these cases. However, doing so may cause NAT bindings to 4180 timeout, and media won't be able to come off hold. 4182 Secondly, some RTP payload formats, such as the payload format for 4183 text conversation [RFC4103], may send packets so infrequently that 4184 the interval exceeds the NAT binding timeouts. 4186 Thirdly, if silence suppression is in use, long periods of silence 4187 may cause media transmission to cease sufficiently long for NAT 4188 bindings to time out. 4190 For these reasons, the media packets themselves cannot be relied 4191 upon. ICE defines a simple periodic keepalive utilizing STUN Binding 4192 indications. This makes its bandwidth requirements highly 4193 predictable, and thus amenable to QoS reservations. 4195 B.7. Why Prefer Peer Reflexive Candidates? 4197 Section 4.1.2 describes procedures for computing the priority of 4198 candidate based on its type and local preferences. That section 4199 requires that the type preference for peer reflexive candidates 4200 always be higher than server reflexive. Why is that? The reason has 4201 to do with the security considerations in Section 14. It is much 4202 easier for an attacker to cause an agent to use a false server 4203 reflexive candidate than it is for an attacker to cause an agent to 4204 use a false peer reflexive candidate. Consequently, attacks against 4205 address gathering with Binding requests are thwarted by ICE by 4206 preferring the peer reflexive candidates. 4208 B.8. Why Are Binding Indications Used for Keepalives? 4210 Media keepalives are described in Section 9. These keepalives make 4211 use of STUN when both endpoints are ICE capable. However, rather 4212 than using a Binding request transaction (which generates a 4213 response), the keepalives use an Indication. Why is that? 4215 The primary reason has to do with network QoS mechanisms. Once media 4216 begins flowing, network elements will assume that the media stream 4217 has a fairly regular structure, making use of periodic packets at 4218 fixed intervals, with the possibility of jitter. If an agent is 4219 sending media packets, and then receives a Binding request, it would 4220 need to generate a response packet along with its media packets. 4221 This will increase the actual bandwidth requirements for the 5-tuple 4222 carrying the media packets, and introduce jitter in the delivery of 4223 those packets. Analysis has shown that this is a concern in certain 4224 layer 2 access networks that use fairly tight packet schedulers for 4225 media. 4227 Additionally, using a Binding Indication allows integrity to be 4228 disabled, allowing for better performance. This is useful for large- 4229 scale endpoints, such as PSTN gateways and SBCs. 4231 Authors' Addresses 4233 Ari Keranen 4234 Ericsson 4235 Hirsalantie 11 4236 02420 Jorvas 4237 Finland 4239 Email: ari.keranen@ericsson.com 4240 Jonathan Rosenberg 4241 jdrosen.net 4242 Monmouth, NJ 4243 US 4245 Email: jdrosen@jdrosen.net 4246 URI: http://www.jdrosen.net