idnits 2.17.1 draft-keranen-mmusic-rfc5245bis-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** There is 1 instance of too long lines in the document, the longest one being 2 characters in excess of 72. == There are 14 instances of lines with private range IPv4 addresses in the document. If these are generic example addresses, they should be changed to use any of the ranges defined in RFC 6890 (or successor): 192.0.2.x, 198.51.100.x or 203.0.113.x. -- The document has examples using IPv4 documentation addresses according to RFC6890, but does not use any IPv6 documentation addresses. Maybe there should be IPv6 examples, too? Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == The document seems to contain a disclaimer for pre-RFC5378 work, but was first submitted on or after 10 November 2008. The disclaimer is usually necessary only for documents that revise or obsolete older RFCs, and that take significant amounts of text from those RFCs. If you can contact all authors of the source material and they are willing to grant the BCP78 rights to the IETF Trust, you can and should remove the disclaimer. Otherwise, the disclaimer is needed and you can ignore this comment. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (February 25, 2013) is 4077 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) ** Obsolete normative reference: RFC 5389 (Obsoleted by RFC 8489) ** Obsolete normative reference: RFC 5766 (Obsoleted by RFC 8656) -- Obsolete informational reference (is this intentional?): RFC 3489 (Obsoleted by RFC 5389) -- Obsolete informational reference (is this intentional?): RFC 4091 (Obsoleted by RFC 5245) -- Obsolete informational reference (is this intentional?): RFC 4092 (Obsoleted by RFC 5245) -- Obsolete informational reference (is this intentional?): RFC 4566 (Obsoleted by RFC 8866) == Outdated reference: A later version (-01) exists of draft-petithuguenin-mmusic-ice-sip-sdp-00 Summary: 3 errors (**), 0 flaws (~~), 4 warnings (==), 6 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 MMUSIC A. Keranen 3 Internet-Draft Ericsson 4 Obsoletes: 5245 (if approved) J. Rosenberg 5 Intended status: Standards Track jdrosen.net 6 Expires: August 29, 2013 February 25, 2013 8 Interactive Connectivity Establishment (ICE): A Protocol for Network 9 Address Translator (NAT) Traversal for Offer/Answer Protocols 10 draft-keranen-mmusic-rfc5245bis-01 12 Abstract 14 This document describes a protocol for Network Address Translator 15 (NAT) traversal for UDP-based multimedia sessions established with 16 the offer/answer model. This protocol is called Interactive 17 Connectivity Establishment (ICE). ICE makes use of the Session 18 Traversal Utilities for NAT (STUN) protocol and its extension, 19 Traversal Using Relay NAT (TURN). ICE can be used by any protocol 20 utilizing the offer/answer model, such as the Session Initiation 21 Protocol (SIP). 23 This document obsoletes RFC 5245. 25 Status of this Memo 27 This Internet-Draft is submitted in full conformance with the 28 provisions of BCP 78 and BCP 79. 30 Internet-Drafts are working documents of the Internet Engineering 31 Task Force (IETF). Note that other groups may also distribute 32 working documents as Internet-Drafts. The list of current Internet- 33 Drafts is at http://datatracker.ietf.org/drafts/current/. 35 Internet-Drafts are draft documents valid for a maximum of six months 36 and may be updated, replaced, or obsoleted by other documents at any 37 time. It is inappropriate to use Internet-Drafts as reference 38 material or to cite them other than as "work in progress." 40 This Internet-Draft will expire on August 29, 2013. 42 Copyright Notice 44 Copyright (c) 2013 IETF Trust and the persons identified as the 45 document authors. All rights reserved. 47 This document is subject to BCP 78 and the IETF Trust's Legal 48 Provisions Relating to IETF Documents 49 (http://trustee.ietf.org/license-info) in effect on the date of 50 publication of this document. Please review these documents 51 carefully, as they describe your rights and restrictions with respect 52 to this document. Code Components extracted from this document must 53 include Simplified BSD License text as described in Section 4.e of 54 the Trust Legal Provisions and are provided without warranty as 55 described in the Simplified BSD License. 57 This document may contain material from IETF Documents or IETF 58 Contributions published or made publicly available before November 59 10, 2008. The person(s) controlling the copyright in some of this 60 material may not have granted the IETF Trust the right to allow 61 modifications of such material outside the IETF Standards Process. 62 Without obtaining an adequate license from the person(s) controlling 63 the copyright in such materials, this document may not be modified 64 outside the IETF Standards Process, and derivative works of it may 65 not be created outside the IETF Standards Process, except to format 66 it for publication as an RFC or to translate it into languages other 67 than English. 69 Table of Contents 71 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 6 72 2. Overview of ICE . . . . . . . . . . . . . . . . . . . . . . . 7 73 2.1. Gathering Candidate Addresses . . . . . . . . . . . . . . 9 74 2.2. Connectivity Checks . . . . . . . . . . . . . . . . . . . 11 75 2.3. Sorting Candidates . . . . . . . . . . . . . . . . . . . . 12 76 2.4. Frozen Candidates . . . . . . . . . . . . . . . . . . . . 13 77 2.5. Security for Checks . . . . . . . . . . . . . . . . . . . 14 78 2.6. Concluding ICE . . . . . . . . . . . . . . . . . . . . . . 14 79 2.7. Lite Implementations . . . . . . . . . . . . . . . . . . . 16 80 2.8. Usages of ICE . . . . . . . . . . . . . . . . . . . . . . 16 81 3. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 16 82 4. Sending the Initial Offer . . . . . . . . . . . . . . . . . . 19 83 4.1. Full Implementation Requirements . . . . . . . . . . . . . 20 84 4.1.1. Gathering Candidates . . . . . . . . . . . . . . . . . 20 85 4.1.1.1. Host Candidates . . . . . . . . . . . . . . . . . 20 86 4.1.1.2. Server Reflexive and Relayed Candidates . . . . . 20 87 4.1.1.3. Computing Foundations . . . . . . . . . . . . . . 22 88 4.1.1.4. Keeping Candidates Alive . . . . . . . . . . . . . 22 89 4.1.2. Prioritizing Candidates . . . . . . . . . . . . . . . 23 90 4.1.2.1. Recommended Formula . . . . . . . . . . . . . . . 23 91 4.1.2.2. Guidelines for Choosing Type and Local 92 Preferences . . . . . . . . . . . . . . . . . . . 24 93 4.1.3. Eliminating Redundant Candidates . . . . . . . . . . . 25 94 4.2. Lite Implementation Requirements . . . . . . . . . . . . . 25 95 4.3. Encoding the Offer . . . . . . . . . . . . . . . . . . . . 26 96 5. Receiving the Initial Offer . . . . . . . . . . . . . . . . . 28 97 5.1. Verifying ICE Support . . . . . . . . . . . . . . . . . . 28 98 5.2. Determining Role . . . . . . . . . . . . . . . . . . . . . 28 99 5.3. Gathering Candidates . . . . . . . . . . . . . . . . . . . 29 100 5.4. Prioritizing Candidates . . . . . . . . . . . . . . . . . 30 101 5.5. Encoding the Answer . . . . . . . . . . . . . . . . . . . 30 102 5.6. Forming the Check Lists . . . . . . . . . . . . . . . . . 30 103 5.6.1. Forming Candidate Pairs . . . . . . . . . . . . . . . 30 104 5.6.2. Computing Pair Priority and Ordering Pairs . . . . . . 33 105 5.6.3. Pruning the Pairs . . . . . . . . . . . . . . . . . . 33 106 5.6.4. Computing States . . . . . . . . . . . . . . . . . . . 33 107 5.7. Scheduling Checks . . . . . . . . . . . . . . . . . . . . 36 108 6. Receipt of the Initial Answer . . . . . . . . . . . . . . . . 38 109 6.1. Verifying ICE Support . . . . . . . . . . . . . . . . . . 38 110 6.2. Determining Role . . . . . . . . . . . . . . . . . . . . . 38 111 6.3. Forming the Check List . . . . . . . . . . . . . . . . . . 38 112 6.4. Performing Ordinary Checks . . . . . . . . . . . . . . . . 38 113 7. Performing Connectivity Checks . . . . . . . . . . . . . . . . 38 114 7.1. STUN Client Procedures . . . . . . . . . . . . . . . . . . 39 115 7.1.1. Creating Permissions for Relayed Candidates . . . . . 39 116 7.1.2. Sending the Request . . . . . . . . . . . . . . . . . 39 117 7.1.2.1. PRIORITY and USE-CANDIDATE . . . . . . . . . . . . 39 118 7.1.2.2. ICE-CONTROLLED and ICE-CONTROLLING . . . . . . . . 40 119 7.1.2.3. Forming Credentials . . . . . . . . . . . . . . . 40 120 7.1.2.4. DiffServ Treatment . . . . . . . . . . . . . . . . 40 121 7.1.3. Processing the Response . . . . . . . . . . . . . . . 40 122 7.1.3.1. Failure Cases . . . . . . . . . . . . . . . . . . 41 123 7.1.3.2. Success Cases . . . . . . . . . . . . . . . . . . 41 124 7.1.3.2.1. Discovering Peer Reflexive Candidates . . . . 42 125 7.1.3.2.2. Constructing a Valid Pair . . . . . . . . . . 42 126 7.1.3.2.3. Updating Pair States . . . . . . . . . . . . . 43 127 7.1.3.2.4. Updating the Nominated Flag . . . . . . . . . 44 128 7.1.3.3. Check List and Timer State Updates . . . . . . . . 44 129 7.2. STUN Server Procedures . . . . . . . . . . . . . . . . . . 45 130 7.2.1. Additional Procedures for Full Implementations . . . . 46 131 7.2.1.1. Detecting and Repairing Role Conflicts . . . . . . 46 132 7.2.1.2. Computing Mapped Address . . . . . . . . . . . . . 47 133 7.2.1.3. Learning Peer Reflexive Candidates . . . . . . . . 47 134 7.2.1.4. Triggered Checks . . . . . . . . . . . . . . . . . 48 135 7.2.1.5. Updating the Nominated Flag . . . . . . . . . . . 49 136 7.2.2. Additional Procedures for Lite Implementations . . . . 49 137 8. Concluding ICE Processing . . . . . . . . . . . . . . . . . . 49 138 8.1. Procedures for Full Implementations . . . . . . . . . . . 50 139 8.1.1. Nominating Pairs . . . . . . . . . . . . . . . . . . . 50 140 8.1.1.1. Regular Nomination . . . . . . . . . . . . . . . . 50 141 8.1.1.2. Aggressive Nomination . . . . . . . . . . . . . . 51 142 8.1.2. Updating States . . . . . . . . . . . . . . . . . . . 51 143 8.2. Procedures for Lite Implementations . . . . . . . . . . . 52 144 8.2.1. Peer Is Full . . . . . . . . . . . . . . . . . . . . . 53 145 8.2.2. Peer Is Lite . . . . . . . . . . . . . . . . . . . . . 53 146 8.3. Freeing Candidates . . . . . . . . . . . . . . . . . . . . 54 147 8.3.1. Full Implementation Procedures . . . . . . . . . . . . 54 148 8.3.2. Lite Implementation Procedures . . . . . . . . . . . . 54 149 9. Keepalives . . . . . . . . . . . . . . . . . . . . . . . . . . 54 150 10. Media Handling . . . . . . . . . . . . . . . . . . . . . . . . 55 151 10.1. Sending Media . . . . . . . . . . . . . . . . . . . . . . 55 152 10.1.1. Procedures for Full Implementations . . . . . . . . . 55 153 10.1.2. Procedures for Lite Implementations . . . . . . . . . 56 154 10.1.3. Procedures for All Implementations . . . . . . . . . . 56 155 10.2. Receiving Media . . . . . . . . . . . . . . . . . . . . . 56 156 11. Extensibility Considerations . . . . . . . . . . . . . . . . . 57 157 12. Setting Ta and RTO . . . . . . . . . . . . . . . . . . . . . . 58 158 12.1. RTP Media Streams . . . . . . . . . . . . . . . . . . . . 58 159 12.2. Non-RTP Sessions . . . . . . . . . . . . . . . . . . . . . 60 160 13. Example . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 161 14. Security Considerations . . . . . . . . . . . . . . . . . . . 65 162 14.1. Attacks on Connectivity Checks . . . . . . . . . . . . . . 65 163 14.2. Attacks on Server Reflexive Address Gathering . . . . . . 68 164 14.3. Attacks on Relayed Candidate Gathering . . . . . . . . . . 69 165 14.4. Insider Attacks . . . . . . . . . . . . . . . . . . . . . 69 166 14.4.1. STUN Amplification Attack . . . . . . . . . . . . . . 69 167 15. STUN Extensions . . . . . . . . . . . . . . . . . . . . . . . 70 168 15.1. New Attributes . . . . . . . . . . . . . . . . . . . . . . 70 169 15.2. New Error Response Codes . . . . . . . . . . . . . . . . . 71 170 16. Operational Considerations . . . . . . . . . . . . . . . . . . 71 171 16.1. NAT and Firewall Types . . . . . . . . . . . . . . . . . . 71 172 16.2. Bandwidth Requirements . . . . . . . . . . . . . . . . . . 71 173 16.2.1. STUN and TURN Server Capacity Planning . . . . . . . . 71 174 16.2.2. Gathering and Connectivity Checks . . . . . . . . . . 72 175 16.2.3. Keepalives . . . . . . . . . . . . . . . . . . . . . . 72 176 16.3. ICE and ICE-lite . . . . . . . . . . . . . . . . . . . . . 73 177 16.4. Troubleshooting and Performance Management . . . . . . . . 73 178 16.5. Endpoint Configuration . . . . . . . . . . . . . . . . . . 73 179 17. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 74 180 17.1. STUN Attributes . . . . . . . . . . . . . . . . . . . . . 74 181 17.2. STUN Error Responses . . . . . . . . . . . . . . . . . . . 74 182 18. IAB Considerations . . . . . . . . . . . . . . . . . . . . . . 74 183 18.1. Problem Definition . . . . . . . . . . . . . . . . . . . . 74 184 18.2. Exit Strategy . . . . . . . . . . . . . . . . . . . . . . 75 185 18.3. Brittleness Introduced by ICE . . . . . . . . . . . . . . 75 186 18.4. Requirements for a Long-Term Solution . . . . . . . . . . 76 187 18.5. Issues with Existing NAPT Boxes . . . . . . . . . . . . . 77 188 19. Changes from RFC 5245 . . . . . . . . . . . . . . . . . . . . 77 189 20. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 78 190 21. References . . . . . . . . . . . . . . . . . . . . . . . . . . 78 191 21.1. Normative References . . . . . . . . . . . . . . . . . . . 78 192 21.2. Informative References . . . . . . . . . . . . . . . . . . 78 193 Appendix A. Lite and Full Implementations . . . . . . . . . . . . 80 194 Appendix B. Design Motivations . . . . . . . . . . . . . . . . . 81 195 B.1. Pacing of STUN Transactions . . . . . . . . . . . . . . . 82 196 B.2. Candidates with Multiple Bases . . . . . . . . . . . . . . 83 197 B.3. Purpose of the Related Address and Related Port 198 Attributes . . . . . . . . . . . . . . . . . . . . . . . . 85 199 B.4. Importance of the STUN Username . . . . . . . . . . . . . 85 200 B.5. The Candidate Pair Priority Formula . . . . . . . . . . . 86 201 B.6. Why Are Keepalives Needed? . . . . . . . . . . . . . . . . 87 202 B.7. Why Prefer Peer Reflexive Candidates? . . . . . . . . . . 87 203 B.8. Why Are Binding Indications Used for Keepalives? . . . . . 88 204 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 88 206 1. Introduction 208 RFC 3264 [RFC3264] defines a two-phase exchange of Session 209 Description Protocol (SDP) messages [RFC4566] for the purposes of 210 establishment of multimedia sessions. This offer/answer mechanism is 211 used by protocols such as the Session Initiation Protocol (SIP) 212 [RFC3261]. 214 Protocols using offer/answer are difficult to operate through Network 215 Address Translators (NATs). Because their purpose is to establish a 216 flow of media packets, they tend to carry the IP addresses and ports 217 of media sources and sinks within their messages, which is known to 218 be problematic through NAT [RFC3235]. The protocols also seek to 219 create a media flow directly between participants, so that there is 220 no application layer intermediary between them. This is done to 221 reduce media latency, decrease packet loss, and reduce the 222 operational costs of deploying the application. However, this is 223 difficult to accomplish through NAT. A full treatment of the reasons 224 for this is beyond the scope of this specification. 226 Numerous solutions have been defined for allowing these protocols to 227 operate through NAT. These include Application Layer Gateways 228 (ALGs), the Middlebox Control Protocol [RFC3303], the original Simple 229 Traversal of UDP Through NAT (STUN) [RFC3489] specification, and 230 Realm Specific IP [RFC3102] [RFC3103] along with session description 231 extensions needed to make them work, such as the Session Description 232 Protocol (SDP) [RFC4566] attribute for the Real Time Control Protocol 233 (RTCP) [RFC3605]. Unfortunately, these techniques all have pros and 234 cons which, make each one optimal in some network topologies, but a 235 poor choice in others. The result is that administrators and 236 implementors are making assumptions about the topologies of the 237 networks in which their solutions will be deployed. This introduces 238 complexity and brittleness into the system. What is needed is a 239 single solution that is flexible enough to work well in all 240 situations. 242 This specification defines Interactive Connectivity Establishment 243 (ICE) as a technique for NAT traversal for UDP-based media streams 244 (though ICE has been extended to handle other transport protocols, 245 such as TCP [RFC6544]) established by the offer/answer model. ICE is 246 an extension to the offer/answer model, and works by including a 247 multiplicity of IP addresses and ports in the offers and answers, 248 which are then tested for connectivity by peer-to-peer connectivity 249 checks. The IP addresses and ports included in the offer and answer 250 and the connectivity checks are performed using Session Traversal 251 Utilities for NAT (STUN) specification [RFC5389]. ICE also makes use 252 of Traversal Using Relays around NAT (TURN) [RFC5766], an extension 253 to STUN. Because ICE exchanges a multiplicity of IP addresses and 254 ports for each media stream, it also allows for address selection for 255 multihomed and dual-stack hosts, and for this reason it deprecates 256 [RFC4091] and [RFC4092]. 258 2. Overview of ICE 260 In a typical ICE deployment, we have two endpoints (known as AGENTS 261 in RFC 3264 terminology) that want to communicate. They are able to 262 communicate indirectly via some signaling protocol (such as SIP), by 263 which they can perform an offer/answer exchange. Note that ICE is 264 not intended for NAT traversal for the signaling protocol, which is 265 assumed to be provided via another mechanism. At the beginning of 266 the ICE process, the agents are ignorant of their own topologies. In 267 particular, they might or might not be behind a NAT (or multiple 268 tiers of NATs). ICE allows the agents to discover enough information 269 about their topologies to potentially find one or more paths by which 270 they can communicate. 272 Figure 1 shows a typical environment for ICE deployment. The two 273 endpoints are labelled L and R (for left and right, which helps 274 visualize call flows). Both L and R are behind their own respective 275 NATs though they may not be aware of it. The type of NAT and its 276 properties are also unknown. Agents L and R are capable of engaging 277 in an offer/answer exchange, whose purpose is to set up a media 278 session between L and R. Typically, this exchange will occur through 279 a signaling (e.g., SIP) server. 281 In addition to the agents, a signaling server and NATs, ICE is 282 typically used in concert with STUN or TURN servers in the network. 283 Each agent can have its own STUN or TURN server, or they can be the 284 same. 286 +---------+ 287 +--------+ |Signaling| +--------+ 288 | STUN | |Server | | STUN | 289 | Server | +---------+ | Server | 290 +--------+ / \ +--------+ 291 / \ 292 / \ 293 / <- Signaling -> \ 294 / \ 295 +--------+ +--------+ 296 | NAT | | NAT | 297 +--------+ +--------+ 298 / \ 299 / \ 300 +-------+ +-------+ 301 | Agent | | Agent | 302 | L | | R | 303 +-------+ +-------+ 305 Figure 1: ICE Deployment Scenario 307 The basic idea behind ICE is as follows: each agent has a variety of 308 candidate TRANSPORT ADDRESSES (combination of IP address and port for 309 a particular transport protocol, which is always UDP in this 310 specification) it could use to communicate with the other agent. 311 These might include: 313 o A transport address on a directly attached network interface 315 o A translated transport address on the public side of a NAT (a 316 "server reflexive" address) 318 o A transport address allocated from a TURN server (a "relayed 319 address") 321 Potentially, any of L's candidate transport addresses can be used to 322 communicate with any of R's candidate transport addresses. In 323 practice, however, many combinations will not work. For instance, if 324 L and R are both behind NATs, their directly attached interface 325 addresses are unlikely to be able to communicate directly (this is 326 why ICE is needed, after all!). The purpose of ICE is to discover 327 which pairs of addresses will work. The way that ICE does this is to 328 systematically try all possible pairs (in a carefully sorted order) 329 until it finds one or more that work. 331 2.1. Gathering Candidate Addresses 333 In order to execute ICE, an agent has to identify all of its address 334 candidates. A CANDIDATE is a transport address -- a combination of 335 IP address and port for a particular transport protocol (with only 336 UDP specified here). This document defines three types of 337 candidates, some derived from physical or logical network interfaces, 338 others discoverable via STUN and TURN. Naturally, one viable 339 candidate is a transport address obtained directly from a local 340 interface. Such a candidate is called a HOST CANDIDATE. The local 341 interface could be Ethernet or WiFi, or it could be one that is 342 obtained through a tunnel mechanism, such as a Virtual Private 343 Network (VPN) or Mobile IP (MIP). In all cases, such a network 344 interface appears to the agent as a local interface from which ports 345 (and thus candidates) can be allocated. 347 If an agent is multihomed, it obtains a candidate from each IP 348 address. Depending on the location of the PEER (the other agent in 349 the session) on the IP network relative to the agent, the agent may 350 be reachable by the peer through one or more of those IP addresses. 351 Consider, for example, an agent that has a local IP address on a 352 private net 10 network (I1), and a second connected to the public 353 Internet (I2). A candidate from I1 will be directly reachable when 354 communicating with a peer on the same private net 10 network, while a 355 candidate from I2 will be directly reachable when communicating with 356 a peer on the public Internet. Rather than trying to guess which IP 357 address will work prior to sending an offer, the offering agent 358 includes both candidates in its offer. 360 Next, the agent uses STUN or TURN to obtain additional candidates. 361 These come in two flavors: translated addresses on the public side of 362 a NAT (SERVER REFLEXIVE CANDIDATES) and addresses on TURN servers 363 (RELAYED CANDIDATES). When TURN servers are utilized, both types of 364 candidates are obtained from the TURN server. If only STUN servers 365 are utilized, only server reflexive candidates are obtained from 366 them. The relationship of these candidates to the host candidate is 367 shown in Figure 2. In this figure, both types of candidates are 368 discovered using TURN. In the figure, the notation X:x means IP 369 address X and UDP port x. 371 To Internet 373 | 374 | 375 | /------------ Relayed 376 Y:y | / Address 377 +--------+ 378 | | 379 | TURN | 380 | Server | 381 | | 382 +--------+ 383 | 384 | 385 | /------------ Server 386 X1':x1'|/ Reflexive 387 +------------+ Address 388 | NAT | 389 +------------+ 390 | 391 | /------------ Local 392 X:x |/ Address 393 +--------+ 394 | | 395 | Agent | 396 | | 397 +--------+ 399 Figure 2: Candidate Relationships 401 When the agent sends the TURN Allocate request from IP address and 402 port X:x, the NAT (assuming there is one) will create a binding 403 X1':x1', mapping this server reflexive candidate to the host 404 candidate X:x. Outgoing packets sent from the host candidate will be 405 translated by the NAT to the server reflexive candidate. Incoming 406 packets sent to the server reflexive candidate will be translated by 407 the NAT to the host candidate and forwarded to the agent. We call 408 the host candidate associated with a given server reflexive candidate 409 the BASE. 411 Note: "Base" refers to the address an agent sends from for a 412 particular candidate. Thus, as a degenerate case host candidates 413 also have a base, but it's the same as the host candidate. 415 When there are multiple NATs between the agent and the TURN server, 416 the TURN request will create a binding on each NAT, but only the 417 outermost server reflexive candidate (the one nearest the TURN 418 server) will be discovered by the agent. If the agent is not behind 419 a NAT, then the base candidate will be the same as the server 420 reflexive candidate and the server reflexive candidate is redundant 421 and will be eliminated. 423 The Allocate request then arrives at the TURN server. The TURN 424 server allocates a port y from its local IP address Y, and generates 425 an Allocate response, informing the agent of this relayed candidate. 426 The TURN server also informs the agent of the server reflexive 427 candidate, X1':x1' by copying the source transport address of the 428 Allocate request into the Allocate response. The TURN server acts as 429 a packet relay, forwarding traffic between L and R. In order to send 430 traffic to L, R sends traffic to the TURN server at Y:y, and the TURN 431 server forwards that to X1':x1', which passes through the NAT where 432 it is mapped to X:x and delivered to L. 434 When only STUN servers are utilized, the agent sends a STUN Binding 435 request [RFC5389] to its STUN server. The STUN server will inform 436 the agent of the server reflexive candidate X1':x1' by copying the 437 source transport address of the Binding request into the Binding 438 response. 440 2.2. Connectivity Checks 442 Once L has gathered all of its candidates, it orders them in highest 443 to lowest-priority and sends them to R over the signaling channel. 444 The candidates are carried in attributes in the offer. When R 445 receives the offer, it performs the same gathering process and 446 responds with its own list of candidates. At the end of this 447 process, each agent has a complete list of both its candidates and 448 its peer's candidates. It pairs them up, resulting in CANDIDATE 449 PAIRS. To see which pairs work, each agent schedules a series of 450 CHECKS. Each check is a STUN request/response transaction that the 451 client will perform on a particular candidate pair by sending a STUN 452 request from the local candidate to the remote candidate. 454 The basic principle of the connectivity checks is simple: 456 1. Sort the candidate pairs in priority order. 458 2. Send checks on each candidate pair in priority order. 460 3. Acknowledge checks received from the other agent. 462 With both agents performing a check on a candidate pair, the result 463 is a 4-way handshake: 465 L R 466 - - 467 STUN request -> \ L's 468 <- STUN response / check 470 <- STUN request \ R's 471 STUN response -> / check 473 Figure 3: Basic Connectivity Check 475 It is important to note that the STUN requests are sent to and from 476 the exact same IP addresses and ports that will be used for media 477 (e.g., RTP and RTCP). Consequently, agents demultiplex STUN and RTP/ 478 RTCP using contents of the packets, rather than the port on which 479 they are received. Fortunately, this demultiplexing is easy to do, 480 especially for RTP and RTCP. 482 Because a STUN Binding request is used for the connectivity check, 483 the STUN Binding response will contain the agent's translated 484 transport address on the public side of any NATs between the agent 485 and its peer. If this transport address is different from other 486 candidates the agent already learned, it represents a new candidate, 487 called a PEER REFLEXIVE CANDIDATE, which then gets tested by ICE just 488 the same as any other candidate. 490 As an optimization, as soon as R gets L's check message, R schedules 491 a connectivity check message to be sent to L on the same candidate 492 pair. This accelerates the process of finding a valid candidate, and 493 is called a TRIGGERED CHECK. 495 At the end of this handshake, both L and R know that they can send 496 (and receive) messages end-to-end in both directions. 498 2.3. Sorting Candidates 500 Because the algorithm above searches all candidate pairs, if a 501 working pair exists it will eventually find it no matter what order 502 the candidates are tried in. In order to produce faster (and better) 503 results, the candidates are sorted in a specified order. The 504 resulting list of sorted candidate pairs is called the CHECK LIST. 505 The algorithm is described in Section 4.1.2 but follows two general 506 principles: 508 o Each agent gives its candidates a numeric priority, which is sent 509 along with the candidate to the peer. 511 o The local and remote priorities are combined so that each agent 512 has the same ordering for the candidate pairs. 514 The second property is important for getting ICE to work when there 515 are NATs in front of L and R. Frequently, NATs will not allow packets 516 in from a host until the agent behind the NAT has sent a packet 517 towards that host. Consequently, ICE checks in each direction will 518 not succeed until both sides have sent a check through their 519 respective NATs. 521 The agent works through this check list by sending a STUN request for 522 the next candidate pair on the list periodically. These are called 523 ORDINARY CHECKS. 525 In general, the priority algorithm is designed so that candidates of 526 similar type get similar priorities and so that more direct routes 527 (that is, through fewer media relays and through fewer NATs) are 528 preferred over indirect ones (ones with more media relays and more 529 NATs). Within those guidelines, however, agents have a fair amount 530 of discretion about how to tune their algorithms. 532 2.4. Frozen Candidates 534 The previous description only addresses the case where the agents 535 wish to establish a media session with one COMPONENT (a piece of a 536 media stream requiring a single transport address; a media stream may 537 require multiple components, each of which has to work for the media 538 stream as a whole to be work). Often (e.g., with RTP and RTCP), the 539 agents actually need to establish connectivity for more than one 540 flow. 542 The network properties are likely to be very similar for each 543 component (especially because RTP and RTCP are sent and received from 544 the same IP address). It is usually possible to leverage information 545 from one media component in order to determine the best candidates 546 for another. ICE does this with a mechanism called "frozen 547 candidates". 549 Each candidate is associated with a property called its FOUNDATION. 550 Two candidates have the same foundation when they are "similar" -- of 551 the same type and obtained from the same host candidate and STUN/TURN 552 server using the same protocol. Otherwise, their foundation is 553 different. A candidate pair has a foundation too, which is just the 554 concatenation of the foundations of its two candidates. Initially, 555 only the candidate pairs with unique foundations are tested. The 556 other candidate pairs are marked "frozen". When the connectivity 557 checks for a candidate pair succeed, the other candidate pairs with 558 the same foundation are unfrozen. This avoids repeated checking of 559 components that are superficially more attractive but in fact are 560 likely to fail. 562 While we've described "frozen" here as a separate mechanism for 563 expository purposes, in fact it is an integral part of ICE and the 564 ICE prioritization algorithm automatically ensures that the right 565 candidates are unfrozen and checked in the right order. However, if 566 the ICE usage does not utilize multiple components or media streams, 567 it does not need to implement this algorithm. 569 2.5. Security for Checks 571 Because ICE is used to discover which addresses can be used to send 572 media between two agents, it is important to ensure that the process 573 cannot be hijacked to send media to the wrong location. Each STUN 574 connectivity check is covered by a message authentication code (MAC) 575 computed using a key exchanged in the signaling channel. This MAC 576 provides message integrity and data origin authentication, thus 577 stopping an attacker from forging or modifying connectivity check 578 messages. Furthermore, if for example a SIP [RFC3261] caller is 579 using ICE, and their call forks, the ICE exchanges happen 580 independently with each forked recipient. In such a case, the keys 581 exchanged in the signaling help associate each ICE exchange with each 582 forked recipient. 584 2.6. Concluding ICE 586 ICE checks are performed in a specific sequence, so that high- 587 priority candidate pairs are checked first, followed by lower- 588 priority ones. One way to conclude ICE is to declare victory as soon 589 as a check for each component of each media stream completes 590 successfully. Indeed, this is a reasonable algorithm, and details 591 for it are provided below. However, it is possible that a packet 592 loss will cause a higher-priority check to take longer to complete. 593 In that case, allowing ICE to run a little longer might produce 594 better results. More fundamentally, however, the prioritization 595 defined by this specification may not yield "optimal" results. As an 596 example, if the aim is to select low-latency media paths, usage of a 597 relay is a hint that latencies may be higher, but it is nothing more 598 than a hint. An actual round-trip time (RTT) measurement could be 599 made, and it might demonstrate that a pair with lower priority is 600 actually better than one with higher priority. 602 Consequently, ICE assigns one of the agents in the role of the 603 CONTROLLING AGENT, and the other of the CONTROLLED AGENT. The 604 controlling agent gets to nominate which candidate pairs will get 605 used for media amongst the ones that are valid. It can do this in 606 one of two ways -- using REGULAR NOMINATION or AGGRESSIVE NOMINATION. 608 With regular nomination, the controlling agent lets the checks 609 continue until at least one valid candidate pair for each media 610 stream is found. Then, it picks amongst those that are valid, and 611 sends a second STUN request on its NOMINATED candidate pair, but this 612 time with a flag set to tell the peer that this pair has been 613 nominated for use. This is shown in Figure 4. 615 L R 616 - - 617 STUN request -> \ L's 618 <- STUN response / check 620 <- STUN request \ R's 621 STUN response -> / check 623 STUN request + flag -> \ L's 624 <- STUN response / check 626 Figure 4: Regular Nomination 628 Once the STUN transaction with the flag completes, both sides cancel 629 any future checks for that media stream. ICE will now send media 630 using this pair. The pair an ICE agent is using for media is called 631 the SELECTED PAIR. 633 In aggressive nomination, the controlling agent puts the flag in 634 every connectivity check STUN request it sends. This way, once the 635 first check succeeds, ICE processing is complete for that media 636 stream and the controlling agent doesn't have to send a second STUN 637 request. The selected pair will be the highest-priority valid pair 638 whose check succeeded. Aggressive nomination is faster than regular 639 nomination, but gives less flexibility. Aggressive nomination is 640 shown in Figure 5. 642 L R 643 - - 644 STUN request + flag -> \ L's 645 <- STUN response / check 647 <- STUN request \ R's 648 STUN response -> / check 650 Figure 5: Aggressive Nomination 652 Once ICE is concluded, it can be restarted at any time for one or all 653 of the media streams by either agent. This is done by sending an 654 updated offer indicating a restart. 656 2.7. Lite Implementations 658 In order for ICE to be used in a call, both agents need to support 659 it. However, certain agents will always be connected to the public 660 Internet and have a public IP address at which it can receive packets 661 from any correspondent. To make it easier for these devices to 662 support ICE, ICE defines a special type of implementation called LITE 663 (in contrast to the normal FULL implementation). A lite 664 implementation doesn't gather candidates; it includes only host 665 candidates for any media stream. Lite agents do not generate 666 connectivity checks or run the state machines, though they need to be 667 able to respond to connectivity checks. When a lite implementation 668 connects with a full implementation, the full agent takes the role of 669 the controlling agent, and the lite agent takes on the controlled 670 role. When two lite implementations connect, no checks are sent. 672 For guidance on when a lite implementation is appropriate, see the 673 discussion in Appendix A. 675 It is important to note that the lite implementation was added to 676 this specification to provide a stepping stone to full 677 implementation. Even for devices that are always connected to the 678 public Internet, a full implementation is preferable if achievable. 680 2.8. Usages of ICE 682 This document specifies generic use of ICE with protocols that 683 provide offer/answer semantics. The specific details (e.g., how to 684 encode candidates) for different protocols using ICE are described in 685 separate usage documents. For example, usage with SIP and SDP is 686 described in [I-D.petithuguenin-mmusic-ice-sip-sdp]. 688 3. Terminology 690 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 691 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 692 "OPTIONAL" in this document are to be interpreted as described in RFC 693 2119 [RFC2119]. 695 Readers should be familiar with the terminology defined in the offer/ 696 answer model [RFC3264], STUN [RFC5389], and NAT Behavioral 697 requirements for UDP [RFC4787]. 699 This specification makes use of the following additional terminology: 701 Agent: As defined in RFC 3264, an agent is the protocol 702 implementation involved in the offer/answer exchange. There are 703 two agents involved in an offer/answer exchange. 705 Peer: From the perspective of one of the agents in a session, its 706 peer is the other agent. Specifically, from the perspective of 707 the offerer, the peer is the answerer. From the perspective of 708 the answerer, the peer is the offerer. 710 Transport Address: The combination of an IP address and transport 711 protocol (such as UDP or TCP) port. 713 Media, Media Stream: When ICE is used to setup multimedia sessions, 714 the media is usually transported over RTP, and a media stream 715 composes of a stream of RTP packets. When ICE is used with other 716 than multimedia sessions, the terms "media" and "media stream" are 717 still used in this specification to refer to the IP data packets 718 that are exchanged between the peers on the path created and 719 tested with ICE. 721 Candidate: A transport address that is a potential point of contact 722 for receipt of media. Candidates also have properties -- their 723 type (server reflexive, relayed, or host), priority, foundation, 724 and base. 726 Component: A component is a piece of a media stream requiring a 727 single transport address; a media stream may require multiple 728 components, each of which has to work for the media stream as a 729 whole to work. For media streams based on RTP, there are two 730 components per media stream -- one for RTP, and one for RTCP. 732 Host Candidate: A candidate obtained by binding to a specific port 733 from an IP address on the host. This includes IP addresses on 734 physical interfaces and logical ones, such as ones obtained 735 through Virtual Private Networks (VPNs) and Realm Specific IP 736 (RSIP) [RFC3102] (which lives at the operating system level). 738 Server Reflexive Candidate: A candidate whose IP address and port 739 are a binding allocated by a NAT for an agent when it sent a 740 packet through the NAT to a server. Server reflexive candidates 741 can be learned by STUN servers using the Binding request, or TURN 742 servers, which provides both a relayed and server reflexive 743 candidate. 745 Peer Reflexive Candidate: A candidate whose IP address and port are 746 a binding allocated by a NAT for an agent when it sent a STUN 747 Binding request through the NAT to its peer. 749 Relayed Candidate: A candidate obtained by sending a TURN Allocate 750 request from a host candidate to a TURN server. The relayed 751 candidate is resident on the TURN server, and the TURN server 752 relays packets back towards the agent. 754 Base: The base of a server reflexive candidate is the host candidate 755 from which it was derived. A host candidate is also said to have 756 a base, equal to that candidate itself. Similarly, the base of a 757 relayed candidate is that candidate itself. 759 Foundation: An arbitrary string that is the same for two candidates 760 that have the same type, base IP address, protocol (UDP, TCP, 761 etc.), and STUN or TURN server. If any of these are different, 762 then the foundation will be different. Two candidate pairs with 763 the same foundation pairs are likely to have similar network 764 characteristics. Foundations are used in the frozen algorithm. 766 Local Candidate: A candidate that an agent has obtained and included 767 in an offer or answer it sent. 769 Remote Candidate: A candidate that an agent received in an offer or 770 answer from its peer. 772 Default Destination/Candidate: The default destination for a 773 component of a media stream is the transport address that would be 774 used by an agent that is not ICE aware. A default candidate for a 775 component is one whose transport address matches the default 776 destination for that component. 778 Candidate Pair: A pairing containing a local candidate and a remote 779 candidate. 781 Check, Connectivity Check, STUN Check: A STUN Binding request 782 transaction for the purposes of verifying connectivity. A check 783 is sent from the local candidate to the remote candidate of a 784 candidate pair. 786 Check List: An ordered set of candidate pairs that an agent will use 787 to generate checks. 789 Ordinary Check: A connectivity check generated by an agent as a 790 consequence of a timer that fires periodically, instructing it to 791 send a check. 793 Triggered Check: A connectivity check generated as a consequence of 794 the receipt of a connectivity check from the peer. 796 Valid List: An ordered set of candidate pairs for a media stream 797 that have been validated by a successful STUN transaction. 799 Full: An ICE implementation that performs the complete set of 800 functionality defined by this specification. 802 Lite: An ICE implementation that omits certain functions, 803 implementing only as much as is necessary for a peer 804 implementation that is full to gain the benefits of ICE. Lite 805 implementations do not maintain any of the state machines and do 806 not generate connectivity checks. 808 Controlling Agent: The ICE agent that is responsible for selecting 809 the final choice of candidate pairs and signaling them through 810 STUN. In any session, one agent is always controlling. The other 811 is the controlled agent. 813 Controlled Agent: An ICE agent that waits for the controlling agent 814 to select the final choice of candidate pairs. 816 Regular Nomination: The process of picking a valid candidate pair 817 for media traffic by validating the pair with one STUN request, 818 and then picking it by sending a second STUN request with a flag 819 indicating its nomination. 821 Aggressive Nomination: The process of picking a valid candidate pair 822 for media traffic by including a flag in every connectivity check 823 STUN request, such that the first one to produce a valid candidate 824 pair is used for media. 826 Nominated: If a valid candidate pair has its nominated flag set, it 827 means that it may be selected by ICE for sending and receiving 828 media. 830 Selected Pair, Selected Candidate: The candidate pair selected by 831 ICE for sending and receiving media is called the selected pair, 832 and each of its candidates is called the selected candidate. 834 Using Protocol, ICE Usage: The protocol that uses ICE for NAT 835 traversal. A usage specification defines the protocol specific 836 details on how the procedures defined here are applied to that 837 protocol. 839 4. Sending the Initial Offer 841 In order to send the initial offer in an offer/answer exchange, an 842 agent must (1) gather candidates, (2) prioritize them, (3) eliminate 843 redundant candidates, (4) (possibly) choose default candidates, and 844 then (5) formulate and send the offer. All but the last of these 845 five steps differ for full and lite implementations. 847 4.1. Full Implementation Requirements 849 4.1.1. Gathering Candidates 851 An agent gathers candidates when it believes that communication is 852 imminent. An offerer can do this based on a user interface cue, or 853 based on an explicit request to initiate a session. Every candidate 854 is a transport address. It also has a type and a base. Four types 855 are defined and gathered by this specification -- host candidates, 856 server reflexive candidates, peer reflexive candidates, and relayed 857 candidates. The server reflexive candidates are gathered using STUN 858 or TURN, and relayed candidates are obtained through TURN. Peer 859 reflexive candidates are obtained in later phases of ICE, as a 860 consequence of connectivity checks. The base of a candidate is the 861 candidate that an agent must send from when using that candidate. 863 4.1.1.1. Host Candidates 865 The first step is to gather host candidates. Host candidates are 866 obtained by binding to ports (typically ephemeral) on a IP address 867 attached to an interface (physical or virtual, including VPN 868 interfaces) on the host. 870 For each UDP media stream the agent wishes to use, the agent SHOULD 871 obtain a candidate for each component of the media stream on each IP 872 address that the host has. It obtains each candidate by binding to a 873 UDP port on the specific IP address. A host candidate (and indeed 874 every candidate) is always associated with a specific component for 875 which it is a candidate. Each component has an ID assigned to it, 876 called the component ID. For RTP-based media streams, the RTP itself 877 has a component ID of 1, and RTCP a component ID of 2. If an agent 878 is using RTCP, it MUST obtain a candidate for it. If an agent is 879 using both RTP and RTCP, it would end up with 2*K host candidates if 880 an agent has K IP addresses. 882 The base for each host candidate is set to the candidate itself. 884 4.1.1.2. Server Reflexive and Relayed Candidates 886 Agents SHOULD obtain relayed candidates and SHOULD obtain server 887 reflexive candidates. These requirements are at SHOULD strength to 888 allow for provider variation. Use of STUN and TURN servers may be 889 unnecessary in closed networks where agents are never connected to 890 the public Internet or to endpoints outside of the closed network. 892 In such cases, a full implementation would be used for agents that 893 are dual-stack or multihomed, to select a host candidate. Use of 894 TURN servers is expensive, and when ICE is being used, they will only 895 be utilized when both endpoints are behind NATs that perform address 896 and port dependent mapping. Consequently, some deployments might 897 consider this use case to be marginal, and elect not to use TURN 898 servers. If an agent does not gather server reflexive or relayed 899 candidates, it is RECOMMENDED that the functionality be implemented 900 and just disabled through configuration, so that it can be re-enabled 901 through configuration if conditions change in the future. 903 If an agent is gathering both relayed and server reflexive 904 candidates, it uses a TURN server. If it is gathering just server 905 reflexive candidates, it uses a STUN server. 907 The agent next pairs each host candidate with the STUN or TURN server 908 with which it is configured or has discovered by some means. If a 909 STUN or TURN server is configured, it is RECOMMENDED that a domain 910 name be configured, and the DNS procedures in [RFC5389] (using SRV 911 records with the "stun" service) be used to discover the STUN server, 912 and the DNS procedures in [RFC5766] (using SRV records with the 913 "turn" service) be used to discover the TURN server. 915 This specification only considers usage of a single STUN or TURN 916 server. When there are multiple choices for that single STUN or TURN 917 server (when, for example, they are learned through DNS records and 918 multiple results are returned), an agent SHOULD use a single STUN or 919 TURN server (based on its IP address) for all candidates for a 920 particular session. This improves the performance of ICE. The 921 result is a set of pairs of host candidates with STUN or TURN 922 servers. The agent then chooses one pair, and sends a Binding or 923 Allocate request to the server from that host candidate. Binding 924 requests to a STUN server are not authenticated, and any ALTERNATE- 925 SERVER attribute in a response is ignored. Agents MUST support the 926 backwards compatibility mode for the Binding request defined in 927 [RFC5389]. Allocate requests SHOULD be authenticated using a long- 928 term credential obtained by the client through some other means. 930 Every Ta milliseconds thereafter, the agent can generate another new 931 STUN or TURN transaction. This transaction can either be a retry of 932 a previous transaction that failed with a recoverable error (such as 933 authentication failure), or a transaction for a new host candidate 934 and STUN or TURN server pair. The agent SHOULD NOT generate 935 transactions more frequently than one every Ta milliseconds. See 936 Section 12 for guidance on how to set Ta and the STUN retransmit 937 timer, RTO. 939 The agent will receive a Binding or Allocate response. A successful 940 Allocate response will provide the agent with a server reflexive 941 candidate (obtained from the mapped address) and a relayed candidate 942 in the XOR-RELAYED-ADDRESS attribute. If the Allocate request is 943 rejected because the server lacks resources to fulfill it, the agent 944 SHOULD instead send a Binding request to obtain a server reflexive 945 candidate. A Binding response will provide the agent with only a 946 server reflexive candidate (also obtained from the mapped address). 947 The base of the server reflexive candidate is the host candidate from 948 which the Allocate or Binding request was sent. The base of a 949 relayed candidate is that candidate itself. If a relayed candidate 950 is identical to a host candidate (which can happen in rare cases), 951 the relayed candidate MUST be discarded. 953 4.1.1.3. Computing Foundations 955 Finally, the agent assigns each candidate a foundation. The 956 foundation is an identifier, scoped within a session. Two candidates 957 MUST have the same foundation ID when all of the following are true: 959 o they are of the same type (host, relayed, server reflexive, or 960 peer reflexive) 962 o their bases have the same IP address (the ports can be different) 964 o for reflexive and relayed candidates, the STUN or TURN servers 965 used to obtain them have the same IP address 967 o they were obtained using the same transport protocol (TCP, UDP, 968 etc.) 970 Similarly, two candidates MUST have different foundations if their 971 types are different, their bases have different IP addresses, the 972 STUN or TURN servers used to obtain them have different IP addresses, 973 or their transport protocols are different. 975 4.1.1.4. Keeping Candidates Alive 977 Once server reflexive and relayed candidates are allocated, they MUST 978 be kept alive until ICE processing has completed, as described in 979 Section 8.3. For server reflexive candidates learned through a 980 Binding request, the bindings MUST be kept alive by additional 981 Binding requests to the server. Refreshes for allocations are done 982 using the Refresh transaction, as described in [RFC5766]. The 983 Refresh requests will also refresh the server reflexive candidate. 985 4.1.2. Prioritizing Candidates 987 The prioritization process results in the assignment of a priority to 988 each candidate. Each candidate for a media stream MUST have a unique 989 priority that MUST be a positive integer between 1 and (2**31 - 1). 990 This priority will be used by ICE to determine the order of the 991 connectivity checks and the relative preference for candidates. 993 An agent SHOULD compute this priority using the formula in 994 Section 4.1.2.1 and choose its parameters using the guidelines in 995 Section 4.1.2.2. If an agent elects to use a different formula, ICE 996 will take longer to converge since both agents will not be 997 coordinated in their checks. 999 4.1.2.1. Recommended Formula 1001 When using the formula, an agent computes the priority by determining 1002 a preference for each type of candidate (server reflexive, peer 1003 reflexive, relayed, and host), and, when the agent is multihomed, 1004 choosing a preference for its IP addresses. These two preferences 1005 are then combined to compute the priority for a candidate. That 1006 priority is computed using the following formula: 1008 priority = (2^24)*(type preference) + 1009 (2^8)*(local preference) + 1010 (2^0)*(256 - component ID) 1012 The type preference MUST be an integer from 0 to 126 inclusive, and 1013 represents the preference for the type of the candidate (where the 1014 types are local, server reflexive, peer reflexive, and relayed). A 1015 126 is the highest preference, and a 0 is the lowest. Setting the 1016 value to a 0 means that candidates of this type will only be used as 1017 a last resort. The type preference MUST be identical for all 1018 candidates of the same type and MUST be different for candidates of 1019 different types. The type preference for peer reflexive candidates 1020 MUST be higher than that of server reflexive candidates. Note that 1021 candidates gathered based on the procedures of Section 4.1.1 will 1022 never be peer reflexive candidates; candidates of these type are 1023 learned from the connectivity checks performed by ICE. 1025 The local preference MUST be an integer from 0 to 65535 inclusive. 1026 It represents a preference for the particular IP address from which 1027 the candidate was obtained, in cases where an agent is multihomed. 1028 65535 represents the highest preference, and a zero, the lowest. 1029 When there is only a single IP address, this value SHOULD be set to 1030 65535. More generally, if there are multiple candidates for a 1031 particular component for a particular media stream that have the same 1032 type, the local preference MUST be unique for each one. In this 1033 specification, this only happens for multihomed hosts. If a host is 1034 multihomed because it is dual-stack, the local preference SHOULD be 1035 set equal to the precedence value for IP addresses described in RFC 1036 6724 [RFC6724]. 1038 The component ID is the component ID for the candidate, and MUST be 1039 between 1 and 256 inclusive. 1041 4.1.2.2. Guidelines for Choosing Type and Local Preferences 1043 One criterion for selection of the type and local preference values 1044 is the use of a media intermediary, such as a TURN server, VPN 1045 server, or NAT. With a media intermediary, if media is sent to that 1046 candidate, it will first transit the media intermediary before being 1047 received. Relayed candidates are one type of candidate that involves 1048 a media intermediary. Another are host candidates obtained from a 1049 VPN interface. When media is transited through a media intermediary, 1050 it can increase the latency between transmission and reception. It 1051 can increase the packet losses, because of the additional router hops 1052 that may be taken. It may increase the cost of providing service, 1053 since media will be routed in and right back out of a media 1054 intermediary run by a provider. If these concerns are important, the 1055 type preference for relayed candidates SHOULD be lower than host 1056 candidates. The RECOMMENDED values are 126 for host candidates, 100 1057 for server reflexive candidates, 110 for peer reflexive candidates, 1058 and 0 for relayed candidates. Furthermore, if an agent is multihomed 1059 and has multiple IP addresses, the local preference for host 1060 candidates from a VPN interface SHOULD have a priority of 0. 1062 Another criterion for selection of preferences is IP address family. 1063 ICE works with both IPv4 and IPv6. It therefore provides a 1064 transition mechanism that allows dual-stack hosts to prefer 1065 connectivity over IPv6, but to fall back to IPv4 in case the v6 1066 networks are disconnected (due, for example, to a failure in a 6to4 1067 relay) [RFC3056]. It can also help with hosts that have both a 1068 native IPv6 address and a 6to4 address. In such a case, higher local 1069 preferences could be assigned to the v6 addresses, followed by the 1070 6to4 addresses, followed by the v4 addresses. This allows a site to 1071 obtain and begin using native v6 addresses immediately, yet still 1072 fall back to 6to4 addresses when communicating with agents in other 1073 sites that do not yet have native v6 connectivity. 1075 Another criterion for selecting preferences is security. If a user 1076 is a telecommuter, and therefore connected to a corporate network and 1077 a local home network, the user may prefer their voice traffic to be 1078 routed over the VPN in order to keep it on the corporate network when 1079 communicating within the enterprise, but use the local network when 1080 communicating with users outside of the enterprise. In such a case, 1081 a VPN address would have a higher local preference than any other 1082 address. 1084 Another criterion for selecting preferences is topological awareness. 1085 This is most useful for candidates that make use of intermediaries. 1086 In those cases, if an agent has preconfigured or dynamically 1087 discovered knowledge of the topological proximity of the 1088 intermediaries to itself, it can use that to assign higher local 1089 preferences to candidates obtained from closer intermediaries. 1091 4.1.3. Eliminating Redundant Candidates 1093 Next, the agent eliminates redundant candidates. A candidate is 1094 redundant if its transport address equals another candidate, and its 1095 base equals the base of that other candidate. Note that two 1096 candidates can have the same transport address yet have different 1097 bases, and these would not be considered redundant. Frequently, a 1098 server reflexive candidate and a host candidate will be redundant 1099 when the agent is not behind a NAT. The agent SHOULD eliminate the 1100 redundant candidate with the lower priority. 1102 4.2. Lite Implementation Requirements 1104 Lite implementations only utilize host candidates. A lite 1105 implementation MUST, for each component of each media stream, 1106 allocate zero or one IPv4 candidates. It MAY allocate zero or more 1107 IPv6 candidates, but no more than one per each IPv6 address utilized 1108 by the host. Since there can be no more than one IPv4 candidate per 1109 component of each media stream, if an agent has multiple IPv4 1110 addresses, it MUST choose one for allocating the candidate. If a 1111 host is dual-stack, it is RECOMMENDED that it allocate one IPv4 1112 candidate and one global IPv6 address. With the lite implementation, 1113 ICE cannot be used to dynamically choose amongst candidates. 1114 Therefore, including more than one candidate from a particular scope 1115 is NOT RECOMMENDED, since only a connectivity check can truly 1116 determine whether to use one address or the other. 1118 Each component has an ID assigned to it, called the component ID. 1119 For RTP-based media streams, the RTP itself has a component ID of 1, 1120 and RTCP a component ID of 2. If an agent is using RTCP, it MUST 1121 obtain candidates for it. 1123 Each candidate is assigned a foundation. The foundation MUST be 1124 different for two candidates allocated from different IP addresses, 1125 and MUST be the same otherwise. A simple integer that increments for 1126 each IP address will suffice. In addition, each candidate MUST be 1127 assigned a unique priority amongst all candidates for the same media 1128 stream. This priority SHOULD be equal to: 1130 priority = (2^24)*(126) + 1131 (2^8)*(IP precedence) + 1132 (2^0)*(256 - component ID) 1134 If a host is v4-only, it SHOULD set the IP precedence to 65535. If a 1135 host is v6 or dual-stack, the IP precedence SHOULD be the precedence 1136 value for IP addresses described in RFC 6724 [RFC6724]. 1138 Next, an agent chooses a default candidate for each component of each 1139 media stream. If a host is IPv4-only, there would only be one 1140 candidate for each component of each media stream, and therefore that 1141 candidate is the default. If a host is IPv6 or dual-stack, the 1142 selection of default is a matter of local policy. This default 1143 SHOULD be chosen such that it is the candidate most likely to be used 1144 with a peer. For IPv6-only hosts, this would typically be a globally 1145 scoped IPv6 address. For dual-stack hosts, the IPv4 address is 1146 RECOMMENDED. 1148 4.3. Encoding the Offer 1150 The syntax for the offer and answer messages is entirely a matter of 1151 convenience for the using protocol. However, the following 1152 parameters and their data types needs to be conveyed in the initial 1153 exchange: 1155 Candidate attribute There will be one or more of these for each 1156 "media stream". Each candidate is composed of: 1158 Connection Address: The IP address and transport protocol port of 1159 the candidate. 1161 Transport: An indicator of the transport protocol for this 1162 candidate. This need not be present if the using protocol will 1163 only ever run over a single transport protocol. If it runs 1164 over more than one, or if others are anticipated to be used in 1165 the future, this should be present. 1167 Foundation: A sequence of up to 32 characters. 1169 Component-ID: This would be present only if the using protocol 1170 were utilizing the concept of components. If it is, it would 1171 be a positive integer that indicates the component ID for which 1172 this is a candidate. 1174 Priority: An encoding of the 32-bit priority value. 1176 Candidate Type: The candidate type, as defined in ICE. 1178 Related Address and Port: The related IP address and port for 1179 this candidate, as defined by ICE. 1181 Extensibility Parameters: The using protocol should define some 1182 means for adding new per-candidate ICE parameters in the 1183 future. 1185 Lite Flag: If ICE lite is used by the using protocol, it needs to 1186 convey a boolean parameter which indicates whether the 1187 implementation is lite or not. 1189 Username Fragment and Password: The using protocol has to convey a 1190 username fragment and password. The username fragment MUST 1191 contain at least 24 bits of randomness, and the password MUST 1192 contain at least 128 bits of randomness. 1194 ICE extensions: In addition to the per-candidate extensions above, 1195 the using protocol should allow for new media-stream or session- 1196 level attributes (ice-options). 1198 If the using protocol is using the ICE mismatch feature, a way is 1199 needed to convey this parameter in answers. It is a boolean flag. 1201 The exchange of parameters is symmetric; both agents need to send the 1202 same set of attributes as defined above. 1204 The using protocol may (or may not) need to deal with backwards 1205 compatibility with older implementations that do not support ICE. If 1206 the fallback mechanism is being used, then presumably the using 1207 protocol provides a way of conveying the default candidate (its IP 1208 address and port) in addition to the ICE parameters. 1210 STUN connectivity checks between agents are authenticated using the 1211 short-term credential mechanism defined for STUN [RFC5389]. This 1212 mechanism relies on a username and password that are exchanged 1213 through protocol machinery between the client and server. With ICE, 1214 the offer/answer exchange is used to exchange them. The username 1215 part of this credential is formed by concatenating a username 1216 fragment from each agent, separated by a colon. Each agent also 1217 provides a password, used to compute the message integrity for 1218 requests it receives. The username fragment and password are 1219 exchanged in the offer and answer. In addition to providing 1220 security, the username provides disambiguation and correlation of 1221 checks to media streams. See Appendix B.4 for motivation. 1223 If an agent is a lite implementation, it MUST indicate this in the 1224 offer. 1226 ICE provides for extensibility by allowing an offer or answer to 1227 contain a series of tokens that identify the ICE extensions used by 1228 that agent. If an agent supports an ICE extension, it MUST include 1229 the token defined for that extension in the offer. 1231 Once an agent has sent its offer or its answer, that agent MUST be 1232 prepared to receive both STUN and media packets on each candidate. 1233 As discussed in Section 10.1, media packets can be sent to a 1234 candidate prior to its appearance as the default destination for 1235 media in an offer or answer. 1237 5. Receiving the Initial Offer 1239 When an agent receives an initial offer, it will check if the offerer 1240 supports ICE, determine its own role, gather candidates, prioritize 1241 them, choose default candidates, encode and send an answer, and for 1242 full implementations, form the check lists and begin connectivity 1243 checks. 1245 5.1. Verifying ICE Support 1247 Certain middleboxes, such as ALGs, may alter the ICE offer and/or 1248 answer in a way that breaks ICE. If the using protocol is vulnerable 1249 to this kind of changes, called ICE mismatch, the answerer needs to 1250 detect this and signal this back to the offerer. The details on 1251 whether this is needed and how it is done is defined by the usage 1252 specifications. 1254 5.2. Determining Role 1256 For each session, each agent takes on a role. There are two roles -- 1257 controlling and controlled. The controlling agent is responsible for 1258 the choice of the final candidate pairs used for communications. For 1259 a full agent, this means nominating the candidate pairs that can be 1260 used by ICE for each media stream, and for generating the updated 1261 offer based on ICE's selection, when needed. For a lite 1262 implementation, being the controlling agent means selecting a 1263 candidate pair based on the ones in the offer and answer (for IPv4, 1264 there is only ever one pair), and then generating an updated offer 1265 reflecting that selection, when needed (it is never needed for an 1266 IPv4-only host). The controlled agent is told which candidate pairs 1267 to use for each media stream, and does not generate an updated offer 1268 to signal this information. The sections below describe in detail 1269 the actual procedures followed by controlling and controlled nodes. 1271 The rules for determining the role and the impact on behavior are as 1272 follows: 1274 Both agents are full: The agent that generated the offer which 1275 started the ICE processing MUST take the controlling role, and the 1276 other MUST take the controlled role. Both agents will form check 1277 lists, run the ICE state machines, and generate connectivity 1278 checks. The controlling agent will execute the logic in 1279 Section 8.1 to nominate pairs that will be selected by ICE, and 1280 then both agents end ICE as described in Section 8.1.2. 1282 One agent full, one lite: The full agent MUST take the controlling 1283 role, and the lite agent MUST take the controlled role. The full 1284 agent will form check lists, run the ICE state machines, and 1285 generate connectivity checks. That agent will execute the logic 1286 in Section 8.1 to nominate pairs that will be selected by ICE, and 1287 use the logic in Section 8.1.2 to end ICE. The lite 1288 implementation will just listen for connectivity checks, receive 1289 them and respond to them, and then conclude ICE as described in 1290 Section 8.2. For the lite implementation, the state of ICE 1291 processing for each media stream is considered to be Running, and 1292 the state of ICE overall is Running. 1294 Both lite: The agent that generated the offer which started the ICE 1295 processing MUST take the controlling role, and the other MUST take 1296 the controlled role. In this case, no connectivity checks are 1297 ever sent. Rather, once the offer/answer exchange completes, each 1298 agent performs the processing described in Section 8 without 1299 connectivity checks. It is possible that both agents will believe 1300 they are controlled or controlling. In the latter case, the 1301 conflict is resolved through glare detection capabilities in the 1302 signaling protocol carrying the offer/answer exchange. The state 1303 of ICE processing for each media stream is considered to be 1304 Running, and the state of ICE overall is Running. 1306 Once roles are determined for a session, they persist unless ICE is 1307 restarted. An ICE restart causes a new selection of roles and tie- 1308 breakers. 1310 5.3. Gathering Candidates 1312 The process for gathering candidates at the answerer is identical to 1313 the process for the offerer as described in Section 4.1.1 for full 1314 implementations and Section 4.2 for lite implementations. It is 1315 RECOMMENDED that this process begin immediately on receipt of the 1316 offer, prior to alerting the user. Such gathering MAY begin when an 1317 agent starts. 1319 5.4. Prioritizing Candidates 1321 The process for prioritizing candidates at the answerer is identical 1322 to the process followed by the offerer, as described in Section 4.1.2 1323 for full implementations and Section 4.2 for lite implementations. 1325 5.5. Encoding the Answer 1327 The process for encoding the answer is identical to the process 1328 followed by the offerer for both full and lite implementations, as 1329 described in Section 4.3. 1331 5.6. Forming the Check Lists 1333 Forming check lists is done only by full implementations. Lite 1334 implementations MUST skip the steps defined in this section. 1336 There is one check list per in-use media stream resulting from the 1337 offer/answer exchange. To form the check list for a media stream, 1338 the agent forms candidate pairs, computes a candidate pair priority, 1339 orders the pairs by priority, prunes them, and sets their states. 1340 These steps are described in this section. 1342 5.6.1. Forming Candidate Pairs 1344 First, the agent takes each of its candidates for a media stream 1345 (called LOCAL CANDIDATES) and pairs them with the candidates it 1346 received from its peer (called REMOTE CANDIDATES) for that media 1347 stream. In order to prevent the attacks described in Section 14.4.1, 1348 agents MAY limit the number of candidates they'll accept in an offer 1349 or answer. A local candidate is paired with a remote candidate if 1350 and only if the two candidates have the same component ID and have 1351 the same IP address version. It is possible that some of the local 1352 candidates won't get paired with remote candidates, and some of the 1353 remote candidates won't get paired with local candidates. This can 1354 happen if one agent doesn't include candidates for the all of the 1355 components for a media stream. If this happens, the number of 1356 components for that media stream is effectively reduced, and 1357 considered to be equal to the minimum across both agents of the 1358 maximum component ID provided by each agent across all components for 1359 the media stream. 1361 In the case of RTP, this would happen when one agent provides 1362 candidates for RTCP, and the other does not. As another example, the 1363 offerer can multiplex RTP and RTCP on the same port and signals that 1364 it can do that in the SDP through an SDP attribute [RFC5761]. 1365 However, since the offerer doesn't know if the answerer can perform 1366 such multiplexing, the offerer includes candidates for RTP and RTCP 1367 on separate ports, so that the offer has two components per media 1368 stream. If the answerer can perform such multiplexing, it would 1369 include just a single component for each candidate -- for the 1370 combined RTP/RTCP mux. ICE would end up acting as if there was just 1371 a single component for this candidate. 1373 The candidate pairs whose local and remote candidates are both the 1374 default candidates for a particular component is called, 1375 unsurprisingly, the default candidate pair for that component. This 1376 is the pair that would be used to transmit media if both agents had 1377 not been ICE aware. 1379 In order to aid understanding, Figure 6 shows the relationships 1380 between several key concepts -- transport addresses, candidates, 1381 candidate pairs, and check lists, in addition to indicating the main 1382 properties of candidates and candidate pairs. 1384 +------------------------------------------+ 1385 | | 1386 | +---------------------+ | 1387 | |+----+ +----+ +----+ | +Type | 1388 | || IP | |Port| |Tran| | +Priority | 1389 | ||Addr| | | | | | +Foundation | 1390 | |+----+ +----+ +----+ | +ComponentiD | 1391 | | Transport | +RelatedAddr | 1392 | | Addr | | 1393 | +---------------------+ +Base | 1394 | Candidate | 1395 +------------------------------------------+ 1396 * * 1397 * ************************************* 1398 * * 1399 +-------------------------------+ 1400 .| | 1401 | Local Remote | 1402 | +----+ +----+ +default? | 1403 | |Cand| |Cand| +valid? | 1404 | +----+ +----+ +nominated?| 1405 | +State | 1406 | | 1407 | | 1408 | Candidate Pair | 1409 +-------------------------------+ 1410 * * 1411 * ************ 1412 * * 1413 +------------------+ 1414 | Candidate Pair | 1415 +------------------+ 1416 +------------------+ 1417 | Candidate Pair | 1418 +------------------+ 1419 +------------------+ 1420 | Candidate Pair | 1421 +------------------+ 1423 Check 1424 List 1426 Figure 6: Conceptual Diagram of a Check List 1428 5.6.2. Computing Pair Priority and Ordering Pairs 1430 Once the pairs are formed, a candidate pair priority is computed. 1431 Let G be the priority for the candidate provided by the controlling 1432 agent. Let D be the priority for the candidate provided by the 1433 controlled agent. The priority for a pair is computed as: 1435 pair priority = 2^32*MIN(G,D) + 2*MAX(G,D) + (G>D?1:0) 1437 Where G>D?1:0 is an expression whose value is 1 if G is greater than 1438 D, and 0 otherwise. Once the priority is assigned, the agent sorts 1439 the candidate pairs in decreasing order of priority. If two pairs 1440 have identical priority, the ordering amongst them is arbitrary. 1442 5.6.3. Pruning the Pairs 1444 This sorted list of candidate pairs is used to determine a sequence 1445 of connectivity checks that will be performed. Each check involves 1446 sending a request from a local candidate to a remote candidate. 1447 Since an agent cannot send requests directly from a reflexive 1448 candidate, but only from its base, the agent next goes through the 1449 sorted list of candidate pairs. For each pair where the local 1450 candidate is server reflexive, the server reflexive candidate MUST be 1451 replaced by its base. Once this has been done, the agent MUST prune 1452 the list. This is done by removing a pair if its local and remote 1453 candidates are identical to the local and remote candidates of a pair 1454 higher up on the priority list. The result is a sequence of ordered 1455 candidate pairs, called the check list for that media stream. 1457 In addition, in order to limit the attacks described in 1458 Section 14.4.1, an agent MUST limit the total number of connectivity 1459 checks the agent performs across all check lists to a specific value, 1460 and this value MUST be configurable. A default of 100 is 1461 RECOMMENDED. This limit is enforced by discarding the lower-priority 1462 candidate pairs until there are less than 100. It is RECOMMENDED 1463 that a lower value be utilized when possible, set to the maximum 1464 number of plausible checks that might be seen in an actual deployment 1465 configuration. The requirement for configuration is meant to provide 1466 a tool for fixing this value in the field if, once deployed, it is 1467 found to be problematic. 1469 5.6.4. Computing States 1471 Each candidate pair in the check list has a foundation and a state. 1472 The foundation is the combination of the foundations of the local and 1473 remote candidates in the pair. The state is assigned once the check 1474 list for each media stream has been computed. There are five 1475 potential values that the state can have: 1477 Waiting: A check has not been performed for this pair, and can be 1478 performed as soon as it is the highest-priority Waiting pair on 1479 the check list. 1481 In-Progress: A check has been sent for this pair, but the 1482 transaction is in progress. 1484 Succeeded: A check for this pair was already done and produced a 1485 successful result. 1487 Failed: A check for this pair was already done and failed, either 1488 never producing any response or producing an unrecoverable failure 1489 response. 1491 Frozen: A check for this pair hasn't been performed, and it can't 1492 yet be performed until some other check succeeds, allowing this 1493 pair to unfreeze and move into the Waiting state. 1495 As ICE runs, the pairs will move between states as shown in Figure 7. 1497 +-----------+ 1498 | | 1499 | | 1500 | Frozen | 1501 | | 1502 | | 1503 +-----------+ 1504 | 1505 |unfreeze 1506 | 1507 V 1508 +-----------+ +-----------+ 1509 | | | | 1510 | | perform | | 1511 | Waiting |-------->|In-Progress| 1512 | | | | 1513 | | | | 1514 +-----------+ +-----------+ 1515 / | 1516 // | 1517 // | 1518 // | 1519 / | 1520 // | 1521 failure // |success 1522 // | 1523 / | 1524 // | 1525 // | 1526 // | 1527 V V 1528 +-----------+ +-----------+ 1529 | | | | 1530 | | | | 1531 | Failed | | Succeeded | 1532 | | | | 1533 | | | | 1534 +-----------+ +-----------+ 1536 Figure 7: Pair State FSM 1538 The initial states for each pair in a check list are computed by 1539 performing the following sequence of steps: 1541 1. The agent sets all of the pairs in each check list to the Frozen 1542 state. 1544 2. The agent examines the check list for the first media stream. 1545 For that media stream: 1547 * For all pairs with the same foundation, it sets the state of 1548 the pair with the lowest component ID to Waiting. If there is 1549 more than one such pair, the one with the highest-priority is 1550 used. 1552 One of the check lists will have some number of pairs in the Waiting 1553 state, and the other check lists will have all of their pairs in the 1554 Frozen state. A check list with at least one pair that is Waiting is 1555 called an active check list, and a check list with all pairs Frozen 1556 is called a frozen check list. 1558 The check list itself is associated with a state, which captures the 1559 state of ICE checks for that media stream. There are three states: 1561 Running: In this state, ICE checks are still in progress for this 1562 media stream. 1564 Completed: In this state, ICE checks have produced nominated pairs 1565 for each component of the media stream. Consequently, ICE has 1566 succeeded and media can be sent. 1568 Failed: In this state, the ICE checks have not completed 1569 successfully for this media stream. 1571 When a check list is first constructed as the consequence of an 1572 offer/answer exchange, it is placed in the Running state. 1574 ICE processing across all media streams also has a state associated 1575 with it. This state is equal to Running while ICE processing is 1576 under way. The state is Completed when ICE processing is complete 1577 and Failed if it failed without success. Rules for transitioning 1578 between states are described below. 1580 5.7. Scheduling Checks 1582 Checks are generated only by full implementations. Lite 1583 implementations MUST skip the steps described in this section. 1585 An agent performs ordinary checks and triggered checks. The 1586 generation of both checks is governed by a timer that fires 1587 periodically for each media stream. The agent maintains a FIFO 1588 queue, called the triggered check queue, which contains candidate 1589 pairs for which checks are to be sent at the next available 1590 opportunity. When the timer fires, the agent removes the top pair 1591 from the triggered check queue, performs a connectivity check on that 1592 pair, and sets the state of the candidate pair to In-Progress. If 1593 there are no pairs in the triggered check queue, an ordinary check is 1594 sent. 1596 Once the agent has computed the check lists as described in 1597 Section 5.6, it sets a timer for each active check list. The timer 1598 fires every Ta*N seconds, where N is the number of active check lists 1599 (initially, there is only one active check list). Implementations 1600 MAY set the timer to fire less frequently than this. Implementations 1601 SHOULD take care to spread out these timers so that they do not fire 1602 at the same time for each media stream. Ta and the retransmit timer 1603 RTO are computed as described in Section 12. Multiplying by N allows 1604 this aggregate check throughput to be split between all active check 1605 lists. The first timer fires immediately, so that the agent performs 1606 a connectivity check the moment the offer/answer exchange has been 1607 done, followed by the next check Ta seconds later (since there is 1608 only one active check list). 1610 When the timer fires and there is no triggered check to be sent, the 1611 agent MUST choose an ordinary check as follows: 1613 o Find the highest-priority pair in that check list that is in the 1614 Waiting state. 1616 o If there is such a pair: 1618 * Send a STUN check from the local candidate of that pair to the 1619 remote candidate of that pair. The procedures for forming the 1620 STUN request for this purpose are described in Section 7.1.2. 1622 * Set the state of the candidate pair to In-Progress. 1624 o If there is no such pair: 1626 * Find the highest-priority pair in that check list that is in 1627 the Frozen state. 1629 * If there is such a pair: 1631 + Unfreeze the pair. 1633 + Perform a check for that pair, causing its state to 1634 transition to In-Progress. 1636 * If there is no such pair: 1638 + Terminate the timer for that check list. 1640 To compute the message integrity for the check, the agent uses the 1641 remote username fragment and password learned from the offer or 1642 answer from its peer. The local username fragment is known directly 1643 by the agent for its own candidate. 1645 6. Receipt of the Initial Answer 1647 This section describes the procedures that an agent follows when it 1648 receives the answer from the peer. It verifies that its peer 1649 supports ICE, determines its role, and for full implementations, 1650 forms the check list and begins performing ordinary checks. 1652 6.1. Verifying ICE Support 1654 The logic at the offerer is identical to that of the answerer as 1655 described in Section 5.1, with the exception that an offerer would 1656 not ever indicate ICE mismatch. 1658 6.2. Determining Role 1660 The offerer follows the same procedures described for the answerer in 1661 Section 5.2. 1663 6.3. Forming the Check List 1665 Formation of check lists is performed only by full implementations. 1666 The offerer follows the same procedures described for the answerer in 1667 Section 5.6. 1669 6.4. Performing Ordinary Checks 1671 Ordinary checks are performed only by full implementations. The 1672 offerer follows the same procedures described for the answerer in 1673 Section 5.7. 1675 7. Performing Connectivity Checks 1677 This section describes how connectivity checks are performed. All 1678 ICE implementations are required to be compliant to [RFC5389], as 1679 opposed to the older [RFC3489]. However, whereas a full 1680 implementation will both generate checks (acting as a STUN client) 1681 and receive them (acting as a STUN server), a lite implementation 1682 will only receive checks, and thus will only act as a STUN server. 1684 7.1. STUN Client Procedures 1686 These procedures define how an agent sends a connectivity check, 1687 whether it is an ordinary or a triggered check. These procedures are 1688 only applicable to full implementations. 1690 7.1.1. Creating Permissions for Relayed Candidates 1692 If the connectivity check is being sent using a relayed local 1693 candidate, the client MUST create a permission first if it has not 1694 already created one previously. It would have created one previously 1695 if it had told the TURN server to create a permission for the given 1696 relayed candidate towards the IP address of the remote candidate. To 1697 create the permission, the agent follows the procedures defined in 1698 [RFC5766]. The permission MUST be created towards the IP address of 1699 the remote candidate. It is RECOMMENDED that the agent defer 1700 creation of a TURN channel until ICE completes, in which case 1701 permissions for connectivity checks are normally created using a 1702 CreatePermission request. Once established, the agent MUST keep the 1703 permission active until ICE concludes. 1705 7.1.2. Sending the Request 1707 A connectivity check is generated by sending a Binding request from a 1708 local candidate to a remote candidate. [RFC5389] describes how 1709 Binding requests are constructed and generated. A connectivity check 1710 MUST utilize the STUN short-term credential mechanism. Support for 1711 backwards compatibility with RFC 3489 MUST NOT be used or assumed 1712 with connectivity checks. The FINGERPRINT mechanism MUST be used for 1713 connectivity checks. 1715 ICE extends STUN by defining several new attributes, including 1716 PRIORITY, USE-CANDIDATE, ICE-CONTROLLED, and ICE-CONTROLLING. These 1717 new attributes are formally defined in Section 15.1, and their usage 1718 is described in the subsections below. These STUN extensions are 1719 applicable only to connectivity checks used for ICE. 1721 7.1.2.1. PRIORITY and USE-CANDIDATE 1723 An agent MUST include the PRIORITY attribute in its Binding request. 1724 The attribute MUST be set equal to the priority that would be 1725 assigned, based on the algorithm in Section 4.1.2, to a peer 1726 reflexive candidate, should one be learned as a consequence of this 1727 check (see Section 7.1.3.2.1 for how peer reflexive candidates are 1728 learned). This priority value will be computed identically to how 1729 the priority for the local candidate of the pair was computed, except 1730 that the type preference is set to the value for peer reflexive 1731 candidate types. 1733 The controlling agent MAY include the USE-CANDIDATE attribute in the 1734 Binding request. The controlled agent MUST NOT include it in its 1735 Binding request. This attribute signals that the controlling agent 1736 wishes to cease checks for this component, and use the candidate pair 1737 resulting from the check for this component. Section 8.1.1 provides 1738 guidance on determining when to include it. 1740 7.1.2.2. ICE-CONTROLLED and ICE-CONTROLLING 1742 The agent MUST include the ICE-CONTROLLED attribute in the request if 1743 it is in the controlled role, and MUST include the ICE-CONTROLLING 1744 attribute in the request if it is in the controlling role. The 1745 content of either attribute MUST be the tie-breaker that was 1746 determined in Section 5.2. These attributes are defined fully in 1747 Section 15.1. 1749 7.1.2.3. Forming Credentials 1751 A Binding request serving as a connectivity check MUST utilize the 1752 STUN short-term credential mechanism. The username for the 1753 credential is formed by concatenating the username fragment provided 1754 by the peer with the username fragment of the agent sending the 1755 request, separated by a colon (":"). The password is equal to the 1756 password provided by the peer. For example, consider the case where 1757 agent L is the offerer, and agent R is the answerer. Agent L 1758 included a username fragment of LFRAG for its candidates and a 1759 password of LPASS. Agent R provided a username fragment of RFRAG and 1760 a password of RPASS. A connectivity check from L to R utilizes the 1761 username RFRAG:LFRAG and a password of RPASS. A connectivity check 1762 from R to L utilizes the username LFRAG:RFRAG and a password of 1763 LPASS. The responses utilize the same usernames and passwords as the 1764 requests (note that the USERNAME attribute is not present in the 1765 response). 1767 7.1.2.4. DiffServ Treatment 1769 If the agent is using Diffserv Codepoint markings [RFC2475] in its 1770 media packets, it SHOULD apply those same markings to its 1771 connectivity checks. 1773 7.1.3. Processing the Response 1775 When a Binding response is received, it is correlated to its Binding 1776 request using the transaction ID, as defined in [RFC5389], which then 1777 ties it to the candidate pair for which the Binding request was sent. 1778 This section defines additional procedures for processing Binding 1779 responses specific to this usage of STUN. 1781 7.1.3.1. Failure Cases 1783 If the STUN transaction generates a 487 (Role Conflict) error 1784 response, the agent checks whether it included the ICE-CONTROLLED or 1785 ICE-CONTROLLING attribute in the Binding request. If the request 1786 contained the ICE-CONTROLLED attribute, the agent MUST switch to the 1787 controlling role if it has not already done so. If the request 1788 contained the ICE-CONTROLLING attribute, the agent MUST switch to the 1789 controlled role if it has not already done so. Once it has switched, 1790 the agent MUST enqueue the candidate pair whose check generated the 1791 487 into the triggered check queue. The state of that pair is set to 1792 Waiting. When the triggered check is sent, it will contain an ICE- 1793 CONTROLLING or ICE-CONTROLLED attribute reflecting its new role. 1794 Note, however, that the tie-breaker value MUST NOT be reselected. 1796 A change in roles will require an agent to recompute pair priorities 1797 (Section 5.6.2), since those priorities are a function of controlling 1798 and controlled roles. The change in role will also impact whether 1799 the agent is responsible for selecting nominated pairs and generating 1800 updated offers upon conclusion of ICE. 1802 Agents MAY support receipt of ICMP errors for connectivity checks. 1803 If the STUN transaction generates an ICMP error, the agent sets the 1804 state of the pair to Failed. If the STUN transaction generates a 1805 STUN error response that is unrecoverable (as defined in [RFC5389]) 1806 or times out, the agent sets the state of the pair to Failed. 1808 The agent MUST check that the source IP address and port of the 1809 response equal the destination IP address and port to which the 1810 Binding request was sent, and that the destination IP address and 1811 port of the response match the source IP address and port from which 1812 the Binding request was sent. In other words, the source and 1813 destination transport addresses in the request and responses are 1814 symmetric. If they are not symmetric, the agent sets the state of 1815 the pair to Failed. 1817 7.1.3.2. Success Cases 1819 A check is considered to be a success if all of the following are 1820 true: 1822 o The STUN transaction generated a success response. 1824 o The source IP address and port of the response equals the 1825 destination IP address and port to which the Binding request was 1826 sent. 1828 o The destination IP address and port of the response match the 1829 source IP address and port from which the Binding request was 1830 sent. 1832 7.1.3.2.1. Discovering Peer Reflexive Candidates 1834 The agent checks the mapped address from the STUN response. If the 1835 transport address does not match any of the local candidates that the 1836 agent knows about, the mapped address represents a new candidate -- a 1837 peer reflexive candidate. Like other candidates, it has a type, 1838 base, priority, and foundation. They are computed as follows: 1840 o Its type is equal to peer reflexive. 1842 o Its base is set equal to the local candidate of the candidate pair 1843 from which the STUN check was sent. 1845 o Its priority is set equal to the value of the PRIORITY attribute 1846 in the Binding request. 1848 o Its foundation is selected as described in Section 4.1.1.3. 1850 This peer reflexive candidate is then added to the list of local 1851 candidates for the media stream. Its username fragment and password 1852 are the same as all other local candidates for that media stream. 1853 However, the peer reflexive candidate is not paired with other remote 1854 candidates. This is not necessary; a valid pair will be generated 1855 from it momentarily based on the procedures in Section 7.1.3.2.2. If 1856 an agent wishes to pair the peer reflexive candidate with other 1857 remote candidates besides the one in the valid pair that will be 1858 generated, the agent MAY generate an updated offer which includes the 1859 peer reflexive candidate. This will cause it to be paired with all 1860 other remote candidates. 1862 7.1.3.2.2. Constructing a Valid Pair 1864 The agent constructs a candidate pair whose local candidate equals 1865 the mapped address of the response, and whose remote candidate equals 1866 the destination address to which the request was sent. This is 1867 called a valid pair, since it has been validated by a STUN 1868 connectivity check. The valid pair may equal the pair that generated 1869 the check, may equal a different pair in the check list, or may be a 1870 pair not currently on any check list. If the pair equals the pair 1871 that generated the check or is on a check list currently, it is also 1872 added to the VALID LIST, which is maintained by the agent for each 1873 media stream. This list is empty at the start of ICE processing, and 1874 fills as checks are performed, resulting in valid candidate pairs. 1876 It will be very common that the pair will not be on any check list. 1877 Recall that the check list has pairs whose local candidates are never 1878 server reflexive; those pairs had their local candidates converted to 1879 the base of the server reflexive candidates, and then pruned if they 1880 were redundant. When the response to the STUN check arrives, the 1881 mapped address will be reflexive if there is a NAT between the two. 1882 In that case, the valid pair will have a local candidate that doesn't 1883 match any of the pairs in the check list. 1885 If the pair is not on any check list, the agent computes the priority 1886 for the pair based on the priority of each candidate, using the 1887 algorithm in Section 5.6. The priority of the local candidate 1888 depends on its type. If it is not peer reflexive, it is equal to the 1889 priority signaled for that candidate in the offer or answer. If it 1890 is peer reflexive, it is equal to the PRIORITY attribute the agent 1891 placed in the Binding request that just completed. The priority of 1892 the remote candidate is taken from the offer/answer of the peer. If 1893 the candidate does not appear there, then the check must have been a 1894 triggered check to a new remote candidate. In that case, the 1895 priority is taken as the value of the PRIORITY attribute in the 1896 Binding request that triggered the check that just completed. The 1897 pair is then added to the VALID LIST. 1899 7.1.3.2.3. Updating Pair States 1901 The agent sets the state of the pair that *generated* the check to 1902 Succeeded. Note that, the pair which *generated* the check may be 1903 different than the valid pair constructed in Section 7.1.3.2.2 as a 1904 consequence of the response. The success of this check might also 1905 cause the state of other checks to change as well. The agent MUST 1906 perform the following two steps: 1908 1. The agent changes the states for all other Frozen pairs for the 1909 same media stream and same foundation to Waiting. Typically, but 1910 not always, these other pairs will have different component IDs. 1912 2. If there is a pair in the valid list for every component of this 1913 media stream (where this is the actual number of components being 1914 used, in cases where the number of components signaled in the 1915 offer/answer differs from offerer to answerer), the success of 1916 this check may unfreeze checks for other media streams. Note 1917 that this step is followed not just the first time the valid list 1918 under consideration has a pair for every component, but every 1919 subsequent time a check succeeds and adds yet another pair to 1920 that valid list. The agent examines the check list for each 1921 other media stream in turn: 1923 * If the check list is active, the agent changes the state of 1924 all Frozen pairs in that check list whose foundation matches a 1925 pair in the valid list under consideration to Waiting. 1927 * If the check list is frozen, and there is at least one pair in 1928 the check list whose foundation matches a pair in the valid 1929 list under consideration, the state of all pairs in the check 1930 list whose foundation matches a pair in the valid list under 1931 consideration is set to Waiting. This will cause the check 1932 list to become active, and ordinary checks will begin for it, 1933 as described in Section 5.7. 1935 * If the check list is frozen, and there are no pairs in the 1936 check list whose foundation matches a pair in the valid list 1937 under consideration, the agent 1939 + groups together all of the pairs with the same foundation, 1940 and 1942 + for each group, sets the state of the pair with the lowest 1943 component ID to Waiting. If there is more than one such 1944 pair, the one with the highest-priority is used. 1946 7.1.3.2.4. Updating the Nominated Flag 1948 If the agent was a controlling agent, and it had included a USE- 1949 CANDIDATE attribute in the Binding request, the valid pair generated 1950 from that check has its nominated flag set to true. This flag 1951 indicates that this valid pair should be used for media if it is the 1952 highest-priority one amongst those whose nominated flag is set. This 1953 may conclude ICE processing for this media stream or all media 1954 streams; see Section 8. 1956 If the agent is the controlled agent, the response may be the result 1957 of a triggered check that was sent in response to a request that 1958 itself had the USE-CANDIDATE attribute. This case is described in 1959 Section 7.2.1.5, and may now result in setting the nominated flag for 1960 the pair learned from the original request. 1962 7.1.3.3. Check List and Timer State Updates 1964 Regardless of whether the check was successful or failed, the 1965 completion of the transaction may require updating of check list and 1966 timer states. 1968 If all of the pairs in the check list are now either in the Failed or 1969 Succeeded state: 1971 o If there is not a pair in the valid list for each component of the 1972 media stream, the state of the check list is set to Failed. 1974 o For each frozen check list, the agent 1976 * groups together all of the pairs with the same foundation, and 1978 * for each group, sets the state of the pair with the lowest 1979 component ID to Waiting. If there is more than one such pair, 1980 the one with the highest-priority is used. 1982 If none of the pairs in the check list are in the Waiting or Frozen 1983 state, the check list is no longer considered active, and will not 1984 count towards the value of N in the computation of timers for 1985 ordinary checks as described in Section 5.7. 1987 7.2. STUN Server Procedures 1989 An agent MUST be prepared to receive a Binding request on the base of 1990 each candidate it included in its most recent offer or answer. This 1991 requirement holds even if the peer is a lite implementation. 1993 The agent MUST use a short-term credential to authenticate the 1994 request and perform a message integrity check. The agent MUST 1995 consider the username to be valid if it consists of two values 1996 separated by a colon, where the first value is equal to the username 1997 fragment generated by the agent in an offer or answer for a session 1998 in-progress. It is possible (and in fact very likely) that an 1999 offerer will receive a Binding request prior to receiving the answer 2000 from its peer. If this happens, the agent MUST immediately generate 2001 a response (including computation of the mapped address as described 2002 in Section 7.2.1.2). The agent has sufficient information at this 2003 point to generate the response; the password from the peer is not 2004 required. Once the answer is received, it MUST proceed with the 2005 remaining steps required, namely, Section 7.2.1.3, Section 7.2.1.4, 2006 and Section 7.2.1.5 for full implementations. In cases where 2007 multiple STUN requests are received before the answer, this may cause 2008 several pairs to be queued up in the triggered check queue. 2010 An agent MUST NOT utilize the ALTERNATE-SERVER mechanism, and MUST 2011 NOT support the backwards-compatibility mechanisms to RFC 3489. It 2012 MUST utilize the FINGERPRINT mechanism. 2014 If the agent is using Diffserv Codepoint markings [RFC2475] in its 2015 media packets, it SHOULD apply those same markings to its responses 2016 to Binding requests. The same would apply to any layer 2 markings 2017 the endpoint might be applying to media packets. 2019 7.2.1. Additional Procedures for Full Implementations 2021 This subsection defines the additional server procedures applicable 2022 to full implementations. 2024 7.2.1.1. Detecting and Repairing Role Conflicts 2026 Normally, the rules for selection of a role in Section 5.2 will 2027 result in each agent selecting a different role -- one controlling 2028 and one controlled. However, in unusual call flows, typically 2029 utilizing third party call control, it is possible for both agents to 2030 select the same role. This section describes procedures for checking 2031 for this case and repairing it. These procedures apply only to 2032 usages of ICE that require conflict resolution. The usage document 2033 MUST specify whether this mechanism is needed. 2035 An agent MUST examine the Binding request for either the ICE- 2036 CONTROLLING or ICE-CONTROLLED attribute. It MUST follow these 2037 procedures: 2039 o If neither ICE-CONTROLLING nor ICE-CONTROLLED is present in the 2040 request, the peer agent may have implemented a previous version of 2041 this specification. There may be a conflict, but it cannot be 2042 detected. 2044 o If the agent is in the controlling role, and the ICE-CONTROLLING 2045 attribute is present in the request: 2047 * If the agent's tie-breaker is larger than or equal to the 2048 contents of the ICE-CONTROLLING attribute, the agent generates 2049 a Binding error response and includes an ERROR-CODE attribute 2050 with a value of 487 (Role Conflict) but retains its role. 2052 * If the agent's tie-breaker is less than the contents of the 2053 ICE-CONTROLLING attribute, the agent switches to the controlled 2054 role. 2056 o If the agent is in the controlled role, and the ICE-CONTROLLED 2057 attribute is present in the request: 2059 * If the agent's tie-breaker is larger than or equal to the 2060 contents of the ICE-CONTROLLED attribute, the agent switches to 2061 the controlling role. 2063 * If the agent's tie-breaker is less than the contents of the 2064 ICE-CONTROLLED attribute, the agent generates a Binding error 2065 response and includes an ERROR-CODE attribute with a value of 2066 487 (Role Conflict) but retains its role. 2068 o If the agent is in the controlled role and the ICE-CONTROLLING 2069 attribute was present in the request, or the agent was in the 2070 controlling role and the ICE-CONTROLLED attribute was present in 2071 the request, there is no conflict. 2073 A change in roles will require an agent to recompute pair priorities 2074 (Section 5.6.2), since those priorities are a function of controlling 2075 and controlled roles. The change in role will also impact whether 2076 the agent is responsible for selecting nominated pairs and generated 2077 updated offers upon conclusion of ICE. 2079 The remaining sections in Section 7.2.1 are followed if the server 2080 generated a successful response to the Binding request, even if the 2081 agent changed roles. 2083 7.2.1.2. Computing Mapped Address 2085 For requests being received on a relayed candidate, the source 2086 transport address used for STUN processing (namely, generation of the 2087 XOR-MAPPED-ADDRESS attribute) is the transport address as seen by the 2088 TURN server. That source transport address will be present in the 2089 XOR-PEER-ADDRESS attribute of a Data Indication message, if the 2090 Binding request was delivered through a Data Indication. If the 2091 Binding request was delivered through a ChannelData message, the 2092 source transport address is the one that was bound to the channel. 2094 7.2.1.3. Learning Peer Reflexive Candidates 2096 If the source transport address of the request does not match any 2097 existing remote candidates, it represents a new peer reflexive remote 2098 candidate. This candidate is constructed as follows: 2100 o The priority of the candidate is set to the PRIORITY attribute 2101 from the request. 2103 o The type of the candidate is set to peer reflexive. 2105 o The foundation of the candidate is set to an arbitrary value, 2106 different from the foundation for all other remote candidates. If 2107 any subsequent offer/answer exchanges contain this peer reflexive 2108 candidate, it will signal the actual foundation for the candidate. 2110 o The component ID of this candidate is set to the component ID for 2111 the local candidate to which the request was sent. 2113 This candidate is added to the list of remote candidates. However, 2114 the agent does not pair this candidate with any local candidates. 2116 7.2.1.4. Triggered Checks 2118 Next, the agent constructs a pair whose local candidate is equal to 2119 the transport address on which the STUN request was received, and a 2120 remote candidate equal to the source transport address where the 2121 request came from (which may be the peer reflexive remote candidate 2122 that was just learned). The local candidate will either be a host 2123 candidate (for cases where the request was not received through a 2124 relay) or a relayed candidate (for cases where it is received through 2125 a relay). The local candidate can never be a server reflexive 2126 candidate. Since both candidates are known to the agent, it can 2127 obtain their priorities and compute the candidate pair priority. 2128 This pair is then looked up in the check list. There can be one of 2129 several outcomes: 2131 o If the pair is already on the check list: 2133 * If the state of that pair is Waiting or Frozen, a check for 2134 that pair is enqueued into the triggered check queue if not 2135 already present. 2137 * If the state of that pair is In-Progress, the agent cancels the 2138 in-progress transaction. Cancellation means that the agent 2139 will not retransmit the request, will not treat the lack of 2140 response to be a failure, but will wait the duration of the 2141 transaction timeout for a response. In addition, the agent 2142 MUST create a new connectivity check for that pair 2143 (representing a new STUN Binding request transaction) by 2144 enqueueing the pair in the triggered check queue. The state of 2145 the pair is then changed to Waiting. 2147 * If the state of the pair is Failed, it is changed to Waiting 2148 and the agent MUST create a new connectivity check for that 2149 pair (representing a new STUN Binding request transaction), by 2150 enqueueing the pair in the triggered check queue. 2152 * If the state of that pair is Succeeded, nothing further is 2153 done. 2155 These steps are done to facilitate rapid completion of ICE when both 2156 agents are behind NAT. 2158 o If the pair is not already on the check list: 2160 * The pair is inserted into the check list based on its priority. 2162 * Its state is set to Waiting. 2164 * The pair is enqueued into the triggered check queue. 2166 When a triggered check is to be sent, it is constructed and processed 2167 as described in Section 7.1.2. These procedures require the agent to 2168 know the transport address, username fragment, and password for the 2169 peer. The username fragment for the remote candidate is equal to the 2170 part after the colon of the USERNAME in the Binding request that was 2171 just received. Using that username fragment, the agent can check the 2172 offers/answers received from its peer (there may be more than one in 2173 cases of forking), and find this username fragment. The 2174 corresponding password is then selected. 2176 7.2.1.5. Updating the Nominated Flag 2178 If the Binding request received by the agent had the USE-CANDIDATE 2179 attribute set, and the agent is in the controlled role, the agent 2180 looks at the state of the pair computed in Section 7.2.1.4: 2182 o If the state of this pair is Succeeded, it means that the check 2183 generated by this pair produced a successful response. This would 2184 have caused the agent to construct a valid pair when that success 2185 response was received (see Section 7.1.3.2.2). The agent now sets 2186 the nominated flag in the valid pair to true. This may end ICE 2187 processing for this media stream; see Section 8. 2189 o If the state of this pair is In-Progress, if its check produces a 2190 successful result, the resulting valid pair has its nominated flag 2191 set when the response arrives. This may end ICE processing for 2192 this media stream when it arrives; see Section 8. 2194 7.2.2. Additional Procedures for Lite Implementations 2196 If the check that was just received contained a USE-CANDIDATE 2197 attribute, the agent constructs a candidate pair whose local 2198 candidate is equal to the transport address on which the request was 2199 received, and whose remote candidate is equal to the source transport 2200 address of the request that was received. This candidate pair is 2201 assigned an arbitrary priority, and placed into a list of valid 2202 candidates called the valid list. The agent sets the nominated flag 2203 for that pair to true. ICE processing is considered complete for a 2204 media stream if the valid list contains a candidate pair for each 2205 component. 2207 8. Concluding ICE Processing 2209 This section describes how an agent completes ICE. 2211 8.1. Procedures for Full Implementations 2213 Concluding ICE involves nominating pairs by the controlling agent and 2214 updating of state machinery. 2216 8.1.1. Nominating Pairs 2218 The controlling agent nominates pairs to be selected by ICE by using 2219 one of two techniques: regular nomination or aggressive nomination. 2220 If its peer has a lite implementation, an agent MUST use a regular 2221 nomination algorithm. If its peer is using ICE options (present in 2222 an ice-options attribute from the peer) that the agent does not 2223 understand, the agent MUST use a regular nomination algorithm. If 2224 its peer is a full implementation and isn't using any ICE options or 2225 is using ICE options understood by the agent, the agent MAY use 2226 either the aggressive or the regular nomination algorithm. However, 2227 the regular algorithm is RECOMMENDED since it provides greater 2228 stability. 2230 8.1.1.1. Regular Nomination 2232 With regular nomination, the agent lets some number of checks 2233 complete, each of which omit the USE-CANDIDATE attribute. Once one 2234 or more checks complete successfully for a component of a media 2235 stream, valid pairs are generated and added to the valid list. The 2236 agent lets the checks continue until some stopping criterion is met, 2237 and then picks amongst the valid pairs based on an evaluation 2238 criterion. The criteria for stopping the checks and for evaluating 2239 the valid pairs is entirely a matter of local optimization. 2241 When the controlling agent selects the valid pair, it repeats the 2242 check that produced this valid pair (by enqueueing the pair that 2243 generated the check into the triggered check queue), this time with 2244 the USE-CANDIDATE attribute. This check should succeed (since the 2245 previous did), causing the nominated flag of that and only that pair 2246 to be set. Consequently, there will be only a single nominated pair 2247 in the valid list for each component, and when the state of the check 2248 list moves to completed, that exact pair is selected by ICE for 2249 sending and receiving media for that component. 2251 Regular nomination provides the most flexibility, since the agent has 2252 control over the stopping and selection criteria for checks. The 2253 only requirement is that the agent MUST eventually pick one and only 2254 one candidate pair and generate a check for that pair with the USE- 2255 CANDIDATE attribute present. Regular nomination also improves ICE's 2256 resilience to variations in implementation (see Section 11). Regular 2257 nomination is also more stable, allowing both agents to converge on a 2258 single pair for media without any transient selections, which can 2259 happen with the aggressive algorithm. The drawback of regular 2260 nomination is that it is guaranteed to increase latencies because it 2261 requires an additional check to be done. 2263 8.1.1.2. Aggressive Nomination 2265 With aggressive nomination, the controlling agent includes the USE- 2266 CANDIDATE attribute in every check it sends. Once the first check 2267 for a component succeeds, it will be added to the valid list and have 2268 its nominated flag set. When all components have a nominated pair in 2269 the valid list, media can begin to flow using the highest-priority 2270 nominated pair. However, because the agent included the USE- 2271 CANDIDATE attribute in all of its checks, another check may yet 2272 complete, causing another valid pair to have its nominated flag set. 2273 ICE always selects the highest-priority nominated candidate pair from 2274 the valid list as the one used for media. Consequently, the selected 2275 pair may actually change briefly as ICE checks complete, resulting in 2276 a set of transient selections until it stabilizes. 2278 8.1.2. Updating States 2280 For both controlling and controlled agents, the state of ICE 2281 processing depends on the presence of nominated candidate pairs in 2282 the valid list and on the state of the check list. Note that, at any 2283 time, more than one of the following cases can apply: 2285 o If there are no nominated pairs in the valid list for a media 2286 stream and the state of the check list is Running, ICE processing 2287 continues. 2289 o If there is at least one nominated pair in the valid list for a 2290 media stream and the state of the check list is Running: 2292 * The agent MUST remove all Waiting and Frozen pairs in the check 2293 list and triggered check queue for the same component as the 2294 nominated pairs for that media stream. 2296 * If an In-Progress pair in the check list is for the same 2297 component as a nominated pair, the agent SHOULD cease 2298 retransmissions for its check if its pair priority is lower 2299 than the lowest-priority nominated pair for that component. 2301 o Once there is at least one nominated pair in the valid list for 2302 every component of at least one media stream and the state of the 2303 check list is Running: 2305 * The agent MUST change the state of processing for its check 2306 list for that media stream to Completed. 2308 * The agent MUST continue to respond to any checks it may still 2309 receive for that media stream, and MUST perform triggered 2310 checks if required by the processing of Section 7.2. 2312 * The agent MUST continue retransmitting any In-Progress checks 2313 for that check list. 2315 * The agent MAY begin transmitting media for this media stream as 2316 described in Section 10.1. 2318 o Once the state of each check list is Completed: 2320 * The agent sets the state of ICE processing overall to 2321 Completed. 2323 * If the controlling agent is using an aggressive nomination 2324 algorithm, this may result in several updated offers as the 2325 pairs selected for media change. An agent MAY delay sending 2326 the offer for a brief interval (one second is RECOMMENDED) in 2327 order to allow the selected pairs to stabilize. 2329 o If the state of the check list is Failed, ICE has not been able to 2330 complete for this media stream. The correct behavior depends on 2331 the state of the check lists for other media streams: 2333 * If all check lists are Failed, ICE processing overall is 2334 considered to be in the Failed state, and the agent SHOULD 2335 consider the session a failure, SHOULD NOT restart ICE, and the 2336 controlling agent SHOULD terminate the entire session. 2338 * If at least one of the check lists for other media streams is 2339 Completed, the controlling agent SHOULD remove the failed media 2340 stream from the session in its updated offer. 2342 * If none of the check lists for other media streams are 2343 Completed, but at least one is Running, the agent SHOULD let 2344 ICE continue. 2346 8.2. Procedures for Lite Implementations 2348 Concluding ICE for a lite implementation is relatively 2349 straightforward. There are two cases to consider: 2351 The implementation is lite, and its peer is full. 2353 The implementation is lite, and its peer is lite. 2355 The effect of ICE concluding is that the agent can free any allocated 2356 host candidates that were not utilized by ICE, as described in 2357 Section 8.3. 2359 8.2.1. Peer Is Full 2361 In this case, the agent will receive connectivity checks from its 2362 peer. When an agent has received a connectivity check that includes 2363 the USE-CANDIDATE attribute for each component of a media stream, the 2364 state of ICE processing for that media stream moves from Running to 2365 Completed. When the state of ICE processing for all media streams is 2366 Completed, the state of ICE processing overall is Completed. 2368 The lite implementation will never itself determine that ICE 2369 processing has failed for a media stream; rather, the full peer will 2370 make that determination and then remove or restart the failed media 2371 stream in a subsequent offer. 2373 8.2.2. Peer Is Lite 2375 Once the offer/answer exchange has completed, both agents examine 2376 their candidates and those of its peer. For each media stream, each 2377 agent pairs up its own candidates with the candidates of its peer for 2378 that media stream. Two candidates are paired up when they are for 2379 the same component, utilize the same transport protocol (UDP in this 2380 specification), and are from the same IP address family (IPv4 or 2381 IPv6). 2383 o If there is a single pair per component, that pair is added to the 2384 Valid list. If all of the components for a media stream had one 2385 pair, the state of ICE processing for that media stream is set to 2386 Completed. If all media streams are Completed, the state of ICE 2387 processing is set to Completed overall. This will always be the 2388 case for implementations that are IPv4-only. 2390 o If there is more than one pair per component: 2392 * The agent MUST select a pair based on local policy. Since this 2393 case only arises for IPv6, it is RECOMMENDED that an agent 2394 follow the procedures of RFC 6724 [RFC6724] to select a single 2395 pair. 2397 * The agent adds the selected pair for each component to the 2398 valid list. As described in Section 10.1, this will permit 2399 media to begin flowing. However, it is possible (and in fact 2400 likely) that both agents have chosen different pairs. 2402 * To reconcile this, the controlling agent MUST send an updated 2403 offer which will include the remote-candidates attribute. 2405 * The agent MUST NOT update the state of ICE processing when the 2406 offer is sent. If this subsequent offer completes, the 2407 controlling agent MUST change the state of ICE processing to 2408 Completed for all media streams, and the state of ICE 2409 processing overall to Completed. 2411 8.3. Freeing Candidates 2413 8.3.1. Full Implementation Procedures 2415 The procedures in Section 8 require that an agent continue to listen 2416 for STUN requests and continue to generate triggered checks for a 2417 media stream, even once processing for that stream completes. The 2418 rules in this section describe when it is safe for an agent to cease 2419 sending or receiving checks on a candidate that was not selected by 2420 ICE, and then free the candidate. 2422 8.3.2. Lite Implementation Procedures 2424 A lite implementation MAY free candidates not selected by ICE as soon 2425 as ICE processing has reached the Completed state for all peers for 2426 all media streams using those candidates. 2428 9. Keepalives 2430 All endpoints MUST send keepalives for each media session. These 2431 keepalives serve the purpose of keeping NAT bindings alive for the 2432 media session. These keepalives MUST be sent even if ICE is not 2433 being utilized for the session at all. The keepalive SHOULD be sent 2434 using a format that is supported by its peer. ICE endpoints allow 2435 for STUN-based keepalives for UDP streams, and as such, STUN 2436 keepalives MUST be used when an agent is a full ICE implementation 2437 and is communicating with a peer that supports ICE (lite or full). 2438 If the peer does not support ICE, the choice of a packet format for 2439 keepalives is a matter of local implementation. A format that allows 2440 packets to easily be sent in the absence of actual media content is 2441 RECOMMENDED. Examples of formats that readily meet this goal are RTP 2442 No-Op [I-D.ietf-avt-rtp-no-op], and in cases where both sides support 2443 it, RTP comfort noise [RFC3389]. If the peer doesn't support any 2444 formats that are particularly well suited for keepalives, an agent 2445 SHOULD send RTP packets with an incorrect version number, or some 2446 other form of error that would cause them to be discarded by the 2447 peer. 2449 If there has been no packet sent on the candidate pair ICE is using 2450 for a media component for Tr seconds (where packets include those 2451 defined for the component (RTP or RTCP) and previous keepalives), an 2452 agent MUST generate a keepalive on that pair. Tr SHOULD be 2453 configurable and SHOULD have a default of 15 seconds. Tr MUST NOT be 2454 configured to less than 15 seconds. Alternatively, if an agent has a 2455 dynamic way to discover the binding lifetimes of the intervening 2456 NATs, it can use that value to determine Tr. Administrators 2457 deploying ICE in more controlled networking environments SHOULD set 2458 Tr to the longest duration possible in their environment. 2460 If STUN is being used for keepalives, a STUN Binding Indication is 2461 used [RFC5389]. The Indication MUST NOT utilize any authentication 2462 mechanism. It SHOULD contain the FINGERPRINT attribute to aid in 2463 demultiplexing, but SHOULD NOT contain any other attributes. It is 2464 used solely to keep the NAT bindings alive. The Binding Indication 2465 is sent using the same local and remote candidates that are being 2466 used for media. Though Binding Indications are used for keepalives, 2467 an agent MUST be prepared to receive a connectivity check as well. 2468 If a connectivity check is received, a response is generated as 2469 discussed in [RFC5389], but there is no impact on ICE processing 2470 otherwise. 2472 An agent MUST begin the keepalive processing once ICE has selected 2473 candidates for usage with media, or media begins to flow, whichever 2474 happens first. Keepalives end once the session terminates or the 2475 media stream is removed. 2477 10. Media Handling 2479 10.1. Sending Media 2481 Procedures for sending media differ for full and lite 2482 implementations. 2484 10.1.1. Procedures for Full Implementations 2486 Agents always send media using a candidate pair, called the selected 2487 candidate pair. An agent will send media to the remote candidate in 2488 the selected pair (setting the destination address and port of the 2489 packet equal to that remote candidate), and will send it from the 2490 local candidate of the selected pair. When the local candidate is 2491 server or peer reflexive, media is originated from the base. Media 2492 sent from a relayed candidate is sent from the base through that TURN 2493 server, using procedures defined in [RFC5766]. 2495 If the local candidate is a relayed candidate, it is RECOMMENDED that 2496 an agent create a channel on the TURN server towards the remote 2497 candidate. This is done using the procedures for channel creation as 2498 defined in Section 11 of [RFC5766]. 2500 The selected pair for a component of a media stream is: 2502 o empty if the state of the check list for that media stream is 2503 Running, and there is no previous selected pair for that component 2504 due to an ICE restart 2506 o equal to the previous selected pair for a component of a media 2507 stream if the state of the check list for that media stream is 2508 Running, and there was a previous selected pair for that component 2509 due to an ICE restart 2511 o equal to the highest-priority nominated pair for that component in 2512 the valid list if the state of the check list is Completed 2514 If the selected pair for at least one component of a media stream is 2515 empty, an agent MUST NOT send media for any component of that media 2516 stream. If the selected pair for each component of a media stream 2517 has a value, an agent MAY send media for all components of that media 2518 stream. 2520 10.1.2. Procedures for Lite Implementations 2522 A lite implementation MUST NOT send media until it has a Valid list 2523 that contains a candidate pair for each component of that media 2524 stream. Once that happens, the agent MAY begin sending media 2525 packets. To do that, it sends media to the remote candidate in the 2526 pair (setting the destination address and port of the packet equal to 2527 that remote candidate), and will send it from the local candidate. 2529 10.1.3. Procedures for All Implementations 2531 ICE has interactions with jitter buffer adaptation mechanisms. An 2532 RTP stream can begin using one candidate, and switch to another one, 2533 though this happens rarely with ICE. The newer candidate may result 2534 in RTP packets taking a different path through the network -- one 2535 with different delay characteristics. As discussed below, agents are 2536 encouraged to re-adjust jitter buffers when there are changes in 2537 source or destination address of media packets. Furthermore, many 2538 audio codecs use the marker bit to signal the beginning of a 2539 talkspurt, for the purposes of jitter buffer adaptation. For such 2540 codecs, it is RECOMMENDED that the sender set the marker bit 2541 [RFC3550] when an agent switches transmission of media from one 2542 candidate pair to another. 2544 10.2. Receiving Media 2546 ICE implementations MUST be prepared to receive media on each 2547 component on any candidates provided for that component in the most 2548 recent offer/answer exchange (in the case of RTP, this would include 2549 both RTP and RTCP if candidates were provided for both). 2551 It is RECOMMENDED that, when an agent receives an RTP packet with a 2552 new source or destination IP address for a particular media stream, 2553 that the agent re-adjust its jitter buffers. 2555 RFC 3550 [RFC3550] describes an algorithm in Section 8.2 for 2556 detecting synchronization source (SSRC) collisions and loops. These 2557 algorithms are based, in part, on seeing different source transport 2558 addresses with the same SSRC. However, when ICE is used, such 2559 changes will sometimes occur as the media streams switch between 2560 candidates. An agent will be able to determine that a media stream 2561 is from the same peer as a consequence of the STUN exchange that 2562 proceeds media transmission. Thus, if there is a change in source 2563 transport address, but the media packets come from the same peer 2564 agent, this SHOULD NOT be treated as an SSRC collision. 2566 11. Extensibility Considerations 2568 This specification makes very specific choices about how both agents 2569 in a session coordinate to arrive at the set of candidate pairs that 2570 are selected for media. It is anticipated that future specifications 2571 will want to alter these algorithms, whether they are simple changes 2572 like timer tweaks or larger changes like a revamp of the priority 2573 algorithm. When such a change is made, providing interoperability 2574 between the two agents in a session is critical. 2576 First, ICE provides the ice-options attribute. Each extension or 2577 change to ICE is associated with a token. When an agent supporting 2578 such an extension or change generates an offer or an answer, it MUST 2579 include the token for that extension in this attribute. This allows 2580 each side to know what the other side is doing. This attribute MUST 2581 NOT be present if the agent doesn't support any ICE extensions or 2582 changes. 2584 One of the complications in achieving interoperability is that ICE 2585 relies on a distributed algorithm running on both agents to converge 2586 on an agreed set of candidate pairs. If the two agents run different 2587 algorithms, it can be difficult to guarantee convergence on the same 2588 candidate pairs. The regular nomination procedure described in 2589 Section 8 eliminates some of the tight coordination by delegating the 2590 selection algorithm completely to the controlling agent. 2591 Consequently, when a controlling agent is communicating with a peer 2592 that supports options it doesn't know about, the agent MUST run a 2593 regular nomination algorithm. When regular nomination is used, ICE 2594 will converge perfectly even when both agents use different pair 2595 prioritization algorithms. One of the keys to such convergence is 2596 triggered checks, which ensure that the nominated pair is validated 2597 by both agents. Consequently, any future ICE enhancements MUST 2598 preserve triggered checks. 2600 ICE is also extensible to other media streams beyond RTP, and for 2601 transport protocols beyond UDP. Extensions to ICE for non-RTP media 2602 streams need to specify how many components they utilize, and assign 2603 component IDs to them, starting at 1 for the most important component 2604 ID. Specifications for new transport protocols must define how, if 2605 at all, various steps in the ICE processing differ from UDP. 2607 12. Setting Ta and RTO 2609 During the gathering phase of ICE (Section 4.1.1) and while ICE is 2610 performing connectivity checks (Section 7), an agent sends STUN and 2611 TURN transactions. These transactions are paced at a rate of one 2612 every Ta milliseconds, and utilize a specific RTO. This section 2613 describes how the values of Ta and RTO are computed. This 2614 computation depends on whether ICE is being used with a real-time 2615 media stream (such as RTP) or something else. When ICE is used for a 2616 stream with a known maximum bandwidth, the computation in 2617 Section 12.1 MAY be followed to rate-control the ICE exchanges. For 2618 all other streams, the computation in Section 12.2 MUST be followed. 2620 12.1. RTP Media Streams 2622 The values of RTO and Ta change during the lifetime of ICE 2623 processing. One set of values applies during the gathering phase, 2624 and the other, for connectivity checks. 2626 The value of Ta SHOULD be configurable, and SHOULD have a default of: 2628 For each media stream i: 2629 Ta_i = (stun_packet_size / rtp_packet_size) * rtp_ptime 2631 1 2632 Ta = MAX (20ms, ------------------- ) 2633 k 2634 ---- 2635 \ 1 2636 > ------ 2637 / Ta_i 2638 ---- 2639 i=1 2641 where k is the number of media streams. During the gathering phase, 2642 Ta is computed based on the number of media streams the agent has 2643 indicated in its offer or answer, and the RTP packet size and RTP 2644 ptime are those of the most preferred codec for each media stream. 2645 Once an offer and answer have been exchanged, the agent recomputes Ta 2646 to pace the connectivity checks. In that case, the value of Ta is 2647 based on the number of media streams that will actually be used in 2648 the session, and the RTP packet size and RTP ptime are those of the 2649 most preferred codec with which the agent will send. 2651 In addition, the retransmission timer for the STUN transactions, RTO, 2652 defined in [RFC5389], SHOULD be configurable and during the gathering 2653 phase, SHOULD have a default of: 2655 RTO = MAX (100ms, Ta * (number of pairs)) 2657 where the number of pairs refers to the number of pairs of candidates 2658 with STUN or TURN servers. 2660 For connectivity checks, RTO SHOULD be configurable and SHOULD have a 2661 default of: 2663 RTO = MAX (100ms, Ta*N * (Num-Waiting + Num-In-Progress)) 2665 where Num-Waiting is the number of checks in the check list in the 2666 Waiting state, and Num-In-Progress is the number of checks in the In- 2667 Progress state. Note that the RTO will be different for each 2668 transaction as the number of checks in the Waiting and In-Progress 2669 states change. 2671 These formulas are aimed at causing STUN transactions to be paced at 2672 the same rate as media. This ensures that ICE will work properly 2673 under the same network conditions needed to support the media as 2674 well. See Appendix B.1 for additional discussion and motivations. 2675 Because of this pacing, it will take a certain amount of time to 2676 obtain all of the server reflexive and relayed candidates. 2677 Implementations should be aware of the time required to do this, and 2678 if the application requires a time budget, limit the number of 2679 candidates that are gathered. 2681 The formulas result in a behavior whereby an agent will send its 2682 first packet for every single connectivity check before performing a 2683 retransmit. This can be seen in the formulas for the RTO (which 2684 represents the retransmit interval). Those formulas scale with N, 2685 the number of checks to be performed. As a result of this, ICE 2686 maintains a nicely constant rate, but becomes more sensitive to 2687 packet loss. The loss of the first single packet for any 2688 connectivity check is likely to cause that pair to take a long time 2689 to be validated, and instead, a lower-priority check (but one for 2690 which there was no packet loss) is much more likely to complete 2691 first. This results in ICE performing sub-optimally, choosing lower- 2692 priority pairs over higher-priority pairs. Implementors should be 2693 aware of this consequence, but still should utilize the timer values 2694 described here. 2696 12.2. Non-RTP Sessions 2698 In cases where ICE is used to establish some kind of session that is 2699 not real time, and has no fixed rate associated with it that is known 2700 to work on the network in which ICE is deployed, Ta and RTO revert to 2701 more conservative values. Ta SHOULD be configurable, SHOULD have a 2702 default of 500 ms, and MUST NOT be configurable to be less than 500 2703 ms. 2705 In addition, the retransmission timer for the STUN transactions, RTO, 2706 SHOULD be configurable and during the gathering phase, SHOULD have a 2707 default of: 2709 RTO = MAX (500ms, Ta * (number of pairs)) 2711 where the number of pairs refers to the number of pairs of candidates 2712 with STUN or TURN servers. 2714 For connectivity checks, RTO SHOULD be configurable and SHOULD have a 2715 default of: 2717 RTO = MAX (500ms, Ta*N * (Num-Waiting + Num-In-Progress)) 2719 13. Example 2721 The example is based on the simplified topology of Figure 8. 2723 +-------+ 2724 |STUN | 2725 |Server | 2726 +-------+ 2727 | 2728 +---------------------+ 2729 | | 2730 | Internet | 2731 | | 2732 +---------------------+ 2733 | | 2734 | | 2735 +---------+ | 2736 | NAT | | 2737 +---------+ | 2738 | | 2739 | | 2740 +-----+ +-----+ 2741 | L | | R | 2742 +-----+ +-----+ 2744 Figure 8: Example Topology 2746 Two agents, L and R, are using ICE. Both are full-mode ICE 2747 implementations and use aggressive nomination when they are 2748 controlling. Both agents have a single IPv4 address. For agent L, 2749 it is 10.0.1.1 in private address space [RFC1918], and for agent R, 2750 192.0.2.1 on the public Internet. Both are configured with the same 2751 STUN server (shown in this example for simplicity, although in 2752 practice the agents do not need to use the same STUN server), which 2753 is listening for STUN Binding requests at an IP address of 192.0.2.2 2754 and port 3478. TURN servers are not used in this example. Agent L 2755 is behind a NAT, and agent R is on the public Internet. The NAT has 2756 an endpoint independent mapping property and an address dependent 2757 filtering property. The public side of the NAT has an IP address of 2758 192.0.2.3. 2760 To facilitate understanding, transport addresses are listed using 2761 variables that have mnemonic names. The format of the name is 2762 entity-type-seqno, where entity refers to the entity whose IP address 2763 the transport address is on, and is one of "L", "R", "STUN", or 2764 "NAT". The type is either "PUB" for transport addresses that are 2765 public, and "PRIV" for transport addresses that are private. 2766 Finally, seq-no is a sequence number that is different for each 2767 transport address of the same type on a particular entity. Each 2768 variable has an IP address and port, denoted by varname.IP and 2769 varname.PORT, respectively, where varname is the name of the 2770 variable. 2772 The STUN server has advertised transport address STUN-PUB-1 (which is 2773 192.0.2.2:3478). 2775 In the call flow itself, STUN messages are annotated with several 2776 attributes. The "S=" attribute indicates the source transport 2777 address of the message. The "D=" attribute indicates the destination 2778 transport address of the message. The "MA=" attribute is used in 2779 STUN Binding response messages and refers to the mapped address. 2780 "USE-CAND" implies the presence of the USE-CANDIDATE attribute. 2782 The call flow examples omit STUN authentication operations and RTCP, 2783 and focus on RTP for a single media stream between two full 2784 implementations. 2786 L NAT STUN R 2787 |RTP STUN alloc. | | 2788 |(1) STUN Req | | | 2789 |S=$L-PRIV-1 | | | 2790 |D=$STUN-PUB-1 | | | 2791 |------------->| | | 2792 | |(2) STUN Req | | 2793 | |S=$NAT-PUB-1 | | 2794 | |D=$STUN-PUB-1 | | 2795 | |------------->| | 2796 | |(3) STUN Res | | 2797 | |S=$STUN-PUB-1 | | 2798 | |D=$NAT-PUB-1 | | 2799 | |MA=$NAT-PUB-1 | | 2800 | |<-------------| | 2801 |(4) STUN Res | | | 2802 |S=$STUN-PUB-1 | | | 2803 |D=$L-PRIV-1 | | | 2804 |MA=$NAT-PUB-1 | | | 2805 |<-------------| | | 2806 |(5) Offer | | | 2807 |------------------------------------------->| 2808 | | | | RTP STUN 2809 | | | | alloc. 2810 | | |(6) STUN Req | 2811 | | |S=$R-PUB-1 | 2812 | | |D=$STUN-PUB-1 | 2813 | | |<-------------| 2814 | | |(7) STUN Res | 2815 | | |S=$STUN-PUB-1 | 2816 | | |D=$R-PUB-1 | 2817 | | |MA=$R-PUB-1 | 2818 | | |------------->| 2819 |(8) answer | | | 2820 |<-------------------------------------------| 2821 | |(9) Bind Req | |Begin 2822 | |S=$R-PUB-1 | |Connectivity 2823 | |D=L-PRIV-1 | |Checks 2824 | |<----------------------------| 2825 | |Dropped | | 2826 |(10) Bind Req | | | 2827 |S=$L-PRIV-1 | | | 2828 |D=$R-PUB-1 | | | 2829 |USE-CAND | | | 2830 |------------->| | | 2831 | |(11) Bind Req | | 2832 | |S=$NAT-PUB-1 | | 2833 | |D=$R-PUB-1 | | 2834 | |USE-CAND | | 2835 | |---------------------------->| 2836 | |(12) Bind Res | | 2837 | |S=$R-PUB-1 | | 2838 | |D=$NAT-PUB-1 | | 2839 | |MA=$NAT-PUB-1 | | 2840 | |<----------------------------| 2841 |(13) Bind Res | | | 2842 |S=$R-PUB-1 | | | 2843 |D=$L-PRIV-1 | | | 2844 |MA=$NAT-PUB-1 | | | 2845 |<-------------| | | 2846 |RTP flows | | | 2847 | |(14) Bind Req | | 2848 | |S=$R-PUB-1 | | 2849 | |D=$NAT-PUB-1 | | 2850 | |<----------------------------| 2851 |(15) Bind Req | | | 2852 |S=$R-PUB-1 | | | 2853 |D=$L-PRIV-1 | | | 2854 |<-------------| | | 2855 |(16) Bind Res | | | 2856 |S=$L-PRIV-1 | | | 2857 |D=$R-PUB-1 | | | 2858 |MA=$R-PUB-1 | | | 2859 |------------->| | | 2860 | |(17) Bind Res | | 2861 | |S=$NAT-PUB-1 | | 2862 | |D=$R-PUB-1 | | 2863 | |MA=$R-PUB-1 | | 2864 | |---------------------------->| 2865 | | | |RTP flows 2866 Figure 9: Example Flow 2868 First, agent L obtains a host candidate from its local IP address 2869 (not shown), and from that, sends a STUN Binding request to the STUN 2870 server to get a server reflexive candidate (messages 1-4). Recall 2871 that the NAT has the address and port independent mapping property. 2872 Here, it creates a binding of NAT-PUB-1 for this UDP request, and 2873 this becomes the server reflexive candidate for RTP. 2875 Agent L sets a type preference of 126 for the host candidate and 100 2876 for the server reflexive. The local preference is 65535. Based on 2877 this, the priority of the host candidate is 2130706431 and for the 2878 server reflexive candidate is 1694498815. The host candidate is 2879 assigned a foundation of 1, and the server reflexive, a foundation of 2880 2. These are sent to the peer in an offer. 2882 This offer is received at agent R. Agent R will obtain a host 2883 candidate, and from it, obtain a server reflexive candidate (messages 2884 6-7). Since R is not behind a NAT, this candidate is identical to 2885 its host candidate, and they share the same base. It therefore 2886 discards this redundant candidate and ends up with a single host 2887 candidate. With identical type and local preferences as L, the 2888 priority for this candidate is 2130706431. It chooses a foundation 2889 of 1 for its single candidate. The answerer's candidates are then 2890 sent to the offerer. 2892 Since neither side indicated that it is lite, the agent that sent the 2893 offer that began ICE processing (agent L) becomes the controlling 2894 agent. 2896 Agents L and R both pair up the candidates. They both initially have 2897 two pairs. However, agent L will prune the pair containing its 2898 server reflexive candidate, resulting in just one. At agent L, this 2899 pair has a local candidate of $L_PRIV_1 and remote candidate of 2900 $R_PUB_1, and has a candidate pair priority of 4.57566E+18 (note that 2901 an implementation would represent this as a 64-bit integer so as not 2902 to lose precision). At agent R, there are two pairs. The highest 2903 priority has a local candidate of $R_PUB_1 and remote candidate of 2904 $L_PRIV_1 and has a priority of 4.57566E+18, and the second has a 2905 local candidate of $R_PUB_1 and remote candidate of $NAT_PUB_1 and 2906 priority 3.63891E+18. 2908 Agent R begins its connectivity check (message 9) for the first pair 2909 (between the two host candidates). Since R is the controlled agent 2910 for this session, the check omits the USE-CANDIDATE attribute. The 2911 host candidate from agent L is private and behind a NAT, and thus 2912 this check won't be successful, because the packet cannot be routed 2913 from R to L. 2915 When agent L gets the answer, it performs its one and only 2916 connectivity check (messages 10-13). It implements the aggressive 2917 nomination algorithm, and thus includes a USE-CANDIDATE attribute in 2918 this check. Since the check succeeds, agent L creates a new pair, 2919 whose local candidate is from the mapped address in the Binding 2920 response (NAT-PUB-1 from message 13) and whose remote candidate is 2921 the destination of the request (R-PUB-1 from message 10). This is 2922 added to the valid list. In addition, it is marked as selected since 2923 the Binding request contained the USE-CANDIDATE attribute. Since 2924 there is a selected candidate in the Valid list for the one component 2925 of this media stream, ICE processing for this stream moves into the 2926 Completed state. Agent L can now send media if it so chooses. 2928 Soon after receipt of the STUN Binding request from agent L (message 2929 11), agent R will generate its triggered check. This check happens 2930 to match the next one on its check list -- from its host candidate to 2931 agent L's server reflexive candidate. This check (messages 14-17) 2932 will succeed. Consequently, agent R constructs a new candidate pair 2933 using the mapped address from the response as the local candidate 2934 (R-PUB-1) and the destination of the request (NAT-PUB-1) as the 2935 remote candidate. This pair is added to the Valid list for that 2936 media stream. Since the check was generated in the reverse direction 2937 of a check that contained the USE-CANDIDATE attribute, the candidate 2938 pair is marked as selected. Consequently, processing for this stream 2939 moves into the Completed state, and agent R can also send media. 2941 14. Security Considerations 2943 There are several types of attacks possible in an ICE system. This 2944 section considers these attacks and their countermeasures. These 2945 countermeasures include: 2947 o Using ICE in conjunction with secure signaling techniques, such as 2948 SIPS. 2950 o Limiting the total number of connectivity checks to 100, and 2951 optionally limiting the number of candidates they'll accept in an 2952 offer or answer. 2954 14.1. Attacks on Connectivity Checks 2956 An attacker might attempt to disrupt the STUN connectivity checks. 2957 Ultimately, all of these attacks fool an agent into thinking 2958 something incorrect about the results of the connectivity checks. 2959 The possible false conclusions an attacker can try and cause are: 2961 False Invalid: An attacker can fool a pair of agents into thinking a 2962 candidate pair is invalid, when it isn't. This can be used to 2963 cause an agent to prefer a different candidate (such as one 2964 injected by the attacker) or to disrupt a call by forcing all 2965 candidates to fail. 2967 False Valid: An attacker can fool a pair of agents into thinking a 2968 candidate pair is valid, when it isn't. This can cause an agent 2969 to proceed with a session, but then not be able to receive any 2970 media. 2972 False Peer Reflexive Candidate: An attacker can cause an agent to 2973 discover a new peer reflexive candidate, when it shouldn't have. 2974 This can be used to redirect media streams to a Denial-of-Service 2975 (DoS) target or to the attacker, for eavesdropping or other 2976 purposes. 2978 False Valid on False Candidate: An attacker has already convinced an 2979 agent that there is a candidate with an address that doesn't 2980 actually route to that agent (for example, by injecting a false 2981 peer reflexive candidate or false server reflexive candidate). It 2982 must then launch an attack that forces the agents to believe that 2983 this candidate is valid. 2985 If an attacker can cause a false peer reflexive candidate or false 2986 valid on a false candidate, it can launch any of the attacks 2987 described in [RFC5389]. 2989 To force the false invalid result, the attacker has to wait for the 2990 connectivity check from one of the agents to be sent. When it is, 2991 the attacker needs to inject a fake response with an unrecoverable 2992 error response, such as a 400. However, since the candidate is, in 2993 fact, valid, the original request may reach the peer agent, and 2994 result in a success response. The attacker needs to force this 2995 packet or its response to be dropped, through a DoS attack, layer 2 2996 network disruption, or other technique. If it doesn't do this, the 2997 success response will also reach the originator, alerting it to a 2998 possible attack. Fortunately, this attack is mitigated completely 2999 through the STUN short-term credential mechanism. The attacker needs 3000 to inject a fake response, and in order for this response to be 3001 processed, the attacker needs the password. If the offer/answer 3002 signaling is secured, the attacker will not have the password and its 3003 response will be discarded. 3005 Forcing the fake valid result works in a similar way. The agent 3006 needs to wait for the Binding request from each agent, and inject a 3007 fake success response. The attacker won't need to worry about 3008 disrupting the actual response since, if the candidate is not valid, 3009 it presumably wouldn't be received anyway. However, like the fake 3010 invalid attack, this attack is mitigated by the STUN short-term 3011 credential mechanism in conjunction with a secure offer/answer 3012 exchange. 3014 Forcing the false peer reflexive candidate result can be done either 3015 with fake requests or responses, or with replays. We consider the 3016 fake requests and responses case first. It requires the attacker to 3017 send a Binding request to one agent with a source IP address and port 3018 for the false candidate. In addition, the attacker must wait for a 3019 Binding request from the other agent, and generate a fake response 3020 with a XOR-MAPPED-ADDRESS attribute containing the false candidate. 3021 Like the other attacks described here, this attack is mitigated by 3022 the STUN message integrity mechanisms and secure offer/answer 3023 exchanges. 3025 Forcing the false peer reflexive candidate result with packet replays 3026 is different. The attacker waits until one of the agents sends a 3027 check. It intercepts this request, and replays it towards the other 3028 agent with a faked source IP address. It must also prevent the 3029 original request from reaching the remote agent, either by launching 3030 a DoS attack to cause the packet to be dropped, or forcing it to be 3031 dropped using layer 2 mechanisms. The replayed packet is received at 3032 the other agent, and accepted, since the integrity check passes (the 3033 integrity check cannot and does not cover the source IP address and 3034 port). It is then responded to. This response will contain a XOR- 3035 MAPPED-ADDRESS with the false candidate, and will be sent to that 3036 false candidate. The attacker must then receive it and relay it 3037 towards the originator. 3039 The other agent will then initiate a connectivity check towards that 3040 false candidate. This validation needs to succeed. This requires 3041 the attacker to force a false valid on a false candidate. Injecting 3042 of fake requests or responses to achieve this goal is prevented using 3043 the integrity mechanisms of STUN and the offer/answer exchange. 3044 Thus, this attack can only be launched through replays. To do that, 3045 the attacker must intercept the check towards this false candidate, 3046 and replay it towards the other agent. Then, it must intercept the 3047 response and replay that back as well. 3049 This attack is very hard to launch unless the attacker is identified 3050 by the fake candidate. This is because it requires the attacker to 3051 intercept and replay packets sent by two different hosts. If both 3052 agents are on different networks (for example, across the public 3053 Internet), this attack can be hard to coordinate, since it needs to 3054 occur against two different endpoints on different parts of the 3055 network at the same time. 3057 If the attacker itself is identified by the fake candidate, the 3058 attack is easier to coordinate. However, if the media path is 3059 secured (e.g., using SRTP [RFC3711]), the attacker will not be able 3060 to play the media packets, but will only be able to discard them, 3061 effectively disabling the media stream for the call. However, this 3062 attack requires the agent to disrupt packets in order to block the 3063 connectivity check from reaching the target. In that case, if the 3064 goal is to disrupt the media stream, it's much easier to just disrupt 3065 it with the same mechanism, rather than attack ICE. 3067 14.2. Attacks on Server Reflexive Address Gathering 3069 ICE endpoints make use of STUN Binding requests for gathering server 3070 reflexive candidates from a STUN server. These requests are not 3071 authenticated in any way. As a consequence, there are numerous 3072 techniques an attacker can employ to provide the client with a false 3073 server reflexive candidate: 3075 o An attacker can compromise the DNS, causing DNS queries to return 3076 a rogue STUN server address. That server can provide the client 3077 with fake server reflexive candidates. This attack is mitigated 3078 by DNS security, though DNS-SEC is not required to address it. 3080 o An attacker that can observe STUN messages (such as an attacker on 3081 a shared network segment, like WiFi) can inject a fake response 3082 that is valid and will be accepted by the client. 3084 o An attacker can compromise a STUN server by means of a virus, and 3085 cause it to send responses with incorrect mapped addresses. 3087 A false mapped address learned by these attacks will be used as a 3088 server reflexive candidate in the ICE exchange. For this candidate 3089 to actually be used for media, the attacker must also attack the 3090 connectivity checks, and in particular, force a false valid on a 3091 false candidate. This attack is very hard to launch if the false 3092 address identifies a fourth party (neither the offerer, answerer, nor 3093 attacker), since it requires attacking the checks generated by each 3094 agent in the session, and is prevented by SRTP if it identifies the 3095 attacker themself. 3097 If the attacker elects not to attack the connectivity checks, the 3098 worst it can do is prevent the server reflexive candidate from being 3099 used. However, if the peer agent has at least one candidate that is 3100 reachable by the agent under attack, the STUN connectivity checks 3101 themselves will provide a peer reflexive candidate that can be used 3102 for the exchange of media. Peer reflexive candidates are generally 3103 preferred over server reflexive candidates. As such, an attack 3104 solely on the STUN address gathering will normally have no impact on 3105 a session at all. 3107 14.3. Attacks on Relayed Candidate Gathering 3109 An attacker might attempt to disrupt the gathering of relayed 3110 candidates, forcing the client to believe it has a false relayed 3111 candidate. Exchanges with the TURN server are authenticated using a 3112 long-term credential. Consequently, injection of fake responses or 3113 requests will not work. In addition, unlike Binding requests, 3114 Allocate requests are not susceptible to replay attacks with modified 3115 source IP addresses and ports, since the source IP address and port 3116 are not utilized to provide the client with its relayed candidate. 3118 However, TURN servers are susceptible to DNS attacks, or to viruses 3119 aimed at the TURN server, for purposes of turning it into a zombie or 3120 rogue server. These attacks can be mitigated by DNS-SEC and through 3121 good box and software security on TURN servers. 3123 Even if an attacker has caused the client to believe in a false 3124 relayed candidate, the connectivity checks cause such a candidate to 3125 be used only if they succeed. Thus, an attacker must launch a false 3126 valid on a false candidate, per above, which is a very difficult 3127 attack to coordinate. 3129 14.4. Insider Attacks 3131 In addition to attacks where the attacker is a third party trying to 3132 insert fake offers, answers, or stun messages, there are attacks 3133 possible with ICE when the attacker is an authenticated and valid 3134 participant in the ICE exchange. 3136 14.4.1. STUN Amplification Attack 3138 The STUN amplification attack is similar to the voice hammer. 3139 However, instead of voice packets being directed to the target, STUN 3140 connectivity checks are directed to the target. The attacker sends 3141 an offer with a large number of candidates, say, 50. The answerer 3142 receives the offer, and starts its checks, which are directed at the 3143 target, and consequently, never generate a response. The answerer 3144 will start a new connectivity check every Ta ms (say, Ta=20ms). 3145 However, the retransmission timers are set to a large number due to 3146 the large number of candidates. As a consequence, packets will be 3147 sent at an interval of one every Ta milliseconds, and then with 3148 increasing intervals after that. Thus, STUN will not send packets at 3149 a rate faster than media would be sent, and the STUN packets persist 3150 only briefly, until ICE fails for the session. Nonetheless, this is 3151 an amplification mechanism. 3153 It is impossible to eliminate the amplification, but the volume can 3154 be reduced through a variety of heuristics. Agents SHOULD limit the 3155 total number of connectivity checks they perform to 100. 3156 Additionally, agents MAY limit the number of candidates they'll 3157 accept in an offer or answer. 3159 Frequently, protocols that wish to avoid these kinds of attacks force 3160 the initiator to wait for a response prior to sending the next 3161 message. However, in the case of ICE, this is not possible. It is 3162 not possible to differentiate the following two cases: 3164 o There was no response because the initiator is being used to 3165 launch a DoS attack against an unsuspecting target that will not 3166 respond. 3168 o There was no response because the IP address and port are not 3169 reachable by the initiator. 3171 In the second case, another check should be sent at the next 3172 opportunity, while in the former case, no further checks should be 3173 sent. 3175 15. STUN Extensions 3177 15.1. New Attributes 3179 This specification defines four new attributes, PRIORITY, USE- 3180 CANDIDATE, ICE-CONTROLLED, and ICE-CONTROLLING. 3182 The PRIORITY attribute indicates the priority that is to be 3183 associated with a peer reflexive candidate, should one be discovered 3184 by this check. It is a 32-bit unsigned integer, and has an attribute 3185 value of 0x0024. 3187 The USE-CANDIDATE attribute indicates that the candidate pair 3188 resulting from this check should be used for transmission of media. 3189 The attribute has no content (the Length field of the attribute is 3190 zero); it serves as a flag. It has an attribute value of 0x0025. 3192 The ICE-CONTROLLED attribute is present in a Binding request and 3193 indicates that the client believes it is currently in the controlled 3194 role. The content of the attribute is a 64-bit unsigned integer in 3195 network byte order, which contains a random number used for tie- 3196 breaking of role conflicts. 3198 The ICE-CONTROLLING attribute is present in a Binding request and 3199 indicates that the client believes it is currently in the controlling 3200 role. The content of the attribute is a 64-bit unsigned integer in 3201 network byte order, which contains a random number used for tie- 3202 breaking of role conflicts. 3204 15.2. New Error Response Codes 3206 This specification defines a single error response code: 3208 487 (Role Conflict): The Binding request contained either the ICE- 3209 CONTROLLING or ICE-CONTROLLED attribute, indicating a role that 3210 conflicted with the server. The server ran a tie-breaker based on 3211 the tie-breaker value in the request and determined that the 3212 client needs to switch roles. 3214 16. Operational Considerations 3216 This section discusses issues relevant to network operators looking 3217 to deploy ICE. 3219 16.1. NAT and Firewall Types 3221 ICE was designed to work with existing NAT and firewall equipment. 3222 Consequently, it is not necessary to replace or reconfigure existing 3223 firewall and NAT equipment in order to facilitate deployment of ICE. 3224 Indeed, ICE was developed to be deployed in environments where the 3225 Voice over IP (VoIP) operator has no control over the IP network 3226 infrastructure, including firewalls and NAT. 3228 That said, ICE works best in environments where the NAT devices are 3229 "behave" compliant, meeting the recommendations defined in [RFC4787] 3230 and [RFC5382]. In networks with behave-compliant NAT, ICE will work 3231 without the need for a TURN server, thus improving voice quality, 3232 decreasing call setup times, and reducing the bandwidth demands on 3233 the network operator. 3235 16.2. Bandwidth Requirements 3237 Deployment of ICE can have several interactions with available 3238 network capacity that operators should take into consideration. 3240 16.2.1. STUN and TURN Server Capacity Planning 3242 First and foremost, ICE makes use of TURN and STUN servers, which 3243 would typically be located in the network operator's data centers. 3244 The STUN servers require relatively little bandwidth. For each 3245 component of each media stream, there will be one or more STUN 3246 transactions from each client to the STUN server. In a basic voice- 3247 only IPv4 VoIP deployment, there will be four transactions per call 3248 (one for RTP and one for RTCP, for both caller and callee). Each 3249 transaction is a single request and a single response, the former 3250 being 20 bytes long, and the latter, 28. Consequently, if a system 3251 has N users, and each makes four calls in a busy hour, this would 3252 require N*1.7bps. For one million users, this is 1.7 Mbps, a very 3253 small number (relatively speaking). 3255 TURN traffic is more substantial. The TURN server will see traffic 3256 volume equal to the STUN volume (indeed, if TURN servers are 3257 deployed, there is no need for a separate STUN server), in addition 3258 to the traffic for the actual media traffic. The amount of calls 3259 requiring TURN for media relay is highly dependent on network 3260 topologies, and can and will vary over time. In a network with 100% 3261 behave-compliant NAT, it is exactly zero. At time of writing, large- 3262 scale consumer deployments were seeing between 5 and 10 percent of 3263 calls requiring TURN servers. Considering a voice-only deployment 3264 using G.711 (so 80 kbps in each direction), with .2 erlangs during 3265 the busy hour, this is N*3.2 kbps. For a population of one million 3266 users, this is 3.2 Gbps, assuming a 10% usage of TURN servers. 3268 16.2.2. Gathering and Connectivity Checks 3270 The process of gathering of candidates and performing of connectivity 3271 checks can be bandwidth intensive. ICE has been designed to pace 3272 both of these processes. The gathering phase and the connectivity 3273 check phase are meant to generate traffic at roughly the same 3274 bandwidth as the media traffic itself. This was done to ensure that, 3275 if a network is designed to support multimedia traffic of a certain 3276 type (voice, video, or just text), it will have sufficient capacity 3277 to support the ICE checks for that media. Of course, the ICE checks 3278 will cause a marginal increase in the total utilization; however, 3279 this will typically be an extremely small increase. 3281 Congestion due to the gathering and check phases has proven to be a 3282 problem in deployments that did not utilize pacing. Typically, 3283 access links became congested as the endpoints flooded the network 3284 with checks as fast as they can send them. Consequently, network 3285 operators should make sure that their ICE implementations support the 3286 pacing feature. Though this pacing does increase call setup times, 3287 it makes ICE network friendly and easier to deploy. 3289 16.2.3. Keepalives 3291 STUN keepalives (in the form of STUN Binding Indications) are sent in 3292 the middle of a media session. However, they are sent only in the 3293 absence of actual media traffic. In deployments that are not 3294 utilizing Voice Activity Detection (VAD), the keepalives are never 3295 used and there is no increase in bandwidth usage. When VAD is being 3296 used, keepalives will be sent during silence periods. This involves 3297 a single packet every 15-20 seconds, far less than the packet every 3298 20-30 ms that is sent when there is voice. Therefore, keepalives 3299 don't have any real impact on capacity planning. 3301 16.3. ICE and ICE-lite 3303 Deployments utilizing a mix of ICE and ICE-lite interoperate 3304 perfectly. They have been explicitly designed to do so, without loss 3305 of function. 3307 However, ICE-lite can only be deployed in limited use cases. Those 3308 cases, and the caveats involved in doing so, are documented in 3309 Appendix A. 3311 16.4. Troubleshooting and Performance Management 3313 ICE utilizes end-to-end connectivity checks, and places much of the 3314 processing in the endpoints. This introduces a challenge to the 3315 network operator -- how can they troubleshoot ICE deployments? How 3316 can they know how ICE is performing? 3318 ICE has built-in features to help deal with these problems. SIP 3319 servers on the signaling path, typically deployed in the data centers 3320 of the network operator, will see the contents of the offer/answer 3321 exchanges that convey the ICE parameters. These parameters include 3322 the type of each candidate (host, server reflexive, or relayed), 3323 along with their related addresses. Once ICE processing has 3324 completed, an updated offer/answer exchange takes place, signaling 3325 the selected address (and its type). This updated re-INVITE is 3326 performed exactly for the purposes of educating network equipment 3327 (such as a diagnostic tool attached to a SIP server) about the 3328 results of ICE processing. 3330 As a consequence, through the logs generated by the SIP server, a 3331 network operator can observe what types of candidates are being used 3332 for each call, and what address was selected by ICE. This is the 3333 primary information that helps evaluate how ICE is performing. 3335 16.5. Endpoint Configuration 3337 ICE relies on several pieces of data being configured into the 3338 endpoints. This configuration data includes timers, credentials for 3339 TURN servers, and hostnames for STUN and TURN servers. ICE itself 3340 does not provide a mechanism for this configuration. Instead, it is 3341 assumed that this information is attached to whatever mechanism is 3342 used to configure all of the other parameters in the endpoint. For 3343 SIP phones, standard solutions such as the configuration framework 3344 [RFC6080] have been defined. 3346 17. IANA Considerations 3348 The original ICE specification registered four new STUN attributes, 3349 and one new STUN error response. The STUN attributes and error 3350 response are reproduced here. 3352 17.1. STUN Attributes 3354 IANA has registered four STUN attributes: 3356 0x0024 PRIORITY 3357 0x0025 USE-CANDIDATE 3358 0x8029 ICE-CONTROLLED 3359 0x802A ICE-CONTROLLING 3361 17.2. STUN Error Responses 3363 IANA has registered following STUN error response code: 3365 487 Role Conflict: The client asserted an ICE role (controlling or 3366 controlled) that is in conflict with the role of the server. 3368 18. IAB Considerations 3370 The IAB has studied the problem of "Unilateral Self-Address Fixing", 3371 which is the general process by which a agent attempts to determine 3372 its address in another realm on the other side of a NAT through a 3373 collaborative protocol reflection mechanism [RFC3424]. ICE is an 3374 example of a protocol that performs this type of function. 3375 Interestingly, the process for ICE is not unilateral, but bilateral, 3376 and the difference has a significant impact on the issues raised by 3377 IAB. Indeed, ICE can be considered a B-SAF (Bilateral Self-Address 3378 Fixing) protocol, rather than an UNSAF protocol. Regardless, the IAB 3379 has mandated that any protocols developed for this purpose document a 3380 specific set of considerations. This section meets those 3381 requirements. 3383 18.1. Problem Definition 3385 >From RFC 3424, any UNSAF proposal must provide: 3387 Precise definition of a specific, limited-scope problem that is to 3388 be solved with the UNSAF proposal. A short-term fix should not be 3389 generalized to solve other problems; this is why "short-term fixes 3390 usually aren't". 3392 The specific problems being solved by ICE are: 3394 Provide a means for two peers to determine the set of transport 3395 addresses that can be used for communication. 3397 Provide a means for a agent to determine an address that is 3398 reachable by another peer with which it wishes to communicate. 3400 18.2. Exit Strategy 3402 >From RFC 3424, any UNSAF proposal must provide: 3404 Description of an exit strategy/transition plan. The better 3405 short-term fixes are the ones that will naturally see less and 3406 less use as the appropriate technology is deployed. 3408 ICE itself doesn't easily get phased out. However, it is useful even 3409 in a globally connected Internet, to serve as a means for detecting 3410 whether a router failure has temporarily disrupted connectivity, for 3411 example. ICE also helps prevent certain security attacks that have 3412 nothing to do with NAT. However, what ICE does is help phase out 3413 other UNSAF mechanisms. ICE effectively selects amongst those 3414 mechanisms, prioritizing ones that are better, and deprioritizing 3415 ones that are worse. Local IPv6 addresses can be preferred. As NATs 3416 begin to dissipate as IPv6 is introduced, server reflexive and 3417 relayed candidates (both forms of UNSAF addresses) simply never get 3418 used, because higher-priority connectivity exists to the native host 3419 candidates. Therefore, the servers get used less and less, and can 3420 eventually be remove when their usage goes to zero. 3422 Indeed, ICE can assist in the transition from IPv4 to IPv6. It can 3423 be used to determine whether to use IPv6 or IPv4 when two dual-stack 3424 hosts communicate with SIP (IPv6 gets used). It can also allow a 3425 network with both 6to4 and native v6 connectivity to determine which 3426 address to use when communicating with a peer. 3428 18.3. Brittleness Introduced by ICE 3430 >From RFC 3424, any UNSAF proposal must provide: 3432 Discussion of specific issues that may render systems more 3433 "brittle". For example, approaches that involve using data at 3434 multiple network layers create more dependencies, increase 3435 debugging challenges, and make it harder to transition. 3437 ICE actually removes brittleness from existing UNSAF mechanisms. In 3438 particular, classic STUN (as described in RFC 3489 [RFC3489]) has 3439 several points of brittleness. One of them is the discovery process 3440 that requires an agent to try to classify the type of NAT it is 3441 behind. This process is error-prone. With ICE, that discovery 3442 process is simply not used. Rather than unilaterally assessing the 3443 validity of the address, its validity is dynamically determined by 3444 measuring connectivity to a peer. The process of determining 3445 connectivity is very robust. 3447 Another point of brittleness in classic STUN and any other unilateral 3448 mechanism is its absolute reliance on an additional server. ICE 3449 makes use of a server for allocating unilateral addresses, but allows 3450 agents to directly connect if possible. Therefore, in some cases, 3451 the failure of a STUN server would still allow for a call to progress 3452 when ICE is used. 3454 Another point of brittleness in classic STUN is that it assumes that 3455 the STUN server is on the public Internet. Interestingly, with ICE, 3456 that is not necessary. There can be a multitude of STUN servers in a 3457 variety of address realms. ICE will discover the one that has 3458 provided a usable address. 3460 The most troubling point of brittleness in classic STUN is that it 3461 doesn't work in all network topologies. In cases where there is a 3462 shared NAT between each agent and the STUN server, traditional STUN 3463 may not work. With ICE, that restriction is removed. 3465 Classic STUN also introduces some security considerations. 3466 Fortunately, those security considerations are also mitigated by ICE. 3468 Consequently, ICE serves to repair the brittleness introduced in 3469 classic STUN, and does not introduce any additional brittleness into 3470 the system. 3472 The penalty of these improvements is that ICE increases session 3473 establishment times. 3475 18.4. Requirements for a Long-Term Solution 3477 From RFC 3424, any UNSAF proposal must provide: 3479 ... requirements for longer term, sound technical solutions -- 3480 contribute to the process of finding the right longer term 3481 solution. 3483 Our conclusions from RFC 3489 remain unchanged. However, we feel ICE 3484 actually helps because we believe it can be part of the long-term 3485 solution. 3487 18.5. Issues with Existing NAPT Boxes 3489 From RFC 3424, any UNSAF proposal must provide: 3491 Discussion of the impact of the noted practical issues with 3492 existing, deployed NA[P]Ts and experience reports. 3494 A number of NAT boxes are now being deployed into the market that try 3495 to provide "generic" ALG functionality. These generic ALGs hunt for 3496 IP addresses, either in text or binary form within a packet, and 3497 rewrite them if they match a binding. This interferes with classic 3498 STUN. However, the update to STUN [RFC5389] uses an encoding that 3499 hides these binary addresses from generic ALGs. 3501 Existing NAPT boxes have non-deterministic and typically short 3502 expiration times for UDP-based bindings. This requires 3503 implementations to send periodic keepalives to maintain those 3504 bindings. ICE uses a default of 15 s, which is a very conservative 3505 estimate. Eventually, over time, as NAT boxes become compliant to 3506 behave [RFC4787], this minimum keepalive will become deterministic 3507 and well-known, and the ICE timers can be adjusted. Having a way to 3508 discover and control the minimum keepalive interval would be far 3509 better still. 3511 19. Changes from RFC 5245 3513 Following is the list of changes from RFC 5245 3515 o The specification was generalized to be more usable with any 3516 protocol and the parts that are specific to SIP and SDP were moved 3517 to a SIP/SDP usage document 3518 [I-D.petithuguenin-mmusic-ice-sip-sdp]. 3520 o Default candidates, multiple components, ICE mismatch detection, 3521 subsequent offer/answer, and role conflict resolution were made 3522 optional since they are not needed with every protocol using ICE. 3524 o With IPv6, the precedence rules of RFC 6724 are used instead of 3525 the obsoleted RFC 3483. 3527 20. Acknowledgements 3529 Most of the text in this document comes from the original ICE 3530 specification, RFC 5245. The authors would like to thank everyone 3531 who has contributed to that document. 3533 21. References 3535 21.1. Normative References 3537 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 3538 Requirement Levels", BCP 14, RFC 2119, March 1997. 3540 [RFC5389] Rosenberg, J., Mahy, R., Matthews, P., and D. Wing, 3541 "Session Traversal Utilities for NAT (STUN)", RFC 5389, 3542 October 2008. 3544 [RFC5766] Mahy, R., Matthews, P., and J. Rosenberg, "Traversal Using 3545 Relays around NAT (TURN): Relay Extensions to Session 3546 Traversal Utilities for NAT (STUN)", RFC 5766, April 2010. 3548 [RFC6724] Thaler, D., Draves, R., Matsumoto, A., and T. Chown, 3549 "Default Address Selection for Internet Protocol Version 6 3550 (IPv6)", RFC 6724, September 2012. 3552 21.2. Informative References 3554 [RFC3605] Huitema, C., "Real Time Control Protocol (RTCP) attribute 3555 in Session Description Protocol (SDP)", RFC 3605, 3556 October 2003. 3558 [RFC3261] Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston, 3559 A., Peterson, J., Sparks, R., Handley, M., and E. 3560 Schooler, "SIP: Session Initiation Protocol", RFC 3261, 3561 June 2002. 3563 [RFC3264] Rosenberg, J. and H. Schulzrinne, "An Offer/Answer Model 3564 with Session Description Protocol (SDP)", RFC 3264, 3565 June 2002. 3567 [RFC3489] Rosenberg, J., Weinberger, J., Huitema, C., and R. Mahy, 3568 "STUN - Simple Traversal of User Datagram Protocol (UDP) 3569 Through Network Address Translators (NATs)", RFC 3489, 3570 March 2003. 3572 [RFC3235] Senie, D., "Network Address Translator (NAT)-Friendly 3573 Application Design Guidelines", RFC 3235, January 2002. 3575 [RFC3303] Srisuresh, P., Kuthan, J., Rosenberg, J., Molitor, A., and 3576 A. Rayhan, "Middlebox communication architecture and 3577 framework", RFC 3303, August 2002. 3579 [RFC3102] Borella, M., Lo, J., Grabelsky, D., and G. Montenegro, 3580 "Realm Specific IP: Framework", RFC 3102, October 2001. 3582 [RFC3103] Borella, M., Grabelsky, D., Lo, J., and K. Taniguchi, 3583 "Realm Specific IP: Protocol Specification", RFC 3103, 3584 October 2001. 3586 [RFC3424] Daigle, L. and IAB, "IAB Considerations for UNilateral 3587 Self-Address Fixing (UNSAF) Across Network Address 3588 Translation", RFC 3424, November 2002. 3590 [RFC3550] Schulzrinne, H., Casner, S., Frederick, R., and V. 3591 Jacobson, "RTP: A Transport Protocol for Real-Time 3592 Applications", STD 64, RFC 3550, July 2003. 3594 [RFC3711] Baugher, M., McGrew, D., Naslund, M., Carrara, E., and K. 3595 Norrman, "The Secure Real-time Transport Protocol (SRTP)", 3596 RFC 3711, March 2004. 3598 [RFC3056] Carpenter, B. and K. Moore, "Connection of IPv6 Domains 3599 via IPv4 Clouds", RFC 3056, February 2001. 3601 [RFC3389] Zopf, R., "Real-time Transport Protocol (RTP) Payload for 3602 Comfort Noise (CN)", RFC 3389, September 2002. 3604 [RFC4091] Camarillo, G. and J. Rosenberg, "The Alternative Network 3605 Address Types (ANAT) Semantics for the Session Description 3606 Protocol (SDP) Grouping Framework", RFC 4091, June 2005. 3608 [RFC4092] Camarillo, G. and J. Rosenberg, "Usage of the Session 3609 Description Protocol (SDP) Alternative Network Address 3610 Types (ANAT) Semantics in the Session Initiation Protocol 3611 (SIP)", RFC 4092, June 2005. 3613 [RFC4566] Handley, M., Jacobson, V., and C. Perkins, "SDP: Session 3614 Description Protocol", RFC 4566, July 2006. 3616 [RFC2475] Blake, S., Black, D., Carlson, M., Davies, E., Wang, Z., 3617 and W. Weiss, "An Architecture for Differentiated 3618 Services", RFC 2475, December 1998. 3620 [RFC1918] Rekhter, Y., Moskowitz, R., Karrenberg, D., Groot, G., and 3621 E. Lear, "Address Allocation for Private Internets", 3622 BCP 5, RFC 1918, February 1996. 3624 [RFC4787] Audet, F. and C. Jennings, "Network Address Translation 3625 (NAT) Behavioral Requirements for Unicast UDP", BCP 127, 3626 RFC 4787, January 2007. 3628 [I-D.ietf-avt-rtp-no-op] 3629 Andreasen, F., "A No-Op Payload Format for RTP", 3630 draft-ietf-avt-rtp-no-op-04 (work in progress), May 2007. 3632 [RFC5761] Perkins, C. and M. Westerlund, "Multiplexing RTP Data and 3633 Control Packets on a Single Port", RFC 5761, April 2010. 3635 [RFC4103] Hellstrom, G. and P. Jones, "RTP Payload for Text 3636 Conversation", RFC 4103, June 2005. 3638 [RFC5382] Guha, S., Biswas, K., Ford, B., Sivakumar, S., and P. 3639 Srisuresh, "NAT Behavioral Requirements for TCP", BCP 142, 3640 RFC 5382, October 2008. 3642 [RFC6080] Petrie, D. and S. Channabasappa, "A Framework for Session 3643 Initiation Protocol User Agent Profile Delivery", 3644 RFC 6080, March 2011. 3646 [RFC6544] Rosenberg, J., Keranen, A., Lowekamp, B., and A. Roach, 3647 "TCP Candidates with Interactive Connectivity 3648 Establishment (ICE)", RFC 6544, March 2012. 3650 [I-D.petithuguenin-mmusic-ice-sip-sdp] 3651 Petit-Huguenin, M. and A. Keraenen, "Using Interactive 3652 Connectivity Establishment (ICE) with Session Description 3653 Protocol (SDP) offer/answer and Session Initiation 3654 Protocol (SIP)", draft-petithuguenin-mmusic-ice-sip-sdp-00 3655 (work in progress), February 2013. 3657 Appendix A. Lite and Full Implementations 3659 ICE allows for two types of implementations. A full implementation 3660 supports the controlling and controlled roles in a session, and can 3661 also perform address gathering. In contrast, a lite implementation 3662 is a minimalist implementation that does little but respond to STUN 3663 checks. 3665 Because ICE requires both endpoints to support it in order to bring 3666 benefits to either endpoint, incremental deployment of ICE in a 3667 network is more complicated. Many sessions involve an endpoint that 3668 is, by itself, not behind a NAT and not one that would worry about 3669 NAT traversal. A very common case is to have one endpoint that 3670 requires NAT traversal (such as a VoIP hard phone or soft phone) make 3671 a call to one of these devices. Even if the phone supports a full 3672 ICE implementation, ICE won't be used at all if the other device 3673 doesn't support it. The lite implementation allows for a low-cost 3674 entry point for these devices. Once they support the lite 3675 implementation, full implementations can connect to them and get the 3676 full benefits of ICE. 3678 Consequently, a lite implementation is only appropriate for devices 3679 that will *always* be connected to the public Internet and have a 3680 public IP address at which it can receive packets from any 3681 correspondent. ICE will not function when a lite implementation is 3682 placed behind a NAT. 3684 ICE allows a lite implementation to have a single IPv4 host candidate 3685 and several IPv6 addresses. In that case, candidate pairs are 3686 selected by the controlling agent using a static algorithm, such as 3687 the one in RFC 6724, which is recommended by this specification. 3688 However, static mechanisms for address selection are always prone to 3689 error, since they cannot ever reflect the actual topology and can 3690 never provide actual guarantees on connectivity. They are always 3691 heuristics. Consequently, if an agent is implementing ICE just to 3692 select between its IPv4 and IPv6 addresses, and none of its IP 3693 addresses are behind NAT, usage of full ICE is still RECOMMENDED in 3694 order to provide the most robust form of address selection possible. 3696 It is important to note that the lite implementation was added to 3697 this specification to provide a stepping stone to full 3698 implementation. Even for devices that are always connected to the 3699 public Internet with just a single IPv4 address, a full 3700 implementation is preferable if achievable. A full implementation 3701 will reduce call setup times, since ICE's aggressive mode can be 3702 used. Full implementations also obtain the security benefits of ICE 3703 unrelated to NAT traversal; in particular, the voice hammer attack 3704 described in Section 14 is prevented only for full implementations, 3705 not lite. Finally, it is often the case that a device that finds 3706 itself with a public address today will be placed in a network 3707 tomorrow where it will be behind a NAT. It is difficult to 3708 definitively know, over the lifetime of a device or product, that it 3709 will always be used on the public Internet. Full implementation 3710 provides assurance that communications will always work. 3712 Appendix B. Design Motivations 3714 ICE contains a number of normative behaviors that may themselves be 3715 simple, but derive from complicated or non-obvious thinking or use 3716 cases that merit further discussion. Since these design motivations 3717 are not necessary to understand for purposes of implementation, they 3718 are discussed here in an appendix to the specification. This section 3719 is non-normative. 3721 B.1. Pacing of STUN Transactions 3723 STUN transactions used to gather candidates and to verify 3724 connectivity are paced out at an approximate rate of one new 3725 transaction every Ta milliseconds. Each transaction, in turn, has a 3726 retransmission timer RTO that is a function of Ta as well. Why are 3727 these transactions paced, and why are these formulas used? 3729 Sending of these STUN requests will often have the effect of creating 3730 bindings on NAT devices between the client and the STUN servers. 3731 Experience has shown that many NAT devices have upper limits on the 3732 rate at which they will create new bindings. Experiments have shown 3733 that once every 20 ms is well supported, but not much lower than 3734 that. This is why Ta has a lower bound of 20 ms. Furthermore, 3735 transmission of these packets on the network makes use of bandwidth 3736 and needs to be rate limited by the agent. Deployments based on 3737 earlier draft versions of this document tended to overload rate- 3738 constrained access links and perform poorly overall, in addition to 3739 negatively impacting the network. As a consequence, the pacing 3740 ensures that the NAT device does not get overloaded and that traffic 3741 is kept at a reasonable rate. 3743 The definition of a "reasonable" rate is that STUN should not use 3744 more bandwidth than the RTP itself will use, once media starts 3745 flowing. The formula for Ta is designed so that, if a STUN packet 3746 were sent every Ta seconds, it would consume the same amount of 3747 bandwidth as RTP packets, summed across all media streams. Of 3748 course, STUN has retransmits, and the desire is to pace those as 3749 well. For this reason, RTO is set such that the first retransmit on 3750 the first transaction happens just as the first STUN request on the 3751 last transaction occurs. Pictorially: 3753 First Packets Retransmits 3755 | | 3756 | | 3757 -------+------ -------+------ 3758 / \ / \ 3759 / \ / \ 3761 +--+ +--+ +--+ +--+ +--+ +--+ 3762 |A1| |B1| |C1| |A2| |B2| |C2| 3763 +--+ +--+ +--+ +--+ +--+ +--+ 3765 ---+-------+-------+-------+-------+-------+------------ Time 3766 0 Ta 2Ta 3Ta 4Ta 5Ta 3768 In this picture, there are three transactions that will be sent (for 3769 example, in the case of candidate gathering, there are three host 3770 candidate/STUN server pairs). These are transactions A, B, and C. 3771 The retransmit timer is set so that the first retransmission on the 3772 first transaction (packet A2) is sent at time 3Ta. 3774 Subsequent retransmits after the first will occur even less 3775 frequently than Ta milliseconds apart, since STUN uses an exponential 3776 back-off on its retransmissions. 3778 B.2. Candidates with Multiple Bases 3780 Section 4.1.3 talks about eliminating candidates that have the same 3781 transport address and base. However, candidates with the same 3782 transport addresses but different bases are not redundant. When can 3783 an agent have two candidates that have the same IP address and port, 3784 but different bases? Consider the topology of Figure 10: 3786 +----------+ 3787 | STUN Srvr| 3788 +----------+ 3789 | 3790 | 3791 ----- 3792 // \\ 3793 | | 3794 | B:net10 | 3795 | | 3796 \\ // 3797 ----- 3798 | 3799 | 3800 +----------+ 3801 | NAT | 3802 +----------+ 3803 | 3804 | 3805 ----- 3806 // \\ 3807 | A | 3808 |192.168/16 | 3809 | | 3810 \\ // 3811 ----- 3812 | 3813 | 3814 |192.168.1.100 ----- 3815 +----------+ // \\ +----------+ 3816 | | | | | | 3817 | Offerer |---------| C:net10 |-----------| Answerer | 3818 | |10.0.1.100| | 10.0.1.101 | | 3819 +----------+ \\ // +----------+ 3820 ----- 3822 Figure 10: Identical Candidates with Different Bases 3824 In this case, the offerer is multihomed. It has one IP address, 3825 10.0.1.100, on network C, which is a net 10 private network. The 3826 answerer is on this same network. The offerer is also connected to 3827 network A, which is 192.168/16. The offerer has an IP address of 3828 192.168.1.100 on this network. There is a NAT on this network, 3829 natting into network B, which is another net 10 private network, but 3830 not connected to network C. There is a STUN server on network B. 3832 The offerer obtains a host candidate on its IP address on network C 3833 (10.0.1.100:2498) and a host candidate on its IP address on network A 3834 (192.168.1.100:3344). It performs a STUN query to its configured 3835 STUN server from 192.168.1.100:3344. This query passes through the 3836 NAT, which happens to assign the binding 10.0.1.100:2498. The STUN 3837 server reflects this in the STUN Binding response. Now, the offerer 3838 has obtained a server reflexive candidate with a transport address 3839 that is identical to a host candidate (10.0.1.100:2498). However, 3840 the server reflexive candidate has a base of 192.168.1.100:3344, and 3841 the host candidate has a base of 10.0.1.100:2498. 3843 B.3. Purpose of the Related Address and Related Port Attributes 3845 The candidate attribute contains two values that are not used at all 3846 by ICE itself -- related address and related port. Why are they 3847 present? 3849 There are two motivations for its inclusion. The first is 3850 diagnostic. It is very useful to know the relationship between the 3851 different types of candidates. By including it, an agent can know 3852 which relayed candidate is associated with which reflexive candidate, 3853 which in turn is associated with a specific host candidate. When 3854 checks for one candidate succeed and not for others, this provides 3855 useful diagnostics on what is going on in the network. 3857 The second reason has to do with off-path Quality of Service (QoS) 3858 mechanisms. When ICE is used in environments such as PacketCable 3859 2.0, proxies will, in addition to performing normal SIP operations, 3860 inspect the SDP in SIP messages, and extract the IP address and port 3861 for media traffic. They can then interact, through policy servers, 3862 with access routers in the network, to establish guaranteed QoS for 3863 the media flows. This QoS is provided by classifying the RTP traffic 3864 based on 5-tuple, and then providing it a guaranteed rate, or marking 3865 its Diffserv codepoints appropriately. When a residential NAT is 3866 present, and a relayed candidate gets selected for media, this 3867 relayed candidate will be a transport address on an actual TURN 3868 server. That address says nothing about the actual transport address 3869 in the access router that would be used to classify packets for QoS 3870 treatment. Rather, the server reflexive candidate towards the TURN 3871 server is needed. By carrying the translation in the SDP, the proxy 3872 can use that transport address to request QoS from the access router. 3874 B.4. Importance of the STUN Username 3876 ICE requires the usage of message integrity with STUN using its 3877 short-term credential functionality. The actual short-term 3878 credential is formed by exchanging username fragments in the offer/ 3879 answer exchange. The need for this mechanism goes beyond just 3880 security; it is actually required for correct operation of ICE in the 3881 first place. 3883 Consider agents L, R, and Z. L and R are within private enterprise 1, 3884 which is using 10.0.0.0/8. Z is within private enterprise 2, which 3885 is also using 10.0.0.0/8. As it turns out, R and Z both have IP 3886 address 10.0.1.1. L sends an offer to Z. Z, in its answer, provides 3887 L with its host candidates. In this case, those candidates are 3888 10.0.1.1:8866 and 10.0.1.1:8877. As it turns out, R is in a session 3889 at that same time, and is also using 10.0.1.1:8866 and 10.0.1.1:8877 3890 as host candidates. This means that R is prepared to accept STUN 3891 messages on those ports, just as Z is. L will send a STUN request to 3892 10.0.1.1:8866 and another to 10.0.1.1:8877. However, these do not go 3893 to Z as expected. Instead, they go to R! If R just replied to them, 3894 L would believe it has connectivity to Z, when in fact it has 3895 connectivity to a completely different user, R. To fix this, the STUN 3896 short-term credential mechanisms are used. The username fragments 3897 are sufficiently random that it is highly unlikely that R would be 3898 using the same values as Z. Consequently, R would reject the STUN 3899 request since the credentials were invalid. In essence, the STUN 3900 username fragments provide a form of transient host identifiers, 3901 bound to a particular offer/answer session. 3903 An unfortunate consequence of the non-uniqueness of IP addresses is 3904 that, in the above example, R might not even be an ICE agent. It 3905 could be any host, and the port to which the STUN packet is directed 3906 could be any ephemeral port on that host. If there is an application 3907 listening on this socket for packets, and it is not prepared to 3908 handle malformed packets for whatever protocol is in use, the 3909 operation of that application could be affected. Fortunately, since 3910 the ports exchanged in offer/answer are ephemeral and usually drawn 3911 from the dynamic or registered range, the odds are good that the port 3912 is not used to run a server on host R, but rather is the agent side 3913 of some protocol. This decreases the probability of hitting an 3914 allocated port, due to the transient nature of port usage in this 3915 range. However, the possibility of a problem does exist, and network 3916 deployers should be prepared for it. Note that this is not a problem 3917 specific to ICE; stray packets can arrive at a port at any time for 3918 any type of protocol, especially ones on the public Internet. As 3919 such, this requirement is just restating a general design guideline 3920 for Internet applications -- be prepared for unknown packets on any 3921 port. 3923 B.5. The Candidate Pair Priority Formula 3925 The priority for a candidate pair has an odd form. It is: 3927 pair priority = 2^32*MIN(G,D) + 2*MAX(G,D) + (G>D?1:0) 3929 Why is this? When the candidate pairs are sorted based on this 3930 value, the resulting sorting has the MAX/MIN property. This means 3931 that the pairs are first sorted based on decreasing value of the 3932 minimum of the two priorities. For pairs that have the same value of 3933 the minimum priority, the maximum priority is used to sort amongst 3934 them. If the max and the min priorities are the same, the 3935 controlling agent's priority is used as the tie-breaker in the last 3936 part of the expression. The factor of 2*32 is used since the 3937 priority of a single candidate is always less than 2*32, resulting in 3938 the pair priority being a "concatenation" of the two component 3939 priorities. This creates the MAX/MIN sorting. MAX/MIN ensures that, 3940 for a particular agent, a lower-priority candidate is never used 3941 until all higher-priority candidates have been tried. 3943 B.6. Why Are Keepalives Needed? 3945 Once media begins flowing on a candidate pair, it is still necessary 3946 to keep the bindings alive at intermediate NATs for the duration of 3947 the session. Normally, the media stream packets themselves (e.g., 3948 RTP) meet this objective. However, several cases merit further 3949 discussion. Firstly, in some RTP usages, such as SIP, the media 3950 streams can be "put on hold". This is accomplished by using the SDP 3951 "sendonly" or "inactive" attributes, as defined in RFC 3264 3952 [RFC3264]. RFC 3264 directs implementations to cease transmission of 3953 media in these cases. However, doing so may cause NAT bindings to 3954 timeout, and media won't be able to come off hold. 3956 Secondly, some RTP payload formats, such as the payload format for 3957 text conversation [RFC4103], may send packets so infrequently that 3958 the interval exceeds the NAT binding timeouts. 3960 Thirdly, if silence suppression is in use, long periods of silence 3961 may cause media transmission to cease sufficiently long for NAT 3962 bindings to time out. 3964 For these reasons, the media packets themselves cannot be relied 3965 upon. ICE defines a simple periodic keepalive utilizing STUN Binding 3966 indications. This makes its bandwidth requirements highly 3967 predictable, and thus amenable to QoS reservations. 3969 B.7. Why Prefer Peer Reflexive Candidates? 3971 Section 4.1.2 describes procedures for computing the priority of 3972 candidate based on its type and local preferences. That section 3973 requires that the type preference for peer reflexive candidates 3974 always be higher than server reflexive. Why is that? The reason has 3975 to do with the security considerations in Section 14. It is much 3976 easier for an attacker to cause an agent to use a false server 3977 reflexive candidate than it is for an attacker to cause an agent to 3978 use a false peer reflexive candidate. Consequently, attacks against 3979 address gathering with Binding requests are thwarted by ICE by 3980 preferring the peer reflexive candidates. 3982 B.8. Why Are Binding Indications Used for Keepalives? 3984 Media keepalives are described in Section 9. These keepalives make 3985 use of STUN when both endpoints are ICE capable. However, rather 3986 than using a Binding request transaction (which generates a 3987 response), the keepalives use an Indication. Why is that? 3989 The primary reason has to do with network QoS mechanisms. Once media 3990 begins flowing, network elements will assume that the media stream 3991 has a fairly regular structure, making use of periodic packets at 3992 fixed intervals, with the possibility of jitter. If an agent is 3993 sending media packets, and then receives a Binding request, it would 3994 need to generate a response packet along with its media packets. 3995 This will increase the actual bandwidth requirements for the 5-tuple 3996 carrying the media packets, and introduce jitter in the delivery of 3997 those packets. Analysis has shown that this is a concern in certain 3998 layer 2 access networks that use fairly tight packet schedulers for 3999 media. 4001 Additionally, using a Binding Indication allows integrity to be 4002 disabled, allowing for better performance. This is useful for large- 4003 scale endpoints, such as PSTN gateways and SBCs. 4005 Authors' Addresses 4007 Ari Keranen 4008 Ericsson 4009 Hirsalantie 11 4010 02420 Jorvas 4011 Finland 4013 Email: ari.keranen@ericsson.com 4014 Jonathan Rosenberg 4015 jdrosen.net 4016 Monmouth, NJ 4017 US 4019 Email: jdrosen@jdrosen.net 4020 URI: http://www.jdrosen.net