idnits 2.17.1 draft-ietf-mmusic-ice-19.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** It looks like you're using RFC 3978 boilerplate. You should update this to the boilerplate described in the IETF Trust License Policy document (see https://trustee.ietf.org/license-info), which is required now. -- Found old boilerplate from RFC 3978, Section 5.1 on line 17. -- Found old boilerplate from RFC 3978, Section 5.5, updated by RFC 4748 on line 5350. -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 5361. -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 5368. -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 5374. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** There is 1 instance of too long lines in the document, the longest one being 2 characters in excess of 72. == There are 20 instances of lines with private range IPv4 addresses in the document. If these are generic example addresses, they should be changed to use any of the ranges defined in RFC 6890 (or successor): 192.0.2.x, 198.51.100.x or 203.0.113.x. -- The document has examples using IPv4 documentation addresses according to RFC6890, but does not use any IPv6 documentation addresses. Maybe there should be IPv6 examples, too? -- The draft header indicates that this document obsoletes RFC4091, but the abstract doesn't seem to mention this, which it should. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust Copyright Line does not match the current year == The document seems to use 'NOT RECOMMENDED' as an RFC 2119 keyword, but does not include the phrase in its RFC 2119 key words list. -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (October 29, 2007) is 6023 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) ** Obsolete normative reference: RFC 4234 (Obsoleted by RFC 5234) ** Obsolete normative reference: RFC 4566 (Obsoleted by RFC 8866) ** Obsolete normative reference: RFC 4091 (Obsoleted by RFC 5245) ** Obsolete normative reference: RFC 3484 (Obsoleted by RFC 6724) == Outdated reference: A later version (-18) exists of draft-ietf-behave-rfc3489bis-08 == Outdated reference: A later version (-16) exists of draft-ietf-behave-turn-04 -- Obsolete informational reference (is this intentional?): RFC 3489 (Obsoleted by RFC 5389) == Outdated reference: A later version (-07) exists of draft-ietf-mmusic-connectivity-precon-02 == Outdated reference: A later version (-20) exists of draft-ietf-sip-outbound-10 == Outdated reference: A later version (-08) exists of draft-ietf-behave-tcp-07 == Outdated reference: A later version (-18) exists of draft-ietf-sipping-config-framework-12 == Outdated reference: A later version (-16) exists of draft-ietf-mmusic-ice-tcp-04 Summary: 6 errors (**), 0 flaws (~~), 10 warnings (==), 10 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 MMUSIC J. Rosenberg 3 Internet-Draft Cisco 4 Obsoletes: 4091 (if approved) October 29, 2007 5 Intended status: Standards Track 6 Expires: May 1, 2008 8 Interactive Connectivity Establishment (ICE): A Protocol for Network 9 Address Translator (NAT) Traversal for Offer/Answer Protocols 10 draft-ietf-mmusic-ice-19 12 Status of this Memo 14 By submitting this Internet-Draft, each author represents that any 15 applicable patent or other IPR claims of which he or she is aware 16 have been or will be disclosed, and any of which he or she becomes 17 aware will be disclosed, in accordance with Section 6 of BCP 79. 19 Internet-Drafts are working documents of the Internet Engineering 20 Task Force (IETF), its areas, and its working groups. Note that 21 other groups may also distribute working documents as Internet- 22 Drafts. 24 Internet-Drafts are draft documents valid for a maximum of six months 25 and may be updated, replaced, or obsoleted by other documents at any 26 time. It is inappropriate to use Internet-Drafts as reference 27 material or to cite them other than as "work in progress." 29 The list of current Internet-Drafts can be accessed at 30 http://www.ietf.org/ietf/1id-abstracts.txt. 32 The list of Internet-Draft Shadow Directories can be accessed at 33 http://www.ietf.org/shadow.html. 35 This Internet-Draft will expire on May 1, 2008. 37 Copyright Notice 39 Copyright (C) The IETF Trust (2007). 41 Abstract 43 This document describes a protocol for Network Address Translator 44 (NAT) traversal for UDP-based multimedia sessions established with 45 the offer/answer model. This protocol is called Interactive 46 Connectivity Establishment (ICE). ICE makes use of the Session 47 Traversal Utilities for NAT (STUN) protocol and its extension, 48 Traversal Using Relay NAT (TURN). ICE can be used by any protocol 49 utilizing the offer/answer model, such as the Session Initiation 50 Protocol (SIP). 52 Table of Contents 54 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 7 55 2. Overview of ICE . . . . . . . . . . . . . . . . . . . . . . . 8 56 2.1. Gathering Candidate Addresses . . . . . . . . . . . . . . 10 57 2.2. Connectivity Checks . . . . . . . . . . . . . . . . . . . 12 58 2.3. Sorting Candidates . . . . . . . . . . . . . . . . . . . 13 59 2.4. Frozen Candidates . . . . . . . . . . . . . . . . . . . . 14 60 2.5. Security for Checks . . . . . . . . . . . . . . . . . . . 15 61 2.6. Concluding ICE . . . . . . . . . . . . . . . . . . . . . 15 62 2.7. Lite Implementations . . . . . . . . . . . . . . . . . . 17 63 3. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 17 64 4. Sending the Initial Offer . . . . . . . . . . . . . . . . . . 20 65 4.1. Full Implementation Requirements . . . . . . . . . . . . 20 66 4.1.1. Gathering Candidates . . . . . . . . . . . . . . . . 20 67 4.1.1.1. Host Candidates . . . . . . . . . . . . . . . . . 21 68 4.1.1.2. Server Reflexive and Relayed Candidates . . . . . 21 69 4.1.1.3. Computing Foundations . . . . . . . . . . . . . . 23 70 4.1.1.4. Keeping Candidates Alive . . . . . . . . . . . . 23 71 4.1.2. Prioritizing Candidates . . . . . . . . . . . . . . . 23 72 4.1.2.1. Recommended Formula . . . . . . . . . . . . . . . 24 73 4.1.2.2. Guidelines for Choosing Type and Local 74 Preferences . . . . . . . . . . . . . . . . . . . 25 75 4.1.3. Eliminating Redundant Candidates . . . . . . . . . . 26 76 4.1.4. Choosing Default Candidates . . . . . . . . . . . . . 26 77 4.2. Lite Implementation . . . . . . . . . . . . . . . . . . . 26 78 4.3. Encoding the SDP . . . . . . . . . . . . . . . . . . . . 27 79 5. Receiving the Initial Offer . . . . . . . . . . . . . . . . . 29 80 5.1. Verifying ICE Support . . . . . . . . . . . . . . . . . . 29 81 5.2. Determining Role . . . . . . . . . . . . . . . . . . . . 30 82 5.3. Gathering Candidates . . . . . . . . . . . . . . . . . . 31 83 5.4. Prioritizing Candidates . . . . . . . . . . . . . . . . . 31 84 5.5. Choosing Default Candidates . . . . . . . . . . . . . . . 31 85 5.6. Encoding the SDP . . . . . . . . . . . . . . . . . . . . 32 86 5.7. Forming the Check Lists . . . . . . . . . . . . . . . . . 32 87 5.7.1. Forming Candidate Pairs . . . . . . . . . . . . . . . 32 88 5.7.2. Computing Pair Priority and Ordering Pairs . . . . . 35 89 5.7.3. Pruning the Pairs . . . . . . . . . . . . . . . . . . 35 90 5.7.4. Computing States . . . . . . . . . . . . . . . . . . 35 91 5.8. Scheduling Checks . . . . . . . . . . . . . . . . . . . . 38 92 6. Receipt of the Initial Answer . . . . . . . . . . . . . . . . 40 93 6.1. Verifying ICE Support . . . . . . . . . . . . . . . . . . 40 94 6.2. Determining Role . . . . . . . . . . . . . . . . . . . . 40 95 6.3. Forming the Check List . . . . . . . . . . . . . . . . . 41 96 6.4. Performing Ordinary Checks . . . . . . . . . . . . . . . 41 97 7. Performing Connectivity Checks . . . . . . . . . . . . . . . 41 98 7.1. STUN Client Procedures . . . . . . . . . . . . . . . . . 41 99 7.1.1. Sending the Request . . . . . . . . . . . . . . . . . 41 100 7.1.1.1. PRIORITY and USE-CANDIDATE . . . . . . . . . . . 42 101 7.1.1.2. ICE-CONTROLLED and ICE-CONTROLLING . . . . . . . 42 102 7.1.1.3. Forming Credentials . . . . . . . . . . . . . . . 42 103 7.1.1.4. DiffServ Treatment . . . . . . . . . . . . . . . 42 104 7.1.2. Processing the Response . . . . . . . . . . . . . . . 43 105 7.1.2.1. Failure Cases . . . . . . . . . . . . . . . . . . 43 106 7.1.2.2. Success Cases . . . . . . . . . . . . . . . . . . 43 107 7.1.2.2.1. Discovering Peer Reflexive Candidates . . . . 44 108 7.1.2.2.2. Constructing a Valid Pair . . . . . . . . . . 44 109 7.1.2.2.3. Updating Pair States . . . . . . . . . . . . 45 110 7.1.2.2.4. Updating the Nominated Flag . . . . . . . . . 46 111 7.1.2.3. Check List and Timer State Updates . . . . . . . 46 112 7.2. STUN Server Procedures . . . . . . . . . . . . . . . . . 47 113 7.2.1. Additional Procedures for Full Implementations . . . 48 114 7.2.1.1. Detecting and Repairing Role Conflicts . . . . . 48 115 7.2.1.2. Computing Mapped Address . . . . . . . . . . . . 49 116 7.2.1.3. Learning Peer Reflexive Candidates . . . . . . . 49 117 7.2.1.4. Triggered Checks . . . . . . . . . . . . . . . . 50 118 7.2.1.5. Updating the Nominated Flag . . . . . . . . . . . 51 119 7.2.2. Additional Procedures for Lite Implementations . . . 51 120 8. Concluding ICE Processing . . . . . . . . . . . . . . . . . . 51 121 8.1. Procedures for Full Implementations . . . . . . . . . . . 52 122 8.1.1. Nominating Pairs . . . . . . . . . . . . . . . . . . 52 123 8.1.1.1. Regular Nomination . . . . . . . . . . . . . . . 52 124 8.1.1.2. Aggressive Nomination . . . . . . . . . . . . . . 53 125 8.1.2. Updating States . . . . . . . . . . . . . . . . . . . 53 126 8.2. Procedures for Lite Implementations . . . . . . . . . . . 54 127 8.2.1. Peer is Full . . . . . . . . . . . . . . . . . . . . 55 128 8.2.2. Peer is Lite . . . . . . . . . . . . . . . . . . . . 55 129 8.3. Freeing Candidates . . . . . . . . . . . . . . . . . . . 56 130 8.3.1. Full Implementation Procedures . . . . . . . . . . . 56 131 8.3.2. Lite Implementations . . . . . . . . . . . . . . . . 56 132 9. Subsequent Offer/Answer Exchanges . . . . . . . . . . . . . . 56 133 9.1. Generating the Offer . . . . . . . . . . . . . . . . . . 57 134 9.1.1. Procedures for All Implementations . . . . . . . . . 57 135 9.1.1.1. ICE Restarts . . . . . . . . . . . . . . . . . . 57 136 9.1.1.2. Removing a Media Stream . . . . . . . . . . . . . 58 137 9.1.1.3. Adding a Media Stream . . . . . . . . . . . . . . 58 138 9.1.2. Procedures for Full Implementations . . . . . . . . . 58 139 9.1.2.1. Existing Media Streams with ICE Running . . . . . 58 140 9.1.2.2. Existing Media Streams with ICE Completed . . . . 59 141 9.1.3. Procedures for Lite Implementations . . . . . . . . . 59 142 9.1.3.1. Existing Media Streams with ICE Running . . . . . 59 143 9.1.3.2. Existing Media Streams with ICE Completed . . . . 60 145 9.2. Receiving the Offer and Generating an Answer . . . . . . 60 146 9.2.1. Procedures for All Implementations . . . . . . . . . 60 147 9.2.1.1. Detecting ICE Restart . . . . . . . . . . . . . . 60 148 9.2.1.2. New Media Stream . . . . . . . . . . . . . . . . 61 149 9.2.1.3. Removed Media Stream . . . . . . . . . . . . . . 61 150 9.2.2. Procedures for Full Implementations . . . . . . . . . 61 151 9.2.2.1. Existing Media Streams with ICE Running and no 152 remote-candidates . . . . . . . . . . . . . . . . 61 153 9.2.2.2. Existing Media Streams with ICE Completed and 154 no remote-candidates . . . . . . . . . . . . . . 61 155 9.2.2.3. Existing Media Streams and remote-candidates . . 61 156 9.2.3. Procedures for Lite Implementations . . . . . . . . . 62 157 9.3. Updating the Check and Valid Lists . . . . . . . . . . . 63 158 9.3.1. Procedures for Full Implementations . . . . . . . . . 63 159 9.3.1.1. ICE Restarts . . . . . . . . . . . . . . . . . . 63 160 9.3.1.2. New Media Stream . . . . . . . . . . . . . . . . 63 161 9.3.1.3. Removed Media Stream . . . . . . . . . . . . . . 64 162 9.3.1.4. ICE Continuing for Existing Media Stream . . . . 64 163 9.3.2. Procedures for Lite Implementations . . . . . . . . . 64 164 10. Keepalives . . . . . . . . . . . . . . . . . . . . . . . . . 65 165 11. Media Handling . . . . . . . . . . . . . . . . . . . . . . . 66 166 11.1. Sending Media . . . . . . . . . . . . . . . . . . . . . . 66 167 11.1.1. Procedures for Full Implementations . . . . . . . . . 66 168 11.1.2. Procedures for Lite Implementations . . . . . . . . . 67 169 11.1.3. Procedures for All Implementations . . . . . . . . . 67 170 11.2. Receiving Media . . . . . . . . . . . . . . . . . . . . . 67 171 12. Usage with SIP . . . . . . . . . . . . . . . . . . . . . . . 68 172 12.1. Latency Guidelines . . . . . . . . . . . . . . . . . . . 68 173 12.1.1. Offer in INVITE . . . . . . . . . . . . . . . . . . . 68 174 12.1.2. Offer in Response . . . . . . . . . . . . . . . . . . 69 175 12.2. SIP Option Tags and Media Feature Tags . . . . . . . . . 70 176 12.3. Interactions with Forking . . . . . . . . . . . . . . . . 70 177 12.4. Interactions with Preconditions . . . . . . . . . . . . . 70 178 12.5. Interactions with Third Party Call Control . . . . . . . 71 179 13. Relationship with ANAT . . . . . . . . . . . . . . . . . . . 71 180 14. Extensibility Considerations . . . . . . . . . . . . . . . . 72 181 15. Grammar . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 182 15.1. "candidate" Attribute . . . . . . . . . . . . . . . . . . 73 183 15.2. "remote-candidates" Attribute . . . . . . . . . . . . . . 75 184 15.3. "ice-lite" and "ice-mismatch" Attributes . . . . . . . . 75 185 15.4. "ice-ufrag" and "ice-pwd" Attributes . . . . . . . . . . 76 186 15.5. "ice-options" Attribute . . . . . . . . . . . . . . . . . 76 187 16. Setting Ta and RTO . . . . . . . . . . . . . . . . . . . . . 77 188 16.1. RTP Media Streams . . . . . . . . . . . . . . . . . . . . 77 189 16.2. Non-RTP Sessions . . . . . . . . . . . . . . . . . . . . 78 190 17. Example . . . . . . . . . . . . . . . . . . . . . . . . . . . 79 191 18. Security Considerations . . . . . . . . . . . . . . . . . . . 86 192 18.1. Attacks on Connectivity Checks . . . . . . . . . . . . . 86 193 18.2. Attacks on Server Reflexive Address Gathering . . . . . . 89 194 18.3. Attacks on Relayed Candidate Gathering . . . . . . . . . 89 195 18.4. Attacks on the Offer/Answer Exchanges . . . . . . . . . . 90 196 18.5. Insider Attacks . . . . . . . . . . . . . . . . . . . . . 90 197 18.5.1. The Voice Hammer Attack . . . . . . . . . . . . . . . 90 198 18.5.2. STUN Amplification Attack . . . . . . . . . . . . . . 91 199 18.6. Interactions with Application Layer Gateways and SIP . . 92 200 19. STUN Extensions . . . . . . . . . . . . . . . . . . . . . . . 93 201 19.1. New Attributes . . . . . . . . . . . . . . . . . . . . . 93 202 19.2. New Error Response Codes . . . . . . . . . . . . . . . . 93 203 20. Operational Considerations . . . . . . . . . . . . . . . . . 94 204 20.1. NAT and Firewall Types . . . . . . . . . . . . . . . . . 94 205 20.2. Bandwidth Requirements . . . . . . . . . . . . . . . . . 94 206 20.2.1. STUN and TURN Server Capacity Planning . . . . . . . 94 207 20.2.2. Gathering and Connectivity Checks . . . . . . . . . . 95 208 20.2.3. Keepalives . . . . . . . . . . . . . . . . . . . . . 95 209 20.3. ICE and ICE-lite . . . . . . . . . . . . . . . . . . . . 95 210 20.4. Troubleshooting and Performance Management . . . . . . . 96 211 20.5. Endpoint Configuration . . . . . . . . . . . . . . . . . 96 212 21. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 96 213 21.1. SDP Attributes . . . . . . . . . . . . . . . . . . . . . 96 214 21.1.1. candidate Attribute . . . . . . . . . . . . . . . . . 97 215 21.1.2. remote-candidates Attribute . . . . . . . . . . . . . 97 216 21.1.3. ice-lite Attribute . . . . . . . . . . . . . . . . . 97 217 21.1.4. ice-mismatch Attribute . . . . . . . . . . . . . . . 98 218 21.1.5. ice-pwd Attribute . . . . . . . . . . . . . . . . . . 98 219 21.1.6. ice-ufrag Attribute . . . . . . . . . . . . . . . . . 99 220 21.1.7. ice-options Attribute . . . . . . . . . . . . . . . . 99 221 21.2. STUN Attributes . . . . . . . . . . . . . . . . . . . . . 100 222 21.3. STUN Error Responses . . . . . . . . . . . . . . . . . . 100 223 22. IAB Considerations . . . . . . . . . . . . . . . . . . . . . 100 224 22.1. Problem Definition . . . . . . . . . . . . . . . . . . . 100 225 22.2. Exit Strategy . . . . . . . . . . . . . . . . . . . . . . 101 226 22.3. Brittleness Introduced by ICE . . . . . . . . . . . . . . 101 227 22.4. Requirements for a Long Term Solution . . . . . . . . . . 102 228 22.5. Issues with Existing NAPT Boxes . . . . . . . . . . . . . 103 229 23. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 103 230 24. References . . . . . . . . . . . . . . . . . . . . . . . . . 104 231 24.1. Normative References . . . . . . . . . . . . . . . . . . 104 232 24.2. Informative References . . . . . . . . . . . . . . . . . 105 233 Appendix A. Lite and Full Implementations . . . . . . . . . . . 107 234 Appendix B. Design Motivations . . . . . . . . . . . . . . . . . 108 235 B.1. Pacing of STUN Transactions . . . . . . . . . . . . . . . 108 236 B.2. Candidates with Multiple Bases . . . . . . . . . . . . . 110 237 B.3. Purpose of the and Attributes . . . 112 238 B.4. Importance of the STUN Username . . . . . . . . . . . . . 112 239 B.5. The Candidate Pair Priority Formula . . . . . . . . . . . 113 240 B.6. The remote-candidates attribute . . . . . . . . . . . . . 114 241 B.7. Why are Keepalives Needed? . . . . . . . . . . . . . . . 115 242 B.8. Why Prefer Peer Reflexive Candidates? . . . . . . . . . . 116 243 B.9. Why Send an Updated Offer? . . . . . . . . . . . . . . . 116 244 B.10. Why are Binding Indications Used for Keepalives? . . . . 116 245 B.11. Why is the Conflict Resolution Mechanism Needed? . . . . 117 246 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 118 247 Intellectual Property and Copyright Statements . . . . . . . . . 119 249 1. Introduction 251 RFC 3264 [RFC3264] defines a two-phase exchange of Session 252 Description Protocol (SDP) messages [RFC4566] for the purposes of 253 establishment of multimedia sessions. This offer/answer mechanism is 254 used by protocols such as the Session Initiation Protocol (SIP) 255 [RFC3261]. 257 Protocols using offer/answer are difficult to operate through Network 258 Address Translators (NAT). Because their purpose is to establish a 259 flow of media packets, they tend to carry the IP addresses and ports 260 of media sources and sinks within their messages, which is known to 261 be problematic through NAT [RFC3235]. The protocols also seek to 262 create a media flow directly between participants, so that there is 263 no application layer intermediary between them. This is done to 264 reduce media latency, decrease packet loss, and reduce the 265 operational costs of deploying the application. However, this is 266 difficult to accomplish through NAT. A full treatment of the reasons 267 for this is beyond the scope of this specification. 269 Numerous solutions have been defined for allowing these protocols to 270 operate through NAT. These include Application Layer Gateways 271 (ALGs), the Middlebox Control Protocol [RFC3303], the original Simple 272 Traversal of UDP Through NAT (STUN) [RFC3489] specification, and 273 Realm Specific IP [RFC3102] [RFC3103] along with session description 274 extensions needed to make them work, such as the Session Description 275 Protocol (SDP) [RFC4566] attribute for the Real Time Control Protocol 276 (RTCP) [RFC3605]. Unfortunately, these techniques all have pros and 277 cons which make each one optimal in some network topologies, but a 278 poor choice in others. The result is that administrators and 279 implementors are making assumptions about the topologies of the 280 networks in which their solutions will be deployed. This introduces 281 complexity and brittleness into the system. What is needed is a 282 single solution which is flexible enough to work well in all 283 situations. 285 This specification defines Interactive Connectivity Establishment 286 (ICE) as a technique for NAT traversal for UDP-based media streams 287 (though ICE can be extended to handle other transport protocols, such 288 as TCP [I-D.ietf-mmusic-ice-tcp]) established by the offer/answer 289 model. ICE is an extension to the offer/answer model, and works by 290 including a multiplicity of IP addresses and ports in SDP offers and 291 answers, which are then tested for connectivity by peer-to-peer 292 connectivity checks. The IP addresses and ports included in the SDP 293 and the connectivity checks are performed using the revised STUN 294 specification [I-D.ietf-behave-rfc3489bis], now renamed to Session 295 Traversal Utilities for NAT. The new name and new specification 296 reflect its new role as a tool that is used with other NAT traversal 297 techniques (namely ICE) rather than a standalone NAT traversal 298 solution, as the original STUN specification was. ICE also makes use 299 of Traversal Using Relay NAT (TURN) [I-D.ietf-behave-turn], an 300 extension to STUN. Because ICE exchanges a multiplicity of IP 301 addresses and ports for each media stream, it also allows for address 302 selection for multi-homed and dual-stack hosts, and for this reason 303 it deprecates RFC 4091 [RFC4091]. 305 2. Overview of ICE 307 In a typical ICE deployment, we have two endpoints (known as AGENTS 308 in RFC 3264 terminology) which want to communicate. They are able to 309 communicate indirectly via some signaling protocol (such as SIP), by 310 which they can perform an offer/answer exchange of SDP [RFC3264] 311 messages. Note that ICE is not intended for NAT traversal for SIP, 312 which is assumed to be provided via another mechanism 313 [I-D.ietf-sip-outbound]. At the beginning of the ICE process, the 314 agents are ignorant of their own topologies. In particular, they 315 might or might not be behind a NAT (or multiple tiers of NATs). ICE 316 allows the agents to discover enough information about their 317 topologies to potentially find one or more paths by which they can 318 communicate. 320 Figure 1 shows a typical environment for ICE deployment. The two 321 endpoints are labelled L and R (for left and right, which helps 322 visualize call flows). Both L and R are behind their own respective 323 NATs though they may not be aware of it. The type of NAT and its 324 properties are also unknown. Agents L and R are capable of engaging 325 in an offer/answer exchange by which they can exchange SDP messages, 326 whose purpose is to set up a media session between L and R. 327 Typically, this exchange will occur through a SIP server. 329 In addition to the agents, a SIP server and NATs, ICE is typically 330 used in concert with STUN or TURN servers in the network. Each agent 331 can have its own STUN or TURN server, or they can be the same. 333 +-------+ 334 | SIP | 335 +-------+ | Srvr | +-------+ 336 | STUN | | | | STUN | 337 | Srvr | +-------+ | Srvr | 338 | | / \ | | 339 +-------+ / \ +-------+ 340 / \ 341 / \ 342 / \ 343 / \ 344 / <- Signalling -> \ 345 / \ 346 / \ 347 +--------+ +--------+ 348 | NAT | | NAT | 349 +--------+ +--------+ 350 / \ 351 / \ 352 / \ 353 +-------+ +-------+ 354 | Agent | | Agent | 355 | L | | R | 356 | | | | 357 +-------+ +-------+ 359 Figure 1: ICE Deployment Scenario 361 The basic idea behind ICE is as follows: each agent has a variety of 362 candidate TRANSPORT ADDRESSES (combination of IP address and port for 363 a particular transport protocol, which is always UDP in this 364 specification)) it could use to communicate with the other agent. 365 These might include: 367 o A transport address on a directly attached network interface 369 o A translated transport address on the public side of a NAT (a 370 "server reflexive" address) 372 o The transport address allocated from a TURN server(a "relayed 373 address". 375 Potentially, any of L's candidate transport addresses can be used to 376 communicate with any of R's candidate transport addresses. In 377 practice, however, many combinations will not work. For instance, if 378 L and R are both behind NATs, their directly attached interface 379 addresses are unlikely to be able to communicate directly (this is 380 why ICE is needed, after all!). The purpose of ICE is to discover 381 which pairs of addresses will work. The way that ICE does this is to 382 systematically try all possible pairs (in a carefully sorted order) 383 until it finds one or more that works. 385 2.1. Gathering Candidate Addresses 387 In order to execute ICE, an agent has to identify all of its address 388 candidates. A CANDIDATE is a transport address - a combination of IP 389 address and port for a particular transport protocol (with only UDP 390 specified here). This document defines three types of candidates, 391 some derived from physical or logical network interfaces, others 392 discoverable via STUN and TURN. Naturally, one viable candidate is a 393 transport address obtained directly from a local interface. Such a 394 candidate is called a HOST CANDIDATE. The local interface could be 395 ethernet or WiFi, or it could be one that is obtained through a 396 tunnel mechanism, such as a Virtual Private Network (VPN) or Mobile 397 IP (MIP). In all cases, such a network interface appears to the 398 agent as a local interface from which ports (and thus candidates) can 399 be allocated. 401 If an agent is multihomed, it obtains a candidate from each IP 402 address. Depending on the location of the PEER (the other agent in 403 the session) on the IP network relative to the agent, the agent may 404 be reachable by the peer through one or more of those IP addresses. 405 Consider, for example, an agent which has a local IP address on a 406 private net 10 network (I1), and a second connected to the public 407 Internet (I2). A candidate from I1 will be directly reachable when 408 communicating with a peer on the same private net 10 network, while a 409 candidate from I2 will be directly reachable when communicating with 410 a peer on the public Internet. Rather than trying to guess which IP 411 address will work prior to sending an offer, the offering agent 412 includes both candidates in its offer. 414 Next, the agent uses STUN or TURN to obtain additional candidates. 415 These come in two flavors: translated addresses on the public side of 416 a NAT (SERVER REFLEXIVE CANDIDATES) and addresses on TURN servers 417 (RELAYED CANDIDATES). When TURN servers are utilized, both types of 418 candidates are obtained from the TURN server. If only STUN servers 419 are utilized, only server reflexive candidates are obtained from 420 them. The relationship of these candidates to the host candidate is 421 shown in Figure 2. In this figure, both types of candidates are 422 discovered using TURN. In the figure, the notation X:x means IP 423 address X and UDP port x. 425 To Internet 427 | 428 | 429 | /------------ Relayed 430 Y:y | / Address 431 +--------+ 432 | | 433 | TURN | 434 | Server | 435 | | 436 +--------+ 437 | 438 | 439 | /------------ Server 440 X1':x1'|/ Reflexive 441 +------------+ Address 442 | NAT | 443 +------------+ 444 | 445 | /------------ Local 446 X:x |/ Address 447 +--------+ 448 | | 449 | Agent | 450 | | 451 +--------+ 453 Figure 2: Candidate Relationships 455 When the agent sends the TURN Allocate Request from IP address and 456 port X:x, the NAT (assuming there is one) will create a binding 457 X1':x1', mapping this server reflexive candidate to the host 458 candidate X:x. Outgoing packets sent from the host candidate will be 459 translated by the NAT to the server reflexive candidate. Incoming 460 packets sent to the server reflexive candidate will be translated by 461 the NAT to the host candidate and forwarded to the agent. We call 462 the host candidate associated with a given server reflexive candidate 463 the BASE. 465 NOTE: "Base" refers to the address an agent sends from for a 466 particular candidate. Thus, as a degenerate case host candidates 467 also have a base, but it's the same as the host candidate. 469 When there are multiple NATs between the agent and the TURN server, 470 the TURN request will create a binding on each NAT, but only the 471 outermost server reflexive candidate (the one nearest the TURN 472 server) will be discovered by the agent. If the agent is not behind 473 a NAT, then the base candidate will be the same as the server 474 reflexive candidate and the server reflexive candidate is redundant 475 and will be eliminated. 477 The Allocate request then arrives at the TURN server. The TURN 478 server allocates a port y from its local IP address Y, and generates 479 an Allocate response, informing the agent of this relayed candidate. 480 The TURN server also informs the agent of the server reflexive 481 candidate, X1':x1' by copying the source transport address of the 482 Allocate request into the Allocate response. The TURN server acts as 483 a packet relay, forwarding traffic between L and R. In order to send 484 traffic to L, R sends traffic to the TURN server at Y:y, and the TURN 485 server forwards that to X1':x1', which passes through the NAT where 486 it is mapped to X:x and delivered to L. 488 When only STUN servers are utilized, the agent sends a STUN Binding 489 Request [I-D.ietf-behave-rfc3489bis] to its STUN server. The STUN 490 server will inform the agent of the server reflexive candidate 491 X1':x1' by copying the source transport address of the Binding 492 request into the Binding response. 494 2.2. Connectivity Checks 496 Once L has gathered all of its candidates, it orders them in highest 497 to lowest priority and sends them to R over the signalling channel. 498 The candidates are carried in attributes in the SDP offer. When R 499 receives the offer, it performs the same gathering process and 500 responds with its own list of candidates. At the end of this 501 process, each agent has a complete list of both its candidates and 502 its peer's candidates. It pairs them up, resulting in CANDIDATE 503 PAIRS. To see which pairs work, each agent schedules a series of 504 CHECKS. Each check is a STUN request/response transaction that the 505 client will perform on a particular candidate pair by sending a STUN 506 request from the local candidate to the remote candidate. 508 The basic principle of the connectivity checks is simple: 510 1. Sort the candidate pairs in priority order. 512 2. Send checks on each candidate pair in priority order. 514 3. Acknowledge checks received from the other agent. 516 With both agents performing a check on a candidate pair, the result 517 is a 4-way handshake: 519 L R 520 - - 521 STUN request -> \ L's 522 <- STUN response / check 524 <- STUN request \ R's 525 STUN response -> / check 527 Figure 3: Basic Connectivity Check 529 It is important to note that the STUN requests are sent to and from 530 the exact same IP addresses and ports that will be used for media 531 (e.g., RTP and RTCP). Consequently, agents demultiplex STUN and RTP/ 532 RTCP using contents of the packets, rather than the port on which 533 they are received. Fortunately, this demultiplexing is easy to do, 534 especially for RTP and RTCP. 536 Because a STUN Binding Request is used for the connectivity check, 537 the STUN Binding response will contain the agent's translated 538 transport address on the public side any NATs between the agent and 539 its peer. If this transport address is different from other 540 candidates the agent already learned, it represents a new candidate, 541 called a PEER REFLEXIVE CANDIDATE, which then gets tested by ICE just 542 the same as any other candidate. 544 As an optimization, as soon as R gets L's check message, R schedules 545 a connectivity check message to be sent to L on the same candidate 546 pair. This accelerates the process of finding a valid candidate, and 547 is called a TRIGGERED CHECK. 549 At the end of this handshake, both L and R know that they can send 550 (and receive) messages end-to-end in both directions. 552 2.3. Sorting Candidates 554 Because the algorithm above searches all candidate pairs, if a 555 working pair exists it will eventually find it no matter what order 556 the candidates are tried in. In order to produce faster (and better) 557 results, the candidates are sorted in a specified order. The 558 resulting list of sorted candidate pairs is called the CHECK LIST. 559 The algorithm is described in Section 4.1.2 but follows two general 560 principles: 562 o Each agent gives its candidates a numeric priority which is sent 563 along with the candidate to the peer 565 o The local and remote priorities are combined so that each agent 566 has the same ordering for the candidate pairs. 568 The second property is important for getting ICE to work when there 569 are NATs in front of L and R. Frequently, NATs will not allow packets 570 in from a host until the agent behind the NAT has sent a packet 571 towards that host. Consequently, ICE checks in each direction will 572 not succeed until both sides have sent a check through their 573 respective NATs. 575 The agent works through this check list by sending a STUN request for 576 the next candidate pair on the list periodically. These are called 577 ORDINARY CHECKS. 579 In general the priority algorithm is designed so that candidates of 580 similar type get similar priorities and so that more direct routes 581 (that is, through fewer media relays and through fewer NATs) are 582 preferred over indirect ones (ones with more media relays and more 583 NATs). Within those guidelines, however, agents have a fair amount 584 of discretion about how to tune their algorithms. 586 2.4. Frozen Candidates 588 The previous description only addresses the case where the agents 589 wish to establish a media session with one COMPONENT (a piece of a 590 media stream requiring a single transport address; a media stream may 591 require multiple components, each of which has to work for the media 592 stream as a whole to be work). Typically, (e.g., with RTP and RTCP) 593 the agents actually need to establish connectivity for more than one 594 flow. 596 The network properties are likely to be very similar for each 597 component (especially because RTP and RTCP are sent and received from 598 the same IP address). It is usually possible to leverage information 599 from one media component in order to determine the best candidates 600 for another. ICE does this with a mechanism called "frozen 601 candidates." 603 Each candidate is associated with a property called its FOUNDATION. 604 Two candidates have the same foundation when they are "similar" - of 605 the same type and obtained from the same host candidate and STUN 606 server using the same protocol. Otherwise, their foundation is 607 different. A candidate pair has a foundation too, which is just the 608 concatenation of the foundations of its two candidates. Initially, 609 only the candidate pairs with unique foundations are tested. The 610 other candidate pairs are marked "frozen". When the connectivity 611 checks for a candidate pair succeed, the other candidate pairs with 612 the same foundation are unfrozen. This avoids repeated checking of 613 components which are superficially more attractive but in fact are 614 likely to fail. 616 While we've described "frozen" here as a separate mechanism for 617 expository purposes, in fact it is an integral part of ICE and the 618 the ICE prioritization algorithm automatically ensures that the right 619 candidates are unfrozen and checked in the right order. 621 2.5. Security for Checks 623 Because ICE is used to discover which addresses can be used to send 624 media between two agents, it is important to ensure that the process 625 cannot be hijacked to send media to the wrong location. Each STUN 626 connectivity check is covered by a message authentication code (MAC) 627 computed using a key exchanged in the signalling channel. This MAC 628 provides message integrity and data origin authentication, thus 629 stopping an attacker from forging or modifying connectivity check 630 messages. Furthermore, if the SIP [RFC3261] caller is using ICE, and 631 their call forks, the ICE exchanges happen independently with each 632 forked recipient. In such a case, the keys exchanged in the 633 signaling help associate each ICE exchange with each forked 634 recipient. 636 2.6. Concluding ICE 638 ICE checks are performed in a specific sequence, so that high 639 priority candidate pairs are checked first, followed by lower 640 priority ones. One way to conclude ICE is to declare victory as soon 641 as a check for each component of each media stream completes 642 successfully. Indeed, this is a reasonable algorithm, and details 643 for it are provided below. However, it is possible that a packet 644 loss will cause a higher priority check to take longer to complete. 645 In that case, allowing ICE to run a little longer might produce 646 better results. More fundamentally, however, the prioritization 647 defined by this specification may not yield "optimal" results. As an 648 example, if the aim is to select low latency media paths, usage of a 649 relay is a hint that latencies may be higher, but it is nothing more 650 than a hint. An actual RTT measurement could be made, and it might 651 demonstrate that a pair with lower priority is actually better than 652 one with higher priority. 654 Consequently, ICE assigns one of the agents in the role of the 655 CONTROLLING AGENT, and the other of the CONTROLLED AGENT. The 656 controlling agent gets to nominate which candidate pairs will get 657 used for media amongst the ones that are valid. It can do this in 658 one of two ways - using REGULAR NOMINATION or AGGRESSIVE NOMINATION. 660 With regular nomination, the controlling agent lets the checks 661 continue until at least one valid candidate pair for each media 662 stream is found. Then, it picks amongst those that are valid, and 663 sends a second STUN request on its NOMINATED candidate pair, but this 664 time with a flag set to tell the peer that this pair has been 665 nominated for use. This is shown in Figure 4. 667 L R 668 - - 669 STUN request -> \ L's 670 <- STUN response / check 672 <- STUN request \ R's 673 STUN response -> / check 675 STUN request + flag -> \ L's 676 <- STUN response / check 678 Figure 4: Regular Nomination 680 Once the STUN transaction with the flag completes, both sides cancel 681 any future checks for that media stream. ICE will now send media 682 using this pair. The pair an ICE agent is using for media is called 683 the SELECTED PAIR. 685 In aggressive nomination, the controlling agent puts the flag in 686 every STUN request it sends. This way, once the first check 687 succeeds, ICE processing is complete for that media stream and the 688 controlling agent doesn't have to send a second STUN request. The 689 selected pair will be the highest priority valid pair whose check 690 succeeded. Aggressive nomination is faster than regular nomination, 691 but gives less flexibility. Aggressive nomination is shown in 692 Figure 5. 694 L R 695 - - 696 STUN request + flag -> \ L's 697 <- STUN response / check 699 <- STUN request \ R's 700 STUN response -> / check 702 Figure 5: Aggressive Nomination 704 Once all of the media streams are completed, the controlling endpoint 705 sends an updated offer if the candidates in the m and c lines for the 706 media stream (called the DEFAULT CANDIDATES) don't match ICE's 707 SELECTED CANDIDATES. 709 Once ICE is concluded, it can be restarted at any time for one or all 710 of the media streams by either agent. This is done by sending an 711 updated offer indicating a restart. 713 2.7. Lite Implementations 715 In order for ICE to be used in a call, both agents need to support 716 it. However, certain agents will always be connected to the public 717 Internet and have a public IP address at which it can receive packets 718 from any correspondent. To make it easier for these devices to 719 support ICE, ICE defines a special type of implementation called LITE 720 (in contrast to the normal FULL implementation). A lite 721 implementation doesn't gather candidates; it includes only host 722 candidates for any media stream. Lite agents do not generate 723 connectivity checks or run the state machines, though they need to be 724 able to respond to connectivity checks. When a lite implementation 725 connects with a full implementation, the full agent takes the role of 726 the controlling agent, and the lite agent takes on the controlled 727 role. When two lite implementations connect, no checks are sent. 729 For guidance on when a lite implementation is appropriate, see the 730 discussion in Appendix A. 732 It is important to note that the lite implementation was added to 733 this specification to provide a stepping stone to full 734 implementation. Even for devices that are always connected to the 735 public Internet, a full implementation is preferable if achievable. 737 3. Terminology 739 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 740 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 741 document are to be interpreted as described in RFC 2119 [RFC2119]. 743 Readers should be familiar with the terminology defined in the offer/ 744 answer model [RFC3264], STUN [I-D.ietf-behave-rfc3489bis] and NAT 745 Behavioral requirements for UDP [RFC4787] 747 This specification makes use of the following additional terminology: 749 Agent: As defined in RFC 3264, an agent is the protocol 750 implementation involved in the offer/answer exchange. There are 751 two agents involved in an offer/answer exchange. 753 Peer: From the perspective of one of the agents in a session, its 754 peer is the other agent. Specifically, from the perspective of 755 the offerer, the peer is the answerer. From the perspective of 756 the answerer, the peer is the offerer. 758 Transport Address: The combination of an IP address and transport 759 protocol (such as UDP or TCP) port. 761 Candidate: A transport address that is a potential point of contact 762 for receipt of media. Candidates also have properties - their 763 type (server reflexive, relayed or host), priority, foundation, 764 and base. 766 Component: A component is a piece of a media stream requiring a 767 single transport address; a media stream may require multiple 768 components, each of which has to work for the media stream as a 769 whole to work. For media streams based on RTP, there are two 770 components per media stream - one for RTP, and one for RTCP. 772 Host Candidate: A candidate obtained by binding to a specific port 773 from an IP address on the host. This includes IP addresses on 774 physical interfaces and logical ones, such as ones obtained 775 through Virtual Private Networks (VPNs) and Realm Specific IP 776 (RSIP) [RFC3102] (which lives at the operating system level). 778 Server Reflexive Candidate: A candidate whose IP address and port 779 are a binding allocated by a NAT for an agent when it sent a 780 packet through the NAT to a server. Server reflexive candidates 781 can be learned by STUN servers using the Binding Request, or TURN 782 servers, which provides both a Relayed and Server Reflexive 783 candidate. 785 Peer Reflexive Candidate: A candidate whose IP address and port are 786 a binding allocated by a NAT for an agent when it sent a STUN 787 Binding Request through the NAT to its peer. 789 Relayed Candidate: A candidate obtained by sending a TURN Allocate 790 request from a host candidate to a TURN server. The relayed 791 candidate is resident on the TURN server, and the TURN server 792 relays packets back towards the agent. 794 Base: The base of a server reflexive candidate is the host candidate 795 from which it was derived. A host candidate is also said to have 796 a base, equal to that candidate itself. Similarly, the base of a 797 relayed candidate is that candidate itself. 799 Foundation: An arbitrary string that is the same for two candidates 800 that have the same type, base IP address, protocol (UDP, TCP, 801 etc.) and STUN or TURN server. If any of these are different then 802 the foundation will be different. Two candidate pairs with the 803 same foundation pairs are likely to have similar network 804 characteristics. Foundations are used in the frozen algorithm. 806 Local Candidate: A candidate that an agent has obtained and included 807 in an offer or answer it sent. 809 Remote Candidate: A candidate that an agent received in an offer or 810 answer from its peer. 812 Default Destination/Candidate: The default destination for a 813 component of a media stream is the transport address that would be 814 used by an agent that is not ICE aware. For the RTP component, 815 the default IP address is in the c line of the SDP, and the port 816 in the m line. For the RTCP component it is in the rtcp attribute 817 when present, and when not present, the IP address in the c line 818 and 1 plus the port in the m line. A default candidate for a 819 component is one whose transport address matches the default 820 destination for that component. 822 Candidate Pair: A pairing containing a local candidate and a remote 823 candidate. 825 Check, Connectivity Check, STUN Check: A STUN Binding Request 826 transaction for the purposes of verifying connectivity. A check 827 is sent from the local candidate to the remote candidate of a 828 candidate pair. 830 Check List: An ordered set of candidate pairs that an agent will use 831 to generate checks. 833 Ordinary Check: A connectivity check generated by an agent as a 834 consequence of a timer that fires periodically, instructing it to 835 send a check. 837 Triggered Check: A connectivity check generated as a consequence of 838 the receipt of a connectivity check from the peer. 840 Valid List: An ordered set of candidate pairs for a media stream 841 that have been validated by a successful STUN transaction. 843 Full: An ICE implementation that performs the complete set of 844 functionality defined by this specification. 846 Lite: An ICE implementation that omits certain functions, 847 implementing only as much as is necessary for a peer 848 implementation that is full to gain the benefits of ICE. Lite 849 implementations do not maintain any of the state machines and do 850 not generate connectivity checks. 852 Controlling Agent: The ICE agent which is responsible for selecting 853 the final choice of candidate pairs and signaling them through 854 STUN and an updated offer, if needed. In any session, one agent 855 is always controlling. The other is the controlled agent. 857 Controlled Agent: An ICE agent which waits for the controlling agent 858 to select the final choice of candidate pairs. 860 Regular Nomination: The process of picking a valid candidate pair 861 for media traffic by validating the pair with one STUN request, 862 and then picking it by sending a second STUN request with a flag 863 indicating its nomination. 865 Aggressive Nomination: The process of picking a valid candidate pair 866 for media traffic by including a flag in every STUN request, such 867 that the first one to produce a valid candidate pair is used for 868 media. 870 Nominated: If a valid candidate pair has its nominated flag set, it 871 means that it may be selected by ICE for sending and receiving 872 media. 874 Selected Pair, Selected Candidate: The candidate pair selected by 875 ICE for sending and receiving media is called the selected pair, 876 and each of its candidates is called the selected candidate. 878 4. Sending the Initial Offer 880 In order to send the initial offer in an offer/answer exchange, an 881 agent must (1) gather candidates, (2) prioritize them, (3) choose 882 default candidates, and then (4) formulate and send the SDP offer. 883 All but the last of these four steps differ for full and lite 884 implementations. 886 4.1. Full Implementation Requirements 888 4.1.1. Gathering Candidates 890 An agent gathers candidates when it believes that communications is 891 imminent. An offerer can do this based on a user interface cue, or 892 based on an explicit request to initiate a session. Every candidate 893 is a transport address. It also has a type and a base. Four types 894 are defined and gathered by this specification - host candidates, 895 server reflexive candidates, peer reflexive candidates, and relayed 896 candidates. The server reflexive and relayed candidates are gathered 897 using STUN or TURN, and relayed candidates are obtained through TURN. 898 Peer reflexive candidates are obtained in later phases of ICE, as a 899 consequence of connectivity checks. The base of a candidate is the 900 candidate that an agent must send from when using that candidate. 902 4.1.1.1. Host Candidates 904 The first step is to gather host candidates. Host candidates are 905 obtained by binding to ports (typically ephemeral) on a IP address 906 attached to an interface (physical or virtual, including VPN 907 interfaces) on the host. 909 For each UDP media stream the agent wishes to use, the agent SHOULD 910 obtain a candidate for each component of the media stream on each IP 911 address that the host has. It obtains each candidate by binding to a 912 UDP port on the specific IP address. A host candidate (and indeed 913 every candidate) is always associated with a specific component for 914 which it is a candidate. Each component has an ID assigned to it, 915 called the component ID. For RTP-based media streams, the RTP itself 916 has a component ID of 1, and RTCP a component ID of 2. If an agent 917 is using RTCP it MUST obtain a candidate for it. If an agent is 918 using both RTP and RTCP, it would end up with 2*K host candidates if 919 an agent has K IP addresses. 921 The base for each host candidate is set to the candidate itself. 923 4.1.1.2. Server Reflexive and Relayed Candidates 925 Agents SHOULD obtain relayed candidates and SHOULD obtain server 926 reflexive candidates. These requirements are at SHOULD strength to 927 allow for provider variation. Use of STUN and TURN servers may be 928 unnecessary in closed networks where agents are never connected to 929 the public Internet or to endpoints outside of the closed network. 930 In such cases, a full implementation would be used for agents that 931 are dual-stack or multi-homed, to select a host candidate. Use of 932 TURN servers is expensive, and when ICE is being used, they will only 933 be utilized when both endpoints are behind NATs that perform address 934 and port dependent mapping. Consequently, some deployments might 935 consider this use case to be marginal, and elect not to use TURN 936 servers. If an agent does not gather server reflexive or relayed 937 candidates, it is RECOMMENDED that the functionality be implemented 938 and just disabled through configuration, so that it can re-enabled 939 through configuration if conditions change in the future. 941 If an agent is gathering both relayed and server reflexive 942 candidates, it uses a TURN server. If it is gathering just server 943 reflexive candidates, it uses a STUN server. 945 The agent next pairs each host candidate with the STUN or TURN server 946 with which it is configured or has discovered by some means. If a 947 STUN or TURN server is configured, it is RECOMMENDED that a domain 948 name be configured, and the DNS procedures in 949 [I-D.ietf-behave-rfc3489bis] (using SRV records with the "stun" 950 service) be used to discover the STUN server, and the DNS procedures 951 in [I-D.ietf-behave-turn] (using SRV records with the "turn" service) 952 be used to discover the TURN server. 954 This specification only considers usage of a single STUN or TURN 955 server. When there are multiple choices for that single STUN or TURN 956 server (when, for example, they are learned through DNS records and 957 multiple results are returned), an agent SHOULD use a single STUN or 958 TURN server (based on its IP address) for all candidates for a 959 particular session. This improves the performance of ICE. The 960 result is a set of pairs of host candidates with STUN or TURN 961 servers. The agent then chooses one pair, and sends a Binding or 962 Allocate request to the server from that host candidate. Binding 963 Requests to a STUN server are not authenticated, and any ALTERNATE- 964 SERVER attribute in a response is ignored. Agents MUST support the 965 backwards compatibility mode for the Binding Request defined in 966 [I-D.ietf-behave-rfc3489bis]. Allocate requests SHOULD be 967 authenticated using a long-term credential obtained by the client 968 through some other means. 970 Every Ta milliseconds thereafter, the agent can generate another new 971 STUN or TURN transaction. This transaction can either be a retry of 972 a previous transaction which failed with a recoverable error (such as 973 authentication failure), or a transaction for a new host candidate 974 and STUN or TURN server pair. The agent SHOULD NOT generate 975 transactions more frequently than one every Ta milliseconds. See 976 Section 16 for guidance on how to set Ta and the STUN retransmit 977 timer, RTO. 979 The agent will receive a Binding or Allocate response. A successful 980 Allocate Response will provide the agent with a server reflexive 981 candidate (obtained from the mapped address) and a relayed candidate 982 in the RELAY-ADDRESS attribute. If the Allocate request is rejected 983 because the server lacks resources to fulfill it, the agent SHOULD 984 instead send a Binding Request to obtain a server reflexive 985 candidate. A Binding Response will provide the agent with only a 986 server reflexive candidate (also obtained from the mapped address). 987 The base of the server reflexive candidate is the host candidate from 988 which the Allocate or Binding request was sent. The base of a 989 relayed candidate is that candidate itself. If a relayed candidate 990 is identical to a host candidate (which can happen in rare cases), 991 the relayed candidate MUST be discarded. 993 4.1.1.3. Computing Foundations 995 Finally, the agent assigns each candidate a foundation. The 996 foundation is an identifier, scoped within a session. Two candidates 997 MUST have the same foundation ID when all of the following are true: 999 o they are of the same type (host, relayed, server reflexive, or 1000 peer reflexive) 1002 o their bases have the same IP address (the ports can be different) 1004 o for reflexive and relayed candidates, the STUN or TURN servers 1005 used to obtain them have the same IP address. 1007 o they were obtained using the same transport protocol (TCP, UDP, 1008 etc.) 1010 Similarly, two candidates MUST have different foundations if their 1011 types are different, their bases have different IP addresses, the 1012 STUN or TURN servers used to obtain them have different IP addresses, 1013 or their transport protocols are different. 1015 4.1.1.4. Keeping Candidates Alive 1017 Once server reflexive and relayed candidates are allocated, they MUST 1018 be kept alive until ICE processing has completed, as described in 1019 Section 8.3. For server reflexive candidates learned through a 1020 Binding request, the bindings MUST be kept alive by additional 1021 Binding Requests to the server. For relayed candidates learned 1022 through an Allocate request, the keepalive MUST be new Allocate 1023 requests. The Allocate requests will also refresh the server 1024 reflexive candidate. 1026 4.1.2. Prioritizing Candidates 1028 The prioritization process results in the assignment of a priority to 1029 each candidate. Each candidate for a media stream MUST have a unique 1030 priority that MUST be a positive integer between 1 and (2**31 - 1). 1031 This priority will be used by ICE to determine the order of the 1032 connectivity checks and the relative preference for candidates. 1034 An agent SHOULD compute this priority using the formula in 1035 Section 4.1.2.1 and choose its parameters using the guidelines in 1036 Section 4.1.2.2. If an agent elects to use a different formula, ICE 1037 will take longer to converge since both agents will not be 1038 coordinated in their checks. 1040 4.1.2.1. Recommended Formula 1042 When using the formula, an agent computes the priority by determining 1043 a preference for each type of candidate (server reflexive, peer 1044 reflexive, relayed and host), and, when the agent is multihomed, 1045 choosing a preference for its IP addresses. These two preferences 1046 are then combined to compute the priority for a candidate. That 1047 priority is computed using the following formula: 1049 priority = (2^24)*(type preference) + 1050 (2^8)*(local preference) + 1051 (2^0)*(256 - component ID) 1053 The type preference MUST be an integer from 0 to 126 inclusive, and 1054 represents the preference for the type of the candidate (where the 1055 types are local, server reflexive, peer reflexive and relayed). A 1056 126 is the highest preference, and a 0 is the lowest. Setting the 1057 value to a 0 means that candidates of this type will only be used as 1058 a last resort. The type preference MUST be identical for all 1059 candidates of the same type and MUST be different for candidates of 1060 different types. The type preference for peer reflexive candidates 1061 MUST be higher than that of server reflexive candidates. Note that 1062 candidates gathered based on the procedures of Section 4.1.1 will 1063 never be peer reflexive candidates; candidates of these type are 1064 learned from the connectivity checks performed by ICE. 1066 The local preference MUST be an integer from 0 to 65535 inclusive. 1067 It represents a preference for the particular IP address from which 1068 the candidate was obtained, in cases where an agent is multihomed. 1069 65535 represents the highest preference, and a zero, the lowest. 1070 When there is only a single IP address, this value SHOULD be set to 1071 65535. More generally, if there are multiple candidates for a 1072 particular component for a particular media stream which have the 1073 same type, the local preference MUST be unique for each one. In this 1074 specification, this only happens for multi-homed hosts. If a host is 1075 multi-homed because it is dual stacked, the local preference SHOULD 1076 be set equal to the precedence value for IP addresses described in 1077 RFC 3484 [RFC3484]. 1079 The component ID is the component ID for the candidate, and MUST be 1080 between 1 and 256 inclusive. 1082 4.1.2.2. Guidelines for Choosing Type and Local Preferences 1084 One criteria for selection of the type and local preference values is 1085 the use of a media intermediary, such as a TURN server, VPN server or 1086 NAT. With a media intermediary, if media is sent to that candidate, 1087 it will first transit the media intermediary before being received. 1088 Relayed candidates are one type of candidate that involves a media 1089 intermediary. Another are host candidates obtained from a VPN 1090 interface. When media is transited through a media intermediary, it 1091 can increase the latency between transmission and reception. It can 1092 increase the packet losses, because of the additional router hops 1093 that may be taken. It may increase the cost of providing service, 1094 since media will be routed in and right back out of a media 1095 intermediary run by a provider. If these concerns are important, the 1096 type preference for relayed candidates SHOULD be lower than host 1097 candidates. The RECOMMENDED values are 126 for host candidates, 100 1098 for server reflexive candidates, 110 for peer reflexive candidates, 1099 and 0 for relayed candidates. Furthermore, if an agent is multi- 1100 homed and has multiple IP addresses, the local preference for host 1101 candidates from a VPN interface SHOULD have a priority of 0. 1103 Another criteria for selection of preferences is IP address family. 1104 ICE works with both IPv4 and IPv6. It therefore provides a 1105 transition mechanism that allows dual-stack hosts to prefer 1106 connectivity over IPv6, but to fall back to IPv4 in case the v6 1107 networks are disconnected (due, for example, to a failure in a 6to4 1108 relay) [RFC3056]. It can also help with hosts that have both a 1109 native IPv6 address and a 6to4 address. In such a case, higher local 1110 preferences could be assigned to the v6 addresses, followed by the 1111 6to4 addresses, followed by the v4 addresses. This allows a site to 1112 obtain and begin using native v6 addresses immediately, yet still 1113 fallback to 6to4 addresses when communicating with agents in other 1114 sites that do not yet have native v6 connectivity. 1116 Another criteria for selecting preferences is security. If a user is 1117 a telecommuter, and therefore connected to their corporate network 1118 and a local home network, they may prefer their voice traffic to be 1119 routed over the VPN in order to keep it on the corporate network when 1120 communicating within the enterprise, but use the local network when 1121 communicating with users outside of the enterprise. In such a case, 1122 a VPN address would have a higher local preference than any other 1123 address. 1125 Another criteria for selecting preferences is topological awareness. 1126 This is most useful for candidates that make use of intermediaries. 1127 In those cases, if an agent has preconfigured or dynamically 1128 discovered knowledge of the topological proximity of the 1129 intermediaries to itself, it can use that to assign higher local 1130 preferences to candidates obtained from closer intermediaries. 1132 4.1.3. Eliminating Redundant Candidates 1134 Next, the agent eliminates redundant candidates. A candidate is 1135 redundant if its transport address equals another candidate, and its 1136 base equals the base of that other candidate. Note that two 1137 candidates can have the same transport address yet have different 1138 bases, and these would not be considered redundant. Frequently, a 1139 server reflexive candidate and a host candidate will be redundant 1140 when the agent is not behind a NAT. The agent SHOULD eliminate the 1141 redundant candididate with the lower priority. 1143 4.1.4. Choosing Default Candidates 1145 A candidate is said to be default if it would be the target of media 1146 from a non-ICE peer; that target being called the DEFAULT 1147 DESTINATION. If the default candidates are not selected by the ICE 1148 algorithm when communicating with an ICE-aware peer, an updated 1149 offer/answer will be required after ICE processing completes in order 1150 to "fix-up" the SDP so that the default destination for media matches 1151 the candidates selected by ICE. If ICE happens to select the default 1152 candidates, no updated offer/answer is required. 1154 An agent MUST choose a set of candidates, one for each component of 1155 each in-use media stream, to be default. A media stream is in-use if 1156 it does not have a port of zero (which is used in RFC 3264 to reject 1157 a media stream). Consequently, a media stream is in-use even if it 1158 is marked as a=inactive [RFC4566] or has a bandwidth value of zero. 1160 It is RECOMMENDED that default candidates be chosen based on the 1161 likelihood of those candidates to work with the peer that is being 1162 contacted. It is RECOMMENDED that the default candidates are the 1163 relayed candidates (if relayed candidates are available), server 1164 reflexive candidates (if server reflexive candidates are available), 1165 and finally host candidates. 1167 4.2. Lite Implementation 1169 Lite implementations only utilize host candidates. A lite 1170 implementation MUST, for each component of each media stream, 1171 allocate zero or one IPv4 candidates. It MAY allocate zero or more 1172 IPv6 candidates, but no more than one per each IPv6 address utilized 1173 by the host. Since there can be no more than one IPv4 candidate per 1174 component of each media stream, if an agent has multiple IPv4 1175 addresses, it MUST choose one for allocating the candidate. If a 1176 host is dual-stack, it is RECOMMENDED that it allocate one IPv4 1177 candidate and one global IPv6 address. With the lite implementation, 1178 ICE cannot be used to dynamically choose amongst candidates. 1179 Therefore, including more than one candidate from a particular scope 1180 is NOT RECOMMENDED, since only a connectivity check can truly 1181 determine whether to use one address or the other. 1183 Each component has an ID assigned to it, called the component ID. 1184 For RTP-based media streams the RTP itself has a component ID of 1, 1185 and RTCP a component ID of 2. If an agent is using RTCP it MUST 1186 obtain candidates for it. 1188 Each candidate is assigned a foundation. The foundation MUST be 1189 different for two candidates allocated from different IP addresses, 1190 and MUST be the same otherwise. A simple integer that increments for 1191 each IP address will suffice. In addition, each candidate MUST be 1192 assigned a unique priority amongst all candidates for the same media 1193 stream. This priority SHOULD be equal to: 1195 priority = (2^24)*(126) + 1196 (2^8)*(IP precedence) + 1197 (2^0)*(256 - component ID) 1199 If a host is v4-only, it SHOULD set the IP precedence to 65535. If a 1200 host is v6 or dual-stack, the IP precedence SHOULD be the precedence 1201 value for IP addresses described in RFC 3484 [RFC3484]. 1203 Next, an agent chooses a default candidate for each component of each 1204 media stream. If a host is IPv4 only, there would only be one 1205 candidate for each component of each media stream, and therefore that 1206 candidate is the default. If a host is IPv6 or dual stack, the 1207 selection of default is a matter of local policy. This default 1208 SHOULD be chosen, such that, it is the candidate most likely to be 1209 used with a peer. For IPv6-only hosts, this would typically by a 1210 globally scoped IPv6 address. For dual-stack hosts, the IPv4 address 1211 is RECOMMENDED. 1213 4.3. Encoding the SDP 1215 The process of encoding the SDP is identical between full and lite 1216 implementations. 1218 The agent will include an m-line for each media stream it wishes to 1219 use. The ordering of media streams in the SDP is relevant for ICE. 1220 ICE will perform its connectivity checks for the first m-line first, 1221 and consequently media will be able to flow for that stream first. 1222 Agents SHOULD place their most important media stream, if there is 1223 one, first in the SDP. 1225 There will be a candidate attribute for each candidate for a 1226 particular media stream. Section 15 provides detailed rules for 1227 constructing this attribute. The attribute carries the IP address, 1228 port and transport protocol for the candidate, in addition to its 1229 properties that need to be signaled to the peer for ICE to work: the 1230 priority, foundation, and component ID. The candidate attribute also 1231 carries information about the candidate that is useful for 1232 diagnostics and other functions: its type and related transport 1233 addresses. 1235 STUN connectivity checks between agents are authenticated using the 1236 short term credential mechanism defined for STUN 1237 [I-D.ietf-behave-rfc3489bis]. This mechanism relies on a username 1238 and password that are exchanged through protocol machinery between 1239 the client and server. With ICE, the offer/answer exchange is used 1240 to exchange them. The username part of this credential is formed by 1241 concatenating a username fragment from each agent, separated by a 1242 colon. Each agent also provides a password, used to compute the 1243 message integrity for requests it receives. The username fragment 1244 and password are exchanged in the ice-ufrag and ice-pwd attributes, 1245 respectively. In addition to providing security, the username 1246 provides disambiguation and correlation of checks to media streams. 1247 See Appendix B.4 for motivation. 1249 If an agent is a lite implementation, it MUST include an "a=ice-lite" 1250 session level attribute in its SDP. If an agent is a full 1251 implementation, it MUST NOT include this attribute. 1253 The default candidates are added to the SDP as the default 1254 destination for media. For streams based on RTP, this is done by 1255 placing the IP address and port of the RTP candidate into the c and m 1256 lines, respectively. If the agent is utilizing RTCP, it MUST encode 1257 the RTCP candidate using the a=rtcp attribute as defined in RFC 3605 1258 [RFC3605]. If RTCP is not in use, the agent MUST signal that using 1259 b=RS:0 and b=RR:0 as defined in RFC 3556 [RFC3556]. 1261 The transport addresses that will be the default destination for 1262 media when communicating with non-ICE peers MUST also be present as 1263 candidates in one or more a=candidate lines. 1265 ICE provides for extensibility by allowing an offer or answer to 1266 contain a series of tokens which identify the ICE extensions used by 1267 that agent. If an agent supports an ICE extension, it MUST include 1268 the token defined for that extension in the ice-options attribute. 1270 The following is an example SDP message that includes ICE attributes 1271 (lines folded for readability): 1273 v=0 1274 o=jdoe 2890844526 2890842807 IN IP4 10.0.1.1 1275 s= 1276 c=IN IP4 192.0.2.3 1277 t=0 0 1278 a=ice-pwd:asd88fgpdd777uzjYhagZg 1279 a=ice-ufrag:8hhY 1280 m=audio 45664 RTP/AVP 0 1281 b=RS:0 1282 b=RR:0 1283 a=rtpmap:0 PCMU/8000 1284 a=candidate:1 1 UDP 2130706431 10.0.1.1 8998 typ host 1285 a=candidate:2 1 UDP 1694498815 192.0.2.3 45664 typ srflx raddr 1286 10.0.1.1 rport 8998 1288 Once an agent has sent its offer or sent its answer, that agent MUST 1289 be prepared to receive both STUN and media packets on each candidate. 1290 As discussed in Section 11.1, media packets can be sent to a 1291 candidate prior to its appearance as the default destination for 1292 media in an offer or answer. 1294 5. Receiving the Initial Offer 1296 When an agent receives an initial offer, it will check if the offerer 1297 supports ICE, determine its own role, gather candidates, prioritize 1298 them, choose default candidates, encode and send an answer, and for 1299 full implementations, form the check lists and begin connectivity 1300 checks. 1302 5.1. Verifying ICE Support 1304 The agent will proceed with the ICE procedures defined in this 1305 specification if, for each media stream in the SDP it received, the 1306 default destination for each component of that media stream appears 1307 in a candidate attribute. For example, in the case of RTP, the IP 1308 address and port in the c and m line, respectively, appears in a 1309 candidate attribute and the value in the rtcp attribute appears in a 1310 candidate attribute. 1312 If this condition is not met, the agent MUST process the SDP based on 1313 normal RFC 3264 procedures, without using any of the ICE mechanisms 1314 described in the remainder of this specification with the following 1315 exceptions: 1317 1. The agent MUST follow the rules of Section 10, which describe 1318 keepalive procedures for all agents. 1320 2. If the agent is not proceeding with ICE because there were 1321 a=candidate attributes, but none that matched the default 1322 destination of the media stream, the agent MUST include an a=ice- 1323 mismatch attribute in its answer. 1325 3. If the default candidates were relayed candidates learned through 1326 a TURN server, the agent MUST create permissions in the TURN 1327 server for the IP addresses learned from its peer in the SDP it 1328 just received. If this is not done, initial packets in the media 1329 stream from the peer may be lost. 1331 5.2. Determining Role 1333 For each session, each agent takes on a role. There are two roles - 1334 controlling, and controlled. The controlling agent is responsible 1335 for the choice of the final candidate pairs used for communications. 1336 For a full agent, this means nominating the candidate pairs that can 1337 be used by ICE for each media stream, and for generating the updated 1338 offer based on ICE's selection, when needed. For a lite 1339 implementation, being the controlling agent means selecting a 1340 candidate pair based on the ones in the offer and answer (for IPv4, 1341 there is only ever one pair), and then generating an updated offer 1342 reflecting that selection, when needed (it is never needed for an 1343 IPv4 only host). The controlled agent is told which candidate pairs 1344 to use for each media stream, and does not generate an updated offer 1345 to signal this information. The sections below describe in detail 1346 the actual procedures following by controlling and controlled nodes. 1348 The rules for determining the role and the impact on behavior are as 1349 follows: 1351 Both agents are full: The agent which generated the offer which 1352 started the ICE processing MUST take the controlling role, and the 1353 other MUST take the controlled role. Both agents will form check 1354 lists, run the ICE state machines, and generate connectivity 1355 checks. The controlling agent will execute the logic in 1356 Section 8.1 to nominate pairs that will be selected by ICE, and 1357 then both agents end ICE as described in Section 8.1.2. In 1358 unusual cases, described in Appendix B.11, it is possible for both 1359 agents to mistakenly believe they are controlled or controlling. 1360 To resolve this, each agent MUST select a random number, called 1361 the tie-breaker, uniformly distributed between 0 and (2**64) - 1 1362 (that is, a 64 bit positive integer). This number is used in 1363 connectivity checks to detect and repair this case, as described 1364 in Section 7.1.1.2. 1366 One agent Full, one Lite: The full agent MUST take the controlling 1367 role, and the lite agent MUST take the controlled role. The full 1368 agent will form check lists, run the ICE state machines, and 1369 generate connectivity checks. That agent will execute the logic 1370 in Section 8.1 to nominate pairs that will be selected by ICE, and 1371 use the logic in Section 8.1.2 to end ICE. The lite 1372 implementation will just listen for connectivity checks, receive 1373 them and respond to them, and then conclude ICE as described in 1374 Section 8.2. For the lite implementation, the state of ICE 1375 processing for each media stream is considered to be Running, and 1376 the state of ICE overall is Running. 1378 Both Lite: The agent which generated the offer which started the ICE 1379 processing MUST take the controlling role, and the other MUST take 1380 the controlled role. In this case, no connectivity checks are 1381 ever sent. Rather, once the offer/answer exchange completes, each 1382 agent performs the processing described in Section 8 without 1383 connectivity checks. It is possible that both agents will believe 1384 they are controlled or controlling. In the latter case, the 1385 conflict is resolved through glare detection capabilities in the 1386 signaling protocol carrying the offer/answer exchange. The state 1387 of ICE processing for each media stream is considered to be 1388 Running, and the state of ICE overall is Running. 1390 Once roles are determined for a session, they persist unless ICE is 1391 restarted. A ICE restart (Section 9.1) causes a new selection of 1392 roles and tie-breakers. 1394 5.3. Gathering Candidates 1396 The process for gathering candidates at the answerer is identical to 1397 the process for the offerer as described in Section 4.1.1 for full 1398 implementations and Section 4.2 for lite implementations. It is 1399 RECOMMENDED that this process begin immediately on receipt of the 1400 offer, prior to alerting the user. Such gathering MAY begin when an 1401 agent starts. 1403 5.4. Prioritizing Candidates 1405 The process for prioritizing candidates at the answerer is identical 1406 to the process followed by the offerer, as described in Section 4.1.2 1407 for full implementations and Section 4.2 for lite implementations. 1409 5.5. Choosing Default Candidates 1411 The process for selecting default candidates at the answerer is 1412 identical to the process followed by the offerer, as described in 1413 Section 4.1.4 for full implementations and Section 4.2 for lite 1414 implementations. 1416 5.6. Encoding the SDP 1418 The process for encoding the SDP at the answerer is identical to the 1419 process followed by the offerer for both full and lite 1420 implementations, as described in Section 4.3. 1422 5.7. Forming the Check Lists 1424 Forming check lists is done only by full implementations. Lite 1425 implementations MUST skip the steps defined in this section. 1427 There is one check list per in-use media stream resulting from the 1428 offer/answer exchange. To form the check list for a media stream, 1429 the agent forms candidate pairs, computes a candidate pair priority, 1430 orders the pairs by priority, prunes them, and sets their states. 1431 These steps are described in this section. 1433 5.7.1. Forming Candidate Pairs 1435 First, the agent takes each of its candidates for a media stream 1436 (called LOCAL CANDIDATES) and pairs them with the candidates it 1437 received from its peer (called REMOTE CANDIDATES) for that media 1438 stream. In order to prevent the attacks described in Section 18.5.2, 1439 agents MAY limit the number of candidates they'll accept in an offer 1440 or answer. A local candidate is paired with a remote candidate if 1441 and only if the two candidates have the same component ID and have 1442 the same IP address version. It is possible that some of the local 1443 candidates don't get paired with a remote candidate, and some of the 1444 remote candidates don't get paired with local candidates. This can 1445 happen if one agent didn't include candidates for the all of the 1446 components for a media stream. If this happens, the number of 1447 components for that media stream is effectively reduced, and 1448 considered to be equal to the minimum across both agents of the 1449 maximum component ID provided by each agent across all components for 1450 the media stream. 1452 In the case of RTP, this would happen when one agent provided 1453 candidates for RTCP, and the other did not. As another example, the 1454 offerer can multiplex RTP and RTCP on the same port and signals it 1455 can do that in the SDP through an SDP attribute 1456 [I-D.ietf-avt-rtp-and-rtcp-mux]. However, since the offerer doesn't 1457 know if the answerer can perform such multiplexing, the offerer 1458 includes candidates for RTP and RTCP on separate ports, so that the 1459 offer has two components per media stream. If the answerer can 1460 perform such multiplexing, it would include just a single component 1461 for each candidate - for the combined RTP/RTCP mux. ICE would end up 1462 acting as if there was just a single component for this candidate. 1464 The candidate pairs whose local and remote candidates were both the 1465 default candidates for a particular component is called, 1466 unsurprisingly, the default candidate pair for that component. This 1467 is the pair that would be used to transmit media if both agents had 1468 not been ICE aware. 1470 In order to aid understanding, Figure 9 shows the relationships 1471 between several key concepts - transport addresses, candidates, 1472 candidate pairs, and check lists, in addition to indicating the main 1473 properties of candidates and candidate pairs. 1475 +------------------------------------------+ 1476 | | 1477 | +---------------------+ | 1478 | |+----+ +----+ +----+ | +Type | 1479 | || IP | |Port| |Tran| | +Priority | 1480 | ||Addr| | | | | | +Foundation | 1481 | |+----+ +----+ +----+ | +ComponentiD | 1482 | | Transport | +RelatedAddr | 1483 | | Addr | | 1484 | +---------------------+ +Base | 1485 | Candidate | 1486 +------------------------------------------+ 1487 * * 1488 * ************************************* 1489 * * 1490 +-------------------------------+ 1491 .| | 1492 | Local Remote | 1493 | +----+ +----+ +default? | 1494 | |Cand| |Cand| +valid? | 1495 | +----+ +----+ +nominated?| 1496 | +State | 1497 | | 1498 | | 1499 | Candidate Pair | 1500 +-------------------------------+ 1501 * * 1502 * ************ 1503 * * 1504 +------------------+ 1505 | Candidate Pair | 1506 +------------------+ 1507 +------------------+ 1508 | Candidate Pair | 1509 +------------------+ 1510 +------------------+ 1511 | Candidate Pair | 1512 +------------------+ 1514 Check 1515 List 1517 Figure 9: Conceptual Diagram of a Check List 1519 5.7.2. Computing Pair Priority and Ordering Pairs 1521 Once the pairs are formed, a candidate pair priority is computed. 1522 Let G be the priority for the candidate provided by the controlling 1523 agent. Let D be the priority for the candidate provided by the 1524 controlled agent. The priority for a pair is computed as: 1526 pair priority = 2^32*MIN(G,D) + 2*MAX(G,D) + (G>D?1:0) 1528 Where G>D?1:0 is an expression whose value is 1 if G is greater than 1529 D, and 0 otherwise. Once the priority is assigned, the agent sorts 1530 the candidate pairs in decreasing order of priority. If two pairs 1531 have identical priority, the ordering amongst them is arbitrary. 1533 5.7.3. Pruning the Pairs 1535 This sorted list of candidate pairs is used to determine a sequence 1536 of connectivity checks that will be performed. Each check involves 1537 sending a request from a local candidate to a remote candidate. 1538 Since an agent cannot send requests directly from a reflexive 1539 candidate, but only from its base, the agent next goes through the 1540 sorted list of candidate pairs. For each pair where the local 1541 candidate is server reflexive, the server reflexive candidate MUST be 1542 replaced by its base. Once this has been done, the agent MUST prune 1543 the list. This is done by removing a pair if its local and remote 1544 candidates are identical to the local and remote candidates of a pair 1545 higher up on the priority list. The result is a sequence of ordered 1546 candidate pairs, called the check list for that media stream. 1548 In addition, in order to limit the attacks described in 1549 Section 18.5.2, an agent MUST limit the total number of connectivity 1550 checks they perform across all check lists to a specific value, adn 1551 this value MUST be configurable. A default of 100 is RECOMMENDED. 1552 This limit is enforced by discarding the lower priority candidate 1553 pairs until there are less than 100. It is RECOMMENDED that a lower 1554 value be utilized when possible, set to the maximum number of 1555 plausible checks that might be seen in an actual deployment 1556 configuration. The requirement for configuration is meant to 1557 provided a tool for fixing this value in the field if, once deployed, 1558 it is found to be problematic. 1560 5.7.4. Computing States 1562 Each candidate pair in the check list has a foundation and a state. 1563 The foundation is the combination of the foundations of the local and 1564 remote candidates in the pair. The state is assigned once the check 1565 list for each media stream has been computed. There are five 1566 potential values that the state can have: 1568 Waiting: A check has not been performed for this pair, and can be 1569 performed as soon as it is the highest priority Waiting pair on 1570 the check list. 1572 In-Progress: A check has been sent for this pair, but the 1573 transaction is in progress. 1575 Succeeded: A check for this pair was already done and produced a 1576 successful result. 1578 Failed: A check for this pair was already done and failed, either 1579 never producing any response or producing an unrecoverable failure 1580 response. 1582 Frozen: A check for this pair hasn't been performed, and it can't 1583 yet be performed until some other check succeeds, allowing this 1584 pair to unfreeze and move into the Waiting state. 1586 As ICE runs, the pairs will move between states as shown in 1587 Figure 10. 1589 +-----------+ 1590 | | 1591 | | 1592 | Frozen | 1593 | | 1594 | | 1595 +-----------+ 1596 | 1597 |unfreeze 1598 | 1599 V 1600 +-----------+ +-----------+ 1601 | | | | 1602 | | perform | | 1603 | Waiting |-------->|In-Progress| 1604 | | | | 1605 | | | | 1606 +-----------+ +-----------+ 1607 / | 1608 // | 1609 // | 1610 // | 1611 / | 1612 // | 1613 failure // |success 1614 // | 1615 / | 1616 // | 1617 // | 1618 // | 1619 V V 1620 +-----------+ +-----------+ 1621 | | | | 1622 | | | | 1623 | Failed | | Succeeded | 1624 | | | | 1625 | | | | 1626 +-----------+ +-----------+ 1628 Figure 10: Pair State FSM 1630 The initial states for each pair in a check list are computed by 1631 performing the following sequence of steps: 1633 1. The agent sets all of the pairs in each check list to the Frozen 1634 state. 1636 2. The agent examines the check list for the first media stream (a 1637 media stream is the first media stream when it is described by 1638 the first m-line in the SDP offer and answer). For that media 1639 stream: 1641 * For all pairs with the same foundation, it sets the state of 1642 the pair with the lowest component ID to Waiting. If there is 1643 more than one such pair, the one with the highest priority is 1644 used. 1646 One of the check lists will have some number of pairs in the Waiting 1647 state, and the other check lists will have all of their pairs in the 1648 Frozen state. A check list with at least one pair that is Waiting is 1649 called an active check list, and a check list with all pairs frozen 1650 is called a frozen check list. 1652 The check list itself is associated with a state, which captures the 1653 state of ICE checks for that media stream. There are three states: 1655 Running: In this state, ICE checks are still in progress for this 1656 media stream. 1658 Completed: In this state, ICE checks have produced nominated pairs 1659 for each component of the media stream. Consequently, ICE has 1660 succeeded and media can be sent. 1662 Failed: In this state, the ICE checks have not completed 1663 successfully for this media stream. 1665 When a check list is first constructed as the consequence of an 1666 offer/answer exchange, it is placed in the Running state. 1668 ICE processing across all media streams also has a state associated 1669 with it. This state is equal to Running while ICE processing is 1670 underway. The state is Completed when ICE processing is complete and 1671 Failed if it failed without success. Rules for transitioning between 1672 states are described below. 1674 5.8. Scheduling Checks 1676 Checks are generated only by full implementations. Lite 1677 implementations MUST skip the steps described in this section. 1679 An agent performs ordinary checks and triggered checks. The 1680 generation of both checks is governed by a timer which fires 1681 periodically for each media stream. The agent maintains a FIFO 1682 queue, called the triggered check queue, which contains candidate 1683 pairs for which checks are to be sent at the next available 1684 opportunity. When the timer fires, the agent removes the top pair 1685 from triggered check queue, performs a connectivity check on that 1686 pair, and sets the state of the candidate pair to In-Progress. If 1687 there are no pairs in the triggered check queue, an ordinary check is 1688 sent. 1690 Once the agent has computed the check lists as described in 1691 Section 5.7, it sets a timer for each active check list. The timer 1692 fires every Ta*N seconds, where N is the number of active check lists 1693 (initially, there is only one active check list). Implementations 1694 MAY set the timer to fire less frequently than this. Implementations 1695 SHOULD take care to spread out these timers so that they do not fire 1696 at the same time for each media stream. Ta and the retransmit timer 1697 RTO are computed as described in Section 16. Multiplying by N allows 1698 this aggregate check throughput to be split between all active check 1699 lists. The first timer fires immediately, so that the agent performs 1700 a connectivity check the moment the offer/answer exchange has been 1701 done, followed by the next check Ta seconds later (since there is 1702 only one active check list). 1704 When the timer fires, and there is no triggered check to be sent, the 1705 agent MUST choose an ordinary check as follows: 1707 o Find the highest priority pair in that check list that is in the 1708 Waiting state. 1710 o If there is such a pair: 1712 * Send a STUN check from the local candidate of that pair to the 1713 remote candidate of that pair. The procedures for forming the 1714 STUN request for this purpose are described in Section 7.1.1. 1716 * Set the state of the candidate pair to In-Progress. 1718 o If there is no such pair: 1720 * Find the highest priority pair in that check list that is in 1721 the Frozen state. 1723 * If there is such a pair: 1725 + Unfreeze the pair. 1727 + Perform a check for that pair, causing its state to 1728 transition to In-Progress. 1730 * If there is no such pair: 1732 + Terminate the timer for that check list. 1734 To compute the message integrity for the check, the agent uses the 1735 remote username fragment and password learned from the SDP from its 1736 peer. The local username fragment is known directly by the agent for 1737 its own candidate. 1739 6. Receipt of the Initial Answer 1741 This section describes the procedures that an agent follows when it 1742 receives the answer from the peer. It verifies that its peer 1743 supports ICE, determines its role, and for full implementations, 1744 forms the check list and begins performing ordinary checks. 1746 When ICE is used with SIP, forking may result in a single offer 1747 generating a multiplicity of answers. In that each, ICE proceeds 1748 completely in parallel and independently for each answer, treating 1749 the combination of its offer and each answer as an independent offer/ 1750 answer exchange, with its own set of pairs, check lists, states, and 1751 so on. The only case in which processing of one pair impacts another 1752 is freeing of candidates, discussed below in Section 8.3. 1754 6.1. Verifying ICE Support 1756 The logic at the offerer is identical to that of the answerer as 1757 described in Section 5.1, with the exception that an offerer would 1758 not ever generate a=ice-mismatch attributes in an SDP. 1760 In some cases, the answer may omit a=candidate attributes for the 1761 media streams, and instead include an a=ice-mismatch attribute for 1762 one or more of the media streams in the SDP. This signals to the 1763 offerer that the answerer supports ICE, but that ICE processing was 1764 not used for the session because a signaling intermediary modified 1765 the default destination for media components without modifying the 1766 corresponding candidate attributes. See Section 18 for a discussion 1767 of cases where this can happen. This specification provides no 1768 guidance on how an agent should proceed in such a failure case. 1770 6.2. Determining Role 1772 The offerer follows the same procedures described for the answerer in 1773 Section 5.2. 1775 6.3. Forming the Check List 1777 Formation of check lists is performed only by full implementations. 1778 The offerer follows the same procedures described for the answerer in 1779 Section 5.7. 1781 6.4. Performing Ordinary Checks 1783 Ordinary checks are performed only by full implementations. The 1784 offerer follows the same procedures described for the answerer in 1785 Section 5.8. 1787 7. Performing Connectivity Checks 1789 This section describes how connectivity checks are performed. All 1790 ICE implementations are required to be compliant to 1791 [I-D.ietf-behave-rfc3489bis], as opposed to the older [RFC3489]. 1792 However, whereas a full implementation will both generate checks 1793 (acting as a STUN client) and receive them (acting as a STUN server), 1794 a lite implementation will only ever receive checks, and thus will 1795 only act as a STUN server. 1797 7.1. STUN Client Procedures 1799 These procedures define how an agent sends a connectivity check, 1800 whether it is an ordinary or a triggered check. These procedures are 1801 only applicable to full implementations. 1803 7.1.1. Sending the Request 1805 The check is generated by sending a Binding Request from a local 1806 candidate, to a remote candidate. [I-D.ietf-behave-rfc3489bis] 1807 describes how Binding Requests are constructed and generated. A 1808 connectivity check MUST utilize the STUN short term credential 1809 mechanism. Support for backwards compatibility with RFC 3489 MUST 1810 NOT be used or assumed with connectivity checks. The FINGERPRINT 1811 mechanism MUST be used for connectivity checks. 1813 ICE extends STUN by defining several new attributes, including 1814 PRIORITY, USE-CANDIDATE, ICE-CONTROLLED, and ICE-CONTROLLING. These 1815 new attributes are formally defined in Section 19.1, and their usage 1816 is described in the subsections below. These STUN extensions are 1817 applicable only to connectivity checks used for ICE. 1819 7.1.1.1. PRIORITY and USE-CANDIDATE 1821 An agent MUST include the PRIORITY attribute in its Binding Request. 1822 The attribute MUST be set equal to the priority that would be 1823 assigned, based on the algorithm in Section 4.1.2, to a peer 1824 reflexive candidate, should one be learned as a consequence of this 1825 check (see Section 7.1.2.2.1 for how peer reflexive candidates are 1826 learned). This priority value will be computed identically to how 1827 the priority for the local candidate of the pair was computed, except 1828 that the type preference is set to the value for peer reflexive 1829 candidate types. 1831 The controlling agent MAY include the USE-CANDIDATE attribute in the 1832 Binding Request. The controlled agent MUST NOT include it in its 1833 Binding Request. This attribute signals that the controlling agent 1834 wishes to cease checks for this component, and use the candidate pair 1835 resulting from the check for this component. Section 8.1.1 provides 1836 guidance on determining when to include it. 1838 7.1.1.2. ICE-CONTROLLED and ICE-CONTROLLING 1840 The agent MUST include the ICE-CONTROLLED attribute in the request if 1841 it is in the controlled role, and MUST include the ICE-CONTROLLING 1842 attribute in the request if it is in the controlling role. The 1843 content of either attribute MUST be the tie breaker that was 1844 determined in Section 5.2. These attributes are defined fully in 1845 Section 19.1. 1847 7.1.1.3. Forming Credentials 1849 A Binding Request serving as a connectivity check MUST utilize the 1850 STUN short term credential mechanism. The username for the 1851 credential is formed by concatenating the username fragment provided 1852 by the peer with the username fragment of the agent sending the 1853 request, separated by a colon (":"). The password is equal to the 1854 password provided by the peer. For example, consider the case where 1855 agent L is the offerer, and agent R is the answerer. Agent L 1856 included a username fragment of LFRAG for its candidates, and a 1857 password of LPASS. Agent R provided a username fragment of RFRAG and 1858 a password of RPASS. A connectivity check from L to R (and its 1859 response of course) utilize the username RFRAG:LFRAG and a password 1860 of RPASS. A connectivity check from R to L (and its response) 1861 utilize the username LFRAG:RFRAG and a password of LPASS. 1863 7.1.1.4. DiffServ Treatment 1865 If the agent is using Diffserv Codepoint markings [RFC2475] in its 1866 media packets, it SHOULD apply those same markings to its 1867 connectivity checks. 1869 7.1.2. Processing the Response 1871 When a Binding Response is received, it is correlated to its Binding 1872 Request using the transaction ID, as defined in 1873 [I-D.ietf-behave-rfc3489bis], which then ties it to the candidate 1874 pair for which the Binding Request was sent. This section defines 1875 additional procedures for processing Binding Responses, specific to 1876 this usage of STUN. 1878 7.1.2.1. Failure Cases 1880 If the STUN transaction generates a 487 (Role Conflict) error 1881 response, the agent checks whether it had included the ICE-CONTROLLED 1882 or ICE-CONTROLLING attribute in the Binding Request. If the request 1883 had contained the ICE-CONTROLLED attribute, the agent MUST switch to 1884 the controlling role if it has not already done so. If the request 1885 had contained the ICE-CONTROLLING attribute, the agent MUST switch to 1886 the controlled role if it has not already done so. Once it has 1887 switched, the agent MUST enqueue the candidate pair whose check 1888 generated the 487 into the triggered check queue. The state of that 1889 pair is set to Waiting. When the triggered check is sent, it will 1890 contain an ICE-CONTROLLING or ICE-CONTROLLED attribute reflecting its 1891 new role. Note, however, that the tie-breaker value MUST NOT be 1892 reselected. 1894 Agents MAY support receipt of ICMP errors for connectivity checks. 1895 If the STUN transaction generates an ICMP error, the agent sets the 1896 state of the pair to Failed. If the STUN transaction generates a 1897 STUN error response that is unrecoverable (as defined in 1898 [I-D.ietf-behave-rfc3489bis]), or times out, the agent sets the state 1899 of the pair to Failed. 1901 The agent MUST check that the source IP address and port of the 1902 response equals the destination IP address and port that the Binding 1903 Request was sent to, and that the destination IP address and port of 1904 the response match the source IP address and port that the Binding 1905 Request was sent from. In other words, the source and destination 1906 transport addresses in the request and responses are the symmetric. 1907 If they are not symmetric, the agent sets the state of the pair to 1908 Failed. 1910 7.1.2.2. Success Cases 1912 A check is considered to be a success if all of the following are 1913 true: 1915 o the STUN transaction generated a success response 1917 o the source IP address and port of the response equals the 1918 destination IP address and port that the Binding Request was sent 1919 to 1921 o the destination IP address and port of the response match the 1922 source IP address and port that the Binding Request was sent from 1924 7.1.2.2.1. Discovering Peer Reflexive Candidates 1926 The agent checks the mapped address from the STUN response. If the 1927 transport address does not match any of the local candidates that the 1928 agent knows about, the mapped address represents a new candidate - a 1929 peer reflexive candidate. Like other candidates, it has a type, 1930 base, priority and foundation. They are computed as follows: 1932 o Its type is equal to peer reflexive. 1934 o Its base is set equal to the local candidate of the candidate pair 1935 from which the STUN check was sent. 1937 o Its priority is set equal to the value of the PRIORITY attribute 1938 in the Binding Request. 1940 o Its foundation is selected as described in Section 4.1.1. 1942 This peer reflexive candidate is then added to the list of local 1943 candidates for the media stream. Its username fragment and password 1944 are the same as all other local candidates for that media stream. 1945 However, the peer reflexive candidate is not paired with other remote 1946 candidates. This is not necessary; a valid pair will be generated 1947 from it momentarily based on the procedures in Section 7.1.2.2.2. If 1948 an agent wishes to pair the peer reflexive candidate with other 1949 remote candidates besides the one in the valid pair that will be 1950 generated, the agent MAY generate an updated offer which includes the 1951 peer reflexive candidate. This will cause it to be paired with all 1952 other remote candidates. 1954 7.1.2.2.2. Constructing a Valid Pair 1956 The agent constructs a candidate pair whose local candidate equals 1957 the mapped address of the response, and whose remote candidate equals 1958 the destination address to which the request was sent. This is 1959 called a valid pair, since it has been validated by a STUN 1960 connectivity check. The valid pair may equal the pair that generated 1961 the check, may equal a different pair in the check list, or may be a 1962 pair not currently on any check list. If the pair equals the pair 1963 that generated the check or is on a check list currently, it is also 1964 added to the VALID LIST, which is maintained by the agent for each 1965 media stream. This list is empty at the start of ICE processing, and 1966 fills as checks are performed, resulting in valid candidate pairs. 1968 It will be very common that the pair will not be on any check list. 1969 Recall that the check list has pairs whose local candidates are never 1970 server reflexive; those pairs had their local candidates converted to 1971 the base of the server reflexive candidates, and then pruned if they 1972 were redundant. When the response to the STUN check arrives, the 1973 mapped address will be reflexive if there is a NAT between the two. 1974 In that case, the valid pair will have a local candidate that doesn't 1975 match any of the pairs in the check list. 1977 If the pair is not on any check list, the agent computes the priority 1978 for the pair based on the priority of each candidate, using the 1979 algorithm in Section 5.7. The priority of the local candidate 1980 depends on its type. If it is not peer reflexive, it is equal to the 1981 priority signaled for that candidate in the SDP. If it is peer 1982 reflexive, it is equal to the PRIORITY attribute the agent placed in 1983 the Binding Request which just completed. The priority of the remote 1984 candidate is taken from the SDP of the peer. If the candidate does 1985 not appear there, then the check must have been a triggered check to 1986 a new remote candidate. In that case, the priority is taken as the 1987 value of the PRIORITY attribute in the Binding Request which 1988 triggered the check that just completed. The pair is then added to 1989 the VALID LIST. 1991 7.1.2.2.3. Updating Pair States 1993 The agent sets the state of the pair that generated the check to 1994 Succeeded. The success of this check might also cause the state of 1995 other checks to change as well. The agent MUST perform the following 1996 two steps: 1998 1. The agent changes the states for all other Frozen pairs for the 1999 same media stream and same foundation to Waiting. Typically 2000 these other pairs will have different component IDs but not 2001 always. 2003 2. If there is a pair in the valid list for every component of this 2004 media stream (where this is the actual number of components being 2005 used, in cases where the number of components signaled in the SDP 2006 differs from offerer to answerer), the success of this check may 2007 unfreeze checks for other media streams. Note that this step is 2008 followed not just the first time the valid list under 2009 consideration has a pair for every component, but every 2010 subsequent time a check succeeds and adds yet another pair to 2011 that valid list. The agent examines the check list for each 2012 other media stream in turn: 2014 * If the check list is active, the agent changes the state of 2015 all Frozen pairs in that check list whose foundation matches a 2016 pair in the valid list under consideration, to Waiting. 2018 * If the check list is frozen, and there is at least one pair in 2019 the check list whose foundation matches a pair in the valid 2020 list under consideration, the state of all pairs in the check 2021 list whose foundation matches a pair in the valid list under 2022 consideration are set to Waiting. This will cause the check 2023 list to become active, and ordinary checks will begin for it, 2024 as described in Section 5.8. 2026 * If the check list is frozen, and there are no pairs in the 2027 check list whose foundation matches a pair in the valid list 2028 under consideration, the agent 2030 + Groups together all of the pairs with the same foundation, 2032 + For each group, sets the state of the pair with the lowest 2033 component ID to Waiting. If there is more than one such 2034 pair, the one with the highest priority is used. 2036 7.1.2.2.4. Updating the Nominated Flag 2038 If the agent was a controlling agent, and it had included a USE- 2039 CANDIDATE attribute in the Binding Request, the valid pair generated 2040 from that check has its nominated flag set to true. This flag 2041 indicates that this valid pair should be used for media if it is the 2042 highest priority one amongst those whose nominated flag is set. This 2043 may conclude ICE processing for this media stream or all media 2044 streams; see Section 8. 2046 If the agent is the controlled agent, the response may be the result 2047 of a triggered check which was sent in response to a request which 2048 itself had the USE-CANDIDATE attribute. This case is described in 2049 Section 7.2.1.5, and may now result in setting the nominated flag for 2050 the pair learned from the original request. 2052 7.1.2.3. Check List and Timer State Updates 2054 Regardless of whether the check was successful or failed, the 2055 completion of the transaction may require updating of check list and 2056 timer states. 2058 If all of the pairs in the check list are now either in the Failed or 2059 Succeeded state: 2061 o If there is not a pair in the valid list for each component of the 2062 media stream, the state of the check list is set to Failed. 2064 o For each frozen check list, the agent: 2066 * Groups together all of the pairs with the same foundation, 2068 * For each group, sets the state of the pair with the lowest 2069 component ID to Waiting. If there is more than one such pair, 2070 the one with the highest priority is used. 2072 If none of the pairs in the check list are in the Waiting or Frozen 2073 state, the check list is no longer considered active, and will not 2074 count towards the value of N in the computation of timers for 2075 ordinary checks as described in Section 5.8. 2077 7.2. STUN Server Procedures 2079 An agent MUST be prepared to receive a Binding Request on the base of 2080 each candidate it included in its most recent offer or answer. This 2081 requirement holds even if the peer is a lite implementation. 2083 The agent MUST use a short term credential to authenticate the 2084 request and perform a message integrity check. The agent MUST 2085 consider the username to be valid if it consists of two values 2086 separated by a colon, where the first value is equal to the username 2087 fragment generated by the agent in an offer or answer for a session 2088 in-progress. It is possible (and in fact very likely) that an 2089 offerer will receive a Binding Request prior to receiving the answer 2090 from its peer. If this happens, the agent MUST immediately generate 2091 a response (including computation of the mapped address as described 2092 in Section 7.2.1.2. The agent has sufficient information at this 2093 point to generate the response; the password from the peer is not 2094 required. Once the answer is received, it MUST proceed with the 2095 remaining steps required, namely Section 7.2.1.3, Section 7.2.1.4, 2096 and Section 7.2.1.5 for full implementations. In cases where 2097 multiple STUN requests are received before the answer, this may cause 2098 several pairs to be queued up in the triggered check queue. 2100 An agent MUST NOT utilize the ALTERNATE-SERVER mechanism, and MUST 2101 NOT support the backwards compatibility mechanisms to RFC 3489. It 2102 MUST utilize the FINGERPRINT mechanism. 2104 If the agent is using Diffserv Codepoint markings [RFC2475] in its 2105 media packets, it SHOULD apply those same markings to its responses 2106 to Binding Requests. The same would apply to any layer 2 markings 2107 the endpoint might be applying to media packets. 2109 7.2.1. Additional Procedures for Full Implementations 2111 This subsection defines the additional server procedures applicable 2112 to full implementations. 2114 7.2.1.1. Detecting and Repairing Role Conflicts 2116 Normally, the rules for selection of a role in Section 5.2 will 2117 result in each agent selecting a different role - one controlling, 2118 and one controlled. However, in unusual call flows, typically 2119 utilizing third party call control, it is possible for both agents to 2120 select the same role. This section describes procedures for checking 2121 for this case and repairing it. 2123 An agent MUST examine the Binding Request for either the ICE- 2124 CONTROLLING or ICE-CONTROLLED attribute. It MUST follow these 2125 procedures: 2127 o If neither ICE-CONTROLLING or ICE-CONTROLLED are present in the 2128 request, the peer agent may have implemented a previous version of 2129 this specification. There may be a conflict, but it cannot be 2130 detected. 2132 o If the agent is in the controlling role, and the ICE-CONTROLLING 2133 attribute is present in the request: 2135 * If the agent's tie-breaker is larger than or equal to the 2136 contents of the ICE-CONTROLLING attribute, the agent generates 2137 a Binding Error Response and includes an ERROR-CODE attribute 2138 with a value of 487 (Role Conflict) but retains its role. 2140 * If the agent's tie-breaker is less than the contents of the 2141 ICE-CONTROLLING attribute, the agent switches to the controlled 2142 role. 2144 o If the agent is in the controlled role, and the ICE-CONTROLLED 2145 attribute is present in the request: 2147 * If the agent's tie-breaker is larger than or equal to the 2148 contents of the ICE-CONTROLLED attribute, the agent switches to 2149 the controlling role. 2151 * If the agent's tie-breaker is less than the contents of the 2152 ICE-CONTROLLED attribute, the agent generates a Binding Error 2153 Response and includes an ERROR-CODE attribute with a value of 2154 487 (Role Conflict) but retains its role. 2156 o If the agent is in the controlled role and the ICE-CONTROLLING 2157 attribute was present in the request, or the agent was in the 2158 controlling role and the ICE-CONTROLLED attribute was present in 2159 the request, there is no conflict. 2161 A change in roles will require an agent to recompute pair priorities 2162 Section 5.7.2, since those priorities are a function of controlling 2163 and controlled role. The change in role will also impact whether the 2164 agent is responsible for selecting nominated pairs and generated 2165 updated offers upon conclusion of ICE. 2167 The remaining sections in Section 7.2.1 are followed if the server 2168 generated a successful response to the Binding Request, even if the 2169 agent changed roles. 2171 7.2.1.2. Computing Mapped Address 2173 For requests being received on a relayed candidate, the source 2174 transport address used for STUN processing (namely, generation of the 2175 XOR-MAPPED-ADDRESS attribute) is the transport address as seen by the 2176 TURN server. That source transport address will be present in the 2177 REMOTE-ADDRESS attribute of a Data Indication message, if the Binding 2178 Request was delivered through a Data Indication (a TURN server 2179 delivers packets encapsulated in a Data Indication when no active 2180 destination is set). If the Binding Request was not encapsulated in 2181 a Data Indication, that source address is equal to the current active 2182 destination for the TURN session. 2184 7.2.1.3. Learning Peer Reflexive Candidates 2186 If the source transport address of the request does not match any 2187 existing remote candidates, it represents a new peer reflexive remote 2188 candidate. This candidate is constructed as follows: 2190 o The priority of the candidate is set to the PRIORITY attribute 2191 from the request. 2193 o The type of the candidate is set to peer reflexive. 2195 o The foundation of the candidate is set to an arbitrary value, 2196 different from the foundation for all other remote candidates. If 2197 any subsequent offer/answer exchanges contain this peer reflexive 2198 candidate in the SDP, it will signal the actual foundation for the 2199 candidate. 2201 o The component ID of this candidate is set to the component ID for 2202 the local candidate to which the request was sent. 2204 This candidate is added to the list of remote candidates. However, 2205 the agent does not pair this candidate with any local candidates. 2207 7.2.1.4. Triggered Checks 2209 Next, the agent constructs a pair whose local candidate is equal to 2210 the transport address on which the STUN request was received, and a 2211 remote candidate equal to the source transport address where the 2212 request came from (which may be peer-reflexive remote candidate that 2213 was just learned). Since both candidates are known to the agent, it 2214 can obtain their priorities and compute the candidate pair priority. 2215 This pair is then looked up in the check list. There can be one of 2216 several outcomes: 2218 o If the pair is already on the check list: 2220 * If the state of that pair is Waiting or Frozen, a check for 2221 that pair is enqueued into the triggered check queue if not 2222 already present. 2224 * If the state of that pair is In-Progress, the agent cancels the 2225 in-progress transaction. Cancellation means that the agent 2226 will not retransmit the request, will not treat the lack of 2227 response to be a failure, but will wait the duration of the 2228 transaction timeout for a response. In addition, the agent 2229 MUST create a new connectivity check for that pair 2230 (representing a new STUN Binding Request transaction) by 2231 enqueueing the pair in the triggered check queue. The state of 2232 the pair is then changed to Waiting. 2234 * If the state of the pair is Failed, it is changed to Waiting 2235 and the agent MUST create a new connectivity check for that 2236 pair (representing a new STUN Binding Request transaction), by 2237 enqueueing the pair in the triggered check queue. 2239 * If the state of that pair is Succeeded, nothing further is 2240 done. 2242 o These steps are done to facilitate rapid completion of ICE when 2243 both agents are behind NAT. 2245 o If the pair is not already on the check list: 2247 * The pair is inserted into the check list based on its priority 2249 * Its state is set to Waiting 2250 * The pair is enqueued into the triggered check queue. 2252 When a triggered check is to be sent, it is constructed and processed 2253 as described in Section 7.1.1. These procedures require the agent to 2254 know the transport address, username fragment and password for the 2255 peer. The username fragment for the remote candidate is equal to the 2256 part after the colon of the USERNAME in the Binding Request that was 2257 just received. Using that username fragment, the agent can check the 2258 SDP messages received from its peer (there may be more than one in 2259 cases of forking), and find this username fragment. The 2260 corresponding password is then selected. 2262 7.2.1.5. Updating the Nominated Flag 2264 If the Binding Request received by the agent had the USE-CANDIDATE 2265 attribute set, and the agent is in the controlled role, the agent 2266 looks at the state of the pair computed in Section 7.2.1.4: 2268 o If the state of this pair is Succeeded, it means that the check 2269 generated by this pair produced a successful response. This would 2270 have caused the agent to construct a valid pair when that success 2271 response was received (see Section 7.1.2.2.2). The agent now sets 2272 the nominated flag in the valid pair to true. This may end ICE 2273 processing for this media stream; see Section 8. 2275 o If the state of this pair is In-Progress, if its check produces a 2276 successful result, the resulting valid pair has its nominated flag 2277 set when the response arrives. This may end ICE processing for 2278 this media stream when it arrives; see Section 8. 2280 7.2.2. Additional Procedures for Lite Implementations 2282 If the check that was just received contained a USE-CANDIDATE 2283 attribute, the agent constructs a candidate pair whose local 2284 candidate is equal to the transport address on which the request was 2285 received, and whose remote candidate is equal to the source transport 2286 address of the request that was received. This candidate pair is 2287 assigned an arbitrary priority, and placed into a list of valid 2288 candidates called the valid list. The agent sets the nominated flag 2289 for that pair to true. ICE processing is considered complete for a 2290 media stream if the valid list contains a candidate pair for each 2291 component. 2293 8. Concluding ICE Processing 2295 This section describes how an agent completes ICE. 2297 8.1. Procedures for Full Implementations 2299 Concluding ICE involves nominating pairs by the controlling agent and 2300 updating of state machinery. 2302 8.1.1. Nominating Pairs 2304 The controlling agent nominates pairs to be selected by ICE by using 2305 one of two techniques: regular nomination or aggressive nomination. 2306 If its peer has a lite implementation, an agent MUST use a regular 2307 nomination algorithm. If its peer is using ICE options (present in 2308 an ice-options attribute from the peer) that the agent does not 2309 understand, the agent MUST use a regular nomination algorithm. If 2310 its peer is a full implementation and isn't using any ICE options or 2311 is using ICE options understood by the agent, the agent MAY use 2312 either the aggressive or the regular nomination algorithm. However, 2313 the regular algorithm is RECOMMENDED since it provides greater 2314 stability. 2316 8.1.1.1. Regular Nomination 2318 With regular nomination, the agent lets some number of checks 2319 complete, each of which omit the USE-CANDIDATE attribute. Once one 2320 or more checks complete successfully for a component of a media 2321 stream, valid pairs are generated and added to the valid list. The 2322 agent lets the checks continue until some stopping criteria is met, 2323 and then picks amongst the valid pairs based on an evaluation 2324 criteria. The criteria for stopping the checks and for evaluating 2325 the valid pairs is entirely a matter of local optimization. 2327 When the controlling agent selects the valid pair, it repeats the 2328 check that produced this valid pair (by enqueuing the pair that 2329 generated the check into the triggered check queue), this time with 2330 the USE-CANDIDATE attribute. This check should succeed (since the 2331 previous did), causing the nominated flag of that and only that pair 2332 to be set. Consequently, there will be only a single nominated pair 2333 in the valid list for each component, and when the state of the check 2334 list moves to completed, that exact pair is selected by ICE for 2335 sending and receiving media for that component. 2337 Regular nomination provides the most flexibility, since the agent has 2338 control over the stopping and selection criteria for checks. The 2339 only requirement is that the agent MUST eventually pick one and only 2340 one candidate pair and generate a check for that pair with the USE- 2341 CANDIDATE attribute present. Regular nomination also improves ICE's 2342 resilience to variations in implementation (see Section 14). Regular 2343 nomination is also more stable, allowing both agents to converge on a 2344 single pair for media without any transient selections, which can 2345 happen with the aggressive algorithm. The drawback of regular 2346 nomination is that it is guaranteed to increase latencies because it 2347 requires an additional check to be done. 2349 8.1.1.2. Aggressive Nomination 2351 With aggressive nomination, the controlling agent includes the USE- 2352 CANDIDATE attribute in every check it sends. Once the first check 2353 for a component succeeds, it will be added to the valid list, and 2354 have its nominated flag set. When all components have a nominated 2355 pair in the valid list, it will cause ICE processing to cease for 2356 this check list. However, because the agent included the USE- 2357 CANDIDATE attribute in all of its checks, another check may yet 2358 complete, causing another valid pair to have its nominated flag set. 2359 ICE always selects the highest priority nominated candidate pair from 2360 the valid list as the one used for media. Consequently, the selected 2361 pair may actually change briefly as ICE checks complete, resulting in 2362 a set of transient selections until it stabilizes. 2364 8.1.2. Updating States 2366 For both controlling and controlled agents, the state of ICE 2367 processing depends on the presence of nominated candidate pairs in 2368 the valid list and on the state of the check list. Note that, at any 2369 time, more than one of the following cases can apply: 2371 o If there are no nominated pairs in the valid list for a media 2372 stream and the state of the check list is Running, ICE processing 2373 continues. 2375 o If there is at least one nominated pair in the valid list for a 2376 media stream and the state of the check list is Running: 2378 * The agent MUST remove all Waiting and Frozen pairs in the check 2379 list and triggered check queue for the same component as the 2380 nominated pairs for that media stream 2382 * If an In-Progress pair in the check list is for the same 2383 component as a nominated pair, the agent SHOULD cease 2384 retransmissions for its check if its pair priority is lower 2385 than the lowest priority nominated pair for that component 2387 o Once there is at least one nominated pair in the valid list for 2388 every component of at least one media stream and the state of the 2389 check list is Running: 2391 * The agent MUST change the state of processing for its check 2392 list for that media stream to Completed. 2394 * The agent MUST continue to respond to any checks it may still 2395 receive for that media stream, and MUST perform triggered 2396 checks if required by the processing of Section 7.2. 2398 * The agent MAY begin transmitting media for this media stream as 2399 described in Section 11.1 2401 o Once the state of each check list is Completed: 2403 * The agent sets the state of ICE processing overall to 2404 Completed. 2406 * If an agent is controlling, it examines the highest priority 2407 nominated candidate pair for each component of each media 2408 stream. If any of those candidate pairs differ from the 2409 default candidate pairs in the most recent offer/answer 2410 exchange, the controlling agent MUST generate an updated offer 2411 as described in Section 9. If the controlling agent is using 2412 an aggressive nomination algorithm, this may result in several 2413 updated offers as the pairs selected for media change. An 2414 agent MAY delay sending the offer for a brief interval (one 2415 second is RECOMMENDED) in order to allow the selected pairs to 2416 stabilize. 2418 o If the state of the check list is Failed, ICE has not been able to 2419 complete for this media stream. The correct behavior depends on 2420 the state of the check lists for other media streams: 2422 * If all check lists are Failed, ICE processing overall is 2423 considered to be in the Failed state, and the agent SHOULD 2424 consider the session a failure, SHOULD NOT restart ICE, and the 2425 controlling agent SHOULD terminate the entire session. 2427 * If at least one of the check lists for other media streams is 2428 Completed, the controlling agent SHOULD remove the failed media 2429 stream from the session in its updated offer. 2431 * If none of the check lists for other media streams are 2432 Completed, but at least one is Running, the agent SHOULD let 2433 ICE continue. 2435 8.2. Procedures for Lite Implementations 2437 Concluding ICE for a lite implementation is relatively 2438 straightforward. There are two cases to consider: 2440 The implementation is lite, and its peer is full. 2442 The implementation is lite, and its peer is lite. 2444 The effect of ICE concluding is that the agent can free any allocated 2445 host candidates that were not utilized by ICE, as described in 2446 Section 8.3. 2448 8.2.1. Peer is Full 2450 In this case, the agent will receive connectivity checks from its 2451 peer. When an agent has received a connectivity check that includes 2452 the USE-CANDIDATE attribute for each component of a media stream, the 2453 state of ICE processing for that media stream moves from Running to 2454 Completed. When the state of ICE processing for all media streams is 2455 Completed, the state of ICE processing overall is Completed. 2457 The lite implementation will never itself determine that ICE 2458 processing has failed for a media stream; rather, the full peer will 2459 make that determination and then remove or restart the failed media 2460 stream in a subsequent offer. 2462 8.2.2. Peer is Lite 2464 Once the offer/answer exchange has completed, both agents examine 2465 their candidates and those of its peer. For each media stream, each 2466 agent pairs up its own candidates with the candidates of its peer for 2467 that media stream. Two candidates are paired up when they are for 2468 the same component, utilize the same transport protocol (UDP in this 2469 specification), and are from the same IP address family (IPv4 or 2470 IPv6). 2472 o If there is a single pair per component, that pair is added to the 2473 Valid list. If all of the components for a media stream had one 2474 pair, the state of ICE processing for that media stream is set to 2475 Completed. If all media streams are Completed, the state of ICE 2476 processing is set to Completed overall. This will always be the 2477 case for implementations that are IPv4 only. 2479 o If there is more than one pair per component: 2481 * The agent MUST select a pair based on local policy. Since this 2482 case only arises for IPv6, it is RECOMMENDED that an agent 2483 follow the procedures of RFC 3484 [RFC3484] to select a single 2484 pair. 2486 * The agent adds the selected pair for each component to the 2487 valid list. As described in Section 11.1, this will permit 2488 media to begin flowing. However, it is possible (and in fact 2489 likely) that both agents have chosen different pairs. 2491 * To reconcile this, the controlling agent MUST send an updated 2492 offer as described in Section 9.1.3, which will include the 2493 remote-candidates attribute. 2495 * The agent MUST NOT update the state of ICE processing when the 2496 offer is sent. If this subsequent offer completes, the 2497 controlling agent MUST change the state of ICE processing to 2498 Completed for all media streams, and the state of ICE 2499 processing overall to Completed. The states for the controlled 2500 agent are set based on the logic in Section 9.2.3. 2502 8.3. Freeing Candidates 2504 8.3.1. Full Implementation Procedures 2506 The procedures in Section 8 require that an agent continue to listen 2507 for STUN requests and continue to generate triggered checks for a 2508 media stream, even once processing for that stream completes. The 2509 rules in this section describe when it is safe for an agent to cease 2510 sending or receiving checks on a candidate that was not selected by 2511 ICE, and then free the candidate. 2513 When ICE is used with SIP, and an offer is forked to multiple 2514 recipients, ICE proceeds in parallel and independently with each 2515 answerer, all using the same local candidates. Once ICE processing 2516 has reached the Completed state for all peers for media streams using 2517 those candidates, the agent SHOULD wait an additional three seconds, 2518 and then it MAY cease responding to checks or generating triggered 2519 checks on that candidate. It MAY free the candidate at that time. 2520 Freeing of server reflexive candidates is never explicit; it happens 2521 by lack of a keepalive. The three second delay handles cases when 2522 aggressive nomination is used, and the selected pairs can quickly 2523 change after ICE has completed. 2525 8.3.2. Lite Implementations 2527 A lite implementation MAY free candidates not selected by ICE as soon 2528 as ICE processing has reached the completed state for all peers for 2529 all media streams using those candidates. 2531 9. Subsequent Offer/Answer Exchanges 2533 Either agent MAY generate a subsequent offer at any time allowed by 2534 RFC 3264 [RFC3264]. The rules in Section 8 will cause the 2535 controlling agent to send an updated offer at the conclusion of ICE 2536 processing when ICE has selected different candidate pairs from the 2537 default pairs. This section defines rules for construction of 2538 subsequent offers and answers. 2540 Should a subsequent offer be rejected, ICE processing continues as if 2541 the subsequent offer had never been made. 2543 9.1. Generating the Offer 2545 9.1.1. Procedures for All Implementations 2547 9.1.1.1. ICE Restarts 2549 An agent MAY restart ICE processing for an existing media stream. An 2550 ICE restart, as the name implies, will cause all previous state of 2551 ICE processing to be flushed and checks to start anew. The only 2552 difference between an ICE restart and a brand new media session is 2553 that, during the restart, media can continue to be sent to the 2554 previously validated pair. 2556 An agent MUST restart ICE for a media stream if: 2558 o The offer is being generated for the purposes of changing the 2559 target of the media stream. In other words, if an agent wants to 2560 generated an updated offer which, had ICE not been in use, would 2561 result in a new value for the destination of a media component. 2563 o An agent is changing its implementation level. This typically 2564 only happens in third party call control use cases, where the 2565 entity performing the signaling is not the entity receiving the 2566 media, and it has changed the target of media mid-session to 2567 another entity that has a different ICE implementation. 2569 These rules imply that setting the IP address in the c line to 2570 0.0.0.0 will cause an ICE restart. Consequently, ICE implementations 2571 MUST NOT utilize this mechanism for call hold, and instead MUST use 2572 a=inactive and a=sendonly as described in [RFC3264] 2574 To restart ICE, an agent MUST change both the ice-pwd and the ice- 2575 ufrag for the media stream in an offer. Note that it is permissible 2576 to use a session-level attribute in one offer, but to provide the 2577 same ice-pwd or ice-ufrag as a media-level attribute in a subsequent 2578 offer. This is not a change in password, just a change in its 2579 representation, and does not cause an ICE restart. 2581 An agent sets the rest of the fields in the SDP for this media stream 2582 as it would in an initial offer of this media stream (see 2583 Section 4.3). Consequently, the set of candidates MAY include some, 2584 none, or all of the previous candidates for that stream and MAY 2585 include a totally new set of candidates gathered as described in 2586 Section 4.1.1. 2588 9.1.1.2. Removing a Media Stream 2590 If an agent removes a media stream by setting its port to zero, it 2591 MUST NOT include any candidate attributes for that media stream and 2592 SHOULD NOT include any other ICE-related attributes defined in 2593 Section 15 for that media stream. 2595 9.1.1.3. Adding a Media Stream 2597 If an agent wishes to add a new media stream, it sets the fields in 2598 the SDP for this media stream as if this was an initial offer for 2599 that media stream (see Section 4.3). This will cause ICE processing 2600 to begin for this media stream. 2602 9.1.2. Procedures for Full Implementations 2604 This section describes additional procedures for full 2605 implementations, covering existing media streams. 2607 The username fragments, password, and implementation level MUST 2608 remain the same as used previously. If an agent needs to change one 2609 of these it MUST restart ICE for that media stream. 2611 Additional behavior depends on the state ICE processing for that 2612 media stream. 2614 9.1.2.1. Existing Media Streams with ICE Running 2616 If an agent generates an updated offer including media stream that 2617 was previously established, and for which ICE checks are in the 2618 Running state, the agent follows the procedures defined here. 2620 An agent MUST include candidate attributes for all local candidates 2621 it had signaled previously for that media stream. The properties of 2622 that candidate as signaled in SDP - the priority, foundation, type 2623 and related transport address SHOULD remain the same. The IP 2624 address, port and transport protocol, which fundamentally identify 2625 that candidate, MUST remain the same (if they change, it would be a 2626 new candidate). The component ID MUST remain the same. The agent 2627 MAY include additional candidates it did not offer previously, but 2628 which it has gathered since the last offer/answer exchange, including 2629 peer reflexive candidates. 2631 The agent MAY change the default destination for media. As with 2632 initial offers, there MUST be a set of candidate attributes in the 2633 offer matching this default destination. 2635 9.1.2.2. Existing Media Streams with ICE Completed 2637 If an agent generates an updated offer including media stream that 2638 was previously established, and for which ICE checks are in the 2639 Completed state, the agent follows the procedures defined here. 2641 The default destination for media (i.e., the values of the IP 2642 addresses and ports in the m and c line used for that media stream) 2643 MUST be the local candidate from the highest priority nominated pair 2644 in the valid list for each component. This "fixes" the default 2645 destination for media to equal the destination ICE has selected for 2646 media. 2648 The agent MUST include a candidate attributes for candidates matching 2649 the default destination for each component of the media stream, and 2650 MUST NOT include any other candidates. 2652 In addition, if the agent is controlling, it MUST include the 2653 a=remote-candidates attribute for each media stream whose check list 2654 is in the Completed state. The attribute contains the remote 2655 candidates from the highest priority nominated pair in the valid list 2656 for each component of that media stream. It is needed to avoid a 2657 race condition whereby the controlling agent chooses its pairs, but 2658 the updated offer beats the connectivity checks to the controlled 2659 agent, which doesn't even know these pairs are valid, let alone 2660 selected. See Appendix B.6 for elaboration on this race condition. 2662 9.1.3. Procedures for Lite Implementations 2664 9.1.3.1. Existing Media Streams with ICE Running 2666 This section describes procedures for lite implementations for 2667 existing streams for which ICE is running. 2669 A lite implementation MUST include all of its candidates for each 2670 component of each media stream in an a=candidate attribute in any 2671 subsequent offer. These candidates are formed identically to the 2672 procedures for initial offers, as described in Section 4.2. 2674 A lite implementation MUST NOT add additional host candidates in a 2675 subsequent offer. If an agent needs to offer additional candidates, 2676 it MUST restart ICE. 2678 The username fragments, password, and implementation level MUST 2679 remain the same as used previously. If an agent needs to change one 2680 of these it MUST restart ICE for that media stream. 2682 9.1.3.2. Existing Media Streams with ICE Completed 2684 If ICE has completed for a media stream, the default destination for 2685 that media stream MUST be set to the remote candidate of the 2686 candidate pair for that component in the valid list. For a lite 2687 implementation, there is always just a single candidate pair in the 2688 valid list for each component of a media stream. Additionally, the 2689 agent MUST include a candidate attribute for each default 2690 destination. 2692 Additionally, if the agent is controlling (which only happens when 2693 both agents are lite), the agent MUST include the a=remote-candidates 2694 attribute for each media stream. The attribute contains the remote 2695 candidates from the candidate pairs in the valid list (one pair for 2696 each component of each media stream). 2698 9.2. Receiving the Offer and Generating an Answer 2700 9.2.1. Procedures for All Implementations 2702 When receiving a subsequent offer within an existing session, an 2703 agent MUST re-apply the verification procedures in Section 5.1 2704 without regard to the results of verification from any previous 2705 offer/answer exchanges. Indeed, it is possible that a previous 2706 offer/answer exchange resulted in ICE not being used, but it is used 2707 as a consequence of a subsequent exchange. 2709 9.2.1.1. Detecting ICE Restart 2711 If the offer contained a change in the a=ice-ufrag or a=ice-pwd 2712 attributes compared to the previous SDP from the peer, it indicates 2713 that ICE is restarting for this media stream. If all media streams 2714 are restarting, than ICE is restarting overall. 2716 If ICE is restarting for a media stream: 2718 o The agent MUST change the a=ice-ufrag and a=ice-pwd attributes in 2719 the answer. 2721 o The agent MAY change its implementation level in the answer. 2723 An agent sets the rest of the fields in the SDP for this media stream 2724 as it would in an initial answer to this media stream (see 2725 Section 4.3). Consequently, the set of candidates MAY include some, 2726 none, or all of the previous candidates for that stream and MAY 2727 include a totally new set of candidates gathered as described in 2728 Section 4.1.1. 2730 9.2.1.2. New Media Stream 2732 If the offer contains a new media stream, the agent sets the fields 2733 in the answer as if it had received an initial offer containing that 2734 media stream (see Section 4.3). This will cause ICE processing to 2735 begin for this media stream. 2737 9.2.1.3. Removed Media Stream 2739 If an offer contains a media stream whose port is zero, the agent 2740 MUST NOT include any candidate attributes for that media stream in 2741 its answer and SHOULD NOT include any other ICE-related attributes 2742 defined in Section 15 for that media stream. 2744 9.2.2. Procedures for Full Implementations 2746 Unless the agent has detected an ICE restart from the offer, the 2747 username fragments, password, and implementation level MUST remain 2748 the same as used previously. If an agent needs to change one of 2749 these it MUST restart ICE for that media stream by generating an 2750 offer; ICE cannot be restarted in an answer. 2752 Additional behaviors depend on the state of ICE processing for that 2753 media stream. 2755 9.2.2.1. Existing Media Streams with ICE Running and no remote- 2756 candidates 2758 If ICE is running for a media stream, and the offer for that media 2759 stream lacked the remote-candidates attribute, the rules for 2760 construction of the answer are identical to those for the offerer as 2761 described in Section 9.1.2.1. 2763 9.2.2.2. Existing Media Streams with ICE Completed and no remote- 2764 candidates 2766 If ICE is Completed for a media stream, and the offer for that media 2767 stream lacked the remote-candidates attribute, the rules for 2768 construction of the answer are identical to those for the offerer as 2769 described in Section 9.1.2.2, except that the answerer MUST NOT 2770 include the a=remote-candidates attribute in the answer. 2772 9.2.2.3. Existing Media Streams and remote-candidates 2774 A controlled agent will receive an offer with the a=remote-candidates 2775 attribute for a media stream when its peer has concluded ICE 2776 processing for that media stream. This attribute is present in the 2777 offer to deal with a race condition between the receipt of the offer, 2778 and the receipt of the Binding Response which tells the answerer the 2779 candidate which will be selected by ICE. See Appendix B.6 for an 2780 explanation of this race condition. Consequently, processing of an 2781 offer with this attribute depends on the winner of the race. 2783 The agent forms a candidate pair for each component of the media 2784 stream by: 2786 o Setting the remote candidate equal to the offerers default 2787 destination for that component (e.g., the contents of the m and 2788 c-lines for RTP, and the a=rtcp attribute for RTCP) 2790 o Setting the local candidate equal to the transport address for 2791 that same component in the a=remote-candidates attribute in the 2792 offer. 2794 The agent then sees if each of these candidate pairs are present in 2795 the valid list. If a particular pair is not in the valid list, the 2796 check has "lost" the race. Call such a pair a "losing pair". 2798 The agent finds all the pairs in the check list whose remote 2799 candidates equal the remote candidate in the losing pair: 2801 o If none of the pairs are In-Progress, and at least one is Failed, 2802 it is most likely that a network failure, such as a network 2803 partition or serious packet loss, has occurred. The agent SHOULD 2804 generate an answer for this media stream as if the remote- 2805 candidates attribute had not been present, and then restart ICE 2806 for this stream. 2808 o If at least one of the pairs are In-Progress, the agent SHOULD 2809 wait for those checks to complete, and as each completes, redo the 2810 processing in this section until there are no losing pairs. 2812 Once there are no losing pairs, the agent can generate the answer. 2813 It MUST set the default destination for media to the candidates in 2814 the remote-candidates attribute from the offer (each of which will 2815 now be the local candidate of a candidate pair in the valid list). 2816 It MUST include a candidate attribute in the answer for each 2817 candidate in the remote-candidates attribute in the offer. 2819 9.2.3. Procedures for Lite Implementations 2821 If the received offer contains the remote-candidates attribute for a 2822 media stream, the agent forms a candidate pair for each component of 2823 the media stream by: 2825 o Setting the remote candidate equal to the offerers default 2826 destination for that component (e.g., the contents of the m and 2827 c-lines for RTP, and the a=rtcp attribute for RTCP) 2829 o Setting the local candidate equal to the transport address for 2830 that same component in the a=remote-candidates attribute in the 2831 offer. 2833 It then places those candidates into the Valid list for the media 2834 stream. The state of ICE processing for that media stream is set to 2835 Completed. 2837 Furthermore, if the agent believed it was controlling, but the offer 2838 contained the remote-candidates attribute, both agents believe they 2839 are controlling. In this case, both would have sent updated offers 2840 around the same time. However, the signaling protocol carrying the 2841 offer/answer exchanges will have resolved this glare condition, so 2842 that one agent is always the 'winner' by having its offer received 2843 before its peer has sent an offer. The winner takes the role of 2844 controlled, so that the loser (the answerer under consideration in 2845 this section MUST change its role to controlled. Consequently, if 2846 the agent was going to send an updated offer since, based on the 2847 rules in Section 8.2.2, it was controlling, it no longer needs to. 2849 Besides the potential role change, change in the Valid list, and 2850 state changes, the construction of the answer is performed 2851 identically to the construction of an offer as described in 2852 Section 9.1.3. 2854 9.3. Updating the Check and Valid Lists 2856 9.3.1. Procedures for Full Implementations 2858 9.3.1.1. ICE Restarts 2860 The agent MUST remember the highest priority nominated pairs in the 2861 Valid list for each component of the media stream, called the 2862 previous selected pairs, prior to the restart. The agent will 2863 continue to send media using these pairs, as described in 2864 Section 11.1. Once these destinations are noted, the agent MUST 2865 flush the valid and check lists, and then recompute the check list 2866 and its states as described in Section 5.7. 2868 9.3.1.2. New Media Stream 2870 If the offer/answer exchange added a new media stream, the agent MUST 2871 create a new check list for it (and an empty Valid list to start of 2872 course), as described in Section 5.7. 2874 9.3.1.3. Removed Media Stream 2876 If the offer/answer exchange removed a media stream, or an answer 2877 rejected an offered media stream, an agent MUST flush the Valid list 2878 for that media stream. It MUST terminate any STUN transactions in 2879 progress for that media stream. An agent MUST remove the check list 2880 for that media stream and cancel any pending ordinary checks for it. 2882 9.3.1.4. ICE Continuing for Existing Media Stream 2884 The valid list is not affected by an updated offer/answer exchange 2885 unless ICE is restarting. 2887 If an agent is in the Running state for that media stream, the check 2888 list is updated (the check list is irrelevant if the state is 2889 completed). To do that, the agent recomputes the check list using 2890 the procedures described in Section 5.7. If a pair on the new check 2891 list was also on the previous check list, and its state was Waiting, 2892 In-Progress, Succeeded or Failed, its state is copied over. 2893 Otherwise, its state is set to Frozen. 2895 If none of the check lists are active (meaning that the pairs in each 2896 check list are Frozen), the full-mode agent sets the first pair in 2897 the check list for the first media stream to Waiting, and then sets 2898 the state of all other pairs in that check list for the same 2899 component ID and with the same foundation to Waiting as well. 2901 Next, the agent goes through each check list, starting with the 2902 highest priority pair. If a pair has a state of Succeeded, and it 2903 has a component ID of 1, then all Frozen pairs in the same check list 2904 with the same foundation whose component IDs are not 1, have their 2905 state set to Waiting. If, for a particular check list, there are 2906 pairs for each component of that media stream in the Succeeded state, 2907 the agent moves the state of all Frozen pairs for the first component 2908 of all other media streams (and thus in different check lists) with 2909 the same foundation to Waiting. 2911 9.3.2. Procedures for Lite Implementations 2913 If ICE is restarting for a media stream, the agent MUST start a new 2914 Valid list for that media stream. It MUST remember the pairs in the 2915 previous Valid list for each component of the media stream, called 2916 the previous selected pairs, and continue to send media there as 2917 described in Section 11.1. The state of ICE processing for each 2918 media stream MUST change to Running, and the state of ICE processing 2919 MUST change to running. 2921 10. Keepalives 2923 All endpoints MUST send keepalives for each media session. These 2924 keepalives serve the purpose of keeping NAT bindings alive for the 2925 media session. These keepalives MUST be sent regardless of whether 2926 the media stream is currently inactive, sendonly, recvonly or 2927 sendrecv, and regardless of the presence or value of the bandwidth 2928 attribute. These keepalives MUST be sent even if ICE is not being 2929 utilized for the session at all. The keepalive SHOULD be sent using 2930 a format which is supported by its peer. ICE endpoints allow for 2931 STUN-based keepalives for UDP streams, and as such, STUN keepalives 2932 MUST be used when an agent is a full ICE implementation and is 2933 communicating with a peer that supports ICE (lite or full). An agent 2934 can determine that its peer supports ICE by the presence of 2935 a=candidate attributes for each media session. If the peer does not 2936 support ICE, the choice of a packet format for keepalives is a matter 2937 of local implementation. A format which allows packets to easily be 2938 sent in the absence of actual media content is RECOMMENDED. Examples 2939 of formats which readily meet this goal are RTP No-Op 2940 [I-D.ietf-avt-rtp-no-op], and in cases where both sides support it, 2941 RTP comfort noise [RFC3389]. If the peer doesn't support any formats 2942 that are particularly well suited for keepalives, an agent SHOULD 2943 send RTP packets with an incorrect version number, or some other form 2944 of error which would cause them to be discarded by the peer. 2946 If there has been no packet sent on the candidate pair ICE is using 2947 for a media component for Tr seconds (where packets include those 2948 defined for the component (RTP or RTCP) and previous keepalives), an 2949 agent MUST generate a keepalive on that pair. Tr SHOULD be 2950 configurable and SHOULD have a default of 15 seconds. Tr MUST NOT be 2951 configured to less than 15 seconds. Alternatively, if an agent has a 2952 dynamic way to discover the binding lifetimes of the intervening 2953 NATs, it can use that value to determine Tr. Administrators 2954 deploying ICE in more controlled networking environments SHOULD set 2955 Tr to the longest duration possible in their environment. 2957 If STUN is being used for keepalives, a STUN Binding Indication is 2958 used [I-D.ietf-behave-rfc3489bis]. The Indication MUST NOT utilize 2959 any authentication mechanism, and SHOULD NOT contain any attributes. 2960 It is used solely to keep the NAT bindings alive. The Binding 2961 Indication is sent using the same local and remote candidates that 2962 are being used for media. Though Binding Indications are used for 2963 keepalives, an agent MUST be prepared to receive a connectivity check 2964 as well. If a connectivity check is received, a response is 2965 generated as discussed in [I-D.ietf-behave-rfc3489bis], but there is 2966 no impact on ICE processing otherwise. 2968 An agent MUST begin the keepalive processing once ICE has selected 2969 candidates for usage with media, or media begins to flow, whichever 2970 happens first. Keepalives end once the session terminates or the 2971 media stream is removed. 2973 11. Media Handling 2975 11.1. Sending Media 2977 Procedures for sending media differ for full and lite 2978 implementations. 2980 11.1.1. Procedures for Full Implementations 2982 Agents always send media using a candidate pair, called the selected 2983 candidate pair. An agent will send media to the remote candidate in 2984 the selected pair (setting the destination address and port of the 2985 packet equal to that remote candidate), and will send it from the 2986 local candidate of the selected pair. When the local candidate is 2987 server or peer reflexive, media is originated from the base. Media 2988 sent from a relayed candidate is sent from the base through that TURN 2989 server, using procedures defined in [I-D.ietf-behave-turn]. 2991 The selected pair for a component of a media stream is: 2993 o empty if the state of the check list for that media stream is 2994 Running, and there is no previous selected pair for that component 2995 due to an ICE restart 2997 o equal to the previous selected pair for a component of a media 2998 stream if the state of the check list for that media stream is 2999 Running, and there was a previous selected pair for that component 3000 due to an ICE restart 3002 o equal to the highest priority nominated pair for that component in 3003 the valid list if the state of the check list is Completed 3005 If the selected pair for at least one component of a media stream is 3006 empty, an agent MUST NOT send media for any component of that media 3007 stream. If the selected pair for each component of a media stream 3008 has a value, an agent MAY send media for all components of that media 3009 stream. 3011 Note that the selected pair for a component of a media stream may not 3012 equal the default pair for that same component from the most recent 3013 offer/answer exchange. When this happens, the selected pair is used 3014 for media, not the default pair. When ICE first completes, if the 3015 selected pairs aren't a match for the default pairs, the controlling 3016 agent sends an updated offer/answer exchange to remedy this 3017 disparity. However, until that updated offer arrives, there will not 3018 be a match. Furthermore, in very unusual cases, the default 3019 candidates in the updated offer/answer will not be a match. 3021 11.1.2. Procedures for Lite Implementations 3023 A lite implementation MUST NOT send media until it has a Valid list 3024 that contains a candidate pair for each component of that media 3025 stream. Once that happens, the agent MAY begin sending media 3026 packets. To do that, it sends media to the remote candidate in the 3027 pair (setting the destination address and port of the packet equal to 3028 that remote candidate), and will send it from the local candidate. 3030 11.1.3. Procedures for All Implementations 3032 ICE has interactions with jitter buffer adaptation mechanisms. An 3033 RTP stream can begin using one candidate, and switch to another one, 3034 though this happens rarely with ICE. The newer candidate may result 3035 in RTP packets taking a different path through the network - one with 3036 different delay characteristics. As discussed below, agents are 3037 encouraged to re-adjust jitter buffers when there are changes in 3038 source or destination address of media packets. Furthermore, many 3039 audio codecs use the marker bit to signal the beginning of a 3040 talkspurt, for the purposes of jitter buffer adaptation. For such 3041 codecs, it is RECOMMENDED that the sender set the marker bit 3042 [RFC3550] when an agent switches transmission of media from one 3043 candidate pair to another. 3045 11.2. Receiving Media 3047 ICE implementations MUST be prepared to receive media on each 3048 component on any candidates provided for that component in the most 3049 recent offer/answer exchange (in the case of RTP, this would include 3050 both RTP and RTCP if candidates were provided for both). 3052 It is RECOMMENDED that, when an agent receives an RTP packet with a 3053 new source or destination IP address for a particular media stream, 3054 that the agent re-adjust its jitter buffers. 3056 RFC 3550 [RFC3550] describes an algorithm in Section 8.2 for 3057 detecting SSRC collisions and loops. These algorithms are based, in 3058 part, on seeing different source transport addresses with the same 3059 SSRC. However, when ICE is used, such changes will sometimes occur 3060 as the media streams switch between candidates. An agent will be 3061 able to determine that a media stream is from the same peer as a 3062 consequence of the STUN exchange that proceeds media transmission. 3063 Thus, if there is a change in source transport address, but the media 3064 packets come from the same peer agent, this SHOULD NOT be treated as 3065 an SSRC collision. 3067 12. Usage with SIP 3069 12.1. Latency Guidelines 3071 ICE requires a series of STUN-based connectivity checks to take place 3072 between endpoints. These checks start from the answerer on 3073 generation of its answer, and start from the offerer when it receives 3074 the answer. These checks can take time to complete, and as such, the 3075 selection of messages to use with offers and answers can effect 3076 perceived user latency. Two latency figures are of particular 3077 interest. These are the post-pickup delay and the post-dial delay. 3078 The post-pickup delay refers to the time between when a user "answers 3079 the phone" and when any speech they utter can be delivered to the 3080 caller. The post-dial delay refers to the time between when a user 3081 enters the destination address for the user, and ringback begins as a 3082 consequence of having successfully started ringing the phone of the 3083 called party. 3085 Two cases can be considered - one where the offer is present in the 3086 initial INVITE, and one where it is in a response. 3088 12.1.1. Offer in INVITE 3090 To reduce post-dial delays, it is RECOMMENDED that the caller begin 3091 gathering candidates prior to actually sending its initial INVITE. 3092 This can be started upon user interface cues that a call is pending, 3093 such as activity on a keypad or the phone going offhook. 3095 If an offer is received in an INVITE request, the answerer SHOULD 3096 begin to gather its candidates on receipt of the offer and then 3097 generate an answer in a provisional response once it has completed 3098 that process. ICE requires that a provisional response with an SDP 3099 be transmitted reliably. This can be done through the existing PRACK 3100 mechanism [RFC3262], or through an optimization that is specific to 3101 ICE. With this optimization, provisional responses containing an SDP 3102 answer that begins ICE processing for one or more media streams can 3103 be sent reliably without RFC 3262. To do this, the agent retransmits 3104 the provisional response with the exponential backoff timers 3105 described in RFC 3262. Retransmits MUST cease on receipt of a STUN 3106 Binding Request for one of the media streams signaled in that SDP 3107 (because receipt of a binding request indicates the offerer has 3108 received the answer) or on transmission of the answer in a 2xx 3109 response. If the peer agent is lite, there will never be a STUN 3110 Binding Request. In such a case, the agent MUST cease retransmitting 3111 the 18x after sending it four times (ICE will actually work even if 3112 the peer never receives the 18x; however, experience has shown that 3113 sending it is important for middleboxes and firewall traversal). If 3114 no Binding Request is received prior to the last retransmit, the 3115 agent does not consider the session terminated. Despite the fact 3116 that the provisional response will be delivered reliably, the rules 3117 for when an agent can send an updated offer or answer do not change 3118 from those specified in RFC 3262. Specifically, if the INVITE 3119 contained an offer, the same answer appears in all of the 1xx and in 3120 the 2xx response to the INVITE. Only after that 2xx has been sent 3121 can an updated offer/answer exchange occur. This optimization SHOULD 3122 NOT be used if both agents support PRACK. Note that the optimization 3123 is very specific to provisional response carrying answers that start 3124 ICE processing; it is not a general technique for 1xx reliability. 3126 Alternatively, an agent MAY delay sending an answer until the 200 OK, 3127 however this results in a poor user experience and is NOT 3128 RECOMMENDED. 3130 Once the answer has been sent, the agent SHOULD begin its 3131 connectivity checks. Once candidate pairs for each component of a 3132 media stream enter the valid list, the answerer can begin sending 3133 media on that media stream. 3135 However, prior to this point, any media that needs to be sent towards 3136 the caller (such as SIP early media [RFC3960] MUST NOT be 3137 transmitted. For this reason, implementations SHOULD delay alerting 3138 the called party until candidates for each component of each media 3139 stream have entered the valid list. In the case of a PSTN gateway, 3140 this would mean that the setup message into the PSTN is delayed until 3141 this point. Doing this increases the post-dial delay, but has the 3142 effect of eliminating 'ghost rings'. Ghost rings are cases where the 3143 called party hears the phone ring, picks up, but hears nothing and 3144 cannot be heard. This technique works without requiring support for, 3145 or usage of, preconditions [RFC3312], since its a localized decision. 3146 It also has the benefit of guaranteeing that not a single packet of 3147 media will get clipped, so that post-pickup delay is zero. If an 3148 agent chooses to delay local alerting in this way, it SHOULD generate 3149 a 180 response once alerting begins. 3151 12.1.2. Offer in Response 3153 In addition to uses where the offer is in an INVITE, and the answer 3154 is in the provisional and/or 200 OK response, ICE works with cases 3155 where the offer appears in the response. In such cases, which are 3156 common in third party call control [RFC3725], ICE agents SHOULD 3157 generate their offers in a reliable provisional response (which MUST 3158 utilize RFC 3262), and not alert the user on receipt of the INVITE. 3160 The answer will arrive in a PRACK. This allows for ICE processing to 3161 take place prior to alerting, so that there is no post-pickup delay, 3162 at the expense of increased call setup delays. Once ICE completes, 3163 the callee can alert the user and then generate a 200 OK when they 3164 answer. The 200 OK would contain no SDP, since the offer/answer 3165 exchange has completed. 3167 Alternatively, agents MAY place the offer in a 2xx instead (in which 3168 case the answer comes in the ACK). When this happens, the callee 3169 will alert the user on receipt of the INVITE, and the ICE exchanges 3170 will take place only after the user answers. This has the effect of 3171 reducing call setup delay, but can cause substantial post-pickup 3172 delays and media clipping. 3174 12.2. SIP Option Tags and Media Feature Tags 3176 [I-D.ietf-sip-ice-option-tag] specifies a SIP option tag and media 3177 feature tag for usage with ICE. ICE implementations using SIP SHOULD 3178 support this specification, which uses a feature tag in registrations 3179 to facilitate interoperability through signaling intermediaries 3181 12.3. Interactions with Forking 3183 ICE interacts very well with forking. Indeed, ICE fixes some of the 3184 problems associated with forking. Without ICE, when a call forks and 3185 the caller receives multiple incoming media streams, it cannot 3186 determine which media stream corresponds to which callee. 3188 With ICE, this problem is resolved. The connectivity checks which 3189 occur prior to transmission of media carry username fragments, which 3190 in turn are correlated to a specific callee. Subsequent media 3191 packets which arrive on the same candidate pair as the connectivity 3192 check will be associated with that same callee. Thus, the caller can 3193 perform this correlation as long as it has received an answer. 3195 12.4. Interactions with Preconditions 3197 Quality of Service (QoS) preconditions, which are defined in RFC 3312 3198 [RFC3312] and RFC 4032 [RFC4032], apply only to the transport 3199 addresses listed as the default targets for media in an offer/answer. 3200 If ICE changes the transport address where media is received, this 3201 change is reflected in an updated offer which changes the default 3202 destination for media to match ICE's selection. As such, it appears 3203 like any other re-INVITE would, and is fully treated in RFC 3312 and 3204 4032, which apply without regard to the fact that the destination for 3205 media is changing due to ICE negotiations occurring "in the 3206 background". 3208 Indeed, an agent SHOULD NOT indicate that Qos preconditions have been 3209 met until the checks have completed and selected the candidate pairs 3210 to be used for media. 3212 ICE also has (purposeful) interactions with connectivity 3213 preconditions [I-D.ietf-mmusic-connectivity-precon]. Those 3214 interactions are described there. Note that the procedures described 3215 in Section 12.1 describe their own type of "preconditions", albeit 3216 with less functionality than those provided by the explicit 3217 preconditions in [I-D.ietf-mmusic-connectivity-precon]. 3219 12.5. Interactions with Third Party Call Control 3221 ICE works with Flows I, III and IV as described in [RFC3725]. Flow I 3222 works without the controller supporting or being aware of ICE. Flow 3223 IV will work as long as the controller passes along the ICE 3224 attributes without alteration. Flow II is fundamentally incompatible 3225 with ICE; each agent will believe itself to be the answerer and thus 3226 never generate a re-INVITE. 3228 The flows for continued operation, as described in Section 7 of RFC 3229 3725, require additional behavior of ICE implementations to support. 3230 In particular, if an agent receives a mid-dialog re-INVITE that 3231 contains no offer, it MUST restart ICE for each media stream and go 3232 through the process of gathering new candidates. Furthermore, that 3233 list of candidates SHOULD include the ones currently being used for 3234 media. 3236 13. Relationship with ANAT 3238 RFC 4091 [RFC4091], the Alternative Network Address Types (ANAT) 3239 Semantics for the SDP grouping framework, defines a mechanism for 3240 indicating that an agent can support both IPv4 and IPv6 for a media 3241 stream, and it does so by including two m-lines, one for v4, and one 3242 for v6. This is similar to ICE, which allows for an agent to 3243 indicate multiple transport addresses using the candidate attribute. 3244 However, ANAT relies on static selection to pick between choices, 3245 rather than a dynamic connectivity check used by ICE. 3247 This specification deprecates RFC 4091. Instead, agents wishing to 3248 support dual-stack will utilize ICE. Because a dual-stack agent will 3249 require at least two candidates, one for IPv4 and one for IPv6, dual- 3250 stack agents MUST be full implementations. However, agents that are 3251 implementing dual-stack and are running on closed networks where it 3252 is known that there are no NAT, MAY include only host candidates in 3253 their offers, skipping server reflexive and relayed candidates. 3255 14. Extensibility Considerations 3257 This specification makes very specific choices about how both agents 3258 in a session coordinate to arrive at the set of candidate pairs that 3259 are selected for media. It is anticipated that future specifications 3260 will want to alter these algorithms, whether they are simple changes 3261 like timer tweaks, or larger changes like a revamp of the priority 3262 algorithm. When such a change is made, providing interoperability 3263 between the two agents in a session is critical. 3265 First, ICE provides the a=ice-options SDP attribute. Each extension 3266 or change to ICE is associated with a token. When an agent 3267 supporting such an extension or change generates an offer or an 3268 answer, it MUST include the token for that extension in this 3269 attribute. This allows each side to know what the other side is 3270 doing. This attribute MUST NOT be present if the agent doesn't 3271 support any ICE extensions or changes. 3273 At this time, no IANA registry or registration procedures are defined 3274 for these option tags. At time of writing, it is unclear whether ICE 3275 changes and extensions will be sufficiently common to warrant a 3276 registry. 3278 One of the complications in achieving interoperability is that ICE 3279 relies on a distributed algorithm running on both agents to converge 3280 on an agreed set of candidate pairs. If the two agents run different 3281 algorithms, it can be difficult to guarantee convergence on the same 3282 candidate pairs. The regular nomination procedure described in 3283 Section 8 eliminates some of the tight coordination by delegating the 3284 selection algorithm completely to the controlling agent. 3285 Consequently, when a controlling agent is communicating with a peer 3286 that supports options it doesn't know about, the agent MUST run a 3287 regular nomination algorithm. When regular nomination is used, ICE 3288 will converge perfectly even when both agents use different pair 3289 prioritization algorithms. One of the keys to such convergence are 3290 triggered checks, which ensure that the nominated pair is validated 3291 by both agents. Consequently, any future ICE enhancements MUST 3292 preserve triggered checks. 3294 ICE is also extensible to other media streams beyond RTP, and for 3295 transport protocols beyond UDP. Extensions to ICE for non-RTP media 3296 streams need to specify how many components they utilize, and assign 3297 component IDs to them, starting at 1 for the most important component 3298 ID. Specifications for new transport protocols must define how, if 3299 at all, various steps in the ICE processing differ from UDP. 3301 15. Grammar 3303 This specification defines seven new SDP attributes - the 3304 "candidate", "remote-candidates", "ice-lite", "ice-mismatch", "ice- 3305 ufrag", "ice-pwd" and "ice-options" attributes. 3307 15.1. "candidate" Attribute 3309 The candidate attribute is a media-level attribute only. It contains 3310 a transport address for a candidate that can be used for connectivity 3311 checks. 3313 The syntax of this attribute is defined using Augmented BNF as 3314 defined in RFC 4234 [RFC4234]: 3316 candidate-attribute = "candidate" ":" foundation SP component-id SP 3317 transport SP 3318 priority SP 3319 connection-address SP ;from RFC 4566 3320 port ;port from RFC 4566 3321 SP cand-type 3322 [SP rel-addr] 3323 [SP rel-port] 3324 *(SP extension-att-name SP 3325 extension-att-value) 3327 foundation = 1*32ice-char 3328 component-id = 1*5DIGIT 3329 transport = "UDP" / transport-extension 3330 transport-extension = token ; from RFC 3261 3331 priority = 1*10DIGIT 3332 cand-type = "typ" SP candidate-types 3333 candidate-types = "host" / "srflx" / "prflx" / "relay" / token 3334 rel-addr = "raddr" SP connection-address 3335 rel-port = "rport" SP port 3336 extension-att-name = byte-string ;from RFC 4566 3337 extension-att-value = byte-string 3338 ice-char = ALPHA / DIGIT / "+" / "/" 3340 This grammar encodes the primary information about a candidate: its 3341 IP address, port and transport protocol, and its properties: the 3342 foundation, component ID, priority, type, and related transport 3343 address: 3345 : is taken from RFC 4566 [RFC4566]. It is the 3346 IP address of the candidate, allowing for IPv4 addresses, IPv6 3347 addresses and FQDNs. An IP address SHOULD be used, but an FQDN 3348 MAY be used in place of an IP address. In that case, when 3349 receiving an offer or answer containing an FQDN in an a=candidate 3350 attribute, the FQDN is looked up in the DNS first using an AAAA 3351 record (assuming the agent supports IPv6), and if no result is 3352 found or the agent only supports IPv4, using an A. If the DNS 3353 query returns more than one IP address, one is chosen, and then 3354 used for the remainder of ICE processing. 3356 : is also taken from RFC 4566 [RFC4566]. It is the port of 3357 the candidate. 3359 : indicates the transport protocol for the candidate. 3360 This specification only defines UDP. However, extensibility is 3361 provided to allow for future transport protocols to be used with 3362 ICE, such as TCP or the Datagram Congestion Control Protocol 3363 (DCCP) [RFC4340]. 3365 : is composed of one to thirty two . It is an 3366 identifier that is equivalent for two candidates that are of the 3367 same type, share the same base, and come from the same STUN 3368 server. The foundation is used to optimize ICE performance in the 3369 Frozen algorithm. 3371 : is a positive integer between 1 and 256 which 3372 identifies the specific component of the media stream for which 3373 this is a candidate. It MUST start at 1 and MUST increment by 1 3374 for each component of a particular candidate. For media streams 3375 based on RTP, candidates for the actual RTP media MUST have a 3376 component ID of 1, and candidates for RTCP MUST have a component 3377 ID of 2. Other types of media streams which require multiple 3378 components MUST develop specifications which define the mapping of 3379 components to component IDs. See Section 14 for additional 3380 discussion on extending ICE to new media streams. 3382 : is a positive integer between 1 and (2**31 - 1). 3384 : encodes the type of candidate. This specification 3385 defines the values "host", "srflx", "prflx" and "relay" for host, 3386 server reflexive, peer reflexive and relayed candidates, 3387 respectively. The set of candidate types is extensible for the 3388 future. 3390 and : convey transport addresses related to the 3391 candidate, useful for diagnostics and other purposes. 3392 and MUST be present for server reflexive, peer 3393 reflexive and relayed candidates. If a candidate is server or 3394 peer reflexive, and is equal to the base for 3395 that server or peer reflexive candidate. If the candidate is 3396 relayed, and is equal to the mapped address 3397 in the Allocate Response that provided the client with that 3398 relayed candidate (see Appendix B.3 for a discussion of its 3399 purpose). If the candidate is a host candidate and 3400 MUST be omitted. 3402 The candidate attribute can itself be extended. The grammar allows 3403 for new name/value pairs to be added at the end of the attribute. An 3404 implementation MUST ignore any name/value pairs it doesn't 3405 understand. 3407 15.2. "remote-candidates" Attribute 3409 The syntax of the "remote-candidates" attribute is defined using 3410 Augmented BNF as defined in RFC 4234 [RFC4234]. The remote- 3411 candidates attribute is a media level attribute only. 3413 remote-candidate-att = "remote-candidates" ":" remote-candidate 3414 0*(SP remote-candidate) 3415 remote-candidate = component-ID SP connection-address SP port 3417 The attribute contains a connection-address and port for each 3418 component. The ordering of components is irrelevant. However, a 3419 value MUST be present for each component of a media stream. This 3420 attribute MUST be included in an offer by a controlling agent for a 3421 media stream that is Completed, and MUST NOT be included in any other 3422 case. 3424 15.3. "ice-lite" and "ice-mismatch" Attributes 3426 The syntax of the "ice-lite" and "ice-mismatch" attributes, both of 3427 which are flags, is: 3429 ice-lite = "ice-lite" 3430 ice-mismatch = "ice-mismatch" 3432 "ice-lite" is a session level attribute only, and indicates that an 3433 agent is a lite implementation. "ice-mismatch" is a media level 3434 attribute only, and when present in an answer, indicates that the 3435 offer arrived with a default destination for a media component that 3436 didn't have a corresponding candidate attribute. 3438 15.4. "ice-ufrag" and "ice-pwd" Attributes 3440 The "ice-ufrag" and "ice-pwd" attributes convey the username fragment 3441 and password used by ICE for message integrity. Their syntax is: 3443 ice-pwd-att = "ice-pwd" ":" password 3444 ice-ufrag-att = "ice-ufrag" ":" ufrag 3445 password = 22*256ice-char 3446 ufrag = 4*256ice-char 3448 The "ice-pwd" and "ice-ufrag" attributes can appear at either the 3449 session-level or media-level. When present in both, the value in the 3450 media-level takes precedence. Thus, the value at the session level 3451 is effectively a default that applies to all media streams, unless 3452 overriden by a media-level value. Whether present at the session or 3453 media level, there MUST be an ice-pwd and ice-ufrag attribute for 3454 each media stream. If two media streams have identical ice-ufrag's, 3455 they MUST have identical ice-pwd's. 3457 The ice-ufrag and ice-pwd attributes MUST be chosen randomly at the 3458 beginning of a session. The ice-ufrag attribute MUST contain at 3459 least 24 bits of randomness, and the ice-pwd attribute MUST contain 3460 at least 128 bits of randomness. This means that the ice-ufrag 3461 attribute will be at least 4 characters long, and the ice-pwd at 3462 least 22 characters long, since the grammar for these attributes 3463 allows for 6 bits of randomness per character. The attributes MAY be 3464 longer than 4 and 22 characters respectively, of course, up to 256 3465 characters. The upper limit allows for buffer sizing in 3466 implementations. Its large upper limit allows for increased amounts 3467 of randomness to be added over time. 3469 15.5. "ice-options" Attribute 3471 The "ice-options" attribute is a session level attribute. It 3472 contains a series of tokens which identify the options supported by 3473 the agent. Its grammar is: 3475 ice-options = "ice-options" ":" ice-option-tag 3476 0*(SP ice-option-tag) 3477 ice-option-tag = 1*ice-char 3479 16. Setting Ta and RTO 3481 During the gathering phase of ICE (Section 4.1.1) and while ICE is 3482 performing connectivity checks (Section 7), an agent sends STUN and 3483 TURN transactions. These transcations are paced at a rate of one 3484 every Ta milliseconds, and utilize a specific RTO. This section 3485 describes how the value of Ta and RTO are computed. This computation 3486 depends on whether ICE is being used with a real time media stream 3487 (such as RTP) or something else. When ICE is used for a stream with 3488 a known maximum bandwidth, the computation in Section 16.1 MAY be 3489 followed to rate-control the ICE exchanges. For all other streams, 3490 the computation in Section 16.2 MUST be followed. 3492 16.1. RTP Media Streams 3494 The values of RTP and Ta change during the lifetime of ICE 3495 processing. One set of values applies during the gathering phase, 3496 and the other, for connectivity checks. 3498 The value of Ta SHOULD be configurable, and SHOULD have a default of: 3500 For each media stream i: 3501 Ta_i = (stun_packet_size / rtp_packet_size) * rtp_ptime 3503 1 3504 Ta = MAX (20ms, ------------------- ) 3505 k 3506 ---- 3507 \ 1 3508 > ------ 3509 / Ta_i 3510 ---- 3511 i=1 3513 Where k is the number of media streams. During the gathering phase, 3514 Ta is computed based on the number of media streams the agent has 3515 indicated in its offer or answer, and the RTP packet size and RTP 3516 ptime are those of the most preferred codec for each media stream. 3517 Once an offer and answer have been exchanged, the agent recomputes Ta 3518 to pace the connectivity checks. In that case, the value of Ta is 3519 based on the number of media streams that will actually be used in 3520 the session, and the RTP packet size and RTP ptime are those of the 3521 most preferred codec that the agent will send with. 3523 In addition, the retransmission timer for the STUN transactions, RTO, 3524 defined in [I-D.ietf-behave-rfc3489bis], SHOULD be configurable and 3525 during the gathering phase, SHOULD have a default of: 3527 RTO = MAX (100ms, Ta * (number of pairs)) 3529 Where the number of pairs refers to the number of pairs of candidates 3530 with STUN or TURN servers. 3532 For connectivity checks, RTO SHOULD be configurable and SHOULD have a 3533 default of: 3535 RTO = MAX (100ms, Ta*N * (Num-Waiting)) 3537 Where Num-Waiting are the number of checks in the check list in the 3538 Waiting state. Note that the RTO will be different for each 3539 transaction as the number of checks in the Waiting state changes. 3541 These formulas are aimed at causing STUN transactions to be paced at 3542 the same rate as media. This ensures that ICE will work properly 3543 under the same network conditions needed to support the media as 3544 well. See Appendix B.1 for additional discussion and motivations. 3545 Because of this pacing, it will take a certain amount of time to 3546 obtain all of the server reflexive and relayed candidates. 3547 Implementations should be aware of the time required to do this, and 3548 if the application requires a time budget, limit the number of 3549 candidates which are gathered. 3551 The formulas result in a behavior whereby an agent will send its 3552 first packet for every single connectivity check before performing a 3553 retransmit. This can be seen in the formulas for the RTO (which 3554 represents the retransmit interval). Those formulas scale with N, 3555 the number of checks to be performed. As a result of this, ICE 3556 maintains a nicely constant rate, but becomes more sensitive to 3557 packet loss. The loss of the first single packet for any 3558 connectivity check is likely to cause that pair to take a long time 3559 to be validated, and instead, a lower priority check (but one for 3560 which there was no packet loss) is much more likely to complete 3561 first. This results in ICE performing sub-optimally, choosing lower 3562 priority pairs over higher priority pairs. Implementors should be 3563 aware of this consequence, but still should utilize the timer values 3564 described here. 3566 16.2. Non-RTP Sessions 3568 In cases where ICE is used to establish some kind of session which is 3569 not real time, and has no fixed rate associated with it that is known 3570 to work on the network in which ICE is deployed, Ta and RTO revert to 3571 more conservative values. Ta SHOULD be configurable, SHOULD have a 3572 default of 500ms, and MUST NOT be configurable to be less than 500ms. 3574 In addition, the retransmission timer for the STUN transactions, RTO, 3575 SHOULD be configurable and during the gathering phase, SHOULD have a 3576 default of: 3578 RTO = MAX (500ms, Ta * (number of pairs)) 3580 Where the number of pairs refers to the number of pairs of candidates 3581 with STUN or TURN servers. 3583 For connectivity checks, RTO SHOULD be configurable and SHOULD have a 3584 default of: 3586 RTO = MAX (500ms, Ta*N * (Num-Waiting)) 3588 17. Example 3590 The example is based on the simplified topology of Figure 21. 3592 +-----+ 3593 | | 3594 |STUN | 3595 | Srvr| 3596 +-----+ 3597 | 3598 +---------------------+ 3599 | | 3600 | Internet | 3601 | | 3602 | | 3603 +---------------------+ 3604 | | 3605 | | 3606 +---------+ | 3607 | NAT | | 3608 +---------+ | 3609 | | 3610 | | 3611 | | 3612 +-----+ +-----+ 3613 | | | | 3614 | L | | R | 3615 | | | | 3616 +-----+ +-----+ 3618 Figure 21: Example Topology 3620 Two agents, L and R, are using ICE. Both are full-mode ICE 3621 implementations and use aggressive nomination when they are 3622 controlling. Both agents have a single IPv4 address. For agent L, 3623 it is 10.0.1.1 in private address space [RFC1918], and for agent R, 3624 192.0.2.1 on the public Internet. Both are configured with the same 3625 STUN server (shown in this example for simplicity, although in 3626 practice the agents do not need to use the same STUN server), which 3627 is listening for STUN Binding Requests at an IP address of 192.0.2.2 3628 and port 3478. TURN servers are not used in this example. Agent L 3629 is behind a NAT, and agent R is on the public Internet. The NAT has 3630 an endpoint independent mapping property and an address dependent 3631 filtering property. The public side of the NAT has an IP address of 3632 192.0.2.3. 3634 To facilitate understanding, transport addresses are listed using 3635 variables that have mnemonic names. The format of the name is 3636 entity-type-seqno, where entity refers to the entity whose IP address 3637 the transport address is on, and is one of "L", "R", "STUN", or 3638 "NAT". The type is either "PUB" for transport addresses that are 3639 public, and "PRIV" for transport addresses that are private. 3641 Finally, seq-no is a sequence number that is different for each 3642 transport address of the same type on a particular entity. Each 3643 variable has an IP address and port, denoted by varname.IP and 3644 varname.PORT, respectively, where varname is the name of the 3645 variable. 3647 The STUN server has advertised transport address STUN-PUB-1 (which is 3648 192.0.2.2:3478). 3650 In the call flow itself, STUN messages are annotated with several 3651 attributes. The "S=" attribute indicates the source transport 3652 address of the message. The "D=" attribute indicates the destination 3653 transport address of the message. The "MA=" attribute is used in 3654 STUN Binding Response messages and refers to the mapped address. 3655 "USE-CAND" implies the presence of the USE-CANDIDATE attribute. 3657 The call flow examples omit STUN authentication operations and RTCP, 3658 and focus on RTP for a single media stream between two full 3659 implementations. 3661 L NAT STUN R 3662 |RTP STUN alloc. | | 3663 |(1) STUN Req | | | 3664 |S=$L-PRIV-1 | | | 3665 |D=$STUN-PUB-1 | | | 3666 |------------->| | | 3667 | |(2) STUN Req | | 3668 | |S=$NAT-PUB-1 | | 3669 | |D=$STUN-PUB-1 | | 3670 | |------------->| | 3671 | |(3) STUN Res | | 3672 | |S=$STUN-PUB-1 | | 3673 | |D=$NAT-PUB-1 | | 3674 | |MA=$NAT-PUB-1 | | 3675 | |<-------------| | 3676 |(4) STUN Res | | | 3677 |S=$STUN-PUB-1 | | | 3678 |D=$L-PRIV-1 | | | 3679 |MA=$NAT-PUB-1 | | | 3680 |<-------------| | | 3681 |(5) Offer | | | 3682 |------------------------------------------->| 3683 | | | |RTP STUN alloc. 3684 | | |(6) STUN Req | 3685 | | |S=$R-PUB-1 | 3686 | | |D=$STUN-PUB-1 | 3687 | | |<-------------| 3688 | | |(7) STUN Res | 3689 | | |S=$STUN-PUB-1 | 3690 | | |D=$R-PUB-1 | 3691 | | |MA=$R-PUB-1 | 3692 | | |------------->| 3693 |(8) answer | | | 3694 |<-------------------------------------------| 3695 | |(9) Bind Req | |Begin 3696 | |S=$R-PUB-1 | |Connectivity 3697 | |D=L-PRIV-1 | |Checks 3698 | |<----------------------------| 3699 | |Dropped | | 3700 |(10) Bind Req | | | 3701 |S=$L-PRIV-1 | | | 3702 |D=$R-PUB-1 | | | 3703 |USE-CAND | | | 3704 |------------->| | | 3705 | |(11) Bind Req | | 3706 | |S=$NAT-PUB-1 | | 3707 | |D=$R-PUB-1 | | 3708 | |USE-CAND | | 3709 | |---------------------------->| 3710 | |(12) Bind Res | | 3711 | |S=$R-PUB-1 | | 3712 | |D=$NAT-PUB-1 | | 3713 | |MA=$NAT-PUB-1 | | 3714 | |<----------------------------| 3715 |(13) Bind Res | | | 3716 |S=$R-PUB-1 | | | 3717 |D=$L-PRIV-1 | | | 3718 |MA=$NAT-PUB-1 | | | 3719 |<-------------| | | 3720 |RTP flows | | | 3721 | |(14) Bind Req | | 3722 | |S=$R-PUB-1 | | 3723 | |D=$NAT-PUB-1 | | 3724 | |<----------------------------| 3725 |(15) Bind Req | | | 3726 |S=$R-PUB-1 | | | 3727 |D=$L-PRIV-1 | | | 3728 |<-------------| | | 3729 |(16) Bind Res | | | 3730 |S=$L-PRIV-1 | | | 3731 |D=$R-PUB-1 | | | 3732 |MA=$R-PUB-1 | | | 3733 |------------->| | | 3734 | |(17) Bind Res | | 3735 | |S=$NAT-PUB-1 | | 3736 | |D=$R-PUB-1 | | 3737 | |MA=$R-PUB-1 | | 3738 | |---------------------------->| 3739 | | | |RTP flows 3741 Figure 22: Example Flow 3743 First, agent L obtains a host candidate from its local IP address 3744 (not shown), and from that, sends a STUN Binding Request to the STUN 3745 server to get a server reflexive candidate (messages 1-4). Recall 3746 that the NAT has the address and port independent mapping property. 3747 Here, it creates a binding of NAT-PUB-1 for this UDP request, and 3748 this becomes the server reflexive candidate for RTP. 3750 Agent L sets a type preference of 126 for the host candidate and 100 3751 for the server reflexive. The local preference is 65535. Based on 3752 this, the priority of the host candidate is 2130706431 and for the 3753 server reflexive candidate is 1694498815. The host candidate is 3754 assigned a foundation of 1, and the server reflexive, a foundation of 3755 2. It chooses its server reflexive candidate as the default 3756 candidate, and encodes it into the m and c lines. The resulting 3757 offer (message 5) looks like (lines folded for clarity): 3759 v=0 3760 o=jdoe 2890844526 2890842807 IN IP4 $L-PRIV-1.IP 3761 s= 3762 c=IN IP4 $NAT-PUB-1.IP 3763 t=0 0 3764 a=ice-pwd:asd88fgpdd777uzjYhagZg 3765 a=ice-ufrag:8hhY 3766 m=audio $NAT-PUB-1.PORT RTP/AVP 0 3767 b=RS:0 3768 b=RR:0 3769 a=rtpmap:0 PCMU/8000 3770 a=candidate:1 1 UDP 2130706431 $L-PRIV-1.IP $L-PRIV-1.PORT typ host 3771 a=candidate:2 1 UDP 1694498815 $NAT-PUB-1.IP $NAT-PUB-1.PORT typ 3772 srflx raddr $L-PRIV-1.IP rport $L-PRIV-1.PORT 3774 The offer, with the variables replaced with their values, will look 3775 like (lines folded for clarity): 3777 v=0 3778 o=jdoe 2890844526 2890842807 IN IP4 10.0.1.1 3779 s= 3780 c=IN IP4 192.0.2.3 3781 t=0 0 3782 a=ice-pwd:asd88fgpdd777uzjYhagZg 3783 a=ice-ufrag:8hhY 3784 m=audio 45664 RTP/AVP 0 3785 b=RS:0 3786 b=RR:0 3787 a=rtpmap:0 PCMU/8000 3788 a=candidate:1 1 UDP 2130706431 10.0.1.1 8998 typ host 3789 a=candidate:2 1 UDP 1694498815 192.0.2.3 45664 typ srflx raddr 3790 10.0.1.1 rport 8998 3792 This offer is received at agent R. Agent R will obtain a host 3793 candidate, and from it, obtain a server reflexive candidate (messages 3794 6-7). Since R is not behind a NAT, this candidate is identical to 3795 its host candidate, and they share the same base. It therefore 3796 discards this redundant candidate and ends up with a single host 3797 candidate. With identical type and local preferences as L, the 3798 priority for this candidate is 2130706431. It chooses a foundation 3799 of 1 for its single candidate. Its resulting answer looks like: 3801 v=0 3802 o=bob 2808844564 2808844564 IN IP4 $R-PUB-1.IP 3803 s= 3804 c=IN IP4 $R-PUB-1.IP 3805 t=0 0 3806 a=ice-pwd:YH75Fviy6338Vbrhrlp8Yh 3807 a=ice-ufrag:9uB6 3808 m=audio $R-PUB-1.PORT RTP/AVP 0 3809 b=RS:0 3810 b=RR:0 3811 a=rtpmap:0 PCMU/8000 3812 a=candidate:1 1 UDP 2130706431 $R-PUB-1.IP $R-PUB-1.PORT typ host 3814 With the variables filled in: 3816 v=0 3817 o=bob 2808844564 2808844564 IN IP4 192.0.2.1 3818 s= 3819 c=IN IP4 192.0.2.1 3820 t=0 0 3821 a=ice-pwd:YH75Fviy6338Vbrhrlp8Yh 3822 a=ice-ufrag:9uB6 3823 m=audio 3478 RTP/AVP 0 3824 b=RS:0 3825 b=RR:0 3826 a=rtpmap:0 PCMU/8000 3827 a=candidate:1 1 UDP 2130706431 192.0.2.1 3478 typ host 3829 Since neither side indicated that they are lite, the agent which sent 3830 the offer that began ICE processing (agent L) becomes the controlling 3831 agent. 3833 Agents L and R both pair up the candidates. They both initially have 3834 two pairs. However, agent L will prune the pair containing its 3835 server reflexive candidate, resulting in just one. At agent L, this 3836 pair has a local candidate of $L_PRIV_1 and remote candidate of 3837 $R_PUB_1, and has a candidate pair priority of 4.57566E+18 (note that 3838 an implementation would represent this as a 64 bit integer so as not 3839 to lose precision). At agent R, there are two pairs. The highest 3840 priority has a local candidate of $R_PUB_1 and remote candidate of 3841 $L_PRIV_1 and has a priority of 4.57566E+18, and the second has a 3842 local candidate of $R_PUB_1 and remote candidate of $NAT_PUB_1 and 3843 priority 3.63891E+18. 3845 Agent R begins its connectivity check (message 9) for the first pair 3846 (between the two host candidates). Since R is the controlled agent 3847 for this session, the check omits the USE-CANDIDATE attribute. The 3848 host candidate from agent L is private and behind a NAT, and thus 3849 this check won't be successful, because the packet cannot be routed 3850 from R to L. 3852 When agent L gets the answer, it performs its one and only 3853 connectivity check (messages 10-13). It implements the aggressive 3854 nomination algorithm, and thus includes a USE-CANDIDATE attribute in 3855 this check. Since the check succeeds, agent L creates a new pair, 3856 whose local candidate is from the mapped address in the binding 3857 response (NAT-PUB-1 from message 13) and whose remote candidate is 3858 the destination of the request (R-PUB-1 from message 10). This is 3859 added to the valid list. In addition, it is marked as selected since 3860 the Binding Request contained the USE-CANDIDATE attribute. Since 3861 there is a selected candidate in the Valid list for the one component 3862 of this media stream, ICE processing for this stream moves into the 3863 Completed state. Agent L can now send media if it so chooses. 3865 Soon after receipt of the STUN Binding Request from agent L (message 3866 11), agent R will generate its triggered check. This check happens 3867 to match the next one on its check list - from its host candidate to 3868 agent L's server reflexive candidate. This check (messages 14-17) 3869 will succeed. Consequently, agent R constructs a new candidate pair 3870 using the mapped address from the response as the local candidate 3871 (R-PUB-1) and the destination of the request (NAT-PUB-1) as the 3872 remote candidate. This pair is added to the Valid list for that 3873 media stream. Since the check was generated in the reverse direction 3874 of a check that contained the USE-CANDIDATE attribute, the candidate 3875 pair is marked as selected. Consequently, processing for this stream 3876 moves into the Completed state, and agent R can also send media. 3878 18. Security Considerations 3880 There are several types of attacks possible in an ICE system. This 3881 section considers these attacks and their countermeasures. These 3882 countermeasures include: 3884 o Using ICE in conjunction with secure signaling techniques, such as 3885 SIPS 3887 o Limiting the total number of connectivity checks to 100, and 3888 optionally limiting the number of candidates they'll accept in an 3889 offer or answer. 3891 18.1. Attacks on Connectivity Checks 3893 An attacker might attempt to disrupt the STUN connectivity checks. 3894 Ultimately, all of these attacks fool an agent into thinking 3895 something incorrect about the results of the connectivity checks. 3896 The possible false conclusions an attacker can try and cause are: 3898 False Invalid: An attacker can fool a pair of agents into thinking a 3899 candidate pair is invalid, when it isn't. This can be used to 3900 cause an agent to prefer a different candidate (such as one 3901 injected by the attacker), or to disrupt a call by forcing all 3902 candidates to fail. 3904 False Valid: An attacker can fool a pair of agents into thinking a 3905 candidate pair is valid, when it isn't. This can cause an agent 3906 to proceed with a session, but then not be able to receive any 3907 media. 3909 False Peer-Reflexive Candidate: An attacker can cause an agent to 3910 discover a new peer reflexive candidate, when it shouldn't have. 3911 This can be used to redirect media streams to a DoS target or to 3912 the attacker, for eavesdropping or other purposes. 3914 False Valid on False Candidate: An attacker has already convinced an 3915 agent that there is a candidate with an address that doesn't 3916 actually route to that agent (for example, by injecting a false 3917 peer reflexive candidate or false server reflexive candidate). It 3918 must then launch an attack that forces the agents to believe that 3919 this candidate is valid. 3921 If an attacker can cause a false per-reflexive candidate or false 3922 valid on a false candidate, it can launch any of the attacks 3923 described in draft-ietf-behave-rfc3489bis 3924 [I-D.ietf-behave-rfc3489bis]. 3926 To force the false invalid result, the attacker has to wait for the 3927 connectivity check from one of the agents to be sent. When it is, 3928 the attacker needs to inject a fake response with an unrecoverable 3929 error response, such as a 400. However, since the candidate is, in 3930 fact, valid, the original request may reach the peer agent, and 3931 result in a success response. The attacker needs to force this 3932 packet or its response to be dropped, through a DoS attack, layer 2 3933 network disruption, or other technique. If it doesn't do this, the 3934 success response will also reach the originator, alerting it to a 3935 possible attack. Fortunately, this attack is mitigated completely 3936 through the STUN short term credential mechanism. The attacker needs 3937 to inject a fake response, and in order for this response to be 3938 processed, the attacker needs the password. If the offer/answer 3939 signaling is secured, the attacker will not have the password and its 3940 response will be discarded. 3942 Forcing the fake valid result works in a similar way. The agent 3943 needs to wait for the Binding Request from each agent, and inject a 3944 fake success response. The attacker won't need to worry about 3945 disrupting the actual response since, if the candidate is not valid, 3946 it presumably wouldn't be received anyway. However, like the fake 3947 invalid attack, this attack is mitigated by the STUN short term 3948 credential mechanism in conjunction with a secure offer/answer 3949 exchange. 3951 Forcing the false peer reflexive candidate result can be done either 3952 with fake requests or responses, or with replays. We consider the 3953 fake requests and responses case first. It requires the attacker to 3954 send a Binding Request to one agent with a source IP address and port 3955 for the false candidate. In addition, the attacker must wait for a 3956 Binding Request from the other agent, and generate a fake response 3957 with a XOR-MAPPED-ADDRESS attribute containing the false candidate. 3958 Like the other attacks described here, this attack is mitigated by 3959 the STUN message integrity mechanisms and secure offer/answer 3960 exchanges. 3962 Forcing the false peer reflexive candidate result with packet replays 3963 is different. The attacker waits until one of the agents sends a 3964 check. It intercepts this request, and replays it towards the other 3965 agent with a faked source IP address. It must also prevent the 3966 original request from reaching the remote agent, either by launching 3967 a DoS attack to cause the packet to be dropped, or forcing it to be 3968 dropped using layer 2 mechanisms. The replayed packet is received at 3969 the other agent, and accepted, since the integrity check passes (the 3970 integrity check cannot and does not cover the source IP address and 3971 port). It is then responded to. This response will contain a XOR- 3972 MAPPED-ADDRESS with the false candidate, and will be sent to that 3973 false candidate. The attacker must then receive it and relay it 3974 towards the originator. 3976 The other agent will then initiate a connectivity check towards that 3977 false candidate. This validation needs to succeed. This requires 3978 the attacker to force a false valid on a false candidate. Injecting 3979 of fake requests or responses to achieve this goal is prevented using 3980 the integrity mechanisms of STUN and the offer/answer exchange. 3981 Thus, this attack can only be launched through replays. To do that, 3982 the attacker must intercept the check towards this false candidate, 3983 and replay it towards the other agent. Then, it must intercept the 3984 response and replay that back as well. 3986 This attack is very hard to launch unless the attacker is identified 3987 by the fake candidate. This is because it requires the attacker to 3988 intercept and replay packets sent by two different hosts. If both 3989 agents are on different networks (for example, across the public 3990 Internet), this attack can be hard to coordinate, since it needs to 3991 occur against two different endpoints on different parts of the 3992 network at the same time. 3994 If the attacker themself is identified by the fake candidate the 3995 attack is easier to coordinate. However, if SRTP is used [RFC3711], 3996 the attacker will not be able to play the media packets, they will 3997 only be able to discard them, effectively disabling the media stream 3998 for the call. However, this attack requires the agent to disrupt 3999 packets in order to block the connectivity check from reaching the 4000 target. In that case, if the goal is to disrupt the media stream, 4001 its much easier to just disrupt it with the same mechanism, rather 4002 than attack ICE. 4004 18.2. Attacks on Server Reflexive Address Gathering 4006 ICE endpoints make use of STUN Binding requests for gathering server 4007 reflexive candidates from a STUN server. These requests are not 4008 authenticated in any way. As a consequence, there are numerous 4009 techniques an attacker can employ to provide the client with a false 4010 server reflexive candidate: 4012 o An attacker can compromise the DNS, causing DNS queries to return 4013 a rogue STUN server address. That server can provide the client 4014 with fake server reflexive candidates. This attack is mitigated 4015 by DNS security, though DNS-SEC is not required to address it. 4017 o An attacker that can observe STUN messages (such as an attacker on 4018 a shared network segment, like WiFi), can inject a fake response 4019 that is valid and will be accepted by the client. 4021 o An attacker can compromise a STUN server by means of a virus, and 4022 cause it to send responses with incorrect mapped addresses. 4024 A false mapped address learned by these attacks will be used as a 4025 server reflexive candidate in the ICE exchange. For this candidate 4026 to actually be used for media, the attacker must also attack the 4027 connectivity checks, and in particular, force a false valid on a 4028 false candidate. This attack is very hard to launch if the false 4029 address identifies a fourth party (neither the offerer, answerer, or 4030 attacker), since it requires attacking the checks generated by each 4031 agent in the session, and is prevented by SRTP if it identifies the 4032 attacker themself. 4034 If the attacker elects not to attack the connectivity checks, the 4035 worst it can do is prevent the server reflexive candidate from being 4036 used. However, if the peer agent has at least one candidate that is 4037 reachable by the agent under attack, the STUN connectivity checks 4038 themselves will provide a peer reflexive candidate that can be used 4039 for the exchange of media. Peer reflexive candidates are generally 4040 preferred over server reflexive candidates. As such, an attack 4041 solely on the STUN address gathering will normally have no impact on 4042 a session at all. 4044 18.3. Attacks on Relayed Candidate Gathering 4046 An attacker might attempt to disrupt the gathering of relayed 4047 candidates, forcing the client to believe it has a false relayed 4048 candidate. Exchanges with the TURN server are authenticated using a 4049 long term credential. Consequently, injection of fake responses or 4050 requests will not work. In addition, unlike Binding requests, 4051 Allocate requests are not susceptible to replay attacks with modified 4052 source IP addresses and ports, since the source IP address and port 4053 is not utilized to provide the client with its relayed candidate. 4055 However, TURN servers are susceptible to DNS attacks, or to viruses 4056 aimed at the TURN server, for purposes of turning it into a zombie or 4057 rogue server. These attacks can be mitigated by DNS-SEC and through 4058 good box and software security on TURN servers. 4060 Even if an attacker has caused the client to believe in a false 4061 relayed candidate, the connectivity checks cause such a candidate to 4062 be used only if they succeed. Thus, an attacker must launch a false 4063 valid on a false candidate, per above, which is a very difficult 4064 attack to coordinate. 4066 18.4. Attacks on the Offer/Answer Exchanges 4068 An attacker that can modify or disrupt the offer/answer exchanges 4069 themselves can readily launch a variety of attacks with ICE. They 4070 could direct media to a target of a DoS attack, they could insert 4071 themselves into the media stream, and so on. These are similar to 4072 the general security considerations for offer/answer exchanges, and 4073 the security considerations in RFC 3264 [RFC3264] apply. These 4074 require techniques for message integrity and encryption for offers 4075 and answers, which are satisfied by the SIPS mechanism [RFC3261] when 4076 SIP is used. As such, the usage of SIPS with ICE is RECOMMENDED. 4078 18.5. Insider Attacks 4080 In addition to attacks where the attacker is a third party trying to 4081 insert fake offers, answers or stun messages, there are several 4082 attacks possible with ICE when the attacker is an authenticated and 4083 valid participant in the ICE exchange. 4085 18.5.1. The Voice Hammer Attack 4087 The voice hammer attack is an amplification attack. In this attack, 4088 the attacker initiates sessions to other agents, and maliciously 4089 includes the IP address and port of a DoS target as the destination 4090 for media traffic signaled in the SDP. This causes substantial 4091 amplification; a single offer/answer exchange can create a continuing 4092 flood of media packets, possibly at high rates (consider video 4093 sources). This attack is not specific to ICE, but ICE can help 4094 provide remediation. 4096 Specifically, if ICE is used, the agent receiving the malicious SDP 4097 will first perform connectivity checks to the target of media before 4098 sending media there. If this target is a third party host, the 4099 checks will not succeed, and media is never sent. 4101 Unfortunately, ICE doesn't help if its not used, in which case an 4102 attacker could simply send the offer without the ICE parameters. 4103 However, in environments where the set of clients are known, and 4104 limited to ones that support ICE, the server can reject any offers or 4105 answers that don't indicate ICE support. 4107 18.5.2. STUN Amplification Attack 4109 The STUN amplification attack is similar to the voice hammer. 4110 However, instead of voice packets being directed to the target, STUN 4111 connectivity checks are directed to the target. The attacker sends 4112 an offer with a large number of candidates, say 50. The answerer 4113 receives the offer, and starts its checks, which are directed at the 4114 target, and consequently, never generate a response. The answerer 4115 will start a new connectivity check every Ta ms (say Ta=20ms). 4116 However, the retransmission timers are set to a large number due to 4117 the large number of candidates. As a consequence, packets will be 4118 sent at an interval of one every Ta milliseconds, and then with 4119 increasing intervals after that. Thus, STUN will not send packets at 4120 a rate faster than media would be sent, and the STUN packets persist 4121 only briefly, until ICE fails for the session. Nonetheless, this is 4122 an amplification mechanism. 4124 It is impossible to eliminate the amplification, but the volume can 4125 be reduced through a variety of heuristics. Agents SHOULD limit the 4126 total number of connectivity checks they perform to 100. 4127 Additionally, agents MAY limit the number of candidates they'll 4128 accept in an offer or answer. 4130 Frequently, protocols that wish to avoid these kinds of attacks force 4131 the initiator to wait for a response prior to sending the next 4132 message. However, in the case of ICE, this is not possible. It is 4133 not possible to differentiate the following two cases: 4135 o There was no response because the initiator is being used to 4136 launch a DoS attack against an unsuspecting target that will not 4137 respond 4139 o There was no response because the IP address and port is not 4140 reachable by the initiator 4142 In the second case, another check should be sent at the next 4143 opportunity, while in the former case, no further checks should be 4144 sent. 4146 18.6. Interactions with Application Layer Gateways and SIP 4148 Application Layer Gateways (ALGs) are functions present in a NAT 4149 device which inspect the contents of packets and modify them, in 4150 order to facilitate NAT traversal for application protocols. Session 4151 Border Controllers (SBC) are close cousins of ALGs, but are less 4152 transparent since they actually exist as application layer SIP 4153 intermediaries. ICE has interactions with SBCs and ALGs. 4155 If an ALG is SIP aware but not ICE aware, ICE will work through it as 4156 long as the ALG correctly modifies the SDP. A correct ALG 4157 implementation behaves as follows: 4159 o The ALG does not modify the m and c lines or the rtcp attribute if 4160 they contain external addresses. 4162 o If the m and c lines contain internal addresses, the modification 4163 depends on the state of the ALG: 4165 If the ALG already has a binding established that maps an 4166 external port to an internal IP address and port matching the 4167 values in the m and c lines or rtcp attribute, the ALG uses 4168 that binding instead of creating a new one. 4170 If the ALG does not already have a binding, it creates a new 4171 one and modifies the SDP, rewriting the m and c lines and rtcp 4172 attribute. 4174 Unfortunately, many ALG are known to work poorly in these corner 4175 cases. ICE does not try to work around broken ALGs, as this is 4176 outside the scope of its functionality. ICE can help diagnose these 4177 conditions, which often show up as a mismatch between the set of 4178 candidates and the m and c lines and rtcp attributes. The ice- 4179 mismatch attribute is used for this purpose. 4181 ICE works best through ALGs when the signaling is run over TLS. This 4182 prevents the ALG from manipulating the SDP messages and interfering 4183 with ICE operation. Implementations which are expected to be 4184 deployed behind ALGs SHOULD provide for TLS transport of the SDP. 4186 If an SBC is SIP aware but not ICE aware, the result depends on the 4187 behavior of the SBC. If it is acting as a proper Back-to-Back User 4188 Agent (B2BUA), the SBC will remove any SDP attributes it doesn't 4189 understand, including the ICE attributes. Consequently, the call 4190 will appear to both endpoints as if the other side doesn't support 4191 ICE. This will result in ICE being disabled, and media flowing 4192 through the SBC, if the SBC has requested it. If, however, the SBC 4193 passes the ICE attributes without modification, yet modifies the 4194 default destination for media (contained in the m and c lines and 4195 rtcp attribute), this will be detected as an ICE mismatch, and ICE 4196 processing is aborted for the call. It is outside of the scope of 4197 ICE for it to act as a tool for "working around" SBCs. If one is 4198 present, ICE will not be used and the SBC techniques take precedence. 4200 19. STUN Extensions 4202 19.1. New Attributes 4204 This specification defines four new attributes, PRIORITY, USE- 4205 CANDIDATE, ICE-CONTROLLED and ICE-CONTROLLING. 4207 The PRIORITY attribute indicates the priority that is to be 4208 associated with a peer reflexive candidate, should one be discovered 4209 by this check. It is a 32 bit unsigned integer, and has an attribute 4210 value of 0x0024. 4212 The USE-CANDIDATE attribute indicates that the candidate pair 4213 resulting from this check should be used for transmission of media. 4214 The attribute has no content (the Length field of the attribute is 4215 zero); it serves as a flag. It has an attribute value of 0x0025. 4217 The ICE-CONTROLLED attribute is present in a Binding Request, and 4218 indicates that the client believes it is currently in the controlled 4219 role. The content of the attribute is a 64 bit unsigned integer in 4220 network byte ordering, which contains a random number used for tie- 4221 breaking of role conflicts. 4223 The ICE-CONTROLLING attribute is present in a Binding Request, and 4224 indicates that the client believes it is currently in the controlling 4225 role. The content of the attribute is a 64 bit unsigned integer in 4226 network byte ordering, which contains a random number used for tie- 4227 breaking of role conflicts. 4229 19.2. New Error Response Codes 4231 This specification defines a single error response code: 4233 487 (Role Conflict): The Binding Request contained either the ICE- 4234 CONTROLLING or ICE-CONTROLLED attribute, indicating a role that 4235 conflicted with the server. The server ran a tie-breaker based on 4236 the tie-breaker value in the request, and determined that the 4237 client needs to switch roles. 4239 20. Operational Considerations 4241 This section discusses issues relevant to network operators looking 4242 to deploy ICE. 4244 20.1. NAT and Firewall Types 4246 ICE was designed to work with existing NAT and firewall equipment. 4247 Consequently, it is not neccesary to replace or reconfigure existing 4248 firewall and NAT equipment in order to facilitate deployment of ICE. 4249 Indeed, ICE was developed to be deployed in environments where the 4250 VoIP operator has no control over the IP network infrastructure, 4251 including firewalls and NAT. 4253 That said, ICE works best in environments where the NAT devices are 4254 "behave" compliant, meeting the recommendations defined in [RFC4787] 4255 and [I-D.ietf-behave-tcp]. In networks with behave-compliant NAT, 4256 ICE will work without the need for a TURN server, thus improving 4257 voice quality, increasing call setup times, and reducing the 4258 bandwidth demands on the network operator. 4260 20.2. Bandwidth Requirements 4262 Deployment of ICE can have several interactions with available 4263 network capacity that operators should take into consideration. 4265 20.2.1. STUN and TURN Server Capacity Planning 4267 First and foremost, ICE makes use of TURN and STUN servers, which 4268 would typically be located in the network operator's data centers. 4269 The STUN servers require relatively little bandwidth. For each 4270 component of each media stream, there will be one or more STUN 4271 transactions from each client to the STUN server. In a basic voice- 4272 only IPv4 VoIP deployment, there will be four transactions per call 4273 (one for RTP and one for RTCP, for both caller and callee). Each 4274 transaction is a single request and a single response, the former 4275 being 20 bytes long, and the latter, 28. Consequently, if a system 4276 has N users, and each makes four calls in a busy hour, this would 4277 require N*1.7bps. For one million users, this is 1.7 Mbps, a very 4278 small number (relatively speaking). 4280 TURN traffic is more substantial. The TURN server will see traffic 4281 volume equal to the STUN volume (indeed, if TURN servers are 4282 deployed, there is no need for a separate STUN server), in addition 4283 to the traffic for the actual media traffic. The amount of calls 4284 requiring TURN for media relay is highly dependent on network 4285 topologies, and can and will vary over time. In a network with 100% 4286 behave compliant NAT, it is exactly zero. At time of writing, large- 4287 scale consumer deployments were seeing between 5 and 10 percent of 4288 calls requiring TURN servers. Considering a voice-only deployment 4289 using G.711 (so 80kbps in each direction), with .2 erlangs during the 4290 busy hour, this is N*3.2kbps. For a population of one million users, 4291 this is 3.2Gbps, assuming a 10% usage of TURN servers. 4293 20.2.2. Gathering and Connectivity Checks 4295 The process of gathering of candidates and performing of connectivity 4296 checks can be banwdidth intensive. ICE has been designed to pace 4297 both of these processes. The gathering phase and the connectivity 4298 check phase are meant to generate traffic at roughly the same 4299 bandwidth as the media traffic itself. This was done to ensure that, 4300 if a network is designed to support multimedia traffic of a certain 4301 type (voice, video or just text), it will have sufficient capacity to 4302 support the ICE checks for that media. Of course, the ICE checks 4303 will cause a marginal increase in the total utilization; however this 4304 will typically be an extremely small increase. 4306 Congestion due to the gathering and check phases has proven to be a 4307 problem in deployments that did not utilize pacing. Typically, 4308 access links became congested as the endpoints flooded the network 4309 with checks as fast as they can send them. Consequently, network 4310 operators should make sure that their ICE implementations support the 4311 pacing feature. Though this pacing does increase call setup times, 4312 it makes ICE network friendly and easier to deploy. 4314 20.2.3. Keepalives 4316 STUN keepalives (in the form of STUN Binding Indications) are sent in 4317 the middle of a media session. However, they are sent only in the 4318 absence of actual media traffic. In deployments that are not 4319 utilizing Voice Activity Detection (VAD), the keepalives are never 4320 used and there is no increase in bandwidth usage. When VAD is being 4321 used, keepalives will be sent during silence periods. This involves 4322 a single packet every 15-20 seconds, far less than the packet every 4323 20-30ms that is sent when there is voice. Therefore, keepalives 4324 don't have any real impact on capacity planning. 4326 20.3. ICE and ICE-lite 4328 Deployments utilizing a mix of ICE and ICE-lite interoperate 4329 perfectly. They have been explicitly designed to do so, without loss 4330 of function. 4332 However, ICE-lite can only be deployed in limited use cases. Those 4333 cases, and the caveats involved in doing so, are documented in 4334 Appendix A. 4336 20.4. Troubleshooting and Performance Management 4338 ICE utilizes end-to-end connectivity checks, and places much of the 4339 processing in the endpoints. This introduces a challenge to the 4340 network operator - how can they troubleshoot ICE deployments? How 4341 can they know how ICE is performing? 4343 ICE has built in features to help deal with these problems. SIP 4344 servers on the signaling path, typically deployed in the data centers 4345 of the network operator, will see the contents of the offer/answer 4346 exchanges that convey the ICE parameters. These parameters include 4347 the type of each candidate (host, server reflexive, or relayed), 4348 along with their related addresses. Once ICE processing has 4349 completed, an updated offer/answer exchange takes place, signaling 4350 the selected address (and its type). This updated re-INVITE is 4351 performed exactly for the purposes of educating network equipment 4352 (such as a diagnostic tool attached to a SIP server) about the 4353 results of ICE processing. 4355 As a consequence, through the logs generated by the SIP server, a 4356 network operator can observe what types of candidates are being used 4357 for each call, and what address was selected by ICE. This is the 4358 primary information that helps evaluate how ICE is performing. 4360 20.5. Endpoint Configuration 4362 ICE relies on several pieces of data being configured into the 4363 endpoints. This configuration data includes timers, credentials for 4364 TURN servers, and hostnames for STUN and TURN servers. ICE itself 4365 does not provide a mechanism for this configuration. Instead, it is 4366 assumed that this information is attached to whatever mechanism is 4367 used to configure all of the other parameters in the endpoint. For 4368 SIP phones, standard solutions such as the configuration framework 4369 [I-D.ietf-sipping-config-framework] have been defined. 4371 21. IANA Considerations 4373 This specification registers new SDP attributes, four new STUN 4374 attributes and one new STUN error response. 4376 21.1. SDP Attributes 4378 This specification defines seven new SDP attributes per the 4379 procedures of Section 8.2.4 of [RFC4566]. The required information 4380 for the registrations are included here. 4382 21.1.1. candidate Attribute 4384 Contact Name: Jonathan Rosenberg, jdrosen@jdrosen.net. 4386 Attribute Name: candidate 4388 Long Form: candidate 4390 Type of Attribute: media level 4392 Charset Considerations: The attribute is not subject to the charset 4393 attribute. 4395 Purpose: This attribute is used with Interactive Connectivity 4396 Establishment (ICE), and provides one of many possible candidate 4397 addresses for communication. These addresses are validated with 4398 an end-to-end connectivity check using Simple Traversal Underneath 4399 NAT (STUN). 4401 Appropriate Values: See Section 15 of RFC XXXX [Note to RFC-ed: 4402 please replace XXXX with the RFC number of this specification]. 4404 21.1.2. remote-candidates Attribute 4406 Contact Name: Jonathan Rosenberg, jdrosen@jdrosen.net. 4408 Attribute Name: remote-candidates 4410 Long Form: remote-candidates 4412 Type of Attribute: media level 4414 Charset Considerations: The attribute is not subject to the charset 4415 attribute. 4417 Purpose: This attribute is used with Interactive Connectivity 4418 Establishment (ICE), and provides the identity of the remote 4419 candidates that the offerer wishes the answerer to use in its 4420 answer. 4422 Appropriate Values: See Section 15 of RFC XXXX [Note to RFC-ed: 4423 please replace XXXX with the RFC number of this specification]. 4425 21.1.3. ice-lite Attribute 4426 Contact Name: Jonathan Rosenberg, jdrosen@jdrosen.net. 4428 Attribute Name: ice-lite 4430 Long Form: ice-lite 4432 Type of Attribute: session level 4434 Charset Considerations: The attribute is not subject to the charset 4435 attribute. 4437 Purpose: This attribute is used with Interactive Connectivity 4438 Establishment (ICE), and indicates that an agent has the minimum 4439 functionality required to support ICE inter-operation with a peer 4440 that has a full implementation. 4442 Appropriate Values: See Section 15 of RFC XXXX [Note to RFC-ed: 4443 please replace XXXX with the RFC number of this specification]. 4445 21.1.4. ice-mismatch Attribute 4447 Contact Name: Jonathan Rosenberg, jdrosen@jdrosen.net. 4449 Attribute Name: ice-mismatch 4451 Long Form: ice-mismatch 4453 Type of Attribute: session level 4455 Charset Considerations: The attribute is not subject to the charset 4456 attribute. 4458 Purpose: This attribute is used with Interactive Connectivity 4459 Establishment (ICE), and indicates that an agent is ICE capable, 4460 but did not proceed with ICE due to a mismatch of candidates with 4461 the default destination for media signaled in the SDP. 4463 Appropriate Values: See Section 15 of RFC XXXX [Note to RFC-ed: 4464 please replace XXXX with the RFC number of this specification]. 4466 21.1.5. ice-pwd Attribute 4468 Contact Name: Jonathan Rosenberg, jdrosen@jdrosen.net. 4470 Attribute Name: ice-pwd 4471 Long Form: ice-pwd 4473 Type of Attribute: session or media level 4475 Charset Considerations: The attribute is not subject to the charset 4476 attribute. 4478 Purpose: This attribute is used with Interactive Connectivity 4479 Establishment (ICE), and provides the password used to protect 4480 STUN connectivity checks. 4482 Appropriate Values: See Section 15 of RFC XXXX [Note to RFC-ed: 4483 please replace XXXX with the RFC number of this specification]. 4485 21.1.6. ice-ufrag Attribute 4487 Contact Name: Jonathan Rosenberg, jdrosen@jdrosen.net. 4489 Attribute Name: ice-ufrag 4491 Long Form: ice-ufrag 4493 Type of Attribute: session or media level 4495 Charset Considerations: The attribute is not subject to the charset 4496 attribute. 4498 Purpose: This attribute is used with Interactive Connectivity 4499 Establishment (ICE), and provides the fragments used to construct 4500 the username in STUN connectivity checks. 4502 Appropriate Values: See Section 15 of RFC XXXX [Note to RFC-ed: 4503 please replace XXXX with the RFC number of this specification]. 4505 21.1.7. ice-options Attribute 4507 Contact Name: Jonathan Rosenberg, jdrosen@jdrosen.net. 4509 Attribute Name: ice-options 4511 Long Form: ice-options 4513 Type of Attribute: session level 4515 Charset Considerations: The attribute is not subject to the charset 4516 attribute. 4518 Purpose: This attribute is used with Interactive Connectivity 4519 Establishment (ICE), and indicates the ICE options or extensions 4520 used by the agent. 4522 Appropriate Values: See Section 15 of RFC XXXX [Note to RFC-ed: 4523 please replace XXXX with the RFC number of this specification]. 4525 21.2. STUN Attributes 4527 This section registers four new STUN attributes per the procedures in 4528 [I-D.ietf-behave-rfc3489bis]. 4530 0x0024 PRIORITY 4531 0x0025 USE-CANDIDATE 4532 0x8029 ICE-CONTROLLED 4533 0x802a ICE-CONTROLLING 4535 21.3. STUN Error Responses 4537 This section registers one new STUN error response code per the 4538 procedures in [I-D.ietf-behave-rfc3489bis]. 4540 487 Role Conflict: The client asserted an ICE role (controlling or 4541 controlled) that is in conflict with the role of the server. 4543 22. IAB Considerations 4545 The IAB has studied the problem of "Unilateral Self Address Fixing", 4546 which is the general process by which a agent attempts to determine 4547 its address in another realm on the other side of a NAT through a 4548 collaborative protocol reflection mechanism [RFC3424]. ICE is an 4549 example of a protocol that performs this type of function. 4550 Interestingly, the process for ICE is not unilateral, but bilateral, 4551 and the difference has a significant impact on the issues raised by 4552 IAB. Indeed, ICE can be considered a B-SAF (Bilateral Self-Address 4553 Fixing) protocol, rather than an UNSAF protocol. Regardless, the IAB 4554 has mandated that any protocols developed for this purpose document a 4555 specific set of considerations. This section meets those 4556 requirements. 4558 22.1. Problem Definition 4560 From RFC 3424 any UNSAF proposal must provide: 4562 Precise definition of a specific, limited-scope problem that is to 4563 be solved with the UNSAF proposal. A short term fix should not be 4564 generalized to solve other problems; this is why "short term fixes 4565 usually aren't". 4567 The specific problems being solved by ICE are: 4569 Provide a means for two peers to determine the set of transport 4570 addresses which can be used for communication. 4572 Provide a means for a agent to determine an address that is 4573 reachable by another peer with which it wishes to communicate. 4575 22.2. Exit Strategy 4577 From RFC 3424, any UNSAF proposal must provide: 4579 Description of an exit strategy/transition plan. The better short 4580 term fixes are the ones that will naturally see less and less use 4581 as the appropriate technology is deployed. 4583 ICE itself doesn't easily get phased out. However, it is useful even 4584 in a globally connected Internet, to serve as a means for detecting 4585 whether a router failure has temporarily disrupted connectivity, for 4586 example. ICE also helps prevent certain security attacks which have 4587 nothing to do with NAT. However, what ICE does is help phase out 4588 other UNSAF mechanisms. ICE effectively selects amongst those 4589 mechanisms, prioritizing ones that are better, and deprioritizing 4590 ones that are worse. Local IPv6 addresses can be preferred. As NATs 4591 begin to dissipate as IPv6 is introduced, server reflexive and 4592 relayed candidates (both forms of UNSAF addresses) simply never get 4593 used, because higher priority connectivity exists to the native host 4594 candidates. Therefore, the servers get used less and less, and can 4595 eventually be remove when their usage goes to zero. 4597 Indeed, ICE can assist in the transition from IPv4 to IPv6. It can 4598 be used to determine whether to use IPv6 or IPv4 when two dual-stack 4599 hosts communicate with SIP (IPv6 gets used). It can also allow a 4600 network with both 6to4 and native v6 connectivity to determine which 4601 address to use when communicating with a peer. 4603 22.3. Brittleness Introduced by ICE 4605 From RFC3424, any UNSAF proposal must provide: 4607 Discussion of specific issues that may render systems more 4608 "brittle". For example, approaches that involve using data at 4609 multiple network layers create more dependencies, increase 4610 debugging challenges, and make it harder to transition. 4612 ICE actually removes brittleness from existing UNSAF mechanisms. In 4613 particular, classic STUN (as described in RFC 3489 [RFC3489]) has 4614 several points of brittleness. One of them is the discovery process 4615 which requires a agent to try and classify the type of NAT it is 4616 behind. This process is error-prone. With ICE, that discovery 4617 process is simply not used. Rather than unilaterally assessing the 4618 validity of the address, its validity is dynamically determined by 4619 measuring connectivity to a peer. The process of determining 4620 connectivity is very robust. 4622 Another point of brittleness in classic STUN and any other unilateral 4623 mechanism is its absolute reliance on an additional server. ICE 4624 makes use of a server for allocating unilateral addresses, but allows 4625 agents to directly connect if possible. Therefore, in some cases, 4626 the failure of a STUN server would still allow for a call to progress 4627 when ICE is used. 4629 Another point of brittleness in classic STUN is that it assumes that 4630 the STUN server is on the public Internet. Interestingly, with ICE, 4631 that is not necessary. There can be a multitude of STUN servers in a 4632 variety of address realms. ICE will discover the one that has 4633 provided a usable address. 4635 The most troubling point of brittleness in classic STUN is that it 4636 doesn't work in all network topologies. In cases where there is a 4637 shared NAT between each agent and the STUN server, traditional STUN 4638 may not work. With ICE, that restriction is removed. 4640 Classic STUN also introduces some security considerations. 4641 Fortunately, those security considerations are also mitigated by ICE. 4643 Consequently, ICE serves to repair the brittleness introduced in 4644 classic STUN, and does not introduce any additional brittleness into 4645 the system. 4647 The penalty of these improvements is that ICE increases session 4648 establishment times. 4650 22.4. Requirements for a Long Term Solution 4652 From RFC 3424, any UNSAF proposal must provide: 4654 Identify requirements for longer term, sound technical solutions 4655 -- contribute to the process of finding the right longer term 4656 solution. 4658 Our conclusions from RFC 3489 remain unchanged. However, we feel ICE 4659 actually helps because we believe it can be part of the long term 4660 solution. 4662 22.5. Issues with Existing NAPT Boxes 4664 From RFC 3424, any UNSAF proposal must provide: 4666 Discussion of the impact of the noted practical issues with 4667 existing, deployed NA[P]Ts and experience reports. 4669 A number of NAT boxes are now being deployed into the market which 4670 try and provide "generic" ALG functionality. These generic ALGs hunt 4671 for IP addresses, either in text or binary form within a packet, and 4672 rewrite them if they match a binding. This interferes with classic 4673 STUN. However, the update to STUN [I-D.ietf-behave-rfc3489bis] uses 4674 an encoding which hides these binary addresses from generic ALGs. 4676 Existing NAPT boxes have non-deterministic and typically short 4677 expiration times for UDP-based bindings. This requires 4678 implementations to send periodic keepalives to maintain those 4679 bindings. ICE uses a default of 15s, which is a very conservative 4680 estimate. Eventually, over time, as NAT boxes become compliant to 4681 behave [RFC4787], this minimum keepalive will become deterministic 4682 and well-known, and the ICE timers can be adjusted. Having a way to 4683 discover and control the minimum keepalive interval would be far 4684 better still. 4686 23. Acknowledgements 4688 The authors would like to thank Dan Wing, Eric Rescorla, Flemming 4689 Andreasen, Rohan Mahy, Dean Willis, Eric Cooper, Jason Fischl, 4690 Douglas Otis, Tim Moore, Jean-Francois Mule, Kevin Johns, Jonathan 4691 Lennox and Francois Audet for their comments and input. A special 4692 thanks goes to Bill May, who suggested several of the concepts in 4693 this specification, Philip Matthews, who suggested many of the key 4694 performance optimizations in this specification, Eric Rescorla, who 4695 drafted the text in the introduction, and Magnus Westerlund, for 4696 doing several detailed reviews on the various revisions of this 4697 specification. 4699 24. References 4700 24.1. Normative References 4702 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 4703 Requirement Levels", BCP 14, RFC 2119, March 1997. 4705 [RFC3605] Huitema, C., "Real Time Control Protocol (RTCP) attribute 4706 in Session Description Protocol (SDP)", RFC 3605, 4707 October 2003. 4709 [RFC3261] Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston, 4710 A., Peterson, J., Sparks, R., Handley, M., and E. 4711 Schooler, "SIP: Session Initiation Protocol", RFC 3261, 4712 June 2002. 4714 [RFC3264] Rosenberg, J. and H. Schulzrinne, "An Offer/Answer Model 4715 with Session Description Protocol (SDP)", RFC 3264, 4716 June 2002. 4718 [RFC3556] Casner, S., "Session Description Protocol (SDP) Bandwidth 4719 Modifiers for RTP Control Protocol (RTCP) Bandwidth", 4720 RFC 3556, July 2003. 4722 [RFC3312] Camarillo, G., Marshall, W., and J. Rosenberg, 4723 "Integration of Resource Management and Session Initiation 4724 Protocol (SIP)", RFC 3312, October 2002. 4726 [RFC4032] Camarillo, G. and P. Kyzivat, "Update to the Session 4727 Initiation Protocol (SIP) Preconditions Framework", 4728 RFC 4032, March 2005. 4730 [RFC4234] Crocker, D. and P. Overell, "Augmented BNF for Syntax 4731 Specifications: ABNF", RFC 4234, October 2005. 4733 [RFC3262] Rosenberg, J. and H. Schulzrinne, "Reliability of 4734 Provisional Responses in Session Initiation Protocol 4735 (SIP)", RFC 3262, June 2002. 4737 [RFC4566] Handley, M., Jacobson, V., and C. Perkins, "SDP: Session 4738 Description Protocol", RFC 4566, July 2006. 4740 [RFC4091] Camarillo, G. and J. Rosenberg, "The Alternative Network 4741 Address Types (ANAT) Semantics for the Session Description 4742 Protocol (SDP) Grouping Framework", RFC 4091, June 2005. 4744 [RFC3484] Draves, R., "Default Address Selection for Internet 4745 Protocol version 6 (IPv6)", RFC 3484, February 2003. 4747 [I-D.ietf-behave-rfc3489bis] 4748 Rosenberg, J., Huitema, C., Mahy, R., Matthews, P., and D. 4749 Wing, "Session Traversal Utilities for (NAT) (STUN)", 4750 draft-ietf-behave-rfc3489bis-08 (work in progress), 4751 July 2007. 4753 [I-D.ietf-behave-turn] 4754 Rosenberg, J., "Traversal Using Relays around NAT (TURN): 4755 Relay Extensions to Session Traversal Utilities for NAT 4756 (STUN)", draft-ietf-behave-turn-04 (work in progress), 4757 July 2007. 4759 [I-D.ietf-sip-ice-option-tag] 4760 Rosenberg, J., "Indicating Support for Interactive 4761 Connectivity Establishment (ICE) in the Session 4762 Initiation Protocol (SIP)", 4763 draft-ietf-sip-ice-option-tag-02 (work in progress), 4764 June 2007. 4766 24.2. Informative References 4768 [RFC3489] Rosenberg, J., Weinberger, J., Huitema, C., and R. Mahy, 4769 "STUN - Simple Traversal of User Datagram Protocol (UDP) 4770 Through Network Address Translators (NATs)", RFC 3489, 4771 March 2003. 4773 [RFC3235] Senie, D., "Network Address Translator (NAT)-Friendly 4774 Application Design Guidelines", RFC 3235, January 2002. 4776 [RFC3303] Srisuresh, P., Kuthan, J., Rosenberg, J., Molitor, A., and 4777 A. Rayhan, "Middlebox communication architecture and 4778 framework", RFC 3303, August 2002. 4780 [RFC3725] Rosenberg, J., Peterson, J., Schulzrinne, H., and G. 4781 Camarillo, "Best Current Practices for Third Party Call 4782 Control (3pcc) in the Session Initiation Protocol (SIP)", 4783 BCP 85, RFC 3725, April 2004. 4785 [RFC3102] Borella, M., Lo, J., Grabelsky, D., and G. Montenegro, 4786 "Realm Specific IP: Framework", RFC 3102, October 2001. 4788 [RFC3103] Borella, M., Grabelsky, D., Lo, J., and K. Taniguchi, 4789 "Realm Specific IP: Protocol Specification", RFC 3103, 4790 October 2001. 4792 [RFC3424] Daigle, L. and IAB, "IAB Considerations for UNilateral 4793 Self-Address Fixing (UNSAF) Across Network Address 4794 Translation", RFC 3424, November 2002. 4796 [RFC3550] Schulzrinne, H., Casner, S., Frederick, R., and V. 4797 Jacobson, "RTP: A Transport Protocol for Real-Time 4798 Applications", RFC 3550, July 2003. 4800 [RFC3711] Baugher, M., McGrew, D., Naslund, M., Carrara, E., and K. 4801 Norrman, "The Secure Real-time Transport Protocol (SRTP)", 4802 RFC 3711, March 2004. 4804 [RFC3056] Carpenter, B. and K. Moore, "Connection of IPv6 Domains 4805 via IPv4 Clouds", RFC 3056, February 2001. 4807 [RFC3389] Zopf, R., "Real-time Transport Protocol (RTP) Payload for 4808 Comfort Noise (CN)", RFC 3389, September 2002. 4810 [RFC3960] Camarillo, G. and H. Schulzrinne, "Early Media and Ringing 4811 Tone Generation in the Session Initiation Protocol (SIP)", 4812 RFC 3960, December 2004. 4814 [RFC2475] Blake, S., Black, D., Carlson, M., Davies, E., Wang, Z., 4815 and W. Weiss, "An Architecture for Differentiated 4816 Services", RFC 2475, December 1998. 4818 [RFC1918] Rekhter, Y., Moskowitz, R., Karrenberg, D., Groot, G., and 4819 E. Lear, "Address Allocation for Private Internets", 4820 BCP 5, RFC 1918, February 1996. 4822 [RFC4787] Audet, F. and C. Jennings, "Network Address Translation 4823 (NAT) Behavioral Requirements for Unicast UDP", BCP 127, 4824 RFC 4787, January 2007. 4826 [I-D.ietf-mmusic-connectivity-precon] 4827 Andreasen, F., "Connectivity Preconditions for Session 4828 Description Protocol Media Streams", 4829 draft-ietf-mmusic-connectivity-precon-02 (work in 4830 progress), June 2006. 4832 [I-D.ietf-avt-rtp-no-op] 4833 Andreasen, F., "A No-Op Payload Format for RTP", 4834 draft-ietf-avt-rtp-no-op-04 (work in progress), May 2007. 4836 [I-D.ietf-avt-rtp-and-rtcp-mux] 4837 Perkins, C. and M. Westerlund, "Multiplexing RTP Data and 4838 Control Packets on a Single Port", 4839 draft-ietf-avt-rtp-and-rtcp-mux-07 (work in progress), 4840 August 2007. 4842 [RFC4340] Kohler, E., Handley, M., and S. Floyd, "Datagram 4843 Congestion Control Protocol (DCCP)", RFC 4340, March 2006. 4845 [RFC4103] Hellstrom, G. and P. Jones, "RTP Payload for Text 4846 Conversation", RFC 4103, June 2005. 4848 [I-D.ietf-sip-outbound] 4849 Jennings, C. and R. Mahy, "Managing Client Initiated 4850 Connections in the Session Initiation Protocol (SIP)", 4851 draft-ietf-sip-outbound-10 (work in progress), July 2007. 4853 [I-D.ietf-behave-tcp] 4854 Guha, S., "NAT Behavioral Requirements for TCP", 4855 draft-ietf-behave-tcp-07 (work in progress), April 2007. 4857 [I-D.ietf-sipping-config-framework] 4858 Petrie, D. and S. Channabasappa, "A Framework for Session 4859 Initiation Protocol User Agent Profile Delivery", 4860 draft-ietf-sipping-config-framework-12 (work in progress), 4861 June 2007. 4863 [I-D.ietf-mmusic-ice-tcp] 4864 Rosenberg, J., "TCP Candidates with Interactive 4865 Connectivity Establishment (ICE", 4866 draft-ietf-mmusic-ice-tcp-04 (work in progress), 4867 July 2007. 4869 Appendix A. Lite and Full Implementations 4871 ICE allows for two types of implementations. A full implementation 4872 supports the controlling and controlled roles in a session, and can 4873 also perform address gathering. In contrast, a lite implementation 4874 is a minimalist implementation that does little but respond to STUN 4875 checks. 4877 Because ICE requires both endpoints to support it in order to bring 4878 benefits to either endpoint, incremental deployment of ICE in a 4879 network is more complicated. Many sessions involve an endpoint which 4880 is, by itself, not behind a NAT and not one that would worry about 4881 NAT traversal. A very common case is to have one endpoint that 4882 requires NAT traversal (such as a VoIP hard phone or soft phone) make 4883 a call to one of these devices. Even if the phone supports a full 4884 ICE implementation, ICE won't be used at all if the other device 4885 doesn't support it. The lite implementation allows for a low-cost 4886 entry point for these devices. Once they support the lite 4887 implementation, full implementations can connect to them and get the 4888 full benefits of ICE. 4890 Consequently, a lite implementation is only appropriate for devices 4891 that will *always* be connected to the public Internet and have a 4892 public IP address at which it can receive packets from any 4893 correspondent. ICE will not function when a lite implementation is 4894 placed behind a NAT. 4896 ICE allows a lite implementation to have a single IPv4 host candidate 4897 and several IPv6 addresses. In that case, candidate pairs are 4898 selected by the controlling agent using a static algorithm, such as 4899 the one in RFC 3484, which is recommended by this specification. 4900 However, static mechanisms for address selection are always prone to 4901 error, since they cannot ever reflect the actual topology and can 4902 never provide actual guarantees on connectivity. They are always 4903 heuristics. Consequently, if an agent is implementing ICE just to 4904 select between its IPv4 and IPv6 addresses, and it is none of its IP 4905 addresses are behind NAT, usage of full ICE is still RECOMMENDED in 4906 order to provide the most robust form of address selection possible. 4908 It is important to note that the lite implementation was added to 4909 this specification to provide a stepping stone to full 4910 implementation. Even for devices that are always connected to the 4911 public Internet with just a single IPv4 address, a full 4912 implementation is preferable if achievable. A full implementation 4913 will reduce call setup times, since ICE's aggressive mode can be 4914 used. Full implementations also obtain the security benefits of ICE 4915 unrelated to NAT traversal; in particular, the voice hammer attack 4916 described in Section 18 is prevented only for full implementations, 4917 not lite. Finally, it is often the case that a device which finds 4918 itself with a public address today will be placed in a network 4919 tomorrow where it will be behind a NAT. It is difficult to 4920 definitively know, over the lifetime of a device or product, that it 4921 will always be used on the public Internet. Full implementation 4922 provides assurance that communications will always work. 4924 Appendix B. Design Motivations 4926 ICE contains a number of normative behaviors which may themselves be 4927 simple, but derive from complicated or non-obvious thinking or use 4928 cases which merit further discussion. Since these design motivations 4929 are not neccesary to understand for purposes of implementation, they 4930 are discussed here in an appendix to the specification. This section 4931 is non-normative. 4933 B.1. Pacing of STUN Transactions 4935 STUN transactions used to gather candidates and to verify 4936 connectivity are paced out at an approximate rate of one new 4937 transaction every Ta milliseconds. Each transaction, in turn, has a 4938 retransmission timer RTO that is a function of Ta as well. Why are 4939 these transactions paced, and why are these formulas used? 4941 Sending of these STUN requests will often have the effect of creating 4942 bindings on NAT devices between the client and the STUN servers. 4943 Experience has shown that many NAT devices have upper limits on the 4944 rate at which they will create new bindings. Experiments have shown 4945 that once every 20ms is well supported, but not much lower than that. 4946 This is why Ta has a lower bound of 20ms. Furthermore, transmission 4947 of these packets on the network makes use of bandwidth and needs to 4948 be rate limited by the agent. Deployments based on earlier drafts of 4949 this document tended to overload rate-constrained access links and 4950 perform poorly overall, in addition to negatively impacting the 4951 network. As a consequence, the pacing ensures that the NAT devices 4952 does not get overloaded and that traffic is kept at a reasonable 4953 rate. 4955 The definition of a "reasonable" rate is that STUN should not use 4956 more bandwidth than the RTP itself will use, once media starts 4957 flowing. The formula for Ta is designed so that, if a STUN packet 4958 were sent every Ta seconds, it would consume the same amount of 4959 bandwidth as RTP packets, summed across all media streams. Of 4960 course, STUN has retransmits, and the desire is to pace those as 4961 well. For this reason, RTO is set such that the first retransmit on 4962 the first transaction happens just as the first STUN request on the 4963 last transaction occurs. Pictorially: 4965 First Packets Retransmits 4967 | | 4968 | | 4969 -------+------ -------+------ 4970 / \ / \ 4971 / \ / \ 4973 +--+ +--+ +--+ +--+ +--+ +--+ 4974 |A1| |B1| |C1| |A2| |B2| |C2| 4975 +--+ +--+ +--+ +--+ +--+ +--+ 4977 ---+-------+-------+-------+-------+-------+------------ Time 4978 0 Ta 2Ta 3Ta 4Ta 5Ta 4980 In this picture, there are three transactions that will be sent (for 4981 example, in the case of candidate gathering, there are three host 4982 candidate/STUN server pairs). These are transactions A, B and C. The 4983 retransmit timer is set so that the first retransmission on the first 4984 transaction (packet A2) is sent at time 3Ta. 4986 Subsequent retransmits after the first will occur even less 4987 frequently than Ta milliseconds apart, since STUN uses an exponential 4988 back-off on its retransmissions. 4990 B.2. Candidates with Multiple Bases 4992 Section 4.1.3 talks about eliminating candidates that have the same 4993 transport address and base. However, candidates with the same 4994 transport addresses but different bases are not redundant . When can 4995 an agent have two candidates that have the same IP address and port, 4996 but different bases? Consider the topology of Figure 30: 4998 +----------+ 4999 | STUN Srvr| 5000 +----------+ 5001 | 5002 | 5003 ----- 5004 // \\ 5005 | | 5006 | B:net10 | 5007 | | 5008 \\ // 5009 ----- 5010 | 5011 | 5012 +----------+ 5013 | NAT | 5014 +----------+ 5015 | 5016 | 5017 ----- 5018 // \\ 5019 | A | 5020 |192.168/16 | 5021 | | 5022 \\ // 5023 ----- 5024 | 5025 | 5026 |192.168.1.100 ----- 5027 +----------+ // \\ +----------+ 5028 | | | | | | 5029 | Offerer |---------| C:net10 |-----------| Answerer | 5030 | |10.0.1.100| | 10.0.1.101 | | 5031 +----------+ \\ // +----------+ 5032 ----- 5034 Figure 30: Identical Candidates with Different Bases 5036 In this case, the offerer is multi-homed. It has one IP address, 5037 10.0.1.100, on network C, which is a net 10 private network. The 5038 Answerer is on this same network. The offerer is also connected to 5039 network A, which is 192.168/16. The offerer has an IP address of 5040 192.168.1.100 on this network. There is a NAT on this network, 5041 natting into network B, which is another net 10 private network, but 5042 not connected to network C. There is a STUN server on network B. 5044 The offerer obtains a host candidate on its IP address on network C 5045 (10.0.1.100:2498) and a host candidate on its IP address on network A 5046 (192.168.1.100:3344). It performs a STUN query to its configured 5047 STUN server from 192.168.1.100:3344. This query passes through the 5048 NAT, which happens to assign the binding 10.0.1.100:2498. The STUN 5049 server reflects this in the STUN Binding Response. Now, the offerer 5050 has obtained a server reflexive candidate with a transport address 5051 that is identical to a host candidate (10.0.1.100:2498). However, 5052 the server reflexive candidate has a base of 192.168.1.100:3344, and 5053 the host candidate has a base of 10.0.1.100:2498. 5055 B.3. Purpose of the and Attributes 5057 The candidate attribute contains two values that are not used at all 5058 by ICE itself - and . Why is it present? 5060 There are two motivations for its inclusion. The first is 5061 diagnostic. It is very useful to know the relationship between the 5062 different types of candidates. By including it, an agent can know 5063 which relayed candidate is associated with which reflexive candidate, 5064 which in turn is associated with a specific host candidate. When 5065 checks for one candidate succeed and not the others, this provides 5066 useful diagnostics on what is going on in the network. 5068 The second reason has to do with off-path Quality of Service (QoS) 5069 mechanisms. When ICE is used in environments such as PacketCable 5070 2.0, proxies will, in addition to performing normal SIP operations, 5071 inspect the SDP in SIP messages, and extract the IP address and port 5072 for media traffic. They can then interact, through policy servers, 5073 with access routers in the network, to establish guaranteed QoS for 5074 the media flows. This QoS is provided by classifying the RTP traffic 5075 based on 5-tuple, and then providing it a guaranteed rate, or marking 5076 its Diffserv codepoints appropriately. When a residential NAT is 5077 present, and a relayed candidate gets selected for media, this 5078 relayed candidate will be a transport address on an actual TURN 5079 server. That address says nothing about the actual transport address 5080 in the access router that would be used to classify packets for QoS 5081 treatment. Rather, the server reflexive candidate towards the TURN 5082 server is needed. By carrying the translation in the SDP, the proxy 5083 can use that transport address to request QoS from the access router. 5085 B.4. Importance of the STUN Username 5087 ICE requires the usage of message integrity with STUN using its short 5088 term credential functionality. The actual short term credential is 5089 formed by exchanging username fragments in the SDP offer/answer 5090 exchange. The need for this mechanism goes beyond just security; it 5091 is actual required for correct operation of ICE in the first place. 5093 Consider agents L, R, and Z. L and R are within private enterprise 1, 5094 which is using 10.0.0.0/8. Z is within private enterprise 2, which 5095 is also using 10.0.0.0/8. As it turns out, R and Z both have IP 5096 address 10.0.1.1. L sends an offer to Z. Z, in its answer, provides 5097 L with its host candidates. In this case, those candidates are 5098 10.0.1.1:8866 and 10.0.1.1:8877. As it turns out, R is in a session 5099 at that same time, and is also using 10.0.1.1:8866 and 10.0.1.1:8877 5100 as host candidates. This means that R is prepared to accept STUN 5101 messages on those ports, just as Z is. L will send a STUN request to 5102 10.0.1.1:8866 and and another to 10.0.1.1:8877. However, these do 5103 not go to Z as expected. Instead, they go to R! If R just replied 5104 to them, L would believe it has connectivity to Z, when in fact it 5105 has connectivity to a completely different user, R. To fix this, the 5106 STUN short term credential mechanisms are used. The username 5107 fragments are sufficiently random that it is highly unlikely that R 5108 would be using the same values as Z. Consequently, R would reject the 5109 STUN request since the credentials were invalid. In essence, the 5110 STUN username fragments provide a form of transient host identifiers, 5111 bound to a particular offer/answer session. 5113 An unfortunate consequence of the non-uniqueness of IP addresses is 5114 that, in the above example, R might not even be an ICE agent. It 5115 could be any host, and the port to which the STUN packet is directed 5116 could be any ephemeral port on that host. If there is an application 5117 listening on this socket for packets, and it is not prepared to 5118 handle malformed packets for whatever protocol is in use, the 5119 operation of that application could be affected. Fortunately, since 5120 the ports exchanged in SDP are ephemeral and usually drawn from the 5121 dynamic or registered range, the odds are good that the port is not 5122 used to run a server on host R, but rather is the agent side of some 5123 protocol. This decreases the probability of hitting an allocated 5124 port, due to the transient nature of port usage in this range. 5125 However, the possibility of a problem does exist, and network 5126 deployers should be prepared for it. Note that this is not a problem 5127 specific to ICE; stray packets can arrive at a port at any time for 5128 any type of protocol, especially ones on the public Internet. As 5129 such, this requirement is just restating a general design guideline 5130 for Internet applications - be prepared for unknown packets on any 5131 port. 5133 B.5. The Candidate Pair Priority Formula 5135 The priority for a candidate pair has an odd form. It is: 5137 pair priority = 2^32*MIN(G,D) + 2*MAX(G,D) + (G>D?1:0) 5139 Why is this? When the candidate pairs are sorted based on this 5140 value, the resulting sorting has the MAX/MIN property. This means 5141 that the pairs are first sorted based on decreasing value of the 5142 minimum of the two priorities. For pairs that have the same value of 5143 the minimum priority, the maximum priority is used to sort amongst 5144 them. If the max and the min priorities are the same, the 5145 controlling agent's priority is used as the tie breaker in the last 5146 part of the expression. The factor of 2*32 is used since the 5147 priority of a single candidate is always less than 2*32, resulting in 5148 the pair priority being a "concatenation" of the two component 5149 priorities. This creates the MAX/MIN sorting. MAX/MIN ensures that, 5150 for a particular agent, a lower priority candidate is never used 5151 until all higher priority candidates have been tried. 5153 B.6. The remote-candidates attribute 5155 The a=remote-candidates attribute exists to eliminate a race 5156 condition between the updated offer and the response to the STUN 5157 Binding Request that moved a candidate into the Valid list. This 5158 race condition is shown in Figure 31. On receipt of message 4, agent 5159 L adds a candidate pair to the valid list. If there was only a 5160 single media stream with a single component, agent L could now send 5161 an updated offer. However, the check from agent R has not yet 5162 generated a response, and agent R receives the updated offer (message 5163 7) before getting the response (message 9). Thus, it does not yet 5164 know that this particular pair is valid. To eliminate this 5165 condition, the actual candidates at R that were selected by the 5166 offerer (the remote candidates) are included in the offer itself, and 5167 the answerer delays its answer until those pairs validate. 5169 Agent A Network Agent B 5170 |(1) Offer | | 5171 |------------------------------------------>| 5172 |(2) Answer | | 5173 |<------------------------------------------| 5174 |(3) STUN Req. | | 5175 |------------------------------------------>| 5176 |(4) STUN Res. | | 5177 |<------------------------------------------| 5178 |(5) STUN Req. | | 5179 |<------------------------------------------| 5180 |(6) STUN Res. | | 5181 |-------------------->| | 5182 | |Lost | 5183 |(7) Offer | | 5184 |------------------------------------------>| 5185 |(8) STUN Req. | | 5186 |<------------------------------------------| 5187 |(9) STUN Res. | | 5188 |------------------------------------------>| 5189 |(10) Answer | | 5190 |<------------------------------------------| 5192 Figure 31: Race Condition Flow 5194 B.7. Why are Keepalives Needed? 5196 Once media begins flowing on a candidate pair, it is still necessary 5197 to keep the bindings alive at intermediate NATs for the duration of 5198 the session. Normally, the media stream packets themselves (e.g., 5199 RTP) meet this objective. However, several cases merit further 5200 discussion. Firstly, in some RTP usages, such as SIP, the media 5201 streams can be "put on hold". This is accomplished by using the SDP 5202 "sendonly" or "inactive" attributes, as defined in RFC 3264 5203 [RFC3264]. RFC 3264 directs implementations to cease transmission of 5204 media in these cases. However, doing so may cause NAT bindings to 5205 timeout, and media won't be able to come off hold. 5207 Secondly, some RTP payload formats, such as the payload format for 5208 text conversation [RFC4103], may send packets so infrequently that 5209 the interval exceeds the NAT binding timeouts. 5211 Thirdly, if silence suppression is in use, long periods of silence 5212 may cause media transmission to cease sufficiently long for NAT 5213 bindings to time out. 5215 For these reasons, the media packets themselves cannot be relied 5216 upon. ICE defines a simple periodic keepalive that operates 5217 independently of media transmission. This makes its bandwidth 5218 requirements highly predictable, and thus amenable to QoS 5219 reservations. 5221 B.8. Why Prefer Peer Reflexive Candidates? 5223 Section 4.1.2 describes procedures for computing the priority of 5224 candidate based on its type and local preferences. That section 5225 requires that the type preference for peer reflexive candidates 5226 always be higher than server reflexive. Why is that? The reason has 5227 to do with the security considerations in Section 18. It is much 5228 easier for an attacker to cause an agent to use a false server 5229 reflexive candidate than it is for an attacker to cause an agent to 5230 use a false peer reflexive candidate. Consequently, attacks against 5231 address gathering with Binding requests are thwarted by ICE by 5232 preferring the peer reflexive candidates. 5234 B.9. Why Send an Updated Offer? 5236 Section 11.1 describes rules for sending media. Both agents can send 5237 media once ICE checks complete, without waiting for an updated offer. 5238 Indeed, the only purpose of the updated offer is to "correct" the SDP 5239 so that the default destination for media matches where media is 5240 being sent based on ICE procedures (which will be the highest 5241 priority nominated candidate pair). 5243 This begs the question - why is the updated offer/answer exchange 5244 needed at all? Indeed, in a pure offer/answer environment, it would 5245 not be. The offerer and answerer will agree on the candidates to use 5246 through ICE, and then can begin using them. As far as the agents 5247 themselves are concerned, the updated offer/answer provides no new 5248 information. However, in practice, numerous components along the 5249 signaling path look at the SDP information. These include entities 5250 performing off-path QoS reservations, NAT traversal components such 5251 as ALGs and Session Border Controllers (SBCs) and diagnostic tools 5252 that passively monitor the network. For these tools to continue to 5253 function without change, the core property of SDP - that the 5254 existing, pre-ICE definitions of the addresses used for media - the m 5255 and c lines and the rtcp attribute - must be retained. For this 5256 reason, an updated offer must be sent. 5258 B.10. Why are Binding Indications Used for Keepalives? 5260 Media keepalives are described in Section 10. These keepalives make 5261 use of STUN when both endpoints are ICE capable. However, rather 5262 than using a Binding Request transaction (which generates a 5263 response), the keepalives use an Indication. Why is that? 5264 The primary reason has to do with network QoS mechanisms. Once media 5265 begins flowing, network elements will assume that the media stream 5266 has a fairly regular structure, making use of periodic packets at 5267 fixed intervals, with the possibility of jitter. If an agent is 5268 sending media packets, and then receives a Binding Request, it would 5269 need to generate a response packet along with its media packets. 5270 This will increase the actual bandwidth requirements for the 5-tuple 5271 carrying the media packets, and introduce jitter in the delivery of 5272 those packets. Analysis has shown that this is a concern in certain 5273 layer 2 access networks that use fairly tight packet schedulers for 5274 media. 5276 Additionally, using a Binding Indication allows integrity to be 5277 disabled, allowing for better performance. This is useful for large 5278 scale endpoints, such as PSTN gateways and SBCs. 5280 B.11. Why is the Conflict Resolution Mechanism Needed? 5282 When ICE runs between two peers, one agent acts as controlled, and 5283 the other as controlling. Rules are defined as a function of 5284 implementation type and offerer/answerer to determine who is 5285 controlling and who is controlled. However, the specification 5286 mentions that, in some cases, both sides might believe they are 5287 controlling, or both sides might believe they are controlled. How 5288 can this happen? 5290 The condition when both agents believe they are controlled shows up 5291 in third party call control cases. Consider the following flow: 5293 A Controller B 5294 |(1) INV() | | 5295 |<-------------| | 5296 |(2) 200(SDP1) | | 5297 |------------->| | 5298 | |(3) INV() | 5299 | |------------->| 5300 | |(4) 200(SDP2) | 5301 | |<-------------| 5302 |(5) ACK(SDP2) | | 5303 |<-------------| | 5304 | |(6) ACK(SDP1) | 5305 | |------------->| 5307 Figure 32: Role Conflict Flow 5309 This flow is a variation on flow III of RFC 3725 [RFC3725]. In fact, 5310 it works better than flow III since it produces fewer messages. In 5311 this flow, the controller sends an offerless INVITE to agent A, which 5312 responds with its offer, SDP1. The agent then sends an offerless 5313 INVITE to agent B, which it responds to with its offer, SDP2. The 5314 controller then uses the offer from each agent to generate the 5315 answers. When this flow is used, ICE will run between agents A and 5316 B, but both will believe they are in the controlling role. With the 5317 role conflict resolution procedures, this flow will function properly 5318 when ICE is used. 5320 At this time, there are no documented flows which can result in the 5321 case where both agents believe they are controlled. However, the 5322 conflict resolution procedures allow for this case, should a flow 5323 arise which would fit into this category. 5325 Author's Address 5327 Jonathan Rosenberg 5328 Cisco 5329 Edison, NJ 5330 US 5332 Phone: +1 973 952-5000 5333 Email: jdrosen@cisco.com 5334 URI: http://www.jdrosen.net 5336 Full Copyright Statement 5338 Copyright (C) The IETF Trust (2007). 5340 This document is subject to the rights, licenses and restrictions 5341 contained in BCP 78, and except as set forth therein, the authors 5342 retain all their rights. 5344 This document and the information contained herein are provided on an 5345 "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS 5346 OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND 5347 THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS 5348 OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF 5349 THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED 5350 WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. 5352 Intellectual Property 5354 The IETF takes no position regarding the validity or scope of any 5355 Intellectual Property Rights or other rights that might be claimed to 5356 pertain to the implementation or use of the technology described in 5357 this document or the extent to which any license under such rights 5358 might or might not be available; nor does it represent that it has 5359 made any independent effort to identify any such rights. Information 5360 on the procedures with respect to rights in RFC documents can be 5361 found in BCP 78 and BCP 79. 5363 Copies of IPR disclosures made to the IETF Secretariat and any 5364 assurances of licenses to be made available, or the result of an 5365 attempt made to obtain a general license or permission for the use of 5366 such proprietary rights by implementers or users of this 5367 specification can be obtained from the IETF on-line IPR repository at 5368 http://www.ietf.org/ipr. 5370 The IETF invites any interested party to bring to its attention any 5371 copyrights, patents or patent applications, or other proprietary 5372 rights that may cover technology that may be required to implement 5373 this standard. Please address the information to the IETF at 5374 ietf-ipr@ietf.org. 5376 Acknowledgment 5378 Funding for the RFC Editor function is provided by the IETF 5379 Administrative Support Activity (IASA).