idnits 2.17.1 

draft-ietf-mmusic-ice-tcp-08.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

  ** The document seems to lack a License Notice according IETF Trust
     Provisions of 28 Dec 2009, Section 6.b.i or Provisions of 12 Sep 2009
     Section 6.b -- however, there's a paragraph with a matching beginning.
     Boilerplate error?

     (You're using the IETF Trust Provisions' Section 6.b License Notice from
     12 Feb 2009 rather than one of the newer Notices.  See
     https://trustee.ietf.org/license-info/.)


  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  ** The document seems to lack a both a reference to RFC 2119 and the
     recommended RFC 2119 boilerplate, even if it appears to use RFC 2119
     keywords. 

     RFC 2119 keyword, line 161: '...ng.  Therefore, it is RECOMMENDED that...'
     RFC 2119 keyword, line 263: '...   in [RFC4145] for constructing the offer.  However, the offerer MUST...'
     RFC 2119 keyword, line 272: '...ither be UDP or TCP), the agent SHOULD...'
     RFC 2119 keyword, line 281: '...or choice, it is RECOMMENDED that agen...'
     RFC 2119 keyword, line 313: '...   Each agent SHOULD "obtain" an activ...'
     (57 more instances...)


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  -- The document seems to lack a disclaimer for pre-RFC5378 work, but may
     have content which was first submitted before 10 November 2008.  If you
     have contacted all the original authors and they are all willing to grant
     the BCP78 rights to the IETF Trust, then this is fine, and you can ignore
     this comment.  If not, you may need to add the pre-RFC5378 disclaimer. 
     (See the Legal Provisions document at
     https://trustee.ietf.org/license-info for more information.)

  -- The document date (October 13, 2009) is 5306 days in the past.  Is this
     intentional?


  Checking references for intended status: Proposed Standard
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

  ** Obsolete normative reference: RFC 5389 (Obsoleted by RFC 8489)

  ** Obsolete normative reference: RFC 4572 (Obsoleted by RFC 8122)


     Summary: 4 errors (**), 0 flaws (~~), 1 warning (==), 2 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	MMUSIC                                                 S. Perreault, Ed.
3	Internet-Draft                                                  Viagenie
4	Intended status: Standards Track                            J. Rosenberg
5	Expires: April 16, 2010                                            Cisco
6	                                                        October 13, 2009

8	    TCP Candidates with Interactive Connectivity Establishment (ICE)
9	                      draft-ietf-mmusic-ice-tcp-08

11	Status of this Memo

13	   This Internet-Draft is submitted to IETF in full conformance with the
14	   provisions of BCP 78 and BCP 79.

16	   Internet-Drafts are working documents of the Internet Engineering
17	   Task Force (IETF), its areas, and its working groups.  Note that
18	   other groups may also distribute working documents as Internet-
19	   Drafts.

21	   Internet-Drafts are draft documents valid for a maximum of six months
22	   and may be updated, replaced, or obsoleted by other documents at any
23	   time.  It is inappropriate to use Internet-Drafts as reference
24	   material or to cite them other than as "work in progress."

26	   The list of current Internet-Drafts can be accessed at
27	   http://www.ietf.org/ietf/1id-abstracts.txt.

29	   The list of Internet-Draft Shadow Directories can be accessed at
30	   http://www.ietf.org/shadow.html.

32	   This Internet-Draft will expire on April 16, 2010.

34	Copyright Notice

36	   Copyright (c) 2009 IETF Trust and the persons identified as the
37	   document authors.  All rights reserved.

39	   This document is subject to BCP 78 and the IETF Trust's Legal
40	   Provisions Relating to IETF Documents in effect on the date of
41	   publication of this document (http://trustee.ietf.org/license-info).
42	   Please review these documents carefully, as they describe your rights
43	   and restrictions with respect to this document.

45	Abstract

47	   Interactive Connectivity Establishment (ICE) defines a mechanism for
48	   NAT traversal for multimedia communication protocols based on the
49	   offer/answer model of session negotiation.  ICE works by providing a
50	   set of candidate transport addresses for each media stream, which are
51	   then validated with peer-to-peer connectivity checks based on Session
52	   Traversal Utilities for NAT (STUN).  ICE provides a general framework
53	   for describing candidates, but only defines UDP-based transport
54	   protocols.  This specification extends ICE to TCP-based media,
55	   including the ability to offer a mix of TCP and UDP-based candidates
56	   for a single stream.

58	Table of Contents

60	   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  3
61	   2.  Overview of Operation  . . . . . . . . . . . . . . . . . . . .  4
62	   3.  Sending the Initial Offer  . . . . . . . . . . . . . . . . . .  6
63	     3.1.  Gathering Candidates . . . . . . . . . . . . . . . . . . .  6
64	     3.2.  Prioritization . . . . . . . . . . . . . . . . . . . . . .  8
65	     3.3.  Choosing Default Candidates  . . . . . . . . . . . . . . .  9
66	     3.4.  Encoding the SDP . . . . . . . . . . . . . . . . . . . . .  9
67	   4.  Receiving the Initial Offer  . . . . . . . . . . . . . . . . . 10
68	     4.1.  Verifying ICE Support  . . . . . . . . . . . . . . . . . . 10
69	     4.2.  Forming the Check Lists  . . . . . . . . . . . . . . . . . 11
70	   5.  Connectivity Checks  . . . . . . . . . . . . . . . . . . . . . 11
71	     5.1.  STUN Client Procedures . . . . . . . . . . . . . . . . . . 11
72	       5.1.1.  Sending the Request  . . . . . . . . . . . . . . . . . 11
73	     5.2.  STUN Server Procedures . . . . . . . . . . . . . . . . . . 12
74	   6.  Concluding ICE Processing  . . . . . . . . . . . . . . . . . . 12
75	   7.  Subsequent Offer/Answer Exchanges  . . . . . . . . . . . . . . 13
76	     7.1.  ICE Restarts . . . . . . . . . . . . . . . . . . . . . . . 13
77	   8.  Media Handling . . . . . . . . . . . . . . . . . . . . . . . . 13
78	     8.1.  Sending Media  . . . . . . . . . . . . . . . . . . . . . . 13
79	     8.2.  Receiving Media  . . . . . . . . . . . . . . . . . . . . . 14
80	   9.  Connection Management  . . . . . . . . . . . . . . . . . . . . 14
81	     9.1.  Connections Formed During Connectivity Checks  . . . . . . 14
82	     9.2.  Connections formed for Gathering Candidates  . . . . . . . 15
83	   10. Security Considerations  . . . . . . . . . . . . . . . . . . . 16
84	   11. IANA Considerations  . . . . . . . . . . . . . . . . . . . . . 16
85	   12. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 16
86	   13. References . . . . . . . . . . . . . . . . . . . . . . . . . . 16
87	     13.1. Normative References . . . . . . . . . . . . . . . . . . . 16
88	     13.2. Informative References . . . . . . . . . . . . . . . . . . 17
89	   Appendix A.  Implementation Considerations for BSD Sockets . . . . 18
90	   Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 19

92	1.  Introduction

94	   Interactive Connectivity Establishment (ICE) [I-D.ietf-mmusic-ice]
95	   defines a mechanism for NAT traversal for multimedia communication
96	   protocols based on the offer/answer model [RFC3264] of session
97	   negotiation.  ICE works by providing a set of candidate transport
98	   addresses for each media stream, which are then validated with peer-
99	   to-peer connectivity checks based on Session Traversal Utilities for
100	   NAT (STUN) [RFC5389].  However, ICE only defines procedures for UDP-
101	   based transport protocols.

103	   There are many reasons why ICE support for TCP is important.
104	   Firstly, there are media protocols that only run over TCP.  Examples
105	   of such protocols are web and application sharing and instant
106	   messaging [RFC4975].  For these protocols to work in the presence of
107	   NAT, unless they define their own NAT traversal mechanisms, ICE
108	   support for TCP is needed.  In addition, RTP itself can run over TCP
109	   [RFC4571].  Typically, it is preferable to run RTP over UDP, and not
110	   TCP.  However, in a variety of network environments, overly
111	   restrictive NAT and firewall devices prevent UDP-based communications
112	   altogether, but general TCP-based communications are permitted.  In
113	   such environments, sending RTP over TCP, and thus establishing the
114	   media session, may be preferable to having it fail altogether.  With
115	   this specification, agents can gather UDP and TCP candidates for an
116	   RTP-based stream, list the UDP ones with higher priority, and then
117	   only use the TCP-based ones if the UDP ones fail.  This provides a
118	   fallback mechanism that allows multimedia communications to be highly
119	   reliable.

121	   The usage of RTP over TCP is particularly useful when combined with
122	   Traversal Using Relay NAT [I-D.ietf-behave-turn].  In this case, one
123	   of the agents would connect to its TURN server using TCP, and obtain
124	   a TCP-based relayed candidate.  It would offer this to its peer agent
125	   as a candidate.  The answerer would initiate a TCP connection towards
126	   the TURN server.  When that connection is established, media can flow
127	   over the connections, through the TURN server.  The benefit of this
128	   usage is that it only requires the agents to make outbound TCP
129	   connections to a server on the public network.  This kind of
130	   operation is broadly interoperable through NAT and firewall devices.
131	   Since it is a goal of ICE and this extension to provide highly
132	   reliable communications that "just works" in as a broad a set of
133	   network deployments as possible, this use case is particularly
134	   important.

136	   This specification extends ICE by defining its usage with TCP
137	   candidates.  It also defines how ICE can be used with RTP and SRTP to
138	   provide both TCP and UDP candidates.  This specification does so by
139	   following the outline of ICE itself, and calling out the additions
140	   and changes necessary in each section of ICE to support TCP
141	   candidates.

143	2.  Overview of Operation

145	   The usage of ICE with TCP is relatively straightforward.  The main
146	   area of specification is around how and when connections are opened,
147	   and how those connections relate to candidate pairs.

149	   When the agents perform address allocations to gather TCP-based
150	   candidates, three types of candidates can be obtained.  These are
151	   active candidates, passive candidates, and simultaneous-open
152	   candidates.  An active candidate is one for which the agent will
153	   attempt to open an outbound connection, but will not receive incoming
154	   connection requests.  A passive candidate is one for which the agent
155	   will receive incoming connection attempts, but not attempt a
156	   connection.  A simultaneous-open candidate is one for which the agent
157	   will attempt to open a connection simultaneously with its peer.

159	      Note: It has been reported that the simultaneous-open technique
160	      has a low success rate (~40%) with the population of NAT devices
161	      in use as of this writing.  Therefore, it is RECOMMENDED that
162	      implementations of this specification acquire and use IPv6 host
163	      candidates.  Means of doing so across NATs include Tunnel Setup
164	      Protocol, [I-D.blanchet-v6ops-tunnelbroker-tsp], Teredo [RFC4380],
165	      IPSec NAT-T [RFC3947], and others.

167	   Unlike UDP, there are no lite implementation defined for TCP.
168	   Instead, an implementation that meets the criteria for a lite
169	   implementation as discussed in Appendix A of [I-D.ietf-mmusic-ice]
170	   can just uses the mechanisms defined in [RFC4145], with constraints
171	   defined here on selection of attribute values.

173	   When gathering candidates from a host interface, the agent typically
174	   obtains an active, passive and simultaneous-open candidates.
175	   Similarly, communications with a STUN server will provide server
176	   reflexive and relayed versions of all three types.  Connections to
177	   the STUN server are kept open during ICE processing.

179	   When encoding these candidates into offers and answers, the type of
180	   the candidate is signaled.  In the case of active candidates, an IP
181	   address and port is present, but it is meaningless, as it is ignored
182	   by the peer.  As a consequence, active candidates do not need to be
183	   physically allocated at the time of address gathering.  Rather, the
184	   physical allocations, which occur as a consequence of a connection
185	   attempt, occur at the time of the connectivity checks.

187	   When the candidates are paired together, active candidates are always
188	   paired with passive, and simultaneous-open candidates with each
189	   other.  When a connectivity check is to be made on a candidate pair,
190	   each agent determines whether it is to make a connection attempt for
191	   this pair.

193	      Why have both active and simultaneous-open candidates?  Why not
194	      just simultaneous-open?  The reason is that NAT treatment of
195	      simultaneous opens is currently not well defined, though
196	      specifications are being developed to address this [RFC5382].
197	      Some NATs block the second TCP SYN packet or improperly process
198	      the subsequent SYNACK, which will cause the connection attempt to
199	      fail.  Therefore, if only simultaneous opens are used, connections
200	      may often fail.  Alternatively, using unidirectional opens (where
201	      one side is active and the other is passive) is more reliable, but
202	      will always require a relay if both sides are behind NAT.
203	      Therefore, in the spirit of the ICE philosophy, both are tried.
204	      Simultaneous-opens are preferred since, if it does work, it will
205	      not require a relay even when both sides are behind a different
206	      NAT.

208	   The actual process of generating connectivity checks, managing the
209	   state of the check list, and updating the Valid list, work
210	   identically for TCP as they do for UDP.

212	   ICE requires an agent to demultiplex STUN and application layer
213	   traffic, since they appear on the same port.  This demultiplexing is
214	   described by ICE, and is done using the magic cookie and other fields
215	   of the message.  Stream-oriented transports introduce another
216	   wrinkle, since they require a way to frame the connection so that the
217	   application and STUN packets can be extracted in order to determine
218	   which is which.  For this reason, TCP media streams utilizing ICE use
219	   the basic framing provided in RFC 4571 [RFC4571], even if the
220	   application layer protocol is not RTP.

222	   When TLS is in use (for non-RTP traffic) or DTLS (for RTP traffic),
223	   it runs over the RFC 4571 framing shim, so that STUN runs outside of
224	   the D/TLS connection (D/TLS is shorthand for TLS or DTLS).
225	   Pictorially:

227	                                  +----------+
228	                                  |          |
229	                                  |    App   |
230	                       +----------+----------+
231	                       |          |          |
232	                       |   STUN   |  D/TLS   |
233	                       +----------+----------+
234	                       |                     |
235	                       |      RFC 4571       |
236	                       +---------------------+
237	                       |                     |
238	                       |         TCP         |
239	                       +---------------------+
240	                       |                     |
241	                       |         IP          |
242	                       +---------------------+

244	                          Figure 1: ICE TCP Stack

246	   The implication of this is that, for any media stream protected by
247	   D/TLS, the agent will first run ICE procedures, exchanging STUN
248	   messages.  Then, once ICE completes, D/TLS procedures begin.  ICE and
249	   D/TLS are thus "peers" in the protocol stack.  The STUN messages are
250	   not sent over the D/TLS connection, even ones sent for the purposes
251	   of keepalive in the middle of the media session.

253	   When an updated offer is generated by the controlling endpoint, the
254	   SDP extensions for connection oriented media [RFC4145] are used to
255	   signal that an existing connection should be used, rather than
256	   opening a new one.

258	3.  Sending the Initial Offer

260	   If an offerer meets the criteria for lite as defined in Appendix A of
261	   [I-D.ietf-mmusic-ice], it omits any ICE attributes for its TCP-based
262	   media streams.  Instead, the offerer follows the procedures defined
263	   in [RFC4145] for constructing the offer.  However, the offerer MUST
264	   use a setup attribute of "actpass" for those streams.

266	   For offerers making use of ICE for TCP streams, the procedures below
267	   are used.

269	3.1.  Gathering Candidates

271	   For each TCP capable media stream the agent wishes to use (including
272	   ones, like RTP, which can either be UDP or TCP), the agent SHOULD
273	   obtain two host candidates (each on a different port) for each
274	   component of the media stream on each interface that the host has -
275	   one for the simultaneous open, and one for the passive candidate.  If
276	   an agent is not capable of acting in one of these modes it would omit
277	   those candidates.

279	   Providers of real-time communications services may decide that it is
280	   preferable to have no media at all than it is to have media over TCP.
281	   To allow for choice, it is RECOMMENDED that agents be configurable
282	   with whether they obtain TCP candidates for real time media.

284	      Having it be configurable, and then configuring it to be off, is
285	      far better than not having the capability at all.  An important
286	      goal of this specification is to provide a single mechanism that
287	      can be used across all types of endpoints.  As such, it is
288	      preferable to account for provider and network variation through
289	      configuration, instead of hard-coded limitations in an
290	      implementation.  Furthermore, network characteristics and
291	      connectivity assumptions can, and will change over time.  Just
292	      because a agent is communicating with a server on the public
293	      network today, doesn't mean that it won't need to communicate with
294	      one behind a NAT tomorrow.  Just because a agent is behind a NAT
295	      with endpoint indpendent mapping today, doesn't mean that tomorrow
296	      they won't pick up their agent and take it to a public network
297	      access point where there is a NAT with address and port dependent
298	      mapping properties, or one that only allows outbound TCP.  The way
299	      to handle these cases and build a reliable system is for agents to
300	      implement a diverse set of techniques for allocating addresses, so
301	      that at least one of them is almost certainly going to work in any
302	      situation.  Implementors should consider very carefully any
303	      assumptions that they make about deployments before electing not
304	      to implement one of the mechanisms for address allocation.  In
305	      particular, implementors should consider whether the elements in
306	      the system may be mobile, and connect through different networks
307	      with different connectivity.  They should also consider whether
308	      endpoints which are under their control, in terms of location and
309	      network connectivity, would always be under their control.  In
310	      environments where mobility and user control are possible, a
311	      multiplicity of techniques is essential for reliability.

313	   Each agent SHOULD "obtain" an active host candidate for each
314	   component of each TCP capable media stream on each interface that the
315	   host has.  The agent does not have to actually allocate a port for
316	   these candidates.  These candidates serve as a placeholder for the
317	   creation of the check lists.

319	   Next, the agent SHOULD take all host TCP candidates for a component
320	   that have the same foundation (there will typically be two - a
321	   passive and a simultaneous-open), and amongst them, pick two
322	   arbitrarily.  These two host candidates will be used to obtain
323	   relayed and server reflexive candidates.  To do that, the agent
324	   initiates a TCP connection from each candidate to the TURN server
325	   (resulting in two TCP connections).  On each connection, it issues an
326	   Allocate request.  One of the resulting relayed candidate is used as
327	   a passive relayed candidate, and the other, as a simultaneous-open
328	   relayed candidate.  In addition, the Allocate responses will provide
329	   the agent with a server reflexive candidate for their corresponding
330	   host candidate.

332	   For all of the remaining host candidates, if any, the agent only
333	   needs to obtain server reflexive candidates.  To do that, it
334	   initiates a TCP connection from each host candidate to a STUN server,
335	   and uses a Binding request over that connection to learn the server
336	   reflexive candidate corresponding to that host candidate.

338	   Once the Allocate or Binding request has completed, the agent MUST
339	   keep the TCP connection open until ICE processing has completed.  See
340	   Appendix A for important implementation guidelines.

342	   If a media stream is UDP-based (such as RTP), an agent MAY use an
343	   additional host TCP candidate to request a UDP-based candidate from a
344	   TURN server.  Usage of the UDP candidate from the TURN server follows
345	   the procedures defined in ICE for UDP candidates.

347	   Each agent SHOULD "obtain" an active relayed candidate for each
348	   component of each TCP capable media stream on each interface that the
349	   host has.  The agent does not have to actually allocate a port for
350	   these candidates from the relay at this time.  These candidates serve
351	   as a placeholder for the creation of the check lists.

353	   Like its UDP counterparts, TCP-based STUN transactions are paced out
354	   at one every Ta seconds.  This pacing refers strictly to STUN
355	   transactions (both Binding and Allocate requests).  If performance of
356	   the transaction requires establishment of a TCP connection, then the
357	   connection gets opened when the transaction is performed.

359	3.2.  Prioritization

361	   The transport protocol itself is a criteria for choosing one
362	   candidate over another.  If a particular media stream can run over
363	   UDP or TCP, the UDP candidates might be preferred over the TCP
364	   candidates.  This allows ICE to use the lower latency UDP
365	   connectivity if it exists, but fallback to TCP if UDP doesn't work.

367	   To accomplish this, the local preference SHOULD be defined as:

369	   local-preference = (2^12)*(transport-pref) +
370	                      (2^9)*(direction-pref) +
371	                      (2^0)*(other-pref)

373	   Transport-pref is the relative preference for candidates with this
374	   particular transport protocol (UDP or TCP), and direction-pref is the
375	   preference for candidates with this particular establishment
376	   directionality (active, passive, or simultaneous-open).  Other-pref
377	   is used as a differentiator when two candidates would otherwise have
378	   identical local preferences.

380	   Transport-pref MUST be between 0 and 15, with 15 being the most
381	   preferred.  Direction-pref MUST be between 0 and 7, with 7 being the
382	   most preferred.  Other-pref MUST be between 0 and 511, with 511 being
383	   the most preferred.  For RTP-based media streams, it is RECOMMENDED
384	   that UDP have a transport-pref of 15 and TCP of 6.  It is RECOMMENDED
385	   that, for all connection-oriented media, simultaneous-open candidates
386	   have a direction-pref of 7, active of 5 and passive of 2.  If any two
387	   candidates have the same type-preference, transport-pref, and
388	   direction-pref, they MUST have a unique other-pref.  With this
389	   specification, the only way that can happen is with multi-homed
390	   hosts, in which case other-pref is a preference amongst interfaces.

392	3.3.  Choosing Default Candidates

394	   The default candidate is chosen primarily based on the likelihood of
395	   it working with a non-ICE peer.  When media streams supporting mixed
396	   modes (both TCP and UDP) are used with ICE, it is RECOMMENDED that,
397	   for real-time streams (such as RTP), the default candidates be UDP-
398	   based.  However, the default SHOULD NOT be the simultaneous-open
399	   candidate.

401	   If a media stream is inherently TCP-based, the agent MUST select the
402	   active candidate as default.  This ensures proper directionality of
403	   connection establishment for NAT traversal with non-ICE
404	   implementations.

406	3.4.  Encoding the SDP

408	   TCP-based candidates are encoded into a=candidate lines identically
409	   to the UDP encoding described in [I-D.ietf-mmusic-ice].  However, the
410	   transport protocol is set to "tcp-so" for TCP simultaneous-open
411	   candidates, "tcp-act" for TCP active candidates, and "tcp-pass" for
412	   TCP passive candidates.  The addr and port encoded into the candidate
413	   attribute for active candidates MUST be set to IP address that will
414	   be used for the attempt, but the port MUST be set to 9 (i.e.,
415	   Discard).  For active relayed candidates, the value for addr must be
416	   identical to the IP address of a passive or simultaneous-open
417	   candidate from the same TURN server.

419	   If the default candidate is TCP, the agent MUST include the a=setup
420	   and a=connection attributes from RFC 4145 [RFC4145], following the
421	   procedures defined there as if ICE was not in use.  In particular, if
422	   an agent is the answerer, the a=setup attribute MUST meet the
423	   constraints in RFC 4145 based on the value in the offer.  Since an
424	   ICE-tcp offerer always uses the active candidate as default, an ICE-
425	   tcp answerer will always use the passive attribute as default and
426	   include the a=setup:passive attribute in the answer.

428	   If an agent is utilizing SRTP [RFC3711], it MAY include a mix of UDP
429	   and TCP candidates.  If ICE selects a TCP candidate pair, the agent
430	   MUST still utilize SRTP, but run over the connection establised by
431	   ICE.  The alternative, RTP over TLS, MUST NOT be used.  This allows
432	   for the higher layer protocols (the security handshakes and media
433	   transport) to be independent of the underlying transport protocol.
434	   In the case of DTLS-SRTP [I-D.ietf-avt-dtls-srtp], the directionality
435	   attributes (a=setup) are utilized strictly to determine the direction
436	   of the DTLS handshake.  Directionality of the TCP connection
437	   establishment are determined by the ICE attributes and procedures
438	   defined here.

440	   If an agent is securing non-RTP media over TCP/TLS, he SDP MUST be
441	   constructed as described in RFC 4572 [RFC4572].  The directionality
442	   attributes (a=setup) are utilized strictly to determine the direction
443	   of the TLS handshake.  Directionality of the TCP connection
444	   establishment are determined by the ICE attributes and procedures
445	   defined here.

447	4.  Receiving the Initial Offer

449	4.1.  Verifying ICE Support

451	   Since this specification does not define a lite mode for ICE-tcp, a
452	   lite implementation will include candidate attributes for its UDP
453	   streams, but no such attributes for its TCP streams.  An agent
454	   receiving such an offer MUST proceed with ICE in this case.  ICE will
455	   be used for the UDP streams, and [RFC4145] procedures will be used
456	   for the TCP streams.  However, if the offer indicates a setup
457	   direction of actpass, the answerer MUST utilize a=setup:active in the
458	   answer.  This is required to ensure proper directionality of
459	   connection establishment to work through NAT.

461	   Similarly, if an agent is lite, and receives an offer that includes
462	   streams with TCP candidates, it will omit candidates from the answer
463	   for those streams.  This will cause [RFC4145] procedures to be used
464	   for those streams.  In this case, the offer will indicate a direction
465	   of active, and the agent will use passive in its answer.

467	4.2.  Forming the Check Lists

469	   When forming candidate pairs, the following types of candidates can
470	   be paired with each other:

472	   Local             Remote
473	   Candidate         Candidate
474	   ----------------------------
475	   tcp-so           tcp-so
476	   tcp-act          tcp-pass
477	   tcp-pass         tcp-act

479	   When the agent prunes the check list, it MUST also remove any pair
480	   for which the local candidate is tcp-pass.

482	   The remainder of check list processing works like the UDP case.

484	5.  Connectivity Checks

486	5.1.  STUN Client Procedures

488	5.1.1.  Sending the Request

490	   When an agent wants to send a TCP-based connectivity check, it first
491	   opens a TCP connection if none yet exists for the 5-tuple defined by
492	   the candidate pair for which the check is to be sent.  This
493	   connection is opened from the local candidate of the pair to the
494	   remote candidate of the pair.  If the local candidate is tcp-act, the
495	   agent MUST open a connection from the interface associated with that
496	   local candidate.  This connection MUST be opened from an unallocated
497	   port.  For host candidates, this is readily done by connecting from
498	   the candidates interface.  For relayed candidates, the agent uses the
499	   procedures in [I-D.ietf-behave-turn] to initiate a new connection
500	   from the specified interface on the TURN server.

502	   Once the connection is established, the agent MUST utilize the shim
503	   defined in RFC 4571 [RFC4571] for the duration this connection
504	   remains open.  The STUN Binding requests and responses are sent ontop
505	   of this shim, so that the length field defined in RFC 4571 precedes
506	   each STUN message.  If TLS or DTLS-SRTP is to be utilized for the
507	   media session, the TLS or DTLS-SRTP handshakes will take place ontop
508	   of this shim as well.  However, they only start once ICE processing
509	   has completed.  In essence, the TLS or DTLS-SRTP handshakes are
510	   considered a part of the media protocol.  STUN is never run within
511	   the TLS or DTLS-SRTP session.

513	   If the TCP connection cannot be established, the check is considered
514	   to have failed, and a full-mode agent MUST update the pair state to
515	   Failed in the check list.

517	   Once the connection is established, client procedures are identical
518	   to those for UDP candidates.  Note that STUN responses received on an
519	   active TCP candidate will typically produce a remote peer reflexive
520	   candidate.

522	5.2.  STUN Server Procedures

524	   An agent MUST be prepared to receive incoming TCP connection requests
525	   on any host or relayed TCP candidate that is simultaneous-open or
526	   passive.  When the connection request is received, the agent MUST
527	   accept it.  The agent MUST utilize the framing defined in RFC 4571
528	   [RFC4571] for the lifetime of this connection.  Due to this framing,
529	   the agent will receive data in discrete frames.  Each frame could be
530	   media (such as RTP or SRTP), TLS, DLTS, or STUN packets.  The STUN
531	   packets are extracted as described in Section 8.2.

533	   Once the connection is established, STUN server procedures are
534	   identical to those for UDP candidates.  Note that STUN requests
535	   received on a passive TCP candidate will typically produce a remote
536	   peer reflexive candidate.

538	6.  Concluding ICE Processing

540	   If there are TCP candidates for a media stream, a controlling agent
541	   MUST use a regular selection algorithm.

543	   When ICE processing for a media stream completes, each agent SHOULD
544	   close all TCP connections except the one between the candidate pairs
545	   selected by ICE.

547	      These two rules are related; the closure of connection on
548	      completion of ICE implies that a regular selection algorithm has
549	      to be used.  This is because aggressive selection might cause
550	      transient pairs to be selected.  Once such a pair was selected,
551	      the agents would close the other connections, one of which may be
552	      about to be selected as a better choice.  This race condition may
553	      result in TCP connections being accidentally closed for the pair
554	      that ICE selects.

556	7.  Subsequent Offer/Answer Exchanges

558	7.1.  ICE Restarts

560	   If an ICE restart occurs for a media stream with TCP candidate pairs
561	   that have been selected by ICE, the agents MUST NOT close the
562	   connections after the restart.  In the offer or answer that causes
563	   the restart, an agent MAY include a simultaneous-open candidate whose
564	   transport address matches the previously selected candidate.  If both
565	   agents do this, the result will be a simultaneous-open candidate pair
566	   matching an existing TCP connection.  In this case, the agents MUST
567	   NOT attempt to open a new connection (or start new TLS or DTLS-SRTP
568	   procedures).  Instead, that existing connection is reused and STUN
569	   checks are performed.

571	   Once the restart completes, if the selected pair does not match the
572	   previously selected pair, the TCP connection for the previously
573	   selected pair SHOULD be closed by the agent.

575	8.  Media Handling

577	8.1.  Sending Media

579	   When sending media, if the selected candidate pair matches an
580	   existing TCP connection, that connection MUST be used for sending
581	   media.

583	   The framing defined in RFC 4571 MUST be used when sending media.  For
584	   media streams that are not RTP-based and do not normally use RFC
585	   4571, the agent treats the media stream as a byte stream, and assumes
586	   that it has its own framing of some sort.  It then takes an arbitrary
587	   number of bytes from the bytestream, and places that as a payload in
588	   the RFC 4571 frames, including the length.  Next, the sender checks
589	   to see if the resulting set of bytes would be viewed as a STUN packet
590	   based on the rules in sections 6 and 8 of [RFC5389].  This includes a
591	   check on the most significant two bits, the magic cookie, the length,
592	   and the fingerprint.  If, based on those rules, the bytes would be
593	   viewed as a STUN message, the sender SHOULD utilize a different
594	   number of bytes so that the length checks will fail.  Though it is
595	   normally highly unlikely that an arbitrary number of bytes from a
596	   bytestream would resemble a STUN packet based on all of the checks,
597	   it can happen if the content of the application stream happens to
598	   contain a STUN message (for example, a file transfer of logs from a
599	   client which includes STUN messages).

601	   If TLS or DTLS-SRTP procedures are being utilized to protect the
602	   media stream, those procedures start at the point that media is
603	   permitted to flow, as defined in the ICE specification
604	   [I-D.ietf-mmusic-ice].  The TLS or DTLS-SRTP handshakes occur ontop
605	   of the RFC 4571 shim, and are considered part of the media stream for
606	   purposes of this specification.

608	8.2.  Receiving Media

610	   The framing defined in RFC 4571 MUST be used when receiving media.
611	   For media streams that are not RTP-based and do not normally use RFC
612	   4571, the agent extracts the payload of each RFC 4571 frame, and
613	   determines if it is a STUN or an application layer data based on the
614	   procedures in ICE [I-D.ietf-mmusic-ice].  If media is being protected
615	   with DTLS-SRTP, the DTLS, RTP and STUN packets are demultiplexed as
616	   described in Section 3.6.2 of [I-D.ietf-avt-dtls-srtp].

618	   For non-STUN data, the agent appends this to the ongoing bytestream
619	   collected from the frames.  It then parses the bytestream as if it
620	   had been directly received over the TCP connection.  This allows for
621	   ICE-tcp to work without regard to the framing mechanism used by the
622	   application layer protocol.

624	9.  Connection Management

626	9.1.  Connections Formed During Connectivity Checks

628	   Once a TCP or TCP/TLS connection is opened by ICE for the purpose of
629	   connectivity checks, its lifecycle depends on how it is used.  If
630	   that candidate pair is selected by ICE for usage for media, an agent
631	   SHOULD keep the connection open until:

633	   o  The session terminates

635	   o  The media stream is removed

637	   o  An ICE restart takes place, resulting in the selection of a
638	      different candidate pair.

640	   In these cases, the agent SHOULD close the connection when that event
641	   occurs.  This applies to both agents in a session, in which case
642	   usually one of the agents will end up closing the connection first.

644	   If a connection has been selected by ICE, an agent MAY close it
645	   anyway.  As described in the next paragraph, this will cause it to be
646	   reopened almost immediately, and in the interim media cannot be sent.
647	   Consequently, such closures have a negative effect and are NOT
648	   RECOMMENDED.  However, there may be cases where an agent needs to
649	   close a connection for some reason.

651	   If an agent needs to send media on the selected candidate pair, and
652	   its TCP connection has closed, either on purpose or due to some
653	   error, then:

655	   o  If the agent's local candidate is tcp-act or tcp-so, it MUST
656	      reopen a connection to the remote candidate of the selected pair.

658	   o  If the agent's local candidate is tcp-pass, the agent MUST await
659	      an incoming connection request, and consequently, will not be able
660	      to send media until it has been opened.

662	   If the TCP connection is established, the framing of RFC 4571 is
663	   utilized.  If the agent opened the connection, it MUST send a STUN
664	   connectivity check.  An agent MUST be prepared to receive a
665	   connectivity check over a connection it opened or accepted (note that
666	   this is true in general; ICE requires that an agent be prepared to
667	   receive a connectivity check at any time, even after ICE processing
668	   completes).  If an agent receives a connectivity check after re-
669	   establishment of the connection, it MUST generate a triggered check
670	   over that connection in response if it has not already sent a check.
671	   Once an agent has sent a check and received a successful response,
672	   the connection is considered Valid and media can be sent (which
673	   includes a TLS or DTLS-SRTP session resumption or restart).

675	   If the TCP connection cannot be established, the controlling agent
676	   SHOULD restart ICE for this media stream.  This will happen in cases
677	   where one of the agents is behind a NAT with connection dependent
678	   mapping properties [RFC5382].

680	9.2.  Connections formed for Gathering Candidates

682	   If the agent opened a connection to a STUN server for the purposes of
683	   gathering a server reflexive candidate, that connection SHOULD be
684	   closed by the client once ICE processing has completed.  This happens
685	   irregardless of whether the candidate learned from the STUN server
686	   was selected by ICE.

688	   If the agent opened a connection to a TURN server for the purposes of
689	   gathering a relayed candidate, that connection MUST be kept open by
690	   the client for the duration of the media session if:

692	   o  A relayed candidate learned by the TURN server was selected by
693	      ICE,

695	   o  or an active candidate established as a consequence of a Connect
696	      request sent through that TCP connection was selected by ICE.

698	   Otherwise, the connection to the TURN server SHOULD be closed once
699	   ICE processing completes.

701	   If, despite efforts of the client, a TCP connection to a TURN server
702	   fails during the lifetime of the media session utilizing a transport
703	   address allocated by that server, the client SHOULD reconnect to the
704	   TURN server, obtain a new allocation, and restart ICE for that media
705	   stream.

707	10.  Security Considerations

709	   The main threat in ICE is hijacking of connections for the purposes
710	   of directing media streams to DoS targets or to malicious users.
711	   ICE-tcp prevents that by only using TCP connections that have been
712	   validated.  Validation requires a STUN transaction to take place over
713	   the connection.  This transaction cannot complete without both
714	   participants knowing a shared secret exchanged in the rendezvous
715	   protocol used with ICE, such as SIP.  This shared secret, in turn, is
716	   protected by that protocol exchange.  In the case of SIP, the usage
717	   of the sips mechanism is RECOMMENDED.  When this is done, an
718	   attacker, even if it knows or can guess the port on which an agent is
719	   listening for incoming TCP connections, will not be able to open a
720	   connection and send media to the agent.

722	   A more detailed analysis of this attack and the various ways ICE
723	   prevents it are described in [I-D.ietf-mmusic-ice].  Those
724	   considerations apply to this specification.

726	11.  IANA Considerations

728	   There are no IANA considerations associated with this specification.

730	12.  Acknowledgements

732	   The authors would like to thank Tim Moore, Saikat Guha, Francois
733	   Audet and Roni Even for the reviews and input on this document.

735	13.  References

737	13.1.  Normative References

739	   [RFC5389]  Rosenberg, J., Mahy, R., Matthews, P., and D. Wing,
740	              "Session Traversal Utilities for NAT (STUN)", RFC 5389,
741	              October 2008.

743	   [RFC3264]  Rosenberg, J. and H. Schulzrinne, "An Offer/Answer Model
744	              with Session Description Protocol (SDP)", RFC 3264,
745	              June 2002.

747	   [RFC4145]  Yon, D. and G. Camarillo, "TCP-Based Media Transport in
748	              the Session Description Protocol (SDP)", RFC 4145,
749	              September 2005.

751	   [RFC4571]  Lazzaro, J., "Framing Real-time Transport Protocol (RTP)
752	              and RTP Control Protocol (RTCP) Packets over Connection-
753	              Oriented Transport", RFC 4571, July 2006.

755	   [RFC4572]  Lennox, J., "Connection-Oriented Media Transport over the
756	              Transport Layer Security (TLS) Protocol in the Session
757	              Description Protocol (SDP)", RFC 4572, July 2006.

759	   [I-D.ietf-mmusic-ice]
760	              Rosenberg, J., "Interactive Connectivity Establishment
761	              (ICE): A Protocol for Network Address  Translator (NAT)
762	              Traversal for Offer/Answer Protocols",
763	              draft-ietf-mmusic-ice-19 (work in progress), October 2007.

765	   [I-D.ietf-avt-dtls-srtp]
766	              McGrew, D. and E. Rescorla, "Datagram Transport Layer
767	              Security (DTLS) Extension to Establish Keys for  Secure
768	              Real-time Transport Protocol (SRTP)",
769	              draft-ietf-avt-dtls-srtp-07 (work in progress),
770	              February 2009.

772	   [I-D.ietf-behave-turn]
773	              Rosenberg, J., Mahy, R., and P. Matthews, "Traversal Using
774	              Relays around NAT (TURN): Relay Extensions to Session
775	              Traversal Utilities for NAT (STUN)",
776	              draft-ietf-behave-turn-16 (work in progress), July 2009.

778	   [RFC3711]  Baugher, M., McGrew, D., Naslund, M., Carrara, E., and K.
779	              Norrman, "The Secure Real-time Transport Protocol (SRTP)",
780	              RFC 3711, March 2004.

782	13.2.  Informative References

784	   [RFC5382]  Guha, S., Biswas, K., Ford, B., Sivakumar, S., and P.
785	              Srisuresh, "NAT Behavioral Requirements for TCP", BCP 142,
786	              RFC 5382, October 2008.

788	   [RFC4975]  Campbell, B., Mahy, R., and C. Jennings, "The Message
789	              Session Relay Protocol (MSRP)", RFC 4975, September 2007.

791	   [I-D.blanchet-v6ops-tunnelbroker-tsp]
792	              Blanchet, M. and F. Parent, "IPv6 Tunnel Broker with the
793	              Tunnel Setup Protocol (TSP)",
794	              draft-blanchet-v6ops-tunnelbroker-tsp-04 (work in
795	              progress), May 2008.

797	   [RFC4380]  Huitema, C., "Teredo: Tunneling IPv6 over UDP through
798	              Network Address Translations (NATs)", RFC 4380,
799	              February 2006.

801	   [RFC3947]  Kivinen, T., Swander, B., Huttunen, A., and V. Volpe,
802	              "Negotiation of NAT-Traversal in the IKE", RFC 3947,
803	              January 2005.

805	Appendix A.  Implementation Considerations for BSD Sockets

807	   This specification requires unusual handling of TCP connections, the
808	   implementation of which in traditional BSD socket APIs is non-
809	   trivial.

811	   In particular, ICE requirs an agent to obtain a local TCP candidate,
812	   bound to a local IP and port, and then from that local port, initiate
813	   a TCP connection (to the STUN server, in order to obtain server
814	   reflexive candidates, to the TURN server, to obtain a relayed
815	   candidate, or to the peer as part of a connectivity check), and be
816	   prepared to receive incoming TCP connections (for passive and
817	   simultaneous-open candidates).  A "typical" BSD socket is used either
818	   for initiating or receiving connections, and not for both.  The code
819	   required to allow incoming and outgoing connections on the same local
820	   IP and port is non-obvious.  The following pseudocode, contributed by
821	   Saikat Guha, has been found to work on many platforms:

823	   for i in 0 to MAX
824	      sock_i = socket()
825	      set(sock_i, SO_REUSEADDR)
826	      bind(sock_i, local)

828	   listen(sock_0)
829	   connect(sock_1, stun)
830	   connect(sock_2, remote_a)
831	   connect(sock_3, remote_b)

833	   The key here is that, prior to the listen() call, the full set of
834	   sockets that need to be utilized for outgoing connections must be
835	   allocated and bound to the local IP address and port.  This number,
836	   MAX, represents the maximum number of TCP connections to different
837	   destinations that might need to be established from the same local
838	   candidate.  This number can be potentially large for simultaneous-
839	   open candidates.  If a request forks, ICE procedures may take place
840	   with multiple peers.  Furthermore, for each peer, connections would
841	   need to be established to each passive or simultaneous-open candidate
842	   for the same component.  If we assume a worst case of 5 forked
843	   branches, and for each peer, five simultaneous-open candidates, that
844	   results in MAX=25.  For a passive candidate, MAX is equal to the
845	   number of STUN servers, since the agent only initiates TCP
846	   connections on a passive candidate to its STUN server.

848	Authors' Addresses

850	   Simon Perreault (editor)
851	   Viagenie
852	   2600 boul. Laurier, suite 625
853	   Quebec, QC  G1V 4W1
854	   Canada

856	   Phone: +1 418 656 9254
857	   Email: simon.perreault@viagenie.ca
858	   URI:   http://www.viagenie.ca

860	   Jonathan Rosenberg
861	   Cisco
862	   Edison, NJ
863	   US

865	   Email: jdrosen@cisco.com
866	   URI:   http://www.jdrosen.net