idnits 2.17.1 

draft-ietf-intarea-tunnels-00.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  == There are 1 instance of lines with non-RFC6890-compliant IPv4 addresses
     in the document.  If these are example addresses, they should be changed.


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  == The document seems to contain a disclaimer for pre-RFC5378 work, but was
     first submitted on or after 10 November 2008.  The disclaimer is usually
     necessary only for documents that revise or obsolete older RFCs, and that
     take significant amounts of text from those RFCs.  If you can contact all
     authors of the source material and they are willing to grant the BCP78
     rights to the IETF Trust, you can and should remove the disclaimer. 
     Otherwise, the disclaimer is needed and you can ignore this comment. 
     (See the Legal Provisions document at
     https://trustee.ietf.org/license-info for more information.)

  -- The document date (March 26, 2010) is 5135 days in the past.  Is this
     intentional?


  Checking references for intended status: Informational
  ----------------------------------------------------------------------------

  == Outdated reference: A later version (-24) exists of draft-ietf-lisp-06

  == Outdated reference: A later version (-04) exists of
     draft-ietf-v6ops-tunnel-security-concerns-01

  == Outdated reference: A later version (-16) exists of
     draft-ietf-trill-rbridge-protocol-15

  -- Obsolete informational reference (is this intentional?): RFC 3344
     (Obsoleted by RFC 5944)

  -- No information found for draft-touch-intarea-ipv4-id-update - is the
     name correct?


     Summary: 0 errors (**), 0 flaws (~~), 6 warnings (==), 3 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.
--------------------------------------------------------------------------------

1	Internet Area WG                                               J. Touch
2	Internet Draft                                                  USC/ISI
3	Intended status: Informational                              M. Townsley
4	Expires: September 2010                                           Cisco
5	                                                         March 26, 2010

7	                   Tunnels in the Internet Architecture
8	                     draft-ietf-intarea-tunnels-00.txt

10	Status of this Memo

12	   This Internet-Draft is submitted in full conformance with the
13	   provisions of BCP 78 and BCP 79.

15	   This document may contain material from IETF Documents or IETF
16	   Contributions published or made publicly available before November
17	   10, 2008. The person(s) controlling the copyright in some of this
18	   material may not have granted the IETF Trust the right to allow
19	   modifications of such material outside the IETF Standards Process.
20	   Without obtaining an adequate license from the person(s) controlling
21	   the copyright in such materials, this document may not be modified
22	   outside the IETF Standards Process, and derivative works of it may
23	   not be created outside the IETF Standards Process, except to format
24	   it for publication as an RFC or to translate it into languages other
25	   than English.

27	   Internet-Drafts are working documents of the Internet Engineering
28	   Task Force (IETF), its areas, and its working groups.  Note that
29	   other groups may also distribute working documents as Internet-
30	   Drafts.

32	   Internet-Drafts are draft documents valid for a maximum of six months
33	   and may be updated, replaced, or obsoleted by other documents at any
34	   time.  It is inappropriate to use Internet-Drafts as reference
35	   material or to cite them other than as "work in progress."

37	   The list of current Internet-Drafts can be accessed at
38	   http://www.ietf.org/ietf/1id-abstracts.txt

40	   The list of Internet-Draft Shadow Directories can be accessed at
41	   http://www.ietf.org/shadow.html

43	   This Internet-Draft will expire on September 26, 2010.

45	Copyright Notice

47	   Copyright (c) 2010 IETF Trust and the persons identified as the
48	   document authors. All rights reserved.

50	   This document is subject to BCP 78 and the IETF Trust's Legal
51	   Provisions Relating to IETF Documents
52	   (http://trustee.ietf.org/license-info) in effect on the date of
53	   publication of this document. Please review these documents
54	   carefully, as they describe your rights and restrictions with respect
55	   to this document. Code Components extracted from this document must
56	   include Simplified BSD License text as described in Section 4.e of
57	   the Trust Legal Provisions and are provided without warranty as
58	   described in the Simplified BSD License.

60	Abstract

62	   This document discusses the role of tunnels in the Internet
63	   architecture. It explains their relationship to existing protocol
64	   layers, and the challenges in supporting tunneling.

66	Table of Contents

68	   1. Introduction...................................................3
69	   2. Conventions used in this document..............................4
70	   3. Known Issues...................................................4
71	      3.1. MTU discovery.............................................5
72	      3.2. Fragmentation.............................................6
73	         3.2.1. Outer Fragmentation..................................6
74	         3.2.2. Inner Fragmentation..................................7
75	         3.2.3. Fragmentation efficiency.............................8
76	         3.2.4. Packing (ala GigE bursting).........................10
77	         3.2.5. IP ID exhaustion....................................11
78	      3.3. Signaling................................................12
79	   4. Current Tunnel Standards......................................13
80	      4.1. IP in IP.................................................13
81	         4.1.1. MTU discovery.......................................13
82	         4.1.2. Fragmentation.......................................14
83	         4.1.3. Signaling...........................................14
84	      4.2. IPsec....................................................14
85	         4.2.1. MTU discovery.......................................15
86	         4.2.2. Fragmentation.......................................15
87	         4.2.3. Signaling...........................................15
88	   5. Issues........................................................15
89	      5.1. Tunnel model.............................................15
90	      5.2. Parties participating....................................16

92	   6. Potential Ways Forward........................................17
93	   7. Notes for future updates......................................18
94	   8. Security Considerations.......................................19
95	   9. IANA Considerations...........................................19
96	   10. References...................................................20
97	      10.1. Normative References....................................20
98	      10.2. Informative References..................................20
99	   11. Acknowledgments..............................................22

101	1. Introduction

103	   The Internet is loosely based on the ISO seven layer stack, in which
104	   data units traverse the stack by being wrapped inside data units one
105	   layer down (Figure 1). A tunnel is a mechanism for transmitting data
106	   units between endpoints by wrapping them inside data units other
107	   layers, e.g., IP in IP, or IP in UDP (Figure 2).

109	                    +------+----+-----+--------------+
110	                    +  Eth | IP | TCP |     Data     |
111	                    +------+----+-----+--------------+

113	                  Figure 1 TCP inside IP inside Ethernet

115	              +------+----+-----+----+-----+--------------+
116	              +  Eth | IP'| UDP | IP | TCP |     Data     |
117	              +------+----+-----+----+-----+--------------+

119	                   Figure 2 IP in UDP in IP in Ethernet

121	   Tunnels help decouple topology from that provided by the physical
122	   network components. For example, they were critical in the
123	   development of multicast, where not all routers were capable of
124	   processing multicast packets. Multicast routers were interconnected
125	   by tunnels where not directly connected. Similar techniques have been
126	   used to support other protocols, such as IPv6.

128	   Use of tunnels is common in the Internet. The word "tunnel" occurs in
129	   over 100 RFCs, and is supported within numerous protocols, including:

131	   o  IPsec - hides the original traffic destination [RFC4301]

133	   o  L2TP - Tunnels PPP over IP, used largely in DSL/FTTH access
134	      networks to extend a subscriber's connection from an access line
135	      provider to an ISP [RFC3931]

137	   o  Mobile IP - forwards traffic to the home agent [RFC2003]
138	   o  L2VPNs - provides a link topology different from that provided by
139	      physical links [RFC4664]

141	   o  L3VPNs - provides a network topology different from that provided
142	      by ISPs [RFC4176]

144	   o  SEAL - a generic mechanism for IP in IP tunneling designed to
145	      overcome the limitations of RFC2003 [RFC5320]

147	   o  LISP - reduces routing table load within an enclave of routers
148	      [Fa10]

150	   o  TRILL - enables L3 routing in an enclave of bridges
151	      [Pe10][RFC5556]

153	   o  MPLS - ? {need description/ref}

155	   o  PWE3 - ? {need description/ref}

157	   The variety of tunnel mechanisms begs the question of the roles of
158	   tunnels in the Internet architecture, and the potential need for
159	   coordination of these mechanisms. In particular, the ways in which
160	   MTU mismatch, error signals (e.g., ICMP), and is handled may benefit
161	   from a coordinated approach.

163	   It is useful to note that, regardless of the layer in which
164	   encapsulation occurs, tunnels emulate a link. As links, they are
165	   subject to link issues, e.g., MTU discovery, signaling, and the
166	   potential utility of native support for broadcast and multicast
167	   [RFC3819]. They have advantages over native links, being potentially
168	   easier to reconfigure and control.

170	2. Conventions used in this document

172	   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
173	   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
174	   document are to be interpreted as described in RFC-2119 [RFC2119].

176	3. Known Issues

178	   Most of the known issues with tunnels arise from the complications of
179	   encapsulation, or from the introduction of artificial endpoints along
180	   a data path. Encapsulation exacerbates MTU issues, often because a
181	   data unit will traverse at least one layer of a protocol stack more
182	   than once (e.g., as in Figure 2), which requires space for additional
183	   headers. This space complicates MTU discovery, and often results in
184	   fragmentation.

186	   Tunnel encapsulation and decapsulation nodes act as network
187	   endpoints. They may source and sink much higher bandwidth streams
188	   from single IP addresses, and thus can be affected by many of the
189	   issues of other high bandwidth edge devices, such as fragmentation
190	   efficiency and IP ID exhaustion (in IPv4). These endpoints also
191	   introduce complexity in end-to-end and path signaling, in the
192	   translation between signals inside a tunnel and signals outside on
193	   the end-to-end path.

195	3.1. MTU discovery

197	   MTU discovery is a known challenge in the current Internet, and
198	   tunnels can complicate its proper operation. Encapsulation increases
199	   the size of a packet during tunnel transit that can exceed the MTU of
200	   the links of the tunnel path. This is especially true for recursive
201	   tunnels, i.e., tunnels that reuse layers of the protocol stack (e.g.,
202	   IPv4 over IPv4). These issues are discussed in detail in [RFC4459];
203	   the following provides a brief overview of the issues. Note that the
204	   impact of tunnels on MTU discovery may be mitigated somewhat by the
205	   ubiquity of workarounds already needed in the Internet, e.g., the
206	   deduction of a 'tunnel tax' for all MTUs (i.e., maxing out the MTU at
207	   1200-1400 bytes, rather than 1500).

209	   Conventional path MTU discovery (PMTUD) relies on explicit negative
210	   feedback from routers along the path (ICMP "message to big" signals)
211	   [RFC1191]. This technique is susceptible to the "black hole"
212	   phenomenon, in which the ICMP messages never return to the source
213	   [RFC2923]. In the typical Internet case, lost ICMPs are often the
214	   result of filtering, e.g., for policy reasons.

216	   A more recent alternative is packetization-layer path MTU discovery
217	   (PLPMTUD) [RFC4821]. This variant relies on feedback from the
218	   endpoint, indicating either the success or failure of probe packets.
219	   It is not susceptible to "black holing", but requires explicit
220	   participation by the receiver.

222	   Either of these techniques (PMTUD, PLPMTUD) can be applied to
223	   tunnels. The encapsulator must react to "message to big" signals in
224	   either case, by either adjusting its fragmentation, relaying a
225	   corresponding signal to the packet origin outside the tunnel, or
226	   both. Fragmentation adjustment is easy to incorporate, but can result
227	   in inefficient transmission of packets over the tunnel (e.g., where
228	   every source packet is fragmented). Relaying the signal to the source
229	   can be much more efficient, but it can be difficult to determine what
230	   signal to forward. E.g., in PMTUD, routers along the tunnel may not
231	   return a sufficiently long prefix to determine the decapsulated
232	   packet origin.

234	   Tunnels thus may need to participate in MTU discovery, either
235	   forwarding or recomputing ICMPs received inside the tunnel path. The
236	   tunnel may incorporate its own MTU discovery between ingress and
237	   egress, e.g., as proposed in SEAL [RFC5320].

239	3.2. Fragmentation

241	   There are two places where fragmentation can occur in a tunnel,
242	   called Outer Fragmentation and Inner Fragmentation.

244	3.2.1. Outer Fragmentation

246	   The simplest case is Outer Fragmentation, as shown in Figure 3. The
247	   bottom of the figure shows the network toplogy, where packets start
248	   at the source, enter the tunnel at the encapsulator, exit the tunnel
249	   at the decapsulator, and arrive finally at the destination. The
250	   packet traffic is shown above the topology, where the end-to-end
251	   packets are shown at the top. The packets are composed of an inner
252	   header (iH) and inner data (iD); the term "inner") is relative to the
253	   tunnel, as will become apparent. When the packet (iH,iD) arrives at
254	   the encapsulator, it is placed inside the tunnel packet structure,
255	   here shown as adding just an outer header, oH, in step (a).

257	   When the encapsulated packet exceeds the MTU of the tunnel, the
258	   packet needs to be fragmented. In this case we fragment the packet at
259	   the outer header, with the fragments shown as (b1) and (b2). Note
260	   that the outer header indicates fragmentation (as ' and "),the inner
261	   header occurs only in the first fragment, and the inner data is
262	   broken across the two packets. These fragments are reassembled at the
263	   encapsulator in step (c), and the resulting packet is decapsulated
264	   and sent on to the destination.

266	    +----+----+                                              +----+----+
267	    | iH | iD |------+ -  -  -  -  -  -  -  -  -  -  +------>| iH | iD |
268	    +----+----+      |                               |       +----+----+
269	                     v                               |
270	              +----+----+----+               +----+----+----+
271	          (a) | oH | iH | iD |               | oH | iH | iD | (c)
272	              +----+----+----+               +----+----+----+
273	                     |                               ^
274	                     |       +----+----+-----+       |
275	                (b1) +----- >| oH'| iH | iD1 |-------+
276	                     |       +----+----+-----+       |
277	                     |                               |
278	                     |       +----+-----+            |
279	                (b2) +----- >| oH"| iD2 |------------+
280	                             +----+-----+

282	   +-----+         +---+                           +---+         +-----+
283	   |     |        /     \ ======================= /     \        |     |
284	   | Src |=======|  Enc  |=======================|  Dec  |=======| Dst |
285	   |     |        \     / ======================= \     /        |     |
286	   +-----+         +---+                           +---+         +-----+

288	                Figure 3 Fragmentation of the outer packet

290	   Outer fragmentation isolates Source and Destination from tunnel
291	   encapsulation duties. This can be considered a benefit in clean,
292	   layered network design, but also may result in complex decapsulator
293	   design, especially where tunnels aggregate large amounts of traffic,
294	   such as IP ID overload (see Sec. 3.2.5). Outer fragmentation is valid
295	   for any tunnel encapsulation protocol that supports fragmentation
296	   (e.g., IPv4 or IPv6), where the tunnel endpoints act as the host
297	   endpoints of that protocol.

299	   Along the tunnel, the inner header is contained only in the first
300	   fragment, which can interfere with mechanisms that 'peek' into lower
301	   layer headers, e.g., as for ICMP, as discussed in Sec. 3.3.

303	3.2.2. Inner Fragmentation

305	   Inner Fragmentation distributes the impact of tunneling across both
306	   the decapsulator and destination, and is shown in Figure 4. Again,
307	   the network topology is shown at the bottom of the figure, and the
308	   original packets show at the top. Packets arrive at the encapsulator,
309	   and are fragmented there based on the inner header into (a1) and
310	   (a2). The fragments arrive at the decapsulator, which removes the
311	   outer header and forwards the resulting fragments on to the
312	   destination. The destination is then responsible for reassembling the
313	   fragments into the original packet.

315	   +----+----+                                               +----+----+
316	   | iH | iD |-------+-  -  -  -  -  -  -  -  -  -  -  -  - >| iH | iD |
317	   +----+----+       |                                       +----+----+
318	                     v                                            ^
319	                +----+-----+                    +----+-----+      |
320	           (a1) | iH'| iD1 |                    | iH'| iD1 |------+
321	                +----+-----+                    +----+-----+      |
322	                                                                  |
323	                +----+---                       +----+-----+      |
324	           (a2) | iH"| iD2 |                    | iH"| iD2 |------+
325	                +----+-----+                    +----+-----+
326	                     |                               ^
327	                     |       +----+----+-----        |
328	                (b1) +----- >| oH | iH'| iD1 |-------+
329	                     |       +----+----+-----+       |
330	                     |                               |
331	                     |       +----+----+-----+       |
332	                (b2) +----- >| oH | iH"| iD2 |-------+
333	                             +----+----+-----+

335	   +-----+         +---+                           +---+         +-----+
336	   |     |        /     \ ======================= /     \        |     |
337	   | Src |=======|  Enc  |=======================|  Dec  |=======| Dst |
338	   |     |        \     / ======================= \     /        |     |
339	   +-----+         +---+                           +---+         +-----+

341	                Figure 4 Fragmentation of the inner packet

343	   As noted, inner fragmentation distributes the effort of tunneling
344	   across the decapsulator and destinations; this can be especially
345	   important when the tunnel aggregates large amounts of traffic. Note
346	   that this mechanism is thus valid only when the original source
347	   packets can be fragmented on-path, e.g., as in IPv4.

349	   Along the tunnel, the inner headers are copied into each fragment,
350	   and so are available to mechanisms that 'peek' into headers (e.g.,
351	   ICMP, as discussed in Sec. 3.3). Because fragmentation happens on the
352	   inner header, the impact of IP ID is reduced.

354	3.2.3. Fragmentation efficiency

356	   There are different ways to fragment a packet. Consider a network
357	   with an MTU as shown in Figure 5, where packets are encapsulated over
358	   the same network layer as they arrive on (e.g., IP in IP). If a
359	   packet as large as the MTU arrives, it must be fragmented to
360	   accommodate the additional header.

362	                 X===========================X (MTU)
363	                 +----+----------------------+
364	                 | iH | DDDDDDDDDDDDDDDDDDDD |
365	                 +----+----------------------+
366	                   |
367	                   |  X===========================X (MTU)
368	                   |  +---+----+------------------+
369	               (a) +->| H'| iH | DDDDDDDDDDDDDDDD |
370	                   |  +---+----+------------------+
371	                   |      |
372	                   |      |  X===========================X (MTU)
373	                   |      |  +----+---+----+-------------+
374	                   | (a1) +->| nH'| H | iH | DDDDDDDDDDD |
375	                   |      |  +----+---+----+-------------+
376	                   |      |
377	                   |      |  +----+-------+
378	                   | (a2) +->| nH"| DDDDD |
379	                   |         +----+-------+
380	                   |
381	                   |  +---+------+
382	               (b) +->| H"| DDDD |
383	                      +---+------+
384	                          |
385	                          |  +----+---+------+
386	                     (b1) +->| nH'| H"| DDDD |
387	                             +----+---+------+

389	                   Figure 5 Fragmenting via maximum fit

391	   Figure 5 shows this process, using Outer Fragmentation as an example
392	   (the situation is the same for Inner Fragmentation, but the headers
393	   that are affected differ). The arriving packet is first split into
394	   (a) and (b), where (a) is of the MTU of the network. However, this
395	   tunnel then traverses over another tunnel, whose impact the first
396	   tunnel ingress has not accommodated. The packet (a) arrives at the
397	   second tunnel ingress, and needs to be encapsulated again, but
398	   because it is already at the MTU, it needs to be fragmented as well,
399	   into (a1) and (a2). In this case, packet (b) arrives at the second
400	   tunnel ingress and is encapsulated into (b1) without fragmentation,
401	   because it is already below the MTU size.

403	   In Figure 6, the fragmentation is done evenly, i.e., by splitting the
404	   original packet into two roughly equal-sized components, (c) and (d).
405	   Note that (d) contains more packet data, because (c) includes the
406	   original packet header because this is an example of Outer
407	   Fragmentation. The packets (c) and (d) arrive at the second tunnel
408	   encapsulator, and are encapsulated again; this time, neither packet
409	   exceeds the MTU, and neither requires further fragmentation.

411	                 X===========================X (MTU)
412	                 +----+----------------------+
413	                 | iH | DDDDDDDDDDDDDDDDDDDD |
414	                 +----+----------------------+
415	                   |
416	                   |  X===========================X (MTU)
417	                   |  +---+----+----------+
418	               (c) +->| H'| iH | DDDDDDDD |
419	                   |  +---+----+----------+
420	                   |      |
421	                   |      |  X===========================X (MTU)
422	                   |      |  +----+---+----+----------+
423	                   | (c1) +->| nH | H'| iH | DDDDDDDD |
424	                   |         +----+---+----+----------+
425	                   |
426	                   |  +---+--------------+
427	               (d) +->| H"| DDDDDDDDDDDD |
428	                      +---+--------------+
429	                          |
430	                          |  +----+---+--------------+
431	                     (d1) +->| nH | H"| DDDDDDDDDDDD |
432	                             +----+---+--------------+

434	                        Figure 6 Fragmenting evenly

436	3.2.4. Packing (ala GigE bursting)

438	   Encapsulating individual packets to traverse a tunnel can be
439	   inefficient, especially where headers are large relative to the
440	   packets being carried. In that case, it can be more efficient to
441	   encapsulate many small packets in a single, larger tunnel payload.
442	   This technique, similar to the effect of packet bursting in Gigabit
443	   Ethernet, reduces the overhead of the encapsulation headers (Figure
444	   7). It reduces the work of header addition and removal at the tunnel
445	   endpoints, but increases other work involving the packing and
446	   unpacking of the component packets carried.

448	                     +-----+-----+
449	                     | iHa | iDa |
450	                     +-----+-----+
451	                           |
452	                           |     +-----+-----+
453	                           |     | iHb | iDb |
454	                           |     +-----+-----+
455	                           |           |
456	                           |           |     +-----+-----+
457	                           |           |     | iHc | iDc |
458	                           |           |     +-----+-----+
459	                           |           |           |
460	                           v           v           v
461	                +----+-----+-----+-----+-----+-----+-----+
462	                | oH | iHa | iHa | iHb | iDb | iHc | iDc |
463	                +----+-----+-----+-----+-----+-----+-----+

465	                  Figure 7 Packing packets into a tunnel

467	3.2.5. IP ID exhaustion

469	   In IPv4, the IP Identification (ID) field is a 16-bit value that is
470	   unique for every packet for a given source address, destination
471	   address, and protocol, such that it does not repeat within the
472	   Maximum Segment Lifetime (MSL) [RFC791][RFC1122]. Although the ID
473	   field was originally intended for fragmentation and reassembly, it
474	   can also be used to detect and discard duplicate packets, e.g., at
475	   congested routers (see Sec. 3.2.1.5 of [RFC1122]). For this reason,
476	   and even more so that IPv4 packets can be fragmented anywhere along a
477	   path, all packets between a source and destination of a given
478	   protocol must have unique ID values over a period of an MSL, which is
479	   typically interpreted as two minutes (120 seconds).

481	   The uniqueness of the IP ID is a known problem for high speed
482	   devices, because it limits the speed of a single protocol between two
483	   endpoints [RFC4963]. With the maximum IP packet size of 64KB, a 16-
484	   bit ID field that does not repeat within 120 seconds means that the
485	   sum of all TCP connections between two endpoints is limited to
486	   roughly 286 Mbps; for more typical MTUs of 1500 bytes, this drops to
487	   6.4 Mbps.

489	   Although this strongly suggests that the uniqueness of the IP ID is
490	   moot, tunnels exacerbate this condition. A tunnel often aggregates
491	   traffic from a number of different source and destination addresses,
492	   of different protocols, and encapsulates them in a header with the
493	   same ingress and egress addresses, all using a single encapsulation
494	   protocol. The result is one of the following:

496	   1. The IP ID rules are enforced, and the tunnel throughput is
497	      severely limited.

499	   2. The IP ID rules are enforced, and the tunnel consumes large
500	      numbers of ingress/egress IP addresses solely to ensure ID
501	      uniqueness.

503	   3. The IP ID rules are ignored.

505	   The last case is the most obvious solution, because it corresponds to
506	   how endpoints currently behave. Fortunately, fragmentation is
507	   somewhat rare in the current Internet at large, but it can be common
508	   along a tunnel. Fragments that repeat the IP ID risk being
509	   reassembled incorrectly, especially when fragments are reordered or
510	   lost. Although such errors may be detected at the transport layer,
511	   this results in excessive overall packet loss, as well as wasting
512	   bandwidth between the egress and ultimate packet destination.

514	3.3. Signaling

516	   In the current Internet architecture, signals tend to go upstream,
517	   either from routers along a path or from the destination, back toward
518	   the source (Figure 8). Such signals are typically contained in ICMP
519	   messages, but can involve other protocols such as RSVP, transport
520	   protocol signals (e.g., TCP RSTs), or multicast.

522	     +--------------------------------------------------------------+
523	     |                                                              |
524	     | +---------------------------+                                |
525	     | |                           |                                |
526	     v v                           |                                |
527	   +-----+                         |                             +-----+
528	   |     |                         |                             |     |
529	   | Src |=========================R=============================| Dst |
530	   |     |                                                       |     |
531	   +-----+                                                       +-----+

533	                  Figure 8 Signaling paths in an Internet

535	   Tunnels interfere with these known signaling paths. As shown in
536	   Figure 9, signals from routers along the tunnel path (R2), as well as
537	   those from the tunnel egress, need to be relayed by the ingress. This
538	   relaying may be difficult, because R2 may not return enough
539	   information to the ingress to support relaying (e.g., when ICMP
540	   returns only the outermost headers in a "message to big", and the
541	   source transport port information is lost). Signals from routers
542	   downstream of the egress (R3 in Figure 9) need to traverse the tunnel
543	   in reverse.

545	   In all cases, the tunnel ingress needs to determine how to relay the
546	   signals from inside the tunnel into signals back to the source. For
547	   some protocols this is either simple or impossible (such as for
548	   ICMP), for others, it can even be undefined (e.g., multicast).

550	      +  -  -  -  -  +-------------------------------+
551	      |              |                               |
552	      v              v                               |
553	   +-----+         +---+                           +---+         +-----+
554	   |     |        /     \ ======================= /     \        |     |
555	   | Src |==R1===|  Enc  |==========R2===========|  Dec  |===R3==| Dst |
556	   |     |        \     / ======================= \     /        |     |
557	   +-----+         +---+             |             +---+         +-----+
558	      ^              ^               |
559	      |              |               |
560	      +  -  -  -  -  +---------------+

562	              Figure 9 Signaling paths introduced by a tunnel

564	4. Current Tunnel Standards

566	   This section reviews two common Internet tunnel standards. They are
567	   notable because they both ultimately rely on IP in IP encapsulation,
568	   although they each handle MTU discovery, fragmentation, and signaling
569	   differently.

571	   [There are other tunnel mechanisms, such as IPv4 in IPv6, which may
572	   be added to this discussion later.]

574	4.1. IP in IP

576	   The simplest tunnel encapsulation mechanism is IP in IP, explained
577	   here for IPv4 [RFC2003]. This protocol was standardized for use in
578	   mobile IP, so that packets sent from a source to a Home Agent could
579	   be forwarded unmodified to the different address of the Mobile Node
580	   [RFC3344]. It has come to be used much more generally, e.g., to
581	   support multicast, as well as in overlay network systems
582	   [Er94][To01].

584	4.1.1. MTU discovery

586	   When an IPv4 packet arrives at an IP-in-IP ingress, the DF flag from
587	   the inner packet is copied to the outer header. This enforces DF of
588	   the packet within the tunnel when requested by the packet source.

590	   Packets which are too large are dropped at the ingress, and a
591	   corresponding ICMP "message to big" is returned to the source.
592	   Internally, IP-in-IP tunneling requires that the tunnel MUST support
593	   ICMP-based path MTU discovery (i.e., PMTUD). Note that due to common
594	   filtering of ICMP messages, this requirement is impossible to
595	   determine and thus to enforce.

597	4.1.2. Fragmentation

599	   IP-in-IP tunneling supports Inner Fragmentation. The inner packet MAY
600	   be fragmented if DF=0, otherwise the packet would have been dropped
601	   if too big, as noted earlier. The tunnel MUST NOT fragment at the
602	   outer header if DF=1 is set, i.e., this tunnel protocol assumes the
603	   network honors the DF bit (note that some tunnels, as well as some
604	   network devices, do not honor the DF bit). Further, if the DF bit is
605	   set in the inner header, it MUST be set in the outer; if not, it MAY
606	   be set in the outer.

608	4.1.3. Signaling

610	   IP-in-IP tunnels MAY relay ICMPs from inside the tunnel to the
611	   source, i.e., at the ingress. They SHOULD relay network and host
612	   unreachable messages, and MUST relay "message too big" messages;
613	   these reflect network conditions that the source should be informed
614	   about. They MUST NOT relay port unreachable messages, because these
615	   are meaningless for encapsulated packets, and thus reflect internal
616	   link conditions that the source should not care about at all. They
617	   MUST NOT relay and SHOULD handle locally messages that affect the
618	   ingress as if it were a host, e.g., source quench and router errors.

620	   Most notably, IP-in-IP notes that the tunnel SHOULD keep sufficient
621	   soft state to assist with relaying. Such state may involve keeping
622	   copies of recently sent packets, to have sufficient context to relay
623	   when lacking in the received ICMP message.

625	4.2. IPsec

627	   The Internet network security standard, IPsec, incorporates IP-in-IP
628	   encapsulation as part of its tunnel mode of operation [RFC4301].
629	   Although IP-in-IP packets can be secured via IPsec transport mode,
630	   resulting in identical packets [RFC3884], the rules affecting IPsec
631	   tunnel mode MTU discovery, fragmentation, and signaling mode are
632	   specified by IPsec, rather than IP-in-IP.

634	4.2.1. MTU discovery

636	   Tunnel mode IPsec MTU discovery supports ICMP-based path MTU
637	   discovery (PMTUD), but only as a SHOULD. If an IPv4 packet arrives
638	   with DF=1, or an IPv6 packet arrives, and either is too large for the
639	   tunnel, the ingress SHOULD discard and send an ICMP to the source. If
640	   IPv4 and DF=0, the ingress SHOULD perform Outer Fragmentation, and
641	   SHOULD NOT send an ICMP to the source.

643	4.2.2. Fragmentation

645	   IPsec performs only Outer Fragmentation; this distinguishes it from
646	   IP-in-IP, which performs only Inner Fragmentation.

648	   It requires that implementations of tunnel mode allow the security
649	   policy to decide how the IPv4 DF bit should propagate from the inner
650	   to the outer header. It may be copied, cleared, or set, again,
651	   differing from IP-in-IP which allows only copy or set.

653	4.2.3. Signaling

655	   IPsec, like IP-in-IP, relays ICMP "message to big" signals from the
656	   ingress back to the source. The size indicated is adjusted to take
657	   into account for the space for both encapsulation and security
658	   information. Further, it allows that any ICMP message may be blocked,
659	   on a per-security association basis; this filtering is for security
660	   reasons, but also can directly result in "black holing".

662	5. Issues

664	   As has been shown in only two examples, even similar mechanisms for
665	   encapsulation can result in very different approaches to tunneling.
666	   Although these approaches result in different MTU discovery,
667	   fragmentation, and signaling mechanisms, they result from different
668	   architectural perspectives on the role of tunnels in the Internet.
669	   This section discusses these more fundamental perspectives, and their
670	   impact on the mechanisms.

672	5.1. Tunnel model

674	   The Internet architecture is composed of hosts, gateways (i.e.,
675	   routers), and links [Cl88]. A host is a source or sink of network
676	   packet traffic, a router redirects packets from one set of links to
677	   another, and links interconnect hosts and routers. Although
678	   originally described for the Internet's network layer, this
679	   architecture, with a bit of renaming (e.g., routers become bridges),
680	   applies equally well for link layers.

682	   Tunnels could, in principle, be related to this basic model in one of
683	   three ways:

685	   o  Tunnel as a link

687	   o  Tunnel as a router/bridge

689	   o  Tunnel as invisible

691	   Tunnels require distinct ingress and egress addresses, to use during
692	   encapsulation, and to direct encapsulated traffic from the ingress to
693	   the egress. As a result, a tunnel is most usefully considered a link
694	   in the architecture in which they are deployed. As a result, tunnel
695	   designers should consider and apply link design issues [RFC3819].
696	   This also implies that operating systems designers should represent
697	   tunnels as links; this may be conveniently represented as virtual
698	   interfaces.

700	   [this includes tunnel as point-point vs. tunnel as multipoint]

702	5.2. Parties participating

704	   The description of a tunnel focuses on the functions of the ingress
705	   and egress, but not all functions need be located at one of these two
706	   points. Recall inner fragmentation, in which fragment reassembly
707	   occurs at the destination, not the egress - this imposes load on the
708	   destination as a result of behavior of the ingress.

710	   Containing all tunnel functions solely inside the tunnel endpoints,
711	   as with outer fragmentation, is architecturally clean. It also obeys
712	   the 'clean up your own mess' principle; the impact of encapsulation
713	   and fragmentation caused by the ingress is then handled by the
714	   egress, without imposing load on the destination.

716	   Distributing tunnel functions across both egress and destination, as
717	   with inner fragmentation, can be more efficient. The impact of the
718	   limited IPv4 IP ID space is more prominent in the outer header, due
719	   to aggregation of traffic at the ingress. Using the inner header for
720	   fragmentation allows use of a larger effective IP ID space because of
721	   the additional IP source/destination addresses present there.
722	   Reassembly can be distributed among a large number of destinations
723	   (where present), and the impact of reassembly can be isolated to only
724	   affected destinations. Further, fragmenting once at the ingress can
725	   avoid repeated fragmentation/reassembly steps when packets traverse
726	   multiple tunnels in succession.

728	   The primary case in favor of distributed tunnel functions, and thus
729	   inner encapsulation is that high speed ingress devices can be
730	   implemented, but that corresponding high speed egresses are difficult
731	   or costly. Unfortunately, network operators cannot always know in
732	   advance that high-speed ingresses are being deployed where the
733	   destination traffic is sufficiently diffuse; deploying such a device
734	   where the traffic focuses on a single destination puts an undue
735	   burden on that destination.

737	6. Potential Ways Forward

739	   There are a number of issues which may benefit from a coordinated
740	   review. These include unification of various tunneling standards, and
741	   revision of tunnel standards to address:

743	   o  Relation of inner/outer headers (i.e., which fields are copied,
744	      derived, etc.)

746	   o  MTU discovery

748	   o  Fragmentation

750	   o  Signaling

752	   This revision may suggest the utility of a single, configurable
753	   tunnel mechanism that includes various solutions as alternatives,
754	   rather than developing custom tunnel solutions on-demand. It may also
755	   suggest the development of new solutions, such as:

757	   o  The use of PLPMTUD for tunnels

759	   o  Addressing the IP ID issue and fragmentation

761	   o  New ICMP signals

763	   o  Optimization solutions, such as packing

765	   SEAL addresses a few of these issues, notably the first two
766	   [RFC5320]. It adds an active signal exchange between ingress and
767	   egress for intra-tunnel MTU discovery, and an extension to the IP ID
768	   space to detect collisions.

770	   Tunnels are further evidence that the current requirements for IPv4
771	   ID uniqueness may need revision. In particular, it is clear that even
772	   moderate speed transport connections already violate these
773	   requirements. We recommend revisiting the requirements as suggested
774	   in [To10].

776	   Note that this document does not argue for a single, generic
777	   tunneling protocol or mechanism. Such a mechanism is no more likely
778	   to be useful than would a 'one size fits all' transport protocol. It
779	   does argue, however, for consistency in tunnel design, and
780	   abstraction and reuse of mechanism where possible.

782	7. Notes for future updates

784	   [This area includes notes for future updates which have been reported
785	   but not yet fully included - it represents a holding area for
786	   comments, and should not appear in the final document.]

788	   tunnel as virtualization - Stewart Bryant (SB)

790	   tunnel as endpoint only, not on-path (not MPLS, e.g.) - JT/coauthor

792	   gigE packing like PWE3 ATM packing - SB

794	   PPP chopping and coalescing - MT/coauthor

796	   end sec 2 "we need large seq num and to frag at the tunnel" / maybe,
797	   but do we want recommendations? - SB

799	   security should add addr management and ACLs (?) - SB

801	   MTU as part of BGP? - SB (Will this even work - JT)

803	   section 2 it says: "The IPv6 fragment header is present only when a
804	   packet has been fragmented", but I know of at least one effort in
805	   MANET that is proposing to include the fragment header even for
806	   unfragmented IPv6 packets. That would seem to bend the rules set
807	   forth in RFC2460, but I just thought it might be worth pointing out
808	   that some people are considering bending them. - Fred Templin

810	   NATs - i.e., One other thought; where the IP ID problem becomes truly
811	   pathological is for tunnels that traverse IPv4 NATs. First, the NATs
812	   could rewrite the ID to something the ingress tunnel endpoint never
813	   intended. Secondly, multiple ingress tunnel endpoints that traverse
814	   the same NAT could have IP ID "collisions" from the perspective of
815	   the outside world.  This may deserve a section unto itself? - FT

817	   NAT as half-tunnel - JT

819	   tunnel endpoint as following host rules - JT (as with ECN in CAPWAP,
820	   per Magnus' email of 10/10/08)

822	   the need for larger min MTU - FT (see SEAL)
823	   describe relationship to [Ho08] - JT (as per INTAREA meeting notes,
824	   don't cover Teredo-specific issues in Ho08, but include generic
825	   issues here)

827	8. Security Considerations

829	   Tunnels may introduce vulnerabilities, or add to the potential for
830	   receiver overload and thus DOS attacks. These issues are primarily
831	   related to the fact that a tunnel is a link that traverses a network
832	   path, and to fragmentation and reassembly. Regarding ICMP signals,
833	   tunnels have similar security issues to routers, in that they SHOULD
834	   throttle ICMPs sent to a given source, and SHOULD send ICMPs that
835	   correspond to events inside the tunnel. Such ICMPs MUST have the
836	   tunnel ingress IP address as the source IP, because IP addresses
837	   inside a tunnel path may have no meaning outside the tunnel.

839	   Tunnels traverse multiple hops of a network path from ingress to
840	   egress. Traffic along such tunnels may be susceptible to on-path and
841	   off-path attacks, including fragment injection, reassembly buffer
842	   overload, and ICMP attacks. Some of these attacks may not be as
843	   visible to the endpoints of the architecture into which tunnels are
844	   deployed, and may result in these attacks being more difficult to
845	   detect.

847	   Inner fragmentation can present an undue burden on destinations where
848	   traffic is not sufficiently diffuse; tunnels SHOULD NOT employ inner
849	   fragmentation except where such diffusion is confirmed either by the
850	   tunnel mechanism or network designer. All tunnel fragmentation -
851	   inner and outer - MUST obey all existing fragmentation requirements,
852	   i.e., IPv6 tunnels MUST NOT employ inner fragmentation, and IPv4
853	   tunnels MUST NOT use inner fragmentation where the inner header DF=1.

855	   Tunnels MUST obey all existing IP requirements, such as the
856	   uniqueness of the IP ID field, until otherwise exceptioned or
857	   revoked. Failure to either limit encapsulation traffic, or use
858	   additional ingress/egress IP addresses, can result in high speed
859	   traffic fragments being incorrectly reassembled.

861	9. IANA Considerations

863	   This document has no IANA considerations.

865	   The RFC Editor should remove this section prior to publication.

867	10. References

869	10.1. Normative References

871	   [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
872	             Requirement Levels", BCP 14, RFC 2119, March 1997.

874	10.2. Informative References

876	   [Cl88]    Clark, D., "The design philosophy of the DARPA internet
877	             protocols," Proc. Sigcomm 1988, p.106-114, 1988.

879	   [Er94]    Eriksson, H., "MBone: The Multicast Backbone,"
880	             Communications of the ACM, Aug. 1994, pp.54-60.

882	   [Fa10]    Farinacci, D., V. Fuller, D. Meyer, D. Lewis, "Locator/ID
883	             Separation Protocol (LISP)," (work in progress), draft-
884	             ietf-lisp-06, Jan. 2010.

886	   [Ho08]    Hoagland, J., S. Krishnan, D. Thaler, "Security Concerns
887	             With IP Tunneling," (work in progress), draft-ietf-v6ops-
888	             tunnel-security-concerns-01, Oct. 2008.

890	   [Pe10]    Perlman, R., D. Eastlake, D. Dutt, S. Gai, A. Ghanwani,
891	             "RBridges: Base Protocol Specification," (work in
892	             progress), trill draft-ietf-trill-rbridge-protocol-15, Jan.
893	             2010.

895	   [RFC791]  Postel, J., "Internet Protocol," RFC 791 / STD 5, September
896	             1981.

898	   [RFC1122] Braden, R., Ed., "Requirements for Internet Hosts -
899	             Communication Layers," RFC 1122 / STD 3, October 1989.

901	   [RFC1191] Mogul, J., S. Deering, "Path MTU discovery," RFC 1191,
902	             November 1990.

904	   [RFC2003] Perkins, C., "IP Encapsulation within IP," RFC 2003,
905	             October 1996.

907	   [RFC2923] Lahey, K., "TCP Problems with Path MTU Discovery," RFC
908	             2923, September 2000.

910	   [RFC3344] Perkins, C., Ed., "IP Mobility Support for IPv4," RFC 3344,
911	             August 2002.

913	   [RFC3819] Karn, P., Ed., C. Bormann, G. Fairhurst, D. Grossman, R.
914	             Ludwig, J. Mahdavi, G. Montenegro, J. Touch, L. Wood,
915	             "Advice for Internet Subnetwork Designers," RFC 3819 / BCP
916	             89, July 2004.

918	   [RFC3884] Touch, J., L. Eggert, Y. Wang, "Use of IPsec Transport Mode
919	             for Dynamic Routing," RFC 3884, September 2004.

921	   [RFC3931] Lau, J., Ed., M. Townsley, Ed., I. Goyret, Ed., "Layer Two
922	             Tunneling Protocol - Version 3 (L2TPv3)," RFC 3931, March
923	             2005.

925	   [RFC4176] El Mghazli, Y., Ed., T. Nadeau, M. Boucadair, K. Chan, A.
926	             Gonguet, "Framework for Layer 3 Virtual Private Networks
927	             (L3VPN) Operations and Management," RFC 4176, October 2005.

929	   [RFC4301] Kent, S., and K. Seo, "Security Architecture for the
930	             Internet Protocol," RFC 4301, December 2005.

932	   [RFC4459] Savola, P., "MTU and Fragmentation Issues with In-the-
933	             Network Tunneling," RFC 4459, April 2006.

935	   [RFC4664] Andersson, L., Ed., E. Rosen, Ed., "Framework for Layer 2
936	             Virtual Private Networks (L2VPNs)," RFC 4664, September
937	             2006.

939	   [RFC4821] Mathis, M., J. Heffner, "Packetization Layer Path MTU
940	             Discovery," RFC 4821, March 2007.

942	   [RFC4963] Heffner, J., M. Mathis, B. Chandler, "IPv4 Reassembly
943	             Errors at High Data Rates," RFC 4963, July 2007.

945	   [RFC5320] Templin, F., Ed., "The Subnetwork Encapsulation and
946	             Adaptation Layer (SEAL)," RFC 5320, Feb. 2010.

948	   [RFC5556] Touch, J., R. Perlman, "Transparently Interconnecting Lots
949	             of Links (TRILL): Problem and Applicability Statement," RFC
950	             5556, May 2009.

952	   [To01]    Touch, J., "Dynamic Internet Overlay Deployment and
953	             Management Using the X-Bone," Computer Networks, July 2001,
954	             pp. 117-135.

956	   [To10]    Touch, J., "Updated Specification of the IPv4 ID Field,"
957	             (work in progress), draft-touch-intarea-ipv4-id-update,
958	             Feb. 2010.

960	11. Acknowledgments

962	   This document originated as the result of numerous discussions among
963	   the authors, Jari Arkko, Stuart Bryant, Lars Eggert, Dino Farinacci,
964	   Matt Mathis, and Fred Templin, as well as members participating in
965	   the Internet Area Working Group.

967	   This document was prepared using 2-Word-v2.0.template.dot.

969	Authors' Addresses

971	   Joe Touch
972	   USC/ISI
973	   4676 Admiralty Way
974	   Marina del Rey, CA 90292-6695
975	   U.S.A.

977	   Phone: +1 (310) 448-9151
978	   Email: touch@isi.edu

980	   W. Mark Townsley
981	   Cisco
982	   L'Atlantis, 11, Rue Camille Desmoulins
983	   Issy Les Moulineaux, ILE DE FRANCE 92782

985	   Email: townsley@cisco.com