idnits 2.17.1 draft-ietf-intarea-tunnels-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- == There are 1 instance of lines with non-RFC6890-compliant IPv4 addresses in the document. If these are example addresses, they should be changed. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == The document seems to contain a disclaimer for pre-RFC5378 work, but was first submitted on or after 10 November 2008. The disclaimer is usually necessary only for documents that revise or obsolete older RFCs, and that take significant amounts of text from those RFCs. If you can contact all authors of the source material and they are willing to grant the BCP78 rights to the IETF Trust, you can and should remove the disclaimer. Otherwise, the disclaimer is needed and you can ignore this comment. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (March 26, 2010) is 5135 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Outdated reference: A later version (-24) exists of draft-ietf-lisp-06 == Outdated reference: A later version (-04) exists of draft-ietf-v6ops-tunnel-security-concerns-01 == Outdated reference: A later version (-16) exists of draft-ietf-trill-rbridge-protocol-15 -- Obsolete informational reference (is this intentional?): RFC 3344 (Obsoleted by RFC 5944) -- No information found for draft-touch-intarea-ipv4-id-update - is the name correct? Summary: 0 errors (**), 0 flaws (~~), 6 warnings (==), 3 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 Internet Area WG J. Touch 2 Internet Draft USC/ISI 3 Intended status: Informational M. Townsley 4 Expires: September 2010 Cisco 5 March 26, 2010 7 Tunnels in the Internet Architecture 8 draft-ietf-intarea-tunnels-00.txt 10 Status of this Memo 12 This Internet-Draft is submitted in full conformance with the 13 provisions of BCP 78 and BCP 79. 15 This document may contain material from IETF Documents or IETF 16 Contributions published or made publicly available before November 17 10, 2008. The person(s) controlling the copyright in some of this 18 material may not have granted the IETF Trust the right to allow 19 modifications of such material outside the IETF Standards Process. 20 Without obtaining an adequate license from the person(s) controlling 21 the copyright in such materials, this document may not be modified 22 outside the IETF Standards Process, and derivative works of it may 23 not be created outside the IETF Standards Process, except to format 24 it for publication as an RFC or to translate it into languages other 25 than English. 27 Internet-Drafts are working documents of the Internet Engineering 28 Task Force (IETF), its areas, and its working groups. Note that 29 other groups may also distribute working documents as Internet- 30 Drafts. 32 Internet-Drafts are draft documents valid for a maximum of six months 33 and may be updated, replaced, or obsoleted by other documents at any 34 time. It is inappropriate to use Internet-Drafts as reference 35 material or to cite them other than as "work in progress." 37 The list of current Internet-Drafts can be accessed at 38 http://www.ietf.org/ietf/1id-abstracts.txt 40 The list of Internet-Draft Shadow Directories can be accessed at 41 http://www.ietf.org/shadow.html 43 This Internet-Draft will expire on September 26, 2010. 45 Copyright Notice 47 Copyright (c) 2010 IETF Trust and the persons identified as the 48 document authors. All rights reserved. 50 This document is subject to BCP 78 and the IETF Trust's Legal 51 Provisions Relating to IETF Documents 52 (http://trustee.ietf.org/license-info) in effect on the date of 53 publication of this document. Please review these documents 54 carefully, as they describe your rights and restrictions with respect 55 to this document. Code Components extracted from this document must 56 include Simplified BSD License text as described in Section 4.e of 57 the Trust Legal Provisions and are provided without warranty as 58 described in the Simplified BSD License. 60 Abstract 62 This document discusses the role of tunnels in the Internet 63 architecture. It explains their relationship to existing protocol 64 layers, and the challenges in supporting tunneling. 66 Table of Contents 68 1. Introduction...................................................3 69 2. Conventions used in this document..............................4 70 3. Known Issues...................................................4 71 3.1. MTU discovery.............................................5 72 3.2. Fragmentation.............................................6 73 3.2.1. Outer Fragmentation..................................6 74 3.2.2. Inner Fragmentation..................................7 75 3.2.3. Fragmentation efficiency.............................8 76 3.2.4. Packing (ala GigE bursting).........................10 77 3.2.5. IP ID exhaustion....................................11 78 3.3. Signaling................................................12 79 4. Current Tunnel Standards......................................13 80 4.1. IP in IP.................................................13 81 4.1.1. MTU discovery.......................................13 82 4.1.2. Fragmentation.......................................14 83 4.1.3. Signaling...........................................14 84 4.2. IPsec....................................................14 85 4.2.1. MTU discovery.......................................15 86 4.2.2. Fragmentation.......................................15 87 4.2.3. Signaling...........................................15 88 5. Issues........................................................15 89 5.1. Tunnel model.............................................15 90 5.2. Parties participating....................................16 92 6. Potential Ways Forward........................................17 93 7. Notes for future updates......................................18 94 8. Security Considerations.......................................19 95 9. IANA Considerations...........................................19 96 10. References...................................................20 97 10.1. Normative References....................................20 98 10.2. Informative References..................................20 99 11. Acknowledgments..............................................22 101 1. Introduction 103 The Internet is loosely based on the ISO seven layer stack, in which 104 data units traverse the stack by being wrapped inside data units one 105 layer down (Figure 1). A tunnel is a mechanism for transmitting data 106 units between endpoints by wrapping them inside data units other 107 layers, e.g., IP in IP, or IP in UDP (Figure 2). 109 +------+----+-----+--------------+ 110 + Eth | IP | TCP | Data | 111 +------+----+-----+--------------+ 113 Figure 1 TCP inside IP inside Ethernet 115 +------+----+-----+----+-----+--------------+ 116 + Eth | IP'| UDP | IP | TCP | Data | 117 +------+----+-----+----+-----+--------------+ 119 Figure 2 IP in UDP in IP in Ethernet 121 Tunnels help decouple topology from that provided by the physical 122 network components. For example, they were critical in the 123 development of multicast, where not all routers were capable of 124 processing multicast packets. Multicast routers were interconnected 125 by tunnels where not directly connected. Similar techniques have been 126 used to support other protocols, such as IPv6. 128 Use of tunnels is common in the Internet. The word "tunnel" occurs in 129 over 100 RFCs, and is supported within numerous protocols, including: 131 o IPsec - hides the original traffic destination [RFC4301] 133 o L2TP - Tunnels PPP over IP, used largely in DSL/FTTH access 134 networks to extend a subscriber's connection from an access line 135 provider to an ISP [RFC3931] 137 o Mobile IP - forwards traffic to the home agent [RFC2003] 138 o L2VPNs - provides a link topology different from that provided by 139 physical links [RFC4664] 141 o L3VPNs - provides a network topology different from that provided 142 by ISPs [RFC4176] 144 o SEAL - a generic mechanism for IP in IP tunneling designed to 145 overcome the limitations of RFC2003 [RFC5320] 147 o LISP - reduces routing table load within an enclave of routers 148 [Fa10] 150 o TRILL - enables L3 routing in an enclave of bridges 151 [Pe10][RFC5556] 153 o MPLS - ? {need description/ref} 155 o PWE3 - ? {need description/ref} 157 The variety of tunnel mechanisms begs the question of the roles of 158 tunnels in the Internet architecture, and the potential need for 159 coordination of these mechanisms. In particular, the ways in which 160 MTU mismatch, error signals (e.g., ICMP), and is handled may benefit 161 from a coordinated approach. 163 It is useful to note that, regardless of the layer in which 164 encapsulation occurs, tunnels emulate a link. As links, they are 165 subject to link issues, e.g., MTU discovery, signaling, and the 166 potential utility of native support for broadcast and multicast 167 [RFC3819]. They have advantages over native links, being potentially 168 easier to reconfigure and control. 170 2. Conventions used in this document 172 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 173 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 174 document are to be interpreted as described in RFC-2119 [RFC2119]. 176 3. Known Issues 178 Most of the known issues with tunnels arise from the complications of 179 encapsulation, or from the introduction of artificial endpoints along 180 a data path. Encapsulation exacerbates MTU issues, often because a 181 data unit will traverse at least one layer of a protocol stack more 182 than once (e.g., as in Figure 2), which requires space for additional 183 headers. This space complicates MTU discovery, and often results in 184 fragmentation. 186 Tunnel encapsulation and decapsulation nodes act as network 187 endpoints. They may source and sink much higher bandwidth streams 188 from single IP addresses, and thus can be affected by many of the 189 issues of other high bandwidth edge devices, such as fragmentation 190 efficiency and IP ID exhaustion (in IPv4). These endpoints also 191 introduce complexity in end-to-end and path signaling, in the 192 translation between signals inside a tunnel and signals outside on 193 the end-to-end path. 195 3.1. MTU discovery 197 MTU discovery is a known challenge in the current Internet, and 198 tunnels can complicate its proper operation. Encapsulation increases 199 the size of a packet during tunnel transit that can exceed the MTU of 200 the links of the tunnel path. This is especially true for recursive 201 tunnels, i.e., tunnels that reuse layers of the protocol stack (e.g., 202 IPv4 over IPv4). These issues are discussed in detail in [RFC4459]; 203 the following provides a brief overview of the issues. Note that the 204 impact of tunnels on MTU discovery may be mitigated somewhat by the 205 ubiquity of workarounds already needed in the Internet, e.g., the 206 deduction of a 'tunnel tax' for all MTUs (i.e., maxing out the MTU at 207 1200-1400 bytes, rather than 1500). 209 Conventional path MTU discovery (PMTUD) relies on explicit negative 210 feedback from routers along the path (ICMP "message to big" signals) 211 [RFC1191]. This technique is susceptible to the "black hole" 212 phenomenon, in which the ICMP messages never return to the source 213 [RFC2923]. In the typical Internet case, lost ICMPs are often the 214 result of filtering, e.g., for policy reasons. 216 A more recent alternative is packetization-layer path MTU discovery 217 (PLPMTUD) [RFC4821]. This variant relies on feedback from the 218 endpoint, indicating either the success or failure of probe packets. 219 It is not susceptible to "black holing", but requires explicit 220 participation by the receiver. 222 Either of these techniques (PMTUD, PLPMTUD) can be applied to 223 tunnels. The encapsulator must react to "message to big" signals in 224 either case, by either adjusting its fragmentation, relaying a 225 corresponding signal to the packet origin outside the tunnel, or 226 both. Fragmentation adjustment is easy to incorporate, but can result 227 in inefficient transmission of packets over the tunnel (e.g., where 228 every source packet is fragmented). Relaying the signal to the source 229 can be much more efficient, but it can be difficult to determine what 230 signal to forward. E.g., in PMTUD, routers along the tunnel may not 231 return a sufficiently long prefix to determine the decapsulated 232 packet origin. 234 Tunnels thus may need to participate in MTU discovery, either 235 forwarding or recomputing ICMPs received inside the tunnel path. The 236 tunnel may incorporate its own MTU discovery between ingress and 237 egress, e.g., as proposed in SEAL [RFC5320]. 239 3.2. Fragmentation 241 There are two places where fragmentation can occur in a tunnel, 242 called Outer Fragmentation and Inner Fragmentation. 244 3.2.1. Outer Fragmentation 246 The simplest case is Outer Fragmentation, as shown in Figure 3. The 247 bottom of the figure shows the network toplogy, where packets start 248 at the source, enter the tunnel at the encapsulator, exit the tunnel 249 at the decapsulator, and arrive finally at the destination. The 250 packet traffic is shown above the topology, where the end-to-end 251 packets are shown at the top. The packets are composed of an inner 252 header (iH) and inner data (iD); the term "inner") is relative to the 253 tunnel, as will become apparent. When the packet (iH,iD) arrives at 254 the encapsulator, it is placed inside the tunnel packet structure, 255 here shown as adding just an outer header, oH, in step (a). 257 When the encapsulated packet exceeds the MTU of the tunnel, the 258 packet needs to be fragmented. In this case we fragment the packet at 259 the outer header, with the fragments shown as (b1) and (b2). Note 260 that the outer header indicates fragmentation (as ' and "),the inner 261 header occurs only in the first fragment, and the inner data is 262 broken across the two packets. These fragments are reassembled at the 263 encapsulator in step (c), and the resulting packet is decapsulated 264 and sent on to the destination. 266 +----+----+ +----+----+ 267 | iH | iD |------+ - - - - - - - - - - +------>| iH | iD | 268 +----+----+ | | +----+----+ 269 v | 270 +----+----+----+ +----+----+----+ 271 (a) | oH | iH | iD | | oH | iH | iD | (c) 272 +----+----+----+ +----+----+----+ 273 | ^ 274 | +----+----+-----+ | 275 (b1) +----- >| oH'| iH | iD1 |-------+ 276 | +----+----+-----+ | 277 | | 278 | +----+-----+ | 279 (b2) +----- >| oH"| iD2 |------------+ 280 +----+-----+ 282 +-----+ +---+ +---+ +-----+ 283 | | / \ ======================= / \ | | 284 | Src |=======| Enc |=======================| Dec |=======| Dst | 285 | | \ / ======================= \ / | | 286 +-----+ +---+ +---+ +-----+ 288 Figure 3 Fragmentation of the outer packet 290 Outer fragmentation isolates Source and Destination from tunnel 291 encapsulation duties. This can be considered a benefit in clean, 292 layered network design, but also may result in complex decapsulator 293 design, especially where tunnels aggregate large amounts of traffic, 294 such as IP ID overload (see Sec. 3.2.5). Outer fragmentation is valid 295 for any tunnel encapsulation protocol that supports fragmentation 296 (e.g., IPv4 or IPv6), where the tunnel endpoints act as the host 297 endpoints of that protocol. 299 Along the tunnel, the inner header is contained only in the first 300 fragment, which can interfere with mechanisms that 'peek' into lower 301 layer headers, e.g., as for ICMP, as discussed in Sec. 3.3. 303 3.2.2. Inner Fragmentation 305 Inner Fragmentation distributes the impact of tunneling across both 306 the decapsulator and destination, and is shown in Figure 4. Again, 307 the network topology is shown at the bottom of the figure, and the 308 original packets show at the top. Packets arrive at the encapsulator, 309 and are fragmented there based on the inner header into (a1) and 310 (a2). The fragments arrive at the decapsulator, which removes the 311 outer header and forwards the resulting fragments on to the 312 destination. The destination is then responsible for reassembling the 313 fragments into the original packet. 315 +----+----+ +----+----+ 316 | iH | iD |-------+- - - - - - - - - - - - - >| iH | iD | 317 +----+----+ | +----+----+ 318 v ^ 319 +----+-----+ +----+-----+ | 320 (a1) | iH'| iD1 | | iH'| iD1 |------+ 321 +----+-----+ +----+-----+ | 322 | 323 +----+--- +----+-----+ | 324 (a2) | iH"| iD2 | | iH"| iD2 |------+ 325 +----+-----+ +----+-----+ 326 | ^ 327 | +----+----+----- | 328 (b1) +----- >| oH | iH'| iD1 |-------+ 329 | +----+----+-----+ | 330 | | 331 | +----+----+-----+ | 332 (b2) +----- >| oH | iH"| iD2 |-------+ 333 +----+----+-----+ 335 +-----+ +---+ +---+ +-----+ 336 | | / \ ======================= / \ | | 337 | Src |=======| Enc |=======================| Dec |=======| Dst | 338 | | \ / ======================= \ / | | 339 +-----+ +---+ +---+ +-----+ 341 Figure 4 Fragmentation of the inner packet 343 As noted, inner fragmentation distributes the effort of tunneling 344 across the decapsulator and destinations; this can be especially 345 important when the tunnel aggregates large amounts of traffic. Note 346 that this mechanism is thus valid only when the original source 347 packets can be fragmented on-path, e.g., as in IPv4. 349 Along the tunnel, the inner headers are copied into each fragment, 350 and so are available to mechanisms that 'peek' into headers (e.g., 351 ICMP, as discussed in Sec. 3.3). Because fragmentation happens on the 352 inner header, the impact of IP ID is reduced. 354 3.2.3. Fragmentation efficiency 356 There are different ways to fragment a packet. Consider a network 357 with an MTU as shown in Figure 5, where packets are encapsulated over 358 the same network layer as they arrive on (e.g., IP in IP). If a 359 packet as large as the MTU arrives, it must be fragmented to 360 accommodate the additional header. 362 X===========================X (MTU) 363 +----+----------------------+ 364 | iH | DDDDDDDDDDDDDDDDDDDD | 365 +----+----------------------+ 366 | 367 | X===========================X (MTU) 368 | +---+----+------------------+ 369 (a) +->| H'| iH | DDDDDDDDDDDDDDDD | 370 | +---+----+------------------+ 371 | | 372 | | X===========================X (MTU) 373 | | +----+---+----+-------------+ 374 | (a1) +->| nH'| H | iH | DDDDDDDDDDD | 375 | | +----+---+----+-------------+ 376 | | 377 | | +----+-------+ 378 | (a2) +->| nH"| DDDDD | 379 | +----+-------+ 380 | 381 | +---+------+ 382 (b) +->| H"| DDDD | 383 +---+------+ 384 | 385 | +----+---+------+ 386 (b1) +->| nH'| H"| DDDD | 387 +----+---+------+ 389 Figure 5 Fragmenting via maximum fit 391 Figure 5 shows this process, using Outer Fragmentation as an example 392 (the situation is the same for Inner Fragmentation, but the headers 393 that are affected differ). The arriving packet is first split into 394 (a) and (b), where (a) is of the MTU of the network. However, this 395 tunnel then traverses over another tunnel, whose impact the first 396 tunnel ingress has not accommodated. The packet (a) arrives at the 397 second tunnel ingress, and needs to be encapsulated again, but 398 because it is already at the MTU, it needs to be fragmented as well, 399 into (a1) and (a2). In this case, packet (b) arrives at the second 400 tunnel ingress and is encapsulated into (b1) without fragmentation, 401 because it is already below the MTU size. 403 In Figure 6, the fragmentation is done evenly, i.e., by splitting the 404 original packet into two roughly equal-sized components, (c) and (d). 405 Note that (d) contains more packet data, because (c) includes the 406 original packet header because this is an example of Outer 407 Fragmentation. The packets (c) and (d) arrive at the second tunnel 408 encapsulator, and are encapsulated again; this time, neither packet 409 exceeds the MTU, and neither requires further fragmentation. 411 X===========================X (MTU) 412 +----+----------------------+ 413 | iH | DDDDDDDDDDDDDDDDDDDD | 414 +----+----------------------+ 415 | 416 | X===========================X (MTU) 417 | +---+----+----------+ 418 (c) +->| H'| iH | DDDDDDDD | 419 | +---+----+----------+ 420 | | 421 | | X===========================X (MTU) 422 | | +----+---+----+----------+ 423 | (c1) +->| nH | H'| iH | DDDDDDDD | 424 | +----+---+----+----------+ 425 | 426 | +---+--------------+ 427 (d) +->| H"| DDDDDDDDDDDD | 428 +---+--------------+ 429 | 430 | +----+---+--------------+ 431 (d1) +->| nH | H"| DDDDDDDDDDDD | 432 +----+---+--------------+ 434 Figure 6 Fragmenting evenly 436 3.2.4. Packing (ala GigE bursting) 438 Encapsulating individual packets to traverse a tunnel can be 439 inefficient, especially where headers are large relative to the 440 packets being carried. In that case, it can be more efficient to 441 encapsulate many small packets in a single, larger tunnel payload. 442 This technique, similar to the effect of packet bursting in Gigabit 443 Ethernet, reduces the overhead of the encapsulation headers (Figure 444 7). It reduces the work of header addition and removal at the tunnel 445 endpoints, but increases other work involving the packing and 446 unpacking of the component packets carried. 448 +-----+-----+ 449 | iHa | iDa | 450 +-----+-----+ 451 | 452 | +-----+-----+ 453 | | iHb | iDb | 454 | +-----+-----+ 455 | | 456 | | +-----+-----+ 457 | | | iHc | iDc | 458 | | +-----+-----+ 459 | | | 460 v v v 461 +----+-----+-----+-----+-----+-----+-----+ 462 | oH | iHa | iHa | iHb | iDb | iHc | iDc | 463 +----+-----+-----+-----+-----+-----+-----+ 465 Figure 7 Packing packets into a tunnel 467 3.2.5. IP ID exhaustion 469 In IPv4, the IP Identification (ID) field is a 16-bit value that is 470 unique for every packet for a given source address, destination 471 address, and protocol, such that it does not repeat within the 472 Maximum Segment Lifetime (MSL) [RFC791][RFC1122]. Although the ID 473 field was originally intended for fragmentation and reassembly, it 474 can also be used to detect and discard duplicate packets, e.g., at 475 congested routers (see Sec. 3.2.1.5 of [RFC1122]). For this reason, 476 and even more so that IPv4 packets can be fragmented anywhere along a 477 path, all packets between a source and destination of a given 478 protocol must have unique ID values over a period of an MSL, which is 479 typically interpreted as two minutes (120 seconds). 481 The uniqueness of the IP ID is a known problem for high speed 482 devices, because it limits the speed of a single protocol between two 483 endpoints [RFC4963]. With the maximum IP packet size of 64KB, a 16- 484 bit ID field that does not repeat within 120 seconds means that the 485 sum of all TCP connections between two endpoints is limited to 486 roughly 286 Mbps; for more typical MTUs of 1500 bytes, this drops to 487 6.4 Mbps. 489 Although this strongly suggests that the uniqueness of the IP ID is 490 moot, tunnels exacerbate this condition. A tunnel often aggregates 491 traffic from a number of different source and destination addresses, 492 of different protocols, and encapsulates them in a header with the 493 same ingress and egress addresses, all using a single encapsulation 494 protocol. The result is one of the following: 496 1. The IP ID rules are enforced, and the tunnel throughput is 497 severely limited. 499 2. The IP ID rules are enforced, and the tunnel consumes large 500 numbers of ingress/egress IP addresses solely to ensure ID 501 uniqueness. 503 3. The IP ID rules are ignored. 505 The last case is the most obvious solution, because it corresponds to 506 how endpoints currently behave. Fortunately, fragmentation is 507 somewhat rare in the current Internet at large, but it can be common 508 along a tunnel. Fragments that repeat the IP ID risk being 509 reassembled incorrectly, especially when fragments are reordered or 510 lost. Although such errors may be detected at the transport layer, 511 this results in excessive overall packet loss, as well as wasting 512 bandwidth between the egress and ultimate packet destination. 514 3.3. Signaling 516 In the current Internet architecture, signals tend to go upstream, 517 either from routers along a path or from the destination, back toward 518 the source (Figure 8). Such signals are typically contained in ICMP 519 messages, but can involve other protocols such as RSVP, transport 520 protocol signals (e.g., TCP RSTs), or multicast. 522 +--------------------------------------------------------------+ 523 | | 524 | +---------------------------+ | 525 | | | | 526 v v | | 527 +-----+ | +-----+ 528 | | | | | 529 | Src |=========================R=============================| Dst | 530 | | | | 531 +-----+ +-----+ 533 Figure 8 Signaling paths in an Internet 535 Tunnels interfere with these known signaling paths. As shown in 536 Figure 9, signals from routers along the tunnel path (R2), as well as 537 those from the tunnel egress, need to be relayed by the ingress. This 538 relaying may be difficult, because R2 may not return enough 539 information to the ingress to support relaying (e.g., when ICMP 540 returns only the outermost headers in a "message to big", and the 541 source transport port information is lost). Signals from routers 542 downstream of the egress (R3 in Figure 9) need to traverse the tunnel 543 in reverse. 545 In all cases, the tunnel ingress needs to determine how to relay the 546 signals from inside the tunnel into signals back to the source. For 547 some protocols this is either simple or impossible (such as for 548 ICMP), for others, it can even be undefined (e.g., multicast). 550 + - - - - +-------------------------------+ 551 | | | 552 v v | 553 +-----+ +---+ +---+ +-----+ 554 | | / \ ======================= / \ | | 555 | Src |==R1===| Enc |==========R2===========| Dec |===R3==| Dst | 556 | | \ / ======================= \ / | | 557 +-----+ +---+ | +---+ +-----+ 558 ^ ^ | 559 | | | 560 + - - - - +---------------+ 562 Figure 9 Signaling paths introduced by a tunnel 564 4. Current Tunnel Standards 566 This section reviews two common Internet tunnel standards. They are 567 notable because they both ultimately rely on IP in IP encapsulation, 568 although they each handle MTU discovery, fragmentation, and signaling 569 differently. 571 [There are other tunnel mechanisms, such as IPv4 in IPv6, which may 572 be added to this discussion later.] 574 4.1. IP in IP 576 The simplest tunnel encapsulation mechanism is IP in IP, explained 577 here for IPv4 [RFC2003]. This protocol was standardized for use in 578 mobile IP, so that packets sent from a source to a Home Agent could 579 be forwarded unmodified to the different address of the Mobile Node 580 [RFC3344]. It has come to be used much more generally, e.g., to 581 support multicast, as well as in overlay network systems 582 [Er94][To01]. 584 4.1.1. MTU discovery 586 When an IPv4 packet arrives at an IP-in-IP ingress, the DF flag from 587 the inner packet is copied to the outer header. This enforces DF of 588 the packet within the tunnel when requested by the packet source. 590 Packets which are too large are dropped at the ingress, and a 591 corresponding ICMP "message to big" is returned to the source. 592 Internally, IP-in-IP tunneling requires that the tunnel MUST support 593 ICMP-based path MTU discovery (i.e., PMTUD). Note that due to common 594 filtering of ICMP messages, this requirement is impossible to 595 determine and thus to enforce. 597 4.1.2. Fragmentation 599 IP-in-IP tunneling supports Inner Fragmentation. The inner packet MAY 600 be fragmented if DF=0, otherwise the packet would have been dropped 601 if too big, as noted earlier. The tunnel MUST NOT fragment at the 602 outer header if DF=1 is set, i.e., this tunnel protocol assumes the 603 network honors the DF bit (note that some tunnels, as well as some 604 network devices, do not honor the DF bit). Further, if the DF bit is 605 set in the inner header, it MUST be set in the outer; if not, it MAY 606 be set in the outer. 608 4.1.3. Signaling 610 IP-in-IP tunnels MAY relay ICMPs from inside the tunnel to the 611 source, i.e., at the ingress. They SHOULD relay network and host 612 unreachable messages, and MUST relay "message too big" messages; 613 these reflect network conditions that the source should be informed 614 about. They MUST NOT relay port unreachable messages, because these 615 are meaningless for encapsulated packets, and thus reflect internal 616 link conditions that the source should not care about at all. They 617 MUST NOT relay and SHOULD handle locally messages that affect the 618 ingress as if it were a host, e.g., source quench and router errors. 620 Most notably, IP-in-IP notes that the tunnel SHOULD keep sufficient 621 soft state to assist with relaying. Such state may involve keeping 622 copies of recently sent packets, to have sufficient context to relay 623 when lacking in the received ICMP message. 625 4.2. IPsec 627 The Internet network security standard, IPsec, incorporates IP-in-IP 628 encapsulation as part of its tunnel mode of operation [RFC4301]. 629 Although IP-in-IP packets can be secured via IPsec transport mode, 630 resulting in identical packets [RFC3884], the rules affecting IPsec 631 tunnel mode MTU discovery, fragmentation, and signaling mode are 632 specified by IPsec, rather than IP-in-IP. 634 4.2.1. MTU discovery 636 Tunnel mode IPsec MTU discovery supports ICMP-based path MTU 637 discovery (PMTUD), but only as a SHOULD. If an IPv4 packet arrives 638 with DF=1, or an IPv6 packet arrives, and either is too large for the 639 tunnel, the ingress SHOULD discard and send an ICMP to the source. If 640 IPv4 and DF=0, the ingress SHOULD perform Outer Fragmentation, and 641 SHOULD NOT send an ICMP to the source. 643 4.2.2. Fragmentation 645 IPsec performs only Outer Fragmentation; this distinguishes it from 646 IP-in-IP, which performs only Inner Fragmentation. 648 It requires that implementations of tunnel mode allow the security 649 policy to decide how the IPv4 DF bit should propagate from the inner 650 to the outer header. It may be copied, cleared, or set, again, 651 differing from IP-in-IP which allows only copy or set. 653 4.2.3. Signaling 655 IPsec, like IP-in-IP, relays ICMP "message to big" signals from the 656 ingress back to the source. The size indicated is adjusted to take 657 into account for the space for both encapsulation and security 658 information. Further, it allows that any ICMP message may be blocked, 659 on a per-security association basis; this filtering is for security 660 reasons, but also can directly result in "black holing". 662 5. Issues 664 As has been shown in only two examples, even similar mechanisms for 665 encapsulation can result in very different approaches to tunneling. 666 Although these approaches result in different MTU discovery, 667 fragmentation, and signaling mechanisms, they result from different 668 architectural perspectives on the role of tunnels in the Internet. 669 This section discusses these more fundamental perspectives, and their 670 impact on the mechanisms. 672 5.1. Tunnel model 674 The Internet architecture is composed of hosts, gateways (i.e., 675 routers), and links [Cl88]. A host is a source or sink of network 676 packet traffic, a router redirects packets from one set of links to 677 another, and links interconnect hosts and routers. Although 678 originally described for the Internet's network layer, this 679 architecture, with a bit of renaming (e.g., routers become bridges), 680 applies equally well for link layers. 682 Tunnels could, in principle, be related to this basic model in one of 683 three ways: 685 o Tunnel as a link 687 o Tunnel as a router/bridge 689 o Tunnel as invisible 691 Tunnels require distinct ingress and egress addresses, to use during 692 encapsulation, and to direct encapsulated traffic from the ingress to 693 the egress. As a result, a tunnel is most usefully considered a link 694 in the architecture in which they are deployed. As a result, tunnel 695 designers should consider and apply link design issues [RFC3819]. 696 This also implies that operating systems designers should represent 697 tunnels as links; this may be conveniently represented as virtual 698 interfaces. 700 [this includes tunnel as point-point vs. tunnel as multipoint] 702 5.2. Parties participating 704 The description of a tunnel focuses on the functions of the ingress 705 and egress, but not all functions need be located at one of these two 706 points. Recall inner fragmentation, in which fragment reassembly 707 occurs at the destination, not the egress - this imposes load on the 708 destination as a result of behavior of the ingress. 710 Containing all tunnel functions solely inside the tunnel endpoints, 711 as with outer fragmentation, is architecturally clean. It also obeys 712 the 'clean up your own mess' principle; the impact of encapsulation 713 and fragmentation caused by the ingress is then handled by the 714 egress, without imposing load on the destination. 716 Distributing tunnel functions across both egress and destination, as 717 with inner fragmentation, can be more efficient. The impact of the 718 limited IPv4 IP ID space is more prominent in the outer header, due 719 to aggregation of traffic at the ingress. Using the inner header for 720 fragmentation allows use of a larger effective IP ID space because of 721 the additional IP source/destination addresses present there. 722 Reassembly can be distributed among a large number of destinations 723 (where present), and the impact of reassembly can be isolated to only 724 affected destinations. Further, fragmenting once at the ingress can 725 avoid repeated fragmentation/reassembly steps when packets traverse 726 multiple tunnels in succession. 728 The primary case in favor of distributed tunnel functions, and thus 729 inner encapsulation is that high speed ingress devices can be 730 implemented, but that corresponding high speed egresses are difficult 731 or costly. Unfortunately, network operators cannot always know in 732 advance that high-speed ingresses are being deployed where the 733 destination traffic is sufficiently diffuse; deploying such a device 734 where the traffic focuses on a single destination puts an undue 735 burden on that destination. 737 6. Potential Ways Forward 739 There are a number of issues which may benefit from a coordinated 740 review. These include unification of various tunneling standards, and 741 revision of tunnel standards to address: 743 o Relation of inner/outer headers (i.e., which fields are copied, 744 derived, etc.) 746 o MTU discovery 748 o Fragmentation 750 o Signaling 752 This revision may suggest the utility of a single, configurable 753 tunnel mechanism that includes various solutions as alternatives, 754 rather than developing custom tunnel solutions on-demand. It may also 755 suggest the development of new solutions, such as: 757 o The use of PLPMTUD for tunnels 759 o Addressing the IP ID issue and fragmentation 761 o New ICMP signals 763 o Optimization solutions, such as packing 765 SEAL addresses a few of these issues, notably the first two 766 [RFC5320]. It adds an active signal exchange between ingress and 767 egress for intra-tunnel MTU discovery, and an extension to the IP ID 768 space to detect collisions. 770 Tunnels are further evidence that the current requirements for IPv4 771 ID uniqueness may need revision. In particular, it is clear that even 772 moderate speed transport connections already violate these 773 requirements. We recommend revisiting the requirements as suggested 774 in [To10]. 776 Note that this document does not argue for a single, generic 777 tunneling protocol or mechanism. Such a mechanism is no more likely 778 to be useful than would a 'one size fits all' transport protocol. It 779 does argue, however, for consistency in tunnel design, and 780 abstraction and reuse of mechanism where possible. 782 7. Notes for future updates 784 [This area includes notes for future updates which have been reported 785 but not yet fully included - it represents a holding area for 786 comments, and should not appear in the final document.] 788 tunnel as virtualization - Stewart Bryant (SB) 790 tunnel as endpoint only, not on-path (not MPLS, e.g.) - JT/coauthor 792 gigE packing like PWE3 ATM packing - SB 794 PPP chopping and coalescing - MT/coauthor 796 end sec 2 "we need large seq num and to frag at the tunnel" / maybe, 797 but do we want recommendations? - SB 799 security should add addr management and ACLs (?) - SB 801 MTU as part of BGP? - SB (Will this even work - JT) 803 section 2 it says: "The IPv6 fragment header is present only when a 804 packet has been fragmented", but I know of at least one effort in 805 MANET that is proposing to include the fragment header even for 806 unfragmented IPv6 packets. That would seem to bend the rules set 807 forth in RFC2460, but I just thought it might be worth pointing out 808 that some people are considering bending them. - Fred Templin 810 NATs - i.e., One other thought; where the IP ID problem becomes truly 811 pathological is for tunnels that traverse IPv4 NATs. First, the NATs 812 could rewrite the ID to something the ingress tunnel endpoint never 813 intended. Secondly, multiple ingress tunnel endpoints that traverse 814 the same NAT could have IP ID "collisions" from the perspective of 815 the outside world. This may deserve a section unto itself? - FT 817 NAT as half-tunnel - JT 819 tunnel endpoint as following host rules - JT (as with ECN in CAPWAP, 820 per Magnus' email of 10/10/08) 822 the need for larger min MTU - FT (see SEAL) 823 describe relationship to [Ho08] - JT (as per INTAREA meeting notes, 824 don't cover Teredo-specific issues in Ho08, but include generic 825 issues here) 827 8. Security Considerations 829 Tunnels may introduce vulnerabilities, or add to the potential for 830 receiver overload and thus DOS attacks. These issues are primarily 831 related to the fact that a tunnel is a link that traverses a network 832 path, and to fragmentation and reassembly. Regarding ICMP signals, 833 tunnels have similar security issues to routers, in that they SHOULD 834 throttle ICMPs sent to a given source, and SHOULD send ICMPs that 835 correspond to events inside the tunnel. Such ICMPs MUST have the 836 tunnel ingress IP address as the source IP, because IP addresses 837 inside a tunnel path may have no meaning outside the tunnel. 839 Tunnels traverse multiple hops of a network path from ingress to 840 egress. Traffic along such tunnels may be susceptible to on-path and 841 off-path attacks, including fragment injection, reassembly buffer 842 overload, and ICMP attacks. Some of these attacks may not be as 843 visible to the endpoints of the architecture into which tunnels are 844 deployed, and may result in these attacks being more difficult to 845 detect. 847 Inner fragmentation can present an undue burden on destinations where 848 traffic is not sufficiently diffuse; tunnels SHOULD NOT employ inner 849 fragmentation except where such diffusion is confirmed either by the 850 tunnel mechanism or network designer. All tunnel fragmentation - 851 inner and outer - MUST obey all existing fragmentation requirements, 852 i.e., IPv6 tunnels MUST NOT employ inner fragmentation, and IPv4 853 tunnels MUST NOT use inner fragmentation where the inner header DF=1. 855 Tunnels MUST obey all existing IP requirements, such as the 856 uniqueness of the IP ID field, until otherwise exceptioned or 857 revoked. Failure to either limit encapsulation traffic, or use 858 additional ingress/egress IP addresses, can result in high speed 859 traffic fragments being incorrectly reassembled. 861 9. IANA Considerations 863 This document has no IANA considerations. 865 The RFC Editor should remove this section prior to publication. 867 10. References 869 10.1. Normative References 871 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 872 Requirement Levels", BCP 14, RFC 2119, March 1997. 874 10.2. Informative References 876 [Cl88] Clark, D., "The design philosophy of the DARPA internet 877 protocols," Proc. Sigcomm 1988, p.106-114, 1988. 879 [Er94] Eriksson, H., "MBone: The Multicast Backbone," 880 Communications of the ACM, Aug. 1994, pp.54-60. 882 [Fa10] Farinacci, D., V. Fuller, D. Meyer, D. Lewis, "Locator/ID 883 Separation Protocol (LISP)," (work in progress), draft- 884 ietf-lisp-06, Jan. 2010. 886 [Ho08] Hoagland, J., S. Krishnan, D. Thaler, "Security Concerns 887 With IP Tunneling," (work in progress), draft-ietf-v6ops- 888 tunnel-security-concerns-01, Oct. 2008. 890 [Pe10] Perlman, R., D. Eastlake, D. Dutt, S. Gai, A. Ghanwani, 891 "RBridges: Base Protocol Specification," (work in 892 progress), trill draft-ietf-trill-rbridge-protocol-15, Jan. 893 2010. 895 [RFC791] Postel, J., "Internet Protocol," RFC 791 / STD 5, September 896 1981. 898 [RFC1122] Braden, R., Ed., "Requirements for Internet Hosts - 899 Communication Layers," RFC 1122 / STD 3, October 1989. 901 [RFC1191] Mogul, J., S. Deering, "Path MTU discovery," RFC 1191, 902 November 1990. 904 [RFC2003] Perkins, C., "IP Encapsulation within IP," RFC 2003, 905 October 1996. 907 [RFC2923] Lahey, K., "TCP Problems with Path MTU Discovery," RFC 908 2923, September 2000. 910 [RFC3344] Perkins, C., Ed., "IP Mobility Support for IPv4," RFC 3344, 911 August 2002. 913 [RFC3819] Karn, P., Ed., C. Bormann, G. Fairhurst, D. Grossman, R. 914 Ludwig, J. Mahdavi, G. Montenegro, J. Touch, L. Wood, 915 "Advice for Internet Subnetwork Designers," RFC 3819 / BCP 916 89, July 2004. 918 [RFC3884] Touch, J., L. Eggert, Y. Wang, "Use of IPsec Transport Mode 919 for Dynamic Routing," RFC 3884, September 2004. 921 [RFC3931] Lau, J., Ed., M. Townsley, Ed., I. Goyret, Ed., "Layer Two 922 Tunneling Protocol - Version 3 (L2TPv3)," RFC 3931, March 923 2005. 925 [RFC4176] El Mghazli, Y., Ed., T. Nadeau, M. Boucadair, K. Chan, A. 926 Gonguet, "Framework for Layer 3 Virtual Private Networks 927 (L3VPN) Operations and Management," RFC 4176, October 2005. 929 [RFC4301] Kent, S., and K. Seo, "Security Architecture for the 930 Internet Protocol," RFC 4301, December 2005. 932 [RFC4459] Savola, P., "MTU and Fragmentation Issues with In-the- 933 Network Tunneling," RFC 4459, April 2006. 935 [RFC4664] Andersson, L., Ed., E. Rosen, Ed., "Framework for Layer 2 936 Virtual Private Networks (L2VPNs)," RFC 4664, September 937 2006. 939 [RFC4821] Mathis, M., J. Heffner, "Packetization Layer Path MTU 940 Discovery," RFC 4821, March 2007. 942 [RFC4963] Heffner, J., M. Mathis, B. Chandler, "IPv4 Reassembly 943 Errors at High Data Rates," RFC 4963, July 2007. 945 [RFC5320] Templin, F., Ed., "The Subnetwork Encapsulation and 946 Adaptation Layer (SEAL)," RFC 5320, Feb. 2010. 948 [RFC5556] Touch, J., R. Perlman, "Transparently Interconnecting Lots 949 of Links (TRILL): Problem and Applicability Statement," RFC 950 5556, May 2009. 952 [To01] Touch, J., "Dynamic Internet Overlay Deployment and 953 Management Using the X-Bone," Computer Networks, July 2001, 954 pp. 117-135. 956 [To10] Touch, J., "Updated Specification of the IPv4 ID Field," 957 (work in progress), draft-touch-intarea-ipv4-id-update, 958 Feb. 2010. 960 11. Acknowledgments 962 This document originated as the result of numerous discussions among 963 the authors, Jari Arkko, Stuart Bryant, Lars Eggert, Dino Farinacci, 964 Matt Mathis, and Fred Templin, as well as members participating in 965 the Internet Area Working Group. 967 This document was prepared using 2-Word-v2.0.template.dot. 969 Authors' Addresses 971 Joe Touch 972 USC/ISI 973 4676 Admiralty Way 974 Marina del Rey, CA 90292-6695 975 U.S.A. 977 Phone: +1 (310) 448-9151 978 Email: touch@isi.edu 980 W. Mark Townsley 981 Cisco 982 L'Atlantis, 11, Rue Camille Desmoulins 983 Issy Les Moulineaux, ILE DE FRANCE 92782 985 Email: townsley@cisco.com