idnits 2.17.1 draft-anderson-siit-dc-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- == There are 6 instances of lines with non-RFC2606-compliant FQDNs in the document. == There are 15 instances of lines with non-RFC6890-compliant IPv4 addresses in the document. If these are example addresses, they should be changed. == There are 2 instances of lines with non-RFC3849-compliant IPv6 addresses in the document. If these are example addresses, they should be changed. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (November 9, 2012) is 4180 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) ** Obsolete normative reference: RFC 6145 (Obsoleted by RFC 7915) -- Obsolete informational reference (is this intentional?): RFC 2373 (Obsoleted by RFC 3513) -- Obsolete informational reference (is this intentional?): RFC 2460 (Obsoleted by RFC 8200) -- Obsolete informational reference (is this intentional?): RFC 2616 (Obsoleted by RFC 7230, RFC 7231, RFC 7232, RFC 7233, RFC 7234, RFC 7235) Summary: 1 error (**), 0 flaws (~~), 4 warnings (==), 4 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group T. Anderson 3 Internet-Draft Redpill Linpro 4 Intended status: Standards Track November 9, 2012 5 Expires: May 13, 2013 7 Stateless IP/ICMP Translation in IPv6 Data Centre Environments 8 draft-anderson-siit-dc-00 10 Abstract 12 This document describes the use of Stateless IP/ICMP Translation 13 (SIIT) in data centre environments in order to simultaneously 14 facilitate IPv6 deployment and IPv4 address conservation. It 15 describes the overall architecture, and provides guidelines for both 16 operators and implementers. 18 Status of this Memo 20 This Internet-Draft is submitted in full conformance with the 21 provisions of BCP 78 and BCP 79. 23 Internet-Drafts are working documents of the Internet Engineering 24 Task Force (IETF). Note that other groups may also distribute 25 working documents as Internet-Drafts. The list of current Internet- 26 Drafts is at http://datatracker.ietf.org/drafts/current/. 28 Internet-Drafts are draft documents valid for a maximum of six months 29 and may be updated, replaced, or obsoleted by other documents at any 30 time. It is inappropriate to use Internet-Drafts as reference 31 material or to cite them other than as "work in progress." 33 This Internet-Draft will expire on May 13, 2013. 35 Copyright Notice 37 Copyright (c) 2012 IETF Trust and the persons identified as the 38 document authors. All rights reserved. 40 This document is subject to BCP 78 and the IETF Trust's Legal 41 Provisions Relating to IETF Documents 42 (http://trustee.ietf.org/license-info) in effect on the date of 43 publication of this document. Please review these documents 44 carefully, as they describe your rights and restrictions with respect 45 to this document. Code Components extracted from this document must 46 include Simplified BSD License text as described in Section 4.e of 47 the Trust Legal Provisions and are provided without warranty as 48 described in the Simplified BSD License. 50 Table of Contents 52 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 53 1.1. Motivation and Goals . . . . . . . . . . . . . . . . . . . 3 54 1.1.1. Single Stack IPv6 Operation . . . . . . . . . . . . . 3 55 1.1.2. Stateless Operation . . . . . . . . . . . . . . . . . 4 56 1.1.3. No Loss of End User's Source Address . . . . . . . . . 4 57 1.1.4. No Forklift Upgrades Required . . . . . . . . . . . . 4 58 1.1.5. No Architectural Dependency on IPv4 . . . . . . . . . 5 59 1.2. Comparison to Other IPv6 Migration Strategies . . . . . . 5 60 1.2.1. IPv4-only Service with Translation for IPv6 Users . . 5 61 1.2.2. Dual Stack . . . . . . . . . . . . . . . . . . . . . . 5 62 2. Architectural Overview . . . . . . . . . . . . . . . . . . . . 6 63 2.1. DNS Configuration . . . . . . . . . . . . . . . . . . . . 8 64 2.2. Example Packet Flow . . . . . . . . . . . . . . . . . . . 8 65 3. Deployment Guidelines for Operators . . . . . . . . . . . . . 10 66 3.1. Choice of Application . . . . . . . . . . . . . . . . . . 10 67 3.2. Choice of Translation Prefix . . . . . . . . . . . . . . . 11 68 3.3. Routing Considerations . . . . . . . . . . . . . . . . . . 11 69 3.4. Location of the Translators . . . . . . . . . . . . . . . 11 70 3.5. Migration from Dual Stack . . . . . . . . . . . . . . . . 12 71 3.6. Packet Size and Fragmentation Considerations . . . . . . . 12 72 3.6.1. IP Header Size Difference . . . . . . . . . . . . . . 13 73 3.6.2. Minimum Path MTU Difference . . . . . . . . . . . . . 13 74 3.6.3. "Atomic Fragments" . . . . . . . . . . . . . . . . . . 14 75 4. Implementation Requirements . . . . . . . . . . . . . . . . . 14 76 4.1. Basic Requirements . . . . . . . . . . . . . . . . . . . . 14 77 4.2. Static Address Mapping Function . . . . . . . . . . . . . 14 78 4.3. Support for Increasing the IPv6 Path MTU . . . . . . . . . 15 79 4.4. Support for Disabling "Atomic Fragments" . . . . . . . . . 15 80 4.5. Feature for Handling IPv4 Path MTUs Lower than 1260 . . . 15 81 4.6. Loop Prevention Mechanism . . . . . . . . . . . . . . . . 16 82 5. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 16 83 6. Requirements Language . . . . . . . . . . . . . . . . . . . . 16 84 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 16 85 8. Security Considerations . . . . . . . . . . . . . . . . . . . 16 86 8.1. Mistaking the Translation Prefix for a Trusted Network . . 16 87 8.2. Packets Looping Through the SIIT Function . . . . . . . . 17 88 9. References . . . . . . . . . . . . . . . . . . . . . . . . . . 17 89 9.1. Normative References . . . . . . . . . . . . . . . . . . . 17 90 9.2. Informative References . . . . . . . . . . . . . . . . . . 17 91 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 18 93 1. Introduction 95 This document describes deploying SIIT [RFC6145] as a network-centric 96 stateless translation service that allow a data centre operator or 97 Internet content provider run his data centre network, servers, and 98 applications using exclusively IPv6, while at the same time ensuring 99 that end users that have only IPv4 connectivity will be able to 100 continue to access the services and applications. 102 1.1. Motivation and Goals 104 Historically, dual stack [RFC4213] has been the recommended way to 105 transition from an IPv4-only environment to one capable of serving 106 IPv6 users. For data centre and Internet content providers, however, 107 dual stack operation has a number of disadvantages compared to single 108 stack operation, in particular increased complexity and operational 109 overhead, and very low expected return of investment in the short to 110 medium term, as there are practically no end users who have only 111 connectivity to the IPv6 Internet. Furthermore, the dual stack 112 approach does not in any way help with the depletion of the IPv4 113 address space. 115 Therefore, a better approach was needed. The design goals were, in 116 no particular order: 118 o To promote the deployment of native IPv6 services 120 o To provide IPv4 service availability for legacy users with no loss 121 of performance or functionality 123 o To ensure that that the legacy users' IPv4 addresses remain 124 available to the servers and applications 126 o To conserve and maximise the utilisation of IPv4 addresses 128 o To avoid introducing more complexity than absolutely necessary, 129 especially on the servers and applications 131 o To be easy to scale and deploy in a fault-tolerant manner 133 SIIT meets all of these requirements, which will be elaborated on in 134 the following subsections. 136 1.1.1. Single Stack IPv6 Operation 138 SIIT allows an operator to build their applications on an IPv6-only 139 foundation. IPv4 end-user connectivity becomes a service provided by 140 the network, which systems administration and application development 141 staff do not need to concern themselves with. 143 Obviously, this will promote universal IPv6 deployment for all the 144 provider's services and applications. 146 1.1.2. Stateless Operation 148 Unlike other solutions that provide either dual stack availability to 149 single-stack services (e.g., Stateful NAT64 [RFC6146] and Layer-4/7 150 proxies), or that provide conservation of IPv4 addresses (e.g., NAT44 151 [RFC3022]), a SIIT gateway does not keep any state between each 152 packet in a single connection/flow. In this sense it operates 153 exactly like a normal IP router, and has similar scaling properties - 154 the limiting factors are packets per second and bandwidth. The 155 number of concurrent flows and flow initiation rates are irrelevant 156 for performance. 158 This not only allows individual SIIT gateways to easily attain "line 159 rate" performance, it also allows for per-packet load balancing 160 between multiple gateways using Equal-Cost Multipath Routing 161 [RFC2991]. Asymmetric routing is also unproblematic, which makes it 162 easy to avoid traffic trampolines, as the prefixes involved may be 163 anycasted from all the SIIT gateways in the provider's network. 165 Finally, stateless operation means that high availability is easily 166 achieved. If an SIIT gateway should fail, its traffic can be re- 167 routed onto another SIIT gateway using completely standard IP routing 168 protocols. This will not impact existing flows any more than what 169 any other IP re-routing event would. 171 1.1.3. No Loss of End User's Source Address 173 SIIT will map the entire end-user's source address into an predefined 174 IPv6 translation prefix. This allows the application server to 175 identify the user by his IPv4 address, which is useful for performing 176 tasks like Geo-Location, logging, abuse handling, and so forth. 178 1.1.4. No Forklift Upgrades Required 180 Except for the introduction of the SIIT gateways themselves, there is 181 no change required in the network, servers, applications, or anywhere 182 else to specifically support SIIT, compared to a dual stack 183 deployment. From the clients', the servers', the IPv6 data centre 184 network's, and the IPv4 Internet's point of view, SIIT is practically 185 invisible. It will work with any standards-compliant IPv4 or IPv6 186 stack. 188 1.1.5. No Architectural Dependency on IPv4 190 SIIT will allow an ICP or data centre operator to build their 191 infrastructure and applications entirely on IPv6. This means that 192 when the day comes to discontinue support for IPv4, no change needs 193 to be made to the overall architecture - it's only a matter of 194 shutting off the SIIT gateways. Therefore, by deploying native IPv6 195 along with SIIT, operators will avoid future migration or deployment 196 projects relating to IPv6 roll-out and/or IPv4 sun-setting. 198 1.2. Comparison to Other IPv6 Migration Strategies 200 1.2.1. IPv4-only Service with Translation for IPv6 Users 202 Typically, this migration strategy involves having an IPv4-only 203 application stack, with some device in front that the IPv6 client 204 connect to, who will then translate or proxy the traffic to the IPv4- 205 only system. This approach is probably the easiest to retrofit to an 206 existing IPv6 service environment, however it does have a few 207 shortcomings not shared by SIIT. In particular: 209 o No conservation of IPv4 addresses 211 o The translator/proxy must be a stateful device, requiring traffic 212 to flow symmetrically across a single instance, in turn giving the 213 solution poor scaling properties and routing flexibility 215 o A fail-over event will disrupt all active flows, unless there is 216 some state replication mechanism (which would likely increase 217 complexity and hurt performance and scaling properties) 219 o Loss of the client's source IP address, if it cannot be injected 220 into application-layer headers such as HTTP's X-Forwarded-For 221 (which is impossible if the application layer is using encryption) 223 1.2.2. Dual Stack 225 Dual stack, unlike SIIT, considerably increases complexity and 226 operational overhead compared to single stack operation for a number 227 of reasons. Some examples of this include: 229 o Duplicate work for design, set-up, documentation, and monitoring 231 o Duplicate ACLs in both network components and applications 233 o An exponential increase in possible failure scenarios 234 o Increased application development and maintenance costs 236 o Increased need for staff training and competency 238 Furthermore, dual stack does not help conserve IPv4 addresses. 240 2. Architectural Overview 242 This section attempts to explain the basic SIIT architecture by 243 describing an example topology of a data centre hosting two IPv6- 244 only customers: 246 o Alice, operating a publicly available web service. 248 o Bob, operating publicly available DNS and MX service. 250 Since both Alice and Bob's server installations contain other servers 251 that provide internal services, if they had used IPv4, they each 252 would have needed their server LANs to be provisioned with at minimum 253 a /29, thereby consuming 16 IPv4 addresses. With SIIT, the IPv4 254 address consumption is reduced to 3 - the same number of publicly 255 available services. 257 Example SIIT Topology 259 +------------------+ +------------------+ 260 | IPv4-only user 1 | | IPv4-only user 2 | 261 | 203.0.113.50 | | 192.0.2.10 | 262 +--------+---------+ +--------+---------+ 263 | | 264 \-----------\ /----------/ 265 | | 266 | | 267 (The IPv4 Internet) | | 268 +---------------------[ IPv4 interface ]-----------------------+ 269 | | 270 | SIIT Gateway | 271 | | 272 | IPv4 service address pool: 198.51.0.0/24 | 273 | Static address mapping 1: 198.51.0.1 <=> 2001:db8:12:34::c | 274 | Static address mapping 2: 198.51.0.2 <=> 2001:db8:ab:cd::1 | 275 | Static address mapping 3: 198.51.0.3 <=> 2001:db8:ab:cd::f | 276 | Translation prefix: 2001:db8:46::/96 | 277 | | 278 +---------------------[ IPv6 interface ]-----------------------+ 279 (IPv6-only data centre) | | 280 | | 281 | | 282 Server LAN Alice | | Server LAN Bob 283 2001:db8:12:34::/64 | | 2001:db8:ab:cd::/64 284 +-------+-------+-------+ +---+-------+--------+--------+ 285 | | | | | | | 286 +--+--+ +--+--+ +--+--+ +--+--+ +--+--+ +---+---+ +--+--+ 287 | www | | nfs | | sql | | mta | | a/v | | iscsi | | dns | 288 | ::1 | | ::2 | | ::3 | | ::f | | ::e | | ::d | | ::c | 289 +-----+ +-----+ +-----+ +-----+ +-----+ +-------+ +-----+ 291 Figure 1 293 198.51.0.0/24 is allocated as a pool from which individual IPv4 294 service addresses are drawn. The provider must route this prefix to 295 the SIIT gateway's IPv4 interface. Note that there are no 296 restrictions on how many IPv4 service address pools are used or their 297 prefix length, as long as they are all routed to the SIIT gateway's 298 IPv4 interface. 300 The static address mappings are used for translating the service's 301 IPv6 address into IPv4 and vice versa. When translating from IPv4 to 302 IPv6, any IPv4 address found in the list of static mappings will be 303 rewritten to its corresponding IPv6 address, and vice versa when 304 translating from IPv6 to IPv4. 306 2001:db8:46::/96 is the IPv6 prefix into which the entire IPv4 307 address space is mapped. It is used for translation of the end 308 user's IPv4 address to IPv6 and vice versa according to the algorithm 309 defined in section 2.2 of [RFC6052]. This algorithmic mapping has a 310 lower precedence than the static mappings. 312 The SIIT gateway itself can be either a separate device or a logical 313 function in another multi-purpose device, for example an IP router. 314 Any number of SIIT gateways may exist simultaneously in an operators 315 infrastructure, as long as they all have the same translation prefix 316 and list of static mappings configured. 318 2.1. DNS Configuration 320 The native IPv6 address of the publicly available services should be 321 registered in DNS using AAAA records, while the corresponding IPv4 322 address (according to the static mapping), should be registered using 323 an A record. This results in the following DNS records: 325 www.alice.tld. IN AAAA 2001:db8:12:34::1 326 www.alice.tld. IN A 198.51.0.2 328 mta.bob.tld. IN AAAA 2001:db8:ab:cd::f 329 mta.bob.tld. IN A 198.51.0.3 331 dns.bob.tld. IN AAAA 2001:db8:12:34::c 332 dns.bob.tld. IN A 198.51.0.1 334 2.2. Example Packet Flow 336 In this example, "IPv4-only user 2" initiates a request to Alice's 337 web server. He starts by looking up the IPv4 address of 338 "www.alice.tld" in DNS, and attempts to connect to this address on 339 port 80 by transmitting the following IPv4 packet: 341 +-----------------------------------------------+ 342 | IP Version: 4 | 343 | Source Address: 192.0.2.10 | 344 | Destination Address: 198.51.0.2 | 345 | Protocol: TCP | 346 |-----------------------------------------------| 347 | TCP SYN [...] | 348 +-----------------------------------------------+ 350 This packet is then routed over the Internet to the (nearest) SIIT 351 gateway, which will translate it into the following IPv6 packet and 352 forward it into the IPv6 network: 354 +-----------------------------------------------+ 355 | IP Version: 6 | 356 | Source Address: 2001:db8:46::192.0.2.10 | 357 | Destination Address: 2001:db8:12:34::1 | 358 | Next Header: TCP | 359 |-----------------------------------------------| 360 | TCP SYN [...] | 361 +-----------------------------------------------+ 363 The destination address was translated according to the configured 364 static mapping, while the source address was translated according to 365 the [RFC6052] mapping (because it did not match any static mappings). 366 The rest of the IP header was translated according to [RFC6145]. The 367 Layer 4 payload is copied verbatim, with the exception of the TCP 368 checksum being recalculated. 370 Note that the IPv6 address 2001:db8:46::192.0.2.10 may also be 371 expressed as 2001:db8:46::c000:20a, cf. section 2.2 of [RFC2373]. 373 Next, Alice's web server receives this IPv6 packet and responds to it 374 like it would with any other IPv6 packet: 376 +-----------------------------------------------+ 377 | IP Version: 6 | 378 | Source Address: 2001:db8:12:34::1 | 379 | Destination Address: 2001:db8:46::192.0.2.10 | 380 | Next Header: TCP | 381 |-----------------------------------------------| 382 | TCP SYN+ACK [...] | 383 +-----------------------------------------------+ 385 The response packet is routed to the (nearest) SIIT gateway's IPv6 386 interface, which will translate it back to IPv4 as follows: 388 +-----------------------------------------------+ 389 | IP Version: 4 | 390 | Source Address: 198.51.0.2 | 391 | Destination Address: 192.0.2.10 | 392 | Protocol: TCP | 393 |-----------------------------------------------| 394 | TCP SYN+ACK [...] | 395 +-----------------------------------------------+ 397 This time, the source address matched the static mapping, while the 398 destination address was translated according to [RFC6052]. The rest 399 of the packet was translated according to [RFC6145]. 401 The resulting IPv4 packet is transmitted back to the end user over 402 the IPv4 Internet. Subsequent packets in the flow will follow the 403 exact same translation pattern. They may or may not cross the same 404 translators as earlier packets in the same flow. 406 The end user's IPv4 stack has no idea that it is communicating with 407 an IPv6 server, nor does the server's IPv6 stack have any idea that 408 is is communicating with an IPv4 client. To them, it's just plain 409 IPv4 or IPv6, respectively. However, the applications running on the 410 server may optionally be updated to recognise and strip the 411 translation prefix, so that the end user's IPv4 address may be used 412 for logging, Geo-Location, abuse handling, and so forth. 414 3. Deployment Guidelines for Operators 416 3.1. Choice of Application 418 As noted in [RFC2663], [RFC2993], and [RFC3022], higher-level 419 protocols that embed addresses as part of their payload, will most 420 likely not work through any form of address translation, including 421 SIIT. As a general rule, if an application layer protocol does work 422 through standard NAT44 (see [RFC3235]), it will most likely work 423 through SIIT as well. 425 It is recommended that an initial deployment of SIIT is used for 426 applications where IPv4-only nodes on the Internet initiate traffic 427 towards the IPv6-only services. While it is possible to combine SIIT 428 with DNS64 [RFC6147] or similar mechanisms in order to allow an IPv6- 429 only server to initiate communication with IPv4 nodes through an SIIT 430 gateway, this may be more complicated to implement, as the server 431 must ensure to always use the address statically mapped on the SIIT 432 gateway as the source when initiating communication. 434 In particular, HTTP [RFC2616] is a good choice of an application 435 protocol to start deploying SIIT with, as it is both ubiquitous and 436 known to work very well through address translation. 438 Note that implementations of SIIT may bundle Application Level 439 Gateways (ALGs) to add specific support for certain application 440 protocols that would otherwise break, similar to what is commonly 441 done with NAT44 implementations. If ALGs are being used, care must 442 be taken to ensure that all the translators in the network all have 443 compatible ALGs. 445 3.2. Choice of Translation Prefix 447 Either a Network-Specific Prefix (NSP) from the provider's own IPv6 448 address space or the IANA-allocated Well-Known Prefix 64:ff9b::/96 449 (WKP) may be used. From a technical point of view, both should work 450 equally well, however as only a single WKP exists, if a provider 451 would like to deploy more than one instance of SIIT in his network, 452 or Stateful NAT64 [RFC6146], an NSP must be used anyway for all but 453 one of those deployments. 455 Furthermore, the WKP cannot be used in inter-domain routing. By 456 using an NSP, a provider will have the possibility to sell SIIT 457 service to other operators. 459 For these reasons, this document recommends that an NSP is used. 460 Section 3.3 of [RFC6052] discusses the choice of translation prefix 461 in more detail. 463 The prefix may use any of the lengths described in section 2.2 of 464 [RFC6052], but /96 has two distinct advantages over the others. 465 First, converting it to IPv4 can be done in a single operation by 466 simply stripping off the first 96 bits; second, it allows for IPv4 467 addresses to be embedded directly into the text representation of an 468 IPv6 address using the familiar dotted quad notation, e.g., "2001: 469 db8::192.0.2.10" (see section 2.4 of [RFC6052]), instead of being 470 converted to hexadecimal notation. This makes it easier to write 471 IPV6 ACLs and similar that match translated endpoints in the IPv4 472 Internet. Use of a /96 prefix length is therefore recommended. 474 3.3. Routing Considerations 476 The IPv4 service address prefix(es) and the IPv6 translation prefix 477 may be routed to the SIIT gateway(s) as any other IPv4 or IPv6 route 478 in the provider's network. 480 If more than one SIIT gateway is being deployed, it is recommended 481 that a dynamic routing protocol (such as BGP, IS-IS, or OSPF) is 482 being used to advertise the routes within the provider's network. 483 This will ensure that the traffic that is to be translated will reach 484 the closest translator, reducing or eliminating traffic trampolines, 485 as well as provide high availability - if one translator fails, the 486 dynamic routing protocol will automatically redirect the traffic to 487 the next-best translator. 489 3.4. Location of the Translators 491 In order to prevent traffic trampolines, it is optimal to place the 492 translators as close as possible to the direct path between the 493 servers and the end users. 495 Ideally, they are implemented as a logical function within the IP 496 routers would handle the traffic anyway (if it wasn't to be 497 translated). This way, the translation service would not need 498 separate networks ports to be assigned (which might become saturated 499 and impacted the service), nor would it need extra rack space or 500 energy. Some good choices of the location could be within a data 501 centre's access routers, or inside the provider's border routers. If 502 every single application in the data centre or the provider's network 503 eventually get single-stacked, there would no need to run IPv4 on the 504 inside of the translators - thus allowing the operator to reclaim 505 IPv4 addresses from the network infrastructure that may instead be 506 used for translated services. 508 3.5. Migration from Dual Stack 510 While this document discusses the use of IPv6-only servers and 511 applications, there is no technical requirement that the servers are 512 IPv4 free. SIIT works equally well for a dual stacked servers, which 513 makes migration easy - after setting up the translation function, the 514 DNS A record for the service is updated to point to the IPv4 address 515 that will be translated to IPv6, the previously used IPv4 service 516 address may continue to be assigned to the server. This makes roll- 517 back to dual stack easy, as it is only a matter of changing the DNS 518 record back to what it was before. 520 For high-volume services migrating to SIIT from dual stack, DNS Round 521 Robin may be used to gradually migrate the service's IPv4 traffic 522 from its native IPv4 address(es) to the translated one(s). 524 3.6. Packet Size and Fragmentation Considerations 526 There are two key differences between IPv4 and IPv6 relating to 527 packet sizes that one should consider when deploying SIIT. They 528 result in a few problematic corner cases, which can be dealt with in 529 a few different ways. 531 The operator may find that relying on fragmentation in the IPv6 532 domain is undesired or even operationally impossible [FRAGDROP]. For 533 this reason, the recommendations in this section seeks to minimise 534 the use of IPv6 fragmentation. 536 Unless otherwise stated, this section assumes that the MTU in both 537 the IPv4 and IPv6 domains is 1500 bytes. 539 3.6.1. IP Header Size Difference 541 The IPv6 header is up to 20 bytes larger than the IPv4 header. This 542 means that a full-size 1500 bytes large IPv4 packet cannot be 543 translated to IPv6 without being fragmented, otherwise it would 544 likely have resulted in a 1520 bytes large IPv6 packet. 546 If the transport protocol used is TCP, this is generally not a 547 problem, as the IPv6 server will advertise a TCP MSS of 1440 bytes. 548 This causes the client to never send larger packets than what can be 549 translated to a single full-size IPv6 packet, eliminating any need 550 for fragmentation. 552 For other transport protocols, full-size IPv4 packets with the DF 553 flag cleared will need to be fragmented by the SIIT gateway. The 554 only way to avoid this is to increase the Path MTU between the SIIT 555 gateway and the servers to 1520 bytes. Note that the servers' MTU 556 SHOULD NOT be increased accordingly, as that would cause them to 557 undergo Path MTU Discovery for most native IPv6 destinations. 558 However, the servers would need to be able to accept and process 559 incoming packets larger than their own MTU. However, if the server's 560 IPv6 implementation allows the MTU to be set differently for specific 561 destinations, it MAY be increased to 1520 for destinations within the 562 translation prefix specifically. 564 3.6.2. Minimum Path MTU Difference 566 The minimum allowed MTU in IPv6 is 1280 bytes, while no such 567 restriction exists in IPv4. This means that an 1280 byte large IPv6 568 packet sent to an IPv4 client may need to be fragmented by a router 569 in the IPv4 network. 571 By default, an SIIT gateway will set the DF flag when translating 572 from IPv6 to IPv4, resulting in a situation where the IPv6 server may 573 receive an ICMPv6 Packet Too Big where the indicated MTU value is 574 less than the IPv6 minimum of 1280. In this situation, the IPv6 575 server has two choices on how to proceed, according to the last 576 paragraph of section 5 of [RFC2460]: 578 o It may reduce its Path MTU value to the value indicated in the 579 Packet Too Big. This causes no problems for the SIIT function. 581 o It may reduce its Path MTU value to 1280, and also include a 582 Fragmentation header in each subsequent packet sent to that 583 destination. This instructs the SIIT gateway to clear the DF flag 584 in the resulting IPv4 packet, and also provides the Identification 585 value. 587 If the use of the IPv6 Fragmentation header is problematic, and the 588 operator has IPv6 servers that implement the second option above, the 589 operator should enable a feature on the SIIT gateways which ensures 590 that the resulting MTU field is always set to 1280 or higher when 591 translating ICMPv4 Need to Fragment into ICMPv6 Packet Too Big, and 592 that when translating IPv6 packets smaller or equal to 1280 bytes the 593 resulting IPv4 packets will have the DF flag cleared and an 594 Identification value generated, cf. Section 4.5. 596 3.6.3. "Atomic Fragments" 598 By default, an SIIT gateway will include a Fragmentation header in 599 the resulting IPv6 packet when translating from an IPv4 packet with 600 the DF flag cleared, cf. section 4 of [RFC6145]. 602 This happens even though the resulting IPv6 packets aren't actually 603 fragmented into several pieces, resulting in "Atomic Fragments" 604 [ATOMFRAG]. This is generally not useful in a data centre 605 environment, and it is therefore recommended that this behaviour is 606 disabled at the SIIT gateways. See Section 4.4. 608 4. Implementation Requirements 610 [RFC6145] and [RFC6052] specifies the basic SIIT gateway. However, 611 they specify some optional features that are very desirable when 612 deploying SIIT in a data centre environment. This section list which 613 additional features are required for an SIIT gateway optimised for a 614 data centre environment. 616 4.1. Basic Requirements 618 The implementation MUST implement [RFC6145] with the algorithmic 619 address mapping defined in [RFC6052]. It MUST NOT create any per- 620 session state under any circumstance. 622 4.2. Static Address Mapping Function 624 The implementation MUST allow the operator to configure an arbitrary 625 number of static mappings which override the default [RFC6052] 626 algorithm. It SHOULD be possible to specify a single bi-directional 627 mapping that will be used in both the IPv4=>IPv6 and IPv6=>IPv4 628 directions, but it MAY additionally (or alternatively) support 629 unidirectional mappings. 631 An example of such a bidirectional static mapping would be: 633 o 198.51.0.1 <=> 2001:db8:12:34::c 635 To accomplish the same using unidirectional mappings, the following 636 two mappings must instead be configured: 638 o 198.51.0.1 => 2001:db8:12:34::c 640 o 2001:db8:12:34::c => 198.51.0.1 642 In both cases, if the gateway receives an IPv6 packet that has 2001: 643 db8:12:34::c in either of the source and destination fields of the IP 644 header, it MUST rewrite this field to 198.51.0.1 when translating to 645 IPv4. Similarly, if the gateway receives an IPv4 packet that has 646 198.51.0.1 as the either the source or destination fields of the IP 647 header, it MUST rewrite this field to 2001:db8:12:34::c. For all 648 IPv4 or IPv6 source or destination field values for which there is no 649 static mapping, [RFC6052] mapping MUST be used. 651 4.3. Support for Increasing the IPv6 Path MTU 653 In order to prevent unnecessary use of the IPv6 Fragmentation header, 654 the implementation MUST support increasing the IPv6 Path MTU from its 655 default value of 1280, as described in section 4 of [RFC6145]. 657 4.4. Support for Disabling "Atomic Fragments" 659 The translator MUST provide a configuration function that allows the 660 translator not to include the Fragment Header for non-fragmented IPv6 661 packets, cf. section 4 of [RFC6145]. 663 4.5. Feature for Handling IPv4 Path MTUs Lower than 1260 665 In order to prevent unnecessary fragments, the implementation MUST 666 support a feature which, if enabled by the operator, changes the 667 translator's default behaviour accordingly: 669 o When translating an ICMPv4 Need To Fragment packet indicating a 670 Path MTU smaller than or equal to 1260, the MTU field in the 671 resulting ICMPv6 Packet Too Big is set to 1280. 673 o When translating an IPv6 packet that is smaller or equal to 1280 674 bytes, the DF flag in the resulting IPv4 packet is cleared, and an 675 Identification value is generated. The translator MUST NOT 676 generate any state as a result of this. 678 This is a modified version of the second approach described in 679 section 6 of [RFC6145]. The default state of the feature SHOULD be 680 disabled. 682 For the definition of an "Atomic Fragment", see [ATOMFRAG]. 684 4.6. Loop Prevention Mechanism 686 As noted in Section 8.2, there is a potential for packets looping 687 through the SIIT function if it receives an IPv4 packet for which 688 there is no static mapping. It is therefore RECOMMENDED that the 689 implementation has a mechanism that automatically prevents this 690 behaviour. One way this could be accomplished would be to discard 691 any IPv4 packets that would be translated into an IPv6 packet that 692 would be routed straight back into the SIIT function. 694 If such a mechanism isn't provided, the implementation MUST provide a 695 way to manually filter or null-route the destination addresses that 696 would otherwise cause loops. 698 5. Acknowledgements 700 TBD 702 6. Requirements Language 704 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 705 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 706 document are to be interpreted as described in [RFC2119]. 708 7. IANA Considerations 710 This draft makes no request of the IANA. 712 8. Security Considerations 714 8.1. Mistaking the Translation Prefix for a Trusted Network 716 If a Network-Specific Prefix from the provider's own address space is 717 chosen for the translation prefix, as is recommended, care must be 718 taken if the translation service is used in front of services that 719 have application-level ACLs that distinguish between the operator's 720 own networks and the Internet at large, as the translated IPv4 end 721 users on the Internet will appear to come from within the provider's 722 own IPv6 address space. It is therefore important that the 723 translation prefix is treated the same as the Internet at large, 724 rather than as a trusted network. 726 8.2. Packets Looping Through the SIIT Function 728 The SIIT gateway receives an IPv4 packet destined to an address for 729 which there is no static mapping, its destination address will be 730 rewritten according to [RFC6052], making the resulting IPv6 packet 731 have a destination address within the translation prefix, which is 732 likely routed to back to the SIIT function. This will cause the 733 packet to loop until its Time To Live / Hop Limit reaches zero, 734 potentially creating a Denial Of Service vulnerability. 736 To avoid this, it should be ensured that packets sent to IPv4 737 destinations addresses for which there are no static mappings, or 738 whose resulting IPv6 address does not have a more-specific route to 739 the IPv6 network, are immediately discarded. 741 9. References 743 9.1. Normative References 745 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 746 Requirement Levels", BCP 14, RFC 2119, March 1997. 748 [RFC6052] Bao, C., Huitema, C., Bagnulo, M., Boucadair, M., and X. 749 Li, "IPv6 Addressing of IPv4/IPv6 Translators", RFC 6052, 750 October 2010. 752 [RFC6145] Li, X., Bao, C., and F. Baker, "IP/ICMP Translation 753 Algorithm", RFC 6145, April 2011. 755 9.2. Informative References 757 [ATOMFRAG] 758 Gont, F., "Processing of IPv6 "atomic" fragments", 759 December 2011, . 761 [FRAGDROP] 762 Jaeggli, J., Colitti, L., Kumari, W., Vyncke, E., Kaeo, 763 M., and T. Taylor, "Why Operators Filter Fragments and 764 What It Implies", October 2012, . 767 [RFC2373] Hinden, R. and S. Deering, "IP Version 6 Addressing 768 Architecture", RFC 2373, July 1998. 770 [RFC2460] Deering, S. and R. Hinden, "Internet Protocol, Version 6 771 (IPv6) Specification", RFC 2460, December 1998. 773 [RFC2616] Fielding, R., Gettys, J., Mogul, J., Frystyk, H., 774 Masinter, L., Leach, P., and T. Berners-Lee, "Hypertext 775 Transfer Protocol -- HTTP/1.1", RFC 2616, June 1999. 777 [RFC2663] Srisuresh, P. and M. Holdrege, "IP Network Address 778 Translator (NAT) Terminology and Considerations", 779 RFC 2663, August 1999. 781 [RFC2991] Thaler, D. and C. Hopps, "Multipath Issues in Unicast and 782 Multicast Next-Hop Selection", RFC 2991, November 2000. 784 [RFC2993] Hain, T., "Architectural Implications of NAT", RFC 2993, 785 November 2000. 787 [RFC3022] Srisuresh, P. and K. Egevang, "Traditional IP Network 788 Address Translator (Traditional NAT)", RFC 3022, 789 January 2001. 791 [RFC3235] Senie, D., "Network Address Translator (NAT)-Friendly 792 Application Design Guidelines", RFC 3235, January 2002. 794 [RFC4213] Nordmark, E. and R. Gilligan, "Basic Transition Mechanisms 795 for IPv6 Hosts and Routers", RFC 4213, October 2005. 797 [RFC6146] Bagnulo, M., Matthews, P., and I. van Beijnum, "Stateful 798 NAT64: Network Address and Protocol Translation from IPv6 799 Clients to IPv4 Servers", RFC 6146, April 2011. 801 [RFC6147] Bagnulo, M., Sullivan, A., Matthews, P., and I. van 802 Beijnum, "DNS64: DNS Extensions for Network Address 803 Translation from IPv6 Clients to IPv4 Servers", RFC 6147, 804 April 2011. 806 Author's Address 808 Tore Anderson 809 Redpill Linpro 810 Herregaardsveien 8B 811 NO-1168 Oslo 812 NORWAY 814 Phone: +47 959 31 212 815 Email: tore.anderson@redpill-linpro.com