idnits 2.17.1 draft-ietf-v6ops-siit-dc-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- == There are 2 instances of lines with non-RFC3849-compliant IPv6 addresses in the document. If these are example addresses, they should be changed. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (June 28, 2015) is 3225 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Missing Reference: 'BGP' is mentioned on line 896, but not defined == Outdated reference: A later version (-03) exists of draft-ietf-v6ops-siit-eam-00 ** Obsolete normative reference: RFC 6145 (Obsoleted by RFC 7915) == Outdated reference: A later version (-08) exists of draft-ietf-6man-deprecate-atomfrag-generation-01 == Outdated reference: A later version (-02) exists of draft-ietf-v6ops-siit-dc-2xlat-00 -- Obsolete informational reference (is this intentional?): RFC 1981 (Obsoleted by RFC 8201) -- Obsolete informational reference (is this intentional?): RFC 2460 (Obsoleted by RFC 8200) -- Obsolete informational reference (is this intentional?): RFC 7230 (Obsoleted by RFC 9110, RFC 9112) Summary: 1 error (**), 0 flaws (~~), 6 warnings (==), 4 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 IPv6 Operations T. Anderson 3 Internet-Draft Redpill Linpro 4 Intended status: Informational June 28, 2015 5 Expires: December 30, 2015 7 SIIT-DC: Stateless IP/ICMP Translation for IPv6 Data Centre Environments 8 draft-ietf-v6ops-siit-dc-01 10 Abstract 12 This document describes the use of the Stateless IP/ICMP Translation 13 (SIIT) algorithm in an IPv6 Internet Data Centre (IDC). In this 14 deployment model, traffic from legacy IPv4-only clients on the 15 Internet is translated to IPv6 when reaches the IDC operator's 16 network infrastructure. From that point on, it is treated just as if 17 it was traffic from any other IPv6-capable end user. This 18 facilitates a single-stack IPv6-only network infrastructure, as well 19 as efficient utilisation of public IPv4 addresses. 21 The primary audience is IDC operators who are deploying IPv6, running 22 out of available IPv4 addresses, and/or feel that dual stack causes 23 undesirable operational complexity. 25 Status of This Memo 27 This Internet-Draft is submitted in full conformance with the 28 provisions of BCP 78 and BCP 79. 30 Internet-Drafts are working documents of the Internet Engineering 31 Task Force (IETF). Note that other groups may also distribute 32 working documents as Internet-Drafts. The list of current Internet- 33 Drafts is at http://datatracker.ietf.org/drafts/current/. 35 Internet-Drafts are draft documents valid for a maximum of six months 36 and may be updated, replaced, or obsoleted by other documents at any 37 time. It is inappropriate to use Internet-Drafts as reference 38 material or to cite them other than as "work in progress." 40 This Internet-Draft will expire on December 30, 2015. 42 Copyright Notice 44 Copyright (c) 2015 IETF Trust and the persons identified as the 45 document authors. All rights reserved. 47 This document is subject to BCP 78 and the IETF Trust's Legal 48 Provisions Relating to IETF Documents 49 (http://trustee.ietf.org/license-info) in effect on the date of 50 publication of this document. Please review these documents 51 carefully, as they describe your rights and restrictions with respect 52 to this document. Code Components extracted from this document must 53 include Simplified BSD License text as described in Section 4.e of 54 the Trust Legal Provisions and are provided without warranty as 55 described in the Simplified BSD License. 57 Table of Contents 59 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 60 1.1. Single Stack IPv6 Operation . . . . . . . . . . . . . . . 3 61 1.2. Stateless Operation . . . . . . . . . . . . . . . . . . . 4 62 1.3. IPv4 Address Conservation . . . . . . . . . . . . . . . . 4 63 1.4. Clients' IPv4 Source Addresses Visible to Applications . 5 64 1.5. Compatible with Standard IPv4 and IPv6 Stacks . . . . . . 5 65 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 5 66 3. Architectural Overview . . . . . . . . . . . . . . . . . . . 7 67 3.1. Packet Flow . . . . . . . . . . . . . . . . . . . . . . . 9 68 4. Deployment Considerations and Guidelines . . . . . . . . . . 10 69 4.1. Application/Device Support for IPv6 . . . . . . . . . . . 10 70 4.2. Application Support for NAT . . . . . . . . . . . . . . . 10 71 4.3. Application Communication Pattern . . . . . . . . . . . . 10 72 4.4. Choice of Translation Prefix . . . . . . . . . . . . . . 11 73 4.5. Routing Considerations . . . . . . . . . . . . . . . . . 12 74 4.6. Location of the SIIT-DC Border Relays . . . . . . . . . . 12 75 4.7. Migration from Dual Stack . . . . . . . . . . . . . . . . 13 76 4.8. Translation of ICMPv6 Errors to IPv4 . . . . . . . . . . 13 77 4.9. MTU and Fragmentation . . . . . . . . . . . . . . . . . . 13 78 4.9.1. IPv4/IPv6 Header Size Difference . . . . . . . . . . 14 79 4.9.2. IPv6 Atomic Fragments . . . . . . . . . . . . . . . . 14 80 4.9.3. Minimum Path MTU Difference Between IPv4 and IPv6 . . 15 81 4.10. IPv4-translatable IPv6 Service Addresses . . . . . . . . 16 82 5. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 17 83 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 17 84 7. Security Considerations . . . . . . . . . . . . . . . . . . . 17 85 7.1. Mistaking the Translation Prefix for a Trusted Network . 17 86 8. References . . . . . . . . . . . . . . . . . . . . . . . . . 17 87 8.1. Normative References . . . . . . . . . . . . . . . . . . 17 88 8.2. Informative References . . . . . . . . . . . . . . . . . 18 89 Appendix A. Complete SIIT-DC IDC topology example . . . . . . . 20 90 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 23 92 1. Introduction 94 Historically, dual stack [RFC4213] [RFC6883] has been the recommended 95 way to transition from a legacy IPv4-only environment to one capable 96 of serving IPv6 users. However, for IDC operators, dual stack 97 operation has a number of disadvantages compared to single stack 98 operation. In particular, running two protocols rather than one 99 results in increased complexity and operational overhead, with little 100 return on investment for as long as large parts of the public 101 Internet remains predominantly IPv4-only. Furthermore, the dual 102 stack approach does not in any way help with the depletion of the 103 IPv4 address space, which at the time of writing is a pressing 104 concern in most parts of the world. 106 Therefore, some IDC operators may instead prefer an approach in which 107 they only need to operate one protocol in the data centre as they 108 prepare for the future. SIIT-DC is one such approach. Its design 109 goals include: 111 o Promote the deployment of native IPv6 services (cf. [RFC6540]). 113 o Provide IPv4 service availability for legacy users with no loss of 114 performance or functionality. 116 o To ensure that that the legacy users' IPv4 addresses remain 117 visible to the nodes and applications. 119 o To conserve and maximise the utilisation of the operator's public 120 IPv4 addresses. 122 o To avoid introducing more complexity than absolutely necessary, 123 especially on the nodes and applications. 125 o To be easy to scale and deploy in a fault-tolerant manner. 127 The following subsections elaborates on how SIIT-DC meets these 128 goals. 130 1.1. Single Stack IPv6 Operation 132 SIIT-DC allows IDC operators to build their infrastructure and 133 applications on an IPv6-only foundation. IPv4 end-user connectivity 134 becomes a service provided by the network, which systems 135 administration and application development staff do not need to 136 concern themselves with. This promotes universal IPv6 deployment for 137 the IDC operator's services and applications. 139 SIIT-DC requires no special support or change from the underlying 140 IPv6 infrastructure, it is compatible with all standard IPv6 141 networks. Traffic between IPv6-enabled end users and IPv6-enabled 142 services will always be transported native end-to-end; SIIT-DC does 143 not intercept or handle native IPv6 traffic at all. 145 When the day comes to discontinue all support for IPv4, no change 146 needs to be made to the overall architecture - it's only a matter of 147 shutting off the BRs. Operators who deploy native IPv6 along with 148 SIIT-DC will thus avoid requiring any future migration or deployment 149 projects relating to IPv6 deployment and/or IPv4 sun-setting. 151 1.2. Stateless Operation 153 Unlike other solutions that provide either dual stack availability to 154 single-stack services (e.g., Stateful NAT64 [RFC6146] and Layer-4/7 155 proxies), or that provide conservation of IPv4 addresses (e.g., 156 NAPT44 [RFC3022]), SIIT-DC does not keep any state between each 157 packet in a single connection or flow. In this sense it operates 158 exactly like a regular IP router, and has similar scaling properties 159 - the limiting factors are packets per second and bandwidth. The 160 number of concurrent flows and flow initiation rates are irrelevant 161 for performance. 163 This not only allows individual BRs to easily attain "line rate" 164 performance, it also allows for per-packet load balancing between 165 multiple BRs using Equal-Cost Multipath Routing [RFC2991]. 166 Asymmetric routing is also acceptable, which makes it easy to avoid 167 sub-optimal traffic patterns; the prefixes involved may be anycasted 168 from all the BRs in the provider's network, thus ensuring that the 169 most optimal path through the network is used, even where the optimal 170 path in one direction differs from the optimal path in the opposite 171 direction. 173 Finally, stateless operation means that high availability is easily 174 achieved. If a BR should fail, its traffic can be re-routed onto 175 another BR using a standard IP routing protocol. This does not 176 impact existing flows any more than what any other IP re-routing 177 event would. 179 1.3. IPv4 Address Conservation 181 In most parts of the world, it is difficult or even impossible to 182 obtain generously sized IPv4 delegation from the Internet Numbers 183 Registry System [RFC7020]. The resulting scarcity in turn impacts 184 individual end users and operators, which might be forced to purchase 185 IPv4 addresses from other operators in order to cover their needs. 186 This process can be risky to business continuity, in the case no 187 suitable block for sale can be located, and/or turn out to be 188 prohibitively expensive. In spite of this, an IDC operator will find 189 that providing IPv4 service remains essential, as a large share of 190 the Internet end users still do not have IPv6 connectivity. 192 A key goal of SIIT-DC is to help reduce a data centre operator's IPv4 193 address requirement to the absolute minimum, by allowing the operator 194 to remove them entirely from nodes and applications that do not need 195 to communicate with endpoints in the IPv4 Internet. One example 196 would be servers that are operating in a supporting/back-end role and 197 only communicates with to other servers (database servers, file 198 servers, and so on). Another example would be the network 199 infrastructure itself (router-to-router links, loopback addresses, 200 and so on). Furthermore, as LAN prefix sizes must always be rounded 201 up to the nearest power of two (or larger, if one reserves space for 202 future growth), even more IPv4 addresses will often end up being 203 wasted without even being used. 205 With SIIT-DC, the operator can remove these valuable IPv4 addresses 206 from his back-end servers and network infrastructure, and reassign 207 them to the SIIT-DC service as IPv4 Service Addresses. There exists 208 no requirement that IPv4 Service Addresses are assigned in an 209 aggregated manner, so there is nothing lost due to infrastructure 210 overhead; every single IPv4 address assigned to SIIT-DC can be used 211 an IPv4 Service Address. 213 1.4. Clients' IPv4 Source Addresses Visible to Applications 215 SIIT-DC uses the [RFC6052] algorithm to map the entire end-user's 216 IPv4 source address into an predefined IPv6 Translation Prefix. This 217 ensures that there is no loss of information; the end-user's IPv4 218 source address remains available to the application, allowing it to 219 perform tasks like Geo-Location, logging, abuse handling, and so 220 forth. 222 1.5. Compatible with Standard IPv4 and IPv6 Stacks 224 Except for the introduction of the BRs themselves, no change to the 225 network, nodes, applications, or anything else is required in order 226 to support SIIT-DC. SIIT-DC is practically invisible from the point 227 of view of the IPv4 clients, the IPv6 nodes, the IPv6 data centre 228 network, and the IPv4 Internet. SIIT-DC interoperates with all 229 standards-compliant IPv4 or IPv6 stacks. 231 2. Terminology 233 This document makes use of the following terms: 235 SIIT-DC Border Relay (BR) 236 A device or a logical function that performs stateless protocol 237 translation between IPv4 and IPv6 in accordance with [RFC6145] and 238 [I-D.ietf-v6ops-siit-eam]. 240 SIIT-DC Edge Relay (ER) 241 A device or logical function that provides "native" IPv4 242 connectivity to IPv4-only devices or application software. It is 243 very similar in function to a BR, but is typically located close 244 to the IPv4-only component(s) it is supporting rather than on the 245 IDC's outer network border. The ERs is an optional component of 246 SIIT-DC. It is discussed in more detail in 247 [I-D.ietf-v6ops-siit-dc-2xlat]. 249 IPv4 Service Address 250 A public IPv4 address with which IPv4-only clients communicates. 251 This communication will be translated to IPv6 by a BR. The 252 service's "IN A" DNS record will typically point to the IPv4 253 Service Address. 255 IPv4 Service Address Pool 256 One or more IPv4 prefixes routed to the BR's IPv4 interface. IPv4 257 Service Addresses are allocated from this pool. That this does 258 not necessarily have to be a "pool" per se, as it could also be 259 one or more host routes (whose prefix length is equal to /32). 260 The purpose of using a pool rather than host routes is to 261 facilitate IPv4 route aggregation and ease provisioning of new 262 IPv4 Service Addresses. 264 IPv6 Service Address 265 A public IPv6 address assigned to a node (such as a server or 266 load-balancer) or an individual application in the IPv6 network. 267 IPv6-capable clients communicate directly with the IPv6 Service 268 Address using native IPv6. The service's "IN AAAA" DNS record 269 will typically point to the IPv6 Service Address. IPv4-only 270 clients indirectly communicate with the IPv6 Service Address 271 through SIIT-DC. 273 Explicit Address Mapping (EAM) 274 A bi-directional coupling between an IPv4 Service Address and an 275 IPv6 Service Address configured in a BR or ER. When translating 276 between IPv4 and IPv6, the BR/ER changes the address fields in the 277 translated packet's IP header according to any matching EAM. See 278 [I-D.ietf-v6ops-siit-eam]. 280 Translation Prefix 281 An IPv6 prefix into which the entire IPv4 address space is mapped. 282 This prefix is routed to the BR's IPv6 interface. It is either a 283 Network-Specific Prefix or the Well-Known Prefix 64:ff9b::/96, cf. 284 [RFC6052]. When translating between IPv4 and IPv6, a BR/ER 285 inserts or removes the Translation Prefix from the address fields 286 in the translated packet's IP header, unless an EAM for the IP 287 address being translated exists. 289 IPv4-translatable IPv6 addresses 290 As defined in Section 1.3 of [RFC6052]. 292 IDC 293 Short for "Internet Data Centre"; a data centre whose main purpose 294 is to deliver services to the public Internet, the use case SIIT- 295 DC is primarily targeted at. IDCs are typically operated by 296 Internet Content Providers or Managed Services Providers. 298 SIIT 299 The Stateless IP/ICMP Translation algorithm, as specified in 300 [RFC6145]. 302 XLAT 303 Short for "translation". 305 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 306 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 307 document are to be interpreted as described in [RFC2119]. 309 3. Architectural Overview 311 This section describes the basic SIIT-DC architecture. 313 SIIT-DC Architecture 315 IPv6-capable user IPv4-only user 316 <2001:db8::ab:cd> <203.0.113.50> 317 | | 318 (the IPv6 internet) (the IPv4 Internet) 319 | | 320 | +-[BR]---------<192.0.2.0/24>--------------+ 321 | | | 322 | | EAM #1: 192.0.2.1,2001:db8:12:34::1 | 323 | | EAM #2..#n: [...] | 324 | | XLAT Prefix: 2001:db8:46::/96 | 325 | | | 326 | +------------<2001:db8:46::/96>------------+ 327 | | 328 (the IPv6-only data centre network) 329 | 330 +--<2001:db8:12:34::1>--[v6-only server]-+ 331 | | | 332 | +-[2001:db8:12:34::1]--[v6-only app]-+ | 333 | | AF_INET6 socket | | 334 | +------------------------------------+ | 335 +----------------------------------------+ 337 Figure 1 339 In Figure 1, 192.0.2.0/24 is the IPv4 Service Address Pool. 340 Individual IPv4 Service Addresses are assigned from this prefix, and 341 traffic destined for it is routed to the BR's IPv4-facing network 342 interface. There are no restrictions on how many IPv4 Service 343 Address Pools are used or their prefix length, as long as they are 344 all routed to the BR's IPv4-facing network interface. 346 When translating packets between IPv4 and IPv6, the BR uses the EAM 347 to replace any occurrence of the IPv4 Service Address (192.0.2.1) 348 with its corresponding IPv6 Service Address (2001:db8:12:34::1). 349 Addresses that do not match any EAM configured in the BR are 350 translated by inserting or removing the Translation Prefix 351 (2001:db8:46::/96), cf. Section 2.2 of RFC6052 [RFC6052]. 353 The BR can be deployed as a separate device or as a logical function 354 in another multi-purpose device, such as an IP router. Any number of 355 BRs may exist simultaneously in the IDC's network infrastructure, as 356 long as they all configured with the same Translation Prefix and an 357 identical EAM Table. 359 The IPv6 Service Address of should be registered in DNS using an "IN 360 AAAA" record, while its corresponding IPv4 Service Address should be 361 registered using an "IN A" record. This ensures that IPv6-capable 362 clients access the application/service directly using its native IPv6 363 end-to-end, while IP4-only clients will access it through SIIT-DC. 365 3.1. Packet Flow 367 In this example, the "IPv4-only user" from Figure 1 initiates a 368 connection to the application running on the IPv6-only server. After 369 first having looked up the "IN A" record in DNS, the user starts by 370 transmitting an TCP SYN packet to the IPv4 Service Address. This 371 IPv4 packet is routed to the BR, and is there translated to IPv6 as 372 follows: 374 IPv4 to IPv6 translation 376 +--[IPv4]----------+ +--[IPv6]-----------------------+ 377 | SRC 203.0.113.50 | | SRC 2001:db8:46::203.0.113.50 | 378 | DST 192.0.2.1 | --> | DST 2001:db8:12:34::1 | 379 | TCP SYN [..] | | TCP SYN [..] | 380 +------------------+ +-------------------------------+ 382 Figure 2 384 The resulting IPv6 packet is routed to the IPv6-only server, which 385 processes and responds to it as if it had been a native IPv6 packet 386 all along. The server's IPv6 response packet is then routed back to 387 the BR, where it is translated back to IPv4 as follows: 389 IPv6 to IPv4 translation 391 +--[IPv6]-----------------------+ +--[IPv4]----------+ 392 | SRC 2001:db8:12:34::1 | | SRC 192.0.2.1 | 393 | DST 2001:db8:46::203.0.113.50 | --> | DST 203.0.113.50 | 394 | TCP SYN/ACK [..] | | TCP SYN/ACK [..] | 395 +-------------------------------+ +------------------+ 397 Figure 3 399 It is important to note that neither the IPv4 client nor the IPv6 400 server/application need any special support to participate in SIIT- 401 DC. However, the application may optionally be taught to extract the 402 embedded IPv4 source address from incoming IPv6 packets with source 403 addresses within the Translation Prefix. This will allow it to 404 perform IPv4-specific tasks such as Geo-Location, logging, abuse 405 handling, and so on. 407 4. Deployment Considerations and Guidelines 409 4.1. Application/Device Support for IPv6 411 SIIT-DC as described in this document requires that the application 412 (and/or the node the application is located on) supports IPv6 413 networking, and that it has no dependency on local IPv4 network 414 connectivity. However, SIIT-DC supports IPv4-dependent applications 415 and nodes through the introduction of an ER. The ER provides the 416 application or node with seemingly native IPv4 connectivity, by 417 translating the packets (that were previously translated from IPv4 to 418 IPv6) by the BR back to IPv4 before passing them to the 419 IPv4-dependent application or node. This approach is described in 420 more detail in [I-D.ietf-v6ops-siit-dc-2xlat]. 422 4.2. Application Support for NAT 424 The operator should carefully examine whether or not the application 425 protocols he would like to use SIIT-DC with are able to operate in a 426 network environment where rewriting of IP addresses occur. In 427 general, if an application layer protocol works correctly through 428 standard NAT44 (see [RFC3235]), it will most likely work correctly 429 through SIIT-DC as well. 431 Higher-level protocols that embed IP addresses as part of their 432 payload are particularly problematic [RFC2663] [RFC2993] [RFC3022]. 433 One well-known example of such a protocol is FTP [RFC0959]. Such 434 protocols can be made to work with SIIT-DC through the introduction 435 of an ER, which provides end-to-end IPv4 address transparency by 436 reversing the translations performed by the BR before passing the 437 packets to the NAT-incompatible application. This approach is 438 described in more detail in [I-D.ietf-v6ops-siit-dc-2xlat]. 440 4.3. Application Communication Pattern 442 SIIT-DC is best suited for traditional client/server applications 443 where IPv4-only clients on the Internet initiate traffic towards an 444 IPv6-only service, which in turn is passively listening for inbound 445 traffic and responding as necessary. In this case, an IPv4 client 446 looks exactly like an native IPv6 client from the IPv6 service's 447 point of view, and thus does not require any special treatment. One 448 particularly common application protocol that follows this client/ 449 server communication pattern, and thus is ideally suited for use with 450 SIIT-DC, is HTTP [RFC7230]. 452 It is also possible to combine SIIT-DC with DNS64 [RFC6147] in order 453 to allow an IPv6-only application to initiate communication with 454 IPv4-only nodes through SIIT-DC. However, in this case, care must be 455 taken so that all outgoing communication is sourced from an IPv6 456 Service Address that is found in an EAM configured in the BR. If 457 another address is used, the BR will most likely be unable to 458 translate it to IPv4, causing the packet to be discarded. This could 459 be prevented by altering the Default Address Selection Policy Table 460 [RFC6724] on the IPv6 node. 462 An alternative approach to the above would be to place an ER in front 463 of the application in question, as described 464 [I-D.ietf-v6ops-siit-dc-2xlat]. This provides the application with 465 seemingly native IPv4 connectivity, which it may use freely for bi- 466 directional communication with the IPv4 Internet. An application or 467 node located behind an ER does not need to worry about selecting a 468 specific source address, as it will only have valid options 469 available. 471 4.4. Choice of Translation Prefix 473 Either a Network-Specific Prefix (NSP) from the provider's own IPv6 474 address space or the IANA-allocated Well-Known Prefix 64:ff9b::/96 475 (WKP) may be used. From a technical point of view, both work equally 476 well. However, only a single WKP exists, so if a provider would like 477 to deploy more than one instance of SIIT-DC in his network, or 478 another translation technology such as Stateful NAT64 [RFC6146], the 479 operator will be forced to use an NSP for all but one of those 480 deployments. 482 Another consideration is that the WKP cannot be used in inter-domain 483 routing. By using an NSP instead, SIIT-DC will support a deployment 484 where the BR and the IPv6 Service Address are located in different 485 Autonomous Systems. 487 The Translation Prefix may use any of the lengths described in 488 Section 2.2 of RFC6052 [RFC6052], but /96 has two distinct advantages 489 over the others. First, converting it to IPv4 can be done in a 490 single operation by simply stripping off the first 96 bits; second, 491 it allows for IPv4 addresses to be embedded directly into the text 492 representation of an IPv6 address using the familiar dotted quad 493 notation, e.g., "2001:db8::198.51.100.10" (cf. Section 2.4 of RFC6052 494 [RFC6052])), instead of being converted to hexadecimal notation. 495 This makes it easier to write IPv6 ACLs and similar that match 496 translated endpoints in the IPv4 Internet. 498 For the reasons discussed above, this document recommends that an NSP 499 with a prefix length of /96 is used. Section 3.3 of [RFC6052] 500 discusses the choice of translation prefix in more detail. 502 4.5. Routing Considerations 504 The prefixes that constitute the IPv4 Service Address Pool and the 505 IPv6 Translation Prefix may be routed to the BRs as any other IPv4 or 506 IPv6 route in the provider's network. If more than one BR is being 507 deployed, it is recommended that a routing protocol (IGP) used to 508 advertise the routes within the provider's network. This will ensure 509 that the traffic that is to be translated will reach the closest BR, 510 reducing or eliminating sub-optimal traffic patterns, as well as 511 providing high availability: Should one BR fail, the IGP will 512 automatically redirect the traffic to the closest alternate BR. 514 4.6. Location of the SIIT-DC Border Relays 516 The goal of SIIT-DC is to facilitate a true IPv6-only application and 517 network architecture, with the sole exception being the IPv4 518 interfaces of the BRs and the network infrastructure required to 519 connect the BRs to the IPv4 Internet. Therefore, the BRs must be 520 located somewhere between the IPv4 Internet and the application 521 delivery stack. This should be understood to include all servers, 522 load balancers, firewalls, intrusion detection systems, and similar 523 devices that are processing traffic to a greater extent than merely 524 forwarding it. 526 It is optimal to place the BRs as close as possible to the direct 527 path between the location of the IPv6 Service Address and the end 528 users. If the closest BR was located a long way from the direct 529 path, all packets in both directions must make a detour in order to 530 traverse the BR. This would increase the RTT between the service and 531 the end user by by two times the extra latency incurred by the 532 detour, as well as cause unnecessary load on the network links on the 533 detour path. 535 Where possible, it is beneficial to implement the BRs as a logical 536 function within the routers would have handled the traffic anyway, 537 had the topology been dual stacked. This way, a SIIT-DC deployment 538 does not require separate networks ports (which might become 539 saturated and impact the service quality), nor will it require extra 540 rack space and energy. Some particularly good choices of the 541 location could be within an IDC's access routers, or within the 542 Autonomous System's border routers. 544 Finally, another possibility is that the IDC operator outsources the 545 SIIT-DC service to another entity, for example his upstream ISP. 546 Doing so allows the IDC operator to build a true IPv6-only 547 infrastructure. 549 4.7. Migration from Dual Stack 551 While this document mainly discusses the use of IPv6-only nodes and 552 applications, it is important to note that SIIT-DC is fully 553 compatible with dual stack infrastructures, including dual stack 554 nodes and applications. 556 Thus, migrating a dual-stacked service to an IPv6-only one where 557 SIIT-DC provides the IPv4 Internet connectivity is easy. The 558 operator would start out by designating the service's current native 559 IPv6 address as the IPv6 Service Address, and assign it a 560 corresponding IPv4 Service Address. At this point, the service will 561 respond on both its old (native) IPv4 address, and the SIIT-DC IPv4 562 Service Address. The operator may now move traffic from the former 563 to the latter by changing the service's "IN A" DNS record. Once all 564 IPv4 traffic has been successfully moved to SIIT-DC, the old IPv4 565 address may be reclaimed. 567 4.8. Translation of ICMPv6 Errors to IPv4 569 In response to an IPv4 packet subsequently translated to IPv6 by the 570 BR, an IPv6 router in the IDC network may need to transmit an ICMPv6 571 error back to the origin IPv4 node. By default, such an ICMPv6 error 572 will most likely be discarded by the BR, unless the source address of 573 the ICMPv6 error happens to be a IPv4-translatable IPv6 address or 574 covered by an EAM. 576 To facilitate reliable delivery of such ICMPv6 errors, an SIIT-DC 577 operator SHOULD implement the recommendations in [RFC6791] in the 578 BRs. 580 4.9. MTU and Fragmentation 582 There are some key differences between IPv4 and IPv6 relating to 583 packet sizes and fragmentation that one should consider when 584 deploying SIIT-DC. They result in a few problematic corner cases, 585 which can be dealt with in a few different ways. The following 586 subsections will discuss these in detail, and provide operational 587 guidance. 589 In particular, the operator may find that relying on fragmentation in 590 the IPv6 domain is undesired or even operationally impossible 591 [I-D.taylor-v6ops-fragdrop]. For this reason, the recommendations in 592 this section seeks to minimise the use of IPv6 fragmentation. 594 Unless otherwise stated, the following subsections assume that the 595 MTU in both the IPv4 and IPv6 domains is 1500 bytes. 597 4.9.1. IPv4/IPv6 Header Size Difference 599 The IPv6 header is up to 20 bytes larger than the IPv4 header. This 600 means that a full-size 1500 bytes large IPv4 packet cannot be 601 translated to IPv6 without being fragmented, otherwise it would 602 likely have resulted in a 1520 bytes large IPv6 packet. 604 If the transport protocol used is TCP, this is generally not a 605 problem, the IPv6 node will advertise a TCP MSS of 1440 bytes during 606 the initial TCP handshake. This causes the IPv4 clients to never 607 send larger packets than what can be translated to a single full-size 608 IPv6 packet, eliminating any need for fragmentation. 610 For other transport protocols, full-size IPv4 packets with the DF 611 flag cleared will need to be fragmented by the BR. This may be 612 avoided by increasing the Path MTU between the BR and the IPv6 nodes 613 to 1520 bytes or greater. If this is done, the MTU on the IPv6 nodes 614 themselves SHOULD NOT be increased accordingly, as doing so would 615 cause them to undergo Path MTU Discovery for all destinations on the 616 IPv6 Internet. The nodes MUST however be able to accept and process 617 incoming packets larger than their own MTU. If the nodes' IPv6 618 implementation allows the initial Path MTU to be set differently for 619 specific destinations, it MAY be increased to 1520 for destinations 620 within the Translation Prefix specifically. 622 4.9.2. IPv6 Atomic Fragments 624 In keeping with the fifth paragraph of Section 4 of RFC6145 625 [RFC6145], a stateless translator like a BR will by default add an 626 IPv6 Fragmentation header to the resulting IPv6 packet when 627 translating an IPv4 packet with the Don't Fragment flag set to 0. 628 This happens even though the resulting IPv6 packet isn't actually 629 fragmented into several pieces, resulting in an IPv6 Atomic Fragment 630 [RFC6946]. These Atomic Fragments are generally not useful in an IDC 631 environment, and it is therefore recommended that this behaviour is 632 disabled in the BRs. To this end, Section 4 of RFC6145 [RFC6145] 633 notes that the "translator MAY provide a configuration function that 634 allows the translator not to include the Fragment Header for the non- 635 fragmented IPv6 packets". 637 Note that [I-D.ietf-6man-deprecate-atomfrag-generation] seeks to 638 update [RFC6145], making the functionality described above as the 639 standard and only mode of operation. 641 In IPv6, the Identification value is located inside the Fragmentation 642 header. That means that if the generation of IPv6 Atomic Fragments 643 is disabled, the IPv4 Identification value will be lost during 644 translation to IPv6. This could potentially confuse some diagnostic 645 tools. 647 4.9.3. Minimum Path MTU Difference Between IPv4 and IPv6 649 Section 5 of RFC2460 [RFC2460] specifies that the minimum IPv6 link 650 MTU is 1280 bytes. Therefore, an IPv6 node can reasonably assume 651 that if it transmits an IPv6 packet that is 1280 bytes or smaller, it 652 is guaranteed to reach its destination without requiring 653 fragmentation or invoking the Path MTU Discovery algorithm [RFC1981]. 654 However, this assumption might prove false if the destination is an 655 IPv4 node reached through a protocol translator such as a BR, as the 656 minimum IPv4 link MTU is 68 bytes. See Section 3.2 of RFC791 657 [RFC0791]. 659 Section 5.1 of RFC6145 [RFC6145] specifies that a stateless 660 translator should set the IPv4 Don't Fragment flag to 1 when it 661 translates a non-fragmented IPv6 packet to IPv4. This means that 662 when the path to the destination IPv4 node contains an IPv4 link with 663 an MTU smaller than 1260 bytes (which corresponds to an IPv6 MTU 664 smaller than 1280 bytes, cf. Section 4.9.1), the Path MTU Discovery 665 algorithm will be invoked, even if the original IPv6 packet was only 666 1280 bytes large. This happens as a result of the IPv4 router 667 connecting to the IPv4 link with the small MTU returning an ICMPv4 668 Need To Fragment error with an MTU value smaller than 1260, which in 669 turns is translated by the BR to an ICMPv6 Packet Too Big error with 670 an MTU value smaller than 1280 which is then transmitted to the 671 origin IPv6 node. 673 When an IPv6 node receives an ICMPv6 Packet Too Big error indicating 674 an MTU value smaller than 1280, the last paragraph of Section 5 of 675 RFC2460 [RFC2460] gives it two choices on how to proceed: 677 o It may reduce its Path MTU value to the value indicated in the 678 Packet Too Big, i.e., limit the size of subsequent packets 679 transmitted to that destination to the indicated value. This 680 approach causes no problems for the SIIT-DC function, as it simply 681 allows Path MTU Discovery to work transparently across the BR. 683 o It may reduce its Path MTU value to exactly 1280, and in addition 684 include a Fragmentation header in subsequent packets sent to that 685 destination. In other words, the IPv6 node will start emitting 686 Atomic Fragments. The Fragmentation header signals to the the BR 687 that the Don't Fragment flag should be set to 0 in the resulting 688 IPv4 packet, and it also provides the Identification value. 690 If the use of the IPv6 Fragmentation header is problematic, and the 691 operator has IPv6 nodes that implement the second option above, the 692 operator should consider enabling the functionality described as the 693 "second approach" in Section 6 of RFC6145 [RFC6145]. This 694 functionality changes the BR's behaviour as follows: 696 o When translating ICMPv4 Need To Fragment to ICMPv6 Packet Too Big, 697 the resulting packet will never contain an MTU value lower than 698 1280. This prevents the IPv6 nodes from generating Atomic 699 Fragments. 701 o When translating IPv6 packets smaller than or equal to 1280 bytes, 702 the Don't Fragment flag in the resulting IPv4 packet will be set 703 to 0. This ensures that in the eventuality that the path contains 704 an IPv4 link with an MTU smaller than 1260, the IPv4 router 705 connected to that link will have the responsibility to fragment 706 the packet before forwarding it towards its destination. 708 In summary, this approach could be seen as prompting the IPv4 709 protocol itself to provide the "link-specific fragmentation and 710 reassembly at a layer below IPv6" required for links that "cannot 711 convey a 1280-octet packet in one piece", to paraphrase Section 5 of 712 RFC2460 [RFC2460]. Note that 713 [I-D.ietf-6man-deprecate-atomfrag-generation] seeks to update 714 [RFC6145], making the approach described above as the standard and 715 only mode of operation. 717 4.10. IPv4-translatable IPv6 Service Addresses 719 SIIT-DC is designed so that the IPv6 Service Addresses are not 720 required to be IPv4-translatable IPv6 addresses. Section 2 of I-D 721 .ietf-v6ops-siit-eam [I-D.ietf-v6ops-siit-eam] discusses why it is 722 desirable to avoid requiring the use of IPv4-translatable IPv6 723 addresses. 725 It is however quite possible to deploy SIIT-DC in combination with 726 IPv4-translatable IPv6 Service Addresses. The primary benefits in 727 doing so are: 729 o The operator is not required to provision EAMs for 730 IPv4-translatable IPv6 Service Addresses onto the BR/ERs. 732 o [RFC6145] translation can be performed in a checksum-neutral 733 manner, cf. Section 4.1 of RFC6052 [RFC6052]. 735 The trade-off is that the IPv4-translatable IPv6 Service Addresses 736 must be configured on the IPv6 nodes, and the applications must be 737 set up to use them - likely in addition to their primary (non- 738 IPv4-translatable) IPv6 addresses. The IPv4-translatable IPv6 739 Service Addresses must also be routed from the BR through the IDC's 740 IPv6 network infrastructure to the nodes on which they are assigned. 741 This essentially requires the entire IPv6 infrastructure to be made 742 aware of and handle translated IPv4 traffic as a special case, which 743 significantly increases complexity. Avoiding such drawbacks is a 744 design goal of SIIT-DC, cf. Section 1.1, therefore the use of 745 IPv4-translatable IPv6 Service Addresses is discouraged. 747 5. Acknowledgements 749 The author would like to thank the following individuals for their 750 contributions, suggestions, corrections, and criticisms: Fred Baker, 751 Cameron Byrne, Brian E Carpenter, Ross Chandler, Dagfinn Ilmari 752 Mannsaaker, Lars Olafsen, Stig Sandbeck Mathisen, Knut A. Syed, 753 Andrew Yourtchenko. 755 6. IANA Considerations 757 This draft makes no request of the IANA. The RFC Editor may remove 758 this section prior to publication. 760 7. Security Considerations 762 7.1. Mistaking the Translation Prefix for a Trusted Network 764 If a Network-Specific Prefix from the provider's own address space is 765 chosen for the translation prefix, as recommended in Section 4.4, 766 care must be taken if the translation service is used in front of 767 services that have application-level ACLs that distinguish between 768 the operator's own networks and the Internet at large, as traffic 769 from translated IPv4 end users on the Internet might appear to be 770 originating from the provider's own network. It is therefore 771 important that the translation prefix is treated the same as the 772 Internet at large, rather than as a trusted network. 774 In order to alleviate this problem, the operator may opt to use a 775 Translation Prefix that is distinct from and not a subset of the IPv6 776 prefixes used elsewhere in the network infrastructure. 778 8. References 780 8.1. Normative References 782 [I-D.ietf-v6ops-siit-eam] 783 Anderson, T. and A. Leiva, "Explicit Address Mappings for 784 Stateless IP/ICMP Translation", draft-ietf-v6ops-siit- 785 eam-00 (work in progress), May 2015. 787 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 788 Requirement Levels", BCP 14, RFC 2119, March 1997. 790 [RFC6052] Bao, C., Huitema, C., Bagnulo, M., Boucadair, M., and X. 791 Li, "IPv6 Addressing of IPv4/IPv6 Translators", RFC 6052, 792 October 2010. 794 [RFC6145] Li, X., Bao, C., and F. Baker, "IP/ICMP Translation 795 Algorithm", RFC 6145, April 2011. 797 [RFC6791] Li, X., Bao, C., Wing, D., Vaithianathan, R., and G. 798 Huston, "Stateless Source Address Mapping for ICMPv6 799 Packets", RFC 6791, November 2012. 801 8.2. Informative References 803 [I-D.ietf-6man-deprecate-atomfrag-generation] 804 Gont, F., LIU, S., and T. Anderson, "Deprecating the 805 Generation of IPv6 Atomic Fragments", draft-ietf-6man- 806 deprecate-atomfrag-generation-01 (work in progress), April 807 2015. 809 [I-D.ietf-v6ops-siit-dc-2xlat] 810 Anderson, T., "SIIT-DC: Dual Translation Mode", draft- 811 ietf-v6ops-siit-dc-2xlat-00 (work in progress), January 812 2015. 814 [I-D.taylor-v6ops-fragdrop] 815 Jaeggli, J., Colitti, L., Kumari, W., Vyncke, E., Kaeo, 816 M., and T. Taylor, "Why Operators Filter Fragments and 817 What It Implies", draft-taylor-v6ops-fragdrop-02 (work in 818 progress), December 2013. 820 [RFC0791] Postel, J., "Internet Protocol", STD 5, RFC 791, September 821 1981. 823 [RFC0959] Postel, J. and J. Reynolds, "File Transfer Protocol", STD 824 9, RFC 959, October 1985. 826 [RFC1981] McCann, J., Deering, S., and J. Mogul, "Path MTU Discovery 827 for IP version 6", RFC 1981, August 1996. 829 [RFC2460] Deering, S. and R. Hinden, "Internet Protocol, Version 6 830 (IPv6) Specification", RFC 2460, December 1998. 832 [RFC2663] Srisuresh, P. and M. Holdrege, "IP Network Address 833 Translator (NAT) Terminology and Considerations", RFC 834 2663, August 1999. 836 [RFC2991] Thaler, D. and C. Hopps, "Multipath Issues in Unicast and 837 Multicast Next-Hop Selection", RFC 2991, November 2000. 839 [RFC2993] Hain, T., "Architectural Implications of NAT", RFC 2993, 840 November 2000. 842 [RFC3022] Srisuresh, P. and K. Egevang, "Traditional IP Network 843 Address Translator (Traditional NAT)", RFC 3022, January 844 2001. 846 [RFC3235] Senie, D., "Network Address Translator (NAT)-Friendly 847 Application Design Guidelines", RFC 3235, January 2002. 849 [RFC4213] Nordmark, E. and R. Gilligan, "Basic Transition Mechanisms 850 for IPv6 Hosts and Routers", RFC 4213, October 2005. 852 [RFC6146] Bagnulo, M., Matthews, P., and I. van Beijnum, "Stateful 853 NAT64: Network Address and Protocol Translation from IPv6 854 Clients to IPv4 Servers", RFC 6146, April 2011. 856 [RFC6147] Bagnulo, M., Sullivan, A., Matthews, P., and I. van 857 Beijnum, "DNS64: DNS Extensions for Network Address 858 Translation from IPv6 Clients to IPv4 Servers", RFC 6147, 859 April 2011. 861 [RFC6540] George, W., Donley, C., Liljenstolpe, C., and L. Howard, 862 "IPv6 Support Required for All IP-Capable Nodes", BCP 177, 863 RFC 6540, April 2012. 865 [RFC6724] Thaler, D., Draves, R., Matsumoto, A., and T. Chown, 866 "Default Address Selection for Internet Protocol Version 6 867 (IPv6)", RFC 6724, September 2012. 869 [RFC6883] Carpenter, B. and S. Jiang, "IPv6 Guidance for Internet 870 Content Providers and Application Service Providers", RFC 871 6883, March 2013. 873 [RFC6946] Gont, F., "Processing of IPv6 "Atomic" Fragments", RFC 874 6946, May 2013. 876 [RFC7020] Housley, R., Curran, J., Huston, G., and D. Conrad, "The 877 Internet Numbers Registry System", RFC 7020, August 2013. 879 [RFC7230] Fielding, R. and J. Reschke, "Hypertext Transfer Protocol 880 (HTTP/1.1): Message Syntax and Routing", RFC 7230, June 881 2014. 883 Appendix A. Complete SIIT-DC IDC topology example 885 Figure 4 attempts to "tie it all together" and show a more complete 886 SIIT-DC topology, in order to better demonstrate its advantageous 887 properties discussed in Section 1. These are discussed in more 888 detail below. 890 Example SIIT-DC IDC topology 892 /--------------------------------\ /---------------\ 893 | IPv4 Internet | | IPv6 Internet | 894 \-+----------------------------+-/ \--------+------/ 895 | | | 896 | <----------[BGP]---------> | [BGP] 897 | | | 898 +-------<192.0.2.0/24>---------+ +---<192.0.2.0/24>---+ | 899 | BR #1 | | BR #2 | | 900 | EAM Table: | | | | 901 | ========== | | | | 902 | 192.0.2.1,2001:db8:12:34::1 | | | | 903 | 192.0.2.2,2001:db8:12:34::2 | | Exactly the same | | 904 | 192.0.2.3,2001:db8:fe:dc::1 | | configuration as | | 905 | 192.0.2.4,2001:db8:12:34::4 | | BR #1 has | | 906 | 192.0.2.5,2001:db8:fe:dc::e | | | | 907 | | | | | 908 | XLAT Prefix 2001:db8:46::/96 | | | | 909 | | | | | 910 +--------<2001:db8:46::/96>----+ +-<2001:db8:46::/96>-+ | 911 | | | 912 | <------[ECMP]------> | | 913 | | | 914 /-----------------+----------------------+--\ | 915 | IPv6 IDC network w/OSPFv3 +------------/ 916 \-+--------------------------------+--------/ 917 | | 918 | Tenant A's server LAN | Tenant B's server LAN 919 | 2001:db8:12:34::/64 | 2001:db8:fe:dc::/64 920 | | 921 +-- www ::1 (IPv6+SIIT-DC) +-- www-lb ::1 (IPv6+SIIT-DC) 922 | | 923 +-- mta ::2 (IPv6+SIIT-DC) +-- web ::80:01 (IPv6-only) 924 | | [...] 925 +-- ftp ::3 (IPv6) +-- web ::80:99 (IPv6-only) 926 | ::4 (IPv4, via ER) | 927 | | +----+ 928 +-- app01 ::a:01 (IPv6-only) \---- ::e | ER | --\ 929 | [...] +----+ | 930 +- app99 ::a:99 (IPv6-only) | 931 | ftp 192.0.2.5 ---/ 932 +-- db01 ::d:01 (IPv6-only) 933 | [..] 934 \-- db99 ::d:99 (IPv6-only) 936 Figure 4 938 Single Stack IPv6 Operation 939 As discussed in Section 1.1, SIIT-DC facilitates an IPv6-only IDC 940 network infrastructure. The only places where IPv4 is absolutely 941 required is between the BRs and the IPv4 Internet, and between any 942 ERs and the IPv4-only applications or devices they are serving 943 (illustrated here as the two tenants' FTP servers). The figure 944 also illustrates how SIIT-DC does not interfere with native IPv6; 945 when there is no longer a need to support IPv4 clients, the BRs 946 may be decommissioned without causing any impact to native IPv6 947 traffic. 949 Stateless Operation 950 As discussed in Section 1.2, SIIT-DC operates in a stateless 951 fashion. In the illustration, both BRs are simultaneously 952 advertising (i.e., anycasting) the IPv4 Service Address Pool and 953 the IPv6 Translation Prefix, so incoming traffic from the IPv4 954 Internet may arrive at either of the BRs, while outgoing IPv6 955 traffic destined for IPv4 endpoints are load balanced between them 956 using Equal-Cost Multipath Routing. No continuous state 957 synchronisation between the two BRs occurs. Should one of the BRs 958 fail, the BGP and OSPF protocols will ensure that traffic 959 converges on the remaining BR. Existing sessions will not be 960 disrupted, beyond any disruption caused by the BGP/OSPF 961 convergence process itself. 963 IPv4 Address Conservation 964 As discussed in Section 1.3, SIIT-DC conserves the IDC operator's 965 IPv4 address space. Even though the two customers in the example 966 above have several hundred servers, the majority of them are not 967 used to run services made available directly from the Internet, 968 and therefore do not need to consume IPv4 addresses. The IDC 969 network infrastructure consumes no IPv4 addresses, either. 970 Finally, the IPv4 addresses that are assigned to the SIIT-DC 971 function as IPv4 Service Address Pools may assigned with 100% 972 efficiency, one address at a time; there is no requirement to 973 assign multiple addresses to a single customer in a contiguous 974 block. 976 Application support 977 As discussed in Section 1.5, as long as the application protocol 978 is translation-friendly (illustrated here with HTTP and SMTP), it 979 will work with SIIT-DC without requiring any special adaptation. 980 Furthermore, translation-unfriendly applications (illustrated here 981 with FTP) will also work when located behind an ER 982 [I-D.ietf-v6ops-siit-dc-2xlat]. Tenant A's FTP server illustrates 983 how an ER may be located in the networking stack of a node, while 984 Tenant B's FTP server illustrates how the ER may be deployed as a 985 network service. The latter approach enables SIIT-DC to support 986 IPv4-only nodes/devices. 988 Author's Address 990 Tore Anderson 991 Redpill Linpro 992 Vitaminveien 1A 993 0485 Oslo 994 Norway 996 Phone: +47 959 31 212 997 Email: tore@redpill-linpro.com 998 URI: http://www.redpill-linpro.com