idnits 2.17.1 draft-ietf-v6ops-siit-dc-03.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- == There are 1 instance of lines with non-RFC3849-compliant IPv6 addresses in the document. If these are example addresses, they should be changed. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (October 12, 2015) is 3118 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Missing Reference: 'BGP' is mentioned on line 919, but not defined == Outdated reference: A later version (-03) exists of draft-ietf-v6ops-siit-eam-01 ** Obsolete normative reference: RFC 6145 (Obsoleted by RFC 7915) == Outdated reference: A later version (-07) exists of draft-bao-v6ops-rfc6145bis-02 == Outdated reference: A later version (-02) exists of draft-ietf-v6ops-siit-dc-2xlat-01 -- Obsolete informational reference (is this intentional?): RFC 1981 (Obsoleted by RFC 8201) -- Obsolete informational reference (is this intentional?): RFC 2460 (Obsoleted by RFC 8200) -- Obsolete informational reference (is this intentional?): RFC 7230 (Obsoleted by RFC 9110, RFC 9112) Summary: 1 error (**), 0 flaws (~~), 6 warnings (==), 4 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 IPv6 Operations T. Anderson 3 Internet-Draft Redpill Linpro 4 Intended status: Informational October 12, 2015 5 Expires: April 14, 2016 7 SIIT-DC: Stateless IP/ICMP Translation for IPv6 Data Centre Environments 8 draft-ietf-v6ops-siit-dc-03 10 Abstract 12 This document describes the use of the Stateless IP/ICMP Translation 13 (SIIT) algorithm in an IPv6 Internet Data Centre (IDC). In this 14 deployment model, traffic from legacy IPv4-only clients on the 15 Internet is translated to IPv6 upon reaching the IDC operator's 16 network infrastructure. From that point on, it may be treated the 17 same as traffic from native IPv6 end users. The IPv6 endpoints may 18 be numbered using arbitrary (non-IPv4-translatable) IPv6 addresses. 19 This facilitates a single-stack IPv6-only network infrastructure, as 20 well as efficient utilisation of public IPv4 addresses. 22 The primary audience is IDC operators who are deploying IPv6, running 23 out of available IPv4 addresses, and/or feel that dual stack causes 24 undesirable operational complexity. 26 Status of This Memo 28 This Internet-Draft is submitted in full conformance with the 29 provisions of BCP 78 and BCP 79. 31 Internet-Drafts are working documents of the Internet Engineering 32 Task Force (IETF). Note that other groups may also distribute 33 working documents as Internet-Drafts. The list of current Internet- 34 Drafts is at http://datatracker.ietf.org/drafts/current/. 36 Internet-Drafts are draft documents valid for a maximum of six months 37 and may be updated, replaced, or obsoleted by other documents at any 38 time. It is inappropriate to use Internet-Drafts as reference 39 material or to cite them other than as "work in progress." 41 This Internet-Draft will expire on April 14, 2016. 43 Copyright Notice 45 Copyright (c) 2015 IETF Trust and the persons identified as the 46 document authors. All rights reserved. 48 This document is subject to BCP 78 and the IETF Trust's Legal 49 Provisions Relating to IETF Documents 50 (http://trustee.ietf.org/license-info) in effect on the date of 51 publication of this document. Please review these documents 52 carefully, as they describe your rights and restrictions with respect 53 to this document. Code Components extracted from this document must 54 include Simplified BSD License text as described in Section 4.e of 55 the Trust Legal Provisions and are provided without warranty as 56 described in the Simplified BSD License. 58 Table of Contents 60 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 61 1.1. Single Stack IPv6 Operation . . . . . . . . . . . . . . . 3 62 1.2. Stateless Operation . . . . . . . . . . . . . . . . . . . 4 63 1.3. IPv4 Address Conservation . . . . . . . . . . . . . . . . 4 64 1.4. Clients' IPv4 Source Addresses Visible to Applications . 5 65 1.5. Compatible with Standard IPv4 and IPv6 Stacks . . . . . . 5 66 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 5 67 3. Architectural Overview . . . . . . . . . . . . . . . . . . . 7 68 3.1. Packet Flow . . . . . . . . . . . . . . . . . . . . . . . 9 69 4. Deployment Considerations and Guidelines . . . . . . . . . . 10 70 4.1. Application/Device Support for IPv6 . . . . . . . . . . . 10 71 4.2. Application Support for NAT . . . . . . . . . . . . . . . 10 72 4.3. Application Communication Pattern . . . . . . . . . . . . 10 73 4.4. Choice of Translation Prefix . . . . . . . . . . . . . . 11 74 4.5. Routing Considerations . . . . . . . . . . . . . . . . . 12 75 4.6. Location of the SIIT-DC Border Relays . . . . . . . . . . 12 76 4.7. Migration from Dual Stack . . . . . . . . . . . . . . . . 12 77 4.8. Translation of ICMPv6 Errors to IPv4 . . . . . . . . . . 13 78 4.9. MTU and Fragmentation . . . . . . . . . . . . . . . . . . 13 79 4.9.1. IPv4/IPv6 Header Size Difference . . . . . . . . . . 13 80 4.9.2. IPv6 Atomic Fragments . . . . . . . . . . . . . . . . 14 81 4.9.3. Minimum Path MTU Difference Between IPv4 and IPv6 . . 15 82 4.10. IPv4-translatable IPv6 Service Addresses . . . . . . . . 16 83 5. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 17 84 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 17 85 7. Security Considerations . . . . . . . . . . . . . . . . . . . 17 86 7.1. Mistaking the Translation Prefix for a Trusted Network . 17 87 8. References . . . . . . . . . . . . . . . . . . . . . . . . . 17 88 8.1. Normative References . . . . . . . . . . . . . . . . . . 17 89 8.2. Informative References . . . . . . . . . . . . . . . . . 18 90 Appendix A. Complete SIIT-DC IDC topology example . . . . . . . 20 91 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 23 93 1. Introduction 94 Historically, dual stack [RFC4213] [RFC6883] has been the recommended 95 way to transition from a legacy IPv4-only environment to one capable 96 of serving IPv6 users. However, for IDC operators, dual stack 97 operation has a number of disadvantages compared to single stack 98 operation. In particular, running two protocols rather than one 99 results in increased complexity and operational overhead, with little 100 return on investment for as long as large parts of the public 101 Internet remains predominantly IPv4-only. Furthermore, the dual 102 stack approach does not in any way help with the depletion of the 103 IPv4 address space, which at the time of writing is a pressing 104 concern in most parts of the world. 106 Therefore, some IDC operators may instead prefer an approach in which 107 they only need to operate one protocol in the data centre as they 108 prepare for the future. SIIT-DC is one such approach. Its design 109 goals include: 111 o Promote the deployment of native IPv6 services (cf. [RFC6540]). 113 o Provide IPv4 service availability for legacy users with no loss of 114 performance or functionality. 116 o To ensure that that the legacy users' IPv4 addresses remain 117 visible to the nodes and applications located in the IPv6 network. 119 o To conserve and maximise the utilisation of the operator's public 120 IPv4 addresses. 122 o To avoid introducing more complexity than absolutely necessary, 123 especially on the nodes and applications. 125 o To be easy to scale and deploy in a fault-tolerant manner. 127 The following subsections elaborates on how SIIT-DC meets these 128 goals. 130 1.1. Single Stack IPv6 Operation 132 SIIT-DC allows IDC operators to build their infrastructure and 133 applications on an IPv6-only foundation. IPv4 end-user connectivity 134 becomes a service provided by the network, which systems 135 administration and application development staff do not need to 136 concern themselves with. This promotes universal IPv6 deployment for 137 the IDC operator's services and applications. 139 SIIT-DC requires no special support or change from the underlying 140 IPv6 infrastructure, it is compatible with all standard IPv6 141 networks. Traffic between IPv6-enabled end users and IPv6-enabled 142 services will always be transported native end-to-end; SIIT-DC does 143 not intercept or handle native IPv6 traffic at all. 145 When the day comes to discontinue all support for IPv4, no change 146 needs to be made to the overall architecture - it's only a matter of 147 shutting off the SIIT-DC Border Relays (BRs). Operators who deploy 148 native IPv6 along with SIIT-DC will thus avoid requiring any future 149 migration or deployment projects relating to IPv6 deployment and/or 150 IPv4 sun-setting. 152 1.2. Stateless Operation 154 Unlike other solutions that provide either dual stack availability to 155 single-stack services (e.g., Stateful NAT64 [RFC6146] and Layer-4/7 156 proxies), or that provide conservation of IPv4 addresses (e.g., 157 NAPT44 [RFC3022]), SIIT-DC does not maintain any state associated 158 with individual connections or flows. In this sense it operates 159 exactly like a regular IP router, and has similar scaling properties 160 - the limiting factors are packets per second and bandwidth. The 161 number of concurrent flows and flow initiation rates are irrelevant 162 for performance. 164 This not only allows individual BRs to easily attain "line rate" 165 performance, it also allows for per-packet load balancing between 166 multiple BRs using Equal-Cost Multipath Routing [RFC2991]. 167 Asymmetric routing is also acceptable, which makes it easy to avoid 168 sub-optimal traffic patterns; the prefixes involved may be anycasted 169 from all the BRs in the provider's network, thus ensuring that the 170 most optimal path through the network is used, even where the optimal 171 path in one direction differs from the optimal path in the opposite 172 direction. 174 Finally, stateless operation means that high availability is easily 175 achieved. If a BR should fail, its traffic can be re-routed onto 176 another BR using a standard IP routing protocol. This does not 177 impact existing flows any more than what any other IP re-routing 178 event would. 180 1.3. IPv4 Address Conservation 182 In most parts of the world, it is difficult or even impossible to 183 obtain generously sized IPv4 delegation from the Internet Numbers 184 Registry System [RFC7020]. The resulting scarcity in turn impacts 185 individual end users and operators, which might be forced to purchase 186 IPv4 addresses from other operators in order to cover their needs. 187 This process can be risky to business continuity, in the case no 188 suitable block for sale can be located, and/or turn out to be 189 prohibitively expensive. In spite of this, an IDC operator will find 190 that providing IPv4 service remains essential, as a large share of 191 the Internet end users still do not have IPv6 connectivity. 193 A key goal of SIIT-DC is to help reduce a data centre operator's IPv4 194 address requirement to the absolute minimum, by allowing the operator 195 to remove them entirely from nodes and applications that do not need 196 to communicate with endpoints in the IPv4 Internet. One example 197 would be servers that are operating in a supporting/back-end role and 198 only communicates with other servers (database servers, file servers, 199 and so on). Another example would be the network infrastructure 200 itself (router-to-router links, loopback addresses, and so on). 201 Furthermore, as LAN prefix sizes must always be rounded up to the 202 nearest power of two (or larger, if one reserves space for future 203 growth), even more IPv4 addresses will often end up being wasted 204 without even being used. 206 With SIIT-DC, the operator can remove these valuable IPv4 addresses 207 from his back-end servers and network infrastructure, and reassign 208 them to the SIIT-DC service as IPv4 Service Addresses. There exists 209 no requirement that IPv4 Service Addresses are assigned in an 210 aggregated manner, so there is nothing lost due to infrastructure 211 overhead; every single IPv4 address assigned to SIIT-DC can be used 212 an IPv4 Service Address. 214 1.4. Clients' IPv4 Source Addresses Visible to Applications 216 SIIT-DC uses the [RFC6052] algorithm to map the entire end-user's 217 IPv4 source address into an predefined IPv6 Translation Prefix. This 218 ensures that there is no loss of information; the end-user's IPv4 219 source address remains available to the application located in the 220 IPv6 network, allowing it to perform tasks like Geo-Location, 221 logging, abuse handling, and so forth. 223 1.5. Compatible with Standard IPv4 and IPv6 Stacks 225 Except for the introduction of the BRs themselves, no change to the 226 network, nodes, applications, or anything else is required in order 227 to support SIIT-DC. SIIT-DC is practically invisible from the point 228 of view of the IPv4 clients, the IPv6 nodes, the IPv6 data centre 229 network, and the IPv4 Internet. SIIT-DC interoperates with all 230 standards-compliant IPv4 or IPv6 stacks. 232 2. Terminology 234 This document makes use of the following terms: 236 SIIT-DC Border Relay (BR) 237 A device or a logical function that performs stateless protocol 238 translation between IPv4 and IPv6. It MUST do so in accordance 239 with [RFC6145] and [I-D.ietf-v6ops-siit-eam]. 241 SIIT-DC Edge Relay (ER) 242 A device or logical function that provides "native" IPv4 243 connectivity to IPv4-only devices or application software. It is 244 very similar in function to a BR, but is typically located close 245 to the IPv4-only component(s) it is supporting rather than on the 246 IDC's outer network border. The ER is an optional component of 247 SIIT-DC. It is discussed in more detail in 248 [I-D.ietf-v6ops-siit-dc-2xlat]. 250 IPv4 Service Address 251 An IPv4 address representing a node or service located in an IPv6 252 network. It is coupled with an IPv6 Service Address using an EAM. 253 Packets sent to this address is translated to IPv6 by the BR, and 254 possibly back to IPv4 by an ER, before reaching the node or 255 service. 257 IPv4 Service Address Pool 258 One or more IPv4 prefixes routed to the BR's IPv4 interface. IPv4 259 Service Addresses are allocated from this pool. That this does 260 not necessarily have to be a "pool" per se, as it could also be 261 one or more host routes (whose prefix length is equal to /32). 262 The purpose of using a pool rather than host routes is to 263 facilitate IPv4 route aggregation and ease provisioning of new 264 IPv4 Service Addresses. 266 IPv6 Service Address 267 An IPv6 address assigned to an application, node, or service; 268 either directly or indirectly (through an ER). It is coupled with 269 an IPv4 Service Address using an EAM. IPv4-only clients 270 communicates with the IPv6 Service Address through SIIT-DC. 272 Explicit Address Mapping (EAM) 273 A bi-directional coupling between an IPv4 Service Address and an 274 IPv6 Service Address configured in a BR or ER. When translating 275 between IPv4 and IPv6, the BR/ER changes the address fields in the 276 translated packet's IP header according to any matching EAM. The 277 EAM algorithm is specified in [I-D.ietf-v6ops-siit-eam]. 279 Translation Prefix 280 An IPv6 prefix into which the entire IPv4 address space is mapped, 281 according to the algorithm in [RFC6052]. The Translation Prefix 282 is routed to the BR's IPv6 interface. When translating between 283 IPv4 and IPv6, an BR/ER will insert/remove the Translation Prefix 284 into/from the address fields in the translated packet's IP header, 285 unless an EAM exists for the IP address that is being translated. 287 IPv4-translatable IPv6 addresses 288 As defined in Section 1.3 of [RFC6052]. 290 IDC 291 Short for "Internet Data Centre"; a data centre whose main purpose 292 is to deliver services to the public Internet, the use case SIIT- 293 DC is primarily targeted at. IDCs are typically operated by 294 Internet Content Providers or Managed Services Providers. 296 SIIT 297 The Stateless IP/ICMP Translation algorithm, as specified in 298 [RFC6145]. 300 XLAT 301 Short for "Translation". Used in figures to indicate where a BR/ 302 ER uses SIIT [RFC6145] to translate IPv4 packets to IPv6 and vice 303 versa. 305 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 306 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 307 document are to be interpreted as described in [RFC2119]. 309 3. Architectural Overview 311 This section describes the basic SIIT-DC architecture. 313 SIIT-DC Architecture 315 IPv6-capable user IPv4-only user 316 <2001:db8::ab:cd> <203.0.113.50> 317 | | 318 (the IPv6 internet) (the IPv4 Internet) 319 | | 320 | +-[BR]---------<192.0.2.0/24>--------------+ 321 | | | 322 | | EAM #1: 192.0.2.1,2001:db8:12:34::1 | 323 | | EAM #2..#n: [...] | 324 | | XLAT Prefix: 2001:db8:46::/96 | 325 | | | 326 | +------------<2001:db8:46::/96>------------+ 327 | | 328 (the IPv6-only data centre network) 329 | 330 +--<2001:db8:12:34::1>--[v6-only server]-+ 331 | | | 332 | +-[2001:db8:12:34::1]--[v6-only app]-+ | 333 | | AF_INET6 socket | | 334 | +------------------------------------+ | 335 +----------------------------------------+ 337 Figure 1 339 In Figure 1, 192.0.2.0/24 is the IPv4 Service Address Pool. 340 Individual IPv4 Service Addresses are assigned from this prefix, and 341 traffic destined for it is routed to the BR's IPv4-facing network 342 interface. There are no restrictions on how many IPv4 Service 343 Address Pools are used or their prefix length, as long as they are 344 all routed to the BR's IPv4-facing network interface. 346 When translating packets between IPv4 and IPv6, the BR uses the EAM 347 to replace any occurrence of the IPv4 Service Address (192.0.2.1) 348 with its corresponding IPv6 Service Address (2001:db8:12:34::1). 349 Addresses that do not match any EAM configured in the BR are 350 translated by inserting or removing the Translation Prefix 351 (2001:db8:46::/96), cf. Section 2.2 of [RFC6052]. 353 The BR can be deployed as a separate device or as a logical function 354 in another multi-purpose device, such as an IP router. Any number of 355 BRs may exist simultaneously in the IDC's network infrastructure, as 356 long as they all configured with the same Translation Prefix and an 357 identical EAM Table. 359 The IPv6 Service Address of should be registered in DNS using an "IN 360 AAAA" record, while its corresponding IPv4 Service Address should be 361 registered using an "IN A" record. This ensures that IPv6-capable 362 clients access the application/service directly using its native IPv6 363 end-to-end, while IP4-only clients will access it through SIIT-DC. 365 3.1. Packet Flow 367 In this example, the "IPv4-only user" from Figure 1 initiates a 368 connection to the application running on the IPv6-only server. After 369 first having looked up the "IN A" record in DNS, the user starts by 370 transmitting an TCP SYN packet to the IPv4 Service Address. This 371 IPv4 packet is routed to the BR, and is there translated to IPv6 as 372 follows: 374 IPv4 to IPv6 translation 376 +--[IPv4]----------+ +--[IPv6]-----------------------+ 377 | SRC 203.0.113.50 | | SRC 2001:db8:46::203.0.113.50 | 378 | DST 192.0.2.1 | --> | DST 2001:db8:12:34::1 | 379 | TCP SYN [..] | | TCP SYN [..] | 380 +------------------+ +-------------------------------+ 382 Figure 2 384 The resulting IPv6 packet is routed to the IPv6-only server, which 385 processes and responds to it as if it had been a native IPv6 packet 386 all along. The server's IPv6 response packet is then routed back to 387 the BR, where it is translated back to IPv4 as follows: 389 IPv6 to IPv4 translation 391 +--[IPv6]-----------------------+ +--[IPv4]----------+ 392 | SRC 2001:db8:12:34::1 | | SRC 192.0.2.1 | 393 | DST 2001:db8:46::203.0.113.50 | --> | DST 203.0.113.50 | 394 | TCP SYN/ACK [..] | | TCP SYN/ACK [..] | 395 +-------------------------------+ +------------------+ 397 Figure 3 399 It is important to note that neither the IPv4 client nor the IPv6 400 server/application need any special support to participate in SIIT- 401 DC. However, the application may optionally be taught to extract the 402 embedded IPv4 source address from incoming IPv6 packets with source 403 addresses within the Translation Prefix. This will allow it to 404 perform IPv4-specific tasks such as Geo-Location, logging, abuse 405 handling, and so on. 407 4. Deployment Considerations and Guidelines 409 4.1. Application/Device Support for IPv6 411 SIIT-DC as described in this document requires that the application 412 (and/or the node the application is located on) supports IPv6 413 networking, and that it has no dependency on local IPv4 network 414 connectivity. 416 SIIT-DC can however support legacy IPv4-dependent applications and 417 nodes through the introduction of an ER. The ER provides the legacy 418 application or node with seemingly native IPv4 Internet connectivity, 419 so that it may operate correctly in an otherwise IPv6-only network 420 environment. This approach is described in more detail in 421 [I-D.ietf-v6ops-siit-dc-2xlat]. 423 4.2. Application Support for NAT 425 The operator should carefully examine whether or not the application 426 protocols he would like to use SIIT-DC with are able to operate in a 427 network environment where rewriting of IP addresses occur. In 428 general, if an application layer protocol works correctly through 429 standard NAT44 (see [RFC3235]), it will most likely work correctly 430 through SIIT-DC as well. 432 Higher-level protocols that embed IP addresses as part of their 433 payload are particularly problematic [RFC2663] [RFC2993] [RFC3022]. 434 One well-known example of such a protocol is FTP [RFC0959]. Such 435 protocols can be made to work with SIIT-DC through the introduction 436 of an ER, which provides end-to-end IPv4 address transparency by 437 reversing the translations performed by the BR before passing the 438 packets to the NAT-incompatible application. This approach is 439 described in more detail in [I-D.ietf-v6ops-siit-dc-2xlat]. 441 4.3. Application Communication Pattern 443 SIIT-DC is best suited for traditional client/server applications 444 where IPv4-only clients on the Internet initiate traffic towards an 445 IPv6-only service, which in turn is passively listening for inbound 446 traffic and responding as necessary. In this case, an IPv4 client 447 looks exactly like an native IPv6 client from the IPv6 service's 448 point of view, and thus does not require any special treatment. One 449 particularly common application protocol that follows this client/ 450 server communication pattern, and thus is ideally suited for use with 451 SIIT-DC, is HTTP [RFC7230]. 453 It is also possible to combine SIIT-DC with DNS64 [RFC6147] in order 454 to allow an IPv6-only application to initiate communication with 455 IPv4-only nodes through SIIT-DC. However, in this case, care must be 456 taken so that all outgoing communication is sourced from an IPv6 457 Service Address that is found in an EAM configured in the BR. If 458 another address is used, the BR will most likely be unable to 459 translate it to IPv4, causing the packet to be discarded. This could 460 be prevented by altering the Default Address Selection Policy Table 461 [RFC6724] on the IPv6 node. 463 An alternative approach to the above would be to place an ER in front 464 of the application in question, as described 465 [I-D.ietf-v6ops-siit-dc-2xlat]. This provides the application with 466 seemingly native IPv4 connectivity, which it may use freely for bi- 467 directional communication with the IPv4 Internet. An application or 468 node located behind an ER does not need to worry about selecting a 469 specific source address, as it will only have valid options 470 available. 472 4.4. Choice of Translation Prefix 474 Either a Network-Specific Prefix (NSP) from the provider's own IPv6 475 address space or the IANA-allocated Well-Known Prefix 64:ff9b::/96 476 (WKP) may be used. From a technical point of view, both work equally 477 well. However, only a single WKP exists, so if a provider would like 478 to deploy more than one instance of SIIT-DC in his network, or 479 another translation technology such as Stateful NAT64 [RFC6146], the 480 operator will be forced to use an NSP for all but one of those 481 deployments. 483 Another consideration is that the WKP cannot be used in inter-domain 484 routing. By using an NSP instead, SIIT-DC will support a deployment 485 where the BR and the IPv6 Service Address are located in different 486 Autonomous Systems. 488 The Translation Prefix may use any of the lengths described in 489 Section 2.2 of [RFC6052], but /96 has two distinct advantages over 490 the others. First, converting it to IPv4 can be done in a single 491 operation by simply stripping off the first 96 bits; second, it 492 allows for IPv4 addresses to be embedded directly into the text 493 representation of an IPv6 address using the familiar dotted quad 494 notation, e.g., "2001:db8::198.51.100.10" (cf. Section 2.4 of 495 [RFC6052])), instead of being converted to hexadecimal notation. 496 This makes it easier to write IPv6 ACLs and similar that match 497 translated endpoints in the IPv4 Internet. 499 For the reasons discussed above, this document recommends that an NSP 500 with a prefix length of /96 is used. Section 3.3 of [RFC6052] 501 discusses the choice of translation prefix in more detail. 503 4.5. Routing Considerations 505 The prefixes that constitute the IPv4 Service Address Pool and the 506 IPv6 Translation Prefix may be routed to the BRs as any other IPv4 or 507 IPv6 route in the provider's network. If more than one BR is being 508 deployed, it is recommended that a routing protocol (IGP) used to 509 advertise the routes within the provider's network. This will ensure 510 that the traffic that is to be translated will reach the closest BR, 511 reducing or eliminating sub-optimal traffic patterns, as well as 512 providing high availability: Should one BR fail, the IGP will 513 automatically redirect the traffic to the closest alternate BR. 515 4.6. Location of the SIIT-DC Border Relays 517 The goal of SIIT-DC is to facilitate a true IPv6-only application and 518 network architecture, with the sole exception being the IPv4 519 interfaces of the BRs and the network infrastructure required to 520 connect the BRs to the IPv4 Internet. Therefore, the BRs must be 521 located somewhere between the IPv4 Internet and the application 522 delivery stack - which includes all servers, load balancers, 523 firewalls, intrusion detection systems, and similar devices that are 524 processing traffic to a greater extent than merely forwarding it. 526 It is optimal to place the BRs as close as possible to the direct 527 path between the location of the IPv6 Service Address and the end 528 users. If the closest BR was located a long way from the direct 529 path, all packets in both directions must make a detour in order to 530 traverse the BR. This would increase the RTT between the service and 531 the end user by by two times the extra latency incurred by the 532 detour, as well as cause unnecessary load on the network links on the 533 detour path. 535 Where possible, it is beneficial to implement the BRs as a logical 536 function within the routers would have handled the traffic anyway, 537 had the topology been dual stacked. This way, a SIIT-DC deployment 538 does not require separate networks ports (which might become 539 saturated and impact the service quality), nor will it require extra 540 rack space and energy. Some particularly good choices of the 541 location could be within an IDC's access routers, or within the 542 Autonomous System's border routers. 544 Finally, another possibility is that the IDC operator outsources the 545 SIIT-DC service to another entity, for example his upstream ISP. 546 Doing so allows the IDC operator to build a true IPv6-only 547 infrastructure. 549 4.7. Migration from Dual Stack 550 While this document mainly discusses the use of IPv6-only nodes and 551 applications, it is important to note that SIIT-DC is fully 552 compatible with dual stack infrastructures, including dual stack 553 nodes and applications. 555 Thus, migrating a dual-stacked service to an IPv6-only one where 556 SIIT-DC provides the IPv4 Internet connectivity is easy. The 557 operator would start out by designating the service's current native 558 IPv6 address as the IPv6 Service Address, and assign it a 559 corresponding IPv4 Service Address. At this point, the service will 560 respond on both its old (native) IPv4 address, and the SIIT-DC IPv4 561 Service Address. The operator may now move traffic from the former 562 to the latter by changing the service's "IN A" DNS record. Once all 563 IPv4 traffic has been successfully moved to SIIT-DC, the old IPv4 564 address may be reclaimed. 566 4.8. Translation of ICMPv6 Errors to IPv4 568 In response to an IPv4 packet subsequently translated to IPv6 by the 569 BR, an IPv6 router in the IDC network may need to transmit an ICMPv6 570 error back to the origin IPv4 node. By default, such an ICMPv6 error 571 will most likely be discarded by the BR, unless the source address of 572 the ICMPv6 error happens to be a IPv4-translatable IPv6 address or 573 covered by an EAM. 575 To facilitate reliable delivery of such ICMPv6 errors, an SIIT-DC 576 operator SHOULD implement the recommendations in [RFC6791] in the 577 BRs. 579 4.9. MTU and Fragmentation 581 There are some key differences between IPv4 and IPv6 relating to 582 packet sizes and fragmentation that one MUST consider when deploying 583 SIIT-DC. They result in a few problematic corner cases, which can be 584 dealt with in a few different ways. The following subsections will 585 discuss these in detail, and provide operational guidance. 587 In particular, the operator may find that relying on fragmentation in 588 the IPv6 domain is undesired or even operationally impossible 589 [I-D.taylor-v6ops-fragdrop]. For this reason, the recommendations in 590 this section seeks to minimise the use of IPv6 fragmentation. 592 Unless otherwise stated, the following subsections assume that the 593 MTU in both the IPv4 and IPv6 domains is 1500 bytes. 595 4.9.1. IPv4/IPv6 Header Size Difference 596 The IPv6 header is up to 20 bytes larger than the IPv4 header. This 597 means that a full-size 1500 bytes large IPv4 packet cannot be 598 translated to IPv6 without being fragmented, otherwise it would 599 likely have resulted in a 1520 bytes large IPv6 packet. 601 If the transport protocol used is TCP, this is generally not a 602 problem, the IPv6 node will advertise a TCP MSS of 1440 bytes during 603 the initial TCP handshake. This causes the IPv4 clients to never 604 send larger packets than what can be translated to a single full-size 605 IPv6 packet, eliminating any need for fragmentation. 607 For other transport protocols, full-size IPv4 packets with the DF 608 flag cleared will need to be fragmented by the BR. This may be 609 avoided by increasing the Path MTU between the BR and the IPv6 nodes 610 to 1520 bytes or greater. If this is done, the MTU on the IPv6 nodes 611 themselves SHOULD NOT be increased accordingly, as doing so would 612 cause them to undergo Path MTU Discovery for all destinations on the 613 IPv6 Internet. The nodes MUST however be able to accept and process 614 incoming packets larger than their own MTU. If the nodes' IPv6 615 implementation allows the initial Path MTU to be set differently for 616 specific destinations, it MAY be increased to 1520 for destinations 617 within the Translation Prefix specifically. 619 4.9.2. IPv6 Atomic Fragments 621 In keeping with the fifth paragraph of Section 4 of [RFC6145], a 622 stateless translator like a BR will by default add an IPv6 623 Fragmentation header to the resulting IPv6 packet when translating an 624 IPv4 packet with the Don't Fragment flag set to 0. This happens even 625 though the resulting IPv6 packet isn't actually fragmented into 626 several pieces, resulting in an IPv6 Atomic Fragment [RFC6946]. 627 These Atomic Fragments are generally not useful in an IDC 628 environment, and it is therefore recommended that this behaviour is 629 disabled in the BRs. To this end, Section 4 of [RFC6145] notes that 630 the "translator MAY provide a configuration function that allows the 631 translator not to include the Fragment Header for the non-fragmented 632 IPv6 packets". 634 Note that IPv6 Atomic Fragments are currently being deprecated by 635 RFC6145bis [I-D.bao-v6ops-rfc6145bis]. As a result, a BR that 636 conforms to the updated standard is required to behave as recommended 637 above. 639 In IPv6, the Identification value is located inside the Fragmentation 640 header. That means that if the generation of IPv6 Atomic Fragments 641 is disabled, the IPv4 Identification value will be lost during 642 translation to IPv6. This could potentially confuse some diagnostic 643 tools. 645 4.9.3. Minimum Path MTU Difference Between IPv4 and IPv6 647 Section 5 of [RFC2460] specifies that the minimum IPv6 link MTU is 648 1280 bytes. Therefore, an IPv6 node can reasonably assume that if it 649 transmits an IPv6 packet that is 1280 bytes or smaller, it is 650 guaranteed to reach its destination without requiring fragmentation 651 or invoking the Path MTU Discovery algorithm [RFC1981]. However, 652 this assumption might prove false if the destination is an IPv4 node 653 reached through a protocol translator such as a BR, as the minimum 654 IPv4 link MTU is 68 bytes. See Section 3.2 of [RFC0791]. 656 Section 5.1 of [RFC6145] specifies that a stateless translator should 657 set the IPv4 Don't Fragment flag to 1 when it translates a non- 658 fragmented IPv6 packet to IPv4. This means that when the path to the 659 destination IPv4 node contains an IPv4 link with an MTU smaller than 660 1260 bytes (which corresponds to an IPv6 MTU smaller than 1280 bytes, 661 cf. Section 4.9.1), the Path MTU Discovery algorithm will be invoked, 662 even if the original IPv6 packet was only 1280 bytes large. This 663 happens as a result of the IPv4 router connecting to the IPv4 link 664 with the small MTU returning an ICMPv4 Need To Fragment error with an 665 MTU value smaller than 1260, which in turns is translated by the BR 666 to an ICMPv6 Packet Too Big error with an MTU value smaller than 1280 667 which is then transmitted to the origin IPv6 node. 669 When an IPv6 node receives an ICMPv6 Packet Too Big error indicating 670 an MTU value smaller than 1280, the last paragraph of Section 5 of 671 [RFC2460] gives it two choices on how to proceed: 673 o It may reduce its Path MTU value to the value indicated in the 674 Packet Too Big, i.e., limit the size of subsequent packets 675 transmitted to that destination to the indicated value. This 676 approach causes no problems for the SIIT-DC function, as it simply 677 allows Path MTU Discovery to work transparently across the BR. 679 o It may reduce its Path MTU value to exactly 1280, and in addition 680 include a Fragmentation header in subsequent packets sent to that 681 destination. In other words, the IPv6 node will start emitting 682 Atomic Fragments. The Fragmentation header signals to the the BR 683 that the Don't Fragment flag should be set to 0 in the resulting 684 IPv4 packet, and it also provides the Identification value. 686 If the use of the IPv6 Fragmentation header is problematic, and the 687 operator has IPv6 nodes that implement the second option above, the 688 operator should consider enabling the functionality described as the 689 "second approach" in Section 6 of [RFC6145]. This functionality 690 changes the BR's behaviour as follows: 692 o When translating ICMPv4 Need To Fragment to ICMPv6 Packet Too Big, 693 the resulting packet will never contain an MTU value lower than 694 1280. This prevents the IPv6 nodes from generating Atomic 695 Fragments. 697 o When translating IPv6 packets smaller than or equal to 1280 bytes, 698 the Don't Fragment flag in the resulting IPv4 packet will be set 699 to 0. This ensures that in the eventuality that the path contains 700 an IPv4 link with an MTU smaller than 1260, the IPv4 router 701 connected to that link will have the responsibility to fragment 702 the packet before forwarding it towards its destination. 704 In summary, this approach could be seen as prompting the IPv4 705 protocol itself to provide the "link-specific fragmentation and 706 reassembly at a layer below IPv6" required for links that "cannot 707 convey a 1280-octet packet in one piece", to paraphrase Section 5 of 708 [RFC2460]. 710 Note that IPv6 Atomic Fragments are currently being deprecated by 711 RFC6145bis [I-D.bao-v6ops-rfc6145bis]. As a result, a BR that 712 conforms to the updated standard is required to behave as suggested 713 above. 715 4.10. IPv4-translatable IPv6 Service Addresses 717 SIIT-DC is designed so that the IPv6 Service Addresses are not 718 required to be IPv4-translatable IPv6 addresses. Section 2 of 719 [I-D.ietf-v6ops-siit-eam] discusses why it is desirable to avoid 720 requiring the use of IPv4-translatable IPv6 addresses. 722 It is however quite possible to deploy SIIT-DC in combination with 723 IPv4-translatable IPv6 Service Addresses. The primary benefits in 724 doing so are: 726 o The operator is not required to provision EAMs for 727 IPv4-translatable IPv6 Service Addresses onto the BR/ERs. 729 o [RFC6145] translation can be performed in a checksum-neutral 730 manner, cf. Section 4.1 of [RFC6052]. 732 The trade-off is that the IPv4-translatable IPv6 Service Addresses 733 must be configured on the IPv6 nodes, and the applications must be 734 set up to use them - likely in addition to their primary (non- 735 IPv4-translatable) IPv6 addresses. The IPv4-translatable IPv6 736 Service Addresses must also be routed from the BR through the IDC's 737 IPv6 network infrastructure to the nodes on which they are assigned. 738 This essentially requires the entire IPv6 infrastructure to be made 739 aware of and handle translated IPv4 traffic as a special case, which 740 significantly increases complexity. As previously described in 741 Section 1.1, avoiding such drawbacks is a design goal of SIIT-DC. 742 The use of IPv4-translatable IPv6 Service Addresses is therefore 743 discouraged. 745 5. Acknowledgements 747 The author would like to thank the following individuals for their 748 contributions, suggestions, corrections, and criticisms: Fred Baker, 749 Cameron Byrne, Brian E Carpenter, Ross Chandler, Tobias Gondrom, 750 Christer Holmberg, Dagfinn Ilmari Mannsaaker, Lars Olafsen, Stig 751 Sandbeck Mathisen, Knut A. Syed, Qin Wu, Andrew Yourtchenko. 753 6. IANA Considerations 755 This draft makes no request of the IANA. 757 7. Security Considerations 759 7.1. Mistaking the Translation Prefix for a Trusted Network 761 If a Network-Specific Prefix from the provider's own address space is 762 chosen for the translation prefix, as recommended in Section 4.4, 763 care MUST be taken if the translation service is used in front of 764 services that have application-level ACLs that distinguish between 765 the operator's own networks and the Internet at large, as traffic 766 from translated IPv4 end users on the Internet might appear to be 767 originating from the provider's own network. It is therefore 768 important that the translation prefix is treated the same as the 769 Internet at large, rather than as a trusted network. 771 In order to alleviate this problem, the operator may opt to use a 772 Translation Prefix that is distinct from and not a subset of the IPv6 773 prefixes used elsewhere in the network infrastructure. 775 8. References 777 8.1. Normative References 779 [I-D.ietf-v6ops-siit-eam] 780 Anderson, T. and A. Leiva, "Explicit Address Mappings for 781 Stateless IP/ICMP Translation", draft-ietf-v6ops-siit- 782 eam-01 (work in progress), June 2015. 784 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 785 Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/ 786 RFC2119, March 1997, 787 . 789 [RFC6052] Bao, C., Huitema, C., Bagnulo, M., Boucadair, M., and X. 790 Li, "IPv6 Addressing of IPv4/IPv6 Translators", RFC 6052, 791 DOI 10.17487/RFC6052, October 2010, 792 . 794 [RFC6145] Li, X., Bao, C., and F. Baker, "IP/ICMP Translation 795 Algorithm", RFC 6145, DOI 10.17487/RFC6145, April 2011, 796 . 798 [RFC6791] Li, X., Bao, C., Wing, D., Vaithianathan, R., and G. 799 Huston, "Stateless Source Address Mapping for ICMPv6 800 Packets", RFC 6791, DOI 10.17487/RFC6791, November 2012, 801 . 803 8.2. Informative References 805 [I-D.bao-v6ops-rfc6145bis] 806 Bao, C., Li, X., Baker, F., Anderson, T., and F. Gont, "IP 807 /ICMP Translation Algorithm (rfc6145bis)", draft-bao- 808 v6ops-rfc6145bis-02 (work in progress), October 2015. 810 [I-D.ietf-v6ops-siit-dc-2xlat] 811 Anderson, T. and S. Steffann, "SIIT-DC: Dual Translation 812 Mode", draft-ietf-v6ops-siit-dc-2xlat-01 (work in 813 progress), June 2015. 815 [I-D.taylor-v6ops-fragdrop] 816 Jaeggli, J., Colitti, L., Kumari, W., Vyncke, E., Kaeo, 817 M., and T. Taylor, "Why Operators Filter Fragments and 818 What It Implies", draft-taylor-v6ops-fragdrop-02 (work in 819 progress), December 2013. 821 [RFC0791] Postel, J., "Internet Protocol", STD 5, RFC 791, DOI 822 10.17487/RFC0791, September 1981, 823 . 825 [RFC0959] Postel, J. and J. Reynolds, "File Transfer Protocol", STD 826 9, RFC 959, DOI 10.17487/RFC0959, October 1985, 827 . 829 [RFC1981] McCann, J., Deering, S., and J. Mogul, "Path MTU Discovery 830 for IP version 6", RFC 1981, DOI 10.17487/RFC1981, August 831 1996, . 833 [RFC2460] Deering, S. and R. Hinden, "Internet Protocol, Version 6 834 (IPv6) Specification", RFC 2460, DOI 10.17487/RFC2460, 835 December 1998, . 837 [RFC2663] Srisuresh, P. and M. Holdrege, "IP Network Address 838 Translator (NAT) Terminology and Considerations", RFC 839 2663, DOI 10.17487/RFC2663, August 1999, 840 . 842 [RFC2991] Thaler, D. and C. Hopps, "Multipath Issues in Unicast and 843 Multicast Next-Hop Selection", RFC 2991, DOI 10.17487/ 844 RFC2991, November 2000, 845 . 847 [RFC2993] Hain, T., "Architectural Implications of NAT", RFC 2993, 848 DOI 10.17487/RFC2993, November 2000, 849 . 851 [RFC3022] Srisuresh, P. and K. Egevang, "Traditional IP Network 852 Address Translator (Traditional NAT)", RFC 3022, DOI 853 10.17487/RFC3022, January 2001, 854 . 856 [RFC3235] Senie, D., "Network Address Translator (NAT)-Friendly 857 Application Design Guidelines", RFC 3235, DOI 10.17487/ 858 RFC3235, January 2002, 859 . 861 [RFC4213] Nordmark, E. and R. Gilligan, "Basic Transition Mechanisms 862 for IPv6 Hosts and Routers", RFC 4213, DOI 10.17487/ 863 RFC4213, October 2005, 864 . 866 [RFC6146] Bagnulo, M., Matthews, P., and I. van Beijnum, "Stateful 867 NAT64: Network Address and Protocol Translation from IPv6 868 Clients to IPv4 Servers", RFC 6146, DOI 10.17487/RFC6146, 869 April 2011, . 871 [RFC6147] Bagnulo, M., Sullivan, A., Matthews, P., and I. van 872 Beijnum, "DNS64: DNS Extensions for Network Address 873 Translation from IPv6 Clients to IPv4 Servers", RFC 6147, 874 DOI 10.17487/RFC6147, April 2011, 875 . 877 [RFC6540] George, W., Donley, C., Liljenstolpe, C., and L. Howard, 878 "IPv6 Support Required for All IP-Capable Nodes", BCP 177, 879 RFC 6540, DOI 10.17487/RFC6540, April 2012, 880 . 882 [RFC6724] Thaler, D., Ed., Draves, R., Matsumoto, A., and T. Chown, 883 "Default Address Selection for Internet Protocol Version 6 884 (IPv6)", RFC 6724, DOI 10.17487/RFC6724, September 2012, 885 . 887 [RFC6883] Carpenter, B. and S. Jiang, "IPv6 Guidance for Internet 888 Content Providers and Application Service Providers", RFC 889 6883, DOI 10.17487/RFC6883, March 2013, 890 . 892 [RFC6946] Gont, F., "Processing of IPv6 "Atomic" Fragments", RFC 893 6946, DOI 10.17487/RFC6946, May 2013, 894 . 896 [RFC7020] Housley, R., Curran, J., Huston, G., and D. Conrad, "The 897 Internet Numbers Registry System", RFC 7020, DOI 10.17487/ 898 RFC7020, August 2013, 899 . 901 [RFC7230] Fielding, R., Ed. and J. Reschke, Ed., "Hypertext Transfer 902 Protocol (HTTP/1.1): Message Syntax and Routing", RFC 903 7230, DOI 10.17487/RFC7230, June 2014, 904 . 906 Appendix A. Complete SIIT-DC IDC topology example 908 Figure 4 attempts to "tie it all together" and show a more complete 909 SIIT-DC topology, in order to better demonstrate its advantageous 910 properties discussed in Section 1. These are discussed in more 911 detail below. 913 Example SIIT-DC IDC topology 915 /--------------------------------\ /---------------\ 916 | IPv4 Internet | | IPv6 Internet | 917 \-+----------------------------+-/ \--------+------/ 918 | | | 919 | <----------[BGP]---------> | [BGP] 920 | | | 921 +-------<192.0.2.0/24>---------+ +---<192.0.2.0/24>---+ | 922 | BR #1 | | BR #2 | | 923 | EAM Table: | | | | 924 | ========== | | | | 925 | 192.0.2.1,2001:db8:12:34::1 | | | | 926 | 192.0.2.2,2001:db8:12:34::2 | | Exactly the same | | 927 | 192.0.2.3,2001:db8:fe:dc::1 | | configuration as | | 928 | 192.0.2.4,2001:db8:12:34::4 | | BR #1 has | | 929 | 192.0.2.5,2001:db8:fe:dc::e | | | | 930 | | | | | 931 | XLAT Prefix 2001:db8:46::/96 | | | | 932 | | | | | 933 +--------<2001:db8:46::/96>----+ +-<2001:db8:46::/96>-+ | 934 | | | 935 | <------[ECMP]------> | | 936 | | | 937 /-----------------+----------------------+--\ | 938 | IPv6 IDC network w/OSPFv3 +------------/ 939 \-+--------------------------------+--------/ 940 | | 941 | Tenant A's server LAN | Tenant B's server LAN 942 | 2001:db8:12:34::/64 | 2001:db8:fe:dc::/64 943 | | 944 +-- www ::1 (IPv6+SIIT-DC) +-- www-lb ::1 (IPv6+SIIT-DC) 945 | | 946 +-- mta ::2 (IPv6+SIIT-DC) +-- web ::80:01 (IPv6-only) 947 | | [...] 948 +-- ftp ::3 (IPv6) +-- web ::80:99 (IPv6-only) 949 | ::4 (IPv4, via ER) | 950 | | +----+ 951 +-- app01 ::a:01 (IPv6-only) \---- ::e | ER | --\ 952 | [...] +----+ | 953 +- app99 ::a:99 (IPv6-only) | 954 | ftp 192.0.2.5 ---/ 955 +-- db01 ::d:01 (IPv6-only) 956 | [..] 957 \-- db99 ::d:99 (IPv6-only) 959 Figure 4 961 Single Stack IPv6 Operation 962 As discussed in Section 1.1, SIIT-DC facilitates an IPv6-only IDC 963 network infrastructure. The only places where IPv4 is absolutely 964 required is between the BRs and the IPv4 Internet, and between any 965 ERs and the IPv4-only applications or devices they are serving 966 (illustrated here as the two tenants' FTP servers). The figure 967 also illustrates how SIIT-DC does not interfere with native IPv6; 968 when there is no longer a need to support IPv4 clients, the BRs 969 may be decommissioned without causing any impact to native IPv6 970 traffic. 972 Stateless Operation 973 As discussed in Section 1.2, SIIT-DC operates in a stateless 974 fashion. In the illustration, both BRs are simultaneously 975 advertising (i.e., anycasting) the IPv4 Service Address Pool and 976 the IPv6 Translation Prefix, so incoming traffic from the IPv4 977 Internet may arrive at either of the BRs, while outgoing IPv6 978 traffic destined for IPv4 endpoints are load balanced between them 979 using Equal-Cost Multipath Routing. No continuous state 980 synchronisation between the two BRs occurs. Should one of the BRs 981 fail, the BGP and OSPF protocols will ensure that traffic 982 converges on the remaining BR. Existing sessions will not be 983 disrupted, beyond any disruption caused by the BGP/OSPF 984 convergence process itself. 986 IPv4 Address Conservation 987 As discussed in Section 1.3, SIIT-DC conserves the IDC operator's 988 IPv4 address space. Even though the two customers in the example 989 above have several hundred servers, the majority of them are not 990 used to run services made available directly from the Internet, 991 and therefore do not need to consume IPv4 addresses. The IDC 992 network infrastructure consumes no IPv4 addresses, either. 993 Finally, the IPv4 addresses that are assigned to the SIIT-DC 994 function as IPv4 Service Address Pools may assigned with 100% 995 efficiency, one address at a time; there is no requirement to 996 assign multiple addresses to a single customer in a contiguous 997 block. 999 Application support 1000 As discussed in Section 1.5, as long as the application protocol 1001 is translation-friendly (illustrated here with HTTP and SMTP), it 1002 will work with SIIT-DC without requiring any special adaptation. 1003 Furthermore, translation-unfriendly applications (illustrated here 1004 with FTP) will also work when located behind an ER 1005 [I-D.ietf-v6ops-siit-dc-2xlat]. Tenant A's FTP server illustrates 1006 how an ER may be located in the networking stack of a node, while 1007 Tenant B's FTP server illustrates how the ER may be deployed as a 1008 network service. The latter approach enables SIIT-DC to support 1009 IPv4-only nodes/devices. 1011 Author's Address 1013 Tore Anderson 1014 Redpill Linpro 1015 Vitaminveien 1A 1016 0485 Oslo 1017 Norway 1019 Phone: +47 959 31 212 1020 Email: tore@redpill-linpro.com 1021 URI: http://www.redpill-linpro.com